This is what you have long waited for! You have written a lot of signal processing code based on lists? You planned to rewrite it to some efficient chunky list structure? Better don't do that! The chunk structure is usually not what you want. You only use it for efficiency. But it makes it almost impossible to write fusion rules that are both correct and applicable. Thus I propose to do most things on lists using the functions from the stream-fusion package and convert only once to a chunky structure, namely at the end of a series of transformations. If the operations process the list in one pass from start to end, which is the case for most signal processing operations, then fusion will jump in and turn everything into a single loop that creates the chunky sequence. Sometimes you need to INLINE functions in order to show GHC that there is something inside your definitions that can be fused across function boundaries. However, for cutting, arranging, and sharing signals, you will certainly prefer a chunky sequence as interim data structure.
That's the library you need: http://hackage.haskell.org/cgi-bin/hackage-scripts/package/storablevector-streamfusion For an example and performance numbers, compile with $ cabal install -fbuildTests ... 1607 PreInlineUnconditionally 2343 PostInlineUnconditionally 759 UnfoldingDone 96 RuleFired 12 *# 2 ++ 8 /## 2 <# 1 <=# 2 ==#->case 3 SPEC Main.hPutArray 12 STREAM stream/unstream fusion ... "stream/unstream" is the fusion rule that makes your code fast! The higher the count, the better. $ ./dist/build/speedtest/speedtest storablevector-from-stream: 0.131797 storablevector-fused: 8.753800000000002e-2 storablevector: 0.23966099999999999 storablevector-lazy-from-stream: 0.197343 storablevector-lazy-fused: 0.104651 storablevector-lazy: 0.229007 loop-poke: 0.22254400000000002 list-binary-put: 0.111832 list-poke-buffer: 0.146816 list-poke: 0.14077699999999999 Interestingly, the tests 'storablevector' and 'storablevector-lazy' are much slower than their fusion counterparts, although fusion should just turn modular code into the same code as in 'storablevector' and 'storablevector-lazy'. Maybe some strictness tricks in stream-fusion, that I do not understand. Since my machine has around 1 GHz CPU clock rate, the numbers mean that the CPU spends about 100 cycles per sample value, which is quite much for my taste. Per sample it must perform a Double multiplication, a pointer increment, a conversion from Double to Int16, and finally write that to the disk. I thought that today's pipelined, superscalar, what-know-I processors can do multiple operations per cycle instead of spending multiple cycles per operation as in MC68000 days. Nonetheless, much fun with playing! Henning _______________________________________________ haskell-art mailing list haskell-art@lurk.org http://lists.lurk.org/mailman/listinfo/haskell-art