Andrew Coppin wrote:

1. Why do I have to type "ByteString" in my code? Why isn't the compiler automatically performing this optimisation for me?

One reason is that ByteString is stricter than String. Even lazy ByteString operates on 64KB chunks. You can see how this might lead to problems with a String like this:

"foo" ++ undefined

The first three elements of this list are well-defined, but if you touch the fourth, you die.

2. ByteString makes text strings faster. But what about other kinds of collections? Can't we do something similar to them that makes them go faster?

Not as easily. The big wins with ByteString are, as you observe, that the data are tiny, uniformly sized, and easily unboxed (though using ForeignPtr seems to be a significant win compared to UArray, too). This also applies to other basic types like Int and Double, but leave those behind, and you get problems.

If your type is an instance of Storable, it's going to have a uniform size, but it might be expensive to flatten and unflatten it, so who knows whether or not it's truly beneficial. If it's not an instance of Storable, you have to store an array of boxed values, and we know that arrays of boxes have crummy locality of reference.

Spencer Janssen hacked up the ByteString code to produce StorableVector as part of last year's SoC, but it never got finished off:

http://darcs.haskell.org/SoC/fps-soc/Data/StorableVector/

More recently, we've been pinning our hopes on the new list fusion stuff to give many of the locality of reference benefits of StorableVector with fewer restrictions, and all the heavy work done in a library.

        <b
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to