Don Stewart wrote:
Alex Mason wrote:
TernaryTrees is a package that extends Data.Set ad Data.Map with some ternary tree structures, based on the article [] .
For the string (or rather ByteString) version:

Which has a number of other significant performance improvements (e.g. node fusion, ByteString instead of String) and a highly expressive interface. Because it uses ByteStrings it can trie any type which can be serialized into a vector of bits[1], albeit indirectly.

The real trick with tries is not in just having them[2], it's in having the right interface to make use of what they're good at. For example, if I have multiple tries, I'd like to merge them without doing it element by element[3]. Or if I know I'm going to be making a number of similar queries, it'd be nice if I could cache my position in the trie[4] to avoid repeating the work for the prefixes of all my queries[5]. Using tricks like these leads to significant improvements over using them like hashtables; tries aren't hashtables just like lists aren't arrays.

Do you have benchmarks?

Somewhere in my email archive (care of Mark Wotton). I'll see if I can dig them up this weekend. The biggest issue here is finding nice datasets (and tasks) to give reasonable benchmarks for. Reading in all of /usr/dict (or the Brown corpus) and looking up all keys only gives one perspective (or two), and not necessarily the most helpful one for "real world" use. I haven't found any good dataset/task suites like there are for the Language Benchmarks Game, though I'd love to hear about one.

The tries /= hashtables comment stems from discussions on various haskell blogs with people inventing their own (or wanting to benchmark Data.Map vs hashtables vs tries vs bloomfilters). As a drop-in replacement tries will perform adequately, but they're nothing overwhelming; the overwhelming comes from changing the usage algorithms to match the "stride" of the datastructure. I don't think I have links to these discussions anymore to pull up code examples.

Live well,
Haskell-Cafe mailing list

Reply via email to