Hi all, attached the code of two new eggs.
llrb-syntax contains a syntax-rules implementation of left-leaning red-black trees. It expands code maintaining LLRB trees in a data structure defined by the user. Depending at the use case it will either allocate temporary space (in the pure case) or allocate no temporary space at all and use the minimum amount of mutations to maintain the LLRB tree in place. llrb-tree uses llrb-syntax to implement generic alternatives for assoc-list drop-ins (having a much better lookup behavior) and srfi-69 hash table drop-ins. Plus versions specialized for fixnums, strings and symbols as key. Both have a README, which should (hopefully) be ready to be pasted into the wiki. (Can't test the syntax, but hey, it's not that much I can have done wrong.) As I don't see the point in hosting my own infrastructure to publish them (after all I assume that once I'm past the inevitable bugs an shortcomings those will not see frequent updates by nature), I'd prefer if those would eventually go into the CHICKEN projects repository. (Where I only have r/o access so far.) --- While preparing I learned something I want to share. CHICKEN is always good for a surprise. And I'm astonished to say the least. Something is strange here. 1) Intuitively I'd assume that conserving space to reduce the load on the garbage collector should pay off. Even for CHCIKEN, which is supposed to be highly effective on temporary allocations. The numbers (included in "numbers.txt" in the tarball, though those should probably not make it into svn) did prove me wrong. The version specialized to fixnum keys is roughly 4x slower when running allocation free (thus loading the mutation stack) than the pure version. The same I found to be true for mutating versions having string-keys. 2) I never had the idea to actually use LLRB trees as a *replacement* for hash tables. After all LLRB maintains an order relation and is according to theories and urban legends *supposed* to be slower. See yourself: in the fixnum case, llrb is almost twice as fast as the corresponding hash table. (1M inserts in 90 seconds for Hashtable vs. 46 seconds for LLRB; References are even faster 1M refs in 19sec. vs. 10sec. for LLRB.) I'm astonished. Even the LLRB build using generic = and < is almost a fast and much faster than Hashtables. 3) Things become slightly more confusing when taking symbols as key. For those hash tables appear about 30% faster than LLRB on inserts and about 5% faster on referencing. (However randomization, which is commented out in the attached source was still active during the test. Maybe there's something hidden.) >From memory: before I tested with smaller numbers of keys, which resulted in the opposite finding: LLRB was found relative faster for smaller numbers of keys. After all it's O(log n), thus this is expected. 4) string->symbol puts #3 upside down: string->symbol 1000000 calls in 111935.0 ms caching the result in a LLRB-string tree naturally adds overhead on the first call: str2sym 1000000 refs in 185748.0ms – almost twice. Wait, what? *Almost* twice? How comes? Should have been around 230% according to #3. But on the second call string->symbol still takes those 112403.0ms, while the LLRB-cached version needs only 16321.0ms. I guess something is wrong in string->symbol. So far I've been unable to spot it. But who would have predicted that a balanced tree could be sorta faster than a hash table? Me not. Have a nice day! /Jörg PS: there's a tblbench.scm in the tarball as well, which produced those numbers. After all that's kind of a benchmark and if there's something lying to me, then it's probably benchmark.
llrb.tar.gz
Description: application/gzip
_______________________________________________ Chicken-users mailing list [email protected] https://lists.nongnu.org/mailman/listinfo/chicken-users
