Thanks Ian, that is some good analysis. > - If you are doing a significant amount of deserialization with lots of > threads than you should know that each deserialization requires a call > to (with-lock ...) to ensure that the shared pool of buffer streams is > thread safe (a problem with elephant < 0.9). This could conceivably > cause a lockup if there are lots of small deserializations happening > concurrently across threads mapping over the same Btrees.
I had a vague suspicion of something like that, but only looked at transactions. I guess I would have to modify elephant to allow me to do the locking to solve such a problem. > Are you sure > it's GC that's eating all the time, or non-lisp CPU time in general? Well, the 99% CPU is reported for the sbcl process. I only know that manually invoking a gc will trigger the problem. > Although it breaks the abstraction barrier, using IDs will be a definite > gain. You'd just make that second BTree pairs of word-freq / obj-oid. > Then you use the OID and object type to grab the object directly from > elephant: (elephant::get-cached-instance oid classname) I have also been considering doing away with the second layer of BTrees, and using my own, more "linear" structures. Not sure what that could look like exactly though. > You might be better off, performance > wise, doing this in a C full-text indexing system and wrapping an > interface to it. I hadn't thought of that yet. Can you recommend any? Anyway, I guess I was asking for trouble a bit with my setup. I'm not sure how I'll proceed yet, but if I stick to the two-level BTree setup and use id's I know what to look out for. Thanks again, Chris _______________________________________________ elephant-devel site list elephant-devel@common-lisp.net http://common-lisp.net/mailman/listinfo/elephant-devel