I checked, Rucksack writes dirty objects to disk on each commit - so should have the same performance profile as our existing backends, of course with different constants...

Ian

On May 16, 2008, at 11:26 AM, Ian Eslick wrote:



I just wanted to clarify a few points in this interesting discussion.

but if we are using high-level language like Lisp, it's much more likely that CPU overhead will be large comparing to memory latency. and as compression, even simpiest/fastest one will only add to CPU overhead, i'm very skeptical about it's benefits.

I doubt that using a Lisp with a modern compiler affects CPU time in any interesting way; I've observed that algorithm implementation with reasonable data structure choices is within 10's of % points of the same algorithm in C. I don't care how fast your CPU is, it's the data access patterns that dominate time unless you're working on small or highly local datasets, then the inner loop speed _might_ matter.

by the way, interesting fact: some time ago i've found that even with cache enabled in elehant/postmodern slots reads are relatively slow -- on 10 thousands per second scale or so. it was quite surprising
that abusing function is form-slot-key (that is used in CL-SQL too):

At least in Allegro, the profiler only counts thread time and that much of the delay shown by 'time' vs. the profiler is taken up waiting on cache-miss IO. It may be that thread time is only a handful of percentage points of the total time?

(defun form-slot-key (oid name)
(format nil "~A ~A" oid name))

Format is notoriously slow. Still, I find it hard to believe that it matters that much. (See above)

and even more surprising was to find out that there is no portable and fast way to convert interger to string. so i had to use SBCL's internal function called sb-impl::quick- integer-to-string -- apparently SBCL developers knew about this problem so they made this function for themselves.

For fun, I did a test with princ and for integers only it's 2x faster, for the above idiom it's only about 50% faster, but some of that is the with-string-stream idea. If you do it manually as they did and write it to an array one character at a time it can get pretty fast, but probably only 3x-4x.

so i think that subtle effects like I/O bandwidth will be only seen in we'll hardcore optimize elephant and applications themselves. but as it is now, storage backend won't be of a big influence of overal performance and just needs to be moderately good

I agree with this in general. However I feel like this whole discussion is premature optimization being done while blindfolded. I do think that over time it would be nice to give people tools like compression to enable/disable for their particular applications, but it's probably not worth all the debate until we solve some of the bigger problems in our copious free time.

By the way, if you are caching your data in memory and using the most conservative transaction model of BDB (blocking commits) you are write bandwidth limited. At the end of each little transaction, you have to write a log entry to disk, then flush a 16kb page to disk, then wait for the controller to say OK, then write the log again, then return to your regularly scheduled programming. In prevalence, you only write the log and occasionally dump to disk. I suspect that the SQL back ends function similarly. When you finish a transaction you have to wait for this to all happen over a socket...

Prevalence is much, much faster because you don't have to flush data structures on each commit, so cl-prevalence performance with Elephant's data and transaction abstractions would be a really nice design point. I wonder if we would get some of this benefit from a Rucksack adaptation?

Ian


_______________________________________________
elephant-devel site list
elephant-devel@common-lisp.net
http://common-lisp.net/mailman/listinfo/elephant-devel

_______________________________________________
elephant-devel site list
elephant-devel@common-lisp.net
http://common-lisp.net/mailman/listinfo/elephant-devel

Reply via email to