I suspect the raw IO costs are going to dominate the query compilation time, and for straight scans and point reads, which I imagine to be the majority of the queries, the optimizers will find a good plan trivially. An interesting experiment would be to take a simple benchmark that stores a million objects of size x, and compare it to the same benchmark storing objects of size (* 2 x).
For the way that I'm using Elephant, it's just plain raw I/O --- and that is why BerkeleyDB is faster. Of course, in my personally usage, I use DCM to do a huge amount of data caching... One of the things that interests me is the inclusion of a performance benchmark that we can share with the Relational people. For example, we might be able to support osdb: http://osdb.sourceforge.net/ (Part of this desire is driven by my desire, which Ian shares, of having a powerful example application as part of our documentation.) The real problem here is that Elephant is so different, and so much more powerful and convenient than a relational system, that one really needs to use an "application" benchmark, not a "relational" benchmark --- if we use a relational benchmark, we are tying our hands behind our back. Once we release 0.9, the integration of the "postmodern" backend is going to be my highest priority. The model I will move towards is: CLSQL-backend for a general databases and as a starting point for writing database-specific backends, but a (hopefully) collection of database-specific backends that can provide better performance. The one thing I don't ever want to do is give up backend-independence. I hope I'm not the only person who sees tremendous benefit in being able to switch your backend implementation choice at any time. On Wed, 2007-04-11 at 08:16 -0400, Ian Eslick wrote: > It's interesting that there is only a little performance advantage. > Have you done some profiling to see where the time is going? It > sounds like either the queries were simple enough that the > compilation step was trivial or that we're seeing Ahmdal's law and > the SQL costs are swamped by some other activity. > > On Apr 11, 2007, at 4:30 AM, Henrik Hjelte wrote: > > > Regarding stored procedures, I agree with Ian that the main > > performance > > advantage that come from them is that the query planning is > > prepared in > > advance. This is also done if you use prepared sql statements, so they > > give the same advantage. Stored procedures can however be faster if > > they > > involve several steps, then you won't have to send intermediate > > results > > to the client and then back to the server. What you should avoid for > > performance reasons is repeatedly sending strings to parse and > > execute. > > > > I have really tried to optimize the postmodern backend for speed, > > still > > it is slower than BerkeleyDB. The postmodern backend uses prepared > > statements for almost everything "simple", I could not measure any > > performance advantage with using stored procedures for this. There is > > one stored procedure left because it involves several steps, so in > > theory it can be faster (compared to a couple of prepared statements), > > but I haven't actually measured if and how much faster. > > > > Negative: stored procedures for the clsql backend will definitely > > remove > > portability between databases. Positive: a little faster. But I am > > totally convinced that stored procedures will not bring clsql even > > close > > to the performance of BerkeleyDB. > > > > /Henrik Hjelte > > > > > > On Tue, 2007-04-03 at 19:31 +0200, Pierre THIERRY wrote: > >> Scribit Robert L. Read dies 03/04/2007 hora 11:08: > >>> Stored procedures tend to not be very portable; therefore to put > >>> them > >>> in the current "postgres" backend, which should really be called a > >>> "clsql" backend, would make it less likely to work with MySQL. > >> > >> I was thinking at having some PostgreSQL-specific bits within the > >> clsql > >> backend. That would apply to MySQL or any other DB that can use > >> stored > >> procedures to make some queries faster. > >> > >>> However, this raises and interesting question: Is performance a > >>> significant problem (at least for the Postgres users?) If you had a > >>> "wish list" for Elephant features, would better performance be at > >>> the > >>> top? > >> > >> I just don't want to be limiting. The only way to go seemed to me > >> to be > >> to benchmark various uses of stored procedures. On the other hand, > >> having a cache for read queries, as was discussed earlier, could well > >> make the stored procedure useless. Or not. Well, we need to measure. > >> > >> Doubtfully, > >> Pierre > >> _______________________________________________ > >> elephant-devel site list > >> elephant-devel@common-lisp.net > >> http://common-lisp.net/mailman/listinfo/elephant-devel > > > > _______________________________________________ > > elephant-devel site list > > elephant-devel@common-lisp.net > > http://common-lisp.net/mailman/listinfo/elephant-devel > > _______________________________________________ > elephant-devel site list > elephant-devel@common-lisp.net > http://common-lisp.net/mailman/listinfo/elephant-devel
_______________________________________________ elephant-devel site list elephant-devel@common-lisp.net http://common-lisp.net/mailman/listinfo/elephant-devel