Re: [ZODB-Dev] Blob directory structure scalability limits

2008-06-19 Thread Martijn Faassen
Hi there,

On Thu, Jun 19, 2008 at 1:38 PM, Christian Theune <[EMAIL PROTECTED]> wrote:
[snip]
> We propose to keep both implementations around and allow to select which one
> to use. We would extend the FileSystemHelper to abstract the two strategies.

Could the default be to choose the more scalable one, or would this
lead to backwards compatibility issues?

Regards,

Martijn
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: Ruby/Smalltalk OODB

2008-06-03 Thread Martijn Faassen
Hey,

On Tue, Jun 3, 2008 at 7:32 PM, Benji York <[EMAIL PROTECTED]> wrote:
> On Tue, Jun 3, 2008 at 1:13 PM, Gary Poster <[EMAIL PROTECTED]> wrote:
>>
>> What you *can't* do out of the box is ask "hey, what products have an
>> attribute that points to this brand?".  That's a back-reference, and that
>> needs solutions like the ones to which I was referring.
>
> In other words: if you want an index of references (or any attribute for
> that matter), you have to set it up (just as you would for a RDBMS).

Or what you'd do with in-memory Python objects, actually.

Regards,

Martijn
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] What the ZODB really does

2008-06-03 Thread Martijn Faassen

Hi there,

The discussion on Gemstone isn't the first time that people seem to 
think the ZODB doesn't do what it actually does. If you have object A 
that points to B, and object C that also points to B, updating B will 
update B truly, and A and C will both aware of this change.


Quite a few smart people seem to be under the impression the ZODB 
doesn't actually take care of this. They believe that B will be 
persisted twice, once for A, once for C. This may be the case if B 
doesn't inherit from Persistent, but if it does, there really will only 
be that instance.


The ZODB is quite transparent that way. It's almost exactly like a pool 
(typically all stored in the same root dictionary) of Python objects. 
They can reference each other just fine. The only requirements I know of 
are:


* if you don't inherit your class from Persistent, or use a python 
builtin (which doesn't inherit from Persistent), things will be 
serialized multiple time, as far as I'm aware. (I may be wrong)


* if you have a non-Persistent subobject (like a list) and you change 
it, you need to manually flag the persistence machinery on the object 
that its subobject changed, with _p_changed. This is *only* necessary if 
some of the objects are not Persistent. For common built-in collections 
in Python such as list and dictionary there are replacements 
(PersistentList, PersistentMapping), and more advanced building blocks 
for indexes (BTrees), that don't have this issue.


Anyway, the misapprehension that the ZODB somehow does less than it does 
seems to be an easy one for people to develop. I think it would be 
important to show on the ZODB home page that this is truly not the case. 
This needs to be repeated early and often.


The *reason* people don't use hard references like this all the time in 
an application is that sometimes you want back references, and sometimes 
you want loose coupling. So that's when things are referenced by a 
string or a lookup or whatnot. Just like you sometimes put Python 
objects in a dictionary with keys. The wide application of such soft 
references seems to give people the impression you can't point the same 
object from tree X and Y at the same time.


Regards,

Martijn

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: Ruby/Smalltalk OODB

2008-06-03 Thread Martijn Faassen
Hey,

On Tue, Jun 3, 2008 at 6:09 PM, Tim Cook <[EMAIL PROTECTED]> wrote:
[snip]
> But that's a feature and not a limitation.  :-)
>
> If I store patient A in demographics
> and a clinical entry B in ehr
>
> When I edit the clinical entry B my DBM had better NOT update B. It is
> supposed to create a new clinical entry that is reference as a later
> version of B.

I don't get it what you're saying here. The ZODB has transparent
references. An OODB should be like a normal collection of Python
objects, and references are transparent there.

Regards,

Martijn
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: Ruby/Smalltalk OODB

2008-06-03 Thread Martijn Faassen
On Tue, Jun 3, 2008 at 5:28 PM, Sean Allen <[EMAIL PROTECTED]> wrote:
[snip]
> What is different is that persistence is built into their vm.
> You just put anything into any global and its persisted.

Okay, that's like the PyPy approach and cannot be done really without
doing it on the interpreter level.

>> How does Gemstone implement efficient querying or indexing?
[snip]

Okay, this sounds like an indexing framework built into the database
layer, something the ZODB doesn't have, but of course has been built
on top with the catalog.

> i haven't looked into the specific details of how they wire it altogether
> but it comes down
> to, gemstone is a fullstack. whether you are using the smalltalk, java or
> eventual ruby...
> they write the vm which has primitives to make their ops fast, has built in
> persistence
> so you just dont think about it at all. in fact, you have to ask for a class
> to not be
> persistent.

Fullstack has its advantages, though also disadvantages. It means they
need to reimplement compliant interpreters for any language they want
to support, and that's going to hurt their library support. (as I
doubt arbitrary CPython extensions would work with a hypothetical
Python version of this)

[snip]
> theres an object store shared by all. there are multiple vms instances
> running whatever code and the
> entire thing can run across multiple machines... need to scale, add more
> machines in.

This is something ZEO also provides, as far as I can grasp from your
description.

> i'm still digging into it all, its only been 3 weeks so i still have a lot
> of the terminology wrong etc,
> but it really is a very cool product. not having to think about the data
> store is just real nice.
> its all just objects and you dont have to change anything about how you code
> for them unless
> you want to use indexes and then the changes are very minor.

I'd say that the ZODB by itself also doesn't put heavy requirements on
your code. The main thing is the subclassing from Persistent, and
_p_changed flags if you use non-persistent subobjects you still want
to persist.

For indexing, a framework like Zope 3 requires zero changes to the
classes themselves.

>>> pull it back out and there it is again, object pointers fully intact.
>>> store
>>> in 2 different directories, modify in one, blam! modified in the other.
>>
>> I'm not sure how this is different than using the root object to store
>> objects and ZEO?
>>
>
> if i have customer A who has order B
>
> and i store customer A to customer dictionary
> and order B to order dictionary
>
> then later  access order B from order dictionary, modify and update it
>
> does ZEO update the instance of order pointed to by customer A?
> I cant get it to do it. My understanding is it cant. Well, it could
> but it isnt 'right out of the box' seamless.

ZEO should do just that. I understand you have an object A which has a
reference to B. You also have a dictionary that has a reference to A,
and a dictionary that has a reference to B. Both A and the dictionary
will be pointing to the same instance of B. (if A and B are both
subclasses of Persistent. If not, it might be both serialize
separately, I'm not sure).

> If you do that in gemstone, there is only one copy of Order B, no matter
> what variable in what dictionary you come at it from. And its drop dead
> simple.

> I looked at implementing that with zodb and moved along.

I'm confused. This has been the way the ZODB worked for a long time,
unless I'm really missing something in your description.

Regards,

Martijn
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: Ruby/Smalltalk OODB

2008-06-03 Thread Martijn Faassen
Hi there,

On Tue, Jun 3, 2008 at 3:28 PM, Sean Allen <[EMAIL PROTECTED]> wrote:
[snip]
> i dont think you can compare ZODB to gemstone's products. i've looked at
> both over the last few months with a decent level of depth and the gemstone
> from a programmer's standpoint is infintately easier to use. no special
> classes, no nothing really. just put anything persistent in a global
> dictionary and blam its done.

The ZODB can be used that way; the root object *is* a global
dictionary. Admittedly Persistent is necessary to make sure that
attributes changes are persisted; this may be nicer in Gemstone. Or do
you mean any global dictionaries are persisted?

How does Gemstone implement efficient querying or indexing?

I know the PyPy people have a demo where multiple interpreters share
objects transparently, so perhaps this is closer to what Gemstone
does.

> pull it back out and there it is again, object pointers fully intact. store
> in 2 different directories, modify in one, blam! modified in the other.

I'm not sure how this is different than using the root object to store
objects and ZEO?

Regards,

Martijn
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Re: Ruby/Smalltalk OODB

2008-06-02 Thread Martijn Faassen

Godefroid Chapelle wrote:

Wichert Akkerman wrote:

Martijn Faassen wrote:


What I find interesting is that Python has had such a thing for about 
a decade (the ZODB), and a mostly vaporware announcement in the Ruby 
world makes such a splash.


That's because most things Zope are not very good with PR.


Smart people are usually very demanding with themselves, which leads to 
the bad pattern of not advertizing our good stuff because it is not 
perfect.


Time to change our patterns, and advertize stuff before they are perfect 
;-)


I don't need explanations of *why* we're not doing PR. I don't see the 
explanations as necessary in order to change the situation. It just 
takes a few people to step up and do the actual PR. The best PR at this 
point is a good collection of documentation in a site that looks like 
one for a serious open source project.


Lots documentation on the ZODB is available, scattered around the web. 
Let's gather it into one place for starters.


So, please, someone actually do something about this rotten situation? 
Finally? Please? Please?


Regards,

Martijn

P.S. This is not a complaint to all the people who are doing something. 
We just need a bit of effort from some more people to get this done.


___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: Ruby/Smalltalk OODB

2008-06-02 Thread Martijn Faassen
Hi there,

On Mon, Jun 2, 2008 at 3:19 PM, Wichert Akkerman <[EMAIL PROTECTED]> wrote:
> Martijn Faassen wrote:
>> Christian Theune wrote:
>>>
>>> this might be interesting to ZODB users and developers:
>>> http://rss.slashdot.org/~r/Slashdot/slashdot/~3/302177093/article.pl
>>
>> What I find interesting is that Python has had such a thing for about a
>> decade (the ZODB), and a mostly vaporware announcement in the Ruby world
>> makes such a splash.
>
> That's because most things Zope are not very good with PR.

I know what the reason is. I just want to point it out again. Not
being good at it is not an excuse not to get better. It can be done,
it's just that it's not been a priority in people's minds.

>> The future is already here folks. So, what's the status of the
>> zodb.zope.org project to actually promote it better? It's easy to know what
>> to write on the homepage, just go to the Ruby buzz, translate the hype inI
>> terms of the ZODB, tone it down some, and add that it's been battle-tested
>> for a decade.
>
> I assume you mean http://new.zope.org/projects/zodb ? It's there, waiting
> for the rest of that site to be finished.

Progress! Good.

Some criticisms next:

The site is nothing compared to what it could be concerning
documentation. It's currently consists of mostly a huge amount of
empty pages of documentation that will intimidate anyone who wants to
work on this site. I strongly recommend doing it the other way around:
mine the existing documentation and assemble this into a site, and
don't intimidate people with a lot of empty structure. It also holds
back the whole new.zope.org project itself, as lots of empty pages
will make people wait to publish the site until it's all filled,
meaning it will never actually happen.

Is anyone actually approaching ZODB developers to fill in pieces? The
only way any ZODB site is going to improve if people on this very
mailing list will actually do something about it. It won't be me.

Finally, I still think this project is big enough to warrant its own
site, independent from the rest of zope.org. If you did this, you
wouldn't need the red letters that this can be used independently from
Zope, it'd be implicit, and people would really trust that this is
indeed the case instead of the nagging suspicion I think they'll end
up with now.

What I would do (but I'm not in charge of this and don't want to be)
is to put up a zodb.zope.org with what's there now, give everybody who
wants to add documentation access to add content, and then nag until
it contains some. Probably the current set of two real pages would be
enough to embaress people into adding more.

Regards,

Martijn

P.S. Wichert, I know you're doing good work. I'm just hoping someone
on this mailing list is actually listening and wants to coordinate
this effort (and then actually does it).
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Re: Ruby/Smalltalk OODB

2008-06-02 Thread Martijn Faassen

Hi there,

Christian Theune wrote:

this might be interesting to ZODB users and developers:
http://rss.slashdot.org/~r/Slashdot/slashdot/~3/302177093/article.pl


What I find interesting is that Python has had such a thing for about a 
decade (the ZODB), and a mostly vaporware announcement in the Ruby world 
makes such a splash.


The future is already here folks. So, what's the status of the 
zodb.zope.org project to actually promote it better? It's easy to know 
what to write on the homepage, just go to the Ruby buzz, translate the 
hype in terms of the ZODB, tone it down some, and add that it's been 
battle-tested for a decade.


(I realize GemStone is also battle-tested, but the Ruby certainly is not)

Regards,

Martijn

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Summer of Code: students please sign up!

2008-03-25 Thread Martijn Faassen

Hi there,

If you're a student and you want to hack on the ZODB this summer in the 
Google Summer of Code, sign up soon: the deadline is just in a week's time!


See here for more details:

http://faassen.n--tree.net/blog/view/weblog/2008/03/25/0

Regards,

Martijn


___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Re: Zeo Server as a single point of failure

2008-03-11 Thread Martijn Faassen

Hey,

David Pratt wrote:
[snip]
I cannot work with sources in zope repository since it would require a 
contributor agreement with Zope Corp. I am unable to enter into this 
agreement for the forseeable future. The zif collective 
(zif.sourceforge.net) is a way that I may contribute back to the zope 
community without the requirement of the agreement.


Just out of interest, would a contributor agreement with the Zope 
Foundation be any better for you?


Regards,

Martijn

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Re: zodb.zope.org

2008-03-11 Thread Martijn Faassen

Hey,

Dirceu, this is great news! ZODB people, please help them!

Regards,

Martijn

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] zodb.zope.org

2008-03-06 Thread Martijn Faassen

Hi there,

There's so much energy on this list recently, so I figured I'd check 
again whether we can get zodb.zope.org going. We can talk about 
performance measurements, we can talk about RelStorage, there's endless 
amounts of good stuff to say, and we can mine up quite a few older 
documents that are relevant today (and could be edited to be completely 
up to date).


I think a zodb.zope.org would be a major step forward to attract 
attention to this very cool, very mature, very powerful technology that 
we're hiding away right now. Attracting attention might attract new 
users. Attracting new users might attract new developers who want to 
hack on the ZODB. New developers might mean they might implement 
whatever makes our dreams come true. Don't you guys want your dreams to 
come true? :)


Technology-wise I don't think we will have much of a problem too. We 
need to get a hosting provider, but the Foundation can help with that. 
We'll need a simple web design - we've already been working on a design 
for zope.org that you may be able to reuse, and even getting a 
bare-bones new one won't be much work. Setting up a Plone site won't be 
much of a problem too: both the new zope.org projects and also 
grok.zope.org have buildouts and infrastructure ready for you. I suspect 
that grok.zope.org is close to what you need, and if you need more 
information I'm sure the people who worked on that can help.


What we need is one or two people who are willing to drive this effort 
so we get the site in the air and some basic content in. After that we 
can give everybody a login into that site and they can start adding 
documents.


So, is anyone interested? I had some volunteers mail me last time, and I 
still have their email addresses, so you'll have a pre-selected group of 
helpers. :)


Regards,

Martijn

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Re: become a ZODB mentor in the Google Summer of Code!

2008-03-04 Thread Martijn Faassen
Hey,

On Tue, Mar 4, 2008 at 8:35 PM, Sidnei da Silva
<[EMAIL PROTECTED]> wrote:
[snip]
>  I'm on the list already, but just want to let you guys know that I've
>  found a victim^H^H^H^H^H volunteer for working on a ZODB project. I
>  entered the five proposals from the ZODB blueprints that he was
>  interested on (obviously he will only work on one!). I guess its to
>  early to start casting votes?

Great! We first need to be accepted at all, and then see how many
projects we get. I'd like at least
one of them to go to the ZODB.

Regards,

Martijn
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Re: become a ZODB mentor in the Google Summer of Code!

2008-03-04 Thread Martijn Faassen

Hey,

Martijn Faassen wrote:
It would be great if we put our community's secret gem, the ZODB, into 
the limelight more, and the Google Summer of Code would be a great 
opportunity. We need mentors, and fast, so if you want to mentor 
someone, please sign up in this wiki page here:


http://wiki.zope.org/gsoc/SummerOfCode2008


No takers?

Regards,

Martijn

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] become a ZODB mentor in the Google Summer of Code!

2008-03-03 Thread Martijn Faassen
[I sent this off to the wrong mailing list before, but fortunately that 
gave me the chance to correct some content too :)]


Hi there,

It would be great if we put our community's secret gem, the ZODB, into 
the limelight more, and the Google Summer of Code would be a great 
opportunity. We need mentors, and fast, so if you want to mentor 
someone, please sign up in this wiki page here:


http://wiki.zope.org/gsoc/SummerOfCode2008

Ideas concerning ZODB improvements are also welcome on that page.

Besides the many technical ideas I'm sure you will all have, one topic I 
think would be very nice is for someone to actually work on a ZODB 
website. Unfortunately this cannot be a direct SoC project as the SoC 
focuses on code, not documentation, but it'd be a nice potential side 
effect none the less. :)


The aim would be to approach Python developers who don't care about Zope 
at all, and tell them about the ZODB, how to get it, and how to join 
into our community. Python developers who *do* care about Zope are of 
course more than welcome too, it should just be clear that this is for 
everybody.


Regards,

Martijn

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Re: Getting started with ZODB

2007-09-19 Thread Martijn Faassen

Hey,

Concerning documentation and website presence of the ZODB, it'd be very 
nice if we had a zodb.zope.org. If anyone is interested in volunteering 
to help this get set up (mostly gathering existing info and organizing 
it in a nice new site), please send me an email.


Regards,

Martijn

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] some interesting benchmarks

2007-09-14 Thread Martijn Faassen

Hi there,

Recently there were two blog entries which did some simple benchmarks of 
the ZODB and compared to other databases. Possibly others hadn't noticed 
those yet, so here are the references:


ZODB vs Relational Database: a simple benchmark

http://pyinsci.blogspot.com/2007/09/zodb-vs-relational-database-simple.html

ZODB vs Durus

http://pyinsci.blogspot.com/2007/09/zodb-vs-durus.html

Regards,

Martijn

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Re: AW: diploma thesis: ZODB Indexing

2007-09-11 Thread Martijn Faassen

Jim Fulton wrote:


On Sep 5, 2007, at 9:39 AM, Christian Theune wrote:

[snip]
I also have the feeling that our goal for ad-hoc querying would be 
incompatible with your envisioned framework for defining

collections and indexes.


I guess I have no idea what you are talking about.

I assumed you meant something along the lines of what people expect
of relational databases.  In the relational world, people define
tables and indexes in order to be able to do indexed ad-hoc queries.
Maybe you are talking about something else.


It is interesting to compare with XML databases. Some XML databases like 
MonetDB or eXist offer XPath queries into the database without anyone 
having to pre-define indexes. Basically these databases tend to index 
the entire tree structure. I'd suggest reading the eXist papers that are 
about. I'd also take a look at MonetDB, as at its core it's a general 
database system which has a RDB and XML db built on top.


http://monetdb.cwi.nl/

I'm not sure how many of these ideas can be translated to work for 
Python structures - XPath is rather specific to XML, after all. But if 
anyone wants to talk about my ideas on all this, let me know. :)


Regards,

Martijn



___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

2007-03-25 Thread Martijn Faassen

Hey Jim,

Jim Fulton wrote:


On Mar 25, 2007, at 12:33 PM, Martijn Faassen wrote:

[snip]
I have the strong suspicion that modern relational databases are 
currently better able to scale at queries using LIMIT and ORDER BY

 than the Zope 3 catalog.


I had a similar suspicion.  I assigned the Python Labs team the task
of finding out through literature search the approaches used.  They
found that there were none other than the sorts of things I've
mentioned.


What about caching strategies? (as I sketched out in my last mail)

This article about MySQL claims that MySQL is the only database that 
does query result set caching. Surprising for such an obvious thought:


http://dev.mysql.com/tech-resources/articles/mysql-query-cache.html

Perhaps it doesn't work as well as one would think and that's why other 
database engines rejected it. :)



I cannot back this up as I haven't done measurements. Perhaps you
have done so?


We did a literature search.


That's useful, but doesn't tell us very much about how they compare in
practice.

Perhaps someone should do measurements and see how the two compare in a
sort/batch use case. It shouldn't be too hard to set up a relational
database-based sorted batch along with a ZODB/catalog based sorted batch
and see how they both hold up.

* Do you estimate the performance of the Zope 3 catalog to be 
equivalent to the performance of a modern relational database

system for queries that need to sort and batch their results?


I estimate that the same issues apply to both.


Theoretical algorithm scalability is one thing, and the same issues
apply to both. Practical scalability might vary widely.

[snip]

I think further improvements are warrented, but they will not achieve
 the goal that many people expect.


Okay, that brings the discussion forward.

To identify our goal it would be good if we did the above comparison 
with an RDB. We then know how much further we are able to go with the 
catalog. Not all the way to RDB performance probably, as they have an 
enormous headstart, but there may still be improvements to make. 
Obviously some of this is not easy, but an analysis of the performance 
characteristics of a search/sort/batch combination might still identify 
low hanging fruit. Or we might be surprised into the realisation there's 
no problem :)


Let's put the idea that there are silver bullets behind us; you've made 
your point. Let's instead determine how to move forward.


Regards,

Martijn

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

2007-03-25 Thread Martijn Faassen

Hey Jim,

Jim Fulton wrote:


On Mar 25, 2007, at 3:01 AM, Adam Groszer wrote:

MF> I think one of the main limitations of the current catalog (and
MF> hurry.query) is efficient support for sorting and batching the query
MF> results. The Zope 3 catalog returns all matching results, which 
can then

MF> be sorted and batched. This will stop being scalable for large
MF> collections. A relational database is able to do this internally, 
and is

MF> potentially able to use optimizations there.


What evidence to you have to support this assertion?  


I have the strong suspicion that modern relational databases are 
currently better able to scale at queries using LIMIT and ORDER BY than 
the Zope 3 catalog. I cannot back this up as I haven't done 
measurements. Perhaps you have done so?


* Do you estimate the performance of the Zope 3 catalog to be equivalent 
to the performance of a modern relational database system for queries 
that need to sort and batch their results?


* If so, do you think it's just as easy for a developer to accomplish 
such equivalent performance with the Zope 3 catalog as it is with a 
relational database?


I've made a number of assertions:

a) one of the main limitations of the current catalog and hurry.query is 
efficient support for sorting and batching.


b) the Zope 3 catalog returns all matching results, which can then be 
sorted and batched. This will stop being scalable for large collections.


I'll amend b) by saying 'This will stop being scalable for large result 
sets'. I agree that b) as stated above is incorrect as the result set 
might be small, but I intended the amendment.


c) A relational database is able to do sorting and batching (limit 
queries) internally, and is potentially able to use optimizations here.


Which of these assertions are false?

Don't you think relational database system that has support for sorting 
and batching built into its query API can at the very least more easily 
use approaches to reduce sorting cost, by rewriting the query, caching, 
and potentially employing special indexes?


We did some 
literature search on this a few years ago and found no special trick to 
avoid sorting costs.


I am at least cursorily aware of challenges surrounding efficient 
querying and batching. I am not looking for a special trick or magic 
bullet. I'd just like more help in avoiding sorting cost in a typical 
situation where results are displayed in a batched format.


If a catalog query returns 1 million results, which I want to show in 
batches of 10, sorted by some property of the results, I would like to 
reduce the costs. Currently the pattern I (and I imagine others) employ 
is to re-execute the query and then sort these results in memory for 
each batch, for each request.


[you list some approaches to reduce sorting cost]

I would like some system that helps me reduce some of these costs, using 
the approaches you list, or at least some caching somewhere. I would 
imagine a relational database for instance can employ caching of result 
sets, so that if no writes occurred, a second LIMIT query asking for a 
different range will return results a lot faster.


Apparently the catalog does support N-best, you state later in this 
thread. How does one use this support? Can I add it to hurry.query somehow?


Perhaps all this is not the reponsibility of the catalog itself, but a 
system surrounding it. As long as it's obviously there for people to use.


Perhaps however I am seeing problems that aren't there?

Do you think there is no problem and we have parity with relational 
database implementations here?


Do you think the current situation cannot be improved much further?

Do you think any further improvements are not worth the costs?

Regards,

Martijn

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Re: Community opinion about search+filter

2007-03-15 Thread Martijn Faassen

Hello,

Adam Groszer wrote:

I'd like to ask your opinion, your experiences about searching and
filtering in quite large object DBs.
We need to add search and filter functions to our current app, where
the user might be able to create quite _sophisticated_ filter criterias.
(The app is a pure Z3 app, subject is document management)

Currently we're looking at something based on catalog/indexes.
As I checked the most comfortable solution would be based on
hurry.query.
Some questions arose:
- Is it necessary/worth adding indexes on all attributes?
- How does the index perform on modification and retrieval?

The biggest problem is that this will be our first try, so we're
missing experiences and are a bit puzzled about the right solution.
Certain is that moving to RDB is not an option.


I think one of the main limitations of the current catalog (and 
hurry.query) is efficient support for sorting and batching the query 
results. The Zope 3 catalog returns all matching results, which can then 
be sorted and batched. This will stop being scalable for large 
collections. A relational database is able to do this internally, and is 
potentially able to use optimizations there.


It would be very nice if someone could look into expanding hurry.query 
and/or the catalog to support these cases. It would be interesting to 
look at what Dieter Maurer has done with AdvancedQuery in Zope 2 in this 
regard as well.


Regards,

Martijn


___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev