Re: [Zope3-dev] Re: [Zope3-checkins] SVN: Zope3/branches/tlotze/src/zope/interface/interface.py Simplifying some idioms, coding style.
Thomas Lotze wrote: On Mon, 22 Aug 2005 11:18:44 -0400, Benji York wrote: I didn't review your entire check in (and I realize that this is on a branch), but the non-use of setdefault there was probably intentional to keep from constructing empty dicts and lists when they may not be used. A simple timing test suggests that using setdefault is actually faster, the construction of empty dicts and objects notwithstanding. Great! I didn't mean to suggest it wasn't; I just wanted to point out the (possible) intent of the original coder. Perhaps they should have benchmarked it (like you did) instead of doing something they thought should be faster (premature optimization and all that). ACK I assume this means acknowledged as opposed to an exclamation of surprised disgust. -- Benji York Senior Software Engineer Zope Corporation ___ Zope3-dev mailing list Zope3-dev@zope.org Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com
[Zope3-dev] Re: Re: [Zope3-checkins] SVN: Zope3/branches/tlotze/src/zope/interface/interface.py Simplifying some idioms, coding style.
On Tue, 23 Aug 2005 08:36:22 -0400, Benji York wrote: ACK I assume this means acknowledged as opposed to an exclamation of surprised disgust. ACK -- Thomas Lotze ___ Zope3-dev mailing list Zope3-dev@zope.org Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com
[Zope3-dev] Florent's O-R blog entry
I recently read Florent's object/relational blog entry at http:// blogs.nuxeo.com/sections/blogs/florent_guillaume/ 2005_08_11_object_relational . It's getting a bit old now, but I didn't see much discussion (or a way to make a comment) so I thought I'd bring it up here to invite shared thoughts on his provocative ideas. Florent spoke of both Zope 2 and Zope 3. Because of my interests, my current job description, and my choice of mailing list for this discussion, I'll be speaking exclusively about the Zope 3 side of things. My O/R experience is on a smaller scale than Florent's (or Ape's) goals, so my responses are offered with knowledge that I may need to be corrected. Florent suggests that a proper enterprise-grade application server using Zope should use an object-relational mapper such as Ape, and rely on it at its core. He made a number of interesting observations about how this would allow us to discard the Zope catalog hack, store blobs on the filesystem, and take advantage of RDBMS maturity for managing and analyzing content data and metadata. While I agree with some of his observations, I believe that Florent's position--a blanket embrace of O-R underneath ZODB for all enterprise use cases--is overzealous. Large business content management applications can have many different usage patterns and many different design characteristics and tradeoffs. An O-R mapping is one choice that has advantages and disadvantages. The most serious disadvantage to O/R mapping is that the cost of creating and maintaining the mapping is not trivial. Requiring an O/ R mapping is a significant barrier of entry, unless you dump all of the data in something like Ape's 'extra stuff' store--in which case you've lost many of the compelling advantages of an RDBMS back end in the first place. This cost could be somewhat alleviated with tools; however, to my knowledge, the tools do not yet exist. Even with the tools, it would still be an extra layer of work demanded just to get things to work. Also, while I won't confidently assert speed losses as a disadvantage, it's worth mentioning that mapping code may (will usually?) introduce more CPU churn (and slower app speeds) than FileStorage. In any case, I know there are some cases in which O/R mappings would be very useful. I do not agree that it is generically the right approach. It has a cost. Moreover, the advantages Florent listed are not as clear cut as he described. Florent identified three advantages to O/R mapping: according to his blog, RDBMS indexing is clearly superior to the Zope catalog; blobs are best handled with mapping code; and content data and metadata are clearly tabular and so fit within a relational database cleanly and obviously, providing advantages such as built in aggregating tools. He makes some good points, but I have caveats or disagreements with all three. First, he identified the Zope catalog as a hack for which RDBMS indexes would be a cure. I don't see how the Zope 3 catalog is a hack, nor do I necessarily see RDBMS indexes as inherently advantageous in all cases. I agree that it is a problem that, given enough indexed objects and/ or enough indexes and/or a small object cache, loading the buckets when you traverse indexes can flush other objects from the ZODB cache. If the flushed objects are expensive to load and frequently used, that can be a noticeable problem. I believe this is a problem that can be addressed, or at least tuned for given applications. When it bites us enough that one of us in the community implements a smarter ZODB cache (or other solution) we'll all win. It is also true, though you did not mention it, that the Zope 3 catalog has no standardized query language or query optimizer. The first job has some contenders, but the second one has no champions to my knowledge. These are not reasons to discard BTrees, or indexes based on them. They provide some significant advantages. Both common indexing requirements and new data structures, such as the fascinating RDFLib that Michel Pelletier has worked on, are handled well by the BTree code. The BTree code is time-tested, relatively easy to use, and well maintained. When combined with the transactional virtues of ZODB, the conflict resolution story reads very well, and very similarly to that of PostgreSQL (default behavior). In terms of the actual indexes and catalog design, the Zope 3 text index is not as featureful as others, but the core algorithms are equivalent or even superior to many of them. In addition, the interface system and the catalog design allows integration with other backends, such as the Lucene text index (as Stephan has illustrated, I believe). It could even support an index with a RDBMS table back end, if desired. This might get you some of the advantages you listed for the O/R back end
[Zope3-dev] RDFLib and Zope 3
Michel (and anyone else with experience with RDFLib on the list), I recently looked at RDFLib (http://rdflib.net/) and came away (after an hour or so) with a good first impression. My biggest disappointment was that, from the perspective of a Zope 3 developer, using it alongside other Zope 3 indexes (and other intid- based data structures) meant that I would have to externally convert to and from RDF in order to merge results and convert the RDF URIs to objects. It would be much more efficient if I could have an RDF resource class that represented an intid, and even more efficient if I could get IFBTrees back directly from searches that somehow included the intids. Then I could leverage the relationship and keyword capabilities of RDFLib while also merging results efficiently with other index-like data structures in Zope 3. The intid-specific resources could even have stable URI representations without too much trouble, so that they could be exported and imported with RDFLib, if desired. Have you thought about that use case? If one used a variation of your back end that assigned intids to non-intid-based resources like URIs and Literals and stored the relationships via intids, you could store the data as IFBTrees and offer up an API to get raw IFBTree results. Any obvious ways that would be a problem? Does it feel reasonable to you? Any suggestions? I'm generally interested in RDFLib, your use of it, and your hopes for it, if you feel like holding forth. :-) Gary ___ Zope3-dev mailing list Zope3-dev@zope.org Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com
Re: [Zope3-dev] Florent's O-R blog entry
On Aug 23, 2005, at 1:11 PM, Gary Poster wrote: FWIW, my concluding sentence would have been better written as Meanwhile, deciding that a community project require an O/R back end over FileStorage or DirectoryStorage, as Florent argues, feels like a significant case of throwing the baby out with the bath water. Argh, communication. That still could be too-easily misinterpreted, and I didn't stare at it long enough before I sent it. One more try. Meanwhile, deciding that a community project require any specific backend--Ape, FileStorage, DirectoryStorage, or another--feels like a mistake. Discarding FileStorage or DirectoryStorage, as Florent argues, is a significant case of throwing the baby out with the bath water. We have at least three maintained and capable ZODB backends, with different strengths and weaknesses, appropriate for different use cases. Lets not jump to discard any of them. Gary ___ Zope3-dev mailing list Zope3-dev@zope.org Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com
Re: [Zope3-dev] Florent's O-R blog entry
On Aug 23, 2005, at 12:56 PM, Shane Hathaway wrote: Gary Poster wrote: In conclusion, the nebulous concept of enterprise applications on Zope does not have a clear cut decision for or against an O/R mapper such as Ape. The cost of O/R mappings is not inconsequential, and the advantages are not conclusive. I hope that large projects that the Zope community works on together can support both, and do not depend on or exclude their use. Florent makes some excellent observations, and solutions to the problems he identifies could be done at a number of layers in the code base. Meanwhile, switching entirely to an O/R back end over FileStorage or DirectoryStorage feels like a significant case of throwing the baby out with the bath water. I would use this argument to support the idea of transparent ZODB- based O/R mapping, which is what Ape does. With a transparent mapper, users can choose their own storage backend. The baby is the application code and the bath water is FileStorage/ DirectoryStorage. Ape keeps the baby 100% intact. ;-) I strongly disagree that FileStorage/DirectoryStorage is bath water--something that has served its purpose, and is discardable. I agree that O/R mapping like Ape provides is a great solution for some cases (such as the one you listed, and there are others) and allows you to transparently replace back ends if it is (or becomes) necessary. It is an exciting idea and technology, and appropriate for some use cases. FWIW, my concluding sentence would have been better written as Meanwhile, deciding that a community project require an O/R back end over FileStorage or DirectoryStorage, as Florent argues, feels like a significant case of throwing the baby out with the bath water. Gary ___ Zope3-dev mailing list Zope3-dev@zope.org Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com
Re: [Zope3-dev] Florent's O-R blog entry
Gary Poster wrote: On Aug 23, 2005, at 1:11 PM, Gary Poster wrote: FWIW, my concluding sentence would have been better written as Meanwhile, deciding that a community project require an O/R back end over FileStorage or DirectoryStorage, as Florent argues, feels like a significant case of throwing the baby out with the bath water. Argh, communication. That still could be too-easily misinterpreted, and I didn't stare at it long enough before I sent it. One more try. Meanwhile, deciding that a community project require any specific backend--Ape, FileStorage, DirectoryStorage, or another--feels like a mistake. Discarding FileStorage or DirectoryStorage, as Florent argues, is a significant case of throwing the baby out with the bath water. We have at least three maintained and capable ZODB backends, with different strengths and weaknesses, appropriate for different use cases. Lets not jump to discard any of them. I agree 100%. However, your concern is that projects will require a specific ZODB backend, while my concern is that projects will dump ZODB altogether. I think the latter is the greater risk, and people need a middle ground so they don't isolate themselves from the rest of the community. Ape could be a part of that middle ground. Also, I did not intend to disparage the excellent FileStorage and DirectoryStorage packages. I always tell people to use FileStorage or DirectoryStorage unless they have a good reason not to, and the biggest reason not to use FileStorage (through-the-web code is hard to put under version control) is already disappearing with Zope 3. Shane ___ Zope3-dev mailing list Zope3-dev@zope.org Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com
Re: [Zope3-dev] [i18n] help for 2 translations
On 8/22/05, Gary Poster [EMAIL PROTECTED] wrote: - Food For Thought Another way of writing this idiom might be 'Things We Ought to Think About' or 'Questions for Later'. - Zope Stub Server Controller Let's see--grep tells me that the string is in ./src/zope/app/ applicationcontrol/browser/server-control.pt. A stub in this context is a short, incomplete version of something. Stub implies we need more. A similar title that does not have the same implication would be Zope Basic Server Controller. A server controller let's you control a server. In this case, the page lets you shut down or restart the Zope server. Ok, thanks ! -- Sébastien Douche [EMAIL PROTECTED] ___ Zope3-dev mailing list Zope3-dev@zope.org Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com
Re: [Zope3-dev] Florent's O-R blog entry
Am 23.08.2005 um 20:36 schrieb Shane Hathaway: Gary Poster wrote: On Aug 23, 2005, at 1:11 PM, Gary Poster wrote: Argh, communication. That still could be too-easily misinterpreted, and I didn't stare at it long enough before I sent it. One more try. Meanwhile, deciding that a community project require any specific backend--Ape, FileStorage, DirectoryStorage, or another--feels like a mistake. Discarding FileStorage or DirectoryStorage, as Florent argues, is a significant case of throwing the baby out with the bath water. We have at least three maintained and capable ZODB backends, with different strengths and weaknesses, appropriate for different use cases. Lets not jump to discard any of them. I agree 100%. However, your concern is that projects will require a specific ZODB backend, while my concern is that projects will dump ZODB altogether. I think the latter is the greater risk, and people need a middle ground so they don't isolate themselves from the rest of the community. Ape could be a part of that middle ground. Also, I did not intend to disparage the excellent FileStorage and DirectoryStorage packages. I always tell people to use FileStorage or DirectoryStorage unless they have a good reason not to, and the biggest reason not to use FileStorage (through-the-web code is hard to put under version control) is already disappearing with Zope 3. This is a good discussion, and I think this will provide a good ground for a technical pro/contra view of the storage situation. But I think the post from Florent looks at this from a slightly different angle. Perhaps I misinterpret it, but his thoughts look at the needs for a content repository storage. I do not think he wanted to totally replace ZODB for all the other stuff. And assuming he looks at the storage question from this point (actually Florent is in holidays at the moment) his views are build with some general concerns as background. Let's assume enterprise means big and sellable to corporations, then the concerns of potential customers are valid, that valuable content is stored in some piece of software, which is only known to a small group of developers. Building a content repository as a marketable solution on this piece of software needs more convincing than to say We have this piece of great software and your content ends in your favorite traditional RDBMS. Ok I will stop to interpret what Florent may have thought, I better present my own path of thinking. In the end I'm against a RDBMS as the only core part of a Zope CMS repository. I started with the general idea to have a content repository for simple content objects, which are all described by schemas. This leads to a rather flat and more structured, nearly homogenous mass of objects, compared to the normal objects present in a Zope CMS. The repository is a layer over potentially many storages. This leads fairly easily to the idea to have a backend storage which stores this data into a RDBMS. This is the level Florent probably looked at also. But I have concerns to many of the other points. At this level the RDBMS is really just a storage of attribute mappings. The hole logic, for example the relation between different content objects is part of the stored data or held in the repository application or some registries. I assume that the moment one starts to use the relational aspects of the RDBMS the application logic becomes part of the storage. This would need to be adressed in the O-R-mapper, which would mean that also the O-R-mapper becomes part of the application logic. There are further proposed benefits of an RDBMS-storage like indexing, direct searching, report generation which are all reflecting back in the application domain, which would lead in the end to the situation that one would circumvent the O-R-mapper for complex or special tasks and starts to work directly on the data. This in the end is bad from my point of view and greatly raises the complexity. It would also mean a big development effort to recreate, overshadow and map current functionality given us by Zope for nearly free. There are many valid points where the ZODB has some shortcomings. Blob support for example will be much better, although it will not be totally solved by just storing blobs on the filesystem. Which leads to my last point. From a solution point of view there are many hacks or individual adaptions involved to have a big scalable site. I think we should look for some of these to be better, means more standardly incorporated into the z3ecms toolbox. Just for example, the answer to time consuming cataloging for cases with many writes is to use the queued catalog product. But integrating it into a system is a hand job, needs a developer who knows how to do it, where to fiddle to integrate it right. Such technically already present
[Zope3-dev] Re: RDFLib and Zope 3
On Tue, 2005-08-23 at 12:49 -0400, Gary Poster wrote: Michel (and anyone else with experience with RDFLib on the list), I recently looked at RDFLib (http://rdflib.net/) and came away (after an hour or so) with a good first impression. Great. I've cc:ed Dan Krech, the lead rdflib developer on this mail. For his benefit I might explain things that you obviously know. My biggest disappointment was that, from the perspective of a Zope 3 developer, using it alongside other Zope 3 indexes (and other intid- based data structures) meant that I would have to externally convert to and from RDF in order to merge results and convert the RDF URIs to objects. Correct. A specific and important optimization in Zope-style cataloging is that objects have a cheap unique integer to reduce catalog footprint and significantly improve result merging and joining. These intergers are exposed as a utility component in Zope. It would be much more efficient if I could have an RDF resource class that represented an intid, and even more efficient if I could get IFBTrees back directly from searches that somehow included the intids. Yes, this is a problem that needs to be solved, and your suggestion is one way to solve it. I've discussed this a few time with Florent at the paris and EUpy sprints and he had a similar suggestion. I'm uncomfortable with it for a few reasons, 1) because intids are such a Zope-catalog-optimization specific thing. I know why they are exposed, so that catalog results can be efficiently merged, but they don't have anything to do with RDF, so 2) rdflib can't really change its interface to accomodate them. Also, 3) they are backend specific, for example rdflib has a URI - integer mapping for its in-meomory and ZODB backends to reduce footprint, but a sql backend would need no such integer, you would in fact have to *add* a column to hold that value just so the data would merge efficiently with a catalog. This seems antithetical to Zope 3's philosophy in general as it violates the concept of not requiring third party libs and data to change themselves significantly just to work with Zope. Of course, this isn't a problem of the catalog, it's a problem in general merging search results from anywhere. I'd like to make the optimization available so that searches on a graph can be efficiently merged with searches on a catalog, but I don't think it can be done by pushing intids down into rdflib, or for that matter any other third party component you want to play with the catalog efficiently. Perhaps instead of pushing the integers down we could push URIs up, Zope's cataloging could grown another layer of indirection on top of intids and provide a URI utility that maps to intids. Of course you might object to that for the same reasons I'm objecting to this. ;) But at least URIs are a well known standard. Somewhat at right angles to this, I think Zope needs to grow another search interface, a higher level one that hides all of this integer id stuff from the user. I proposed something incomplete along these lines to the z3labs site, an interface that could aggregate searches across multiple registered search sources, whether catalogs, rdflib Graphs, relational databases, remote systems, google, etc. With something like this, no need to worry about intersecting two floating point result sets efficiently, the underlying search framework performs that optimization if it is available. Note that the primary benefit of such an interface is not necessarily merging results across multiple sources, but instead providing a consistent interface regardless of the search source. Then I could leverage the relationship and keyword capabilities of RDFLib while also merging results efficiently with other index-like data structures in Zope 3. The intid-specific resources could even have stable URI representations without too much trouble, so that they could be exported and imported with RDFLib, if desired. Hmm so these resource objects you are suggesting, they would be persistent objects? I don't quite have the picture of what you suggest. Perhaps these resource classes can be managed by a utility? Have you thought about that use case? If one used a variation of your back end that assigned intids to non-intid-based resources like URIs and Literals and stored the relationships via intids, One doesn't need a variation, this is exactly the way the in-memory and ZODB backends work now as an optimization. But they are internal details of the implementation of those backends. you could store the data as IFBTrees and offer up an API to get raw IFBTree results. Any obvious ways that would be a problem? Does it feel reasonable to you? Any suggestions? Well not any good ones yet, although I know it's an important problem. I'll have to think about it a bit more. Do you understand my objections? Does anyone else have any suggestions out there?
Re: [Zope3-dev] Re: RDFLib and Zope 3
On Tue, 2005-08-23 at 18:04 -0400, Gary Poster wrote: The relationship between ZODB content objects, their int id as provided by the pertinent intid utility, and a (theoretical) corresponding RDF URI is what I'm having a hard time not making hacky in my mind. I'll think about it some more. They might not be that hacky, this might be the wrong direction to take but URI's don't have to be visually meaningful, blank nodes, for example, are usually just '_:' concatenated to a random opaque string. If the URI were 'zope:' (maybe path/to/intid/util:) that would work just as well, it would also be trival to transform into a feasible join key if the URI was also a URL that looked up, instead of some network resource, an intid. Actually being able to trivially transform an intid to an rdflib URI might be something to think about. Thinking about it more, the current Zemantic uses the physical path of the object as the rdf:about= URI when an object adds itself, because honestly I could think of no other URI in Zope. This is obviously wrong, but I didn't have a better answer in paris. Why not use the `intid` plus some URI sugar? If the URI and the intid can be easily converted from one to the other then that should solve the whole problem, no? Another difficulty is that I like the RDF data model and the RDFLib implementation, but I haven't found a compelling reason to care much about the actual RDF format input and output. Is there a practically compelling defense of RDF as a format somewhere to which you can point me? I'm sure you're aware of this but for others: RDF does not specify a syntax, only the data model. The most popular syntax, RDF/XML, is pretty bad, but rdflib also supports the NT syntax, which is a plain text format. There are some other triple languages out there that may look even better, and support for them in rdflib would require writing only a parser and maybe a serializer if you want that format back out. I like SLIP but the parser needs some work: http://www.scottsweeney.com/projects/slip/ and lastly I've been kicking around a new syntax based on SLIP I call SLIPR. Unlike SLIP which is for any XML, SLIPR is RDF only. https://svn.cignex.com/public/slipr/data/pyinrdf.slpr I only have the syntax outlined right now, I'm still working on the parser. This is my attempt to mix RDF with indented syntax. It looks great in python-mode. ;) Unfortunately this is low priority. The good news is the high priority is a SPARQL parser, which is coming along nicely. Kudos to the fabulous pyparsing library. Hopefully we should have full sparql support by 2.4. -Michel ___ Zope3-dev mailing list Zope3-dev@zope.org Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com