Re: [Zope3-dev] Re: [Zope3-checkins] SVN: Zope3/branches/tlotze/src/zope/interface/interface.py Simplifying some idioms, coding style.

2005-08-23 Thread Benji York

Thomas Lotze wrote:

On Mon, 22 Aug 2005 11:18:44 -0400, Benji York wrote:
I didn't review your entire check in (and I realize that this is on a 
branch), but the non-use of setdefault there was probably intentional to 
keep from constructing empty dicts and lists when they may not be used.


A simple timing test suggests that using setdefault is actually faster,
the construction of empty dicts and objects notwithstanding.


Great!  I didn't mean to suggest it wasn't; I just wanted to point out 
the (possible) intent of the original coder.  Perhaps they should have 
benchmarked it (like you did) instead of doing something they thought 
should be faster (premature optimization and all that).



ACK


I assume this means acknowledged as opposed to an exclamation of 
surprised disgust.

--
Benji York
Senior Software Engineer
Zope Corporation
___
Zope3-dev mailing list
Zope3-dev@zope.org
Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com



[Zope3-dev] Re: Re: [Zope3-checkins] SVN: Zope3/branches/tlotze/src/zope/interface/interface.py Simplifying some idioms, coding style.

2005-08-23 Thread Thomas Lotze
On Tue, 23 Aug 2005 08:36:22 -0400, Benji York wrote:

 ACK
 
 I assume this means acknowledged as opposed to an exclamation of 
 surprised disgust.

ACK

-- 
Thomas Lotze

___
Zope3-dev mailing list
Zope3-dev@zope.org
Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com



[Zope3-dev] Florent's O-R blog entry

2005-08-23 Thread Gary Poster
I recently read Florent's object/relational blog entry at http:// 
blogs.nuxeo.com/sections/blogs/florent_guillaume/ 
2005_08_11_object_relational .  It's getting a bit old now, but I  
didn't see much discussion (or a way to make a comment) so I thought  
I'd bring it up here to invite shared thoughts on his provocative  
ideas.  Florent spoke of both Zope 2 and Zope 3.  Because of my  
interests, my current job description, and my choice of mailing list  
for this discussion, I'll be speaking exclusively about the Zope 3  
side of things.  My O/R experience is on a smaller scale than  
Florent's (or Ape's) goals, so my responses are offered with  
knowledge that I may need to be corrected.


Florent suggests that a proper enterprise-grade application server  
using Zope should use an object-relational mapper such as Ape, and  
rely on it at its core.  He made a number of interesting observations  
about how this would allow us to discard the Zope catalog hack,  
store blobs on the filesystem, and take advantage of RDBMS maturity  
for managing and analyzing content data and metadata.


While I agree with some of his observations, I believe that Florent's  
position--a blanket embrace of O-R underneath ZODB for all  
enterprise use cases--is overzealous.  Large business content  
management applications can have many different usage patterns and  
many different design characteristics and tradeoffs.  An O-R mapping  
is one choice that has advantages and disadvantages.


The most serious disadvantage to O/R mapping is that the cost of  
creating and maintaining the mapping is not trivial.  Requiring an O/ 
R mapping is a significant barrier of entry, unless you dump all of  
the data in something like Ape's 'extra stuff' store--in which case  
you've lost many of the compelling advantages of an RDBMS back end in  
the first place.  This cost could be somewhat alleviated with tools;  
however, to my knowledge, the tools do not yet exist.  Even with the  
tools, it would still be an extra layer of work demanded just to get  
things to work.


Also, while I won't confidently assert speed losses as a  
disadvantage, it's worth mentioning that mapping code may (will  
usually?) introduce more CPU churn (and slower app speeds) than  
FileStorage.


In any case, I know there are some cases in which O/R mappings would  
be very useful.  I do not agree that it is generically the right  
approach.  It has a cost.  Moreover, the advantages Florent listed  
are not as clear cut as he described.


Florent identified three advantages to O/R mapping: according to his  
blog, RDBMS indexing is clearly superior to the Zope catalog;  blobs  
are best handled with mapping code; and content data and metadata are  
clearly tabular and so fit within a relational database cleanly and  
obviously, providing advantages such as built in aggregating tools.   
He makes some good points, but I have caveats or disagreements with  
all three.


First, he identified the Zope catalog as a hack for which RDBMS  
indexes would be a cure.  I don't see how the Zope 3 catalog is a  
hack, nor do I necessarily see RDBMS indexes as inherently  
advantageous in all cases.


I agree that it is a problem that, given enough indexed objects and/ 
or enough indexes and/or a small object cache, loading the buckets  
when you traverse indexes can flush other objects from the ZODB  
cache.  If the flushed objects are expensive to load and frequently  
used, that can be a noticeable problem.  I believe this is a problem  
that can be addressed, or at least tuned for given applications.   
When it bites us enough that one of us in the community implements a  
smarter ZODB cache (or other solution) we'll all win.


It is also true, though you did not mention it, that the Zope 3  
catalog has no standardized query language or query optimizer.  The  
first job has some contenders, but the second one has no champions to  
my knowledge.


These are not reasons to discard BTrees, or indexes based on them.   
They provide some significant advantages.  Both common indexing  
requirements and new data structures, such as the fascinating RDFLib  
that Michel Pelletier has worked on, are handled well by the BTree  
code.  The BTree code is time-tested, relatively easy to use, and  
well maintained.  When combined with the transactional virtues of  
ZODB, the conflict resolution story reads very well, and very  
similarly to that of PostgreSQL (default behavior).


In terms of the actual indexes and catalog design, the Zope 3 text  
index is not as featureful as others, but the core algorithms are  
equivalent or even superior to many of them.  In addition, the  
interface system and the catalog design allows integration with other  
backends, such as the Lucene text index (as Stephan has illustrated,  
I believe).  It could even support an index with a RDBMS table back  
end, if desired.  This might get you some of the advantages you  
listed for the O/R back end 

[Zope3-dev] RDFLib and Zope 3

2005-08-23 Thread Gary Poster
Michel (and anyone else with experience with RDFLib on the list), I  
recently looked at RDFLib (http://rdflib.net/) and came away (after  
an hour or so) with a good first impression.


My biggest disappointment was that, from the perspective of a Zope 3  
developer, using it alongside other Zope 3 indexes (and other intid- 
based data structures) meant that I would have to externally convert  
to and from RDF in order to merge results and convert the RDF URIs to  
objects.  It would be much more efficient if I could have an RDF  
resource class that represented an intid, and even more efficient if  
I could get IFBTrees back directly from searches that somehow  
included the intids.  Then I could leverage the relationship and  
keyword capabilities of RDFLib while also merging results efficiently  
with other index-like data structures in Zope 3.  The intid-specific  
resources could even have stable URI representations without too much  
trouble, so that they could be exported and imported with RDFLib, if  
desired.


Have you thought about that use case?  If one used a variation of  
your back end that assigned intids to non-intid-based resources like  
URIs and Literals and stored the relationships via intids, you could  
store the data as IFBTrees and offer up an API to get raw IFBTree  
results.  Any obvious ways that would be a problem?  Does it feel  
reasonable to you?  Any suggestions?


I'm generally interested in RDFLib, your use of it, and your hopes  
for it, if you feel like holding forth. :-)


Gary
___
Zope3-dev mailing list
Zope3-dev@zope.org
Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com



Re: [Zope3-dev] Florent's O-R blog entry

2005-08-23 Thread Gary Poster


On Aug 23, 2005, at 1:11 PM, Gary Poster wrote:
FWIW, my concluding sentence would have been better written as  
Meanwhile, deciding that a community project require an O/R back  
end over FileStorage or DirectoryStorage, as Florent argues, feels  
like a significant case of throwing the baby out with the bath  
water.


Argh, communication.  That still could be too-easily misinterpreted,  
and I didn't stare at it long enough before I sent it.  One more try.


Meanwhile, deciding that a community project require any specific  
backend--Ape, FileStorage, DirectoryStorage, or another--feels like a  
mistake.  Discarding FileStorage or DirectoryStorage, as Florent  
argues, is a significant case of throwing the baby out with the bath  
water.  We have at least three maintained and capable ZODB backends,  
with different strengths and weaknesses, appropriate for different  
use cases.  Lets not jump to discard any of them.


Gary
___
Zope3-dev mailing list
Zope3-dev@zope.org
Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com



Re: [Zope3-dev] Florent's O-R blog entry

2005-08-23 Thread Gary Poster


On Aug 23, 2005, at 12:56 PM, Shane Hathaway wrote:


Gary Poster wrote:

In conclusion, the nebulous concept of enterprise applications  
on  Zope does not have a clear cut decision for or against an O/R  
mapper  such as Ape.  The cost of O/R mappings is not  
inconsequential, and  the advantages are not conclusive.  I hope  
that large projects that  the Zope community works on together can  
support both, and do not  depend on or exclude their use.  Florent  
makes some excellent  observations, and solutions to the problems  
he identifies could be  done at a number of layers in the code  
base.  Meanwhile, switching  entirely to an O/R back end over  
FileStorage or DirectoryStorage  feels like a significant case of  
throwing the baby out with the bath  water.




I would use this argument to support the idea of transparent ZODB- 
based O/R mapping, which is what Ape does.  With a transparent  
mapper, users can choose their own storage backend.  The baby is  
the application code and the bath water is FileStorage/ 
DirectoryStorage.  Ape keeps the baby 100% intact. ;-)


I strongly disagree that FileStorage/DirectoryStorage is bath  
water--something that has served its purpose, and is discardable.  I  
agree that O/R mapping like Ape provides is a great solution for some  
cases (such as the one you listed, and there are others) and allows  
you to transparently replace back ends if it is (or becomes)  
necessary.  It is an exciting idea and technology, and appropriate  
for some use cases.


FWIW, my concluding sentence would have been better written as  
Meanwhile, deciding that a community project require an O/R back end  
over FileStorage or DirectoryStorage, as Florent argues, feels like a  
significant case of throwing the baby out with the bath water.


Gary
___
Zope3-dev mailing list
Zope3-dev@zope.org
Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com



Re: [Zope3-dev] Florent's O-R blog entry

2005-08-23 Thread Shane Hathaway

Gary Poster wrote:


On Aug 23, 2005, at 1:11 PM, Gary Poster wrote:

FWIW, my concluding sentence would have been better written as  
Meanwhile, deciding that a community project require an O/R back  end 
over FileStorage or DirectoryStorage, as Florent argues, feels  like a 
significant case of throwing the baby out with the bath  water.



Argh, communication.  That still could be too-easily misinterpreted,  
and I didn't stare at it long enough before I sent it.  One more try.


Meanwhile, deciding that a community project require any specific  
backend--Ape, FileStorage, DirectoryStorage, or another--feels like a  
mistake.  Discarding FileStorage or DirectoryStorage, as Florent  
argues, is a significant case of throwing the baby out with the bath  
water.  We have at least three maintained and capable ZODB backends,  
with different strengths and weaknesses, appropriate for different  use 
cases.  Lets not jump to discard any of them.


I agree 100%.  However, your concern is that projects will require a 
specific ZODB backend, while my concern is that projects will dump ZODB 
altogether.  I think the latter is the greater risk, and people need a 
middle ground so they don't isolate themselves from the rest of the 
community.  Ape could be a part of that middle ground.


Also, I did not intend to disparage the excellent FileStorage and 
DirectoryStorage packages.  I always tell people to use FileStorage or 
DirectoryStorage unless they have a good reason not to, and the biggest 
reason not to use FileStorage (through-the-web code is hard to put under 
version control) is already disappearing with Zope 3.


Shane
___
Zope3-dev mailing list
Zope3-dev@zope.org
Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com



Re: [Zope3-dev] [i18n] help for 2 translations

2005-08-23 Thread Sebastien Douche
On 8/22/05, Gary Poster [EMAIL PROTECTED] wrote:
  - Food For Thought
 
 Another way of writing this idiom might be 'Things We Ought to Think
 About' or 'Questions for Later'.
 
  - Zope Stub Server Controller
 
 Let's see--grep tells me that the string is in ./src/zope/app/
 applicationcontrol/browser/server-control.pt.  A stub in this
 context is a short, incomplete version of something.  Stub implies
 we need more.  A similar title that does not have the same
 implication would be Zope Basic Server Controller.
 
 A server controller let's you control a server.  In this case, the
 page lets you shut down or restart the Zope server.


Ok, thanks !

-- 
Sébastien Douche [EMAIL PROTECTED]
___
Zope3-dev mailing list
Zope3-dev@zope.org
Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com



Re: [Zope3-dev] Florent's O-R blog entry

2005-08-23 Thread Janko Hauser


Am 23.08.2005 um 20:36 schrieb Shane Hathaway:


Gary Poster wrote:


On Aug 23, 2005, at 1:11 PM, Gary Poster wrote:

Argh, communication.  That still could be too-easily  
misinterpreted,  and I didn't stare at it long enough before I  
sent it.  One more try.
Meanwhile, deciding that a community project require any specific   
backend--Ape, FileStorage, DirectoryStorage, or another--feels  
like a  mistake.  Discarding FileStorage or DirectoryStorage, as  
Florent  argues, is a significant case of throwing the baby out  
with the bath  water.  We have at least three maintained and  
capable ZODB backends,  with different strengths and weaknesses,  
appropriate for different  use cases.  Lets not jump to discard  
any of them.




I agree 100%.  However, your concern is that projects will require  
a specific ZODB backend, while my concern is that projects will  
dump ZODB altogether.  I think the latter is the greater risk, and  
people need a middle ground so they don't isolate themselves from  
the rest of the community.  Ape could be a part of that middle ground.


Also, I did not intend to disparage the excellent FileStorage and  
DirectoryStorage packages.  I always tell people to use FileStorage  
or DirectoryStorage unless they have a good reason not to, and the  
biggest reason not to use FileStorage (through-the-web code is hard  
to put under version control) is already disappearing with Zope 3.


This is a good discussion, and I think this will provide a good  
ground for a technical pro/contra view of the storage situation. But  
I think the post from Florent looks at this from a slightly different  
angle. Perhaps I misinterpret it, but his thoughts look at the needs  
for a content repository storage. I do not think he wanted to totally  
replace ZODB for all the other stuff. And assuming he looks at the  
storage question from this point (actually Florent is in holidays at  
the moment) his views are build with some general concerns as  
background.


Let's assume enterprise means big and sellable to corporations,  
then the concerns of potential customers are valid, that valuable  
content is stored in some piece of software, which is only known to a  
small group of developers. Building a content repository as a  
marketable solution on this piece of software needs more convincing  
than to say We have this piece of great software and your content  
ends in your favorite traditional RDBMS.


Ok I will stop to interpret what Florent may have thought, I better  
present my own path of thinking. In the end I'm against a RDBMS as  
the only core part of a Zope CMS repository.


I started with the general idea to have a content repository for  
simple content objects, which are all described by schemas. This  
leads to a rather flat and more structured, nearly homogenous mass of  
objects, compared to the normal objects present in a Zope CMS.
The repository is a layer over potentially many storages. This leads  
fairly easily to the idea to have a backend storage which stores this  
data into a RDBMS. This is the level Florent probably looked at also.  
But I have concerns to many of the other points. At this level the  
RDBMS is really just a storage of attribute mappings. The hole logic,  
for example the relation between different content objects is part of  
the stored data or held in the repository application or some  
registries. I assume that the moment one starts to use the relational  
aspects of the RDBMS the application logic becomes part of the  
storage. This would  need to be adressed in the O-R-mapper, which  
would mean that also the O-R-mapper becomes part of the application  
logic. There are further proposed benefits of an RDBMS-storage like  
indexing, direct searching, report generation which are all  
reflecting back in the application domain, which would lead in the  
end to the situation that one would circumvent the O-R-mapper for  
complex or special tasks and starts to work directly on the data.  
This in the end is bad from my point of view and greatly raises the  
complexity. It would also mean a big development effort to recreate,  
overshadow and map current functionality given us by Zope  for nearly  
free.


There are many valid points where the ZODB has some shortcomings.  
Blob support for example will be much better, although it will not be  
totally solved by just storing blobs on the filesystem. Which leads  
to my last point. From a solution point of view there are many  
hacks or individual adaptions involved to have a big scalable site. I  
think we should look for some of these to be better, means more  
standardly incorporated into the z3ecms toolbox. Just for example,  
the answer to time consuming cataloging for cases with many writes is  
to use the queued catalog product. But integrating it into a system  
is a hand job, needs a developer who knows how to do it, where to  
fiddle to integrate it right. Such technically already present  

[Zope3-dev] Re: RDFLib and Zope 3

2005-08-23 Thread Michel Pelletier
On Tue, 2005-08-23 at 12:49 -0400, Gary Poster wrote:
 Michel (and anyone else with experience with RDFLib on the list), I  
 recently looked at RDFLib (http://rdflib.net/) and came away (after  
 an hour or so) with a good first impression.

Great.  I've cc:ed Dan Krech, the lead rdflib developer on this mail.
For his benefit I might explain things that you obviously know.

 My biggest disappointment was that, from the perspective of a Zope 3  
 developer, using it alongside other Zope 3 indexes (and other intid- 
 based data structures) meant that I would have to externally convert  
 to and from RDF in order to merge results and convert the RDF URIs to  
 objects. 

Correct.  A specific and important optimization in Zope-style cataloging
is that objects have a cheap unique integer to reduce catalog footprint
and significantly improve result merging and joining.  These intergers
are exposed as a utility component in Zope.

  It would be much more efficient if I could have an RDF  
 resource class that represented an intid, and even more efficient if  
 I could get IFBTrees back directly from searches that somehow  
 included the intids.  

Yes, this is a problem that needs to be solved, and your suggestion is
one way to solve it.  I've discussed this a few time with Florent at the
paris and EUpy sprints and he had a similar suggestion.  

I'm  uncomfortable with it for a few reasons, 1) because intids are such
a Zope-catalog-optimization specific thing.  I know why they are
exposed, so that catalog results can be efficiently merged, but they
don't have anything to do with RDF, so 2) rdflib can't really change its
interface to accomodate them.  Also, 3) they are backend specific, for
example rdflib has a URI - integer mapping for its in-meomory and ZODB
backends to reduce footprint, but a sql backend would need no such
integer, you would in fact have to *add* a column to hold that value
just so the data would merge efficiently with a catalog.  This seems
antithetical to Zope 3's philosophy in general as it violates the
concept of not requiring third party libs and data to change themselves
significantly just to work with Zope.  Of course, this isn't a problem
of the catalog, it's a problem in general merging search results from
anywhere.

I'd like to make the optimization available so that searches on a graph
can be efficiently merged with searches on a catalog, but I don't think
it can be done by pushing intids down into rdflib, or for that matter
any other third party component you want to play with the catalog
efficiently.  Perhaps instead of pushing the integers down we could push
URIs up, Zope's cataloging could grown another layer of indirection on
top of intids and provide a URI utility that maps to intids.  Of course
you might object to that for the same reasons I'm objecting to this. ;)
But at least URIs are a well known standard.

Somewhat at right angles to this, I think Zope needs to grow another
search interface, a higher level one that hides all of this integer id
stuff from the user.  I proposed something incomplete along these lines
to the z3labs site, an interface that could aggregate searches across
multiple registered search sources, whether catalogs, rdflib Graphs,
relational databases, remote systems, google, etc.  

With something like this, no need to worry about intersecting two
floating point result sets efficiently, the underlying search framework
performs that optimization if it is available.  Note that the primary
benefit of such an interface is not necessarily merging results across
multiple sources, but instead providing a consistent interface
regardless of the search source.

 Then I could leverage the relationship and  
 keyword capabilities of RDFLib while also merging results efficiently  
 with other index-like data structures in Zope 3.  The intid-specific  
 resources could even have stable URI representations without too much  
 trouble, so that they could be exported and imported with RDFLib, if  
 desired.

Hmm so these resource objects you are suggesting, they would be
persistent objects?  I don't quite have the picture of what you suggest.
Perhaps these resource classes can be managed by a utility?

 Have you thought about that use case?  If one used a variation of  
 your back end that assigned intids to non-intid-based resources like  
 URIs and Literals and stored the relationships via intids, 

One doesn't need a variation, this is exactly the way the in-memory and
ZODB backends work now as an optimization.  But they are internal
details of the implementation of those backends.

 you could  
 store the data as IFBTrees and offer up an API to get raw IFBTree  
 results.  Any obvious ways that would be a problem?  Does it feel  
 reasonable to you?  Any suggestions?

Well not any good ones yet, although I know it's an important problem.
I'll have to think about it a bit more.  Do you understand my
objections?  Does anyone else have any suggestions out there?  

Re: [Zope3-dev] Re: RDFLib and Zope 3

2005-08-23 Thread Michel Pelletier
On Tue, 2005-08-23 at 18:04 -0400, Gary Poster wrote:

 The relationship between ZODB content objects, their int id as  
 provided by the pertinent intid utility, and a (theoretical)  
 corresponding RDF URI is what I'm having a hard time not making hacky  
 in my mind.  I'll think about it some more.

They might not be that hacky, this might be the wrong direction to take
but URI's don't have to be visually meaningful, blank nodes, for
example, are usually just '_:' concatenated to a random opaque string.
If the URI were 'zope:' (maybe path/to/intid/util:) that
would work just as well, it would also be trival to transform into a
feasible join key if the URI was also a URL that looked up, instead of
some network resource, an intid.  

Actually being able to trivially transform an intid to an rdflib URI
might be something to think about.  Thinking about it more, the current
Zemantic uses the physical path of the object as the rdf:about= URI
when an object adds itself, because honestly I could think of no other
URI in Zope.  This is obviously wrong, but I didn't have a better answer
in paris.  Why not use the `intid` plus some URI sugar?  If the URI and
the intid can be easily converted from one to the other then that should
solve the whole problem, no?

 
 Another difficulty is that I like the RDF data model and the RDFLib  
 implementation, but I haven't found a compelling reason to care much  
 about the actual RDF format input and output.  Is there a practically  
 compelling defense of RDF as a format somewhere to which you can  
 point me?

I'm sure you're aware of this but for others: RDF does not specify a
syntax, only the data model.  The most popular syntax, RDF/XML, is
pretty bad, but rdflib also supports the NT syntax, which is a plain
text format.  There are some other triple languages out there that may
look even better, and support for them in rdflib would require writing
only a parser and maybe a serializer if you want that format back out.
I like SLIP but the parser needs some work:

http://www.scottsweeney.com/projects/slip/

and lastly I've been kicking around a new syntax based on SLIP I call
SLIPR.  Unlike SLIP which is for any XML, SLIPR is RDF only.

https://svn.cignex.com/public/slipr/data/pyinrdf.slpr

I only have the syntax outlined right now, I'm still working on the
parser.  This is my attempt to mix RDF with indented syntax.  It looks
great in python-mode. ;)  Unfortunately this is low priority.  The good
news is the high priority is a SPARQL parser, which is coming along
nicely.  Kudos to the fabulous pyparsing library. Hopefully we should
have full sparql support by 2.4.

-Michel

___
Zope3-dev mailing list
Zope3-dev@zope.org
Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com