RE: [Zope-dev] OR mapping proposal
[Phillip] http://dev.zope.org/Wikis/DevSite/Proposals/ORMappingDB Comments encouraged! [Albert] I've added some there. Jim highlighted a project Risk there: Updates to RDBMS data outside of the OR mapping could cause cached data to be inconsistent. This strikes me as rather fundamental. Unless the goals include actual *sharing* of RDBMS data with other applications completely independent of Zope I doubt that the most important benefits of an OR mapping could be achieved. Essentially SQL RDBM Systems are *about* sharing data among applications. When customers want SQL that is often what they actually want. An SQL RDBMS can be overkill for other purposes which may be just as well achieved by an embedded ODBMS like ZODB, an SQL file system like MySQL or an LDAP directory. Alternative goals for *exporting* ZODB data to an RDBMS continuously, *importing* data from an RDBMS at regular intervals and *embedding* an RDBMS database for exclusive use by Zope with no write access for other applications could all be met more easily. There is certainly no major difficulty on the RDBMS side, giving a Zope instance control over a set of tables for it's own use and providing append only and read only access to export and import tables or views for regular or continuous replication. But the combination of all 3 (which could be delivered incrementally in any order) is *not* the same as *sharing*. As I understand it, Zope's approach to cacheing inherently prevents support for the Isolation part of ACID. Conflicting writes to the same object are detected by version stamps but the objects used by a transaction in one thread may have been inconsistently changed by transactions in other threads. This will not be detected unless those objects used are also changed. Similar problems are inherent in LDAP directories, which are also designed for relatively static data with a low rate of updates. This is acceptable for many applications. Scope can and should be limited to sharing that works with optimistic checkout and does not require pessimistic locking. It is common for an Enterprise Object to be read from an RDBMS with it's stamp noted, modified independently by an application and then updated iff the stamp was not changed. Only the simultaneous checking of the stamp and update of the object needs to be wrapped within a short ACID RDBMS transaction. For example ACS 4 maintains a timestamp on every object which can be used for this purpose. This is similar to the ZODB approach. Note however that: 1) The application must be prepared to deal with an exception that cannot just be handled as a lower layer ConflictError by retrying. 2) The object will often be a composite - eg an order header *and* all it's line items, and fulfilments. Entanglement with other objects such as products (for pricing) is avoided by specific application programming (which may also be done in stored procedures within the DBMS). 3) This does not support *any* cacheing of objects outside of a transaction. The RDBMS itself provides internal cacheing (often of the entire database for efficient queries with web applications). This leads to the ACS paradigm of the web server is the database client, which is actually rather similar to Zope's Zserver is the ZODB client. Both ACS and Zope involve complex issues for database client side cacheing Both 1 and 2 completely preclude any possibility of the same level of transparency as for ZODB, while in no way hindering use of pythonic syntax. For most Zope web object publishing purposes cached objects just need to be kept reasonably up to date rather than synchronized with RDBMS transactions. The only viable mechanism I can think of for dealing with item 3 in a Zope context would involve the RDBMS maintaining a Changes table which it appends to whenever any object that has a special column for ZeoItem is changed without also changing the value of ZeoItem. (ACS does not do this and I'm not sure what it does do). Zeo would monitor that table, either by regular polling or continuously (eg with PostgreSQL as a LISTENer responding to NOTIFY commands issued automatically whenever the triggers append to the Changes table). For each change Zeo would notify it's Zope instances to invalidate their caches for that item. I'm not familiar enough with Zope cacheing internals to know whether some other approach is feasible. Requiring such changes in a shared database is certainly undesirable. Q1. Could somebody please post specific URLs for relevant documentation of Zope cacheing? Q2. I have a related question about the Zope design overall. As far as I can make out Zope actually keeps separate copies of persistent objects in RAM for each thread, and relies on the fact that there is a tree structure corresponding to the URL paths that ensures objects from which attributes will be acquired tend to already be in RAM when the acquisition occurs. I assume this is trading off the horrendous inefficiency of multiple
Re: [Zope-dev] OR mapping proposal
Albert Langer wrote: [Phillip] http://dev.zope.org/Wikis/DevSite/Proposals/ORMappingDB Comments encouraged! [Albert] I've added some there. Jim highlighted a project Risk there: Updates to RDBMS data outside of the OR mapping could cause cached data to be inconsistent. I agree! In fact, Jim and others here at DC have been suggesting this idea for a long time now, but I've always resisted because of this issue. But now I see two workable approaches to solving the problem: 1) Create an invalidation protocol where other applications are required to update a special table every time they make changes to the database. Zope checks this table at the beginning of each transaction. Databases that have strong support for triggers would be able to do this at the database level. 2) Some kinds of objects will stay in memory only for the duration of a transaction. PJE hinted at this and I like it. Some people may decide that *all* relational objects should behave this way, in which case the decreased performance would still be equal to or slightly better than competing projects AFAICT (since this proposal has the advantage of some of the logic being implemented in C.) Thanks for your comments in the wiki. After talking with others here at DC, it's clear I should have provided a description of the possible solutions for some of the major issues. We're all still learning how the process is supposed to work. The next step in the process is to get it reviewed, after which we can turn this into a project. Your comments will become a part of the project. Shane ___ Zope-Dev maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
[Zope-dev] OR mapping proposal
http://dev.zope.org/Wikis/DevSite/Proposals/ORMappingDB Comments encouraged! Shane ___ Zope-Dev maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] OR mapping proposal
On Tue, 15 May 2001, Phillip J. Eby wrote: If we had a standardized manipulation API or idioms (like JavaBeans) for application objects, then having lots of ways to *implement* storage would be a good thing. Different products and offerings could co-exist and compete in the storage implementation space, and users would have the benefits of being able to choose their back-ends and app frameworks without being locked into one framework's API. That sounds useful, though I've been hinting at a more gradual approach. For example, I might define an OR mapping like this early on (roughly): class UserMapping (ObjectMappingScheme): def loadState(self, object, key, store, cursor): records = cursor.execute('select login, password from users' ' where uid=%(key)s', {'key': key}) d = object.__dict__ r = records[0] d['id'] = key d['login'] = r['login'] d['__'] = r['password'] def storeState(self, object, key, store, cursor, transaction): d = object.__dict__ cursor.execute('update users set login=%(id)s password=' '%(__)s where uid=%(id)s', d) Then I might see that many tables conform to a pattern and I would convert it to this: UserMapping = SimpleAttributeMapping(table='users', {'id':'uid', 'login':'login', '__':'password'}) Then I could make a web interface for creating SimpleAttributeMapping objects. Of course this is just the simplest case and things get more interesting when you have hierarchies of objects or complex queries. Perhaps I should offer up a counter-proposal, focused on establishing a common API and proposing some of the requirements for same? Presumably we are all agreed that it should be as Pythonic as possible, but no more so. :) Also, API is perhaps not the right word, it is more about access and manipulation idioms. It needs to deal explicitly with the notion of relationships as well as attributes in the sense of data fields. And it needs to deal with the notion of how you determine what classes should be used for what things and how to get at those classes (since they may need to be backend-specific). You mean a complementary proposal, don't you? These are issues, by the way, which the current ZODB API dodges, and that is why I've been saying that doing O-R mapping in ZODB doesn't help the key issues of database independence. You *still* have to code to a style that is compatible with changing back-ends. I think it might be helpful if we all got on the same page about what that style should be, and then all these efforts could go forward knowing that in the Zope application space, users will only need to learn one such style at the Python level, and any education efforts about that style can be leveraged across many possible implementation approaches. Sounds great! Shane ___ Zope-Dev maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )