BACKGROUND I was attracted to Zope in 1998 because it freed us from the clumsiness of the first normal form.
Many in the Zope community can also boast membership of the NoSQL "old-guard". The ZODB is great, but it, and all other OODBs, have a serious problem: the data is tied too closely to the application. Objects in a ZODB Data.fs instance cannot be retrieved unless the classes that define them are in memory and in perfect sync. But the classes are not in the same storage, so that is never guaranteed to work. In an RDBMs the schema is part of the database, so they can never be out of sync. We lived with this for more than a decade, but then I learned: the data is always more valuable than the application, so we need to be able to get to it without the original software. The Python community has always been smart about finding the middle ground. For databases, between the extremes of RDBs and OODBs, I believe the middle ground are semi-structured databases, as exemplified by CouchDB, MongoDB, Google Datastore (sort of) and a host of others that are gaining momentum, features, and cases. In these, we don't store serialized objects, but just the data to reconstruct the objects. But the data is not completely dismembered in some normalized form. In a semi-structured database the data graph can follow very closely the original object graph, which makes retrieval easier for the programmer and more efficient for the database. And the schema is self-describing, which means if you have a database backup, then you are able to get to the data even if you don't have the software that put it there. SEMI-STRUCTURED DATABASE MODEL Here is a useful definition: """ The semi-structured data model is designed as an evolution of the relational data model that allows the representation of data with a flexible structure. Some items may have missing attributes, others may have extra attributes, some items may have two ore more occurrences of the same attribute. The type of an attribute is also flexible: it may be an atomic value, or it may be another record or collection. Moreover, collections may be heterogeneous, i.e., they may contain items with different structures. The semi-structured data model is self-describing data model, in which the data values and the schema components co-exist. [1] """ "Self-describing" is a key. It is also interesting to note that the text above seems like the description of the Python data model in general. Here is formal definition, from the same source: """ A semi-structured data instance is a rooted, directed graph in which the edges carry labels representing schema components, and leaf nodes (i.e., nodes without any outgoing edges) are labeled with data values (integers, reals, strings, etc.). """ [1] M.T. Özsu and L. Liu, Encyclopedia of database systems : Springer, 2009. THE OPPORTUNITY I believe BFG, with its battle-tested traversal machinery, is uniquely well positioned to take advantage of the wider adoption of semi-structured databases. A missing piece is a generic API for semi-structured data, to fill the role that SQL Alchemy plays in the BFG ecosystem. Does anyone know whether a good candidate for this already exists? I am very happy that I took the time to visit Paul, Chris and Tres in my last trip do Washington DC. Thanks for the hospitality, the book, and inspiring ideas, guys. I am excited about BFG and looking forward to the rest of this story. Cheers, Luciano _______________________________________________ Repoze-dev mailing list Repoze-dev@lists.repoze.org http://lists.repoze.org/listinfo/repoze-dev