I have been following the discussion thread for the serialization library review with some interest, as I think the topic is of extreme importance. Right up there with smart pointers and threading, it's something that would be used by many people for many different things. I want to thank Robert for the obviously extensive amount of thought and effort that he has put into this. I have this nagging feeling, though, that we all need to step back for a moment and re-examine our first principles.
Based on reading the documentation, carefully reading the discussion from Unicode support to XML archive formats, to registration issues, etc, and based on my own thoughts on the subject, I get the distinct impression that we're not at all in agreement about what exactly a serialization library is supposed to do. From the documentation, it is clear that Robert is most familiar with MFC's serialization mechanism, and the library follows in that general mold. He also states that this is a "serialization" library, and not a "persistence" library (like what you find with an OO database). I think that this is an extremeley important point that I don't remember really being discussed at all. The discussion about alternate archive formats, especially XML, CSV, and other formats for interchange with other systems, make sense in the context of "serialization." A discussion questioning whether these formats could truly represent all C++ structures (especially diamond inheritance) ensued, but this point is only relevant to persistence, not to serialization. Because I think that a lot of the discussion hinges around this point, I'm going to venture to make a distinction, and I'd like to know if people agree. Persistence: A transformation-less transfer of application native data to an alternate storage medium. Only useful and only intended to be useful to applications that apriori agree on object type and layout, presumably by sharing headers. May optionally account for differences in architecture or compiler. Must be symmetric--support both store and load. Alternate storage formats would only differ for effeciency reasons, perhaps at the expense of not supporting constructs not needed by a given application. Serialization: A transformation of application native data into a serial intermediate exchange format specified by the application writer. Whether objects can be read back in an order different than they were stored, or if there is any object identification of any kind, is up to each individual format. Because it is not presumed that applications share apriori knowledge, it may be necessary to include meta-data regarding data types. Various structuring mechanisms may be used, coming in various flavours--header, pre/post tags, post (ie,terminated), length prepended, packeted, etc. Metadata may be independant or mixed with structure. Also, it is not presumed that all expressable object layouts or relationships can be serialized to all possible formats; it is the responsibility of the format writer to account for this. Often, only store or load will be supported--the symmetric operation is performed by the application one is exchanging data with (really, the point of the excersise). The library up for review strikes me as a serialization library intended to function as a persistence library, and I think this sparked everyone to ask for different things in the confusion. To me, a persistence library must take into account object factories, object lifetime management, versioning, and should be fairly transparent (praying for MPL magic here), while a serialization library must deal with the archive format issues and explicit conversion logic. While either task is huge, I really think we need to clarify the purpose and scope of a boost serialization library that we would accept--that's only fair to Robert. So here's my thoughts: 1) We explicity acknowledge that persistance is a seperate topic requiring a seperate, mostly unrelated, library. It is very important itself, but we should start a different thread later. That thread should talk about factories and lifetime management, and not get confused about whether or not XML is pertinent (it's not, keep it on this thread). 2) Assuming I can now focus on serialization, the most important requirement relates to what styles of archive can be generated. In other words, defining the set of hooks in the serialization process to insert tags and/or metadata. For starters, very simple examples demonstrating JPEG style headers, XML style pre/post tags, SWF style length prepended tags, CSV style terminated lists, and perhaps protocol style packets. 3) We understand that the library still requires that archives be written. A couple of the examples should probably be useful, but other boost members should be encouraged to write and submit archives for their own pet format. 4) Having introduced tags and metadata issues, escaping schemes need to be introduced. URL-encoding, C-style backslashing, etc. I don't remember seeing anything that indicated this was already present; excuse me if it is. 5) The archive writer also needs a way to write something to verify archive files. Spirit would probably come in handy here. I haven't thought through the details of what I would want here, perhaps the serialization library can't really help with this task. 6) Versioning is not so important at the class level, but at the entire archive format level. Perhaps with the meta programming panacea in hand after some language revisions, we will be able to unify all of the storage and translation mechanisms under one framework. Until then, I vote that we keep serialization serial and don't worry about persistence until we're ready to take the bull by the horns. One last thought on the multiple format problem: it may be wiser to standardize all serialization on XML and rely on XSLT to transform into all the different formats we might want. Again, I want to give a huge thanks to Robert, you've been wonderfully patient with all of us, even though everybody keeps voting "no." Thanks, too, to all the reviewers and their insights. Cheers- Augustus __________________________________________________ Do you Yahoo!? Yahoo! Mail Plus – Powerful. Affordable. Sign up now. http://mailplus.yahoo.com _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost