In-Reply-To: <[EMAIL PROTECTED]> On Wed, 11 Dec 2002 18:17:21 -0500 David Abrahams ([EMAIL PROTECTED]) wrote: > I'm willing to use any terms that everyone will agree to > (including yours)
Me too. > but whichever terms we use should be at least as clearly defined > as what Augustus wrote. I'm afraid I couldn't quite get my head around them. To me, "persistence" and "serialisation" are at different levels of abstraction. Serialisation is one way to implement persistence. As such they do not compete; they are not mutually incompatible alternatives. I think we have a consensus that a fully general persistence library, that could be implemented by dumping RAM images to disk or whatever, is not what we want at this point. I'm OK with that. What I don't understand is what Augustus means when he says: I think that plain serialization (your term) should be explicitly *not supported* and defer that use case to a safer, more airtight approach with a persistance library. What is gained by excluding persistence, and/or the simpler kinds of serialisation (where source and destination are the same program running on the same hardware with the same compiler)? > So far, you haven't provided a clear definition of serialization. Actually I agree with Augustus's, as far as I understand it, which isn't far. He seems to imply that serialisation does not need to bother with object factories or object lifetime management. I don't understand how that can be. I can't figure out whether UTD versioning belongs to Persistence or to Serialisation. He says Persistence, but doesn't that make Persistence asymmetrical and involve it in non-trivial transforms? How can it be achieved by transparent meta-programming magic? I doubt a robust but transparent persistence mechanism can be built. > > We could send a binary format through a "uuencode" filter, but a > > text format which was natively safe would be neater (and probably > > more efficient). > > Why would it be more efficient? Because it has more knowledge. For example, if we write out the number 500 using an alphabet of 64 safe characters, it takes 2 characters. If we write it out using all 256 characters, it still takes 2 of them, but now to make it safe each character needs 2 safe characters to represent it, so it takes 4 bytes altogether. The double conversion is more verbose because the first part loses information. > > Adding or removing instance variables is pretty straightforward. > > Erm. I am still leery of thinking of all this in terms of "instance > variables". The representation of state written to the archive may or > may not have a direct correspondence to a class' data members. Sure. Call them "fields" if it helps. I sometimes find it helpful to think in terms of concrete examples. The point is, sometimes a class grows so that its serialised representation gets bigger. > "schema ID"? A term from MFC. It is what the submitted library calls a file_version. > Can you give an example of "containing the mess within the UDT?" I don't have a good example to hand. Here's a made up one: void MyClass::load( CArchive &ar ) { int schema = load_schema( ar, 10, 15 ); if (schema >= 13) MyBaseClass::load( ar ); else { MyOldBaseClass::load( ar ); int myBaseClassData; ar >> myBaseClassData; MyBaseClass::init( myBaseClassData ); } if (schema >= 14) ar >> myVar1; else myVar1 = 100; if (schema == 14) { int unused; ar >> unused; } if (schema >= 13) ar >> myVar2; else { MyOldType t; ar >> t; myVar2 = convert( t ); } } The first line fetches the class's schema/version number. The arguments to load_schema() are used for range-checking - load_schema() may throw. For safety, it's best not to use schemas 0 or 1 so I usually start from 10. The next block chains explicitly to the base class. In this case older archives used a different base class so we have some nasty code to make it work. The next few lines load a variable. Old archives didn't store it, so we have to provide a default value. Schema 14 added an int which was later removed; if it is present we have to skip over it. The last few lines load another variable. Older archives used a different type so we may need to load a temporary object of that old type and then convert it. I don't know what you think of this code - whether it horrifies you for being too low level or lacking in design foresight. It is my practical experience. Designs age, and the history accretes in the serialisation load routines. I hope that the boost library will be able to support this kind of evolution. I don't claim that code like this is the best solution, but in practice I have found it works. > It's beginning to sound more and more like the metaclass framework > some people have been hinting at. Do you mean that some framework could handle a history like that reflected in the above code, automatically? How would that work? How it could track changes to the base class over time? Java manages it by storing a snapshot of the class hierarchy (as it was when the archive was made) into the archive. That gives it enough information to figure out how the hierarchy has changed. However, it can lead to rather bloated archives. > > Renaming classes is something which MFC doesn't support. I believe > > that some of the proposals which came up during the review would > > allow this. > > Why should a class name come into play, unless you were using > std::type_info to archive it? MFC uses its own macro-driven RTTI system, in which classes are identified by name. If we don't use that, or type_info, then there is probably no problem. We do need to make sure that we can add new classes without somehow breaking the correspondence between the old classes and whatever the archive stores to represent them. > ... assuming there is such a factory method. The archive has to store something to represent classes, and has to be able to create instances of the classes so represented, in order to restore polymorphic pointers. That's what I mean by a "factory method". I don't mean to imply a particular implementation. > It sounds like your viewpoint on this is very heavily influenced by > one particular kind of application. Yes. Well, less so then my choice of words may have implied. And of course in that passage I was discussing a trap that MFC fell into. Your earlier comment: [...] the use of type_info::name() for type identification. Even if these were optional components to the library, they could provide enormous benefit for some applications. made it sound like you might make the same mistake. If we use class names to identify types, we need to make sure we can rename classes and still load old files. But generally, yes, I know what kind of applications I write and I hope boost will support. If other people have different expectations, shouldn't they write about them? Isn't that what this pre-coding discussion is for? I'm sorry for the length of this post, but now that I've written it, maybe you can tell me whether I want a persistency library or a serialisation library. -- Dave Harris _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost