Eliot Miranda wrote: > > > On Fri, Jan 22, 2010 at 12:35 PM, Stéphane Ducasse > <[email protected] <mailto:[email protected]>> wrote: > > hi guys > > the more I read about imageSegments the more I would like to remove > them (or to package them > carefully - not sure that this is possible) and may be add a new > class to > just have one simple way of invoking the save (but not swapping back in) > > I think that mariano diving into them is a great phd exercise but on > the long run > I see it as a brittle mechanism. > > what do you think? > > > David Leibs' work on parcels in VW demonstrated that high-performance > packaging can be done with no VM support.
Parcels are wonderful things, but my impression is that ImageSegment and parcels are designed to do rather different things. Parcels are used for physical delivery of packages (primarily code), whereas ImageSegments are for arbitrary graphs of objects (primarily not code, although some folks have tried to use ImageSegments for code). If parcels are designed to be more general than package delivery, I'd like to hear about it. I suspect that the only VM support that ImageSegments really need are the mark-sweep primitives to discover what objects are in the ImageSegment. All other algorithms, file format, etc. can (and maybe should) be redesigned to be better, but using the GC to find the objects is the heart of what ImageSegment is. > When you implement a binary > format, carefully designed for unpacking performance, at the image level > you get the freedom to add flexibility. I added shape change support to > parcels (and some higher level features that aren't relevant here) after > David had left. So I think the right approach is to reimplement image > segments entirely in the image without special VM support and add > metadata to the format (class shape information) and you'll probably end > up with something that is nearly as performant but much more flexible > and evolvable. > > > The two keys to the performance of David's design are the separation of > objects from their references and the btching of object allocations. A > parcel file starts with a number of allocations of well-known classes > (e.g. this parcel contains 17 large integers of the following sizes, and > 3 floats, and 17 symbols of the following sizes etc) followed by an > arbitrary number of "N instances of class X". So the unpacker populates > an object table with indices from 1 to N where N is the number of > objects in the parcel, but it does so in batch, spinning in a loop > creating N instances of each class in turn, instead of determining which > object to create as it walks a (flattened) input graph. After the > instance data comes the reference data, which slots refer to which > objects. Again the unpacker can spin filling in slots from the > reference data instead of determining whether to instantiate an object > or dereference an object id as it walks the input graph. So loading is > much faster than e.g. ReferenceStream-style approaches. I like that approach for a file format. It probably doesn't even make writing the file out much slower; the work has to be done in multiple passes, but each pass is simpler. And write speed is important: A parcel is typically written once and loaded many times, but one common pattern of ImageSegment use is write once, load once, discard, or even write once, load never. Regards, -Martin _______________________________________________ Pharo-project mailing list [email protected] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
