On Wed, Jan 30, 2008 at 08:34:44 +0000, Ciaran wrote: > I understand exactly what you're saying about the data-type in flags > approach being a corruption of the flag intent, but if there was to be an > agreed 'cross-platform serialisation' strategy (such as the first one or > two bytes of the actual data being the 'serialisation flag' )
Note that I'm not against the "SERIALIZED" flag itself, only against using flags to encode data type. Thus the client can always tell that the data is serialized by looking into the flags. But it won't tell the serialization method from that. Yes, it's possible to rise SERIALIZED flag, then encode serialization method in two first bytes, and put the rest after that. IMHO this is preferrable over encoding serializarion type in flags, as was suggested in other email. But the problem of choosing common serialization method itself is not limited to memcached domain. There's no single library that would allow you to pass data structures back and forth between different environments, over, say, a socket, so it sounds ambitious if we are going to implement one for memcached ;). Serialization of arbitrary data structure is a complex task, as one of sub-problems is reconstruction of directed graph, with cycles in a general case (think of pointers/references). As for limiting it to some basic scalar types, strings, and arrays thereof, even here there are deviations: not every language has distinct boolean type, integers of various widths, internal support for UTF-8, etc. The common subset would leave us with integer, floating point, and string (as a byte sequence; even character vs byte distinction is disputable). And it's easy to write a serializer for these three (by stringifying numbers as the last resort for instance). Sad, but true. -- Tomash Brechko
