In-Reply-To: <[EMAIL PROTECTED]> >From the headers... > typedef unsigned char version_type; // upto 255 versions > namespace serialization_detail { > typedef unsigned short class_id_type; // upto 64k kinds > // of objects > typedef int object_id_type; // upto 2G objects > }
It seems to me these limits are arbitrary, and in some cases rather low. Wouldn't it be better, and more general, to use int or long? On a related note, I think variable length integers ought to be supported as primitive. For example, consider something like: void basic_oarchive::save_vri( unsigned long x ) { bool more_to_come = true; while (more_to_come) { unsigned char low_bits = x & 0x7f; x >>= 7; more_to_come = (x == 0); unsigned char high_bit = more_to_come ? 0x80 : 0x00; *this << (high_bit | low_bits); }; } unsigned long basic_iarchive::load_vri() { unsigned long x = 0; bool more_to_come = true; while (more_to_come) { unsigned char bits; *this >> bits; x = (x << 7) | (bits & 0x7f); more_to_come = (bits & 0x80) != 0; } return x; } This encodes an unsigned int as a variable number of bytes. The low 7 bits of each byte contribute to the number, and the high bit says whether there are more bytes to come. Although I've used this technique in the past I haven't tested this exact code, so it may have bugs or be in the wrong place. If we are saving in an ASCII format we wouldn't want to do this because ASCII is intrinsically variable length anyway. And of course, we cannot use it as the default way of writing integers because for some numbers it is less efficient (with this scheme the overhead can never be more than a byte). That said, when used appropriately the benefits include: (a) Smaller archives in the common case. (b) Faster loading and saving (because of there being fewer bytes to move around). (c) Avoidance of arbitrary limits caused by hardwired sizes. (d) Extra portability due to not relying on the number and ordering of bytes in primitive types. Of course something like this can be built on top of the current library, but if it is included then the library can use it for its bookkeeping data. It can be used for things like class_id_type and the lengths of strings and vectors. Then the library will get benefits (a)-(d). For example, a string like "hello" is currently stored (by boarchive) with a size_t length, which on my machine is 32 bits, taking 9 bytes altogether. If the variable length format is used, it will take 6 bytes, a 33% saving. Further, it can be reloaded into a machine for which size_t is only 16 bits. -- Dave Harris _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost