Ron Adam wrote: > I was presuming it would be done in C code and it will just need a > pointer to the first byte, memchr(), and then read n bytes directly into > a new memory range via memcpy().
If the object supports the buffer interface, it can be done that way. But if not, it would seem to make sense to fall back on the iterator protocol. > However, if it's done with a Python iterator and then each item is > translated to bytes in a sequence, (much slower), an encoding will need > to be known for it to work correctly. No, it won't. When using the bytes(x) form, encoding has nothing to do with it. It's purely a conversion from one representation of an array of 0..255 to another. When you *do* want to perform encoding, you use bytes(u, encoding) and say what encoding you want to use. > Unfortunately Unicode strings > don't set an attribute to indicate it's own encoding. I think you don't understand what an encoding is. Unicode strings don't *have* an encoding, because theyre not encoded! Encoding is what happens when you go from a unicode string to something else. > Since some longs will be of different length, yes a bytes(0L) could give > differing results on different platforms, It's not just a matter of length. I'm not sure of the details, but I believe longs are currently stored as an array of 16-bit chunks, of which only 15 bits are used. I'm having trouble imagining a use for low-level access to that format, other than just treating it as an opaque lump of data for turning back into a long later -- in which case why not just leave it as a long in the first place. Greg _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com