On Fri, 16 Dec 2011 09:08:12 +0100, Mark Rotteveel wrote:

> Yes there is a binary representation defined in
> http://www.ietf.org/rfc/rfc4122.txt in section 4.1.2. It also explicitly
> says that bytes should be in network byte order (aka big-endian).

Exactly.

So the problem with our UUID functions (pre-2.5.2) was:

a) gen_uuid() returned the first three fields (a ULONG and two USHORTs) with 
their bytes in host order instead of network order, thus shifting the version 
digit two places to the right on little-endian systems and making the output 
non-compliant;

b) on Posix, gen_uuid() didn't insert the version and variant bits at all, but 
just returned a 16-byte random sequence.

Please notice that there was and is nothing wrong with the two conversion 
functions uuid_to_char() and char_to_uuid(): they just faithfully convert the 
input in the order given. Garbage in, garbage out.

Oh yes, there is one thing: the standard stipulates that the 36-char hyphenated 
string should contain lowercase characters. As input, both lower and upper case 
are allowed. So uuid_to_char() should produce lowercase a..f instead of A..F as 
it does now, and char_to_uuid should accept both (as it already does).

Back to gen_uuid(): problem (b) has been fixed with the commit of 21 Dec 2011, 
but problem (a) still exists.

To make matters more complicated, two new conversion functions - 
uuid_to_char2() and char_to_uuid2() - have been introduced, which convert 
host-order CHAR(16) OCTETS uuids to network-order CHAR(36) strings and vice 
versa. Use of the existing uuid_to_char() and char_to_uuid() is discouraged. 
Side effect: if you restore a database containing CHAR(16) uuids generated on a 
different-endian system, uuid_to_char2() will produce a non-compliant string.

To comply with the standard as well as keep things simple and transportable, I 
_strongly_ suggest the following:

- fix gen_uuid() so that it returns all the fields MSB-first as per RFC4122;
- keep uuid_to_char() and char_to_uuid() largely as they were, but have 
uuid_to_char() output lowercase a..f;
- drop uuid_to_char2() and char_to_uuid2() before they see the light of day.

BTW, I would have posted much earlier but I usually don't have time to follow 
firebid-devel. I got alerted to this issue because of a DOC tracker item 
assigned to me.


Cheers,
Paul Vinkenoog


PS:
Of course I'm aware that in, say, a Windows UUID struct on little-endian 
systems, words and dwords are stored LSB first. But that's a _storage_ issue; a 
CHAR(16) UUID is a string of bytes, not a memory image of a C struct, a Pascal 
record, or whatever implementation-dependent structure on a certain system. It 
should therefore follow the standard which explicitly prescribes network byte 
order - and save us and our users a lot of headaches in the process ;-)

------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual 
desktops for less than the cost of PCs and save 60% on VDI infrastructure 
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

Reply via email to