On 2017/06/10 8:24 PM, Richard Damon wrote:
On 6/10/17 8:13 AM, R Smith wrote:

The typical usage, storing full text UUID minus braces in an ASCII/UTF-8 sequence will result in a hair over 2.3 times[1] the storage of INTs. Not really that bad I think.

If the field was declared as char[38], and UNICODE collation, many systems will allocate 4 bytes per character to allow for any possible character (of course, since we are only storing ASCII characters, we need to declare that so they take only 1 byte per character). Comparing that to 128 bits (16 bytes) is the 9x factor.

I've known systems to use Double-Byte character space in the past, but most RDBMSes repented and mended their evil ways by moving to a Multi-byte encoding variable length solution - but allocating 4-bytes blindly to anything Unicode?, well, let's just say in that case you don't need to consider whether to use UUIDs or not, you need to consider changing Database engines asap. (or perhaps at least the storage engine or type).

I mean 99.99% of the World's language and communication requirement Unicode characters sits in the BMP (Basic Multilingual Plane) which is in its entirety below the need for a 3rd byte. The higher planes that are currently allocated contains mostly non-linguistic definitions, like this smiley poo - 💩 SELECT Char(0x1F4A9); - though one can argue it has become a staple of the chat-client vernacular.


_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to