Re: [sqlite] Article: UUID or GUID as Primary Keys? Be Careful!

2017-06-10 Thread Richard Damon

On 6/10/17 6:31 PM, R Smith wrote:



On 2017/06/10 8:24 PM, Richard Damon wrote:

On 6/10/17 8:13 AM, R Smith wrote:


The typical usage, storing full text UUID minus braces in an 
ASCII/UTF-8 sequence will result in a hair over 2.3 times[1] the 
storage of INTs. Not really that bad I think. 


If the field was declared as char[38], and UNICODE collation, many 
systems will allocate 4 bytes per character to allow for any possible 
character (of course, since we are only storing ASCII characters, we 
need to declare that so they take only 1 byte per character). 
Comparing that to 128 bits (16 bytes) is the 9x factor.


I've known systems to use Double-Byte character space in the past, but 
most RDBMSes repented and mended their evil ways by moving to a 
Multi-byte encoding variable length solution - but allocating 4-bytes 
blindly to anything Unicode?, well, let's just say in that case you 
don't need to consider whether to use UUIDs or not, you need to 
consider changing Database engines asap. (or perhaps at least the 
storage engine or type).


I mean 99.99% of the World's language and communication requirement 
Unicode characters sits in the BMP (Basic Multilingual Plane) which is 
in its entirety below the need for a 3rd byte. The higher planes that 
are currently allocated contains mostly non-linguistic definitions, 
like this smiley poo -    SELECT Char(0x1F4A9); - though one can 
argue it has become a staple of the chat-client vernacular.
The issue is defining a field as CHAR not VARCHAR. For many systems, if 
the record is fixed size, there is an efficiency gain of CHAR over 
VARCHAR, so some older guidelines suggest using fixed width fields when 
possible. If the field is just ASCII, then it works well, but when 
UNICODE fixed width fields can become less useful, as the storage engine 
needs to be prepared for the totally worse case situation. Thus the 
situation I described is likely a design error, but is one I have seen. 
(It comes up more often in Indexes, which are usually by definition 
fixed width fields)


--
Richard Damon

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Article: UUID or GUID as Primary Keys? Be Careful!

2017-06-10 Thread R Smith



On 2017/06/10 8:24 PM, Richard Damon wrote:

On 6/10/17 8:13 AM, R Smith wrote:


The typical usage, storing full text UUID minus braces in an 
ASCII/UTF-8 sequence will result in a hair over 2.3 times[1] the 
storage of INTs. Not really that bad I think. 


If the field was declared as char[38], and UNICODE collation, many 
systems will allocate 4 bytes per character to allow for any possible 
character (of course, since we are only storing ASCII characters, we 
need to declare that so they take only 1 byte per character). 
Comparing that to 128 bits (16 bytes) is the 9x factor.


I've known systems to use Double-Byte character space in the past, but 
most RDBMSes repented and mended their evil ways by moving to a 
Multi-byte encoding variable length solution - but allocating 4-bytes 
blindly to anything Unicode?, well, let's just say in that case you 
don't need to consider whether to use UUIDs or not, you need to consider 
changing Database engines asap. (or perhaps at least the storage engine 
or type).


I mean 99.99% of the World's language and communication requirement 
Unicode characters sits in the BMP (Basic Multilingual Plane) which is 
in its entirety below the need for a 3rd byte. The higher planes that 
are currently allocated contains mostly non-linguistic definitions, like 
this smiley poo -    SELECT Char(0x1F4A9); - though one can argue it 
has become a staple of the chat-client vernacular.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Article: UUID or GUID as Primary Keys? Be Careful!

2017-06-10 Thread Daniel Anderson
I drink to that!

2017-06-10 14:26 GMT-04:00 Jens Alfke :

>
> > On Jun 10, 2017, at 6:58 AM, Daniel Anderson  wrote:
> >
> > the article was integers (4 bytes) vs guid as a primary key, which as
> > string takes 36 bytes. so he was right in saying that string guid/uuid
> take
> > 9 time the space of simple integers.
>
> Ah, that explains where he got 9x. I still think it’s an unfair
> comparison, given the inefficient encoding of the UUID (it should be a
> 16-byte blob.) And in any design problem that’s at sufficient scale that
> you’re even considering UUIDs, a 32-bit primary key isn’t going to be
> sufficient, so it should be 64-bit. Now we’re down to a 2x difference.
>
> > in general the article was quite good and allowed people to ponder about
> > their choices/beliefs.
>
> I would have been less critical, had the author identified that the
> subject domain is traditional RDBMSs. Instead the implication is that this
> is relevant to any database, which it isn’t.
>
> Anyway, arguing about databases is boring. It’s the weekend — I’m off to
> play guitar and go to parties ;-)
>
> —Jens
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
Daniel
*L'action accède à la perfection quand, bien que vivant, vous êtes déjà
mort*
*Bunan*
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Article: UUID or GUID as Primary Keys? Be Careful!

2017-06-10 Thread Jens Alfke

> On Jun 10, 2017, at 11:24 AM, Richard Damon  wrote:
> 
> If the field was declared as char[38], and UNICODE collation, many systems 
> will allocate 4 bytes per character to allow for any possible character

Maybe, but it’s a bad design! It’s a huge amount of bloat for most text that 
would be stored in a database (even most Asian characters will fit in 16 bits.) 
And worse, it doesn’t actually make working with text easier, because you can 
_not_ treat every 32-bit code point as a character. There are arcane Unicode 
rules for combining code points — an accented letter may be represented as the 
base letter followed by an accent mark, and some ideographs (including many 
emoji!) are composed of multiple combined ideographs. So even something simple 
like “how many characters are in this string” requires scanning the string, not 
just a simple array lookup. 

In the end, UTF-8 almost always becomes the best encoding, since everything has 
to be treated as variable-width anyway.

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Article: UUID or GUID as Primary Keys? Be Careful!

2017-06-10 Thread Jens Alfke

> On Jun 10, 2017, at 6:58 AM, Daniel Anderson  wrote:
> 
> the article was integers (4 bytes) vs guid as a primary key, which as
> string takes 36 bytes. so he was right in saying that string guid/uuid take
> 9 time the space of simple integers.

Ah, that explains where he got 9x. I still think it’s an unfair comparison, 
given the inefficient encoding of the UUID (it should be a 16-byte blob.) And 
in any design problem that’s at sufficient scale that you’re even considering 
UUIDs, a 32-bit primary key isn’t going to be sufficient, so it should be 
64-bit. Now we’re down to a 2x difference.

> in general the article was quite good and allowed people to ponder about
> their choices/beliefs.

I would have been less critical, had the author identified that the subject 
domain is traditional RDBMSs. Instead the implication is that this is relevant 
to any database, which it isn’t.

Anyway, arguing about databases is boring. It’s the weekend — I’m off to play 
guitar and go to parties ;-)

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Article: UUID or GUID as Primary Keys? Be Careful!

2017-06-10 Thread Richard Damon

On 6/10/17 8:13 AM, R Smith wrote:


In that worst case scenario (all of the UUID plus dashes and braces), 
the full storage requirement for a UUID would look like this: 
{--Axxx-Bxxx-} which totals 38 characters of 
ASCII (or UTF-8) text space which totals 38 bytes.
Let's be generous and assume the user made VARCHAR(40) provision on an 
old-style DB which reserves all the bytes, or better yet, a modern one 
with a length definition that takes a further 32-bit value, so 42 
bytes then. Even in this very worst case scenario, the full space 
requirement for a UUID is a dismal ~2.7 times more than the 16 bytes 
of space the original 128-bit value consumed. Let's further assume the 
worst text storage system using DBCS to store 16 bits per character 
(and nobody really does this), even then we only get to just over 5 
times. Where did he get 9 times from??
The typical usage, storing full text UUID minus braces in an 
ASCII/UTF-8 sequence will result in a hair over 2.3 times[1] the 
storage of INTs. Not really that bad I think. 
If the field was declared as char[38], and UNICODE collation, many 
systems will allocate 4 bytes per character to allow for any possible 
character (of course, since we are only storing ASCII characters, we 
need to declare that so they take only 1 byte per character). Comparing 
that to 128 bits (16 bytes) is the 9x factor.


Richard Damon
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Article: UUID or GUID as Primary Keys? Be Careful!

2017-06-10 Thread Simon Slavin
There’s no reason for a human to want to sort by GUID.  They’re arbitrary 
symbols and don’t have an order.  Anyone who says "Show me our spare parts in 
GUID order." needs to rethink why they’re doing what they’re doing.

However, SQLite does searching by sorting.  Binary chop is still the best way 
we know to find one value of many (though hashes look interesting), and for 
that you need your values in sorted order. So yes, you’re going to sort by your 
key values.

The other thing is that your primary key should be values which don’t matter to 
you.  Because if you care about a value sooner or later you’re going to think 
of a reason to change it.  So if you understand UUID, appreciate their 
structure, print them out, or use parts of them to indicate factories or 
data-sources, they’re significant values and you need something else to use as 
a primary key.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Article: UUID or GUID as Primary Keys? Be Careful!

2017-06-10 Thread J Decker
On Sat, Jun 10, 2017 at 5:13 AM, R Smith  wrote:

>
> On 2017/06/10 6:27 AM, Jens Alfke wrote:
>
>> On Jun 9, 2017, at 3:05 PM, Simon Slavin  wrote:
>>>
>>> Tangential to SQLite, but there’s little on the list at the moment so
>>> perhaps some of you might like this.
>>> >> careful-7b2aa3dcb439 >> d-or-guid-as-primary-keys-be-careful-7b2aa3dcb439>>
>>>
>> He makes some questionable points, like saying that an ASCII string of
>> hex has a “9x cost in size” compared to a binary representation, or that
>> hex strings would somehow get larger when converted from ISO-8859-1 to
>> UTF-8.
>>
>
> Just in case people wonder about your assertion that his assertion is
> wrong - to be specific: UTF8 consumes the exact same space as ASCII when
> you use only characters from the first block (0..7F) - which is exactly
> what a UUID uses, at a maximum characters 0..9, A..F or a..f, -, {, and }.
> Nothing outside of the first Unicode block - though it should be noted that
> some systems may use Double-byte (16 bit) character representations
> internally whenever the DB table text *storage* type is set to UTF-anything
> (which should be noted is not the same thing as the *DB-Interface* type
> being UTF-anything).
>
> So the blogger's point doesn't hold on that assertion.
>
> Further, a UUID/GUID as per the standard (RFC4122) consists of 128 bit
> value formatted to present like this 36-character sequence:
> --Axxx-Bxxx-
>
> where A is a variant and B is the version of the UUID represented.
> Variants define different methods of calculation, like whether the MAC
> address with a time component was used, or a Domain/Namespace based UUID
> etc. In DB systems we usually use variant 1 (MAC+Time with 100 nanosecond
> precision) which, unless mechanical failure or intentional deceit, must be
> unique (i.e. probability for global collision = 0 if created exactly as
> described and all systems work as designed, and some cosmic ray doesn't hit
> your processor just right [or is it just wrong?]).
>
> Anyway, about the layout, you can of course simply store the UUID as a 128
> bit value (or 2 64-bit INTs - considering you use the exact same variant
> and version for all your IDs, but this takes processing and you end up with
> a value that needs to be re-computed before it can be compared to anything
> outside of your system), or at a minimum remove any dashes and braces, but
> in reality most people will just plop it as-is into a Text/Varchar field
> that's been Uniqued and probably PK'd.
>
> In that worst case scenario (all of the UUID plus dashes and braces), the
> full storage requirement for a UUID would look like this:
> {--Axxx-Bxxx-} which totals 38 characters of
> ASCII (or UTF-8) text space which totals 38 bytes.
> Let's be generous and assume the user made VARCHAR(40) provision on an
> old-style DB which reserves all the bytes, or better yet, a modern one with
> a length definition that takes a further 32-bit value, so 42 bytes then.
> Even in this very worst case scenario, the full space requirement for a
> UUID is a dismal ~2.7 times more than the 16 bytes of space the original
> 128-bit value consumed. Let's further assume the worst text storage system
> using DBCS to store 16 bits per character (and nobody really does this),
> even then we only get to just over 5 times. Where did he get 9 times from??
> The typical usage, storing full text UUID minus braces in an ASCII/UTF-8
> sequence will result in a hair over 2.3 times[1] the storage of INTs. Not
> really that bad I think.
>
> I find it fascinating that the number 1 reason to not use UUIDs, and
> probably the only reason, he never even mentioned. Sheer speed. (He refers
> sorting speed, but the real gain is look-up speed, which gets compounded in
> a compound query). In MSSQL I measured almost double the lookup speed using
> INTs in a PK in stead of VARCHARs (I didn't even use UUIDs, simply
> 6-character client codes of the form ABC001 etc.).
>
>
but then you have to compare a lot of characters before you get a mismatch;
a uuid will fail comparisons sooner since the leading characters will be
different sooner than your example data would.  Definitely comparing apples
and oranges there.


> Where I DO agree with the blogger: Where space is not a big concern, use
> both UUIDs and INTs locally in your DB, that way it is always scalable,
> always merge-able with other global data and always fast with the right
> query.
>
> Cheers,
> Ryan
>
> [1] - It's hard to say exactly, most DBs use extra bits/bytes for field
> specifications, lengths etc, even for the INT fields, so making an exact
> blanket assertion here about ratio of char vs. int storage is not possible,
> but the given ratio should be close.
>
>
> ___
> sqlite-users mailing list
> 

Re: [sqlite] Article: UUID or GUID as Primary Keys? Be Careful!

2017-06-10 Thread Daniel Anderson
the article was integers (4 bytes) vs guid as a primary key, which as
string takes 36 bytes. so he was right in saying that string guid/uuid take
9 time the space of simple integers.

in general the article was quite good and allowed people to ponder about
their choices/beliefs.

he said be: *careful*, which is a good rule of thumb, as design decisions
are often cast in stone and cannot be change because to many external
programs depends on them.


regards


2017-06-10 8:13 GMT-04:00 R Smith :

>
> On 2017/06/10 6:27 AM, Jens Alfke wrote:
>
>> On Jun 9, 2017, at 3:05 PM, Simon Slavin  wrote:
>>>
>>> Tangential to SQLite, but there’s little on the list at the moment so
>>> perhaps some of you might like this.
>>> >> careful-7b2aa3dcb439 >> d-or-guid-as-primary-keys-be-careful-7b2aa3dcb439>>
>>>
>> He makes some questionable points, like saying that an ASCII string of
>> hex has a “9x cost in size” compared to a binary representation, or that
>> hex strings would somehow get larger when converted from ISO-8859-1 to
>> UTF-8.
>>
>
> Just in case people wonder about your assertion that his assertion is
> wrong - to be specific: UTF8 consumes the exact same space as ASCII when
> you use only characters from the first block (0..7F) - which is exactly
> what a UUID uses, at a maximum characters 0..9, A..F or a..f, -, {, and }.
> Nothing outside of the first Unicode block - though it should be noted that
> some systems may use Double-byte (16 bit) character representations
> internally whenever the DB table text *storage* type is set to UTF-anything
> (which should be noted is not the same thing as the *DB-Interface* type
> being UTF-anything).
>
> So the blogger's point doesn't hold on that assertion.
>
> Further, a UUID/GUID as per the standard (RFC4122) consists of 128 bit
> value formatted to present like this 36-character sequence:
> --Axxx-Bxxx-
>
> where A is a variant and B is the version of the UUID represented.
> Variants define different methods of calculation, like whether the MAC
> address with a time component was used, or a Domain/Namespace based UUID
> etc. In DB systems we usually use variant 1 (MAC+Time with 100 nanosecond
> precision) which, unless mechanical failure or intentional deceit, must be
> unique (i.e. probability for global collision = 0 if created exactly as
> described and all systems work as designed, and some cosmic ray doesn't hit
> your processor just right [or is it just wrong?]).
>
> Anyway, about the layout, you can of course simply store the UUID as a 128
> bit value (or 2 64-bit INTs - considering you use the exact same variant
> and version for all your IDs, but this takes processing and you end up with
> a value that needs to be re-computed before it can be compared to anything
> outside of your system), or at a minimum remove any dashes and braces, but
> in reality most people will just plop it as-is into a Text/Varchar field
> that's been Uniqued and probably PK'd.
>
> In that worst case scenario (all of the UUID plus dashes and braces), the
> full storage requirement for a UUID would look like this:
> {--Axxx-Bxxx-} which totals 38 characters of
> ASCII (or UTF-8) text space which totals 38 bytes.
> Let's be generous and assume the user made VARCHAR(40) provision on an
> old-style DB which reserves all the bytes, or better yet, a modern one with
> a length definition that takes a further 32-bit value, so 42 bytes then.
> Even in this very worst case scenario, the full space requirement for a
> UUID is a dismal ~2.7 times more than the 16 bytes of space the original
> 128-bit value consumed. Let's further assume the worst text storage system
> using DBCS to store 16 bits per character (and nobody really does this),
> even then we only get to just over 5 times. Where did he get 9 times from??
> The typical usage, storing full text UUID minus braces in an ASCII/UTF-8
> sequence will result in a hair over 2.3 times[1] the storage of INTs. Not
> really that bad I think.
>
> I find it fascinating that the number 1 reason to not use UUIDs, and
> probably the only reason, he never even mentioned. Sheer speed. (He refers
> sorting speed, but the real gain is look-up speed, which gets compounded in
> a compound query). In MSSQL I measured almost double the lookup speed using
> INTs in a PK in stead of VARCHARs (I didn't even use UUIDs, simply
> 6-character client codes of the form ABC001 etc.).
>
> Where I DO agree with the blogger: Where space is not a big concern, use
> both UUIDs and INTs locally in your DB, that way it is always scalable,
> always merge-able with other global data and always fast with the right
> query.
>
> Cheers,
> Ryan
>
> [1] - It's hard to say exactly, most DBs use extra bits/bytes for field
> specifications, lengths etc, even for the INT fields, so making an exact
> 

Re: [sqlite] Article: UUID or GUID as Primary Keys? Be Careful!

2017-06-10 Thread R Smith


On 2017/06/10 6:27 AM, Jens Alfke wrote:

On Jun 9, 2017, at 3:05 PM, Simon Slavin  wrote:

Tangential to SQLite, but there’s little on the list at the moment so perhaps 
some of you might like this.
>

He makes some questionable points, like saying that an ASCII string of hex has 
a “9x cost in size” compared to a binary representation, or that hex strings 
would somehow get larger when converted from ISO-8859-1 to UTF-8.


Just in case people wonder about your assertion that his assertion is 
wrong - to be specific: UTF8 consumes the exact same space as ASCII when 
you use only characters from the first block (0..7F) - which is exactly 
what a UUID uses, at a maximum characters 0..9, A..F or a..f, -, {, and 
}. Nothing outside of the first Unicode block - though it should be 
noted that some systems may use Double-byte (16 bit) character 
representations internally whenever the DB table text *storage* type is 
set to UTF-anything (which should be noted is not the same thing as the 
*DB-Interface* type being UTF-anything).


So the blogger's point doesn't hold on that assertion.

Further, a UUID/GUID as per the standard (RFC4122) consists of 128 bit 
value formatted to present like this 36-character sequence:

--Axxx-Bxxx-

where A is a variant and B is the version of the UUID represented. 
Variants define different methods of calculation, like whether the MAC 
address with a time component was used, or a Domain/Namespace based UUID 
etc. In DB systems we usually use variant 1 (MAC+Time with 100 
nanosecond precision) which, unless mechanical failure or intentional 
deceit, must be unique (i.e. probability for global collision = 0 if 
created exactly as described and all systems work as designed, and some 
cosmic ray doesn't hit your processor just right [or is it just wrong?]).


Anyway, about the layout, you can of course simply store the UUID as a 
128 bit value (or 2 64-bit INTs - considering you use the exact same 
variant and version for all your IDs, but this takes processing and you 
end up with a value that needs to be re-computed before it can be 
compared to anything outside of your system), or at a minimum remove any 
dashes and braces, but in reality most people will just plop it as-is 
into a Text/Varchar field that's been Uniqued and probably PK'd.


In that worst case scenario (all of the UUID plus dashes and braces), 
the full storage requirement for a UUID would look like this: 
{--Axxx-Bxxx-} which totals 38 characters of 
ASCII (or UTF-8) text space which totals 38 bytes.
Let's be generous and assume the user made VARCHAR(40) provision on an 
old-style DB which reserves all the bytes, or better yet, a modern one 
with a length definition that takes a further 32-bit value, so 42 bytes 
then. Even in this very worst case scenario, the full space requirement 
for a UUID is a dismal ~2.7 times more than the 16 bytes of space the 
original 128-bit value consumed. Let's further assume the worst text 
storage system using DBCS to store 16 bits per character (and nobody 
really does this), even then we only get to just over 5 times. Where did 
he get 9 times from??
The typical usage, storing full text UUID minus braces in an ASCII/UTF-8 
sequence will result in a hair over 2.3 times[1] the storage of INTs. 
Not really that bad I think.


I find it fascinating that the number 1 reason to not use UUIDs, and 
probably the only reason, he never even mentioned. Sheer speed. (He 
refers sorting speed, but the real gain is look-up speed, which gets 
compounded in a compound query). In MSSQL I measured almost double the 
lookup speed using INTs in a PK in stead of VARCHARs (I didn't even use 
UUIDs, simply 6-character client codes of the form ABC001 etc.).


Where I DO agree with the blogger: Where space is not a big concern, use 
both UUIDs and INTs locally in your DB, that way it is always scalable, 
always merge-able with other global data and always fast with the right 
query.


Cheers,
Ryan

[1] - It's hard to say exactly, most DBs use extra bits/bytes for field 
specifications, lengths etc, even for the INT fields, so making an exact 
blanket assertion here about ratio of char vs. int storage is not 
possible, but the given ratio should be close.


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Article: UUID or GUID as Primary Keys? Be Careful!

2017-06-10 Thread Darko Volaric
And between centrally assigned "consecutive integer keys" and elaborate, 
probabilistic UUIDs are centrally allocated number ranges of a natively 
supported integer, eg 64 bit on SQLite. Problem solved.


> On Jun 10, 2017, at 6:27 AM, Jens Alfke  wrote:
> 
> 
>> On Jun 9, 2017, at 3:05 PM, Simon Slavin  wrote:
>> 
>> Tangential to SQLite, but there’s little on the list at the moment so 
>> perhaps some of you might like this.
>> >  
>> >
> 
> He makes some questionable points, like saying that an ASCII string of hex 
> has a “9x cost in size” compared to a binary representation, or that hex 
> strings would somehow get larger when converted from ISO-8859-1 to UTF-8.
> 
>> Several of his points don’t apply to SQLite, which works differently from 
>> most SQL engines, but it’s interesting reading nevertheless.
> 
> Most of what he says is extremely RDBMS-centric, even though he never says 
> so. I have the feeling he’s never used a NoSQL database. To me the whole 
> thing comes off as pretty parochial — I have to laugh at his assertion that 
> UUIDs aren’t scalable, since in systems like Couchbase Server*, which handles 
> ridiculously huge data sets, it’s extremely common to use them as keys. 
> 
> From a highly-scalable perspective, having a single global counter in the 
> database to assign consecutive integer keys is a horrifying bottleneck!
> 
>> (I find it ironic that the URLs for his posts are composed of two elements: 
>> an assigned piece of content and an arbitrary long number coded as 
>> hexadecimal.)
> 
> Blame that on Medium, which is hosting his blog posts :)
> 
> —Jens
> 
> * Disclaimer: I work for Couchbase (but on mobile software, not on servers.)
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Article: UUID or GUID as Primary Keys? Be Careful!

2017-06-09 Thread Jens Alfke

> On Jun 9, 2017, at 3:05 PM, Simon Slavin  wrote:
> 
> Tangential to SQLite, but there’s little on the list at the moment so perhaps 
> some of you might like this.
>   
> >

He makes some questionable points, like saying that an ASCII string of hex has 
a “9x cost in size” compared to a binary representation, or that hex strings 
would somehow get larger when converted from ISO-8859-1 to UTF-8.

> Several of his points don’t apply to SQLite, which works differently from 
> most SQL engines, but it’s interesting reading nevertheless.

Most of what he says is extremely RDBMS-centric, even though he never says so. 
I have the feeling he’s never used a NoSQL database. To me the whole thing 
comes off as pretty parochial — I have to laugh at his assertion that UUIDs 
aren’t scalable, since in systems like Couchbase Server*, which handles 
ridiculously huge data sets, it’s extremely common to use them as keys. 

From a highly-scalable perspective, having a single global counter in the 
database to assign consecutive integer keys is a horrifying bottleneck!

> (I find it ironic that the URLs for his posts are composed of two elements: 
> an assigned piece of content and an arbitrary long number coded as 
> hexadecimal.)

Blame that on Medium, which is hosting his blog posts :)

—Jens

* Disclaimer: I work for Couchbase (but on mobile software, not on servers.)
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users