[sqlite] Fwd: SAVEPOINT name

2015-05-21 Thread Baruch Burstein
On Thu, May 21, 2015 at 7:15 PM, Simon Slavin  wrote:
>
>
> Also, savepoint names can't be used as parameters when binding.


Does that mean that I can't prepare this: "SAVEPOINT :name"? What would be
the recommended method of preventing SQL injection for this?

-- 
?u?op-?p?sdn s? ?o??uo? ?no? 's??? p??? u?? no? ??


[sqlite] SAVEPOINT name

2015-05-21 Thread Simon Slavin

On 21 May 2015, at 8:51pm, Baruch Burstein  wrote:

> Does that mean that I can't prepare this: "SAVEPOINT :name"? What would be
> the recommended method of preventing SQL injection for this?

Is there really a need to use ad-hoc savepoint names ?  You can keep reusing 
the same name during successive savepoints, or you can use several different 
names, but would you want to make up names inside your program ?

Simon.


[sqlite] AUTOINC vs. UUIDs

2015-05-21 Thread Darren Duncan
On 2015-05-21 9:14 PM, Stephen Chrzanowski wrote:
> {{I just got a bounced message.. Reposting}}

Both of your attempts got through.  You got a bounce because you sent it to an 
invalid list address in addition to a valid one; one bounced the other didn't.

> I've been watching this thread from the beginning with great interest, and
> I still don't see the difference between using a UUID or an auto-inc
> integer as a PK at the very raw, basic level. The database will only see
> them as a string of bits or bytes and handle accordingly.  IMO, using UUID
> is an extra overhead for humans to deal with, which is going to cause more
> grief than necessary.

Personally I'm a strong advocate of using natural keys only wherever possible, 
which also has an effect on how you design your database.  When followed 
judiciously, it can lead to very well designed databases.

While table-specific surrogate keys like auto-inc or uuids make sense in some 
situations, they are greatly over used, and most of the time natural keys can 
be 
used instead.  I've seen many people use surrogate keys when there were 
perfectly suitable natural keys available instead.

As to auto-inc vs uuids, the main difference I see is that the former gives you 
tighter coupling to the database or between rows than is otherwise necessary.

To explain, when you use auto-inc, you are depending on the database to tell 
you 
what your row identifiers are, which gets more complicated if you're entering a 
set of related records where you want to use the same identifiers in multiple 
tables to indicate related records, eg parent-child.

When you don't use auto-inc, you can know in advance in the application before 
talking to the database what the complete values of all your new rows are, you 
know what values you are using to relate records to each other, you don't have 
to insert one row to know how to associate other rows.

Also when you auto-inc, values tend to be serial, so the order you insert 
records affects their final values, where when you don't auto-inc, your order 
of 
insertion has no impact on their final values (in the absense of triggers).

In that respect uuids can be better than auto-inc because you lose those 
coupling problems.  On the other hand uuids themselves should be used very 
sparingly, and I haven't really seen a reason to use them yet.

-- Darren Duncan



[sqlite] AUTOINC vs. UUIDs

2015-05-21 Thread Valentin Davydov
On Wed, May 20, 2015 at 11:52:08PM +, Peter Aronson wrote:
> Now you're just getting silly. ?What if the application sets all rowids,
> everywhere to 1? ?The fact is, the chance of collision on a UUID is pretty
> astronomically low as long as a decent source of entropy is used
> (see?http://en.wikipedia.org/wiki/Universally_unique_identifier#Random_UUID_probability_of_duplicates).
> ?Yes, some application might not generate proper UUIDs, but that's true
> with any scheme that needs to coordinate disconnected data editing or
> generation on multiple machines.

Moreover, there are widespread examples of colliding UUIDs, say
EBD0A0A2-B9E5-4433-87C0-68B6B72699C7. This means that this idea 
have already proven to fail on it's intended usage.

> ?There are lots of applications out there that use UUIDs pretty successfully.

Much less than a number of applications which use integers ;-)

Val. Dav.



[sqlite] BestIndex problem

2015-05-21 Thread Emmanouil Karvounis
Greetings,

We are having an issue with BestIndex in our Virtual Table implementation.
Allow us to illustrate the problem with a simple example.

We create a virtual table named 'vt' that conceptually has 3 columns
"value1, value2, value3" and then we want to execute the following queries:

1) select value1 from vt where value1 = 7;

In this case, BestIndex passes the equal constraint on the first column and
by setting the output as pdxInfo->aConstraintUsage[0].argvIndex = 1,
we indicate that we accept this specific constraint. So, at Filter we get
the value 7 as argument0. Everything behaves normally so far.

However, in case we run any of the following equivalent queries (that
should pass to Filter more than one value), we get an error message
"xBestIndex returned an invalid plan":

2) select value1 from vt where value1 = 7 or value1 = 8;
3) select value1 from vt where value1 in (select * from tableA);//
suppose tableA contains an integer 'id' column and records (7, 8)
4) select value1 from vt, tableA where value1 = tableA.id;

Again, in each case we set pdxInfo->aConstraintUsage[0].argvIndex = 1 but
we get the above error message.

This behavior seems rather weird, so we'd like some expert help on what we
might be doing wrong.


Thank you for your time,
Manos Karvounis
Stamatis Christoforidis


[sqlite] Fwd: SAVEPOINT name

2015-05-21 Thread Baruch Burstein
Hi,

What are the restrictions (if any) on a sve point name? Alphanumeric? Same
as a string (enclosed in ' ' if necessary)? Something else?

Thank you,
Baruch

-- 
?u?op-?p?sdn s? ?o??uo? ?no? 's??? p??? u?? no? ??


[sqlite] in Linux system, 2 threads can access 2 separate databases (created by sqlite3) through sqlite3?

2015-05-21 Thread miles zhang
dear,
thanks for your reading.
I have 2 different databases ,  and I create 2 threads to access such 2 
databases with sqlite3 sql statement?
   can it be done successfully like this using sqlite3 lib?
   is it need to control access of such 2 thread for competition ? for example 
, locker.

   thx,
   hope your reply soon.

   miles zhang
   2015-05-21


[sqlite] Fwd: SAVEPOINT name

2015-05-21 Thread Simon Slavin

On 21 May 2015, at 3:56pm, Baruch Burstein  wrote:

> What are the restrictions (if any) on a sve point name? Alphanumeric? Same
> as a string (enclosed in ' ' if necessary)? Something else?

No official documentation, as far as I know.  The nearest is this:



It's an entity name, not a string, so the restrictions should be the same as 
those on table names and column names.  I think you need to double-quote them 
if you want spaces in them or use square brackets if you want to use 
punctuation in them.

Also, savepoint names can't be used as parameters when binding.

Simon.


[sqlite] AUTOINC vs. UUIDs

2015-05-21 Thread Darren Duncan
I agree with Keith and disagree with Simon.

Regarding the current state for sequence generators that are specific to user 
databases, storing that state inside the same databases where they are used is 
exactly the right place.

Everything necessary to understand a database should be stored in that 
database, 
this is the correct way to do it, there is no wall being broken.

Sequence generators are NOT private, just as the ROWIDs of SQLite tables are 
NOT 
private either.  Users can and do directly access this information.

A sequence generator is nothing more than an abstraction over a variable in a 
database (a SQL "table" is also a variable in a database) combined with an 
atomic routine to return its current value plus optionally increment that 
variable.

A table with an auto-increment column is nothing more than an ordinary table 
with an on-insert triggered action to invoke said routine and use its value to 
modify the incoming row.

Some DBMSs don't store sequence generator state as their own thing, sometimes 
they just use the actual row value of the table being auto-incremented itself. 
Either way the principle is the same.

Some DBMSs perform the sequence state variable increment in an autonomous 
transaction that automatically commits before the main one, so that even if the 
transaction using the sequence generator rolls back, the sequence generator 
doesn't repeat the same values; others just keep this increment in the main 
transaction and so also can be rolled back and repeat.

That's all that standard SQL/etc sequence generators are, regular user data, 
and 
one shouldn't be fooled into thinking they are something else.

-- Darren Duncan

On 2015-05-20 4:21 PM, Keith Medcalf wrote:
> All relational database engines store configuration data within the users' 
> database.  Many of them just hide it behind varying layers of logically 
> imposed complication.  For example, you could simulate SQL Servers' 
> obfuscation by simply changing the name of the primary database alias from 
> "main" to "master" in the SQLite code, and then requiring that all "user" 
> tables are stored in an attached database.
>
>> -Original Message-
>> From: sqlite-users-bounces at mailinglists.sqlite.org [mailto:sqlite-users-
>> bounces at mailinglists.sqlite.org] On Behalf Of Simon Slavin
>> Sent: Wednesday, 20 May, 2015 14:38
>> To: sqlite-users at mailinglists.sqlite.org
>> Subject: Re: [sqlite] AUTOINC vs. UUIDs
>>
>> On 20 May 2015, at 8:52pm, Kees Nuyt  wrote:
>>
>>> The autoincrement clause causes an entry in the sqlite_sequence
>>> table.
>>
>> It's interesting that SQLite uses tables inside the user database for
>> private purposes like this.  A certain wall is broken when the designers
>> choose this option.  SQLite does it for sqlite_master, sqlite_sequence,
>> sqlite_stat*, and probably others I've forgotten.
>>
>> SQLite is handicapped by having no permanent data storage location.  It
>> has nowhere to store configuration information apart from inside the
>> user's databases or as compilation settings.  This is very unusual but, I
>> think, contributes a lot to how portable SQLite is: no need to understand
>> folder structure or safe places to keep configuration information;
>> increased startup-speed; reduced code size, fewer file handles, slightly
>> reduced memory.
>>
>> Simon.



[sqlite] ALTER TABLE ADD COLUMN

2015-05-21 Thread Kees Nuyt
On Thu, 21 May 2015 12:45:41 +0200, "Christoph P.U. Kukulies"
 wrote:

>Am 21.05.2015 um 10:00 schrieb Kees Nuyt:
>> On Thu, 21 May 2015 09:40:53 +0200, "Christoph P.U. Kukulies"
>>  wrote:
>>
>>> Am 21.05.2015 um 09:25 schrieb Hick Gunter:
 alter table [database] add column [real_length] numeric;
>>> Thanks, that did it right. Almost. Still I'm wondering why
>>> the output is
>>> not as "smooth" as the
>>> .fullschema listing before the ALTER ADD COLUMN:
>>> (comma at line start,
>> Alter table just adds ",your-row-spec" in front of ");".
>> If you want it to be more consistent, use comma at line start
>> for all column specs in the original schema.
>>
>>> lower case 'numeric')
>> That can be fixed by using
>> alter table [database] add column [real_length] NUMERIC;
>>
>> You can patch the layout of the CREATE TABLE statement in
>> sqlite_master by using
>> http://my.domain/sqlite.org/pragma.html#pragma_writable_schema

I forgot to edit that link, should be
http://www.sqlite.org/pragma.html#pragma_writable_schema

>> and replacing the contents of the sql column with your preferred
>> layout.
>> It is dangerous, so read the warning, make backups first and
>> test the result thoroughly!
>
>
> Thanks. So you mean to use the pragma at run time in sqlite3, like:
> sqlite> PRAGMA writable_schema = true;
> sqlite>

Yes, that.

>or at compile time and recompile it?

No, no compile time options required, the default sqlite3
command line will do.

> To me it would probably far easier, to rebuild the schema
> from ground up in sqlite3.exe since I'm still in a design phase.

That's much better.
For most of my projects I maintain the schema source in my
project source tree, together with a few scripts with test data.
The database is automaticaly created from that schema when it is
missing (e.g. in  make clean; make). 

-- 
Regards, 

Kees Nuyt



[sqlite] AUTOINC vs. UUIDs

2015-05-21 Thread Darren Duncan
On 2015-05-21 12:16 PM, James K. Lowden wrote:
> On Wed, 20 May 2015 19:05:29 +0100
> Simon Slavin  wrote:
>
>> Posting this not because I agree with it but because the subject has
>> come up here a couple of times.
>>
>> >>
>>
>> "Today, I?ll talk about why we stopped using serial integers for our
>> primary keys, and why we?re now extensively using Universally Unique
>> IDs (or UUIDs) almost everywhere."
>
>   "A relational database is a graph where nodes are called
> entities and edges relations."
>
> It's hard to know where to start.  But that's definitely not it.

Indeed, that article had a rather poor idea what a relational database is.  For 
one thing, it didn't seem to grasp that each entire tuple/row in a 
relation/table is by definition its own identity.  If no subset of 
attributes/columns is a key, the entire tuple/row can be used to identify which 
tuple/row you're talking about. -- Darren Duncan



[sqlite] BestIndex problem

2015-05-21 Thread Hick Gunter
None of the queries named requires more than 1 parameter to xFilter. Why should 
they?

xBestIndex will be called once with no usable constraints and once with 
(value1, "=").

The difference is that xFilter will be called once for each constraint value, 
i.e. once for query 1 and twice for queries 2..4.

I guess you are setting argvIndex to 1 for a constraint that is not marked 
usable.

The exact calling sequence should be:

xBestIndex (1 constraint, not usable) => rows should be set to number of rows 
in vt
xBestIndex ( value1, "=" ) => rows should be set to 1 (if field value1 is a 
unique key for the table)
xFilter ( 7 ) => locate the row
xColumn ( ... ) => return the contents of the field
xNext  => set eof flag
xEof => return TRUE
xFilter ( 8 ) => locate the row
xColumn ( ... ) => return the contents of the field
xNext  => set eof flag
xEof => return TRUE


-Urspr?ngliche Nachricht-
Von: Emmanouil Karvounis [mailto:manosk at di.uoa.gr]
Gesendet: Donnerstag, 21. Mai 2015 17:21
An: sqlite-users at mailinglists.sqlite.org
Cc: Stamatios Christoforidis
Betreff: [sqlite] BestIndex problem

Greetings,

We are having an issue with BestIndex in our Virtual Table implementation.
Allow us to illustrate the problem with a simple example.

We create a virtual table named 'vt' that conceptually has 3 columns "value1, 
value2, value3" and then we want to execute the following queries:

1) select value1 from vt where value1 = 7;

In this case, BestIndex passes the equal constraint on the first column and by 
setting the output as pdxInfo->aConstraintUsage[0].argvIndex = 1, we indicate 
that we accept this specific constraint. So, at Filter we get the value 7 as 
argument0. Everything behaves normally so far.

However, in case we run any of the following equivalent queries (that should 
pass to Filter more than one value), we get an error message "xBestIndex 
returned an invalid plan":

2) select value1 from vt where value1 = 7 or value1 = 8;
3) select value1 from vt where value1 in (select * from tableA);//
suppose tableA contains an integer 'id' column and records (7, 8)
4) select value1 from vt, tableA where value1 = tableA.id;

Again, in each case we set pdxInfo->aConstraintUsage[0].argvIndex = 1 but we 
get the above error message.

This behavior seems rather weird, so we'd like some expert help on what we 
might be doing wrong.


Thank you for your time,
Manos Karvounis
Stamatis Christoforidis
___
sqlite-users mailing list
sqlite-users at mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: hick at scigames.at

This communication (including any attachments) is intended for the use of the 
intended recipient(s) only and may contain information that is confidential, 
privileged or legally protected. Any unauthorized use or dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please immediately notify the sender by return e-mail message and 
delete all copies of the original communication. Thank you for your cooperation.




[sqlite] Fwd: SAVEPOINT name

2015-05-21 Thread Igor Tandetnik
On 5/21/2015 3:51 PM, Baruch Burstein wrote:
> Does that mean that I can't prepare this: "SAVEPOINT :name"?

Yes.

> What would be the recommended method of preventing SQL injection for this?

Do you plan to run SAVEPOINT "user-provided-string"? What for, if you 
don't mind me asking?
-- 
Igor Tandetnik



[sqlite] AUTOINC vs. UUIDs

2015-05-21 Thread James K. Lowden
On Thu, 21 May 2015 13:44:15 +0200
Mark Straver  wrote:

> > Oh and the other feature! of UUIDs is merging databases... if I
> > have a database deployed to 15 customers, each with their own ORGID
> > I can easily throw them in the same database without worrying about
> > fixing up primary keys and all records related to those keys.
> 
> You wouldn't with sequential PKs either, but instead of stuffing 2
> types of data into a single field (which I think is also a bad idea)
> you simply have an extra field with ORGID. The PK should always be
> unique to the target DB, unrelated to the source DBs, when merging.

Exactly.  One extra column makes the key unique across (in this
example) organizations.  Plus it adds information -- the organization
ID -- that would otherwise be lost in the hash.  

We have a whole Internet running on nonrandom addresses, a world wide
web running on client-determined nonrandom addresses.  Instead of
relying on hashes and patching up the inevitable conflicts, the
designers TCP/IP and HTTP wisely decided on human-readable,
deterministic identifiers.  Strange indeed it is to watch those who
build atop that infrastructure assert that hashes are necessary and
intergers insecure.  

--jkl


[sqlite] AUTOINC vs. UUIDs

2015-05-21 Thread James K. Lowden
On Wed, 20 May 2015 19:05:29 +0100
Simon Slavin  wrote:

> Posting this not because I agree with it but because the subject has
> come up here a couple of times.
> 
>  >
> 
> "Today, I?ll talk about why we stopped using serial integers for our
> primary keys, and why we?re now extensively using Universally Unique
> IDs (or UUIDs) almost everywhere."

"A relational database is a graph where nodes are called
entities and edges relations."

It's hard to know where to start.  But that's definitely not it.  

--jkl



> 
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Copy data between to databases unsing windows embedded compact 7, .net 3.5

2015-05-21 Thread afriendandmore
Hi,

I am developing an application for Windows Compact Embedded 7 using .net 
3.5 and sqlite.

We have two databases,

db1, db2.

A table shall b e copied from db1 to db2 using .net and c#.

INSERT INTO db1.table
SELECT *
   FROM db2.table

How can I execute this command using ado.net 3.5 (Ce.net 3.5), how to  
connections to two databases.

Is there an enviroment/console where I can run this (and other) command 
directly, to test them?

Thank you very much.


[sqlite] AUTOINC vs. UUIDs

2015-05-21 Thread Igor Tandetnik
On 5/21/2015 1:37 PM, Valentin Davydov wrote:
> Moreover, there are widespread examples of colliding UUIDs, say
> EBD0A0A2-B9E5-4433-87C0-68B6B72699C7.

This particular issue ( 
http://lists.gnu.org/archive/html/bug-parted/2011-06/msg00026.html ) 
didn't happen because two independent UUID generation processes 
accidentally ended up generating the same UUID. It happened because the 
same UUID was deliberately reused - a decision that seemed good at the 
time, but later proved problematic. So it's quite misleading to cite 
this as the case of "colliding UUIDs".



[sqlite] AUTOINC vs. UUIDs

2015-05-21 Thread Mark Straver
On 21/05/2015 09:33, J Decker wrote:
> chance of collision is slim... it's a truly large amount of bits...
> There's libuuid for linux, and builtin functions for windows to create them.

The chance of collision is only slim if you (as some people have already
indicated) have a proper entropy source to generate type 4s. In any other case
the chance of collision is pretty much guaranteed, and not just once, but for
each PK entry because of how PRNGs work.

The other main concern that seems to be the case here is merging data from
separate DBs with the same PK. Of course, if you are dealing with multiple
clients with individual input in local DBs, you would not just take the DBs
and slap them one on top of the other on the central storage location; that is
just being extremely naive.

I also wouldn't want to be the poor server having to check all UUIDs for
matching entries on every delta if those are used, which, because of them
being pseudo-random is very much a requirement.

> Oh and the other feature! of UUIDs is merging databases... if I have a
> database deployed to 15 customers, each with their own ORGID I can easily
> throw them in the same database without worrying about fixing up primary
> keys and all records related to those keys.

You wouldn't with sequential PKs either, but instead of stuffing 2 types of
data into a single field (which I think is also a bad idea) you simply have an
extra field with ORGID. The PK should always be unique to the target DB,
unrelated to the source DBs, when merging. Blurring those lines is asking for
trouble, IMHO.

So, I think the following can be said:
* UUIDs are not human-readable in case someone actually needs to remember a PK
* If you stuff an ORGID into the UUID, you're actually making it easier to
leak data; only a truly random UUID would make sense.
* UUIDs offer no key space advantage to a DB that is actually relevant to even
large DB systems (let alone normal SQLite use)
* UUIDs are not offering guarantees of uniqueness which a one-way
auto-increment would, requiring extensive checking "just in case"
(contributing to the next point)
* UUIDs depend on the client-side implementation of a PRNG and entropy source,
which may be flawed, produce repeatable results, etc.
* UUIDs take more resources to use (memory, CPU cycles for comp, hash, etc.)
which I think is right against what you'd want for SQLite

As a result, I don't think it's wise to use them, objectively speaking. It has
very little to do with having used them in the past myself or not (OT: that is
actually a psychological trap employed by many shady business models ;) ).
I do hope that if you use them, you never run into collisions and having to
sort out the pieces afterwards.

M.



[sqlite] Copy data between to databases unsing windows embedded compact 7, .net 3.5

2015-05-21 Thread Hick Gunter
ATTACH is the command you are looking for

-Urspr?ngliche Nachricht-
Von: afriendandmore [mailto:afriendandmore at ymail.com]
Gesendet: Donnerstag, 21. Mai 2015 15:15
An: sqlite-users at mailinglists.sqlite.org
Betreff: [sqlite] Copy data between to databases unsing windows embedded 
compact 7, .net 3.5

Hi,

I am developing an application for Windows Compact Embedded 7 using .net
3.5 and sqlite.

We have two databases,

db1, db2.

A table shall b e copied from db1 to db2 using .net and c#.

INSERT INTO db1.table
SELECT *
   FROM db2.table

How can I execute this command using ado.net 3.5 (Ce.net 3.5), how to 
connections to two databases.

Is there an enviroment/console where I can run this (and other) command 
directly, to test them?

Thank you very much.
___
sqlite-users mailing list
sqlite-users at mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: hick at scigames.at

This communication (including any attachments) is intended for the use of the 
intended recipient(s) only and may contain information that is confidential, 
privileged or legally protected. Any unauthorized use or dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please immediately notify the sender by return e-mail message and 
delete all copies of the original communication. Thank you for your cooperation.




[sqlite] AUTOINC vs. UUIDs

2015-05-21 Thread e...@edlsystems.com
>
> As a result, I don't think it's wise to use them, objectively speaking. It has
> very little to do with having used them in the past myself or not (OT: that is
> actually a psychological trap employed by many shady business models ;) ).
> I do hope that if you use them, you never run into collisions and having to
> sort out the pieces afterwards.
>

I have done a fair amount of research here as it is connected to a 
project I am involved in.

The only real advantage of UUID is for distributed creation of records. 
The disadvantage of a sequence is that a single mechanism, that supplies 
the PK, needs to be accessed and locked whenever any record anywhere is 
created.  This is an issue for distributed databases (what I am involved 
in), but not an issue for an sqlite one, which almost by definition, is 
not distributed.

If you are using sqlite as a distributed database, the first question must 
be "why?", and the second reaction should be "you should be designing the 
application accordingly then", such as generating the UUID yourself rather 
than sqlite do it.

In short, it would be incorrect for sqlite to move over to UUIDs as the 
standard PK mechanism.

Eddy


[sqlite] AUTOINC vs. UUIDs

2015-05-21 Thread Scott Robison
On Thu, May 21, 2015 at 11:37 AM, Valentin Davydov 
wrote:

> On Wed, May 20, 2015 at 11:52:08PM +, Peter Aronson wrote:
> > Now you're just getting silly.  What if the application sets all rowids,
> > everywhere to 1?  The fact is, the chance of collision on a UUID is
> pretty
> > astronomically low as long as a decent source of entropy is used
> > (see
> http://en.wikipedia.org/wiki/Universally_unique_identifier#Random_UUID_probability_of_duplicates
> ).
> >  Yes, some application might not generate proper UUIDs, but that's true
> > with any scheme that needs to coordinate disconnected data editing or
> > generation on multiple machines.
>
> Moreover, there are widespread examples of colliding UUIDs, say
> EBD0A0A2-B9E5-4433-87C0-68B6B72699C7. This means that this idea
> have already proven to fail on it's intended usage.
>

MBR uses one byte unsigned integers to indicate partition type. If two
companies pick the same byte value to represent two different things, then
you have a collision and some software somewhere is going to do the wrong
thing if it doesn't know how to disambiguate using extra information. The
problem is not with the software, it is with two developers picking the
same byte value to mean different things. Since the "partition type"
namespace only has 2^8 possible values, and there are more than 256
partition formats defined, collisions are "very likely" (by which I mean
have already happened). This was unavoidable with a single byte, but given
that MBR dates back to a time that multiple bootable partitions on a single
hard drive were unheard of, it worked (for a while). Once people started
dual booting, partition type collisions became more painful.

GPT uses GUIDs. This gives 2^128 possible partition types. Collisions are
much less likely if one uses any of the RFC UUID generation procedures
(versions 1 through 5). If one deliberately chooses the exact same byte
sequence to mean something different, as happened in this case, that's not
a failure of UUID-style identifier creation. Any bad actor (whether acting
out of ignorance or maliciousness) can result in the exact same outcome
whether using 8 bit or 128 bit (or any other finite bit) identifiers.

>  There are lots of applications out there that use UUIDs pretty
> successfully.
>
> Much less than a number of applications which use integers ;-)


Integers have been around a lot longer, so of course there are more
successful applications. :)

I would not say UUIDs or their ilk should be used indiscriminately as was
recommended in the linked article that started this thread. They do have
their place though, particularly when the cost to coordinate the allocation
of relatively small integers is "too high". Maybe a device has no
connectivity to a central server while in the field (short of maybe IP over
Avian Carrier: https://tools.ietf.org/html/rfc1149). In that case one has
two choices. One, allocate an integer that you know is almost certainly
going to collide with someone else and sort it out later). Two, do what
SQLite does when the largest possible key is in use: start picking random
integers until you find an open slot or you time out. UUIDs (whether random
or deterministic in their generation technique) are option two (practically
if not exactly).

I would not use UUID-like identifiers in most cases. There are some cases
where they work better than integers. In fact, one need not limit oneself
to the official UUID generation techniques. One can do the same sort of
thing with 64 bit integers (as described above by SQLite), or 256 bit
blobs. Heck, one can do it with one bit integers, though the chance of
collision is rather high. :)

-- 
Scott Robison


[sqlite] ALTER TABLE ADD COLUMN

2015-05-21 Thread Christoph P.U. Kukulies
Am 21.05.2015 um 10:00 schrieb Kees Nuyt:
> On Thu, 21 May 2015 09:40:53 +0200, "Christoph P.U. Kukulies"
>  wrote:
>
>> Am 21.05.2015 um 09:25 schrieb Hick Gunter:
>>> alter table [database] add column [real_length] numeric;
>> Thanks, that did it right. Almost. Still I'm wondering why
>> the output is
>> not as "smooth" as the
>> .fullschema listing before the ALTER ADD COLUMN:
>> (comma at line start,
> Alter table just adds ",your-row-spec" in front of ");".
> If you want it to be more consistent, use comma at line start
> for all column specs in the original schema.
>
>> lower case 'numeric')
> That can be fixed by using
> alter table [database] add column [real_length] NUMERIC;
>
> You can patch the layout of the CREATE TABLE statement in
> sqlite_master by using
> http://knuyt.demon.nl/sqlite.org/pragma.html#pragma_writable_schema
> and replacing the contents of the sql column with your preferred
> layout.
> It is dangerous, so read the warning, make backups first and
> test the result thoroughly!


Thanks. So you mean to use the pragma at run time in sqlite3, like:
sqlite> PRAGMA writable_schema = true;
sqlite>

or at compile time and recompile it?

To me it would probably far easier, to rebuild the schema from ground up 
in sqlite3.exe
since I'm still in a design phase.
--
Christoph Kukulies
>



[sqlite] ALTER TABLE ADD COLUMN

2015-05-21 Thread Simon Slavin

On 21 May 2015, at 7:51am, Christoph P.U. Kukulies  wrote:

> Now I wonder why I don't read
> CREATE TABLE [database] (
> [database_name] TEXT  NULL,
> [table_name] TEXT  NULL,
> [data_type] TEXT  NULL,
> [data_size] INTEGER  NULL,
> [column_name] TEXT  NULL,
> [vendor] TEXT DEFAULT 'SQLBASE' NULL,
> [real_length] NUMERIC;

The standard of having a column name in square brackets does not belong to 
SQLite.  It is not something SQLite does, and it is not something SQLite 
insists on.  Whoever created the table orignally used square brackets and they 
worked, but that doesn't mean that everything else has to use square brackets 
from then on.

If you like things to use square brackets do it yourself, so do

alter table database add column [real_length] numeric;

Simon.


[sqlite] ALTER TABLE ADD COLUMN

2015-05-21 Thread Christoph P.U. Kukulies
Am 21.05.2015 um 09:08 schrieb Jean-Christophe Deschamps:
> At 08:51 21/05/2015, you wrote:
>
>> I used sqlite3.exe by invoking it on the command line with the name 
>> of a database.
>> Next I typed
>>
>>  alter table database add column real_length numeric;
>>
>> Next I typed .fullschema
>> and I'm getting:
>>
>> sqlite> .fullschema
>> CREATE TABLE [database] (
>> [database_name] TEXT  NULL,
>> [table_name] TEXT  NULL,
>> [data_type] TEXT  NULL,
>> [data_size] INTEGER  NULL,
>> [column_name] TEXT  NULL,
>> [vendor] TEXT DEFAULT 'SQLBASE' NULL
>> , real_length numeric);
>> /* No STAT tables available */
>>
>> Now I wonder why I don't read
>> CREATE TABLE [database] (
>> [database_name] TEXT  NULL,
>> [table_name] TEXT  NULL,
>> [data_type] TEXT  NULL,
>> [data_size] INTEGER  NULL,
>> [column_name] TEXT  NULL,
>> [vendor] TEXT DEFAULT 'SQLBASE' NULL,
>> [real_length] NUMERIC;
>   ^ missing )
Yes, this was my fault when hand editing (copy/paste) the message. 
Thanks for being exact.
--
Christoph Kukulies



[sqlite] ALTER TABLE ADD COLUMN

2015-05-21 Thread Christoph P.U. Kukulies
Am 21.05.2015 um 09:25 schrieb Hick Gunter:
> alter table [database] add column [real_length] numeric;
Thanks, that did it right. Almost. Still I'm wondering why the output is 
not as "smooth" as the
.fullschema listing before the ALTER ADD COLUMN: (comma at line start, 
lower case 'numeric')

sqlite> .fullschema
CREATE TABLE [database] (
[database_name] TEXT  NULL,
[table_name] TEXT  NULL,
[data_type] TEXT  NULL,
[data_size] INTEGER  NULL,
[column_name] TEXT  NULL,
[vendor] TEXT DEFAULT 'SQLBASE' NULL
);
/* No STAT tables available */
sqlite> alter table [database] add column [real_length] numeric;
sqlite> .fullschema
CREATE TABLE [database] (
[database_name] TEXT  NULL,
[table_name] TEXT  NULL,
[data_type] TEXT  NULL,
[data_size] INTEGER  NULL,
[column_name] TEXT  NULL,
[vendor] TEXT DEFAULT 'SQLBASE' NULL
, [real_length] numeric);
/* No STAT tables available */


--
Christoph Kukulies



[sqlite] ALTER TABLE ADD COLUMN

2015-05-21 Thread Jean-Christophe Deschamps
At 08:51 21/05/2015, you wrote:

>I used sqlite3.exe by invoking it on the command line with the name of 
>a database.
>Next I typed
>
>  alter table database add column real_length numeric;
>
>Next I typed .fullschema
>and I'm getting:
>
>sqlite> .fullschema
>CREATE TABLE [database] (
>[database_name] TEXT  NULL,
>[table_name] TEXT  NULL,
>[data_type] TEXT  NULL,
>[data_size] INTEGER  NULL,
>[column_name] TEXT  NULL,
>[vendor] TEXT DEFAULT 'SQLBASE' NULL
>, real_length numeric);
>/* No STAT tables available */
>
>Now I wonder why I don't read
>CREATE TABLE [database] (
>[database_name] TEXT  NULL,
>[table_name] TEXT  NULL,
>[data_type] TEXT  NULL,
>[data_size] INTEGER  NULL,
>[column_name] TEXT  NULL,
>[vendor] TEXT DEFAULT 'SQLBASE' NULL,
>[real_length] NUMERIC;
   ^ missing )

Probably because you used a third-party tool (like SQLite Expert) to 
create the table. Such tools often --and I know Expert does-- enclose 
schema names in a given pair of delimiters [...], "..." or `...` while 
the CLI simply issues the SQL command like you input it. 



[sqlite] ALTER TABLE ADD COLUMN

2015-05-21 Thread Christoph P.U. Kukulies
I used sqlite3.exe by invoking it on the command line with the name of a 
database.
Next I typed

  alter table database add column real_length numeric;

Next I typed .fullschema
and I'm getting:

sqlite> .fullschema
CREATE TABLE [database] (
[database_name] TEXT  NULL,
[table_name] TEXT  NULL,
[data_type] TEXT  NULL,
[data_size] INTEGER  NULL,
[column_name] TEXT  NULL,
[vendor] TEXT DEFAULT 'SQLBASE' NULL
, real_length numeric);
/* No STAT tables available */

Now I wonder why I don't read
CREATE TABLE [database] (
[database_name] TEXT  NULL,
[table_name] TEXT  NULL,
[data_type] TEXT  NULL,
[data_size] INTEGER  NULL,
[column_name] TEXT  NULL,
[vendor] TEXT DEFAULT 'SQLBASE' NULL,
[real_length] NUMERIC;

--
Christoph Kukulies


[sqlite] ALTER TABLE ADD COLUMN

2015-05-21 Thread Hick Gunter
Try

  alter table [database] add column [real_length] numeric;

-Urspr?ngliche Nachricht-
Von: Christoph P.U. Kukulies [mailto:kuku at kukulies.org]
Gesendet: Donnerstag, 21. Mai 2015 08:51
An: General Discussion of SQLite Database
Betreff: [sqlite] ALTER TABLE ADD COLUMN

I used sqlite3.exe by invoking it on the command line with the name of a 
database.
Next I typed

  alter table database add column real_length numeric;

Next I typed .fullschema
and I'm getting:

sqlite> .fullschema
CREATE TABLE [database] (
[database_name] TEXT  NULL,
[table_name] TEXT  NULL,
[data_type] TEXT  NULL,
[data_size] INTEGER  NULL,
[column_name] TEXT  NULL,
[vendor] TEXT DEFAULT 'SQLBASE' NULL
, real_length numeric);
/* No STAT tables available */

Now I wonder why I don't read
CREATE TABLE [database] (
[database_name] TEXT  NULL,
[table_name] TEXT  NULL,
[data_type] TEXT  NULL,
[data_size] INTEGER  NULL,
[column_name] TEXT  NULL,
[vendor] TEXT DEFAULT 'SQLBASE' NULL,
[real_length] NUMERIC;

--
Christoph Kukulies
___
sqlite-users mailing list
sqlite-users at mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: hick at scigames.at

This communication (including any attachments) is intended for the use of the 
intended recipient(s) only and may contain information that is confidential, 
privileged or legally protected. Any unauthorized use or dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please immediately notify the sender by return e-mail message and 
delete all copies of the original communication. Thank you for your cooperation.




[sqlite] AUTOINC vs. UUIDs

2015-05-21 Thread Jean-Christophe Deschamps
At 01:33 21/05/2015, you wrote:

>HUIDs have the birthday problem.  How many people do you need to have 
>in a group before two of them will have the same birthday?  HUIDs can 
>only be called LUID's (Locally Unique Identifiers) if you go to the 
>trouble of ensuring uniqueness *before* using one.  Validating Global 
>uniqueness is, quite obviously, a very difficult problem.
>
> > -Original Message-
> > From: sqlite-users-bounces at mailinglists.sqlite.org 
> [mailto:sqlite-users-
> > bounces at mailinglists.sqlite.org] On Behalf Of Jean-Christophe Deschamps
> > Sent: Wednesday, 20 May, 2015 15:38
> > To: General Discussion of SQLite Database
> > Subject: Re: [sqlite] AUTOINC vs. UUIDs
> >
> > At 23:24 20/05/2015, you wrote:
> >
> > >On Wed, May 20, 2015 at 2:20 PM, Steven M. McNeese
> > > wrote:
> > > > Often times people will use GUIDs as primary keys when different
> > > systems need to generate
> > > > rows and then merge together. Like an application that works with
> > > offline clients that push the
> > > > data to a server when the connect. However there are other ways of
> > > accomplishing the same thing.
> > >
> > >For curiosity - Is there a site/blog post somewhere
> > >enumerating/listing these other ways ?
> >
> > I don't know, but let's say your rowids range from -9223372036854775807
> > to 9223372036854775807, that's 18446744073709551614 possible rowids.
> >
> > Imagine that in 50 years, the total population on Earth will grow to
> > (say) 50 billion people (I hope it won't!).
> >
> > 18 446 744 073 709 551 614 / 50 000 000 000 = 368 934 881.474 191 
> 032 28
> >
> > That leaves you the possibility to assign a unique identifier to every
> > potential client on the planet (and a big one) in nearly 369 million
> > servers concurrently without the faintest risk of collision.
> >
> > At this rate, you may limit rowids to only positive integers ... and
> > hire a large army of telemarketers.
> >

I didn't advocate for HUIDs, at all!

My non-overlapping split range of rowids spread over N devices is 
provably immune to the birthday issue. This partitionning of rowids' 
range is equivalent (but simpler) to giving each device a globally 
unique prefix.

And within a few years it's quite possible that the typical int will be 
_int128, allowing this scheme to market any item over the entire galaxy. 



[sqlite] AUTOINC vs. UUIDs

2015-05-21 Thread J Decker
Sorry to be argumentative.  I'll drop it.

On Thu, May 21, 2015 at 4:44 AM, Mark Straver <
sqlite-users-list at palemoon.org> wrote:

> On 21/05/2015 09:33, J Decker wrote:
> > chance of collision is slim... it's a truly large amount of bits...
> > There's libuuid for linux, and builtin functions for windows to create
> them.
>
> The chance of collision is only slim if you (as some people have already
> indicated) have a proper entropy source to generate type 4s. In any other
> case
> the chance of collision is pretty much guaranteed, and not just once, but
> for
> each PK entry because of how PRNGs work.
>
> The other main concern that seems to be the case here is merging data from
> separate DBs with the same PK. Of course, if you are dealing with multiple
> clients with individual input in local DBs, you would not just take the DBs
> and slap them one on top of the other on the central storage location;
> that is
> just being extremely naive.
>
> I also wouldn't want to be the poor server having to check all UUIDs for
> matching entries on every delta if those are used, which, because of them
> being pseudo-random is very much a requirement.
>
> > Oh and the other feature! of UUIDs is merging databases... if I have a
> > database deployed to 15 customers, each with their own ORGID I can easily
> > throw them in the same database without worrying about fixing up primary
> > keys and all records related to those keys.
>
> You wouldn't with sequential PKs either, but instead of stuffing 2 types of
> data into a single field (which I think is also a bad idea) you simply
> have an
> extra field with ORGID. The PK should always be unique to the target DB,
> unrelated to the source DBs, when merging. Blurring those lines is asking
> for
> trouble, IMHO.
>
> having the ORGID alone doesn't help... you still have to fixup all related
records that reference the primary key... just inserting to get new PK's
doesn't help.  If the dataset is fairly complex (6 wide and 4 deep in
relations) that can be a very hairy problem.


> So, I think the following can be said:
> * UUIDs are not human-readable in case someone actually needs to remember
> a PK
>
(first reaction)Remembering PK's is a horrible practice, and best broken
early.
(edited)Remembering PKs is a habit one can get into when it's like 1-100,
like changing the item type or something... but having keys that are not
memorable just cause you to modify your habits, but don't hamper getting
jobs done.


> * If you stuff an ORGID into the UUID, you're actually making it easier to
> leak data; only a truly random UUID would make sense.

I wasn't saying the ORGID was part of the UUID it is itself unique, but
really irrelavent other than to indicate it would be possible to separate
the 15 back to original datasets by referncing a ORGID column.


>

* UUIDs offer no key space advantage to a DB that is actually relevant to
> even
> large DB systems (let alone normal SQLite use)
>
Other than ease of merge; ease of building local transactions that don't
require locks; ...


> * UUIDs are not offering guarantees of uniqueness which a one-way
> auto-increment would, requiring extensive checking "just in case"
> (contributing to the next point)
>
Other than they don't require updating all related records using the PK as
a FK.  Since appropriate entropy was given such that your million records a
second for the past 10 years still has a very tiny chance of collision.
 (and something generating that much data is probably a data point
collector like a monitor system for sensors on a rocket engine or maybe the
CERN collider, and should not have an issue with locking the database while
it logs its data).


> * UUIDs depend on the client-side implementation of a PRNG and entropy
> source,
> which may be flawed, produce repeatable results, etc.
>
and should be taken into consideration... something simple like 'rand()'
only has 32 bits of entropic chains for instance (given 0-0x best
case) you end up with only that many predictable chains... fortunatly
libuuid and windows libs reference more bits of entropy and are written to
fulfill the purpose.


> * UUIDs take more resources to use (memory, CPU cycles for comp, hash,
> etc.)
> which I think is right against what you'd want for SQLite
>
> This is true... but since numbers are just stored as strings anyway, the
failure of matching is probably the same number of characters (3-4 in the
lead) it will either fail early or match until its length.  It could be
more optimal if there was a binary representation of UUID available... but
given that the keys are random the indexes are generally well populated in
a balanced tree without getting too long on a side.  (although I assume
auto increment primary keys have 'tricks' to make them be a balanced sort
anyway)


> As a result, I don't think it's wise to use them, objectively speaking. It
> has
> very little to do with having used them in the past myself or not (OT:
> that is
> 

[sqlite] AUTOINC vs. UUIDs

2015-05-21 Thread R.Smith

On 2015-05-21 01:52 AM, Peter Aronson wrote:
> Now you're just getting silly.  What if the application sets all rowids, 
> everywhere to 1?  The fact is, the chance of collision on a UUID is pretty 
> astronomically low as long as a decent source of entropy is used (see 
> http://en.wikipedia.org

I think Keith's point (which I very much agree with) is that 
astronomically big is still not guaranteed - and ANY solution that 
relies on something not guaranteed is a bad solution. I'd much rather 
even ensure that similar ID's are used client-side, then KNOW that that 
is the case and implement a solution that understands this and deals 
with it (such as simply prepending a device-specific ID or some such) to 
ensure 100% secure uniqueness server-side - no need to rely on 
astronomically big randomnessessess.

> /wiki/Universally_unique_identifier#Random_UUID_probability_of_duplicates).  
> Yes, some application might not generate proper UUIDs, but that's true with 
> any scheme that needs to coordinate disconnected data editing or generation 
> on multiple machines.  There are lots of applications out there that use 
> UUIDs pretty successfully.  Are they a perfect solution?  Of course not, but 
> then, what is?  But for that particular problem domain they have proved 
> viable.
> Mind you, the original article linked was guilty of gross 
> over-generalization, and got the amount of randomess in a UUID wrong (it's 
> 122 bits, not 124).
> Peter
>
>



[sqlite] AUTOINC vs. UUIDs

2015-05-21 Thread Simon Slavin

On 20 May 2015, at 11:18pm, Scott Doctor  wrote:

> I can see the issue. For instance, an invoice ID related to inventory, where 
> the invoice ID is the PK number. Two salesmen on their iPads take an order 
> where their local database assigns the same PK number. When merged you have 
> two invoices with the same ID. Seems a classic problem with non centralized 
> input. A second field, such as salesman ID, would need to squashed into that 
> invoices PK number to guarantee a uniqueness to the number, assuming every 
> salesman has a unique ID.

Yep.  That's the problem and that's one way to solve it.  The other is to give 
each device a short unique number, and have that number in the UUID the device 
can generate.  Given that UUIDs are normally expressed as Hex, the device code 
is normally three or four higits.

Simon.


[sqlite] AUTOINC vs. UUIDs

2015-05-21 Thread Richard Hipp
On 5/20/15, R.Smith  wrote:
>
> ANY solution that
> relies on something not guaranteed is a bad solution.

Here is a disturbing truth:  There are no guarantees.

Those transistors that make up your CPU - they turn on and off when
current is applied to their gate *with a certain probability*.  The
trick is to make that probability very close to 1.0 so that they
overall system works reliably.  Sometimes the probability of success
is increased by running multiple subsystems in parallel and voting.
This further increases the likelihood of a correct answer, but it is
not a guarantee.

The polarity of a magnetic domain on your disk drive is read by a
passing read head with a certain probability, close to 1.0 but not
quite as close as you might imagine.  Sometimes bits are read
incorrectly.  Various error correcting codes increase the probability
of a correct read-out, though they do not achieve certainty.
Likewise, the charge on the floating gates in your SSD are read out
probabilistically.  Quite a few bits in the flash memory of your SSD
are devoted to correcting read errors, and increasing the probability
of a correct read.  But that probability never quite reaches 1.0.

The signals traveling through your network cable and bringing you this
message are read with a finite reliability.  Very sophisticated coding
techniques are used to bring the probability of detecting a corrupted
transmission to very near 1.0, so that bad packets can be discarded
and retried.  But those techniques are not perfect.  They are not
guaranteed.

Everything about your computer is based on probabilities.  Things are
engineered so that the probabilities are very close to 1.0, and we are
given the illusion that they are certainty.  But they are not.

Some people may say that such and such an algorithm is *guaranteed* to
give the right answer.  But that is true only under the assumption
that the algorithm runs on a fault-free computer.  And no such
computer exists, and so the assumption is false.

You can never make a perfect machine, or a perfect program, or a
perfect system.  The art of engineering is to get as close to
perfection as you can, so that nobody notices the imperfections.

The only guarantee you have is a negative: your system will eventually
fail.  Your job as a programmer, designer, or engineer, is to make the
expected time to failure as large as possible.

All that said, there exist scenarios where using a random number as an
ID comes closer to guaranteeing that the ID is unique than any other
technique you are likely to invent.

-- 
D. Richard Hipp
drh at sqlite.org


[sqlite] AUTOINC vs. UUIDs

2015-05-21 Thread J Decker
"Just to chime in here:  First off, how are you going to generate them? How
will you avoid collisions? Why should it be necessary to using that much
storage space for a primary key and what's the reason for not using
blazingly-fast integers?..."

chance of collision is slim... it's a truly large amount of bits...
There's libuuid for linux, and builtin functions for windows to create them.

I personally have come to like UUIDs... as I can build large related
transactions of data without storing them in the database and retrieving
what the auto-increment ID is to fixup the other related rows.  I can build
a complex transaction set without touching the database, and only on the
slim chance of a collision update keys.  (haven't yet ever hit that fixup
code except in test cases where I purposely chose bad keys)

Using the birthday paradox calculation(
http://en.wikipedia.org/wiki/Birthday_problem) , it takes "only after
generating 1 billion UUIDs every second for the next 100 years, the
probability of creating just one duplicate would be about 50%." (
http://en.wikipedia.org/wiki/Universally_unique_identifier) and that's not
even a real 50%...

Most tools support copy and paste if you really want to use a key and
select rows joined with a separate statement. Rarely do I ever have to
worry about having to 'remember a number' to do a select... I just do my
normal select on criteria with appropriate joins and forget the keys; or
select from appropriate views that already have the joins...

Some databases perform better if you use a 'sequential uuid' (MS SQL) the
others don't seem to care one way or another for performance... a squential
UUID uses the first long as like milliesecond timestamp that is always
forward-going, and retains the random parts for everything else (other than
the digit that's supposed to idenfity the source of the UUID creator... so
even changing that digit to not '4' puts out out of collision of most other
algorithms.

Once upon a time part of the standard was 'use the ethernet MAC address as
part of the number' but that was seen as an exploit that gave unscrupulous
people information, this was later patched and removed and only existed for
a few years near the turn of the century (a decade+ ago).

Oh and the other feature! of UUIDs is merging databases... if I have a
database deployed to 15 customers, each with their own ORGID I can easily
throw them in the same database without worrying about fixing up primary
keys and all records related to those keys.

Also given a badly designed schema, I can definitely tell what columns
relate to other columns, as the primary keys will only show up once, and
all other columns, even if badly named, can be matched that that's actually
the source key they reference.

-
I find that generally those who say 'yuck, don't do that' have never done
it.  And those that have, will never go back to simple integers again.


[sqlite] AUTOINC vs. UUIDs

2015-05-21 Thread Peter Aronson
Never mind about the 122 vs 124 bit thing. ?I could have swore I read that... 


 On Wednesday, May 20, 2015 4:52 PM, Peter Aronson  
wrote:



 Now you're just getting silly. ?What if the application sets all rowids, 
everywhere to 1? ?The fact is, the chance of collision on a UUID is pretty 
astronomically low as long as a decent source of entropy is used 
(see?http://en.wikipedia.org/wiki/Universally_unique_identifier#Random_UUID_probability_of_duplicates).
 ?Yes, some application might not generate proper UUIDs, but that's true with 
any scheme that needs to coordinate disconnected data editing or generation on 
multiple machines. ?There are lots of applications out there that use UUIDs 
pretty successfully. ?Are they a perfect solution? ?Of course not, but then, 
what is? ?But for that particular problem domain they have proved viable.
Mind you, the original article linked was guilty of gross over-generalization, 
and got the amount of randomess in a UUID wrong (it's 122 bits, not 124).
Peter? ?


? ? On Wednesday, May 20, 2015 4:38 PM, Keith Medcalf  
wrote:
? 



On the other hand, perhaps both tablets implement the same PRNG with the same 
seed.? You will then have the same HUID's generated on both and have the exact 
same problem.? Addressing the problem using Hope and Pray is not a very robust 
solution.? It would be much better to solve the problem using a deterministic 
solution than a prayer based one.? Even though God hears all prayers, the 
answer might be no ...

> -Original Message-
> From: sqlite-users-bounces at mailinglists.sqlite.org [mailto:sqlite-users-
> bounces at mailinglists.sqlite.org] On Behalf Of Steven M. McNeese
> Sent: Wednesday, 20 May, 2015 16:09
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] AUTOINC vs. UUIDs
> 
> You misunderstood. If an application running on a tablet for 2 users add
> rows to a local database with an auto increment key, each would get a key
> based on their database an let's say for grins they both start with a new
> clean database. User 1 gets key 1 and user 2 gets key 1. Now they push
> their data to a server each with key 1 but different data. That doesn't
> work. Both records needs to be added to the server database - a collection
> of all the data collected on the client tablets.
> 
> Sent from my iPhone
> 
> > On May 20, 2015, at 4:38 PM, Jean-Christophe Deschamps
>  wrote:
> >
> > At 23:24 20/05/2015, you wrote:
> >
> >> On Wed, May 20, 2015 at 2:20 PM, Steven M. McNeese
> >>  wrote:
> >> > Often times people will use GUIDs as primary keys when different
> systems need to generate
> >> > rows and then merge together. Like an application that works with
> offline clients that push the
> >> > data to a server when the connect. However there are other ways of
> accomplishing the same thing.
> >>
> >> For curiosity - Is there a site/blog post somewhere
> >> enumerating/listing these other ways ?
> >
> > I don't know, but let's say your rowids range from -9223372036854775807
> to 9223372036854775807, that's 18446744073709551614 possible rowids.
> >
> > Imagine that in 50 years, the total population on Earth will grow to
> (say) 50 billion people (I hope it won't!).
> >
> > 18 446 744 073 709 551 614 / 50 000 000 000 = 368 934 881.474 191 032 28
> >
> > That leaves you the possibility to assign a unique identifier to every
> potential client on the planet (and a big one) in nearly 369 million
> servers concurrently without the faintest risk of collision.
> >
> > At this rate, you may limit rowids to only positive integers ... and
> hire a large army of telemarketers.
> > ___
> > sqlite-users mailing list
> > sqlite-users at mailinglists.sqlite.org
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> 
> 
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users at mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



? 
___
sqlite-users mailing list
sqlite-users at mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users






[sqlite] AUTOINC vs. UUIDs

2015-05-21 Thread Peter Aronson
Now you're just getting silly. ?What if the application sets all rowids, 
everywhere to 1? ?The fact is, the chance of collision on a UUID is pretty 
astronomically low as long as a decent source of entropy is used 
(see?http://en.wikipedia.org/wiki/Universally_unique_identifier#Random_UUID_probability_of_duplicates).
 ?Yes, some application might not generate proper UUIDs, but that's true with 
any scheme that needs to coordinate disconnected data editing or generation on 
multiple machines. ?There are lots of applications out there that use UUIDs 
pretty successfully. ?Are they a perfect solution? ?Of course not, but then, 
what is? ?But for that particular problem domain they have proved viable.
Mind you, the original article linked was guilty of gross over-generalization, 
and got the amount of randomess in a UUID wrong (it's 122 bits, not 124).
Peter? ?


 On Wednesday, May 20, 2015 4:38 PM, Keith Medcalf  
wrote:




On the other hand, perhaps both tablets implement the same PRNG with the same 
seed.? You will then have the same HUID's generated on both and have the exact 
same problem.? Addressing the problem using Hope and Pray is not a very robust 
solution.? It would be much better to solve the problem using a deterministic 
solution than a prayer based one.? Even though God hears all prayers, the 
answer might be no ...

> -Original Message-
> From: sqlite-users-bounces at mailinglists.sqlite.org [mailto:sqlite-users-
> bounces at mailinglists.sqlite.org] On Behalf Of Steven M. McNeese
> Sent: Wednesday, 20 May, 2015 16:09
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] AUTOINC vs. UUIDs
> 
> You misunderstood. If an application running on a tablet for 2 users add
> rows to a local database with an auto increment key, each would get a key
> based on their database an let's say for grins they both start with a new
> clean database. User 1 gets key 1 and user 2 gets key 1. Now they push
> their data to a server each with key 1 but different data. That doesn't
> work. Both records needs to be added to the server database - a collection
> of all the data collected on the client tablets.
> 
> Sent from my iPhone
> 
> > On May 20, 2015, at 4:38 PM, Jean-Christophe Deschamps
>  wrote:
> >
> > At 23:24 20/05/2015, you wrote:
> >
> >> On Wed, May 20, 2015 at 2:20 PM, Steven M. McNeese
> >>  wrote:
> >> > Often times people will use GUIDs as primary keys when different
> systems need to generate
> >> > rows and then merge together. Like an application that works with
> offline clients that push the
> >> > data to a server when the connect. However there are other ways of
> accomplishing the same thing.
> >>
> >> For curiosity - Is there a site/blog post somewhere
> >> enumerating/listing these other ways ?
> >
> > I don't know, but let's say your rowids range from -9223372036854775807
> to 9223372036854775807, that's 18446744073709551614 possible rowids.
> >
> > Imagine that in 50 years, the total population on Earth will grow to
> (say) 50 billion people (I hope it won't!).
> >
> > 18 446 744 073 709 551 614 / 50 000 000 000 = 368 934 881.474 191 032 28
> >
> > That leaves you the possibility to assign a unique identifier to every
> potential client on the planet (and a big one) in nearly 369 million
> servers concurrently without the faintest risk of collision.
> >
> > At this rate, you may limit rowids to only positive integers ... and
> hire a large army of telemarketers.
> > ___
> > sqlite-users mailing list
> > sqlite-users at mailinglists.sqlite.org
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> 
> 
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users at mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users






[sqlite] AUTOINC vs. UUIDs

2015-05-21 Thread Scott Robison
On May 20, 2015 10:23 PM, "Keith Medcalf"  wrote:
>
>
> Fossil does not use UUID's.

No. But it uses hashes which can collide. Same principle. They are
convenient in the distributed environment.

Version 5 UUID algorithm is based on sha1 (truncated) and involves the
sorts of deterministic data parts you describe later.

> Artifact IDs used by fossil are the SHA-1 hash of the file contents, and
the checkin IDs are the SHA-1 hash of the check-in manifest contents.  They
are *NOT* random but rather, are 100% deterministic -- that is if you run
the sha-1 hash over the same input data you will ALWAYS get the same result.
>
> Whether or not the "definition" of a UUID includes a hash function or not
does not make it deterministic.  The application of a hash function cannot
increase the entropy of the underlying random data, nor can it make its
"Universal" designation more than a Hope and Prayer.  I will, however,
grant that in the case where the UUID is generated from a good hash
function applied against a combination of local unique identity and random
data (such as the FQDN of the machine, the current timestamp, and the
output of a good local whirlpool of entropy), *and* it is verified to be
locally unique, then it is *more likely* to be Universally unique than if
it is based directly (or by application of a hash function) against a
purely random source.
>
> Using a UUID as an prayerful means of generating unique identifiers is
ill-advised.  If you want a generated nonsensical key then apply a hash
function over the real record key and use that.  Triggers and Referential
Integrity constraints can ensure that the generated key is maintained in
sync with the changes to the key fields.  Then and only then will you be
able to merge or update data from multiple distributed databases into a
master.
>
> > -Original Message-
> > From: sqlite-users-bounces at mailinglists.sqlite.org [mailto:sqlite-users-
> > bounces at mailinglists.sqlite.org] On Behalf Of Scott Robison
> > Sent: Wednesday, 20 May, 2015 21:42
> > To: General Discussion of SQLite Database
> > Subject: Re: [sqlite] AUTOINC vs. UUIDs
> >
> > On Wed, May 20, 2015 at 7:20 PM, R.Smith  wrote:
> >
> > >
> > > On 2015-05-21 01:52 AM, Peter Aronson wrote:
> > >
> > >> Now you're just getting silly.  What if the application sets all
> > rowids,
> > >> everywhere to 1?  The fact is, the chance of collision on a UUID is
> > pretty
> > >> astronomically low as long as a decent source of entropy is used (see
> > >> http://en.wikipedia.org
> > >>
> > >
> > > I think Keith's point (which I very much agree with) is that
> > > astronomically big is still not guaranteed - and ANY solution that
> > relies
> > > on something not guaranteed is a bad solution. I'd much rather even
> > ensure
> > > that similar ID's are used client-side, then KNOW that that is the
case
> > and
> > > implement a solution that understands this and deals with it (such as
> > > simply prepending a device-specific ID or some such) to ensure 100%
> > secure
> > > uniqueness server-side - no need to rely on astronomically big
> > > randomnessessess.
> > >
> >
> > Then I guess all the distributed version control systems that rely on
> > unique hash values (including fossil, the sqlite DVCS) are a bad
solution.
> > Note that two of the five defined standards for UUID are based on
hashes.
> > Okay, so modern hashes are longer than 128 bits, but that only reduces
the
> > probability of a collision, it does not eliminate it.
> >
> > "But that's not used as the primary key of a SQLite or other relational
> > table" you might say. Except it is the unique key for a conceptual
table.
> >
> > Certainly I do not agree with the originally linked article that integer
> > primary keys should almost always be avoided, and there was a lot of
> > exaggerating in the risks involved. Still, UUIDs (or other similarly
long
> > or longer hash or quality random number source based IDs) can be an
> > effective technique when used appropriately.
> >
> > --
> > Scott Robison
> > ___
> > sqlite-users mailing list
> > sqlite-users at mailinglists.sqlite.org
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] AUTOINC vs. UUIDs

2015-05-21 Thread Jean-Christophe Deschamps
At 22:50 20/05/2015, you wrote:

>Just to chime in here: I think using UUIDs for primary keys is an 
>absolutely
>terrible idea, instead. First off, how are you going to generate them? How
>will you avoid collisions? Why should it be necessary to using that much
>storage space for a primary key and what's the reason for not using
>blazingly-fast integers?...

Agreed. Also the article denounces the "information leakage" introduced 
by exposing rowids to applications, but keeps quiet about a significant 
drawback: replacing those integers typically shown in the leftmost 
column of a grid (which don't forcibly expose meaningful information) 
by UUIDs is going to eat a very significant screen space if used instead.

Finaly I can see many people easily remembering 80486 as an identifier 
of  but almost none able to do the same with their 
c0b656b1-7351-4dc2-84c8-62a2afb41e66 example UUID.

Granted the risk of accidental collision is minimal but still 
non-zero.  Yet their (careful) creation and, moreover, the repeated 
conversions from 128-bit blob to split hex ASCII for human 
"convenience" are a useless waste of cycles.

To those who say: "UUIDs don't have to be displayed to users", then why 
complain that INTEGER PRIMARY KEY leak information if they are not 
displayed either?

JcD