Re: [sqlite] Non-unique columns in unique index

2017-03-02 Thread Hick Gunter
Thank you, that is exactyy what I intended.

If sqlite would choose to use the covering index in cases where read speed 
(plus row decode) is likely to be better, then it would already "sort of" 
support the INCLUDE syntax requested by the OP with only minor changes.

Note that my investigation involved sqlite version 1.7.14.1; maybe the NGQP 
already handles this efficiently.

Also thanks to Keith for pointing out the AUTOMATIC label for ad-hoc generated 
indices.

BTW: My first name is Gunter

>-Ursprüngliche Nachricht-
>Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
>Auftrag von R Smith
>Gesendet: Donnerstag, 02. März 2017 16:10
>An: sqlite-users@mailinglists.sqlite.org
>Betreff: Re: [sqlite] Non-unique columns in unique index
>
>
>On 2017/03/02 4:44 PM, Keith Medcalf wrote:
>> On Thursday, 2 March, 2017 06:04, Hick Gunter <h...@scigames.at> wrote:
>
>I think what Hick tried to show was that if you have a covering Index on 
>fields F1 and F2 (Unique or not) and then have another index (Automatic or 
>not, but Unique) on just F1, and you then do a Query of the form:
>
>SELECT F1,F2 FROM T WHERE F1 = x AND F2 = y
>
>SQLite will use the covering Index (as expected), but if you drop one WHERE 
>term so as to end up with the form:
>
>SELECT F1,F2 FROM T WHERE F1 = x
>
>Then SQLite will use the other Index when the covering Index is really better 
>(for read-speed) because it contains all the fields referenced and we "know" 
>that F1 is Unique so the Covering Index must still be Unique >for F1. So an 
>Optimization would be that if we "know" F1 to be Unique, and all the fields 
>required (in the SELECT) are found in the covering Index, then using the 
>covering Index will be better.
>


___
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This communication (including any attachments) is intended for the use of the 
intended recipient(s) only and may contain information that is confidential, 
privileged or legally protected. Any unauthorized use or dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please immediately notify the sender by return e-mail message and 
delete all copies of the original communication. Thank you for your cooperation.


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Non-unique columns in unique index

2017-03-02 Thread Deon Brewis
I live my life one "indexed by" at a time.

- Deon

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of R Smith
Sent: Thursday, March 2, 2017 7:10 AM
To: sqlite-users@mailinglists.sqlite.org
Subject: Re: [sqlite] Non-unique columns in unique index


On 2017/03/02 4:44 PM, Keith Medcalf wrote:
> On Thursday, 2 March, 2017 06:04, Hick Gunter <h...@scigames.at> wrote:

I think what Hick tried to show was that if you have a covering Index on fields 
F1 and F2 (Unique or not) and then have another index (Automatic or not, but 
Unique) on just F1, and you then do a Query of the form:

SELECT F1,F2 FROM T WHERE F1 = x AND F2 = y

SQLite will use the covering Index (as expected), but if you drop one WHERE 
term so as to end up with the form:

SELECT F1,F2 FROM T WHERE F1 = x

Then SQLite will use the other Index when the covering Index is really better 
(for read-speed) because it contains all the fields referenced and we "know" 
that F1 is Unique so the Covering Index must still be Unique for F1. So an 
Optimization would be that if we "know" F1 to be Unique, and all the fields 
required (in the SELECT) are found in the covering Index, then using the 
covering Index will be better.

While I follow the suggestion, I'd like to point out that a covering Index 
might not be the best to use. More than one column could be Unique or the 
covering Index may contain a lot more fields than is referenced, it may not 
always be faster. Imagine these tables:

CREATE TABLE T (ID INTEGER PRIMARY KEY, F1, F2, F3, F4, F5, F6, F7, F8, F9); 
CREATE UNIQUE INDEX TU_1 ON T(F1, F2); CREATE INDEX TC_2 ON T(F1, F2, F3, F4, 
F5, F6, F7, F8, F9);

The query:
SELECT F1, F2 FROM T WHERE F1 = x AND F2 = y; will surely be much faster when 
using the implied Unique index, and further:

SELECT F3 FROM T WHERE F1 = x AND F2 = y; should still be faster using the 
(much smaller) Unique Index and reading
F3 after a lookup. If you are in doubt, imagine the same example with Fields 
going up to F999. At some field-count the Unique Index will be faster for any 
selection of field types. It's hard to imagine a safe tweak for the QP here, 
and having to "assess" whether all referenced fields are ALSO in Unique 
indices... sounds like an expensive step, but now I'm just guessing.


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Non-unique columns in unique index

2017-03-02 Thread R Smith


On 2017/03/02 4:44 PM, Keith Medcalf wrote:

On Thursday, 2 March, 2017 06:04, Hick Gunter  wrote:


I think what Hick tried to show was that if you have a covering Index on 
fields F1 and F2 (Unique or not) and then have another index (Automatic 
or not, but Unique) on just F1, and you then do a Query of the form:


SELECT F1,F2 FROM T WHERE F1 = x AND F2 = y

SQLite will use the covering Index (as expected), but if you drop one 
WHERE term so as to end up with the form:


SELECT F1,F2 FROM T WHERE F1 = x

Then SQLite will use the other Index when the covering Index is really 
better (for read-speed) because it contains all the fields referenced 
and we "know" that F1 is Unique so the Covering Index must still be 
Unique for F1. So an Optimization would be that if we "know" F1 to be 
Unique, and all the fields required (in the SELECT) are found in the 
covering Index, then using the covering Index will be better.


While I follow the suggestion, I'd like to point out that a covering 
Index might not be the best to use. More than one column could be Unique 
or the covering Index may contain a lot more fields than is referenced, 
it may not always be faster. Imagine these tables:


CREATE TABLE T (ID INTEGER PRIMARY KEY, F1, F2, F3, F4, F5, F6, F7, F8, F9);
CREATE UNIQUE INDEX TU_1 ON T(F1, F2);
CREATE INDEX TC_2 ON T(F1, F2, F3, F4, F5, F6, F7, F8, F9);

The query:
SELECT F1, F2 FROM T WHERE F1 = x AND F2 = y;
will surely be much faster when using the implied Unique index, and further:

SELECT F3 FROM T WHERE F1 = x AND F2 = y;
should still be faster using the (much smaller) Unique Index and reading 
F3 after a lookup. If you are in doubt, imagine the same example with 
Fields going up to F999. At some field-count the Unique Index will be 
faster for any selection of field types. It's hard to imagine a safe 
tweak for the QP here, and having to "assess" whether all referenced 
fields are ALSO in Unique indices... sounds like an expensive step, but 
now I'm just guessing.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Non-unique columns in unique index

2017-03-02 Thread Hick Gunter
I tried to create a test table and two indices thus:

>create temp table test (id integer primary key, name text unique, bs integer, 
>data text);
>create unique index plain on test(name);
>create unique index cover on test(name,bs);

NB: The field name has a unique constraint

As long as the query mentions the additional field, sqlite chooses the covering 
index.

>explain query plan select name,bs from test where name='test' and bs=1;
>sele  order  from  deta
>  -    
>0 0  0 SEARCH TABLE test USING COVERING INDEX cover 
>(name=? AND bs=?) (~1 rows)

unfortunately if thsi field is dropped from the where clause but remains in the 
select list, sqlite misses the covering index

>explain query plan select name,bs from test where name='test';
>sele  order  from  deta
>  -    
>0 0  0 SEARCH TABLE test USING INDEX plain (name=?) (~1 
>rows)

not having an index at alls causes sqlite to create an automatic index

>drop index cover;
>drop index plain;
>explain query plan select name,bs from test where name='test';
>sele  order  from  deta
>  -    
>0 0  0 SEARCH TABLE test USING INDEX 
>sqlite_autoindex_test_1 (name=?) (~1 rows)

just the same if the index is not unique (even though the first field is)

>create index plain on test(name);
>create index cover on test(name,bs);
>explain query plan select name,bs from test where name='test';
>sele  order  from  deta
>  -    
>0 0  0 SEARCH TABLE test USING INDEX 
>sqlite_autoindex_test_1 (name=?) (~1 rows)

I guess such a schema would be quite unusual, so this optimization opportunity 
is missed

-Ursprüngliche Nachricht-
>"Any Index that starts with a Unique column will by definition be Unique."
>Sorry, yet, I admit the title of the thread is confusing. What I meant to say 
>is that it's TOO unique :). Adding the additional columns will allow 
>duplicates on the columns where duplicates should not be allowed.

I don't think so. There can only be one entry for the unique field, and the 
extra column captures the corresponding value of the one and only matching 
record.



___
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This communication (including any attachments) is intended for the use of the 
intended recipient(s) only and may contain information that is confidential, 
privileged or legally protected. Any unauthorized use or dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please immediately notify the sender by return e-mail message and 
delete all copies of the original communication. Thank you for your cooperation.


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Non-unique columns in unique index

2017-03-02 Thread R Smith



On 2017/03/02 2:29 PM, Deon Brewis wrote:

"This Query (for instance) will be exceedingly fast:
SELECT ExtraCol FROM Foo WHERE UniqueCol > range_start AND UniqueCol < 
range_end"

No, that's not covered. I've tried that before, that query is too slow when it 
isn't covered - the table is many GB's and the result is needed in low 
milliseconds. Also, I don't need an index directly on ExtraCol like specified 
below for the ( WHERE ExtraCol = ...) case . It's not useful a useful lookup by 
itself - only a useful result. So the only reason to include it in an index is 
to make it covered.


Ah yes, so you /do/ need every bit of work cycle saving you can get


"You cannot optimize for everything, pick your favourite thing and optimize for 
that."

This is a bit of a call to support INCLUDE columns in indexes. This 
optimization is very easy in SQL Server using that.


Well yes, so in SQLite you can optimize either for the speed or for the 
space, not for both. In MSSQL you can optimize for both using INCLUDE, 
and I can see how it is useful in specifically your case.


That said, I doubt it would make it into development code for SQLite at 
this point for the simple reason that it adds weight to the engine 
(adding INCLUDE) for a very fringe-case optimization.

Perhaps there are other reasons to add INCLUDE that are less fringe-case?


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Non-unique columns in unique index

2017-03-02 Thread Deon Brewis
"Any Index that starts with a Unique column will by definition be Unique."
Sorry, yet, I admit the title of the thread is confusing. What I meant to say 
is that it's TOO unique :). Adding the additional columns will allow duplicates 
on the columns where duplicates should not be allowed.


"This Query (for instance) will be exceedingly fast:
SELECT ExtraCol FROM Foo WHERE UniqueCol > range_start AND UniqueCol < 
range_end"

No, that's not covered. I've tried that before, that query is too slow when it 
isn't covered - the table is many GB's and the result is needed in low 
milliseconds. Also, I don't need an index directly on ExtraCol like specified 
below for the ( WHERE ExtraCol = ...) case . It's not useful a useful lookup by 
itself - only a useful result. So the only reason to include it in an index is 
to make it covered.


"You cannot optimize for everything, pick your favourite thing and optimize for 
that."

This is a bit of a call to support INCLUDE columns in indexes. This 
optimization is very easy in SQL Server using that.

- Deon

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of R Smith
Sent: Thursday, March 2, 2017 2:50 AM
To: sqlite-users@mailinglists.sqlite.org
Subject: Re: [sqlite] Non-unique columns in unique index

Any Index that starts with a Unique column will by definition be Unique. 
Of course in your case you want the Uniqueness of only the first column to be 
enforced, but you want to lookup also using the second column (either or both). 
Why insist on having it in a covering Index though?

Why not just make one Unique index, and one other Index on the other column by 
itself? Like so:

+CREATE TABLE Blah(
 Id Integer Primary Key,
 UniqueCol blob, // 20 bytes fixed
 ExtraCol blob, // 12 bytes fixed
 UNIQUE(UniqueCol)
);  // ~36 bytes/row

CREATE INDEX Blah_ExtraIdx on Blah(ExtraCol);


This will be close to the same size of one covering Index.
The Query planner is clever enough to figure out it can use only the second 
Index to look up some queries, it can also use the Unique Index if needed for a 
query and it will use it to enforce uniqueness.

You get all the speed and a little extra overhead space, but much less than a 
covering Index.

This Query (for instance) will be exceedingly fast:
SELECT ExtraCol FROM Foo WHERE UniqueCol > range_start AND UniqueCol < range_end

So would this:
SELECT * FROM Foo WHERE UniqueCol > range_start AND UniqueCol < range_end AND 
ExtraCol = xxx;

And this:
SELECT * FROM Foo WHERE ExtraCol = xxx;

All of these will be super fast and your Indices will take up the least 
possible space.

There is some saving in cycles if you can read the bytes directly out of a 
covering Index rather than a big table, but this is no big table, it should be 
real quick. If you really really really need those few cycles saved, invest the 
MBs and make the covering Index additional. If space is a problem, use only the 
single Unique index.

You cannot optimize for everything, pick your favourite thing and optimize for 
that.

Cheers,
Ryan

On 2017/03/01 7:00 PM, Deon Brewis wrote:
> Is there way to add non-unique columns in a unique index?
>
> I would like to use the same index to enforce unique constraints, as well as 
> giving a covered result for other queries.
>
> Something like an 'INCLUDE' would also work (actually even better). E.g.
>
> CREATE UNIQUE INDEX indx  ON Foo(UniqueCol) INCLUDE (ExtraCol)
>
>
> If not, is there a way to efficiently implement a UNIQUE constraint in 
> a different way? (Trigger maybe?)
>
> -Deon
>   
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Non-unique columns in unique index

2017-03-02 Thread R Smith
Any Index that starts with a Unique column will by definition be Unique. 
Of course in your case you want the Uniqueness of only the first column 
to be enforced, but you want to lookup also using the second column 
(either or both). Why insist on having it in a covering Index though?


Why not just make one Unique index, and one other Index on the other 
column by itself? Like so:


+CREATE TABLE Blah(
Id Integer Primary Key,
UniqueCol blob, // 20 bytes fixed
ExtraCol blob, // 12 bytes fixed
UNIQUE(UniqueCol)
);  // ~36 bytes/row

CREATE INDEX Blah_ExtraIdx on Blah(ExtraCol);


This will be close to the same size of one covering Index.
The Query planner is clever enough to figure out it can use only the second 
Index to look up some queries, it can also use the Unique Index if needed for a 
query and it will use it to enforce uniqueness.

You get all the speed and a little extra overhead space, but much less than a 
covering Index.

This Query (for instance) will be exceedingly fast:
SELECT ExtraCol FROM Foo WHERE UniqueCol > range_start AND UniqueCol < range_end

So would this:
SELECT * FROM Foo WHERE UniqueCol > range_start AND UniqueCol < range_end AND 
ExtraCol = xxx;

And this:
SELECT * FROM Foo WHERE ExtraCol = xxx;

All of these will be super fast and your Indices will take up the least 
possible space.

There is some saving in cycles if you can read the bytes directly out of a 
covering Index rather than a big table, but this is no big table, it should be 
real quick. If you really really really need those few cycles saved, invest the 
MBs and make the covering Index additional. If space is a problem, use only the 
single Unique index.

You cannot optimize for everything, pick your favourite thing and optimize for 
that.

Cheers,
Ryan

On 2017/03/01 7:00 PM, Deon Brewis wrote:

Is there way to add non-unique columns in a unique index?

I would like to use the same index to enforce unique constraints, as well as 
giving a covered result for other queries.

Something like an 'INCLUDE' would also work (actually even better). E.g.

CREATE UNIQUE INDEX indx  ON Foo(UniqueCol) INCLUDE (ExtraCol)


If not, is there a way to efficiently implement a UNIQUE constraint in a 
different way? (Trigger maybe?)

-Deon
  
___

sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Non-unique columns in unique index

2017-03-01 Thread Keith Medcalf

On Wednesday, 1 March, 2017 14:12, Deon Brewis <de...@outlook.com> wrote:

> "But look how much space and processing time it would take up"
 
> Can you clarify what you mean by "space" ?
 
> The processing time argument I understand.

I doubt there is any "space" requirement at all.  In fact, since you are not 
carrying an extra index, the space requirement will be significantly reduced.  

Your only real concern would be that on INSERT and UPDATE operations the 
_prepare would take a little longer since it would have to retrieve and 
generate the trigger code for the statement.  You pay the price during 
statement prepare, but I doubt that the actual statement execution time will be 
much affected (the execution would have to do a "lookup" in the unique index 
anyway to determine whether the operation meets the unique requirement) as you 
are merely substitution one operation for another like operation.  Of course, 
the internal unique check is done entirely inline, whereas the trigger method 
is done by running additional VBDE code -- though of course the overall time to 
run the trigger may approximate the time taken to maintain an the extra index 
-- in which case you would save the space used by the extra index and have no 
(or a minor) execution time penalty on insert/update operations.

> I think this is one of those things that if the database engine doesn't
> internally support it, it can't really be emulated.
> 
> Sure would be nice to have INCLUDE columns support (here and in other
> places).
> 
> - Deon
> 
> -Original Message-
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
> On Behalf Of Simon Slavin
> Sent: Wednesday, March 1, 2017 12:57 PM
> To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
> Subject: Re: [sqlite] Non-unique columns in unique index
> 
> 
> On 1 Mar 2017, at 8:21pm, David Raymond <david.raym...@tomtom.com> wrote:
> 
> > The trigger version you asked about would look something like the below
> I believe. More risky than having the two indexes, but should work.
> (Famous last words)
> 
> I have no doubt it would work.  But look how much space and processing
> time it would take up.  Far simpler and clearer just to create the two
> indexes.
> 
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> 
> 
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Non-unique columns in unique index

2017-03-01 Thread David Raymond
SQLITE_STAT1.. 10.0% 

*** Page counts for all tables and indices separately *

BAR... 42030   21.5% 
FOO... 42030   21.5% 
BARINDEX.. 40410   20.7% 
FOOINDEX.. 40410   20.7% 
SQLITE_AUTOINDEX_FOO_1 30206   15.5% 
SQLITE_MASTER. 10.0% 
SQLITE_STAT1.. 10.0% 


-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Simon Slavin
Sent: Wednesday, March 01, 2017 4:38 PM
To: SQLite mailing list
Subject: Re: [sqlite] Non-unique columns in unique index


On 1 Mar 2017, at 9:11pm, Deon Brewis <de...@outlook.com> wrote:

> "But look how much space and processing time it would take up"
> 
> Can you clarify what you mean by "space" ?

Your triggers have to be stored.  Every time they’re activated (each time you 
insert a row) they have to be run.  That requires SQLite to grab enough memory 
and/or disk space to run the triggers and to construct any temporary indexes 
they require.

Also, it’s a nightmare to debug in comparison with just creating one extra 
index.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Non-unique columns in unique index

2017-03-01 Thread Deon Brewis
Right, you need to use the birthday attack algorithm for determining collision 
risk.

I wouldn't mind hashing - but there is an additional complication - the query I 
want the covered field in the index for is this:

SELECT ExtraCol FROM Foo WHERE UniqueCol > range_start AND UniqueCol < range_end

So I would need a hashing algorithm that's usably small and doesn't collide, 
yet preserves the properties that if  a < b then  hash(a) < hash(b).

- Deon

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Paul Sanderson
Sent: Wednesday, March 1, 2017 2:19 PM
To: p...@sandersonforensics.com
Cc: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] Non-unique columns in unique index

Hmm - a bit of google foo and refreshing of chances of collions means my
1:9,223,372,036,854,775,808 is way off

That is the chance of any two hashes colliding - you'll have lot sof hashes.

The basic idea might be sound though even if you stick with a full MD5 it 
should save a chunk of storage

Paul
www.sandersonforensics.com
skype: r3scue193
twitter: @sandersonforens
Tel +44 (0)1326 572786
http://sandersonforensics.com/forum/content.php?195-SQLite-Forensic-Toolkit
-Forensic Toolkit for SQLite
email from a work address for a fully functional demo licence

On 1 March 2017 at 22:13, Paul Sanderson <sandersonforens...@gmail.com>
wrote:

> As a bit of an off the wall suggestion you could try an MD5 (or even a 
> partial MD5 - half of the bytes)
>
> CREATE table hashes (
> hash integer primary key; // just the first 64 bits of the hash of 
> uniquecol and extracol
> )
>
> as an integer primary key the hash would be an alias of the rowid and 
> so storage would be 8 bytes plus admin
>
> the chance of a random colliison based on a 64 bit hash would be (I 
> think)
> 1:9,223,372,036,854,775,808
>
> MD5 is broken but would work OK for this
>
> use a trigger to abort the insert into blah if the insert into hashes 
> fails.
>
>
> Paul
> www.sandersonforensics.com
> skype: r3scue193
> twitter: @sandersonforens
> Tel +44 (0)1326 572786 <+44%201326%20572786>
> http://sandersonforensics.com/forum/content.php?195-SQLite-
> Forensic-Toolkit -Forensic Toolkit for SQLite email from a work 
> address for a fully functional demo licence
>
> On 1 March 2017 at 21:54, Simon Slavin <slav...@bigfraud.org> wrote:
>
>>
>> On 1 Mar 2017, at 9:41pm, Deon Brewis <de...@outlook.com> wrote:
>>
>> > Yeah ok, but that is paltry compared with the gb's of diskspace 
>> > that
>> the actual second index takes up. But thanks for clarifying.
>>
>> Ah.  If it’s really GBs of disk space then I can see why you’d look 
>> for alternative solutions.
>>
>> But I have a 43 GB database file which could be 20 GB without an 
>> extra index.  I could have written that extra check in code, and 
>> reduced the file size, but I decided not to.  Because once I had 
>> developed procedures to handle a 20 GB file, I might was well be dealing 
>> with a 43 GB file anyway.
>>
>> Simon.
>> ___
>> sqlite-users mailing list
>> sqlite-users@mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Non-unique columns in unique index

2017-03-01 Thread Paul Sanderson
Hmm - a bit of google foo and refreshing of chances of collions means my
1:9,223,372,036,854,775,808 is way off

That is the chance of any two hashes colliding - you'll have lot sof hashes.

The basic idea might be sound though even if you stick with a full MD5 it
should save a chunk of storage

Paul
www.sandersonforensics.com
skype: r3scue193
twitter: @sandersonforens
Tel +44 (0)1326 572786
http://sandersonforensics.com/forum/content.php?195-SQLite-Forensic-Toolkit
-Forensic Toolkit for SQLite
email from a work address for a fully functional demo licence

On 1 March 2017 at 22:13, Paul Sanderson 
wrote:

> As a bit of an off the wall suggestion you could try an MD5 (or even a
> partial MD5 - half of the bytes)
>
> CREATE table hashes (
> hash integer primary key; // just the first 64 bits of the hash of
> uniquecol and extracol
> )
>
> as an integer primary key the hash would be an alias of the rowid and so
> storage would be 8 bytes plus admin
>
> the chance of a random colliison based on a 64 bit hash would be (I think)
> 1:9,223,372,036,854,775,808
>
> MD5 is broken but would work OK for this
>
> use a trigger to abort the insert into blah if the insert into hashes
> fails.
>
>
> Paul
> www.sandersonforensics.com
> skype: r3scue193
> twitter: @sandersonforens
> Tel +44 (0)1326 572786 <+44%201326%20572786>
> http://sandersonforensics.com/forum/content.php?195-SQLite-
> Forensic-Toolkit -Forensic Toolkit for SQLite
> email from a work address for a fully functional demo licence
>
> On 1 March 2017 at 21:54, Simon Slavin  wrote:
>
>>
>> On 1 Mar 2017, at 9:41pm, Deon Brewis  wrote:
>>
>> > Yeah ok, but that is paltry compared with the gb's of diskspace that
>> the actual second index takes up. But thanks for clarifying.
>>
>> Ah.  If it’s really GBs of disk space then I can see why you’d look for
>> alternative solutions.
>>
>> But I have a 43 GB database file which could be 20 GB without an extra
>> index.  I could have written that extra check in code, and reduced the file
>> size, but I decided not to.  Because once I had developed procedures to
>> handle a 20 GB file, I might was well be dealing with a 43 GB file anyway.
>>
>> Simon.
>> ___
>> sqlite-users mailing list
>> sqlite-users@mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Non-unique columns in unique index

2017-03-01 Thread Paul Sanderson
As a bit of an off the wall suggestion you could try an MD5 (or even a
partial MD5 - half of the bytes)

CREATE table hashes (
hash integer primary key; // just the first 64 bits of the hash of
uniquecol and extracol
)

as an integer primary key the hash would be an alias of the rowid and so
storage would be 8 bytes plus admin

the chance of a random colliison based on a 64 bit hash would be (I think)
1:9,223,372,036,854,775,808

MD5 is broken but would work OK for this

use a trigger to abort the insert into blah if the insert into hashes fails.


Paul
www.sandersonforensics.com
skype: r3scue193
twitter: @sandersonforens
Tel +44 (0)1326 572786
http://sandersonforensics.com/forum/content.php?195-SQLite-Forensic-Toolkit
-Forensic Toolkit for SQLite
email from a work address for a fully functional demo licence

On 1 March 2017 at 21:54, Simon Slavin  wrote:

>
> On 1 Mar 2017, at 9:41pm, Deon Brewis  wrote:
>
> > Yeah ok, but that is paltry compared with the gb's of diskspace that the
> actual second index takes up. But thanks for clarifying.
>
> Ah.  If it’s really GBs of disk space then I can see why you’d look for
> alternative solutions.
>
> But I have a 43 GB database file which could be 20 GB without an extra
> index.  I could have written that extra check in code, and reduced the file
> size, but I decided not to.  Because once I had developed procedures to
> handle a 20 GB file, I might was well be dealing with a 43 GB file anyway.
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Non-unique columns in unique index

2017-03-01 Thread Simon Slavin

On 1 Mar 2017, at 9:41pm, Deon Brewis  wrote:

> Yeah ok, but that is paltry compared with the gb's of diskspace that the 
> actual second index takes up. But thanks for clarifying.

Ah.  If it’s really GBs of disk space then I can see why you’d look for 
alternative solutions.

But I have a 43 GB database file which could be 20 GB without an extra index.  
I could have written that extra check in code, and reduced the file size, but I 
decided not to.  Because once I had developed procedures to handle a 20 GB 
file, I might was well be dealing with a 43 GB file anyway.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Non-unique columns in unique index

2017-03-01 Thread Deon Brewis
Yeah ok, but that is paltry compared with the gb's of diskspace that the actual 
second index takes up. But thanks for clarifying.

-Deon

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Simon Slavin
Sent: Wednesday, March 1, 2017 1:38 PM
To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] Non-unique columns in unique index


On 1 Mar 2017, at 9:11pm, Deon Brewis <de...@outlook.com> wrote:

> "But look how much space and processing time it would take up"
> 
> Can you clarify what you mean by "space" ?

Your triggers have to be stored.  Every time they’re activated (each time you 
insert a row) they have to be run.  That requires SQLite to grab enough memory 
and/or disk space to run the triggers and to construct any temporary indexes 
they require.

Also, it’s a nightmare to debug in comparison with just creating one extra 
index.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Non-unique columns in unique index

2017-03-01 Thread Simon Slavin

On 1 Mar 2017, at 9:11pm, Deon Brewis  wrote:

> "But look how much space and processing time it would take up"
> 
> Can you clarify what you mean by "space" ?

Your triggers have to be stored.  Every time they’re activated (each time you 
insert a row) they have to be run.  That requires SQLite to grab enough memory 
and/or disk space to run the triggers and to construct any temporary indexes 
they require.

Also, it’s a nightmare to debug in comparison with just creating one extra 
index.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Non-unique columns in unique index

2017-03-01 Thread Deon Brewis
"But look how much space and processing time it would take up"

Can you clarify what you mean by "space" ?

The processing time argument I understand. 


I think this is one of those things that if the database engine doesn't 
internally support it, it can't really be emulated. 

Sure would be nice to have INCLUDE columns support (here and in other places).

- Deon

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Simon Slavin
Sent: Wednesday, March 1, 2017 12:57 PM
To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] Non-unique columns in unique index


On 1 Mar 2017, at 8:21pm, David Raymond <david.raym...@tomtom.com> wrote:

> The trigger version you asked about would look something like the below I 
> believe. More risky than having the two indexes, but should work. (Famous 
> last words)

I have no doubt it would work.  But look how much space and processing time it 
would take up.  Far simpler and clearer just to create the two indexes.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Non-unique columns in unique index

2017-03-01 Thread Deon Brewis
Thanks, thought about it. But I have a ROWID requirement as well due to foreign 
keys referring back to this table via the ROWID.

I could in theory put that just the RowId in an additional index, but then the 
RowId index will just carry the entire UniqueCol again - back to the same 
problem, just reversed.

To throw the specific design out there:

This is a frequent read, infrequent update table:

CREATE TABLE Blah(
Id Integer Primary Key,
UniqueCol blob, // 20 bytes fixed
ExtraCol blob, // 12 bytes fixed
UNIQUE(UniqueCol)
);  // ~36 bytes/row

CREATE INDEX sqlite_autoindex_Resource_1 on Blah  ( // implicit index
UniqueCol,   // 20 bytes
 // Id -- implicit
)  // ~24 bytes/row

CREATE INDEX blahIndex on Blah ( // actual needed index for workload
UniqueCol, // 20 bytes
ExtraCol,  // 12 bytes
 // Id -- implicit
)  // ~36 bytes/row

So this 3rd index is exactly as big as the original table. I would love to get 
rid of at least the 2 vs. 3 redundancy here. 

- Deon

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Clemens Ladisch
Sent: Wednesday, March 1, 2017 9:58 AM
To: sqlite-users@mailinglists.sqlite.org
Subject: Re: [sqlite] Non-unique columns in unique index

Deon Brewis wrote:
> Is there way to add non-unique columns in a unique index?
>
> I would like to use the same index to enforce unique constraints, as well as 
> giving a covered result for other queries.
>
> CREATE UNIQUE INDEX indx  ON Foo(UniqueCol) INCLUDE (ExtraCol)

CREATE TABLE Foo (
  UniqueCol PRIMARY KEY,
  ExtraCol,
  [...]
) WITHOUT ROWID;


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Non-unique columns in unique index

2017-03-01 Thread Simon Slavin

On 1 Mar 2017, at 5:00pm, Deon Brewis  wrote:

> Is there way to add non-unique columns in a unique index?

I don’t know of anything that operates in that way.

> If not, is there a way to efficiently implement a UNIQUE constraint in a 
> different way? (Trigger maybe?)

Not efficiently.  You could make a non-UNIQUE index which has the columns you 
want UNIQUE at the beginning.  Then you could include a check for uniqueness in 
your own code.  This would probably tempt SQLite to use that index when 
checking uniqueness.  But this strikes me as less efficient than just letting 
SQLite do its own thing.

But it would be far more efficient just to create two indexes, one for the 
UNIQUE and one for coverage.  Or just create the UNIQUE index and let SQLite 
get the other columns itself.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users