Re: [sqlite] Non-unique columns in unique index
Thank you, that is exactyy what I intended. If sqlite would choose to use the covering index in cases where read speed (plus row decode) is likely to be better, then it would already "sort of" support the INCLUDE syntax requested by the OP with only minor changes. Note that my investigation involved sqlite version 1.7.14.1; maybe the NGQP already handles this efficiently. Also thanks to Keith for pointing out the AUTOMATIC label for ad-hoc generated indices. BTW: My first name is Gunter >-Ursprüngliche Nachricht- >Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im >Auftrag von R Smith >Gesendet: Donnerstag, 02. März 2017 16:10 >An: sqlite-users@mailinglists.sqlite.org >Betreff: Re: [sqlite] Non-unique columns in unique index > > >On 2017/03/02 4:44 PM, Keith Medcalf wrote: >> On Thursday, 2 March, 2017 06:04, Hick Gunter <h...@scigames.at> wrote: > >I think what Hick tried to show was that if you have a covering Index on >fields F1 and F2 (Unique or not) and then have another index (Automatic or >not, but Unique) on just F1, and you then do a Query of the form: > >SELECT F1,F2 FROM T WHERE F1 = x AND F2 = y > >SQLite will use the covering Index (as expected), but if you drop one WHERE >term so as to end up with the form: > >SELECT F1,F2 FROM T WHERE F1 = x > >Then SQLite will use the other Index when the covering Index is really better >(for read-speed) because it contains all the fields referenced and we "know" >that F1 is Unique so the Covering Index must still be Unique >for F1. So an >Optimization would be that if we "know" F1 to be Unique, and all the fields >required (in the SELECT) are found in the covering Index, then using the >covering Index will be better. > ___ Gunter Hick Software Engineer Scientific Games International GmbH FN 157284 a, HG Wien Klitschgasse 2-4, A-1130 Vienna, Austria Tel: +43 1 80100 0 E-Mail: h...@scigames.at This communication (including any attachments) is intended for the use of the intended recipient(s) only and may contain information that is confidential, privileged or legally protected. Any unauthorized use or dissemination of this communication is strictly prohibited. If you have received this communication in error, please immediately notify the sender by return e-mail message and delete all copies of the original communication. Thank you for your cooperation. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Non-unique columns in unique index
I live my life one "indexed by" at a time. - Deon -Original Message- From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of R Smith Sent: Thursday, March 2, 2017 7:10 AM To: sqlite-users@mailinglists.sqlite.org Subject: Re: [sqlite] Non-unique columns in unique index On 2017/03/02 4:44 PM, Keith Medcalf wrote: > On Thursday, 2 March, 2017 06:04, Hick Gunter <h...@scigames.at> wrote: I think what Hick tried to show was that if you have a covering Index on fields F1 and F2 (Unique or not) and then have another index (Automatic or not, but Unique) on just F1, and you then do a Query of the form: SELECT F1,F2 FROM T WHERE F1 = x AND F2 = y SQLite will use the covering Index (as expected), but if you drop one WHERE term so as to end up with the form: SELECT F1,F2 FROM T WHERE F1 = x Then SQLite will use the other Index when the covering Index is really better (for read-speed) because it contains all the fields referenced and we "know" that F1 is Unique so the Covering Index must still be Unique for F1. So an Optimization would be that if we "know" F1 to be Unique, and all the fields required (in the SELECT) are found in the covering Index, then using the covering Index will be better. While I follow the suggestion, I'd like to point out that a covering Index might not be the best to use. More than one column could be Unique or the covering Index may contain a lot more fields than is referenced, it may not always be faster. Imagine these tables: CREATE TABLE T (ID INTEGER PRIMARY KEY, F1, F2, F3, F4, F5, F6, F7, F8, F9); CREATE UNIQUE INDEX TU_1 ON T(F1, F2); CREATE INDEX TC_2 ON T(F1, F2, F3, F4, F5, F6, F7, F8, F9); The query: SELECT F1, F2 FROM T WHERE F1 = x AND F2 = y; will surely be much faster when using the implied Unique index, and further: SELECT F3 FROM T WHERE F1 = x AND F2 = y; should still be faster using the (much smaller) Unique Index and reading F3 after a lookup. If you are in doubt, imagine the same example with Fields going up to F999. At some field-count the Unique Index will be faster for any selection of field types. It's hard to imagine a safe tweak for the QP here, and having to "assess" whether all referenced fields are ALSO in Unique indices... sounds like an expensive step, but now I'm just guessing. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Non-unique columns in unique index
On 2017/03/02 4:44 PM, Keith Medcalf wrote: On Thursday, 2 March, 2017 06:04, Hick Gunterwrote: I think what Hick tried to show was that if you have a covering Index on fields F1 and F2 (Unique or not) and then have another index (Automatic or not, but Unique) on just F1, and you then do a Query of the form: SELECT F1,F2 FROM T WHERE F1 = x AND F2 = y SQLite will use the covering Index (as expected), but if you drop one WHERE term so as to end up with the form: SELECT F1,F2 FROM T WHERE F1 = x Then SQLite will use the other Index when the covering Index is really better (for read-speed) because it contains all the fields referenced and we "know" that F1 is Unique so the Covering Index must still be Unique for F1. So an Optimization would be that if we "know" F1 to be Unique, and all the fields required (in the SELECT) are found in the covering Index, then using the covering Index will be better. While I follow the suggestion, I'd like to point out that a covering Index might not be the best to use. More than one column could be Unique or the covering Index may contain a lot more fields than is referenced, it may not always be faster. Imagine these tables: CREATE TABLE T (ID INTEGER PRIMARY KEY, F1, F2, F3, F4, F5, F6, F7, F8, F9); CREATE UNIQUE INDEX TU_1 ON T(F1, F2); CREATE INDEX TC_2 ON T(F1, F2, F3, F4, F5, F6, F7, F8, F9); The query: SELECT F1, F2 FROM T WHERE F1 = x AND F2 = y; will surely be much faster when using the implied Unique index, and further: SELECT F3 FROM T WHERE F1 = x AND F2 = y; should still be faster using the (much smaller) Unique Index and reading F3 after a lookup. If you are in doubt, imagine the same example with Fields going up to F999. At some field-count the Unique Index will be faster for any selection of field types. It's hard to imagine a safe tweak for the QP here, and having to "assess" whether all referenced fields are ALSO in Unique indices... sounds like an expensive step, but now I'm just guessing. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Non-unique columns in unique index
I tried to create a test table and two indices thus: >create temp table test (id integer primary key, name text unique, bs integer, >data text); >create unique index plain on test(name); >create unique index cover on test(name,bs); NB: The field name has a unique constraint As long as the query mentions the additional field, sqlite chooses the covering index. >explain query plan select name,bs from test where name='test' and bs=1; >sele order from deta > - >0 0 0 SEARCH TABLE test USING COVERING INDEX cover >(name=? AND bs=?) (~1 rows) unfortunately if thsi field is dropped from the where clause but remains in the select list, sqlite misses the covering index >explain query plan select name,bs from test where name='test'; >sele order from deta > - >0 0 0 SEARCH TABLE test USING INDEX plain (name=?) (~1 >rows) not having an index at alls causes sqlite to create an automatic index >drop index cover; >drop index plain; >explain query plan select name,bs from test where name='test'; >sele order from deta > - >0 0 0 SEARCH TABLE test USING INDEX >sqlite_autoindex_test_1 (name=?) (~1 rows) just the same if the index is not unique (even though the first field is) >create index plain on test(name); >create index cover on test(name,bs); >explain query plan select name,bs from test where name='test'; >sele order from deta > - >0 0 0 SEARCH TABLE test USING INDEX >sqlite_autoindex_test_1 (name=?) (~1 rows) I guess such a schema would be quite unusual, so this optimization opportunity is missed -Ursprüngliche Nachricht- >"Any Index that starts with a Unique column will by definition be Unique." >Sorry, yet, I admit the title of the thread is confusing. What I meant to say >is that it's TOO unique :). Adding the additional columns will allow >duplicates on the columns where duplicates should not be allowed. I don't think so. There can only be one entry for the unique field, and the extra column captures the corresponding value of the one and only matching record. ___ Gunter Hick Software Engineer Scientific Games International GmbH FN 157284 a, HG Wien Klitschgasse 2-4, A-1130 Vienna, Austria Tel: +43 1 80100 0 E-Mail: h...@scigames.at This communication (including any attachments) is intended for the use of the intended recipient(s) only and may contain information that is confidential, privileged or legally protected. Any unauthorized use or dissemination of this communication is strictly prohibited. If you have received this communication in error, please immediately notify the sender by return e-mail message and delete all copies of the original communication. Thank you for your cooperation. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Non-unique columns in unique index
On 2017/03/02 2:29 PM, Deon Brewis wrote: "This Query (for instance) will be exceedingly fast: SELECT ExtraCol FROM Foo WHERE UniqueCol > range_start AND UniqueCol < range_end" No, that's not covered. I've tried that before, that query is too slow when it isn't covered - the table is many GB's and the result is needed in low milliseconds. Also, I don't need an index directly on ExtraCol like specified below for the ( WHERE ExtraCol = ...) case . It's not useful a useful lookup by itself - only a useful result. So the only reason to include it in an index is to make it covered. Ah yes, so you /do/ need every bit of work cycle saving you can get "You cannot optimize for everything, pick your favourite thing and optimize for that." This is a bit of a call to support INCLUDE columns in indexes. This optimization is very easy in SQL Server using that. Well yes, so in SQLite you can optimize either for the speed or for the space, not for both. In MSSQL you can optimize for both using INCLUDE, and I can see how it is useful in specifically your case. That said, I doubt it would make it into development code for SQLite at this point for the simple reason that it adds weight to the engine (adding INCLUDE) for a very fringe-case optimization. Perhaps there are other reasons to add INCLUDE that are less fringe-case? ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Non-unique columns in unique index
"Any Index that starts with a Unique column will by definition be Unique." Sorry, yet, I admit the title of the thread is confusing. What I meant to say is that it's TOO unique :). Adding the additional columns will allow duplicates on the columns where duplicates should not be allowed. "This Query (for instance) will be exceedingly fast: SELECT ExtraCol FROM Foo WHERE UniqueCol > range_start AND UniqueCol < range_end" No, that's not covered. I've tried that before, that query is too slow when it isn't covered - the table is many GB's and the result is needed in low milliseconds. Also, I don't need an index directly on ExtraCol like specified below for the ( WHERE ExtraCol = ...) case . It's not useful a useful lookup by itself - only a useful result. So the only reason to include it in an index is to make it covered. "You cannot optimize for everything, pick your favourite thing and optimize for that." This is a bit of a call to support INCLUDE columns in indexes. This optimization is very easy in SQL Server using that. - Deon -Original Message- From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of R Smith Sent: Thursday, March 2, 2017 2:50 AM To: sqlite-users@mailinglists.sqlite.org Subject: Re: [sqlite] Non-unique columns in unique index Any Index that starts with a Unique column will by definition be Unique. Of course in your case you want the Uniqueness of only the first column to be enforced, but you want to lookup also using the second column (either or both). Why insist on having it in a covering Index though? Why not just make one Unique index, and one other Index on the other column by itself? Like so: +CREATE TABLE Blah( Id Integer Primary Key, UniqueCol blob, // 20 bytes fixed ExtraCol blob, // 12 bytes fixed UNIQUE(UniqueCol) ); // ~36 bytes/row CREATE INDEX Blah_ExtraIdx on Blah(ExtraCol); This will be close to the same size of one covering Index. The Query planner is clever enough to figure out it can use only the second Index to look up some queries, it can also use the Unique Index if needed for a query and it will use it to enforce uniqueness. You get all the speed and a little extra overhead space, but much less than a covering Index. This Query (for instance) will be exceedingly fast: SELECT ExtraCol FROM Foo WHERE UniqueCol > range_start AND UniqueCol < range_end So would this: SELECT * FROM Foo WHERE UniqueCol > range_start AND UniqueCol < range_end AND ExtraCol = xxx; And this: SELECT * FROM Foo WHERE ExtraCol = xxx; All of these will be super fast and your Indices will take up the least possible space. There is some saving in cycles if you can read the bytes directly out of a covering Index rather than a big table, but this is no big table, it should be real quick. If you really really really need those few cycles saved, invest the MBs and make the covering Index additional. If space is a problem, use only the single Unique index. You cannot optimize for everything, pick your favourite thing and optimize for that. Cheers, Ryan On 2017/03/01 7:00 PM, Deon Brewis wrote: > Is there way to add non-unique columns in a unique index? > > I would like to use the same index to enforce unique constraints, as well as > giving a covered result for other queries. > > Something like an 'INCLUDE' would also work (actually even better). E.g. > > CREATE UNIQUE INDEX indx ON Foo(UniqueCol) INCLUDE (ExtraCol) > > > If not, is there a way to efficiently implement a UNIQUE constraint in > a different way? (Trigger maybe?) > > -Deon > > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Non-unique columns in unique index
Any Index that starts with a Unique column will by definition be Unique. Of course in your case you want the Uniqueness of only the first column to be enforced, but you want to lookup also using the second column (either or both). Why insist on having it in a covering Index though? Why not just make one Unique index, and one other Index on the other column by itself? Like so: +CREATE TABLE Blah( Id Integer Primary Key, UniqueCol blob, // 20 bytes fixed ExtraCol blob, // 12 bytes fixed UNIQUE(UniqueCol) ); // ~36 bytes/row CREATE INDEX Blah_ExtraIdx on Blah(ExtraCol); This will be close to the same size of one covering Index. The Query planner is clever enough to figure out it can use only the second Index to look up some queries, it can also use the Unique Index if needed for a query and it will use it to enforce uniqueness. You get all the speed and a little extra overhead space, but much less than a covering Index. This Query (for instance) will be exceedingly fast: SELECT ExtraCol FROM Foo WHERE UniqueCol > range_start AND UniqueCol < range_end So would this: SELECT * FROM Foo WHERE UniqueCol > range_start AND UniqueCol < range_end AND ExtraCol = xxx; And this: SELECT * FROM Foo WHERE ExtraCol = xxx; All of these will be super fast and your Indices will take up the least possible space. There is some saving in cycles if you can read the bytes directly out of a covering Index rather than a big table, but this is no big table, it should be real quick. If you really really really need those few cycles saved, invest the MBs and make the covering Index additional. If space is a problem, use only the single Unique index. You cannot optimize for everything, pick your favourite thing and optimize for that. Cheers, Ryan On 2017/03/01 7:00 PM, Deon Brewis wrote: Is there way to add non-unique columns in a unique index? I would like to use the same index to enforce unique constraints, as well as giving a covered result for other queries. Something like an 'INCLUDE' would also work (actually even better). E.g. CREATE UNIQUE INDEX indx ON Foo(UniqueCol) INCLUDE (ExtraCol) If not, is there a way to efficiently implement a UNIQUE constraint in a different way? (Trigger maybe?) -Deon ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Non-unique columns in unique index
On Wednesday, 1 March, 2017 14:12, Deon Brewis <de...@outlook.com> wrote: > "But look how much space and processing time it would take up" > Can you clarify what you mean by "space" ? > The processing time argument I understand. I doubt there is any "space" requirement at all. In fact, since you are not carrying an extra index, the space requirement will be significantly reduced. Your only real concern would be that on INSERT and UPDATE operations the _prepare would take a little longer since it would have to retrieve and generate the trigger code for the statement. You pay the price during statement prepare, but I doubt that the actual statement execution time will be much affected (the execution would have to do a "lookup" in the unique index anyway to determine whether the operation meets the unique requirement) as you are merely substitution one operation for another like operation. Of course, the internal unique check is done entirely inline, whereas the trigger method is done by running additional VBDE code -- though of course the overall time to run the trigger may approximate the time taken to maintain an the extra index -- in which case you would save the space used by the extra index and have no (or a minor) execution time penalty on insert/update operations. > I think this is one of those things that if the database engine doesn't > internally support it, it can't really be emulated. > > Sure would be nice to have INCLUDE columns support (here and in other > places). > > - Deon > > -Original Message- > From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] > On Behalf Of Simon Slavin > Sent: Wednesday, March 1, 2017 12:57 PM > To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org> > Subject: Re: [sqlite] Non-unique columns in unique index > > > On 1 Mar 2017, at 8:21pm, David Raymond <david.raym...@tomtom.com> wrote: > > > The trigger version you asked about would look something like the below > I believe. More risky than having the two indexes, but should work. > (Famous last words) > > I have no doubt it would work. But look how much space and processing > time it would take up. Far simpler and clearer just to create the two > indexes. > > Simon. > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > > > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Non-unique columns in unique index
SQLITE_STAT1.. 10.0% *** Page counts for all tables and indices separately * BAR... 42030 21.5% FOO... 42030 21.5% BARINDEX.. 40410 20.7% FOOINDEX.. 40410 20.7% SQLITE_AUTOINDEX_FOO_1 30206 15.5% SQLITE_MASTER. 10.0% SQLITE_STAT1.. 10.0% -Original Message- From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Simon Slavin Sent: Wednesday, March 01, 2017 4:38 PM To: SQLite mailing list Subject: Re: [sqlite] Non-unique columns in unique index On 1 Mar 2017, at 9:11pm, Deon Brewis <de...@outlook.com> wrote: > "But look how much space and processing time it would take up" > > Can you clarify what you mean by "space" ? Your triggers have to be stored. Every time they’re activated (each time you insert a row) they have to be run. That requires SQLite to grab enough memory and/or disk space to run the triggers and to construct any temporary indexes they require. Also, it’s a nightmare to debug in comparison with just creating one extra index. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Non-unique columns in unique index
Right, you need to use the birthday attack algorithm for determining collision risk. I wouldn't mind hashing - but there is an additional complication - the query I want the covered field in the index for is this: SELECT ExtraCol FROM Foo WHERE UniqueCol > range_start AND UniqueCol < range_end So I would need a hashing algorithm that's usably small and doesn't collide, yet preserves the properties that if a < b then hash(a) < hash(b). - Deon -Original Message- From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Paul Sanderson Sent: Wednesday, March 1, 2017 2:19 PM To: p...@sandersonforensics.com Cc: SQLite mailing list <sqlite-users@mailinglists.sqlite.org> Subject: Re: [sqlite] Non-unique columns in unique index Hmm - a bit of google foo and refreshing of chances of collions means my 1:9,223,372,036,854,775,808 is way off That is the chance of any two hashes colliding - you'll have lot sof hashes. The basic idea might be sound though even if you stick with a full MD5 it should save a chunk of storage Paul www.sandersonforensics.com skype: r3scue193 twitter: @sandersonforens Tel +44 (0)1326 572786 http://sandersonforensics.com/forum/content.php?195-SQLite-Forensic-Toolkit -Forensic Toolkit for SQLite email from a work address for a fully functional demo licence On 1 March 2017 at 22:13, Paul Sanderson <sandersonforens...@gmail.com> wrote: > As a bit of an off the wall suggestion you could try an MD5 (or even a > partial MD5 - half of the bytes) > > CREATE table hashes ( > hash integer primary key; // just the first 64 bits of the hash of > uniquecol and extracol > ) > > as an integer primary key the hash would be an alias of the rowid and > so storage would be 8 bytes plus admin > > the chance of a random colliison based on a 64 bit hash would be (I > think) > 1:9,223,372,036,854,775,808 > > MD5 is broken but would work OK for this > > use a trigger to abort the insert into blah if the insert into hashes > fails. > > > Paul > www.sandersonforensics.com > skype: r3scue193 > twitter: @sandersonforens > Tel +44 (0)1326 572786 <+44%201326%20572786> > http://sandersonforensics.com/forum/content.php?195-SQLite- > Forensic-Toolkit -Forensic Toolkit for SQLite email from a work > address for a fully functional demo licence > > On 1 March 2017 at 21:54, Simon Slavin <slav...@bigfraud.org> wrote: > >> >> On 1 Mar 2017, at 9:41pm, Deon Brewis <de...@outlook.com> wrote: >> >> > Yeah ok, but that is paltry compared with the gb's of diskspace >> > that >> the actual second index takes up. But thanks for clarifying. >> >> Ah. If it’s really GBs of disk space then I can see why you’d look >> for alternative solutions. >> >> But I have a 43 GB database file which could be 20 GB without an >> extra index. I could have written that extra check in code, and >> reduced the file size, but I decided not to. Because once I had >> developed procedures to handle a 20 GB file, I might was well be dealing >> with a 43 GB file anyway. >> >> Simon. >> ___ >> sqlite-users mailing list >> sqlite-users@mailinglists.sqlite.org >> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users >> > > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Non-unique columns in unique index
Hmm - a bit of google foo and refreshing of chances of collions means my 1:9,223,372,036,854,775,808 is way off That is the chance of any two hashes colliding - you'll have lot sof hashes. The basic idea might be sound though even if you stick with a full MD5 it should save a chunk of storage Paul www.sandersonforensics.com skype: r3scue193 twitter: @sandersonforens Tel +44 (0)1326 572786 http://sandersonforensics.com/forum/content.php?195-SQLite-Forensic-Toolkit -Forensic Toolkit for SQLite email from a work address for a fully functional demo licence On 1 March 2017 at 22:13, Paul Sandersonwrote: > As a bit of an off the wall suggestion you could try an MD5 (or even a > partial MD5 - half of the bytes) > > CREATE table hashes ( > hash integer primary key; // just the first 64 bits of the hash of > uniquecol and extracol > ) > > as an integer primary key the hash would be an alias of the rowid and so > storage would be 8 bytes plus admin > > the chance of a random colliison based on a 64 bit hash would be (I think) > 1:9,223,372,036,854,775,808 > > MD5 is broken but would work OK for this > > use a trigger to abort the insert into blah if the insert into hashes > fails. > > > Paul > www.sandersonforensics.com > skype: r3scue193 > twitter: @sandersonforens > Tel +44 (0)1326 572786 <+44%201326%20572786> > http://sandersonforensics.com/forum/content.php?195-SQLite- > Forensic-Toolkit -Forensic Toolkit for SQLite > email from a work address for a fully functional demo licence > > On 1 March 2017 at 21:54, Simon Slavin wrote: > >> >> On 1 Mar 2017, at 9:41pm, Deon Brewis wrote: >> >> > Yeah ok, but that is paltry compared with the gb's of diskspace that >> the actual second index takes up. But thanks for clarifying. >> >> Ah. If it’s really GBs of disk space then I can see why you’d look for >> alternative solutions. >> >> But I have a 43 GB database file which could be 20 GB without an extra >> index. I could have written that extra check in code, and reduced the file >> size, but I decided not to. Because once I had developed procedures to >> handle a 20 GB file, I might was well be dealing with a 43 GB file anyway. >> >> Simon. >> ___ >> sqlite-users mailing list >> sqlite-users@mailinglists.sqlite.org >> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users >> > > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Non-unique columns in unique index
As a bit of an off the wall suggestion you could try an MD5 (or even a partial MD5 - half of the bytes) CREATE table hashes ( hash integer primary key; // just the first 64 bits of the hash of uniquecol and extracol ) as an integer primary key the hash would be an alias of the rowid and so storage would be 8 bytes plus admin the chance of a random colliison based on a 64 bit hash would be (I think) 1:9,223,372,036,854,775,808 MD5 is broken but would work OK for this use a trigger to abort the insert into blah if the insert into hashes fails. Paul www.sandersonforensics.com skype: r3scue193 twitter: @sandersonforens Tel +44 (0)1326 572786 http://sandersonforensics.com/forum/content.php?195-SQLite-Forensic-Toolkit -Forensic Toolkit for SQLite email from a work address for a fully functional demo licence On 1 March 2017 at 21:54, Simon Slavinwrote: > > On 1 Mar 2017, at 9:41pm, Deon Brewis wrote: > > > Yeah ok, but that is paltry compared with the gb's of diskspace that the > actual second index takes up. But thanks for clarifying. > > Ah. If it’s really GBs of disk space then I can see why you’d look for > alternative solutions. > > But I have a 43 GB database file which could be 20 GB without an extra > index. I could have written that extra check in code, and reduced the file > size, but I decided not to. Because once I had developed procedures to > handle a 20 GB file, I might was well be dealing with a 43 GB file anyway. > > Simon. > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Non-unique columns in unique index
On 1 Mar 2017, at 9:41pm, Deon Brewiswrote: > Yeah ok, but that is paltry compared with the gb's of diskspace that the > actual second index takes up. But thanks for clarifying. Ah. If it’s really GBs of disk space then I can see why you’d look for alternative solutions. But I have a 43 GB database file which could be 20 GB without an extra index. I could have written that extra check in code, and reduced the file size, but I decided not to. Because once I had developed procedures to handle a 20 GB file, I might was well be dealing with a 43 GB file anyway. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Non-unique columns in unique index
Yeah ok, but that is paltry compared with the gb's of diskspace that the actual second index takes up. But thanks for clarifying. -Deon -Original Message- From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Simon Slavin Sent: Wednesday, March 1, 2017 1:38 PM To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org> Subject: Re: [sqlite] Non-unique columns in unique index On 1 Mar 2017, at 9:11pm, Deon Brewis <de...@outlook.com> wrote: > "But look how much space and processing time it would take up" > > Can you clarify what you mean by "space" ? Your triggers have to be stored. Every time they’re activated (each time you insert a row) they have to be run. That requires SQLite to grab enough memory and/or disk space to run the triggers and to construct any temporary indexes they require. Also, it’s a nightmare to debug in comparison with just creating one extra index. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Non-unique columns in unique index
On 1 Mar 2017, at 9:11pm, Deon Brewiswrote: > "But look how much space and processing time it would take up" > > Can you clarify what you mean by "space" ? Your triggers have to be stored. Every time they’re activated (each time you insert a row) they have to be run. That requires SQLite to grab enough memory and/or disk space to run the triggers and to construct any temporary indexes they require. Also, it’s a nightmare to debug in comparison with just creating one extra index. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Non-unique columns in unique index
"But look how much space and processing time it would take up" Can you clarify what you mean by "space" ? The processing time argument I understand. I think this is one of those things that if the database engine doesn't internally support it, it can't really be emulated. Sure would be nice to have INCLUDE columns support (here and in other places). - Deon -Original Message- From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Simon Slavin Sent: Wednesday, March 1, 2017 12:57 PM To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org> Subject: Re: [sqlite] Non-unique columns in unique index On 1 Mar 2017, at 8:21pm, David Raymond <david.raym...@tomtom.com> wrote: > The trigger version you asked about would look something like the below I > believe. More risky than having the two indexes, but should work. (Famous > last words) I have no doubt it would work. But look how much space and processing time it would take up. Far simpler and clearer just to create the two indexes. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Non-unique columns in unique index
Thanks, thought about it. But I have a ROWID requirement as well due to foreign keys referring back to this table via the ROWID. I could in theory put that just the RowId in an additional index, but then the RowId index will just carry the entire UniqueCol again - back to the same problem, just reversed. To throw the specific design out there: This is a frequent read, infrequent update table: CREATE TABLE Blah( Id Integer Primary Key, UniqueCol blob, // 20 bytes fixed ExtraCol blob, // 12 bytes fixed UNIQUE(UniqueCol) ); // ~36 bytes/row CREATE INDEX sqlite_autoindex_Resource_1 on Blah ( // implicit index UniqueCol, // 20 bytes // Id -- implicit ) // ~24 bytes/row CREATE INDEX blahIndex on Blah ( // actual needed index for workload UniqueCol, // 20 bytes ExtraCol, // 12 bytes // Id -- implicit ) // ~36 bytes/row So this 3rd index is exactly as big as the original table. I would love to get rid of at least the 2 vs. 3 redundancy here. - Deon -Original Message- From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Clemens Ladisch Sent: Wednesday, March 1, 2017 9:58 AM To: sqlite-users@mailinglists.sqlite.org Subject: Re: [sqlite] Non-unique columns in unique index Deon Brewis wrote: > Is there way to add non-unique columns in a unique index? > > I would like to use the same index to enforce unique constraints, as well as > giving a covered result for other queries. > > CREATE UNIQUE INDEX indx ON Foo(UniqueCol) INCLUDE (ExtraCol) CREATE TABLE Foo ( UniqueCol PRIMARY KEY, ExtraCol, [...] ) WITHOUT ROWID; Regards, Clemens ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Non-unique columns in unique index
On 1 Mar 2017, at 5:00pm, Deon Brewiswrote: > Is there way to add non-unique columns in a unique index? I don’t know of anything that operates in that way. > If not, is there a way to efficiently implement a UNIQUE constraint in a > different way? (Trigger maybe?) Not efficiently. You could make a non-UNIQUE index which has the columns you want UNIQUE at the beginning. Then you could include a check for uniqueness in your own code. This would probably tempt SQLite to use that index when checking uniqueness. But this strikes me as less efficient than just letting SQLite do its own thing. But it would be far more efficient just to create two indexes, one for the UNIQUE and one for coverage. Or just create the UNIQUE index and let SQLite get the other columns itself. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users