The first problem with packing several integer fields into one is that you lose 
the capability of expressing NULL values in the packed fields without resorting 
to complicated and time consuming arithmetics modulo 257.

The next is that you lose the possibility of expressing values outside of the 
intended range. Even if, at inception, you think you will only need 3 bits 
(e.g. NULL and 7 discrete values), murphy's law guarantees that there will 
arise a need for an 8th discrete value and you will have to unpack and repack 
all of the rows. If you still have some bits left unused.

Another method of avoiding the "wasted" space of a rowid is to declare WITHOUT 
ROWID tables.

Arguably, comparing 8 bytes is faster than decoding n fields; but the record 
format of SQLite allows the range of the values to be compared first.

E.g. you have 8 subfields f1 to f8 and your constraint is "f1=1 and f2=2 and 
f3=4", which translates into "between 0x0102040000000000 and 0x010204ffffffffff"

The index cell would have a (partial) header of s1s2s3s4s5s6s7s8 and a 
(partial) payload of.[f1] [f2] [f3] [f4] [f5] [f6] [f7] [f8]. Analysis of the 
constraints yields that the values are all representable in 8 bits, giving a 
"signature" of s1=9 and s2=1 and s3=1 which serves to eliminate most of the 
keys by comparing 1 to 3 bytes, and needs to compare 1 or 2 more bytes (f2 and 
f3, because f1 is not present) to find a match.

Apart from the technical details, overly concerning yourself with the speed of 
query execution at the beginning of design can be considered "premature 
optimization". Getting the application to run correctly should be the primary 
focus. Once it does that, see if the performance is ok. If not, writing better 
queries that run efficiently is probably orders of magnitude more important 
than the difference between comparing "rowids" and "keys".

-----Ursprüngliche Nachricht-----
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von x
Gesendet: Donnerstag, 10. August 2017 13:19
An: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
Betreff: Re: [sqlite] Packing integer primary key with field bits

Thanks for the replies. I’m not sure I agree with Gunter and Ryan though. I’m 
thinking about this more from the gain in speed rather than saving space.

To clarify, I’m suggesting replacing a compound key (made up of several integer 
cols) with an integer primary key (which sqlite will use rather than the row 
id). I have done my homework on this so I’m familiar with Gunter’s points 
regarding ‘between’ and ‘high end bits’ but will the between on a single 
integer key not be faster than matching on the first m fields of an n compound 
key? If an index is needed on any non-high bit col an expression index would 
work just as fast for lookups (I suppose inserts would be slower). The savings 
on space would contribute to the speed as each disk read would contain more 
records.

Even forgetting about keys, if you packed say 8 columns into one int64 column 
would you not be saving a minimum of 7 bits?


From: Hick Gunter<mailto:h...@scigames.at>
Sent: 10 August 2017 10:53
To: 'SQLite mailing list'<mailto:sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] Packing integer primary key with field bits

For the sake of the argument, let's try to devise a workable scheme for such an 
undertaking:

Lets assume you have a 32-bit "real rowid" and four 8-bit "value" fields.

How to distribute these in the 64-bit rowid?

Rrid low = MSB | v1 | v2 | v3 | v4 | r1r2r3r4 | LSB or Rrid high =  MSB | 
r1r2r3r4 | v1 | v2 | v3 | v4 | LSB?

Let's try to select a row by real rowid:

SELECT id FROM t WHERE id & 0xffff = 1234;

There is no index on "id & 0xffff" so this translates into a full table scan 
(even if limited to one row, it still requires on average 1/2 of the rows to be 
read). Creating an index is no option because we are trying to save space, 
remember?

SELECT id FROM t WHERE id BETWEEN (1234<<32) and (1234<<32 + 0xffff);

This only works with "real rowid high" ("real rowid low" requires 2^^32 ranges 
to work) and the lookup is fast, but the endpoints should better be calculated 
in the calling program.

So for a reasonably fast lookup by "real rowid", it needs to occupy the most 
significant bits.

Trying to select by one the values, by extension of the arguments given, will 
either be a nightmare, default to a full table scan or require a covering index 
(which requires more disk space than the 4-12 bytes per record we have "saved").

Additionally, if any of the statements returns more than one row for any "real 
rowid" then your table is shot up beyond repair...

-----Ursprüngliche Nachricht-----
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von x
Gesendet: Donnerstag, 10. August 2017 09:45
An: sqlite-users@mailinglists.sqlite.org
Betreff: [sqlite] Packing integer primary key with field bits

As in cramming numerous integer columns into a 64 bit integer.

Does anyone do this?

Is the speed gain worth the additional confusion of extracting the columns from 
the key?

How is it best accomplished (virtual table maybe)?
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___________________________________________
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This communication (including any attachments) is intended for the use of the 
intended recipient(s) only and may contain information that is confidential, 
privileged or legally protected. Any unauthorized use or dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please immediately notify the sender by return e-mail message and 
delete all copies of the original communication. Thank you for your cooperation.


_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___________________________________________
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This communication (including any attachments) is intended for the use of the 
intended recipient(s) only and may contain information that is confidential, 
privileged or legally protected. Any unauthorized use or dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please immediately notify the sender by return e-mail message and 
delete all copies of the original communication. Thank you for your cooperation.


_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to