Re: [sqlite] Packing integer primary key with field bits

2017-08-12 Thread x
Cheers Ryan.

From: R Smith<mailto:rsm...@rsweb.co.za>
Sent: 11 August 2017 11:14
To: 
sqlite-users@mailinglists.sqlite.org<mailto:sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] Packing integer primary key with field bits


On 2017/08/11 10:21 AM, x wrote:
> I’m wishing I had kept my mouth shut now 

Never do that.

We do see some questions asked ad-nauseum (or silly ones which could
have been answered by the fleetest of Google searches) and some people
probably find some irritation from it, but most of us don't and by far
the other 99% of questions are all valuable.

If you are wondering it, chances are someone else reading this also
wonders about it.

Case in point: Did you see how many people replied in other disconnected
threads about document typos (for instance) or further questions arising
from reading the very same file-format document we've discussed? Had you
kept your mouth shut, that might not have happened - so, thank you for
asking.

Good luck!
Ryan

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Packing integer primary key with field bits

2017-08-11 Thread R Smith


On 2017/08/11 10:21 AM, x wrote:

I’m wishing I had kept my mouth shut now 


Never do that.

We do see some questions asked ad-nauseum (or silly ones which could 
have been answered by the fleetest of Google searches) and some people 
probably find some irritation from it, but most of us don't and by far 
the other 99% of questions are all valuable.


If you are wondering it, chances are someone else reading this also 
wonders about it.


Case in point: Did you see how many people replied in other disconnected 
threads about document typos (for instance) or further questions arising 
from reading the very same file-format document we've discussed? Had you 
kept your mouth shut, that might not have happened - so, thank you for 
asking.


Good luck!
Ryan

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Packing integer primary key with field bits

2017-08-11 Thread x
I’m wishing I had kept my mouth shut now 

It’s starting to get a bit above my head as programming’s little more than a 
hobby for me. Although I’ve been using c++ builder for a long time I only got 
round to trying to learn c++ properly a couple of years back during a lengthy 
illness. I even had to ask on a forum how to run ‘main()’ so I could follow the 
online c++ lessons. This yearning for knowledge came about after the discovery 
of sqlite. After the demise of Paradox (anybody remember that) I hated having 
to use sql servers and finding this embedded gem, which created a single db 
file and whose code consisted of an amalgamation file, gave me a new lease of 
life. I still have a lot of problems with c code though.

This forum’s a god-send as well. Despite the number of sharp minds on here, and 
the complexities and incompatibilities they’re having to deal with, it’s 
generally very civilised and condescending free. If I started this thread on 
stackoverflow I’d probably have a minus reputation by now coupled with a severe 
inferiority complex.

If I decide to pursue this I’ll report back any findings. Thanks to all who 
contributed to the discussion.

Tom


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Packing integer primary key with field bits

2017-08-10 Thread Jens Alfke

> On Aug 10, 2017, at 5:53 PM, Richard Hipp  wrote:
> 
> See https://www.sqlite.org/src/file/ext/misc/compress.c 
> 

Thanks, that would be a good template to start from. But I recommend using 
Snappy instead of ZLib, unless file size is more important than performance.

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Packing integer primary key with field bits

2017-08-10 Thread Richard Hipp
On 8/10/17, Jens Alfke  wrote:
>
> If I were going to use it, I’d implement a pair of SQL functions to
> compress/uncompress blobs,

See https://www.sqlite.org/src/file/ext/misc/compress.c

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Packing integer primary key with field bits

2017-08-10 Thread Jens Alfke

> On Aug 10, 2017, at 9:36 AM, Nico Williams  wrote:
> 
> Then proper compression (e.g., with zlib) will be much better for you
> than what you're trying to do.

zlib isn’t fast enough for usage like this. It’s better to use something like 
Snappy, which was designed to be very fast, at the expense of less compression. 
It works very well for database content.

If I were going to use it, I’d implement a pair of SQL functions to 
compress/uncompress blobs, then wrap calls to those around the SQL statements 
that read and write the column I wanted to compress.

(Of course this is total overkill for storing a couple of integers as discussed 
here.)

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Packing integer primary key with field bits

2017-08-10 Thread R Smith

On 2017/08/10 5:54 PM, x wrote:

Ryan, I know what you’re saying about the list of bytes but I’d like it 
confirmed that the indexes are not stored in a format similar to the data.


Oh it's similar, just not the same. I looked it up:
http://sqlite.org/fileformat.html
Fascinating reading. Take special note of section 1.5 that discusses the 
B-Tree formats at length, and 2.1 for the Record format, which also 
clarifies exactly SQLite's varint type that explains how the Integer 
packing, which we've referred to in previous posts, happens.



Do you not think the cost of unpacking the bytes would be insignificant?


Never sure how to answer questions asked in the negative  YES, I 
don't think so...  or NO, I don't think so - which makes more sense?


Jokes aside, I agree, I do think it insignificant - I just don't think 
it's significantly more insignificant than the cost of SQLite unpacking 
the fields for you. My point is that it is already insignificant enough 
given the givens. Any improvement to be had has a very high EPR.


(That's my invented measure: *Effort-to-Pleasure Ratio*.  I strive to 
lower it for everything.)  :)


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Packing integer primary key with field bits

2017-08-10 Thread Bob Friesenhahn

On Thu, 10 Aug 2017, Clemens Ladisch wrote:


x wrote:

I’m thinking about this more from the gain in speed rather than saving space.


Database performance is usually limited by I/O, i.e., you gain speed by
saving space.


To be clear, database performance is usually limited by the number of 
I/O operations possible within a given amount of time (IOPS) rather 
than the bandwidth (bytes/second) available.


If the database is small enough to fit in the OS filesystem cache, 
then hardly any I/O operations to underlying store will be needed to 
satisfy requests.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Packing integer primary key with field bits

2017-08-10 Thread Nico Williams
On Thu, Aug 10, 2017 at 03:36:31PM +0200, Clemens Ladisch wrote:
> x wrote:
> > I’m thinking about this more from the gain in speed rather than saving 
> > space.
> 
> Database performance is usually limited by I/O, i.e., you gain speed by
> saving space.

Then proper compression (e.g., with zlib) will be much better for you
than what you're trying to do.

I don't know how to get SQLite3 DBs compressed though, short of writing
your own VFS (one might already exist) or hosting on ZFS (which works
very well).

Nico
-- 
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Packing integer primary key with field bits

2017-08-10 Thread x
Ryan, I know what you’re saying about the list of bytes but I’d like it 
confirmed that the indexes are not stored in a format similar to the data.

Do you not think the cost of unpacking the bytes would be insignificant?

Another possibility that should be considered is that if the database is 
reduced considerably it might fit in memory or, at the vey least, memory caches 
will be able to hold a lot more records.

Tom

From: R Smith<mailto:rsm...@rsweb.co.za>
Sent: 10 August 2017 16:24
To: 
sqlite-users@mailinglists.sqlite.org<mailto:sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] Packing integer primary key with field bits



On 2017/08/10 4:51 PM, x wrote:
> Valid point on the intended range Gunter. I don’t know enough about sqlite to 
> fully understand your index cell paragraph. I thought the way sqlite worked 
> was e.g. to get the value of the 3rd column it had to read the lengths of 
> col1 & col2 so it knows where col 3 value starts (I’ve seen a comment that 
> the most retrieved cols should be at the start of the record)

This is true, but the cost of "reading" the bytes to get to the values
are no worse than what your cost would be for reading the Int64 and
unpacking the bytes from there... SQLite internally does the same thing
only without using an Int64, it just uses an array of bytes, but gleans
the values equally efficiently. If it was better to use Int64 values in
stead of a byte-array, SQLite would already be using it. By the way,
Int64 is just another way of saying "list of 8 bytes", which is just
another way of saying "list of 64 bits", there is nothing magically more
efficient about 64bit INT than a list of 8 bytes. Memory is memory.
Your idea might offer savings in that you predetermine the length of
stored values and ensure they fit into that list of 8 bytes, whereas
SQLite stores also a Length, but has the vastly superior property of
being able to store any size of any type of variable. Again, the speed
gain is dubious for the price it comes at.

To make matters worse, the real bottleneck is usually the storage layer,
and to read even 1 byte from a row, or Index, usually requires reading
an entire File-System Page (Typically ~4 Thousand bytes) but some
successive reads are usually gotten from the same page to make matters
more efficient. The point being, the reading of that File system page is
several magnitudes more time-intensive than unpacking the bytes. You
will not save anything much speed-wise, and very little space-wise, and
then only for smaller row-lengths.

By the way, the same goes for reading your own array of bytes from the
storage medium - the hope is of course that some good caching at both
the storage layer and your application layer would smooth out read/write
waits.

>   and that indexes are stored the same way.

This is not quite true. The Index is (usually) a B-Tree structure with
consistent size value indices. An Index is built for speed, not storage
efficiency. Adding an Index effectively copies the entire column(s) the
Index refers to into slightly less efficient (but more accessible)
storage - It's much worse space-wise than simply using separate fields
and no Index.
[Note: I am not 100% sure about the BTree storage format in SQLite's
case, I refer here to other DB Engines, but I would be surprised if an
SQLite BTree Index is stored exactly the same as the table Row format.]


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Packing integer primary key with field bits

2017-08-10 Thread x
Paul, The record size will be fairly small. Probably under 100 bytes. Agree 
with you on your reduced disk I/o point, particularly as the index size might 
be significantly smaller than the full blown index size.


From: Paul Sanderson<mailto:sandersonforens...@gmail.com>
Sent: 10 August 2017 16:15
To: SQLite mailing list<mailto:sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] Packing integer primary key with field bits

Space savings will depend very much on what other data is in the table.

If you have a 4096 byte page size and with an average record size of 1000
bytes then saving 7 bytes for each of the 4 records wont free up enough
space to fit a new record into that page. So savings in this scenario will
effectively be nil.

If on the otherhand the average record is 100 bytes you may well fit more
records into the page, conversely changing the page size to 64K would also
reduce the number of reads.

I suspect that biggest time savings may be gained by reducing disk I/O.

Better advice could possibly be given if we know the full table schema
including typical sizes for data in any fields/

Paul
www.sandersonforensics.com<http://www.sandersonforensics.com>
skype: r3scue193
twitter: @sandersonforens
Tel +44 (0)1326 572786
http://sandersonforensics.com/forum/content.php?195-SQLite-Forensic-Toolkit
-Forensic Toolkit for SQLite
email from a work address for a fully functional demo licence

On 10 August 2017 at 14:13, R Smith <rsm...@rsweb.co.za> wrote:

> On 2017/08/10 1:19 PM, x wrote:
>
>> Thanks for the replies. I’m not sure I agree with Gunter and Ryan though.
>> I’m thinking about this more from the gain in speed rather than saving
>> space.
>>
>> To clarify, I’m suggesting replacing a compound key (made up of several
>> integer cols) with an integer primary key (which sqlite will use rather
>> than the row id). I have done my homework on this so I’m familiar with
>> Gunter’s points regarding ‘between’ and ‘high end bits’ but will the
>> between on a single integer key not be faster than matching on the first m
>> fields of an n compound key? If an index is needed on any non-high bit col
>> an expression index would work just as fast for lookups (I suppose inserts
>> would be slower). The savings on space would contribute to the speed as
>> each disk read would contain more records.
>>
>
> Ok, if you require ALL the packed records all the time, and will always
> access it by the primary value (the first of the packed values) and is very
> very sure you won't ever need expanding the value range, then you might
> actually get a speed gain from it.
>
> Problem is, the gain will be minuscule, and the price is high. Lots of
> development time, loss of useful SQL aggregates and other functionality,
> possible future reworks... All of that for a very small speed gain?  If you
> are wanting that, why not simply use a custom structure and avoid SQLite
> completely? The speed gain will actually be significant then, and you're
> going to lose the SQL-ness of it anyway, so that shouldn't matter.
>
> A structured array mapped to a physical byte-stream will be several times
> faster than SQLite (or any other RDBMS for that matter).  SQL as supported
> by the average RDBMS is only really helpful when you are looking for
> SET-type relational data handling or very large data (and your use case is
> specifically not for large data). Most RDBMSes have great optimizations for
> speeding up resolving of relational-type questions and their every-day-use
> advantages are legion, they are however without exception NOT faster than -
> NOR intended to be faster than - simple byte/structured array handling.
>
> You might even find a synergy between using your own structured array
> together with an SQLite DB which only get accessed once you need more
> information than persists in the array itself - it's easy to make a pilot
> and test the speed gains. And please do that before investing the time to
> develop a fully fledged dual system.
>
>
>
>> Even forgetting about keys, if you packed say 8 columns into one int64
>> column would you not be saving a minimum of 7 bits?
>>
>
> No you won't, SQLite stores Integer much more efficiently. Unless you mean
> use ONLY the 64-bit index and not storing the values in separate fields in
> the DB at all, in which case yes, you will save a few bytes, possibly less
> than 7 though (I need to affirm the exact number, don't know off the
> top...).
>
> Cheers,
> Ryan
>
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users

Re: [sqlite] Packing integer primary key with field bits

2017-08-10 Thread R Smith



On 2017/08/10 4:51 PM, x wrote:

Valid point on the intended range Gunter. I don’t know enough about sqlite to fully 
understand your index cell paragraph. I thought the way sqlite worked was e.g. to 
get the value of the 3rd column it had to read the lengths of col1 & col2 so it 
knows where col 3 value starts (I’ve seen a comment that the most retrieved cols 
should be at the start of the record)


This is true, but the cost of "reading" the bytes to get to the values 
are no worse than what your cost would be for reading the Int64 and 
unpacking the bytes from there... SQLite internally does the same thing 
only without using an Int64, it just uses an array of bytes, but gleans 
the values equally efficiently. If it was better to use Int64 values in 
stead of a byte-array, SQLite would already be using it. By the way, 
Int64 is just another way of saying "list of 8 bytes", which is just 
another way of saying "list of 64 bits", there is nothing magically more 
efficient about 64bit INT than a list of 8 bytes. Memory is memory.
Your idea might offer savings in that you predetermine the length of 
stored values and ensure they fit into that list of 8 bytes, whereas 
SQLite stores also a Length, but has the vastly superior property of 
being able to store any size of any type of variable. Again, the speed 
gain is dubious for the price it comes at.


To make matters worse, the real bottleneck is usually the storage layer, 
and to read even 1 byte from a row, or Index, usually requires reading 
an entire File-System Page (Typically ~4 Thousand bytes) but some 
successive reads are usually gotten from the same page to make matters 
more efficient. The point being, the reading of that File system page is 
several magnitudes more time-intensive than unpacking the bytes. You 
will not save anything much speed-wise, and very little space-wise, and 
then only for smaller row-lengths.


By the way, the same goes for reading your own array of bytes from the 
storage medium - the hope is of course that some good caching at both 
the storage layer and your application layer would smooth out read/write 
waits.



  and that indexes are stored the same way.


This is not quite true. The Index is (usually) a B-Tree structure with 
consistent size value indices. An Index is built for speed, not storage 
efficiency. Adding an Index effectively copies the entire column(s) the 
Index refers to into slightly less efficient (but more accessible) 
storage - It's much worse space-wise than simply using separate fields 
and no Index.
[Note: I am not 100% sure about the BTree storage format in SQLite's 
case, I refer here to other DB Engines, but I would be surprised if an 
SQLite BTree Index is stored exactly the same as the table Row format.]



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Packing integer primary key with field bits

2017-08-10 Thread Paul Sanderson
Space savings will depend very much on what other data is in the table.

If you have a 4096 byte page size and with an average record size of 1000
bytes then saving 7 bytes for each of the 4 records wont free up enough
space to fit a new record into that page. So savings in this scenario will
effectively be nil.

If on the otherhand the average record is 100 bytes you may well fit more
records into the page, conversely changing the page size to 64K would also
reduce the number of reads.

I suspect that biggest time savings may be gained by reducing disk I/O.

Better advice could possibly be given if we know the full table schema
including typical sizes for data in any fields/

Paul
www.sandersonforensics.com
skype: r3scue193
twitter: @sandersonforens
Tel +44 (0)1326 572786
http://sandersonforensics.com/forum/content.php?195-SQLite-Forensic-Toolkit
-Forensic Toolkit for SQLite
email from a work address for a fully functional demo licence

On 10 August 2017 at 14:13, R Smith  wrote:

> On 2017/08/10 1:19 PM, x wrote:
>
>> Thanks for the replies. I’m not sure I agree with Gunter and Ryan though.
>> I’m thinking about this more from the gain in speed rather than saving
>> space.
>>
>> To clarify, I’m suggesting replacing a compound key (made up of several
>> integer cols) with an integer primary key (which sqlite will use rather
>> than the row id). I have done my homework on this so I’m familiar with
>> Gunter’s points regarding ‘between’ and ‘high end bits’ but will the
>> between on a single integer key not be faster than matching on the first m
>> fields of an n compound key? If an index is needed on any non-high bit col
>> an expression index would work just as fast for lookups (I suppose inserts
>> would be slower). The savings on space would contribute to the speed as
>> each disk read would contain more records.
>>
>
> Ok, if you require ALL the packed records all the time, and will always
> access it by the primary value (the first of the packed values) and is very
> very sure you won't ever need expanding the value range, then you might
> actually get a speed gain from it.
>
> Problem is, the gain will be minuscule, and the price is high. Lots of
> development time, loss of useful SQL aggregates and other functionality,
> possible future reworks... All of that for a very small speed gain?  If you
> are wanting that, why not simply use a custom structure and avoid SQLite
> completely? The speed gain will actually be significant then, and you're
> going to lose the SQL-ness of it anyway, so that shouldn't matter.
>
> A structured array mapped to a physical byte-stream will be several times
> faster than SQLite (or any other RDBMS for that matter).  SQL as supported
> by the average RDBMS is only really helpful when you are looking for
> SET-type relational data handling or very large data (and your use case is
> specifically not for large data). Most RDBMSes have great optimizations for
> speeding up resolving of relational-type questions and their every-day-use
> advantages are legion, they are however without exception NOT faster than -
> NOR intended to be faster than - simple byte/structured array handling.
>
> You might even find a synergy between using your own structured array
> together with an SQLite DB which only get accessed once you need more
> information than persists in the array itself - it's easy to make a pilot
> and test the speed gains. And please do that before investing the time to
> develop a fully fledged dual system.
>
>
>
>> Even forgetting about keys, if you packed say 8 columns into one int64
>> column would you not be saving a minimum of 7 bits?
>>
>
> No you won't, SQLite stores Integer much more efficiently. Unless you mean
> use ONLY the 64-bit index and not storing the values in separate fields in
> the DB at all, in which case yes, you will save a few bytes, possibly less
> than 7 though (I need to affirm the exact number, don't know off the
> top...).
>
> Cheers,
> Ryan
>
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Packing integer primary key with field bits

2017-08-10 Thread x
Valid point on the intended range Gunter. I don’t know enough about sqlite to 
fully understand your index cell paragraph. I thought the way sqlite worked was 
e.g. to get the value of the 3rd column it had to read the lengths of col1 & 
col2 so it knows where col 3 value starts (I’ve seen a comment that the most 
retrieved cols should be at the start of the record) and that indexes are 
stored the same way. If that’s the case would having to consider a single int 
not always be faster than considering the first few cols (although I suppose it 
only has to consider columns 2 and up to settle ties on col 1).

Ryan, you might be right on the structured array. Re the 8 cols packed 
together, I’m thinking it would save at least 7 bytes because sqlite would only 
require one ‘length byte’ (I’m unsure of the terminology), as opposed to 8, in 
the record.

Clemens, I was referring to bit shifting with respect to indexes homework. I’ve 
yet to test anything. Indeed, in this thread, I’m trying to determine if such 
testing is a waste of time.

Tom



From: Hick Gunter<mailto:h...@scigames.at>
Sent: 10 August 2017 13:40
To: 'SQLite mailing list'<mailto:sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] Packing integer primary key with field bits

The first problem with packing several integer fields into one is that you lose 
the capability of expressing NULL values in the packed fields without resorting 
to complicated and time consuming arithmetics modulo 257.

The next is that you lose the possibility of expressing values outside of the 
intended range. Even if, at inception, you think you will only need 3 bits 
(e.g. NULL and 7 discrete values), murphy's law guarantees that there will 
arise a need for an 8th discrete value and you will have to unpack and repack 
all of the rows. If you still have some bits left unused.

Another method of avoiding the "wasted" space of a rowid is to declare WITHOUT 
ROWID tables.

Arguably, comparing 8 bytes is faster than decoding n fields; but the record 
format of SQLite allows the range of the values to be compared first.

E.g. you have 8 subfields f1 to f8 and your constraint is "f1=1 and f2=2 and 
f3=4", which translates into "between 0x01020400 and 0x010204ff"

The index cell would have a (partial) header of s1s2s3s4s5s6s7s8 and a 
(partial) payload of.[f1] [f2] [f3] [f4] [f5] [f6] [f7] [f8]. Analysis of the 
constraints yields that the values are all representable in 8 bits, giving a 
"signature" of s1=9 and s2=1 and s3=1 which serves to eliminate most of the 
keys by comparing 1 to 3 bytes, and needs to compare 1 or 2 more bytes (f2 and 
f3, because f1 is not present) to find a match.

Apart from the technical details, overly concerning yourself with the speed of 
query execution at the beginning of design can be considered "premature 
optimization". Getting the application to run correctly should be the primary 
focus. Once it does that, see if the performance is ok. If not, writing better 
queries that run efficiently is probably orders of magnitude more important 
than the difference between comparing "rowids" and "keys".

-Ursprüngliche Nachricht-
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von x
Gesendet: Donnerstag, 10. August 2017 13:19
An: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
Betreff: Re: [sqlite] Packing integer primary key with field bits

Thanks for the replies. I’m not sure I agree with Gunter and Ryan though. I’m 
thinking about this more from the gain in speed rather than saving space.

To clarify, I’m suggesting replacing a compound key (made up of several integer 
cols) with an integer primary key (which sqlite will use rather than the row 
id). I have done my homework on this so I’m familiar with Gunter’s points 
regarding ‘between’ and ‘high end bits’ but will the between on a single 
integer key not be faster than matching on the first m fields of an n compound 
key? If an index is needed on any non-high bit col an expression index would 
work just as fast for lookups (I suppose inserts would be slower). The savings 
on space would contribute to the speed as each disk read would contain more 
records.

Even forgetting about keys, if you packed say 8 columns into one int64 column 
would you not be saving a minimum of 7 bits?


From: Hick Gunter<mailto:h...@scigames.at>
Sent: 10 August 2017 10:53
To: 'SQLite mailing list'<mailto:sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] Packing integer primary key with field bits

For the sake of the argument, let's try to devise a workable scheme for such an 
undertaking:

Lets assume you have a 32-bit "real rowid" and four 8-bit "value" fields.

How to distribute these in the 64-bit rowid?

Rrid low = MSB | v1 | v2 | v3 | v4 | r1r2r3r4 | LSB or Rrid high =  MSB | 
r1r2r3r4 | v1 | 

Re: [sqlite] Packing integer primary key with field bits

2017-08-10 Thread Clemens Ladisch
x wrote:
> I’m thinking about this more from the gain in speed rather than saving space.

Database performance is usually limited by I/O, i.e., you gain speed by
saving space.

> I have done my homework on this

So what are the results of your measurements?


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Packing integer primary key with field bits

2017-08-10 Thread R Smith

On 2017/08/10 1:19 PM, x wrote:

Thanks for the replies. I’m not sure I agree with Gunter and Ryan though. I’m 
thinking about this more from the gain in speed rather than saving space.

To clarify, I’m suggesting replacing a compound key (made up of several integer 
cols) with an integer primary key (which sqlite will use rather than the row 
id). I have done my homework on this so I’m familiar with Gunter’s points 
regarding ‘between’ and ‘high end bits’ but will the between on a single 
integer key not be faster than matching on the first m fields of an n compound 
key? If an index is needed on any non-high bit col an expression index would 
work just as fast for lookups (I suppose inserts would be slower). The savings 
on space would contribute to the speed as each disk read would contain more 
records.


Ok, if you require ALL the packed records all the time, and will always 
access it by the primary value (the first of the packed values) and is 
very very sure you won't ever need expanding the value range, then you 
might actually get a speed gain from it.


Problem is, the gain will be minuscule, and the price is high. Lots of 
development time, loss of useful SQL aggregates and other functionality, 
possible future reworks... All of that for a very small speed gain?  If 
you are wanting that, why not simply use a custom structure and avoid 
SQLite completely? The speed gain will actually be significant then, and 
you're going to lose the SQL-ness of it anyway, so that shouldn't matter.


A structured array mapped to a physical byte-stream will be several 
times faster than SQLite (or any other RDBMS for that matter).  SQL as 
supported by the average RDBMS is only really helpful when you are 
looking for SET-type relational data handling or very large data (and 
your use case is specifically not for large data). Most RDBMSes have 
great optimizations for speeding up resolving of relational-type 
questions and their every-day-use advantages are legion, they are 
however without exception NOT faster than - NOR intended to be faster 
than - simple byte/structured array handling.


You might even find a synergy between using your own structured array 
together with an SQLite DB which only get accessed once you need more 
information than persists in the array itself - it's easy to make a 
pilot and test the speed gains. And please do that before investing the 
time to develop a fully fledged dual system.





Even forgetting about keys, if you packed say 8 columns into one int64 column 
would you not be saving a minimum of 7 bits?


No you won't, SQLite stores Integer much more efficiently. Unless you 
mean use ONLY the 64-bit index and not storing the values in separate 
fields in the DB at all, in which case yes, you will save a few bytes, 
possibly less than 7 though (I need to affirm the exact number, don't 
know off the top...).


Cheers,
Ryan

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Packing integer primary key with field bits

2017-08-10 Thread Hick Gunter
The first problem with packing several integer fields into one is that you lose 
the capability of expressing NULL values in the packed fields without resorting 
to complicated and time consuming arithmetics modulo 257.

The next is that you lose the possibility of expressing values outside of the 
intended range. Even if, at inception, you think you will only need 3 bits 
(e.g. NULL and 7 discrete values), murphy's law guarantees that there will 
arise a need for an 8th discrete value and you will have to unpack and repack 
all of the rows. If you still have some bits left unused.

Another method of avoiding the "wasted" space of a rowid is to declare WITHOUT 
ROWID tables.

Arguably, comparing 8 bytes is faster than decoding n fields; but the record 
format of SQLite allows the range of the values to be compared first.

E.g. you have 8 subfields f1 to f8 and your constraint is "f1=1 and f2=2 and 
f3=4", which translates into "between 0x01020400 and 0x010204ff"

The index cell would have a (partial) header of s1s2s3s4s5s6s7s8 and a 
(partial) payload of.[f1] [f2] [f3] [f4] [f5] [f6] [f7] [f8]. Analysis of the 
constraints yields that the values are all representable in 8 bits, giving a 
"signature" of s1=9 and s2=1 and s3=1 which serves to eliminate most of the 
keys by comparing 1 to 3 bytes, and needs to compare 1 or 2 more bytes (f2 and 
f3, because f1 is not present) to find a match.

Apart from the technical details, overly concerning yourself with the speed of 
query execution at the beginning of design can be considered "premature 
optimization". Getting the application to run correctly should be the primary 
focus. Once it does that, see if the performance is ok. If not, writing better 
queries that run efficiently is probably orders of magnitude more important 
than the difference between comparing "rowids" and "keys".

-Ursprüngliche Nachricht-
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von x
Gesendet: Donnerstag, 10. August 2017 13:19
An: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
Betreff: Re: [sqlite] Packing integer primary key with field bits

Thanks for the replies. I’m not sure I agree with Gunter and Ryan though. I’m 
thinking about this more from the gain in speed rather than saving space.

To clarify, I’m suggesting replacing a compound key (made up of several integer 
cols) with an integer primary key (which sqlite will use rather than the row 
id). I have done my homework on this so I’m familiar with Gunter’s points 
regarding ‘between’ and ‘high end bits’ but will the between on a single 
integer key not be faster than matching on the first m fields of an n compound 
key? If an index is needed on any non-high bit col an expression index would 
work just as fast for lookups (I suppose inserts would be slower). The savings 
on space would contribute to the speed as each disk read would contain more 
records.

Even forgetting about keys, if you packed say 8 columns into one int64 column 
would you not be saving a minimum of 7 bits?


From: Hick Gunter<mailto:h...@scigames.at>
Sent: 10 August 2017 10:53
To: 'SQLite mailing list'<mailto:sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] Packing integer primary key with field bits

For the sake of the argument, let's try to devise a workable scheme for such an 
undertaking:

Lets assume you have a 32-bit "real rowid" and four 8-bit "value" fields.

How to distribute these in the 64-bit rowid?

Rrid low = MSB | v1 | v2 | v3 | v4 | r1r2r3r4 | LSB or Rrid high =  MSB | 
r1r2r3r4 | v1 | v2 | v3 | v4 | LSB?

Let's try to select a row by real rowid:

SELECT id FROM t WHERE id & 0x = 1234;

There is no index on "id & 0x" so this translates into a full table scan 
(even if limited to one row, it still requires on average 1/2 of the rows to be 
read). Creating an index is no option because we are trying to save space, 
remember?

SELECT id FROM t WHERE id BETWEEN (1234<<32) and (1234<<32 + 0x);

This only works with "real rowid high" ("real rowid low" requires 2^^32 ranges 
to work) and the lookup is fast, but the endpoints should better be calculated 
in the calling program.

So for a reasonably fast lookup by "real rowid", it needs to occupy the most 
significant bits.

Trying to select by one the values, by extension of the arguments given, will 
either be a nightmare, default to a full table scan or require a covering index 
(which requires more disk space than the 4-12 bytes per record we have "saved").

Additionally, if any of the statements returns more than one row for any "real 
rowid" then your table is shot up beyond repair...

-Ursprüngliche Nachricht-
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von x

Re: [sqlite] Packing integer primary key with field bits

2017-08-10 Thread x
Saving a minimum of 7 bytes I meant.

From: x<mailto:tam118...@hotmail.com>
Sent: 10 August 2017 12:19
To: SQLite mailing list<mailto:sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] Packing integer primary key with field bits

Thanks for the replies. I’m not sure I agree with Gunter and Ryan though. I’m 
thinking about this more from the gain in speed rather than saving space.

To clarify, I’m suggesting replacing a compound key (made up of several integer 
cols) with an integer primary key (which sqlite will use rather than the row 
id). I have done my homework on this so I’m familiar with Gunter’s points 
regarding ‘between’ and ‘high end bits’ but will the between on a single 
integer key not be faster than matching on the first m fields of an n compound 
key? If an index is needed on any non-high bit col an expression index would 
work just as fast for lookups (I suppose inserts would be slower). The savings 
on space would contribute to the speed as each disk read would contain more 
records.

Even forgetting about keys, if you packed say 8 columns into one int64 column 
would you not be saving a minimum of 7 bits?


From: Hick Gunter<mailto:h...@scigames.at>
Sent: 10 August 2017 10:53
To: 'SQLite mailing list'<mailto:sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] Packing integer primary key with field bits

For the sake of the argument, let's try to devise a workable scheme for such an 
undertaking:

Lets assume you have a 32-bit "real rowid" and four 8-bit "value" fields.

How to distribute these in the 64-bit rowid?

Rrid low = MSB | v1 | v2 | v3 | v4 | r1r2r3r4 | LSB or
Rrid high =  MSB | r1r2r3r4 | v1 | v2 | v3 | v4 | LSB?

Let's try to select a row by real rowid:

SELECT id FROM t WHERE id & 0x = 1234;

There is no index on "id & 0x" so this translates into a full table scan 
(even if limited to one row, it still requires on average 1/2 of the rows to be 
read). Creating an index is no option because we are trying to save space, 
remember?

SELECT id FROM t WHERE id BETWEEN (1234<<32) and (1234<<32 + 0x);

This only works with "real rowid high" ("real rowid low" requires 2^^32 ranges 
to work) and the lookup is fast, but the endpoints should better be calculated 
in the calling program.

So for a reasonably fast lookup by "real rowid", it needs to occupy the most 
significant bits.

Trying to select by one the values, by extension of the arguments given, will 
either be a nightmare, default to a full table scan or require a covering index 
(which requires more disk space than the 4-12 bytes per record we have "saved").

Additionally, if any of the statements returns more than one row for any "real 
rowid" then your table is shot up beyond repair...

-Ursprüngliche Nachricht-
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von x
Gesendet: Donnerstag, 10. August 2017 09:45
An: sqlite-users@mailinglists.sqlite.org
Betreff: [sqlite] Packing integer primary key with field bits

As in cramming numerous integer columns into a 64 bit integer.

Does anyone do this?

Is the speed gain worth the additional confusion of extracting the columns from 
the key?

How is it best accomplished (virtual table maybe)?
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This communication (including any attachments) is intended for the use of the 
intended recipient(s) only and may contain information that is confidential, 
privileged or legally protected. Any unauthorized use or dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please immediately notify the sender by return e-mail message and 
delete all copies of the original communication. Thank you for your cooperation.


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Packing integer primary key with field bits

2017-08-10 Thread x
Thanks for the replies. I’m not sure I agree with Gunter and Ryan though. I’m 
thinking about this more from the gain in speed rather than saving space.

To clarify, I’m suggesting replacing a compound key (made up of several integer 
cols) with an integer primary key (which sqlite will use rather than the row 
id). I have done my homework on this so I’m familiar with Gunter’s points 
regarding ‘between’ and ‘high end bits’ but will the between on a single 
integer key not be faster than matching on the first m fields of an n compound 
key? If an index is needed on any non-high bit col an expression index would 
work just as fast for lookups (I suppose inserts would be slower). The savings 
on space would contribute to the speed as each disk read would contain more 
records.

Even forgetting about keys, if you packed say 8 columns into one int64 column 
would you not be saving a minimum of 7 bits?


From: Hick Gunter<mailto:h...@scigames.at>
Sent: 10 August 2017 10:53
To: 'SQLite mailing list'<mailto:sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] Packing integer primary key with field bits

For the sake of the argument, let's try to devise a workable scheme for such an 
undertaking:

Lets assume you have a 32-bit "real rowid" and four 8-bit "value" fields.

How to distribute these in the 64-bit rowid?

Rrid low = MSB | v1 | v2 | v3 | v4 | r1r2r3r4 | LSB or
Rrid high =  MSB | r1r2r3r4 | v1 | v2 | v3 | v4 | LSB?

Let's try to select a row by real rowid:

SELECT id FROM t WHERE id & 0x = 1234;

There is no index on "id & 0x" so this translates into a full table scan 
(even if limited to one row, it still requires on average 1/2 of the rows to be 
read). Creating an index is no option because we are trying to save space, 
remember?

SELECT id FROM t WHERE id BETWEEN (1234<<32) and (1234<<32 + 0x);

This only works with "real rowid high" ("real rowid low" requires 2^^32 ranges 
to work) and the lookup is fast, but the endpoints should better be calculated 
in the calling program.

So for a reasonably fast lookup by "real rowid", it needs to occupy the most 
significant bits.

Trying to select by one the values, by extension of the arguments given, will 
either be a nightmare, default to a full table scan or require a covering index 
(which requires more disk space than the 4-12 bytes per record we have "saved").

Additionally, if any of the statements returns more than one row for any "real 
rowid" then your table is shot up beyond repair...

-Ursprüngliche Nachricht-
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von x
Gesendet: Donnerstag, 10. August 2017 09:45
An: sqlite-users@mailinglists.sqlite.org
Betreff: [sqlite] Packing integer primary key with field bits

As in cramming numerous integer columns into a 64 bit integer.

Does anyone do this?

Is the speed gain worth the additional confusion of extracting the columns from 
the key?

How is it best accomplished (virtual table maybe)?
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This communication (including any attachments) is intended for the use of the 
intended recipient(s) only and may contain information that is confidential, 
privileged or legally protected. Any unauthorized use or dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please immediately notify the sender by return e-mail message and 
delete all copies of the original communication. Thank you for your cooperation.


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Packing integer primary key with field bits

2017-08-10 Thread Bart Smissaert
Actually looking at the subject title it looks he was.
Sorry.

RBS

On Thu, Aug 10, 2017 at 11:43 AM, Bart Smissaert 
wrote:

> Not sure the OP wanted to touch the rowid/integer primary key.
> I think he just was contemplating putting a number of integers in one
> single, normal integer column.
> Might be mistaken there.
>
> RBS
>
> On Thu, Aug 10, 2017 at 10:53 AM, Hick Gunter  wrote:
>
>> For the sake of the argument, let's try to devise a workable scheme for
>> such an undertaking:
>>
>> Lets assume you have a 32-bit "real rowid" and four 8-bit "value" fields.
>>
>> How to distribute these in the 64-bit rowid?
>>
>> Rrid low = MSB | v1 | v2 | v3 | v4 | r1r2r3r4 | LSB or
>> Rrid high =  MSB | r1r2r3r4 | v1 | v2 | v3 | v4 | LSB?
>>
>> Let's try to select a row by real rowid:
>>
>> SELECT id FROM t WHERE id & 0x = 1234;
>>
>> There is no index on "id & 0x" so this translates into a full table
>> scan (even if limited to one row, it still requires on average 1/2 of the
>> rows to be read). Creating an index is no option because we are trying to
>> save space, remember?
>>
>> SELECT id FROM t WHERE id BETWEEN (1234<<32) and (1234<<32 + 0x);
>>
>> This only works with "real rowid high" ("real rowid low" requires 2^^32
>> ranges to work) and the lookup is fast, but the endpoints should better be
>> calculated in the calling program.
>>
>> So for a reasonably fast lookup by "real rowid", it needs to occupy the
>> most significant bits.
>>
>> Trying to select by one the values, by extension of the arguments given,
>> will either be a nightmare, default to a full table scan or require a
>> covering index (which requires more disk space than the 4-12 bytes per
>> record we have "saved").
>>
>> Additionally, if any of the statements returns more than one row for any
>> "real rowid" then your table is shot up beyond repair...
>>
>> -Ursprüngliche Nachricht-
>> Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
>> Im Auftrag von x
>> Gesendet: Donnerstag, 10. August 2017 09:45
>> An: sqlite-users@mailinglists.sqlite.org
>> Betreff: [sqlite] Packing integer primary key with field bits
>>
>> As in cramming numerous integer columns into a 64 bit integer.
>>
>> Does anyone do this?
>>
>> Is the speed gain worth the additional confusion of extracting the
>> columns from the key?
>>
>> How is it best accomplished (virtual table maybe)?
>> ___
>> sqlite-users mailing list
>> sqlite-users@mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>>
>> ___
>>  Gunter Hick
>> Software Engineer
>> Scientific Games International GmbH
>> FN 157284 a, HG Wien
>> Klitschgasse 2-4, A-1130 Vienna, Austria
>> Tel: +43 1 80100 0
>> E-Mail: h...@scigames.at
>>
>> This communication (including any attachments) is intended for the use of
>> the intended recipient(s) only and may contain information that is
>> confidential, privileged or legally protected. Any unauthorized use or
>> dissemination of this communication is strictly prohibited. If you have
>> received this communication in error, please immediately notify the sender
>> by return e-mail message and delete all copies of the original
>> communication. Thank you for your cooperation.
>>
>>
>> ___
>> sqlite-users mailing list
>> sqlite-users@mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Packing integer primary key with field bits

2017-08-10 Thread Bart Smissaert
Not sure the OP wanted to touch the rowid/integer primary key.
I think he just was contemplating putting a number of integers in one
single, normal integer column.
Might be mistaken there.

RBS

On Thu, Aug 10, 2017 at 10:53 AM, Hick Gunter  wrote:

> For the sake of the argument, let's try to devise a workable scheme for
> such an undertaking:
>
> Lets assume you have a 32-bit "real rowid" and four 8-bit "value" fields.
>
> How to distribute these in the 64-bit rowid?
>
> Rrid low = MSB | v1 | v2 | v3 | v4 | r1r2r3r4 | LSB or
> Rrid high =  MSB | r1r2r3r4 | v1 | v2 | v3 | v4 | LSB?
>
> Let's try to select a row by real rowid:
>
> SELECT id FROM t WHERE id & 0x = 1234;
>
> There is no index on "id & 0x" so this translates into a full table
> scan (even if limited to one row, it still requires on average 1/2 of the
> rows to be read). Creating an index is no option because we are trying to
> save space, remember?
>
> SELECT id FROM t WHERE id BETWEEN (1234<<32) and (1234<<32 + 0x);
>
> This only works with "real rowid high" ("real rowid low" requires 2^^32
> ranges to work) and the lookup is fast, but the endpoints should better be
> calculated in the calling program.
>
> So for a reasonably fast lookup by "real rowid", it needs to occupy the
> most significant bits.
>
> Trying to select by one the values, by extension of the arguments given,
> will either be a nightmare, default to a full table scan or require a
> covering index (which requires more disk space than the 4-12 bytes per
> record we have "saved").
>
> Additionally, if any of the statements returns more than one row for any
> "real rowid" then your table is shot up beyond repair...
>
> -Ursprüngliche Nachricht-
> Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
> Im Auftrag von x
> Gesendet: Donnerstag, 10. August 2017 09:45
> An: sqlite-users@mailinglists.sqlite.org
> Betreff: [sqlite] Packing integer primary key with field bits
>
> As in cramming numerous integer columns into a 64 bit integer.
>
> Does anyone do this?
>
> Is the speed gain worth the additional confusion of extracting the columns
> from the key?
>
> How is it best accomplished (virtual table maybe)?
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>
> ___
>  Gunter Hick
> Software Engineer
> Scientific Games International GmbH
> FN 157284 a, HG Wien
> Klitschgasse 2-4, A-1130 Vienna, Austria
> Tel: +43 1 80100 0
> E-Mail: h...@scigames.at
>
> This communication (including any attachments) is intended for the use of
> the intended recipient(s) only and may contain information that is
> confidential, privileged or legally protected. Any unauthorized use or
> dissemination of this communication is strictly prohibited. If you have
> received this communication in error, please immediately notify the sender
> by return e-mail message and delete all copies of the original
> communication. Thank you for your cooperation.
>
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Packing integer primary key with field bits

2017-08-10 Thread Hick Gunter
For the sake of the argument, let's try to devise a workable scheme for such an 
undertaking:

Lets assume you have a 32-bit "real rowid" and four 8-bit "value" fields.

How to distribute these in the 64-bit rowid?

Rrid low = MSB | v1 | v2 | v3 | v4 | r1r2r3r4 | LSB or
Rrid high =  MSB | r1r2r3r4 | v1 | v2 | v3 | v4 | LSB?

Let's try to select a row by real rowid:

SELECT id FROM t WHERE id & 0x = 1234;

There is no index on "id & 0x" so this translates into a full table scan 
(even if limited to one row, it still requires on average 1/2 of the rows to be 
read). Creating an index is no option because we are trying to save space, 
remember?

SELECT id FROM t WHERE id BETWEEN (1234<<32) and (1234<<32 + 0x);

This only works with "real rowid high" ("real rowid low" requires 2^^32 ranges 
to work) and the lookup is fast, but the endpoints should better be calculated 
in the calling program.

So for a reasonably fast lookup by "real rowid", it needs to occupy the most 
significant bits.

Trying to select by one the values, by extension of the arguments given, will 
either be a nightmare, default to a full table scan or require a covering index 
(which requires more disk space than the 4-12 bytes per record we have "saved").

Additionally, if any of the statements returns more than one row for any "real 
rowid" then your table is shot up beyond repair...

-Ursprüngliche Nachricht-
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von x
Gesendet: Donnerstag, 10. August 2017 09:45
An: sqlite-users@mailinglists.sqlite.org
Betreff: [sqlite] Packing integer primary key with field bits

As in cramming numerous integer columns into a 64 bit integer.

Does anyone do this?

Is the speed gain worth the additional confusion of extracting the columns from 
the key?

How is it best accomplished (virtual table maybe)?
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This communication (including any attachments) is intended for the use of the 
intended recipient(s) only and may contain information that is confidential, 
privileged or legally protected. Any unauthorized use or dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please immediately notify the sender by return e-mail message and 
delete all copies of the original communication. Thank you for your cooperation.


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Packing integer primary key with field bits

2017-08-10 Thread R Smith

The "anyone" that does this already is called "SQLite".

It doesn't amalgamate the INT fields as such, but it stores INTEGER 
values in only as much bits as is needed, so it achieves the goal of 
being more memory-efficient. So a 0, 1 or 2 takes up much less space 
than say a 786587626 or other high-value integer.


It saves a lot more than is immediately evident, because it turns out in 
*most* cases, *most* tables with INT columns/values have *most* values 
in the low-end of the Integer value spectrum. Especially for Indexes, 
Key relations and the like.


The way you refer to, is simply a form of compression, and it has the 
advantage of saving space, but the common weakness of all compression 
algorithms - It takes CPU cycles, i.e. Work. You asked about "speed" 
gain - there is NO speed gain, there might be a bit of space gain, but 
there is a Speed loss. Perhaps you think that because a processor is 
natively 64bit (these days anyway) it will transfer 64bit integers just 
as fast as any lower-bitted integer, and you are correct about that, but 
if it involves ANY cycles to build that Integer (and it will involve 
MANY), the gain turns into a loss quickly.


Remember: In the CPU registers, something that looks very lean to your 
eye, like left-shifting 2 bytes packing it into a 16-bit INT [ex: i16 = 
(b1 << 8) + b2] actually expands to many low-level CPU operations 
involving moving the larger int into a register/adder, shifting it, 
adding, etc. (Repeat-extrapolate for larger INTs.)  As you can see, 
simply speeding the 8 bytes as-is into cpu/memory/device consecutively, 
using only an 1/8th of the highway, is much faster than first building 
the 64 bit INT and then sending that. The loss effectively repeats the 
day you read it back and needs to decipher.


Verdict: Don't do it. Ever.

Cheers,
Ryan

On 2017/08/10 9:44 AM, x wrote:

As in cramming numerous integer columns into a 64 bit integer.

Does anyone do this?

Is the speed gain worth the additional confusion of extracting the columns from 
the key?

How is it best accomplished (virtual table maybe)?
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Packing integer primary key with field bits

2017-08-10 Thread Hick Gunter
Simple answer: Don't!

This sounds like a misguided attempt to save space in the disk image by messing 
around with the rowid of a table (which is what "integer primary key" declares 
the column to be an alias of).

Whatever you stuff in there needs to be unique and, if you intend to use 
foreign keys, stable (unless you want to incur anomalies on updates). 
Additionally, SQLite may assign a completely different rowid (i.e. clobber your 
"packed field values") when running the REPLACE conflict resolution algortihm.

SQLite is already very good at saving space on disk. The most common values 
(NULL, 0 and 1) are stored using only 1 byte in the record header and other 
integer values only use the minimal number of bytes required to store the value.


-Ursprüngliche Nachricht-
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von x
Gesendet: Donnerstag, 10. August 2017 09:45
An: sqlite-users@mailinglists.sqlite.org
Betreff: [sqlite] Packing integer primary key with field bits

As in cramming numerous integer columns into a 64 bit integer.

Does anyone do this?

Is the speed gain worth the additional confusion of extracting the columns from 
the key?

How is it best accomplished (virtual table maybe)?
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This communication (including any attachments) is intended for the use of the 
intended recipient(s) only and may contain information that is confidential, 
privileged or legally protected. Any unauthorized use or dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please immediately notify the sender by return e-mail message and 
delete all copies of the original communication. Thank you for your cooperation.


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users