Ryan, I know what you’re saying about the list of bytes but I’d like it 
confirmed that the indexes are not stored in a format similar to the data.

Do you not think the cost of unpacking the bytes would be insignificant?

Another possibility that should be considered is that if the database is 
reduced considerably it might fit in memory or, at the vey least, memory caches 
will be able to hold a lot more records.

Tom

From: R Smith<mailto:rsm...@rsweb.co.za>
Sent: 10 August 2017 16:24
To: 
sqlite-users@mailinglists.sqlite.org<mailto:sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] Packing integer primary key with field bits



On 2017/08/10 4:51 PM, x wrote:
> Valid point on the intended range Gunter. I don’t know enough about sqlite to 
> fully understand your index cell paragraph. I thought the way sqlite worked 
> was e.g. to get the value of the 3rd column it had to read the lengths of 
> col1 & col2 so it knows where col 3 value starts (I’ve seen a comment that 
> the most retrieved cols should be at the start of the record)

This is true, but the cost of "reading" the bytes to get to the values
are no worse than what your cost would be for reading the Int64 and
unpacking the bytes from there... SQLite internally does the same thing
only without using an Int64, it just uses an array of bytes, but gleans
the values equally efficiently. If it was better to use Int64 values in
stead of a byte-array, SQLite would already be using it. By the way,
Int64 is just another way of saying "list of 8 bytes", which is just
another way of saying "list of 64 bits", there is nothing magically more
efficient about 64bit INT than a list of 8 bytes. Memory is memory.
Your idea might offer savings in that you predetermine the length of
stored values and ensure they fit into that list of 8 bytes, whereas
SQLite stores also a Length, but has the vastly superior property of
being able to store any size of any type of variable. Again, the speed
gain is dubious for the price it comes at.

To make matters worse, the real bottleneck is usually the storage layer,
and to read even 1 byte from a row, or Index, usually requires reading
an entire File-System Page (Typically ~4 Thousand bytes) but some
successive reads are usually gotten from the same page to make matters
more efficient. The point being, the reading of that File system page is
several magnitudes more time-intensive than unpacking the bytes. You
will not save anything much speed-wise, and very little space-wise, and
then only for smaller row-lengths.

By the way, the same goes for reading your own array of bytes from the
storage medium - the hope is of course that some good caching at both
the storage layer and your application layer would smooth out read/write
waits.

>   and that indexes are stored the same way.

This is not quite true. The Index is (usually) a B-Tree structure with
consistent size value indices. An Index is built for speed, not storage
efficiency. Adding an Index effectively copies the entire column(s) the
Index refers to into slightly less efficient (but more accessible)
storage - It's much worse space-wise than simply using separate fields
and no Index.
[Note: I am not 100% sure about the BTree storage format in SQLite's
case, I refer here to other DB Engines, but I would be surprised if an
SQLite BTree Index is stored exactly the same as the table Row format.]


_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to