Re: [sqlite] Full covering index without table

2013-03-07 Thread Dominique Devienne
On Thu, Mar 7, 2013 at 11:50 AM, Richard Hipp  wrote:
> When a row is larger than a single page, the content spills into multiple
> pages where the pages are strung together as a linked list.  You have to
> read each page in order to follow the linked list.

That makes sense. Thanks for the precision.

> [...] That said, if you enable incremental vacuum mode
>
>  PRAGMA auto_vacuum=INCREMENTAL;
>  VACUUM;
>
> Then SQLite3 will store some additional information related to each page
> (specifically pointer map pages described at
> http://www.sqlite.org/fileformat2.html#ovflpgs) which will often enable it
> to follow the linked list of overflow pages without actually having to
read
> each page in the chain.  Can you rerun your performance measurements after
> switching to incremental vacuum mode and see if that makes a difference?

Makes a big difference! The vacuum itself was not free (7+ sec) , but sped
up the couple queries accessing the tail columns from 1,500+ ms to ~50 ms.
Not everyday you get a 30x improvement :)

Thanks a bunch for this insight. --DD

PS: FWIW, I'm running this on a Win64 Core i7 laptop, on the SSD drive.

C:\Users\DDevienne>sqlite3 toseekornottoseek.db
SQLite version 3.7.15.2 2013-01-09 11:53:05
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> PRAGMA auto_vacuum=INCREMENTAL;
sqlite> create table t (head number, big blob, tail text);
sqlite> insert into t values (1, randomblob(100*1024*1024), "100 MB");
sqlite> insert into t values (2, randomblob(100*1024*1024), "100 MB");
sqlite> insert into t values (3, randomblob(100*1024*1024), "100 MB");
sqlite> insert into t values (4, randomblob(100*1024*1024), "100 MB");
sqlite> insert into t values (5, randomblob(100*1024*1024), "100 MB");
sqlite> .timer ON
sqlite> VACUUM;
CPU Time: user 1.996813 sys 7.550448
sqlite> .q

C:\Users\DDevienne>sqlite3 toseekornottoseek.db
SQLite version 3.7.15.2 2013-01-09 11:53:05
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .timer ON

sqlite> select head, tail from t;
CPU Time: user 0.046800 sys 0.046800

sqlite> select tail from t;
CPU Time: user 0.046800 sys 0.046800
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Full covering index without table

2013-03-07 Thread Richard Hipp
On Thu, Mar 7, 2013 at 3:50 AM, Dominique Devienne wrote:

> On Tue, Mar 5, 2013 at 4:05 PM, Richard Hipp  wrote:
> > Both implementations allow for reading just the prefix of the content
> blob
> > in order to access earlier fields of a table, leaving the tail of the
> blob
> > unread on disk.
>
> This information sparked a bit of a debate among my colleagues, because
> since SQLite's back-end is paged, some of them assumed that pages
> corresponding (entirely) to un-selected (blob) columns would be
> skipped/seeked over, thus avoiding to be read from disk, even if followed
> by other columns that are selected.
>
> Am I reading you correctly that this assumption is incorrect?
>

When a row is larger than a single page, the content spills into multiple
pages where the pages are strung together as a linked list.  You have to
read each page in order to follow the linked list.  (When I was designing
the SQLite3 file format, 10 years ago, I analyzed every SQLite2 database
file I could lay my hands on and found that it was very unusual to have a
TEXT or BLOB column that exceeded a few hundred bytes in length, so this
was a reasonable thing to do back then.  Since that time, usage patterns
have changed, perhaps because SQLite3 is very efficient at handling large
BLOBs and TEXT fields even with its linked-list overflow chain
implementation.)

That said, if you enable incremental vacuum mode

 PRAGMA auto_vacuum=INCREMENTAL;
 VACUUM;

Then SQLite3 will store some additional information related to each page
(specifically pointer map pages described at
http://www.sqlite.org/fileformat2.html#ovflpgs) which will often enable it
to follow the linked list of overflow pages without actually having to read
each page in the chain.  Can you rerun your performance measurements after
switching to incremental vacuum mode and see if that makes a difference?



-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Full covering index without table

2013-03-07 Thread Dominique Devienne
On Thu, Mar 7, 2013 at 9:50 AM, Dominique Devienne 
wrote:
> On Tue, Mar 5, 2013 at 4:05 PM, Richard Hipp  wrote:
> > Both implementations allow for reading just the prefix of the content
blob
> > in order to access earlier fields of a table, leaving the tail of the
blob
> > unread on disk.
>
> This information sparked a bit of a debate among my colleagues, because
since SQLite's back-end is paged, some of them assumed that pages
corresponding (entirely) to un-selected (blob) columns would be
skipped/seeked over, thus avoiding to be read from disk, even if followed
by other columns that are selected.
>
> Am I reading you correctly that this assumption is incorrect?

I guess I can answer that question myself, given the experiment below.

Selecting only from the row's head is fast (under the 15ms clock tick
here), including selecting the blob's length (or typeof, as you wrote).
Selecting the rows big blob or just the tail is slow, likely because
reading the whole row.
Selecting just leading bytes of the big blob is slow too, although
logically that's still a row prefix. I guess the engine doesn't "peek" at
substr(big, 0, N) to realize that's still a row prefix, and ends up reading
the whole row (maybe minus the tail, although that makes no difference),
even though only 10 bytes are requested.

Very informative. Thanks, --DD

C:\Users\DDevienne>sqlite3 toseekornottoseek.db
SQLite version 3.7.15.2 2013-01-09 11:53:05
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> create table t (head number, big blob, tail text);
sqlite> insert into t values (1, randomblob(100*1024*1024), "100 MB");
sqlite> insert into t values (2, randomblob(100*1024*1024), "100 MB");
sqlite> insert into t values (3, randomblob(100*1024*1024), "100 MB");
sqlite> insert into t values (4, randomblob(100*1024*1024), "100 MB");
sqlite> insert into t values (5, randomblob(100*1024*1024), "100 MB");
sqlite> .q

C:\Users\DDevienne>sqlite3 toseekornottoseek.db
SQLite version 3.7.15.2 2013-01-09 11:53:05
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .timer ON

sqlite> select head from t;
1
2
3
4
5
CPU Time: user 0.00 sys 0.00

sqlite> select head, tail from t;
1|100 MB
2|100 MB
3|100 MB
4|100 MB
5|100 MB
CPU Time: user 0.280802 sys 1.528810

sqlite> select tail from t;
100 MB
100 MB
100 MB
100 MB
100 MB
CPU Time: user 0.156001 sys 1.653611

sqlite> select length(big) from t;
104857600
104857600
104857600
104857600
104857600
CPU Time: user 0.00 sys 0.00

sqlite> select hex(substr(big, 34*1024*1024, 10)) from t;
D6DB92922E2E9FF91612
5DBD2F8E31B9DCE47BA3
EAE710AAC1D92AA8BE63
B9D549292AB82EBA21BD
CC979172DFA70D8A4420
CPU Time: user 0.343202 sys 1.809612

sqlite> select hex(substr(big, 0, 10)) from t;
DCA56B817BEF685AED
21D8D8D157AE5DDC3F
723BC3489CBA8146FF
1E31DF8816D2DD6E6D
4514A82E23F1BE638E
CPU Time: user 0.374402 sys 1.762811

sqlite> select typeof(big) from t;
blob
blob
blob
blob
blob
CPU Time: user 0.00 sys 0.015600
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Full covering index without table

2013-03-07 Thread Dominique Devienne
On Tue, Mar 5, 2013 at 4:05 PM, Richard Hipp  wrote:
> Both implementations allow for reading just the prefix of the content blob
> in order to access earlier fields of a table, leaving the tail of the blob
> unread on disk.

This information sparked a bit of a debate among my colleagues, because
since SQLite's back-end is paged, some of them assumed that pages
corresponding (entirely) to un-selected (blob) columns would be
skipped/seeked over, thus avoiding to be read from disk, even if followed
by other columns that are selected.

Am I reading you correctly that this assumption is incorrect?

Concretely, assuming a table with a few scalar columns i, t, r (thus fairly
small in byte size), a couple large to very large blob columns, b1, b2,
followed by a couple scalar columns again e1, e2 (at the end by "mistake"
or as the result of a add-column schema upgrade).

* That table contains a single row, starting on page 5. the i, t, r
values/cells for that row are on that page.
* The b1, b2 blobs of that row spread from page 5 to 875.
* The trailing e1, e2 scalars of that row as also on page 875.

If I select i, t, r, e1, e2, is SQLite reading the two pages (5, 875), or
the whole range (5 - 875)?

Thanks, --DD

PS: The table described above really exists in our software, is often a
large one ( > 1 GB, mostly because of the blobs), and the debate is whether
it's really worth to reorder the columns close to a release.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Full covering index without table

2013-03-05 Thread Richard Hipp
On Tue, Mar 5, 2013 at 12:11 PM, Tim Streater  wrote:

> On 05 Mar 2013 at 15:05, Richard Hipp  wrote:
>
> > Both implementations allow for reading just the prefix of the content
> blob
> > in order to access earlier fields of a table, leaving the tail of the
> blob
> > unread on disk.  So in all cases, it pays to put your frequently accessed
> > small fields early in your table, and your infrequently accessed
> > multi-megabyte BLOB columns at the end of the table.  That way you won't
> > have to read over a multi-megabyte BLOB just to get at the BOOLEAN value
> at
> > the end.
>
> This was interesting to read, and may result in me reordering some tables
> I have. But suppose one of my fields early in the tables is an integer
> whose value, so far, fits in 16 bits (say). What happens if a value in one
> row grows to require 24 or 32 bits to represent. Does that column get moved
> to the end of the row, past my large blobs?
>

The entire row is rewritten on any update.  So space for each integer can
be added as needed.

So, it also makes sense to store massive BLOBs in separate tables from
small integers and booleans, and do joins as needed, so that you can update
your integers and booleans without having to copy the huge BLOBs.

That said, the penalty for coping the huge BLOB is not all that great.  The
SQLite database that runs Fossil stores small integers together in the same
table with big BLOBs that hold checked-in file content.  And it sometimes
updates those integers without touching the blobs, causing the blobs to
have to be recopied.  And we've never had any performance problems (or at
least none in that particular area of the code).  So occasional updates
will be fine.  You probably only need to separate integers/booleans from
big BLOBs in extremely performance critical cases.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Full covering index without table

2013-03-05 Thread Tim Streater
On 05 Mar 2013 at 15:05, Richard Hipp  wrote: 

> Both implementations allow for reading just the prefix of the content blob
> in order to access earlier fields of a table, leaving the tail of the blob
> unread on disk.  So in all cases, it pays to put your frequently accessed
> small fields early in your table, and your infrequently accessed
> multi-megabyte BLOB columns at the end of the table.  That way you won't
> have to read over a multi-megabyte BLOB just to get at the BOOLEAN value at
> the end.

This was interesting to read, and may result in me reordering some tables I 
have. But suppose one of my fields early in the tables is an integer whose 
value, so far, fits in 16 bits (say). What happens if a value in one row grows 
to require 24 or 32 bits to represent. Does that column get moved to the end of 
the row, past my large blobs?

--
Cheers  --  Tim
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Full covering index without table

2013-03-05 Thread Simon Slavin

On 5 Mar 2013, at 3:05pm, Richard Hipp  wrote:

> I think you have misunderstood too.  Both SQLite3 and SQLite4 are
> row-oriented databases.  The storage engines are very different, but they
> still store each row as a single big blob

Okay, yep.  I did misunderstand what was posted earlier.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Full covering index without table

2013-03-05 Thread Richard Hipp
On Tue, Mar 5, 2013 at 9:45 AM, Simon Slavin  wrote:

>
> On 5 Mar 2013, at 2:37pm, Richard Hipp  wrote:
>
> > On Tue, Mar 5, 2013 at 9:26 AM, Simon Slavin 
> wrote:
> >
> >> I've seen programs written by people who always do 'SELECT *" even when
> >> they want just a couple of fields.  That's going to be a lot more
> expensive
> >> under SQLite4.  Maybe it'll teach them to be better programmers.
> >
> > Why will it be any more expensive under SQLite4 than it is under SQLite3?
>
> I probably misunderstood.  My impression was that under SQLite3, all the
> fields of one row were stored 'near' one-another, so getting the value of
> one of them would read the others into cache, whereas under SQLite4 this
> wasn't true.
>

I think you have misunderstood too.  Both SQLite3 and SQLite4 are
row-oriented databases.  The storage engines are very different, but they
still store each row as a single big blob that embeds the individual
fields.  The encoding format of this blob in SQLite3 is described at (
http://www.sqlite.org/fileformat2.html#record_format) and the encoding
format for this blob in SQLite4 is described at (
http://www.sqlite.org/src4/doc/trunk/www/data_encoding.wiki).  The
encodings are similar in philosophy but different in detail.  The SQLite4
version is a little more efficient and more extensible.

Both implementations allow for reading just the prefix of the content blob
in order to access earlier fields of a table, leaving the tail of the blob
unread on disk.  So in all cases, it pays to put your frequently accessed
small fields early in your table, and your infrequently accessed
multi-megabyte BLOB columns at the end of the table.  That way you won't
have to read over a multi-megabyte BLOB just to get at the BOOLEAN value at
the end.  Both implementations also allow you to do things like typeof(x)
and length(x) without actually reading the entire content off of disk.


>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Full covering index without table

2013-03-05 Thread Simon Slavin

On 5 Mar 2013, at 2:37pm, Richard Hipp  wrote:

> On Tue, Mar 5, 2013 at 9:26 AM, Simon Slavin  wrote:
> 
>> I've seen programs written by people who always do 'SELECT *" even when
>> they want just a couple of fields.  That's going to be a lot more expensive
>> under SQLite4.  Maybe it'll teach them to be better programmers.
> 
> Why will it be any more expensive under SQLite4 than it is under SQLite3?

I probably misunderstood.  My impression was that under SQLite3, all the fields 
of one row were stored 'near' one-another, so getting the value of one of them 
would read the others into cache, whereas under SQLite4 this wasn't true.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Full covering index without table

2013-03-05 Thread Richard Hipp
On Tue, Mar 5, 2013 at 9:26 AM, Simon Slavin  wrote:

> I've seen programs written by people who always do 'SELECT *" even when
> they want just a couple of fields.  That's going to be a lot more expensive
> under SQLite4.  Maybe it'll teach them to be better programmers.
>

Why will it be any more expensive under SQLite4 than it is under SQLite3?
-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Full covering index without table

2013-03-05 Thread Simon Slavin

On 5 Mar 2013, at 11:54am, Eleytherios Stamatogiannakis  
wrote:

> I'm not so sure how fast SQLite4 will be with respect to full scans. IMHO row 
> stores have an advantage when scanning over multiple columns.

I was thinking about that.  I've seen programs written by people who always do 
'SELECT *" even when they want just a couple of fields.  That's going to be a 
lot more expensive under SQLite4.  Maybe it'll teach them to be better 
programmers.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Full covering index without table

2013-03-05 Thread Eleytherios Stamatogiannakis
I'm not so sure how fast SQLite4 will be with respect to full scans. 
IMHO row stores have an advantage when scanning over multiple columns.


Concerning dropping the index and recreating it on the other side. Its 
doable but very bad from a performance point of view.


If you know how the other side will access the data, then the best 
option is to build in parallel the indexes over all the data chunks (on 
every cluster node), and then send the indexed chunks on the other side. 
Having doubled data (table+full covering index) in these chunks, halves 
the I/O bandwidth of the whole cluster.


Also i should point that our main use case is OLAP processing and not OLTP.

lefteris.

On 05/03/13 10:51, Nico Williams wrote:

SQLite4 gets this right...  Of course, it's not been released.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Full covering index without table

2013-03-05 Thread Nico Williams
SQLite4 gets this right...  Of course, it's not been released.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Full covering index without table

2013-03-05 Thread Dominique Devienne
On Tue, Mar 5, 2013 at 1:59 AM, Carlos Milon Silva 
wrote:
> Also, You could not rebuild the index from the index, if necessary.

I'm not sure what your point is Carlos. The table is the index, so there's
no index to drop or rebuild.

What Eleytherios is referring to, is known in Oracle as an
Index-Organized-Table (IOT).

They have their uses (e.g. http://stackoverflow.com/questions/3382939),
although it's just an optimization. --DD
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Full covering index without table

2013-03-04 Thread Carlos Milon Silva

Also, You could not rebuild the index from the index, if necessary.

Carlos.

Em 04/03/2013 12:44, Simon Slavin escreveu:

On 4 Mar 2013, at 4:13pm, Eleytherios Stamatogiannakis  wrote:


Is there a way in SQLite to have a full covering index on a table without also 
storing the duplicate table?

Can we ask why you care about this ?  Do you have a huge table which is taking a huge amount of 
space, and you're trying to fit it on a Flash Drive ?  Can you distinguish between "I think it 
could be smaller." and "It's just a little too big and that means I can't use SQLite for 
this." ?

The most compact way of carrying SQLite databases around is to use the shell 
tool to dump the database to a SQL text file, then use a compression utility 
(e.g. ZIP) to compress that text file.  But without knowing your situation I 
can't tell if that would help you.


A first shot toward a partial solution would be to declare all the columns on 
the table as primary keys:

create table t(a,b,c, primary key(a,b,c));

Sorry, but it doesn't help.  Even fields in the primary key are stored twice.

Simon.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Full covering index without table

2013-03-04 Thread Eleytherios Stamatogiannakis

On 04/03/13 18:44, Simon Slavin wrote:


On 4 Mar 2013, at 4:13pm, Eleytherios Stamatogiannakis  wrote:


Is there a way in SQLite to have a full covering index on a table without also 
storing the duplicate table?


Can we ask why you care about this ?  Do you have a huge table which is taking a huge amount of 
space, and you're trying to fit it on a Flash Drive ?  Can you distinguish between "I think it 
could be smaller." and "It's just a little too big and that means I can't use SQLite for 
this." ?



We are creating a distributed processing system in the spirit of Hadapt 
[1], but instead of using PostgreSQL we are using SQLite.


For the intermediate result tables (each one inside an SQLite DB) that 
we know how they will be accessed (and so we prepare their indexes), it 
is very wasteful to have to transfer twice the data (index + full table).


This kind of systems live and die by their I/O.


The most compact way of carrying SQLite databases around is to use the shell 
tool to dump the database to a SQL text file, then use a compression utility 
(e.g. ZIP) to compress that text file.  But without knowing your situation I 
can't tell if that would help you.


For streaming processing we have our own serialization format that is 
compressed on the fly with LZ4. These streams are opened on the other 
side as SQLite Virtual Tables. For store and forward type of processing, 
we use SQLite DBs also compressed on the fly with LZ4. On the other side 
we simply "attach" these DBs.



A first shot toward a partial solution would be to declare all the columns on 
the table as primary keys:

create table t(a,b,c, primary key(a,b,c));


Sorry, but it doesn't help.  Even fields in the primary key are stored twice.


I'm saddened to hear that. I thought that at least we had a partial 
solution with declaring all rows as a primary key...


Thank you for answering.

l.

[1] http://hadapt.com
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Full covering index without table

2013-03-04 Thread Simon Slavin

On 4 Mar 2013, at 4:13pm, Eleytherios Stamatogiannakis  wrote:

> Is there a way in SQLite to have a full covering index on a table without 
> also storing the duplicate table?

Can we ask why you care about this ?  Do you have a huge table which is taking 
a huge amount of space, and you're trying to fit it on a Flash Drive ?  Can you 
distinguish between "I think it could be smaller." and "It's just a little too 
big and that means I can't use SQLite for this." ?

The most compact way of carrying SQLite databases around is to use the shell 
tool to dump the database to a SQL text file, then use a compression utility 
(e.g. ZIP) to compress that text file.  But without knowing your situation I 
can't tell if that would help you.

> A first shot toward a partial solution would be to declare all the columns on 
> the table as primary keys:
> 
> create table t(a,b,c, primary key(a,b,c));

Sorry, but it doesn't help.  Even fields in the primary key are stored twice.

Simon.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users