Re: [sqlite] Full covering index without table
On Thu, Mar 7, 2013 at 11:50 AM, Richard Hippwrote: > When a row is larger than a single page, the content spills into multiple > pages where the pages are strung together as a linked list. You have to > read each page in order to follow the linked list. That makes sense. Thanks for the precision. > [...] That said, if you enable incremental vacuum mode > > PRAGMA auto_vacuum=INCREMENTAL; > VACUUM; > > Then SQLite3 will store some additional information related to each page > (specifically pointer map pages described at > http://www.sqlite.org/fileformat2.html#ovflpgs) which will often enable it > to follow the linked list of overflow pages without actually having to read > each page in the chain. Can you rerun your performance measurements after > switching to incremental vacuum mode and see if that makes a difference? Makes a big difference! The vacuum itself was not free (7+ sec) , but sped up the couple queries accessing the tail columns from 1,500+ ms to ~50 ms. Not everyday you get a 30x improvement :) Thanks a bunch for this insight. --DD PS: FWIW, I'm running this on a Win64 Core i7 laptop, on the SSD drive. C:\Users\DDevienne>sqlite3 toseekornottoseek.db SQLite version 3.7.15.2 2013-01-09 11:53:05 Enter ".help" for instructions Enter SQL statements terminated with a ";" sqlite> PRAGMA auto_vacuum=INCREMENTAL; sqlite> create table t (head number, big blob, tail text); sqlite> insert into t values (1, randomblob(100*1024*1024), "100 MB"); sqlite> insert into t values (2, randomblob(100*1024*1024), "100 MB"); sqlite> insert into t values (3, randomblob(100*1024*1024), "100 MB"); sqlite> insert into t values (4, randomblob(100*1024*1024), "100 MB"); sqlite> insert into t values (5, randomblob(100*1024*1024), "100 MB"); sqlite> .timer ON sqlite> VACUUM; CPU Time: user 1.996813 sys 7.550448 sqlite> .q C:\Users\DDevienne>sqlite3 toseekornottoseek.db SQLite version 3.7.15.2 2013-01-09 11:53:05 Enter ".help" for instructions Enter SQL statements terminated with a ";" sqlite> .timer ON sqlite> select head, tail from t; CPU Time: user 0.046800 sys 0.046800 sqlite> select tail from t; CPU Time: user 0.046800 sys 0.046800 ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Full covering index without table
On Thu, Mar 7, 2013 at 3:50 AM, Dominique Deviennewrote: > On Tue, Mar 5, 2013 at 4:05 PM, Richard Hipp wrote: > > Both implementations allow for reading just the prefix of the content > blob > > in order to access earlier fields of a table, leaving the tail of the > blob > > unread on disk. > > This information sparked a bit of a debate among my colleagues, because > since SQLite's back-end is paged, some of them assumed that pages > corresponding (entirely) to un-selected (blob) columns would be > skipped/seeked over, thus avoiding to be read from disk, even if followed > by other columns that are selected. > > Am I reading you correctly that this assumption is incorrect? > When a row is larger than a single page, the content spills into multiple pages where the pages are strung together as a linked list. You have to read each page in order to follow the linked list. (When I was designing the SQLite3 file format, 10 years ago, I analyzed every SQLite2 database file I could lay my hands on and found that it was very unusual to have a TEXT or BLOB column that exceeded a few hundred bytes in length, so this was a reasonable thing to do back then. Since that time, usage patterns have changed, perhaps because SQLite3 is very efficient at handling large BLOBs and TEXT fields even with its linked-list overflow chain implementation.) That said, if you enable incremental vacuum mode PRAGMA auto_vacuum=INCREMENTAL; VACUUM; Then SQLite3 will store some additional information related to each page (specifically pointer map pages described at http://www.sqlite.org/fileformat2.html#ovflpgs) which will often enable it to follow the linked list of overflow pages without actually having to read each page in the chain. Can you rerun your performance measurements after switching to incremental vacuum mode and see if that makes a difference? -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Full covering index without table
On Thu, Mar 7, 2013 at 9:50 AM, Dominique Deviennewrote: > On Tue, Mar 5, 2013 at 4:05 PM, Richard Hipp wrote: > > Both implementations allow for reading just the prefix of the content blob > > in order to access earlier fields of a table, leaving the tail of the blob > > unread on disk. > > This information sparked a bit of a debate among my colleagues, because since SQLite's back-end is paged, some of them assumed that pages corresponding (entirely) to un-selected (blob) columns would be skipped/seeked over, thus avoiding to be read from disk, even if followed by other columns that are selected. > > Am I reading you correctly that this assumption is incorrect? I guess I can answer that question myself, given the experiment below. Selecting only from the row's head is fast (under the 15ms clock tick here), including selecting the blob's length (or typeof, as you wrote). Selecting the rows big blob or just the tail is slow, likely because reading the whole row. Selecting just leading bytes of the big blob is slow too, although logically that's still a row prefix. I guess the engine doesn't "peek" at substr(big, 0, N) to realize that's still a row prefix, and ends up reading the whole row (maybe minus the tail, although that makes no difference), even though only 10 bytes are requested. Very informative. Thanks, --DD C:\Users\DDevienne>sqlite3 toseekornottoseek.db SQLite version 3.7.15.2 2013-01-09 11:53:05 Enter ".help" for instructions Enter SQL statements terminated with a ";" sqlite> create table t (head number, big blob, tail text); sqlite> insert into t values (1, randomblob(100*1024*1024), "100 MB"); sqlite> insert into t values (2, randomblob(100*1024*1024), "100 MB"); sqlite> insert into t values (3, randomblob(100*1024*1024), "100 MB"); sqlite> insert into t values (4, randomblob(100*1024*1024), "100 MB"); sqlite> insert into t values (5, randomblob(100*1024*1024), "100 MB"); sqlite> .q C:\Users\DDevienne>sqlite3 toseekornottoseek.db SQLite version 3.7.15.2 2013-01-09 11:53:05 Enter ".help" for instructions Enter SQL statements terminated with a ";" sqlite> .timer ON sqlite> select head from t; 1 2 3 4 5 CPU Time: user 0.00 sys 0.00 sqlite> select head, tail from t; 1|100 MB 2|100 MB 3|100 MB 4|100 MB 5|100 MB CPU Time: user 0.280802 sys 1.528810 sqlite> select tail from t; 100 MB 100 MB 100 MB 100 MB 100 MB CPU Time: user 0.156001 sys 1.653611 sqlite> select length(big) from t; 104857600 104857600 104857600 104857600 104857600 CPU Time: user 0.00 sys 0.00 sqlite> select hex(substr(big, 34*1024*1024, 10)) from t; D6DB92922E2E9FF91612 5DBD2F8E31B9DCE47BA3 EAE710AAC1D92AA8BE63 B9D549292AB82EBA21BD CC979172DFA70D8A4420 CPU Time: user 0.343202 sys 1.809612 sqlite> select hex(substr(big, 0, 10)) from t; DCA56B817BEF685AED 21D8D8D157AE5DDC3F 723BC3489CBA8146FF 1E31DF8816D2DD6E6D 4514A82E23F1BE638E CPU Time: user 0.374402 sys 1.762811 sqlite> select typeof(big) from t; blob blob blob blob blob CPU Time: user 0.00 sys 0.015600 ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Full covering index without table
On Tue, Mar 5, 2013 at 4:05 PM, Richard Hippwrote: > Both implementations allow for reading just the prefix of the content blob > in order to access earlier fields of a table, leaving the tail of the blob > unread on disk. This information sparked a bit of a debate among my colleagues, because since SQLite's back-end is paged, some of them assumed that pages corresponding (entirely) to un-selected (blob) columns would be skipped/seeked over, thus avoiding to be read from disk, even if followed by other columns that are selected. Am I reading you correctly that this assumption is incorrect? Concretely, assuming a table with a few scalar columns i, t, r (thus fairly small in byte size), a couple large to very large blob columns, b1, b2, followed by a couple scalar columns again e1, e2 (at the end by "mistake" or as the result of a add-column schema upgrade). * That table contains a single row, starting on page 5. the i, t, r values/cells for that row are on that page. * The b1, b2 blobs of that row spread from page 5 to 875. * The trailing e1, e2 scalars of that row as also on page 875. If I select i, t, r, e1, e2, is SQLite reading the two pages (5, 875), or the whole range (5 - 875)? Thanks, --DD PS: The table described above really exists in our software, is often a large one ( > 1 GB, mostly because of the blobs), and the debate is whether it's really worth to reorder the columns close to a release. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Full covering index without table
On Tue, Mar 5, 2013 at 12:11 PM, Tim Streaterwrote: > On 05 Mar 2013 at 15:05, Richard Hipp wrote: > > > Both implementations allow for reading just the prefix of the content > blob > > in order to access earlier fields of a table, leaving the tail of the > blob > > unread on disk. So in all cases, it pays to put your frequently accessed > > small fields early in your table, and your infrequently accessed > > multi-megabyte BLOB columns at the end of the table. That way you won't > > have to read over a multi-megabyte BLOB just to get at the BOOLEAN value > at > > the end. > > This was interesting to read, and may result in me reordering some tables > I have. But suppose one of my fields early in the tables is an integer > whose value, so far, fits in 16 bits (say). What happens if a value in one > row grows to require 24 or 32 bits to represent. Does that column get moved > to the end of the row, past my large blobs? > The entire row is rewritten on any update. So space for each integer can be added as needed. So, it also makes sense to store massive BLOBs in separate tables from small integers and booleans, and do joins as needed, so that you can update your integers and booleans without having to copy the huge BLOBs. That said, the penalty for coping the huge BLOB is not all that great. The SQLite database that runs Fossil stores small integers together in the same table with big BLOBs that hold checked-in file content. And it sometimes updates those integers without touching the blobs, causing the blobs to have to be recopied. And we've never had any performance problems (or at least none in that particular area of the code). So occasional updates will be fine. You probably only need to separate integers/booleans from big BLOBs in extremely performance critical cases. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Full covering index without table
On 05 Mar 2013 at 15:05, Richard Hippwrote: > Both implementations allow for reading just the prefix of the content blob > in order to access earlier fields of a table, leaving the tail of the blob > unread on disk. So in all cases, it pays to put your frequently accessed > small fields early in your table, and your infrequently accessed > multi-megabyte BLOB columns at the end of the table. That way you won't > have to read over a multi-megabyte BLOB just to get at the BOOLEAN value at > the end. This was interesting to read, and may result in me reordering some tables I have. But suppose one of my fields early in the tables is an integer whose value, so far, fits in 16 bits (say). What happens if a value in one row grows to require 24 or 32 bits to represent. Does that column get moved to the end of the row, past my large blobs? -- Cheers -- Tim ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Full covering index without table
On 5 Mar 2013, at 3:05pm, Richard Hippwrote: > I think you have misunderstood too. Both SQLite3 and SQLite4 are > row-oriented databases. The storage engines are very different, but they > still store each row as a single big blob Okay, yep. I did misunderstand what was posted earlier. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Full covering index without table
On Tue, Mar 5, 2013 at 9:45 AM, Simon Slavinwrote: > > On 5 Mar 2013, at 2:37pm, Richard Hipp wrote: > > > On Tue, Mar 5, 2013 at 9:26 AM, Simon Slavin > wrote: > > > >> I've seen programs written by people who always do 'SELECT *" even when > >> they want just a couple of fields. That's going to be a lot more > expensive > >> under SQLite4. Maybe it'll teach them to be better programmers. > > > > Why will it be any more expensive under SQLite4 than it is under SQLite3? > > I probably misunderstood. My impression was that under SQLite3, all the > fields of one row were stored 'near' one-another, so getting the value of > one of them would read the others into cache, whereas under SQLite4 this > wasn't true. > I think you have misunderstood too. Both SQLite3 and SQLite4 are row-oriented databases. The storage engines are very different, but they still store each row as a single big blob that embeds the individual fields. The encoding format of this blob in SQLite3 is described at ( http://www.sqlite.org/fileformat2.html#record_format) and the encoding format for this blob in SQLite4 is described at ( http://www.sqlite.org/src4/doc/trunk/www/data_encoding.wiki). The encodings are similar in philosophy but different in detail. The SQLite4 version is a little more efficient and more extensible. Both implementations allow for reading just the prefix of the content blob in order to access earlier fields of a table, leaving the tail of the blob unread on disk. So in all cases, it pays to put your frequently accessed small fields early in your table, and your infrequently accessed multi-megabyte BLOB columns at the end of the table. That way you won't have to read over a multi-megabyte BLOB just to get at the BOOLEAN value at the end. Both implementations also allow you to do things like typeof(x) and length(x) without actually reading the entire content off of disk. > > Simon. > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Full covering index without table
On 5 Mar 2013, at 2:37pm, Richard Hippwrote: > On Tue, Mar 5, 2013 at 9:26 AM, Simon Slavin wrote: > >> I've seen programs written by people who always do 'SELECT *" even when >> they want just a couple of fields. That's going to be a lot more expensive >> under SQLite4. Maybe it'll teach them to be better programmers. > > Why will it be any more expensive under SQLite4 than it is under SQLite3? I probably misunderstood. My impression was that under SQLite3, all the fields of one row were stored 'near' one-another, so getting the value of one of them would read the others into cache, whereas under SQLite4 this wasn't true. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Full covering index without table
On Tue, Mar 5, 2013 at 9:26 AM, Simon Slavinwrote: > I've seen programs written by people who always do 'SELECT *" even when > they want just a couple of fields. That's going to be a lot more expensive > under SQLite4. Maybe it'll teach them to be better programmers. > Why will it be any more expensive under SQLite4 than it is under SQLite3? -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Full covering index without table
On 5 Mar 2013, at 11:54am, Eleytherios Stamatogiannakiswrote: > I'm not so sure how fast SQLite4 will be with respect to full scans. IMHO row > stores have an advantage when scanning over multiple columns. I was thinking about that. I've seen programs written by people who always do 'SELECT *" even when they want just a couple of fields. That's going to be a lot more expensive under SQLite4. Maybe it'll teach them to be better programmers. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Full covering index without table
I'm not so sure how fast SQLite4 will be with respect to full scans. IMHO row stores have an advantage when scanning over multiple columns. Concerning dropping the index and recreating it on the other side. Its doable but very bad from a performance point of view. If you know how the other side will access the data, then the best option is to build in parallel the indexes over all the data chunks (on every cluster node), and then send the indexed chunks on the other side. Having doubled data (table+full covering index) in these chunks, halves the I/O bandwidth of the whole cluster. Also i should point that our main use case is OLAP processing and not OLTP. lefteris. On 05/03/13 10:51, Nico Williams wrote: SQLite4 gets this right... Of course, it's not been released. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Full covering index without table
SQLite4 gets this right... Of course, it's not been released. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Full covering index without table
On Tue, Mar 5, 2013 at 1:59 AM, Carlos Milon Silvawrote: > Also, You could not rebuild the index from the index, if necessary. I'm not sure what your point is Carlos. The table is the index, so there's no index to drop or rebuild. What Eleytherios is referring to, is known in Oracle as an Index-Organized-Table (IOT). They have their uses (e.g. http://stackoverflow.com/questions/3382939), although it's just an optimization. --DD ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Full covering index without table
Also, You could not rebuild the index from the index, if necessary. Carlos. Em 04/03/2013 12:44, Simon Slavin escreveu: On 4 Mar 2013, at 4:13pm, Eleytherios Stamatogiannakiswrote: Is there a way in SQLite to have a full covering index on a table without also storing the duplicate table? Can we ask why you care about this ? Do you have a huge table which is taking a huge amount of space, and you're trying to fit it on a Flash Drive ? Can you distinguish between "I think it could be smaller." and "It's just a little too big and that means I can't use SQLite for this." ? The most compact way of carrying SQLite databases around is to use the shell tool to dump the database to a SQL text file, then use a compression utility (e.g. ZIP) to compress that text file. But without knowing your situation I can't tell if that would help you. A first shot toward a partial solution would be to declare all the columns on the table as primary keys: create table t(a,b,c, primary key(a,b,c)); Sorry, but it doesn't help. Even fields in the primary key are stored twice. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Full covering index without table
On 04/03/13 18:44, Simon Slavin wrote: On 4 Mar 2013, at 4:13pm, Eleytherios Stamatogiannakiswrote: Is there a way in SQLite to have a full covering index on a table without also storing the duplicate table? Can we ask why you care about this ? Do you have a huge table which is taking a huge amount of space, and you're trying to fit it on a Flash Drive ? Can you distinguish between "I think it could be smaller." and "It's just a little too big and that means I can't use SQLite for this." ? We are creating a distributed processing system in the spirit of Hadapt [1], but instead of using PostgreSQL we are using SQLite. For the intermediate result tables (each one inside an SQLite DB) that we know how they will be accessed (and so we prepare their indexes), it is very wasteful to have to transfer twice the data (index + full table). This kind of systems live and die by their I/O. The most compact way of carrying SQLite databases around is to use the shell tool to dump the database to a SQL text file, then use a compression utility (e.g. ZIP) to compress that text file. But without knowing your situation I can't tell if that would help you. For streaming processing we have our own serialization format that is compressed on the fly with LZ4. These streams are opened on the other side as SQLite Virtual Tables. For store and forward type of processing, we use SQLite DBs also compressed on the fly with LZ4. On the other side we simply "attach" these DBs. A first shot toward a partial solution would be to declare all the columns on the table as primary keys: create table t(a,b,c, primary key(a,b,c)); Sorry, but it doesn't help. Even fields in the primary key are stored twice. I'm saddened to hear that. I thought that at least we had a partial solution with declaring all rows as a primary key... Thank you for answering. l. [1] http://hadapt.com ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Full covering index without table
On 4 Mar 2013, at 4:13pm, Eleytherios Stamatogiannakiswrote: > Is there a way in SQLite to have a full covering index on a table without > also storing the duplicate table? Can we ask why you care about this ? Do you have a huge table which is taking a huge amount of space, and you're trying to fit it on a Flash Drive ? Can you distinguish between "I think it could be smaller." and "It's just a little too big and that means I can't use SQLite for this." ? The most compact way of carrying SQLite databases around is to use the shell tool to dump the database to a SQL text file, then use a compression utility (e.g. ZIP) to compress that text file. But without knowing your situation I can't tell if that would help you. > A first shot toward a partial solution would be to declare all the columns on > the table as primary keys: > > create table t(a,b,c, primary key(a,b,c)); Sorry, but it doesn't help. Even fields in the primary key are stored twice. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users