Re: [sqlite] Query efficiency
First of all I couldn't for the life of we work out why that new query would work but I'm sure that's just a limit of my knowledge. I then realised that the database definition I had used was really very different from what I was trying to do so I've had another go and then tried to use the query with the new schema. Unfortunately, I have failed miserably so here's the schema I'm working with now. Note that table b isn't used by I've created it to show that table ab is a list of references of a records to b records and b records contain an a_id sqlite> create table a (a_id int primary key); sqlite> create table b (a_id int, b_id int, primary key(a_id, b_id)); sqlite> insert into a values(1); sqlite> insert into a values(2); sqlite> create table aa (a_id1 int, a_id2 int, primary key(a_id1, a_id2)); sqlite> insert into aa values (1, 1); sqlite> insert into aa values (1, 2); sqlite> insert into aa values (2, 1); sqlite> insert into aa values (3, 1); sqlite> insert into aa values (1, 3); sqlite> select * from aa; 1|1 1|2 2|1 3|1 1|3 sqlite> create table ab (a_id1 int, a_id2 int, b_id int, primary key(a_id1, a_id2, b_id)); sqlite> insert into ab values(1, 1, 1); sqlite> insert into ab values(2, 1, 1); sqlite> select * from aa where a_id1 in (select distinct a_id1 from ab where ab.a_id2 = 1 and ab.b_id = 1) and a_id2 in (select distinct a_id1 from ab where ab.a_id2 = 1 and ab.b_id = 1) ; 1|1 1|2 2|1 sqlite> select aa.* from aa ar, ab ab1, ab ab2; Error: no such table: aa sqlite> select aa.* from aa as ar, ab ab1, ab ab2; Error: no such table: aa sqlite> At this point I got confused but realised I had asked about a very different schema and obviously must be doing something wrong as this failed before adding any sort of where clause. > From: Pavel Ivanov<paiva...@gmail.com> > Subject: Re: [sqlite] Query efficiency > >> > That is, is leaving it to the >> > query optimiser to figure out that I only need the sub select once the >> > best thing to do? > AFAIK, SQLite's optimizer is not that smart to collapse two identical > sub-queries and reuse once generated result. > >> > Is the select I'm doing where both a_id1& 2 are "in" the exact same >> > select the most efficient way to do this? > I'd say that the following query will work faster in this particular > case (with this set of tables and indexes): > > select ar.* > from a_relation ar, ab ab1, ab ab2 > where ar.a_id1 = ab1.a_id > and ab1.b_id = 1 > and ar.a_id2 = ab2.a_id > and ab2.b_id = 1; > > But this query could be not transformable to your real case. Also > performance in real schema could be different. > > > Pavel > > > On Tue, May 17, 2011 at 5:29 AM, Matthew Jones<matthew.jo...@hp.com> wrote: >> > O.k. So this is a very cut down example but it illustrates the question: >> > >> > sqlite> create table a (a_id int primary key); >> > sqlite> create table b (b_id int primary key); >> > sqlite> create table ab (a_id int, b_id int, primary key(a_id, b_id)); >> > sqlite> create table a_relation (a_id1 int, a_id2, primary key(a_id1, >> > a_id2)); >> > sqlite> select * from a_relation where >> > ? ?...> a_id1 in (select a_id from ab where b_id = 1) and >> > ? ?...> a_id2 in (select a_id from ab where b_id = 1); >> > >> > Is the select I'm doing where both a_id1& 2 are "in" the exact same >> > select the most efficient way to do this? That is, is leaving it to the >> > query optimiser to figure out that I only need the sub select once the >> > best thing to do? >> > >> > (The actual tables in question are a little more complicated and I have >> > versions to cope with but this effectively what I'm doing in C++ [so I'm >> > preparing and binding etc.]. The actual sub select have a group by a_id >> > to cope with multiple entries with different versions.) >> > >> > Thanks >> > >> > -- >> > Matthew Jones >> > Hewlett-Packard Ltd >> > Thanks -- Matthew Jones Hewlett-Packard Ltd ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Query efficiency
O.k. So this is a very cut down example but it illustrates the question: sqlite> create table a (a_id int primary key); sqlite> create table b (b_id int primary key); sqlite> create table ab (a_id int, b_id int, primary key(a_id, b_id)); sqlite> create table a_relation (a_id1 int, a_id2, primary key(a_id1, a_id2)); sqlite> select * from a_relation where ...> a_id1 in (select a_id from ab where b_id = 1) and ...> a_id2 in (select a_id from ab where b_id = 1); Is the select I'm doing where both a_id1 & 2 are "in" the exact same select the most efficient way to do this? That is, is leaving it to the query optimiser to figure out that I only need the sub select once the best thing to do? (The actual tables in question are a little more complicated and I have versions to cope with but this effectively what I'm doing in C++ [so I'm preparing and binding etc.]. The actual sub select have a group by a_id to cope with multiple entries with different versions.) Thanks -- Matthew Jones Hewlett-Packard Ltd ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Query question
Thanks guys. I knew it would be simple (and I should have known it already). > select a, count(distinct b) from a group by a; > > or if you are checking a specific value of a: > > select count(distinct b) from a where a=10; -- Matthew Jones Hewlett-Packard Ltd Long Down Avenue Stoke Gifford Bristol. BS34 8QZ Tel: +44 (0) 117 312 7490 Email:matthew.jo...@hp.com<mailto:matthew.jo...@hp.com> Hewlett-Packard Limited registered Office: Cain Road, Bracknell, Berks, RG12 1HN. Registered No: 690597 England The contents of this message and any attachments to it are confidential and may be legally privileged. If you have received this message in error, you should delete it from your system immediately and advise the sender. To any recipient of this message within HP, unless otherwise stated you should consider this message and attachments as "HP CONFIDENTIAL". ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Query question
A simple query question that I really ought to know the answer to but don't: Given a table with multiple columns in it how do I count how many entries there are with column A matching some value where column B is distinct? sqlite> create table a (a, b); sqlite> insert into a values (10, 1); sqlite> insert into a values (10, 2); sqlite> insert into a values (10, 2); sqlite> insert into a values (11, 2); sqlite> insert into a values (11, 2); sqlite> insert into a values (11, 3); sqlite> select * from a where a=10 group by b; 10|1 10|2 sqlite> select * from a where a=11 group by b; 11|2 11|3 How can I do count equivalent of such a query to find out how many distinct values of b there are for a given a? (That is get an answer of 2 in the above) Thanks -- Matthew Jones Hewlett-Packard Ltd Long Down Avenue Stoke Gifford Bristol. BS34 8QZ Tel: +44 (0) 117 312 7490 Email:matthew.jo...@hp.com<mailto:matthew.jo...@hp.com> Hewlett-Packard Limited registered Office: Cain Road, Bracknell, Berks, RG12 1HN. Registered No: 690597 England The contents of this message and any attachments to it are confidential and may be legally privileged. If you have received this message in error, you should delete it from your system immediately and advise the sender. To any recipient of this message within HP, unless otherwise stated you should consider this message and attachments as "HP CONFIDENTIAL". ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] UPDATE/INSERTing 1-2k rows slower than expected
> I'll give you another failure point that most people never see or think of. > > I used to manage numerous Linux systems with RAID-5. One time I had a drive > fail, the spare kicked in, and then during the rebuild a 2nd drive > failed...hosing the RAID (i.e. two failed disks). > > The problem was...normal disk access/backup only scanned the in-use blocks. > The RAID resync scanned the entire disk which had never been done. > > After that I put in a utility that did a nightly "dd if=/dev/md0 >/dev/null" > job to force a scan of the entire disk set. > > This is one reason why they invented RAID6. > > There's just so many ways to fail...sigh... Off topic but an interesting read for disk failures: http://blogs.sun.com/relling/entry/raid_recommendations_space_vs_mttdl -- Matthew Jones Hewlett-Packard Ltd Long Down Avenue Stoke Gifford Bristol. BS34 8QZ Tel: +44 (0) 117 312 7490 Email:matthew.jo...@hp.com<mailto:matthew.jo...@hp.com> Hewlett-Packard Limited registered Office: Cain Road, Bracknell, Berks, RG12 1HN. Registered No: 690597 England The contents of this message and any attachments to it are confidential and may be legally privileged. If you have received this message in error, you should delete it from your system immediately and advise the sender. To any recipient of this message within HP, unless otherwise stated you should consider this message and attachments as "HP CONFIDENTIAL". ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite type question
> Also even with a signed rowid and if you keep all rowids positive, you can > still insert 1,000,000,000 rows per second continuously for 292 years before > you run out. Excellent. I think you should adopt that as a strapline Richard. -- Matthew Jones Hewlett-Packard Ltd ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] LIKE with BLOB
From a web search (and abbreviated): >>I have BLOBs in my schema and the data will often start with bytes of >>0 value. >>I'm having a tough time coming up with the proper SQL syntax to >> select all the columns that start with 2 0's (or any zeros). > SELECT * FROM mytable WHERE myblob LIKE X'0025'; > SELECT * FROM mytable WHERE quote(myblob) LIKE 'X''00%'; Now I have a column of blob data (always 20 bytes) and I would like to do a LIKE select on this column where I have the first 10 bytes but they can be any value including, of course, the % character. Is this possible or does the arbitrary nature of the data make this infeasible? If it is possible, how would I define a prepared statement so that I can just bind the (10 byte) value into it? Thanks -- Matthew Jones Hewlett-Packard Ltd Long Down Avenue Stoke Gifford Bristol. BS34 8QZ Tel: +44 (0) 117 312 7490 Email:matthew.jo...@hp.com<mailto:matthew.jo...@hp.com> Hewlett-Packard Limited registered Office: Cain Road, Bracknell, Berks, RG12 1HN. Registered No: 690597 England The contents of this message and any attachments to it are confidential and may be legally privileged. If you have received this message in error, you should delete it from your system immediately and advise the sender. To any recipient of this message within HP, unless otherwise stated you should consider this message and attachments as "HP CONFIDENTIAL". ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] UPDATE without a JOIN
I've seen various posts about who to get around the lack of UPDATE with a JOIN but they all seem to refer to tables joined on a single column. I need to do something very similar but with two-column primary key. E.g. sqlite> create table TABLE1 (a int, b int, primary key(a, b)); sqlite> create table TABLE2 (x int, y int, z int, primary key(x, y)); sqlite> create table TABLE1_2 (a int, b int, x int, y int, primary key(a, b, x, y)); So I have a many to many relationship between table 1 and 2. I now want to update column z in TABLE2 for all entries that match a particular row in TABLE1. Obviously, TABLE1_2 can be queried to give me the keys of all the rows in TABLE2 that need to be updated but how do I actually do the update? Thanks ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Schema updates invalidating connection
The only way for a database connection to know if the schema has changed is to open and read the database file. Sqlite_prepare() tries to avoid reading the database file in order to reduce contention, though, so it is unlikely to discover a database change. The change is only discovered when sqlite3_step() is run. Having sqlite3_prepare() avoid opening and reading the database file is a desirable optimization since it makes sqlite3_prepare() run faster in the common case where no schema change occurs. A fair enough optimisation. The only way I know of to force sqlite3_prepare() to check the database schema is to close and reopen the database connection. But doing so will dramatically increase the amount of time it takes to prepare each statement. When the step fails (with SQLITE_ERROR not SQLITE_SCHEMA) the finalize is called and it returns SQLITE_SCHEMA. (If sqlite3_step could return SQLITE_SCHEMA the condition would be easier to detect). Having finalized the previous statement, the next statement prepared and stepped works perfectly. Does this mean that sqlite3_finalize is closing and reopening the database under the hood? If so, does this mean that sqlite3_finalize is very expensive? If not, then there is something that sqlite3_finalize does that updates the connection's view of the schema. Would it possible to do this prior to the prepare? (should one wish to do so and knowing that there is an overhead involved) Your best approach is to modify your code so that you are prepared to deal with SQLITE_SCHEMA errors returned by sqlite3_step. You'll need this anyway for the rare case when the schema changes in between sqlite3_prepare and sqlite3_step. And, presumably, the less rare case where subsequent steps are called to retrieve data. Indeed, this is easy to cope with from a users perspective, an active query is terminated due to a schema change. It just seems a little strange (to us) that a schema change by another thread/process that may have completed "ages" ago means that the next query we attempt fails. We can program for this in that if the first sqlite3_step of a prepared statement fails then, like sqlite3_exec, we can re-try the prepare and step. If it isn't the first step then we can terminate the select and somehow indicate that the schema has changed during the select. - To unsubscribe, send email to [EMAIL PROTECTED] -
[sqlite] Schema updates invalidating connection
We have a problem that is obviously to do with our failing to understand sqlite properly. Using a test program (enclosed) we create a table in a new database and then wait (for use input). If we then open the same sqlite database using sqlite3 and create another table in the database when we continue with our program it fails to insert data into the table it has created. Now, we know that schema changes invalidate any previously prepared statements but we always prepare and execute (step) our statements in one go. So what should our program do prior to preparing a statement to ensure that its view of the database schema is correct? If we use sqlite3_exec instead it all works as this has a built in retry for the step failing and then the finalize returning SQLITE_SCHEMA. I thought this was to cater for the case of preparing before the schema change and then stepping afterwards, the documented failure, but it also covers our case. I raised a bug at sqlite.org, ticket 1936, but have been told that I should post to the list to receive support. Obviously, we are missing something very basic here or . Our problem occurs in a multi-threaded environment but we have minimized the code in the attached example to remove any threading issues. code follows: /* Prepare, execute, finalize as an atomic operation */ void execute(struct sqlite3 *db, const char *sql) { int status = 0; struct sqlite3_stmt *stmt; const char *tail = 0; fprintf(stdout, "%s\n", sql); if((status = sqlite3_prepare(db, sql, strlen(sql), , )) != SQLITE_OK) { fprintf(stderr, "sqlite3_prepare: %d\n", status); } status = sqlite3_step(stmt); if(status != SQLITE_DONE) { fprintf(stderr, "sqlite3_step: %d, %s\n", status, sqlite3_errmsg(db)); } if((status = sqlite3_finalize(stmt)) != SQLITE_OK) { fprintf(stderr, "sqlite3_finalize: %d, %s\n", status, sqlite3_errmsg(db)); } } int main(int ac, char* av[]) { int status = 0; struct sqlite3 *db = 0; char buffer[256]; if((status = sqlite3_open("test.db", )) != SQLITE_OK) { fprintf(stderr, "sqlite3_open: %d\n", status); } sprintf(buffer, "create table %s (i int)", av[1]); execute(db, buffer); puts("type to continue:"); getchar(); sprintf(buffer, "insert into %s values (4)", av[1]); execute(db, buffer); if((status = sqlite3_close(db)) != SQLITE_OK) { fprintf(stderr, "sqlite3_close: %d\n", status); } return 0; } - To unsubscribe, send email to [EMAIL PROTECTED] -
[sqlite] Virtual tables
Is there any straight forward way I could use SQLite3 with virtual tables. By that I mean SQLite contains all the table and index definitions but none of the data - actually it will contain the data for some tables but not others. I only need to query access to the tables so I need to intercept the fetching of data from these virtual tables & indices. From what I have read in the architecture, it seems like I want to intercept the btree implementation for these virtual tables but that doesn't mean it has to be done that way. Thanks