Re: [sqlite] without rowid and secondary indexes
> Le 10 avr. 2018 à 18:25, Clemens Ladischa écrit : > > Olivier Mascia wrote: >> Would you mean: >> >> select NAME,count(NAME) from SHEETS where SHEET>? group by NAME having NAME >> is not NULL; >> >> is more logically appropriate? > > Probably. The first query can give quite random results: > > delete from SHEETS; > insert into SHEETS(NODE, SHEET, NAME) values(1, 1, 'a'); > insert into SHEETS(NODE, SHEET, NAME) values(1, 3, 'a'); > select NAME,count(NAME) from SHEETS group by NAME having SHEET>2 and NAME is > not NULL; > a|2 > select NAME,count(NAME) from SHEETS where SHEET>2 group by NAME having NAME > is not NULL; > a|1 > delete from SHEETS; > insert into SHEETS(NODE, SHEET, NAME) values(1, 3, 'a'); > insert into SHEETS(NODE, SHEET, NAME) values(2, 1, 'a'); > select NAME,count(NAME) from SHEETS group by NAME having SHEET>2 and NAME is > not NULL; > select NAME,count(NAME) from SHEETS where SHEET>2 group by NAME having NAME > is not NULL; > a|1 > >> Or that I might even go for: >> >> select NAME,count(NAME) from SHEETS where SHEET>? and NAME is not NULL >> group by NAME; > > This always has the same result as the second query. (But "NAME is not NULL" > in the > WHERE clause applies to the table column, while in the HAVING clause, it > applies to > the result of the grouping. So the last query makes it more likely that > SQLite can > prove that it can use the index.) Thanks a lot: this proved wonderfully educative, especially the oddities arising from: > select NAME,count(NAME) from SHEETS group by NAME having SHEET>2 and NAME is > not NULL; -- Best Regards, Meilleures salutations, Met vriendelijke groeten, Olivier Mascia ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] without rowid and secondary indexes
Olivier Mascia wrote: >> Le 10 avr. 2018 à 13:20, Clemens Ladischa écrit : >>> select NAME,count(NAME) from SHEETS group by NAME having SHEET>? and NAME >>> is not NULL; >> >> The SHEET value is from some random row in the group, and the NAME value is >> the same for all rows in the group. Are you sure you do not want to use >> WHERE >> instead of HAVING? > > Would you mean: > > select NAME,count(NAME) from SHEETS where SHEET>? group by NAME having NAME > is not NULL; > > is more logically appropriate? Probably. The first query can give quite random results: delete from SHEETS; insert into SHEETS(NODE, SHEET, NAME) values(1, 1, 'a'); insert into SHEETS(NODE, SHEET, NAME) values(1, 3, 'a'); select NAME,count(NAME) from SHEETS group by NAME having SHEET>2 and NAME is not NULL; a|2 select NAME,count(NAME) from SHEETS where SHEET>2 group by NAME having NAME is not NULL; a|1 delete from SHEETS; insert into SHEETS(NODE, SHEET, NAME) values(1, 3, 'a'); insert into SHEETS(NODE, SHEET, NAME) values(2, 1, 'a'); select NAME,count(NAME) from SHEETS group by NAME having SHEET>2 and NAME is not NULL; select NAME,count(NAME) from SHEETS where SHEET>2 group by NAME having NAME is not NULL; a|1 > Or that I might even go for: > > select NAME,count(NAME) from SHEETS where SHEET>? and NAME is not NULL > group by NAME; This always has the same result as the second query. (But "NAME is not NULL" in the WHERE clause applies to the table column, while in the HAVING clause, it applies to the result of the grouping. So the last query makes it more likely that SQLite can prove that it can use the index.) Regards, Clemens ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] without rowid and secondary indexes
> Le 10 avr. 2018 à 13:20, Clemens Ladischa écrit : > >> select NAME,count(NAME) from SHEETS group by NAME having SHEET>? and NAME is >> not NULL; > > The SHEET value is from some random row in the group, and the NAME value is > the same for all rows in the group. Are you sure you do not want to use WHERE > instead of HAVING? Thanks for the hint, Clemens. Would you mean: select NAME,count(NAME) from SHEETS where SHEET>? group by NAME having NAME is not NULL; is more logically appropriate? I'm inclined to think yes and that was what you meant. Or that I might even go for: select NAME,count(NAME) from SHEETS where SHEET>? and NAME is not NULL group by NAME; On a test table with statistically representative values (only ~95'000 rows), I get the same plan in all three cases: --EQP-- 0,0,0,SEARCH TABLE SHEETS USING COVERING INDEX SHEETS_NAME (NAME>?) -- Best Regards, Meilleures salutations, Met vriendelijke groeten, Olivier Mascia ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] without rowid and secondary indexes
Olivier Mascia wrote: > [...] > So I should probably even better go with: > > create index ... where NAME is not NULL; > > as my only queries involving NAME have a 'where NAME is not NULL' restriction. To be sure, check with EXPLAIN QUERY PLAN whether the index is actually used. > select NAME,count(NAME) from SHEETS group by NAME having SHEET>? and NAME is > not NULL; The SHEET value is from some random row in the group, and the NAME value is the same for all rows in the group. Are you sure you do not want to use WHERE instead of HAVING? Regards, Clemens ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] without rowid and secondary indexes
Thanks! > In your example below, your PRIMARY KEY consists of two integer > instead of just a single rowid integer. So it will take up slightly > more space. Not a lot, but some. > > Will that make a difference in your application? I do not know. Then, I will indeed run some more tests, but I guess I'm OK with adding the index while keeping the table without rowid because few rows will actually have a NAME value, most (like in more than 95% of them) will be NULL. What's more, most queries are PK centric. So I should probably even better go with: create index if not exists SHEETS_NAME on SHEETS(NAME) where NAME is not NULL; as my only queries involving NAME have a 'where NAME is not NULL' restriction. The purpose of the index being only to optimize such occasional dictionary building queries: select NAME,count(NAME) from SHEETS group by NAME having SHEET>? and NAME is not NULL; -- Best Regards, Meilleures salutations, Met vriendelijke groeten, Olivier Mascia > Le 10 avr. 2018 à 12:36, Richard Hippa écrit : > > Every index is composed of the columns being indexed followed by the > primary key of the table. In a regular ROWID table, the "primary key" > is the rowid - an integer. In a WITHOUT ROWID table, the primary key > is whatever you declare the PRIMARY KEY to be. > > So whether or not there is more overhead in an index on a WITHOUT > ROWID table versus a normal rowid table depends on your choice of the > PRIMARY KEY. > > In your example below, your PRIMARY KEY consists of two integer > instead of just a single rowid integer. So it will take up slightly > more space. Not a lot, but some. > > Will that make a difference in your application? I do not know. I > suggest you try it and see. This is one of the great beauties of SQL > (any SQL not just SQLite). You can make a simple schema change like > this and try it out, without having to modify your application - > indeed without having to even recompile your application. Simply run > the same build of your application using two different versions of the > database - one with an ordinary rowid table and the other with a > WITHOUT ROWID table - and see which one gives you better performance - > for whatever definition of "performance" is important to you. > > On 4/10/18, Olivier Mascia wrote: >> Dear all, >> >> With https://www.sqlite.org/withoutrowid.html in mind, assuming something >> simple as: >> >> create table if not exists SHEETS( >> NODE integer not null, >> SHEET integer not null, >> NAME text, -- often NULL, sometimes short text value >> (keyword) >> ..., -- 2 or 3 other small properties >> primary key(NODE,SHEET) >> ) without rowid; >> -- makes sense because small row and natural primary key is not integer >> >> Now what if an index on NAME becomes valuable? >> Will such a secondary index have a higher storage cost or performance cost >> than with an ordinary table? >> >> create index if not exists SHEETS_NAME on SHEETS(NAME); >> >> -- >> Best Regards, Meilleures salutations, Met vriendelijke groeten, >> Olivier Mascia ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] without rowid and secondary indexes
Every index is composed of the columns being indexed followed by the primary key of the table. In a regular ROWID table, the "primary key" is the rowid - an integer. In a WITHOUT ROWID table, the primary key is whatever you declare the PRIMARY KEY to be. So whether or not there is more overhead in an index on a WITHOUT ROWID table versus a normal rowid table depends on your choice of the PRIMARY KEY. In your example below, your PRIMARY KEY consists of two integer instead of just a single rowid integer. So it will take up slightly more space. Not a lot, but some. Will that make a difference in your application? I do not know. I suggest you try it and see. This is one of the great beauties of SQL (any SQL not just SQLite). You can make a simple schema change like this and try it out, without having to modify your application - indeed without having to even recompile your application. Simply run the same build of your application using two different versions of the database - one with an ordinary rowid table and the other with a WITHOUT ROWID table - and see which one gives you better performance - for whatever definition of "performance" is important to you. On 4/10/18, Olivier Masciawrote: > Dear all, > > With https://www.sqlite.org/withoutrowid.html in mind, assuming something > simple as: > > create table if not exists SHEETS( > NODE integer not null, > SHEET integer not null, > NAME text, -- often NULL, sometimes short text value > (keyword) > ..., -- 2 or 3 other small properties > primary key(NODE,SHEET) > ) without rowid; > -- makes sense because small row and natural primary key is not integer > > Now what if an index on NAME becomes valuable? > Will such a secondary index have a higher storage cost or performance cost > than with an ordinary table? > > create index if not exists SHEETS_NAME on SHEETS(NAME); > > -- > Best Regards, Meilleures salutations, Met vriendelijke groeten, > Olivier Mascia > > > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users