Re: [sqlite] Creating indexes on non-existent columns with double quotes
On Wed, Jan 25, 2017 at 9:55 AM, Simon Slavinwrote: > > On 25 Jan 2017, at 5:45pm, Ersin Akinci wrote: > >> Thanks Richard and Simon for your insights. I think I'm still missing >> a few things, though. >> >> 1. What does it mean when SQLite tries to create an index on a string? >> Simon suggested that it's creating a calculated index, but I'm not >> sure what means. (Does it just mean an "index" literally just on the >> word "yearz_doesnt_exist"? That seems quite silly.) > > SQLite supports indexes on calculations. Or other expressions. So you > should be able to do things like > > CREATE INDEX fred ON MyTable (CASE WHEN stock < 0 THEN 0 ELSE stock END) > > Since a constant string is an expression, and single quotes must be used to > quote strings, this means you can do something like > > CREATE INDEX mabel ON MyTable ('hello mom') > > It won’t do anything useful, but it should work. > >> 2. And regardless, why is there a different result when using single >> quotes vs. double quotes? > > The two quotes mean completely different things to SQLite. Single quotes are > used to quote strings. Double quotes are used to quote entity names (like > column names, table names, etc.). Properly speaking the version of your > command with double quotes should be rejected for the reason you expected > but, as DRH explained, for historical reasons if there’s no column with the > given name SQLite thinks you want the contents of the quotes as a string. > And that’s what it’s doing. > > Simon. > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users Aha. Makes sense. And as you noted in your bug report, Simon, the single quoted version should create a computed index, as well, but for some reason it doesn't. Thanks for the clarification! -Ersin -- Ersin Y. Akinci -- ersinakinci.com ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Creating indexes on non-existent columns with double quotes
On 25 Jan 2017, at 5:45pm, Ersin Akinciwrote: > Thanks Richard and Simon for your insights. I think I'm still missing > a few things, though. > > 1. What does it mean when SQLite tries to create an index on a string? > Simon suggested that it's creating a calculated index, but I'm not > sure what means. (Does it just mean an "index" literally just on the > word "yearz_doesnt_exist"? That seems quite silly.) SQLite supports indexes on calculations. Or other expressions. So you should be able to do things like CREATE INDEX fred ON MyTable (CASE WHEN stock < 0 THEN 0 ELSE stock END) Since a constant string is an expression, and single quotes must be used to quote strings, this means you can do something like CREATE INDEX mabel ON MyTable ('hello mom') It won’t do anything useful, but it should work. > 2. And regardless, why is there a different result when using single > quotes vs. double quotes? The two quotes mean completely different things to SQLite. Single quotes are used to quote strings. Double quotes are used to quote entity names (like column names, table names, etc.). Properly speaking the version of your command with double quotes should be rejected for the reason you expected but, as DRH explained, for historical reasons if there’s no column with the given name SQLite thinks you want the contents of the quotes as a string. And that’s what it’s doing. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Creating indexes on non-existent columns with double quotes
Thanks Richard and Simon for your insights. I think I'm still missing a few things, though. 1. What does it mean when SQLite tries to create an index on a string? Simon suggested that it's creating a calculated index, but I'm not sure what means. (Does it just mean an "index" literally just on the word "yearz_doesnt_exist"? That seems quite silly.) My assumption is that a calculated index is just an index on a view or some other kind of dynamically rendered column, but according to http://sqlite.1065341.n5.nabble.com/Computed-column-or-create-index-on-a-view-td89442.html, SQLite doesn't support indexes on views. 2. And regardless, why is there a different result when using single quotes vs. double quotes? -- Ersin Y. Akinci -- ersinakinci.com ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Creating indexes on non-existent columns with double quotes
On 23 Jan 2017, at 9:33pm, Ersin Akinciwrote: > CREATE INDEX index_reports_on_yearz_doesnt_exist ON reports > (yearz_doesnt_exist); > CREATE INDEX index_reports_on_yearz_doesnt_exist ON reports > ('yearz_doesnt_exist'); > CREATE INDEX index_reports_on_yearz_doesnt_exist ON reports > ("yearz_doesnt_exist"); > > The first two of these statements will fail with the error 'Error: no such > column: yearz_doesnt_exist', but the last one goes through. The first one is doing what you said. In the second index you are, technically, asking it to create a computed index on a constant string (that’s what the apostrophes mean). I don’t understand why this doesn’t work for me. I’m using SQLite 3.14.0. The third one /should/ do what the first one does, since double-quotes indicate entity-names. > After executing > the last statement, if we look at the output of index_list and index_info, > we get: > > sqlite> PRAGMA index_list(reports); > 0|index_reports_on_yearz_doesnt_exist|0|c|0 > sqlite> PRAGMA index_info(index_reports_on_yearz_doesnt_exist); > 0|-2| I don’t get that. Instead I get sqlite> CREATE INDEX index_reports_quotes ON reports ...> ("yearz_doesnt_exist"); sqlite> PRAGMA index_info('index_reports_quotes'); 0|-2| sqlite> PRAGMA index_xinfo('index_reports_quotes'); 0|-2||0|BINARY|1 1|-1||0|BINARY|0 The "-2" in each case is undocumented. I suspect that that it means 'calculated column' and it’s just not documented yet. [later] I see DRH has cleared this up. SQLite failed to find a column with that name and is interpreting the string as a text string. Therefore it is indeed creating a calculated index. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Creating indexes on non-existent columns with double quotes
On 1/23/17, Ersin Akinciwrote: > CREATE INDEX index_reports_on_yearz_doesnt_exist ON reports > ("yearz_doesnt_exist"); > > > I understand that the double quotation syntax is used to indicate > identifiers. Why am I allowed to create an index on a non-existent column > when I use double quotes? > SQLite has a (mis-)feature that double-quoted names that cannot be resolved to a table or column name are treated as strings. This was a very early design decision, made long before SQLite went viral and found itself running in everything device on the planet, and was intended to make SQLite more compatible with MySQL, which at the time was the most widely deployed database engine in the world. I regret that choice now, but I cannot undo it without breaking backwards compatibility. On the other hand, the ability to index an expression is a relatively new feature. Maybe we can disable the double-quote-to-string mis-feature for that one isolated case. Maybe. I'll look into it. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] creating indexes on empty columns
Sorry - generally the sorts will be on one column - but they may choose at a later time to sort by another column. They will (but rarely - sort by two or more columns at the same time). On 26 November 2012 14:20, Clemens Ladischwrote: > Paul Sanderson wrote: > > My software creates a large table containing anything between about 250K > > and Millions of rows when first run, the indexes are created immediately > > after the table is populated and the tables do not change afterwards. > > > > The reason for the indexes is that the data is then reviewed by a user > and > > each user will have a different end result in mind and so will be looking > > at (lots of) different subsets of the data but will want pretty much > > instant results, i.e. a few seconds not minutes. I have no idea which > > columns, other than a few of the main columns, that they will need to do > > selects and sorts on. > > If they can doing sorts on more than one column, you need indexes for > all combinations of those columns. (For n columns, that's "only" n! > combinations. Database size might become an issue ...) > > > The size of the database is never an issue. Memory is unlikely to be an > > issue. The initial processing time (data loading and indexing) is not an > > issue. > > Then why not creating covering indexes for every combination of sort > columns? (This is essentially the same as creating lots of sorted > copies of the table.) > > > Regards, > Clemens > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > -- Paul www.sandersonforensics.com skype: r3scue193 twitter: @sandersonforens Tel +44 (0)1326 572786 ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] creating indexes on empty columns
Paul Sanderson wrote: > My software creates a large table containing anything between about 250K > and Millions of rows when first run, the indexes are created immediately > after the table is populated and the tables do not change afterwards. > > The reason for the indexes is that the data is then reviewed by a user and > each user will have a different end result in mind and so will be looking > at (lots of) different subsets of the data but will want pretty much > instant results, i.e. a few seconds not minutes. I have no idea which > columns, other than a few of the main columns, that they will need to do > selects and sorts on. If they can doing sorts on more than one column, you need indexes for all combinations of those columns. (For n columns, that's "only" n! combinations. Database size might become an issue ...) > The size of the database is never an issue. Memory is unlikely to be an > issue. The initial processing time (data loading and indexing) is not an > issue. Then why not creating covering indexes for every combination of sort columns? (This is essentially the same as creating lots of sorted copies of the table.) Regards, Clemens ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] creating indexes on empty columns
Thanks for the replies - I'll try and read through them all thoroughly a bit later. But for now a bit of background. My software creates a large table containing anything between about 250K and Millions of rows when first run, the indexes are created immediately after the table is populated and the tables do not change afterwards. This process generally takes 10 minutes to an hour or so. The reason for the indexes is that the data is then reviewed by a user and each user will have a different end result in mind and so will be looking at (lots of) different subsets of the data but will want pretty much instant results, i.e. a few seconds not minutes. I have no idea which columns, other than a few of the main columns, that they will need to do selects and sorts on. It's unlikely to be all of them for a specific user but all users are very likely to use all columns but wont really know which until they start looking at the data and even then the desired end reult is sometimes fluid so requirements change. The size of the database is never an issue. Memory is unlikely to be an issue. The initial processing time (data loading and indexing) is not an issue. The whole target here is to drive the end user experience and reduce any wait time when they are at the keyboard to a minimum. I have considered just indexing the main columns and indexing other columns on demand (first use if you like) but can't see the benefit of that other than saving time and disk space at stage 1, but there is no need for this On 26 November 2012 01:52, Simon Slavinwrote: > > On 25 Nov 2012, at 11:30pm, "Jay A. Kreibich" wrote: > > > If you view an index as an optimization, then the idea > > is usually to increase overall performance, so that there is net win. > [snip] > > > > I disagree with this idea, as it implies there is a 1:1 exchange in > > read performance gains and write performance loss. > > To enlarge upon Jay's monotribe, I'll put these two together. > Optimization of a system sometimes doesn't mean optimizing /net/ > performance, it means speeding things up when time matters. I once got an > ovation from a roomfull of staff by deleting a couple of indexes, not > adding them. Because they spent most of their time entering data, but > reports were generated by other people they never met and didn't care > about. My changes reduced a two-second pause after hitting 'send' to half a > second, massively reducing false keystrokes. > > The report-needing people, on the other hand didn't care. It already took > the system 30 minutes to generate the reports: they pressed the button and > came back an hour later. Changing the 30 minutes to 50 minutes didn't > impact their workflow badly. Out of courtesy I explained why the > end-of-day reports were taking longer and nobody complained. (By the way > this sort of thing is why banks used to close to customers at 3:30pm even > though the staff still worked until 5pm.) > > This was decades ago on a pre-SQL system. One thing I like about SQL is > that the language used to change and consult the database doesn't say > anything about indexes (except in some non-standard extensions). So you > can add or delete indexes when you want without needing to change a line of > code in any of your software and having to deploy new versions of your > software. [insert dramatic evil genius music here] > > Simon. > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] creating indexes on empty columns
On Nov 26, 2012, at 12:30 AM, Jay A. Kreibichwrote: > "Using SQLite" (http://shop.oreilly.com/product/9780596521196.do) > has a very lengthy discussion of indexes and how they work, > specifically because it is difficult to generalize the use of > indexes. One must really look at each application and each database > (and the data in that database!) with a solid knowledge of what an > index can-- or cannot-- provide in order to find places where an > index will make a positive difference. All true, and well articulated. But, at the end of the day, SQLite only provides b-tree indexes and that is that. And b-trees are not that useful and/or detrimental for a wide range of use case. What about providing bitmap indexes? http://en.wikipedia.org/wiki/Bitmap_index Bitmap Index vs. B-tree Index: Which and When? http://www.oracle.com/technetwork/articles/sharma-indexes-093638.html Now, that would be a breakthrough :) ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] creating indexes on empty columns
On 25 Nov 2012, at 11:30pm, "Jay A. Kreibich"wrote: > If you view an index as an optimization, then the idea > is usually to increase overall performance, so that there is net win. [snip] > > I disagree with this idea, as it implies there is a 1:1 exchange in > read performance gains and write performance loss. To enlarge upon Jay's monotribe, I'll put these two together. Optimization of a system sometimes doesn't mean optimizing /net/ performance, it means speeding things up when time matters. I once got an ovation from a roomfull of staff by deleting a couple of indexes, not adding them. Because they spent most of their time entering data, but reports were generated by other people they never met and didn't care about. My changes reduced a two-second pause after hitting 'send' to half a second, massively reducing false keystrokes. The report-needing people, on the other hand didn't care. It already took the system 30 minutes to generate the reports: they pressed the button and came back an hour later. Changing the 30 minutes to 50 minutes didn't impact their workflow badly. Out of courtesy I explained why the end-of-day reports were taking longer and nobody complained. (By the way this sort of thing is why banks used to close to customers at 3:30pm even though the staff still worked until 5pm.) This was decades ago on a pre-SQL system. One thing I like about SQL is that the language used to change and consult the database doesn't say anything about indexes (except in some non-standard extensions). So you can add or delete indexes when you want without needing to change a line of code in any of your software and having to deploy new versions of your software. [insert dramatic evil genius music here] Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] creating indexes on empty columns
On Sun, Nov 25, 2012 at 12:41:21PM -0700, Keith Medcalf scratched on the wall: > > On Sunday, 25 November, 2012, 11:58, Jay A. Kreibich wrote: > > > each column is usually undesirable. A given SELECT can usually only > > use one index per query (or sub-query), so it rarely makes sense to > > stack up the indexes... adding unused indexes only slows down > > insert/update/deletes, as well as makes the file size much larger. > > Generally speaking indexes should be treated as a performance trade-off. Ideally they're a performance gain. That's kind of the point. If they're not an overall gain, you likely shouldn't be using them. > When you create an index (other than a UNIQUE index used to enforce a > constraint, or an index on a parent or child key in a foreign-key > relationship, where such an index may greatly increase INSERT or UPDATE > performance) you are "moving" execution time from the retrieval > processing to the maintenance processing of your application. Yes and no. If you view an index as an optimization, then the idea is usually to increase overall performance, so that there is net win. It is true that an index will (hopefully) increase the performance of many queries at the possible cost of additional maintenance processing, but the net change depends on the application's read/write ratios, and the maintenance patterns. In that sense, I'd say the indexes do not move execution *time*, so much as move and adjust execution *cost*. If the application's needs and access patterns are such that the refactored set of costs is lower, the indexes are usually seen as a correct and positive optimization. If the new set of costs results in lower overall performance, the index is seen as a negative thing. There is also nothing special about an automatic UNIQUE index. The costs are the same. The only difference is the motivation for creating it, accepting the write performance cost as the price of enforcing the constraint. Adding a UNIQUE constraint (and the index that goes with it) is not an optimization, but an application requirement. > When you add an index, you are (usually) optimizing retrieval and > query operations. The execution time saved during such query > operations does not disappear (it is not recovered). These processor > cycles and I/O operations are "removed" from retrieval operations > and "spent" when you perform updates to the database to maintain > the indexes. I disagree with this idea, as it implies there is a 1:1 exchange in read performance gains and write performance loss. That is very rarely the case. Even the relative costs of a single read vs a single write are not linear or particularly related. All of these things are highly dependent on the table size, the different indexes, and the operation being performed. It isn't about moving costs from category A to category B, it is about finding efficient work flows that work faster and better. We already know there are some situations when an index will provide significant performance benefits, and other cases when the same index may slow things down. A big part of what the query optimizer must do is identify these situations and pick the best use of the available resources-- in many situations that may be to ignore the index. That's not about shifting execution time, it is about getting rid of it, and picking the fastest path for the situation at hand. The balance is that an index opens up more options to the query optimizer, but there is also an associated cost-- and that cost should only be paid if the index is used from time to time. This balance is true of all operations, not just SELECTs. For example, a good index can make UPDATE and DELETE operations faster, just as it may make SELECTs faster. That kind of blows the theory of moving execution time around, since there can be advantages in both types of operations. Index optimization is in no way a zero-sum game. The bag of water is *not* fixed, and it is just as possible to remove water as it is to add it. > > Additionally, indexing any column (even one used heavily in a query > > filter) is not useful if the index doesn't reduce the working data > > set to ~10% of the rows. In other words, having an index on a column > > that has all one value (be it NULL or anything else) accomplishes > > nothing but significantly slowing down queries that use the index. > > Same is true, for example, of a True/False column with an even > > distribution. Generally, unless the index can be used to eliminate > > the vast majority of rows, it will slow down, rather than speed up, a > > query. > > The exception to this is, of course, where the index created is a > covering index because using a covering index, while it may not > necessarily reduce the number of rows significantly, eliminates the > accesses to the base table to retrieve data
Re: [sqlite] creating indexes on empty columns
On Nov 25, 2012, at 8:41 PM, Keith Medcalfwrote: > (ie, be careful not to just add water into the bag -- the objective is to > poke it around, not just add more water) +1 for the, hmmm, treading water metaphore ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] creating indexes on empty columns
> On Sunday, 25 November, 2012, 11:58, Jay A. Kreibich wrote: > each column is usually undesirable. A given SELECT can usually only > use one index per query (or sub-query), so it rarely makes sense to > stack up the indexes... adding unused indexes only slows down > insert/update/deletes, as well as makes the file size much larger. Generally speaking indexes should be treated as a performance trade-off. When you create an index (other than a UNIQUE index used to enforce a constraint, or an index on a parent or child key in a foreign-key relationship, where such an index may greatly increase INSERT or UPDATE performance) you are "moving" execution time from the retrieval processing to the maintenance processing of your application. When you add an index, you are (usually) optimizing retrieval and query operations. The execution time saved during such query operations does not disappear (it is not recovered). These processor cycles and I/O operations are "removed" from retrieval operations and "spent" when you perform updates to the database to maintain the indexes. In other words, optimization does not make all operations faster -- the time required for your application to perform its functions can be viewed as a fixed size bag of water. When you optimize some operation by adding an index you are "pushing in" the bag of water in some specific place. It bulges out somewhere else, and some other operation becomes slower because now you have to perform additional operations elsewhere to maintain the indexes. The objective is to save a "huge" amount of processing time in one operation and spend that savings in "wee bits" distributed over many other operations where it is not noticeable. For example, adding a particular index may decrease the time to perform a certain query from 5 minutes to 5 seconds. However, it will add a few hundred milliseconds to each update or insert operation. If this trade-off is acceptable, then you have made a good trade-off. On the other hand, if the indexes you added to optimize the query increases each update/insert by a few seconds, the trade-off may not be acceptable. You may need to find an acceptable middle ground ... > Additionally, indexing any column (even one used heavily in a query > filter) is not useful if the index doesn't reduce the working data > set to ~10% of the rows. In other words, having an index on a column > that has all one value (be it NULL or anything else) accomplishes > nothing but significantly slowing down queries that use the index. > Same is true, for example, of a True/False column with an even > distribution. Generally, unless the index can be used to eliminate > the vast majority of rows, it will slow down, rather than speed up, a > query. The exception to this is, of course, where the index created is a covering index because using a covering index, while it may not necessarily reduce the number of rows significantly, eliminates the accesses to the base table to retrieve data that might be being used in further operations. Care needs to be taken to not prematurely add indexes that will add maintenance cost but not significantly improve query performance (ie, be careful not to just add water into the bag -- the objective is to poke it around, not just add more water). Adding indexes required to enforce or optimize constraint and foreign key operations are almost always required -- but other indexes should not be added until you are sure that they will actually be required because overuse of indexes can severely hamper overall performance. --- () ascii ribbon campaign against html e-mail /\ www.asciiribbon.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] creating indexes on empty columns
On Sun, Nov 25, 2012 at 01:29:48PM +, Paul Sanderson scratched on the wall: > Yes NULL - > > I underastand that ecvery coumn needs to be read, that is self evident, > however my feeling (not tested) is that the process is much slower than it > needs to be, i..e the process of creating an index on a column whos values > are all NULL takes longer than just reading all of the columns - I have to > admit to not testing/timimg this though, I'll have a go when time permits. An index is essentially a specialized table that uses a different key. Tables and indexes use an almost identical data structure when stored. The main difference is that tables are sorted and keyed via the ROWID column, while indexes are sorted and keyed by one or more indexed columns. Every index has one row for each row in the table. So the question then becomes, if you were creating a one column table and inserted a large number of rows, would you expect a significant difference in the insert time if all the values were the same, or if they were different? > That does lead to another question. Is their a method of creating multiple > indexes at the same time, e.g. create an index on each (or specified) > column in a table in one pass - rather than do each column in turn. This > would save on the overhead of reading the entire table for each column. Scanning the table is not the expensive part of creating an index. Creating a large index requires a significant amount of memory to sort and rebalance the B-Tree used to store the index data. Doing more than one index at a time would make that problem much worse, and would likely be slower, overall, then creating the indexes one at a time. As others have pointed out, it might be worth backing and asking what you're trying to achieve with all these indexes. Putting an index on each column is usually undesirable. A given SELECT can usually only use one index per query (or sub-query), so it rarely makes sense to stack up the indexes... adding unused indexes only slows down insert/update/deletes, as well as makes the file size much larger. Additionally, indexing any column (even one used heavily in a query filter) is not useful if the index doesn't reduce the working data set to ~10% of the rows. In other words, having an index on a column that has all one value (be it NULL or anything else) accomplishes nothing but significantly slowing down queries that use the index. Same is true, for example, of a True/False column with an even distribution. Generally, unless the index can be used to eliminate the vast majority of rows, it will slow down, rather than speed up, a query. Indexes are not magic bullets, and using them properly requires understanding how they work and how they are used. -j -- Jay A. Kreibich < J A Y @ K R E I B I.C H > "Intelligence is like underwear: it is important that you have it, but showing it to the wrong people has the tendency to make them feel uncomfortable." -- Angela Johnson ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] creating indexes on empty columns
On 25 Nov 2012, at 6:13pm, Jay A. Kreibichwrote: > On Sun, Nov 25, 2012 at 04:56:44PM +, Simon Slavin scratched on the wall: >> >> In SQLite, all columns are in all indexes even if the column contains a >> NULL. NULL has a sorting order, and anything that does > > Rows, Simon, rows... not columns. Watch your terminology or your > answers will be more confusing than the questions. Whoops. Right. Thanks. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] creating indexes on empty columns
On Sun, Nov 25, 2012 at 04:56:44PM +, Simon Slavin scratched on the wall: > > On 25 Nov 2012, at 1:29pm, Paul Sanderson> wrote: > > > I underastand that ecvery coumn needs to be read, that is self evident, > > however my feeling (not tested) is that the process is much slower than it > > needs to be, i..e the process of creating an index on a column whos values > > are all NULL takes longer than just reading all of the columns - I have to > > admit to not testing/timimg this though, I'll have a go when time permits. > > In SQLite, all columns are in all indexes even if the column contains a > NULL. NULL has a sorting order, and anything that does Rows, Simon, rows... not columns. Watch your terminology or your answers will be more confusing than the questions. -j -- Jay A. Kreibich < J A Y @ K R E I B I.C H > "Intelligence is like underwear: it is important that you have it, but showing it to the wrong people has the tendency to make them feel uncomfortable." -- Angela Johnson ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] creating indexes on empty columns
Paul Sanderson wrote: > I underastand that ecvery coumn needs to be read, that is self evident, > however my feeling (not tested) is that the process is much slower than it > needs to be, i..e the process of creating an index on a column whos values > are all NULL takes longer than just reading all of the columns There's also the time needed for sorting the values and constructing and writing the index. > I have to admit to not testing/timimg this though sqlite> .timer on sqlite> SELECT COUNT(*) FROM t; 4194304 CPU Time: user 0.008000 sys 0.00 sqlite> SELECT DISTINCT x FROM t; CPU Time: user 0.608000 sys 0.012000 sqlite> CREATE INDEX i ON t(x); CPU Time: user 6.592000 sys 0.064000 sqlite> DROP INDEX i; CPU Time: user 0.08 sys 0.004000 sqlite> UPDATE t SET x=1; CPU Time: user 4.556000 sys 0.052000 sqlite> SELECT DISTINCT x FROM t; 1 CPU Time: user 0.628000 sys 0.00 sqlite> CREATE INDEX i ON t(x); CPU Time: user 6.648000 sys 0.08 > That does lead to another question. Is their a method of creating multiple > indexes at the same time, e.g. create an index on each (or specified) > column in a table in one pass - rather than do each column in turn. This > would save on the overhead of reading the entire table for each column. No such commands exists. Just have enough file cache. :-) Regards, Clemens ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] creating indexes on empty columns
On 25 Nov 2012, at 1:29pm, Paul Sandersonwrote: > I underastand that ecvery coumn needs to be read, that is self evident, > however my feeling (not tested) is that the process is much slower than it > needs to be, i..e the process of creating an index on a column whos values > are all NULL takes longer than just reading all of the columns - I have to > admit to not testing/timimg this though, I'll have a go when time permits. In SQLite, all columns are in all indexes even if the column contains a NULL. NULL has a sorting order, and anything that does SELECT * FROM myTable ORDER BY myIndexedColumn still has to return all rows. > That does lead to another question. Is their a method of creating multiple > indexes at the same time, e.g. create an index on each (or specified) > column in a table in one pass - rather than do each column in turn. This > would save on the overhead of reading the entire table for each column. It is very rare to need to index every column of a table. That is generally a sign that what you really want is some sort of three-column attribute store rather than a conventional database. You should know that any SELECT will only use either zero or one index: once you've used one index for sorting, the others are useless because they list rows in an unsorted order. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] creating indexes on empty columns
On 25 Nov 2012, at 1:29pm, Paul Sandersonwrote: > That does lead to another question. Is their a method of creating multiple > indexes at the same time, e.g. create an index on each (or specified) > column in a table in one pass - rather than do each column in turn. This > would save on the overhead of reading the entire table for each column. Looking again at this I'm not sure I understand why you're indexing every column. Do you understand the difference between creating one index for each column and creating one index on all columns ? It is worth the huge increase in filesize ? Is the slow-down on data entry (where entering one row must modify many indexes) worth the speed-up in your SELECTs ? Do you understand that any SELECT will use only one index ? Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] creating indexes on empty columns
Yes NULL - I underastand that ecvery coumn needs to be read, that is self evident, however my feeling (not tested) is that the process is much slower than it needs to be, i..e the process of creating an index on a column whos values are all NULL takes longer than just reading all of the columns - I have to admit to not testing/timimg this though, I'll have a go when time permits. That does lead to another question. Is their a method of creating multiple indexes at the same time, e.g. create an index on each (or specified) column in a table in one pass - rather than do each column in turn. This would save on the overhead of reading the entire table for each column. On 24 November 2012 14:40, Clemens Ladischwrote: > Paul Sanderson wrote: > > Whilst building a new app I created an index on every column some of > which > > were empty. > > And with "empty", you mean that every value in that column is NULL? > > > The database is reasonably large (400K rows) and I notcied that > > it seems to take as long to create an index on a column in which all the > > rows are empty as it does on one in which all the rows are unique. > > SQLite still has to read all records to determine what the values are, > and to create the index with all these values and RowIDs. That these > values all happen to have the same value does not change the algorithm. > > > Regards, > Clemens > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] creating indexes on empty columns
Paul Sanderson wrote: > Whilst building a new app I created an index on every column some of which > were empty. And with "empty", you mean that every value in that column is NULL? > The database is reasonably large (400K rows) and I notcied that > it seems to take as long to create an index on a column in which all the > rows are empty as it does on one in which all the rows are unique. SQLite still has to read all records to determine what the values are, and to create the index with all these values and RowIDs. That these values all happen to have the same value does not change the algorithm. Regards, Clemens ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] creating indexes on empty columns
Paul Sandersonwrote: > Whilst building a new app I created an index on every column some of which > were empty. The database is reasonably large (400K rows) and I notcied that > it seems to take as long to create an index on a column in which all the > rows are empty as it does on one in which all the rows are unique. > > I don't really have an idea of how sqlite works internally, but this seems > like an area where there could be some optimisation - or anm I totally off > track? What kind of optimization do you have in mind? How exactly do you propose determining that a column has empty values in all rows, any faster than looking at every row? Also, what constitutes an "empty value" - NULL, empty string, zero-length blob, integer 0, something else? -- Igor Tandetnik ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] creating indexes
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 alenD wrote: > Should an index be created before insertions or after insertions?:super: What performance difference did you see after trying both options? Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAktc24gACgkQmOOfHg372QT9yQCfbIPLPg/HNZ1xqkv6cnbdDfxP 2KQAoKqgGHzqiz/MCHa90NyC4vfIhxvy =W1wk -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Creating Indexes
On Wed, 6 Aug 2008 16:45:35 -0400, Jeffrey Becker wrote: > On Wed, Aug 6, 2008 at 4:10 PM, > Kees Nuyt <[EMAIL PROTECTED]> wrote: >> Additionally: NodeID and ParentID shouldn't be blobs. >> Integer is the most suitable type for IDs. > >The blobs I'm inserting are actually a binary representation designed >_specifically_ to index properly. So yes they should be blobs. Aha, ok, all clear now. -- ( Kees Nuyt ) c[_] ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Creating Indexes
On Wed, Aug 6, 2008 at 4:10 PM, Kees Nuyt <[EMAIL PROTECTED]> wrote: > On Tue, 05 Aug 2008 17:22:05 -0500, you wrote: > >>Jeffrey Becker wrote: >>> I have a table 'SiteMap' defined as : >>> >>> Create Table SiteMap >>> ( >>> NodeID blob not null PRIMARY KEY, >>> Title text NOT NULL UNIQUE, >>> Url text NOT NULL >>> ); >>> >>> I'd like to index on the node's parent value as defined by the >>> expression ancestor(NodeID,1). 'ancestor' being a user defined >>> function. >>> However sqlite doesnt permit indicies on expressions, only columns. >>> Why is this? More importantly what's my best bet on achieveing >>> something similar? >> >>Create Table SiteMap >>( >> NodeID blob not null PRIMARY KEY, >> ParentID blob not null, >> Title text NOT NULL UNIQUE, >> Url text NOT NULL >>); >> >>Then index on (ParentID, NodeID) or something like that. >> >>-Steve > > Additionally: NodeID and ParentID shouldn't be blobs. > Integer is the most suitable type for IDs. > > http://article.gmane.org/gmane.comp.db.sqlite.general/17286/match=managing+trees+database > and > http://www.sitepoint.com/article/hierarchical-data-database > may help on maintaining trees in a relational database. > > Or search the net on "Joe Celko" and "nested sets". > -- > ( Kees Nuyt > ) > c[_] The blobs I'm inserting are actually a binary representation designed _specifically_ to index properly. So yes they should be blobs. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Creating Indexes
On Tue, 05 Aug 2008 17:22:05 -0500, you wrote: >Jeffrey Becker wrote: >> I have a table 'SiteMap' defined as : >> >> Create Table SiteMap >> ( >> NodeID blob not null PRIMARY KEY, >> Title text NOT NULL UNIQUE, >> Url text NOT NULL >> ); >> >> I'd like to index on the node's parent value as defined by the >> expression ancestor(NodeID,1). 'ancestor' being a user defined >> function. >> However sqlite doesnt permit indicies on expressions, only columns. >> Why is this? More importantly what's my best bet on achieveing >> something similar? > >Create Table SiteMap >( > NodeID blob not null PRIMARY KEY, > ParentID blob not null, > Title text NOT NULL UNIQUE, > Url text NOT NULL >); > >Then index on (ParentID, NodeID) or something like that. > >-Steve Additionally: NodeID and ParentID shouldn't be blobs. Integer is the most suitable type for IDs. http://article.gmane.org/gmane.comp.db.sqlite.general/17286/match=managing+trees+database and http://www.sitepoint.com/article/hierarchical-data-database may help on maintaining trees in a relational database. Or search the net on "Joe Celko" and "nested sets". -- ( Kees Nuyt ) c[_] ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Creating Indexes
Jeffrey Becker wrote: > I have a table 'SiteMap' defined as : > > Create Table SiteMap > ( > NodeID blob not null PRIMARY KEY, > Title text NOT NULL UNIQUE, > Url text NOT NULL > ); > > I'd like to index on the node's parent value as defined by the > expression ancestor(NodeID,1). 'ancestor' being a user defined > function. > However sqlite doesnt permit indicies on expressions, only columns. > Why is this? More importantly what's my best bet on achieveing > something similar? Create Table SiteMap ( NodeID blob not null PRIMARY KEY, ParentID blob not null, Title text NOT NULL UNIQUE, Url text NOT NULL ); Then index on (ParentID, NodeID) or something like that. -Steve ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] creating indexes in attached databases
On 8/19/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: "Jonathan Ellis" <[EMAIL PROTECTED]> wrote: > sqlite> attach 'foo.db' as foo; > sqlite> create table foo.bar (i int); > sqlite> create index foo.bar_i on foo.bar(i); Should be: create index foo.bar_i on bar(i); Thanks, that fixed it. - To unsubscribe, send email to [EMAIL PROTECTED] -
Re: [sqlite] creating indexes in attached databases
"Jonathan Ellis" <[EMAIL PROTECTED]> wrote: > sqlite> attach 'foo.db' as foo; > sqlite> create table foo.bar (i int); > sqlite> create index foo.bar_i on foo.bar(i); Should be: create index foo.bar_i on bar(i); > SQL error: near ".": syntax error > sqlite> > > Is there another way to do this? > > - > To unsubscribe, send email to [EMAIL PROTECTED] > - > > > . - To unsubscribe, send email to [EMAIL PROTECTED] -