Re: [sqlite] Bigger table and query optimization
2008/9/9 Igor Tandetnik <[EMAIL PROTECTED]>: > "Stephen Oberholtzer" > <[EMAIL PROTECTED]> wrote in > message > news:[EMAIL PROTECTED] >> Idea: Submit a patch that allows LIKE expressions that start with a >> fixed >> string (i.e. don't start with '%') to use the index to improve >> performance. (SQL Server 2000 does this.) > > http://www.sqlite.org/optoverview.html > 4.0 The LIKE optimization > > Igor Tandetnik > > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > Stephen, Good idea(idea2). In the next week I'll try to code some patch including reverse() and so other functions I feel SQLite could have. :-) And I'll try to submit it too. About the LIKE indexes(first idea), reading Igor's link, I saw it's already supported(with some restrictions). Igor, This docs is too useful. I take a read on it, and I feel it's very interesting for me. Thank you all people!!! ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Bigger table and query optimization
On Tue, Sep 9, 2008 at 10:18 PM, Igor Tandetnik <[EMAIL PROTECTED]> wrote: > "Stephen Oberholtzer" > <[EMAIL PROTECTED]> wrote in > message > news:[EMAIL PROTECTED]<[EMAIL PROTECTED]> > > Idea: Submit a patch that allows LIKE expressions that start with a > > fixed > > string (i.e. don't start with '%') to use the index to improve > > performance. (SQL Server 2000 does this.) > > http://www.sqlite.org/optoverview.html > 4.0 The LIKE optimization > > Igor Tandetnik > > "For the LIKE operator, if case_sensitive_like mode is enabled then the column must use the default BINARY collating sequence, or if case_sensitive_like mode is disabled then the column must use the built-in NOCASE collating sequence." There's a flaw in this design -- which explains why, when I actually tested it, the index wasn't used: CREATE TABLE foo (id integer primary key, name text, name_back text); INSERT INTO "foo" VALUES(1,'one','eno'); CREATE INDEX name_back_IX2 on foo(name_back collate nocase); pragma case_sensitive_like=off; explain query plan select name_back from foo where name_back like 'e%'; order fromdetail -- -- -- 0 0 TABLE foo The index "name_back_IX2" is collated NOCASE, which is the same collation that LIKE uses -- but the index doesn't get used! -- -- Stevie-O Real programmers use COPY CON PROGRAM.EXE ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Bigger table and query optimization
"Stephen Oberholtzer" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > Idea: Submit a patch that allows LIKE expressions that start with a > fixed > string (i.e. don't start with '%') to use the index to improve > performance. (SQL Server 2000 does this.) http://www.sqlite.org/optoverview.html 4.0 The LIKE optimization Igor Tandetnik ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Bigger table and query optimization
On Tue, Sep 9, 2008 at 9:04 AM, Bruno Moreira Guedes <[EMAIL PROTECTED]>wrote: > 2008/9/9 P Kishor <[EMAIL PROTECTED]>: > > On 9/9/08, Bruno Moreira Guedes <[EMAIL PROTECTED]> wrote: > >> Hi people!! > >> > >> I'm getting problems with a bigger table. I'm doing a query like this: > >> > >> SELECT field FROM tables WHERE '%.' || somefield LIKE 'somestring' > > > > interesting variation on the syntax. Usually one would > > > > WHERE somefield LIKE '%.somestring' > > > > > > Yes... I need to get the rows where 'something' ENDS wifh > '.[field-value]'. But I really think this solution isn't very well. Store the field *twice* -- once normally, and once *backwards*. Put an index on the backwards column, and when searching, do: somefield_backwards >= 'gnitset' and somefield_backwards < 'gnitseu' (Note that I advanced the last character from 't' to 'u' and used < instead of <=) This will enable SQLite to use the index on the backwards column to efficiently find everything. Idea: Submit a patch that allows LIKE expressions that start with a fixed string (i.e. don't start with '%') to use the index to improve performance. (SQL Server 2000 does this.) Idea 2: Submit a patch adding a built-in 'reverse' function to assist in this. -- -- Stevie-O Real programmers use COPY CON PROGRAM.EXE ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Bigger table and query optimization
2008/9/9 Igor Tandetnik <[EMAIL PROTECTED]>: > Bruno Moreira Guedes > <[EMAIL PROTECTED]> wrote: >> I can split my 'something' in st0, st1, st2, stN... I have a '.' >> betwen this values. But the number of "st"s is undefined. I tried to >> do "the necessary" number comparsions, appending tokens: >> >> SELECT fields FROM sometable WHERE field = 'st0.st1.st2.st3' OR field >> = 'st1.st2.st3' OR field = 'st2.st3' OR field = 'st3'; >> >> Unfortunately, even creating a INDEX, the performance was decreased >> instead of being increased. > > Try formulating it this way: > > SELECT fields FROM sometable WHERE field IN ( >'st0.st1.st2.st3', 'st1.st2.st3', 'st2.st3', 'st3'); > > > As the number of pieces grows, it might be better to create and fill a > temporary table with suffixes, then do > > SELECT fields FROM sometable WHERE field IN >(select suffix from suffixes); > > Igor Tandetnik > > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > Hi all!! Dennis, Thank you by your idea. I've been made some testing, and it seems to be the better solution. Unfortunately, my table has three fields(including the special rowid field), two integers and a varchar. The amount of data(about 500.000 rows) recently added increased 32Mb on database's file. So, I think maintaining a 'reverse copy' of varchar field will increase too many the database's file size - considering it'll grow about 10x in a few days. Igor, Thank you also. Using the 'IN' really solved the problem. The query runs fast. It'll require some 'manual string processing', because my application will have to generate the "list of something's tails", but as there's no big number of tails, it isn't a problem, and solve the needs by now. Thank you all people. It's hard to develop database applications with little database knowledge. Without you I couldn't make it works. Bruno M Guedes ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Bigger table and query optimization
Bruno Moreira Guedes <[EMAIL PROTECTED]> wrote: > I can split my 'something' in st0, st1, st2, stN... I have a '.' > betwen this values. But the number of "st"s is undefined. I tried to > do "the necessary" number comparsions, appending tokens: > > SELECT fields FROM sometable WHERE field = 'st0.st1.st2.st3' OR field > = 'st1.st2.st3' OR field = 'st2.st3' OR field = 'st3'; > > Unfortunately, even creating a INDEX, the performance was decreased > instead of being increased. Try formulating it this way: SELECT fields FROM sometable WHERE field IN ( 'st0.st1.st2.st3', 'st1.st2.st3', 'st2.st3', 'st3'); As the number of pieces grows, it might be better to create and fill a temporary table with suffixes, then do SELECT fields FROM sometable WHERE field IN (select suffix from suffixes); Igor Tandetnik ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Bigger table and query optimization
2008/9/9 Dennis Cote <[EMAIL PROTECTED]>: > Bruno Moreira Guedes wrote: >> >> I can split my 'something' in st0, st1, st2, stN... I have a '.' >> betwen this values. But the number of "st"s is undefined. I tried to >> do "the necessary" number comparsions, appending tokens: >> >> SELECT fields FROM sometable WHERE field = 'st0.st1.st2.st3' OR field >> = 'st1.st2.st3' OR field = 'st2.st3' OR field = 'st3'; >> > > You could try this > > select fields from sometable > where substr(field, -length(:somestring)) = :somestring; > I need to check if: '.' || field is a substring of something, and if something But it should be useful. Thank you! > This will still require a full table scan and will not use an index, but > the overhead of testing if the field ends with the appropriate string > should be as small as possible. > > Your main problem is there is no way to use an index to match the end of > a string. If this is a common operation for you database, you may want > to add a field that stores the strings in reverse order. You can then > add an index on that string. What used to be the end of the string is > now the beginning of the reversed field, and can be searched quickly > using an index. > > You will need to create a user defined function to reverse the > characters of a string. > > reverse('string') => 'gnirts' > > With this function you could add a new field to your database and create > a suitable index on that field. > > alter table sometable add column reversed text; > update sometable set reversed = reverse(field); > create index field_reversed_idx on sometable(reversed); > > Now you can use the same function to reverse the string you are trying > to match and use a like comparison to locate the strings quickly using > the index (since the search string is now the prefix of string). > > select field from sometable > where reversed like reverse(:somestring) || '%'; > > HTH > Dennis Cote > > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > Yes, it's the perfect solution!! I'll work to implement this. Thank you again!! Thank you all, Bruno ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Bigger table and query optimization
Bruno Moreira Guedes wrote: > > I can split my 'something' in st0, st1, st2, stN... I have a '.' > betwen this values. But the number of "st"s is undefined. I tried to > do "the necessary" number comparsions, appending tokens: > > SELECT fields FROM sometable WHERE field = 'st0.st1.st2.st3' OR field > = 'st1.st2.st3' OR field = 'st2.st3' OR field = 'st3'; > You could try this select fields from sometable where substr(field, -length(:somestring)) = :somestring; This will still require a full table scan and will not use an index, but the overhead of testing if the field ends with the appropriate string should be as small as possible. Your main problem is there is no way to use an index to match the end of a string. If this is a common operation for you database, you may want to add a field that stores the strings in reverse order. You can then add an index on that string. What used to be the end of the string is now the beginning of the reversed field, and can be searched quickly using an index. You will need to create a user defined function to reverse the characters of a string. reverse('string') => 'gnirts' With this function you could add a new field to your database and create a suitable index on that field. alter table sometable add column reversed text; update sometable set reversed = reverse(field); create index field_reversed_idx on sometable(reversed); Now you can use the same function to reverse the string you are trying to match and use a like comparison to locate the strings quickly using the index (since the search string is now the prefix of string). select field from sometable where reversed like reverse(:somestring) || '%'; HTH Dennis Cote ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Bigger table and query optimization
2008/9/9 Igor Tandetnik <[EMAIL PROTECTED]>: > "Bruno Moreira Guedes" > <[EMAIL PROTECTED]> wrote in message > news:[EMAIL PROTECTED] >> Yes... I need to get the rows where 'something' ENDS wifh >> '.[field-value]'. But I really think this solution isn't very well. > > Can field-value contain a period? If not, you can just extract the > portion of 'something' after the last period, then run a straight > comparison: WHERE somefield='tail-of-somestring' . Such a comparison can > then be sped up with an index. > > Igor Tandetnik > > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > Igor, I can split my 'something' in st0, st1, st2, stN... I have a '.' betwen this values. But the number of "st"s is undefined. I tried to do "the necessary" number comparsions, appending tokens: SELECT fields FROM sometable WHERE field = 'st0.st1.st2.st3' OR field = 'st1.st2.st3' OR field = 'st2.st3' OR field = 'st3'; Unfortunately, even creating a INDEX, the performance was decreased instead of being increased. Now I'm thinking about using length() to filter where the strings will be compared... After some testing, I increased a little of the performance, but it still very bad!! I still looking for hints!! Thank you Bruno M Guedes ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Bigger table and query optimization
"Bruno Moreira Guedes" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > Yes... I need to get the rows where 'something' ENDS wifh > '.[field-value]'. But I really think this solution isn't very well. Can field-value contain a period? If not, you can just extract the portion of 'something' after the last period, then run a straight comparison: WHERE somefield='tail-of-somestring' . Such a comparison can then be sped up with an index. Igor Tandetnik ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Bigger table and query optimization
2008/9/9 P Kishor <[EMAIL PROTECTED]>: > On 9/9/08, Bruno Moreira Guedes <[EMAIL PROTECTED]> wrote: >> Hi people!! >> >> I'm getting problems with a bigger table. I'm doing a query like this: >> >> SELECT field FROM tables WHERE '%.' || somefield LIKE 'somestring' > > interesting variation on the syntax. Usually one would > > WHERE somefield LIKE '%.somestring' > > Yes... I need to get the rows where 'something' ENDS wifh '.[field-value]'. But I really think this solution isn't very well. >> >> The table has about 500.000 registers. When the table is short, it's >> too fast, but when it's populated it gets slow. I've been imagined it >> before. What may I do to optimize this query?? Should an INDEX help >> me, if it's comparing concatenated strings?? > > LIKE doesn't use INDEXes, so you are really out of luck with big > tables. Try some other approach. On big tables it will be slow, > although 500K "registers" (is a "register" the same as "rows"?) is > not much at all. Yes, with "registers" I'm talking about "rows". 500K rows increased about 34Mb on database's file size. And I fear it's just 'a little', because I'll probably import more 'bigger amounts' of data(each 'amount' with about 500K rows). > > I would try FTS. > > > >> >> Thank you all, >> Bruno >> ___ >> sqlite-users mailing list >> sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >> > > > -- > Puneet Kishor http://punkish.eidesis.org/ > Nelson Institute for Environmental Studies http://www.nelson.wisc.edu/ > Open Source Geospatial Foundation (OSGeo) http://www.osgeo.org/ > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > Thank you by the help, I'll read FTS docs right now!! ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Bigger table and query optimization
On 9/9/08, Bruno Moreira Guedes <[EMAIL PROTECTED]> wrote: > Hi people!! > > I'm getting problems with a bigger table. I'm doing a query like this: > > SELECT field FROM tables WHERE '%.' || somefield LIKE 'somestring' interesting variation on the syntax. Usually one would WHERE somefield LIKE '%.somestring' > > The table has about 500.000 registers. When the table is short, it's > too fast, but when it's populated it gets slow. I've been imagined it > before. What may I do to optimize this query?? Should an INDEX help > me, if it's comparing concatenated strings?? LIKE doesn't use INDEXes, so you are really out of luck with big tables. Try some other approach. On big tables it will be slow, although 500K "registers" (is a "register" the same as "rows"?) is not much at all. I would try FTS. > > Thank you all, > Bruno > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > -- Puneet Kishor http://punkish.eidesis.org/ Nelson Institute for Environmental Studies http://www.nelson.wisc.edu/ Open Source Geospatial Foundation (OSGeo) http://www.osgeo.org/ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Bigger table and query optimization
Hi people!! I'm getting problems with a bigger table. I'm doing a query like this: SELECT field FROM tables WHERE '%.' || somefield LIKE 'somestring' The table has about 500.000 registers. When the table is short, it's too fast, but when it's populated it gets slow. I've been imagined it before. What may I do to optimize this query?? Should an INDEX help me, if it's comparing concatenated strings?? Thank you all, Bruno ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users