Re: [sqlite] FTS5 MATCH using "string*" matches just the exact string
Hi Then I can do nothing about it. Sent from my iPhone > On Shahrivar 31, 1397 AP, at 7:50 PM, Dan Kennedy > wrote: > >> On 09/22/2018 02:11 PM, Maziar Parsijani wrote: >> Hi,Dan >> I have trimmed it here you can download it: >> https://drive.google.com/file/d/1D0xKF3DaOEbvfe9yFf0aFQvpo2nxWYqF/view?usp=sharing >> And the problem is searching with MATCH 'أعلم*' it must find 5 rows. >> it could be the same as I use LIKE "%أعلم%" > > Thaks for the db. > > As far as I can tell it's working correctly. > > The default FTS5 tokenizer splits text on whitespace and punctuation. And > 'token*' only matches tokens that begin with "token", not all tokens that > contain the string "token". > > In utf-8, your search string is X'D8A3D8B9D984D985'. > > So, for example, the LIKE pattern matches the text that, rendered in utf-8, > is: > > X'20D988D8A3D8B9D984D985D988D8A7' > > because it contains the search string starting at the third byte. But the > character before that is U+0648 - "ARABIC LETTER WAW", which is not > whitespace or punctuation. So the search string is not at the start of the > token and the query does not match. > > Dan. > > > > > >> >> >>> On Tue, Sep 11, 2018 at 8:35 PM Dan Kennedy wrote: >>> On 09/11/2018 01:26 PM, Maziar Parsijani wrote: Thanks for your answer Eric,Sandu About the first idea of Eric I have to say its correct for me when I do a test with another fts5 table with no huge rows but my database has that problem.And about the other possibility I tested that before to set "*"at the first and end of the match but I have got an error.Here if there is >>> no problem I will attach an example Arabic text database if you search for 'أعلم*' it must find 83 rows but it will find 49 which is the same with "أعلم" results.and if you put "*"at the both sides you will get error.I don't know if I did something wrong during creation but this is all I >>> have got. >>> >>> The "*" may only occur following a string in an fts5 expression, not >>> before: >>> >>> https://sqlite.org/fts5.html#fts5_prefix_queries >>> >>> If possible, can you trim the database down to just two rows that >>> demonstrate the problem and upload it somewhere? i.e. so that it >>> contains two rows that should both be matched by 'أعلم*', but for which >>> the current FTS5 only matches one. >>> >>> Even if the complete database makes the problem obvious to Arabic >>> readers, it will be quite difficult for non-readers to deal with. >>> >>> Thanks, >>> Dan. >>> >>> >>> >>> >>> On Tue, Sep 11, 2018 at 10:20 AM Sandu Buraga wrote: > In Arabic you have a different direction of the text. Might be that the > reason? > > Sandu > > Eric Minbiole schrieb am Di., 11. Sep. 2018, >>> 00:23: > >>> SELECT rowid,text >>> FROM table >>> WHERE table MATCH 'أعلم*'; >>> >>> And I have to add this that my data is Arabic text. >>> >>> This method must find words that contains 'أعلم' but it doesn't. What >>> should I do now. >> I just tried this in SQLite version 3.24.0 on a Mac, and it seems to > work. >> Please see my example below. In particular, the first select doesn't > have a >> *, so it doesn't find the word. The second example includes a trailing >>> *, >> so it performs a "starts with" search, and does find the match. (Note > that >> I simply doubled the letters of your original arabic text to make a > longer >> word.) >> >> MacBook-II:Programs eric$ ./sqlite3 >>> SQLite version 3.24.0 2018-06-04 19:24:41 >>> Enter ".help" for usage hints. >>> Connected to a transient in-memory database. >>> Use ".open FILENAME" to reopen on a persistent database. >>> sqlite> CREATE VIRTUAL TABLE fts USING fts5(doc); >>> sqlite> INSERT INTO fts (rowid, doc) VALUES (1, 'english text'); >>> sqlite> INSERT INTO fts (rowid, doc) VALUES (2, 'arabic أعلمأعلم'); >>> sqlite> SELECT rowid FROM fts WHERE fts MATCH 'أعلم'; >>> sqlite> SELECT rowid FROM fts WHERE fts MATCH 'أعلم*'; >>> 2 >>> sqlite> >> >> One possible explanation: You mentioned that you want a "contains" > search. >> However, the trailing asterisk in your example only results in a >>> "begins >> with" search. If you really want a "contains" search, you'll need to >>> put >> the * at both the beginning and end of the match word. E.g., "WHERE fts >> MATCH '*asdf*'; >> >> Hope this is helpful. >> >> ~Eric >> ___ >> sqlite-users mailing list >> sqlite-users@mailinglists.sqlite.org >> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org >
Re: [sqlite] FTS5 MATCH using "string*" matches just the exact string
On 09/22/2018 02:11 PM, Maziar Parsijani wrote: Hi,Dan I have trimmed it here you can download it: https://drive.google.com/file/d/1D0xKF3DaOEbvfe9yFf0aFQvpo2nxWYqF/view?usp=sharing And the problem is searching with MATCH 'أعلم*' it must find 5 rows. it could be the same as I use LIKE "%أعلم%" Thaks for the db. As far as I can tell it's working correctly. The default FTS5 tokenizer splits text on whitespace and punctuation. And 'token*' only matches tokens that begin with "token", not all tokens that contain the string "token". In utf-8, your search string is X'D8A3D8B9D984D985'. So, for example, the LIKE pattern matches the text that, rendered in utf-8, is: X'20D988D8A3D8B9D984D985D988D8A7' because it contains the search string starting at the third byte. But the character before that is U+0648 - "ARABIC LETTER WAW", which is not whitespace or punctuation. So the search string is not at the start of the token and the query does not match. Dan. On Tue, Sep 11, 2018 at 8:35 PM Dan Kennedy wrote: On 09/11/2018 01:26 PM, Maziar Parsijani wrote: Thanks for your answer Eric,Sandu About the first idea of Eric I have to say its correct for me when I do a test with another fts5 table with no huge rows but my database has that problem.And about the other possibility I tested that before to set "*"at the first and end of the match but I have got an error.Here if there is no problem I will attach an example Arabic text database if you search for 'أعلم*' it must find 83 rows but it will find 49 which is the same with "أعلم" results.and if you put "*"at the both sides you will get error.I don't know if I did something wrong during creation but this is all I have got. The "*" may only occur following a string in an fts5 expression, not before: https://sqlite.org/fts5.html#fts5_prefix_queries If possible, can you trim the database down to just two rows that demonstrate the problem and upload it somewhere? i.e. so that it contains two rows that should both be matched by 'أعلم*', but for which the current FTS5 only matches one. Even if the complete database makes the problem obvious to Arabic readers, it will be quite difficult for non-readers to deal with. Thanks, Dan. On Tue, Sep 11, 2018 at 10:20 AM Sandu Buraga wrote: In Arabic you have a different direction of the text. Might be that the reason? Sandu Eric Minbiole schrieb am Di., 11. Sep. 2018, 00:23: SELECT rowid,text FROM table WHERE table MATCH 'أعلم*'; And I have to add this that my data is Arabic text. This method must find words that contains 'أعلم' but it doesn't. What should I do now. I just tried this in SQLite version 3.24.0 on a Mac, and it seems to work. Please see my example below. In particular, the first select doesn't have a *, so it doesn't find the word. The second example includes a trailing *, so it performs a "starts with" search, and does find the match. (Note that I simply doubled the letters of your original arabic text to make a longer word.) MacBook-II:Programs eric$ ./sqlite3 SQLite version 3.24.0 2018-06-04 19:24:41 Enter ".help" for usage hints. Connected to a transient in-memory database. Use ".open FILENAME" to reopen on a persistent database. sqlite> CREATE VIRTUAL TABLE fts USING fts5(doc); sqlite> INSERT INTO fts (rowid, doc) VALUES (1, 'english text'); sqlite> INSERT INTO fts (rowid, doc) VALUES (2, 'arabic أعلمأعلم'); sqlite> SELECT rowid FROM fts WHERE fts MATCH 'أعلم'; sqlite> SELECT rowid FROM fts WHERE fts MATCH 'أعلم*'; 2 sqlite> One possible explanation: You mentioned that you want a "contains" search. However, the trailing asterisk in your example only results in a "begins with" search. If you really want a "contains" search, you'll need to put the * at both the beginning and end of the match word. E.g., "WHERE fts MATCH '*asdf*'; Hope this is helpful. ~Eric ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] FTS5 MATCH using "string*" matches just the exact string
Hi,Dan I have trimmed it here you can download it: https://drive.google.com/file/d/1D0xKF3DaOEbvfe9yFf0aFQvpo2nxWYqF/view?usp=sharing And the problem is searching with MATCH 'أعلم*' it must find 5 rows. it could be the same as I use LIKE "%أعلم%" On Tue, Sep 11, 2018 at 8:35 PM Dan Kennedy wrote: > On 09/11/2018 01:26 PM, Maziar Parsijani wrote: > > Thanks for your answer Eric,Sandu > > About the first idea of Eric I have to say its correct for me when I do a > > test with another fts5 table with no huge rows but my database has that > > problem.And about the other possibility I tested that before to set "*"at > > the first and end of the match but I have got an error.Here if there is > no > > problem I will attach an example Arabic text database if you search for > > 'أعلم*' it must find 83 rows but it will find 49 which is the same with > > "أعلم" results.and if you put "*"at the both sides you will get error.I > > don't know if I did something wrong during creation but this is all I > have > > got. > > The "*" may only occur following a string in an fts5 expression, not > before: > >https://sqlite.org/fts5.html#fts5_prefix_queries > > If possible, can you trim the database down to just two rows that > demonstrate the problem and upload it somewhere? i.e. so that it > contains two rows that should both be matched by 'أعلم*', but for which > the current FTS5 only matches one. > > Even if the complete database makes the problem obvious to Arabic > readers, it will be quite difficult for non-readers to deal with. > > Thanks, > Dan. > > > > > > > > > On Tue, Sep 11, 2018 at 10:20 AM Sandu Buraga > > wrote: > > > >> In Arabic you have a different direction of the text. Might be that the > >> reason? > >> > >> Sandu > >> > >> Eric Minbiole schrieb am Di., 11. Sep. 2018, > 00:23: > >> > SELECT rowid,text > FROM table > WHERE table MATCH 'أعلم*'; > > And I have to add this that my data is Arabic text. > > This method must find words that contains 'أعلم' but it doesn't. What > should I do now. > > > >>> I just tried this in SQLite version 3.24.0 on a Mac, and it seems to > >> work. > >>> Please see my example below. In particular, the first select doesn't > >> have a > >>> *, so it doesn't find the word. The second example includes a trailing > *, > >>> so it performs a "starts with" search, and does find the match. (Note > >> that > >>> I simply doubled the letters of your original arabic text to make a > >> longer > >>> word.) > >>> > >>> MacBook-II:Programs eric$ ./sqlite3 > SQLite version 3.24.0 2018-06-04 19:24:41 > Enter ".help" for usage hints. > Connected to a transient in-memory database. > Use ".open FILENAME" to reopen on a persistent database. > sqlite> CREATE VIRTUAL TABLE fts USING fts5(doc); > sqlite> INSERT INTO fts (rowid, doc) VALUES (1, 'english text'); > sqlite> INSERT INTO fts (rowid, doc) VALUES (2, 'arabic أعلمأعلم'); > sqlite> SELECT rowid FROM fts WHERE fts MATCH 'أعلم'; > sqlite> SELECT rowid FROM fts WHERE fts MATCH 'أعلم*'; > 2 > sqlite> > > >>> > >>> One possible explanation: You mentioned that you want a "contains" > >> search. > >>> However, the trailing asterisk in your example only results in a > "begins > >>> with" search. If you really want a "contains" search, you'll need to > put > >>> the * at both the beginning and end of the match word. E.g., "WHERE fts > >>> MATCH '*asdf*'; > >>> > >>> Hope this is helpful. > >>> > >>> ~Eric > >>> ___ > >>> sqlite-users mailing list > >>> sqlite-users@mailinglists.sqlite.org > >>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > >>> > >> ___ > >> sqlite-users mailing list > >> sqlite-users@mailinglists.sqlite.org > >> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > >> > > ___ > > sqlite-users mailing list > > sqlite-users@mailinglists.sqlite.org > > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > > > > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] FTS5 MATCH using "string*" matches just the exact string
On 09/11/2018 01:26 PM, Maziar Parsijani wrote: Thanks for your answer Eric,Sandu About the first idea of Eric I have to say its correct for me when I do a test with another fts5 table with no huge rows but my database has that problem.And about the other possibility I tested that before to set "*"at the first and end of the match but I have got an error.Here if there is no problem I will attach an example Arabic text database if you search for 'أعلم*' it must find 83 rows but it will find 49 which is the same with "أعلم" results.and if you put "*"at the both sides you will get error.I don't know if I did something wrong during creation but this is all I have got. The "*" may only occur following a string in an fts5 expression, not before: https://sqlite.org/fts5.html#fts5_prefix_queries If possible, can you trim the database down to just two rows that demonstrate the problem and upload it somewhere? i.e. so that it contains two rows that should both be matched by 'أعلم*', but for which the current FTS5 only matches one. Even if the complete database makes the problem obvious to Arabic readers, it will be quite difficult for non-readers to deal with. Thanks, Dan. On Tue, Sep 11, 2018 at 10:20 AM Sandu Buraga wrote: In Arabic you have a different direction of the text. Might be that the reason? Sandu Eric Minbiole schrieb am Di., 11. Sep. 2018, 00:23: SELECT rowid,text FROM table WHERE table MATCH 'أعلم*'; And I have to add this that my data is Arabic text. This method must find words that contains 'أعلم' but it doesn't. What should I do now. I just tried this in SQLite version 3.24.0 on a Mac, and it seems to work. Please see my example below. In particular, the first select doesn't have a *, so it doesn't find the word. The second example includes a trailing *, so it performs a "starts with" search, and does find the match. (Note that I simply doubled the letters of your original arabic text to make a longer word.) MacBook-II:Programs eric$ ./sqlite3 SQLite version 3.24.0 2018-06-04 19:24:41 Enter ".help" for usage hints. Connected to a transient in-memory database. Use ".open FILENAME" to reopen on a persistent database. sqlite> CREATE VIRTUAL TABLE fts USING fts5(doc); sqlite> INSERT INTO fts (rowid, doc) VALUES (1, 'english text'); sqlite> INSERT INTO fts (rowid, doc) VALUES (2, 'arabic أعلمأعلم'); sqlite> SELECT rowid FROM fts WHERE fts MATCH 'أعلم'; sqlite> SELECT rowid FROM fts WHERE fts MATCH 'أعلم*'; 2 sqlite> One possible explanation: You mentioned that you want a "contains" search. However, the trailing asterisk in your example only results in a "begins with" search. If you really want a "contains" search, you'll need to put the * at both the beginning and end of the match word. E.g., "WHERE fts MATCH '*asdf*'; Hope this is helpful. ~Eric ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] FTS5 MATCH using "string*" matches just the exact string
Thanks for your answer Eric,Sandu About the first idea of Eric I have to say its correct for me when I do a test with another fts5 table with no huge rows but my database has that problem.And about the other possibility I tested that before to set "*"at the first and end of the match but I have got an error.Here if there is no problem I will attach an example Arabic text database if you search for 'أعلم*' it must find 83 rows but it will find 49 which is the same with "أعلم" results.and if you put "*"at the both sides you will get error.I don't know if I did something wrong during creation but this is all I have got. On Tue, Sep 11, 2018 at 10:20 AM Sandu Buraga wrote: > In Arabic you have a different direction of the text. Might be that the > reason? > > Sandu > > Eric Minbiole schrieb am Di., 11. Sep. 2018, 00:23: > > > > SELECT rowid,text > > > FROM table > > > WHERE table MATCH 'أعلم*'; > > > > > > And I have to add this that my data is Arabic text. > > > > > > This method must find words that contains 'أعلم' but it doesn't. What > > > should I do now. > > > > > > > > I just tried this in SQLite version 3.24.0 on a Mac, and it seems to > work. > > Please see my example below. In particular, the first select doesn't > have a > > *, so it doesn't find the word. The second example includes a trailing *, > > so it performs a "starts with" search, and does find the match. (Note > that > > I simply doubled the letters of your original arabic text to make a > longer > > word.) > > > > MacBook-II:Programs eric$ ./sqlite3 > > > SQLite version 3.24.0 2018-06-04 19:24:41 > > > Enter ".help" for usage hints. > > > Connected to a transient in-memory database. > > > Use ".open FILENAME" to reopen on a persistent database. > > > sqlite> CREATE VIRTUAL TABLE fts USING fts5(doc); > > > sqlite> INSERT INTO fts (rowid, doc) VALUES (1, 'english text'); > > > sqlite> INSERT INTO fts (rowid, doc) VALUES (2, 'arabic أعلمأعلم'); > > > sqlite> SELECT rowid FROM fts WHERE fts MATCH 'أعلم'; > > > sqlite> SELECT rowid FROM fts WHERE fts MATCH 'أعلم*'; > > > 2 > > > sqlite> > > > > > > > One possible explanation: You mentioned that you want a "contains" > search. > > However, the trailing asterisk in your example only results in a "begins > > with" search. If you really want a "contains" search, you'll need to put > > the * at both the beginning and end of the match word. E.g., "WHERE fts > > MATCH '*asdf*'; > > > > Hope this is helpful. > > > > ~Eric > > ___ > > sqlite-users mailing list > > sqlite-users@mailinglists.sqlite.org > > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > > > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] FTS5 MATCH using "string*" matches just the exact string
In Arabic you have a different direction of the text. Might be that the reason? Sandu Eric Minbiole schrieb am Di., 11. Sep. 2018, 00:23: > > SELECT rowid,text > > FROM table > > WHERE table MATCH 'أعلم*'; > > > > And I have to add this that my data is Arabic text. > > > > This method must find words that contains 'أعلم' but it doesn't. What > > should I do now. > > > > > I just tried this in SQLite version 3.24.0 on a Mac, and it seems to work. > Please see my example below. In particular, the first select doesn't have a > *, so it doesn't find the word. The second example includes a trailing *, > so it performs a "starts with" search, and does find the match. (Note that > I simply doubled the letters of your original arabic text to make a longer > word.) > > MacBook-II:Programs eric$ ./sqlite3 > > SQLite version 3.24.0 2018-06-04 19:24:41 > > Enter ".help" for usage hints. > > Connected to a transient in-memory database. > > Use ".open FILENAME" to reopen on a persistent database. > > sqlite> CREATE VIRTUAL TABLE fts USING fts5(doc); > > sqlite> INSERT INTO fts (rowid, doc) VALUES (1, 'english text'); > > sqlite> INSERT INTO fts (rowid, doc) VALUES (2, 'arabic أعلمأعلم'); > > sqlite> SELECT rowid FROM fts WHERE fts MATCH 'أعلم'; > > sqlite> SELECT rowid FROM fts WHERE fts MATCH 'أعلم*'; > > 2 > > sqlite> > > > > One possible explanation: You mentioned that you want a "contains" search. > However, the trailing asterisk in your example only results in a "begins > with" search. If you really want a "contains" search, you'll need to put > the * at both the beginning and end of the match word. E.g., "WHERE fts > MATCH '*asdf*'; > > Hope this is helpful. > > ~Eric > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] FTS5 MATCH using "string*" matches just the exact string
> SELECT rowid,text > FROM table > WHERE table MATCH 'أعلم*'; > > And I have to add this that my data is Arabic text. > > This method must find words that contains 'أعلم' but it doesn't. What > should I do now. > > I just tried this in SQLite version 3.24.0 on a Mac, and it seems to work. Please see my example below. In particular, the first select doesn't have a *, so it doesn't find the word. The second example includes a trailing *, so it performs a "starts with" search, and does find the match. (Note that I simply doubled the letters of your original arabic text to make a longer word.) MacBook-II:Programs eric$ ./sqlite3 > SQLite version 3.24.0 2018-06-04 19:24:41 > Enter ".help" for usage hints. > Connected to a transient in-memory database. > Use ".open FILENAME" to reopen on a persistent database. > sqlite> CREATE VIRTUAL TABLE fts USING fts5(doc); > sqlite> INSERT INTO fts (rowid, doc) VALUES (1, 'english text'); > sqlite> INSERT INTO fts (rowid, doc) VALUES (2, 'arabic أعلمأعلم'); > sqlite> SELECT rowid FROM fts WHERE fts MATCH 'أعلم'; > sqlite> SELECT rowid FROM fts WHERE fts MATCH 'أعلم*'; > 2 > sqlite> > One possible explanation: You mentioned that you want a "contains" search. However, the trailing asterisk in your example only results in a "begins with" search. If you really want a "contains" search, you'll need to put the * at both the beginning and end of the match word. E.g., "WHERE fts MATCH '*asdf*'; Hope this is helpful. ~Eric ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] FTS5 MATCH using "string*" matches just the exact string
Hi I am new here and I am not an expert on sqlite my question is about match option that I have read from https://sqlite.org/fts5.html but the problem is when I want to not find the exact word so I use : SELECT rowid,text FROM table WHERE table MATCH 'أعلم*'; And I have to add this that my data is Arabic text. This method must find words that contains 'أعلم' but it doesn't. What should I do now. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users