Re: [sqlite] How to know what terms were created using FTS
On Sat, Aug 18, 2012 at 10:00 AM, Mohit Sindhwani wrote: > On 17/8/2012 7:14 PM, Dominique Pellé wrote: >> This gives the tokens: >> >> sqlite> CREATE VIRTUAL TABLE ft USING fts4(x); >> sqlite> INSERT INTO ft VALUES("hello world"); >> sqlite> INSERT INTO ft VALUES("hello there"); >> >> sqlite> CREATE VIRTUAL TABLE ft_terms USING fts4aux(ft); >> >> sqlite> .header on >> sqlite> SELECT * FROM ft_terms; >> term|col|documents|occurrences >> hello|*|2|2 >> hello|0|2|2 >> there|*|1|1 >> there|0|1|1 >> world|*|1|1 >> world|0|1|1 > > Actually, I want to know: > * terms for "hello world" are "hello" and "world" > * terms for "hello there" are "hello" and "there" > ...and so on. > > The aux table doesn't give an easy way to find that, as far as I can see. Depending on how often you need to do this, perhaps you could just create a table in the TEMP database or a memory database, and insert only a single document. -scott ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] How to know what terms were created using FTS
Hi Dominique, Thanks! On 17/8/2012 7:14 PM, Dominique Pellé wrote: This gives the tokens: sqlite> CREATE VIRTUAL TABLE ft USING fts4(x); sqlite> INSERT INTO ft VALUES("hello world"); sqlite> INSERT INTO ft VALUES("hello there"); sqlite> CREATE VIRTUAL TABLE ft_terms USING fts4aux(ft); sqlite> .header on sqlite> SELECT * FROM ft_terms; term|col|documents|occurrences hello|*|2|2 hello|0|2|2 there|*|1|1 there|0|1|1 world|*|1|1 world|0|1|1 Actually, I want to know: * terms for "hello world" are "hello" and "world" * terms for "hello there" are "hello" and "there" ...and so on. The aux table doesn't give an easy way to find that, as far as I can see. However, thanks for taking the time to reply! I have received a couple of other solutions that may also help! Best Regards, Mohit. 19/8/2012 | 12:59 AM. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] How to know what terms were created using FTS
On 17/8/2012 6:41 PM, Dan Kennedy wrote: On 08/17/2012 03:58 PM, Mohit Sindhwani wrote: Hi Ralf, On 17/8/2012 3:50 PM, Ralf Junker wrote: On 17.08.2012 09:30, Mohit Sindhwani wrote: We're using FTS4 and it works well for many things. One of the things that we'd like to do is to see what terms are being created by the tokenizer in use. What would be the easiest way to do that? I tried looking through the fts_aux table and the segments and content tables, but nothing struck me directly as usable. Any suggestions? http://www.sqlite.org/fts3.html#fts4aux I did look at this - but I couldn't figure out a way that allowed me to see what terms were created by the tokenizer for a particular expression. Example "SOME TEXT" becomes "SOME", "TEXT" - is there a way to find that? You could, I suppose, obtain a handle for the tables tokenizer using the fts3_tokenizer() function: http://www.sqlite.org/fts3.html#section_8_1 Then use it to tokenize your expression using the API in fts3_tokenizer.h. See static function "testFunc()" in fts3_tokenizer.c for an example. Thanks, Dan... yes, I guess that should work. Thanks for pointing me to the correct file to go into there. Best Regards, Mohit. 19/8/2012 | 12:57 AM. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] How to know what terms were created using FTS
Mohit Sindhwani wrote: > Hi Ralf, > > > > On 17/8/2012 3:50 PM, Ralf Junker wrote: >> >> On 17.08.2012 09:30, Mohit Sindhwani wrote: >> >>> We're using FTS4 and it works well for many things. One of the things >>> that we'd like to do is to see what terms are being created by the >>> tokenizer in use. What would be the easiest way to do that? >>> >>> I tried looking through the fts_aux table and the segments and content >>> tables, but nothing struck me directly as usable. Any suggestions? >> >> http://www.sqlite.org/fts3.html#fts4aux > > > I did look at this - but I couldn't figure out a way that allowed me to see > what terms were created by the tokenizer for a particular expression. > Example "SOME TEXT" becomes "SOME", "TEXT" - is there a way to find that? > > Best Regards, > Mohit. This gives the tokens: sqlite> CREATE VIRTUAL TABLE ft USING fts4(x); sqlite> INSERT INTO ft VALUES("hello world"); sqlite> INSERT INTO ft VALUES("hello there"); sqlite> CREATE VIRTUAL TABLE ft_terms USING fts4aux(ft); sqlite> .header on sqlite> SELECT * FROM ft_terms; term|col|documents|occurrences hello|*|2|2 hello|0|2|2 there|*|1|1 there|0|1|1 world|*|1|1 world|0|1|1 Regards -- Dominique ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] How to know what terms were created using FTS
On 08/17/2012 03:58 PM, Mohit Sindhwani wrote: Hi Ralf, On 17/8/2012 3:50 PM, Ralf Junker wrote: On 17.08.2012 09:30, Mohit Sindhwani wrote: We're using FTS4 and it works well for many things. One of the things that we'd like to do is to see what terms are being created by the tokenizer in use. What would be the easiest way to do that? I tried looking through the fts_aux table and the segments and content tables, but nothing struck me directly as usable. Any suggestions? http://www.sqlite.org/fts3.html#fts4aux I did look at this - but I couldn't figure out a way that allowed me to see what terms were created by the tokenizer for a particular expression. Example "SOME TEXT" becomes "SOME", "TEXT" - is there a way to find that? You could, I suppose, obtain a handle for the tables tokenizer using the fts3_tokenizer() function: http://www.sqlite.org/fts3.html#section_8_1 Then use it to tokenize your expression using the API in fts3_tokenizer.h. See static function "testFunc()" in fts3_tokenizer.c for an example. Dan. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] How to know what terms were created using FTS
Hi Ralf, On 17/8/2012 3:50 PM, Ralf Junker wrote: On 17.08.2012 09:30, Mohit Sindhwani wrote: We're using FTS4 and it works well for many things. One of the things that we'd like to do is to see what terms are being created by the tokenizer in use. What would be the easiest way to do that? I tried looking through the fts_aux table and the segments and content tables, but nothing struck me directly as usable. Any suggestions? http://www.sqlite.org/fts3.html#fts4aux I did look at this - but I couldn't figure out a way that allowed me to see what terms were created by the tokenizer for a particular expression. Example "SOME TEXT" becomes "SOME", "TEXT" - is there a way to find that? Best Regards, Mohit. 17/8/2012 | 4:57 PM. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] How to know what terms were created using FTS
On 17.08.2012 09:30, Mohit Sindhwani wrote: > We're using FTS4 and it works well for many things. One of the things > that we'd like to do is to see what terms are being created by the > tokenizer in use. What would be the easiest way to do that? > > I tried looking through the fts_aux table and the segments and content > tables, but nothing struck me directly as usable. Any suggestions? http://www.sqlite.org/fts3.html#fts4aux Ralf ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] How to know what terms were created using FTS
Hi, We're using FTS4 and it works well for many things. One of the things that we'd like to do is to see what terms are being created by the tokenizer in use. What would be the easiest way to do that? I tried looking through the fts_aux table and the segments and content tables, but nothing struck me directly as usable. Any suggestions? Best Regards, Mohit. 17/8/2012 | 3:29 PM. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users