Re: [sqlite] How to know what terms were created using FTS

2012-08-27 Thread Scott Hess
On Sat, Aug 18, 2012 at 10:00 AM, Mohit Sindhwani  wrote:
> On 17/8/2012 7:14 PM, Dominique Pellé wrote:
>> This gives the tokens:
>>
>> sqlite> CREATE VIRTUAL TABLE ft USING fts4(x);
>> sqlite> INSERT INTO ft VALUES("hello world");
>> sqlite> INSERT INTO ft VALUES("hello there");
>>
>> sqlite> CREATE VIRTUAL TABLE ft_terms USING fts4aux(ft);
>>
>> sqlite> .header on
>> sqlite> SELECT * FROM ft_terms;
>> term|col|documents|occurrences
>> hello|*|2|2
>> hello|0|2|2
>> there|*|1|1
>> there|0|1|1
>> world|*|1|1
>> world|0|1|1
>
> Actually, I want to know:
> * terms for "hello world" are "hello" and "world"
> * terms for "hello there" are "hello" and "there"
> ...and so on.
>
> The aux table doesn't give an easy way to find that, as far as I can see.

Depending on how often you need to do this, perhaps you could just
create a table in the TEMP database or a memory database, and insert
only a single document.

-scott
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to know what terms were created using FTS

2012-08-18 Thread Mohit Sindhwani

Hi Dominique,

Thanks!

On 17/8/2012 7:14 PM, Dominique Pellé wrote:


This gives the tokens:

sqlite> CREATE VIRTUAL TABLE ft USING fts4(x);
sqlite> INSERT INTO ft VALUES("hello world");
sqlite> INSERT INTO ft VALUES("hello there");

sqlite> CREATE VIRTUAL TABLE ft_terms USING fts4aux(ft);

sqlite> .header on
sqlite> SELECT * FROM ft_terms;
term|col|documents|occurrences
hello|*|2|2
hello|0|2|2
there|*|1|1
there|0|1|1
world|*|1|1
world|0|1|1


Actually, I want to know:
* terms for "hello world" are "hello" and "world"
* terms for "hello there" are "hello" and "there"
...and so on.

The aux table doesn't give an easy way to find that, as far as I can see.

However, thanks for taking the time to reply!  I have received a couple 
of other solutions that may also help!


Best Regards,
Mohit.
19/8/2012 | 12:59 AM.



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to know what terms were created using FTS

2012-08-18 Thread Mohit Sindhwani

On 17/8/2012 6:41 PM, Dan Kennedy wrote:

On 08/17/2012 03:58 PM, Mohit Sindhwani wrote:

Hi Ralf,


On 17/8/2012 3:50 PM, Ralf Junker wrote:

On 17.08.2012 09:30, Mohit Sindhwani wrote:


We're using FTS4 and it works well for many things.  One of the things
that we'd like to do is to see what terms are being created by the
tokenizer in use.  What would be the easiest way to do that?

I tried looking through the fts_aux table and the segments and content
tables, but nothing struck me directly as usable.  Any suggestions?

http://www.sqlite.org/fts3.html#fts4aux


I did look at this - but I couldn't figure out a way that allowed me to
see what terms were created by the tokenizer for a particular
expression.  Example "SOME TEXT" becomes "SOME", "TEXT" - is there a way
to find that?


You could, I suppose, obtain a handle for the tables tokenizer using
the fts3_tokenizer() function:

  http://www.sqlite.org/fts3.html#section_8_1

Then use it to tokenize your expression using the API in
fts3_tokenizer.h. See static function "testFunc()" in
fts3_tokenizer.c for an example.



Thanks, Dan... yes, I guess that should work.  Thanks for pointing me to 
the correct file to go into there.


Best Regards,
Mohit.
19/8/2012 | 12:57 AM.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to know what terms were created using FTS

2012-08-17 Thread Dominique Pellé
Mohit Sindhwani  wrote:

> Hi Ralf,
>
>
>
> On 17/8/2012 3:50 PM, Ralf Junker wrote:
>>
>> On 17.08.2012 09:30, Mohit Sindhwani wrote:
>>
>>> We're using FTS4 and it works well for many things.  One of the things
>>> that we'd like to do is to see what terms are being created by the
>>> tokenizer in use.  What would be the easiest way to do that?
>>>
>>> I tried looking through the fts_aux table and the segments and content
>>> tables, but nothing struck me directly as usable.  Any suggestions?
>>
>> http://www.sqlite.org/fts3.html#fts4aux
>
>
> I did look at this - but I couldn't figure out a way that allowed me to see
> what terms were created by the tokenizer for a particular expression.
> Example "SOME TEXT" becomes "SOME", "TEXT" - is there a way to find that?
>
> Best Regards,
> Mohit.

This gives the tokens:

sqlite> CREATE VIRTUAL TABLE ft USING fts4(x);
sqlite> INSERT INTO ft VALUES("hello world");
sqlite> INSERT INTO ft VALUES("hello there");

sqlite> CREATE VIRTUAL TABLE ft_terms USING fts4aux(ft);

sqlite> .header on
sqlite> SELECT * FROM ft_terms;
term|col|documents|occurrences
hello|*|2|2
hello|0|2|2
there|*|1|1
there|0|1|1
world|*|1|1
world|0|1|1

Regards
-- Dominique
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to know what terms were created using FTS

2012-08-17 Thread Mohit Sindhwani

Hi Ralf,


On 17/8/2012 3:50 PM, Ralf Junker wrote:

On 17.08.2012 09:30, Mohit Sindhwani wrote:


We're using FTS4 and it works well for many things.  One of the things
that we'd like to do is to see what terms are being created by the
tokenizer in use.  What would be the easiest way to do that?

I tried looking through the fts_aux table and the segments and content
tables, but nothing struck me directly as usable.  Any suggestions?

http://www.sqlite.org/fts3.html#fts4aux


I did look at this - but I couldn't figure out a way that allowed me to 
see what terms were created by the tokenizer for a particular 
expression.  Example "SOME TEXT" becomes "SOME", "TEXT" - is there a way 
to find that?


Best Regards,
Mohit.
17/8/2012 | 4:57 PM.



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] How to know what terms were created using FTS

2012-08-17 Thread Mohit Sindhwani

Hi,

We're using FTS4 and it works well for many things.  One of the things 
that we'd like to do is to see what terms are being created by the 
tokenizer in use.  What would be the easiest way to do that?


I tried looking through the fts_aux table and the segments and content 
tables, but nothing struck me directly as usable.  Any suggestions?


Best Regards,
Mohit.
17/8/2012 | 3:29 PM.


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users