[sqlite] Full text search without full phrase matches

2012-06-14 Thread pcarb
I had to implement something like this for comparing passages from statutes 
(see the Introduction in Douglas Hay and Paul Craven, *Masters, Servants and 
Magistrates in Britain and the Empire, 1562-1955* [UNCP Press, 2004] for an 
illustration).

You need to isolate the keywords, in whatever order, count them, and measure 
the distances (number of words) between them.  SqLite is great for managing the 
tables of keywords, the lists of texts that contain them, and tables of 
distances.  But it is not the optimal tool for breaking down the texts and 
extracting the keywords and distances.  I used Perl for this job, and found 
that I could easily adapt recipes from the Perl Cookbook and similar 
repositories to build my routines.  I wrote the disaggregated lists of 
keywords, distances and texts as sql tables and analysed them in SqLite.

Paul Craven
York University

--

Date: Wed, 13 Jun 2012 23:09:35 +0200
From: Philip Bennefall 
To: 
Subject: [sqlite] Full text search without full phrase matches
Message-ID: 
Content-Type: text/plain; charset="iso-8859-1"

Hi all,

I am new to this maling list and to SqLite, so I wanted to start by thanking 
all of those who make this project a reality. It is a great tool.

Now, to my question. I am trying to use the full text search feature to find 
rough matches for a chat robot. Basically I want to match as many keywords as 
possible, but not necessarily all of them. The results should be sorted based 
on how many keywords were found in the phrase and how closely ordered they are 
to the query. In other words the ordering doesn't have to be exact, but the 
closer it is, the higher the result should rank. Similarly, even if only one or 
two words in the phrase are found it should match, but rank higher the more of 
the words that are present. I have read the reference and I see the NEAR 
statement and the matchinfo function, as well as the example of how to use it, 
but I cannot figure out how to apply this knowledge to my specific problem. 
Does anyone have any suggestions?

Thanks in advance for your help.

Kind regards,

Philip Bennefall
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Full text search without full phrase matches

2012-06-14 Thread pcarb
I had to implement something like this for comparing passages from statutes 
(see the Introduction in Douglas Hay and Paul Craven, *Masters, Servants and 
Magistrates in Britain and the Empire, 1562-1955* [UNCP Press, 2004] for an 
illustration).

You need to isolate the keywords, in whatever order, count them, and measure 
the distances (number of words) between them.  SqLite is great for managing the 
tables of keywords, the lists of texts that contain them, and tables of 
distances.  But it is not the optimal tool for breaking down the texts and 
extracting the keywords and distances.  I used Perl for this job, and found 
that I could easily adapt recipes from the Perl Cookbook and similar 
repositories to build my routines.  I wrote the disaggregated lists of 
keywords, distances and texts as sql tables and analysed them in SqLite.

Paul Craven
York University

--

Date: Wed, 13 Jun 2012 23:09:35 +0200
From: Philip Bennefall 
To: 
Subject: [sqlite] Full text search without full phrase matches
Message-ID: 
Content-Type: text/plain; charset="iso-8859-1"

Hi all,

I am new to this maling list and to SqLite, so I wanted to start by thanking 
all of those who make this project a reality. It is a great tool.

Now, to my question. I am trying to use the full text search feature to find 
rough matches for a chat robot. Basically I want to match as many keywords as 
possible, but not necessarily all of them. The results should be sorted based 
on how many keywords were found in the phrase and how closely ordered they are 
to the query. In other words the ordering doesn't have to be exact, but the 
closer it is, the higher the result should rank. Similarly, even if only one or 
two words in the phrase are found it should match, but rank higher the more of 
the words that are present. I have read the reference and I see the NEAR 
statement and the matchinfo function, as well as the example of how to use it, 
but I cannot figure out how to apply this knowledge to my specific problem. 
Does anyone have any suggestions?

Thanks in advance for your help.

Kind regards,

Philip Bennefall
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users