Re: [sqlite] FTS2 suggestion

2007-08-29 Thread brian kruse
On 8/29/07, Scott Hess <[EMAIL PROTECTED]> wrote: > What was fts3 will now be fts4. fts3 will now be > fts2-with-rowid-fixed. fts3 is already in the tree, but with an > #error at the top to force people to not use it without reading a > comment. I was planning to turn that off this week (what

Re: [sqlite] FTS2 suggestion

2007-08-29 Thread Scott Hess
Hmm, and a clarification on the n-gram case ... there are no current plans to implement any n-gram capabilities in fts. This kind of thing has been discussed, but since it still seems like a nice-to-have type thing and not a must-have type thing, no time is being spent on it. I have somewhat of

Re: [sqlite] FTS2 suggestion

2007-08-29 Thread Scott Hess
What was fts3 will now be fts4. fts3 will now be fts2-with-rowid-fixed. fts3 is already in the tree, but with an #error at the top to force people to not use it without reading a comment. I was planning to turn that off this week (what with the SQLite 3.5 stuff going on, might as well!). The

Re: [sqlite] FTS2 suggestion

2007-08-29 Thread Scott Hess
A primary constraint of the porter algorithm in fts is that it's completely unencumbered open-source. That may-or-may-not make it a great stemmer, of course :-). One of the reasons it's in there in the first place is as an example of an alternative to the very basic "simple" fts tokenizer. One

Re: [sqlite] FTS2 suggestion

2007-08-29 Thread brian kruse
On 8/24/07, Scott Hess <[EMAIL PROTECTED]> wrote: > > My current focus for the next generation is international support > (this is more of a Google Gears project, but with focus on SQLite so > there is likely to be stuff checked in on the SQLite side), and more > scalable/manageable indexing.

Re: [sqlite] FTS2 suggestion

2007-08-29 Thread Cesar D. Rodas
N-gram is a sequense of N Letters of a word or set of words... http://en.wikipedia.org/wiki/N-gram On 29/08/2007, Uma Krishnan <[EMAIL PROTECTED]> wrote: > > Hello Scott, > > I have several clarifications with respect to full text search. I'm a > newbie in open source development, so please

Re: [sqlite] FTS2 suggestion

2007-08-29 Thread Uma Krishnan
Hello Scott, I have several clarifications with respect to full text search. I'm a newbie in open source development, so please bear with me if some of the questions are irrelevant/obvious/nonsense. I was given to understand that the potter stemming algorithm implemented in fts2 is not robust

Re: [sqlite] FTS2 suggestion

2007-08-24 Thread Scott Hess
Porter stemmer is already in there. The main issue with Porter is that it's English only. There is no general game-plan for fuzzy search at this time, though if someone wants to step into the breech, go for it! Even a prototype which demonstrates the concepts and problems but isn't

Re: [sqlite] FTS2 suggestion

2007-08-24 Thread Uma Krishnan
Would it not be more useful to first implement potter stemmer algorithm, and then to implement n-gram (as I understand n-gram is for cross column fuzzy search?). What is the general game plan for FTS3 with regard to fuzzy search? Thanks in advance "Cesar D. Rodas" <[EMAIL PROTECTED]>

Re: [sqlite] FTS2 suggestion

2007-08-23 Thread Cesar D. Rodas
I On 23/08/07, Russell Leighton <[EMAIL PROTECTED]> wrote: > > > Could fts3 (the next fts) have the option to override the default > 'match' function with one passed in (similar to the tokenizer)? > > The reason I ask is then the fts table could be used as smart index > when the tokenizer is >

Re: [sqlite] FTS2 suggestion

2007-08-23 Thread Russell Leighton
Could fts3 (the next fts) have the option to override the default 'match' function with one passed in (similar to the tokenizer)? The reason I ask is then the fts table could be used as smart index when the tokenizer is something like bigram, trigram, etc. and the 'match' function computes

Re: [sqlite] FTS2 suggestion

2007-08-23 Thread Scott Hess
It's all interesting, but categorization is hard. Not so hard to get some results, sort of hard to get quality results. Might work as a nice adjunct to fts, so that you can throw the search terms into the categorization engine and put up suggestions for re-running the search with a tighter

Re: [sqlite] FTS2 suggestion

2007-08-23 Thread Cesar D. Rodas
On 23/08/07, Scott Hess <[EMAIL PROTECTED]> wrote: > On 8/20/07, Cesar D. Rodas <[EMAIL PROTECTED]> wrote: > > As I know ( I can be wrong ) SQLite Full Text Search is only match with hole > > words right? It could not be > > And also no FT extension to db ( as far I know) is miss spell tolerant, >

Re: [sqlite] FTS2 suggestion

2007-08-23 Thread Scott Hess
On 8/20/07, Cesar D. Rodas <[EMAIL PROTECTED]> wrote: > As I know ( I can be wrong ) SQLite Full Text Search is only match with hole > words right? It could not be > And also no FT extension to db ( as far I know) is miss spell tolerant, Yes, fts is matching exactly. There is some primitive