Re: Spelt, for better spelling correction

2007-03-22 Thread Martin Haye
;[EMAIL PROTECTED]> To: Yonik Seeley <[EMAIL PROTECTED]> Cc: java-user@lucene.apache.org Sent: Wednesday, March 21, 2007 2:03:50 PM Subject: Re: Spelt, for better spelling correction The dictionary is generated from the corpus, with the result that a larger corpus gives better results. Wo

Re: Spelt, for better spelling correction

2007-03-21 Thread Otis Gospodnetic
://www.simpy.com/ - Tag - Search - Share - Original Message From: Martin Haye <[EMAIL PROTECTED]> To: Yonik Seeley <[EMAIL PROTECTED]> Cc: java-user@lucene.apache.org Sent: Wednesday, March 21, 2007 2:03:50 PM Subject: Re: Spelt, for better spelling correction The dictionary

Re: Spelt, for better spelling correction

2007-03-21 Thread Martin Haye
The dictionary is generated from the corpus, with the result that a larger corpus gives better results. Words are queued up during an index run, and at the end are munged to create an optimized dictionary. It also supports incremental building, though the overhead would be too much for those appl

Re: Spelt, for better spelling correction

2007-03-20 Thread Otis Gospodnetic
m/ - Tag - Search - Share - Original Message From: Martin Haye <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Tuesday, March 20, 2007 6:24:43 PM Subject: Spelt, for better spelling correction As part of XTF, an open source publishing engine that uses Lucene, I develope

Re: Spelt, for better spelling correction

2007-03-20 Thread Yonik Seeley
Sounds interesting Martin! Is the dictionary static, or is it generated from the corpus or from user queries? -Yonik On 3/20/07, Martin Haye <[EMAIL PROTECTED]> wrote: As part of XTF, an open source publishing engine that uses Lucene, I developed a new spelling correction engine specifically to

Spelt, for better spelling correction

2007-03-20 Thread Martin Haye
As part of XTF, an open source publishing engine that uses Lucene, I developed a new spelling correction engine specifically to provide "Did you mean..." links for misspelled queries. I and a small group are preparing this for submission as a contrib module to Lucene. And we're inviting interested