[ 
https://issues.apache.org/jira/browse/NUTCH-72?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated NUTCH-72:
-------------------------------


Bulk close of legacy issues:
http://www.lucidimagination.com/search/document/2738eeb014805854/clean_up_open_legacy_issues_in_jira

> Query basic filter with correction feature
> ------------------------------------------
>
>                 Key: NUTCH-72
>                 URL: https://issues.apache.org/jira/browse/NUTCH-72
>             Project: Nutch
>          Issue Type: New Feature
>          Components: searcher
>         Environment: lucene
>            Reporter: Christophe Noel
>         Attachments: querycorrectionplugin.zip
>
>
> This plugin improves query-basic plugin with a correction feature.
> Lucene includes FuzzyQuery feature which consists of searching not only for 
> matching terms, but searching for very similar terms too.
> This plugin should be used instead of query-basic, for people looking for an 
> easy solution about users query requests correction.
> Correction Query Plugin can be used as follows :
> Solution 1 :  If you want to search for very similar terms, add 
> autocorrectionmod as the first term of the query (example : 'nutch engine' -> 
> 'autocorrectionmod nutch engine')
> Solution 2 : Create a new search.jsp page which include a "correction" 
> checkbox management (<input type="checkbox" name="autocorrection" 
> value="true"> may automatically add 'autocorrectionmod' as the first term of 
> the query) 
> QueryFuzzy knows a big problem : it is very slow for large index !
> So Correction Query Plugin works as follows :
> - it is not useful for big indexes
> - it only works for 5 characters and more words
> - it only look for words matching with the 2 first characters (to improve 
> performance this should be set to 3/4)
> - it only works for 65 % matching suffixes (algorithm is levenstein)
> PLease give your opinion about it.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to