Grant, I saw your comment and I agree its probably best to somehow re-query through a Search Handler, either the existing one with all other components turned off, or through a new one just for this purpose. If you (or someone else) are not able to work on implementing it this way then I can probably get a little time in a few weeks.
James Dyer E-Commerce Systems Ingram Book Company (615) 213-4311 -----Original Message----- From: Grant Ingersoll [mailto:gsing...@apache.org] Sent: Friday, August 13, 2010 7:34 AM To: dev@lucene.apache.org Subject: Re: [jira] Commented: (SOLR-2010) Improvements to SpellCheckComponent Collate functionality Hi James, Did you see my comments on the issue? On Aug 11, 2010, at 12:28 AM, Dyer, James wrote: > Tom, > > I'm going to also need this to work with 1.4.1 within the next month or two so if someone else doesn't back-port it to 1.4.1 then I probably will. I also would like to see this working with shards. The PossibilityIterator class likely can be made a lot simpler. If nobody else takes care of these items I will try to find time to do so myself prior to making it work with 1.4.1. > > James Dyer > E-Commerce Systems > Ingram Book Company > (615) 213-4311 > > -----Original Message----- > From: Tom Phethean (JIRA) [mailto:j...@apache.org] > Sent: Tuesday, August 10, 2010 10:01 AM > To: dev@lucene.apache.org > Subject: [jira] Commented: (SOLR-2010) Improvements to SpellCheckComponent Collate functionality > > > [ https://issues.apache.org/jira/browse/SOLR-2010?page=com.atlassian.jira. plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12896903# action_12896903 ] > > Tom Phethean commented on SOLR-2010: > ------------------------------------ > > Ok, thanks. Do you know if there is a rough timescale on that? > >> Improvements to SpellCheckComponent Collate functionality >> --------------------------------------------------------- >> >> Key: SOLR-2010 >> URL: https://issues.apache.org/jira/browse/SOLR-2010 >> Project: Solr >> Issue Type: New Feature >> Components: clients - java, spellchecker >> Affects Versions: 1.4.1 >> Environment: Tested against trunk revision 966633 >> Reporter: James Dyer >> Assignee: Grant Ingersoll >> Priority: Minor >> Attachments: SOLR-2010.patch, SOLR-2010.patch >> >> >> Improvements to SpellCheckComponent Collate functionality >> Our project requires a better Spell Check Collator. I'm contributing this as a patch to get suggestions for improvements and in case there is a broader need for these features. >> 1. Only return collations that are guaranteed to result in hits if re-queried (applying original fq params also). This is especially helpful when there is more than one correction per query. The 1.4 behavior does not verify that a particular combination will actually return hits. >> 2. Provide the option to get multiple collation suggestions >> 3. Provide extended collation results including the # of hits re-querying will return and a breakdown of each misspelled word and its correction. >> This patch is similar to what is described in SOLR-507 item #1. Also, this patch provides a viable workaround for the problem discussed in SOLR-1074. A dictionary could be created that combines the terms from the multiple fields. The collator then would prune out any spurious suggestions this would cause. >> This patch adds the following spellcheck parameters: >> 1. spellcheck.maxCollationTries - maximum # of collation possibilities to try before giving up. Lower values ensure better performance. Higher values may be necessary to find a collation that can return results. Default is 0, which maintains backwards-compatible behavior (do not check collations). >> 2. spellcheck.maxCollations - maximum # of collations to return. Default is 1, which maintains backwards-compatible behavior. >> 3. spellcheck.collateExtendedResult - if true, returns an expanded response format detailing collations found. default is false, which maintains backwards-compatible behavior. When true, output is like this (in context): >> <lst name="spellcheck"> >> <lst name="suggestions"> >> <lst name="hopq"> >> <int name="numFound">94</int> >> <int name="startOffset">7</int> >> <int name="endOffset">11</int> >> <arr name="suggestion"> >> <str>hope</str> >> <str>how</str> >> <str>hope</str> >> <str>chops</str> >> <str>hoped</str> >> etc >> </arr> >> <lst name="faill"> >> <int name="numFound">100</int> >> <int name="startOffset">16</int> >> <int name="endOffset">21</int> >> <arr name="suggestion"> >> <str>fall</str> >> <str>fails</str> >> <str>fail</str> >> <str>fill</str> >> <str>faith</str> >> <str>all</str> >> etc >> </arr> >> </lst> >> <lst name="collation"> >> <str name="collationQuery">Title:(how AND fails)</str> >> <int name="hits">2</int> >> <lst name="misspellingsAndCorrections"> >> <str name="hopq">how</str> >> <str name="faill">fails</str> >> </lst> >> </lst> >> <lst name="collation"> >> <str name="collationQuery">Title:(hope AND faith)</str> >> <int name="hits">2</int> >> <lst name="misspellingsAndCorrections"> >> <str name="hopq">hope</str> >> <str name="faill">faith</str> >> </lst> >> </lst> >> <lst name="collation"> >> <str name="collationQuery">Title:(chops AND all)</str> >> <int name="hits">1</int> >> <lst name="misspellingsAndCorrections"> >> <str name="hopq">chops</str> >> <str name="faill">all</str> >> </lst> >> </lst> >> </lst> >> </lst> >> In addition, SOLRJ is updated to include SpellCheckResponse.getCollatedResults(), which will return the expanded Collation format. getCollatedResult(), which returns a single String, is retained for backwards-compatibility. Other APIs were not changed but will still work provided that spellcheck.collateExtendedResult is false. >> This likely will not return valid results if using Shards. Rather, a more robust interaction with the index would be necessary than what exists in SpellCheckCollator.collate(). > > -- > This message is automatically generated by JIRA. > - > You can reply to this email to add a comment to the issue online. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > -------------------------- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem using Solr/Lucene: http://www.lucidimagination.com/search --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org