[jira] [Updated] (SOLR-4280) spellcheck.maxResultsForSuggest based on filter query results
[ https://issues.apache.org/jira/browse/SOLR-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Dyer updated SOLR-4280: - Attachment: SOLR-4280.patch Clean-up patch with slightly better testing, javadoc. Once I can run tests & precommit on it, I will commit this. > spellcheck.maxResultsForSuggest based on filter query results > - > > Key: SOLR-4280 > URL: https://issues.apache.org/jira/browse/SOLR-4280 > Project: Solr > Issue Type: Improvement > Components: spellchecker >Reporter: Markus Jelsma >Assignee: James Dyer > Fix For: 4.9, Trunk > > Attachments: SOLR-4280-trunk-1.patch, SOLR-4280-trunk.patch, > SOLR-4280-trunk.patch, SOLR-4280.patch, SOLR-4280.patch, SOLR-4280.patch > > > spellcheck.maxResultsForSuggest takes a fixed number but ideally should be > able to take a ratio and calculate that against the maximum number of results > the filter queries return. > At least in our case this would certainly add a lot of value. >99% of our > end-users search within one or more filters of which one is always unique. > The number of documents for each of those unique filters varies significantly > ranging from 300 to 3.000.000 documents in which they search. The > maxResultsForSuggest is set to a reasonable low value so it kind of works > fine but sometimes leads to undesired suggestions for a large subcorpus that > has more misspellings. > Spun off from SOLR-4278. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4280) spellcheck.maxResultsForSuggest based on filter query results
[ https://issues.apache.org/jira/browse/SOLR-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Dyer updated SOLR-4280: - Fix Version/s: (was: 4.9) (was: Trunk) 5.5 > spellcheck.maxResultsForSuggest based on filter query results > - > > Key: SOLR-4280 > URL: https://issues.apache.org/jira/browse/SOLR-4280 > Project: Solr > Issue Type: Improvement > Components: spellchecker >Reporter: Markus Jelsma >Assignee: James Dyer > Fix For: 5.5 > > Attachments: SOLR-4280-trunk-1.patch, SOLR-4280-trunk.patch, > SOLR-4280-trunk.patch, SOLR-4280.patch, SOLR-4280.patch, SOLR-4280.patch > > > spellcheck.maxResultsForSuggest takes a fixed number but ideally should be > able to take a ratio and calculate that against the maximum number of results > the filter queries return. > At least in our case this would certainly add a lot of value. >99% of our > end-users search within one or more filters of which one is always unique. > The number of documents for each of those unique filters varies significantly > ranging from 300 to 3.000.000 documents in which they search. The > maxResultsForSuggest is set to a reasonable low value so it kind of works > fine but sometimes leads to undesired suggestions for a large subcorpus that > has more misspellings. > Spun off from SOLR-4278. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4280) spellcheck.maxResultsForSuggest based on filter query results
[ https://issues.apache.org/jira/browse/SOLR-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated SOLR-4280: Attachment: SOLR-4280.patch Updated patch. Added SPELLCHECK_MAX_RESULTS_FOR_SUGGEST_FQ = spellcheck.maxResultsForSuggest.fq to take a filter query. Only used if maxResultsForSuggest is a fraction. > spellcheck.maxResultsForSuggest based on filter query results > - > > Key: SOLR-4280 > URL: https://issues.apache.org/jira/browse/SOLR-4280 > Project: Solr > Issue Type: Improvement > Components: spellchecker >Reporter: Markus Jelsma > Fix For: 4.9, Trunk > > Attachments: SOLR-4280-trunk-1.patch, SOLR-4280-trunk.patch, > SOLR-4280-trunk.patch, SOLR-4280.patch, SOLR-4280.patch > > > spellcheck.maxResultsForSuggest takes a fixed number but ideally should be > able to take a ratio and calculate that against the maximum number of results > the filter queries return. > At least in our case this would certainly add a lot of value. >99% of our > end-users search within one or more filters of which one is always unique. > The number of documents for each of those unique filters varies significantly > ranging from 300 to 3.000.000 documents in which they search. The > maxResultsForSuggest is set to a reasonable low value so it kind of works > fine but sometimes leads to undesired suggestions for a large subcorpus that > has more misspellings. > Spun off from SOLR-4278. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4280) spellcheck.maxResultsForSuggest based on filter query results
[ https://issues.apache.org/jira/browse/SOLR-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Dyer updated SOLR-4280: - Attachment: SOLR-4280.patch Here is an updated patch for Trunk. I've included unit tests and changed javadoc to reflect the added functionality. I've also modified how this gets triggered. Rather than introduce a new request parameter, the user passes in "spellcheck.maxResultsForSuggest" as a fractional percent, between 0 and 1. So if the user wants no more than 5% of the most-selective filter's results to be the maximum results to trigger suggestions, they would specify "spellcheck.maxResultsForSuggest=.05". If, for instance, the most-selective filter returns (by itself) 100 documents, then the effective maximum number of hits we will return without triggering spelling suggestions is 5. [~markus17] does this all sound right to you? Is this still a feature you want and would be interested in seeing committed? > spellcheck.maxResultsForSuggest based on filter query results > - > > Key: SOLR-4280 > URL: https://issues.apache.org/jira/browse/SOLR-4280 > Project: Solr > Issue Type: Improvement > Components: spellchecker >Reporter: Markus Jelsma > Fix For: 4.9, Trunk > > Attachments: SOLR-4280-trunk-1.patch, SOLR-4280-trunk.patch, > SOLR-4280-trunk.patch, SOLR-4280.patch > > > spellcheck.maxResultsForSuggest takes a fixed number but ideally should be > able to take a ratio and calculate that against the maximum number of results > the filter queries return. > At least in our case this would certainly add a lot of value. >99% of our > end-users search within one or more filters of which one is always unique. > The number of documents for each of those unique filters varies significantly > ranging from 300 to 3.000.000 documents in which they search. The > maxResultsForSuggest is set to a reasonable low value so it kind of works > fine but sometimes leads to undesired suggestions for a large subcorpus that > has more misspellings. > Spun off from SOLR-4278. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4280) spellcheck.maxResultsForSuggest based on filter query results
[ https://issues.apache.org/jira/browse/SOLR-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-4280: --- Fix Version/s: (was: 4.7) 4.8 spellcheck.maxResultsForSuggest based on filter query results - Key: SOLR-4280 URL: https://issues.apache.org/jira/browse/SOLR-4280 Project: Solr Issue Type: Improvement Components: spellchecker Reporter: Markus Jelsma Fix For: 4.8 Attachments: SOLR-4280-trunk-1.patch, SOLR-4280-trunk.patch, SOLR-4280-trunk.patch spellcheck.maxResultsForSuggest takes a fixed number but ideally should be able to take a ratio and calculate that against the maximum number of results the filter queries return. At least in our case this would certainly add a lot of value. 99% of our end-users search within one or more filters of which one is always unique. The number of documents for each of those unique filters varies significantly ranging from 300 to 3.000.000 documents in which they search. The maxResultsForSuggest is set to a reasonable low value so it kind of works fine but sometimes leads to undesired suggestions for a large subcorpus that has more misspellings. Spun off from SOLR-4278. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4280) spellcheck.maxResultsForSuggest based on filter query results
[ https://issues.apache.org/jira/browse/SOLR-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated SOLR-4280: Attachment: SOLR-4280-trunk.patch New patch. This patch now also works in a distributed environment. spellcheck.maxResultsForSuggest based on filter query results - Key: SOLR-4280 URL: https://issues.apache.org/jira/browse/SOLR-4280 Project: Solr Issue Type: Improvement Components: spellchecker Reporter: Markus Jelsma Fix For: 4.5, 5.0 Attachments: SOLR-4280-trunk-1.patch, SOLR-4280-trunk.patch, SOLR-4280-trunk.patch spellcheck.maxResultsForSuggest takes a fixed number but ideally should be able to take a ratio and calculate that against the maximum number of results the filter queries return. At least in our case this would certainly add a lot of value. 99% of our end-users search within one or more filters of which one is always unique. The number of documents for each of those unique filters varies significantly ranging from 300 to 3.000.000 documents in which they search. The maxResultsForSuggest is set to a reasonable low value so it kind of works fine but sometimes leads to undesired suggestions for a large subcorpus that has more misspellings. Spun off from SOLR-4278. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4280) spellcheck.maxResultsForSuggest based on filter query results
[ https://issues.apache.org/jira/browse/SOLR-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated SOLR-4280: Attachment: SOLR-4280-trunk.patch I forgot i had a working patch laying around. Specify spellcheck.percentageResultsForSuggest=0.25 to force maxResultsForSuggest to be 25% of the smallest filterQuery DocSet. This allows maxResultsForSuggest to be adjusted dynamically based on the filters specified. It doesn't seem to work in a distributed environment although the parameters are passed nicely. I haven't figured that out yet, but all shards return the same collation for undistributed requests. Tips? spellcheck.maxResultsForSuggest based on filter query results - Key: SOLR-4280 URL: https://issues.apache.org/jira/browse/SOLR-4280 Project: Solr Issue Type: Improvement Components: spellchecker Reporter: Markus Jelsma Fix For: 4.4 Attachments: SOLR-4280-trunk-1.patch, SOLR-4280-trunk.patch spellcheck.maxResultsForSuggest takes a fixed number but ideally should be able to take a ratio and calculate that against the maximum number of results the filter queries return. At least in our case this would certainly add a lot of value. 99% of our end-users search within one or more filters of which one is always unique. The number of documents for each of those unique filters varies significantly ranging from 300 to 3.000.000 documents in which they search. The maxResultsForSuggest is set to a reasonable low value so it kind of works fine but sometimes leads to undesired suggestions for a large subcorpus that has more misspellings. Spun off from SOLR-4278. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4280) spellcheck.maxResultsForSuggest based on filter query results
[ https://issues.apache.org/jira/browse/SOLR-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated SOLR-4280: Attachment: SOLR-4280-trunk-1.patch Patch for trunk introducing a spellcheck.percentageResultsForSuggest. It uses the filterCache to check the maximum number of possible results so whether a term is misspelled relies on how large the maximum result set is and the value for this parameter. Since the filterCache cannot be retrieved from SolrIndexSearcher.getCache() at this moment you'll have to hack into it and have it add the filterCache to the cacheMap somewhere in the constructor. {code} cacheMap.put(filterCache.name(), filterCache); {code} spellcheck.maxResultsForSuggest based on filter query results - Key: SOLR-4280 URL: https://issues.apache.org/jira/browse/SOLR-4280 Project: Solr Issue Type: Improvement Components: spellchecker Reporter: Markus Jelsma Fix For: 4.2, 5.0 Attachments: SOLR-4280-trunk-1.patch spellcheck.maxResultsForSuggest takes a fixed number but ideally should be able to take a ratio and calculate that against the maximum number of results the filter queries return. At least in our case this would certainly add a lot of value. 99% of our end-users search within one or more filters of which one is always unique. The number of documents for each of those unique filters varies significantly ranging from 300 to 3.000.000 documents in which they search. The maxResultsForSuggest is set to a reasonable low value so it kind of works fine but sometimes leads to undesired suggestions for a large subcorpus that has more misspellings. Spun off from SOLR-4278. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org