[jira] [Updated] (SOLR-4280) spellcheck.maxResultsForSuggest based on filter query results

2015-12-17 Thread James Dyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Dyer updated SOLR-4280:
-
Attachment: SOLR-4280.patch

Clean-up patch with slightly better testing, javadoc.  Once I can run tests & 
precommit on it, I will commit this.

> spellcheck.maxResultsForSuggest based on filter query results
> -
>
> Key: SOLR-4280
> URL: https://issues.apache.org/jira/browse/SOLR-4280
> Project: Solr
>  Issue Type: Improvement
>  Components: spellchecker
>Reporter: Markus Jelsma
>Assignee: James Dyer
> Fix For: 4.9, Trunk
>
> Attachments: SOLR-4280-trunk-1.patch, SOLR-4280-trunk.patch, 
> SOLR-4280-trunk.patch, SOLR-4280.patch, SOLR-4280.patch, SOLR-4280.patch
>
>
> spellcheck.maxResultsForSuggest takes a fixed number but ideally should be 
> able to take a ratio and calculate that against the maximum number of results 
> the filter queries return.
> At least in our case this would certainly add a lot of value. >99% of our 
> end-users search within one or more filters of which one is always unique. 
> The number of documents for each of those unique filters varies significantly 
> ranging from 300 to 3.000.000 documents in which they search. The 
> maxResultsForSuggest is set to a reasonable low value so it kind of works 
> fine but sometimes leads to undesired suggestions for a large subcorpus that 
> has more misspellings.
> Spun off from SOLR-4278.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4280) spellcheck.maxResultsForSuggest based on filter query results

2015-12-17 Thread James Dyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Dyer updated SOLR-4280:
-
Fix Version/s: (was: 4.9)
   (was: Trunk)
   5.5

> spellcheck.maxResultsForSuggest based on filter query results
> -
>
> Key: SOLR-4280
> URL: https://issues.apache.org/jira/browse/SOLR-4280
> Project: Solr
>  Issue Type: Improvement
>  Components: spellchecker
>Reporter: Markus Jelsma
>Assignee: James Dyer
> Fix For: 5.5
>
> Attachments: SOLR-4280-trunk-1.patch, SOLR-4280-trunk.patch, 
> SOLR-4280-trunk.patch, SOLR-4280.patch, SOLR-4280.patch, SOLR-4280.patch
>
>
> spellcheck.maxResultsForSuggest takes a fixed number but ideally should be 
> able to take a ratio and calculate that against the maximum number of results 
> the filter queries return.
> At least in our case this would certainly add a lot of value. >99% of our 
> end-users search within one or more filters of which one is always unique. 
> The number of documents for each of those unique filters varies significantly 
> ranging from 300 to 3.000.000 documents in which they search. The 
> maxResultsForSuggest is set to a reasonable low value so it kind of works 
> fine but sometimes leads to undesired suggestions for a large subcorpus that 
> has more misspellings.
> Spun off from SOLR-4278.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4280) spellcheck.maxResultsForSuggest based on filter query results

2015-12-09 Thread Markus Jelsma (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-4280:

Attachment: SOLR-4280.patch

Updated patch. Added SPELLCHECK_MAX_RESULTS_FOR_SUGGEST_FQ = 
spellcheck.maxResultsForSuggest.fq to take a filter query. Only used if 
maxResultsForSuggest is a fraction.


> spellcheck.maxResultsForSuggest based on filter query results
> -
>
> Key: SOLR-4280
> URL: https://issues.apache.org/jira/browse/SOLR-4280
> Project: Solr
>  Issue Type: Improvement
>  Components: spellchecker
>Reporter: Markus Jelsma
> Fix For: 4.9, Trunk
>
> Attachments: SOLR-4280-trunk-1.patch, SOLR-4280-trunk.patch, 
> SOLR-4280-trunk.patch, SOLR-4280.patch, SOLR-4280.patch
>
>
> spellcheck.maxResultsForSuggest takes a fixed number but ideally should be 
> able to take a ratio and calculate that against the maximum number of results 
> the filter queries return.
> At least in our case this would certainly add a lot of value. >99% of our 
> end-users search within one or more filters of which one is always unique. 
> The number of documents for each of those unique filters varies significantly 
> ranging from 300 to 3.000.000 documents in which they search. The 
> maxResultsForSuggest is set to a reasonable low value so it kind of works 
> fine but sometimes leads to undesired suggestions for a large subcorpus that 
> has more misspellings.
> Spun off from SOLR-4278.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4280) spellcheck.maxResultsForSuggest based on filter query results

2015-12-07 Thread James Dyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Dyer updated SOLR-4280:
-
Attachment: SOLR-4280.patch

Here is an updated patch for Trunk.  I've included unit tests and changed 
javadoc to reflect the added functionality.  I've also modified how this gets 
triggered.  Rather than introduce a new request parameter, the user passes in 
"spellcheck.maxResultsForSuggest" as a fractional percent, between 0 and 1.  So 
if the user wants no more than 5% of the most-selective filter's results to be 
the maximum results to trigger suggestions, they would specify 
"spellcheck.maxResultsForSuggest=.05".  If, for instance, the most-selective 
filter returns (by itself) 100 documents, then the effective maximum number of 
hits we will return without triggering spelling suggestions is 5.

[~markus17] does this all sound right to you?  Is this still a feature you want 
and would be interested in seeing committed?

> spellcheck.maxResultsForSuggest based on filter query results
> -
>
> Key: SOLR-4280
> URL: https://issues.apache.org/jira/browse/SOLR-4280
> Project: Solr
>  Issue Type: Improvement
>  Components: spellchecker
>Reporter: Markus Jelsma
> Fix For: 4.9, Trunk
>
> Attachments: SOLR-4280-trunk-1.patch, SOLR-4280-trunk.patch, 
> SOLR-4280-trunk.patch, SOLR-4280.patch
>
>
> spellcheck.maxResultsForSuggest takes a fixed number but ideally should be 
> able to take a ratio and calculate that against the maximum number of results 
> the filter queries return.
> At least in our case this would certainly add a lot of value. >99% of our 
> end-users search within one or more filters of which one is always unique. 
> The number of documents for each of those unique filters varies significantly 
> ranging from 300 to 3.000.000 documents in which they search. The 
> maxResultsForSuggest is set to a reasonable low value so it kind of works 
> fine but sometimes leads to undesired suggestions for a large subcorpus that 
> has more misspellings.
> Spun off from SOLR-4278.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4280) spellcheck.maxResultsForSuggest based on filter query results

2014-03-15 Thread David Smiley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated SOLR-4280:
---

Fix Version/s: (was: 4.7)
   4.8

 spellcheck.maxResultsForSuggest based on filter query results
 -

 Key: SOLR-4280
 URL: https://issues.apache.org/jira/browse/SOLR-4280
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Reporter: Markus Jelsma
 Fix For: 4.8

 Attachments: SOLR-4280-trunk-1.patch, SOLR-4280-trunk.patch, 
 SOLR-4280-trunk.patch


 spellcheck.maxResultsForSuggest takes a fixed number but ideally should be 
 able to take a ratio and calculate that against the maximum number of results 
 the filter queries return.
 At least in our case this would certainly add a lot of value. 99% of our 
 end-users search within one or more filters of which one is always unique. 
 The number of documents for each of those unique filters varies significantly 
 ranging from 300 to 3.000.000 documents in which they search. The 
 maxResultsForSuggest is set to a reasonable low value so it kind of works 
 fine but sometimes leads to undesired suggestions for a large subcorpus that 
 has more misspellings.
 Spun off from SOLR-4278.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4280) spellcheck.maxResultsForSuggest based on filter query results

2013-08-20 Thread Markus Jelsma (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-4280:


Attachment: SOLR-4280-trunk.patch

New patch. This patch now also works in a distributed environment.

 spellcheck.maxResultsForSuggest based on filter query results
 -

 Key: SOLR-4280
 URL: https://issues.apache.org/jira/browse/SOLR-4280
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Reporter: Markus Jelsma
 Fix For: 4.5, 5.0

 Attachments: SOLR-4280-trunk-1.patch, SOLR-4280-trunk.patch, 
 SOLR-4280-trunk.patch


 spellcheck.maxResultsForSuggest takes a fixed number but ideally should be 
 able to take a ratio and calculate that against the maximum number of results 
 the filter queries return.
 At least in our case this would certainly add a lot of value. 99% of our 
 end-users search within one or more filters of which one is always unique. 
 The number of documents for each of those unique filters varies significantly 
 ranging from 300 to 3.000.000 documents in which they search. The 
 maxResultsForSuggest is set to a reasonable low value so it kind of works 
 fine but sometimes leads to undesired suggestions for a large subcorpus that 
 has more misspellings.
 Spun off from SOLR-4278.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4280) spellcheck.maxResultsForSuggest based on filter query results

2013-07-16 Thread Markus Jelsma (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-4280:


Attachment: SOLR-4280-trunk.patch

I forgot i had a working patch laying around. Specify 
spellcheck.percentageResultsForSuggest=0.25 to force maxResultsForSuggest to be 
25% of the smallest filterQuery DocSet. This allows maxResultsForSuggest to be 
adjusted dynamically based on the filters specified. 

It doesn't seem to work in a distributed environment although the parameters 
are passed nicely. I haven't figured that out yet, but all shards return the 
same collation for undistributed requests. Tips?

 spellcheck.maxResultsForSuggest based on filter query results
 -

 Key: SOLR-4280
 URL: https://issues.apache.org/jira/browse/SOLR-4280
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Reporter: Markus Jelsma
 Fix For: 4.4

 Attachments: SOLR-4280-trunk-1.patch, SOLR-4280-trunk.patch


 spellcheck.maxResultsForSuggest takes a fixed number but ideally should be 
 able to take a ratio and calculate that against the maximum number of results 
 the filter queries return.
 At least in our case this would certainly add a lot of value. 99% of our 
 end-users search within one or more filters of which one is always unique. 
 The number of documents for each of those unique filters varies significantly 
 ranging from 300 to 3.000.000 documents in which they search. The 
 maxResultsForSuggest is set to a reasonable low value so it kind of works 
 fine but sometimes leads to undesired suggestions for a large subcorpus that 
 has more misspellings.
 Spun off from SOLR-4278.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4280) spellcheck.maxResultsForSuggest based on filter query results

2013-02-14 Thread Markus Jelsma (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-4280:


Attachment: SOLR-4280-trunk-1.patch

Patch for trunk introducing a spellcheck.percentageResultsForSuggest. It uses 
the filterCache to check the maximum number of possible results so whether a 
term is misspelled relies on how large the maximum result set is and the value 
for this parameter. 


Since the filterCache cannot be retrieved from SolrIndexSearcher.getCache() at 
this moment you'll have to hack into it and have it add the filterCache to the 
cacheMap somewhere in the constructor.

{code}
cacheMap.put(filterCache.name(), filterCache);
{code}

 spellcheck.maxResultsForSuggest based on filter query results
 -

 Key: SOLR-4280
 URL: https://issues.apache.org/jira/browse/SOLR-4280
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Reporter: Markus Jelsma
 Fix For: 4.2, 5.0

 Attachments: SOLR-4280-trunk-1.patch


 spellcheck.maxResultsForSuggest takes a fixed number but ideally should be 
 able to take a ratio and calculate that against the maximum number of results 
 the filter queries return.
 At least in our case this would certainly add a lot of value. 99% of our 
 end-users search within one or more filters of which one is always unique. 
 The number of documents for each of those unique filters varies significantly 
 ranging from 300 to 3.000.000 documents in which they search. The 
 maxResultsForSuggest is set to a reasonable low value so it kind of works 
 fine but sometimes leads to undesired suggestions for a large subcorpus that 
 has more misspellings.
 Spun off from SOLR-4278.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org