[jira] Resolved: (SOLR-521) Allow StopFilterFactory to use StopFilter setEnablePositionIncrementsDefault function
[ https://issues.apache.org/jira/browse/SOLR-521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-521. --- Resolution: Fixed > Allow StopFilterFactory to use StopFilter setEnablePositionIncrementsDefault > function > - > > Key: SOLR-521 > URL: https://issues.apache.org/jira/browse/SOLR-521 > Project: Solr > Issue Type: Improvement >Affects Versions: 1.3 >Reporter: Walter Ferrara >Assignee: Hoss Man >Priority: Trivial > Fix For: 1.3 > > Attachments: stopfilter.patch, stopfilter.patch > > > Lucene StopFilter has a function, setEnablePositionIncrementsDefault, that > when set, "when a token is stopped (omitted), the position increment of the > following token is incremented". Solr however have no setting in schema.xml > to activate this -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-521) Allow StopFilterFactory to use StopFilter setEnablePositionIncrementsDefault function
[ https://issues.apache.org/jira/browse/SOLR-521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12604658#action_12604658 ] Hoss Man commented on SOLR-521: --- I was going to change the default, and i'd even already written up the CHANGES.txt verbage to include in, when i noticed that it caused 2 tests to fail: on for DisMax and one in ConvertedLegacyTest. This wasn't a huge surprise, i figured the test were just expecting "broken" behavior, but when i looked at exact failures they were by no means obvious failures. In both cases doing "the right thing" had some subtle impacts on the matching/scoring of docs that made me realize changing the default is probably not in the best interests of existing users (if it caused problems like this in our simple unit tests, it could have some pretty serious impacts on real world cases) FWIW, here's the verbage i *was* going to add... {quote} A new "enablePositionIncrements" option has been added to the StopFilterFactory. The default value is "true", indicating that a "gap" should be left when a stop word is removed, which will affect how much slop is required in order for Phrase Queries to match. Users who wish to preserve previous behavior should add 'enablePositionIncrements="false"' to usages of StopFilterFactory in their schema.xml. Other users should consider reindexing to ensure consistency in behavior for all documents. {quote} > Allow StopFilterFactory to use StopFilter setEnablePositionIncrementsDefault > function > - > > Key: SOLR-521 > URL: https://issues.apache.org/jira/browse/SOLR-521 > Project: Solr > Issue Type: Improvement >Affects Versions: 1.3 >Reporter: Walter Ferrara >Assignee: Hoss Man >Priority: Trivial > Fix For: 1.3 > > Attachments: stopfilter.patch, stopfilter.patch > > > Lucene StopFilter has a function, setEnablePositionIncrementsDefault, that > when set, "when a token is stopped (omitted), the position increment of the > following token is incremented". Solr however have no setting in schema.xml > to activate this -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-597) Need to remove SolrCore "caching" from SolrServlet
[ https://issues.apache.org/jira/browse/SOLR-597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-597. --- Resolution: Fixed Committed revision 667190. > Need to remove SolrCore "caching" from SolrServlet > -- > > Key: SOLR-597 > URL: https://issues.apache.org/jira/browse/SOLR-597 > Project: Solr > Issue Type: Bug >Affects Versions: 1.3 >Reporter: Hoss Man >Assignee: Hoss Man > Fix For: 1.3 > > > As discussed in this thread... > http://www.nabble.com/Add-SolrCore.getSolrCore%28%29-to-SolrServlet.doGet-to-work-arround-Resin-bug--to17501487.html#a17515374 > SolrServlet currently calls SolrCore.getSolrCore() during it's init method, > and then caches that core for reuse on each requests. > Now that we have multicore support, and the decision was made that the > singleton accessor should always return the more recently created core, this > behavior is inconsistent with SolrUpdateServlet which calls > SolrCore.getSolrCore() on each request. > One potential problem with this is that in a "mixed use" setup, where some > requests are handled by the SolrDispatchFilter and some are handled by the > SolrServlet you'll get insonstent results as cores are reloaded/renamed. > Another problem that has been observed "in the wild" is that since some > versions of Resin do not correctly load Filter's before Servlets, the > SolrServlet is constructing a core that only it ever sees before the > DispatchFilter has a chance to construct the "normal" core. > The consensus solution is to make SolrServlet refetch the SolrCore singleton > on each request -- this means that heavily customized legacy setups that do > not use the SolrDispatchFilter at all will not see initialization "lag" or > errors until the first request. butthis seems acceptible given that > SolrServlet is already deprecated - anyone using SolrServlet in a customized > application who upgrades will want to start using the DispatchFilter anyway. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-572) Spell Checker as a Search Component
[ https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12604639#action_12604639 ] Erik Hatcher commented on SOLR-572: --- the spell checker component handling build/reload seems highly awkward to me. suggestion component really should just do that... and wrap the other operations as a /spellchecker/rebuild kinda thing and not even necessarily componentize those operations since they don't really necessarily need to be hooked together with other operations as a single request. anyway, just the overloading of a "component" to do managerial operations seems awkward. food for thought. not a -1 kinda thing though. > Spell Checker as a Search Component > --- > > Key: SOLR-572 > URL: https://issues.apache.org/jira/browse/SOLR-572 > Project: Solr > Issue Type: New Feature > Components: spellchecker >Affects Versions: 1.3 >Reporter: Shalin Shekhar Mangar >Assignee: Grant Ingersoll >Priority: Minor > Fix For: 1.3 > > Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, > SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, > SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, > SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, > SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, > SOLR-572.patch, SOLR-572.patch, SOLR-572.patch > > > Expose the Lucene contrib SpellChecker as a Search Component. Provide the > following features: > * Allow creating a spell index on a given field and make it possible to have > multiple spell indices -- one for each field > * Give suggestions on a per-field basis > * Given a multi-word query, give only one consistent suggestion > * Process the query with the same analyzer specified for the source field and > process each token separately > * Allow the user to specify minimum length for a token (optional) > Consistency criteria for a multi-word query can consist of the following: > * Preserve the correct words in the original query as it is > * Never give duplicate words in a suggestion -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-572) Spell Checker as a Search Component
[ https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12604617#action_12604617 ] Sean Timm commented on SOLR-572: It doesn't appear that you can get both extendedResults and count > 1. With the below URL, I get 1 suggestion for each misspelled term regardless of the value of spellcheck.count. If I set spellcheck.extendedResults=false, then I get the requested three suggestions for each term. {noformat} /solr/spellCheckCompRH/?q=waz+designatd+two+bee+Arvil+25+bye+Pres.+it+waz&version=2.2&start=0&rows=2&indent=on&spellcheck=true&fl=title,url,id,categories,score&hl=on&hl.fl=body&qt=dismax&spellcheck.extendedResults=true&spellcheck.count=3 {noformat} > Spell Checker as a Search Component > --- > > Key: SOLR-572 > URL: https://issues.apache.org/jira/browse/SOLR-572 > Project: Solr > Issue Type: New Feature > Components: spellchecker >Affects Versions: 1.3 >Reporter: Shalin Shekhar Mangar >Assignee: Grant Ingersoll >Priority: Minor > Fix For: 1.3 > > Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, > SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, > SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, > SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, > SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, > SOLR-572.patch, SOLR-572.patch, SOLR-572.patch > > > Expose the Lucene contrib SpellChecker as a Search Component. Provide the > following features: > * Allow creating a spell index on a given field and make it possible to have > multiple spell indices -- one for each field > * Give suggestions on a per-field basis > * Given a multi-word query, give only one consistent suggestion > * Process the query with the same analyzer specified for the source field and > process each token separately > * Allow the user to specify minimum length for a token (optional) > Consistency criteria for a multi-word query can consist of the following: > * Preserve the correct words in the original query as it is > * Never give duplicate words in a suggestion -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (SOLR-595) support field level boosting to morelikethis handler.
[ https://issues.apache.org/jira/browse/SOLR-595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll reassigned SOLR-595: Assignee: Grant Ingersoll > support field level boosting to morelikethis handler. > - > > Key: SOLR-595 > URL: https://issues.apache.org/jira/browse/SOLR-595 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.3 >Reporter: Thomas Morton >Assignee: Grant Ingersoll >Priority: Minor > Fix For: 1.3 > > Attachments: SOLR-595.patch > > Original Estimate: 3h > Remaining Estimate: 3h > > Allow boosting to be specified for particular fields when using more like > this. > # Parse out "mlt.qf parameters" to get boosts in dismax like format (existing > code from DisMax param parse code used to produce a Map) > # Iterate through mltquery terms, get boost by looking at field from which > mltquery term came, and multiply boost specified in map by existing term > boost. > * If mlt.boost=false, then you get the same boost values as in map/mlt.qf > parameters, > * If mlt.boost=true then you get normalized boost multiplied by specified > boost (which makes sense to me). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-598) DebugComponent should be last component of SearchHandler
DebugComponent should be last component of SearchHandler Key: SOLR-598 URL: https://issues.apache.org/jira/browse/SOLR-598 Project: Solr Issue Type: Improvement Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor See http://lucene.markmail.org/message/qddlgc4h5vhxpv65?q=DebugComponent The DebugComponent should be the last component in the SearchHandler hierarchy, in case an earlier, custom, component changes things in the result list, etc. (unless the user explicitly states all components) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: SearchComponent ordering in SearchHandler
On Jun 12, 2008, at 2:14 PM, Chris Hostetter wrote: I must confess: i've been completley ignorant of how SearchComponents are configured until now. It seems like a no brainer that debug should come last ... so i say go for it (just make sure it's documented). the caveat to that is that if the config specifies a 8full* list of components we shouldn't implicitly tack debug on to the end in that case (just the last-components) case. Cool. : Another alternative, is to allow the config to insert a component at a : specific place, but that is a bit problematic in that it requires one to know : the exact ordering of each of the components. I actually thought that wassomething that had been done also, but looking at it now i don't see it ... it doesn't seem like it would be too problematic. it's just like line numbers in BASIC -- you don't have to know exactly what the number is for most components, just setup the defaults so that the "pre" ones have negative numebrs, and hte 'post" ones hav positive numbers, and space them out in increments of "100" (with debug at 100 or soemthing obscenely high) Wow. BASIC. Hadn't thought about BASIC in a long time. Thanks for the flashback! I'll work up a patch when I have a spare moment.
Re: Add SolrCore.getSolrCore() to SolrServlet.doGet to work arround Resin bug?
: I agree with Otis and prefer the cleaner approach. https://issues.apache.org/jira/browse/SOLR-597 -Hoss
Re: SearchComponent ordering in SearchHandler
I must confess: i've been completley ignorant of how SearchComponents are configured until now. It seems like a no brainer that debug should come last ... so i say go for it (just make sure it's documented). the caveat to that is that if the config specifies a 8full* list of components we shouldn't implicitly tack debug on to the end in that case (just the last-components) case. : Another alternative, is to allow the config to insert a component at a : specific place, but that is a bit problematic in that it requires one to know : the exact ordering of each of the components. I actually thought that wassomething that had been done also, but looking at it now i don't see it ... it doesn't seem like it would be too problematic. it's just like line numbers in BASIC -- you don't have to know exactly what the number is for most components, just setup the defaults so that the "pre" ones have negative numebrs, and hte 'post" ones hav positive numbers, and space them out in increments of "100" (with debug at 100 or soemthing obscenely high) -Hoss
[jira] Resolved: (SOLR-596) NoSuchElementException when setting facet.count=0
[ https://issues.apache.org/jira/browse/SOLR-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley resolved SOLR-596. --- Resolution: Fixed Fix Version/s: 1.3 I just committed this. Thanks! > NoSuchElementException when setting facet.count=0 > - > > Key: SOLR-596 > URL: https://issues.apache.org/jira/browse/SOLR-596 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: 1.3 > Environment: Tomcat 5.5 >Reporter: Lars Kotthoff >Priority: Minor > Fix For: 1.3 > > Attachments: SOLR-596.patch > > > When requesting no facet counts, i.e. setting facet.count=0, a > NoSuchElementException is thrown. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-597) Need to remove SolrCore "caching" from SolrServlet
Need to remove SolrCore "caching" from SolrServlet -- Key: SOLR-597 URL: https://issues.apache.org/jira/browse/SOLR-597 Project: Solr Issue Type: Bug Affects Versions: 1.3 Reporter: Hoss Man Assignee: Hoss Man Fix For: 1.3 As discussed in this thread... http://www.nabble.com/Add-SolrCore.getSolrCore%28%29-to-SolrServlet.doGet-to-work-arround-Resin-bug--to17501487.html#a17515374 SolrServlet currently calls SolrCore.getSolrCore() during it's init method, and then caches that core for reuse on each requests. Now that we have multicore support, and the decision was made that the singleton accessor should always return the more recently created core, this behavior is inconsistent with SolrUpdateServlet which calls SolrCore.getSolrCore() on each request. One potential problem with this is that in a "mixed use" setup, where some requests are handled by the SolrDispatchFilter and some are handled by the SolrServlet you'll get insonstent results as cores are reloaded/renamed. Another problem that has been observed "in the wild" is that since some versions of Resin do not correctly load Filter's before Servlets, the SolrServlet is constructing a core that only it ever sees before the DispatchFilter has a chance to construct the "normal" core. The consensus solution is to make SolrServlet refetch the SolrCore singleton on each request -- this means that heavily customized legacy setups that do not use the SolrDispatchFilter at all will not see initialization "lag" or errors until the first request. butthis seems acceptible given that SolrServlet is already deprecated - anyone using SolrServlet in a customized application who upgrades will want to start using the DispatchFilter anyway. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-572) Spell Checker as a Search Component
[ https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated SOLR-572: - Attachment: SOLR-572.patch Make getSpellChecker protected, add in JMX Stuff. Handle if the SpellingResult is null > Spell Checker as a Search Component > --- > > Key: SOLR-572 > URL: https://issues.apache.org/jira/browse/SOLR-572 > Project: Solr > Issue Type: New Feature > Components: spellchecker >Affects Versions: 1.3 >Reporter: Shalin Shekhar Mangar >Assignee: Grant Ingersoll >Priority: Minor > Fix For: 1.3 > > Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, > SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, > SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, > SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, > SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, > SOLR-572.patch, SOLR-572.patch, SOLR-572.patch > > > Expose the Lucene contrib SpellChecker as a Search Component. Provide the > following features: > * Allow creating a spell index on a given field and make it possible to have > multiple spell indices -- one for each field > * Give suggestions on a per-field basis > * Given a multi-word query, give only one consistent suggestion > * Process the query with the same analyzer specified for the source field and > process each token separately > * Allow the user to specify minimum length for a token (optional) > Consistency criteria for a multi-word query can consist of the following: > * Preserve the correct words in the original query as it is > * Never give duplicate words in a suggestion -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.