[jira] [Created] (SOLR-11310) QueryElevation doesn't work with Solr LTR
Rahul Babulal created SOLR-11310: Summary: QueryElevation doesn't work with Solr LTR Key: SOLR-11310 URL: https://issues.apache.org/jira/browse/SOLR-11310 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: contrib - LTR Affects Versions: 6.6, 6.5.1 Reporter: Rahul Babulal Priority: Critical It looks like the QueryElevation component doesn't work with using with Solr LTR. When passing the elevateIds param and the rq param for LTR in the solr request, the elevated document is not always in the first position. The reranking happening in the LTRRescorer doesn't take into account the elevations. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-10183) A real scaling normalizer in solr-ltr
[ https://issues.apache.org/jira/browse/SOLR-10183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rahul Babulal updated SOLR-10183: - Description: The current solr-ltr plugin provider two different normalizer implementations, minMax and standard normalizer. The mimMax normalizer doesn't seem to correctly scale the values to fall in between the given min and max. The solr scale function [1] appropriately scales the values to fall in between the given range. But it cannot be used as it is, because it has performance problems and makes the scoring process really slow. For example if I have a data set [11,51,16,42,18,21] and if we want to scale the values to 1 to 10, I would except the max value in the data set (which is 51) to be normalized to 10, and min value which is 11, to be normalized to 1. Here is sample output of the minMax normalizer vs Scaling normalizer ||Input||MinMax Normalizer||Scaling Normalizer|| |11.0|1.112|1.0| |51.0|5.553|10.0| |16.0|1.666|2.125| |42.0|4.553|7.975| |18.0|1.888|2.5749998| |21.0|2.223|3.25| [1]https://wiki.apache.org/solr/FunctionQuery#scale was: The current solr-ltr plugin provider two different normalizer implementations, minMax and standard normalizer. The mimMax normalizer doesn't seem to correctly scale the values to fall in between the given min and max. The solr scale function [1] appropriately scales the values to fall in between the given range. But it cannot be used as it is, because it has performance problems and makes the scoring process really slow. For example if I have a data set [11,51,16,42,18,21] and if we want to scale the values to 1 to 10, I would except the max value in the data set (which is 51) to be normalized to 10, and min value which is 11, to be normalized to 1. Here is sample output of the minMax normalizer vs Scaling normalizer ||Input||MinMax Normalizer||Scaling Normalizer|| |11.0|1.112|10.8| |51.0|5.553|10.0| |16.0|1.666|10.7| |42.0|4.553|10.18| |18.0|1.888|10.66| |21.0|2.223|10.6| [1]https://wiki.apache.org/solr/FunctionQuery#scale > A real scaling normalizer in solr-ltr > - > > Key: SOLR-10183 > URL: https://issues.apache.org/jira/browse/SOLR-10183 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 6.4.1 >Reporter: Rahul Babulal > Labels: contrib_ltr, ltr > > The current solr-ltr plugin provider two different normalizer > implementations, minMax and standard normalizer. The mimMax normalizer > doesn't seem to correctly scale the values to fall in between the given min > and max. The solr scale function [1] appropriately scales the values to > fall in between the given range. But it cannot be used as it is, because it > has performance problems and makes the scoring process really slow. > For example if I have a data set [11,51,16,42,18,21] and if we want to scale > the values to 1 to 10, I would except the max value in the data set (which > is 51) to be normalized to 10, and min value which is 11, to be normalized to > 1. > Here is sample output of the minMax normalizer vs Scaling normalizer > ||Input||MinMax Normalizer||Scaling Normalizer|| > |11.0|1.112|1.0| > |51.0|5.553|10.0| > |16.0|1.666|2.125| > |42.0|4.553|7.975| > |18.0|1.888|2.5749998| > |21.0|2.223|3.25| > [1]https://wiki.apache.org/solr/FunctionQuery#scale -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-10183) A real scaling normalizer in solr-ltr
[ https://issues.apache.org/jira/browse/SOLR-10183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rahul Babulal updated SOLR-10183: - Description: The current solr-ltr plugin provider two different normalizer implementations, minMax and standard normalizer. The mimMax normalizer doesn't seem to correctly scale the values to fall in between the given min and max. The solr scale function [1] appropriately scales the values to fall in between the given range. But it cannot be used as it is, because it has performance problems and makes the scoring process really slow. For example if I have a data set [11,51,16,42,18,21] and if we want to scale the values to 1 to 10, I would except the max value in the data set (which is 51) to be normalized to 10, and min value which is 11, to be normalized to 1. Here is sample output of the minMax normalizer vs Scaling normalizer ||Input||MinMax Normalizer||Scaling Normalizer|| |11.0|1.112|10.8| |51.0|5.553|10.0| |16.0|1.666|10.7| |42.0|4.553|10.18| |18.0|1.888|10.66| |21.0|2.223|10.6| [1]https://wiki.apache.org/solr/FunctionQuery#scale was: The current solr-ltr plugin provider two different normalizer implementations, minMax and standard normalizer. The mimMax normalizer doesn't seem to correctly scale the values to fall in between the given min and max. The solr scale function [1] appropriately scales the values to fall in between the given range. But it cannot be used as it is, because it has performance problems and makes the scoring process really slow. For example if I have a data set [11,51,16,42,18,21] and if we want to scale the values to 1 to 10, I would except the max value in the data set (which is 51) to be normalized to 10, and min value which is 11, to be normalized to 1. ||Input||MinMax Normalizer||Scaling Normalizer|| |11.0|1.112|10.8| |51.0|5.553|10.0| |16.0|1.666|10.7| |42.0|4.553|10.18| |18.0|1.888|10.66| |21.0|2.223|10.6| [1]https://wiki.apache.org/solr/FunctionQuery#scale > A real scaling normalizer in solr-ltr > - > > Key: SOLR-10183 > URL: https://issues.apache.org/jira/browse/SOLR-10183 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 6.4.1 >Reporter: Rahul Babulal > Labels: contrib_ltr, ltr > > The current solr-ltr plugin provider two different normalizer > implementations, minMax and standard normalizer. The mimMax normalizer > doesn't seem to correctly scale the values to fall in between the given min > and max. The solr scale function [1] appropriately scales the values to > fall in between the given range. But it cannot be used as it is, because it > has performance problems and makes the scoring process really slow. > For example if I have a data set [11,51,16,42,18,21] and if we want to scale > the values to 1 to 10, I would except the max value in the data set (which > is 51) to be normalized to 10, and min value which is 11, to be normalized to > 1. > Here is sample output of the minMax normalizer vs Scaling normalizer > ||Input||MinMax Normalizer||Scaling Normalizer|| > |11.0|1.112|10.8| > |51.0|5.553|10.0| > |16.0|1.666|10.7| > |42.0|4.553|10.18| > |18.0|1.888|10.66| > |21.0|2.223|10.6| > [1]https://wiki.apache.org/solr/FunctionQuery#scale -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-10183) A real scaling normalizer in solr-ltr
[ https://issues.apache.org/jira/browse/SOLR-10183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rahul Babulal updated SOLR-10183: - Description: The current solr-ltr plugin provider two different normalizer implementations, minMax and standard normalizer. The mimMax normalizer doesn't seem to correctly scale the values to fall in between the given min and max. The solr scale function [1] appropriately scales the values to fall in between the given range. But it cannot be used as it is, because it has performance problems and makes the scoring process really slow. For example if I have a data set [11,51,16,42,18,21] and if we want to scale the values to 1 to 10, I would except the max value in the data set (which is 51) to be normalized to 10, and min value which is 11, to be normalized to 1. ||Input||MinMax Normalizer||Scaling Normalizer|| |11.0|1.112|10.8| |51.0|5.553|10.0| |16.0|1.666|10.7| |42.0|4.553|10.18| |18.0|1.888|10.66| |21.0|2.223|10.6| [1]https://wiki.apache.org/solr/FunctionQuery#scale was: The current solr-ltr plugin provider two different normalizer implementations, minMax and standard normalizer. The mimMax normalizer doesn't seem to correctly scale the values to fall in between the given min and max. The solr scale function [1] appropriately scales the values to fall in between the given range. But it cannot be used as it is, because it has performance problems and makes the scoring process really slow. [1]https://wiki.apache.org/solr/FunctionQuery#scale > A real scaling normalizer in solr-ltr > - > > Key: SOLR-10183 > URL: https://issues.apache.org/jira/browse/SOLR-10183 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 6.4.1 >Reporter: Rahul Babulal > Labels: contrib_ltr, ltr > > The current solr-ltr plugin provider two different normalizer > implementations, minMax and standard normalizer. The mimMax normalizer > doesn't seem to correctly scale the values to fall in between the given min > and max. The solr scale function [1] appropriately scales the values to > fall in between the given range. But it cannot be used as it is, because it > has performance problems and makes the scoring process really slow. > For example if I have a data set [11,51,16,42,18,21] and if we want to scale > the values to 1 to 10, I would except the max value in the data set (which > is 51) to be normalized to 10, and min value which is 11, to be normalized to > 1. > ||Input||MinMax Normalizer||Scaling Normalizer|| > |11.0|1.112|10.8| > |51.0|5.553|10.0| > |16.0|1.666|10.7| > |42.0|4.553|10.18| > |18.0|1.888|10.66| > |21.0|2.223|10.6| > [1]https://wiki.apache.org/solr/FunctionQuery#scale -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-10183) A real scaling normalizer in solr-ltr
Rahul Babulal created SOLR-10183: Summary: A real scaling normalizer in solr-ltr Key: SOLR-10183 URL: https://issues.apache.org/jira/browse/SOLR-10183 Project: Solr Issue Type: New Feature Security Level: Public (Default Security Level. Issues are Public) Affects Versions: 6.4.1 Reporter: Rahul Babulal The current solr-ltr plugin provider two different normalizer implementations, minMax and standard normalizer. The mimMax normalizer doesn't seem to correctly scale the values to fall in between the given min and max. The solr scale function [1] appropriately scales the values to fall in between the given range. But it cannot be used as it is, because it has performance problems and makes the scoring process really slow. [1]https://wiki.apache.org/solr/FunctionQuery#scale -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5541) Allow QueryElevationComponent to accept elevateIds and excludeIds as http parameters
[ https://issues.apache.org/jira/browse/SOLR-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13959339#comment-13959339 ] Rahul Babulal commented on SOLR-5541: - FYI..., There is a typo in the URL in the JIRA as well as the documentation, The param elevatedIds, should be elevateIds {code} http://localhost:8983/solr/elevate?q=*:*elevatedIds=3,4excludeIds=6,8 {code} Allow QueryElevationComponent to accept elevateIds and excludeIds as http parameters Key: SOLR-5541 URL: https://issues.apache.org/jira/browse/SOLR-5541 Project: Solr Issue Type: Improvement Components: SearchComponents - other Affects Versions: 4.6 Reporter: Joel Bernstein Assignee: Joel Bernstein Priority: Minor Fix For: 4.7 Attachments: SOLR-5541.patch, SOLR-5541.patch, SOLR-5541.patch, SOLR-5541.patch The QueryElevationComponent currently uses an xml file to map query strings to elevateIds and excludeIds. This ticket adds the ability to pass in elevateIds and excludeIds through two new http parameters elevateIds and excludeIds. This will allow more sophisticated business logic to be used in selecting which ids to elevate/exclude. Proposed syntax: http://localhost:8983/solr/elevate?q=*:*elevatedIds=3,4excludeIds=6,8 The elevateIds and excludeIds point to the unique document Id. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3390) Highlighting issue with multi-word synonyms causes to highlight the wrong terms
[ https://issues.apache.org/jira/browse/SOLR-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259896#comment-13259896 ] Rahul Babulal commented on SOLR-3390: - Thank you for the details. For now, I am setting the luceneMatchVersion to LUCENE_33. This sort of** fixes the highlighting issue. I am still testing to see if there are any other side effects of that. Do you guys now of any problems with setting the luceneMatchVersion to LUCENE_33. I will keep an eye on this issue. **The reason why I say it sort of works is that when I search name, in my case dns, domain name search, it matches with the document which has dns in its index, that's because I have expand set to true. Highlighting issue with multi-word synonyms causes to highlight the wrong terms --- Key: SOLR-3390 URL: https://issues.apache.org/jira/browse/SOLR-3390 Project: Solr Issue Type: Bug Components: highlighter, query parsers Affects Versions: 3.6 Environment: Windows 7. (Development machine, not the server) Reporter: Rahul Babulal Labels: highlighter, multi-word, solr, synonyms I am using solr 3.6 and when I have multi-words synonyms the highlighting results have the wrong word highlighted. If I have the below entry in the synonyms file: dns, domain name system If I index something like: A sample dns entry explaining the details. Searching for name (without quotes) in the highlight results/snippets I get : A sample dns ementry/em explaining the details. (The token entry overlaps with the token name in the analysis.jsp) Searching for system (without quotes) in the highlight results/snippets I get : A sample dns entry emexplaining/em the details. (The token explaining overlaps with the token system in the analysis.jsp) Here is my schema field Type: fieldType name=text_general class=solr.TextField positionIncrementGap=100 analyzer type=index charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.LowerCaseFilterFactory/ filter class=solr.PorterStemFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=false/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.LowerCaseFilterFactory/ filter class=solr.PorterStemFilterFactory/ /analyzer /fieldType -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3390) Highlighting issue with multi-word synonyms causes to highlight the wrong terms
Rahul Babulal created SOLR-3390: --- Summary: Highlighting issue with multi-word synonyms causes to highlight the wrong terms Key: SOLR-3390 URL: https://issues.apache.org/jira/browse/SOLR-3390 Project: Solr Issue Type: Bug Components: highlighter, query parsers Affects Versions: 3.6 Environment: Windows 7. (Development machine, not the server) Reporter: Rahul Babulal I am using solr 3.6 and when I have multi-words synonyms the highlighting results have the wrong word highlighted. If I have the below entry in the synonyms file: dns, domain name system If I index something like: A sample dns entry explaining the details. Searching for name (without quotes) in the highlight results/snippets I get : A sample dns ementry/em explaining the details. (The token entry overlaps with the token name in the analysis.jsp) Searching for system (without quotes) in the highlight results/snippets I get : A sample dns entry emexplaining/em the details. (The token explaining overlaps with the token system in the analysis.jsp) Here is my schema field Type: fieldType name=text_general class=solr.TextField positionIncrementGap=100 analyzer type=index charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.LowerCaseFilterFactory/ filter class=solr.PorterStemFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=false/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.LowerCaseFilterFactory/ filter class=solr.PorterStemFilterFactory/ /analyzer /fieldType -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3668) offsets issues with multiword synonyms
[ https://issues.apache.org/jira/browse/LUCENE-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13258775#comment-13258775 ] Rahul Babulal commented on LUCENE-3668: --- I'm using solr 3.6, and with luceneMatchVersion =3.6 in my solrconfig.xml I'm still seeing issues with highlighting. However using luceneMatchVersion=3.3 fixes my issue. Issue Details: In my synonyms if I have: nhl, national hockey league If I index Australian nhl team great and search-use-case 1: search for hockey (without quotes) in my highlighted response snippets I get Australian nhl emteam/em great. search-use-case 2: search for league (without quotes) in my highlighted response snippets I get Australian nhl team emgreat/em. Here is my feildType and field definitions: fieldType name=text_general class=solr.TextField positionIncrementGap=100 analyzer type=index charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.LowerCaseFilterFactory/ filter class=solr.PorterStemFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=false/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.LowerCaseFilterFactory/ filter class=solr.PorterStemFilterFactory/ /analyzer /fieldType field name=description type=text_synonym indexed=true stored=true termVectors=true termPositions=true termOffsets=true omitNorms=false/ offsets issues with multiword synonyms -- Key: LUCENE-3668 URL: https://issues.apache.org/jira/browse/LUCENE-3668 Project: Lucene - Java Issue Type: Bug Components: modules/analysis Reporter: Robert Muir Assignee: Michael McCandless Fix For: 3.6, 4.0 Attachments: LUCENE-3668.patch, LUCENE-3668_test.patch as reported on the list, there are some strange offsets with FSTSynonyms, in the case of multiword synonyms. as a workaround it was suggested to use the older synonym impl, but it has bugs too (just in a different way). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org