[jira] [Commented] (SOLR-13126) Inconsistent score in debug and result with multiple multiplicative boosts

Thomas Aglassinger (JIRA) Mon, 14 Jan 2019 09:49:05 -0800


    [ 
https://issues.apache.org/jira/browse/SOLR-13126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16742343#comment-16742343
 ]


Thomas Aglassinger commented on SOLR-13126:
-------------------------------------------

We've been digging into this and managed to somewhat track the issue down 
although unfortunately our knowledge of the inner workings of Solr and Lucene 
in particular is not sufficient to fix it and provide a patch.

We did however add logging statements that showcase the difference in the 
scoring for some trivial queries. To make the logging easier to read we 
refactored several anonymous classes to inner classes with expressive names and 
added several {{toString()}} functions. The log messages are deliberately 
written with level warning so we can easily separate them from Solr's own info 
and debug messages.

If it helps we can make these changes available although it's not feasible to 
merge them because they are only debug hacks.

Here's what we found out so far:

As described in the initial issue description we can reproduce that the score 
of a query result is computed correctly in the explain segments but incorrectly 
in the actual result if only one of two multiplicative boost conditions match. 
We now further simplified our query by splitting it into 3 separate queries 
with a filter query on one specific document. The cases are:

 # name matches both boost (netzteil and sony): Original Sony Vaio Netzteil
 # name matches one boost (netzteil but not sony): GS-Netzteil 20W schwarz
 # name matches no boost (neither netzteil nor sony): Camcorderband DV 100min 
(2)

Attached you find the log files for these queries and the JSON of the queries 
themselves. This time we did not enable debugQuery in order to log only the 
incorrect score of the actual result.

Each request was executed on a freshly restarted server (local, no replication, 
no shards) to ensure caching does not pollute the findings.

We made the following observations:
 # Both matches: lucene detects both matches with {{QueryDocValues.exists()}} 
and then computes scores for them using QueryDocValues.floatValue(). This seems 
to be called eventually by the scorer utilized by the result of 
{{org.apache.lucene.search.DoubleValues#withDefault()}} based on a formerly 
anonymous class renamed to DoubleValues_DoubleValuesWithDefault()
 # Single match: {{QueryDocValues.exists()}} detects one match and considers 
the other false (which seems correct). After that however it only seems to work 
with various variants of a constant score of 1.0, which in the end results in 
1.0. Notice that this query uses the same {{withDefault()}} as above but 
performs a very different computation mostly based on constant values. There is 
no call to {{QueryDocValues.floatVal()}}
 # No match: {{QueryDocValues.exists()}} does not find anything and results in 
a score of 1.0 as expected.
 # All logs seems to compute the score for a document with the ID -1, which 
utilizes {{QueryDocValues.floatVal()}}. As far as we understand this seems to 
be some initialization step independent of the actual query that happens only 
for the first query sent to the server.

Interestingly when you compare the logs for single and no match the are almost 
identical apart from the {{QueryDocValues.exists()}}, an additional 
{{BooleanWeight()}} and various {{toString()}} hashes.

Our expectation would have been that queries for single and both matches would 
have produced a fairly similar log using similar scorers but different scores 
(2.0 vs 6.0).

As we can reproduce these results consistently in a small testing environment 
we currently see the following options to proceed further:
 # With some hints on where to further dig into the source code we might be 
able to find the real culprit causing the inconsistent score. Any pointers?
 # We could make the solrconfig.xml, schema.xml and the core files for Solr 7.5 
available for someone else to debug who has a better grasp of the inner 
workings. Again, this is small test environment with only a few documents, and 
we could probably reduce this further (e.g. by removing Solr fields unrelated 
to this issue).

Any help would be much appreciated,
 Thomas

> Inconsistent score in debug and result with multiple multiplicative boosts
> --------------------------------------------------------------------------
>
>                 Key: SOLR-13126
>                 URL: https://issues.apache.org/jira/browse/SOLR-13126
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: search
>    Affects Versions: 7.5.0
>         Environment: Reproduced with macOS 10.14.1, a quick test with Windows 
> 10 showed the same result.
>            Reporter: Thomas Aglassinger
>            Priority: Major
>         Attachments: debugQuery.json, 
> solr_match_neither_nextteil_nor_sony.json, 
> solr_match_neither_nextteil_nor_sony.txt, solr_match_netzteil_and_sony.json, 
> solr_match_netzteil_and_sony.txt, solr_match_netzteil_only.json, 
> solr_match_netzteil_only.txt
>
>
> Under certain circumstances search results from queries with multiple 
> multiplicative boosts using the Solr functions {{product()}} and {{query()}} 
> result in a score that is inconsistent with the one from the debugQuery 
> information. Also only the debug score is correct while the actual search 
> results show a wrong score.
> This seems somewhat similar to the behaviour described in 
> https://issues.apache.org/jira/browse/LUCENE-7132, though this issue has been 
> resolved a while ago.
> A little background: we are using Solr as a search platform for the 
> e-commerce framework SAP Hybris. There the shop administrator can create 
> multiplicative boost rules (see below for an example) where a value like 2.0 
> means that an item gets boosted to 200%. This works fine in the demo shop 
> distributed by SAP but breaks in our shop. We encountered the issue when 
> Upgrading from Solr 7.2.1 / Hybris 6.7 to Solr 7.5 / Hybris 18.8.3 (which 
> would have been named Hybris 6.8 but the version naming schema changed).
> We reduced the Solr query generated by Hybris to the relevant parts and could 
> reproduce the issue in the Solr admin without any Hybris connection.
> I attached the JSON result of a test query but here's a description of the 
> parts that seemed most relevant to me.
> The {{responseHeader.params}} reads (slightly rearranged):
> {code:java}
> "q":"{!boost b=$ymb}(+{!lucene v=$yq})",
> "ymb":"product(query({!v=\"name_text_de\\:Netzteil\\^=2.0\"},1),query({!v=\"name_text_de\\:Sony\\^=3.0\"},1))",
> "yq":"*:*",
> "sort":"score desc",
> "debugQuery":"true",
> // Added to keep the output small but probably unrelated to the actual issue
> "fl":"score,id,code_string,name_text_de",
> "fq":"catalogId:\"someProducts\"",
> "rows":"10",
> {code}
> This example boosts the German product name (field {{name_text_de}}) in case 
> in contains certain terms:
>  * "Netzteil" (power supply) is boosted to 200%
>  * "Sony" is boosted to 300%
> Consequently a product containing both terms should be boosted to 600%.
> Also the query function has the value 1 specified as default in case the name 
> does not contain the respective term resulting in a pseudo boost that 
> preserves the score.
> According to the debug information the parser used is the LuceneQParser, 
> which translates this to the following parsed query:
> {quote}FunctionScoreQuery(FunctionScoreQuery(+*:*, scored by 
> boost(product(query((ConstantScore(name_text_de:netzteil))^2.0,def=1.0),query((ConstantScore(name_text_de:sony))^3.0,def=1.0)))))
> {quote}
> And the translated boost is:
> {quote}org.apache.lucene.queries.function.valuesource.ProductFloatFunction:product(query((ConstantScore(name_text_de:netzteil))^2.0,def=1.0),query((ConstantScore(name_text_de:sony))^3.0,def=1.0))
> {quote}
> When taking a look at the search result, among other the following products 
> are included (see the JSON comments for an analysis of each result):
> {code:javascript}
>      {
>         "id":"someProducts/Online/test7111111",
>         "name_text_de":"Original Sony Vaio Netzteil",
>         "code_string":"test7111111",
>         // CORRECT, both "Netzteil" and "Sony" are included in the name
>         "score":6.0},
>       {
>         "id":"someProducts/Online/taxTestingProductThree",
>         "name_text_de":"Steuertestprodukt Zwei",
>         "code_string":"taxTestingProductThree",
>         // CORRECT, neither "Netzteil" nor "Sony" are included in the name
>         "score":1.0},
>       {
>         "id":"someProducts/Online/797856300000",
>         "name_text_de":"GS-Netzteil 20W schwarz",
>         "code_string":"797856300000",
>         // WRONG, "Netzteil" is part of the name; 
>         // note that we do split words on hyphen because 
>         // WordDelimiterGraphFilterFactory.generateWordParts="1"
>         "score":1.0},
> {code}
> So apparently the multiplicative boost works for product names where all the 
> boosted terms are included but fails if only one of the terms matches.
> There are also other products in the result that contain either "Netzteil" or 
> "Sony" but still get a score of 1.0 instead of 2.0 resp. 3.0.
> Surprisingly in the {{explain}} segment the score for the product with 
> "Netzteil" but without "Sony" correctly is 2.0:
> {code:java}
> 2.0 = product of:
>   1.0 = boost
>   2.0 = product of:
>     1.0 = *:*
>     2.0 = 
> product(query((ConstantScore(name_text_de:netzteil))^2.0,def=1.0)=2.0,query((ConstantScore(name_text_de:sony))^3.0,def=1.0)=1.0)
> {code}
> The type definition of {{text_de}} in the {{schema.xml}} (which is used for 
> "name_text_de") includes the following filters:
> {code:xml}
> <fieldType name="text_de" class="solr.TextField" positionIncrementGap="100">
>     <analyzer>
>         <tokenizer class="solr.WhitespaceTokenizerFactory" />
>         <filter class="solr.WordDelimiterGraphFilterFactory"  
> preserveOriginal="1"
>                 generateWordParts="1" generateNumberParts="1" 
> catenateWords="1"
>                 catenateNumbers="1" catenateAll="0" splitOnCaseChange="1" />
>         <filter class="solr.LowerCaseFilterFactory" />
>     </analyzer>
> </fieldType>
> {code}
> The {{solrconfig.xml}} mostly is taken form the Hybris defaults and AFAIK 
> does not do anything kinky. The following lines might be of interest:
> {code:xml}
> <luceneMatchVersion>7.5.0</luceneMatchVersion>
> <queryParser name="multiMaxScore" 
> class="de.hybris.platform.solr.search.MultiMaxScoreQParserPlugin"/>
> {code}
> To sum it up, my expectation would have been:
> * The score in the result and explain section are identical.
> * Names matching only one of the two multiplied boost terms are receive the 
> respective single boost instead of the default score 1.0.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-13126) Inconsistent score in debug and result with multiple multiplicative boosts

Reply via email to