[ 
https://issues.apache.org/jira/browse/LUCENE-8592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16712958#comment-16712958
 ] 

Michael McCandless commented on LUCENE-8592:
--------------------------------------------

Phew this is evil, nice catch [~jim.ferenczi]! Thank you for adding reverse 
true/false testing through all the test cases. The patch looks good – moving 
the {{reverseMul}} logic to after the comparison.

This may have high impact? Any time the user sorts by an int field, reversed, 
and has missing values, their index will now fail {{CheckIndex}} on upgrade and 
require reindexing from their original docs.

Whether this fix is buggy depends on the return values of e.g. 
{{Integer.compareTo}}, {{String.compareTo}}, etc.? I.e. do those methods ever 
return {{Integer.MIN_VALUE}}. Looking at [Integer.java in at least 
JDK8|http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/lang/Integer.java#l1233]
 it seems we are good – it returns 0, 1, -1. And [String.java returns the 
difference of two 
{{chars}}|http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/lang/String.java#l1140].
 Hopefully all other native {{compareTo}} impls are similar.

Should we add a test case confirming {{CheckIndex}} detects this bug? Create a 
broken index and zip it up and commit that, with a test that unzips and 
confirms {{CheckIndex}} fails on it?

Instead of the (int) casts e.g. in {{return (int) 
globalOrds.get(readerValues.ordValue())}}, maybe we should switch to 
{{Math.toIntExact}}?  Because these are single values, it should never happen 
that the long ord exceeds positive int space, so the conversion should always 
work safely.

> MultiSorter#sort incorrectly sort Integer/Long#MIN_VALUE when the natural 
> sort is reversed
> ------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-8592
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8592
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Jim Ferenczi
>            Priority: Major
>         Attachments: LUCENE-8592.patch
>
>
> MultiSorter#getComparableProviders on an integer or long field doesn't handle 
> MIN_VALUE correctly when the natural order is reversed. To handle reverse 
> sort we use the negation of the value but there is no check for overflows so 
> MIN_VALUE for ints and longs are always sorted first (even if the natural 
> order is reversed). 
> This method is used by index sorting when merging already sorted segments 
> together. This means that a sorted index can be incorrectly sorted if it uses 
> a reverse sort and a missing value set to MIN_VALUE (long or int or values 
> inside the segment that are equals to MIN_VALUE).
> This a bad bug because it affects the documents order inside segments and 
> only a reindex can restore the correct sort order. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to