[ 
https://issues.apache.org/jira/browse/LUCENE-7966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16162647#comment-16162647
 ] 

Uwe Schindler commented on LUCENE-7966:
---------------------------------------

bq. One thing that we lose here is the verbosity of exception message (field 
names, doc ids)

Yes, but there is a reason to change to the new methods! Robert did not add 
that to the issue description. I directed him on the weekend during a private 
chat to the new methods (which was caused by my comment in a former issue with 
the missing check).

To make it short: Those methods are highly optimized by Hotspot. In fact they 
replaced the bounds checks in ByteBuffers and other places throughout the JDK 
to call those methods instead of hardcoded if/then/else statements. I also 
talked with RĂ©mi Forax about it this summer to get more information. He 
strongly recommends to use the new methods from Objects and Arrays instead of 
manually crafted if/then/else checks. The reason is: Those methods are 
intrinsics and are more or less replaced by highly optimized bounds check code. 
In addition, if you add several levels of bounds checks more magic is 
happening: a high level method checks bounds and call lower level method, which 
does bounds checks again (e.g., a ByteBuffer in the JDK), and this lower level 
method again accesses a Java array (that implcitely does bounds checks, too), - 
then Hotspot adds all calls of those Objects' index check methods to the 
internal AST and it can then safely eliminate all of them except one. And if it 
sees that a variable is unsigned it will also remove negative checks ... and so 
on. This was done, because they had very hard problems in eliminating those 
checks everywhere when somebody creates an if statement (there are tons of ways 
to do the same check with if statements!), the OpenJDK developers argued to use 
a intrinsic replacement method instead of hardcoded bounds checks with hundreds 
of variants. By that they only have to optimize a single "if statement" and can 
replace it by assembly code and remove duplicates.

Even shorter: If you use the Objects methods instead of if statements, you can 
add more safety to your code, as duplicate checks are eliminated. So we can 
even start to add checks into BytesRef. Or we can remove the stupid try/catch 
blocks in ByteBufferIndexInput.

> build mr-jar and use some java 9 methods if available
> -----------------------------------------------------
>
>                 Key: LUCENE-7966
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7966
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: general/build
>            Reporter: Robert Muir
>         Attachments: LUCENE-7966.patch, LUCENE-7966.patch
>
>
> See background: http://openjdk.java.net/jeps/238
> It would be nice to use some of the newer array methods and range checking 
> methods in java 9 for example, without waiting for lucene 10 or something. If 
> we build an MR-jar, we can start migrating our code to use java 9 methods 
> right now, it will use optimized methods from java 9 when thats available, 
> otherwise fall back to java 8 code.  
> This patch adds:
> {code}
> Objects.checkIndex(int,int)
> Objects.checkFromToIndex(int,int,int)
> Objects.checkFromIndexSize(int,int,int)
> Arrays.mismatch(byte[],int,int,byte[],int,int)
> Arrays.compareUnsigned(byte[],int,int,byte[],int,int)
> Arrays.equal(byte[],int,int,byte[],int,int)
> // did not add char/int/long/short/etc but of course its possible if needed
> {code}
> It sets these up in {{org.apache.lucene.future}} as 1-1 mappings to java 
> methods. This way, we can simply directly replace call sites with java 9 
> methods when java 9 is a minimum. Simple 1-1 mappings mean also that we only 
> have to worry about testing that our java 8 fallback methods work.
> I found that many of the current byte array methods today are willy-nilly and 
> very lenient for example, passing invalid offsets at times and relying on 
> compare methods not throwing exceptions, etc. I fixed all the instances in 
> core/codecs but have not looked at the problems with AnalyzingSuggester. Also 
> SimpleText still uses a silly method in ArrayUtil in similar crazy way, have 
> not removed that one yet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to