[ https://issues.apache.org/jira/browse/AVRO-4045 ]


    David Mollitor deleted comment on AVRO-4045:
    --------------------------------------

was (Author: belugabehr):
During some micro-benchmarking, I found that there was a significant overhead 
to calling the JDK method Arrays#equals. For short strings, the difference in 
performance was two orders of magnitude. I expected some overhead, but was 
surprised by the final outcome.

However, for longer strings with long (e.g., 16+ characters) common prefixes 
the vectorized performance was 50% better.

 

So, as a compromise, for short strings, use the existing method, for longer 
strings, use the Vectorized methods.

> Use JDK to Compare Byte Array Lexicographically 
> ------------------------------------------------
>
>                 Key: AVRO-4045
>                 URL: https://issues.apache.org/jira/browse/AVRO-4045
>             Project: Apache Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.11.3
>            Reporter: David Mollitor
>            Assignee: David Mollitor
>            Priority: Minor
>              Labels: pull-request-available
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> The JDK introduced a new compare function for byte arrays in JDK9.
> Leverage this vectorized and "IntrinsicCandidate" (via ArraysSupport.java) 
> implementation for Avro.
> [https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/Arrays.html#compare(byte%5B%5D,int,int,byte%5B%5D,int,int)]
> https://github.com/openjdk/jdk17/blob/4afbcaf55383ec2f5da53282a1547bac3d099e9d/src/java.base/share/classes/java/util/Arrays.java#L5804



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to