Looks good!
Thanks
Maurizio
On 19/06/2020 11:56, Chris Hegarty wrote:
Paul, Maurizio,
This version incorporates all feedback so far.
https://cr.openjdk.java.net/~chegar/8247696/webrev.01/
Results on my machine:
Benchmark Mode Cnt Score Error Units
BulkOps.mismatch_large_bytebuffer avgt 30 88266.728
? 4083.476 ns/op
BulkOps.mismatch_large_segment avgt 30 86141.343
? 2450.450 ns/op
BulkOps.mismatch_small_bytebuffer avgt 30 6.360 ? 0.425 ns/op
BulkOps.mismatch_small_segment avgt 30 4.582 ? 1.040 ns/op
-Chris.
On 19 Jun 2020, at 00:35, Paul Sandoz <paul.san...@oracle.com
<mailto:paul.san...@oracle.com>> wrote:
Thanks Chris.
On Jun 18, 2020, at 2:57 AM, Maurizio Cimadamore
<maurizio.cimadam...@oracle.com
<mailto:maurizio.cimadam...@oracle.com>> wrote:
Thanks for looking at this Chris
On 17/06/2020 21:56, Paul Sandoz wrote:
Hi Chris,
AbstractMemorySegmentImpl
—
In vectorizedMismatchLarge:
163 if (remaining > 7)
164 throw new InternalError("remaining greater than
7: " + remaining);
165 i = length - remaining;
166 }
Should this check be an assert?
I suggested that to Chris, since sometimes asserts are enabled when
running tests, sometimes are not - in langtools we moved away from
using asserts as we realized that in certain cases they were
silently failing. I'm ok with whatever standard you feel comfortable
with though.
Automated running of tests enable assertions (-ea -esa), perhaps the
use is more commonly accepted in library code than compiler code. I
would favor so in this case if necessary (sometimes they are avoided
to reduce inline thresholds).
—
This fix prompted me to think more deeply about the implementation
of vectorizedMismatchLarge and its use of
vectorizedMismatch. Sorry, I should have thought about this more
throughly earlier on.
We need to refine the approach, not for this patch, but something
to follow up after. I think there are two issues.
1) The intrinsic method vectorizedMismatch could potentially bomb
out at any point and return the bitwise compliment of the
remaining number of elements to check.
Obviously, there is no point doing that arbitrarily but a stub
implementation for, say, x86 AVX-512 might decide bomb out for <
64 remaining elements, rather than apply vector operations on
smaller vector sizes or use a scalar loop. It does not today, but I
think we should guard against that potentially happening, otherwise
bad things can happen.
So your worry here is that we'll end up with an infinite loop, right?
Or more likely that an incorrect result is returned since tail
elements will be skipped over as the offset and size is updated.
If so, we could check remaining against previous remaining and bomb
out too if no further progress seem to be made?
I think it's better to always terminate the loop after the last
sub-range is checked, rather than unnecessarily calling twice.
I am not confident the vectorizedMismatch intrinsic stub has been
tested properly on very small lengths, since it's never directly
called in such cases, so also keeping "remaining > 7" is good too.
Paul.
I think the loop should exit when the last sub-range has been
checked. We should rely on other tests to ensure the intrinsic
method is operating efficiently.
2) This method only works when operating on byte arrays. It will
not work correctly if operating on short or long arrays, since we
are not adjusting the length and offsets accordingly by the scale.
It's probably easier to just rename this as
vectorizedMismatchLargeForBytes and drop the log2ArrayIndexScale
argument. Then expand later if need be. I still think the method
rightly belongs in ArraysSupport.
Yep - probably good idea to restrict on bytes, for now.
Maurizio
Paul.
On Jun 17, 2020, at 8:33 AM, Chris Hegarty
<chris.hega...@oracle.com <mailto:chris.hega...@oracle.com>> wrote:
The MemorySegment::mismatch implementation added vectorized
mismatch of long sizes. The implementation is trivial, but
the starting point for a more optimized implementation, if needed.
ArraysSupport::vectorizedMismatchLarge incorrectly returns the
bitwise complement of the offset of the first mismatch, where it
should return the bitwise complement of the number of remaining
pairs of elements to be checked in the tail of the two arrays. The
AbstractMemorySegmentImpl::mismatch masked this problem, since
it seamlessly compared the remaining tail, which is larger than it
should be.
Webrev:
https://cr.openjdk.java.net/~chegar/8247696/webrev.00/
I updated the exiting BulkOps micro-benchmark to cover mismatch.
Here are the results, compared to ByteBuffer::mismatch, on my machine:
Benchmark Mode Cnt Score Error
Units
BulkOps.mismatch_large_bytebuffer avgt 30 740186.973 ?
119314.207 ns/op
BulkOps.mismatch_large_segment avgt 30 683475.305 ?
76355.043 ns/op
BulkOps.mismatch_small_bytebuffer avgt 30 7.367 ? 0.523
ns/op
BulkOps.mismatch_small_segment avgt 30 4.140 ? 0.602
ns/op
-Chris.