Re: RFR [15] 8247696: Incorrect tail computation for large segments in AbstractMemorySegmentImpl::mismatch

Maurizio Cimadamore Fri, 19 Jun 2020 12:11:17 -0700

Looks good!

Thanks
Maurizio


On 19/06/2020 11:56, Chris Hegarty wrote:

Paul, Maurizio,

This version incorporates all feedback so far.

https://cr.openjdk.java.net/~chegar/8247696/webrev.01/
Results on my machine:

Benchmark  Mode  Cnt       Score        Error  Units
BulkOps.mismatch_large_bytebuffer avgt 30 88266.728? 4083.476 ns/opBulkOps.mismatch_large_segment avgt 30 86141.343? 2450.450 ns/op
BulkOps.mismatch_small_bytebuffer  avgt   30       6.360 ?   0.425  ns/op
BulkOps.mismatch_small_segment     avgt   30       4.582 ?   1.040  ns/op

-Chris.
On 19 Jun 2020, at 00:35, Paul Sandoz <[email protected]<mailto:[email protected]>> wrote:
Thanks Chris.
On Jun 18, 2020, at 2:57 AM, Maurizio Cimadamore<[email protected]<mailto:[email protected]>> wrote:
Thanks for looking at this Chris

On 17/06/2020 21:56, Paul Sandoz wrote:
Hi Chris,

AbstractMemorySegmentImpl
—

In vectorizedMismatchLarge:

163             if (remaining > 7)
164 throw new InternalError("remaining greater than7: " + remaining);
165             i = length - remaining;
166         }

Should this check be an assert?
I suggested that to Chris, since sometimes asserts are enabled whenrunning tests, sometimes are not - in langtools we moved away fromusing asserts as we realized that in certain cases they weresilently failing. I'm ok with whatever standard you feel comfortablewith though.
Automated running of tests enable assertions (-ea -esa), perhaps theuse is more commonly accepted in library code than compiler code. Iwould favor so in this case if necessary (sometimes they are avoidedto reduce inline thresholds).
—
This fix prompted me to think more deeply about the implementationof vectorizedMismatchLarge and its use ofvectorizedMismatch. Sorry, I should have thought about this morethroughly earlier on.
We need to refine the approach, not for this patch, but somethingto follow up after. I think there are two issues.
1) The intrinsic method vectorizedMismatch could potentially bombout at any point and return the bitwise compliment of theremaining number of elements to check.
Obviously, there is no point doing that arbitrarily but a stubimplementation for, say, x86 AVX-512 might decide bomb out for <64 remaining elements, rather than apply vector operations onsmaller vector sizes or use a scalar loop. It does not today, but Ithink we should guard against that potentially happening, otherwisebad things can happen.
So your worry here is that we'll end up with an infinite loop, right?
Or more likely that an incorrect result is returned since tailelements will be skipped over as the offset and size is updated.
If so, we could check remaining against previous remaining and bombout too if no further progress seem to be made?
I think it's better to always terminate the loop after the lastsub-range is checked, rather than unnecessarily calling twice.
I am not confident the vectorizedMismatch intrinsic stub has beentested properly on very small lengths, since it's never directlycalled in such cases, so also keeping "remaining > 7" is good too.
Paul.
I think the loop should exit when the last sub-range has beenchecked. We should rely on other tests to ensure the intrinsicmethod is operating efficiently.
2) This method only works when operating on byte arrays. It willnot work correctly if operating on short or long arrays, since weare not adjusting the length and offsets accordingly by the scale.It's probably easier to just rename this asvectorizedMismatchLargeForBytes and drop the log2ArrayIndexScaleargument. Then expand later if need be. I still think the methodrightly belongs in ArraysSupport.
Yep - probably good idea to restrict on bytes, for now.

Maurizio
Paul.
On Jun 17, 2020, at 8:33 AM, Chris Hegarty<[email protected] <mailto:[email protected]>> wrote:
The MemorySegment::mismatch implementation added vectorizedmismatch of long sizes. The implementation is trivial, butthe starting point for a more optimized implementation, if needed.ArraysSupport::vectorizedMismatchLarge incorrectly returns thebitwise complement of the offset of the first mismatch, where itshould return the bitwise complement of the number of remainingpairs of elements to be checked in the tail of the two arrays. TheAbstractMemorySegmentImpl::mismatch masked this problem, sinceit seamlessly compared the remaining tail, which is larger than itshould be.
Webrev:
https://cr.openjdk.java.net/~chegar/8247696/webrev.00/
I updated the exiting BulkOps micro-benchmark to cover mismatch.Here are the results, compared to ByteBuffer::mismatch, on my machine:
Benchmark Mode Cnt Score Error UnitsBulkOps.mismatch_large_bytebuffer avgt 30 740186.973 ?119314.207 ns/opBulkOps.mismatch_large_segment avgt 30 683475.305 ? 76355.043 ns/opBulkOps.mismatch_small_bytebuffer avgt 30 7.367 ? 0.523 ns/opBulkOps.mismatch_small_segment avgt 30 4.140 ? 0.602 ns/op
-Chris.

Re: RFR [15] 8247696: Incorrect tail computation for large segments in AbstractMemorySegmentImpl::mismatch

Reply via email to