Re: RFR [15] 8247696: Incorrect tail computation for large segments in AbstractMemorySegmentImpl::mismatch

Paul Sandoz Thu, 18 Jun 2020 16:36:32 -0700

Thanks Chris.

> On Jun 18, 2020, at 2:57 AM, Maurizio Cimadamore 
> <maurizio.cimadam...@oracle.com> wrote:
> 
> Thanks for looking at this Chris
> 
> On 17/06/2020 21:56, Paul Sandoz wrote:
>> Hi Chris,
>> 
>> AbstractMemorySegmentImpl
>> —
>> 
>> In vectorizedMismatchLarge:
>> 
>>  163             if (remaining > 7)
>>  164                 throw new InternalError("remaining greater than 7: " + 
>> remaining);
>>  165             i = length - remaining;
>>  166         }
>> 
>> Should this check be an assert?
> I suggested that to Chris, since sometimes asserts are enabled when running 
> tests, sometimes are not - in langtools we moved away from using asserts as 
> we realized that in certain cases they were silently failing. I'm ok with 
> whatever standard you feel comfortable with though.


Automated running of tests enable assertions (-ea -esa), perhaps the use is 
more commonly accepted in library code than compiler code. I would favor so in 
this case if necessary (sometimes they are avoided to reduce inline thresholds).


>> 
>> —
>> 
>> This fix prompted me to think more deeply about the implementation of 
>> vectorizedMismatchLarge and its use of vectorizedMismatch. Sorry, I should 
>> have thought about this more throughly earlier on.
>> 
>> We need to refine the approach, not for this patch, but something to follow 
>> up after. I think there are two issues.
>> 
>> 1) The intrinsic method vectorizedMismatch could potentially bomb out at any 
>> point and return the bitwise compliment of the remaining number of elements 
>> to check.
>> 
>> Obviously, there is no point doing that arbitrarily but a stub 
>> implementation for, say, x86 AVX-512 might decide bomb out for  < 64 
>> remaining elements, rather than apply vector operations on smaller vector 
>> sizes or use a scalar loop. It does not today, but I think we should guard 
>> against that potentially happening, otherwise bad things can happen.
> 
> So your worry here is that we'll end up with an infinite loop, right?
> 

Or more likely that an incorrect result is returned since tail elements will be 
skipped over as the offset and size is updated.


> If so, we could check remaining against previous remaining and bomb out too 
> if no further progress seem to be made?
> 

I think it's better to always terminate the loop after the last sub-range is 
checked, rather than unnecessarily calling twice.

I am not confident the vectorizedMismatch intrinsic stub has been tested 
properly on very small lengths, since it's never directly called in such cases, 
so also keeping "remaining > 7" is good too. 

Paul.

>> 
>> I think the loop should exit when the last sub-range has been checked. We 
>> should rely on other tests to ensure the intrinsic method is operating 
>> efficiently.
>> 
>> 
>> 2) This method only works when operating on byte arrays. It will not work 
>> correctly if operating on short or long arrays, since we are not adjusting 
>> the length and offsets accordingly by the scale. It's probably easier to 
>> just rename this as vectorizedMismatchLargeForBytes and drop the 
>> log2ArrayIndexScale argument. Then expand later if need be. I still think 
>> the method rightly belongs in ArraysSupport.
> 
> Yep - probably good idea to restrict on bytes, for now.
> 
> Maurizio
> 
>> 
>> Paul.
>> 
>>> On Jun 17, 2020, at 8:33 AM, Chris Hegarty <chris.hega...@oracle.com> wrote:
>>> 
>>> The MemorySegment::mismatch implementation added vectorized mismatch of 
>>> long sizes. The implementation is trivial, but the starting point for a 
>>> more optimized implementation, if needed. 
>>> ArraysSupport::vectorizedMismatchLarge incorrectly returns the bitwise 
>>> complement of the offset of the first mismatch, where it should return the 
>>> bitwise complement of the number of remaining pairs of elements to be 
>>> checked in the tail of the two arrays. The 
>>> AbstractMemorySegmentImpl::mismatch masked this problem, since it 
>>> seamlessly compared the remaining tail, which is larger than it should be.
>>> 
>>> Webrev:
>>>  https://cr.openjdk.java.net/~chegar/8247696/webrev.00/
>>> 
>>> I updated the exiting BulkOps micro-benchmark to cover mismatch. Here are 
>>> the results, compared to ByteBuffer::mismatch, on my machine:
>>> 
>>> Benchmark                          Mode  Cnt        Score        Error  
>>> Units
>>> BulkOps.mismatch_large_bytebuffer  avgt   30   740186.973 ? 119314.207  
>>> ns/op
>>> BulkOps.mismatch_large_segment     avgt   30   683475.305 ?  76355.043  
>>> ns/op
>>> BulkOps.mismatch_small_bytebuffer  avgt   30        7.367 ?      0.523  
>>> ns/op
>>> BulkOps.mismatch_small_segment     avgt   30        4.140 ?      0.602  
>>> ns/op
>>> 
>>> 
>>> -Chris.

Re: RFR [15] 8247696: Incorrect tail computation for large segments in AbstractMemorySegmentImpl::mismatch

Reply via email to