Thanks Chris. > On Jun 18, 2020, at 2:57 AM, Maurizio Cimadamore > <maurizio.cimadam...@oracle.com> wrote: > > Thanks for looking at this Chris > > On 17/06/2020 21:56, Paul Sandoz wrote: >> Hi Chris, >> >> AbstractMemorySegmentImpl >> — >> >> In vectorizedMismatchLarge: >> >> 163 if (remaining > 7) >> 164 throw new InternalError("remaining greater than 7: " + >> remaining); >> 165 i = length - remaining; >> 166 } >> >> Should this check be an assert? > I suggested that to Chris, since sometimes asserts are enabled when running > tests, sometimes are not - in langtools we moved away from using asserts as > we realized that in certain cases they were silently failing. I'm ok with > whatever standard you feel comfortable with though.
Automated running of tests enable assertions (-ea -esa), perhaps the use is more commonly accepted in library code than compiler code. I would favor so in this case if necessary (sometimes they are avoided to reduce inline thresholds). >> >> — >> >> This fix prompted me to think more deeply about the implementation of >> vectorizedMismatchLarge and its use of vectorizedMismatch. Sorry, I should >> have thought about this more throughly earlier on. >> >> We need to refine the approach, not for this patch, but something to follow >> up after. I think there are two issues. >> >> 1) The intrinsic method vectorizedMismatch could potentially bomb out at any >> point and return the bitwise compliment of the remaining number of elements >> to check. >> >> Obviously, there is no point doing that arbitrarily but a stub >> implementation for, say, x86 AVX-512 might decide bomb out for < 64 >> remaining elements, rather than apply vector operations on smaller vector >> sizes or use a scalar loop. It does not today, but I think we should guard >> against that potentially happening, otherwise bad things can happen. > > So your worry here is that we'll end up with an infinite loop, right? > Or more likely that an incorrect result is returned since tail elements will be skipped over as the offset and size is updated. > If so, we could check remaining against previous remaining and bomb out too > if no further progress seem to be made? > I think it's better to always terminate the loop after the last sub-range is checked, rather than unnecessarily calling twice. I am not confident the vectorizedMismatch intrinsic stub has been tested properly on very small lengths, since it's never directly called in such cases, so also keeping "remaining > 7" is good too. Paul. >> >> I think the loop should exit when the last sub-range has been checked. We >> should rely on other tests to ensure the intrinsic method is operating >> efficiently. >> >> >> 2) This method only works when operating on byte arrays. It will not work >> correctly if operating on short or long arrays, since we are not adjusting >> the length and offsets accordingly by the scale. It's probably easier to >> just rename this as vectorizedMismatchLargeForBytes and drop the >> log2ArrayIndexScale argument. Then expand later if need be. I still think >> the method rightly belongs in ArraysSupport. > > Yep - probably good idea to restrict on bytes, for now. > > Maurizio > >> >> Paul. >> >>> On Jun 17, 2020, at 8:33 AM, Chris Hegarty <chris.hega...@oracle.com> wrote: >>> >>> The MemorySegment::mismatch implementation added vectorized mismatch of >>> long sizes. The implementation is trivial, but the starting point for a >>> more optimized implementation, if needed. >>> ArraysSupport::vectorizedMismatchLarge incorrectly returns the bitwise >>> complement of the offset of the first mismatch, where it should return the >>> bitwise complement of the number of remaining pairs of elements to be >>> checked in the tail of the two arrays. The >>> AbstractMemorySegmentImpl::mismatch masked this problem, since it >>> seamlessly compared the remaining tail, which is larger than it should be. >>> >>> Webrev: >>> https://cr.openjdk.java.net/~chegar/8247696/webrev.00/ >>> >>> I updated the exiting BulkOps micro-benchmark to cover mismatch. Here are >>> the results, compared to ByteBuffer::mismatch, on my machine: >>> >>> Benchmark Mode Cnt Score Error >>> Units >>> BulkOps.mismatch_large_bytebuffer avgt 30 740186.973 ? 119314.207 >>> ns/op >>> BulkOps.mismatch_large_segment avgt 30 683475.305 ? 76355.043 >>> ns/op >>> BulkOps.mismatch_small_bytebuffer avgt 30 7.367 ? 0.523 >>> ns/op >>> BulkOps.mismatch_small_segment avgt 30 4.140 ? 0.602 >>> ns/op >>> >>> >>> -Chris.