On Fri, 4 Oct 2024 10:41:46 GMT, Roman Kennke <rken...@openjdk.org> wrote:

>> @rkennke The small loop looks to me that it will run over the end of the 
>> array.
>> Say the haystack_len is 7, the index below would be 0 after the shrq 
>> instruction, and the  movq(XMM_TMP1, Address(haystack, index, 
>> Address::times_8)) in the loop will read 8 bytes i.e. one byte past the end 
>> of the array:
>>        // num_words (zero-based) = (haystack_len - 1) / 8;
>>       __ movq(index, haystack_len);
>>       __ subq(index, 1);
>>       __ shrq(index, LogBytesPerWord);
>> 
>>       __ bind(L_loop);
>>       __ movq(XMM_TMP1, Address(haystack, index, Address::times_8));
>>       __ movq(Address(rsp, index, Address::times_8), XMM_TMP1);
>>       __ subq(index, 1);
>>       __ jcc(Assembler::positive, L_loop);
>
> Yes, and that is intentional.
> 
> Say, haystack_len is 7, then the first block computes the adjustment of the 
> haystack, which is 8 - (7 % 8) = 1. We adjust the haystack pointer one byte 
> down, so that when we copy (multiple of) 8 bytes, we land on the last byte. 
> We do copy a few bytes that are preceding the array, which is part of the 
> object header and guaranteed to be >= 8 bytes.
> 
> Then we compute the number of words to copy, but make it 0-based. That is '0' 
> is 1 word, '1' is 2 words, etc. It makes the loop nicer. In this example we 
> get 0, which means we copy one word from the adjusted haystack, which is 
> correct.
> 
> Then comes the actual loop.
> 
> Afterwards we adjust the haystack pointer so that it points to the first 
> array element that we just copied onto the stack, ignoring the few garbage 
> bytes that we also copied.

@rkennke Thanks for the explanation. I attach here a fix which is an extension 
of existing way of copying while taking care of the smaller object header. Also 
there are two instances of this in the intrinsic so I have factored the new 
code into a method and call it from both the places. 
[indexof_fix.patch](https://github.com/user-attachments/files/17285239/indexof_fix.patch)

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/20677#discussion_r1790967275

Reply via email to