On Fri, 4 Oct 2024 10:41:46 GMT, Roman Kennke <rken...@openjdk.org> wrote:
>> @rkennke The small loop looks to me that it will run over the end of the >> array. >> Say the haystack_len is 7, the index below would be 0 after the shrq >> instruction, and the movq(XMM_TMP1, Address(haystack, index, >> Address::times_8)) in the loop will read 8 bytes i.e. one byte past the end >> of the array: >> // num_words (zero-based) = (haystack_len - 1) / 8; >> __ movq(index, haystack_len); >> __ subq(index, 1); >> __ shrq(index, LogBytesPerWord); >> >> __ bind(L_loop); >> __ movq(XMM_TMP1, Address(haystack, index, Address::times_8)); >> __ movq(Address(rsp, index, Address::times_8), XMM_TMP1); >> __ subq(index, 1); >> __ jcc(Assembler::positive, L_loop); > > Yes, and that is intentional. > > Say, haystack_len is 7, then the first block computes the adjustment of the > haystack, which is 8 - (7 % 8) = 1. We adjust the haystack pointer one byte > down, so that when we copy (multiple of) 8 bytes, we land on the last byte. > We do copy a few bytes that are preceding the array, which is part of the > object header and guaranteed to be >= 8 bytes. > > Then we compute the number of words to copy, but make it 0-based. That is '0' > is 1 word, '1' is 2 words, etc. It makes the loop nicer. In this example we > get 0, which means we copy one word from the adjusted haystack, which is > correct. > > Then comes the actual loop. > > Afterwards we adjust the haystack pointer so that it points to the first > array element that we just copied onto the stack, ignoring the few garbage > bytes that we also copied. @rkennke Thanks for the explanation. I attach here a fix which is an extension of existing way of copying while taking care of the smaller object header. Also there are two instances of this in the intrinsic so I have factored the new code into a method and call it from both the places. [indexof_fix.patch](https://github.com/user-attachments/files/17285239/indexof_fix.patch) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20677#discussion_r1790967275