github-actions[bot] commented on code in PR #63784:
URL: https://github.com/apache/doris/pull/63784#discussion_r3315370093
##########
be/src/exprs/function/url/find_symbols.h:
##########
@@ -330,7 +330,69 @@ inline const char* find_first_symbols_sse42(const char*
const begin, const char*
return return_mode == ReturnMode::End ? end : nullptr;
}
-/// NOTE No SSE 4.2 implementation for find_last_symbols_or_null. Not worth to
do.
+template <bool positive, ReturnMode return_mode>
+inline const char* find_last_symbols_sse2(const char* const begin, const char*
const end,
+ const char* symbols, size_t
num_chars) {
+ if (begin >= end) return return_mode == ReturnMode::End ? end : nullptr;
+
+ const char* pos = end;
+
+#if defined(__SSE2__)
+ const auto needles = mm_is_in_prepare(symbols, num_chars);
+ for (; pos - 16 >= begin; pos -= 16) {
+ __m128i bytes = _mm_loadu_si128(reinterpret_cast<const __m128i*>(pos -
16));
Review Comment:
This loop forms `pos - 16` before proving that there are 16 bytes left in
the range. For any runtime `SearchSymbols` call on a string shorter than 16
bytes (for example `trim_in('abc', ' ')` through the new ASCII rtrim path),
`pos` is `end` and `pos - 16` is outside the array before the comparison is
evaluated, which is undefined behavior even if the loop body is skipped. This
is distinct from the scalar-tail issue already raised: the UB happens in the
SIMD loop condition before reaching the tail. Please use the same safe form as
the SSE4.2 helper, e.g. `static_cast<size_t>(pos - begin) >= 16`, before
computing `pos - 16`.
```suggestion
for (; static_cast<size_t>(pos - begin) >= 16; pos -= 16) {
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]