github-actions[bot] commented on code in PR #63784:
URL: https://github.com/apache/doris/pull/63784#discussion_r3315370093


##########
be/src/exprs/function/url/find_symbols.h:
##########
@@ -330,7 +330,69 @@ inline const char* find_first_symbols_sse42(const char* 
const begin, const char*
     return return_mode == ReturnMode::End ? end : nullptr;
 }
 
-/// NOTE No SSE 4.2 implementation for find_last_symbols_or_null. Not worth to 
do.
+template <bool positive, ReturnMode return_mode>
+inline const char* find_last_symbols_sse2(const char* const begin, const char* 
const end,
+                                          const char* symbols, size_t 
num_chars) {
+    if (begin >= end) return return_mode == ReturnMode::End ? end : nullptr;
+
+    const char* pos = end;
+
+#if defined(__SSE2__)
+    const auto needles = mm_is_in_prepare(symbols, num_chars);
+    for (; pos - 16 >= begin; pos -= 16) {
+        __m128i bytes = _mm_loadu_si128(reinterpret_cast<const __m128i*>(pos - 
16));

Review Comment:
   This loop forms `pos - 16` before proving that there are 16 bytes left in 
the range. For any runtime `SearchSymbols` call on a string shorter than 16 
bytes (for example `trim_in('abc', ' ')` through the new ASCII rtrim path), 
`pos` is `end` and `pos - 16` is outside the array before the comparison is 
evaluated, which is undefined behavior even if the loop body is skipped. This 
is distinct from the scalar-tail issue already raised: the UB happens in the 
SIMD loop condition before reaching the tail. Please use the same safe form as 
the SSE4.2 helper, e.g. `static_cast<size_t>(pos - begin) >= 16`, before 
computing `pos - 16`.
   
   ```suggestion
       for (; static_cast<size_t>(pos - begin) >= 16; pos -= 16) {
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to