lriggs commented on code in PR #50141:
URL: https://github.com/apache/arrow/pull/50141#discussion_r3391955764


##########
cpp/src/gandiva/precompiled/decimal_wrapper.cc:
##########
@@ -423,11 +423,23 @@ FORCE_INLINE
 char* castVARCHAR_decimal128_int64(int64_t context, int64_t x_high, uint64_t 
x_low,
                                    int32_t x_precision, int32_t x_scale,
                                    int64_t out_len_param, int32_t* out_length) 
{
+  if (out_len_param < 0) {
+    gdv_fn_context_set_error_msg(context, "Output buffer length can't be 
negative");
+    *out_length = 0;
+    return const_cast<char*>("");
+  }
   int32_t full_dec_str_len;
   char* dec_str =
       gdv_fn_dec_to_string(context, x_high, x_low, x_scale, &full_dec_str_len);
-  int32_t trunc_dec_str_len =
-      out_len_param < full_dec_str_len ? out_len_param : full_dec_str_len;
+  if (dec_str == nullptr) {
+    // Allocation failed upstream; error message is already set. Avoid copying 
from
+    // an invalid buffer with a non-zero length.
+    *out_length = 0;

Review Comment:
   Yes, it will be null. This nullptr is set when the memory allocation for the 
output string fails. The null part is actually handled at another level using 
the validity bits.
   
   null decimal input is handled correctly, and it's actually cleaner now than 
before.
   
   Tracing the slow path (now active because we added kCanReturnErrors) in 
llvm_generator.cc:797-832:
   
   For a null input, is_valid is false, so BuildIfElse takes the else_lambda 
(lines 821-829):
   
   castVARCHAR_decimal128_int64 is not called at all.
   It returns a dummy value: else_value = NullConstant(...) and, because utf8() 
is binary-like, else_value_len = i32_constant(0).
   Separately, the result validity is computed by the validity dex as the AND 
of the input validities (kResultNullIfNull, per expr_decomposer.cc:92. A null 
input → result validity bit = 0 → the output row is marked null, and the dummy 
value/length are masked out.
   
   So the consumer sees a proper null. Two things worth noting:
   
   Correctness was the same before our fix — the fast path also produced a null 
output for null rows (via the validity mask); the only difference was that it 
ran the function on garbage x_high/x_low and threw the result away. Now it's 
genuinely skipped.
   
   No risk from the 0-length dummy — the else branch returns a null pointer 
with length 0, and since the row is invalid in the bitmap, nothing reads it. 
This is unrelated to the arena 0-size behavior we just discussed (no 
arena_malloc call happens on this path at all).
   
   So: null input → function skipped → output null, exactly as expected.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to