lriggs opened a new pull request, #50141:
URL: https://github.com/apache/arrow/pull/50141
# [C++][Gandiva] Fix castVARCHAR(decimal128) native memory corruption /
SIGSEGV on allocation failure
### Rationale for this change
The Gandiva `castVARCHAR_decimal128_int64` path could corrupt native memory
and
crash the process (SIGSEGV) when the output-string arena allocation failed
(e.g. `CAST(decimal AS VARCHAR)` under memory pressure). Three independent
defects combined to cause this:
1. The `castVARCHAR` decimal128 registry entry was missing
`NativeFunction::kCanReturnErrors`, so generated code skipped the error
check
and ignored any error the function reported.
2. `gdv_fn_dec_to_string` set the output length to a positive value *before*
checking whether the allocation succeeded, then returned `nullptr` —
leaving
the caller to copy from an invalid buffer with a positive length.
3. `castVARCHAR_decimal128_int64` did not validate a negative requested
output
length and did not handle an upstream allocation failure.
### What changes are included in this PR?
- **`function_registry_string.cc`**: Add `NativeFunction::kCanReturnErrors`
to the
`castVARCHAR` `decimal128` entry so the generated code checks for and
propagates errors instead of assuming the function never fails.
- **`gdv_function_stubs.cc`** (`gdv_fn_dec_to_string`): Only write the output
length *after* a successful allocation. On allocation failure, set
`*dec_str_len = 0` and return an empty string so callers never copy from an
invalid buffer using a stale, positive length.
- **`precompiled/decimal_wrapper.cc`** (`castVARCHAR_decimal128_int64`):
- Reject a negative output length with a graceful error
(`"Output buffer length can't be negative"`) instead of using it as a
copy
size.
- Bail out safely (zero length, empty string) if the upstream
`gdv_fn_dec_to_string` call failed, since the error has already been set.
- **`tests/decimal_test.cc`**: Add `TestCastVarCharDecimalNegativeLength`, a
regression test that casts a decimal to varchar with a negative output
length
and asserts the query fails gracefully with the expected error message
rather
than crashing. This also exercises the `kCanReturnErrors` flag — without
it the
error would not propagate and the test would fail.
### Behavior change
Queries such as `CAST(decimal AS VARCHAR)` that previously crashed the
process
(SIGSEGV) under memory pressure now fail gracefully with an error message
about
the allocation failure / invalid length, and the rest of the system is
unaffected.
### Are these changes tested?
Yes, unit tests.
### Are there any user-facing changes?
No.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]