rich-t-kid-datadog commented on code in PR #7713:
URL: https://github.com/apache/arrow-rs/pull/7713#discussion_r2161820896
##########
arrow-cast/src/cast/run_array.rs:
##########
@@ -0,0 +1,152 @@
+use crate::cast::*;
+
+pub(crate) fn run_end_encoded_cast<K: RunEndIndexType>(
+ array: &dyn Array,
+ to_type: &DataType,
+ cast_options: &CastOptions,
+) -> Result<ArrayRef, ArrowError> {
+ match array.data_type() {
+ DataType::RunEndEncoded(_run_end_field, _values_field) => {
+ let run_array = array
+ .as_any()
+ .downcast_ref::<RunArray<K>>()
+ .ok_or_else(|| ArrowError::CastError("Expected
RunArray".to_string()))?;
+
+ let values = run_array.values();
+
+ match to_type {
+ // CASE 1: Stay as RunEndEncoded, cast only the values
+ DataType::RunEndEncoded(_target_run_end_field,
target_value_field) => {
Review Comment:
I'm not sure we should support this. The main reason is that the current
Arrow casting API is designed to accept a single target type, and adding
support for RunEndEncoded casting would make this implicit and potentially
confusing.For example, when casting to a RunEndEncoded type today, the call
looks like: `cast(array, DataType::RunEndEncoded(...))` This fits naturally
within Arrow’s existing cast(array, to_type) interface. However, if we were to
support casting between different REE index types (e.g., Int64 to Int16), we
would effectively need to pass two types: the run-end index type and the value
type. That would either:
- Require changes to the API signature, or
- Encode both types in a way that's less readable and harder to validate.
since there are only three valid run-end index types (Int16, Int32, Int64),
which are monotonically increasing in capacity, supporting downcasts (Int64 →
Int16) introduces issues. Due to Rust’s two’s complement truncation during
integer casting with **as**, attempting to cast a large run-end value from
Int64 to Int16 would silently wrap, resulting in a serious logical error on the
client side.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]