tustvold commented on code in PR #2456:
URL: https://github.com/apache/arrow-rs/pull/2456#discussion_r946037323


##########
arrow/src/compute/kernels/cast.rs:
##########
@@ -1254,6 +1258,24 @@ pub fn cast_with_options(
     }
 }
 
+/// Cast to string array to binary array
+fn cast_string_to_binary<OffsetSize>(array: &ArrayRef) -> Result<ArrayRef>
+where
+    OffsetSize: OffsetSizeTrait,
+{
+    let array = array
+        .as_any()
+        .downcast_ref::<GenericStringArray<OffsetSize>>()
+        .unwrap();
+
+    Ok(Arc::new(
+        array
+            .iter()
+            .map(|x| x.map(|data| data.as_bytes()))
+            .collect::<GenericBinaryArray<OffsetSize>>(),

Review Comment:
   As this isn't changing the size of the offsets, it would be significantly 
faster to just reuse the existing buffers
   
   Something like
   
   ```
   assert_eq(array.data_type(), DataType::Utf8);
   
array.data().clone().into_builder().data_type(DataType::Binary).build_unchecked()
   ```
   And similar for LargeUtf8 and LargeBinary



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to