ethan-tyler opened a new issue, #9219:
URL: https://github.com/apache/arrow-rs/issues/9219

   **Describe the bug**
   
   Casting to Dictionary(_, Utf8View) or Dictionary(_, BinaryView) fails:
   
   `Cast error: Unsupported output type for dictionary packing: Utf8View`
   
   can_cast_types returns true for these combinations, so this is a runtime 
inconsistency.
   
   <!--
   A clear and concise description of what the bug is.
   -->
   
   **To Reproduce**
   
   ```
   use arrow_array::{BinaryArray, StringArray};
   use arrow_cast::{can_cast_types, cast};
   use arrow_schema::DataType;
   
   fn main() {
       // Utf8View
       let arr = StringArray::from(vec![Some("a"), Some("b"), Some("a")]);
       let target = DataType::Dictionary(Box::new(DataType::Int32), 
Box::new(DataType::Utf8View));
       assert!(can_cast_types(arr.data_type(), &target)); // true
       assert!(cast(&arr, &target).is_err()); // fails
   
       // BinaryView
       let bin = BinaryArray::from(vec![Some(b"a".as_slice()), Some(b"b"), 
Some(b"a")]);
       let target = DataType::Dictionary(Box::new(DataType::Int32), 
Box::new(DataType::BinaryView));
       assert!(can_cast_types(bin.data_type(), &target)); // true
       assert!(cast(&bin, &target).is_err()); // fails
   }
   ```
   <!--
   Steps to reproduce the behavior:
   -->
   
   **Expected behavior**
   
   Cast succeeds, or can_cast_types returns false. Currently it claims true 
then returns Err at runtime.
   
   <!--
   A clear and concise description of what you expected to happen.
   -->
   
   **Use case**
   
   DataFusion dictionary encodes partition columns for memory efficiency, and 
some pipelines require view types for schema alignment 
(schema_force_view_types). Currently no way to get Dictionary(K, Utf8View).
   
   This isn't redundant with plain view arrays: views store 16 bytes per row 
(the u128 view struct), while dictionary keys are 1-4 bytes. For partition 
columns (constant per file), dictionary encoding can cut memory ~4-8x while 
still satisfying view-typed schema requirements.
   
   **Additional context**
   
   Related to #7114, but that was about implicit coercion. This is explicit 
casting and a can_cast_types consistency bug.
   <!--
   Add any other context about the problem here.
   -->


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to