kosiew opened a new issue, #8403: URL: https://github.com/apache/arrow-rs/issues/8403
**Describe the bug** Casting from `BinaryView` to `Utf8View` fails when encountering invalid UTF-8, even with `CastOptions.safe = true`. This behavior is inconsistent with other binary types in Arrow, which replace invalid UTF-8 sequences with `null` when `safe=true`. **To Reproduce** ```rust #[test] fn test_arrow_cast_binaryview_to_utf8view_fails_with_invalid_utf8() { use arrow::compute::kernels::cast::{cast_with_options, CastOptions}; use arrow_array::{cast::AsArray, ArrayRef, BinaryViewArray}; use arrow_schema::DataType; use std::sync::Arc; let binary_data = vec![ Some("valid".as_bytes()), Some(&[0xf0, 0x28, 0x8c, 0x28]), // invalid UTF-8 sequence Some("also_valid".as_bytes()), ]; let binary_view_array: ArrayRef = Arc::new(BinaryViewArray::from(binary_data)); // Try casting with safe=false (should fail) let cast_options = CastOptions::default(); // safe=false by default let result = cast_with_options(&binary_view_array, &DataType::Utf8View, &cast_options); assert!( result.is_err(), "Expected BinaryView->Utf8View cast to fail with safe=false" ); assert!( result .unwrap_err() .to_string() .contains("Encountered non-UTF-8 data"), "Error should mention non-UTF-8 data" ); // Try casting with safe=true (should still fail, but this is unexpected) let mut safe_cast_options = CastOptions::default(); safe_cast_options.safe = true; let safe_result = cast_with_options( &binary_view_array, &DataType::Utf8View, &safe_cast_options, ); assert!( safe_result.is_err(), "BinaryView->Utf8View cast fails even with safe=true (unlike other binary types)" ); assert!( safe_result .unwrap_err() .to_string() .contains("Encountered non-UTF-8 data"), "Safe cast error should also mention non-UTF-8 data" ); } ``` **Expected behavior** When using `CastOptions.safe = true`, invalid UTF-8 in a `BinaryView` array should result in `null` values in the resulting `Utf8View` array, not a hard failure—similar to how other binary array types behave. **Additional context** This behavior appears inconsistent and surprising. In other binary array types, setting `safe=true` allows for graceful degradation (returning `null`s for invalid entries). However, `BinaryView` does not follow this pattern and fails even when safe casting is requested. Let me know if this behavior is intentional, or if a fix would be welcomed! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org