alamb opened a new issue, #3152:
URL: https://github.com/apache/arrow-datafusion/issues/3152

   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   It is a common code pattern in DataFusion to downcast an `ArrayRef` to a 
concrete arrow array type.
   
   There are actually at least *three* patterns to do so in the datafusion 
codebase https://github.com/apache/arrow-datafusion/search?l=Rust&q=downcast
   
   I think this leads to 
   1. Poor user experience (as a DataFusion bug might panic and quit the 
process, which is impolite)
   2.  less readable code
   2. less efficient feature development and review (as people have to figure 
out which of the three patterns to use and they can't just look at what 
convention is used in the rest of the codebase
   
   ### Pattern 1: `downcast_ref()`
   ```rust
   let array = array.as_any().downcast_ref::<Date32Array>().unwrap();
   ```
   `panic`s on error
   
   ### Pattern 2: use `as_XXX_array` [functions from 
arrow](https://docs.rs/arrow/20.0.0/arrow/array/index.html#functions)
   
   ```rust
   let array = as_boolean_array(&array);
   ```
   `panic`s on error
   
   ### Pattern 3: check type and return an error:
   ```rust
   let array = array.as_any().downcast_ref::<Date32Array>().ok_or(
       DataFusionError::Execution("Cast Date32Array error".to_string()),
   )?;
   ```
   
   Returns an error to the user (a better behavior I think)
   
   **Describe the solution you'd like**
   I would like to use one consistent pattern throughout the codebase.  
Specifically I propose functions that do the error checking / error message 
generation:
   
   ```rust
   /// Return `array` cast to `Int32Array` otherwise an error
   pub fn as_int32_array(array: &dyn Array) -> Result<&Int32Array, 
DataFusionError> {
     array.as_any().downcast_ref::<Int32Array>().ok_or(
       DataFusionError::Execution(format!("Expected a Int32Array, got: ", 
array.data_type()),
     )
   }
   
   /// And similar functions for the remaining array types
   ```
   
   And then change all dynamic casts in the codebase:
   1. If they are fallable, to use the new functions that make a nice error 
message
   2. If they are infallible / somewhere were passing out the result is hard, 
use arrow functions
   
   **Describe alternatives you've considered**
   Leave the status quo
   
   **Additional context**
   Inspired by @avantgardnerio 's suggestion  
https://github.com/apache/arrow-datafusion/pull/3110/files#r945847871 (but this 
is not the first time it has come up)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to