alamb opened a new issue #814:
URL: https://github.com/apache/arrow-rs/issues/814


   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   While working on validation for ArrayData 
(https://github.com/apache/arrow-rs/pull/810), I was trying to validate that 
the number of `Buffers` passed to a `UnionArray` were accurate.
   
   However, since number of Buffers for a `UnionArray` differs based on if it 
has a "sparse" or "dense" layout (1 or 2, respectively), the sparseness for the 
UnionArray is not encoded in the type`DataType::UnionArray`; This means we 
can't easily validate that the UnionArray has the correct number of buffers for 
its type
   
   It also means that the UnionArray can not be used without first checking to 
see how many buffers it has (what type it is)
   
   **Describe the solution you'd like**
   I propose changing
   ```rust
   enum DataType {
   ..
     Union(Vec<Field>),
   ..
   }
   ```
   
   To something like
   
   ```rust
   enum UnionMode {
     Sparse,
     Dense
   }
   
   enum DataType {
   ..
     Union(UnionMode, Vec<Field>),
   ..
   }
   ```
   
   which is both consistent with the C++ implementation as well as allows 
`UnionArray` to be statically typechecked: 
https://github.com/apache/arrow/blob/661c7d749150905a63dd3b52e0a04dac39030d95/cpp/src/arrow/type.h#L1028
   
   
   
   **Describe alternatives you've considered**
   Can keep the sparse/dense
   
   **Additional context**
   Add any other context or screenshots about the feature request here.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to