alamb opened a new pull request #10063:
URL: https://github.com/apache/arrow/pull/10063


   # Purpose
   
   This PR is a draft for comment / review. If people generally like this idea, 
I will polish up this PR with doc examples / comments and more test for real 
review.
   
   # Rationle / Usecase:
   
   While writing tests (both in IOx and in DataFusion) where I need a single 
`RecordBatch`, I often find myself doing something like this (copied directly 
from IOx source code):
   
   ```rust
   let schema = Arc::new(Schema::new(vec![
       ArrowField::new("float_field", ArrowDataType::Float64, true),
       ArrowField::new("time", ArrowDataType::Int64, true),
   ]));
   
   let float_array: ArrayRef = Arc::new(Float64Array::from(vec![10.1, 20.1, 
30.1, 40.1]));
   let timestamp_array: ArrayRef = Arc::new(Int64Array::from(vec![1000, 2000, 
3000, 4000]));
   
   let batch = RecordBatch::try_new(schema, vec![float_array, timestamp_array])
       .expect("created new record batch");
   ```
   
   This is annoying because I have to redundantly (and verbosely) encode the 
information that `float_field` is a Float64 both in the `Schema` and the 
`Float64Array`
   
   I would much rather  be able to construct `RecordBatches` using a more Rust 
like style to avoid the the redundancy and reduce the amount of typing / 
redundancy:
   
   
   # Proposal:
   
   Add `RecordBatch::append` so the following syntax can be supported:
   
   
   ```rust
   let float_array: ArrayRef = Arc::new(Float64Array::from(vec![10.1, 20.1, 
30.1, 40.1]));
   let timestamp_array: ArrayRef = Arc::new(Int64Array::from(vec![1000, 2000, 
3000, 4000]));
   
   let batch = RecordBatch::empty()
     .append("float_field", timestamp_array).unwrap()
     .append("time", float_array).unwrap;
   
   ```
   
   # Existing APIs
   The existing APIs to create a `RecordBatch` from a `Schema` and 
`Vec<ArrayRef>` would not be changed as there are plenty of use cases where the 
Schema is known up front and should not be checked each time.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to