alamb opened a new pull request #10063:
URL: https://github.com/apache/arrow/pull/10063
# Purpose
This PR is a draft for comment / review. If people generally like this idea,
I will polish up this PR with doc examples / comments and more test for real
review.
# Rationle / Usecase:
While writing tests (both in IOx and in DataFusion) where I need a single
`RecordBatch`, I often find myself doing something like this (copied directly
from IOx source code):
```rust
let schema = Arc::new(Schema::new(vec![
ArrowField::new("float_field", ArrowDataType::Float64, true),
ArrowField::new("time", ArrowDataType::Int64, true),
]));
let float_array: ArrayRef = Arc::new(Float64Array::from(vec![10.1, 20.1,
30.1, 40.1]));
let timestamp_array: ArrayRef = Arc::new(Int64Array::from(vec![1000, 2000,
3000, 4000]));
let batch = RecordBatch::try_new(schema, vec![float_array, timestamp_array])
.expect("created new record batch");
```
This is annoying because I have to redundantly (and verbosely) encode the
information that `float_field` is a Float64 both in the `Schema` and the
`Float64Array`
I would much rather be able to construct `RecordBatches` using a more Rust
like style to avoid the the redundancy and reduce the amount of typing /
redundancy:
# Proposal:
Add `RecordBatch::append` so the following syntax can be supported:
```rust
let float_array: ArrayRef = Arc::new(Float64Array::from(vec![10.1, 20.1,
30.1, 40.1]));
let timestamp_array: ArrayRef = Arc::new(Int64Array::from(vec![1000, 2000,
3000, 4000]));
let batch = RecordBatch::empty()
.append("float_field", timestamp_array).unwrap()
.append("time", float_array).unwrap;
```
# Existing APIs
The existing APIs to create a `RecordBatch` from a `Schema` and
`Vec<ArrayRef>` would not be changed as there are plenty of use cases where the
Schema is known up front and should not be checked each time.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]