emcake opened a new issue, #4397:
URL: https://github.com/apache/arrow-rs/issues/4397
**Describe the bug**
When using filter_record_batch on a RecordBatch, one of whose types is a
timestamp with timezone, it drops the timezone.
**To Reproduce**
A test that replicates this:
```rust
#[test]
fn filter_record_batch_maintains_timezones() -> Result<(),
arrow::error::ArrowError> {
let fields = vec![arrow::datatypes::Field::new(
"timestamp",
arrow::datatypes::DataType::Timestamp(
arrow::datatypes::TimeUnit::Nanosecond,
Some("UTC".to_owned().into()),
),
false,
)];
let field_builders: Vec<Box<dyn arrow::array::ArrayBuilder>> =
vec![Box::new(arrow::array::TimestampNanosecondBuilder::new())];
let mut sa = arrow::array::StructBuilder::new(fields, field_builders);
for i in 0..100 {
sa.field_builder::<arrow::array::TimestampNanosecondBuilder>(0)
.unwrap()
.append_value(i);
sa.append(true);
}
let struct_array = sa.finish();
let rec: arrow::record_batch::RecordBatch = (&struct_array).into();
let schema = rec.schema();
let dt = schema.field(0);
assert_eq!(
&arrow::datatypes::DataType::Timestamp(
arrow::datatypes::TimeUnit::Nanosecond,
Some("UTC".to_owned().into())
),
dt.data_type()
);
let filter: arrow::array::BooleanArray = vec![true; 100].into();
let filtered = arrow::compute::filter_record_batch(&rec, &filter)?;
let filtered_schema = filtered.schema();
let filtered_dt = filtered_schema.field(0);
assert_eq!(
&arrow::datatypes::DataType::Timestamp(
arrow::datatypes::TimeUnit::Nanosecond,
Some("UTC".to_owned().into())
),
filtered_dt.data_type()
);
Ok(())
}
```
**Expected behavior**
Test should pass.
**Additional context**
Tested on arrow `40` and `41`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]