crepererum commented on PR #5545:
URL:
https://github.com/apache/arrow-datafusion/pull/5545#issuecomment-1465873451
# Breaking Change
## Before
```rust
let file_scan_config = FileScanConfig {
table_partition_cols: vec![
(
"group".to_owned(),
DataType::Utf8,
),
...
],
...
};
let partitioned_file = PartitionedFile {
partition_values: vec![
ScalarValue::Utf8(Some("foo".to_owned())),
...
],
...
};
```
## After (exact)
If you want an exact conversion:
```rust
let file_scan_config = FileScanConfig {
table_partition_cols: vec![
(
"group".to_owned(),
DataType::Dictionary(
Box::new(DataType::UInt16),
Box::new(DataType::Utf8),
),
),
...
],
...
};
let partitioned_file = PartitionedFile {
partition_values: vec![
ScalarValue::Dictionary(
Box::new(DataType::UInt16),
Box::new(ScalarValue::Utf8(Some("foo".to_owned()))),
),
...
],
...
};
```
## After (alternative)
You may just decide that you don't to dictionary-encode at all:
```rust
let file_scan_config = FileScanConfig {
table_partition_cols: vec![
(
"group".to_owned(),
DataType::Utf8,
),
...
],
...
};
let partitioned_file = PartitionedFile {
partition_values: vec![
ScalarValue::Utf8(Some("foo".to_owned())),
...
],
...
};
```
or that you want a different dictionary key type:
```rust
let file_scan_config = FileScanConfig {
table_partition_cols: vec![
(
"group".to_owned(),
DataType::Dictionary(
Box::new(DataType::Int8),
Box::new(DataType::Utf8),
),
),
...
],
...
};
let partitioned_file = PartitionedFile {
partition_values: vec![
ScalarValue::Dictionary(
Box::new(DataType::Int8),
Box::new(ScalarValue::Utf8(Some("foo".to_owned()))),
),
...
],
...
};
```
Note that in all cases, the types in `FileScanConfig::table_partition_cols`
and `PartitionedFile::partition_values` MUST be in-sync (strictly speaking that
hasn't changed, before this PR both had been converted to UInt16 dictionary
types).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]