jayzhan211 commented on code in PR #10221:
URL: https://github.com/apache/datafusion/pull/10221#discussion_r1580522622
##########
datafusion/expr/src/type_coercion/binary.rs:
##########
@@ -638,22 +638,26 @@ fn dictionary_coercion(
preserve_dictionaries: bool,
) -> Option<DataType> {
use arrow::datatypes::DataType::*;
+ let maybe_preserve_dictionaries =
+ |index_type: &Box<DataType>, value_type: DataType| -> DataType {
+ if preserve_dictionaries {
+ Dictionary(index_type.clone(), Box::new(value_type))
+ } else {
+ value_type
+ }
+ };
match (lhs_type, rhs_type) {
(
Dictionary(_lhs_index_type, lhs_value_type),
Dictionary(_rhs_index_type, rhs_value_type),
) => comparison_coercion(lhs_value_type, rhs_value_type),
Review Comment:
I think we should avoid `casting from value to dict for column`
Given the example, if we preserve dict, we still ends up casting column
(utf8) to Dict (i32,utf8), but in this case, we can cast the const from i64 to
utf8 and it is enough.
```
statement ok
create table test as values
('row1', arrow_cast(1, 'Utf8')),
('row2', arrow_cast(2, 'Utf8')),
('row3', arrow_cast(3, 'Utf8'))
;
# query using an string '1' which must be coerced into a dictionary string
query TT
SELECT * from test where column2 = arrow_cast(2, 'Dictionary(Int32,
Int64)');
----
row2 2
query TT
explain SELECT * from test where column2 = arrow_cast(2, 'Dictionary(Int32,
Int64)');
----
logical_plan
01)Filter: CAST(test.column2 AS Dictionary(Int32, Utf8)) = Dictionary(Int32,
Utf8("2"))
02)--TableScan: test projection=[column1, column2]
physical_plan
01)CoalesceBatchesExec: target_batch_size=8192
02)--FilterExec: CAST(column2@1 AS Dictionary(Int32, Utf8)) = 2
03)----MemoryExec: partitions=1, partition_sizes=[1]
statement ok
drop table test;
```
I think we should not preserve dict, but need specialization on comparing
dict vs non-dict case.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]