liukun4515 commented on issue #3031:
URL:
https://github.com/apache/arrow-datafusion/issues/3031#issuecomment-1217923520
> Here is a self contained reproducer for anyone following along:
>
> ```sql
> ❯ create table foo as select column1 as d from (values (1), (2));
> +---+
> | d |
> +---+
> | 1 |
> | 2 |
> +---+
> 2 rows in set. Query took 0.005 seconds.
> ❯ create table bar as select cast (d as decimal) as d from foo;
> +--------------+
> | d |
> +--------------+
> | 1.0000000000 |
> | 2.0000000000 |
> +--------------+
> 2 rows in set. Query took 0.005 seconds.
> ❯ explain select * from bar where d = 1.4;
>
+---------------+-----------------------------------------------------------------------------------+
> | plan_type | plan
|
>
+---------------+-----------------------------------------------------------------------------------+
> | logical_plan | Projection: #bar.d
|
> | | Filter: #bar.d = Float64(1.4)
|
> | | TableScan: bar projection=[d]
|
> | physical_plan | ProjectionExec: expr=[d@0 as d]
|
> | | CoalesceBatchesExec: target_batch_size=4096
|
> | | FilterExec: CAST(d@0 AS Decimal128(38, 15)) =
CAST(1.4 AS Decimal128(38, 15)) |
> | | RepartitionExec: partitioning=RoundRobinBatch(16)
|
> | | MemoryExec: partitions=1, partition_sizes=[1]
|
> | |
|
>
+---------------+-----------------------------------------------------------------------------------+
> 2 rows in set. Query took 0.005 seconds.
> ```
>
> The FilterExec line above should not have the CAST operations in them
The `cast` adding in the creation of physical expr/physical plan.
It follow a generate rule for coerced binary comparison.
Like below:
INT32 < INT64 -> INT64
DECIMAL(10,2) < DOUBLE -> Other decimal data type.
I think it all in the `comparison_binary_numeric_coercion` function.
This is just the general rule, and it works well in all cases.
But in many user case, we just use the literal as filter expr and other
condition as this issue, the new optimizer rule should resolve this case like
in the spark.
I have file a draft pr which add a logical optimizer rule to do this, but it
maybe ready tomorrow because of some changes of plan needed to reviewed by
myself first. I think the rule can works well for us.
@alamb
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]