[
https://issues.apache.org/jira/browse/IMPALA-10349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17933533#comment-17933533
]
Csaba Ringhofer commented on IMPALA-10349:
------------------------------------------
This is also a major perf issue for geospatial functions - as BINARY is used
instead of geospatial type, if it can't be encoded as ascii (which is highly
likely as those mainly consist of doubles), then constant folding won't work
and the expensive st_ functions need to be evaluated per row.
> Revisit constant folding on non-ASCII strings
> ---------------------------------------------
>
> Key: IMPALA-10349
> URL: https://issues.apache.org/jira/browse/IMPALA-10349
> Project: IMPALA
> Issue Type: Improvement
> Components: Frontend
> Reporter: Quanlong Huang
> Priority: Critical
>
> Constant folding may produce non-ASCII strings. In such cases, we currently
> abandon folding the constant. See commit message of IMPALA-1788 or codes
> here:
> [https://github.com/apache/impala/blob/9672d945963e1ca3c8699340f92d7d6ce1d91c9f/fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java#L274-L282]
> I think we should allow folding non-ASCII strings if they are legal UTF-8
> strings.
> Example of constant folding work:
> {code:java}
> Query: explain select * from functional.alltypes where string_col =
> substr('123', 1, 1)
> +-------------------------------------------------------------+
> | Explain String |
> +-------------------------------------------------------------+
> | Max Per-Host Resource Reservation: Memory=32.00KB Threads=3 |
> | Per-Host Resource Estimates: Memory=160MB |
> | Codegen disabled by planner |
> | |
> | PLAN-ROOT SINK |
> | | |
> | 01:EXCHANGE [UNPARTITIONED] |
> | | |
> | 00:SCAN HDFS [functional.alltypes] |
> | HDFS partitions=24/24 files=24 size=478.45KB |
> | predicates: string_col = '1' |
> | row-size=89B cardinality=730 |
> +-------------------------------------------------------------+
> {code}
> Example of constant folding doesn't work:
> {code:java}
> Query: explain select * from functional.alltypes where string_col =
> substr('引擎', 1, 3)
> +-------------------------------------------------------------+
> | Explain String |
> +-------------------------------------------------------------+
> | Max Per-Host Resource Reservation: Memory=32.00KB Threads=3 |
> | Per-Host Resource Estimates: Memory=160MB |
> | Codegen disabled by planner |
> | |
> | PLAN-ROOT SINK |
> | | |
> | 01:EXCHANGE [UNPARTITIONED] |
> | | |
> | 00:SCAN HDFS [functional.alltypes] |
> | HDFS partitions=24/24 files=24 size=478.45KB |
> | predicates: string_col = substr('引擎', 1, 3) |
> | row-size=89B cardinality=730 |
> +-------------------------------------------------------------+
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]