[
https://issues.apache.org/jira/browse/IMPALA-10461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17273642#comment-17273642
]
Sheng Wang commented on IMPALA-10461:
-------------------------------------
Hi [[email protected]],[~stigahuang],[~boroknagyz]
When I read the code, I found that the main reason is 'CAST('2021-01-26' AS
TIMESTAMP)'. After rewrite by FoldConstantsRule, Impala will transform this
CastExpr to a TimestmapLiteral, and when execute SelectStmt.analyze() again
after rewrite. And at this time bound check failed for 'resultExprs_', because
Impala cannot substitute 'CastExpr' to 'SlotRef' due to some reasons.
I continue read the code, and found that TimestampLiteral localEquals failed on
substitute phase. This is because some thing wrong when generated TimestmapVaue
in BE, finally I found that in when initialize ScalarExprEvaluator in
fe-support.cc, Impala create a new ExprValue called 'result_', which contians a
TimestampValue, the last four padding bytes not been set to zero when
initialize timestamp_val in ExprValue ctor. So when we use this timestamp_val
to reserve value like this:
{code:java}
case TYPE_TIMESTAMP: {
impala_udf::TimestampVal v = expr.GetTimestampVal(this, row);
if (v.is_null) return nullptr;
result_.timestamp_val = TimestampValue::FromTimestampVal(v);
return &result_.timestamp_val;
}{code}
This last four padding bytes' value is decided when initialized in ExprValue
ctor. And when execute query several times, the four padding bytes' value is
different which lead to TimestampLiteral equals failed.
If we add 'memset(×tamp_val, 0, sizeof(timestamp_val))', or we just
compare 12 bytes in TimestampLiteral.localEquals since the last four bytes
padding, we can solve this problem.
But I cannot reproduce this problem in version 4.0.Does 4.0 version ensure
variable set to 0 when initializing or padding byte set to 0?
> Flaky exception when use cast to timestamp in query
> ---------------------------------------------------
>
> Key: IMPALA-10461
> URL: https://issues.apache.org/jira/browse/IMPALA-10461
> Project: IMPALA
> Issue Type: Bug
> Affects Versions: Impala 2.12.0, Impala 3.4.0
> Reporter: Sheng Wang
> Priority: Major
>
> Recently, I found a probelm for execute below query in version 3.4.0 and
> 2.12.0:
> {code:java}
> create table test_table(dt STRING) partitioned by(day STRING) STORED AS
> PARQUET;
> SELECT
> (CASE WHEN (DAYS_ADD(CAST(CAST(TO_DATE(TO_TIMESTAMP(`t1`.`dt`,
> 'yyyy-MM-dd')) AS TIMESTAMP) AS TIMESTAMP), 7)
> > CAST('2021-01-26' AS TIMESTAMP))
> THEN 0 ELSE 1 END) `d1`
> FROM
> (SELECT dt FROM test_table
> WHERE day=to_date(days_sub(now(),1))
> GROUP BY dt) `t1`
> GROUP BY (CASE WHEN (DAYS_ADD(CAST(CAST(TO_DATE(TO_TIMESTAMP(`t1`.`dt`,
> 'yyyy-MM-dd')) AS TIMESTAMP) AS TIMESTAMP), 7)
> > CAST('2021-01-26' AS TIMESTAMP))
> THEN 0 ELSE 1 END)
> LIMIT 20;
> {code}
> The above query sometime execute success, but sometimes failed like this:
> {code:java}
> Create execute plan failed :Create execute plan failed
> :org.apache.impala.common.AnalysisException: select list expression not
> produced by aggregation output (missing from GROUP BY clause?): (CASE WHEN
> (days_add(CAST(CAST(to_date(to_timestamp(t1.dt, 'yyyy-MM-dd')) AS TIMESTAMP)
> AS TIMESTAMP), 7) > TIMESTAMP '2021-01-26 00:00:00') THEN 0 ELSE 1 END) d1 at
> org.apache.impala.analysis.SelectStmt$SelectAnalyzer.verifyAggregation(SelectStmt.java:832)
> at
> org.apache.impala.analysis.SelectStmt$SelectAnalyzer.analyze(SelectStmt.java:233){code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]