[ 
https://issues.apache.org/jira/browse/IMPALA-10461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17273642#comment-17273642
 ] 

Sheng Wang commented on IMPALA-10461:
-------------------------------------

Hi [[email protected]],[~stigahuang],[~boroknagyz]

When I read the code, I found that the main reason is 'CAST('2021-01-26' AS 
TIMESTAMP)'. After rewrite by FoldConstantsRule, Impala will transform this 
CastExpr to a TimestmapLiteral, and when execute SelectStmt.analyze() again 
after rewrite. And at this time bound check failed for 'resultExprs_', because 
Impala cannot substitute 'CastExpr' to 'SlotRef' due to some reasons.
I continue read the code, and found that TimestampLiteral localEquals failed on 
substitute phase. This is because some thing wrong when generated TimestmapVaue 
in BE, finally I found that in when initialize ScalarExprEvaluator in 
fe-support.cc, Impala create a new ExprValue called 'result_', which contians a 
TimestampValue, the last four padding bytes not been set to zero when 
initialize timestamp_val in ExprValue ctor. So when we use this timestamp_val 
to reserve value like this:
{code:java}
    case TYPE_TIMESTAMP: {
      impala_udf::TimestampVal v = expr.GetTimestampVal(this, row);
      if (v.is_null) return nullptr;
      result_.timestamp_val = TimestampValue::FromTimestampVal(v);
      return &result_.timestamp_val;
    }{code}
This last four padding bytes' value is decided when initialized in ExprValue 
ctor. And when execute query several times, the four padding bytes' value is 
different which lead to TimestampLiteral equals failed.
If we add 'memset(&timestamp_val, 0, sizeof(timestamp_val))', or we just 
compare 12 bytes in TimestampLiteral.localEquals since the last four bytes 
padding, we can solve this problem.
But I cannot reproduce this problem in version 4.0.Does 4.0 version ensure 
variable set to 0 when initializing or padding byte set to 0?

> Flaky exception when use cast to timestamp in query
> ---------------------------------------------------
>
>                 Key: IMPALA-10461
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10461
>             Project: IMPALA
>          Issue Type: Bug
>    Affects Versions: Impala 2.12.0, Impala 3.4.0
>            Reporter: Sheng Wang
>            Priority: Major
>
> Recently, I found a probelm for execute below query in version 3.4.0 and 
> 2.12.0:
> {code:java}
> create table test_table(dt STRING) partitioned by(day STRING) STORED AS 
> PARQUET;
> SELECT
>       (CASE WHEN (DAYS_ADD(CAST(CAST(TO_DATE(TO_TIMESTAMP(`t1`.`dt`, 
> 'yyyy-MM-dd')) AS TIMESTAMP) AS TIMESTAMP), 7) 
>               > CAST('2021-01-26' AS TIMESTAMP))
>               THEN 0 ELSE 1 END) `d1`
> FROM
>  (SELECT dt FROM test_table
>   WHERE day=to_date(days_sub(now(),1))
>   GROUP BY dt) `t1`
> GROUP BY (CASE WHEN (DAYS_ADD(CAST(CAST(TO_DATE(TO_TIMESTAMP(`t1`.`dt`, 
> 'yyyy-MM-dd')) AS TIMESTAMP) AS TIMESTAMP), 7) 
>       > CAST('2021-01-26' AS TIMESTAMP))
>       THEN 0 ELSE 1 END)
> LIMIT 20;
> {code}
> The above query sometime execute success, but sometimes failed like this:
> {code:java}
> Create execute plan failed :Create execute plan failed 
> :org.apache.impala.common.AnalysisException: select list expression not 
> produced by aggregation output (missing from GROUP BY clause?): (CASE WHEN 
> (days_add(CAST(CAST(to_date(to_timestamp(t1.dt, 'yyyy-MM-dd')) AS TIMESTAMP) 
> AS TIMESTAMP), 7) > TIMESTAMP '2021-01-26 00:00:00') THEN 0 ELSE 1 END) d1 at 
> org.apache.impala.analysis.SelectStmt$SelectAnalyzer.verifyAggregation(SelectStmt.java:832)
>  at 
> org.apache.impala.analysis.SelectStmt$SelectAnalyzer.analyze(SelectStmt.java:233){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to