Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17168 )

Change subject: IMPALA-10564: Return error when inserting an invalid decimal 
value
......................................................................


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17168/7/be/src/udf/udf.h
File be/src/udf/udf.h:

http://gerrit.cloudera.org:8080/#/c/17168/7/be/src/udf/udf.h@29
PS7, Line 29: #include "gen-cpp/ErrorCodes_types.h" // for TErrorCode
> Sure, I think that's fine
Done


http://gerrit.cloudera.org:8080/#/c/17168/7/common/thrift/ImpalaService.thrift
File common/thrift/ImpalaService.thrift:

http://gerrit.cloudera.org:8080/#/c/17168/7/common/thrift/ImpalaService.thrift@652
PS7, Line 652: false by default
> > If we set the default value for query option as true, the default
I debug the constant case. Both FE and BE evaluates the decimal expression. FE 
evaluates the constant decimal expression, and writes error message and calling 
stack to log file when it get decimal overflow issue. But it does not transfer 
the materialized decimal value to BE so BE has to evaluate the decimal 
expression again. FE generate different plan tree for expression with/without 
alias. For constant case, FragmentInstanceState::ExecInternal() call 
UnionNode::GetNext() before calling HdfsTableSink::Send(). UnionNode::GetNext() 
call UnionNode::GetNextConst(), then UnionNode::MaterializeExprs(), ... 
DecimalOperators::ScaleDecimalValue() to get expression value. If there is 
decimal overflow issue, the error is saved in RuntimeState object. When 
HdfsTableSink::Send() is called, it calls RuntimeState::CheckQueryState() and 
returns error if there is a recorded error. That's why the current code return 
error for decimal overflow with constant expression. For expression with alias, 
the expression is evaluated util the column values are going to be wrote to 
table in HdfsTextTableWriter::AppendRows(). So in this case, when 
RuntimeState::CheckQueryState() is called by HdfsTableSink::Send(), there is no 
error saved in RuntimeState object, it continue to call 
HdfsTableSink::WriteRowsToPartition(), then HdfsTextTableWriter::AppendRows() 
to write rows. But HdfsTextTableWriter::AppendRows() don't call 
RuntimeState::CheckQueryState() anymore, it inserts NULL values to the table 
even there is an error recorded in RuntimeState object. That's reason why the 
current behaviors for two cases are different. Both cases are handled by BE. 
The new query option should affect both cases. If I don't change the code in 
HdfsTableSink::Send(), the behaviors for the two cases will be different.

The alias is required to reproduce the customer issue. If using "create table 
as select cast(a*b*c as decimal(28,10)) from source" with big a, b, c values, 
Impala return error without inserting NULL since the calling path is different.



--
To view, visit http://gerrit.cloudera.org:8080/17168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98
Gerrit-Change-Number: 17168
Gerrit-PatchSet: 7
Gerrit-Owner: Wenzhe Zhou <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]>
Gerrit-Reviewer: Wenzhe Zhou <[email protected]>
Gerrit-Comment-Date: Sat, 20 Mar 2021 03:39:08 +0000
Gerrit-HasComments: Yes

Reply via email to