Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/17168 )
Change subject: IMPALA-10564: Return error when inserting an invalid decimal value ...................................................................... Patch Set 7: (2 comments) http://gerrit.cloudera.org:8080/#/c/17168/7/be/src/udf/udf.h File be/src/udf/udf.h: http://gerrit.cloudera.org:8080/#/c/17168/7/be/src/udf/udf.h@29 PS7, Line 29: #include "gen-cpp/ErrorCodes_types.h" // for TErrorCode > Sure, I think that's fine Done http://gerrit.cloudera.org:8080/#/c/17168/7/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/17168/7/common/thrift/ImpalaService.thrift@652 PS7, Line 652: false by default > > If we set the default value for query option as true, the default I debug the constant case. Both FE and BE evaluates the decimal expression. FE evaluates the constant decimal expression, and writes error message and calling stack to log file when it get decimal overflow issue. But it does not transfer the materialized decimal value to BE so BE has to evaluate the decimal expression again. FE generate different plan tree for expression with/without alias. For constant case, FragmentInstanceState::ExecInternal() call UnionNode::GetNext() before calling HdfsTableSink::Send(). UnionNode::GetNext() call UnionNode::GetNextConst(), then UnionNode::MaterializeExprs(), ... DecimalOperators::ScaleDecimalValue() to get expression value. If there is decimal overflow issue, the error is saved in RuntimeState object. When HdfsTableSink::Send() is called, it calls RuntimeState::CheckQueryState() and returns error if there is a recorded error. That's why the current code return error for decimal overflow with constant expression. For expression with alias, the expression is evaluated util the column values are going to be wrote to table in HdfsTextTableWriter::AppendRows(). So in this case, when RuntimeState::CheckQueryState() is called by HdfsTableSink::Send(), there is no error saved in RuntimeState object, it continue to call HdfsTableSink::WriteRowsToPartition(), then HdfsTextTableWriter::AppendRows() to write rows. But HdfsTextTableWriter::AppendRows() don't call RuntimeState::CheckQueryState() anymore, it inserts NULL values to the table even there is an error recorded in RuntimeState object. That's reason why the current behaviors for two cases are different. Both cases are handled by BE. The new query option should affect both cases. If I don't change the code in HdfsTableSink::Send(), the behaviors for the two cases will be different. The alias is required to reproduce the customer issue. If using "create table as select cast(a*b*c as decimal(28,10)) from source" with big a, b, c values, Impala return error without inserting NULL since the calling path is different. -- To view, visit http://gerrit.cloudera.org:8080/17168 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98 Gerrit-Change-Number: 17168 Gerrit-PatchSet: 7 Gerrit-Owner: Wenzhe Zhou <[email protected]> Gerrit-Reviewer: Aman Sinha <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]> Gerrit-Reviewer: Wenzhe Zhou <[email protected]> Gerrit-Comment-Date: Sat, 20 Mar 2021 03:39:08 +0000 Gerrit-HasComments: Yes
