Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/17168 )
Change subject: IMPALA-10564: Return error when inserting an invalid decimal value ...................................................................... Patch Set 7: (3 comments) http://gerrit.cloudera.org:8080/#/c/17168/7/be/src/exec/hdfs-text-table-writer.cc File be/src/exec/hdfs-text-table-writer.cc: http://gerrit.cloudera.org:8080/#/c/17168/7/be/src/exec/hdfs-text-table-writer.cc@112 PS7, Line 112: state_->GetQueryStatus().IsInvalidDecimalError() Probably want to move this check to after the check if the column has decimal type - this involves reading an atomic variable so its pretty expensive to be doing once per row, better to at least only have to do it for decimal rows. http://gerrit.cloudera.org:8080/#/c/17168/7/be/src/udf/udf.h File be/src/udf/udf.h: http://gerrit.cloudera.org:8080/#/c/17168/7/be/src/udf/udf.h@29 PS7, Line 29: #include "gen-cpp/ErrorCodes_types.h" // for TErrorCode I'm not sure if you can do this. See the comment a few lines above this - this file is sometimes used in situations where we don't have a full Impala build available. I think its probably fine to just rely on string matching for this, instead of using error codes. http://gerrit.cloudera.org:8080/#/c/17168/7/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/17168/7/common/thrift/ImpalaService.thrift@652 PS7, Line 652: false by default > For constant decimal (e.g the expression without alias), our current code r I think this is a different situation than the constant decimal case, since for that case you fail the entire query before anything gets inserted so you're not messing up a users data, whereas in this case we're stopping after a partial insert, so its not always clear what the end state of the table will be, and I think preferring to just have nulls inserted makes a lot of sense from a user's perspective. I guess the argument though is that for decimal_v2 we're supposed to return an error rather than null for overflow, not doing that in this situation was an oversight, so theoretically this is making the behavior more consistent with what its supposed to be anyways. I still personally am reluctant to change the default behavior in situations like these where the previous behavior wasn't really incorrect (i.e. wasn't giving wrong results, causing crashes, etc.), cause you might be breaking users that rely on the behavior, but I'll leave it up to you. -- To view, visit http://gerrit.cloudera.org:8080/17168 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98 Gerrit-Change-Number: 17168 Gerrit-PatchSet: 7 Gerrit-Owner: Wenzhe Zhou <[email protected]> Gerrit-Reviewer: Aman Sinha <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]> Gerrit-Reviewer: Wenzhe Zhou <[email protected]> Gerrit-Comment-Date: Fri, 19 Mar 2021 23:44:37 +0000 Gerrit-HasComments: Yes
