Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17168 )

Change subject: IMPALA-10564: Return error when inserting an invalid decimal 
value
......................................................................


Patch Set 7:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/17168/7/be/src/exec/hdfs-text-table-writer.cc
File be/src/exec/hdfs-text-table-writer.cc:

http://gerrit.cloudera.org:8080/#/c/17168/7/be/src/exec/hdfs-text-table-writer.cc@112
PS7, Line 112: state_->GetQueryStatus().IsInvalidDecimalError()
Probably want to move this check to after the check if the column has decimal 
type - this involves reading an atomic variable so its pretty expensive to be 
doing once per row, better to at least only have to do it for decimal rows.


http://gerrit.cloudera.org:8080/#/c/17168/7/be/src/udf/udf.h
File be/src/udf/udf.h:

http://gerrit.cloudera.org:8080/#/c/17168/7/be/src/udf/udf.h@29
PS7, Line 29: #include "gen-cpp/ErrorCodes_types.h" // for TErrorCode
I'm not sure if you can do this. See the comment a few lines above this - this 
file is sometimes used in situations where we don't have a full Impala build 
available.

I think its probably fine to just rely on string matching for this, instead of 
using error codes.


http://gerrit.cloudera.org:8080/#/c/17168/7/common/thrift/ImpalaService.thrift
File common/thrift/ImpalaService.thrift:

http://gerrit.cloudera.org:8080/#/c/17168/7/common/thrift/ImpalaService.thrift@652
PS7, Line 652: false by default
> For constant decimal (e.g the expression without alias), our current code r
I think this is a different situation than the constant decimal case, since for 
that case you fail the entire query before anything gets inserted so you're not 
messing up a users data, whereas in this case we're stopping after a partial 
insert, so its not always clear what the end state of the table will be, and I 
think preferring to just have nulls inserted makes a lot of sense from a user's 
perspective.

I guess the argument though is that for decimal_v2 we're supposed to return an 
error rather than null for overflow, not doing that in this situation was an 
oversight, so theoretically this is making the behavior more consistent with 
what its supposed to be anyways.

I still personally am reluctant to change the default behavior in situations 
like these where the previous behavior wasn't really incorrect (i.e. wasn't 
giving wrong results, causing crashes, etc.), cause you might be breaking users 
that rely on the behavior, but I'll leave it up to you.



--
To view, visit http://gerrit.cloudera.org:8080/17168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98
Gerrit-Change-Number: 17168
Gerrit-PatchSet: 7
Gerrit-Owner: Wenzhe Zhou <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]>
Gerrit-Reviewer: Wenzhe Zhou <[email protected]>
Gerrit-Comment-Date: Fri, 19 Mar 2021 23:44:37 +0000
Gerrit-HasComments: Yes

Reply via email to