Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-3579: Strict handling of numeric overflow in text parsing
......................................................................


Patch Set 1:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/3150/1//COMMIT_MSG
Commit Message:

Line 13: invalid data, i.e. NULL is returned.
> this doesn't seem right.  when the option is set, error is returned (not NU
In the case that abort_on_error=false but strict_mode=true this is what we see, 
no?


Line 21:  However, integer and floating point parsing never returns
       :   an underflow today.
> what cases would be integer and floating point underflow?  For integer, we 
Yeah for floating point we don't handle underflows. For integers I'm talking 
about truncation so I guess I shouldn't refer to that as underflow since it's 
not really what it means (though it is similar, i.e. not enough precision).


http://gerrit.cloudera.org:8080/#/c/3150/1/be/src/exec/text-converter.cc
File be/src/exec/text-converter.cc:

Line 249: (though the latter are only possible for
        :       // DECIMAL which isn't codegen'd)
> is underflow only possible given the way the code works today or by definit
I thought we wanted to handle underflows in floats/doubles. Right now we don't 
really handle them, i.e. 9.9e-99999 is 0 w/ PARSE_SUCCESS. I also think we may 
want to consider handling truncating for integers (though I dunno if we'd want 
to call that underflow necessarily, perhaps like PARSE_TRUNCATE).
Anyway, that's not happening here so I'll take this out for now.


http://gerrit.cloudera.org:8080/#/c/3150/1/be/src/exec/text-converter.inline.h
File be/src/exec/text-converter.inline.h:

Line 171:     // TODO: add warnings
> what is this TODO about at this point?  shouldn't it be the caller would do
Was thinking in case we wanted to log something more specific, but I guess 
that's not really useful (at least not yet). I'll remove.


http://gerrit.cloudera.org:8080/#/c/3150/1/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

Line 200:   // Additional strict handling of invalid values in parsing.
> I think we'll want this option to have broader meaning soon enough, so mayb
Done


http://gerrit.cloudera.org:8080/#/c/3150/1/testdata/workloads/functional-query/queries/QueryTest/strict-mode-abort.test
File 
testdata/workloads/functional-query/queries/QueryTest/strict-mode-abort.test:

Line 32
> how about testing negative values for all these cases as well
Hm yeah, I think the best way to do that is introducing a new "bad data" file. 
However, I'm not as worried about handling both cases here because  
strict-mode.test (the other test file) tests that all values (both positive and 
negative) return parse errors, so this is more of a test that the 
abort_on_error code works on this path. I'm not sure it's worth adding a new 
bad data file (testdata/data/...) and then a new functional table to wrap it. 
What do you think?


-- 
To view, visit http://gerrit.cloudera.org:8080/3150
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7409c31ec0cb6fe0b2d9842b9f58fe1670914836
Gerrit-PatchSet: 1
Gerrit-Project: Impala
Gerrit-Branch: cdh5-2.6.0_5.8.0
Gerrit-Owner: Matthew Jacobs <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Matthew Jacobs <[email protected]>
Gerrit-HasComments: Yes

Reply via email to