Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12163 )

Change subject: IMPALA-7087: Read Parquet decimal columns with lower 
precision/scale
......................................................................


Patch Set 1:

> (1 comment)
 >
 > Thanks for taking a look @Csaba! I've only responded to your
 > comment on overflow handling so far, as I want to get that behavior
 > locked down first.

Are you sure that the query will be aborted - isn't it only one warning / file? 
If there are "normal" files processed before the one with overflows, then some 
rows will be returned before running to the error. In the current 
implementation it is also possible that some rows are returned from the file 
before reaching the overflowing value.

I am also not completely sure about the best approach.

I would prefer the error to be non-fatal, so to print one warning/file and 
continue the query.

If no important use case is lost this way, then I would prefer to use the 
formula and skip files based on metadata instead of checking every row.

This would make it possible to print a useful error message, so that the user 
will know the precision needed to process the file, and can call ALTER TABLE to 
increase precision / decrease scale and make the file usable. If the error is 
printed based the first row, then this can become a long iterative process.


--
To view, visit http://gerrit.cloudera.org:8080/12163
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iafc8efd12379a39756e3e70f022a81a636dadb61
Gerrit-Change-Number: 12163
Gerrit-PatchSet: 1
Gerrit-Owner: Sahil Takiar <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Reviewer: Sahil Takiar <[email protected]>
Gerrit-Comment-Date: Mon, 07 Jan 2019 18:22:45 +0000
Gerrit-HasComments: No

Reply via email to