Matthew Jacobs has posted comments on this change. Change subject: IMPALA-3943: Adhere to abort_on_error when a Parquet file has no row groups. ......................................................................
Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/3862/3/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: Line 856: } > the error handle for all of these would be the same, right? so i think we c My original point was that there are different kinds of errors, ranging from corruption to no data (like this case), and that they maybe would behave differently (e.g. this is OK to warn on but corrupt files are not). I was thinking that this case isn't as bad as a header that can't even be read, so I was asking if we'd want to avoid putting all things in the category of abort on error, regardless of how severe they are. (And as an aside, by handling this here for this case, we shouldn't need to short circuit in GetNext().) That said, maybe the distinction between these errors isn't useful, and I see the benefit to handling all errors in a single way. -- To view, visit http://gerrit.cloudera.org:8080/3862 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6aff766a1ce6376efb329bdde51c648149dfe08c Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
