Michael Ho has posted comments on this change. Change subject: IMPALA-3989: Display skew warning for poorly formatted Parquet files ......................................................................
Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/5400/6/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: PS6, Line 333: (split_start <= row_group_start && split_end >= row_group_end) > Why is this an invalid case? If I understand it correctly, this function's sole purpose is to return TRUE iff there is any overlap between the split_range and the given row group. It's at the discretion of the caller to determine what's invalid. It's called below in NextRowGroup() iff the mid point of the row group doesn't fall into the split range. -- To view, visit http://gerrit.cloudera.org:8080/5400 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibf48d978383d73efdade733a892e795ebd53c76a Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Attila Jeges <[email protected]> Gerrit-Reviewer: Attila Jeges <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Sailesh Mukil <[email protected]> Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]> Gerrit-HasComments: Yes
