Marcel Kornacker has posted comments on this change. Change subject: IMPALA-2523: Make HdfsTableSink aware of clustered input ......................................................................
Patch Set 16: Code-Review+2 (3 comments) http://gerrit.cloudera.org:8080/#/c/4863/15/be/src/exec/hdfs-table-sink.cc File be/src/exec/hdfs-table-sink.cc: Line 299: // then the batch contains rows with the same key and can be written as a whole. then all rows in the batch have the same key http://gerrit.cloudera.org:8080/#/c/4863/13/be/src/exec/hdfs-table-sink.h File be/src/exec/hdfs-table-sink.h: Line 214: /// Maps all rows in 'batch' to partitions and appends them to their temporary Hdfs > Done. Though I think "must be" makes it more clear that it's a prerequisit good point, better revert then. http://gerrit.cloudera.org:8080/#/c/4863/16/be/src/exec/hdfs-table-writer.h File be/src/exec/hdfs-table-writer.h: Line 70: /// rows in the batch are appended. i find this confusing, because it comments on the intention of the caller (if multiple partitions then...), and it doesn't even make that clear. this function doesn't check for any partitions. it appends the rows in 'batch' that were selected via row_group_indices, and if the latter is empty, appends every row. -- To view, visit http://gerrit.cloudera.org:8080/4863 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibeda0bdabbfe44c8ac95bf7c982a75649e1b82d0 Gerrit-PatchSet: 16 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Lars Volker <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Zoltan Ivanfi <[email protected]> Gerrit-HasComments: Yes
