Tim Armstrong has posted comments on this change. Change subject: IMPALA_2523: Make HdfsTableSink aware of clustered input ......................................................................
Patch Set 1: (2 comments) I didn't do a full pass but had thoughts on testing. http://gerrit.cloudera.org:8080/#/c/4863/1//COMMIT_MSG Commit Message: PS1, Line 7: A_ - instead of _ Line 12: RE: testing - off the top of my head, I think we need: * Very large inserts with partitions spanning many row batches * Inserts with a small number of rows per partition (e.g. 1, 2, 3). * Tests that check that the expected # of files are created It would also be good to add batch_size as a test dimension (if it isn't already for the insert tests) and test with some different batch sizes to hit more edge cases, e.g. 1, 16, default (1024). -- To view, visit http://gerrit.cloudera.org:8080/4863 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibeda0bdabbfe44c8ac95bf7c982a75649e1b82d0 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker <[email protected]> Gerrit-Reviewer: Lars Volker <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
