Sailesh Mukil has posted comments on this change. Change subject: IMPALA-3452: S3: Disable Impala staging for INSERTs via flag for speedup ......................................................................
Patch Set 4: (3 comments) The S3_INSERT_SKIP_STAGING query option is enabled by default. For the test, I've disabled it. http://gerrit.cloudera.org:8080/#/c/2905/4/be/src/exec/hdfs-table-sink.h File be/src/exec/hdfs-table-sink.h: Line 214: This is set if we want to > Returns TRUE if the staging step should be skipped for this partition. Done http://gerrit.cloudera.org:8080/#/c/2905/4/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: Line 865: } else if (!is_s3_path || !query_ctx_.request.query_options.s3_skip_insert_staging) { > Is this even needed when the option is not set? i.e. does it do anything us We generally wouldn't need to create "directories" on S3, however, the hadoop S3A connector code always checks if a path exists before doing an operation on that "directory" (this is unnecessary for us, but we don't have a way to work around it.) So, if we don't call CREATE_DIR, a lot of operations would fail. There's a comment regarding that above in line 854. Hadoop code for reference (this is for the rename() operation): http://github.mtv.cloudera.com/CDH/hadoop/blob/2b855a685a8dd445c9e20139967b4c16aab3df12/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L529 Line 866: we set the S3_SKIP_INSERT_STAGING query option > if the ... query option is set Done -- To view, visit http://gerrit.cloudera.org:8080/2905 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iff9620d41ba0d5fb1aa0c9f4abb48866fc2b0698 Gerrit-PatchSet: 4 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Sailesh Mukil <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Henry Robinson <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Mostafa Mokhtar <[email protected]> Gerrit-Reviewer: Sailesh Mukil <[email protected]> Gerrit-HasComments: Yes
