Sailesh Mukil has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/2905

Change subject: IMPALA-3452: S3: Disable Impala staging for INSERTs via flag 
for speedup
......................................................................

IMPALA-3452: S3: Disable Impala staging for INSERTs via flag for speedup

INSERTs on S3 are slower because of double buffering where we buffer
once locally and once in a staging directoy in S3 before moving the
file(s) to the final location. Also, moving the file from the staging
directory to the final location in HDFS is a quick rename which is
only a metadata operation. However, on S3, renames are not supported,
thus becoming a full file copy instead of just a metadata rename
operation.

This patch instroduces a boolean flag "s3_skip_insert_staging" which
avoids the staging step on S3 and allows the sinks to write to the
final location directly.

TODO: Record average performance gains here.

Change-Id: Iff9620d41ba0d5fb1aa0c9f4abb48866fc2b0698
---
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
2 files changed, 34 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/05/2905/1
-- 
To view, visit http://gerrit.cloudera.org:8080/2905
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Iff9620d41ba0d5fb1aa0c9f4abb48866fc2b0698
Gerrit-PatchSet: 1
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Sailesh Mukil <[email protected]>

Reply via email to