Henry Robinson has posted comments on this change.

Change subject: IMPALA-3452: S3: Disable Impala staging for INSERTs via flag 
for speedup
......................................................................


Patch Set 2:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/2905/2/be/src/exec/hdfs-table-sink.cc
File be/src/exec/hdfs-table-sink.cc:

Line 285: and the user has specified
        :   // so via the query option S3_SKIP_INSERT_STAGING, in order to 
achieve better
        :   // performance.
Talk less about how the option might have got set, and more about what skipping 
staging means: where the file will get written in the true and false cases.


Line 289: output_partition->final_hdfs_file_name_prefix
        :              << "." << output_partition->num_files
        :              << "." << output_partition->writer->file_extension();
how about creating a:

  string final_location = Substitute(...);

and reusing that below on line 339.


Line 394: and if the query is not an INSERT OVERWRITE
say why we can't skip for overwrites (and mention that overwrites aren't 
affected in the query option comment as well)


http://gerrit.cloudera.org:8080/#/c/2905/2/be/src/exec/hdfs-table-sink.h
File be/src/exec/hdfs-table-sink.h:

Line 57: /// Name of the temporary directory that files for this partition are 
staged to before
       :   /// the coordinator moves them to their permanent location once the 
query completes.
Needs an updated comment.


Line 91:   /// This is set if we want to skip the staging step for this 
partition.
Say more about why we would want to do this.


http://gerrit.cloudera.org:8080/#/c/2905/2/common/thrift/ImpalaService.thrift
File common/thrift/ImpalaService.thrift:

Line 213:   // If true, skips the staging step for INSERTs to S3.
Be more clear here: not every reader will know what the 'staging step' is. 
Something like: "If true, INSERT writes to S3 go directly to their final 
location rather than being copied there by the coordinator."


-- 
To view, visit http://gerrit.cloudera.org:8080/2905
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iff9620d41ba0d5fb1aa0c9f4abb48866fc2b0698
Gerrit-PatchSet: 2
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Sailesh Mukil <[email protected]>
Gerrit-Reviewer: Henry Robinson <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Mostafa Mokhtar <[email protected]>
Gerrit-Reviewer: Sailesh Mukil <[email protected]>
Gerrit-HasComments: Yes

Reply via email to