[
https://issues.apache.org/jira/browse/IMPALA-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Armstrong resolved IMPALA-3578.
-----------------------------------
Resolution: Won't Fix
Having HDFS +S3 co-existing is an unusual architecture, not work doing.
> S3: Consider allowing table-sink to stage in HDFS when writing to S3
> --------------------------------------------------------------------
>
> Key: IMPALA-3578
> URL: https://issues.apache.org/jira/browse/IMPALA-3578
> Project: IMPALA
> Issue Type: Improvement
> Components: Perf Investigation
> Affects Versions: Impala 2.6.0
> Reporter: Sailesh Mukil
> Assignee: Sailesh Mukil
> Priority: Minor
> Labels: performance, s3
>
> If users do not want to skip the staging step on INSERTs to S3, we could
> allow the table sink to stage the temporary files in HDFS (if available) and
> make the coordinator move the files to S3 on FinalizeSuccessfulInsert().
> This could improve performance in INSERTs to S3 as writes to HDFS are faster
> than to S3 currently. Currently, when we do not skip the staging step, the
> sinks write to a temporary loaction in S3 and the coordinator copies over
> these files to the final location in S3 (as S3 doesn't support the rename()
> operation). So this would bring down the number of writes to S3 from 2 to 1
> per file.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)