[ 
https://issues.apache.org/jira/browse/IMPALA-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-3578.
-----------------------------------
    Resolution: Won't Fix

Having HDFS +S3 co-existing is an unusual architecture, not work doing.

> S3: Consider allowing table-sink to stage in HDFS when writing to S3
> --------------------------------------------------------------------
>
>                 Key: IMPALA-3578
>                 URL: https://issues.apache.org/jira/browse/IMPALA-3578
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Perf Investigation
>    Affects Versions: Impala 2.6.0
>            Reporter: Sailesh Mukil
>            Assignee: Sailesh Mukil
>            Priority: Minor
>              Labels: performance, s3
>
> If users do not want to skip the staging step on INSERTs to S3, we could 
> allow the table sink to stage the temporary files in HDFS (if available) and 
> make the coordinator move the files to S3 on FinalizeSuccessfulInsert().
> This could improve performance in INSERTs to S3 as writes to HDFS are faster 
> than to S3 currently. Currently, when we do not skip the staging step, the 
> sinks write to a temporary loaction in S3 and the coordinator copies over 
> these files to the final location in S3 (as S3 doesn't support the rename() 
> operation). So this would bring down the number of writes to S3 from 2 to 1 
> per file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to