[ 
https://issues.apache.org/jira/browse/HADOOP-11183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318160#comment-14318160
 ] 

Steve Loughran commented on HADOOP-11183:
-----------------------------------------

# markdown docs should just go into src/site/markdown, with relevant links, as 
part of the overall patch.
# please don't say bad things about HDFS performance, especially given ongoing 
work with erasure coding, TCP stack bypassing for local writes, etc etc. Best 
to leave all performance numbers out altogether. 
# consider that within an EC2 VM, memory storage should be faster than 
virtualized HDD, SSD may change the values, etc. etc. 
# ...so just say "may offer performance improvements, especially on operations 
with the object store "close" to the writer, where "close" is defined by 
network bandwidth and latency"

> Memory-based S3AOutputstream
> ----------------------------
>
>                 Key: HADOOP-11183
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11183
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.6.0
>            Reporter: Thomas Demoor
>            Assignee: Thomas Demoor
>         Attachments: HADOOP-11183.001.patch, HADOOP-11183.002.patch, 
> HADOOP-11183.003.patch, info-003.md, info-S3AFastOutputStream-sync.md
>
>
> Currently s3a buffers files on disk(s) before uploading. This JIRA 
> investigates adding a memory-based upload implementation.
> The motivation is evidently performance: this would be beneficial for users 
> with high network bandwidth to S3 (EC2?) or users that run Hadoop directly on 
> an S3-compatible object store (FYI: my contributions are made in name of 
> Amplidata). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to