[
https://issues.apache.org/jira/browse/HADOOP-13560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15840561#comment-15840561
]
ASF GitHub Bot commented on HADOOP-13560:
-----------------------------------------
Github user mojodna commented on the issue:
https://github.com/apache/hadoop/pull/130
Done: https://issues.apache.org/jira/browse/HADOOP-14028
> S3ABlockOutputStream to support huge (many GB) file writes
> ----------------------------------------------------------
>
> Key: HADOOP-13560
> URL: https://issues.apache.org/jira/browse/HADOOP-13560
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 2.9.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: HADOOP-13560-branch-2-001.patch,
> HADOOP-13560-branch-2-002.patch, HADOOP-13560-branch-2-003.patch,
> HADOOP-13560-branch-2-004.patch
>
>
> There's two output stream mechanisms in Hadooop 2.7.x, neither of which
> handle massive multi-GB files that well.
> "classic": buffer everything to HDD until to the close() operation; time to
> close becomes O(data); as is available disk space. Fails to exploit exploit
> idle bandwidth, and on EC2 VMs with not much HDD capacity (especially
> completing with HDFS storage), can fill up the disk.
> {{S3AFastOutputStream}} uploads data in partition-sized blocks, buffering via
> byte arrays. Avoids disk problems and as it writes as soon as the first
> partition is ready, close() time is O(outstanding-data). However: needs
> tuning to reduce amount of data buffered. Get it wrong, and the first clue
> you get may be that the process goes OOM or is killed by YARN. Which is a
> shame, as get it right and operations which generates lots of data, complete
> much faster, including distcp.
> This patch proposes a new output stream, a successor to both,
> {{S3ABlockOutputStream}}.
> # uses block upload model of S3AFastOutputStream
> # supports buffering via: HDD, heap and (recycled) byte buffer, offering a
> choice between memory and HDD use. HDD: no OOM problems on small JVMs/need to
> tune.
> # Uses the fast output stream mechanism of limiting queue size for data to
> upload. Even when buffering via HDD, you may need to limit that use.
> # lots of instrumentation to see what's being written.
> # good defaults out the box (e.g buffer to HDD, partition size to strike a
> good balance of early upload and scaleability)
> # robust against transient failures. The AWS SDK retries a PUT on failure;
> the entire block may need to be replayed, so HDD input cannot be buffered via
> {{java.io.BufferedInputStream}}. It has also surfaced in testing that if the
> final commit of a multipart option fails, it isn't retried —at least in the
> current SDK in use. Do that ourselves.
> # use roundrobin directory allocation, for most effective disk use
> # take an AWS SDK {{com.amazonaws.event.ProgressListener}} for progress
> callbacks, giving more detail on the operation. (It actually takes a
> {{org.apache.hadoop.util.Progressable}}, but if that also implements the AWS
> interface, that is used instead.
> All of this to come with scale tests
> * generate large files using all buffer mechanisms
> * Do a large copy/rname and verify that the copy really works, including
> metadata
> * be configurable with sizes up to muti-GB, which also means that the test
> timeouts need to be configurable to match the time it can take.
> * As they are slow, make them optional, using the {{-Dscale}} switch to
> enable.
> Verifying large file rename is important on its own, as it is needed for very
> large commit operations for committers using rename
> The goal here is to implement a single, object stream which can be used for
> all outputs, tuneable as to whether to use disk or memory, and on queue
> sizes, but otherwise be all that's needed. We can do future development on
> this, remove its predecessor {{S3AFastOutputStream}}, so keeping docs and
> testing down, and leave the original {{S3AOutputStream}} alone for regression
> testing/fallback.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]