[
https://issues.apache.org/jira/browse/HDDS-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
runzhiwang updated HDDS-3223:
-----------------------------
Summary: Improve s3g read 1GB object efficiency by 100 times (was:
Improve s3g read 1GB object efficiency by 10 times )
> Improve s3g read 1GB object efficiency by 100 times
> ----------------------------------------------------
>
> Key: HDDS-3223
> URL: https://issues.apache.org/jira/browse/HDDS-3223
> Project: Hadoop Distributed Data Store
> Issue Type: Improvement
> Reporter: runzhiwang
> Assignee: runzhiwang
> Priority: Critical
> Labels: pull-request-available
> Fix For: 0.6.0
>
> Attachments: screenshot-1.png
>
>
> *What's the problem ?*
> Read 1000M object, it cost about 470 seconds, i.e. 2.2M/s, which is too slow.
> *What's the reason ?*
> When read 1000M file, there are 50 GET requests, each GET request read 20M.
> When do GET, the stack is:
> [IOUtils::copyLarge|https://github.com/apache/hadoop-ozone/blob/master/hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java#L262]
> ->
> [IOUtils::skipFully|https://github.com/apache/commons-io/blob/master/src/main/java/org/apache/commons/io/IOUtils.java#L1190]
> ->
> [IOUtils::skip|https://github.com/apache/commons-io/blob/master/src/main/java/org/apache/commons/io/IOUtils.java#L2064]
> ->
> [InputStream::read|https://github.com/apache/commons-io/blob/master/src/main/java/org/apache/commons/io/IOUtils.java#L1957].
> It means, the 50th GET request which should read 980M-1000M, but to skip
> 0-980M, it also
> [InputStream::read|https://github.com/apache/commons-io/blob/master/src/main/java/org/apache/commons/io/IOUtils.java#L1957]
> 0-980M. So the 1st GET request read 0-20M, the 2nd GET request read 0-40M,
> the 3rd GET request read 0-60M, ..., the 50th GET request read 0-1000M. So
> the GET request from 1st-50th become slower and slower.
> You can also refer it [here|https://issues.apache.org/jira/browse/IO-203] why
> IOUtils implement skip by read rather than real skip, e.g. seek.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]