[jira] [Commented] (HADOOP-15027) Improvements for Hadoop read from AliyunOSS

wujinhu (JIRA) Sun, 12 Nov 2017 22:50:36 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16249160#comment-16249160
 ]


wujinhu commented on HADOOP-15027:
----------------------------------

[~uncleGen]
Thanks for your comments.
1. I have tested by using ossutil tool and the read speed is about 
10MB+/s(continue to verify this).
2 & 3.
I think it's ok if thread pool is in FileSystem. Thread pool is just used to 
per-fetch data from OSS. Actually, just as the following code shows
      *{color:#d04437}if (item.buffer.length == 0) {
        //EOF
        item.ready.set(true);
      } else {
        this.readAheadExecutorService.execute(new AliyunOSSFileReaderTask(key, 
store, item));
      }
      cachedStreams.add(item);{color}*

each item will be enqueue both thread pool(FileSystem) and cachedStreams(Each 
stream has its own queue).
If one input stream is slow, it just affect its own cachedStreams, and will not 
affect others.

4. I will change code style of these lines.
5. Yes, we can do a simple refactor if some modules have the same requirements.

I will add another patch to fix this.

> Improvements for Hadoop read from AliyunOSS
> -------------------------------------------
>
>                 Key: HADOOP-15027
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15027
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/oss
>    Affects Versions: 3.0.0
>            Reporter: wujinhu
>            Assignee: wujinhu
>         Attachments: HADOOP-15027.001.patch, HADOOP-15027.002.patch
>
>
> Currently, read performance is poor when Hadoop reads from AliyunOSS. It 
> needs about 1min to read 1GB from OSS.
> Class AliyunOSSInputStream uses single thread to read data from AliyunOSS,  
> so we can refactor this by using multi-thread pre read to improve this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15027) Improvements for Hadoop read from AliyunOSS

Reply via email to