[ https://issues.apache.org/jira/browse/HADOOP-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16248362#comment-16248362 ]
wujinhu edited comment on HADOOP-15027 at 11/11/17 6:40 AM: ------------------------------------------------------------ Hi Steve Loughran Thanks for the comments and your suggestions are very helpful. I will follow your suggestions about thread pool and retry logic. For random IO, it is true that my implementation will not work well. It seems HADOOP-14535 is similar with what os does. Operation system starts to sequential read-ahead when one of the following conditions satisfies: * first read from a file and seek pos is 0 * current read and previous read are continuous in this file Otherwise, it is random IO. I will take a look at these two issues and continue to improve this. was (Author: wujinhu): Hi Steve Loughran Thanks for the comments and your suggestions are very helpful. I will follow your suggestions about thread pool and retry logic. For random IO, it is true that my implementation will not work well. It seems HADOOP-14535 is similar with what os does. Operation system starts to sequential read-ahead when one of the following conditions satisfies: * first read from a file and seek pos is 0 * current read and previous read are continuous in this file Otherwise, it is random IO. I will take a look at these two issues and continue to improve this. > Improvements for Hadoop read from AliyunOSS > ------------------------------------------- > > Key: HADOOP-15027 > URL: https://issues.apache.org/jira/browse/HADOOP-15027 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/oss > Affects Versions: 3.0.0 > Reporter: wujinhu > Assignee: wujinhu > Attachments: HADOOP-15027.001.patch > > > Currently, read performance is poor when Hadoop reads from AliyunOSS. It > needs about 1min to read 1GB from OSS. > Class AliyunOSSInputStream uses single thread to read data from AliyunOSS, > so we can refactor this by using multi-thread pre read to improve this. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org