[jira] [Comment Edited] (HADOOP-15027) AliyunOSS: Support multi-thread pre-read to improve read from Hadoop to Aliyun OSS performance
[ https://issues.apache.org/jira/browse/HADOOP-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16321869#comment-16321869 ] wujinhu edited comment on HADOOP-15027 at 1/11/18 3:17 PM: --- Hi [~Sammi], here are some performance data. I use this tool(https://github.com/hortonworks/hive-testbench) to compare run time between this patch and current version(text file). {code:java} query patch current query13.sql 241.591440.524 query28.sql 1259.307 1943.949 query51.sql 469.618722.904 query73.sql 216.596414.75 query96.sql 268.869476.473 {code} was (Author: wujinhu): Hi [~Sammi], here are some performance data. I use this tool(https://github.com/hortonworks/hive-testbench) to compare run time between this patch and current version. {code:java} query patch current query13.sql 241.591440.524 query28.sql 1259.307 1943.949 query51.sql 469.618722.904 query73.sql 216.596414.75 query96.sql 268.869476.473 {code} > AliyunOSS: Support multi-thread pre-read to improve read from Hadoop to > Aliyun OSS performance > -- > > Key: HADOOP-15027 > URL: https://issues.apache.org/jira/browse/HADOOP-15027 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/oss >Affects Versions: 3.0.0 >Reporter: wujinhu >Assignee: wujinhu > Attachments: HADOOP-15027.001.patch, HADOOP-15027.002.patch, > HADOOP-15027.003.patch, HADOOP-15027.004.patch, HADOOP-15027.005.patch, > HADOOP-15027.006.patch, HADOOP-15027.007.patch, HADOOP-15027.008.patch, > HADOOP-15027.009.patch, HADOOP-15027.010.patch, HADOOP-15027.011.patch, > HADOOP-15027.012.patch > > > Currently, AliyunOSSInputStream uses single thread to read data from > AliyunOSS, so we can do some refactoring by using multi-thread pre-read to > improve read performance. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-15027) AliyunOSS: Support multi-thread pre-read to improve read from Hadoop to Aliyun OSS performance
[ https://issues.apache.org/jira/browse/HADOOP-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16321869#comment-16321869 ] wujinhu edited comment on HADOOP-15027 at 1/11/18 8:44 AM: --- Hi [~Sammi], here are some performance data. I use this tool(https://github.com/hortonworks/hive-testbench) to compare run time between this patch and current version. {code:java} query patch current query13.sql 241.591440.524 query28.sql 1259.307 1943.949 query51.sql 469.618722.904 query73.sql 216.596414.75 query96.sql 268.869476.473 {code} was (Author: wujinhu): Hi [~Sammi], here are some performance data. I use this tool(https://github.com/hortonworks/hive-testbench) to compare run time between this patch and current version. {code:java} query after before query13.sql 241.591440.524 query28.sql 1259.307 1943.949 query51.sql 469.618722.904 query73.sql 216.596414.75 query96.sql 268.869476.473 {code} > AliyunOSS: Support multi-thread pre-read to improve read from Hadoop to > Aliyun OSS performance > -- > > Key: HADOOP-15027 > URL: https://issues.apache.org/jira/browse/HADOOP-15027 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/oss >Affects Versions: 3.0.0 >Reporter: wujinhu >Assignee: wujinhu > Attachments: HADOOP-15027.001.patch, HADOOP-15027.002.patch, > HADOOP-15027.003.patch, HADOOP-15027.004.patch, HADOOP-15027.005.patch, > HADOOP-15027.006.patch, HADOOP-15027.007.patch, HADOOP-15027.008.patch, > HADOOP-15027.009.patch, HADOOP-15027.010.patch, HADOOP-15027.011.patch, > HADOOP-15027.012.patch > > > Currently, AliyunOSSInputStream uses single thread to read data from > AliyunOSS, so we can do some refactoring by using multi-thread pre-read to > improve read performance. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-15027) AliyunOSS: Support multi-thread pre-read to improve read from Hadoop to Aliyun OSS performance
[ https://issues.apache.org/jira/browse/HADOOP-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16321869#comment-16321869 ] wujinhu edited comment on HADOOP-15027 at 1/11/18 8:43 AM: --- Hi [~Sammi], here are some performance data. I use this tool(https://github.com/hortonworks/hive-testbench) to compare run time between this patch and current version. {code:java} query after before query13.sql 241.591 440.524 query28.sql 1259.307 1943.949 query51.sql 469.618 722.904 query73.sql 216.596 414.75 query96.sql 268.869 476.473 {code} was (Author: wujinhu): Hi [~Sammi], here are some performance data. I use this tool(https://github.com/hortonworks/hive-testbench) to compare run time between this patch and current version. {code:java} query after before query13.sql 241.591 440.524 query28.sql 1259.307 1943.949 query51.sql 469.618 722.904 query73.sql 216.596 414.75 query96.sql 268.869 476.473 {code} > AliyunOSS: Support multi-thread pre-read to improve read from Hadoop to > Aliyun OSS performance > -- > > Key: HADOOP-15027 > URL: https://issues.apache.org/jira/browse/HADOOP-15027 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/oss >Affects Versions: 3.0.0 >Reporter: wujinhu >Assignee: wujinhu > Attachments: HADOOP-15027.001.patch, HADOOP-15027.002.patch, > HADOOP-15027.003.patch, HADOOP-15027.004.patch, HADOOP-15027.005.patch, > HADOOP-15027.006.patch, HADOOP-15027.007.patch, HADOOP-15027.008.patch, > HADOOP-15027.009.patch, HADOOP-15027.010.patch, HADOOP-15027.011.patch, > HADOOP-15027.012.patch > > > Currently, AliyunOSSInputStream uses single thread to read data from > AliyunOSS, so we can do some refactoring by using multi-thread pre-read to > improve read performance. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-15027) AliyunOSS: Support multi-thread pre-read to improve read from Hadoop to Aliyun OSS performance
[ https://issues.apache.org/jira/browse/HADOOP-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16321869#comment-16321869 ] wujinhu edited comment on HADOOP-15027 at 1/11/18 8:43 AM: --- Hi [~Sammi], here are some performance data. I use this tool(https://github.com/hortonworks/hive-testbench) to compare run time between this patch and current version. {code:java} query after before query13.sql 241.591440.524 query28.sql 1259.307 1943.949 query51.sql 469.618722.904 query73.sql 216.596414.75 query96.sql 268.869476.473 {code} was (Author: wujinhu): Hi [~Sammi], here are some performance data. I use this tool(https://github.com/hortonworks/hive-testbench) to compare run time between this patch and current version. {code:java} query after before query13.sql 241.591 440.524 query28.sql 1259.307 1943.949 query51.sql 469.618722.904 query73.sql 216.596414.75 query96.sql 268.869476.473 {code} > AliyunOSS: Support multi-thread pre-read to improve read from Hadoop to > Aliyun OSS performance > -- > > Key: HADOOP-15027 > URL: https://issues.apache.org/jira/browse/HADOOP-15027 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/oss >Affects Versions: 3.0.0 >Reporter: wujinhu >Assignee: wujinhu > Attachments: HADOOP-15027.001.patch, HADOOP-15027.002.patch, > HADOOP-15027.003.patch, HADOOP-15027.004.patch, HADOOP-15027.005.patch, > HADOOP-15027.006.patch, HADOOP-15027.007.patch, HADOOP-15027.008.patch, > HADOOP-15027.009.patch, HADOOP-15027.010.patch, HADOOP-15027.011.patch, > HADOOP-15027.012.patch > > > Currently, AliyunOSSInputStream uses single thread to read data from > AliyunOSS, so we can do some refactoring by using multi-thread pre-read to > improve read performance. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-15027) AliyunOSS: Support multi-thread pre-read to improve read from Hadoop to Aliyun OSS performance
[ https://issues.apache.org/jira/browse/HADOOP-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16321869#comment-16321869 ] wujinhu edited comment on HADOOP-15027 at 1/11/18 8:43 AM: --- Hi [~Sammi], here are some performance data. I use this tool(https://github.com/hortonworks/hive-testbench) to compare run time between this patch and current version. {code:java} query after before query13.sql 241.591 440.524 query28.sql 1259.307 1943.949 query51.sql 469.618722.904 query73.sql 216.596414.75 query96.sql 268.869476.473 {code} was (Author: wujinhu): Hi [~Sammi], here are some performance data. I use this tool(https://github.com/hortonworks/hive-testbench) to compare run time between this patch and current version. {code:java} query after before query13.sql 241.591 440.524 query28.sql 1259.307 1943.949 query51.sql 469.618 722.904 query73.sql 216.596 414.75 query96.sql 268.869 476.473 {code} > AliyunOSS: Support multi-thread pre-read to improve read from Hadoop to > Aliyun OSS performance > -- > > Key: HADOOP-15027 > URL: https://issues.apache.org/jira/browse/HADOOP-15027 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/oss >Affects Versions: 3.0.0 >Reporter: wujinhu >Assignee: wujinhu > Attachments: HADOOP-15027.001.patch, HADOOP-15027.002.patch, > HADOOP-15027.003.patch, HADOOP-15027.004.patch, HADOOP-15027.005.patch, > HADOOP-15027.006.patch, HADOOP-15027.007.patch, HADOOP-15027.008.patch, > HADOOP-15027.009.patch, HADOOP-15027.010.patch, HADOOP-15027.011.patch, > HADOOP-15027.012.patch > > > Currently, AliyunOSSInputStream uses single thread to read data from > AliyunOSS, so we can do some refactoring by using multi-thread pre-read to > improve read performance. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-15027) AliyunOSS: Support multi-thread pre-read to improve read from Hadoop to Aliyun OSS performance
[ https://issues.apache.org/jira/browse/HADOOP-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16320103#comment-16320103 ] wujinhu edited comment on HADOOP-15027 at 1/10/18 11:32 AM: Change some default configurations. {code:java} - public static final long MULTIPART_DOWNLOAD_SIZE_DEFAULT = 1024 * 1024; + public static final long MULTIPART_DOWNLOAD_SIZE_DEFAULT = 512 * 1024; public static final String MULTIPART_DOWNLOAD_THREAD_NUMBER_KEY = "fs.oss.multipart.download.threads"; - public static final int MULTIPART_DOWNLOAD_THREAD_NUMBER_DEFAULT = 16; + public static final int MULTIPART_DOWNLOAD_THREAD_NUMBER_DEFAULT = 10; public static final String MAX_TOTAL_TASKS_KEY = "fs.oss.max.total.tasks"; public static final int MAX_TOTAL_TASKS_DEFAULT = 128; public static final String MULTIPART_DOWNLOAD_AHEAD_PART_MAX_NUM_KEY = "fs.oss.multipart.download.ahead.part.max.number"; - public static final int MULTIPART_DOWNLOAD_AHEAD_PART_MAX_NUM_DEFAULT = 8; + public static final int MULTIPART_DOWNLOAD_AHEAD_PART_MAX_NUM_DEFAULT = 4; {code} was (Author: wujinhu): Change some default configuration. {code:java} - public static final long MULTIPART_DOWNLOAD_SIZE_DEFAULT = 1024 * 1024; + public static final long MULTIPART_DOWNLOAD_SIZE_DEFAULT = 512 * 1024; public static final String MULTIPART_DOWNLOAD_THREAD_NUMBER_KEY = "fs.oss.multipart.download.threads"; - public static final int MULTIPART_DOWNLOAD_THREAD_NUMBER_DEFAULT = 16; + public static final int MULTIPART_DOWNLOAD_THREAD_NUMBER_DEFAULT = 10; public static final String MAX_TOTAL_TASKS_KEY = "fs.oss.max.total.tasks"; public static final int MAX_TOTAL_TASKS_DEFAULT = 128; public static final String MULTIPART_DOWNLOAD_AHEAD_PART_MAX_NUM_KEY = "fs.oss.multipart.download.ahead.part.max.number"; - public static final int MULTIPART_DOWNLOAD_AHEAD_PART_MAX_NUM_DEFAULT = 8; + public static final int MULTIPART_DOWNLOAD_AHEAD_PART_MAX_NUM_DEFAULT = 4; {code} > AliyunOSS: Support multi-thread pre-read to improve read from Hadoop to > Aliyun OSS performance > -- > > Key: HADOOP-15027 > URL: https://issues.apache.org/jira/browse/HADOOP-15027 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/oss >Affects Versions: 3.0.0 >Reporter: wujinhu >Assignee: wujinhu > Attachments: HADOOP-15027.001.patch, HADOOP-15027.002.patch, > HADOOP-15027.003.patch, HADOOP-15027.004.patch, HADOOP-15027.005.patch, > HADOOP-15027.006.patch, HADOOP-15027.007.patch, HADOOP-15027.008.patch, > HADOOP-15027.009.patch, HADOOP-15027.010.patch, HADOOP-15027.011.patch, > HADOOP-15027.012.patch > > > Currently, AliyunOSSInputStream uses single thread to read data from > AliyunOSS, so we can do some refactoring by using multi-thread pre-read to > improve read performance. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-15027) AliyunOSS: Support multi-thread pre-read to improve read from Hadoop to Aliyun OSS performance
[ https://issues.apache.org/jira/browse/HADOOP-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16320103#comment-16320103 ] wujinhu edited comment on HADOOP-15027 at 1/10/18 11:28 AM: Change some default configuration. {code:java} - public static final long MULTIPART_DOWNLOAD_SIZE_DEFAULT = 1024 * 1024; + public static final long MULTIPART_DOWNLOAD_SIZE_DEFAULT = 512 * 1024; public static final String MULTIPART_DOWNLOAD_THREAD_NUMBER_KEY = "fs.oss.multipart.download.threads"; - public static final int MULTIPART_DOWNLOAD_THREAD_NUMBER_DEFAULT = 16; + public static final int MULTIPART_DOWNLOAD_THREAD_NUMBER_DEFAULT = 10; public static final String MAX_TOTAL_TASKS_KEY = "fs.oss.max.total.tasks"; public static final int MAX_TOTAL_TASKS_DEFAULT = 128; public static final String MULTIPART_DOWNLOAD_AHEAD_PART_MAX_NUM_KEY = "fs.oss.multipart.download.ahead.part.max.number"; - public static final int MULTIPART_DOWNLOAD_AHEAD_PART_MAX_NUM_DEFAULT = 8; + public static final int MULTIPART_DOWNLOAD_AHEAD_PART_MAX_NUM_DEFAULT = 4; {code} was (Author: wujinhu): Change some default configuration. {code:java} 474c474 < index dd71842fb87..dedc038f3f7 100644 --- > index dd71842fb87..a1070277d33 100644 482c482 < + public static final long MULTIPART_DOWNLOAD_SIZE_DEFAULT = 512 * 1024; --- > + public static final long MULTIPART_DOWNLOAD_SIZE_DEFAULT = 1024 * 1024; 486c486 < + public static final int MULTIPART_DOWNLOAD_THREAD_NUMBER_DEFAULT = 10; --- > + public static final int MULTIPART_DOWNLOAD_THREAD_NUMBER_DEFAULT = 16; 493c493 < + public static final int MULTIPART_DOWNLOAD_AHEAD_PART_MAX_NUM_DEFAULT = 4; --- > + public static final int MULTIPART_DOWNLOAD_AHEAD_PART_MAX_NUM_DEFAULT = 8; {code} > AliyunOSS: Support multi-thread pre-read to improve read from Hadoop to > Aliyun OSS performance > -- > > Key: HADOOP-15027 > URL: https://issues.apache.org/jira/browse/HADOOP-15027 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/oss >Affects Versions: 3.0.0 >Reporter: wujinhu >Assignee: wujinhu > Attachments: HADOOP-15027.001.patch, HADOOP-15027.002.patch, > HADOOP-15027.003.patch, HADOOP-15027.004.patch, HADOOP-15027.005.patch, > HADOOP-15027.006.patch, HADOOP-15027.007.patch, HADOOP-15027.008.patch, > HADOOP-15027.009.patch, HADOOP-15027.010.patch, HADOOP-15027.011.patch, > HADOOP-15027.012.patch > > > Currently, AliyunOSSInputStream uses single thread to read data from > AliyunOSS, so we can do some refactoring by using multi-thread pre-read to > improve read performance. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org