from:"Kai Zheng \(JIRA\)"

[jira] [Commented] (MAPREDUCE-6705) Task failing continuously on trunk

2016-05-30 Thread Kai Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15306488#comment-15306488
 ] 

Kai Zheng commented on MAPREDUCE-6705:
--

YarnChild uses TaskUmbilicalProtocol which still relies on WritableRpcEngine. 
This must be handled before removing the engine. It involves major development 
work rather than just a bug fix, so I guess we have to revert HADOOP-12579. Any 
comment?

> Task failing continuously on trunk
> --
>
> Key: MAPREDUCE-6705
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6705
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Kai Zheng
>Priority: Blocker
>
> Task attempt failing continuously. Submit any mapreduce application
> Run the job as below
> {code}
> ./yarn jar ../share/hadoop/mapreduce/hadoop-mapreduce-examples*.jar pi 
> -Dyarn.app.mapreduce.am.env="HADOOP_MAPRED_HOME={{HADOOP_COMMON_HOME}}" 
> -Dmapreduce.admin.user.env="HADOOP_MAPRED_HOME={{HADOOP_COMMON_HOME}}"  1 1
> {code}
> {noformat}
> 2016-05-27 11:28:27,148 DEBUG [main] org.apache.hadoop.ipc.Client: getting 
> client out of cache: org.apache.hadoop.ipc.Client@291ae
> 2016-05-27 11:28:27,160 DEBUG [main] org.apache.hadoop.mapred.YarnChild: PID: 
> 22305
> 2016-05-27 11:28:27,160 INFO [main] org.apache.hadoop.mapred.YarnChild: 
> Sleeping for 0ms before retrying again. Got null now.
> 2016-05-27 11:28:27,161 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.reflect.UndeclaredThrowableException
>   at com.sun.proxy.$Proxy10.getTask(Unknown Source)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:136)
> Caused by: com.google.protobuf.ServiceException: Too many or few parameters 
> for request. Method: [getTask], Expected: 2, Actual: 1
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   ... 2 more
> 2016-05-27 11:28:27,161 DEBUG [main] org.apache.hadoop.ipc.Client: stopping 
> client from cache: org.apache.hadoop.ipc.Client@291ae
> 2016-05-27 11:28:27,161 DEBUG [main] org.apache.hadoop.ipc.Client: removing 
> client from cache: org.apache.hadoop.ipc.Client@291ae
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Commented] (MAPREDUCE-6705) Task failing continuously on trunk

2016-05-30 Thread Kai Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307061#comment-15307061
 ] 

Kai Zheng commented on MAPREDUCE-6705:
--

HADOOP-12579 was reverted to avoid this and MAPREDUCE-6706 was opened to 
migrate TaskUmbilicalProtocol. I guess this will be resolved as a duplicate? 
[~asuresh], any comment? Thanks!

> Task failing continuously on trunk
> --
>
> Key: MAPREDUCE-6705
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6705
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Kai Zheng
>Priority: Blocker
>
> Task attempt failing continuously. Submit any mapreduce application
> Run the job as below
> {code}
> ./yarn jar ../share/hadoop/mapreduce/hadoop-mapreduce-examples*.jar pi 
> -Dyarn.app.mapreduce.am.env="HADOOP_MAPRED_HOME={{HADOOP_COMMON_HOME}}" 
> -Dmapreduce.admin.user.env="HADOOP_MAPRED_HOME={{HADOOP_COMMON_HOME}}"  1 1
> {code}
> {noformat}
> 2016-05-27 11:28:27,148 DEBUG [main] org.apache.hadoop.ipc.Client: getting 
> client out of cache: org.apache.hadoop.ipc.Client@291ae
> 2016-05-27 11:28:27,160 DEBUG [main] org.apache.hadoop.mapred.YarnChild: PID: 
> 22305
> 2016-05-27 11:28:27,160 INFO [main] org.apache.hadoop.mapred.YarnChild: 
> Sleeping for 0ms before retrying again. Got null now.
> 2016-05-27 11:28:27,161 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.reflect.UndeclaredThrowableException
>   at com.sun.proxy.$Proxy10.getTask(Unknown Source)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:136)
> Caused by: com.google.protobuf.ServiceException: Too many or few parameters 
> for request. Method: [getTask], Expected: 2, Actual: 1
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   ... 2 more
> 2016-05-27 11:28:27,161 DEBUG [main] org.apache.hadoop.ipc.Client: stopping 
> client from cache: org.apache.hadoop.ipc.Client@291ae
> 2016-05-27 11:28:27,161 DEBUG [main] org.apache.hadoop.ipc.Client: removing 
> client from cache: org.apache.hadoop.ipc.Client@291ae
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Assigned] (MAPREDUCE-6705) Task failing continuously on trunk

2016-05-29 Thread Kai Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng reassigned MAPREDUCE-6705:


Assignee: Kai Zheng

> Task failing continuously on trunk
> --
>
> Key: MAPREDUCE-6705
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6705
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Kai Zheng
>Priority: Blocker
>
> Task attempt failing continuously. Submit any mapreduce application
> Run the job as below
> {code}
> ./yarn jar ../share/hadoop/mapreduce/hadoop-mapreduce-examples*.jar pi 
> -Dyarn.app.mapreduce.am.env="HADOOP_MAPRED_HOME={{HADOOP_COMMON_HOME}}" 
> -Dmapreduce.admin.user.env="HADOOP_MAPRED_HOME={{HADOOP_COMMON_HOME}}"  1 1
> {code}
> {noformat}
> 2016-05-27 11:28:27,148 DEBUG [main] org.apache.hadoop.ipc.Client: getting 
> client out of cache: org.apache.hadoop.ipc.Client@291ae
> 2016-05-27 11:28:27,160 DEBUG [main] org.apache.hadoop.mapred.YarnChild: PID: 
> 22305
> 2016-05-27 11:28:27,160 INFO [main] org.apache.hadoop.mapred.YarnChild: 
> Sleeping for 0ms before retrying again. Got null now.
> 2016-05-27 11:28:27,161 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.reflect.UndeclaredThrowableException
>   at com.sun.proxy.$Proxy10.getTask(Unknown Source)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:136)
> Caused by: com.google.protobuf.ServiceException: Too many or few parameters 
> for request. Method: [getTask], Expected: 2, Actual: 1
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   ... 2 more
> 2016-05-27 11:28:27,161 DEBUG [main] org.apache.hadoop.ipc.Client: stopping 
> client from cache: org.apache.hadoop.ipc.Client@291ae
> 2016-05-27 11:28:27,161 DEBUG [main] org.apache.hadoop.ipc.Client: removing 
> client from cache: org.apache.hadoop.ipc.Client@291ae
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO

2016-07-12 Thread Kai Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated MAPREDUCE-6729:
-
Attachment: MAPREDUCE-6729.002.patch

Re-uploaded the same patch to trigger the building.

> Accurately compute the test execute time in DFSIO
> -
>
> Key: MAPREDUCE-6729
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: benchmarks, performance, test
>Affects Versions: 2.9.0
>Reporter: mingleizhang
>Assignee: mingleizhang
>Priority: Minor
>  Labels: performance, test
> Attachments: MAPREDUCE-6729.001.patch, MAPREDUCE-6729.002.patch
>
>
> When doing DFSIO test as a distributed i/o benchmark tool. Then especially 
> writes plenty of files to disk or read from, both can cause performance issue 
> and imprecise value in a way. The question is that existing practices needs 
> to delete files when before running a job and that will cause extra time 
> consumption and furthermore cause performance issue, statistical time error 
> and imprecise throughput while the files are lots of. So we need to replace 
> or improve this hack to prevent this from happening in the future.
> {code}
> public static void testWrite() throws Exception {
> FileSystem fs = cluster.getFileSystem();
> long tStart = System.currentTimeMillis();
> bench.writeTest(fs); // this line of code will cause extra time 
> consumption because of fs.delete(*,*) by the writeTest method
> long execTime = System.currentTimeMillis() - tStart;
> bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime);
>   }
> private void writeTest(FileSystem fs) throws IOException {
>   Path writeDir = getWriteDir(config);
>   fs.delete(getDataDir(config), true);
>   fs.delete(writeDir, true);
>   runIOTest(WriteMapper.class, writeDir);
>   }
> {code}　
> [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO

2016-07-07 Thread Kai Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated MAPREDUCE-6729:
-
Summary: Accurately compute the test execute time in DFSIO  (was: Hitting 
performance and error when lots of files to write or read)

> Accurately compute the test execute time in DFSIO
> -
>
> Key: MAPREDUCE-6729
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: benchmarks, performance, test
>Reporter: mingleizhang
>Assignee: mingleizhang
>Priority: Minor
>  Labels: performance, test
> Attachments: MR-6729.txt
>
>
> When doing DFSIO test as a distributed i/o benchmark tool. Then especially 
> writes plenty of files to disk or read from, both can cause performance issue 
> and imprecise value in a way. The question is that existing practices needs 
> to delete files when before running a job and that will cause extra time 
> consumption and furthermore cause performance issue, statistical time error 
> and imprecise throughput while the files are lots of. So we need to replace 
> or improve this hack to prevent this from happening in the future.
> {code}
> public static void testWrite() throws Exception {
> FileSystem fs = cluster.getFileSystem();
> long tStart = System.currentTimeMillis();
> bench.writeTest(fs); // this line of code will cause extra time 
> consumption because of fs.delete(*,*) by the writeTest method
> long execTime = System.currentTimeMillis() - tStart;
> bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime);
>   }
> private void writeTest(FileSystem fs) throws IOException {
>   Path writeDir = getWriteDir(config);
>   fs.delete(getDataDir(config), true);
>   fs.delete(writeDir, true);
>   runIOTest(WriteMapper.class, writeDir);
>   }
> {code}　
> [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-6729) Hitting performance and error when lots of files to write or read

2016-07-07 Thread Kai Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated MAPREDUCE-6729:
-
Assignee: mingleizhang  (was: mingleizhang)

> Hitting performance and error when lots of files to write or read
> -
>
> Key: MAPREDUCE-6729
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: benchmarks, performance, test
>Reporter: mingleizhang
>Assignee: mingleizhang
>Priority: Minor
>  Labels: performance, test
> Attachments: MR-6729.txt
>
>
> When doing DFSIO test as a distributed i/o benchmark tool. Then especially 
> writes plenty of files to disk or read from, both can cause performance issue 
> and imprecise value in a way. The question is that existing practices needs 
> to delete files when before running a job and that will cause extra time 
> consumption and furthermore cause performance issue, statistical time error 
> and imprecise throughput while the files are lots of. So we need to replace 
> or improve this hack to prevent this from happening in the future.
> {code}
> public static void testWrite() throws Exception {
> FileSystem fs = cluster.getFileSystem();
> long tStart = System.currentTimeMillis();
> bench.writeTest(fs); // this line of code will cause extra time 
> consumption because of fs.delete(*,*) by the writeTest method
> long execTime = System.currentTimeMillis() - tStart;
> bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime);
>   }
> private void writeTest(FileSystem fs) throws IOException {
>   Path writeDir = getWriteDir(config);
>   fs.delete(getDataDir(config), true);
>   fs.delete(writeDir, true);
>   runIOTest(WriteMapper.class, writeDir);
>   }
> {code}　
> [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-6729) Hitting performance and error when lots of files to write or read

2016-07-07 Thread Kai Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated MAPREDUCE-6729:
-
Assignee: mingleizhang

> Hitting performance and error when lots of files to write or read
> -
>
> Key: MAPREDUCE-6729
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: benchmarks, performance, test
>Reporter: mingleizhang
>Assignee: mingleizhang
>Priority: Minor
>  Labels: performance, test
> Attachments: MR-6729.txt
>
>
> When doing DFSIO test as a distributed i/o benchmark tool. Then especially 
> writes plenty of files to disk or read from, both can cause performance issue 
> and imprecise value in a way. The question is that existing practices needs 
> to delete files when before running a job and that will cause extra time 
> consumption and furthermore cause performance issue, statistical time error 
> and imprecise throughput while the files are lots of. So we need to replace 
> or improve this hack to prevent this from happening in the future.
> {code}
> public static void testWrite() throws Exception {
> FileSystem fs = cluster.getFileSystem();
> long tStart = System.currentTimeMillis();
> bench.writeTest(fs); // this line of code will cause extra time 
> consumption because of fs.delete(*,*) by the writeTest method
> long execTime = System.currentTimeMillis() - tStart;
> bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime);
>   }
> private void writeTest(FileSystem fs) throws IOException {
>   Path writeDir = getWriteDir(config);
>   fs.delete(getDataDir(config), true);
>   fs.delete(writeDir, true);
>   runIOTest(WriteMapper.class, writeDir);
>   }
> {code}　
> [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Commented] (MAPREDUCE-6729) Hitting performance and error when lots of files to write or read

2016-07-07 Thread Kai Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366074#comment-15366074
 ] 

Kai Zheng commented on MAPREDUCE-6729:
--

Thanks [~mingleizhang] for the reporting and contribution! It sounds good. 

[~ozawa], could you help take a look? This was found in a benchmark test we 
performed some time ago. Thanks!

> Hitting performance and error when lots of files to write or read
> -
>
> Key: MAPREDUCE-6729
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: benchmarks, performance, test
>Reporter: mingleizhang
>Assignee: mingleizhang
>Priority: Minor
>  Labels: performance, test
> Attachments: MR-6729.txt
>
>
> When doing DFSIO test as a distributed i/o benchmark tool. Then especially 
> writes plenty of files to disk or read from, both can cause performance issue 
> and imprecise value in a way. The question is that existing practices needs 
> to delete files when before running a job and that will cause extra time 
> consumption and furthermore cause performance issue, statistical time error 
> and imprecise throughput while the files are lots of. So we need to replace 
> or improve this hack to prevent this from happening in the future.
> {code}
> public static void testWrite() throws Exception {
> FileSystem fs = cluster.getFileSystem();
> long tStart = System.currentTimeMillis();
> bench.writeTest(fs); // this line of code will cause extra time 
> consumption because of fs.delete(*,*) by the writeTest method
> long execTime = System.currentTimeMillis() - tStart;
> bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime);
>   }
> private void writeTest(FileSystem fs) throws IOException {
>   Path writeDir = getWriteDir(config);
>   fs.delete(getDataDir(config), true);
>   fs.delete(writeDir, true);
>   runIOTest(WriteMapper.class, writeDir);
>   }
> {code}　
> [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Commented] (MAPREDUCE-6729) Accurately compute the test execute time in DFSIO

2016-07-07 Thread Kai Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366096#comment-15366096
 ] 

Kai Zheng commented on MAPREDUCE-6729:
--

[~mingleizhang],

Would you please rename your patch like {{MAPREDUCE-6729-v1.patch}} and then 
submit it to trigger the Jenkins test?

> Accurately compute the test execute time in DFSIO
> -
>
> Key: MAPREDUCE-6729
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6729
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: benchmarks, performance, test
>Reporter: mingleizhang
>Assignee: mingleizhang
>Priority: Minor
>  Labels: performance, test
> Attachments: MR-6729.txt
>
>
> When doing DFSIO test as a distributed i/o benchmark tool. Then especially 
> writes plenty of files to disk or read from, both can cause performance issue 
> and imprecise value in a way. The question is that existing practices needs 
> to delete files when before running a job and that will cause extra time 
> consumption and furthermore cause performance issue, statistical time error 
> and imprecise throughput while the files are lots of. So we need to replace 
> or improve this hack to prevent this from happening in the future.
> {code}
> public static void testWrite() throws Exception {
> FileSystem fs = cluster.getFileSystem();
> long tStart = System.currentTimeMillis();
> bench.writeTest(fs); // this line of code will cause extra time 
> consumption because of fs.delete(*,*) by the writeTest method
> long execTime = System.currentTimeMillis() - tStart;
> bench.analyzeResult(fs, TestType.TEST_TYPE_WRITE, execTime);
>   }
> private void writeTest(FileSystem fs) throws IOException {
>   Path writeDir = getWriteDir(config);
>   fs.delete(getDataDir(config), true);
>   fs.delete(writeDir, true);
>   runIOTest(WriteMapper.class, writeDir);
>   }
> {code}　
> [https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-6578) Add support for HDFS heterogeneous storage testing to TestDFSIO

2016-08-24 Thread Kai Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated MAPREDUCE-6578:
-
Attachment: MAPREDUCE-6578.03.patch

Uploading the updated patch provided by [~Sammi]. Thanks!

> Add support for HDFS heterogeneous storage testing to TestDFSIO
> ---
>
> Key: MAPREDUCE-6578
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6578
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Wei Zhou
>Assignee: Wei Zhou
> Attachments: MAPREDUCE-6578.00.patch, MAPREDUCE-6578.01.patch, 
> MAPREDUCE-6578.02.patch, MAPREDUCE-6578.03.patch
>
>
> HDFS heterogeneous storage allows user to store data blocks to different 
> storage medias according to predefined storage policies. Only 'Default' 
> policy is supported in current TestDFSIO implementation. This is going to add 
> an new option to enable tests of other storage polices.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-6775) Fix MapReduce failures caused by default RPC engine changing

2016-09-07 Thread Kai Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated MAPREDUCE-6775:
-
Attachment: MAPREDUCE-6775-v1.patch

Provided a fix to change back the default RPC engine.

> Fix MapReduce failures caused by default RPC engine changing
> 
>
> Key: MAPREDUCE-6775
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6775
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha2
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: MAPREDUCE-6775-v1.patch
>
>
> HADOOP-13218 changed the default RPC engine, which isn't inappropriate 
> because MAPREDUCE-6706 isn't solved yet, supporting TaskUmbilicalProtocol to 
> use ProtobufRPCEngine.
> [~jlowe] reported the following errors:
> {noformat}
> 2016-09-07 17:51:56,296 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.reflect.UndeclaredThrowableException
>   at com.sun.proxy.$Proxy10.getTask(Unknown Source)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:137)
> Caused by: com.google.protobuf.ServiceException: Too many or few parameters 
> for request. Method: [getTask], Expected: 2, Actual: 1
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:199)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>   ... 2 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Created] (MAPREDUCE-6775) Fix MapReduce failures caused by default RPC engine changing

2016-09-07 Thread Kai Zheng (JIRA)

Kai Zheng created MAPREDUCE-6775:


 Summary: Fix MapReduce failures caused by default RPC engine 
changing
 Key: MAPREDUCE-6775
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6775
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.0.0-alpha2
Reporter: Kai Zheng
Assignee: Kai Zheng


HADOOP-13218 changed the default RPC engine, which isn't inappropriate because 
MAPREDUCE-6706 isn't solved yet, supporting TaskUmbilicalProtocol to use 
ProtobufRPCEngine.

[~jlowe] reported the following errors:
{noformat}
2016-09-07 17:51:56,296 WARN [main] org.apache.hadoop.mapred.YarnChild: 
Exception running child : java.lang.reflect.UndeclaredThrowableException
at com.sun.proxy.$Proxy10.getTask(Unknown Source)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:137)
Caused by: com.google.protobuf.ServiceException: Too many or few parameters for 
request. Method: [getTask], Expected: 2, Actual: 1
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:199)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
... 2 more
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-6775) Fix MapReduce failures caused by default RPC engine changing

2016-09-07 Thread Kai Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated MAPREDUCE-6775:
-
Status: Patch Available  (was: Open)

> Fix MapReduce failures caused by default RPC engine changing
> 
>
> Key: MAPREDUCE-6775
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6775
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha2
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: MAPREDUCE-6775-v1.patch
>
>
> HADOOP-13218 changed the default RPC engine, which isn't inappropriate 
> because MAPREDUCE-6706 isn't solved yet, supporting TaskUmbilicalProtocol to 
> use ProtobufRPCEngine.
> [~jlowe] reported the following errors:
> {noformat}
> 2016-09-07 17:51:56,296 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.reflect.UndeclaredThrowableException
>   at com.sun.proxy.$Proxy10.getTask(Unknown Source)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:137)
> Caused by: com.google.protobuf.ServiceException: Too many or few parameters 
> for request. Method: [getTask], Expected: 2, Actual: 1
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:199)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>   ... 2 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-6775) Fix MapReduce failures caused by default RPC engine changing

2016-09-07 Thread Kai Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated MAPREDUCE-6775:
-
Resolution: Invalid
Status: Resolved  (was: Patch Available)

Resolved this as it will be handled in the original issue.

> Fix MapReduce failures caused by default RPC engine changing
> 
>
> Key: MAPREDUCE-6775
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6775
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha2
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: MAPREDUCE-6775-v1.patch
>
>
> HADOOP-13218 changed the default RPC engine, which isn't inappropriate 
> because MAPREDUCE-6706 isn't solved yet, supporting TaskUmbilicalProtocol to 
> use ProtobufRPCEngine.
> [~jlowe] reported the following errors:
> {noformat}
> 2016-09-07 17:51:56,296 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.reflect.UndeclaredThrowableException
>   at com.sun.proxy.$Proxy10.getTask(Unknown Source)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:137)
> Caused by: com.google.protobuf.ServiceException: Too many or few parameters 
> for request. Method: [getTask], Expected: 2, Actual: 1
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:199)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>   ... 2 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Commented] (MAPREDUCE-6774) Add support for HDFS erasure code policy to TestDFSIO

2016-09-13 Thread Kai Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15488709#comment-15488709
 ] 

Kai Zheng commented on MAPREDUCE-6774:
--

Thanks Sammi for the update! It looks good and only some minors now.

1. Could you define constants for: {{test.io.erasure.code.policy}}, 
{{test.io.block.storage.policy}}?
2. Would like to see minor refinements for {{checkErasureCodePolicy}}. For 
example, having some line breaks, avoiding the {{else}}.

{code}
+  private boolean checkErasureCodePolicy(String erasureCodePolicyName,
+  FileSystem fs, TestType testType) throws IOException {
+Collection list =
+((DistributedFileSystem) fs).getAllErasureCodingPolicies();
+boolean isValid = false;
+int i = 0;
+for (ErasureCodingPolicy ec : list) {
+  if (erasureCodePolicyName.equals(ec.getName())) {
+isValid = true;
+break;
+  }
+}
+if (!isValid) {
+  System.out.println("Invalid erasure code policy: " +
+  erasureCodePolicyName);
+  System.out.println("Current supported erasure code policy list: ");
+  for (ErasureCodingPolicy ec : list) {
+System.out.println(ec.getName());
+  }
+  return false;
+} else {
+  if (testType == TestType.TEST_TYPE_APPEND ||
+  testType == TestType.TEST_TYPE_TRUNCATE) {
+System.out.println("So far append or truncate operation" +
+" with erasureCodePolicy enabled is not supported");
+return false;
+  }
+}
+config.set("test.io.erasure.code.policy", erasureCodePolicyName);
+LOG.info("erasureCodePolicy = " + erasureCodePolicyName);
+return true;
+  }
{code}

> Add support for HDFS erasure code policy to TestDFSIO
> -
>
> Key: MAPREDUCE-6774
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6774
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: SammiChen
>Assignee: SammiChen
> Attachments: MAPREDUCE-6774-v1.patch, MAPREDUCE-6774-v2.patch, 
> MAPREDUCE-6774-v3.patch, MAPREDUCE-6774-v4.patch
>
>
> HDFS erasure code policy allows user to store directory and file to 
> predefined erasure code policies. Currently only 3x replication is supported 
> in TestDFSIO implementation. This is going to add an new option to enable 
> tests of files with erasure code policy enabled. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Commented] (MAPREDUCE-6774) Add support for HDFS erasure code policy to TestDFSIO

2016-09-14 Thread Kai Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491630#comment-15491630
 ] 

Kai Zheng commented on MAPREDUCE-6774:
--

Thanks Sammi for the update! The latest patch LGTM and +1.

> Add support for HDFS erasure code policy to TestDFSIO
> -
>
> Key: MAPREDUCE-6774
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6774
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: SammiChen
>Assignee: SammiChen
> Attachments: MAPREDUCE-6774-v1.patch, MAPREDUCE-6774-v2.patch, 
> MAPREDUCE-6774-v3.patch, MAPREDUCE-6774-v4.patch, MAPREDUCE-6774-v5.patch, 
> MAPREDUCE-6774-v6.patch
>
>
> HDFS erasure code policy allows user to store directory and file to 
> predefined erasure code policies. Currently only 3x replication is supported 
> in TestDFSIO implementation. This is going to add an new option to enable 
> tests of files with erasure code policy enabled. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Commented] (MAPREDUCE-6774) Add support for HDFS erasure code policy to TestDFSIO

2016-09-13 Thread Kai Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15486600#comment-15486600
 ] 

Kai Zheng commented on MAPREDUCE-6774:
--

Thanks [~Sammi] for the work!

It looks good overall. Suggestions:

{code}
boolean isStoragePolicyValid(String storagePolicy, FileSystem fs)
boolean isErasureCodePolicyValid(String erasureCodePolicyName, FileSystem fs, 
TestType testType) throws IOException
{code}
would be good to:
{code}
void checkStoragePolicy(String storagePolicy, FileSystem fs) throws Exception
void checkErasureCodePolicy(String erasureCodePolicyName, FileSystem fs, 
TestType testType) throws Exception
{code}


> Add support for HDFS erasure code policy to TestDFSIO
> -
>
> Key: MAPREDUCE-6774
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6774
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: SammiChen
>Assignee: SammiChen
> Attachments: MAPREDUCE-6774-v1.patch, MAPREDUCE-6774-v2.patch, 
> MAPREDUCE-6774-v3.patch
>
>
> HDFS erasure code policy allows user to store directory and file to 
> predefined erasure code policies. Currently only 3x replication is supported 
> in TestDFSIO implementation. This is going to add an new option to enable 
> tests of files with erasure code policy enabled. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Commented] (MAPREDUCE-6774) Add support for HDFS erasure code policy to TestDFSIO

2016-09-13 Thread Kai Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15486603#comment-15486603
 ] 

Kai Zheng commented on MAPREDUCE-6774:
--

So given above, the following can then be:
{code}
+if (storagePolicy != null) {
+  if (!isStoragePolicyValid(storagePolicy, fs)) {
+return -1;
+  }
{code}
=>
{code}
+if (storagePolicy != null) {
+  checkStoragePolicy(storagePolicy, fs);
+}
{code}


> Add support for HDFS erasure code policy to TestDFSIO
> -
>
> Key: MAPREDUCE-6774
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6774
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: SammiChen
>Assignee: SammiChen
> Attachments: MAPREDUCE-6774-v1.patch, MAPREDUCE-6774-v2.patch, 
> MAPREDUCE-6774-v3.patch
>
>
> HDFS erasure code policy allows user to store directory and file to 
> predefined erasure code policies. Currently only 3x replication is supported 
> in TestDFSIO implementation. This is going to add an new option to enable 
> tests of files with erasure code policy enabled. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Commented] (MAPREDUCE-6706) Update TaskUmbilicalProtocol to use ProtobufRPCEngine

2016-09-18 Thread Kai Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15500842#comment-15500842
 ] 

Kai Zheng commented on MAPREDUCE-6706:
--

Hi [~djp],

Thanks for your feedback in time and it's very helpful. As you could see in the 
discussions in HADOOP-12579, it's regarded as a burden to maintain the old RPC 
engine, so in my understanding if we can remove it in the code base without 
being concerned it'd be great, so that's why I want to proceed on this. Sure if 
it incurs too much overhead or effort to switch the engine like here, I'd like 
to stop and just have the old engine deprecated other than removal.

> Update TaskUmbilicalProtocol to use ProtobufRPCEngine
> -
>
> Key: MAPREDUCE-6706
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6706
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Arun Suresh
>Assignee: Wei Zhou
>
> Currently, {{TaskUmbilicalProtocol}} still uses {{WritableRPCEngine}}, which 
> should be moved to {{ProtocolRPCEngine}} so that the former can be deprecated 
> as tracked by HADOOP-12579



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-6706) Update TaskUmbilicalProtocol to use ProtobufRPCEngine

2016-09-17 Thread Kai Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated MAPREDUCE-6706:
-
Assignee: Wei Zhou  (was: Kai Zheng)

> Update TaskUmbilicalProtocol to use ProtobufRPCEngine
> -
>
> Key: MAPREDUCE-6706
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6706
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Arun Suresh
>Assignee: Wei Zhou
>
> Currently, {{TaskUmbilicalProtocol}} still uses {{WritableRPCEngine}}, which 
> should be moved to {{ProtocolRPCEngine}} so that the former can be deprecated 
> as tracked by HADOOP-12579



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-6774) Add support for HDFS erasure code policy to TestDFSIO

2016-09-16 Thread Kai Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated MAPREDUCE-6774:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-alpha2
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks Sammi for the contribution!

> Add support for HDFS erasure code policy to TestDFSIO
> -
>
> Key: MAPREDUCE-6774
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6774
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: SammiChen
>Assignee: SammiChen
> Fix For: 3.0.0-alpha2
>
> Attachments: MAPREDUCE-6774-v1.patch, MAPREDUCE-6774-v2.patch, 
> MAPREDUCE-6774-v3.patch, MAPREDUCE-6774-v4.patch, MAPREDUCE-6774-v5.patch, 
> MAPREDUCE-6774-v6.patch
>
>
> HDFS erasure code policy allows user to store directory and file to 
> predefined erasure code policies. Currently only 3x replication is supported 
> in TestDFSIO implementation. This is going to add an new option to enable 
> tests of files with erasure code policy enabled. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Resolved] (MAPREDUCE-6705) Task failing continuously on trunk

2016-09-16 Thread Kai Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng resolved MAPREDUCE-6705.
--
Resolution: Duplicate

Resolved this as a duplicate.

> Task failing continuously on trunk
> --
>
> Key: MAPREDUCE-6705
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6705
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Kai Zheng
>Priority: Blocker
>
> Task attempt failing continuously. Submit any mapreduce application
> Run the job as below
> {code}
> ./yarn jar ../share/hadoop/mapreduce/hadoop-mapreduce-examples*.jar pi 
> -Dyarn.app.mapreduce.am.env="HADOOP_MAPRED_HOME={{HADOOP_COMMON_HOME}}" 
> -Dmapreduce.admin.user.env="HADOOP_MAPRED_HOME={{HADOOP_COMMON_HOME}}"  1 1
> {code}
> {noformat}
> 2016-05-27 11:28:27,148 DEBUG [main] org.apache.hadoop.ipc.Client: getting 
> client out of cache: org.apache.hadoop.ipc.Client@291ae
> 2016-05-27 11:28:27,160 DEBUG [main] org.apache.hadoop.mapred.YarnChild: PID: 
> 22305
> 2016-05-27 11:28:27,160 INFO [main] org.apache.hadoop.mapred.YarnChild: 
> Sleeping for 0ms before retrying again. Got null now.
> 2016-05-27 11:28:27,161 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.reflect.UndeclaredThrowableException
>   at com.sun.proxy.$Proxy10.getTask(Unknown Source)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:136)
> Caused by: com.google.protobuf.ServiceException: Too many or few parameters 
> for request. Method: [getTask], Expected: 2, Actual: 1
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   ... 2 more
> 2016-05-27 11:28:27,161 DEBUG [main] org.apache.hadoop.ipc.Client: stopping 
> client from cache: org.apache.hadoop.ipc.Client@291ae
> 2016-05-27 11:28:27,161 DEBUG [main] org.apache.hadoop.ipc.Client: removing 
> client from cache: org.apache.hadoop.ipc.Client@291ae
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Commented] (MAPREDUCE-6705) Task failing continuously on trunk

2016-09-16 Thread Kai Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497853#comment-15497853
 ] 

Kai Zheng commented on MAPREDUCE-6705:
--

Sorry for the late response, [~bibinchundatt]. Yes I think this should be 
closed.

> Task failing continuously on trunk
> --
>
> Key: MAPREDUCE-6705
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6705
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Kai Zheng
>Priority: Blocker
>
> Task attempt failing continuously. Submit any mapreduce application
> Run the job as below
> {code}
> ./yarn jar ../share/hadoop/mapreduce/hadoop-mapreduce-examples*.jar pi 
> -Dyarn.app.mapreduce.am.env="HADOOP_MAPRED_HOME={{HADOOP_COMMON_HOME}}" 
> -Dmapreduce.admin.user.env="HADOOP_MAPRED_HOME={{HADOOP_COMMON_HOME}}"  1 1
> {code}
> {noformat}
> 2016-05-27 11:28:27,148 DEBUG [main] org.apache.hadoop.ipc.Client: getting 
> client out of cache: org.apache.hadoop.ipc.Client@291ae
> 2016-05-27 11:28:27,160 DEBUG [main] org.apache.hadoop.mapred.YarnChild: PID: 
> 22305
> 2016-05-27 11:28:27,160 INFO [main] org.apache.hadoop.mapred.YarnChild: 
> Sleeping for 0ms before retrying again. Got null now.
> 2016-05-27 11:28:27,161 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.reflect.UndeclaredThrowableException
>   at com.sun.proxy.$Proxy10.getTask(Unknown Source)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:136)
> Caused by: com.google.protobuf.ServiceException: Too many or few parameters 
> for request. Method: [getTask], Expected: 2, Actual: 1
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   ... 2 more
> 2016-05-27 11:28:27,161 DEBUG [main] org.apache.hadoop.ipc.Client: stopping 
> client from cache: org.apache.hadoop.ipc.Client@291ae
> 2016-05-27 11:28:27,161 DEBUG [main] org.apache.hadoop.ipc.Client: removing 
> client from cache: org.apache.hadoop.ipc.Client@291ae
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Commented] (MAPREDUCE-6706) Update TaskUmbilicalProtocol to use ProtobufRPCEngine

2016-09-16 Thread Kai Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497879#comment-15497879
 ] 

Kai Zheng commented on MAPREDUCE-6706:
--

Thank you Arun!

> Update TaskUmbilicalProtocol to use ProtobufRPCEngine
> -
>
> Key: MAPREDUCE-6706
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6706
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Arun Suresh
>Assignee: Kai Zheng
>
> Currently, {{TaskUmbilicalProtocol}} still uses {{WritableRPCEngine}}, which 
> should be moved to {{ProtocolRPCEngine}} so that the former can be deprecated 
> as tracked by HADOOP-12579



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Commented] (MAPREDUCE-6706) Update TaskUmbilicalProtocol to use ProtobufRPCEngine

2016-09-16 Thread Kai Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497869#comment-15497869
 ] 

Kai Zheng commented on MAPREDUCE-6706:
--

Hi [~asuresh],

Will you work on this? If you don't mind, my side can help do it as I want to 
proceed and get HADOOP-12579 done. This is the major task left to be done now 
to get rid of the old RPC engine. Thanks.

> Update TaskUmbilicalProtocol to use ProtobufRPCEngine
> -
>
> Key: MAPREDUCE-6706
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6706
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>
> Currently, {{TaskUmbilicalProtocol}} still uses {{WritableRPCEngine}}, which 
> should be moved to {{ProtocolRPCEngine}} so that the former can be deprecated 
> as tracked by HADOOP-12579



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-6774) Add support for HDFS erasure code policy to TestDFSIO

2016-09-06 Thread Kai Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated MAPREDUCE-6774:
-
Assignee: SammiChen

> Add support for HDFS erasure code policy to TestDFSIO
> -
>
> Key: MAPREDUCE-6774
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6774
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: SammiChen
>Assignee: SammiChen
> Attachments: MAPREDUCE-6774-v1.patch
>
>
> HDFS erasure code policy allows user to store directory and file to 
> predefined erasure code policies. Currently only 3x replication is supported 
> in TestDFSIO implementation. This is going to add an new option to enable 
> tests of files with erasure code policy enabled. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-6578) Add support for HDFS heterogeneous storage testing to TestDFSIO

2016-08-22 Thread Kai Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated MAPREDUCE-6578:
-
Attachment: MAPREDUCE-6578.01.patch

Rebased the patch.

> Add support for HDFS heterogeneous storage testing to TestDFSIO
> ---
>
> Key: MAPREDUCE-6578
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6578
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Wei Zhou
>Assignee: Wei Zhou
> Attachments: MAPREDUCE-6578.00.patch, MAPREDUCE-6578.01.patch
>
>
> HDFS heterogeneous storage allows user to store data blocks to different 
> storage medias according to predefined storage policies. Only 'Default' 
> policy is supported in current TestDFSIO implementation. This is going to add 
> an new option to enable tests of other storage polices.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-6578) Add support for HDFS heterogeneous storage testing to TestDFSIO

2016-08-22 Thread Kai Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated MAPREDUCE-6578:
-
Status: Patch Available  (was: Open)

> Add support for HDFS heterogeneous storage testing to TestDFSIO
> ---
>
> Key: MAPREDUCE-6578
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6578
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Wei Zhou
>Assignee: Wei Zhou
> Attachments: MAPREDUCE-6578.00.patch, MAPREDUCE-6578.01.patch
>
>
> HDFS heterogeneous storage allows user to store data blocks to different 
> storage medias according to predefined storage policies. Only 'Default' 
> policy is supported in current TestDFSIO implementation. This is going to add 
> an new option to enable tests of other storage polices.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Commented] (MAPREDUCE-6578) Add support for HDFS heterogeneous storage testing to TestDFSIO

2016-08-23 Thread Kai Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15434072#comment-15434072
 ] 

Kai Zheng commented on MAPREDUCE-6578:
--

Thanks [~zhouwei] for the work and [~Sammi] for the update.

It's a good idea to support the {{storagePolicy}} option to test with some 
storage policy. The patch looks good overall. One comment is, could we print in 
the usage the list of all the supported policies? Otherwise it's hard for user 
to use the option.

> Add support for HDFS heterogeneous storage testing to TestDFSIO
> ---
>
> Key: MAPREDUCE-6578
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6578
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Wei Zhou
>Assignee: Wei Zhou
> Attachments: MAPREDUCE-6578.00.patch, MAPREDUCE-6578.01.patch, 
> MAPREDUCE-6578.02.patch
>
>
> HDFS heterogeneous storage allows user to store data blocks to different 
> storage medias according to predefined storage policies. Only 'Default' 
> policy is supported in current TestDFSIO implementation. This is going to add 
> an new option to enable tests of other storage polices.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-6578) Add support for HDFS heterogeneous storage testing to TestDFSIO

2016-08-23 Thread Kai Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated MAPREDUCE-6578:
-
Attachment: MAPREDUCE-6578.02.patch

[~Sammi] helped updating the patch. Uploading it for her.

> Add support for HDFS heterogeneous storage testing to TestDFSIO
> ---
>
> Key: MAPREDUCE-6578
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6578
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Wei Zhou
>Assignee: Wei Zhou
> Attachments: MAPREDUCE-6578.00.patch, MAPREDUCE-6578.01.patch, 
> MAPREDUCE-6578.02.patch
>
>
> HDFS heterogeneous storage allows user to store data blocks to different 
> storage medias according to predefined storage policies. Only 'Default' 
> policy is supported in current TestDFSIO implementation. This is going to add 
> an new option to enable tests of other storage polices.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-6578) Add support for HDFS heterogeneous storage testing to TestDFSIO

2016-08-24 Thread Kai Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated MAPREDUCE-6578:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-alpha1
   Status: Resolved  (was: Patch Available)

Committed to 3.0.0 and trunk branches. Thanks [~zhouwei] and [~Sammi] for the 
contribution.

> Add support for HDFS heterogeneous storage testing to TestDFSIO
> ---
>
> Key: MAPREDUCE-6578
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6578
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Wei Zhou
>Assignee: Wei Zhou
> Fix For: 3.0.0-alpha1
>
> Attachments: MAPREDUCE-6578.00.patch, MAPREDUCE-6578.01.patch, 
> MAPREDUCE-6578.02.patch, MAPREDUCE-6578.03.patch, MAPREDUCE-6578.04.patch
>
>
> HDFS heterogeneous storage allows user to store data blocks to different 
> storage medias according to predefined storage policies. Only 'Default' 
> policy is supported in current TestDFSIO implementation. This is going to add 
> an new option to enable tests of other storage polices.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Commented] (MAPREDUCE-6578) Add support for HDFS heterogeneous storage testing to TestDFSIO

2016-08-24 Thread Kai Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435019#comment-15435019
 ] 

Kai Zheng commented on MAPREDUCE-6578:
--

Please ignore the above messy building message. It happened after the patch 
committed already. I fixed the minor check style (line too long) just before 
committing it.

> Add support for HDFS heterogeneous storage testing to TestDFSIO
> ---
>
> Key: MAPREDUCE-6578
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6578
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Wei Zhou
>Assignee: Wei Zhou
> Fix For: 3.0.0-alpha1
>
> Attachments: MAPREDUCE-6578.00.patch, MAPREDUCE-6578.01.patch, 
> MAPREDUCE-6578.02.patch, MAPREDUCE-6578.03.patch, MAPREDUCE-6578.04.patch
>
>
> HDFS heterogeneous storage allows user to store data blocks to different 
> storage medias according to predefined storage policies. Only 'Default' 
> policy is supported in current TestDFSIO implementation. This is going to add 
> an new option to enable tests of other storage polices.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Commented] (MAPREDUCE-6578) Add support for HDFS heterogeneous storage testing to TestDFSIO

2016-08-24 Thread Kai Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15434999#comment-15434999
 ] 

Kai Zheng commented on MAPREDUCE-6578:
--

+1 on the updated patch except the minor style. Will commit the latest one that 
solved the style.

> Add support for HDFS heterogeneous storage testing to TestDFSIO
> ---
>
> Key: MAPREDUCE-6578
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6578
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Wei Zhou
>Assignee: Wei Zhou
> Attachments: MAPREDUCE-6578.00.patch, MAPREDUCE-6578.01.patch, 
> MAPREDUCE-6578.02.patch, MAPREDUCE-6578.03.patch, MAPREDUCE-6578.04.patch
>
>
> HDFS heterogeneous storage allows user to store data blocks to different 
> storage medias according to predefined storage policies. Only 'Default' 
> policy is supported in current TestDFSIO implementation. This is going to add 
> an new option to enable tests of other storage polices.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-6578) Add support for HDFS heterogeneous storage testing to TestDFSIO

2016-08-24 Thread Kai Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated MAPREDUCE-6578:
-
Attachment: MAPREDUCE-6578.04.patch

This fixed the minor style.

> Add support for HDFS heterogeneous storage testing to TestDFSIO
> ---
>
> Key: MAPREDUCE-6578
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6578
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Wei Zhou
>Assignee: Wei Zhou
> Attachments: MAPREDUCE-6578.00.patch, MAPREDUCE-6578.01.patch, 
> MAPREDUCE-6578.02.patch, MAPREDUCE-6578.03.patch, MAPREDUCE-6578.04.patch
>
>
> HDFS heterogeneous storage allows user to store data blocks to different 
> storage medias according to predefined storage policies. Only 'Default' 
> policy is supported in current TestDFSIO implementation. This is going to add 
> an new option to enable tests of other storage polices.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Commented] (MAPREDUCE-6780) Add support for HDFS directory with erasure code policy to TeraGen and TeraSort

2016-10-09 Thread Kai Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15559673#comment-15559673
 ] 

Kai Zheng commented on MAPREDUCE-6780:
--

Thanks [~Sammi] for the update per our off-line reviewing discussion. The 
latest patch LGTM and +1.

> Add support for HDFS directory with erasure code policy to TeraGen and 
> TeraSort
> ---
>
> Key: MAPREDUCE-6780
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6780
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>Reporter: SammiChen
>Assignee: SammiChen
> Attachments: MAPREDUCE-6780-v1.patch, MAPREDUCE-6780-v2.patch
>
>
> So far, HDFS file with erasure code policy doesn't support hflush and hsync 
> operation. This task is going to find a way to support writing data to 
> erasure code policy files in TeraGen and TeraSort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-6780) Add support for striping files in benchmarking of TeraGen and TeraSort

2016-10-09 Thread Kai Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated MAPREDUCE-6780:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-alpha2
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks [~Sammi] for the contribution!

> Add support for striping files in benchmarking of TeraGen and TeraSort
> --
>
> Key: MAPREDUCE-6780
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6780
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>Reporter: SammiChen
>Assignee: SammiChen
> Fix For: 3.0.0-alpha2
>
> Attachments: MAPREDUCE-6780-v1.patch, MAPREDUCE-6780-v2.patch
>
>
> So far, HDFS file with erasure code policy doesn't support hflush and hsync 
> operation. This task is going to find a way to support writing data to 
> erasure code policy files in TeraGen and TeraSort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-6780) Add support for striping files in benchmarking of TeraGen and TeraSort

2016-10-09 Thread Kai Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated MAPREDUCE-6780:
-
Summary: Add support for striping files in benchmarking of TeraGen and 
TeraSort  (was: Add support for HDFS directory with erasure code policy to 
TeraGen and TeraSort)

> Add support for striping files in benchmarking of TeraGen and TeraSort
> --
>
> Key: MAPREDUCE-6780
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6780
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>Reporter: SammiChen
>Assignee: SammiChen
> Attachments: MAPREDUCE-6780-v1.patch, MAPREDUCE-6780-v2.patch
>
>
> So far, HDFS file with erasure code policy doesn't support hflush and hsync 
> operation. This task is going to find a way to support writing data to 
> erasure code policy files in TeraGen and TeraSort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Updated] (MAPREDUCE-6877) Assign map task preferentially to the data node where the split is on faster storage type

2017-04-18 Thread Kai Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated MAPREDUCE-6877:
-
Description: It would be good to use SSD in HDFS to improve reading/writing 
performance. However, SSD costs more than HDD, so there is a tradeoff policy 
ONE-SSD to balance the performance and cost. But there occurs a problem whether 
applications will read the replication on SSD or not. If applications wouldn’t 
preferentially read the replication on SSD, the advantage of SSD wouldn’t be 
fully utilized. The current MapReduce only assign tasks according to data 
locality. The storage types of all the replications of each split should also 
be taken into consideration in order to assign map task preferentially to a 
node where its split is located on a faster storage type.  (was: SSD has been 
widely used in HDFS to improve reading/writing performance. However, SSD costs 
much more than HDD, so there is a tradeoff policy ONE-SSD to balance the 
performance and cost. But there occurs a problem whether applications will read 
the replication on SSD. If applications cannot read the replication on SSD, the 
advantage of SSD can no longer be utilized, which will lead to much poorer 
performance compared to ALL-SSD policy. The current MapReduce only assign tasks 
according to data locality. The storage types of all the replications of each 
split should also been taken into consideration in order to assign map task 
preferentially to a node where its split is located on a faster storage type.)

> Assign map task preferentially to the data node where the split is on faster 
> storage type
> -
>
> Key: MAPREDUCE-6877
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6877
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Tim Yao
>
> It would be good to use SSD in HDFS to improve reading/writing performance. 
> However, SSD costs more than HDD, so there is a tradeoff policy ONE-SSD to 
> balance the performance and cost. But there occurs a problem whether 
> applications will read the replication on SSD or not. If applications 
> wouldn’t preferentially read the replication on SSD, the advantage of SSD 
> wouldn’t be fully utilized. The current MapReduce only assign tasks according 
> to data locality. The storage types of all the replications of each split 
> should also be taken into consideration in order to assign map task 
> preferentially to a node where its split is located on a faster storage type.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

[jira] [Assigned] (MAPREDUCE-6877) Assign map task preferentially to the data node where the split is on faster storage type

2017-04-18 Thread Kai Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng reassigned MAPREDUCE-6877:


Assignee: Tim Yao

> Assign map task preferentially to the data node where the split is on faster 
> storage type
> -
>
> Key: MAPREDUCE-6877
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6877
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Tim Yao
>Assignee: Tim Yao
>
> It would be good to use SSD in HDFS to improve reading/writing performance. 
> However, SSD costs more than HDD, so there is a tradeoff policy ONE-SSD to 
> balance the performance and cost. But there occurs a problem whether 
> applications will read the replication on SSD or not. If applications 
> wouldn’t preferentially read the replication on SSD, the advantage of SSD 
> wouldn’t be fully utilized. The current MapReduce only assign tasks according 
> to data locality. The storage types of all the replications of each split 
> should also be taken into consideration in order to assign map task 
> preferentially to a node where its split is located on a faster storage type.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

39 matches

Mail list logo