[jira] [Comment Edited] (MAPREDUCE-6688) Store job configurations in Timeline Service v2

2016-05-29 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15265635#comment-15265635
 ] 

Varun Saxena edited comment on MAPREDUCE-6688 at 5/30/16 5:19 AM:
--

I actually wanted this point up for discussion. Forgot to mention it.
sync or async semantically is decided more on the basis of which entities we 
would want to publish immediately rather than if they have to be merged or not. 
Are configs something which have to be published immediately as part of sync 
put ?

There can be a fair argument in favor of sending together all entities in one 
shot for a sync though. But we can convert list to array outside as well. And 
for converting into an array I will have to first use a list anyways(as array 
size cannot be predetermined in some cases).

I guess you mean the same, but just to elaborate for others as well. 
The reason I am looping through a list and putting entities one by one instead 
of turning it into an array and publishing in a single put call is because of 
consideration to the fact that entities are merged together for async calls. 
>From what I remember of YARN-3367, we were waiting up to 10 TimelineEnties 
>object before publishing. Key is that we wait for 10 TimelineEntities objects 
>and not TimelineEntity ones. We do not check how many entities are wrapped 
>inside a single TimelineEntities object. Correct me if I am wrong.
If I pass an array of 10 entities, all those entities would be wrapped up in a 
single TimelineEntities object. And hence would count as a single addition to 
the queue. If I put them separately, it will be counted as 10 additions to the 
queue. Hence went with looping over.
Now, the reason I chose 100k as the limit was assuming that even if all 10 
entities go in single call, the payload size will be 1 M which IMO is fine 
enough. If 1M is not fine, we can change the limit size to something like 
50k(say).

Would like to hear views of others on the same.

bq. This solution looks fine as of now but would require changes if we adopt 
different approach for publishing metrics and configurations as per YARN-3401.
Even if we were to route our entities through RM, we would likely do that based 
on entity type(i.e. route entities with YARN entity type via RM). That is one 
solution which comes to my mind for YARN-3401.
In that case current structure of code should work well.


was (Author: varun_saxena):
I actually wanted this point up for discussion. Forgot to mention it.
sync or async semantically is decided more on the basis of which entities we 
would want to publish immediately rather than if they have to be merged or not. 
Are configs something which have to be published immediately as part of sync 
put ?

There can be a fair argument in favor of sending together all entities in one 
short for a sync though. But we can convert list to array outside as well. And 
for converting into an array I will have to first use a list anyways(as array 
size cannot be predetermined in some cases).

I guess you mean the same, but just to elaborate for others as well. 
The reason I am looping through a list and putting entities one by one instead 
of turning it into an array and publishing in a single put call is because of 
consideration to the fact that entities are merged together for async calls. 
>From what I remember of YARN-3367, we were waiting up to 10 TimelineEnties 
>object before publishing. Key is that we wait for 10 TimelineEntities objects 
>and not TimelineEntity ones. We do not check how many entities are wrapped 
>inside a single TimelineEntities object. Correct me if I am wrong.
If I pass an array of 10 entities, all those entities would be wrapped up in a 
single TimelineEntities object. And hence would count as a single addition to 
the queue. If I put them separately, it will be counted as 10 additions to the 
queue. Hence went with looping over.
Now, the reason I chose 100k as the limit was assuming that even if all 10 
entities go in single call, the payload size will be 1 M which IMO is fine 
enough. If 1M is not fine, we can change the limit size to something like 
50k(say).

Would like to hear views of others on the same.

bq. This solution looks fine as of now but would require changes if we adopt 
different approach for publishing metrics and configurations as per YARN-3401.
Even if we were to route our entities through RM, we would likely do that based 
on entity type(i.e. route entities with YARN entity type via RM). That is one 
solution which comes to my mind for YARN-3401.
In that case current structure of code should work well.

> Store job configurations in Timeline Service v2
> ---
>
> Key: MAPREDUCE-6688
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6688
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
> 

[jira] [Assigned] (MAPREDUCE-6705) Task failing continuously on trunk

2016-05-29 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng reassigned MAPREDUCE-6705:


Assignee: Kai Zheng

> Task failing continuously on trunk
> --
>
> Key: MAPREDUCE-6705
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6705
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Kai Zheng
>Priority: Blocker
>
> Task attempt failing continuously. Submit any mapreduce application
> Run the job as below
> {code}
> ./yarn jar ../share/hadoop/mapreduce/hadoop-mapreduce-examples*.jar pi 
> -Dyarn.app.mapreduce.am.env="HADOOP_MAPRED_HOME={{HADOOP_COMMON_HOME}}" 
> -Dmapreduce.admin.user.env="HADOOP_MAPRED_HOME={{HADOOP_COMMON_HOME}}"  1 1
> {code}
> {noformat}
> 2016-05-27 11:28:27,148 DEBUG [main] org.apache.hadoop.ipc.Client: getting 
> client out of cache: org.apache.hadoop.ipc.Client@291ae
> 2016-05-27 11:28:27,160 DEBUG [main] org.apache.hadoop.mapred.YarnChild: PID: 
> 22305
> 2016-05-27 11:28:27,160 INFO [main] org.apache.hadoop.mapred.YarnChild: 
> Sleeping for 0ms before retrying again. Got null now.
> 2016-05-27 11:28:27,161 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.reflect.UndeclaredThrowableException
>   at com.sun.proxy.$Proxy10.getTask(Unknown Source)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:136)
> Caused by: com.google.protobuf.ServiceException: Too many or few parameters 
> for request. Method: [getTask], Expected: 2, Actual: 1
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   ... 2 more
> 2016-05-27 11:28:27,161 DEBUG [main] org.apache.hadoop.ipc.Client: stopping 
> client from cache: org.apache.hadoop.ipc.Client@291ae
> 2016-05-27 11:28:27,161 DEBUG [main] org.apache.hadoop.ipc.Client: removing 
> client from cache: org.apache.hadoop.ipc.Client@291ae
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6240) Hadoop client displays confusing error message

2016-05-29 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated MAPREDUCE-6240:
-
Status: Patch Available  (was: In Progress)

trying a jenkins rerun

> Hadoop client displays confusing error message
> --
>
> Key: MAPREDUCE-6240
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6240
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 2.7.0
>Reporter: Mohammad Kamrul Islam
>Assignee: Gera Shegalov
> Attachments: MAPREDUCE-6240-gera.001.patch, 
> MAPREDUCE-6240-gera.001.patch, MAPREDUCE-6240-gera.002.patch, 
> MAPREDUCE-6240.003.patch, MAPREDUCE-6240.004.patch, MAPREDUCE-6240.1.patch
>
>
> Hadoop client often throws exception  with "java.io.IOException: Cannot 
> initialize Cluster. Please check your configuration for 
> mapreduce.framework.name and the correspond server addresses".
> This is a misleading and generic message for any cluster initialization 
> problem. It takes a lot of debugging hours to identify the root cause. The 
> correct error message could resolve this problem quickly.
> In one such instance, Oozie log showed the following exception  while the 
> root cause was CNF  that Hadoop client didn't return in the exception.
> {noformat}
>  JA009: Cannot initialize Cluster. Please check your configuration for 
> mapreduce.framework.name and the correspond server addresses.
> at 
> org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:412)
> at 
> org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:392)
> at 
> org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:979)
> at 
> org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1134)
> at 
> org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:228)
> at 
> org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:63)
> at org.apache.oozie.command.XCommand.call(XCommand.java:281)
> at 
> org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:323)
> at 
> org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:252)
> at 
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:174)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.io.IOException: Cannot initialize Cluster. Please check your 
> configuration for mapreduce.framework.name and the correspond server 
> addresses.
> at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:120)
> at org.apache.hadoop.mapreduce.Cluster.(Cluster.java:82)
> at org.apache.hadoop.mapreduce.Cluster.(Cluster.java:75)
> at org.apache.hadoop.mapred.JobClient.init(JobClient.java:470)
> at org.apache.hadoop.mapred.JobClient.(JobClient.java:449)
> at 
> org.apache.oozie.service.HadoopAccessorService$1.run(HadoopAccessorService.java:372)
> at 
> org.apache.oozie.service.HadoopAccessorService$1.run(HadoopAccessorService.java:370)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at 
> org.apache.oozie.service.HadoopAccessorService.createJobClient(HadoopAccessorService.java:379)
> at 
> org.apache.oozie.action.hadoop.JavaActionExecutor.createJobClient(JavaActionExecutor.java:1185)
> at 
> org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:927)
>  ... 10 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6240) Hadoop client displays confusing error message

2016-05-29 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated MAPREDUCE-6240:
-
Status: In Progress  (was: Patch Available)

> Hadoop client displays confusing error message
> --
>
> Key: MAPREDUCE-6240
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6240
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 2.7.0
>Reporter: Mohammad Kamrul Islam
>Assignee: Gera Shegalov
> Attachments: MAPREDUCE-6240-gera.001.patch, 
> MAPREDUCE-6240-gera.001.patch, MAPREDUCE-6240-gera.002.patch, 
> MAPREDUCE-6240.003.patch, MAPREDUCE-6240.004.patch, MAPREDUCE-6240.1.patch
>
>
> Hadoop client often throws exception  with "java.io.IOException: Cannot 
> initialize Cluster. Please check your configuration for 
> mapreduce.framework.name and the correspond server addresses".
> This is a misleading and generic message for any cluster initialization 
> problem. It takes a lot of debugging hours to identify the root cause. The 
> correct error message could resolve this problem quickly.
> In one such instance, Oozie log showed the following exception  while the 
> root cause was CNF  that Hadoop client didn't return in the exception.
> {noformat}
>  JA009: Cannot initialize Cluster. Please check your configuration for 
> mapreduce.framework.name and the correspond server addresses.
> at 
> org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:412)
> at 
> org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:392)
> at 
> org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:979)
> at 
> org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1134)
> at 
> org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:228)
> at 
> org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:63)
> at org.apache.oozie.command.XCommand.call(XCommand.java:281)
> at 
> org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:323)
> at 
> org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:252)
> at 
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:174)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.io.IOException: Cannot initialize Cluster. Please check your 
> configuration for mapreduce.framework.name and the correspond server 
> addresses.
> at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:120)
> at org.apache.hadoop.mapreduce.Cluster.(Cluster.java:82)
> at org.apache.hadoop.mapreduce.Cluster.(Cluster.java:75)
> at org.apache.hadoop.mapred.JobClient.init(JobClient.java:470)
> at org.apache.hadoop.mapred.JobClient.(JobClient.java:449)
> at 
> org.apache.oozie.service.HadoopAccessorService$1.run(HadoopAccessorService.java:372)
> at 
> org.apache.oozie.service.HadoopAccessorService$1.run(HadoopAccessorService.java:370)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at 
> org.apache.oozie.service.HadoopAccessorService.createJobClient(HadoopAccessorService.java:379)
> at 
> org.apache.oozie.action.hadoop.JavaActionExecutor.createJobClient(JavaActionExecutor.java:1185)
> at 
> org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:927)
>  ... 10 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6240) Hadoop client displays confusing error message

2016-05-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15305813#comment-15305813
 ] 

Hadoop QA commented on MAPREDUCE-6240:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 27s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
5s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 29s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
25s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
23s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 6s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s 
{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
40s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 27s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 27s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 47s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
19s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 56s {color} 
| {color:red} hadoop-mapreduce-client-core in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 143m 42s 
{color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
24s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 163m 50s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.mapreduce.tools.TestCLI |
|   | hadoop.mapred.TestReduceFetch |
|   | hadoop.mapred.TestMerge |
|   | hadoop.mapreduce.TestMapReduceLazyOutput |
|   | hadoop.mapred.TestMRIntermediateDataEncryption |
|   | hadoop.mapred.TestLazyOutput |
|   | hadoop.mapreduce.TestLargeSort |
|   | hadoop.mapred.TestReduceFetchFromPartialMem |
|   | hadoop.mapreduce.v2.TestMRJobsWithProfiler |
|   | hadoop.mapreduce.lib.output.TestJobOutputCommitter |
|   | hadoop.mapreduce.security.ssl.TestEncryptedShuffle |
|   | hadoop.mapreduce.v2.TestMROldApiJobs |
|   | hadoop.mapred.TestJobCleanup |
|   | hadoop.mapreduce.v2.TestSpeculativeExecution |
|   | hadoop.mapred.TestClusterMRNotification |
|   | hadoop.mapreduce.security.TestUmbilicalProtocolWithJobToken |
|   | hadoop.mapreduce.v2.TestMRAMWithNonNormalizedCapabilities |
|   | hadoop.mapreduce.v2.TestMRJobs |
|   | hadoop.mapred.TestJobName |
|   | hadoop.mapreduce.TestMRJobClient |
|   | hadoop.mapred.TestClusterMapReduceTestCase |
|   | hadoop.mapred.TestAuditLogger |
|   | hadoop.mapreduce.security.TestMRCredentials |
|   | hadoop.mapred.TestMRTimelineEventHandling |
|   |