[jira] [Comment Edited] (MAPREDUCE-6688) Store job configurations in Timeline Service v2
[ https://issues.apache.org/jira/browse/MAPREDUCE-6688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15265635#comment-15265635 ] Varun Saxena edited comment on MAPREDUCE-6688 at 5/30/16 5:19 AM: -- I actually wanted this point up for discussion. Forgot to mention it. sync or async semantically is decided more on the basis of which entities we would want to publish immediately rather than if they have to be merged or not. Are configs something which have to be published immediately as part of sync put ? There can be a fair argument in favor of sending together all entities in one shot for a sync though. But we can convert list to array outside as well. And for converting into an array I will have to first use a list anyways(as array size cannot be predetermined in some cases). I guess you mean the same, but just to elaborate for others as well. The reason I am looping through a list and putting entities one by one instead of turning it into an array and publishing in a single put call is because of consideration to the fact that entities are merged together for async calls. >From what I remember of YARN-3367, we were waiting up to 10 TimelineEnties >object before publishing. Key is that we wait for 10 TimelineEntities objects >and not TimelineEntity ones. We do not check how many entities are wrapped >inside a single TimelineEntities object. Correct me if I am wrong. If I pass an array of 10 entities, all those entities would be wrapped up in a single TimelineEntities object. And hence would count as a single addition to the queue. If I put them separately, it will be counted as 10 additions to the queue. Hence went with looping over. Now, the reason I chose 100k as the limit was assuming that even if all 10 entities go in single call, the payload size will be 1 M which IMO is fine enough. If 1M is not fine, we can change the limit size to something like 50k(say). Would like to hear views of others on the same. bq. This solution looks fine as of now but would require changes if we adopt different approach for publishing metrics and configurations as per YARN-3401. Even if we were to route our entities through RM, we would likely do that based on entity type(i.e. route entities with YARN entity type via RM). That is one solution which comes to my mind for YARN-3401. In that case current structure of code should work well. was (Author: varun_saxena): I actually wanted this point up for discussion. Forgot to mention it. sync or async semantically is decided more on the basis of which entities we would want to publish immediately rather than if they have to be merged or not. Are configs something which have to be published immediately as part of sync put ? There can be a fair argument in favor of sending together all entities in one short for a sync though. But we can convert list to array outside as well. And for converting into an array I will have to first use a list anyways(as array size cannot be predetermined in some cases). I guess you mean the same, but just to elaborate for others as well. The reason I am looping through a list and putting entities one by one instead of turning it into an array and publishing in a single put call is because of consideration to the fact that entities are merged together for async calls. >From what I remember of YARN-3367, we were waiting up to 10 TimelineEnties >object before publishing. Key is that we wait for 10 TimelineEntities objects >and not TimelineEntity ones. We do not check how many entities are wrapped >inside a single TimelineEntities object. Correct me if I am wrong. If I pass an array of 10 entities, all those entities would be wrapped up in a single TimelineEntities object. And hence would count as a single addition to the queue. If I put them separately, it will be counted as 10 additions to the queue. Hence went with looping over. Now, the reason I chose 100k as the limit was assuming that even if all 10 entities go in single call, the payload size will be 1 M which IMO is fine enough. If 1M is not fine, we can change the limit size to something like 50k(say). Would like to hear views of others on the same. bq. This solution looks fine as of now but would require changes if we adopt different approach for publishing metrics and configurations as per YARN-3401. Even if we were to route our entities through RM, we would likely do that based on entity type(i.e. route entities with YARN entity type via RM). That is one solution which comes to my mind for YARN-3401. In that case current structure of code should work well. > Store job configurations in Timeline Service v2 > --- > > Key: MAPREDUCE-6688 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6688 > Project: Hadoop Map/Reduce > Issue
[jira] [Assigned] (MAPREDUCE-6705) Task failing continuously on trunk
[ https://issues.apache.org/jira/browse/MAPREDUCE-6705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Zheng reassigned MAPREDUCE-6705: Assignee: Kai Zheng > Task failing continuously on trunk > -- > > Key: MAPREDUCE-6705 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6705 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Kai Zheng >Priority: Blocker > > Task attempt failing continuously. Submit any mapreduce application > Run the job as below > {code} > ./yarn jar ../share/hadoop/mapreduce/hadoop-mapreduce-examples*.jar pi > -Dyarn.app.mapreduce.am.env="HADOOP_MAPRED_HOME={{HADOOP_COMMON_HOME}}" > -Dmapreduce.admin.user.env="HADOOP_MAPRED_HOME={{HADOOP_COMMON_HOME}}" 1 1 > {code} > {noformat} > 2016-05-27 11:28:27,148 DEBUG [main] org.apache.hadoop.ipc.Client: getting > client out of cache: org.apache.hadoop.ipc.Client@291ae > 2016-05-27 11:28:27,160 DEBUG [main] org.apache.hadoop.mapred.YarnChild: PID: > 22305 > 2016-05-27 11:28:27,160 INFO [main] org.apache.hadoop.mapred.YarnChild: > Sleeping for 0ms before retrying again. Got null now. > 2016-05-27 11:28:27,161 WARN [main] org.apache.hadoop.mapred.YarnChild: > Exception running child : java.lang.reflect.UndeclaredThrowableException > at com.sun.proxy.$Proxy10.getTask(Unknown Source) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:136) > Caused by: com.google.protobuf.ServiceException: Too many or few parameters > for request. Method: [getTask], Expected: 2, Actual: 1 > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) > ... 2 more > 2016-05-27 11:28:27,161 DEBUG [main] org.apache.hadoop.ipc.Client: stopping > client from cache: org.apache.hadoop.ipc.Client@291ae > 2016-05-27 11:28:27,161 DEBUG [main] org.apache.hadoop.ipc.Client: removing > client from cache: org.apache.hadoop.ipc.Client@291ae > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6240) Hadoop client displays confusing error message
[ https://issues.apache.org/jira/browse/MAPREDUCE-6240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gera Shegalov updated MAPREDUCE-6240: - Status: Patch Available (was: In Progress) trying a jenkins rerun > Hadoop client displays confusing error message > -- > > Key: MAPREDUCE-6240 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6240 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client >Affects Versions: 2.7.0 >Reporter: Mohammad Kamrul Islam >Assignee: Gera Shegalov > Attachments: MAPREDUCE-6240-gera.001.patch, > MAPREDUCE-6240-gera.001.patch, MAPREDUCE-6240-gera.002.patch, > MAPREDUCE-6240.003.patch, MAPREDUCE-6240.004.patch, MAPREDUCE-6240.1.patch > > > Hadoop client often throws exception with "java.io.IOException: Cannot > initialize Cluster. Please check your configuration for > mapreduce.framework.name and the correspond server addresses". > This is a misleading and generic message for any cluster initialization > problem. It takes a lot of debugging hours to identify the root cause. The > correct error message could resolve this problem quickly. > In one such instance, Oozie log showed the following exception while the > root cause was CNF that Hadoop client didn't return in the exception. > {noformat} > JA009: Cannot initialize Cluster. Please check your configuration for > mapreduce.framework.name and the correspond server addresses. > at > org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:412) > at > org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:392) > at > org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:979) > at > org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1134) > at > org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:228) > at > org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:63) > at org.apache.oozie.command.XCommand.call(XCommand.java:281) > at > org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:323) > at > org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:252) > at > org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:174) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: Cannot initialize Cluster. Please check your > configuration for mapreduce.framework.name and the correspond server > addresses. > at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:120) > at org.apache.hadoop.mapreduce.Cluster.(Cluster.java:82) > at org.apache.hadoop.mapreduce.Cluster.(Cluster.java:75) > at org.apache.hadoop.mapred.JobClient.init(JobClient.java:470) > at org.apache.hadoop.mapred.JobClient.(JobClient.java:449) > at > org.apache.oozie.service.HadoopAccessorService$1.run(HadoopAccessorService.java:372) > at > org.apache.oozie.service.HadoopAccessorService$1.run(HadoopAccessorService.java:370) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at > org.apache.oozie.service.HadoopAccessorService.createJobClient(HadoopAccessorService.java:379) > at > org.apache.oozie.action.hadoop.JavaActionExecutor.createJobClient(JavaActionExecutor.java:1185) > at > org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:927) > ... 10 more > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6240) Hadoop client displays confusing error message
[ https://issues.apache.org/jira/browse/MAPREDUCE-6240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gera Shegalov updated MAPREDUCE-6240: - Status: In Progress (was: Patch Available) > Hadoop client displays confusing error message > -- > > Key: MAPREDUCE-6240 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6240 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client >Affects Versions: 2.7.0 >Reporter: Mohammad Kamrul Islam >Assignee: Gera Shegalov > Attachments: MAPREDUCE-6240-gera.001.patch, > MAPREDUCE-6240-gera.001.patch, MAPREDUCE-6240-gera.002.patch, > MAPREDUCE-6240.003.patch, MAPREDUCE-6240.004.patch, MAPREDUCE-6240.1.patch > > > Hadoop client often throws exception with "java.io.IOException: Cannot > initialize Cluster. Please check your configuration for > mapreduce.framework.name and the correspond server addresses". > This is a misleading and generic message for any cluster initialization > problem. It takes a lot of debugging hours to identify the root cause. The > correct error message could resolve this problem quickly. > In one such instance, Oozie log showed the following exception while the > root cause was CNF that Hadoop client didn't return in the exception. > {noformat} > JA009: Cannot initialize Cluster. Please check your configuration for > mapreduce.framework.name and the correspond server addresses. > at > org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:412) > at > org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:392) > at > org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:979) > at > org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1134) > at > org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:228) > at > org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:63) > at org.apache.oozie.command.XCommand.call(XCommand.java:281) > at > org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:323) > at > org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:252) > at > org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:174) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > Caused by: java.io.IOException: Cannot initialize Cluster. Please check your > configuration for mapreduce.framework.name and the correspond server > addresses. > at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:120) > at org.apache.hadoop.mapreduce.Cluster.(Cluster.java:82) > at org.apache.hadoop.mapreduce.Cluster.(Cluster.java:75) > at org.apache.hadoop.mapred.JobClient.init(JobClient.java:470) > at org.apache.hadoop.mapred.JobClient.(JobClient.java:449) > at > org.apache.oozie.service.HadoopAccessorService$1.run(HadoopAccessorService.java:372) > at > org.apache.oozie.service.HadoopAccessorService$1.run(HadoopAccessorService.java:370) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at > org.apache.oozie.service.HadoopAccessorService.createJobClient(HadoopAccessorService.java:379) > at > org.apache.oozie.action.hadoop.JavaActionExecutor.createJobClient(JavaActionExecutor.java:1185) > at > org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:927) > ... 10 more > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6240) Hadoop client displays confusing error message
[ https://issues.apache.org/jira/browse/MAPREDUCE-6240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15305813#comment-15305813 ] Hadoop QA commented on MAPREDUCE-6240: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 27s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 5s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 23s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 6s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 40s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 1m 56s {color} | {color:red} hadoop-mapreduce-client-core in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 143m 42s {color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 163m 50s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.mapreduce.tools.TestCLI | | | hadoop.mapred.TestReduceFetch | | | hadoop.mapred.TestMerge | | | hadoop.mapreduce.TestMapReduceLazyOutput | | | hadoop.mapred.TestMRIntermediateDataEncryption | | | hadoop.mapred.TestLazyOutput | | | hadoop.mapreduce.TestLargeSort | | | hadoop.mapred.TestReduceFetchFromPartialMem | | | hadoop.mapreduce.v2.TestMRJobsWithProfiler | | | hadoop.mapreduce.lib.output.TestJobOutputCommitter | | | hadoop.mapreduce.security.ssl.TestEncryptedShuffle | | | hadoop.mapreduce.v2.TestMROldApiJobs | | | hadoop.mapred.TestJobCleanup | | | hadoop.mapreduce.v2.TestSpeculativeExecution | | | hadoop.mapred.TestClusterMRNotification | | | hadoop.mapreduce.security.TestUmbilicalProtocolWithJobToken | | | hadoop.mapreduce.v2.TestMRAMWithNonNormalizedCapabilities | | | hadoop.mapreduce.v2.TestMRJobs | | | hadoop.mapred.TestJobName | | | hadoop.mapreduce.TestMRJobClient | | | hadoop.mapred.TestClusterMapReduceTestCase | | | hadoop.mapred.TestAuditLogger | | | hadoop.mapreduce.security.TestMRCredentials | | | hadoop.mapred.TestMRTimelineEventHandling | |