[jira] [Created] (YARN-7673) ClassNotFoundException: org.apache.hadoop.yarn.server.api.DistributedSchedulingAMProtocol when using hadoop-client-minicluster
Jeff Zhang created YARN-7673: Summary: ClassNotFoundException: org.apache.hadoop.yarn.server.api.DistributedSchedulingAMProtocol when using hadoop-client-minicluster Key: YARN-7673 URL: https://issues.apache.org/jira/browse/YARN-7673 Project: Hadoop YARN Issue Type: Bug Reporter: Jeff Zhang I'd like to use hadoop-client-minicluster for hadoop downstream project, but I encounter the following exception when starting hadoop minicluster. And I check the hadoop-client-minicluster, it indeed does not have this class. Is this something that is missing when packaging the published jar ? {code} java.lang.NoClassDefFoundError: org/apache/hadoop/yarn/server/api/DistributedSchedulingAMProtocol at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:763) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:467) at java.net.URLClassLoader.access$100(URLClassLoader.java:73) at java.net.URLClassLoader$1.run(URLClassLoader.java:368) at java.net.URLClassLoader$1.run(URLClassLoader.java:362) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:361) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at org.apache.hadoop.yarn.server.MiniYARNCluster.createResourceManager(MiniYARNCluster.java:851) at org.apache.hadoop.yarn.server.MiniYARNCluster.serviceInit(MiniYARNCluster.java:285) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-6364) How to set the resource queue when start spark job running on yarn
[ https://issues.apache.org/jira/browse/YARN-6364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang resolved YARN-6364. -- Resolution: Invalid > How to set the resource queue when start spark job running on yarn > --- > > Key: YARN-6364 > URL: https://issues.apache.org/jira/browse/YARN-6364 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: sydt > > As we all know, yarn takes charge of resource manage for hadoop. When > zeppelin start a spark job with yarn-client mode, how to set the designated > resource queue on yarn in order to make different spark applications belongs > to respective user running different yarn resource queue? -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-4182) Killed containers disappear on app attempt page
[ https://issues.apache.org/jira/browse/YARN-4182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang resolved YARN-4182. -- Resolution: Duplicate > Killed containers disappear on app attempt page > --- > > Key: YARN-4182 > URL: https://issues.apache.org/jira/browse/YARN-4182 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Reporter: Jeff Zhang > Attachments: 2015-09-18_1601.png > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4182) Killed containers disappear on app attempt page
Jeff Zhang created YARN-4182: Summary: Killed containers disappear on app attempt page Key: YARN-4182 URL: https://issues.apache.org/jira/browse/YARN-4182 Project: Hadoop YARN Issue Type: Bug Components: webapp Reporter: Jeff Zhang -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3763) Support for fuzzy search in ATS
Jeff Zhang created YARN-3763: Summary: Support for fuzzy search in ATS Key: YARN-3763 URL: https://issues.apache.org/jira/browse/YARN-3763 Project: Hadoop YARN Issue Type: Improvement Components: timelineserver Affects Versions: 2.7.0 Reporter: Jeff Zhang Currently ATS only support exact match. Sometimes fuzzy match may be helpful when the entities in the ATS has some common prefix or suffix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3755) Log the command of launching containers
Jeff Zhang created YARN-3755: Summary: Log the command of launching containers Key: YARN-3755 URL: https://issues.apache.org/jira/browse/YARN-3755 Project: Hadoop YARN Issue Type: Improvement Reporter: Jeff Zhang In the resource manager, yarn would log the command for launching AM, this is very useful. But there's no such log in the NN log for launching containers. It would be difficult to diagnose when containers fails to launch due to some issue in the commands. Although use can look at the commands in the container launch script file, this is an internal things of yarn, usually user don't know that. In user's perspective, they only know what command they specify when building yarn application. {code} 2015-06-01 16:06:42,245 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Command to launch container container_1433145984561_0001_01_01 : $JAVA_HOME/bin/java -server -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx1024m -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator -Dlog4j.configuration=tez-container-log4j.properties -Dyarn.app.container.log.dir=LOG_DIR -Dtez.root.logger=info,CLA -Dsun.nio.ch.bugLevel='' org.apache.tez.dag.app.DAGAppMaster 1LOG_DIR/stdout 2LOG_DIR/stderr {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3171) Sort by application id don't work in ATS web ui
Jeff Zhang created YARN-3171: Summary: Sort by application id don't work in ATS web ui Key: YARN-3171 URL: https://issues.apache.org/jira/browse/YARN-3171 Project: Hadoop YARN Issue Type: Bug Components: timelineserver Affects Versions: 2.6.0 Reporter: Jeff Zhang The order doesn't change when I click the column header -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3000) YARN_PID_DIR should be visible in yarn-env.sh
Jeff Zhang created YARN-3000: Summary: YARN_PID_DIR should be visible in yarn-env.sh Key: YARN-3000 URL: https://issues.apache.org/jira/browse/YARN-3000 Project: Hadoop YARN Issue Type: Bug Components: scripts Affects Versions: 2.6.0 Reporter: Jeff Zhang Priority: Minor Currently I see YARN_PID_DIR only show in yarn-deamon.sh which is supposed not the place for user to set up enviroment variable. IMO, yarn-env.sh is the place for users to set up enviroment variable just like hadoop-env.sh, so it's better to put YARN_PID_DIR into yarn-env.sh. ( can put it into comment just like YARN_RESOURCEMANAGER_HEAPSIZE ) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-2560) Diagnostics is delayed to passed to ApplicationReport
Jeff Zhang created YARN-2560: Summary: Diagnostics is delayed to passed to ApplicationReport Key: YARN-2560 URL: https://issues.apache.org/jira/browse/YARN-2560 Project: Hadoop YARN Issue Type: Bug Reporter: Jeff Zhang The diagnostics of Application may be delayed to pass to ApplicationReport. Here's one example when ApplicationStatus has changed to FAILED, but the diagnostics is still empty. And the next call of getApplicationReport could get the diagnostics. {code} while(true) { appReport = yarnClient.getApplicationReport(appId); Thread.sleep(1000); LOG.info(AppStatus: + appReport.getFinalApplicationStatus()); LOG.info(Diagnostics: + appReport.getDiagnostics()); } {code} *Output:* {code} AppStatus:FAILED Diagnostics: // empty // get diagnostics for the next getApplicationReport AppStatus:FAILED Diagnostics: // diagnostics info here {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-2184) ResourceManager may fail due to name node in safe mode
Jeff Zhang created YARN-2184: Summary: ResourceManager may fail due to name node in safe mode Key: YARN-2184 URL: https://issues.apache.org/jira/browse/YARN-2184 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.4.0 Reporter: Jeff Zhang Assignee: Jeff Zhang If the historyservice is enabled in resourcemanager, it will try to mkdir when service is inited. And at that time maybe the name node is still in safemode which may cause the historyservice failed and then cause the resouremanager fail. It would be very possible when the cluster is restarted when namenode will be in safemode in a long time. Here's the error logs: {code} Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException): Cannot create directory /Users/jzhang/Java/lib/hadoop-2.4.0/logs/yarn/system/history/ApplicationHistoryDataRoot. Name node is in safe mode. The reported blocks 85 has reached the threshold 0.9990 of total blocks 85. The number of live datanodes 1 has reached the minimum number 0. In safe mode extension. Safe mode will be turned off automatically in 19 seconds. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1195) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3564) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3540) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:754) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:558) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) at org.apache.hadoop.ipc.Client.call(Client.java:1410) at org.apache.hadoop.ipc.Client.call(Client.java:1363) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) at com.sun.proxy.$Proxy14.mkdirs(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103) at com.sun.proxy.$Proxy14.mkdirs(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:500) at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2553) at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2524) at org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:827) at org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:823) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:823) at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:816) at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1815) at org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore.serviceInit(FileSystemApplicationHistoryStore.java:120) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) ... 10 more 2014-06-20 11:06:25,220 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: SHUTDOWN_MSG: / SHUTDOWN_MSG: Shutting down ResourceManager at jzhangMBPr.local/192.168.100.152 {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (YARN-1754) Container process is not really killed
Jeff Zhang created YARN-1754: Summary: Container process is not really killed Key: YARN-1754 URL: https://issues.apache.org/jira/browse/YARN-1754 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.2.0 Environment: Mac Reporter: Jeff Zhang I test the following distributed shell example on my mac: hadoop jar share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.2.0.jar -appname shell -jar share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.2.0.jar -shell_command=sleep -shell_args=10 -num_containers=1 And it will start 2 process for one container, one is the shell process, another is the real command I execute ( here is sleep 10). And then I kill this application by running command yarn application -kill app_id it will kill the shell process, but won't kill the real command process. The reason is that yarn use kill command to kill process, but it won't kill its child process. use pkill could resolve this issue. I also verify this case on centos which is the same as mac. IMHO, it is a very important case which will make the resource usage inconsistency, and have potential security problem. -- This message was sent by Atlassian JIRA (v6.1.5#6160)