[jira] [Created] (YARN-7673) ClassNotFoundException: org.apache.hadoop.yarn.server.api.DistributedSchedulingAMProtocol when using hadoop-client-minicluster

2017-12-18 Thread Jeff Zhang (JIRA)
Jeff Zhang created YARN-7673:


 Summary: ClassNotFoundException: 
org.apache.hadoop.yarn.server.api.DistributedSchedulingAMProtocol when using 
hadoop-client-minicluster
 Key: YARN-7673
 URL: https://issues.apache.org/jira/browse/YARN-7673
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jeff Zhang


I'd like to use hadoop-client-minicluster for hadoop downstream project, but I 
encounter the following exception when starting hadoop minicluster.  And I 
check the hadoop-client-minicluster, it indeed does not have this class. Is 
this something that is missing when packaging the published jar ?

{code}
java.lang.NoClassDefFoundError: 
org/apache/hadoop/yarn/server/api/DistributedSchedulingAMProtocol

at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at 
org.apache.hadoop.yarn.server.MiniYARNCluster.createResourceManager(MiniYARNCluster.java:851)
at 
org.apache.hadoop.yarn.server.MiniYARNCluster.serviceInit(MiniYARNCluster.java:285)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Resolved] (YARN-6364) How to set the resource queue when start spark job running on yarn

2017-03-20 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang resolved YARN-6364.
--
Resolution: Invalid

> How to set the resource queue  when start spark job running on yarn
> ---
>
> Key: YARN-6364
> URL: https://issues.apache.org/jira/browse/YARN-6364
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: sydt
>
> As we all know, yarn takes charge of resource manage for hadoop. When 
> zeppelin start a spark job with yarn-client mode, how to set the designated 
> resource queue on yarn in order to make different spark applications belongs 
> to respective  user running different yarn resource queue?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Resolved] (YARN-4182) Killed containers disappear on app attempt page

2015-09-18 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang resolved YARN-4182.
--
Resolution: Duplicate

> Killed containers disappear on app attempt page
> ---
>
> Key: YARN-4182
> URL: https://issues.apache.org/jira/browse/YARN-4182
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: Jeff Zhang
> Attachments: 2015-09-18_1601.png
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4182) Killed containers disappear on app attempt page

2015-09-18 Thread Jeff Zhang (JIRA)
Jeff Zhang created YARN-4182:


 Summary: Killed containers disappear on app attempt page
 Key: YARN-4182
 URL: https://issues.apache.org/jira/browse/YARN-4182
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Reporter: Jeff Zhang






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3763) Support for fuzzy search in ATS

2015-06-02 Thread Jeff Zhang (JIRA)
Jeff Zhang created YARN-3763:


 Summary: Support for fuzzy search in ATS
 Key: YARN-3763
 URL: https://issues.apache.org/jira/browse/YARN-3763
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: timelineserver
Affects Versions: 2.7.0
Reporter: Jeff Zhang


Currently ATS only support exact match. Sometimes fuzzy match may be helpful 
when the entities in the ATS has some common prefix or suffix.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3755) Log the command of launching containers

2015-06-01 Thread Jeff Zhang (JIRA)
Jeff Zhang created YARN-3755:


 Summary: Log the command of launching containers
 Key: YARN-3755
 URL: https://issues.apache.org/jira/browse/YARN-3755
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jeff Zhang


In the resource manager, yarn would log the command for launching AM, this is 
very useful. But there's no such log in the NN log for launching containers. It 
would be difficult to diagnose when containers fails to launch due to some 
issue in the commands. Although use can look at the commands in the container 
launch script file, this is an internal things of yarn, usually user don't know 
that. In user's perspective, they only know what command they specify when 
building yarn application. 

{code}
2015-06-01 16:06:42,245 INFO 
org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Command to 
launch container container_1433145984561_0001_01_01 : $JAVA_HOME/bin/java 
-server -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN  
-Xmx1024m  -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator 
-Dlog4j.configuration=tez-container-log4j.properties 
-Dyarn.app.container.log.dir=LOG_DIR -Dtez.root.logger=info,CLA 
-Dsun.nio.ch.bugLevel='' org.apache.tez.dag.app.DAGAppMaster 1LOG_DIR/stdout 
2LOG_DIR/stderr
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3171) Sort by application id don't work in ATS web ui

2015-02-10 Thread Jeff Zhang (JIRA)
Jeff Zhang created YARN-3171:


 Summary: Sort by application id don't work in ATS web ui
 Key: YARN-3171
 URL: https://issues.apache.org/jira/browse/YARN-3171
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver
Affects Versions: 2.6.0
Reporter: Jeff Zhang


The order doesn't change when I click the column header



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3000) YARN_PID_DIR should be visible in yarn-env.sh

2014-12-30 Thread Jeff Zhang (JIRA)
Jeff Zhang created YARN-3000:


 Summary: YARN_PID_DIR should be visible in yarn-env.sh
 Key: YARN-3000
 URL: https://issues.apache.org/jira/browse/YARN-3000
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scripts
Affects Versions: 2.6.0
Reporter: Jeff Zhang
Priority: Minor


Currently I see YARN_PID_DIR only show in yarn-deamon.sh which is supposed not 
the place for user to set up enviroment variable. IMO, yarn-env.sh is the place 
for users to set up enviroment variable just like hadoop-env.sh, so it's better 
to put YARN_PID_DIR into yarn-env.sh. ( can put it into comment just like 
YARN_RESOURCEMANAGER_HEAPSIZE )



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-2560) Diagnostics is delayed to passed to ApplicationReport

2014-09-16 Thread Jeff Zhang (JIRA)
Jeff Zhang created YARN-2560:


 Summary: Diagnostics is delayed to passed to ApplicationReport
 Key: YARN-2560
 URL: https://issues.apache.org/jira/browse/YARN-2560
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jeff Zhang


The diagnostics of Application may be delayed to pass to ApplicationReport. 
Here's one example when ApplicationStatus has changed to FAILED, but the 
diagnostics is still empty. And the next call of getApplicationReport could get 
the diagnostics.
{code}
while(true) {
appReport = yarnClient.getApplicationReport(appId);
Thread.sleep(1000);
LOG.info(AppStatus: + appReport.getFinalApplicationStatus());
LOG.info(Diagnostics: + appReport.getDiagnostics());

}
{code}

*Output:*
{code}
AppStatus:FAILED
Diagnostics: // empty

// get diagnostics for the next getApplicationReport
AppStatus:FAILED
Diagnostics: // diagnostics info here
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-2184) ResourceManager may fail due to name node in safe mode

2014-06-19 Thread Jeff Zhang (JIRA)
Jeff Zhang created YARN-2184:


 Summary: ResourceManager may fail due to name node in safe mode
 Key: YARN-2184
 URL: https://issues.apache.org/jira/browse/YARN-2184
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Jeff Zhang
Assignee: Jeff Zhang


If the historyservice is enabled in resourcemanager, it will try to mkdir when 
service is inited. And at that time maybe the name node is still in safemode 
which may cause the historyservice failed and then cause the resouremanager 
fail. It would be very possible when the cluster is restarted when namenode 
will be in safemode in a long time.

Here's the error logs:

{code}
Caused by: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException):
 Cannot create directory 
/Users/jzhang/Java/lib/hadoop-2.4.0/logs/yarn/system/history/ApplicationHistoryDataRoot.
 Name node is in safe mode.
The reported blocks 85 has reached the threshold 0.9990 of total blocks 85. The 
number of live datanodes 1 has reached the minimum number 0. In safe mode 
extension. Safe mode will be turned off automatically in 19 seconds.
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1195)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3564)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3540)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:754)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:558)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)

at org.apache.hadoop.ipc.Client.call(Client.java:1410)
at org.apache.hadoop.ipc.Client.call(Client.java:1363)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy14.mkdirs(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
at com.sun.proxy.$Proxy14.mkdirs(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:500)
at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2553)
at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2524)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:827)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:823)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:823)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:816)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1815)
at 
org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore.serviceInit(FileSystemApplicationHistoryStore.java:120)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
... 10 more
2014-06-20 11:06:25,220 INFO 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: SHUTDOWN_MSG:
/
SHUTDOWN_MSG: Shutting down ResourceManager at jzhangMBPr.local/192.168.100.152
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-1754) Container process is not really killed

2014-02-23 Thread Jeff Zhang (JIRA)
Jeff Zhang created YARN-1754:


 Summary: Container process is not really killed
 Key: YARN-1754
 URL: https://issues.apache.org/jira/browse/YARN-1754
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.2.0
 Environment: Mac
Reporter: Jeff Zhang


I test the following distributed shell example on my mac:

hadoop jar 
share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.2.0.jar -appname 
shell -jar 
share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.2.0.jar 
-shell_command=sleep -shell_args=10 -num_containers=1

And it will start 2 process for one container, one is the shell process, 
another is the real command I execute ( here is sleep 10). 

And then I kill this application by running command yarn application -kill 
app_id

it will kill the shell process, but won't kill the real command process. The 
reason is that yarn use kill command to kill process, but it won't kill its 
child process. use pkill could resolve this issue.

I also verify this case on centos which is the same as mac. IMHO, it is a very 
important case which will make the resource usage inconsistency, and have 
potential security problem. 




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)