[jira] [Commented] (YARN-2857) ConcurrentModificationException in ContainerLogAppender
[ https://issues.apache.org/jira/browse/YARN-2857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14211574#comment-14211574 ] Mohammad Kamrul Islam commented on YARN-2857: - Can some (binding) people please review it? ConcurrentModificationException in ContainerLogAppender --- Key: YARN-2857 URL: https://issues.apache.org/jira/browse/YARN-2857 Project: Hadoop YARN Issue Type: Bug Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Priority: Critical Attachments: ContainerLogAppender.java, MAPREDUCE-6139-test.01.patch, MAPREDUCE-6139.1.patch, MAPREDUCE-6139.2.patch, MAPREDUCE-6139.3.patch, YARN-2857.3.patch Context: * Hadoop-2.3.0 * Using Oozie 4.0.1 * Pig version 0.11.x The job is submitted by Oozie to launch Pig script. The following exception traces were found on MR task log: In syslog: {noformat} 2014-10-24 20:37:29,317 WARN [Thread-5] org.apache.hadoop.util.ShutdownHookManager: ShutdownHook '' failed, java.util.ConcurrentModificationException java.util.ConcurrentModificationException at java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:966) at java.util.LinkedList$ListItr.next(LinkedList.java:888) at org.apache.hadoop.yarn.ContainerLogAppender.close(ContainerLogAppender.java:94) at org.apache.log4j.helpers.AppenderAttachableImpl.removeAllAppenders(AppenderAttachableImpl.java:141) at org.apache.log4j.Category.removeAllAppenders(Category.java:891) at org.apache.log4j.Hierarchy.shutdown(Hierarchy.java:471) at org.apache.log4j.LogManager.shutdown(LogManager.java:267) at org.apache.hadoop.mapred.TaskLog.syncLogsShutdown(TaskLog.java:286) at org.apache.hadoop.mapred.TaskLog$2.run(TaskLog.java:339) at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54) 2014-10-24 20:37:29,395 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping MapTask metrics system... {noformat} in stderr: {noformat} java.util.ConcurrentModificationException at java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:966) at java.util.LinkedList$ListItr.next(LinkedList.java:888) at org.apache.hadoop.yarn.ContainerLogAppender.close(ContainerLogAppender.java:94) at org.apache.log4j.helpers.AppenderAttachableImpl.removeAllAppenders(AppenderAttachableImpl.java:141) at org.apache.log4j.Category.removeAllAppenders(Category.java:891) at org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:759) at org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:648) at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:514) at org.apache.log4j.PropertyConfigurator.configure(PropertyConfigurator.java:440) at org.apache.pig.Main.configureLog4J(Main.java:740) at org.apache.pig.Main.run(Main.java:384) at org.apache.pig.PigRunner.run(PigRunner.java:49) at org.apache.oozie.action.hadoop.PigMain.runPigJob(PigMain.java:283) at org.apache.oozie.action.hadoop.PigMain.run(PigMain.java:223) at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:37) at org.apache.oozie.action.hadoop.PigMain.main(PigMain.java:76) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:226) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2481) YARN should allow defining the location of java
[ https://issues.apache.org/jira/browse/YARN-2481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14150339#comment-14150339 ] Mohammad Kamrul Islam commented on YARN-2481: - Very good to know that [~acmurthy]. Just to confirm my understanding: So user can override JAVA_HOME for each job. right? I assume it is through job conf? YARN should allow defining the location of java --- Key: YARN-2481 URL: https://issues.apache.org/jira/browse/YARN-2481 Project: Hadoop YARN Issue Type: New Feature Reporter: Abin Shahab Yarn right now uses the location of the JAVA_HOME on the host to launch containers. This does not work with Docker containers which have their own filesystem namespace and OS. If the location of the Java binary of the container to be launched is configurable, yarn can launch containers that have java in a different location than the host. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1962) Timeline server is enabled by default
[ https://issues.apache.org/jira/browse/YARN-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Kamrul Islam updated YARN-1962: Attachment: YARN-1962.1.patch Thanks [~vinodkv]. Patch added Timeline server is enabled by default - Key: YARN-1962 URL: https://issues.apache.org/jira/browse/YARN-1962 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.4.0 Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: YARN-1962.1.patch Since Timeline server is not matured and secured yet, enabling it by default might create some confusion. We were playing with 2.4.0 and found a lot of exceptions for distributed shell example related to connection refused error. Btw, we didn't run TS because it is not secured yet. Although it is possible to explicitly turn it off through yarn-site config. In my opinion, this extra change for this new service is not worthy at this point,. This JIRA is to turn it off by default. If there is an agreement, i can put a simple patch about this. {noformat} 14/04/17 23:24:33 ERROR impl.TimelineClientImpl: Failed to get the response from the timeline server. com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: Connection refused at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149) at com.sun.jersey.api.client.Client.handle(Client.java:648) at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670) at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74) at com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281) Caused by: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at java.net.Socket.connect(Socket.java:528) at sun.net.NetworkClient.doConnect(NetworkClient.java:180) at sun.net.www.http.HttpClient.openServer(HttpClient.java:432) at sun.net.www.http.HttpClient.openServer(HttpClient.java:527) at sun.net.www.http.HttpClient.in14/04/17 23:24:33 ERROR impl.TimelineClientImpl: Failed to get the response from the timeline server. com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: Connection refused at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149) at com.sun.jersey.api.client.Client.handle(Client.java:648) at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670) at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74) at com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281) Caused by: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at
[jira] [Commented] (YARN-1962) Timeline server is enabled by default
[ https://issues.apache.org/jira/browse/YARN-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13975885#comment-13975885 ] Mohammad Kamrul Islam commented on YARN-1962: - [~zjshen] Thanks for the feedback. I will upload a new patch. Timeline server is enabled by default - Key: YARN-1962 URL: https://issues.apache.org/jira/browse/YARN-1962 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.4.0 Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: YARN-1962.1.patch Since Timeline server is not matured and secured yet, enabling it by default might create some confusion. We were playing with 2.4.0 and found a lot of exceptions for distributed shell example related to connection refused error. Btw, we didn't run TS because it is not secured yet. Although it is possible to explicitly turn it off through yarn-site config. In my opinion, this extra change for this new service is not worthy at this point,. This JIRA is to turn it off by default. If there is an agreement, i can put a simple patch about this. {noformat} 14/04/17 23:24:33 ERROR impl.TimelineClientImpl: Failed to get the response from the timeline server. com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: Connection refused at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149) at com.sun.jersey.api.client.Client.handle(Client.java:648) at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670) at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74) at com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281) Caused by: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at java.net.Socket.connect(Socket.java:528) at sun.net.NetworkClient.doConnect(NetworkClient.java:180) at sun.net.www.http.HttpClient.openServer(HttpClient.java:432) at sun.net.www.http.HttpClient.openServer(HttpClient.java:527) at sun.net.www.http.HttpClient.in14/04/17 23:24:33 ERROR impl.TimelineClientImpl: Failed to get the response from the timeline server. com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: Connection refused at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149) at com.sun.jersey.api.client.Client.handle(Client.java:648) at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670) at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74) at com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281) Caused by: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at
[jira] [Updated] (YARN-1962) Timeline server is enabled by default
[ https://issues.apache.org/jira/browse/YARN-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Kamrul Islam updated YARN-1962: Attachment: YARN-1962.2.patch Patch with review comments. Timeline server is enabled by default - Key: YARN-1962 URL: https://issues.apache.org/jira/browse/YARN-1962 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.4.0 Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: YARN-1962.1.patch, YARN-1962.2.patch Since Timeline server is not matured and secured yet, enabling it by default might create some confusion. We were playing with 2.4.0 and found a lot of exceptions for distributed shell example related to connection refused error. Btw, we didn't run TS because it is not secured yet. Although it is possible to explicitly turn it off through yarn-site config. In my opinion, this extra change for this new service is not worthy at this point,. This JIRA is to turn it off by default. If there is an agreement, i can put a simple patch about this. {noformat} 14/04/17 23:24:33 ERROR impl.TimelineClientImpl: Failed to get the response from the timeline server. com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: Connection refused at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149) at com.sun.jersey.api.client.Client.handle(Client.java:648) at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670) at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74) at com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281) Caused by: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at java.net.Socket.connect(Socket.java:528) at sun.net.NetworkClient.doConnect(NetworkClient.java:180) at sun.net.www.http.HttpClient.openServer(HttpClient.java:432) at sun.net.www.http.HttpClient.openServer(HttpClient.java:527) at sun.net.www.http.HttpClient.in14/04/17 23:24:33 ERROR impl.TimelineClientImpl: Failed to get the response from the timeline server. com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: Connection refused at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149) at com.sun.jersey.api.client.Client.handle(Client.java:648) at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670) at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74) at com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281) Caused by: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at
[jira] [Commented] (YARN-1962) Timeline server is enabled by default
[ https://issues.apache.org/jira/browse/YARN-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13976094#comment-13976094 ] Mohammad Kamrul Islam commented on YARN-1962: - Testing done: 1. Tested in 2.4.0 cluster of 100 nodes with [~tthompso] 2. Ran the relevant unit test including the new one. Timeline server is enabled by default - Key: YARN-1962 URL: https://issues.apache.org/jira/browse/YARN-1962 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.4.0 Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: YARN-1962.1.patch, YARN-1962.2.patch Since Timeline server is not matured and secured yet, enabling it by default might create some confusion. We were playing with 2.4.0 and found a lot of exceptions for distributed shell example related to connection refused error. Btw, we didn't run TS because it is not secured yet. Although it is possible to explicitly turn it off through yarn-site config. In my opinion, this extra change for this new service is not worthy at this point,. This JIRA is to turn it off by default. If there is an agreement, i can put a simple patch about this. {noformat} 14/04/17 23:24:33 ERROR impl.TimelineClientImpl: Failed to get the response from the timeline server. com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: Connection refused at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149) at com.sun.jersey.api.client.Client.handle(Client.java:648) at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670) at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74) at com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281) Caused by: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at java.net.Socket.connect(Socket.java:528) at sun.net.NetworkClient.doConnect(NetworkClient.java:180) at sun.net.www.http.HttpClient.openServer(HttpClient.java:432) at sun.net.www.http.HttpClient.openServer(HttpClient.java:527) at sun.net.www.http.HttpClient.in14/04/17 23:24:33 ERROR impl.TimelineClientImpl: Failed to get the response from the timeline server. com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: Connection refused at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149) at com.sun.jersey.api.client.Client.handle(Client.java:648) at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670) at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74) at com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281) Caused by: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198) at
[jira] [Created] (YARN-1962) Timeline server is enabled by default
Mohammad Kamrul Islam created YARN-1962: --- Summary: Timeline server is enabled by default Key: YARN-1962 URL: https://issues.apache.org/jira/browse/YARN-1962 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.4.0 Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Since Timeline server is not matured and secured yet, enabling it by default might create some confusion. We were playing with 2.4.0 and found a lot of exceptions for distributed shell example related to connection refused error. Btw, we didn't run TS because it is not secured yet. Although it is possible to explicitly turn it off through yarn-site config. In my opinion, this extra change for this new service is not worthy at this point,. This JIRA is to turn it off by default. If there is an agreement, i can put a simple patch about this. {noformat} 14/04/17 23:24:33 ERROR impl.TimelineClientImpl: Failed to get the response from the timeline server. com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: Connection refused at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149) at com.sun.jersey.api.client.Client.handle(Client.java:648) at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670) at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74) at com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281) Caused by: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at java.net.Socket.connect(Socket.java:528) at sun.net.NetworkClient.doConnect(NetworkClient.java:180) at sun.net.www.http.HttpClient.openServer(HttpClient.java:432) at sun.net.www.http.HttpClient.openServer(HttpClient.java:527) at sun.net.www.http.HttpClient.in14/04/17 23:24:33 ERROR impl.TimelineClientImpl: Failed to get the response from the timeline server. com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: Connection refused at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149) at com.sun.jersey.api.client.Client.handle(Client.java:648) at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670) at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74) at com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:563) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPostingEntities(TimelineClientImpl.java:131) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:104) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.publishApplicationAttemptEvent(ApplicationMaster.java:1072) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:515) at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:281) Caused by: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at java.net.Socket.connect(Socket.java:528) at sun.net.NetworkClient.doConnect(NetworkClient.java:180) at sun.net.www.http.HttpClient.openServer(HttpClient.java:432) at
[jira] [Commented] (YARN-1894) RM shutdown due to java.net.UnknownHostException
[ https://issues.apache.org/jira/browse/YARN-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13954629#comment-13954629 ] Mohammad Kamrul Islam commented on YARN-1894: - Thanks [~jianhe] and [~vinodkv] for pointing this and taking care of this. RM shutdown due to java.net.UnknownHostException Key: YARN-1894 URL: https://issues.apache.org/jira/browse/YARN-1894 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Fix For: 2.4.0 Background: I started Hadoop 2.3 on my Mac in my office network and submitted few jobs successfully. When i went to my home (new network), I submitted another job and it abruptly pulled down the RM service. Error in RM log: {noformat} 2014-03-29 12:28:56,754 INFO org.apache.hadoop.yarn.server.resourcemanager.security.RMDelegationTokenSecretManager: storing RMDelegation token with sequence number: 3 2014-03-29 12:28:57,256 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type NODE_UPDATE to the scheduler java.lang.IllegalArgumentException: java.net.UnknownHostException: mislam-mn.MY.OOFICE.DOMAIN at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:377) at org.apache.hadoop.yarn.server.utils.BuilderUtils.newContainerToken(BuilderUtils.java:247) at org.apache.hadoop.yarn.server.resourcemanager.security.RMContainerTokenSecretManager.createContainerToken(RMContainerTokenSecretManager.java:195) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.createContainerToken(LeafQueue.java:1294) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1342) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignOffSwitchContainers(LeafQueue.java:1208) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:1167) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:868) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:642) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:556) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:696) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:740) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:88) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:543) at java.lang.Thread.run(Thread.java:695) Caused by: java.net.UnknownHostException: mislam-mn.linkedin.biz ... 15 more 2014-03-29 12:28:57,259 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye.. 2014-03-29 12:28:57,297 INFO org.mortbay.log: Stopped SelectChannelConnector@0.0.0.0:8088 2014-03-29 12:28:57,401 INFO org.apache.hadoop.ipc.Server: Stopping server on 8032 2014-03-29 12:28:57,473 INFO org.apache.hadoop.ipc.Server: Stopping server on 8033 . {noformat} Proposal: --- I believe the root cause : I moved my machine from one network to another with the same RM service. My point is: Whatever the cause, RM is a long running core-service and it should not exit this way. An appropriate error message should be sufficient. If there is an consensus (or no disagreement), I can work for a patch. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (YARN-1894) RM shutdown due to java.net.UnknownHostException
Mohammad Kamrul Islam created YARN-1894: --- Summary: RM shutdown due to java.net.UnknownHostException Key: YARN-1894 URL: https://issues.apache.org/jira/browse/YARN-1894 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Background: I started Hadoop 2.3 on my Mac in my office network and submitted few jobs successfully. When i went to my home (new network), I submitted another job and it abruptly pulled down the RM service. Error in RM log: {noformat} 2014-03-29 12:28:56,754 INFO org.apache.hadoop.yarn.server.resourcemanager.security.RMDelegationTokenSecretManager: storing RMDelegation token with sequence number: 3 2014-03-29 12:28:57,256 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type NODE_UPDATE to the scheduler java.lang.IllegalArgumentException: java.net.UnknownHostException: mislam-mn.MY.OOFICE.DOMAIN at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:377) at org.apache.hadoop.yarn.server.utils.BuilderUtils.newContainerToken(BuilderUtils.java:247) at org.apache.hadoop.yarn.server.resourcemanager.security.RMContainerTokenSecretManager.createContainerToken(RMContainerTokenSecretManager.java:195) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.createContainerToken(LeafQueue.java:1294) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1342) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignOffSwitchContainers(LeafQueue.java:1208) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:1167) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:868) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:642) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:556) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:696) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:740) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:88) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:543) at java.lang.Thread.run(Thread.java:695) Caused by: java.net.UnknownHostException: mislam-mn.linkedin.biz ... 15 more 2014-03-29 12:28:57,259 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye.. 2014-03-29 12:28:57,297 INFO org.mortbay.log: Stopped SelectChannelConnector@0.0.0.0:8088 2014-03-29 12:28:57,401 INFO org.apache.hadoop.ipc.Server: Stopping server on 8032 2014-03-29 12:28:57,473 INFO org.apache.hadoop.ipc.Server: Stopping server on 8033 . {noformat} Proposal: --- I believe the root cause : I moved my machine from one network to another with the same RM service. My point is: Whatever the cause, RM is a long running core-service and it should not exit this way. An appropriate error message should be sufficient. If there is an consensus (or no disagreement), I can work for a patch. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (YARN-1818) When mapreduce.jobhistory.intermediate-done-dir isn't writable, application fails with generic error
[ https://issues.apache.org/jira/browse/YARN-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Kamrul Islam reassigned YARN-1818: --- Assignee: Mohammad Kamrul Islam When mapreduce.jobhistory.intermediate-done-dir isn't writable, application fails with generic error Key: YARN-1818 URL: https://issues.apache.org/jira/browse/YARN-1818 Project: Hadoop YARN Issue Type: Bug Components: applications Affects Versions: 2.3.0 Reporter: Travis Thompson Assignee: Mohammad Kamrul Islam When trying to run an application and the permissions are wrong on {{mapreduce.jobhistory.intermediate-done-dir}}, the MapReduce AM fails with a non-descriptive error message: {noformat} Application application_1394227890066_0004 failed 2 times due to AM Container for appattempt_1394227890066_0004_02 exited with exitCode: 1 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: at org.apache.hadoop.util.Shell.runCommand(Shell.java:505) at org.apache.hadoop.util.Shell.run(Shell.java:418) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:279) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) main : command provided 1 main : user is tthompso main : requested yarn user is tthompso Container exited with a non-zero exit code 1 .Failing this attempt.. Failing the application. {noformat} When permissions are corrected on this dir, applications are able to run. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1590) _HOST doesn't expand properly for RM, NM, ProxyServer and JHS
[ https://issues.apache.org/jira/browse/YARN-1590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Kamrul Islam updated YARN-1590: Attachment: YARN-1590.4.patch New patch that addressed Vinod's comments. _HOST doesn't expand properly for RM, NM, ProxyServer and JHS - Key: YARN-1590 URL: https://issues.apache.org/jira/browse/YARN-1590 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 3.0.0, 2.2.0 Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: YARN-1590.1.patch, YARN-1590.2.patch, YARN-1590.3.patch, YARN-1590.4.patch _HOST is not properly substituted when we use VIP address. Currently it always used the host name of the machine and disregard the VIP address. It is true mainly for RM, NM, WebProxy, and JHS rpc service. Looks like it is working fine for webservice authentication. On the other hand, the same thing is working fine for NN and SNN in RPC as well as webservice. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1590) _HOST doesn't expand properly for RM, NM, ProxyServer and JHS
[ https://issues.apache.org/jira/browse/YARN-1590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13891065#comment-13891065 ] Mohammad Kamrul Islam commented on YARN-1590: - bq. It is true mainly for RM, NM,, WebProxy, and JHS rpc service. Looks like it is working fine for webservice authentication. It means RM UI (for example) works fine with _HOST. However, any RM rpc communication with NM (e.g.) doesn't work. We tested that this patch resolved the RM--NM communications failure. _HOST doesn't expand properly for RM, NM, ProxyServer and JHS - Key: YARN-1590 URL: https://issues.apache.org/jira/browse/YARN-1590 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 3.0.0, 2.2.0 Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: YARN-1590.1.patch, YARN-1590.2.patch, YARN-1590.3.patch, YARN-1590.4.patch _HOST is not properly substituted when we use VIP address. Currently it always used the host name of the machine and disregard the VIP address. It is true mainly for RM, NM, WebProxy, and JHS rpc service. Looks like it is working fine for webservice authentication. On the other hand, the same thing is working fine for NN and SNN in RPC as well as webservice. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1622) 'bin/yarn' command doesn't behave like 'hadoop' and etc.
[ https://issues.apache.org/jira/browse/YARN-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Kamrul Islam updated YARN-1622: Attachment: YARN-1622.1.patch Simple one line patch in script. 'bin/yarn' command doesn't behave like 'hadoop' and etc. - Key: YARN-1622 URL: https://issues.apache.org/jira/browse/YARN-1622 Project: Hadoop YARN Issue Type: Bug Components: client Affects Versions: 2.2.0 Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: YARN-1622.1.patch There are few issues with 'bin/yarn' and 'etc/hadoop/yarn-env.sh'. They are loosely related but fixes are minor and will go in the same files. Therefore I combined them into one JIRA. Issues are: 1, bin/yarn has a dangling 'fi' in the last line. Thanks to shell for so compliant! 2. YARN_ROOT_LOGGER is defined as INFO, DFRA in yarn-env.sh. That's why 'bin/yarn' command doesn't show (by default) the log messages in client window. But when we used 'bin/hadoop', it shows the log correctly (because HADOOP_ROOT_LOOGER is INFO,console by default). Need to address this non-consistent behavior. 3. For each client command, yarn creates a log file in $YARN_LOG_DIR/yarn.log own by the 'end-user'. In a multi-tenant environment, the second user will not be able to create its own yarn.log in the same place causing the exception (pasted at the end). By default, it should write to $YARN_LOG_DIR/$USER/yarn.log instead. Note: I plan to address only #1 and #2 in this JIRA. If we default the YARN_ROOT_LOGGER consistent with 'bin/hadoop', the issue #3 will not happen. The scope of this JIRA is come close to 'bin/hadoop' behavior. Exception: log4j:ERROR setFile(null,true) call failed. java.io.FileNotFoundException: /export/apps/hadoop/logs/yarn.log (Permission denied) at java.io.FileOutputStream.open(Native Method) at java.io.FileOutputStream.init(FileOutputStream.java:221) at java.io.FileOutputStream.init(FileOutputStream.java:142) at org.apache.log4j.FileAppender.setFile(FileAppender.java:294) at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165) at org.apache.log4j.DailyRollingFileAppender.activateOptions(DailyRollingFileAppender.java:223) at org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307) at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172) at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104) at org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:842) at org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:768) at org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:648) at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:514) at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:580) at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526) at org.apache.log4j.LogManager.clinit(LogManager.java:127) at org.apache.log4j.Logger.getLogger(Logger.java:104) at org.apache.commons.logging.impl.Log4JLogger.getLogger(Log4JLogger.java:289) at org.apache.commons.logging.impl.Log4JLogger.init(Log4JLogger.java:109) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.commons.logging.impl.LogFactoryImpl.createLogFromClass(LogFactoryImpl.java:1116) at org.apache.commons.logging.impl.LogFactoryImpl.discoverLogImplementation(LogFactoryImpl.java:914) at org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:604) at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:336) at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:310) at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:685) at org.apache.hadoop.conf.Configuration.clinit(Configuration.java:165) at org.apache.hadoop.util.RunJar.main(RunJar.java:158) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (YARN-1622) 'yarn' command doesn't behave like 'hadoop' and etc.
Mohammad Kamrul Islam created YARN-1622: --- Summary: 'yarn' command doesn't behave like 'hadoop' and etc. Key: YARN-1622 URL: https://issues.apache.org/jira/browse/YARN-1622 Project: Hadoop YARN Issue Type: Bug Components: client Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam There are few issues with 'bin/yarn' and 'etc/hadoop/yarn-env.sh'. They are loosely related but fixes are minor and will go in the same files. Therefore I combined them into one JIRA. Issues are: 1, bin/yarn has a dangling 'fi' in the last line. Thanks to shell for so compliant! 2. YARN_ROOT_LOGGER is defined as INFO, DFRA in yarn-env.sh. That's why 'bin/yarn' command doesn't show (by default) the log messages in client window. But when we used 'bin/hadoop', it shows the log correctly (because HADOOP_ROOT_LOOGER is INFO,console by default). Need to address this non-consistent behavior. 3. For each client command, yarn creates a log file in $YARN_LOG_DIR/yarn.log own by the 'end-user'. In a multi-tenant environment, the second user will not be able to create its own yarn.log in the same place causing the exception (pasted at the end). By default, it should write to $YARN_LOG_DIR/$USER/yarn.log instead. Note: I plan to address only #1 and #2 in this JIRA. If we default the YARN_ROOT_LOGGER consistent with 'bin/hadoop', the issue #3 will not happen. The scope of this JIRA is come close to 'bin/hadoop' behavior. Exception: log4j:ERROR setFile(null,true) call failed. java.io.FileNotFoundException: /export/apps/hadoop/logs/yarn.log (Permission denied) at java.io.FileOutputStream.open(Native Method) at java.io.FileOutputStream.init(FileOutputStream.java:221) at java.io.FileOutputStream.init(FileOutputStream.java:142) at org.apache.log4j.FileAppender.setFile(FileAppender.java:294) at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165) at org.apache.log4j.DailyRollingFileAppender.activateOptions(DailyRollingFileAppender.java:223) at org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307) at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172) at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104) at org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:842) at org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:768) at org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:648) at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:514) at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:580) at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526) at org.apache.log4j.LogManager.clinit(LogManager.java:127) at org.apache.log4j.Logger.getLogger(Logger.java:104) at org.apache.commons.logging.impl.Log4JLogger.getLogger(Log4JLogger.java:289) at org.apache.commons.logging.impl.Log4JLogger.init(Log4JLogger.java:109) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.commons.logging.impl.LogFactoryImpl.createLogFromClass(LogFactoryImpl.java:1116) at org.apache.commons.logging.impl.LogFactoryImpl.discoverLogImplementation(LogFactoryImpl.java:914) at org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:604) at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:336) at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:310) at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:685) at org.apache.hadoop.conf.Configuration.clinit(Configuration.java:165) at org.apache.hadoop.util.RunJar.main(RunJar.java:158) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1622) 'bin/yarn' command doesn't behave like 'hadoop' and etc.
[ https://issues.apache.org/jira/browse/YARN-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Kamrul Islam updated YARN-1622: Affects Version/s: 2.2.0 Summary: 'bin/yarn' command doesn't behave like 'hadoop' and etc. (was: 'yarn' command doesn't behave like 'hadoop' and etc. ) 'bin/yarn' command doesn't behave like 'hadoop' and etc. - Key: YARN-1622 URL: https://issues.apache.org/jira/browse/YARN-1622 Project: Hadoop YARN Issue Type: Bug Components: client Affects Versions: 2.2.0 Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam There are few issues with 'bin/yarn' and 'etc/hadoop/yarn-env.sh'. They are loosely related but fixes are minor and will go in the same files. Therefore I combined them into one JIRA. Issues are: 1, bin/yarn has a dangling 'fi' in the last line. Thanks to shell for so compliant! 2. YARN_ROOT_LOGGER is defined as INFO, DFRA in yarn-env.sh. That's why 'bin/yarn' command doesn't show (by default) the log messages in client window. But when we used 'bin/hadoop', it shows the log correctly (because HADOOP_ROOT_LOOGER is INFO,console by default). Need to address this non-consistent behavior. 3. For each client command, yarn creates a log file in $YARN_LOG_DIR/yarn.log own by the 'end-user'. In a multi-tenant environment, the second user will not be able to create its own yarn.log in the same place causing the exception (pasted at the end). By default, it should write to $YARN_LOG_DIR/$USER/yarn.log instead. Note: I plan to address only #1 and #2 in this JIRA. If we default the YARN_ROOT_LOGGER consistent with 'bin/hadoop', the issue #3 will not happen. The scope of this JIRA is come close to 'bin/hadoop' behavior. Exception: log4j:ERROR setFile(null,true) call failed. java.io.FileNotFoundException: /export/apps/hadoop/logs/yarn.log (Permission denied) at java.io.FileOutputStream.open(Native Method) at java.io.FileOutputStream.init(FileOutputStream.java:221) at java.io.FileOutputStream.init(FileOutputStream.java:142) at org.apache.log4j.FileAppender.setFile(FileAppender.java:294) at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165) at org.apache.log4j.DailyRollingFileAppender.activateOptions(DailyRollingFileAppender.java:223) at org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307) at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172) at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104) at org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:842) at org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:768) at org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:648) at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:514) at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:580) at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526) at org.apache.log4j.LogManager.clinit(LogManager.java:127) at org.apache.log4j.Logger.getLogger(Logger.java:104) at org.apache.commons.logging.impl.Log4JLogger.getLogger(Log4JLogger.java:289) at org.apache.commons.logging.impl.Log4JLogger.init(Log4JLogger.java:109) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.commons.logging.impl.LogFactoryImpl.createLogFromClass(LogFactoryImpl.java:1116) at org.apache.commons.logging.impl.LogFactoryImpl.discoverLogImplementation(LogFactoryImpl.java:914) at org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:604) at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:336) at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:310) at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:685) at org.apache.hadoop.conf.Configuration.clinit(Configuration.java:165) at org.apache.hadoop.util.RunJar.main(RunJar.java:158) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1622) 'bin/yarn' command doesn't behave like 'hadoop' and etc.
[ https://issues.apache.org/jira/browse/YARN-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878118#comment-13878118 ] Mohammad Kamrul Islam commented on YARN-1622: - Further analysis found the some claims made in the original description are no longer valid for apache version. The issue #2 and #3 are in our cluster specific -- not an apache issue. However, the issue #1 remains valid. In short, the scope of this JIRA becomes very limited than what I originally thought. Will put a simple patch soon. 'bin/yarn' command doesn't behave like 'hadoop' and etc. - Key: YARN-1622 URL: https://issues.apache.org/jira/browse/YARN-1622 Project: Hadoop YARN Issue Type: Bug Components: client Affects Versions: 2.2.0 Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam There are few issues with 'bin/yarn' and 'etc/hadoop/yarn-env.sh'. They are loosely related but fixes are minor and will go in the same files. Therefore I combined them into one JIRA. Issues are: 1, bin/yarn has a dangling 'fi' in the last line. Thanks to shell for so compliant! 2. YARN_ROOT_LOGGER is defined as INFO, DFRA in yarn-env.sh. That's why 'bin/yarn' command doesn't show (by default) the log messages in client window. But when we used 'bin/hadoop', it shows the log correctly (because HADOOP_ROOT_LOOGER is INFO,console by default). Need to address this non-consistent behavior. 3. For each client command, yarn creates a log file in $YARN_LOG_DIR/yarn.log own by the 'end-user'. In a multi-tenant environment, the second user will not be able to create its own yarn.log in the same place causing the exception (pasted at the end). By default, it should write to $YARN_LOG_DIR/$USER/yarn.log instead. Note: I plan to address only #1 and #2 in this JIRA. If we default the YARN_ROOT_LOGGER consistent with 'bin/hadoop', the issue #3 will not happen. The scope of this JIRA is come close to 'bin/hadoop' behavior. Exception: log4j:ERROR setFile(null,true) call failed. java.io.FileNotFoundException: /export/apps/hadoop/logs/yarn.log (Permission denied) at java.io.FileOutputStream.open(Native Method) at java.io.FileOutputStream.init(FileOutputStream.java:221) at java.io.FileOutputStream.init(FileOutputStream.java:142) at org.apache.log4j.FileAppender.setFile(FileAppender.java:294) at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165) at org.apache.log4j.DailyRollingFileAppender.activateOptions(DailyRollingFileAppender.java:223) at org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307) at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172) at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104) at org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:842) at org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:768) at org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:648) at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:514) at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:580) at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526) at org.apache.log4j.LogManager.clinit(LogManager.java:127) at org.apache.log4j.Logger.getLogger(Logger.java:104) at org.apache.commons.logging.impl.Log4JLogger.getLogger(Log4JLogger.java:289) at org.apache.commons.logging.impl.Log4JLogger.init(Log4JLogger.java:109) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.commons.logging.impl.LogFactoryImpl.createLogFromClass(LogFactoryImpl.java:1116) at org.apache.commons.logging.impl.LogFactoryImpl.discoverLogImplementation(LogFactoryImpl.java:914) at org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:604) at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:336) at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:310) at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:685) at org.apache.hadoop.conf.Configuration.clinit(Configuration.java:165) at org.apache.hadoop.util.RunJar.main(RunJar.java:158) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1590) _HOST doesn't expand properly for RM, NM, ProxyServer and JHS
[ https://issues.apache.org/jira/browse/YARN-1590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868967#comment-13868967 ] Mohammad Kamrul Islam commented on YARN-1590: - No new functionality was introduced. Fixed a bug that was implemented in other cases. But I tested it in a cluster environment. Secondly, I ran the failed test case (mvn test -Dtest=TestRMWebServicesApps -Pnative) and it ran fine for me. Looks like it was some timing issue. _HOST doesn't expand properly for RM, NM, ProxyServer and JHS - Key: YARN-1590 URL: https://issues.apache.org/jira/browse/YARN-1590 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 3.0.0, 2.2.0 Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: YARN-1590.1.patch, YARN-1590.2.patch, YARN-1590.3.patch _HOST is not properly substituted when we use VIP address. Currently it always used the host name of the machine and disregard the VIP address. It is true mainly for RM, NM, WebProxy, and JHS rpc service. Looks like it is working fine for webservice authentication. On the other hand, the same thing is working fine for NN and SNN in RPC as well as webservice. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (YARN-1590) _HOST doesn't expand properly for RM, NM, ProxyServer and JHS
Mohammad Kamrul Islam created YARN-1590: --- Summary: _HOST doesn't expand properly for RM, NM, ProxyServer and JHS Key: YARN-1590 URL: https://issues.apache.org/jira/browse/YARN-1590 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Mohammad Kamrul Islam _HOST is not properly substituted when we use VIP address. Currently it always used the host name of the machine and disregard the VIP address. It is true mainly for RM, NM, WebProxy, and JHS rpc service. Looks like it is working fine for webservice authentication. On the other hand, the same thing is working fine for NN and SNN in RPC as well as webservice. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1590) _HOST doesn't expand properly for RM, NM, ProxyServer and JHS
[ https://issues.apache.org/jira/browse/YARN-1590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Kamrul Islam updated YARN-1590: Affects Version/s: 2.2.0 _HOST doesn't expand properly for RM, NM, ProxyServer and JHS - Key: YARN-1590 URL: https://issues.apache.org/jira/browse/YARN-1590 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.2.0 Reporter: Mohammad Kamrul Islam _HOST is not properly substituted when we use VIP address. Currently it always used the host name of the machine and disregard the VIP address. It is true mainly for RM, NM, WebProxy, and JHS rpc service. Looks like it is working fine for webservice authentication. On the other hand, the same thing is working fine for NN and SNN in RPC as well as webservice. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Assigned] (YARN-1590) _HOST doesn't expand properly for RM, NM, ProxyServer and JHS
[ https://issues.apache.org/jira/browse/YARN-1590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Kamrul Islam reassigned YARN-1590: --- Assignee: Mohammad Kamrul Islam _HOST doesn't expand properly for RM, NM, ProxyServer and JHS - Key: YARN-1590 URL: https://issues.apache.org/jira/browse/YARN-1590 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.2.0 Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam _HOST is not properly substituted when we use VIP address. Currently it always used the host name of the machine and disregard the VIP address. It is true mainly for RM, NM, WebProxy, and JHS rpc service. Looks like it is working fine for webservice authentication. On the other hand, the same thing is working fine for NN and SNN in RPC as well as webservice. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1590) _HOST doesn't expand properly for RM, NM, ProxyServer and JHS
[ https://issues.apache.org/jira/browse/YARN-1590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Kamrul Islam updated YARN-1590: Attachment: YARN-1590.1.patch _HOST doesn't expand properly for RM, NM, ProxyServer and JHS - Key: YARN-1590 URL: https://issues.apache.org/jira/browse/YARN-1590 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.2.0 Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: YARN-1590.1.patch _HOST is not properly substituted when we use VIP address. Currently it always used the host name of the machine and disregard the VIP address. It is true mainly for RM, NM, WebProxy, and JHS rpc service. Looks like it is working fine for webservice authentication. On the other hand, the same thing is working fine for NN and SNN in RPC as well as webservice. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1590) _HOST doesn't expand properly for RM, NM, ProxyServer and JHS
[ https://issues.apache.org/jira/browse/YARN-1590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Kamrul Islam updated YARN-1590: Attachment: YARN-1590.2.patch _HOST doesn't expand properly for RM, NM, ProxyServer and JHS - Key: YARN-1590 URL: https://issues.apache.org/jira/browse/YARN-1590 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 3.0.0, 2.2.0 Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: YARN-1590.1.patch, YARN-1590.2.patch _HOST is not properly substituted when we use VIP address. Currently it always used the host name of the machine and disregard the VIP address. It is true mainly for RM, NM, WebProxy, and JHS rpc service. Looks like it is working fine for webservice authentication. On the other hand, the same thing is working fine for NN and SNN in RPC as well as webservice. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1590) _HOST doesn't expand properly for RM, NM, ProxyServer and JHS
[ https://issues.apache.org/jira/browse/YARN-1590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Kamrul Islam updated YARN-1590: Attachment: YARN-1590.3.patch _HOST doesn't expand properly for RM, NM, ProxyServer and JHS - Key: YARN-1590 URL: https://issues.apache.org/jira/browse/YARN-1590 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 3.0.0, 2.2.0 Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: YARN-1590.1.patch, YARN-1590.2.patch, YARN-1590.3.patch _HOST is not properly substituted when we use VIP address. Currently it always used the host name of the machine and disregard the VIP address. It is true mainly for RM, NM, WebProxy, and JHS rpc service. Looks like it is working fine for webservice authentication. On the other hand, the same thing is working fine for NN and SNN in RPC as well as webservice. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-49) Improve distributed shell application to work on a secure cluster
[ https://issues.apache.org/jira/browse/YARN-49?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747319#comment-13747319 ] Mohammad Kamrul Islam commented on YARN-49: --- [~ojoshi] do you have WIP patch that i can use for new Giraph AM? It doesn't need to work though. Improve distributed shell application to work on a secure cluster - Key: YARN-49 URL: https://issues.apache.org/jira/browse/YARN-49 Project: Hadoop YARN Issue Type: Sub-task Components: applications/distributed-shell Reporter: Hitesh Shah Assignee: Omkar Vinit Joshi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-49) Improve distributed shell application to work on a secure cluster
[ https://issues.apache.org/jira/browse/YARN-49?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13745235#comment-13745235 ] Mohammad Kamrul Islam commented on YARN-49: --- I need it for new Giraph AM with 2.1.x. Improve distributed shell application to work on a secure cluster - Key: YARN-49 URL: https://issues.apache.org/jira/browse/YARN-49 Project: Hadoop YARN Issue Type: Sub-task Components: applications/distributed-shell Reporter: Hitesh Shah Assignee: Omkar Vinit Joshi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira