[jira] [Commented] (MAPREDUCE-6654) Possible NPE in JobHistoryEventHandler#handleEvent
[ https://issues.apache.org/jira/browse/MAPREDUCE-6654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16595970#comment-16595970 ] Junping Du commented on MAPREDUCE-6654: --- [~sunilg], I don't have bandwidth in short term for this jira, just drop the version. if anyone have interest to take and move forward, please go ahead. PS: we haven't set up version 3.3 in JIRA. I would suggest to add it so we can move unfinished jiras in 3.2 to 3.3. > Possible NPE in JobHistoryEventHandler#handleEvent > -- > > Key: MAPREDUCE-6654 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6654 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Xiao Chen >Assignee: Junping Du >Priority: Critical > Attachments: MAPREDUCE-6654-v2.1.patch, MAPREDUCE-6654-v2.patch, > MAPREDUCE-6654.patch > > > I have seen NPE thrown from {{JobHistoryEventHandler#handleEvent}}: > {noformat} > 2016-03-14 16:42:15,231 INFO [Thread-69] > org.apache.hadoop.service.AbstractService: Service JobHistoryEventHandler > failed in state STOPPED; cause: java.lang.NullPointerException > java.lang.NullPointerException > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleEvent(JobHistoryEventHandler.java:570) > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.serviceStop(JobHistoryEventHandler.java:382) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52) > at > org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80) > at > org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157) > at > org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStop(MRAppMaster.java:1651) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.stop(MRAppMaster.java:1147) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.shutDownJob(MRAppMaster.java:573) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler$1.run(MRAppMaster.java:620) > {noformat} > In the version this exception is thrown, the > [line|https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryEventHandler.java#L586] > is: > {code:java}mi.writeEvent(historyEvent);{code} > IMHO, this may be caused by an exception in a previous step. Specifically, in > the kerberized environment, when creating event writer which calls to decrypt > EEK, the connection to KMS failed. Exception below: > {noformat} > 2016-03-14 16:41:57,559 ERROR [eventHandlingThread] > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Error > JobHistoryEventHandler in handleEvent: EventType: AM_STARTED > java.net.SocketTimeoutException: Read timed out > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.read(SocketInputStream.java:152) > at java.net.SocketInputStream.read(SocketInputStream.java:122) > at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) > at java.io.BufferedInputStream.read(BufferedInputStream.java:334) > at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687) > at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633) > at > sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1323) > at > java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468) > at > org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:520) > at > org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:505) > at > org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:779) > at > org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$3.call(LoadBalancingKMSClientProvider.java:185) > at > org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$3.call(LoadBalancingKMSClientProvider.java:181) > at > org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:94) > at > org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:181) > at >
[jira] [Updated] (MAPREDUCE-6654) Possible NPE in JobHistoryEventHandler#handleEvent
[ https://issues.apache.org/jira/browse/MAPREDUCE-6654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6654: -- Target Version/s: (was: 3.2.0) > Possible NPE in JobHistoryEventHandler#handleEvent > -- > > Key: MAPREDUCE-6654 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6654 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Xiao Chen >Assignee: Junping Du >Priority: Critical > Attachments: MAPREDUCE-6654-v2.1.patch, MAPREDUCE-6654-v2.patch, > MAPREDUCE-6654.patch > > > I have seen NPE thrown from {{JobHistoryEventHandler#handleEvent}}: > {noformat} > 2016-03-14 16:42:15,231 INFO [Thread-69] > org.apache.hadoop.service.AbstractService: Service JobHistoryEventHandler > failed in state STOPPED; cause: java.lang.NullPointerException > java.lang.NullPointerException > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleEvent(JobHistoryEventHandler.java:570) > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.serviceStop(JobHistoryEventHandler.java:382) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52) > at > org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80) > at > org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157) > at > org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStop(MRAppMaster.java:1651) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.stop(MRAppMaster.java:1147) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.shutDownJob(MRAppMaster.java:573) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler$1.run(MRAppMaster.java:620) > {noformat} > In the version this exception is thrown, the > [line|https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryEventHandler.java#L586] > is: > {code:java}mi.writeEvent(historyEvent);{code} > IMHO, this may be caused by an exception in a previous step. Specifically, in > the kerberized environment, when creating event writer which calls to decrypt > EEK, the connection to KMS failed. Exception below: > {noformat} > 2016-03-14 16:41:57,559 ERROR [eventHandlingThread] > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Error > JobHistoryEventHandler in handleEvent: EventType: AM_STARTED > java.net.SocketTimeoutException: Read timed out > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.read(SocketInputStream.java:152) > at java.net.SocketInputStream.read(SocketInputStream.java:122) > at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) > at java.io.BufferedInputStream.read(BufferedInputStream.java:334) > at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687) > at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633) > at > sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1323) > at > java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468) > at > org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:520) > at > org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:505) > at > org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:779) > at > org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$3.call(LoadBalancingKMSClientProvider.java:185) > at > org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$3.call(LoadBalancingKMSClientProvider.java:181) > at > org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:94) > at > org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:181) > at > org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:388) > at > org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(DFSClient.java:1420) > at > org.apache.hadoop.hdfs.DFSClient.createWrappedOutputStream(DFSClient.java:1522) > at >
[jira] [Commented] (MAPREDUCE-7012) 3.0 deployment cannot work with old version MR tar ball which break rolling upgrade
[ https://issues.apache.org/jira/browse/MAPREDUCE-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261476#comment-16261476 ] Junping Du commented on MAPREDUCE-7012: --- CC [~andrew.wang], [~vinodkv], [~jlowe]. > 3.0 deployment cannot work with old version MR tar ball which break rolling > upgrade > --- > > Key: MAPREDUCE-7012 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7012 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distributed-cache, mrv2 >Reporter: Junping Du >Priority: Blocker > > I tried to deploy 3.0 cluster with 2.9 MR tar ball. The MR job is failed > because following error: > {noformat} > 2017-11-21 12:42:50,911 INFO [main] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for > application appattempt_1511295641738_0003_01 > 2017-11-21 12:42:51,070 WARN [main] org.apache.hadoop.util.NativeCodeLoader: > Unable to load native-hadoop library for your platform... using builtin-java > classes where applicable > 2017-11-21 12:42:51,118 FATAL [main] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster > java.lang.RuntimeException: Unable to determine current user > at > org.apache.hadoop.conf.Configuration$Resource.getRestrictParserDefault(Configuration.java:254) > at > org.apache.hadoop.conf.Configuration$Resource.(Configuration.java:220) > at > org.apache.hadoop.conf.Configuration$Resource.(Configuration.java:212) > at > org.apache.hadoop.conf.Configuration.addResource(Configuration.java:888) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1638) > Caused by: java.io.IOException: Exception reading > /tmp/nm-local-dir/usercache/jdu/appcache/application_1511295641738_0003/container_e03_1511295641738_0003_01_01/container_tokens > at > org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:208) > at > org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:907) > at > org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:820) > at > org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:689) > at > org.apache.hadoop.conf.Configuration$Resource.getRestrictParserDefault(Configuration.java:252) > ... 4 more > Caused by: java.io.IOException: Unknown version 1 in token storage. > at > org.apache.hadoop.security.Credentials.readTokenStorageStream(Credentials.java:226) > at > org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:205) > ... 8 more > 2017-11-21 12:42:51,122 INFO [main] org.apache.hadoop.util.ExitUtil: Exiting > with status 1: java.lang.RuntimeException: Unable to determine current user > {noformat} > I think it is due to token incompatiblity change between 2.9 and 3.0. As we > claim "rolling upgrade" is supported in Hadoop 3, we should fix this before > we ship 3.0 otherwise all MR running applications will get stuck during/after > upgrade. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7012) 3.0 deployment cannot work with old version MR tar ball which break rolling upgrade
Junping Du created MAPREDUCE-7012: - Summary: 3.0 deployment cannot work with old version MR tar ball which break rolling upgrade Key: MAPREDUCE-7012 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7012 Project: Hadoop Map/Reduce Issue Type: Bug Components: distributed-cache, mrv2 Reporter: Junping Du Priority: Blocker I tried to deploy 3.0 cluster with 2.9 MR tar ball. The MR job is failed because following error: {noformat} 2017-11-21 12:42:50,911 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for application appattempt_1511295641738_0003_01 2017-11-21 12:42:51,070 WARN [main] org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2017-11-21 12:42:51,118 FATAL [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster java.lang.RuntimeException: Unable to determine current user at org.apache.hadoop.conf.Configuration$Resource.getRestrictParserDefault(Configuration.java:254) at org.apache.hadoop.conf.Configuration$Resource.(Configuration.java:220) at org.apache.hadoop.conf.Configuration$Resource.(Configuration.java:212) at org.apache.hadoop.conf.Configuration.addResource(Configuration.java:888) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1638) Caused by: java.io.IOException: Exception reading /tmp/nm-local-dir/usercache/jdu/appcache/application_1511295641738_0003/container_e03_1511295641738_0003_01_01/container_tokens at org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:208) at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:907) at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:820) at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:689) at org.apache.hadoop.conf.Configuration$Resource.getRestrictParserDefault(Configuration.java:252) ... 4 more Caused by: java.io.IOException: Unknown version 1 in token storage. at org.apache.hadoop.security.Credentials.readTokenStorageStream(Credentials.java:226) at org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:205) ... 8 more 2017-11-21 12:42:51,122 INFO [main] org.apache.hadoop.util.ExitUtil: Exiting with status 1: java.lang.RuntimeException: Unable to determine current user {noformat} I think it is due to token incompatiblity change between 2.9 and 3.0. As we claim "rolling upgrade" is supported in Hadoop 3, we should fix this before we ship 3.0 otherwise all MR running applications will get stuck during/after upgrade. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-5889) Deprecate FileInputFormat.setInputPaths(Job, String) and FileInputFormat.addInputPaths(Job, String)
[ https://issues.apache.org/jira/browse/MAPREDUCE-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-5889: -- Target Version/s: 2.8.3 (was: 2.8.1) > Deprecate FileInputFormat.setInputPaths(Job, String) and > FileInputFormat.addInputPaths(Job, String) > --- > > Key: MAPREDUCE-5889 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5889 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Minor > Attachments: MAPREDUCE-5889.3.patch, MAPREDUCE-5889.4.patch, > MAPREDUCE-5889.5.patch, MAPREDUCE-5889.patch, MAPREDUCE-5889.patch > > > {{FileInputFormat.setInputPaths(Job job, String commaSeparatedPaths)}} and > {{FileInputFormat.addInputPaths(Job job, String commaSeparatedPaths)}} fail > to parse commaSeparatedPaths if a comma is included in the file path. (e.g. > Path: {{/path/file,with,comma}}) > We should deprecate these methods and document to use {{setInputPaths(Job > job, Path... inputPaths)}} and {{addInputPaths(Job job, Path... inputPaths)}} > instead. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-5392) "mapred job -history all" command throws IndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/MAPREDUCE-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-5392: -- Target Version/s: 2.8.3 (was: 2.8.1) > "mapred job -history all" command throws IndexOutOfBoundsException > -- > > Key: MAPREDUCE-5392 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5392 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 2.0.5-alpha, 2.2.0, 3.0.0-alpha1 >Reporter: Shinichi Yamashita >Assignee: Shinichi Yamashita > Labels: BB2015-05-TBR > Attachments: MAPREDUCE-5392.2.patch, MAPREDUCE-5392.3.patch, > MAPREDUCE-5392.4.patch, MAPREDUCE-5392.5.patch, MAPREDUCE-5392.patch, > MAPREDUCE-5392.patch, MAPREDUCE-5392.patch, MAPREDUCE-5392.patch, > MAPREDUCE-5392.patch, MAPREDUCE-5392.patch, MAPREDUCE-5392.patch, > MAPREDUCE-5392.patch, MAPREDUCE-5392.patch > > > When I use an "all" option by "mapred job -history" comamnd, the following > exceptions are displayed and do not work. > {code} > Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String > index out of range: -3 > at java.lang.String.substring(String.java:1875) > at > org.apache.hadoop.mapreduce.util.HostUtil.convertTrackerNameToHostName(HostUtil.java:49) > at > org.apache.hadoop.mapreduce.jobhistory.HistoryViewer.getTaskLogsUrl(HistoryViewer.java:459) > at > org.apache.hadoop.mapreduce.jobhistory.HistoryViewer.printAllTaskAttempts(HistoryViewer.java:235) > at > org.apache.hadoop.mapreduce.jobhistory.HistoryViewer.print(HistoryViewer.java:117) > at org.apache.hadoop.mapreduce.tools.CLI.viewHistory(CLI.java:472) > at org.apache.hadoop.mapreduce.tools.CLI.run(CLI.java:313) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at org.apache.hadoop.mapred.JobClient.main(JobClient.java:1233) > {code} > This is because a node name recorded in History file is not given "tracker_". > Therefore it makes modifications to be able to read History file even if a > node name is not given by "tracker_". > In addition, it fixes the URL of displayed task log. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6498) ClientServiceDelegate should not retry upon AccessControlException
[ https://issues.apache.org/jira/browse/MAPREDUCE-6498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6498: -- Target Version/s: 2.8.3 (was: 2.8.1) > ClientServiceDelegate should not retry upon AccessControlException > -- > > Key: MAPREDUCE-6498 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6498 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Peng Zhang >Assignee: Peng Zhang > Attachments: MAPREDUCE-6498.1.patch > > > MapReduce client will retry forever when remote AM throw > AccessControlException: > {code:title=MRClientService.java} > if (job != null && !job.checkAccess(ugi, accessType)) { > throw new AccessControlException("User " + ugi.getShortUserName() > + " cannot perform operation " + accessType.name() + " on " > + jobID); > } > {code} > This issue is similar to MAPREDUCE-6285 which only handled > {{AuthenticationException}} subclass of {{AccessControlException}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6491) Environment variable handling assumes values should be appended
[ https://issues.apache.org/jira/browse/MAPREDUCE-6491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6491: -- Target Version/s: 2.8.3 (was: 2.8.1) > Environment variable handling assumes values should be appended > --- > > Key: MAPREDUCE-6491 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6491 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Jason Lowe >Assignee: Dustin Cote > Attachments: YARN-2369-10.patch, YARN-2369-1.patch, > YARN-2369-2.patch, YARN-2369-3.patch, YARN-2369-4.patch, YARN-2369-5.patch, > YARN-2369-6.patch, YARN-2369-7.patch, YARN-2369-8.patch, YARN-2369-9.patch > > > When processing environment variables for a container context the code > assumes that the value should be appended to any pre-existing value in the > environment. This may be desired behavior for handling path-like environment > variables such as PATH, LD_LIBRARY_PATH, CLASSPATH, etc. but it is a > non-intuitive and harmful way to handle any variable that does not have > path-like semantics. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6864) Hadoop streaming creates 2 mappers when the input has only one block
[ https://issues.apache.org/jira/browse/MAPREDUCE-6864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6864: -- Target Version/s: 2.8.3 (was: 2.8.1) > Hadoop streaming creates 2 mappers when the input has only one block > > > Key: MAPREDUCE-6864 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6864 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 2.7.3 >Reporter: Daniel Templeton > > If a streaming job is run against input that is less than 2 blocks, 2 mappers > will be created, both operating on the same split, both producing (duplicate) > output. In some cases the second mapper will consistently fail. I've not > seen the failure with input less than 10 bytes or more than a couple MB. I > have seen it with a 4kB input. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6096) SummarizedJob class NPEs with some jhist files
[ https://issues.apache.org/jira/browse/MAPREDUCE-6096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6096: -- Target Version/s: 2.8.3 (was: 2.8.1) > SummarizedJob class NPEs with some jhist files > -- > > Key: MAPREDUCE-6096 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6096 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: zhangyubiao >Assignee: zhangyubiao > Labels: easyfix, patch > Attachments: > job_1446203652278_66705-1446308686422-dd_edw-insert+overwrite+table+bkactiv...dp%3D%27ACTIVE%27%28Stage-1446308802181-233-0-SUCCEEDED-bdp_jdw_corejob.jhist, > MAPREDUCE-6096.patch, MAPREDUCE-6096-v1.patch, MAPREDUCE-6096-v2.patch > > > When I Parse the JobHistory in the HistoryFile,I use the Hadoop System's > map-reduce-client-core project > org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser class and > HistoryViewer$SummarizedJob to Parse the JobHistoryFile(Just Like > job_1408862281971_489761-1410883171851_XXX.jhist) > and it throw an Exception Just Like > Exception in thread "pool-1-thread-1" java.lang.NullPointerException > at > org.apache.hadoop.mapreduce.jobhistory.HistoryViewer$SummarizedJob.(HistoryViewer.java:626) > at > com.jd.hadoop.log.parse.ParseLogService.getJobDetail(ParseLogService.java:70) > After I'm see the SummarizedJob class I find that attempt.getTaskStatus() is > NULL , So I change the order of > attempt.getTaskStatus().equals (TaskStatus.State.FAILED.toString()) to > TaskStatus.State.FAILED.toString().equals(attempt.getTaskStatus()) > and it works well . > So I wonder If we can change all attempt.getTaskStatus() after > TaskStatus.State.XXX.toString() ? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6117) Hadoop ignores yarn.nodemanager.hostname for RPC listeners
[ https://issues.apache.org/jira/browse/MAPREDUCE-6117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6117: -- Target Version/s: 2.8.3 (was: 2.8.1) > Hadoop ignores yarn.nodemanager.hostname for RPC listeners > -- > > Key: MAPREDUCE-6117 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6117 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client, task >Affects Versions: 2.2.1, 2.4.1, 2.5.1 > Environment: Any mapreduce example with standard cluster. In our > case each node has four networks. It is important that all internode > communication be done on a specific network. >Reporter: Waldyn Benbenek >Assignee: Waldyn Benbenek > Labels: BB2015-05-RFC > Attachments: MapReduce-534.patch, MAPREDUCE-6117.002.patch > > Original Estimate: 48h > Time Spent: 384h > Remaining Estimate: 0h > > The RPC listeners for an application are using the hostname of the node as > the binding address of the listener, They ignore yarn.nodemanager.hostname > for this. In our setup we want all communication between nodes to be done > via the network addresses we specify in yarn.nodemanager.hostname on each > node. > TaskAttemptListenerImpl.java and MRClientService.java are two places I have > found where the default address is used rather that NM_host. The node > Manager hostname should be used for all communication between nodes including > the RPC listeners. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6101) on job submission, if input or output directories are encrypted, shuffle data should be encrypted at rest
[ https://issues.apache.org/jira/browse/MAPREDUCE-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6101: -- Target Version/s: 2.8.3 (was: 2.8.1) > on job submission, if input or output directories are encrypted, shuffle data > should be encrypted at rest > - > > Key: MAPREDUCE-6101 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6101 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: job submission, security >Affects Versions: 2.6.0 >Reporter: Alejandro Abdelnur >Assignee: Arun Suresh > Attachments: MAPREDUCE-6101.1.patch, MAPREDUCE-6101.2.patch > > > Currently setting shuffle data at rest encryption has to be done explicitly > to work. If not set explicitly (ON or OFF) but the input or output HDFS > directories of the job are in an encrption zone, we should set it to ON. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6941) The default setting doesn't work for MapReduce job
[ https://issues.apache.org/jira/browse/MAPREDUCE-6941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16154675#comment-16154675 ] Junping Du commented on MAPREDUCE-6941: --- Sorry missing comments on this JIRA. I think Ray's comments make sense and I just missed the discussion on MAPREDUCE-6704. +1 on resolve this issue. > The default setting doesn't work for MapReduce job > -- > > Key: MAPREDUCE-6941 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6941 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 3.0.0-beta1 >Reporter: Junping Du >Priority: Blocker > > On the deployment of hadoop 3 cluster (based on current trunk branch) with > default settings, the MR job will get failed as following exceptions: > {noformat} > 2017-08-16 13:00:03,846 INFO mapreduce.Job: Job job_1502913552390_0001 > running in uber mode : false > 2017-08-16 13:00:03,847 INFO mapreduce.Job: map 0% reduce 0% > 2017-08-16 13:00:03,864 INFO mapreduce.Job: Job job_1502913552390_0001 failed > with state FAILED due to: Application application_1502913552390_0001 failed 2 > times due to AM Container for appattempt_1502913552390_0001_02 exited > with exitCode: 1 > Failing this attempt.Diagnostics: [2017-08-16 13:00:02.963]Exception from > container-launch. > Container id: container_1502913552390_0001_02_01 > Exit code: 1 > Stack trace: ExitCodeException exitCode=1: > at org.apache.hadoop.util.Shell.runCommand(Shell.java:994) > at org.apache.hadoop.util.Shell.run(Shell.java:887) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1212) > at > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:295) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.launchContainer(ContainerLaunch.java:455) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:275) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:90) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > This is because mapreduce related jar are not added into yarn setup by > default. To make MR job run successful, we need to add following > configurations to yarn-site.xml now: > {noformat} > > yarn.application.classpath > > ... > /share/hadoop/mapreduce/*, > /share/hadoop/mapreduce/lib/* > ... > > {noformat} > But this config is not necessary for previous version of Hadoop. We should > fix this issue before beta release otherwise it will be a regression for > configuration changes. > This could be more like a YARN issue (if so, we should move), depends on how > we fix it finally. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6931) Remove TestDFSIO "Total Throughput" calculation
[ https://issues.apache.org/jira/browse/MAPREDUCE-6931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151042#comment-16151042 ] Junping Du commented on MAPREDUCE-6931: --- This JIRA is marked as trivial, but we are in 2.8.2 RC stage. In my practice (different RM may have different practices), commits under major priority should be skipped at this stage with balance between importance of fixes and risky of careless code/merge. > Remove TestDFSIO "Total Throughput" calculation > --- > > Key: MAPREDUCE-6931 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6931 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: benchmarks, test >Affects Versions: 2.8.0 >Reporter: Dennis Huo >Assignee: Dennis Huo >Priority: Trivial > Fix For: 2.9.0, 3.0.0-beta1, 2.7.5, 2.8.3 > > Attachments: MAPREDUCE-6931-001.patch > > > The new "Total Throughput" line added in > https://issues.apache.org/jira/browse/HDFS-9153 is currently calculated as > {{toMB(size) / ((float)execTime)}} and claims to be in units of "MB/s", but > {{execTime}} is in milliseconds; thus, the reported number is 1/1000x the > actual value: > {code:java} > String resultLines[] = { > "- TestDFSIO - : " + testType, > "Date & time: " + new Date(System.currentTimeMillis()), > "Number of files: " + tasks, > " Total MBytes processed: " + df.format(toMB(size)), > " Throughput mb/sec: " + df.format(size * 1000.0 / (time * > MEGA)), > "Total Throughput mb/sec: " + df.format(toMB(size) / > ((float)execTime)), > " Average IO rate mb/sec: " + df.format(med), > " IO rate std deviation: " + df.format(stdDev), > " Test exec time sec: " + df.format((float)execTime / 1000), > "" }; > {code} > The different calculated fields can also use toMB and a shared > milliseconds-to-seconds conversion to make it easier to keep units consistent. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6937) Backport MAPREDUCE-6870 to branch-2 while preserving compatibility
[ https://issues.apache.org/jira/browse/MAPREDUCE-6937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149746#comment-16149746 ] Junping Du commented on MAPREDUCE-6937: --- Sounds like we forget to commit to branch-2.8.2. Just commit it. > Backport MAPREDUCE-6870 to branch-2 while preserving compatibility > -- > > Key: MAPREDUCE-6937 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6937 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Zhe Zhang >Assignee: Peter Bacsko > Fix For: 2.9.0, 2.8.2, 2.7.5 > > Attachments: MAPREDUCE-6870-branch-2.01.patch, > MAPREDUCE-6870-branch-2.02.patch, MAPREDUCE-6870-branch-2.7.03.patch, > MAPREDUCE-6870-branch-2.7.04.patch, MAPREDUCE-6870-branch-2.7.05.patch, > MAPREDUCE-6870_branch2.7.patch, MAPREDUCE-6870_branch2.7v2.patch, > MAPREDUCE-6870-branch-2.8.03.patch, MAPREDUCE-6870-branch-2.8.04.patch, > MAPREDUCE-6870_branch2.8.patch, MAPREDUCE-6870_branch2.8v2.patch > > > To maintain compatibility we need to disable this by default per discussion > on MAPREDUCE-6870. > Using a separate JIRA to correctly track incompatibilities. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6931) Remove TestDFSIO "Total Throughput" calculation
[ https://issues.apache.org/jira/browse/MAPREDUCE-6931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149704#comment-16149704 ] Junping Du commented on MAPREDUCE-6931: --- 2.8.2 is in RC stage. Move to 2.8.3 given this only land on branch-2.8. > Remove TestDFSIO "Total Throughput" calculation > --- > > Key: MAPREDUCE-6931 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6931 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: benchmarks, test >Affects Versions: 2.8.0 >Reporter: Dennis Huo >Assignee: Dennis Huo >Priority: Trivial > Fix For: 2.9.0, 3.0.0-beta1, 2.7.5, 2.8.3 > > Attachments: MAPREDUCE-6931-001.patch > > > The new "Total Throughput" line added in > https://issues.apache.org/jira/browse/HDFS-9153 is currently calculated as > {{toMB(size) / ((float)execTime)}} and claims to be in units of "MB/s", but > {{execTime}} is in milliseconds; thus, the reported number is 1/1000x the > actual value: > {code:java} > String resultLines[] = { > "- TestDFSIO - : " + testType, > "Date & time: " + new Date(System.currentTimeMillis()), > "Number of files: " + tasks, > " Total MBytes processed: " + df.format(toMB(size)), > " Throughput mb/sec: " + df.format(size * 1000.0 / (time * > MEGA)), > "Total Throughput mb/sec: " + df.format(toMB(size) / > ((float)execTime)), > " Average IO rate mb/sec: " + df.format(med), > " IO rate std deviation: " + df.format(stdDev), > " Test exec time sec: " + df.format((float)execTime / 1000), > "" }; > {code} > The different calculated fields can also use toMB and a shared > milliseconds-to-seconds conversion to make it easier to keep units consistent. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6931) Remove TestDFSIO "Total Throughput" calculation
[ https://issues.apache.org/jira/browse/MAPREDUCE-6931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6931: -- Fix Version/s: (was: 2.8.2) 2.8.3 > Remove TestDFSIO "Total Throughput" calculation > --- > > Key: MAPREDUCE-6931 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6931 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: benchmarks, test >Affects Versions: 2.8.0 >Reporter: Dennis Huo >Assignee: Dennis Huo >Priority: Trivial > Fix For: 2.9.0, 3.0.0-beta1, 2.7.5, 2.8.3 > > Attachments: MAPREDUCE-6931-001.patch > > > The new "Total Throughput" line added in > https://issues.apache.org/jira/browse/HDFS-9153 is currently calculated as > {{toMB(size) / ((float)execTime)}} and claims to be in units of "MB/s", but > {{execTime}} is in milliseconds; thus, the reported number is 1/1000x the > actual value: > {code:java} > String resultLines[] = { > "- TestDFSIO - : " + testType, > "Date & time: " + new Date(System.currentTimeMillis()), > "Number of files: " + tasks, > " Total MBytes processed: " + df.format(toMB(size)), > " Throughput mb/sec: " + df.format(size * 1000.0 / (time * > MEGA)), > "Total Throughput mb/sec: " + df.format(toMB(size) / > ((float)execTime)), > " Average IO rate mb/sec: " + df.format(med), > " IO rate std deviation: " + df.format(stdDev), > " Test exec time sec: " + df.format((float)execTime / 1000), > "" }; > {code} > The different calculated fields can also use toMB and a shared > milliseconds-to-seconds conversion to make it easier to keep units consistent. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-6941) The default setting doesn't work for MapReduce job
Junping Du created MAPREDUCE-6941: - Summary: The default setting doesn't work for MapReduce job Key: MAPREDUCE-6941 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6941 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0-beta1 Reporter: Junping Du Priority: Blocker On the deployment of hadoop 3 cluster (based on current trunk branch) with default settings, the MR job will get failed as following exceptions: {noformat} 2017-08-16 13:00:03,846 INFO mapreduce.Job: Job job_1502913552390_0001 running in uber mode : false 2017-08-16 13:00:03,847 INFO mapreduce.Job: map 0% reduce 0% 2017-08-16 13:00:03,864 INFO mapreduce.Job: Job job_1502913552390_0001 failed with state FAILED due to: Application application_1502913552390_0001 failed 2 times due to AM Container for appattempt_1502913552390_0001_02 exited with exitCode: 1 Failing this attempt.Diagnostics: [2017-08-16 13:00:02.963]Exception from container-launch. Container id: container_1502913552390_0001_02_01 Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:994) at org.apache.hadoop.util.Shell.run(Shell.java:887) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1212) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:295) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.launchContainer(ContainerLaunch.java:455) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:275) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:90) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {noformat} This is because mapreduce related jar are not added into yarn setup by default. To make MR job run successful, we need to add following configurations to yarn-site.xml now: {noformat} yarn.application.classpath ... /share/hadoop/mapreduce/*, /share/hadoop/mapreduce/lib/* ... {noformat} But this config is not necessary for previous version of Hadoop. We should fix this issue before beta release otherwise it will be a regression for configuration changes. This could be more like a YARN issue (if so, we should move), depends on how we fix it finally. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6925) CLONE - Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild
[ https://issues.apache.org/jira/browse/MAPREDUCE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6925: -- Fix Version/s: (was: 3.0.0-alpha1) (was: 2.9.0) > CLONE - Make Counter limits consistent across JobClient, MRAppMaster, and > YarnChild > --- > > Key: MAPREDUCE-6925 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6925 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, client, task >Affects Versions: 2.4.0 >Reporter: Gera Shegalov >Assignee: Gera Shegalov > > Currently, counter limits "mapreduce.job.counters.*" handled by > {{org.apache.hadoop.mapreduce.counters.Limits}} are initialized > asymmetrically: on the client side, and on the AM, job.xml is ignored whereas > it's taken into account in YarnChild. > It would be good to make the Limits job-configurable, such that max > counters/groups is only increased when needed. With the current Limits > implementation relying on static constants, it's going to be challenging for > tools that submit jobs concurrently without resorting to class loading > isolation. > The patch that I am uploading is not perfect but demonstrates the issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6925) CLONE - Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild
[ https://issues.apache.org/jira/browse/MAPREDUCE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6925: -- Target Version/s: 2.9.0, 3.0.0-beta1 (was: 2.7.0) > CLONE - Make Counter limits consistent across JobClient, MRAppMaster, and > YarnChild > --- > > Key: MAPREDUCE-6925 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6925 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, client, task >Affects Versions: 2.4.0 >Reporter: Gera Shegalov >Assignee: Gera Shegalov > > Currently, counter limits "mapreduce.job.counters.*" handled by > {{org.apache.hadoop.mapreduce.counters.Limits}} are initialized > asymmetrically: on the client side, and on the AM, job.xml is ignored whereas > it's taken into account in YarnChild. > It would be good to make the Limits job-configurable, such that max > counters/groups is only increased when needed. With the current Limits > implementation relying on static constants, it's going to be challenging for > tools that submit jobs concurrently without resorting to class loading > isolation. > The patch that I am uploading is not perfect but demonstrates the issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6925) CLONE - Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild
[ https://issues.apache.org/jira/browse/MAPREDUCE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6925: -- Hadoop Flags: (was: Reviewed) > CLONE - Make Counter limits consistent across JobClient, MRAppMaster, and > YarnChild > --- > > Key: MAPREDUCE-6925 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6925 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, client, task >Affects Versions: 2.4.0 >Reporter: Gera Shegalov >Assignee: Gera Shegalov > > Currently, counter limits "mapreduce.job.counters.*" handled by > {{org.apache.hadoop.mapreduce.counters.Limits}} are initialized > asymmetrically: on the client side, and on the AM, job.xml is ignored whereas > it's taken into account in YarnChild. > It would be good to make the Limits job-configurable, such that max > counters/groups is only increased when needed. With the current Limits > implementation relying on static constants, it's going to be challenging for > tools that submit jobs concurrently without resorting to class loading > isolation. > The patch that I am uploading is not perfect but demonstrates the issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-5875) Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild
[ https://issues.apache.org/jira/browse/MAPREDUCE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108080#comment-16108080 ] Junping Du commented on MAPREDUCE-5875: --- bq. At a minimum, -beta1 will need to include a note that says it was removed or, preferably, removed but replaced with something better. I believe MAPREDUCE-6924 already address this. > Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild > --- > > Key: MAPREDUCE-5875 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5875 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, client, task >Affects Versions: 2.4.0 >Reporter: Gera Shegalov >Assignee: Gera Shegalov > Fix For: 3.0.0-alpha1 > > Attachments: MAPREDUCE-5875.v01.patch, MAPREDUCE-5875.v02.patch, > MAPREDUCE-5875.v03.patch, MAPREDUCE-5875.v04.patch, MAPREDUCE-5875.v05.patch, > MAPREDUCE-5875.v06.patch, MAPREDUCE-5875.v07.patch, MAPREDUCE-5875.v08.patch, > MAPREDUCE-5875.v09.patch > > > Currently, counter limits "mapreduce.job.counters.*" handled by > {{org.apache.hadoop.mapreduce.counters.Limits}} are initialized > asymmetrically: on the client side, and on the AM, job.xml is ignored whereas > it's taken into account in YarnChild. > It would be good to make the Limits job-configurable, such that max > counters/groups is only increased when needed. With the current Limits > implementation relying on static constants, it's going to be challenging for > tools that submit jobs concurrently without resorting to class loading > isolation. > The patch that I am uploading is not perfect but demonstrates the issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (MAPREDUCE-5875) Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild
[ https://issues.apache.org/jira/browse/MAPREDUCE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108052#comment-16108052 ] Junping Du edited comment on MAPREDUCE-5875 at 7/31/17 10:05 PM: - Drop 2.9.0 as the patch get reverted from branch-2. Just clone MAPREDUCE-6925 for improved fix. To be clear, this fix never land in any non-alpha release, so we shouldn't claim the fix is released or any regression due to revert of this fix. was (Author: djp): Drop 2.9.0 as the patch get reverted from branch-2. Just clone MAPREDUCE-6924 for improved fix. To be clear, this fix never land in any non-alpha release, so we shouldn't claim the fix is released or any regression due to revert of this fix. > Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild > --- > > Key: MAPREDUCE-5875 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5875 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, client, task >Affects Versions: 2.4.0 >Reporter: Gera Shegalov >Assignee: Gera Shegalov > Fix For: 3.0.0-alpha1 > > Attachments: MAPREDUCE-5875.v01.patch, MAPREDUCE-5875.v02.patch, > MAPREDUCE-5875.v03.patch, MAPREDUCE-5875.v04.patch, MAPREDUCE-5875.v05.patch, > MAPREDUCE-5875.v06.patch, MAPREDUCE-5875.v07.patch, MAPREDUCE-5875.v08.patch, > MAPREDUCE-5875.v09.patch > > > Currently, counter limits "mapreduce.job.counters.*" handled by > {{org.apache.hadoop.mapreduce.counters.Limits}} are initialized > asymmetrically: on the client side, and on the AM, job.xml is ignored whereas > it's taken into account in YarnChild. > It would be good to make the Limits job-configurable, such that max > counters/groups is only increased when needed. With the current Limits > implementation relying on static constants, it's going to be challenging for > tools that submit jobs concurrently without resorting to class loading > isolation. > The patch that I am uploading is not perfect but demonstrates the issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-5875) Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild
[ https://issues.apache.org/jira/browse/MAPREDUCE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108052#comment-16108052 ] Junping Du commented on MAPREDUCE-5875: --- Drop 2.9.0 as the patch get reverted from branch-2. Just clone MAPREDUCE-6924 for improved fix. To be clear, this fix never land in any non-alpha release, so we shouldn't claim the fix is released or any regression due to revert of this fix. > Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild > --- > > Key: MAPREDUCE-5875 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5875 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, client, task >Affects Versions: 2.4.0 >Reporter: Gera Shegalov >Assignee: Gera Shegalov > Fix For: 3.0.0-alpha1 > > Attachments: MAPREDUCE-5875.v01.patch, MAPREDUCE-5875.v02.patch, > MAPREDUCE-5875.v03.patch, MAPREDUCE-5875.v04.patch, MAPREDUCE-5875.v05.patch, > MAPREDUCE-5875.v06.patch, MAPREDUCE-5875.v07.patch, MAPREDUCE-5875.v08.patch, > MAPREDUCE-5875.v09.patch > > > Currently, counter limits "mapreduce.job.counters.*" handled by > {{org.apache.hadoop.mapreduce.counters.Limits}} are initialized > asymmetrically: on the client side, and on the AM, job.xml is ignored whereas > it's taken into account in YarnChild. > It would be good to make the Limits job-configurable, such that max > counters/groups is only increased when needed. With the current Limits > implementation relying on static constants, it's going to be challenging for > tools that submit jobs concurrently without resorting to class loading > isolation. > The patch that I am uploading is not perfect but demonstrates the issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-5875) Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild
[ https://issues.apache.org/jira/browse/MAPREDUCE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-5875: -- Fix Version/s: (was: 2.9.0) > Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild > --- > > Key: MAPREDUCE-5875 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5875 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, client, task >Affects Versions: 2.4.0 >Reporter: Gera Shegalov >Assignee: Gera Shegalov > Fix For: 3.0.0-alpha1 > > Attachments: MAPREDUCE-5875.v01.patch, MAPREDUCE-5875.v02.patch, > MAPREDUCE-5875.v03.patch, MAPREDUCE-5875.v04.patch, MAPREDUCE-5875.v05.patch, > MAPREDUCE-5875.v06.patch, MAPREDUCE-5875.v07.patch, MAPREDUCE-5875.v08.patch, > MAPREDUCE-5875.v09.patch > > > Currently, counter limits "mapreduce.job.counters.*" handled by > {{org.apache.hadoop.mapreduce.counters.Limits}} are initialized > asymmetrically: on the client side, and on the AM, job.xml is ignored whereas > it's taken into account in YarnChild. > It would be good to make the Limits job-configurable, such that max > counters/groups is only increased when needed. With the current Limits > implementation relying on static constants, it's going to be challenging for > tools that submit jobs concurrently without resorting to class loading > isolation. > The patch that I am uploading is not perfect but demonstrates the issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-6925) CLONE - Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild
Junping Du created MAPREDUCE-6925: - Summary: CLONE - Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild Key: MAPREDUCE-6925 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6925 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster, client, task Affects Versions: 2.4.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Fix For: 2.9.0, 3.0.0-alpha1 Currently, counter limits "mapreduce.job.counters.*" handled by {{org.apache.hadoop.mapreduce.counters.Limits}} are initialized asymmetrically: on the client side, and on the AM, job.xml is ignored whereas it's taken into account in YarnChild. It would be good to make the Limits job-configurable, such that max counters/groups is only increased when needed. With the current Limits implementation relying on static constants, it's going to be challenging for tools that submit jobs concurrently without resorting to class loading isolation. The patch that I am uploading is not perfect but demonstrates the issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6199) AbstractCounters are not reset completely on deserialization
[ https://issues.apache.org/jira/browse/MAPREDUCE-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6199: -- Fix Version/s: (was: 3.0.0-alpha1) (was: 2.8.0) > AbstractCounters are not reset completely on deserialization > > > Key: MAPREDUCE-6199 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6199 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: mr-6199.001.patch, mr-6199.001.patch, mr-6199.002.patch > > > AbstractCounters are partially reset on deserialization. This patch > completely resets them. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Reopened] (MAPREDUCE-6199) AbstractCounters are not reset completely on deserialization
[ https://issues.apache.org/jira/browse/MAPREDUCE-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du reopened MAPREDUCE-6199: --- > AbstractCounters are not reset completely on deserialization > > > Key: MAPREDUCE-6199 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6199 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: mr-6199.001.patch, mr-6199.001.patch, mr-6199.002.patch > > > AbstractCounters are partially reset on deserialization. This patch > completely resets them. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-6199) AbstractCounters are not reset completely on deserialization
[ https://issues.apache.org/jira/browse/MAPREDUCE-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du resolved MAPREDUCE-6199. --- Resolution: Won't Fix > AbstractCounters are not reset completely on deserialization > > > Key: MAPREDUCE-6199 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6199 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: mr-6199.001.patch, mr-6199.001.patch, mr-6199.002.patch > > > AbstractCounters are partially reset on deserialization. This patch > completely resets them. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6199) AbstractCounters are not reset completely on deserialization
[ https://issues.apache.org/jira/browse/MAPREDUCE-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107992#comment-16107992 ] Junping Du commented on MAPREDUCE-6199: --- MAPREDUCE-5875 is revert, so patch here is not needed. reopen and resolve it as won't fix. > AbstractCounters are not reset completely on deserialization > > > Key: MAPREDUCE-6199 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6199 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Fix For: 2.8.0, 3.0.0-alpha1 > > Attachments: mr-6199.001.patch, mr-6199.001.patch, mr-6199.002.patch > > > AbstractCounters are partially reset on deserialization. This patch > completely resets them. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6286) A typo in HistoryViewer makes some code useless, which causes counter limits are not reset correctly.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6286: -- Fix Version/s: (was: 3.0.0-alpha1) (was: 2.9.0) > A typo in HistoryViewer makes some code useless, which causes counter limits > are not reset correctly. > - > > Key: MAPREDUCE-6286 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6286 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client >Affects Versions: 2.6.0 >Reporter: zhihai xu >Assignee: zhihai xu > Attachments: MAPREDUCE-6286.000.patch > > > A typo in HistoryViewer makes some code useless and it causes counter limits > are not reset correctly. > The typo is > Limits.reset(conf); > We should use jobConf instead of conf. > With the typo, the following code becomes useless: > {code} > final Path jobConfPath = new Path(jobFile.getParent(), jobDetails[0] > + "_" + jobDetails[1] + "_" + jobDetails[2] + "_conf.xml"); > final Configuration jobConf = new Configuration(conf); > jobConf.addResource(fs.open(jobConfPath), jobConfPath.toString()); > {code} > The code wants to load the configuration from the Job configuration file and > reset the Limits based on the new configuration loaded from the Job > configuration file. But with the typo, the Limits are reset with the old > configuration. > So this typo is apparent. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-6286) A typo in HistoryViewer makes some code useless, which causes counter limits are not reset correctly.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du resolved MAPREDUCE-6286. --- Resolution: Won't Fix MAPREDUCE-5875 is revert, so patch here is not needed. reopen and resolve it as won't fix. > A typo in HistoryViewer makes some code useless, which causes counter limits > are not reset correctly. > - > > Key: MAPREDUCE-6286 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6286 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client >Affects Versions: 2.6.0 >Reporter: zhihai xu >Assignee: zhihai xu > Fix For: 2.9.0, 3.0.0-alpha1 > > Attachments: MAPREDUCE-6286.000.patch > > > A typo in HistoryViewer makes some code useless and it causes counter limits > are not reset correctly. > The typo is > Limits.reset(conf); > We should use jobConf instead of conf. > With the typo, the following code becomes useless: > {code} > final Path jobConfPath = new Path(jobFile.getParent(), jobDetails[0] > + "_" + jobDetails[1] + "_" + jobDetails[2] + "_conf.xml"); > final Configuration jobConf = new Configuration(conf); > jobConf.addResource(fs.open(jobConfPath), jobConfPath.toString()); > {code} > The code wants to load the configuration from the Job configuration file and > reset the Limits based on the new configuration loaded from the Job > configuration file. But with the typo, the Limits are reset with the old > configuration. > So this typo is apparent. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Reopened] (MAPREDUCE-6286) A typo in HistoryViewer makes some code useless, which causes counter limits are not reset correctly.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du reopened MAPREDUCE-6286: --- > A typo in HistoryViewer makes some code useless, which causes counter limits > are not reset correctly. > - > > Key: MAPREDUCE-6286 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6286 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client >Affects Versions: 2.6.0 >Reporter: zhihai xu >Assignee: zhihai xu > Fix For: 2.9.0, 3.0.0-alpha1 > > Attachments: MAPREDUCE-6286.000.patch > > > A typo in HistoryViewer makes some code useless and it causes counter limits > are not reset correctly. > The typo is > Limits.reset(conf); > We should use jobConf instead of conf. > With the typo, the following code becomes useless: > {code} > final Path jobConfPath = new Path(jobFile.getParent(), jobDetails[0] > + "_" + jobDetails[1] + "_" + jobDetails[2] + "_conf.xml"); > final Configuration jobConf = new Configuration(conf); > jobConf.addResource(fs.open(jobConfPath), jobConfPath.toString()); > {code} > The code wants to load the configuration from the Job configuration file and > reset the Limits based on the new configuration loaded from the Job > configuration file. But with the typo, the Limits are reset with the old > configuration. > So this typo is apparent. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-5875) Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild
[ https://issues.apache.org/jira/browse/MAPREDUCE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107986#comment-16107986 ] Junping Du commented on MAPREDUCE-5875: --- As discussed in MAPREDUCE-6288, I have revert this patch from trunk and branch-2. We need to reopen this jira for an improved solution. But it looks like I cannot reopen it - can someone able to do this? > Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild > --- > > Key: MAPREDUCE-5875 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5875 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, client, task >Affects Versions: 2.4.0 >Reporter: Gera Shegalov >Assignee: Gera Shegalov > Fix For: 2.9.0, 3.0.0-alpha1 > > Attachments: MAPREDUCE-5875.v01.patch, MAPREDUCE-5875.v02.patch, > MAPREDUCE-5875.v03.patch, MAPREDUCE-5875.v04.patch, MAPREDUCE-5875.v05.patch, > MAPREDUCE-5875.v06.patch, MAPREDUCE-5875.v07.patch, MAPREDUCE-5875.v08.patch, > MAPREDUCE-5875.v09.patch > > > Currently, counter limits "mapreduce.job.counters.*" handled by > {{org.apache.hadoop.mapreduce.counters.Limits}} are initialized > asymmetrically: on the client side, and on the AM, job.xml is ignored whereas > it's taken into account in YarnChild. > It would be good to make the Limits job-configurable, such that max > counters/groups is only increased when needed. With the current Limits > implementation relying on static constants, it's going to be challenging for > tools that submit jobs concurrently without resorting to class loading > isolation. > The patch that I am uploading is not perfect but demonstrates the issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Assigned] (MAPREDUCE-5875) Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild
[ https://issues.apache.org/jira/browse/MAPREDUCE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du reassigned MAPREDUCE-5875: - Assignee: Gera Shegalov (was: Junping Du) > Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild > --- > > Key: MAPREDUCE-5875 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5875 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, client, task >Affects Versions: 2.4.0 >Reporter: Gera Shegalov >Assignee: Gera Shegalov > Fix For: 2.9.0, 3.0.0-alpha1 > > Attachments: MAPREDUCE-5875.v01.patch, MAPREDUCE-5875.v02.patch, > MAPREDUCE-5875.v03.patch, MAPREDUCE-5875.v04.patch, MAPREDUCE-5875.v05.patch, > MAPREDUCE-5875.v06.patch, MAPREDUCE-5875.v07.patch, MAPREDUCE-5875.v08.patch, > MAPREDUCE-5875.v09.patch > > > Currently, counter limits "mapreduce.job.counters.*" handled by > {{org.apache.hadoop.mapreduce.counters.Limits}} are initialized > asymmetrically: on the client side, and on the AM, job.xml is ignored whereas > it's taken into account in YarnChild. > It would be good to make the Limits job-configurable, such that max > counters/groups is only increased when needed. With the current Limits > implementation relying on static constants, it's going to be challenging for > tools that submit jobs concurrently without resorting to class loading > isolation. > The patch that I am uploading is not perfect but demonstrates the issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Assigned] (MAPREDUCE-5875) Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild
[ https://issues.apache.org/jira/browse/MAPREDUCE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du reassigned MAPREDUCE-5875: - Assignee: Junping Du (was: Gera Shegalov) > Make Counter limits consistent across JobClient, MRAppMaster, and YarnChild > --- > > Key: MAPREDUCE-5875 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5875 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster, client, task >Affects Versions: 2.4.0 >Reporter: Gera Shegalov >Assignee: Junping Du > Fix For: 2.9.0, 3.0.0-alpha1 > > Attachments: MAPREDUCE-5875.v01.patch, MAPREDUCE-5875.v02.patch, > MAPREDUCE-5875.v03.patch, MAPREDUCE-5875.v04.patch, MAPREDUCE-5875.v05.patch, > MAPREDUCE-5875.v06.patch, MAPREDUCE-5875.v07.patch, MAPREDUCE-5875.v08.patch, > MAPREDUCE-5875.v09.patch > > > Currently, counter limits "mapreduce.job.counters.*" handled by > {{org.apache.hadoop.mapreduce.counters.Limits}} are initialized > asymmetrically: on the client side, and on the AM, job.xml is ignored whereas > it's taken into account in YarnChild. > It would be good to make the Limits job-configurable, such that max > counters/groups is only increased when needed. With the current Limits > implementation relying on static constants, it's going to be challenging for > tools that submit jobs concurrently without resorting to class loading > isolation. > The patch that I am uploading is not perfect but demonstrates the issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-6288) mapred job -status fails with AccessControlException
[ https://issues.apache.org/jira/browse/MAPREDUCE-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du resolved MAPREDUCE-6288. --- Resolution: Done I have revert MAPREDUCE-6286, MAPREDUCE-6199 and MAPREDUCE-5875 from trunk and branch-2. Resolve this jira as Done as we will reopen MAPREDUCE-5875 for a better solution. > mapred job -status fails with AccessControlException > - > > Key: MAPREDUCE-6288 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6288 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.8.0 >Reporter: Robert Kanter >Priority: Blocker > Attachments: MAPREDUCE-6288.002.patch, MAPREDUCE-6288-gera-001.patch, > MAPREDUCE-6288.patch > > > After MAPREDUCE-5875, we're seeing this Exception when trying to do {{mapred > job -status job_1427080398288_0001}} > {noformat} > Exception in thread "main" org.apache.hadoop.security.AccessControlException: > Permission denied: user=jenkins, access=EXECUTE, > inode="/user/history/done":mapred:hadoop:drwxrwx--- > at > org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:257) > at > org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:238) > at > org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkTraverse(DefaultAuthorizationProvider.java:180) > at > org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:137) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:138) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6553) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6535) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:6460) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1919) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1870) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1850) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1822) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:545) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) > at > org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73) > at > org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1213) > at > org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1201) > at > org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1191) > at > org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:299) > at > org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:265) > at org.apache.hadoop.hdfs.DFSInputStream.(DFSInputStream.java:257) > at
[jira] [Commented] (MAPREDUCE-6288) mapred job -status fails with AccessControlException
[ https://issues.apache.org/jira/browse/MAPREDUCE-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16104470#comment-16104470 ] Junping Du commented on MAPREDUCE-6288: --- Yep. We keep reverting it from 2.7.x release and 2.8 so this patch haven't be released yet. I will revert them tomorrow if no further concerns from others. > mapred job -status fails with AccessControlException > - > > Key: MAPREDUCE-6288 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6288 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.8.0 >Reporter: Robert Kanter >Priority: Blocker > Attachments: MAPREDUCE-6288.002.patch, MAPREDUCE-6288-gera-001.patch, > MAPREDUCE-6288.patch > > > After MAPREDUCE-5875, we're seeing this Exception when trying to do {{mapred > job -status job_1427080398288_0001}} > {noformat} > Exception in thread "main" org.apache.hadoop.security.AccessControlException: > Permission denied: user=jenkins, access=EXECUTE, > inode="/user/history/done":mapred:hadoop:drwxrwx--- > at > org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:257) > at > org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:238) > at > org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkTraverse(DefaultAuthorizationProvider.java:180) > at > org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:137) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:138) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6553) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6535) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:6460) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1919) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1870) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1850) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1822) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:545) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) > at > org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73) > at > org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1213) > at > org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1201) > at > org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1191) > at > org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:299) > at > org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:265) > at org.apache.hadoop.hdfs.DFSInputStream.(DFSInputStream.java:257) > at
[jira] [Commented] (MAPREDUCE-6288) mapred job -status fails with AccessControlException
[ https://issues.apache.org/jira/browse/MAPREDUCE-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16104162#comment-16104162 ] Junping Du commented on MAPREDUCE-6288: --- Thus, I would suggest we reopen MAPREDUCE-5875 for a better solution when we revert the original patches from trunk/branch-2. > mapred job -status fails with AccessControlException > - > > Key: MAPREDUCE-6288 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6288 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.8.0 >Reporter: Robert Kanter >Priority: Blocker > Attachments: MAPREDUCE-6288.002.patch, MAPREDUCE-6288-gera-001.patch, > MAPREDUCE-6288.patch > > > After MAPREDUCE-5875, we're seeing this Exception when trying to do {{mapred > job -status job_1427080398288_0001}} > {noformat} > Exception in thread "main" org.apache.hadoop.security.AccessControlException: > Permission denied: user=jenkins, access=EXECUTE, > inode="/user/history/done":mapred:hadoop:drwxrwx--- > at > org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:257) > at > org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:238) > at > org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkTraverse(DefaultAuthorizationProvider.java:180) > at > org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:137) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:138) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6553) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6535) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:6460) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1919) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1870) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1850) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1822) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:545) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) > at > org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73) > at > org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1213) > at > org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1201) > at > org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1191) > at > org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:299) > at > org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:265) > at org.apache.hadoop.hdfs.DFSInputStream.(DFSInputStream.java:257) > at
[jira] [Commented] (MAPREDUCE-6288) mapred job -status fails with AccessControlException
[ https://issues.apache.org/jira/browse/MAPREDUCE-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16104157#comment-16104157 ] Junping Du commented on MAPREDUCE-6288: --- [~rkanter], I think all people here agree that MAPREDUCE-5875 indeed fix a critical problem. But the problem it involves is actually a blocker issue and no one commit the effort for a decent solution so far. [~andrew.wang] already defer this out several times for 3.0-alpha but I don't think we should do the same practice again when we are doing for beta/GA - this will definitely become a real blocker there. > mapred job -status fails with AccessControlException > - > > Key: MAPREDUCE-6288 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6288 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.8.0 >Reporter: Robert Kanter >Priority: Blocker > Attachments: MAPREDUCE-6288.002.patch, MAPREDUCE-6288-gera-001.patch, > MAPREDUCE-6288.patch > > > After MAPREDUCE-5875, we're seeing this Exception when trying to do {{mapred > job -status job_1427080398288_0001}} > {noformat} > Exception in thread "main" org.apache.hadoop.security.AccessControlException: > Permission denied: user=jenkins, access=EXECUTE, > inode="/user/history/done":mapred:hadoop:drwxrwx--- > at > org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:257) > at > org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:238) > at > org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkTraverse(DefaultAuthorizationProvider.java:180) > at > org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:137) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:138) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6553) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6535) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:6460) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1919) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1870) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1850) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1822) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:545) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) > at > org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73) > at > org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1213) > at > org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1201) > at > org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1191) > at >
[jira] [Commented] (MAPREDUCE-6288) mapred job -status fails with AccessControlException
[ https://issues.apache.org/jira/browse/MAPREDUCE-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16103827#comment-16103827 ] Junping Du commented on MAPREDUCE-6288: --- This has pending for a while. Sounds like no one object the revert of MAPREDUCE-5875 from trunk and branch-2. I will go ahead to revert it (as well as MAPREDUCE-6286 and MAPREDCUE-6199) after 24 hours. If people here have any concern, please call it out before I revert them tomorrow. > mapred job -status fails with AccessControlException > - > > Key: MAPREDUCE-6288 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6288 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.8.0 >Reporter: Robert Kanter >Priority: Blocker > Attachments: MAPREDUCE-6288.002.patch, MAPREDUCE-6288-gera-001.patch, > MAPREDUCE-6288.patch > > > After MAPREDUCE-5875, we're seeing this Exception when trying to do {{mapred > job -status job_1427080398288_0001}} > {noformat} > Exception in thread "main" org.apache.hadoop.security.AccessControlException: > Permission denied: user=jenkins, access=EXECUTE, > inode="/user/history/done":mapred:hadoop:drwxrwx--- > at > org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:257) > at > org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:238) > at > org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkTraverse(DefaultAuthorizationProvider.java:180) > at > org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:137) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:138) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6553) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6535) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:6460) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1919) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1870) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1850) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1822) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:545) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) > at > org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73) > at > org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1213) > at > org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1201) > at > org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1191) > at > org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:299) > at >
[jira] [Created] (MAPREDUCE-6915) On branch-2 ResourceManager failed to start
Junping Du created MAPREDUCE-6915: - Summary: On branch-2 ResourceManager failed to start Key: MAPREDUCE-6915 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6915 Project: Hadoop Map/Reduce Issue Type: Bug Components: resourcemanager Affects Versions: 2.9 Reporter: Junping Du On build against branch-2, ResourceManager get failed to start because of following failures: {noformat} 2017-07-16 23:33:15,688 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting ResourceManager java.lang.NoSuchMethodError: org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.ContainerAllocationExpirer.setMonitorInterval(I)V at org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.ContainerAllocationExpirer.serviceInit(ContainerAllocationExpirer.java:44) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:684) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1005) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:285) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1283) {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6288) mapred job -status fails with AccessControlException
[ https://issues.apache.org/jira/browse/MAPREDUCE-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16090497#comment-16090497 ] Junping Du commented on MAPREDUCE-6288: --- bq. If nobody steps up to address this then the simplest path forward is to revert MAPREDUCE-5875 from branch-2 and trunk until someone can properly fix it. +1. We should revert the patch from trunk/branch-2 just like what we did in 2.7 and 2.8 until someone have properly fix. > mapred job -status fails with AccessControlException > - > > Key: MAPREDUCE-6288 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6288 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.8.0 >Reporter: Robert Kanter >Priority: Blocker > Attachments: MAPREDUCE-6288.002.patch, MAPREDUCE-6288-gera-001.patch, > MAPREDUCE-6288.patch > > > After MAPREDUCE-5875, we're seeing this Exception when trying to do {{mapred > job -status job_1427080398288_0001}} > {noformat} > Exception in thread "main" org.apache.hadoop.security.AccessControlException: > Permission denied: user=jenkins, access=EXECUTE, > inode="/user/history/done":mapred:hadoop:drwxrwx--- > at > org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:257) > at > org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:238) > at > org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkTraverse(DefaultAuthorizationProvider.java:180) > at > org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:137) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:138) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6553) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6535) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:6460) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1919) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1870) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1850) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1822) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:545) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) > at > org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73) > at > org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1213) > at > org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1201) > at > org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1191) > at > org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:299) > at >
[jira] [Updated] (MAPREDUCE-5621) mr-jobhistory-daemon.sh doesn't have to execute mkdir and chown all the time
[ https://issues.apache.org/jira/browse/MAPREDUCE-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-5621: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.2 Status: Resolved (was: Patch Available) Thanks [~jianhe] for additional review. I have commit the patch to branch-2, branch-2.8 and branch-2.8.2. Thanks [~sinchii] for patch contribution! > mr-jobhistory-daemon.sh doesn't have to execute mkdir and chown all the time > > > Key: MAPREDUCE-5621 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5621 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Affects Versions: 2.8.0 >Reporter: Shinichi Yamashita >Assignee: Shinichi Yamashita >Priority: Minor > Labels: BB2015-05-TBR > Fix For: 2.8.2 > > Attachments: MAPREDUCE-5621-branch-2.02.patch, > MAPREDUCE-5621-branch-2.patch, MAPREDUCE-5621.patch > > > mr-jobhistory-daemon.sh executes mkdir and chown command to output the log > files. > This is always executed with or without a directory. In addition, this is > executed not only starting daemon but also stopping daemon. > It add "if" like hadoop-daemon.sh and yarn-daemon.sh and should control it. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-5621) mr-jobhistory-daemon.sh doesn't have to execute mkdir and chown all the time
[ https://issues.apache.org/jira/browse/MAPREDUCE-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-5621: -- Attachment: MAPREDUCE-5621-branch-2.02.patch Fix the warning issue in 02 patch. > mr-jobhistory-daemon.sh doesn't have to execute mkdir and chown all the time > > > Key: MAPREDUCE-5621 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5621 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Affects Versions: 2.8.0 >Reporter: Shinichi Yamashita >Assignee: Shinichi Yamashita >Priority: Minor > Labels: BB2015-05-TBR > Attachments: MAPREDUCE-5621-branch-2.02.patch, > MAPREDUCE-5621-branch-2.patch, MAPREDUCE-5621.patch > > > mr-jobhistory-daemon.sh executes mkdir and chown command to output the log > files. > This is always executed with or without a directory. In addition, this is > executed not only starting daemon but also stopping daemon. > It add "if" like hadoop-daemon.sh and yarn-daemon.sh and should control it. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-5621) mr-jobhistory-daemon.sh doesn't have to execute mkdir and chown all the time
[ https://issues.apache.org/jira/browse/MAPREDUCE-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-5621: -- Affects Version/s: (was: 3.0.0-alpha1) Status: Patch Available (was: Reopened) > mr-jobhistory-daemon.sh doesn't have to execute mkdir and chown all the time > > > Key: MAPREDUCE-5621 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5621 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Affects Versions: 2.8.0 >Reporter: Shinichi Yamashita >Assignee: Shinichi Yamashita >Priority: Minor > Labels: BB2015-05-TBR > Attachments: MAPREDUCE-5621-branch-2.patch, MAPREDUCE-5621.patch > > > mr-jobhistory-daemon.sh executes mkdir and chown command to output the log > files. > This is always executed with or without a directory. In addition, this is > executed not only starting daemon but also stopping daemon. > It add "if" like hadoop-daemon.sh and yarn-daemon.sh and should control it. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-5621) mr-jobhistory-daemon.sh doesn't have to execute mkdir and chown all the time
[ https://issues.apache.org/jira/browse/MAPREDUCE-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-5621: -- Attachment: MAPREDUCE-5621-branch-2.patch Rename the patch for branch-2. The latest patch LGTM. Will commit it 24 hours if no further comments. > mr-jobhistory-daemon.sh doesn't have to execute mkdir and chown all the time > > > Key: MAPREDUCE-5621 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5621 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Affects Versions: 2.8.0, 3.0.0-alpha1 >Reporter: Shinichi Yamashita >Assignee: Shinichi Yamashita >Priority: Minor > Labels: BB2015-05-TBR > Attachments: MAPREDUCE-5621-branch-2.patch, MAPREDUCE-5621.patch > > > mr-jobhistory-daemon.sh executes mkdir and chown command to output the log > files. > This is always executed with or without a directory. In addition, this is > executed not only starting daemon but also stopping daemon. > It add "if" like hadoop-daemon.sh and yarn-daemon.sh and should control it. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Reopened] (MAPREDUCE-5621) mr-jobhistory-daemon.sh doesn't have to execute mkdir and chown all the time
[ https://issues.apache.org/jira/browse/MAPREDUCE-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du reopened MAPREDUCE-5621: --- Reopen this ticket as the issue still exists on branch-2. > mr-jobhistory-daemon.sh doesn't have to execute mkdir and chown all the time > > > Key: MAPREDUCE-5621 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5621 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Affects Versions: 2.8.0, 3.0.0-alpha1 >Reporter: Shinichi Yamashita >Assignee: Shinichi Yamashita >Priority: Minor > Labels: BB2015-05-TBR > Attachments: MAPREDUCE-5621.patch > > > mr-jobhistory-daemon.sh executes mkdir and chown command to output the log > files. > This is always executed with or without a directory. In addition, this is > executed not only starting daemon but also stopping daemon. > It add "if" like hadoop-daemon.sh and yarn-daemon.sh and should control it. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-5621) mr-jobhistory-daemon.sh doesn't have to execute mkdir and chown all the time
[ https://issues.apache.org/jira/browse/MAPREDUCE-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-5621: -- Affects Version/s: 2.8.0 > mr-jobhistory-daemon.sh doesn't have to execute mkdir and chown all the time > > > Key: MAPREDUCE-5621 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5621 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Affects Versions: 2.8.0, 3.0.0-alpha1 >Reporter: Shinichi Yamashita >Assignee: Shinichi Yamashita >Priority: Minor > Labels: BB2015-05-TBR > Attachments: MAPREDUCE-5621.patch > > > mr-jobhistory-daemon.sh executes mkdir and chown command to output the log > files. > This is always executed with or without a directory. In addition, this is > executed not only starting daemon but also stopping daemon. > It add "if" like hadoop-daemon.sh and yarn-daemon.sh and should control it. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-5621) mr-jobhistory-daemon.sh doesn't have to execute mkdir and chown all the time
[ https://issues.apache.org/jira/browse/MAPREDUCE-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-5621: -- Target Version/s: 2.9 > mr-jobhistory-daemon.sh doesn't have to execute mkdir and chown all the time > > > Key: MAPREDUCE-5621 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5621 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Affects Versions: 2.8.0, 3.0.0-alpha1 >Reporter: Shinichi Yamashita >Assignee: Shinichi Yamashita >Priority: Minor > Labels: BB2015-05-TBR > Attachments: MAPREDUCE-5621.patch > > > mr-jobhistory-daemon.sh executes mkdir and chown command to output the log > files. > This is always executed with or without a directory. In addition, this is > executed not only starting daemon but also stopping daemon. > It add "if" like hadoop-daemon.sh and yarn-daemon.sh and should control it. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6246) DBOutputFormat.java appending extra semicolon to query which is incompatible with DB2
[ https://issues.apache.org/jira/browse/MAPREDUCE-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6246: -- Fix Version/s: 3.0.0-beta1 > DBOutputFormat.java appending extra semicolon to query which is incompatible > with DB2 > - > > Key: MAPREDUCE-6246 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6246 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1, mrv2 >Affects Versions: 2.4.1 > Environment: OS: RHEL 5.x, RHEL 6.x, SLES 11.x > Platform: xSeries, pSeries > Browser: Firefox, IE > Security Settings: No Security, Flat file, LDAP, PAM > File System: HDFS, GPFS FPO >Reporter: ramtin >Assignee: Gergely Novák > Fix For: 3.0.0-beta1, 2.8.2 > > Attachments: MAPREDUCE-6246.002.patch, MAPREDUCE-6246.003.patch, > MAPREDUCE-6246.004.patch, MAPREDUCE-6246.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > DBoutputformat is used for writing output of mapreduce jobs to the database > and when used with db2 jdbc drivers it fails with following error > com.ibm.db2.jcc.am.SqlSyntaxErrorException: DB2 SQL Error: SQLCODE=-104, > SQLSTATE=42601, SQLERRMC=;;,COUNT) VALUES (?,?);END-OF-STATEMENT, > DRIVER=4.16.53 at com.ibm.db2.jcc.am.fd.a(fd.java:739) at > com.ibm.db2.jcc.am.fd.a(fd.java:60) at com.ibm.db2.jcc.am.fd.a(fd.java:127) > In DBOutputFormat class there is constructQuery method that generates "INSERT > INTO" statement with semicolon(";") at the end. > Semicolon is ANSI SQL-92 standard character for a statement terminator but > this feature is disabled(OFF) as a default settings in IBM DB2. > Although by using -t we can turn it ON for db2. > (http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0010410.html?cp=SSEPGG_9.7.0%2F3-6-2-0-2). > But there are some products that already built on top of this default > setting (OFF) so by turning ON this feature make them error prone. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6246) DBOutputFormat.java appending extra semicolon to query which is incompatible with DB2
[ https://issues.apache.org/jira/browse/MAPREDUCE-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6246: -- Resolution: Fixed Assignee: Gergely Novák (was: ramtin) Hadoop Flags: Reviewed Fix Version/s: 2.8.2 Status: Resolved (was: Patch Available) I have commit the patch to trunk, branch-2, branch-2.8 and branch-2.8.2. Thanks [~ramtinb] and [~GergelyNovak] for patch contribution! > DBOutputFormat.java appending extra semicolon to query which is incompatible > with DB2 > - > > Key: MAPREDUCE-6246 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6246 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1, mrv2 >Affects Versions: 2.4.1 > Environment: OS: RHEL 5.x, RHEL 6.x, SLES 11.x > Platform: xSeries, pSeries > Browser: Firefox, IE > Security Settings: No Security, Flat file, LDAP, PAM > File System: HDFS, GPFS FPO >Reporter: ramtin >Assignee: Gergely Novák > Fix For: 2.8.2 > > Attachments: MAPREDUCE-6246.002.patch, MAPREDUCE-6246.003.patch, > MAPREDUCE-6246.004.patch, MAPREDUCE-6246.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > DBoutputformat is used for writing output of mapreduce jobs to the database > and when used with db2 jdbc drivers it fails with following error > com.ibm.db2.jcc.am.SqlSyntaxErrorException: DB2 SQL Error: SQLCODE=-104, > SQLSTATE=42601, SQLERRMC=;;,COUNT) VALUES (?,?);END-OF-STATEMENT, > DRIVER=4.16.53 at com.ibm.db2.jcc.am.fd.a(fd.java:739) at > com.ibm.db2.jcc.am.fd.a(fd.java:60) at com.ibm.db2.jcc.am.fd.a(fd.java:127) > In DBOutputFormat class there is constructQuery method that generates "INSERT > INTO" statement with semicolon(";") at the end. > Semicolon is ANSI SQL-92 standard character for a statement terminator but > this feature is disabled(OFF) as a default settings in IBM DB2. > Although by using -t we can turn it ON for db2. > (http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0010410.html?cp=SSEPGG_9.7.0%2F3-6-2-0-2). > But there are some products that already built on top of this default > setting (OFF) so by turning ON this feature make them error prone. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6246) DBOutputFormat.java appending extra semicolon to query which is incompatible with DB2
[ https://issues.apache.org/jira/browse/MAPREDUCE-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6246: -- Labels: (was: BB2015-05-RFC) Status: Patch Available (was: Open) Thanks [~GergelyNovak]! submit the patch to kick off Jenkins. > DBOutputFormat.java appending extra semicolon to query which is incompatible > with DB2 > - > > Key: MAPREDUCE-6246 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6246 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1, mrv2 >Affects Versions: 2.4.1 > Environment: OS: RHEL 5.x, RHEL 6.x, SLES 11.x > Platform: xSeries, pSeries > Browser: Firefox, IE > Security Settings: No Security, Flat file, LDAP, PAM > File System: HDFS, GPFS FPO >Reporter: ramtin >Assignee: ramtin > Attachments: MAPREDUCE-6246.002.patch, MAPREDUCE-6246.003.patch, > MAPREDUCE-6246.004.patch, MAPREDUCE-6246.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > DBoutputformat is used for writing output of mapreduce jobs to the database > and when used with db2 jdbc drivers it fails with following error > com.ibm.db2.jcc.am.SqlSyntaxErrorException: DB2 SQL Error: SQLCODE=-104, > SQLSTATE=42601, SQLERRMC=;;,COUNT) VALUES (?,?);END-OF-STATEMENT, > DRIVER=4.16.53 at com.ibm.db2.jcc.am.fd.a(fd.java:739) at > com.ibm.db2.jcc.am.fd.a(fd.java:60) at com.ibm.db2.jcc.am.fd.a(fd.java:127) > In DBOutputFormat class there is constructQuery method that generates "INSERT > INTO" statement with semicolon(";") at the end. > Semicolon is ANSI SQL-92 standard character for a statement terminator but > this feature is disabled(OFF) as a default settings in IBM DB2. > Although by using -t we can turn it ON for db2. > (http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0010410.html?cp=SSEPGG_9.7.0%2F3-6-2-0-2). > But there are some products that already built on top of this default > setting (OFF) so by turning ON this feature make them error prone. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6246) DBOutputFormat.java appending extra semicolon to query which is incompatible with DB2
[ https://issues.apache.org/jira/browse/MAPREDUCE-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6246: -- Status: Open (was: Patch Available) > DBOutputFormat.java appending extra semicolon to query which is incompatible > with DB2 > - > > Key: MAPREDUCE-6246 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6246 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1, mrv2 >Affects Versions: 2.4.1 > Environment: OS: RHEL 5.x, RHEL 6.x, SLES 11.x > Platform: xSeries, pSeries > Browser: Firefox, IE > Security Settings: No Security, Flat file, LDAP, PAM > File System: HDFS, GPFS FPO >Reporter: ramtin >Assignee: ramtin > Labels: BB2015-05-RFC > Attachments: MAPREDUCE-6246.002.patch, MAPREDUCE-6246.003.patch, > MAPREDUCE-6246.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > DBoutputformat is used for writing output of mapreduce jobs to the database > and when used with db2 jdbc drivers it fails with following error > com.ibm.db2.jcc.am.SqlSyntaxErrorException: DB2 SQL Error: SQLCODE=-104, > SQLSTATE=42601, SQLERRMC=;;,COUNT) VALUES (?,?);END-OF-STATEMENT, > DRIVER=4.16.53 at com.ibm.db2.jcc.am.fd.a(fd.java:739) at > com.ibm.db2.jcc.am.fd.a(fd.java:60) at com.ibm.db2.jcc.am.fd.a(fd.java:127) > In DBOutputFormat class there is constructQuery method that generates "INSERT > INTO" statement with semicolon(";") at the end. > Semicolon is ANSI SQL-92 standard character for a statement terminator but > this feature is disabled(OFF) as a default settings in IBM DB2. > Although by using -t we can turn it ON for db2. > (http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0010410.html?cp=SSEPGG_9.7.0%2F3-6-2-0-2). > But there are some products that already built on top of this default > setting (OFF) so by turning ON this feature make them error prone. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6246) DBOutputFormat.java appending extra semicolon to query which is incompatible with DB2
[ https://issues.apache.org/jira/browse/MAPREDUCE-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16067268#comment-16067268 ] Junping Du commented on MAPREDUCE-6246: --- Looks like the patch doesn't apply to trunk any more. [~ramtinb], would you rebase your patch according to latest trunk? Also, in your test for testORACLEConstructQuery(), I think we should get rid of "db2" name but using oracle instead. Other looks fine to me. > DBOutputFormat.java appending extra semicolon to query which is incompatible > with DB2 > - > > Key: MAPREDUCE-6246 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6246 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1, mrv2 >Affects Versions: 2.4.1 > Environment: OS: RHEL 5.x, RHEL 6.x, SLES 11.x > Platform: xSeries, pSeries > Browser: Firefox, IE > Security Settings: No Security, Flat file, LDAP, PAM > File System: HDFS, GPFS FPO >Reporter: ramtin >Assignee: ramtin > Labels: BB2015-05-RFC > Attachments: MAPREDUCE-6246.002.patch, MAPREDUCE-6246.003.patch, > MAPREDUCE-6246.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > DBoutputformat is used for writing output of mapreduce jobs to the database > and when used with db2 jdbc drivers it fails with following error > com.ibm.db2.jcc.am.SqlSyntaxErrorException: DB2 SQL Error: SQLCODE=-104, > SQLSTATE=42601, SQLERRMC=;;,COUNT) VALUES (?,?);END-OF-STATEMENT, > DRIVER=4.16.53 at com.ibm.db2.jcc.am.fd.a(fd.java:739) at > com.ibm.db2.jcc.am.fd.a(fd.java:60) at com.ibm.db2.jcc.am.fd.a(fd.java:127) > In DBOutputFormat class there is constructQuery method that generates "INSERT > INTO" statement with semicolon(";") at the end. > Semicolon is ANSI SQL-92 standard character for a statement terminator but > this feature is disabled(OFF) as a default settings in IBM DB2. > Although by using -t we can turn it ON for db2. > (http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0010410.html?cp=SSEPGG_9.7.0%2F3-6-2-0-2). > But there are some products that already built on top of this default > setting (OFF) so by turning ON this feature make them error prone. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6246) DBOutputFormat.java appending extra semicolon to query which is incompatible with DB2
[ https://issues.apache.org/jira/browse/MAPREDUCE-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16067141#comment-16067141 ] Junping Du commented on MAPREDUCE-6246: --- Thanks [~ramtinb] for the patch contribution. Latest patch LGTM. Will commit it shortly if no further comments. > DBOutputFormat.java appending extra semicolon to query which is incompatible > with DB2 > - > > Key: MAPREDUCE-6246 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6246 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1, mrv2 >Affects Versions: 2.4.1 > Environment: OS: RHEL 5.x, RHEL 6.x, SLES 11.x > Platform: xSeries, pSeries > Browser: Firefox, IE > Security Settings: No Security, Flat file, LDAP, PAM > File System: HDFS, GPFS FPO >Reporter: ramtin >Assignee: ramtin > Labels: BB2015-05-RFC > Attachments: MAPREDUCE-6246.002.patch, MAPREDUCE-6246.003.patch, > MAPREDUCE-6246.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > DBoutputformat is used for writing output of mapreduce jobs to the database > and when used with db2 jdbc drivers it fails with following error > com.ibm.db2.jcc.am.SqlSyntaxErrorException: DB2 SQL Error: SQLCODE=-104, > SQLSTATE=42601, SQLERRMC=;;,COUNT) VALUES (?,?);END-OF-STATEMENT, > DRIVER=4.16.53 at com.ibm.db2.jcc.am.fd.a(fd.java:739) at > com.ibm.db2.jcc.am.fd.a(fd.java:60) at com.ibm.db2.jcc.am.fd.a(fd.java:127) > In DBOutputFormat class there is constructQuery method that generates "INSERT > INTO" statement with semicolon(";") at the end. > Semicolon is ANSI SQL-92 standard character for a statement terminator but > this feature is disabled(OFF) as a default settings in IBM DB2. > Although by using -t we can turn it ON for db2. > (http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.cmd.doc/doc/r0010410.html?cp=SSEPGG_9.7.0%2F3-6-2-0-2). > But there are some products that already built on top of this default > setting (OFF) so by turning ON this feature make them error prone. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6897) Add Unit Test to make sure Job end notification get sent even appMaster stop get YarnRuntimeException
[ https://issues.apache.org/jira/browse/MAPREDUCE-6897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16052416#comment-16052416 ] Junping Du commented on MAPREDUCE-6897: --- Thanks [~GergelyNovak] for contributing the patch and [~raviprak] for review! > Add Unit Test to make sure Job end notification get sent even appMaster stop > get YarnRuntimeException > - > > Key: MAPREDUCE-6897 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6897 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.9.0, 3.0.0-alpha4 >Reporter: Junping Du >Assignee: Gergely Novák >Priority: Minor > Labels: newbie++ > Fix For: 2.9.0, 3.0.0-alpha4 > > Attachments: MAPREDUCE-6897.001.patch > > > In MAPREDUCE-6895, we fix the issue that Job end notification not send due to > YarnRuntimeException throw in appMaster stop. We need to add unit test to > make sure we won't run into the same issue again in future. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6895) Job end notification not send due to YarnRuntimeException
[ https://issues.apache.org/jira/browse/MAPREDUCE-6895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6895: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.0.0-alpha4 2.9.0 Status: Resolved (was: Patch Available) > Job end notification not send due to YarnRuntimeException > - > > Key: MAPREDUCE-6895 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6895 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster >Affects Versions: 2.4.1, 2.8.0, 2.7.3 >Reporter: yunjiong zhao >Assignee: yunjiong zhao > Fix For: 2.9.0, 3.0.0-alpha4 > > Attachments: MAPREDUCE-6895.001.patch, MAPREDUCE-6895.002.patch > > > MRAppMaster.this.stop() throw out YarnRuntimeException as below log shows, it > caused job end notification not send. > {quote} > 2017-05-24 12:14:02,165 WARN [Thread-693] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Graceful stop failed > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: > java.nio.channels.ClosedChannelException > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleEvent(JobHistoryEventHandler.java:531) > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.serviceStop(JobHistoryEventHandler.java:360) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52) > at > org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80) > at > org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157) > at > org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStop(MRAppMaster.java:1476) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.stop(MRAppMaster.java:1090) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.shutDownJob(MRAppMaster.java:554) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler$1.run(MRAppMaster.java:605) > Caused by: java.nio.channels.ClosedChannelException > at > org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:1528) > at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:98) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:58) > at java.io.DataOutputStream.write(DataOutputStream.java:107) > at > org.codehaus.jackson.impl.Utf8Generator._flushBuffer(Utf8Generator.java:1754) > at > org.codehaus.jackson.impl.Utf8Generator.flush(Utf8Generator.java:1088) > at org.apache.avro.io.JsonEncoder.flush(JsonEncoder.java:67) > at > org.apache.hadoop.mapreduce.jobhistory.EventWriter.write(EventWriter.java:67) > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler$MetaInfo.writeEvent(JobHistoryEventHandler.java:886) > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleEvent(JobHistoryEventHandler.java:520) > ... 11 more > 2017-05-24 12:14:02,165 INFO [Thread-693] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Exiting MR AppMaster..GoodBye! > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6895) Job end notification not send due to YarnRuntimeException
[ https://issues.apache.org/jira/browse/MAPREDUCE-6895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16048705#comment-16048705 ] Junping Du commented on MAPREDUCE-6895: --- Just filed MAPREDUCE-6897 and commit the latest patch to trunk and branch-2. Thanks [~zhaoyunjiong] for patch contribution and [~raviprak] and [~leftnoteasy] for review and comments! > Job end notification not send due to YarnRuntimeException > - > > Key: MAPREDUCE-6895 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6895 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster >Affects Versions: 2.4.1, 2.8.0, 2.7.3 >Reporter: yunjiong zhao >Assignee: yunjiong zhao > Attachments: MAPREDUCE-6895.001.patch, MAPREDUCE-6895.002.patch > > > MRAppMaster.this.stop() throw out YarnRuntimeException as below log shows, it > caused job end notification not send. > {quote} > 2017-05-24 12:14:02,165 WARN [Thread-693] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Graceful stop failed > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: > java.nio.channels.ClosedChannelException > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleEvent(JobHistoryEventHandler.java:531) > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.serviceStop(JobHistoryEventHandler.java:360) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52) > at > org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80) > at > org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157) > at > org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStop(MRAppMaster.java:1476) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.stop(MRAppMaster.java:1090) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.shutDownJob(MRAppMaster.java:554) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler$1.run(MRAppMaster.java:605) > Caused by: java.nio.channels.ClosedChannelException > at > org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:1528) > at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:98) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:58) > at java.io.DataOutputStream.write(DataOutputStream.java:107) > at > org.codehaus.jackson.impl.Utf8Generator._flushBuffer(Utf8Generator.java:1754) > at > org.codehaus.jackson.impl.Utf8Generator.flush(Utf8Generator.java:1088) > at org.apache.avro.io.JsonEncoder.flush(JsonEncoder.java:67) > at > org.apache.hadoop.mapreduce.jobhistory.EventWriter.write(EventWriter.java:67) > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler$MetaInfo.writeEvent(JobHistoryEventHandler.java:886) > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleEvent(JobHistoryEventHandler.java:520) > ... 11 more > 2017-05-24 12:14:02,165 INFO [Thread-693] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Exiting MR AppMaster..GoodBye! > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-6897) Add Unit Test to make sure Job end notification get sent even appMaster stop get YarnRuntimeException
Junping Du created MAPREDUCE-6897: - Summary: Add Unit Test to make sure Job end notification get sent even appMaster stop get YarnRuntimeException Key: MAPREDUCE-6897 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6897 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Junping Du Priority: Minor In MAPREDUCE-6895, we fix the issue that Job end notification not send due to YarnRuntimeException throw in appMaster stop. We need to add unit test to make sure we won't run into the same issue again in future. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6895) Job end notification not send due to YarnRuntimeException
[ https://issues.apache.org/jira/browse/MAPREDUCE-6895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16047201#comment-16047201 ] Junping Du commented on MAPREDUCE-6895: --- Adding a UT is also a nice idea. Given this fix here is simple and important, I am going to file a separated JIRA to track UT effort before commit the patch. [~raviprak], are you ok with that? > Job end notification not send due to YarnRuntimeException > - > > Key: MAPREDUCE-6895 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6895 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster >Affects Versions: 2.4.1, 2.8.0, 2.7.3 >Reporter: yunjiong zhao >Assignee: yunjiong zhao > Attachments: MAPREDUCE-6895.001.patch, MAPREDUCE-6895.002.patch > > > MRAppMaster.this.stop() throw out YarnRuntimeException as below log shows, it > caused job end notification not send. > {quote} > 2017-05-24 12:14:02,165 WARN [Thread-693] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Graceful stop failed > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: > java.nio.channels.ClosedChannelException > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleEvent(JobHistoryEventHandler.java:531) > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.serviceStop(JobHistoryEventHandler.java:360) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52) > at > org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80) > at > org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157) > at > org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStop(MRAppMaster.java:1476) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.stop(MRAppMaster.java:1090) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.shutDownJob(MRAppMaster.java:554) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler$1.run(MRAppMaster.java:605) > Caused by: java.nio.channels.ClosedChannelException > at > org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:1528) > at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:98) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:58) > at java.io.DataOutputStream.write(DataOutputStream.java:107) > at > org.codehaus.jackson.impl.Utf8Generator._flushBuffer(Utf8Generator.java:1754) > at > org.codehaus.jackson.impl.Utf8Generator.flush(Utf8Generator.java:1088) > at org.apache.avro.io.JsonEncoder.flush(JsonEncoder.java:67) > at > org.apache.hadoop.mapreduce.jobhistory.EventWriter.write(EventWriter.java:67) > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler$MetaInfo.writeEvent(JobHistoryEventHandler.java:886) > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleEvent(JobHistoryEventHandler.java:520) > ... 11 more > 2017-05-24 12:14:02,165 INFO [Thread-693] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Exiting MR AppMaster..GoodBye! > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6895) Job end notification not send due to YarnRuntimeException
[ https://issues.apache.org/jira/browse/MAPREDUCE-6895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043574#comment-16043574 ] Junping Du commented on MAPREDUCE-6895: --- The patch LGTM as well. The only minor comments is: MRJobConfig.MR_JOB_END_NOTIFICATION_URL get duplicated checked - by shutdownJob() and notifier.notify(report). Given shutdownJob is the only non-test place to call notifier.notify(report), we can omit one check I think. Other looks fine. > Job end notification not send due to YarnRuntimeException > - > > Key: MAPREDUCE-6895 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6895 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster >Affects Versions: 2.4.1, 2.8.0, 2.7.3 >Reporter: yunjiong zhao >Assignee: yunjiong zhao > Attachments: MAPREDUCE-6895.001.patch > > > MRAppMaster.this.stop() throw out YarnRuntimeException as below log shows, it > caused job end notification not send. > {quote} > 2017-05-24 12:14:02,165 WARN [Thread-693] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Graceful stop failed > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: > java.nio.channels.ClosedChannelException > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleEvent(JobHistoryEventHandler.java:531) > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.serviceStop(JobHistoryEventHandler.java:360) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52) > at > org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80) > at > org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157) > at > org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStop(MRAppMaster.java:1476) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.stop(MRAppMaster.java:1090) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.shutDownJob(MRAppMaster.java:554) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler$1.run(MRAppMaster.java:605) > Caused by: java.nio.channels.ClosedChannelException > at > org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:1528) > at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:98) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:58) > at java.io.DataOutputStream.write(DataOutputStream.java:107) > at > org.codehaus.jackson.impl.Utf8Generator._flushBuffer(Utf8Generator.java:1754) > at > org.codehaus.jackson.impl.Utf8Generator.flush(Utf8Generator.java:1088) > at org.apache.avro.io.JsonEncoder.flush(JsonEncoder.java:67) > at > org.apache.hadoop.mapreduce.jobhistory.EventWriter.write(EventWriter.java:67) > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler$MetaInfo.writeEvent(JobHistoryEventHandler.java:886) > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleEvent(JobHistoryEventHandler.java:520) > ... 11 more > 2017-05-24 12:14:02,165 INFO [Thread-693] > org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Exiting MR AppMaster..GoodBye! > {quote} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6654) Possible NPE in JobHistoryEventHandler#handleEvent
[ https://issues.apache.org/jira/browse/MAPREDUCE-6654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6654: -- Target Version/s: 2.9.0 (was: 2.8.1) > Possible NPE in JobHistoryEventHandler#handleEvent > -- > > Key: MAPREDUCE-6654 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6654 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Xiao Chen >Assignee: Junping Du >Priority: Critical > Attachments: MAPREDUCE-6654.patch, MAPREDUCE-6654-v2.1.patch, > MAPREDUCE-6654-v2.patch > > > I have seen NPE thrown from {{JobHistoryEventHandler#handleEvent}}: > {noformat} > 2016-03-14 16:42:15,231 INFO [Thread-69] > org.apache.hadoop.service.AbstractService: Service JobHistoryEventHandler > failed in state STOPPED; cause: java.lang.NullPointerException > java.lang.NullPointerException > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleEvent(JobHistoryEventHandler.java:570) > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.serviceStop(JobHistoryEventHandler.java:382) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52) > at > org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80) > at > org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157) > at > org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStop(MRAppMaster.java:1651) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.stop(MRAppMaster.java:1147) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.shutDownJob(MRAppMaster.java:573) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler$1.run(MRAppMaster.java:620) > {noformat} > In the version this exception is thrown, the > [line|https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/jobhistory/JobHistoryEventHandler.java#L586] > is: > {code:java}mi.writeEvent(historyEvent);{code} > IMHO, this may be caused by an exception in a previous step. Specifically, in > the kerberized environment, when creating event writer which calls to decrypt > EEK, the connection to KMS failed. Exception below: > {noformat} > 2016-03-14 16:41:57,559 ERROR [eventHandlingThread] > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Error > JobHistoryEventHandler in handleEvent: EventType: AM_STARTED > java.net.SocketTimeoutException: Read timed out > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.read(SocketInputStream.java:152) > at java.net.SocketInputStream.read(SocketInputStream.java:122) > at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) > at java.io.BufferedInputStream.read(BufferedInputStream.java:334) > at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687) > at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633) > at > sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1323) > at > java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468) > at > org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:520) > at > org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:505) > at > org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:779) > at > org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$3.call(LoadBalancingKMSClientProvider.java:185) > at > org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$3.call(LoadBalancingKMSClientProvider.java:181) > at > org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:94) > at > org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:181) > at > org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:388) > at > org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(DFSClient.java:1420) > at > org.apache.hadoop.hdfs.DFSClient.createWrappedOutputStream(DFSClient.java:1522) > at >
[jira] [Updated] (MAPREDUCE-5889) Deprecate FileInputFormat.setInputPaths(Job, String) and FileInputFormat.addInputPaths(Job, String)
[ https://issues.apache.org/jira/browse/MAPREDUCE-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-5889: -- Labels: (was: BB2015-05-TBR newbie) > Deprecate FileInputFormat.setInputPaths(Job, String) and > FileInputFormat.addInputPaths(Job, String) > --- > > Key: MAPREDUCE-5889 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5889 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Minor > Attachments: MAPREDUCE-5889.3.patch, MAPREDUCE-5889.4.patch, > MAPREDUCE-5889.5.patch, MAPREDUCE-5889.patch, MAPREDUCE-5889.patch > > > {{FileInputFormat.setInputPaths(Job job, String commaSeparatedPaths)}} and > {{FileInputFormat.addInputPaths(Job job, String commaSeparatedPaths)}} fail > to parse commaSeparatedPaths if a comma is included in the file path. (e.g. > Path: {{/path/file,with,comma}}) > We should deprecate these methods and document to use {{setInputPaths(Job > job, Path... inputPaths)}} and {{addInputPaths(Job job, Path... inputPaths)}} > instead. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6868) License check for jdiff output files should be ignored
[ https://issues.apache.org/jira/browse/MAPREDUCE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6868: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.0.0-alpha3 2.8.1 2.9.0 Status: Resolved (was: Patch Available) I have commit the patch to trunk, branch-2 and branch-2.8. Thanks [~ajisakaa]! > License check for jdiff output files should be ignored > -- > > Key: MAPREDUCE-6868 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6868 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: build >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka > Fix For: 2.9.0, 2.8.1, 3.0.0-alpha3 > > Attachments: MAPREDUCE-6868.01.patch > > > The following commits added jdiff output for Hadoop 2.8.0 but ASF license > header is missing. > * > https://github.com/apache/hadoop/commit/6df029db36412b7219b64313dcbe1874dc1c8b0c > * > https://github.com/apache/hadoop/commit/d174c06b01e1f743d3111b9b760a9824d8106b86 > hadoop-mapreduce-project module does not have a setting to ignore the jdiff > output files, so the license check fails. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6868) License check for jdiff output files should be ignored
[ https://issues.apache.org/jira/browse/MAPREDUCE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15942692#comment-15942692 ] Junping Du commented on MAPREDUCE-6868: --- Thanks [~ajisakaa] for reporting the issue and fixing here. +1 on the patch. Committing it. > License check for jdiff output files should be ignored > -- > > Key: MAPREDUCE-6868 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6868 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: build >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka > Attachments: MAPREDUCE-6868.01.patch > > > The following commits added jdiff output for Hadoop 2.8.0 but ASF license > header is missing. > * > https://github.com/apache/hadoop/commit/6df029db36412b7219b64313dcbe1874dc1c8b0c > * > https://github.com/apache/hadoop/commit/d174c06b01e1f743d3111b9b760a9824d8106b86 > hadoop-mapreduce-project module does not have a setting to ignore the jdiff > output files, so the license check fails. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6864) Hadoop streaming creates 2 mappers when the input has only one block
[ https://issues.apache.org/jira/browse/MAPREDUCE-6864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6864: -- Target Version/s: 2.8.1 > Hadoop streaming creates 2 mappers when the input has only one block > > > Key: MAPREDUCE-6864 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6864 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 2.7.3 >Reporter: Daniel Templeton > > If a streaming job is run against input that is less than 2 blocks, 2 mappers > will be created, both operating on the same split, both producing (duplicate) > output. In some cases the second mapper will consistently fail. I've not > seen the failure with input less than 10 bytes or more than a couple MB. I > have seen it with a 4kB input. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6852) Job#updateStatus() failed with NPE due to race condition
[ https://issues.apache.org/jira/browse/MAPREDUCE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6852: -- Fix Version/s: 3.0.0-alpha3 > Job#updateStatus() failed with NPE due to race condition > > > Key: MAPREDUCE-6852 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6852 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Junping Du >Assignee: Junping Du > Fix For: 2.9.0, 3.0.0-alpha3 > > Attachments: MAPREDUCE-6852.patch, MAPREDUCE-6852-v2.patch > > > Like MAPREDUCE-6762, we found this issue in a cluster where Pig query > occasionally failed on NPE - "Pig uses JobControl API to track MR job status, > but sometimes Job History Server failed to flush job meta files to HDFS which > caused the status update failed." Beside NPE in > o.a.h.mapreduce.Job.getJobName, we also get NPE in Job.updateStatus() and the > exception is as following: > {noformat} > Caused by: java.lang.NullPointerException > at org.apache.hadoop.mapreduce.Job$1.run(Job.java:323) > at org.apache.hadoop.mapreduce.Job$1.run(Job.java:320) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1833) > at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:320) > at org.apache.hadoop.mapreduce.Job.isComplete(Job.java:604) > {noformat} > We found state here is null. However, we already check the job state to be > RUNNING as code below: > {noformat} > public boolean isComplete() throws IOException { > ensureState(JobState.RUNNING); > updateStatus(); > return status.isJobComplete(); > } > {noformat} > The only possible reason here is two threads are calling here for the same > time: ensure state first, then one thread update the state to null while the > other thread hit NPE issue here. > We should fix this NPE exception. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6852) Job#updateStatus() failed with NPE due to race condition
[ https://issues.apache.org/jira/browse/MAPREDUCE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892827#comment-15892827 ] Junping Du commented on MAPREDUCE-6852: --- Thanks Jian for review and commit! > Job#updateStatus() failed with NPE due to race condition > > > Key: MAPREDUCE-6852 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6852 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Junping Du >Assignee: Junping Du > Fix For: 2.9.0, 3.0.0-alpha3 > > Attachments: MAPREDUCE-6852.patch, MAPREDUCE-6852-v2.patch > > > Like MAPREDUCE-6762, we found this issue in a cluster where Pig query > occasionally failed on NPE - "Pig uses JobControl API to track MR job status, > but sometimes Job History Server failed to flush job meta files to HDFS which > caused the status update failed." Beside NPE in > o.a.h.mapreduce.Job.getJobName, we also get NPE in Job.updateStatus() and the > exception is as following: > {noformat} > Caused by: java.lang.NullPointerException > at org.apache.hadoop.mapreduce.Job$1.run(Job.java:323) > at org.apache.hadoop.mapreduce.Job$1.run(Job.java:320) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1833) > at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:320) > at org.apache.hadoop.mapreduce.Job.isComplete(Job.java:604) > {noformat} > We found state here is null. However, we already check the job state to be > RUNNING as code below: > {noformat} > public boolean isComplete() throws IOException { > ensureState(JobState.RUNNING); > updateStatus(); > return status.isJobComplete(); > } > {noformat} > The only possible reason here is two threads are calling here for the same > time: ensure state first, then one thread update the state to null while the > other thread hit NPE issue here. > We should fix this NPE exception. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6852) Job#updateStatus() failed with NPE due to race condition
[ https://issues.apache.org/jira/browse/MAPREDUCE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6852: -- Status: Patch Available (was: Open) > Job#updateStatus() failed with NPE due to race condition > > > Key: MAPREDUCE-6852 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6852 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Junping Du >Assignee: Junping Du > Attachments: MAPREDUCE-6852.patch, MAPREDUCE-6852-v2.patch > > > Like MAPREDUCE-6762, we found this issue in a cluster where Pig query > occasionally failed on NPE - "Pig uses JobControl API to track MR job status, > but sometimes Job History Server failed to flush job meta files to HDFS which > caused the status update failed." Beside NPE in > o.a.h.mapreduce.Job.getJobName, we also get NPE in Job.updateStatus() and the > exception is as following: > {noformat} > Caused by: java.lang.NullPointerException > at org.apache.hadoop.mapreduce.Job$1.run(Job.java:323) > at org.apache.hadoop.mapreduce.Job$1.run(Job.java:320) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1833) > at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:320) > at org.apache.hadoop.mapreduce.Job.isComplete(Job.java:604) > {noformat} > We found state here is null. However, we already check the job state to be > RUNNING as code below: > {noformat} > public boolean isComplete() throws IOException { > ensureState(JobState.RUNNING); > updateStatus(); > return status.isJobComplete(); > } > {noformat} > The only possible reason here is two threads are calling here for the same > time: ensure state first, then one thread update the state to null while the > other thread hit NPE issue here. > We should fix this NPE exception. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6852) Job#updateStatus() failed with NPE due to race condition
[ https://issues.apache.org/jira/browse/MAPREDUCE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6852: -- Attachment: MAPREDUCE-6852-v2.patch Sounds reasonable. v2 patch incorporate comments here. > Job#updateStatus() failed with NPE due to race condition > > > Key: MAPREDUCE-6852 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6852 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Junping Du >Assignee: Junping Du > Attachments: MAPREDUCE-6852.patch, MAPREDUCE-6852-v2.patch > > > Like MAPREDUCE-6762, we found this issue in a cluster where Pig query > occasionally failed on NPE - "Pig uses JobControl API to track MR job status, > but sometimes Job History Server failed to flush job meta files to HDFS which > caused the status update failed." Beside NPE in > o.a.h.mapreduce.Job.getJobName, we also get NPE in Job.updateStatus() and the > exception is as following: > {noformat} > Caused by: java.lang.NullPointerException > at org.apache.hadoop.mapreduce.Job$1.run(Job.java:323) > at org.apache.hadoop.mapreduce.Job$1.run(Job.java:320) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1833) > at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:320) > at org.apache.hadoop.mapreduce.Job.isComplete(Job.java:604) > {noformat} > We found state here is null. However, we already check the job state to be > RUNNING as code below: > {noformat} > public boolean isComplete() throws IOException { > ensureState(JobState.RUNNING); > updateStatus(); > return status.isJobComplete(); > } > {noformat} > The only possible reason here is two threads are calling here for the same > time: ensure state first, then one thread update the state to null while the > other thread hit NPE issue here. > We should fix this NPE exception. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6852) Job#updateStatus() failed with NPE due to race condition
[ https://issues.apache.org/jira/browse/MAPREDUCE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6852: -- Status: Open (was: Patch Available) > Job#updateStatus() failed with NPE due to race condition > > > Key: MAPREDUCE-6852 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6852 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Junping Du >Assignee: Junping Du > Attachments: MAPREDUCE-6852.patch > > > Like MAPREDUCE-6762, we found this issue in a cluster where Pig query > occasionally failed on NPE - "Pig uses JobControl API to track MR job status, > but sometimes Job History Server failed to flush job meta files to HDFS which > caused the status update failed." Beside NPE in > o.a.h.mapreduce.Job.getJobName, we also get NPE in Job.updateStatus() and the > exception is as following: > {noformat} > Caused by: java.lang.NullPointerException > at org.apache.hadoop.mapreduce.Job$1.run(Job.java:323) > at org.apache.hadoop.mapreduce.Job$1.run(Job.java:320) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1833) > at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:320) > at org.apache.hadoop.mapreduce.Job.isComplete(Job.java:604) > {noformat} > We found state here is null. However, we already check the job state to be > RUNNING as code below: > {noformat} > public boolean isComplete() throws IOException { > ensureState(JobState.RUNNING); > updateStatus(); > return status.isJobComplete(); > } > {noformat} > The only possible reason here is two threads are calling here for the same > time: ensure state first, then one thread update the state to null while the > other thread hit NPE issue here. > We should fix this NPE exception. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6852) Job#updateStatus() failed with NPE due to race condition
[ https://issues.apache.org/jira/browse/MAPREDUCE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6852: -- Attachment: MAPREDUCE-6852.patch Upload a quick patch to fix this issue. Not include any unit test given this is a race condition case which is hard to do in unit test. > Job#updateStatus() failed with NPE due to race condition > > > Key: MAPREDUCE-6852 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6852 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Junping Du >Assignee: Junping Du > Attachments: MAPREDUCE-6852.patch > > > Like MAPREDUCE-6762, we found this issue in a cluster where Pig query > occasionally failed on NPE - "Pig uses JobControl API to track MR job status, > but sometimes Job History Server failed to flush job meta files to HDFS which > caused the status update failed." Beside NPE in > o.a.h.mapreduce.Job.getJobName, we also get NPE in Job.updateStatus() and the > exception is as following: > {noformat} > Caused by: java.lang.NullPointerException > at org.apache.hadoop.mapreduce.Job$1.run(Job.java:323) > at org.apache.hadoop.mapreduce.Job$1.run(Job.java:320) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1833) > at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:320) > at org.apache.hadoop.mapreduce.Job.isComplete(Job.java:604) > {noformat} > We found state here is null. However, we already check the job state to be > RUNNING as code below: > {noformat} > public boolean isComplete() throws IOException { > ensureState(JobState.RUNNING); > updateStatus(); > return status.isJobComplete(); > } > {noformat} > The only possible reason here is two threads are calling here for the same > time: ensure state first, then one thread update the state to null while the > other thread hit NPE issue here. > We should fix this NPE exception. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6852) Job#updateStatus() failed with NPE due to race condition
[ https://issues.apache.org/jira/browse/MAPREDUCE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6852: -- Status: Patch Available (was: Open) > Job#updateStatus() failed with NPE due to race condition > > > Key: MAPREDUCE-6852 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6852 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Junping Du >Assignee: Junping Du > Attachments: MAPREDUCE-6852.patch > > > Like MAPREDUCE-6762, we found this issue in a cluster where Pig query > occasionally failed on NPE - "Pig uses JobControl API to track MR job status, > but sometimes Job History Server failed to flush job meta files to HDFS which > caused the status update failed." Beside NPE in > o.a.h.mapreduce.Job.getJobName, we also get NPE in Job.updateStatus() and the > exception is as following: > {noformat} > Caused by: java.lang.NullPointerException > at org.apache.hadoop.mapreduce.Job$1.run(Job.java:323) > at org.apache.hadoop.mapreduce.Job$1.run(Job.java:320) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1833) > at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:320) > at org.apache.hadoop.mapreduce.Job.isComplete(Job.java:604) > {noformat} > We found state here is null. However, we already check the job state to be > RUNNING as code below: > {noformat} > public boolean isComplete() throws IOException { > ensureState(JobState.RUNNING); > updateStatus(); > return status.isJobComplete(); > } > {noformat} > The only possible reason here is two threads are calling here for the same > time: ensure state first, then one thread update the state to null while the > other thread hit NPE issue here. > We should fix this NPE exception. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-6852) Job#updateStatus() failed with NPE due to race condition
Junping Du created MAPREDUCE-6852: - Summary: Job#updateStatus() failed with NPE due to race condition Key: MAPREDUCE-6852 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6852 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Junping Du Assignee: Junping Du Like MAPREDUCE-6762, we found this issue in a cluster where Pig query occasionally failed on NPE - "Pig uses JobControl API to track MR job status, but sometimes Job History Server failed to flush job meta files to HDFS which caused the status update failed." Beside NPE in o.a.h.mapreduce.Job.getJobName, we also get NPE in Job.updateStatus() and the exception is as following: {noformat} Caused by: java.lang.NullPointerException at org.apache.hadoop.mapreduce.Job$1.run(Job.java:323) at org.apache.hadoop.mapreduce.Job$1.run(Job.java:320) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1833) at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:320) at org.apache.hadoop.mapreduce.Job.isComplete(Job.java:604) {noformat} We found state here is null. However, we already check the job state to be RUNNING as code below: {noformat} public boolean isComplete() throws IOException { ensureState(JobState.RUNNING); updateStatus(); return status.isJobComplete(); } {noformat} The only possible reason here is two threads are calling here for the same time: ensure state first, then one thread update the state to null while the other thread hit NPE issue here. We should fix this NPE exception. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6404) Allow AM to specify a port range for starting its webapp
[ https://issues.apache.org/jira/browse/MAPREDUCE-6404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6404: -- Release Note: Add a new configuration - "yarn.app.mapreduce.am.webapp.port-range" to specify port-range for webapp launched by AM. > Allow AM to specify a port range for starting its webapp > > > Key: MAPREDUCE-6404 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6404 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: applicationmaster >Reporter: Varun Saxena >Assignee: Varun Saxena > Fix For: 2.9.0, 3.0.0-alpha3 > > Attachments: MAPREDUCE-6404.01.patch, MAPREDUCE-6404.02.patch > > > Allow AM to specify a port range for starting its webapp -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6404) Allow AM to specify a port range for starting its webapp
[ https://issues.apache.org/jira/browse/MAPREDUCE-6404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6404: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.0.0-alpha3 2.9.0 Status: Resolved (was: Patch Available) I have commit the patch to trunk and branch-2. Thanks [~varun_saxena] for the patch and [~Naganarasimha] for comments! > Allow AM to specify a port range for starting its webapp > > > Key: MAPREDUCE-6404 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6404 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: applicationmaster >Reporter: Varun Saxena >Assignee: Varun Saxena > Fix For: 2.9.0, 3.0.0-alpha3 > > Attachments: MAPREDUCE-6404.01.patch, MAPREDUCE-6404.02.patch > > > Allow AM to specify a port range for starting its webapp -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6404) Allow AM to specify a port range for starting its webapp
[ https://issues.apache.org/jira/browse/MAPREDUCE-6404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15855394#comment-15855394 ] Junping Du commented on MAPREDUCE-6404: --- Thanks [~varun_saxena] for updating the patch which looks good to me! +1. Will commit it shortly. > Allow AM to specify a port range for starting its webapp > > > Key: MAPREDUCE-6404 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6404 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: applicationmaster >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: MAPREDUCE-6404.01.patch, MAPREDUCE-6404.02.patch > > > Allow AM to specify a port range for starting its webapp -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6404) Allow AM to specify a port range for starting its webapp
[ https://issues.apache.org/jira/browse/MAPREDUCE-6404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853674#comment-15853674 ] Junping Du commented on MAPREDUCE-6404: --- Patch looks good - just some checkstyle, white space issues, should be easy to fix. > Allow AM to specify a port range for starting its webapp > > > Key: MAPREDUCE-6404 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6404 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: applicationmaster >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: MAPREDUCE-6404.01.patch > > > Allow AM to specify a port range for starting its webapp -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6404) Allow AM to specify a port range for starting its webapp
[ https://issues.apache.org/jira/browse/MAPREDUCE-6404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6404: -- Status: Patch Available (was: In Progress) > Allow AM to specify a port range for starting its webapp > > > Key: MAPREDUCE-6404 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6404 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: applicationmaster >Reporter: Varun Saxena >Assignee: Junping Du > Attachments: MAPREDUCE-6404.01.patch > > > Allow AM to specify a port range for starting its webapp -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Assigned] (MAPREDUCE-6404) Allow AM to specify a port range for starting its webapp
[ https://issues.apache.org/jira/browse/MAPREDUCE-6404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du reassigned MAPREDUCE-6404: - Assignee: Junping Du (was: Varun Saxena) > Allow AM to specify a port range for starting its webapp > > > Key: MAPREDUCE-6404 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6404 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: applicationmaster >Reporter: Varun Saxena >Assignee: Junping Du > Attachments: MAPREDUCE-6404.01.patch > > > Allow AM to specify a port range for starting its webapp -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Assigned] (MAPREDUCE-6404) Allow AM to specify a port range for starting its webapp
[ https://issues.apache.org/jira/browse/MAPREDUCE-6404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du reassigned MAPREDUCE-6404: - Assignee: Varun Saxena (was: Junping Du) > Allow AM to specify a port range for starting its webapp > > > Key: MAPREDUCE-6404 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6404 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: applicationmaster >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: MAPREDUCE-6404.01.patch > > > Allow AM to specify a port range for starting its webapp -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6404) Allow AM to specify a port range for starting its webapp
[ https://issues.apache.org/jira/browse/MAPREDUCE-6404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853487#comment-15853487 ] Junping Du commented on MAPREDUCE-6404: --- HADOOP-12097 already get commit in. Submit the patch for this JIRA. > Allow AM to specify a port range for starting its webapp > > > Key: MAPREDUCE-6404 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6404 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: applicationmaster >Reporter: Varun Saxena >Assignee: Varun Saxena > Attachments: MAPREDUCE-6404.01.patch > > > Allow AM to specify a port range for starting its webapp -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6338) MR AppMaster does not honor ephemeral port range
[ https://issues.apache.org/jira/browse/MAPREDUCE-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6338: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.0.0-alpha3 2.9.0 Status: Resolved (was: Patch Available) I have commit the patch to trunk and branch-2 with fixing a checkstyle issue. Thanks [~bleuleon] for patch contribution and all for review and comments! > MR AppMaster does not honor ephemeral port range > > > Key: MAPREDUCE-6338 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6338 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am, mrv2 >Affects Versions: 2.6.0 >Reporter: Frank Nguyen >Assignee: Frank Nguyen > Fix For: 2.9.0, 3.0.0-alpha3 > > Attachments: MAPREDUCE-6338.002.patch, MAPREDUCE-6338.003.patch > > > The MR AppMaster should only use port ranges defined in the > yarn.app.mapreduce.am.job.client.port-range property. On initial startup of > the MRAppMaster, it does use the port range defined in the property. > However, it also opens up a listener on a random ephemeral port. This is not > the Jetty listener. It is another listener opened by the MRAppMaster via > another thread and is recognized by the RM. Other nodes will try to > communicate to it via that random port. With firewall settings on, the MR > job will fail because the random port is not opened. This problem has caused > others to have all OS ephemeral ports opened to have MR jobs run. > This is related to MAPREDUCE-4079 -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6825) YARNRunner#createApplicationSubmissionContext method is longer than 150 lines
[ https://issues.apache.org/jira/browse/MAPREDUCE-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15850821#comment-15850821 ] Junping Du commented on MAPREDUCE-6825: --- Thanks [~ctrezzo] for review! Patch LGTM too. However, there is unit test failure. [~GergelyNovak], can you check if this is related to our code change here? > YARNRunner#createApplicationSubmissionContext method is longer than 150 lines > - > > Key: MAPREDUCE-6825 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6825 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chris Trezzo >Assignee: Gergely Novák >Priority: Trivial > Labels: newbie > Attachments: MAPREDUCE-6825.001.patch, MAPREDUCE-6825.002.patch > > > bq. > ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/YARNRunner.java:341: > public ApplicationSubmissionContext createApplicationSubmissionContext(:3: > Method length is 249 lines (max allowed is 150). > {{YARNRunner#createApplicationSubmissionContext}} is longer than 150 lines > and needs to be refactored. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6338) MR AppMaster does not honor ephemeral port range
[ https://issues.apache.org/jira/browse/MAPREDUCE-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6338: -- Status: Patch Available (was: Open) > MR AppMaster does not honor ephemeral port range > > > Key: MAPREDUCE-6338 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6338 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am, mrv2 >Affects Versions: 2.6.0 >Reporter: Frank Nguyen >Assignee: Frank Nguyen > Attachments: MAPREDUCE-6338.002.patch, MAPREDUCE-6338.003.patch > > > The MR AppMaster should only use port ranges defined in the > yarn.app.mapreduce.am.job.client.port-range property. On initial startup of > the MRAppMaster, it does use the port range defined in the property. > However, it also opens up a listener on a random ephemeral port. This is not > the Jetty listener. It is another listener opened by the MRAppMaster via > another thread and is recognized by the RM. Other nodes will try to > communicate to it via that random port. With firewall settings on, the MR > job will fail because the random port is not opened. This problem has caused > others to have all OS ephemeral ports opened to have MR jobs run. > This is related to MAPREDUCE-4079 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6338) MR AppMaster does not honor ephemeral port range
[ https://issues.apache.org/jira/browse/MAPREDUCE-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15834489#comment-15834489 ] Junping Du commented on MAPREDUCE-6338: --- Forget my comments above, I double check that we don't involve any new configuration here so this approach looks fine to me. Set to patch available again and will review from there. > MR AppMaster does not honor ephemeral port range > > > Key: MAPREDUCE-6338 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6338 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am, mrv2 >Affects Versions: 2.6.0 >Reporter: Frank Nguyen >Assignee: Frank Nguyen > Attachments: MAPREDUCE-6338.002.patch, MAPREDUCE-6338.003.patch > > > The MR AppMaster should only use port ranges defined in the > yarn.app.mapreduce.am.job.client.port-range property. On initial startup of > the MRAppMaster, it does use the port range defined in the property. > However, it also opens up a listener on a random ephemeral port. This is not > the Jetty listener. It is another listener opened by the MRAppMaster via > another thread and is recognized by the RM. Other nodes will try to > communicate to it via that random port. With firewall settings on, the MR > job will fail because the random port is not opened. This problem has caused > others to have all OS ephemeral ports opened to have MR jobs run. > This is related to MAPREDUCE-4079 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6711) JobImpl fails to handle preemption events on state COMMITTING
[ https://issues.apache.org/jira/browse/MAPREDUCE-6711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6711: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.0.0-alpha2 2.7.4 2.8.0 Status: Resolved (was: Patch Available) I have commit the patch to trunk, branch-2, branch-2.8 and branch-2.7. Thanks [~Prabhu Joseph] for the contribution and [~gtCarrera9] for reporting the issue! > JobImpl fails to handle preemption events on state COMMITTING > - > > Key: MAPREDUCE-6711 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6711 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Li Lu >Assignee: Prabhu Joseph > Fix For: 2.8.0, 2.7.4, 3.0.0-alpha2 > > Attachments: MAPREDUCE-6711.1.patch, MAPREDUCE-6711.patch > > > When a MR app being preempted on COMMITTING state, we saw the following > exceptions in its log: > {code} > ERROR [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Can't handle this event > at current state > org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: > JOB_TASK_ATTEMPT_COMPLETED at COMMITTING > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:996) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1289) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1285) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:182) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:108) > at java.lang.Thread.run(Thread.java:744) > {code} > and > {code} > ERROR [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Can't handle this event > at current state > org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: > JOB_MAP_TASK_RESCHEDULED at COMMITTING > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:996) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1289) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1285) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:182) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:108) > at java.lang.Thread.run(Thread.java:744) > {code} > Seems like we need to handle those preemption related events when the job is > being committed? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6670) TestJobListCache#testEviction sometimes fails on Windows with timeout
[ https://issues.apache.org/jira/browse/MAPREDUCE-6670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6670: -- Fix Version/s: (was: 2.9.0) > TestJobListCache#testEviction sometimes fails on Windows with timeout > - > > Key: MAPREDUCE-6670 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6670 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Affects Versions: 2.7.0, 2.8.0, 2.7.1, 2.7.2, 2.7.3 > Environment: OS: Windows Server 2012 > JDK: 1.7.0_79 >Reporter: Gergely Novák >Assignee: Gergely Novák >Priority: Minor > Fix For: 2.8.0, 2.7.3, 3.0.0-alpha1 > > Attachments: MAPREDUCE-6670.001.patch, MAPREDUCE-6670.002.patch > > > TestJobListCache#testEviction often needs more than 1000 ms to finish in > Windows environment. Increasing the timeout solves the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6801) Fix flaky TestKill.testKillJob()
[ https://issues.apache.org/jira/browse/MAPREDUCE-6801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6801: -- Fix Version/s: (was: 2.9.0) > Fix flaky TestKill.testKillJob() > > > Key: MAPREDUCE-6801 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6801 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 >Affects Versions: 3.0.0-alpha1 >Reporter: Haibo Chen >Assignee: Haibo Chen > Fix For: 2.8.0, 3.0.0-alpha2 > > Attachments: mapreduce6801.001.patch, mapreduce6801.002.patch > > > TestKill.testKillJob often fails for the same reason with the following error > message: > {code} > 1 tests failed. > FAILED: org.apache.hadoop.mapreduce.v2.app.TestKill.testKillJob > Error Message: > Task state not correct expected: but was: > Stack Trace: > java.lang.AssertionError: Task state not correct expected: but > was: > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at > org.apache.hadoop.mapreduce.v2.app.TestKill.testKillJob(TestKill.java:84) > {code} > The root cause is that when the job is in KILLED state from an external view, > TaskKillEvents and TaskAttemptKillEvents placed on the event loop queue may > not have been processed by the dispatcher thread. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6541) Exclude scheduled reducer memory when calculating available mapper slots from headroom to avoid deadlock
[ https://issues.apache.org/jira/browse/MAPREDUCE-6541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6541: -- Fix Version/s: (was: 2.9.0) > Exclude scheduled reducer memory when calculating available mapper slots from > headroom to avoid deadlock > - > > Key: MAPREDUCE-6541 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6541 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.7.1 >Reporter: Wangda Tan >Assignee: Varun Saxena > Fix For: 2.8.0, 3.0.0-alpha2 > > Attachments: MAPREDUCE-6541.01.patch, MAPREDUCE-6541.02.patch > > > We saw a MR deadlock recently: > - When NM restarted by framework without enable recovery, containers running > on these nodes will be identified as "ABORTED", and MR AM will try to > reschedule "ABORTED" mapper containers. > - Since such lost mappers are "ABORTED" container, MR AM gives normal mapper > priority (priority=20) to such mapper requests. If there's any pending > reducer (priority=10) at the same time, mapper requests need to wait for > reducer requests satisfied. > - In our test, one mapper needs 700+ MB, reducer needs 1000+ MB, and RM > available resource = mapper-request = (700+ MB), only one job was running in > the system so scheduler cannot allocate more reducer containers AND MR-AM > thinks there're enough headroom for mapper so reducer containers will not be > preempted. > MAPREDUCE-6302 can solve most of the problems, but in the other hand, I think > we may need to exclude scheduled reducers resource when calculating > #available-mapper-slots from headroom. Which we can avoid excessive reducer > preemption. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-2631) Potential resource leaks in BinaryProtocol$TeeOutputStream.java
[ https://issues.apache.org/jira/browse/MAPREDUCE-2631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-2631: -- Fix Version/s: (was: 2.9.0) > Potential resource leaks in BinaryProtocol$TeeOutputStream.java > --- > > Key: MAPREDUCE-2631 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2631 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 0.23.0 >Reporter: Ravi Teja Ch N V >Assignee: Sunil G > Fix For: 2.8.0, 3.0.0-alpha2 > > Attachments: 0001-MAPREDUCE-2631.patch, 0002-MAPREDUCE-2631.patch, > 0003-MAPREDUCE-2631.patch, 0004-MAPREDUCE-2631.patch, > MAPREDUCE-2631.0005.patch, MAPREDUCE-2631.0006.patch, > MAPREDUCE-2631.0007.patch, MAPREDUCE-2631.02.patch, MAPREDUCE-2631.1.patch, > MAPREDUCE-2631.2.patch, MAPREDUCE-2631.3.patch, MAPREDUCE-2631.patch > > > {code:title=BinaryProtocol$TeeOutputStream.java|borderStyle=solid} > public void close() throws IOException { > flush(); > file.close(); > out.close(); > } > {code} > In the above code, if the file.close() throws any exception out will not be > closed. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6622) Add capability to set JHS job cache to a task-based limit
[ https://issues.apache.org/jira/browse/MAPREDUCE-6622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6622: -- Fix Version/s: (was: 2.9.0) > Add capability to set JHS job cache to a task-based limit > - > > Key: MAPREDUCE-6622 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6622 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: jobhistoryserver >Affects Versions: 2.7.2 >Reporter: Ray Chiang >Assignee: Ray Chiang >Priority: Critical > Labels: supportability > Fix For: 2.8.0, 2.7.3, 2.6.5, 3.0.0-alpha1 > > Attachments: MAPREDUCE-6622.001.patch, MAPREDUCE-6622.002.patch, > MAPREDUCE-6622.003.patch, MAPREDUCE-6622.004.patch, MAPREDUCE-6622.005.patch, > MAPREDUCE-6622.006.patch, MAPREDUCE-6622.007.patch, MAPREDUCE-6622.008.patch, > MAPREDUCE-6622.009.patch, MAPREDUCE-6622.010.patch, MAPREDUCE-6622.011.patch, > MAPREDUCE-6622.012.patch, MAPREDUCE-6622.013.patch, MAPREDUCE-6622.014.patch > > > When setting the property mapreduce.jobhistory.loadedjobs.cache.size the jobs > can be of varying size. This is generally not a problem when the jobs sizes > are uniform or small, but when the job sizes can be very large (say greater > than 250k tasks), then the JHS heap size can grow tremendously. > In cases, where multiple jobs are very large, then the JHS can lock up and > spend all its time in GC. However, since the cache is holding on to all the > jobs, not much heap space can be freed up. > By setting a property that sets a cap on the number of tasks allowed in the > cache and since the total number of tasks loaded is directly proportional to > the amount of heap used, this should help prevent the JHS from locking up. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6577) MR AM unable to load native library without MR_AM_ADMIN_USER_ENV set
[ https://issues.apache.org/jira/browse/MAPREDUCE-6577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6577: -- Fix Version/s: (was: 2.9.0) > MR AM unable to load native library without MR_AM_ADMIN_USER_ENV set > > > Key: MAPREDUCE-6577 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6577 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 2.6.0 >Reporter: Sangjin Lee >Assignee: Sangjin Lee >Priority: Critical > Fix For: 2.8.0, 2.7.3, 2.6.4, 3.0.0-alpha1 > > Attachments: MAPREDUCE-6577.01.patch, MAPREDUCE-6577.02.patch, > MAPREDUCE-6577.03.patch, MAPREDUCE-6577.04.patch > > > If yarn.app.mapreduce.am.admin.user.env (or yarn.app.mapreduce.am.env) is not > configured to set LD_LIBRARY_PATH, MR AM will fail to load the native library: > {noformat} > 2015-12-15 21:29:22,473 WARN [main] org.apache.hadoop.util.NativeCodeLoader: > Unable to load native-hadoop library for your platform... using builtin-java > classes where applicable > {noformat} > As a result, any code that needs the hadoop native library in the MR AM will > fail. For example, an uber-AM with lz4 compression for the mapper task will > fail: > {noformat} > 2015-12-15 21:30:17,575 WARN [uber-SubtaskRunner] > org.apache.hadoop.mapred.LocalContainerLauncher: Exception running local > (uberized) 'child' : java.lang.RuntimeException: native lz4 library not > available > at > org.apache.hadoop.io.compress.Lz4Codec.getCompressorType(Lz4Codec.java:125) > at > org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:148) > at > org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:163) > at org.apache.hadoop.mapred.IFile$Writer.(IFile.java:114) > at org.apache.hadoop.mapred.IFile$Writer.(IFile.java:97) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1602) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1482) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:457) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > at > org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:391) > at > org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:309) > at > org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:195) > at > org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler$1.run(LocalContainerLauncher.java:238) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-5883) "Total megabyte-seconds" in job counters is slightly misleading
[ https://issues.apache.org/jira/browse/MAPREDUCE-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-5883: -- Fix Version/s: 2.8.0 > "Total megabyte-seconds" in job counters is slightly misleading > --- > > Key: MAPREDUCE-5883 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5883 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.4.0, 3.0.0-alpha1 >Reporter: Nathan Roberts >Assignee: Nathan Roberts >Priority: Minor > Labels: BB2015-05-TBR > Fix For: 2.8.0, 2.7.2, 2.6.3, 3.0.0-alpha1 > > Attachments: MAPREDUCE-5883.patch > > > The following counters are in milliseconds so "megabyte-seconds" might be > better stated as "megabyte-milliseconds" > MB_MILLIS_MAPS.name= Total megabyte-seconds taken by all map > tasks > MB_MILLIS_REDUCES.name=Total megabyte-seconds taken by all reduce > tasks > VCORES_MILLIS_MAPS.name= Total vcore-seconds taken by all map tasks > VCORES_MILLIS_REDUCES.name=Total vcore-seconds taken by all reduce > tasks -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6387) Serialize the recently added Task#encryptedSpillKey field at the end
[ https://issues.apache.org/jira/browse/MAPREDUCE-6387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6387: -- Fix Version/s: 2.8.0 > Serialize the recently added Task#encryptedSpillKey field at the end > > > Key: MAPREDUCE-6387 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6387 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.7.0 >Reporter: Arun Suresh >Assignee: Arun Suresh >Priority: Minor > Fix For: 2.8.0, 2.7.1, 3.0.0-alpha1 > > Attachments: MAPREDUCE-6387.1.patch > > > There was a recent addition of an {{encryptedSpillKey}} to the Task object. > And when serialized, this field was written out somewhere in the middle. This > caused deployments that do not use DistributedCache to push job jars before > running the job to fail rolling upgrade. > Although deploying via Distributed Cache is the recommended method, there > might still be deployments that use the node local classpath to pick up the > MR framework classes (eg. for efficiency purposes, since this does not > require the jar being copied to hdfs and then to all the nodes) > Ensuring that it is the last field written and read when the Task object is > serialized would alleviate this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6451) DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic
[ https://issues.apache.org/jira/browse/MAPREDUCE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6451: -- Fix Version/s: 2.8.0 > DistCp has incorrect chunkFilePath for multiple jobs when strategy is dynamic > - > > Key: MAPREDUCE-6451 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6451 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: distcp >Affects Versions: 2.6.0 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Fix For: 2.8.0, 2.7.2, 3.0.0-alpha1 > > Attachments: MAPREDUCE-6451-v1.patch, MAPREDUCE-6451-v2.patch, > MAPREDUCE-6451-v3.patch, MAPREDUCE-6451-v4.patch, MAPREDUCE-6451-v5.patch > > > DistCp when used with dynamic strategy does not update the chunkFilePath and > other static variables any time other than for the first job. This is seen > when DistCp::run() is used. > A single copy succeeds but multiple jobs finish successfully without any real > copying. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6518) Set SO_KEEPALIVE on shuffle connections
[ https://issues.apache.org/jira/browse/MAPREDUCE-6518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6518: -- Fix Version/s: 2.8.0 > Set SO_KEEPALIVE on shuffle connections > --- > > Key: MAPREDUCE-6518 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6518 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2, nodemanager >Affects Versions: 2.7.1 >Reporter: Nathan Roberts >Assignee: Chang Li > Fix For: 2.8.0, 2.7.2, 3.0.0-alpha1 > > Attachments: MAPREDUCE-6518.4.patch, YARN-4052.2.patch, > YARN-4052.3.patch, YARN-4052.patch > > > Shuffle handler does not set SO_KEEPALIVE so we've seen cases where > FDs/sockets get stuck in ESTABLISHED state indefinitely because the server > did not see the client leave (network cut or otherwise). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6497) Fix wrong value of JOB_FINISHED event in JobHistoryEventHandler
[ https://issues.apache.org/jira/browse/MAPREDUCE-6497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6497: -- Fix Version/s: 2.8.0 > Fix wrong value of JOB_FINISHED event in JobHistoryEventHandler > --- > > Key: MAPREDUCE-6497 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6497 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.7.1 >Reporter: Shinichi Yamashita >Assignee: Shinichi Yamashita > Fix For: 2.8.0, 2.7.2, 2.6.2, 3.0.0-alpha1 > > Attachments: MAPREDUCE-6497.001.patch > > > It seems that "MAP_COUNTER_GROUPS" values use total_counter value. > We should fix to use map_counter value. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6474) ShuffleHandler can possibly exhaust nodemanager file descriptors
[ https://issues.apache.org/jira/browse/MAPREDUCE-6474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6474: -- Fix Version/s: 2.8.0 > ShuffleHandler can possibly exhaust nodemanager file descriptors > > > Key: MAPREDUCE-6474 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6474 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2, nodemanager >Affects Versions: 2.5.0 >Reporter: Nathan Roberts >Assignee: Kuhu Shukla > Fix For: 2.8.0, 2.7.2, 3.0.0-alpha1 > > Attachments: YARN-2410-v1.patch, YARN-2410-v10.patch, > YARN-2410-v11.patch, YARN-2410-v2.patch, YARN-2410-v3.patch, > YARN-2410-v4.patch, YARN-2410-v5.patch, YARN-2410-v6.patch, > YARN-2410-v7.patch, YARN-2410-v8.patch, YARN-2410-v9.patch > > > The async nature of the shufflehandler can cause it to open a huge number of > file descriptors, when it runs out it crashes. > Scenario: > Job with 6K reduces, slow start set to 0.95, about 40 map outputs per node. > Let's say all 6K reduces hit a node at about same time asking for their > outputs. Each reducer will ask for all 40 map outputs over a single socket in > a > single request (not necessarily all 40 at once, but with coalescing it is > likely to be a large number). > sendMapOutput() will open the file for random reading and then perform an > async transfer of the particular portion of this file(). This will > theoretically > happen 6000*40=24 times which will run the NM out of file descriptors and > cause it to crash. > The algorithm should be refactored a little to not open the fds until they're > actually needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org