[jira] [Commented] (MAPREDUCE-7471) Hadoop mapred minicluster command line fails with class not found
[ https://issues.apache.org/jira/browse/MAPREDUCE-7471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17812211#comment-17812211 ] Xiaoqiao He commented on MAPREDUCE-7471: Hi [~slfan1989], [~ayushsaxena], [~zhangshuyan] would you mind to take a review here? Thanks. > Hadoop mapred minicluster command line fails with class not found > - > > Key: MAPREDUCE-7471 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7471 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 3.3.5 >Reporter: Duo Zhang >Priority: Major > > If you run > ./bin/mapred minicluster > It will fail with > {noformat} > Exception in thread "Listener at localhost/35325" > java.lang.NoClassDefFoundError: org/mockito/stubbing/Answer > at > org.apache.hadoop.hdfs.MiniDFSCluster.isNameNodeUp(MiniDFSCluster.java:2648) > at > org.apache.hadoop.hdfs.MiniDFSCluster.isClusterUp(MiniDFSCluster.java:2662) > at > org.apache.hadoop.hdfs.MiniDFSCluster.waitClusterUp(MiniDFSCluster.java:1510) > at > org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:989) > at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:588) > at > org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:530) > at > org.apache.hadoop.mapreduce.MiniHadoopClusterManager.start(MiniHadoopClusterManager.java:160) > at > org.apache.hadoop.mapreduce.MiniHadoopClusterManager.run(MiniHadoopClusterManager.java:132) > at > org.apache.hadoop.mapreduce.MiniHadoopClusterManager.main(MiniHadoopClusterManager.java:320) > Caused by: java.lang.ClassNotFoundException: org.mockito.stubbing.Answer > at java.net.URLClassLoader.findClass(URLClassLoader.java:387) > at java.lang.ClassLoader.loadClass(ClassLoader.java:418) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) > at java.lang.ClassLoader.loadClass(ClassLoader.java:351) > ... 9 more > {noformat} > This line > https://github.com/apache/hadoop/blob/835403d872506c4fa76eb2d721f2d91f413473d5/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java#L2648 > This is because we rely on mockito in NameNodeAdapter but we do not have > mockito on our classpath, at least in our published hadoop-3.3.5 binary. > And there is another problem that, if we do not run the above command in the > HADOOP_HOME directory, i.e, in another directory by typing the absolute path > of the mapred command, it will fail with > {noformat} > Exception in thread "main" java.lang.NoClassDefFoundError: org/junit/Assert > at > org.apache.hadoop.test.GenericTestUtils.assertExists(GenericTestUtils.java:336) > at > org.apache.hadoop.test.GenericTestUtils.getTestDir(GenericTestUtils.java:280) > at > org.apache.hadoop.test.GenericTestUtils.getTestDir(GenericTestUtils.java:289) > at > org.apache.hadoop.hdfs.MiniDFSCluster.getBaseDirectory(MiniDFSCluster.java:3069) > at > org.apache.hadoop.hdfs.MiniDFSCluster$Builder.(MiniDFSCluster.java:239) > at > org.apache.hadoop.mapreduce.MiniHadoopClusterManager.start(MiniHadoopClusterManager.java:157) > at > org.apache.hadoop.mapreduce.MiniHadoopClusterManager.run(MiniHadoopClusterManager.java:132) > at > org.apache.hadoop.mapreduce.MiniHadoopClusterManager.main(MiniHadoopClusterManager.java:320) > Caused by: java.lang.ClassNotFoundException: org.junit.Assert > at java.net.URLClassLoader.findClass(URLClassLoader.java:387) > at java.lang.ClassLoader.loadClass(ClassLoader.java:418) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) > at java.lang.ClassLoader.loadClass(ClassLoader.java:351) > ... 8 mor > {noformat} > This simply because this line > https://github.com/apache/hadoop/blob/835403d872506c4fa76eb2d721f2d91f413473d5/hadoop-common-project/hadoop-common/src/main/bin/hadoop-functions.sh#L601 > We should add the $HADOOP_TOOLS_HOME prefix for the default value of > HADOOP_TOOLS_DIR. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7311) Fix non-idempotent test in TestTaskProgressReporter
[ https://issues.apache.org/jira/browse/MAPREDUCE-7311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoqiao He updated MAPREDUCE-7311: --- Target Version/s: (was: 3.2.2) Issue Type: Test (was: Bug) Thanks [~lzx404243], Change issue type to test and remove target version. Please let me know if it blocks 3.2.2. Thanks. > Fix non-idempotent test in TestTaskProgressReporter > --- > > Key: MAPREDUCE-7311 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7311 > Project: Hadoop Map/Reduce > Issue Type: Test >Reporter: Zhengxi Li >Priority: Minor > Attachments: MAPREDUCE-7311.001.patch > > > The test > {{`org.apache.hadoop.mapred.TestTaskProgressReporter.testBytesWrittenRespectingLimit`}} > is not idempotent and fails if run twice in the same JVM, because it > pollutes state shared among tests. It may be good to clean this state > pollution so that some other tests do not fail in the future due to the > shared state polluted by this test. > h3. Details > Running {{`TestTaskProgressReporter.testBytesWrittenRespectingLimit`}} twice > would result in the second run failing with the following assertion: > {noformat} > Assert.assertEquals(failFast, threadExited) > {noformat} > The root cause for this is that when`testBytesWrittenRespectingLimit` writes > some bytes on the local file system, some counters are being incremented. The > problem is that, after the test is done, the counter is not reset. With this > polluted shared state, assumptions are broken, resulting in test failure in > the second run. > PR link: https://github.com/apache/hadoop/pull/2500 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7310) Fix flaky test TestJobHistoryEventHandler.testSigTermedFunctionality
[ https://issues.apache.org/jira/browse/MAPREDUCE-7310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoqiao He updated MAPREDUCE-7310: --- Target Version/s: (was: 3.2.2) Issue Type: Test (was: Bug) Thanks [~lzx404243], Change issue type to test and remove target version. Please let me know if it blocks 3.2.2. Thanks. > Fix flaky test TestJobHistoryEventHandler.testSigTermedFunctionality > > > Key: MAPREDUCE-7310 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7310 > Project: Hadoop Map/Reduce > Issue Type: Test >Reporter: Zhengxi Li >Priority: Minor > Labels: pull-request-available > Attachments: MAPREDUCE-7310.001.patch > > Time Spent: 10m > Remaining Estimate: 0h > > The test > '{{org.apache.hadoop.mapreduce.jobhistory.TestJobHistoryEventHandler.testSigTermedFunctionality'}} > is not idempotent and fails if run twice in the same JVM, because it > pollutes state shared among tests. It may be good to clean this state > pollution so that some other tests do not fail in the future due to the > shared state polluted by this test. > h3. Details > Running `TestJobHistoryEventHandler.testSigTermedFunctionality` twice would > result in the second run failing due to `NullPointerException`shown in the > following: > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.serviceStop(JobHistoryEventHandler.java:460) > at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.mapreduce.jobhistory.TestJobHistoryEventHandler.testSigTermedFunctionality(TestJobHistoryEventHandler.java:933) > {noformat} > The root cause of this is that running > '{{TestJobHistoryEventHandler.testSigTermedFunctionality'}} results in some > entries to be added to the static '{{JobHistoryEventHandler.fileMap'}}. The > entries in the '{{fileMap'}} are not cleaned up when the test is done, > resulting in a NullPointerException in the second run as the stale > object(added in the first run) in the 'fileMap' is accessed. > > PR link: https://github.com/apache/hadoop/pull/2499 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7309) Improve performance of reading resource request for mapper/reducers from config
[ https://issues.apache.org/jira/browse/MAPREDUCE-7309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17239978#comment-17239978 ] Xiaoqiao He commented on MAPREDUCE-7309: backport to branch-3.2.2. > Improve performance of reading resource request for mapper/reducers from > config > --- > > Key: MAPREDUCE-7309 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7309 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: applicationmaster >Affects Versions: 3.0.0, 3.1.0, 3.2.0, 3.3.0 >Reporter: Wangda Tan >Assignee: Peter Bacsko >Priority: Major > Fix For: 3.2.2, 3.4.0, 3.1.5, 3.3.1 > > Attachments: MAPREDUCE-7309-003.patch, MAPREDUCE-7309-004.patch, > MAPREDUCE-7309-005.patch, MAPREDUCE-7309-branch-3.1-001.patch, > MAPREDUCE-7309-branch-3.2-001.patch, MAPREDUCE-7309-branch-3.3-001.patch, > MAPREDUCE-7309.001.patch, MAPREDUCE-7309.002.patch > > > This is an issue could affect all the releases which includes YARN-6927. > Basically, we use regex match repeatedly when we read mapper/reducer resource > request from config files. When we have large config file, and large number > of splits, it could take a long time. > We saw AM could take hours to parse config when we have 200k+ splits, with a > large config file (hundreds of kbs). > The problematic part is this: > {noformat} > private void populateResourceCapability(TaskType taskType) { > String resourceTypePrefix = > getResourceTypePrefix(taskType); > boolean memorySet = false; > boolean cpuVcoresSet = false; > if (resourceTypePrefix != null) { > List resourceRequests = > ResourceUtils.getRequestedResourcesFromConfig(conf, > resourceTypePrefix); > {noformat} > Inside {{ResourceUtils.getRequestedResourcesFromConfig()}}, we call > {{Configuration.getValByRegex()}} which goes through all property keys that > come from the MapReduce job configuration (jobconf.xml). If the job config is > large (eg. due to being part of an MR pipeline and it was populated by an > earlier job), then this results in running a regexp match unnecessarily for > all properties over and over again. This is not necessary, because all > mappers and reducers will have the same config, respectively. > We should do proper caching for pre-configured resource requests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7304) Enhance the map-reduce Job end notifier to be able to notify the given URL via a custom class
[ https://issues.apache.org/jira/browse/MAPREDUCE-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17239977#comment-17239977 ] Xiaoqiao He commented on MAPREDUCE-7304: backport to branch-3.2.2. > Enhance the map-reduce Job end notifier to be able to notify the given URL > via a custom class > - > > Key: MAPREDUCE-7304 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7304 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv2 >Reporter: Daniel Fritsi >Assignee: Zoltán Erdmann >Priority: Major > Fix For: 3.2.2, 3.4.0, 3.1.5, 3.3.1 > > Attachments: MAPREDUCE-7304-001.patch, MAPREDUCE-7304-002.patch, > MAPREDUCE-7304-003.patch, MAPREDUCE-7304-004.patch, > MAPREDUCE-7304-branch-3.1-001.patch, MAPREDUCE-7304-branch-3.2-001.patch, > MAPREDUCE-7304-branch-3.3-001.patch > > > Currently > {color:#0747a6}{{*org.apache.hadoop.mapreduce.v2.app.JobEndNotifier*}}{color} > allows a very limited configuration on how the given Job end notification URL > should be notified. We should enhance this, but instead of adding more > *{color:#0747A6}{{mapreduce.job.end-notification.*}}{color}* properties to be > able to configure the underlying HttpURLConnection, we should add a new > property so users can use their own notifier class. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Assigned] (MAPREDUCE-7247) Modify HistoryServerRest.html content,change The job attempt id‘s datatype from string to int
[ https://issues.apache.org/jira/browse/MAPREDUCE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoqiao He reassigned MAPREDUCE-7247: -- Target Version/s: 3.3.0, 3.2.2, 2.10.1, 3.0.4 (was: 3.0.4, 3.3.0, 3.2.2, 2.10.1) Assignee: zhaoshengjie Add [~zhaoshengjie] to contributors list and assign this issue to him. > Modify HistoryServerRest.html content,change The job attempt id‘s datatype > from string to int > - > > Key: MAPREDUCE-7247 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7247 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: documentation >Affects Versions: 3.2.1 >Reporter: zhaoshengjie >Assignee: zhaoshengjie >Priority: Major > Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1 > > Attachments: image-2019-10-29-14-46-17-354.png, > image-2019-10-29-14-46-49-929.png > > > The Job Attempts API > http://history-server-http-address:port/ws/v1/history/mapreduce/jobs/\{jobid}/jobattempts > document, In > http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/HistoryServerRest.html#Job_Attempts_API, > change The job attempt id‘s datatype from string to int. > !image-2019-10-29-14-46-17-354.png|width=508,height=126! > !image-2019-10-29-14-46-49-929.png|width=465,height=315! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7167) Extra LF ("\n") pushed directly to storage
[ https://issues.apache.org/jira/browse/MAPREDUCE-7167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoqiao He updated MAPREDUCE-7167: --- Target Version/s: 3.2.3 (was: 3.2.2) Updated the target version to 3.2.3 for preparing 3.2.2 release. Please let me know if it is blocker for you. Thanks. > Extra LF ("\n") pushed directly to storage > -- > > Key: MAPREDUCE-7167 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7167 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 3.2.0 >Reporter: Saurabh >Assignee: Saurabh >Priority: Major > Attachments: image-2018-11-28-19-23-52-972.png, > image-2018-11-29-14-53-58-176.png, image-2018-11-29-14-54-28-254.png, > nremoved.txt, nremoved.txt, patch1128.patch, patch1128.patch, > patch1128trunk.patch, withn.txt, withn.txt > > > JsonEncoder already adds the necessary newline after writing each object as > per [this| > [https://github.com/apache/avro/blob/39ec1a3f0addfce06869f705f7a17c03d538fe16/lang/java/avro/src/main/java/org/apache/avro/io/JsonEncoder.java#L77] > ] so this patch removes the "out.writeBytes("\n");". As the encoder is > buffered, the out.writeBytes can cause JSON errors in the output stream as it > directly writes to the output stream, hence it must be removed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org