[jira] [Commented] (MAPREDUCE-7471) Hadoop mapred minicluster command line fails with class not found

2024-01-30 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17812211#comment-17812211
 ] 

Xiaoqiao He commented on MAPREDUCE-7471:


Hi [~slfan1989], [~ayushsaxena], [~zhangshuyan] would you mind to take a review 
here? Thanks.

> Hadoop mapred minicluster command line fails with class not found
> -
>
> Key: MAPREDUCE-7471
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7471
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.3.5
>Reporter: Duo Zhang
>Priority: Major
>
> If you run
> ./bin/mapred minicluster
> It will fail with
> {noformat}
> Exception in thread "Listener at localhost/35325" 
> java.lang.NoClassDefFoundError: org/mockito/stubbing/Answer
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.isNameNodeUp(MiniDFSCluster.java:2648)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.isClusterUp(MiniDFSCluster.java:2662)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.waitClusterUp(MiniDFSCluster.java:1510)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:989)
>   at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:588)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:530)
>   at 
> org.apache.hadoop.mapreduce.MiniHadoopClusterManager.start(MiniHadoopClusterManager.java:160)
>   at 
> org.apache.hadoop.mapreduce.MiniHadoopClusterManager.run(MiniHadoopClusterManager.java:132)
>   at 
> org.apache.hadoop.mapreduce.MiniHadoopClusterManager.main(MiniHadoopClusterManager.java:320)
> Caused by: java.lang.ClassNotFoundException: org.mockito.stubbing.Answer
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
>   ... 9 more
> {noformat}
> This line
> https://github.com/apache/hadoop/blob/835403d872506c4fa76eb2d721f2d91f413473d5/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java#L2648
> This is because we rely on mockito in NameNodeAdapter but we do not have 
> mockito on our classpath, at least in our published hadoop-3.3.5 binary.
> And there is another problem that, if we do not run the above command in the 
> HADOOP_HOME directory, i.e, in another directory by typing the absolute path 
> of the mapred command, it will fail with
> {noformat}
> Exception in thread "main" java.lang.NoClassDefFoundError: org/junit/Assert
>   at 
> org.apache.hadoop.test.GenericTestUtils.assertExists(GenericTestUtils.java:336)
>   at 
> org.apache.hadoop.test.GenericTestUtils.getTestDir(GenericTestUtils.java:280)
>   at 
> org.apache.hadoop.test.GenericTestUtils.getTestDir(GenericTestUtils.java:289)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.getBaseDirectory(MiniDFSCluster.java:3069)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster$Builder.(MiniDFSCluster.java:239)
>   at 
> org.apache.hadoop.mapreduce.MiniHadoopClusterManager.start(MiniHadoopClusterManager.java:157)
>   at 
> org.apache.hadoop.mapreduce.MiniHadoopClusterManager.run(MiniHadoopClusterManager.java:132)
>   at 
> org.apache.hadoop.mapreduce.MiniHadoopClusterManager.main(MiniHadoopClusterManager.java:320)
> Caused by: java.lang.ClassNotFoundException: org.junit.Assert
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
>   ... 8 mor
> {noformat}
> This simply because this line
> https://github.com/apache/hadoop/blob/835403d872506c4fa76eb2d721f2d91f413473d5/hadoop-common-project/hadoop-common/src/main/bin/hadoop-functions.sh#L601
> We should add the $HADOOP_TOOLS_HOME prefix for the default value of 
> HADOOP_TOOLS_DIR.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7311) Fix non-idempotent test in TestTaskProgressReporter

2020-12-05 Thread Xiaoqiao He (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He updated MAPREDUCE-7311:
---
Target Version/s:   (was: 3.2.2)
  Issue Type: Test  (was: Bug)

Thanks [~lzx404243], Change issue type to test and remove target version. 
Please let me know if it blocks 3.2.2. Thanks.

> Fix non-idempotent test in TestTaskProgressReporter
> ---
>
> Key: MAPREDUCE-7311
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7311
> Project: Hadoop Map/Reduce
>  Issue Type: Test
>Reporter: Zhengxi Li
>Priority: Minor
> Attachments: MAPREDUCE-7311.001.patch
>
>
> The test 
> {{`org.apache.hadoop.mapred.TestTaskProgressReporter.testBytesWrittenRespectingLimit`}}
>  is not idempotent and fails if run twice in the same JVM, because it 
> pollutes state shared among tests. It may be good to clean this state 
> pollution so that some other tests do not fail in the future due to the 
> shared state polluted by this test.
> h3. Details
> Running {{`TestTaskProgressReporter.testBytesWrittenRespectingLimit`}} twice 
> would result in the second run failing with the following assertion:
> {noformat}
> Assert.assertEquals(failFast, threadExited)
> {noformat}
> The root cause for this is that when`testBytesWrittenRespectingLimit` writes 
> some bytes on the local file system, some counters are being incremented. The 
> problem is that, after the test is done, the counter is not reset. With this 
> polluted shared state, assumptions are broken, resulting in test failure in 
> the second run.
> PR link: https://github.com/apache/hadoop/pull/2500



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7310) Fix flaky test TestJobHistoryEventHandler.testSigTermedFunctionality

2020-12-05 Thread Xiaoqiao He (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He updated MAPREDUCE-7310:
---
Target Version/s:   (was: 3.2.2)
  Issue Type: Test  (was: Bug)

Thanks [~lzx404243], Change issue type to test and remove target version. 
Please let me know if it blocks 3.2.2. Thanks.

> Fix flaky test TestJobHistoryEventHandler.testSigTermedFunctionality
> 
>
> Key: MAPREDUCE-7310
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7310
> Project: Hadoop Map/Reduce
>  Issue Type: Test
>Reporter: Zhengxi Li
>Priority: Minor
>  Labels: pull-request-available
> Attachments: MAPREDUCE-7310.001.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The test 
> '{{org.apache.hadoop.mapreduce.jobhistory.TestJobHistoryEventHandler.testSigTermedFunctionality'}}
>  is not idempotent and fails if run twice in the same JVM, because it 
> pollutes state shared among tests. It may be good to clean this state 
> pollution so that some other tests do not fail in the future due to the 
> shared state polluted by this test.
> h3. Details
> Running `TestJobHistoryEventHandler.testSigTermedFunctionality` twice would 
> result in the second run failing due to `NullPointerException`shown in the 
> following:
> {noformat}
>  java.lang.NullPointerException
>  at 
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.serviceStop(JobHistoryEventHandler.java:460)
>  at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
>  at 
> org.apache.hadoop.mapreduce.jobhistory.TestJobHistoryEventHandler.testSigTermedFunctionality(TestJobHistoryEventHandler.java:933)
> {noformat}
> The root cause of this is that running 
> '{{TestJobHistoryEventHandler.testSigTermedFunctionality'}} results in some 
> entries to be added to the static '{{JobHistoryEventHandler.fileMap'}}. The 
> entries in the '{{fileMap'}} are not cleaned up when the test is done, 
> resulting in a NullPointerException in the second run as the stale 
> object(added in the first run) in the 'fileMap' is accessed.
>  
> PR link: https://github.com/apache/hadoop/pull/2499



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7309) Improve performance of reading resource request for mapper/reducers from config

2020-11-28 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17239978#comment-17239978
 ] 

Xiaoqiao He commented on MAPREDUCE-7309:


backport to branch-3.2.2.

> Improve performance of reading resource request for mapper/reducers from 
> config
> ---
>
> Key: MAPREDUCE-7309
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7309
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: applicationmaster
>Affects Versions: 3.0.0, 3.1.0, 3.2.0, 3.3.0
>Reporter: Wangda Tan
>Assignee: Peter Bacsko
>Priority: Major
> Fix For: 3.2.2, 3.4.0, 3.1.5, 3.3.1
>
> Attachments: MAPREDUCE-7309-003.patch, MAPREDUCE-7309-004.patch, 
> MAPREDUCE-7309-005.patch, MAPREDUCE-7309-branch-3.1-001.patch, 
> MAPREDUCE-7309-branch-3.2-001.patch, MAPREDUCE-7309-branch-3.3-001.patch, 
> MAPREDUCE-7309.001.patch, MAPREDUCE-7309.002.patch
>
>
> This is an issue could affect all the releases which includes YARN-6927. 
> Basically, we use regex match repeatedly when we read mapper/reducer resource 
> request from config files. When we have large config file, and large number 
> of splits, it could take a long time.  
> We saw AM could take hours to parse config when we have 200k+ splits, with a 
> large config file (hundreds of kbs). 
> The problematic part is this:
> {noformat}
>   private void populateResourceCapability(TaskType taskType) {
> String resourceTypePrefix =
> getResourceTypePrefix(taskType);
> boolean memorySet = false;
> boolean cpuVcoresSet = false;
> if (resourceTypePrefix != null) {
>   List resourceRequests =
>   ResourceUtils.getRequestedResourcesFromConfig(conf,
>   resourceTypePrefix);
> {noformat}
> Inside {{ResourceUtils.getRequestedResourcesFromConfig()}}, we call 
> {{Configuration.getValByRegex()}} which goes through all property keys that 
> come from the MapReduce job configuration (jobconf.xml). If the job config is 
> large (eg. due to being part of an MR pipeline and it was populated by an 
> earlier job), then this results in running a regexp match unnecessarily for 
> all properties over and over again. This is not necessary, because all 
> mappers and reducers will have the same config, respectively.
> We should do proper caching for pre-configured resource requests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7304) Enhance the map-reduce Job end notifier to be able to notify the given URL via a custom class

2020-11-28 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17239977#comment-17239977
 ] 

Xiaoqiao He commented on MAPREDUCE-7304:


backport to branch-3.2.2.

> Enhance the map-reduce Job end notifier to be able to notify the given URL 
> via a custom class
> -
>
> Key: MAPREDUCE-7304
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7304
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2
>Reporter: Daniel Fritsi
>Assignee: Zoltán Erdmann
>Priority: Major
> Fix For: 3.2.2, 3.4.0, 3.1.5, 3.3.1
>
> Attachments: MAPREDUCE-7304-001.patch, MAPREDUCE-7304-002.patch, 
> MAPREDUCE-7304-003.patch, MAPREDUCE-7304-004.patch, 
> MAPREDUCE-7304-branch-3.1-001.patch, MAPREDUCE-7304-branch-3.2-001.patch, 
> MAPREDUCE-7304-branch-3.3-001.patch
>
>
> Currently 
> {color:#0747a6}{{*org.apache.hadoop.mapreduce.v2.app.JobEndNotifier*}}{color} 
> allows a very limited configuration on how the given Job end notification URL 
> should be notified. We should enhance this, but instead of adding more 
> *{color:#0747A6}{{mapreduce.job.end-notification.*}}{color}* properties to be 
> able to configure the underlying HttpURLConnection, we should add a new 
> property so users can use their own notifier class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Assigned] (MAPREDUCE-7247) Modify HistoryServerRest.html content,change The job attempt id‘s datatype from string to int

2020-09-14 Thread Xiaoqiao He (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He reassigned MAPREDUCE-7247:
--

Target Version/s: 3.3.0, 3.2.2, 2.10.1, 3.0.4  (was: 3.0.4, 3.3.0, 3.2.2, 
2.10.1)
Assignee: zhaoshengjie

Add [~zhaoshengjie] to contributors list and assign this issue to him.

> Modify HistoryServerRest.html content,change The job attempt id‘s datatype 
> from string to int
> -
>
> Key: MAPREDUCE-7247
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7247
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.2.1
>Reporter: zhaoshengjie
>Assignee: zhaoshengjie
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1
>
> Attachments: image-2019-10-29-14-46-17-354.png, 
> image-2019-10-29-14-46-49-929.png
>
>
> The Job Attempts API 
> http://history-server-http-address:port/ws/v1/history/mapreduce/jobs/\{jobid}/jobattempts
>  document, In 
> http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/HistoryServerRest.html#Job_Attempts_API,
>  change The job attempt id‘s datatype from string to int.
> !image-2019-10-29-14-46-17-354.png|width=508,height=126!
> !image-2019-10-29-14-46-49-929.png|width=465,height=315!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7167) Extra LF ("\n") pushed directly to storage

2020-09-12 Thread Xiaoqiao He (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He updated MAPREDUCE-7167:
---
Target Version/s: 3.2.3  (was: 3.2.2)

Updated the target version to 3.2.3 for preparing 3.2.2 release. Please let me 
know if it is blocker for you. Thanks.

> Extra LF ("\n") pushed directly to storage
> --
>
> Key: MAPREDUCE-7167
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7167
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.2.0
>Reporter: Saurabh
>Assignee: Saurabh
>Priority: Major
> Attachments: image-2018-11-28-19-23-52-972.png, 
> image-2018-11-29-14-53-58-176.png, image-2018-11-29-14-54-28-254.png, 
> nremoved.txt, nremoved.txt, patch1128.patch, patch1128.patch, 
> patch1128trunk.patch, withn.txt, withn.txt
>
>
> JsonEncoder already adds the necessary newline after writing each object as 
> per [this| 
> [https://github.com/apache/avro/blob/39ec1a3f0addfce06869f705f7a17c03d538fe16/lang/java/avro/src/main/java/org/apache/avro/io/JsonEncoder.java#L77]
>  ] so this patch removes the "out.writeBytes("\n");". As the encoder is 
> buffered, the out.writeBytes can cause JSON errors in the output stream as it 
> directly writes to the output stream, hence it must be removed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org