[jira] [Updated] (MAPREDUCE-7350) Replace Guava Lists usage by Hadoop's own Lists in hadoop-mapreduce-project

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated MAPREDUCE-7350:
--
  Component/s: common
 Hadoop Flags: Reviewed
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Replace Guava Lists usage by Hadoop's own Lists in hadoop-mapreduce-project
> ---
>
> Key: MAPREDUCE-7350
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7350
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7442) exception message is not intusive when accessing the job configuration web UI

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated MAPREDUCE-7442:
--
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> exception message is not intusive when accessing the job configuration web UI
> -
>
> Key: MAPREDUCE-7442
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7442
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Affects Versions: 3.4.0
>Reporter: Jiandan Yang 
>Assignee: Jiandan Yang 
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: image-2023-07-14-11-23-10-762.png
>
>
> I launched a Teragen job on hadoop-3.3.4 cluster. 
> The web occured an error when I clicked the link of Configuration of Job. The 
> error page said "HTTP ERROR 500 java.lang.IllegalArgumentException: RFC6265 
> Cookie values may not contain character: [ ]", and I can't find any solution 
> by this error message.
> I found some additional stacks in the log of AM, and those stacks reflect 
> yarn did not have the permission of stagging directory. When I give 
> permission to yarn I can access configuration page.
> I think the problem is that the error page does not provide useful or 
> meaningful prompts.
> It's better if there are  message about "yarn does not have hdfs permission" 
> in the error page.
> The snapshot of error page is as follows:
> !image-2023-07-14-11-23-10-762.png!
> The error logs of am are as folllows:
> {code:java}
> 2023-07-14 11:20:08,218 ERROR [qtp1379757019-43] 
> org.apache.hadoop.yarn.webapp.View: Error while reading 
> hdfs://dmp/user/ubd_dmp_test/.staging/job_1689296289020_0006/job.xml
> org.apache.hadoop.security.AccessControlException: Permission denied: 
> user=yarn, access=EXECUTE, 
> inode="/user/ubd_dmp_test/.staging":ubd_dmp_test:ubd_dmp_test:drwx--
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:506)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:422)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:333)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermissionWithContext(FSPermissionChecker.java:370)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:240)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:713)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkTraverse(FSDirectory.java:1892)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkTraverse(FSDirectory.java:1910)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.resolvePath(FSDirectory.java:727)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:154)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2089)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:762)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:458)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:604)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:572)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:556)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1093)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1043)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:971)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2976)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at 

[jira] [Assigned] (MAPREDUCE-7441) Race condition in closing FadvisedFileRegion

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan reassigned MAPREDUCE-7441:
-

Assignee: Benjamin Teke

> Race condition in closing FadvisedFileRegion
> 
>
> Key: MAPREDUCE-7441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7441
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.4.0
>Reporter: Benjamin Teke
>Assignee: Benjamin Teke
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> This issue is similar to the one described in MAPREDUCE-7095, just for 
> FadvisedFileRegion.transferSuccessful. There are warning messages when 
> multiple threads are calling the transferSuccessful method:
> {code:java}
> 2023-05-25 08:41:57,288 WARN org.apache.hadoop.mapred.FadvisedFileRegion: 
> Failed to manage OS cache for 
> /hadoop/data04/yarn/nm/usercache/hive/appcache/application_1684916804740_8245/output/attempt_1684916804740_8245_1_00_001154_0_10003/file.out
> EBADF: Bad file descriptor
> at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posix_fadvise(Native Method)
> at 
> org.apache.hadoop.io.nativeio.NativeIO$POSIX.posixFadviseIfPossible(NativeIO.java:271)
> at 
> org.apache.hadoop.io.nativeio.NativeIO$POSIX$CacheManipulator.posixFadviseIfPossible(NativeIO.java:148)
> at 
> org.apache.hadoop.mapred.FadvisedFileRegion.transferSuccessful(FadvisedFileRegion.java:163)
> at 
> org.apache.hadoop.mapred.ShuffleChannelHandler.lambda$sendMapOutput$0(ShuffleChannelHandler.java:516)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:590)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:557)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Assigned] (MAPREDUCE-7442) exception message is not intusive when accessing the job configuration web UI

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan reassigned MAPREDUCE-7442:
-

Assignee: Jiandan Yang 

> exception message is not intusive when accessing the job configuration web UI
> -
>
> Key: MAPREDUCE-7442
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7442
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Reporter: Jiandan Yang 
>Assignee: Jiandan Yang 
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: image-2023-07-14-11-23-10-762.png
>
>
> I launched a Teragen job on hadoop-3.3.4 cluster. 
> The web occured an error when I clicked the link of Configuration of Job. The 
> error page said "HTTP ERROR 500 java.lang.IllegalArgumentException: RFC6265 
> Cookie values may not contain character: [ ]", and I can't find any solution 
> by this error message.
> I found some additional stacks in the log of AM, and those stacks reflect 
> yarn did not have the permission of stagging directory. When I give 
> permission to yarn I can access configuration page.
> I think the problem is that the error page does not provide useful or 
> meaningful prompts.
> It's better if there are  message about "yarn does not have hdfs permission" 
> in the error page.
> The snapshot of error page is as follows:
> !image-2023-07-14-11-23-10-762.png!
> The error logs of am are as folllows:
> {code:java}
> 2023-07-14 11:20:08,218 ERROR [qtp1379757019-43] 
> org.apache.hadoop.yarn.webapp.View: Error while reading 
> hdfs://dmp/user/ubd_dmp_test/.staging/job_1689296289020_0006/job.xml
> org.apache.hadoop.security.AccessControlException: Permission denied: 
> user=yarn, access=EXECUTE, 
> inode="/user/ubd_dmp_test/.staging":ubd_dmp_test:ubd_dmp_test:drwx--
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:506)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:422)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:333)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermissionWithContext(FSPermissionChecker.java:370)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:240)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:713)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkTraverse(FSDirectory.java:1892)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkTraverse(FSDirectory.java:1910)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.resolvePath(FSDirectory.java:727)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:154)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2089)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:762)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:458)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:604)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:572)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:556)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1093)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1043)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:971)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2976)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>

[jira] [Assigned] (MAPREDUCE-7375) JobSubmissionFiles don't set right permission after mkdirs

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan reassigned MAPREDUCE-7375:
-

Assignee: Zhang Dongsheng

> JobSubmissionFiles don't set right permission after mkdirs
> --
>
> Key: MAPREDUCE-7375
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7375
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 3.3.2
>Reporter: Zhang Dongsheng
>Assignee: Zhang Dongsheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5, 3.2.5
>
> Attachments: MAPREDUCE-7375.patch
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> JobSubmissionFiles provide getStagingDir to get Staging Directory.If 
> stagingArea missing, method will create new directory with this.
> {quote}fs.mkdirs(stagingArea, new FsPermission(JOB_DIR_PERMISSION));{quote}
> It seems create new directory with JOB_DIR_PERMISSION,but this permission 
> will be apply by umask.If umask too strict , this permission may be 000(if 
> umask is 700).So we should change permission after create.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Assigned] (MAPREDUCE-7372) MapReduce set permission too late in copyJar method

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan reassigned MAPREDUCE-7372:
-

Assignee: Zhang Dongsheng

> MapReduce set permission too late in copyJar method
> ---
>
> Key: MAPREDUCE-7372
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7372
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 3.3.1
>Reporter: Zhang Dongsheng
>Assignee: Zhang Dongsheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5, 3.2.5
>
> Attachments: MAPREDUCE-7372.patch
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> while execute copyJar in JobResourceUploader .the setPermission running after 
> setReplication,but setReplication need permission first.So if we set restrict 
> umask in project such as 0600, the mapreduce process will fail.
> In patch file , I put setPermisson before setReplication.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7311) Fix non-idempotent test in TestTaskProgressReporter

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated MAPREDUCE-7311:
--
  Component/s: test
 Target Version/s: 3.2.4, 3.3.2, 3.4.0
Affects Version/s: 3.2.4
   3.3.2
   3.4.0

> Fix non-idempotent test in TestTaskProgressReporter
> ---
>
> Key: MAPREDUCE-7311
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7311
> Project: Hadoop Map/Reduce
>  Issue Type: Test
>  Components: test
>Affects Versions: 3.4.0, 3.3.2, 3.2.4
>Reporter: Zhengxi Li
>Assignee: Zhengxi Li
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2, 3.2.4
>
> Attachments: MAPREDUCE-7311.001.patch, MAPREDUCE-7311.002.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The test 
> {{`org.apache.hadoop.mapred.TestTaskProgressReporter.testBytesWrittenRespectingLimit`}}
>  is not idempotent and fails if run twice in the same JVM, because it 
> pollutes state shared among tests. It may be good to clean this state 
> pollution so that some other tests do not fail in the future due to the 
> shared state polluted by this test.
> h3. Details
> Running {{`TestTaskProgressReporter.testBytesWrittenRespectingLimit`}} twice 
> would result in the second run failing with the following assertion:
> {noformat}
> Assert.assertEquals(failFast, threadExited)
> {noformat}
> The root cause for this is that when`testBytesWrittenRespectingLimit` writes 
> some bytes on the local file system, some counters are being incremented. The 
> problem is that, after the test is done, the counter is not reset. With this 
> polluted shared state, assumptions are broken, resulting in test failure in 
> the second run.
> PR link: https://github.com/apache/hadoop/pull/2500



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7342) Stop RMService in TestClientRedirect.testRedirect()

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated MAPREDUCE-7342:
--
  Component/s: test
 Target Version/s: 3.2.4, 3.3.2, 3.4.0
Affects Version/s: 3.2.4
   3.3.2
   3.4.0

> Stop RMService in TestClientRedirect.testRedirect()
> ---
>
> Key: MAPREDUCE-7342
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7342
> Project: Hadoop Map/Reduce
>  Issue Type: Test
>  Components: test
>Affects Versions: 3.4.0, 3.3.2, 3.2.4
>Reporter: Zhengxi Li
>Assignee: Zhengxi Li
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2, 3.2.4
>
> Attachments: MAPREDUCE-7342-master.001.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The test *{{*org.apache.hadoop.mapred.TestClientRedirect.testRedirect}}** is 
> not idempotent and fail if run twice in the same JVM, because it pollutes 
> some states shared among tests. It may be good to clean this state pollution 
> so that some other tests do not fail in the future due to the shared state 
> polluted by this test.
> h3. Detail
> Running *{{TestClientRedirect.testRedirect}}* twice would result in the 
> second run failing due to the following assertion error:
> {noformat}
> INFO  [main] service.AbstractService (AbstractService.java:noteFailure(267)) 
> - Service test failed in state STARTED
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.net.BindException: 
> Problem binding to [0.0.0.0:8054] java.net.BindException: Address already in 
> use
> {noformat}
> The root cause is that the RM server listening on port 8054) is started in 
> the first run of this test, but hasn't been stopped when the test finishes. 
> In the second run, when the test is trying to start the RMService, it fails 
> because port 8054 is already in use, leading to the exception.
> PR link: https://github.com/apache/hadoop/pull/2968



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7446) NegativeArraySizeException when running MR jobs with large data size

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated MAPREDUCE-7446:
--
  Component/s: mrv1
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> NegativeArraySizeException when running MR jobs with large data size
> 
>
> Key: MAPREDUCE-7446
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7446
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1
>Affects Versions: 3.4.0
>Reporter: Peter Szucs
>Assignee: Peter Szucs
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> We are using bit shifting to double the byte array in IFile's 
> [nextRawValue|https://github.infra.cloudera.com/CDH/hadoop/blob/bef14a39c7616e3b9f437a6fb24fc7a55a676b57/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/IFile.java#L437]
>  method to store the byte values in it. With large dataset it can easily 
> happen that we shift the leftmost bit when we are calculating the size of the 
> array, which can lead to a negative number as the array size, causing the 
> NegativeArraySizeException.
> It would be safer to expand the backing array with a 1.5x factor, and have a 
> check not to extend Integer's max value during that.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7434) Fix ShuffleHandler tests

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated MAPREDUCE-7434:
--
 Component/s: tets
Hadoop Flags: Reviewed
Target Version/s: 3.4.0
 Description: 
https://ci-hadoop.apache.org/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86_64/1143/testReport/junit/org.apache.hadoop.mapred/TestShuffleHandler/testMapFileAccess/

{code}
Error Message
Server returned HTTP response code: 500 for URL: 
http://127.0.0.1:13562/mapOutput?job=job_1_0001=0=attempt_1_0001_m_01_0
Stacktrace
java.io.IOException: Server returned HTTP response code: 500 for URL: 
http://127.0.0.1:13562/mapOutput?job=job_1_0001=0=attempt_1_0001_m_01_0
at 
sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1902)
at 
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1500)
at 
org.apache.hadoop.mapred.TestShuffleHandler.testMapFileAccess(TestShuffleHandler.java:292)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:750)
Standard Output
12:04:17.466 [Time-limited test] DEBUG o.a.h.m.lib.MutableMetricsFactory - 
field org.apache.hadoop.metrics2.lib.MutableGaugeInt 
org.apache.hadoop.mapred.ShuffleHandler$ShuffleMetrics.shuffleConnections with 
annotation @org.apache.hadoop.metrics2.annotation.Metric(always=false, 
sampleName=Ops, valueName=Time, about=, interval=10, type=DEFAULT, value=[# of 
current shuffle connections])
12:04:17.466 [Time-limited test] DEBUG o.a.h.m.lib.MutableMetricsFactory - 
field org.apache.hadoop.metrics2.lib.MutableCounterLong 
org.apache.hadoop.mapred.ShuffleHandler$ShuffleMetrics.shuffleOutputBytes with 
annotation @org.apache.hadoop.metrics2.annotation.Metric(always=false, 
sampleName=Ops, valueName=Time, about=, interval=10, type=DEFAULT, 
value=[Shuffle output in bytes])
12:04:17.466 [Time-limited test] DEBUG o.a.h.m.lib.MutableMetricsFactory - 
field org.apache.hadoop.metrics2.lib.MutableCounterInt 
org.apache.hadoop.mapred.ShuffleHandler$ShuffleMetrics.shuffleOutputsFailed 
with annotation @org.apache.hadoop.metrics2.annotation.Metric(always=false, 
sampleName=Ops, valueName=Time, about=, interval=10, type=DEFAULT, value=[# of 
failed shuffle outputs])
12:04:17.466 [Time-limited test] DEBUG o.a.h.m.lib.MutableMetricsFactory - 
field org.apache.hadoop.metrics2.lib.MutableCounterInt 
org.apache.hadoop.mapred.ShuffleHandler$ShuffleMetrics.shuffleOutputsOK with 
annotation @org.apache.hadoop.metrics2.annotation.Metric(always=false, 
sampleName=Ops, valueName=Time, about=, interval=10, type=DEFAULT, value=[# of 
succeeeded shuffle outputs])
12:04:17.466 [Time-limited test] DEBUG o.a.h.m.impl.MetricsSystemImpl - 
ShuffleMetrics, Shuffle output metrics
12:04:17.467 [Time-limited test] DEBUG o.a.hadoop.service.AbstractService - 
Service: mapreduce_shuffle entered state INITED
12:04:17.477 [Time-limited test] DEBUG o.a.hadoop.service.AbstractService - 
Config has been overridden during init
12:04:17.478 [Time-limited test] INFO  org.apache.hadoop.mapred.IndexCache - 
IndexCache created with max memory = 10485760
12:04:17.479 [Time-limited test] DEBUG o.a.h.m.lib.MutableMetricsFactory - 
field org.apache.hadoop.metrics2.lib.MutableGaugeInt 
org.apache.hadoop.mapred.ShuffleHandler$ShuffleMetrics.shuffleConnections with 
annotation @org.apache.hadoop.metrics2.annotation.Metric(always=false, 
sampleName=Ops, valueName=Time, about=, interval=10, type=DEFAULT, value=[# of 
current shuffle connections])
12:04:17.479 [Time-limited test] DEBUG o.a.h.m.lib.MutableMetricsFactory - 
field org.apache.hadoop.metrics2.lib.MutableCounterLong 
org.apache.hadoop.mapred.ShuffleHandler$ShuffleMetrics.shuffleOutputBytes with 
annotation @org.apache.hadoop.metrics2.annotation.Metric(always=false, 
sampleName=Ops, valueName=Time, about=, interval=10, type=DEFAULT, 
value=[Shuffle output in 

[jira] [Updated] (MAPREDUCE-7426) Fix typo in class StartEndTImesBase

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated MAPREDUCE-7426:
--
 Component/s: mrv2
Target Version/s: 3.4.0

> Fix typo in class StartEndTImesBase
> ---
>
> Key: MAPREDUCE-7426
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7426
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 3.3.4
>Reporter: Samrat Deb
>Assignee: Samrat Deb
>Priority: Trivial
>  Labels: newbie, pull-request-available
> Fix For: 3.4.0
>
>
> While going through the code , found some typo in the code related to naming 
> variables 
> - +slowTaskRelativeTresholds+ spells wrong can be fixed to 
> +slowTaskRelativeThresholds+



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7369) MapReduce tasks timing out when spends more time on MultipleOutputs#close

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated MAPREDUCE-7369:
--
 Component/s: mrv1
  mrv2
Target Version/s: 3.3.5, 3.4.0

> MapReduce tasks timing out when spends more time on MultipleOutputs#close
> -
>
> Key: MAPREDUCE-7369
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7369
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1, mrv2
>Affects Versions: 3.3.1
>Reporter: Prabhu Joseph
>Assignee: Ashutosh Gupta
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5
>
> Attachments: MAPREDUCE-7369.001.patch
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> MapReduce tasks timing out when spends more time on MultipleOutputs#close. 
> MultipleOutputs#closes takes more time when there are multiple files to be 
> closed & there is a high latency in closing a stream.
> {code}
> 2021-11-01 02:45:08,312 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics 
> report from attempt_1634949471086_61268_m_001115_0: 
> AttemptID:attempt_1634949471086_61268_m_001115_0 Timed out after 300 secs
> {code}
> MapReduce task timeout can be increased but it is tough to set the right 
> timeout value. The timeout can be disabled with 0 but that might lead to 
> hanging tasks not getting killed.
> The tasks are sending the ping every 3 seconds which are not honored by 
> ApplicationMaster. It expects the status information which won't be send 
> during MultipleOutputs#close. This jira is to add a config which considers 
> the ping from task as part of Task Liveliness Check in the ApplicationMaster.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7377) Remove unused imports in MapReduce project

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated MAPREDUCE-7377:
--
  Component/s: build
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Remove unused imports in MapReduce project
> --
>
> Key: MAPREDUCE-7377
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7377
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.4.0
>Reporter: Ashutosh Gupta
>Assignee: Ashutosh Gupta
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> h3. Optimize Imports to keep code clean
>  # Remove any unused imports



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7376) AggregateWordCount fetches wrong results

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated MAPREDUCE-7376:
--
  Component/s: aggregate
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> AggregateWordCount fetches wrong results
> 
>
> Key: MAPREDUCE-7376
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7376
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: aggregate
>Affects Versions: 3.4.0
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> AggregateWordCount rather than counting  the words, gives a single line 
> output counting the number of rows
> Wrong Result Looks Like:
> {noformat}
> hadoop-3.4.0-SNAPSHOT % bin/hdfs dfs -cat /testOut1/part-r-0
> record_count 2
> {noformat}
> Correct Should Look Like:
> {noformat}
> hadoop-3.4.0-SNAPSHOT % bin/hdfs dfs -cat /testOut1/part-r-0  
>  
> Bye   1
> Goodbye   1
> Hadoop2
> Hello 2
> World 2
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7368) DBOutputFormat.DBRecordWriter#write must throw exception when it fails

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated MAPREDUCE-7368:
--
 Component/s: mrv2
Hadoop Flags: Reviewed
Target Version/s: 3.3.1, 3.4.0

> DBOutputFormat.DBRecordWriter#write must throw exception when it fails
> --
>
> Key: MAPREDUCE-7368
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7368
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 3.3.1
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> When the 
> [DBRecordWriter#write|https://github.com/apache/hadoop/blob/91af256a5b44925e5dfdf333293251a19685ba2a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/db/DBOutputFormat.java#L120]
>  fails with an {{SQLException}} the problem is not propagated but printed in 
> {{System.err}} instead. 
> {code:java}
> public void write(K key, V value) throws IOException {
>   try {
> key.write(statement);
> statement.addBatch();
>   } catch (SQLException e) {
> e.printStackTrace();
>   }
> }
> {code}
> The consumer of this API has no way to tell that the write failed. Moreover, 
> the exception is not present in the logs which makes the problem very hard 
> debug and can easily lead to data corruption since clients can easily assume 
> that everything went well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7353) Mapreduce job fails when NM is stopped

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated MAPREDUCE-7353:
--
  Component/s: task
 Hadoop Flags: Reviewed
 Target Version/s: 3.3.2, 3.2.3, 2.10.2, 3.4.0
Affects Version/s: 3.3.2
   3.2.3
   2.10.2
   3.4.0

> Mapreduce job fails when NM is stopped
> --
>
> Key: MAPREDUCE-7353
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7353
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: task
>Affects Versions: 3.4.0, 2.10.2, 3.2.3, 3.3.2
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Fix For: 3.4.0, 2.10.2, 3.2.3, 3.3.2
>
> Attachments: MAPREDUCE-7353.001.patch, MAPREDUCE-7353.002.patch
>
>
> Job fails as task fail due to too many fetch failures 
> {code:java}
> Line 48048: 2021-06-02 16:25:02,002 | INFO  | ContainerLauncher #6 | 
> Processing the event EventType: CONTAINER_REMOTE_CLEANUP for container 
> container_e03_1622107691213_1054_01_05 taskAttempt 
> attempt_1622107691213_1054_m_00_0 | ContainerLauncherImpl.java:394
>   Line 48053: 2021-06-02 16:25:02,002 | INFO  | ContainerLauncher #6 | 
> KILLING attempt_1622107691213_1054_m_00_0 | ContainerLauncherImpl.java:209
>   Line 58026: 2021-06-02 16:26:34,034 | INFO  | AsyncDispatcher event 
> handler | TaskAttempt killed because it ran on unusable node 
> node-group-1ZYEq0002:26009. AttemptId:attempt_1622107691213_1054_m_00_0 | 
> JobImpl.java:1401
>   Line 58030: 2021-06-02 16:26:34,034 | DEBUG | AsyncDispatcher event 
> handler | Processing attempt_1622107691213_1054_m_00_0 of type TA_KILL | 
> TaskAttemptImpl.java:1390
>   Line 58035: 2021-06-02 16:26:34,034 | INFO  | RMCommunicator Allocator 
> | Killing taskAttempt:attempt_1622107691213_1054_m_00_0 because it is 
> running on unusable node:node-group-1ZYEq0002:26009 | 
> RMContainerAllocator.java:1066
>   Line 58043: 2021-06-02 16:26:34,034 | DEBUG | AsyncDispatcher event 
> handler | Processing attempt_1622107691213_1054_m_00_0 of type TA_KILL | 
> TaskAttemptImpl.java:1390
>   Line 58054: 2021-06-02 16:26:34,034 | DEBUG | AsyncDispatcher event 
> handler | Processing attempt_1622107691213_1054_m_00_0 of type 
> TA_DIAGNOSTICS_UPDATE | TaskAttemptImpl.java:1390
>   Line 58055: 2021-06-02 16:26:34,034 | INFO  | AsyncDispatcher event 
> handler | Diagnostics report from attempt_1622107691213_1054_m_00_0: 
> Container released on a *lost* node | TaskAttemptImpl.java:2649
>   Line 58057: 2021-06-02 16:26:34,034 | DEBUG | AsyncDispatcher event 
> handler | Processing attempt_1622107691213_1054_m_00_0 of type TA_KILL | 
> TaskAttemptImpl.java:1390
>   Line 60317: 2021-06-02 16:26:57,057 | INFO  | AsyncDispatcher event 
> handler | Too many fetch-failures for output of task attempt: 
> attempt_1622107691213_1054_m_00_0 ... raising fetch failure to map | 
> JobImpl.java:2005
>   Line 60319: 2021-06-02 16:26:57,057 | DEBUG | AsyncDispatcher event 
> handler | Processing attempt_1622107691213_1054_m_00_0 of type 
> TA_TOO_MANY_FETCH_FAILURE | TaskAttemptImpl.java:1390
>   Line 60320: 2021-06-02 16:26:57,057 | INFO  | AsyncDispatcher event 
> handler | attempt_1622107691213_1054_m_00_0 transitioned from state 
> SUCCESS_CONTAINER_CLEANUP to FAILED, event type is TA_TOO_MANY_FETCH_FAILURE 
> and nodeId=node-group-1ZYEq0002:26009 | TaskAttemptImpl.java:1411
>   Line 69487: 2021-06-02 16:30:02,002 | DEBUG | AsyncDispatcher event 
> handler | Processing attempt_1622107691213_1054_m_00_0 of type 
> TA_DIAGNOSTICS_UPDATE | TaskAttemptImpl.java:1390
>   Line 69527: 2021-06-02 16:30:02,002 | INFO  | AsyncDispatcher event 
> handler | Diagnostics report from attempt_1622107691213_1054_m_00_0: 
> cleanup failed for container container_e03_1622107691213_1054_01_05 : 
> java.net.ConnectException: Call From node-group-1ZYEq0001/192.168.0.66 to 
> node-group-1ZYEq0002:26009 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
>   Line 69607: 2021-06-02 16:30:02,002 | DEBUG | AsyncDispatcher event 
> handler | Processing attempt_1622107691213_1054_m_00_0 of type 
> TA_CONTAINER_CLEANED | TaskAttemptImpl.java:1390
>   Line 69609: 2021-06-02 16:30:02,002 | DEBUG | AsyncDispatcher event 
> handler | Processing attempt_1622107691213_1054_m_00_0 of type 
> TA_CONTAINER_CLEANED | TaskAttemptImpl.java:1390
>   Line 73645: 2021-06-02 16:23:56,056 | DEBUG | fetcher#9 | Fetcher 9 
> going to fetch from node-group-1ZYEq0002:26008 for: 
> [attempt_1622107691213_1054_m_00_0] | Fetcher.java:318
> 

[jira] [Updated] (MAPREDUCE-7320) ClusterMapReduceTestCase does not clean directories

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated MAPREDUCE-7320:
--
  Component/s: test
 Hadoop Flags: Reviewed
 Target Version/s: 3.2.3, 2.10.2, 3.3.1, 3.4.0
Affects Version/s: 3.2.3
   2.10.2
   3.3.1
   3.4.0

> ClusterMapReduceTestCase does not clean directories
> ---
>
> Key: MAPREDUCE-7320
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7320
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.4.0, 3.3.1, 2.10.2, 3.2.3
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.1, 2.10.2, 3.2.3
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Running Junits that extend {{ClusterMapReduceTestCase}} generate lots of 
> directories and folders without cleaning them up.
> For example:
> {code:bash}
> men test -Dtest=TestMRJobClient{code}
> generates the following directories:
> {code:bash}
> - target
>-+ ConfigurableMiniMRCluster_315090884
>-+ ConfigurableMiniMRCluster_1335188990
>-+ ConfigurableMiniMRCluster_1973037511
>-+ test-dir
> -+ dfs
> -+ hadopp-XYZ-01
> -+ hadopp-XYZ-02 
> -+ hadopp-XYZ-03
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7051) Fix typo in MultipleOutputFormat

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated MAPREDUCE-7051:
--
  Component/s: mrv1
 Target Version/s: 3.3.1, 3.2.2, 3.4.0
Affects Version/s: 3.3.1
   3.2.2
   3.4.0

> Fix typo in MultipleOutputFormat
> 
>
> Key: MAPREDUCE-7051
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7051
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1
>Affects Versions: 3.2.2, 3.4.0, 3.3.1
>Reporter: ywheel
>Assignee: ywheel
>Priority: Trivial
>  Labels: newbie
> Fix For: 3.2.2, 3.4.0, 3.3.1
>
> Attachments: MAPREDUCE-7051.patch
>
>
> In org.apache.hadoop.mapred.lib.MultipleOutputFormat, there is a typo for the 
> java doc of getInputFileBasedOutputFileName method. 
> "the outfile name based on a given anme and the input file name" should be 
> "the outfile name based on a given name and the input file name"



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7285) Junit class missing from hadoop-mapreduce-client-jobclient-*-tests jar

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated MAPREDUCE-7285:
--
  Component/s: test
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Junit class missing from hadoop-mapreduce-client-jobclient-*-tests jar
> --
>
> Key: MAPREDUCE-7285
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7285
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.4.0
>Reporter: Eric Badger
>Assignee: Masatake Iwasaki
>Priority: Major
> Fix For: 3.4.0
>
>
> {noformat}
> [ebadger@foo bin]$ $HADOOP_HOME/bin/hadoop jar 
> $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-*-tests.jar
>  sleep -Dyarn.app.mapreduce.am.env="HADOOP_MAPRED_HOME=$HADOOP_HOME" 
> -Dmapreduce.admin.user.env="HADOOP_MAPRED_HOME=$HADOOP_HOME" -mt 1 -rt 1 -m 1 
> -r 1
> WARNING: HADOOP_PREFIX has been replaced by HADOOP_HOME. Using value of 
> HADOOP_PREFIX.
> java.lang.NoClassDefFoundError: junit/framework/TestCase
>   at java.lang.ClassLoader.defineClass1(Native Method)
>   at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
>   at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>   at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
>   at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at 
> org.apache.hadoop.test.MapredTestDriver.(MapredTestDriver.java:109)
>   at 
> org.apache.hadoop.test.MapredTestDriver.(MapredTestDriver.java:61)
>   at 
> org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:147)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
> Caused by: java.lang.ClassNotFoundException: junit.framework.TestCase
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   ... 21 more
> {noformat}
> The sleep job continues to run after the error and succeeds successfully, but 
> the error shouldn't be there. Something must have removed a jar or added an 
> unfulfilled dependency on junit



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7281) Fix NoClassDefFoundError on 'mapred minicluster'

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated MAPREDUCE-7281:
--
  Component/s: scripts
 Target Version/s: 3.3.1, 3.4.0
Affects Version/s: 3.3.1
   3.4.0

> Fix NoClassDefFoundError on 'mapred minicluster'
> 
>
> Key: MAPREDUCE-7281
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7281
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 3.4.0, 3.3.1
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Major
> Fix For: 3.4.0, 3.3.1
>
>
> {noformat}
> $ bin/mapred minicluster
> 2020-06-17 12:01:29,133 INFO mapreduce.MiniHadoopClusterManager: Updated 0 
> configuration settings from command line.
> Exception in thread "main" java.lang.NoClassDefFoundError: org/junit/Assert
>   at 
> org.apache.hadoop.test.GenericTestUtils.assertExists(GenericTestUtils.java:298)
>   at 
> org.apache.hadoop.test.GenericTestUtils.getTestDir(GenericTestUtils.java:242)
>   at 
> org.apache.hadoop.test.GenericTestUtils.getTestDir(GenericTestUtils.java:251)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.getBaseDirectory(MiniDFSCluster.java:2982)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster$Builder.(MiniDFSCluster.java:224)
>   at 
> org.apache.hadoop.mapreduce.MiniHadoopClusterManager.start(MiniHadoopClusterManager.java:157)
>   at 
> org.apache.hadoop.mapreduce.MiniHadoopClusterManager.run(MiniHadoopClusterManager.java:132)
>   at 
> org.apache.hadoop.mapreduce.MiniHadoopClusterManager.main(MiniHadoopClusterManager.java:320)
> Caused by: java.lang.ClassNotFoundException: org.junit.Assert
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
>   ... 8 more
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6826) Job fails with InvalidStateTransitonException: Invalid event: JOB_TASK_COMPLETED at SUCCEEDED/COMMITTING

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated MAPREDUCE-6826:
--
 Component/s: mrv2
Hadoop Flags: Reviewed
Target Version/s: 3.3.1, 3.4.0

> Job fails with InvalidStateTransitonException: Invalid event: 
> JOB_TASK_COMPLETED at SUCCEEDED/COMMITTING
> 
>
> Key: MAPREDUCE-6826
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6826
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.7.2
>Reporter: Varun Saxena
>Assignee: Bilwa S T
>Priority: Major
> Fix For: 3.4.0, 3.3.1
>
> Attachments: MAPREDUCE-6826-001.patch, MAPREDUCE-6826-002.patch, 
> MAPREDUCE-6826-003.patch
>
>
> This happens if a container is preempted by scheduler after job starts 
> committing.
> And this exception in turn leads to application being marked as FAILED in 
> YARN.
> I think we can probably ignore JOB_TASK_COMPLETED event while JobImpl state 
> is COMMITTING or SUCCEEDED as job is in the process of finishing.
> Also is there any point in attempting to scheduler another task attempt if 
> job is already in COMMITTING or SUCCEEDED state.
> {noformat}
> 2016-12-23 09:10:38,642 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: 
> task_1482404625971_23910_m_04 Task Transitioned from RUNNING to SUCCEEDED
> 2016-12-23 09:10:38,642 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed Tasks: 5
> 2016-12-23 09:10:38,643 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: 
> job_1482404625971_23910Job Transitioned from RUNNING to COMMITTING
> 2016-12-23 09:10:38,644 INFO [ContainerLauncher #5] 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing 
> the event EventType: CONTAINER_REMOTE_CLEANUP for container 
> container_e55_1482404625971_23910_01_10 taskAttempt 
> attempt_1482404625971_23910_m_04_1
> 2016-12-23 09:10:38,644 INFO [ContainerLauncher #5] 
> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: KILLING 
> attempt_1482404625971_23910_m_04_1
> 2016-12-23 09:10:38,644 INFO [ContainerLauncher #5] 
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy: 
> Opening proxy : linux-19:26009
> 2016-12-23 09:10:38,644 INFO [CommitterEvent Processor #4] 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing 
> the event EventType: JOB_COMMIT
> 2016-12-23 09:10:38,724 INFO [IPC Server handler 0 on 27113] 
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID : 
> jvm_1482404625971_23910_m_60473139527690 asked for a task
> 2016-12-23 09:10:38,724 INFO [IPC Server handler 0 on 27113] 
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID: 
> jvm_1482404625971_23910_m_60473139527690 is invalid and will be killed.
> 2016-12-23 09:10:38,797 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Calling handler for 
> JobFinishedEvent 
> 2016-12-23 09:10:38,797 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: 
> job_1482404625971_23910Job Transitioned from COMMITTING to SUCCEEDED
> 2016-12-23 09:10:38,798 INFO [Thread-93] 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Job finished cleanly, 
> recording last MRAppMaster retry
> 2016-12-23 09:10:38,798 INFO [Thread-93] 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Notify RMCommunicator 
> isAMLastRetry: true
> 2016-12-23 09:10:38,798 INFO [Thread-93] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator: RMCommunicator notified 
> that shouldUnregistered is: true
> 2016-12-23 09:10:38,799 INFO [Thread-93] 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Notify JHEH isAMLastRetry: 
> true
> 2016-12-23 09:10:38,799 INFO [Thread-93] 
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: 
> JobHistoryEventHandler notified that forceJobCompletion is true
> 2016-12-23 09:10:38,799 INFO [Thread-93] 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Calling stop for all the 
> services
> 2016-12-23 09:10:38,800 INFO [Thread-93] 
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Stopping 
> JobHistoryEventHandler. Size of the outstanding queue size is 1
> 2016-12-23 09:10:38,989 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before 
> Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:1 
> AssignedReds:0 CompletedMaps:5 CompletedReds:0 ContAlloc:8 ContRel:0 
> HostLocal:0 RackLocal:0
> 2016-12-23 09:10:38,993 INFO [RMCommunicator Allocator] 
> 

[jira] [Updated] (MAPREDUCE-7272) TaskAttemptListenerImpl excessive log messages

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated MAPREDUCE-7272:
--
  Component/s: test
 Hadoop Flags: Reviewed
 Target Version/s: 2.10.1, 3.2.2, 3.1.4, 3.3.0, 3.4.0
Affects Version/s: 2.10.1
   3.2.2
   3.1.4
   3.3.0
   3.4.0

> TaskAttemptListenerImpl excessive log messages
> --
>
> Key: MAPREDUCE-7272
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7272
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.3.0, 3.1.4, 3.2.2, 2.10.1, 3.4.0
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
> Fix For: 2.8.6, 3.3.0, 2.9.3, 3.1.4, 3.2.2, 2.10.1, 3.4.0
>
> Attachments: MAPREDUCE-7272-branch-2.10.001.patch, 
> MAPREDUCE-7272-branch-2.10.002.patch, MAPREDUCE-7272-branch-2.10.003.patch, 
> MAPREDUCE-7272-branch-2.10.004.patch, MAPREDUCE-7272.001.patch, 
> MAPREDUCE-7272.002.patch, MAPREDUCE-7272.003.patch, MAPREDUCE-7272.004.patch
>
>
> {{TaskAttemptListenerImpl.statusUpdate()}} causes a bloating in log files. 
> One every call, the listener uses {{LOG.info()}} to printout the progress of 
> the {{TaskAttempt}}.
> {code:java}
> taskAttemptStatus.progress = taskStatus.getProgress();
> LOG.info("Progress of TaskAttempt " + taskAttemptID + " is : "
> + taskStatus.getProgress());
> {code}
>  
> {code:bash}
> 2020-04-07 10:20:50,708 INFO [IPC Server handler 17 on 43926] 
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
> attempt_1586003420099_716645_m_007783_0 is : 0.40713295
> 2020-04-07 10:20:50,717 INFO [IPC Server handler 7 on 43926] 
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
> attempt_1586003420099_716645_m_020681_0 is : 0.55573714
> 2020-04-07 10:20:50,717 INFO [IPC Server handler 26 on 43926] 
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
> attempt_1586003420099_716645_m_024371_0 is : 0.54190344
> 2020-04-07 10:20:50,738 INFO [IPC Server handler 15 on 43926] 
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
> attempt_1586003420099_716645_m_033182_0 is : 0.50264555
> 2020-04-07 10:20:50,748 INFO [IPC Server handler 3 on 43926] 
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
> attempt_1586003420099_716645_m_022375_0 is : 0.5495565
> {code}
> After discussing this issue with [~nroberts], [~ebadger], and [~epayne], we 
> thought that while it is helpful to have a log print of task progress, it is 
> still excessive to log the progress in every update.
>  This Jira is to suppress the excessive logging from TaskAttemptListener 
> without affecting the frequency of progress updates. 
>  There are two flags:
>  * {{-Dmapreduce.task.log.progress.delta.threshold=0.10}}: means that the 
> task progress will be logged every 10% of delta progress. Default is 5%.
>  * {{-Dmapreduce.task.log.progress.wait.interval-seconds=120}}: means that if 
> the listener will log the progress every 2 minutes. This is helpful for long 
> running tasks that take long time to achieve the delta threshold. Default is 
> 1 minute.
> The listener will long whichever of {{delta.threshold}} and 
> {{wait.interval-seconds}} is reached first. 
>    Enabling {{LOG.DEBUG}} for  {{TaskAttemptListenerImpl}} will override 
> those two flags and log the task progress on every update.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7468) Change add-opens flag's default value from true to false

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated MAPREDUCE-7468:
--
 Component/s: mrv2
Target Version/s: 3.4.0, 3.3.9, 3.3.7

> Change add-opens flag's default value from true to false
> 
>
> Key: MAPREDUCE-7468
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7468
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2
>Affects Versions: 3.4.0, 3.3.7
>Reporter: Benjamin Teke
>Assignee: Benjamin Teke
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.9
>
>
> To support avoid issues when a newer JobClient is used with Hadoop versions 
> without MAPREDUCE-7449 the default value of 
> mapreduce.jvm.add-opens-as-default should be false. Currently it's true, this 
> can cause if a newer JobClient is used to submit apps, as the placeholder 
> replacement won't happen during a container launch, resulting in a failed 
> submission.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-5608) Replace and deprecate mapred.tasktracker.indexcache.mb

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated MAPREDUCE-5608:
--
 Component/s: mapreduce-client
Hadoop Flags: Reviewed
Target Version/s: 3.4.0

> Replace and deprecate mapred.tasktracker.indexcache.mb
> --
>
> Key: MAPREDUCE-5608
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5608
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mapreduce-client
>Affects Versions: 2.2.0
>Reporter: Sandy Ryza
>Assignee: Ashutosh Gupta
>Priority: Major
>  Labels: BB2015-05-TBR, configuration, newbie, 
> pull-request-available
> Fix For: 3.4.0
>
> Attachments: MAPREDUCE-5608-002.patch, MAPREDUCE-5608.003.patch, 
> MAPREDUCE-5608.patch
>
>
> In MR2 mapred.tasktracker.indexcache.mb still works for configuring the size 
> of the shuffle service index cache.  As the tasktracker no longer exists, we 
> should replace this with something like mapreduce.shuffle.indexcache.mb. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7390) Remove WhiteBox in mapreduce module.

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated MAPREDUCE-7390:
--
  Component/s: mrv2
 Hadoop Flags: Reviewed
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> Remove WhiteBox in mapreduce module.
> 
>
> Key: MAPREDUCE-7390
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7390
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2
>Affects Versions: 3.4.0
>Reporter: Shilun Fan
>Assignee: Shilun Fan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> WhiteBox is deprecated, try to remove this method in hadoop-mapreduce.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7411) Use secure XML parser utils in MapReduce

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated MAPREDUCE-7411:
--
  Component/s: mrv1
   mrv2
 Hadoop Flags: Reviewed
 Target Version/s: 3.3.5, 3.4.0
Affects Version/s: 3.3.5
   3.4.0

> Use secure XML parser utils in MapReduce
> 
>
> Key: MAPREDUCE-7411
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7411
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv1, mrv2
>Affects Versions: 3.4.0, 3.3.5
>Reporter: PJ Fanning
>Assignee: PJ Fanning
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5
>
>
> Uptake of HADOOP-18469
> If anyone is landing on this page following any security scanner alert, know 
> that there is no known issue here, just a centralisation of all construction 
> of XML parsers with lockdown of all the features.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7370) Parallelize MultipleOutputs#close call

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated MAPREDUCE-7370:
--
 Component/s: mapreduce-client
Target Version/s: 3.3.6, 3.4.0

> Parallelize MultipleOutputs#close call
> --
>
> Key: MAPREDUCE-7370
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7370
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mapreduce-client
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Ashutosh Gupta
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> This call takes more time when there are lot of files to close and there is a 
> high latency to close. Parallelize MultipleOutputs#close call to improve the 
> speed.
> {code}
>   public void close() throws IOException {
> for (RecordWriter writer : recordWriters.values()) {
>   writer.close(null);
> }
>   }
> {code}
> Idea is from [~ste...@apache.org]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7385) impove JobEndNotifier#httpNotification With recommended methods

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated MAPREDUCE-7385:
--
  Component/s: mrv1
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> impove JobEndNotifier#httpNotification With recommended methods
> ---
>
> Key: MAPREDUCE-7385
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7385
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv1
>Affects Versions: 3.4.0
>Reporter: Shilun Fan
>Assignee: Shilun Fan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> JobEndNotifier#httpNotification's DefaultHttpClient has been Deprecated, use 
> the recommended method instead
> JobEndNotifier#httpNotification
> {code:java}
> private static int httpNotification(String uri, int timeout)
>       throws IOException, URISyntaxException {
>     DefaultHttpClient client = new DefaultHttpClient();
>     client.getParams()
>         .setIntParameter(CoreConnectionPNames.SO_TIMEOUT, timeout)
>         .setLongParameter(ClientPNames.CONN_MANAGER_TIMEOUT, (long) timeout);
>     HttpGet httpGet = new HttpGet(new URI(uri));
>     httpGet.setHeader("Accept", "*/*");
>     return client.execute(httpGet).getStatusLine().getStatusCode();
>   } {code}
>  * CoreConnectionPNames.SO_TIMEOUT
>  * Use RequestConfig.setSocketTimeout instead
> {code:java}
> Deprecated.Defines the socket timeout (SO_TIMEOUT) in milliseconds, which is 
> the timeout for waiting for data or, put differently, a maximum period 
> inactivity between two consecutive data packets). A timeout value of zero is 
> interpreted as an infinite timeout. {code}
>  
>  * ClientPNames.CONN_MANAGER_TIMEOUT
>  * Use RequestConfig.setConnectionRequestTimeout instead
> {code:java}
> Deprecated. Defines the timeout in milliseconds used when retrieving an 
> instance of ManagedClientConnection from the ClientConnectionManager. {code}
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7379) RMContainerRequestor#makeRemoteRequest has confusing log message

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated MAPREDUCE-7379:
--
  Component/s: mrv2
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> RMContainerRequestor#makeRemoteRequest has confusing log message
> 
>
> Key: MAPREDUCE-7379
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7379
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2
>Affects Versions: 3.4.0
>Reporter: Szilard Nemeth
>Assignee: Ashutosh Gupta
>Priority: Trivial
>  Labels: newbie, newbie++, pull-request-available
> Fix For: 3.4.0
>
> Attachments: YARN-9355.001.patch, YARN-9355.002.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor#makeRemoteRequest 
> has this log: 
> {code:java}
> if (ask.size() > 0 || release.size() > 0) {
>   LOG.info("getResources() for " + applicationId + ":" + " ask="
>   + ask.size() + " release= " + release.size() + " newContainers="
>   + allocateResponse.getAllocatedContainers().size()
>   + " finishedContainers=" + numCompletedContainers
>   + " resourcelimit=" + availableResources + " knownNMs="
>   + clusterNmCount);
> }
> {code}
> The reason why "getResources()" is printed because 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator#getResources 
> invokes makeRemoteRequest. This is not too informative and error-prone as 
> name of getResources could change over time and the log will be outdated. 
> Moreover, it's not a good idea to print a method name from a method below the 
> current one in the stack.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7343) Increase the job name max length in mapred job -list

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated MAPREDUCE-7343:
--
 Component/s: mapreduce-client
Target Version/s: 3.4.0

> Increase the job name max length in mapred job -list 
> -
>
> Key: MAPREDUCE-7343
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7343
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mapreduce-client
>Affects Versions: 3.4.0
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Presently the job name length is capped at 20, But in many cases(One being 
> Hive). The length gets crossed in too many cases and post that it doesn't 
> fetch much value.
>  
> Propose to increase the length limit from 20->35 here:
> {code:java}
> writer.printf(dataPattern, job.getJobID().toString(),
> job.getJobName().substring(0, jobNameLength > 20 ? 20 : jobNameLength),
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7343) Increase the job name max length in mapred job -list

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated MAPREDUCE-7343:
--
Affects Version/s: 3.4.0

> Increase the job name max length in mapred job -list 
> -
>
> Key: MAPREDUCE-7343
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7343
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 3.4.0
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Presently the job name length is capped at 20, But in many cases(One being 
> Hive). The length gets crossed in too many cases and post that it doesn't 
> fetch much value.
>  
> Propose to increase the length limit from 20->35 here:
> {code:java}
> writer.printf(dataPattern, job.getJobID().toString(),
> job.getJobName().substring(0, jobNameLength > 20 ? 20 : jobNameLength),
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7324) ClientHSSecurityInfo class is in wrong META-INF file

2024-02-11 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated MAPREDUCE-7324:
--
  Component/s: mapreduce-client
 Hadoop Flags: Reviewed
 Target Version/s: 3.2.3, 2.10.2, 3.3.1, 3.4.0
Affects Version/s: 3.2.3
   2.10.2
   3.3.1
   3.4.0

> ClientHSSecurityInfo class is in wrong META-INF file
> 
>
> Key: MAPREDUCE-7324
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7324
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mapreduce-client
>Affects Versions: 3.4.0, 3.3.1, 2.10.2, 3.2.3
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Fix For: 3.4.0, 3.3.1, 2.10.2, 3.2.3
>
> Attachments: MAPREDUCE-7324.001.patch
>
>
> {{ClientHSSecurityInfo}} is located in 
> {noformat}
> ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/resources/META-INF/services/org.apache.hadoop.security.SecurityInfo
> {noformat} 
> But the actual class exists in
> {noformat}
> hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common
> {noformat}
> Because of this issue, there is an ordering dependency between the 
> client-jobclient and client-common that can cause failures if the ordering is 
> not correct. Namely, if client-common is in the classpath _after_ 
> client-jobclient, the JVM won't find {{ClientHSSecurityInfo}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org