Apache Hadoop qbt Report: branch2.10+JDK7 on Linux/x86

2020-07-08 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86/742/

No changes




-1 overall


The following subsystems voted -1:
docker


Powered by Apache Yetushttps://yetus.apache.org

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org

[jira] [Created] (YARN-10348) Allow RM to always cancel tokens after app completes

2020-07-08 Thread Jim Brennan (Jira)
Jim Brennan created YARN-10348:
--

 Summary: Allow RM to always cancel tokens after app completes
 Key: YARN-10348
 URL: https://issues.apache.org/jira/browse/YARN-10348
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 3.1.3, 2.10.0
Reporter: Jim Brennan
Assignee: Jim Brennan


(Note: this change was originally done on our internal branch by [~daryn]).

The RM currently has an option for a client to specify disabling token 
cancellation when a job completes. This feature was an initial attempt to 
address the use case of a job launching sub-jobs (ie. oozie launcher) and the 
original job finishing prior to the sub-job(s) completion - ex. original job 
completion triggered premature cancellation of tokens needed by the sub-jobs.

Many years ago, [~daryn] added a more robust implementation to ref count tokens 
([YARN-3055]). This prevented premature cancellation of the token until all 
apps using the token complete, and invalidated the need for a client to specify 
cancel=false. Unfortunately the config option was not removed.

We have seen cases where oozie "java actions" and some users were explicitly 
disabling token cancellation. This can lead to a buildup of defunct tokens that 
may overwhelm the ZK buffer used by the KDC's backing store. At which point the 
KMS fails to connect to ZK and is unable to issue/validate new tokens - 
rendering the KDC only able to authenticate pre-existing tokens. Production 
incidents have occurred due to the buffer size issue.

To avoid these issues, the RM should have the option to ignore/override the 
client's request to not cancel tokens.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 3.1.4 (RC2)

2020-07-08 Thread Masatake Iwasaki

Thanks Steve and Prabhu for the information.

The cause turned out to be locking in CapacityScheduler#reinitialize.
I think the method is called after transitioning to active stat if RM-HA 
is enabled.


I filed YARN-10347 and created PR.


Masatake Iwasaki


On 2020/07/08 16:33, Prabhu Joseph wrote:

Hi Masatake,

  The thread is waiting for a ReadLock, we need to check what the other
thread holding WriteLock is blocked on.
Can you get three consecutive complete jstack of ResourceManager during the
issue.


I got no issue if RM-HA is disabled.

Looks RM is not able to access Zookeeper State Store. Can you check if
there is any connectivity issue between RM and Zookeeper.

Thanks,
Prabhu Joseph


On Mon, Jul 6, 2020 at 2:44 AM Masatake Iwasaki 
wrote:


Thanks for putting this up, Gabor Bota.

I'm testing the RC2 on 3 node docker cluster with NN-HA and RM-HA enabled.
ResourceManager reproducibly blocks on submitApplication while launching
example MR jobs.
Does anyone run into the same issue?

The same configuration worked for 3.1.3.
I got no issue if RM-HA is disabled.


"IPC Server handler 1 on default port 8032" #167 daemon prio=5 os_prio=0
tid=0x7fe91821ec50 nid=0x3b9 waiting on condition [0x7fe901bac000]
 java.lang.Thread.State: WAITING (parking)
  at sun.misc.Unsafe.park(Native Method)
  - parking to wait for  <0x85d37a40> (a
java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
  at
java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
  at

java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
  at

java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
  at

java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
  at

java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
  at

org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.checkAndGetApplicationPriority(CapacityScheduler.java:2521)
  at

org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:417)
  at

org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:342)
  at

org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:678)
  at

org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:277)
  at

org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:563)
  at

org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:527)
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1036)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1015)
  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:943)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at

org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2943)


Masatake Iwasaki

On 2020/06/26 22:51, Gabor Bota wrote:

Hi folks,

I have put together a release candidate (RC2) for Hadoop 3.1.4.

The RC is available at:

http://people.apache.org/~gabota/hadoop-3.1.4-RC2/

The RC tag in git is here:
https://github.com/apache/hadoop/releases/tag/release-3.1.4-RC2
The maven artifacts are staged at
https://repository.apache.org/content/repositories/orgapachehadoop-1269/

You can find my public key at:
https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
and http://keys.gnupg.net/pks/lookup?op=get=0xB86249D83539B38C

Please try the release and vote. The vote will run for 5 weekdays,
until July 6. 2020. 23:00 CET.

The release includes the revert of HDFS-14941, as it caused
HDFS-15421. IBR leak causes standby NN to be stuck in safe mode.
(https://issues.apache.org/jira/browse/HDFS-15421)
The release includes HDFS-15323, as requested.
(https://issues.apache.org/jira/browse/HDFS-15323)

Thanks,
Gabor

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org


-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org




-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional 

[jira] [Created] (YARN-10347) Fix double locking in CapacityScheduler#reinitialize in branch-3.1

2020-07-08 Thread Masatake Iwasaki (Jira)
Masatake Iwasaki created YARN-10347:
---

 Summary: Fix double locking in CapacityScheduler#reinitialize in 
branch-3.1
 Key: YARN-10347
 URL: https://issues.apache.org/jira/browse/YARN-10347
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacity scheduler
Affects Versions: 3.1.4
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki


Double locking blocks another threads in ResourceManager waiting for the lock.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-10346) Add testcase for RMWebApp make external class pluggable

2020-07-08 Thread Bilwa S T (Jira)
Bilwa S T created YARN-10346:


 Summary: Add testcase for RMWebApp make external class pluggable
 Key: YARN-10346
 URL: https://issues.apache.org/jira/browse/YARN-10346
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Bilwa S T
Assignee: Bilwa S T


Add testcase for Jira YARN-8047



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-10345) HsWebServices containerlogs does not honor ACLs for completed jobs

2020-07-08 Thread Prabhu Joseph (Jira)
Prabhu Joseph created YARN-10345:


 Summary: HsWebServices containerlogs does not honor ACLs for 
completed jobs
 Key: YARN-10345
 URL: https://issues.apache.org/jira/browse/YARN-10345
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 3.2.0, 3.4.0
Reporter: Prabhu Joseph
Assignee: Prabhu Joseph
 Attachments: Screen Shot 2020-07-08 at 12.54.21 PM.png

HsWebServices containerlogs does not honor ACLs. User who does not have 
permission to view a job is allowed to view the job logs from YARN UI2 through 
HsWebServices.

*Repro:*

Secure cluster + yarn.admin.acl=yarn,mapred + Root Queue ACLs set to " " + 
HistoryServer runs as mapred

1. Run a sample MR job using systest user
2. Once the job is complete, access the job logs using hue user from YARN UI2. 




YARN CLI works fine.
{code}
[hue@pjoseph-cm-2 /]$ 
[hue@pjoseph-cm-2 /]$ yarn logs -applicationId application_1594188841761_0002
WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS.
20/07/08 07:23:08 INFO client.RMProxy: Connecting to ResourceManager at 
rmhostname:8032
Permission denied: user=hue, access=EXECUTE, 
inode="/tmp/logs/systest":systest:hadoop:drwxrwx---
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:496)
{code}





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 3.1.4 (RC2)

2020-07-08 Thread Prabhu Joseph
Hi Masatake,

 The thread is waiting for a ReadLock, we need to check what the other
thread holding WriteLock is blocked on.
Can you get three consecutive complete jstack of ResourceManager during the
issue.

>> I got no issue if RM-HA is disabled.

Looks RM is not able to access Zookeeper State Store. Can you check if
there is any connectivity issue between RM and Zookeeper.

Thanks,
Prabhu Joseph


On Mon, Jul 6, 2020 at 2:44 AM Masatake Iwasaki 
wrote:

> Thanks for putting this up, Gabor Bota.
>
> I'm testing the RC2 on 3 node docker cluster with NN-HA and RM-HA enabled.
> ResourceManager reproducibly blocks on submitApplication while launching
> example MR jobs.
> Does anyone run into the same issue?
>
> The same configuration worked for 3.1.3.
> I got no issue if RM-HA is disabled.
>
>
> "IPC Server handler 1 on default port 8032" #167 daemon prio=5 os_prio=0
> tid=0x7fe91821ec50 nid=0x3b9 waiting on condition [0x7fe901bac000]
> java.lang.Thread.State: WAITING (parking)
>  at sun.misc.Unsafe.park(Native Method)
>  - parking to wait for  <0x85d37a40> (a
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>  at
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>  at
>
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>  at
>
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
>  at
>
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
>  at
>
> java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
>  at
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.checkAndGetApplicationPriority(CapacityScheduler.java:2521)
>  at
>
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:417)
>  at
>
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:342)
>  at
>
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:678)
>  at
>
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:277)
>  at
>
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:563)
>  at
>
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:527)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1036)
>  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1015)
>  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:943)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at
>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
>  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2943)
>
>
> Masatake Iwasaki
>
> On 2020/06/26 22:51, Gabor Bota wrote:
> > Hi folks,
> >
> > I have put together a release candidate (RC2) for Hadoop 3.1.4.
> >
> > The RC is available at:
> http://people.apache.org/~gabota/hadoop-3.1.4-RC2/
> > The RC tag in git is here:
> > https://github.com/apache/hadoop/releases/tag/release-3.1.4-RC2
> > The maven artifacts are staged at
> > https://repository.apache.org/content/repositories/orgapachehadoop-1269/
> >
> > You can find my public key at:
> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> > and http://keys.gnupg.net/pks/lookup?op=get=0xB86249D83539B38C
> >
> > Please try the release and vote. The vote will run for 5 weekdays,
> > until July 6. 2020. 23:00 CET.
> >
> > The release includes the revert of HDFS-14941, as it caused
> > HDFS-15421. IBR leak causes standby NN to be stuck in safe mode.
> > (https://issues.apache.org/jira/browse/HDFS-15421)
> > The release includes HDFS-15323, as requested.
> > (https://issues.apache.org/jira/browse/HDFS-15323)
> >
> > Thanks,
> > Gabor
> >
> > -
> > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> >
>
> -
> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
>
>