Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86

2019-10-18 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/479/

No changes




-1 overall


The following subsystems voted -1:
docker


Powered by Apache Yetus 0.8.0   http://yetus.apache.org

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org

Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2019-10-18 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1294/

[Oct 18, 2019 9:19:49 AM] (snemeth) YARN-9841. Capacity scheduler: add support 
for combined %user +
[Oct 18, 2019 3:25:02 PM] (weichiu) HADOOP-16152. Upgrade Eclipse Jetty version 
to 9.4.x. Contributed by
[Oct 18, 2019 8:26:20 PM] (weichiu) HADOOP-16579. Upgrade to Curator 4.2.0 and 
ZooKeeper 3.5.5 (#1656).
[Oct 18, 2019 11:10:32 PM] (eyang) YARN-9884. Make container-executor mount 
logic modular   
[Oct 19, 2019 12:30:11 AM] (eyang) YARN-9875. Improve fair scheduler 
configuration store on HDFS.  




-1 overall


The following subsystems voted -1:
docker


Powered by Apache Yetus 0.8.0   http://yetus.apache.org

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org

[jira] [Created] (YARN-9920) YarnAuthorizationProvider AccessRequest has Null RemoteAddress in case of FairScheduler

2019-10-18 Thread Prabhu Joseph (Jira)
Prabhu Joseph created YARN-9920:
---

 Summary: YarnAuthorizationProvider AccessRequest has Null 
RemoteAddress in case of FairScheduler
 Key: YARN-9920
 URL: https://issues.apache.org/jira/browse/YARN-9920
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.3.0
Reporter: Prabhu Joseph
Assignee: Prabhu Joseph


YarnAuthorizationProvider AccessRequest has Null RemoteAddress in case of 
FairScheduler. FSQueue#hasAccess uses Server.getRemoteAddress() which will be 
Null when the call is from RMWebServices and EventDispatcher. It works fine 
only when called by IPC Server Handler.

FSQueue#hasAccess is called at three places where (2) and (3) returns NULL.

1. IPC Server -> RMAppManager#createAndPopulateNewRMApp -> FSQueue#hasAccess -> 
Server.getRemoteAddress returns correct Remote IP.

2. IPC Server -> RMAppManager#createAndPopulateNewRMApp -> 
AppAddedSchedulerEvent 

EventDispatcher ->  FairScheduler#addApplication -> FSQueue.hasAccess -> 
Server.getRemoteAddress returns NULL.

{code}
org.apache.hadoop.yarn.security.ConfiguredYarnAuthorizer.checkPermission(ConfiguredYarnAuthorizer.java:101)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSQueue.hasAccess(FSQueue.java:316)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.addApplication(FairScheduler.java:509)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1268)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:133)
at 
org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
{code}

3. RMWebServices -> QueueACLsManager#checkAccess -> FSQueue.hasAccess -> 
Server.getRemoteAddress returns NULL.

{code}
org.apache.hadoop.yarn.security.ConfiguredYarnAuthorizer.checkPermission(ConfiguredYarnAuthorizer.java:101)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSQueue.hasAccess(FSQueue.java:316)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.checkAccess(FairScheduler.java:1610)
at 
org.apache.hadoop.yarn.server.resourcemanager.security.QueueACLsManager.checkAccess(QueueACLsManager.java:84)
at 
org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices.hasAccess(RMWebServices.java:270)
at 
org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices.getApps(RMWebServices.java:553)
{code}

Have verified with CapacityScheduler and it works fine.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-9919) QueueMetrics for observe only preemption

2019-10-18 Thread Prashant Golash (Jira)
Prashant Golash created YARN-9919:
-

 Summary: QueueMetrics for observe only preemption
 Key: YARN-9919
 URL: https://issues.apache.org/jira/browse/YARN-9919
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Prashant Golash


There should be a way to track how many containers will be preempted if 
preemption is enabled in "observe only" mode.

 

In the case of lazy preemption, it could be bit involved as the scheduler makes 
the decision.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-9918) AggregatedAllocatedContainers metrics not getting reported for MR in 2.6.x

2019-10-18 Thread Prashant Golash (Jira)
Prashant Golash created YARN-9918:
-

 Summary: AggregatedAllocatedContainers metrics not getting 
reported for MR in 2.6.x
 Key: YARN-9918
 URL: https://issues.apache.org/jira/browse/YARN-9918
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Prashant Golash


One of our YARN clusters is 2.6.x cdh. I have observed that aggregated 
allocated container metrics are not getting reported for the MR jobs. Some 
queues have specific MR workload, but that queue always shows 0 as 
"aggregatedAllocatedContainers".

 

Created this Jira to track this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-9917) Rate limit RMWebServices API

2019-10-18 Thread Prashant Golash (Jira)
Prashant Golash created YARN-9917:
-

 Summary: Rate limit RMWebServices API
 Key: YARN-9917
 URL: https://issues.apache.org/jira/browse/YARN-9917
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Prashant Golash


In our cluster, spark clients call getApplications without specifying filters 
to solve some use cases.

Sometimes when the number of applications is too many and there are too many 
calls, this causes GC on RM and result in a failover.

 

Ideally, we should be able to rate-limit this API.

 

Any other ideas are also welcome.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-9915) Fix FindBug issue in QueueMetrics

2019-10-18 Thread Prabhu Joseph (Jira)
Prabhu Joseph created YARN-9915:
---

 Summary: Fix FindBug issue in QueueMetrics
 Key: YARN-9915
 URL: https://issues.apache.org/jira/browse/YARN-9915
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.3.0
Reporter: Prabhu Joseph
Assignee: Prabhu Joseph


Below FindBug issue appears in the trunk build

{code}
org.apache.hadoop.yarn.server.resourcemanager.scheduler.QueueMetrics.registerCustomResources()
 invokes inefficient new Long(long) constructor; use Long.valueOf(long) instead
Bug type DM_NUMBER_CTOR (click for details) 
In class org.apache.hadoop.yarn.server.resourcemanager.scheduler.QueueMetrics
In method 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.QueueMetrics.registerCustomResources()
Called method new Long(long)
Should call Long.valueOf(long) instead
At QueueMetrics.java:[line 468]
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-9916) Improving Async Dispatcher

2019-10-18 Thread Prashant Golash (Jira)
Prashant Golash created YARN-9916:
-

 Summary: Improving Async Dispatcher
 Key: YARN-9916
 URL: https://issues.apache.org/jira/browse/YARN-9916
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Prashant Golash


Currently, async dispatcher works in the single-threaded model.

 

There is another queue for the scheduler handler, but not all handlers are 
non-blocking. In our cluster, this queue can go sometimes to 16M events, which 
takes time to drain.

 

We should think of improving it:

 
 # Either make multi-threads in the dispatcher which will pick queue events, 
but this would require careful evaluation of the order of events.
 # Or Make all downstream handlers similar to scheduler queue (this also needs 
careful evaluation of out of order events).

Any other ideas are also welcome.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-9914) Use separate configs for free disk space checking for full and not-full disks

2019-10-18 Thread Jim Brennan (Jira)
Jim Brennan created YARN-9914:
-

 Summary: Use separate configs for free disk space checking for 
full and not-full disks
 Key: YARN-9914
 URL: https://issues.apache.org/jira/browse/YARN-9914
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: yarn
Reporter: Jim Brennan
Assignee: Jim Brennan


[YARN-3943] added separate configurations for the nodemanager health check disk 
utilization full disk check:

{{max-disk-utilization-per-disk-percentage}} - threshold for marking a good 
disk full

{{disk-utilization-watermark-low-per-disk-percentage}} - threshold for marking 
a full disk as not full.

On our clusters, we do not use these configs. We instead use 
{{min-free-space-per-disk-mb}} so we can specify the limit in mb instead of 
percent of utilization. We have observed the same oscillation behavior as 
described in [YARN-3943] with this parameter. I would like to add an optional 
config to specify a separate threshold for marking a full disk as not full:

{{min-free-space-per-disk-mb}} - threshold at which a good disk is marked full

{{disk-free-space-per-disk-high-watermark-mb}} - threshold at which a full disk 
is marked good.

So for example, we could set {{min-free-space-per-disk-mb = 5GB}}, which would 
cause a disk to be marked full when free space goes below 5GB, and 
{{disk-free-space-per-disk-high-watermark-mb = 10GB}} to keep the disk in the 
full state until free space goes above 10GB.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-9913) In YARN ui2 attempt container tab, The Container's ElapsedTime of running Application is incorrect when the browser and the yarn server are in different timezons.

2019-10-18 Thread jenny (Jira)
jenny created YARN-9913:
---

 Summary:  In YARN ui2 attempt container tab, The Container's 
ElapsedTime of  running Application is incorrect when the browser and the yarn 
server are in different timezons.
 Key: YARN-9913
 URL: https://issues.apache.org/jira/browse/YARN-9913
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn-ui-v2
Affects Versions: 3.2.1, 3.1.1
Reporter: jenny


In YARN ui2 attempt container tab, The Container's ElapsedTime of running 
Application is incorrect when the browser and the yarn server are in different 
timezons.
Please see the screenshots below:
Yarn UI2:
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org