[jira] [Commented] (YARN-9738) Remove lock on ClusterNodeTracker#getNodeReport as it blocks application submission

2019-08-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16905908#comment-16905908
 ] 

Hadoop QA commented on YARN-9738:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
56s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 51s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 36s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}104m 47s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}170m 19s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer
 |
|   | hadoop.yarn.server.resourcemanager.scheduler.TestAbstractYarnScheduler |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | YARN-9738 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12977442/YARN-9738-002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 999b769a2c7b 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 
10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 454420e |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/24548/artifact/out/p

[jira] [Commented] (YARN-9738) Remove lock on ClusterNodeTracker#getNodeReport as it blocks application submission

2019-08-13 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16905952#comment-16905952
 ] 

Sunil Govindan commented on YARN-9738:
--

Hi [~BilwaST]

{{nodes}} is operated under read and write lock in ClusterNodeTracker. Now 
converting the same to a concurrent hash map also impacts other code lines too. 
If we are not carefully using writeLock and concurrentHashMap, then it could 
cause redundant locking.

Thanks

> Remove lock on ClusterNodeTracker#getNodeReport as it blocks application 
> submission
> ---
>
> Key: YARN-9738
> URL: https://issues.apache.org/jira/browse/YARN-9738
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9738-001.patch, YARN-9738-002.patch
>
>
> *Env :*
> Server OS :- UBUNTU
> No. of Cluster Node:- 9120 NMs
> Env Mode:- [Secure / Non secure]Secure
> *Preconditions:*
> ~9120 NM's was running
> ~1250 applications was in running state 
> 35K applications was in pending state
> *Test Steps:*
> 1. Submit the application from 5 clients, each client 2 threads and total 10 
> queues
> 2. Once application submittion increases (for each application of 
> distributted shell will call getClusterNodes)
> *ClientRMservice#getClusterNodes tries to get 
> ClusterNodeTracker#getNodeReport where map nodes is locked.*
> {quote}
> "IPC Server handler 36 on 45022" #246 daemon prio=5 os_prio=0 
> tid=0x7f75095de000 nid=0x1949c waiting on condition [0x7f74cff78000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x7f759f6d8858> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
>   at 
> java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.getNodeReport(ClusterNodeTracker.java:123)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getNodeReport(AbstractYarnScheduler.java:449)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.createNodeReports(ClientRMService.java:1067)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getClusterNodes(ClientRMService.java:992)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getClusterNodes(ApplicationClientProtocolPBServiceImpl.java:313)
>   at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:589)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:530)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1036)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:863)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2792)
> {quote}
> *Instead we can make nodes as concurrentHashMap and remove readlock*



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9676) Add DEBUG and TRACE level messages to AppLogAggregatorImpl and connected classes

2019-08-13 Thread Peter Bacsko (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906018#comment-16906018
 ] 

Peter Bacsko commented on YARN-9676:


+1 LGTM (non-binding)

> Add DEBUG and TRACE level messages to AppLogAggregatorImpl and connected 
> classes
> 
>
> Key: YARN-9676
> URL: https://issues.apache.org/jira/browse/YARN-9676
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
>
> During the development of the last items of YARN-6875, it was typically 
> difficult to extract information about the internal state of some log 
> aggregation related classes (e.g. {{AppLogAggregatiorImpl}} and 
> {{LogAggregationFileController}}). 
> On my fork I added a few more messages to those classes like:
> - displaying the number of log aggregation cycles
> - displaying the names of the files currently considered for log aggregation 
> by containers
> - immediately displaying any exception caught (and sent to the RM in the 
> diagnostic messages) during the log aggregation process.
> Those messages were quite useful for debugging if any issue occurs, but 
> otherwise it flooded the NM log file with these messages that are usually not 
> needed. I suggest to add (some of) these messages in DEBUG or TRACE level.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9444) YARN API ResourceUtils's getRequestedResourcesFromConfig doesn't recognize yarn.io/gpu as a valid resource

2019-08-13 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-9444:
-
Summary: YARN API ResourceUtils's getRequestedResourcesFromConfig doesn't 
recognize yarn.io/gpu as a valid resource  (was: YARN Api Resource Utils's 
getRequestedResourcesFromConfig doesn't recognise yarn.io/gpu as a valid 
resource type)

> YARN API ResourceUtils's getRequestedResourcesFromConfig doesn't recognize 
> yarn.io/gpu as a valid resource
> --
>
> Key: YARN-9444
> URL: https://issues.apache.org/jira/browse/YARN-9444
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: api
>Reporter: Gergely Pollak
>Assignee: Gergely Pollak
>Priority: Minor
> Attachments: YARN-9444.001.patch
>
>
> The original issue was the jobclient test did not send the requested resource 
> type, when specified in the command line eg:
> {code:java}
> hadoop jar hadoop-mapreduce-client-jobclient-tests.jar sleep 
> -Dmapreduce.reduce.resource.yarn.io/gpu=1  -m 10 -r 1 -mt 9
> {code}
> After some investigation, it turned out it only affects resource types with 
> name containing '.' characters. And the root cause is regexp from the 
> getRequestedResourcesFromConfig method.
> {code:java}
> "^" + Pattern.quote(prefix) + "[^.]+$"
> {code}
> This regexp explicitly forbids any dots in the resource type name, which is 
> inconsistent with the default resource type for gpu and fpga, which are 
> yarn.io/gpu and yarn.io/fpga respectively.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9741) [JDK11] TestAHSWebServices.testAbout fails

2019-08-13 Thread Adam Antal (JIRA)
Adam Antal created YARN-9741:


 Summary: [JDK11] TestAHSWebServices.testAbout fails
 Key: YARN-9741
 URL: https://issues.apache.org/jira/browse/YARN-9741
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineservice
Affects Versions: 3.2.0
Reporter: Adam Antal


On openjdk-11.0.2 TestAHSWebServices.testAbout[0] fails consistently with the 
following stack trace:
{noformat}
[ERROR] Tests run: 40, Failures: 6, Errors: 0, Skipped: 0, Time elapsed: 7.9 s 
<<< FAILURE! - in 
org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices
[ERROR] 
testAbout[0](org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices)
  Time elapsed: 0.241 s  <<< FAILURE!
org.junit.ComparisonFailure: expected: but 
was:
at org.junit.Assert.assertEquals(Assert.java:115)
at org.junit.Assert.assertEquals(Assert.java:144)
at 
org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices.testAbout(TestAHSWebServices.java:333)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runners.Suite.runChild(Suite.java:128)
at org.junit.runners.Suite.runChild(Suite.java:27)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9742) [JDK11] TestTimelineWebServicesWithSSL.testPutEntities fails

2019-08-13 Thread Adam Antal (JIRA)
Adam Antal created YARN-9742:


 Summary: [JDK11] TestTimelineWebServicesWithSSL.testPutEntities 
fails
 Key: YARN-9742
 URL: https://issues.apache.org/jira/browse/YARN-9742
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineservice
Affects Versions: 3.2.0
Reporter: Adam Antal


Tested on openjdk-11.0.2 on a Mac.

Stack trace:
{noformat}
[ERROR] Tests run: 3, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 8.206 s 
<<< FAILURE! - in 
org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServicesWithSSL
[ERROR] 
testPutEntities(org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServicesWithSSL)
  Time elapsed: 0.366 s  <<< ERROR!
com.sun.jersey.api.client.ClientHandlerException: java.io.IOException: HTTPS 
hostname wrong:  should be <0.0.0.0>
at 
com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:155)
at 
org.apache.hadoop.yarn.client.api.impl.TimelineConnector$TimelineJerseyRetryFilter$1.run(TimelineConnector.java:392)
at 
org.apache.hadoop.yarn.client.api.impl.TimelineConnector$TimelineClientConnectionRetry.retryOn(TimelineConnector.java:335)
at 
org.apache.hadoop.yarn.client.api.impl.TimelineConnector$TimelineJerseyRetryFilter.handle(TimelineConnector.java:405)
at com.sun.jersey.api.client.Client.handle(Client.java:652)
at com.sun.jersey.api.client.WebResource.handle(WebResource.java:682)
at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
at 
com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:570)
at 
org.apache.hadoop.yarn.client.api.impl.TimelineWriter.doPostingObject(TimelineWriter.java:152)
at 
org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServicesWithSSL$TestTimelineClient$1.doPostingObject(TestTimelineWebServicesWithSSL.java:139)
at 
org.apache.hadoop.yarn.client.api.impl.TimelineWriter$1.run(TimelineWriter.java:115)
at 
org.apache.hadoop.yarn.client.api.impl.TimelineWriter$1.run(TimelineWriter.java:112)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
at 
org.apache.hadoop.yarn.client.api.impl.TimelineWriter.doPosting(TimelineWriter.java:112)
at 
org.apache.hadoop.yarn.client.api.impl.TimelineWriter.putEntities(TimelineWriter.java:92)
at 
org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:178)
at 
org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServicesWithSSL.testPutEntities(TestTimelineWebServicesWithSSL.java:110)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at 
org.apache.maven.surefire.booter.ForkedBooter

[jira] [Created] (YARN-9743) [JDK11] TestTimelineWebServices.testContextFactory fails

2019-08-13 Thread Adam Antal (JIRA)
Adam Antal created YARN-9743:


 Summary: [JDK11] TestTimelineWebServices.testContextFactory fails
 Key: YARN-9743
 URL: https://issues.apache.org/jira/browse/YARN-9743
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineservice
Affects Versions: 3.2.0
Reporter: Adam Antal


Tested on OpenJDK 11.0.2 on a Mac.

Stack trace:
{noformat}
[ERROR] Tests run: 29, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 36.016 
s <<< FAILURE! - in 
org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices
[ERROR] 
testContextFactory(org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices)
  Time elapsed: 1.031 s  <<< ERROR!
java.lang.ClassNotFoundException: com.sun.xml.internal.bind.v2.ContextFactory
at 
java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:583)
at 
java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
at java.base/java.lang.Class.forName0(Native Method)
at java.base/java.lang.Class.forName(Class.java:315)
at 
org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.ContextFactory.newContext(ContextFactory.java:85)
at 
org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.ContextFactory.createContext(ContextFactory.java:112)
at 
org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices.testContextFactory(TestTimelineWebServices.java:1039)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-9728)  ResourceManager REST API can produce an illegal xml response

2019-08-13 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph reassigned YARN-9728:
---

Assignee: Prabhu Joseph

>  ResourceManager REST API can produce an illegal xml response
> -
>
> Key: YARN-9728
> URL: https://issues.apache.org/jira/browse/YARN-9728
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: api, resourcemanager
>Affects Versions: 2.7.3
>Reporter: Thomas
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: IllegalResponseChrome.png
>
>
> When a spark job throws an exception with a message containing a character 
> out of the range supported by xml 1.0, then
> the application fails and the stack trace will be stored into the 
> {{diagnostics}} field. So far, so good.
> But the issue occurred when we try to get application information with the 
> ResourceManager REST API
> The xml response will contain the illegal xml 1.0 char and will be invalid.
>  *+Examples of illegals characters in xml 1.0 :+* 
>  * \u 
>  * \u0001
>  * \u0002
>  * \u0003
>  * \u0004
> _For more information about supported characters :_
> [https://www.w3.org/TR/xml/#charsets]
> *+Example of illegal response from the Ressource Manager API  :+* 
> {code:xml}
> 
> 
>   application_1326821518301_0005
>   user1
>   job
>   a1
>   FINISHED
>   FAILED
>   100.0
>   History
>   
> http://host.domain.com:8088/proxy/application_1326821518301_0005/jobhistory/job/job_1326821518301_5_5
>   Exception in thread "main" java.lang.Exception: \u0001
>   at com..main(JobWithSpecialCharMain.java:6)
>   [...]
> 
> {code}
>  
> *+Example of job to reproduce :+*
> {code:java}
> public class JobWithSpecialCharMain {
>  public static void main(String[] args) throws Exception {
>   throw new Exception("\u0001");
>  }
> }
> {code}
>  !IllegalResponseChrome.png! 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9728)  ResourceManager REST API can produce an illegal xml response

2019-08-13 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906026#comment-16906026
 ] 

Prabhu Joseph commented on YARN-9728:
-

Thanks [~tde].

>  ResourceManager REST API can produce an illegal xml response
> -
>
> Key: YARN-9728
> URL: https://issues.apache.org/jira/browse/YARN-9728
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: api, resourcemanager
>Affects Versions: 2.7.3
>Reporter: Thomas
>Priority: Major
> Attachments: IllegalResponseChrome.png
>
>
> When a spark job throws an exception with a message containing a character 
> out of the range supported by xml 1.0, then
> the application fails and the stack trace will be stored into the 
> {{diagnostics}} field. So far, so good.
> But the issue occurred when we try to get application information with the 
> ResourceManager REST API
> The xml response will contain the illegal xml 1.0 char and will be invalid.
>  *+Examples of illegals characters in xml 1.0 :+* 
>  * \u 
>  * \u0001
>  * \u0002
>  * \u0003
>  * \u0004
> _For more information about supported characters :_
> [https://www.w3.org/TR/xml/#charsets]
> *+Example of illegal response from the Ressource Manager API  :+* 
> {code:xml}
> 
> 
>   application_1326821518301_0005
>   user1
>   job
>   a1
>   FINISHED
>   FAILED
>   100.0
>   History
>   
> http://host.domain.com:8088/proxy/application_1326821518301_0005/jobhistory/job/job_1326821518301_5_5
>   Exception in thread "main" java.lang.Exception: \u0001
>   at com..main(JobWithSpecialCharMain.java:6)
>   [...]
> 
> {code}
>  
> *+Example of job to reproduce :+*
> {code:java}
> public class JobWithSpecialCharMain {
>  public static void main(String[] args) throws Exception {
>   throw new Exception("\u0001");
>  }
> }
> {code}
>  !IllegalResponseChrome.png! 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9290) Invalid SchedulingRequest not rejected in Scheduler PlacementConstraintsHandler

2019-08-13 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9290:

Attachment: YARN-9290-006.patch

> Invalid SchedulingRequest not rejected in Scheduler 
> PlacementConstraintsHandler 
> 
>
> Key: YARN-9290
> URL: https://issues.apache.org/jira/browse/YARN-9290
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9290-001.patch, YARN-9290-002.patch, 
> YARN-9290-003.patch, YARN-9290-004.patch, YARN-9290-005.patch, 
> YARN-9290-006.patch
>
>
> SchedulingRequest with Invalid namespace is not rejected in Scheduler  
> PlacementConstraintsHandler. RM keeps on trying to allocateOnNode with 
> logging the exception. This is rejected in case of placement-processor 
> handler.
> {code}
> 2019-02-08 16:51:27,548 WARN 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.SingleConstraintAppPlacementAllocator:
>  Failed to query node cardinality:
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.InvalidAllocationTagsQueryException:
>  Invalid namespace prefix: notselfi, valid values are: 
> all,not-self,app-id,app-tag,self
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.TargetApplicationsNamespace.fromString(TargetApplicationsNamespace.java:277)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.TargetApplicationsNamespace.parse(TargetApplicationsNamespace.java:234)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.AllocationTags.createAllocationTags(AllocationTags.java:93)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.PlacementConstraintsUtil.canSatisfySingleConstraintExpression(PlacementConstraintsUtil.java:78)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.PlacementConstraintsUtil.canSatisfySingleConstraint(PlacementConstraintsUtil.java:240)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.PlacementConstraintsUtil.canSatisfyConstraints(PlacementConstraintsUtil.java:321)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.PlacementConstraintsUtil.canSatisfyAndConstraint(PlacementConstraintsUtil.java:272)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.PlacementConstraintsUtil.canSatisfyConstraints(PlacementConstraintsUtil.java:324)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.PlacementConstraintsUtil.canSatisfyConstraints(PlacementConstraintsUtil.java:365)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.SingleConstraintAppPlacementAllocator.checkCardinalityAndPending(SingleConstraintAppPlacementAllocator.java:355)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.SingleConstraintAppPlacementAllocator.precheckNode(SingleConstraintAppPlacementAllocator.java:395)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.precheckNode(AppSchedulingInfo.java:779)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.preCheckForNodeCandidateSet(RegularContainerAllocator.java:145)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.allocate(RegularContainerAllocator.java:837)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.assignContainers(RegularContainerAllocator.java:890)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.ContainerAllocator.assignContainers(ContainerAllocator.java:54)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.assignContainers(FiCaSchedulerApp.java:977)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:1173)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:795)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:623)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:1630)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1624)
>   at 
> org.apache.hadoop.yarn.server.resourcemana

[jira] [Created] (YARN-9744) RollingLevelDBTimelineStore.getEntityByTime fails with NPE

2019-08-13 Thread Prabhu Joseph (JIRA)
Prabhu Joseph created YARN-9744:
---

 Summary: RollingLevelDBTimelineStore.getEntityByTime fails with NPE
 Key: YARN-9744
 URL: https://issues.apache.org/jira/browse/YARN-9744
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver
Affects Versions: 3.2.0, 3.3.0
Reporter: Prabhu Joseph
Assignee: Prabhu Joseph


RollingLevelDBTimelineStore.getEntityByTime fails with NPE.

{code}
2019-08-07 12:58:55,990 WARN  ipc.Server (Server.java:logException(2433)) - IPC 
Server handler 0 on 10200, call 
org.apache.hadoop.yarn.api.ApplicationHistoryProtocolPB.getContainers from 
10.21.216.93:36392 Call#29446915 Retry#0
java.lang.NullPointerException
at 
org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.getEntityByTime(RollingLevelDBTimelineStore.java:786)
at 
org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.getEntities(RollingLevelDBTimelineStore.java:614)
at 
org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.getEntities(EntityGroupFSTimelineStore.java:1045)
at 
org.apache.hadoop.yarn.server.timeline.TimelineDataManager.doGetEntities(TimelineDataManager.java:168)
at 
org.apache.hadoop.yarn.server.timeline.TimelineDataManager.getEntities(TimelineDataManager.java:138)
at 
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getContainers(ApplicationHistoryManagerOnTimelineStore.java:222)
at 
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryClientService.getContainers(ApplicationHistoryClientService.java:213)
at 
org.apache.hadoop.yarn.api.impl.pb.service.ApplicationHistoryProtocolPBServiceImpl.getContainers(ApplicationHistoryProtocolPBServiceImpl.java:172)
at 
org.apache.hadoop.yarn.proto.ApplicationHistoryProtocol$ApplicationHistoryProtocolService$2.callBlockingMethod(ApplicationHistoryProtocol.java:201)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9744) RollingLevelDBTimelineStore.getEntityByTime fails with NPE

2019-08-13 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9744:

Attachment: YARN-9744-001.patch

> RollingLevelDBTimelineStore.getEntityByTime fails with NPE
> --
>
> Key: YARN-9744
> URL: https://issues.apache.org/jira/browse/YARN-9744
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 3.2.0, 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9744-001.patch
>
>
> RollingLevelDBTimelineStore.getEntityByTime fails with NPE.
> {code}
> 2019-08-07 12:58:55,990 WARN  ipc.Server (Server.java:logException(2433)) - 
> IPC Server handler 0 on 10200, call 
> org.apache.hadoop.yarn.api.ApplicationHistoryProtocolPB.getContainers from 
> 10.21.216.93:36392 Call#29446915 Retry#0
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.getEntityByTime(RollingLevelDBTimelineStore.java:786)
> at 
> org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.getEntities(RollingLevelDBTimelineStore.java:614)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.getEntities(EntityGroupFSTimelineStore.java:1045)
> at 
> org.apache.hadoop.yarn.server.timeline.TimelineDataManager.doGetEntities(TimelineDataManager.java:168)
> at 
> org.apache.hadoop.yarn.server.timeline.TimelineDataManager.getEntities(TimelineDataManager.java:138)
> at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getContainers(ApplicationHistoryManagerOnTimelineStore.java:222)
> at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryClientService.getContainers(ApplicationHistoryClientService.java:213)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationHistoryProtocolPBServiceImpl.getContainers(ApplicationHistoryProtocolPBServiceImpl.java:172)
> at 
> org.apache.hadoop.yarn.proto.ApplicationHistoryProtocol$ApplicationHistoryProtocolService$2.callBlockingMethod(ApplicationHistoryProtocol.java:201)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9548) [Umbrella] Make YARN work well in elastic cloud environments

2019-08-13 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906068#comment-16906068
 ] 

Junping Du commented on YARN-9548:
--

+1. 
I am quite interesting on autoscaling part. For horizontal scaling, we can 
leverage graceful decommission (YARN-914) to decommission/recommission nodes 
based on metrics monitoring. For vertical scaling, we can leverage dynamic 
resource allocation (YARN-291) to have a min/max resource setting on each node 
and update according to resource profiling of each node.

> [Umbrella] Make YARN work well in elastic cloud environments
> 
>
> Key: YARN-9548
> URL: https://issues.apache.org/jira/browse/YARN-9548
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Priority: Major
>
> YARN works well in static environments but there isn't fundamentally broken 
> in YARN to stop us from making it work well in dynamic environments like 
> cloud (public or private) as well.
> There are few areas where we need to invest though
>  # Autoscaling
>  -- cluster level: add/remove nodes intelligently based on metrics and/or 
> admin plugins
>  -- node level: scale nodes up/down vertically?
>  # Smarter scheduling
> -- to pack containers as opposed to spreading them around to account for 
> nodes going away
> -- to account for speculative nodes like spot instances
>  # Handling nodes going away better
> -- by decommissioning sanely
> -- dealing with auxiliary services data
>  # And any installation helpers in this dynamic world - scripts, operators 
> etc.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9743) [JDK11] TestTimelineWebServices.testContextFactory fails

2019-08-13 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906090#comment-16906090
 ] 

Wei-Chiu Chuang commented on YARN-9743:
---

Hi Adam,
could you try:
(1) the latest openjdk 11.04
(2) list the configuration used. For example, build with jdk11 + jdk8 target + 
run on jdk11, build with jdk11 + jdk11 target + run on jdk11, or build with 
jdk8 run on jdk11.

Thanks!

> [JDK11] TestTimelineWebServices.testContextFactory fails
> 
>
> Key: YARN-9743
> URL: https://issues.apache.org/jira/browse/YARN-9743
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineservice
>Affects Versions: 3.2.0
>Reporter: Adam Antal
>Priority: Major
>
> Tested on OpenJDK 11.0.2 on a Mac.
> Stack trace:
> {noformat}
> [ERROR] Tests run: 29, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 
> 36.016 s <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices
> [ERROR] 
> testContextFactory(org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices)
>   Time elapsed: 1.031 s  <<< ERROR!
> java.lang.ClassNotFoundException: com.sun.xml.internal.bind.v2.ContextFactory
>   at 
> java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:583)
>   at 
> java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
>   at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
>   at java.base/java.lang.Class.forName0(Native Method)
>   at java.base/java.lang.Class.forName(Class.java:315)
>   at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.ContextFactory.newContext(ContextFactory.java:85)
>   at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.ContextFactory.createContext(ContextFactory.java:112)
>   at 
> org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices.testContextFactory(TestTimelineWebServices.java:1039)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9133) Make tests more easy to comprehend in TestGpuResourceHandler

2019-08-13 Thread Peter Bacsko (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-9133:
---
Attachment: YARN-9133.007.patch

> Make tests more easy to comprehend in TestGpuResourceHandler
> 
>
> Key: YARN-9133
> URL: https://issues.apache.org/jira/browse/YARN-9133
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9133.001.patch, YARN-9133.001.patch, 
> YARN-9133.002.patch, YARN-9133.003.patch, YARN-9133.004.patch, 
> YARN-9133.005.patch, YARN-9133.006.patch, YARN-9133.006.patch, 
> YARN-9133.007.patch
>
>
> Tests are not quite easy to read: 
> - Some more helper methods would improve readability.
> - Eliminating the boolean flag that controls if docker is used would also 
> improve readability and clarity.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9744) RollingLevelDBTimelineStore.getEntityByTime fails with NPE

2019-08-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906138#comment-16906138
 ] 

Hadoop QA commented on YARN-9744:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 42s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 20s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
21s{color} | {color:green} hadoop-yarn-server-applicationhistoryservice in the 
patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 48m 39s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-9744 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12977471/YARN-9744-001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux b5a6c8039e7a 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 0b507d2 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24551/testReport/ |
| Max. process+thread count | 417 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/24551/cons

[jira] [Comment Edited] (YARN-7982) Do ACLs check while retrieving entity-types per application

2019-08-13 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906143#comment-16906143
 ] 

Prabhu Joseph edited comment on YARN-7982 at 8/13/19 12:32 PM:
---

Thanks [~abmodi] for reviewing.

1. The {{TimelineReaderContext.getUserId()}} does not present if user does not 
set Query Parameter {{userId}} in Rest API 
{{TimelineReaderWebServices#getEntityTypes}}. The {{userId}} is fetched and set 
in the context at  {{EntityTypes#readEntityTypes}} -> {{augmentparams}} from 
AppToFlowTable. Reusing the same context so that the 
 {{TimelineReaderWebServices#getEntityTypes}} will have userid to check access.

2. In {{FileSystemTimelineReadeImpl}}, {{context.getUserId}} is null if user 
does not set Query Parameter {{userId}}. Setting it from the Flow Run Path.


was (Author: prabhu joseph):
Thanks [~abmodi] for reviewing.

1. The {{TimelineReaderContext.getUserId()}} does not present if user does not 
set Query Parameter {{userId}} in 
Rest API {{TimelineReaderWebServices#getEntityTypes}}. The {{userId}} is 
fetched and set in the context at 
{{EntityTypes#readEntityTypes}} -> {{augmentparams}} from AppToFlowTable. 
Reusing the same context so that the 
{{TimelineReaderWebServices#getEntityTypes}} will have userid to check access.

2. In {{FileSystemTimelineReadeImpl}}, {{context.getUserId}} is null if user 
does not set Query Parameter {{userId}}. 
Setting it from the Flow Run Path.



> Do ACLs check while retrieving entity-types per application
> ---
>
> Key: YARN-7982
> URL: https://issues.apache.org/jira/browse/YARN-7982
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Rohith Sharma K S
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-7982-001.patch, YARN-7982-002.patch, 
> YARN-7982-003.patch
>
>
> REST end point {{/apps/$appid/entity-types}} retrieves all the entity-types 
> for given application. This need to be guarded with ACL check
> {code}
> [yarn@yarn-ats-3 ~]$ curl 
> "http://yarn-ats-3:8198/ws/v2/timeline/apps/application_1552297011473_0002?user.name=ambari-qa1";
> {"exception":"ForbiddenException","message":"java.lang.Exception: User 
> ambari-qa1 is not allowed to read TimelineService V2 
> data.","javaClassName":"org.apache.hadoop.yarn.webapp.ForbiddenException"}
> [yarn@yarn-ats-3 ~]$ curl 
> "http://yarn-ats-3:8198/ws/v2/timeline/apps/application_1552297011473_0002/entity-types?user.name=ambari-qa1";
> ["YARN_APPLICATION_ATTEMPT","YARN_CONTAINER"]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7982) Do ACLs check while retrieving entity-types per application

2019-08-13 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906143#comment-16906143
 ] 

Prabhu Joseph commented on YARN-7982:
-

Thanks [~abmodi] for reviewing.

1. The {{TimelineReaderContext.getUserId()}} does not present if user does not 
set Query Parameter {{userId}} in 
Rest API {{TimelineReaderWebServices#getEntityTypes}}. The {{userId}} is 
fetched and set in the context at 
{{EntityTypes#readEntityTypes}} -> {{augmentparams}} from AppToFlowTable. 
Reusing the same context so that the 
{{TimelineReaderWebServices#getEntityTypes}} will have userid to check access.

2. In {{FileSystemTimelineReadeImpl}}, {{context.getUserId}} is null if user 
does not set Query Parameter {{userId}}. 
Setting it from the Flow Run Path.



> Do ACLs check while retrieving entity-types per application
> ---
>
> Key: YARN-7982
> URL: https://issues.apache.org/jira/browse/YARN-7982
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Rohith Sharma K S
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-7982-001.patch, YARN-7982-002.patch, 
> YARN-7982-003.patch
>
>
> REST end point {{/apps/$appid/entity-types}} retrieves all the entity-types 
> for given application. This need to be guarded with ACL check
> {code}
> [yarn@yarn-ats-3 ~]$ curl 
> "http://yarn-ats-3:8198/ws/v2/timeline/apps/application_1552297011473_0002?user.name=ambari-qa1";
> {"exception":"ForbiddenException","message":"java.lang.Exception: User 
> ambari-qa1 is not allowed to read TimelineService V2 
> data.","javaClassName":"org.apache.hadoop.yarn.webapp.ForbiddenException"}
> [yarn@yarn-ats-3 ~]$ curl 
> "http://yarn-ats-3:8198/ws/v2/timeline/apps/application_1552297011473_0002/entity-types?user.name=ambari-qa1";
> ["YARN_APPLICATION_ATTEMPT","YARN_CONTAINER"]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1902) Allocation of too many containers when a second request is done with the same resource capability

2019-08-13 Thread Rick Moritz (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906148#comment-16906148
 ] 

Rick Moritz commented on YARN-1902:
---

This bug actually also can cause application crashes, if the application 
handles "ContainerAllocated"-events by stockpiling them, and then scheduling 
tasks to these containers as they arrive. This usually leads to timeouts of the 
involved token, and very interesting guesswork, why program logic is attempting 
to launch containers that have been assigned obsolete tokens.

I also wonder how this mixes with the recent addition of "opportunistic 
allocation".

Hadoop 3 would have been a great opportunity to close this bug :(

> Allocation of too many containers when a second request is done with the same 
> resource capability
> -
>
> Key: YARN-1902
> URL: https://issues.apache.org/jira/browse/YARN-1902
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Affects Versions: 2.2.0, 2.3.0, 2.4.0
>Reporter: Sietse T. Au
>Assignee: Sietse T. Au
>Priority: Major
>  Labels: client
> Attachments: YARN-1902.patch, YARN-1902.v2.patch, YARN-1902.v3.patch
>
>
> Regarding AMRMClientImpl
> Scenario 1:
> Given a ContainerRequest x with Resource y, when addContainerRequest is 
> called z times with x, allocate is called and at least one of the z allocated 
> containers is started, then if another addContainerRequest call is done and 
> subsequently an allocate call to the RM, (z+1) containers will be allocated, 
> where 1 container is expected.
> Scenario 2:
> No containers are started between the allocate calls. 
> Analyzing debug logs of the AMRMClientImpl, I have found that indeed a (z+1) 
> are requested in both scenarios, but that only in the second scenario, the 
> correct behavior is observed.
> Looking at the implementation I have found that this (z+1) request is caused 
> by the structure of the remoteRequestsTable. The consequence of Map ResourceRequestInfo> is that ResourceRequestInfo does not hold any 
> information about whether a request has been sent to the RM yet or not.
> There are workarounds for this, such as releasing the excess containers 
> received.
> The solution implemented is to initialize a new ResourceRequest in 
> ResourceRequestInfo when a request has been successfully sent to the RM.
> The patch includes a test in which scenario one is tested.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1902) Allocation of too many containers when a second request is done with the same resource capability

2019-08-13 Thread Rick Moritz (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906150#comment-16906150
 ] 

Rick Moritz commented on YARN-1902:
---

Oh, and Hadoop 2.7 can also be added to the affected version list.

> Allocation of too many containers when a second request is done with the same 
> resource capability
> -
>
> Key: YARN-1902
> URL: https://issues.apache.org/jira/browse/YARN-1902
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Affects Versions: 2.2.0, 2.3.0, 2.4.0
>Reporter: Sietse T. Au
>Assignee: Sietse T. Au
>Priority: Major
>  Labels: client
> Attachments: YARN-1902.patch, YARN-1902.v2.patch, YARN-1902.v3.patch
>
>
> Regarding AMRMClientImpl
> Scenario 1:
> Given a ContainerRequest x with Resource y, when addContainerRequest is 
> called z times with x, allocate is called and at least one of the z allocated 
> containers is started, then if another addContainerRequest call is done and 
> subsequently an allocate call to the RM, (z+1) containers will be allocated, 
> where 1 container is expected.
> Scenario 2:
> No containers are started between the allocate calls. 
> Analyzing debug logs of the AMRMClientImpl, I have found that indeed a (z+1) 
> are requested in both scenarios, but that only in the second scenario, the 
> correct behavior is observed.
> Looking at the implementation I have found that this (z+1) request is caused 
> by the structure of the remoteRequestsTable. The consequence of Map ResourceRequestInfo> is that ResourceRequestInfo does not hold any 
> information about whether a request has been sent to the RM yet or not.
> There are workarounds for this, such as releasing the excess containers 
> received.
> The solution implemented is to initialize a new ResourceRequest in 
> ResourceRequestInfo when a request has been successfully sent to the RM.
> The patch includes a test in which scenario one is tested.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9133) Make tests more easy to comprehend in TestGpuResourceHandler

2019-08-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906170#comment-16906170
 ] 

Hadoop QA commented on YARN-9133:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 53s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 0 new + 2 unchanged - 3 fixed = 2 total (was 5) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 50s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 
44s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 71m 24s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-9133 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12977473/YARN-9133.007.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 1e7aaede5ba6 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 0b507d2 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24552/testReport/ |
| Max. process+thread count | 418 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/24552/console |
| Powered by | Apache Yetu

[jira] [Updated] (YARN-9744) RollingLevelDBTimelineStore.getEntityByTime fails with NPE

2019-08-13 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9744:

Description: 
RollingLevelDBTimelineStore.getEntityByTime fails with NPE.

{code}
2019-08-07 12:58:55,990 WARN  ipc.Server (Server.java:logException(2433)) - IPC 
Server handler 0 on 10200, call 
org.apache.hadoop.yarn.api.ApplicationHistoryProtocolPB.getContainers from 
10.21.216.93:36392 Call#29446915 Retry#0
java.lang.NullPointerException
at 
org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.getEntityByTime(RollingLevelDBTimelineStore.java:786)
at 
org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.getEntities(RollingLevelDBTimelineStore.java:614)
at 
org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.getEntities(EntityGroupFSTimelineStore.java:1045)
at 
org.apache.hadoop.yarn.server.timeline.TimelineDataManager.doGetEntities(TimelineDataManager.java:168)
at 
org.apache.hadoop.yarn.server.timeline.TimelineDataManager.getEntities(TimelineDataManager.java:138)
at 
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getContainers(ApplicationHistoryManagerOnTimelineStore.java:222)
at 
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryClientService.getContainers(ApplicationHistoryClientService.java:213)
at 
org.apache.hadoop.yarn.api.impl.pb.service.ApplicationHistoryProtocolPBServiceImpl.getContainers(ApplicationHistoryProtocolPBServiceImpl.java:172)
at 
org.apache.hadoop.yarn.proto.ApplicationHistoryProtocol$ApplicationHistoryProtocolService$2.callBlockingMethod(ApplicationHistoryProtocol.java:201)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)
{code}


This affects Rest Api to get entities.

curl http://pjosephdocker:8188/ws/v1/timeline/TEZ_APPLICATION 

  was:
RollingLevelDBTimelineStore.getEntityByTime fails with NPE.

{code}
2019-08-07 12:58:55,990 WARN  ipc.Server (Server.java:logException(2433)) - IPC 
Server handler 0 on 10200, call 
org.apache.hadoop.yarn.api.ApplicationHistoryProtocolPB.getContainers from 
10.21.216.93:36392 Call#29446915 Retry#0
java.lang.NullPointerException
at 
org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.getEntityByTime(RollingLevelDBTimelineStore.java:786)
at 
org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.getEntities(RollingLevelDBTimelineStore.java:614)
at 
org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.getEntities(EntityGroupFSTimelineStore.java:1045)
at 
org.apache.hadoop.yarn.server.timeline.TimelineDataManager.doGetEntities(TimelineDataManager.java:168)
at 
org.apache.hadoop.yarn.server.timeline.TimelineDataManager.getEntities(TimelineDataManager.java:138)
at 
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getContainers(ApplicationHistoryManagerOnTimelineStore.java:222)
at 
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryClientService.getContainers(ApplicationHistoryClientService.java:213)
at 
org.apache.hadoop.yarn.api.impl.pb.service.ApplicationHistoryProtocolPBServiceImpl.getContainers(ApplicationHistoryProtocolPBServiceImpl.java:172)
at 
org.apache.hadoop.yarn.proto.ApplicationHistoryProtocol$ApplicationHistoryProtocolService$2.callBlockingMethod(ApplicationHistoryProtocol.java:201)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)
{code}


> RollingLevelDBTimelineStore.getEntityByTime fails with NPE
> --
>
> Key: YARN-9744
> URL: https://issues.apache.org/jira/browse/YARN-9744
> 

[jira] [Commented] (YARN-9744) RollingLevelDBTimelineStore.getEntityByTime fails with NPE

2019-08-13 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906177#comment-16906177
 ] 

Prabhu Joseph commented on YARN-9744:
-

[~abmodi] Can you review this Jira when you get time. This fixes NPE thrown by 
RollingLevelDBTimelineStore.getEntityByTime.

> RollingLevelDBTimelineStore.getEntityByTime fails with NPE
> --
>
> Key: YARN-9744
> URL: https://issues.apache.org/jira/browse/YARN-9744
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 3.2.0, 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9744-001.patch
>
>
> RollingLevelDBTimelineStore.getEntityByTime fails with NPE.
> {code}
> 2019-08-07 12:58:55,990 WARN  ipc.Server (Server.java:logException(2433)) - 
> IPC Server handler 0 on 10200, call 
> org.apache.hadoop.yarn.api.ApplicationHistoryProtocolPB.getContainers from 
> 10.21.216.93:36392 Call#29446915 Retry#0
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.getEntityByTime(RollingLevelDBTimelineStore.java:786)
> at 
> org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.getEntities(RollingLevelDBTimelineStore.java:614)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.getEntities(EntityGroupFSTimelineStore.java:1045)
> at 
> org.apache.hadoop.yarn.server.timeline.TimelineDataManager.doGetEntities(TimelineDataManager.java:168)
> at 
> org.apache.hadoop.yarn.server.timeline.TimelineDataManager.getEntities(TimelineDataManager.java:138)
> at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getContainers(ApplicationHistoryManagerOnTimelineStore.java:222)
> at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryClientService.getContainers(ApplicationHistoryClientService.java:213)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationHistoryProtocolPBServiceImpl.getContainers(ApplicationHistoryProtocolPBServiceImpl.java:172)
> at 
> org.apache.hadoop.yarn.proto.ApplicationHistoryProtocol$ApplicationHistoryProtocolService$2.callBlockingMethod(ApplicationHistoryProtocol.java:201)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)
> {code}
> This affects Rest Api to get entities.
> curl http://pjosephdocker:8188/ws/v1/timeline/TEZ_APPLICATION 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9444) YARN API ResourceUtils's getRequestedResourcesFromConfig doesn't recognize yarn.io/gpu as a valid resource

2019-08-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906183#comment-16906183
 ] 

Hadoop QA commented on YARN-9444:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
27s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
32s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 16s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
35s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
18s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m 
26s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 19s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 2 new + 15 unchanged - 0 fixed = 17 total (was 15) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 10s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
54s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
50s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
 9s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}100m  8s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | YARN-9444 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12964993/YARN-9444.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux c9b6bcfe2a4b 4.15.0-52-generic #56-Ubuntu SMP Tue Jun 4 
22:49:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 0b507d2 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Bui

[jira] [Commented] (YARN-9744) RollingLevelDBTimelineStore.getEntityByTime fails with NPE

2019-08-13 Thread Abhishek Modi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906205#comment-16906205
 ] 

Abhishek Modi commented on YARN-9744:
-

Thanks [~Prabhu Joseph] for the patch. LGTM. Committed to trunk.

> RollingLevelDBTimelineStore.getEntityByTime fails with NPE
> --
>
> Key: YARN-9744
> URL: https://issues.apache.org/jira/browse/YARN-9744
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 3.2.0, 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9744-001.patch
>
>
> RollingLevelDBTimelineStore.getEntityByTime fails with NPE.
> {code}
> 2019-08-07 12:58:55,990 WARN  ipc.Server (Server.java:logException(2433)) - 
> IPC Server handler 0 on 10200, call 
> org.apache.hadoop.yarn.api.ApplicationHistoryProtocolPB.getContainers from 
> 10.21.216.93:36392 Call#29446915 Retry#0
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.getEntityByTime(RollingLevelDBTimelineStore.java:786)
> at 
> org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.getEntities(RollingLevelDBTimelineStore.java:614)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.getEntities(EntityGroupFSTimelineStore.java:1045)
> at 
> org.apache.hadoop.yarn.server.timeline.TimelineDataManager.doGetEntities(TimelineDataManager.java:168)
> at 
> org.apache.hadoop.yarn.server.timeline.TimelineDataManager.getEntities(TimelineDataManager.java:138)
> at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getContainers(ApplicationHistoryManagerOnTimelineStore.java:222)
> at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryClientService.getContainers(ApplicationHistoryClientService.java:213)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationHistoryProtocolPBServiceImpl.getContainers(ApplicationHistoryProtocolPBServiceImpl.java:172)
> at 
> org.apache.hadoop.yarn.proto.ApplicationHistoryProtocol$ApplicationHistoryProtocolService$2.callBlockingMethod(ApplicationHistoryProtocol.java:201)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)
> {code}
> This affects Rest Api to get entities.
> curl http://pjosephdocker:8188/ws/v1/timeline/TEZ_APPLICATION 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9728)  ResourceManager REST API can produce an illegal xml response

2019-08-13 Thread Thomas (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas updated YARN-9728:
-
Description: 
When a spark job throws an exception with a message containing a character out 
of the range supported by xml 1.0, then
 the application fails and the stack trace will be stored into the 
{{diagnostics}} field. So far, so good.

But the issue occurred when we try to get application information with the 
ResourceManager REST API
 The xml response will contain the illegal xml 1.0 char and will be invalid.

 *+Examples of illegals characters in xml 1.0 :+* 
 * {{\u}}
 * {{\u0001}}
 * {{\u0002}}
 * {{\u0003}}
 * {{\u0004}}

_For more information about supported characters :_
 [https://www.w3.org/TR/xml/#charsets]

*+Example of illegal response from the Ressource Manager API :+* 
{code:xml}


  application_1326821518301_0005
  user1
  job
  a1
  FINISHED
  FAILED
  100.0
  History
  
http://host.domain.com:8088/proxy/application_1326821518301_0005/jobhistory/job/job_1326821518301_5_5
  Exception in thread "main" java.lang.Exception: \u0001
at com..main(JobWithSpecialCharMain.java:6)

  [...]


{code}
 

*+Example of job to reproduce :+*
{code:java}
public class JobWithSpecialCharMain {

 public static void main(String[] args) throws Exception {
  throw new Exception("\u0001");
 }

}
{code}
!IllegalResponseChrome.png!

  was:
When a spark job throws an exception with a message containing a character out 
of the range supported by xml 1.0, then
the application fails and the stack trace will be stored into the 
{{diagnostics}} field. So far, so good.

But the issue occurred when we try to get application information with the 
ResourceManager REST API
The xml response will contain the illegal xml 1.0 char and will be invalid.

 *+Examples of illegals characters in xml 1.0 :+* 
 * \u 
 * \u0001
 * \u0002
 * \u0003
 * \u0004

_For more information about supported characters :_
[https://www.w3.org/TR/xml/#charsets]



*+Example of illegal response from the Ressource Manager API  :+* 
{code:xml}


  application_1326821518301_0005
  user1
  job
  a1
  FINISHED
  FAILED
  100.0
  History
  
http://host.domain.com:8088/proxy/application_1326821518301_0005/jobhistory/job/job_1326821518301_5_5
  Exception in thread "main" java.lang.Exception: \u0001
at com..main(JobWithSpecialCharMain.java:6)

  [...]


{code}
 

*+Example of job to reproduce :+*
{code:java}
public class JobWithSpecialCharMain {

 public static void main(String[] args) throws Exception {
  throw new Exception("\u0001");
 }

}
{code}


 !IllegalResponseChrome.png! 


>  ResourceManager REST API can produce an illegal xml response
> -
>
> Key: YARN-9728
> URL: https://issues.apache.org/jira/browse/YARN-9728
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: api, resourcemanager
>Affects Versions: 2.7.3
>Reporter: Thomas
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: IllegalResponseChrome.png
>
>
> When a spark job throws an exception with a message containing a character 
> out of the range supported by xml 1.0, then
>  the application fails and the stack trace will be stored into the 
> {{diagnostics}} field. So far, so good.
> But the issue occurred when we try to get application information with the 
> ResourceManager REST API
>  The xml response will contain the illegal xml 1.0 char and will be invalid.
>  *+Examples of illegals characters in xml 1.0 :+* 
>  * {{\u}}
>  * {{\u0001}}
>  * {{\u0002}}
>  * {{\u0003}}
>  * {{\u0004}}
> _For more information about supported characters :_
>  [https://www.w3.org/TR/xml/#charsets]
> *+Example of illegal response from the Ressource Manager API :+* 
> {code:xml}
> 
> 
>   application_1326821518301_0005
>   user1
>   job
>   a1
>   FINISHED
>   FAILED
>   100.0
>   History
>   
> http://host.domain.com:8088/proxy/application_1326821518301_0005/jobhistory/job/job_1326821518301_5_5
>   Exception in thread "main" java.lang.Exception: \u0001
>   at com..main(JobWithSpecialCharMain.java:6)
>   [...]
> 
> {code}
>  
> *+Example of job to reproduce :+*
> {code:java}
> public class JobWithSpecialCharMain {
>  public static void main(String[] args) throws Exception {
>   throw new Exception("\u0001");
>  }
> }
> {code}
> !IllegalResponseChrome.png!



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9290) Invalid SchedulingRequest not rejected in Scheduler PlacementConstraintsHandler

2019-08-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906211#comment-16906211
 ] 

Hadoop QA commented on YARN-9290:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
28s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 9 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 24s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 0 new + 630 unchanged - 3 fixed = 630 total (was 633) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 18s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 79m 
27s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}128m 54s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-9290 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12977467/YARN-9290-006.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux f2890caea321 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 0b507d2 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24549/testReport/ |
| Max. process+thread count | 901 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/24549/console |
| Po

[jira] [Commented] (YARN-9676) Add DEBUG and TRACE level messages to AppLogAggregatorImpl and connected classes

2019-08-13 Thread Adam Antal (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906220#comment-16906220
 ] 

Adam Antal commented on YARN-9676:
--

Thanks!

> Add DEBUG and TRACE level messages to AppLogAggregatorImpl and connected 
> classes
> 
>
> Key: YARN-9676
> URL: https://issues.apache.org/jira/browse/YARN-9676
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
>
> During the development of the last items of YARN-6875, it was typically 
> difficult to extract information about the internal state of some log 
> aggregation related classes (e.g. {{AppLogAggregatiorImpl}} and 
> {{LogAggregationFileController}}). 
> On my fork I added a few more messages to those classes like:
> - displaying the number of log aggregation cycles
> - displaying the names of the files currently considered for log aggregation 
> by containers
> - immediately displaying any exception caught (and sent to the RM in the 
> diagnostic messages) during the log aggregation process.
> Those messages were quite useful for debugging if any issue occurs, but 
> otherwise it flooded the NM log file with these messages that are usually not 
> needed. I suggest to add (some of) these messages in DEBUG or TRACE level.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9744) RollingLevelDBTimelineStore.getEntityByTime fails with NPE

2019-08-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906224#comment-16906224
 ] 

Hudson commented on YARN-9744:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #17100 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17100/])
YARN-9744. RollingLevelDBTimelineStore.getEntityByTime fails with NPE. (abmodi: 
rev b4097b96a39bad6214b01989e7f2fb37dad70793)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/RollingLevelDBTimelineStore.java


> RollingLevelDBTimelineStore.getEntityByTime fails with NPE
> --
>
> Key: YARN-9744
> URL: https://issues.apache.org/jira/browse/YARN-9744
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 3.2.0, 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9744-001.patch
>
>
> RollingLevelDBTimelineStore.getEntityByTime fails with NPE.
> {code}
> 2019-08-07 12:58:55,990 WARN  ipc.Server (Server.java:logException(2433)) - 
> IPC Server handler 0 on 10200, call 
> org.apache.hadoop.yarn.api.ApplicationHistoryProtocolPB.getContainers from 
> 10.21.216.93:36392 Call#29446915 Retry#0
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.getEntityByTime(RollingLevelDBTimelineStore.java:786)
> at 
> org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.getEntities(RollingLevelDBTimelineStore.java:614)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.getEntities(EntityGroupFSTimelineStore.java:1045)
> at 
> org.apache.hadoop.yarn.server.timeline.TimelineDataManager.doGetEntities(TimelineDataManager.java:168)
> at 
> org.apache.hadoop.yarn.server.timeline.TimelineDataManager.getEntities(TimelineDataManager.java:138)
> at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getContainers(ApplicationHistoryManagerOnTimelineStore.java:222)
> at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryClientService.getContainers(ApplicationHistoryClientService.java:213)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationHistoryProtocolPBServiceImpl.getContainers(ApplicationHistoryProtocolPBServiceImpl.java:172)
> at 
> org.apache.hadoop.yarn.proto.ApplicationHistoryProtocol$ApplicationHistoryProtocolService$2.callBlockingMethod(ApplicationHistoryProtocol.java:201)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)
> {code}
> This affects Rest Api to get entities.
> curl http://pjosephdocker:8188/ws/v1/timeline/TEZ_APPLICATION 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9133) Make tests more easy to comprehend in TestGpuResourceHandler

2019-08-13 Thread Peter Bacsko (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-9133:
---
Attachment: YARN-9133.branch-3.2.001.patch

> Make tests more easy to comprehend in TestGpuResourceHandler
> 
>
> Key: YARN-9133
> URL: https://issues.apache.org/jira/browse/YARN-9133
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9133.001.patch, YARN-9133.001.patch, 
> YARN-9133.002.patch, YARN-9133.003.patch, YARN-9133.004.patch, 
> YARN-9133.005.patch, YARN-9133.006.patch, YARN-9133.006.patch, 
> YARN-9133.007.patch, YARN-9133.branch-3.2.001.patch
>
>
> Tests are not quite easy to read: 
> - Some more helper methods would improve readability.
> - Eliminating the boolean flag that controls if docker is used would also 
> improve readability and clarity.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9133) Make tests more easy to comprehend in TestGpuResourceHandler

2019-08-13 Thread Peter Bacsko (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906266#comment-16906266
 ] 

Peter Bacsko commented on YARN-9133:


Uploaded patch for branch-3.2. Again, patch for branch-3.1 is not trivial, 
there are too many conflicts - I'd skip it.

> Make tests more easy to comprehend in TestGpuResourceHandler
> 
>
> Key: YARN-9133
> URL: https://issues.apache.org/jira/browse/YARN-9133
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9133.001.patch, YARN-9133.001.patch, 
> YARN-9133.002.patch, YARN-9133.003.patch, YARN-9133.004.patch, 
> YARN-9133.005.patch, YARN-9133.006.patch, YARN-9133.006.patch, 
> YARN-9133.007.patch, YARN-9133.branch-3.2.001.patch
>
>
> Tests are not quite easy to read: 
> - Some more helper methods would improve readability.
> - Eliminating the boolean flag that controls if docker is used would also 
> improve readability and clarity.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9744) RollingLevelDBTimelineStore.getEntityByTime fails with NPE

2019-08-13 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906287#comment-16906287
 ] 

Prabhu Joseph commented on YARN-9744:
-

Thanks [~abmodi].

> RollingLevelDBTimelineStore.getEntityByTime fails with NPE
> --
>
> Key: YARN-9744
> URL: https://issues.apache.org/jira/browse/YARN-9744
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 3.2.0, 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9744-001.patch
>
>
> RollingLevelDBTimelineStore.getEntityByTime fails with NPE.
> {code}
> 2019-08-07 12:58:55,990 WARN  ipc.Server (Server.java:logException(2433)) - 
> IPC Server handler 0 on 10200, call 
> org.apache.hadoop.yarn.api.ApplicationHistoryProtocolPB.getContainers from 
> 10.21.216.93:36392 Call#29446915 Retry#0
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.getEntityByTime(RollingLevelDBTimelineStore.java:786)
> at 
> org.apache.hadoop.yarn.server.timeline.RollingLevelDBTimelineStore.getEntities(RollingLevelDBTimelineStore.java:614)
> at 
> org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.getEntities(EntityGroupFSTimelineStore.java:1045)
> at 
> org.apache.hadoop.yarn.server.timeline.TimelineDataManager.doGetEntities(TimelineDataManager.java:168)
> at 
> org.apache.hadoop.yarn.server.timeline.TimelineDataManager.getEntities(TimelineDataManager.java:138)
> at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getContainers(ApplicationHistoryManagerOnTimelineStore.java:222)
> at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryClientService.getContainers(ApplicationHistoryClientService.java:213)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationHistoryProtocolPBServiceImpl.getContainers(ApplicationHistoryProtocolPBServiceImpl.java:172)
> at 
> org.apache.hadoop.yarn.proto.ApplicationHistoryProtocol$ApplicationHistoryProtocolService$2.callBlockingMethod(ApplicationHistoryProtocol.java:201)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)
> {code}
> This affects Rest Api to get entities.
> curl http://pjosephdocker:8188/ws/v1/timeline/TEZ_APPLICATION 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9562) Add Java changes for the new RuncContainerRuntime

2019-08-13 Thread Eric Badger (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated YARN-9562:
--
Attachment: YARN-9562.003.patch

> Add Java changes for the new RuncContainerRuntime
> -
>
> Key: YARN-9562
> URL: https://issues.apache.org/jira/browse/YARN-9562
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Attachments: YARN-9562.001.patch, YARN-9562.002.patch, 
> YARN-9562.003.patch
>
>
> This JIRA will be used to add the Java changes for the new 
> RuncContainerRuntime. This will work off of YARN-9560 to use much of the 
> existing DockerLinuxContainerRuntime code once it is moved up into an 
> abstract class that can be extended. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9562) Add Java changes for the new RuncContainerRuntime

2019-08-13 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906295#comment-16906295
 ] 

Eric Badger commented on YARN-9562:
---

[~eyang], yes that's the error. I caught this error awhile ago, but I guess I 
never uploaded a patch with the fix. Patch 003 fixes the issue.

> Add Java changes for the new RuncContainerRuntime
> -
>
> Key: YARN-9562
> URL: https://issues.apache.org/jira/browse/YARN-9562
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Attachments: YARN-9562.001.patch, YARN-9562.002.patch, 
> YARN-9562.003.patch
>
>
> This JIRA will be used to add the Java changes for the new 
> RuncContainerRuntime. This will work off of YARN-9560 to use much of the 
> existing DockerLinuxContainerRuntime code once it is moved up into an 
> abstract class that can be extended. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9217) Nodemanager will fail to start if GPU is misconfigured on the node or GPU drivers missing

2019-08-13 Thread Peter Bacsko (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-9217:
---
Attachment: YARN-9217.009.patch

> Nodemanager will fail to start if GPU is misconfigured on the node or GPU 
> drivers missing
> -
>
> Key: YARN-9217
> URL: https://issues.apache.org/jira/browse/YARN-9217
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Antal Bálint Steinbach
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9217.001.patch, YARN-9217.002.patch, 
> YARN-9217.003.patch, YARN-9217.004.patch, YARN-9217.005.patch, 
> YARN-9217.006.patch, YARN-9217.007.patch, YARN-9217.008.patch, 
> YARN-9217.009.patch
>
>
> Nodemanager will not start
> 1. If Autodiscovery is enabled:
>  * If nvidia-smi path is misconfigured or the file does not exist.
>  * There is 0 GPU found
>  * If the file exists but it is not pointing to an nvidia-smi
>  * if the binary is ok but there is an IOException
> 2. If the manually configured GPU devices are misconfigured
>  * Any index:minor number format failure will cause a problem
>  * 0 configured device will cause a problem
>  * NumberFormatException is not handled
> It would be a better option to add warnings about the configuration, set 0 
> available GPUs and let the node work and run non-gpu jobs.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9745) TestFairScheduler.testIncreaseQueueMaxRunningAppsOnTheFly fails intermittent

2019-08-13 Thread Prabhu Joseph (JIRA)
Prabhu Joseph created YARN-9745:
---

 Summary: TestFairScheduler.testIncreaseQueueMaxRunningAppsOnTheFly 
fails intermittent
 Key: YARN-9745
 URL: https://issues.apache.org/jira/browse/YARN-9745
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler, test
Affects Versions: 3.3.0
Reporter: Prabhu Joseph
Assignee: Prabhu Joseph


TestFairScheduler.testIncreaseQueueMaxRunningAppsOnTheFly fails intermittent

{code}
[ERROR] 
testIncreaseQueueMaxRunningAppsOnTheFly(org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler)
  Time elapsed: 0.003 s  <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 5000 
milliseconds
at java.io.FileOutputStream.open0(Native Method)
at java.io.FileOutputStream.open(FileOutputStream.java:270)
at java.io.FileOutputStream.(FileOutputStream.java:213)
at java.io.FileOutputStream.(FileOutputStream.java:101)
at java.io.FileWriter.(FileWriter.java:63)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler.testIncreaseQueueSettingOnTheFlyInternal(TestFairScheduler.java:2394)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler.testIncreaseQueueMaxRunningAppsOnTheFly(TestFairScheduler.java:2357)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)

{code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9133) Make tests more easy to comprehend in TestGpuResourceHandler

2019-08-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906330#comment-16906330
 ] 

Hadoop QA commented on YARN-9133:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
49s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-3.2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
17s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  6s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
58s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} branch-3.2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 0 new + 1 unchanged - 3 fixed = 1 total (was 4) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 30s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 
33s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
42s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 75m 15s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:63396beab41 |
| JIRA Issue | YARN-9133 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12977495/YARN-9133.branch-3.2.001.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 25412e954cd4 4.15.0-48-generic #51-Ubuntu SMP Wed Apr 3 
08:28:49 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-3.2 / c5f433b |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24553/testReport/ |
| asflicense | 
https://builds.apache.org/job/PreCommit-YARN-Build/24553/artifact/out/patch-asflicense-problems.txt
 |
| Max. process+thread count | 306 (vs. ulimit of 5500) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn

[jira] [Commented] (YARN-9561) Add C changes for the new RuncContainerRuntime

2019-08-13 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906360#comment-16906360
 ] 

Eric Yang commented on YARN-9561:
-

[~ebadger] This patch no longer applies to trunk.

{code}
[WARNING] 
/home/eyang/test/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/oci/oci.c:
 In function ‘run_oci_container’:
[WARNING] 
/home/eyang/test/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/oci/oci.c:849:8:
 error: ‘ERROR_OCI_RUN_FAILED’ undeclared (first use in this function)
[WARNING]rc = ERROR_OCI_RUN_FAILED;
[WARNING] ^
[WARNING] 
/home/eyang/test/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/oci/oci.c:849:8:
 note: each undeclared identifier is reported only once for each function it 
appears in
[WARNING] make[2]: *** 
[CMakeFiles/container.dir/main/native/container-executor/impl/oci/oci.c.o] 
Error 1
[WARNING] make[2]: *** Waiting for unfinished jobs
[WARNING] 
/home/eyang/test/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/oci/oci_reap.c:
 In function ‘reap_oci_layer_mounts_with_ctx’:
[WARNING] 
/home/eyang/test/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/oci/oci_reap.c:566:12:
 error: ‘ERROR_OCI_REAP_LAYER_MOUNTS_FAILED’ undeclared (first use in this 
function)
[WARNING]int rc = ERROR_OCI_REAP_LAYER_MOUNTS_FAILED;
[WARNING] ^
[WARNING] 
/home/eyang/test/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/oci/oci_reap.c:566:12:
 note: each undeclared identifier is reported only once for each function it 
appears in
[WARNING] 
/home/eyang/test/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/oci/oci_reap.c:
 In function ‘reap_oci_layer_mounts’:
[WARNING] 
/home/eyang/test/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/oci/oci_reap.c:613:12:
 error: ‘ERROR_OCI_REAP_LAYER_MOUNTS_FAILED’ undeclared (first use in this 
function)
[WARNING]int rc = ERROR_OCI_REAP_LAYER_MOUNTS_FAILED;
[WARNING] ^
[WARNING] make[2]: *** 
[CMakeFiles/container.dir/main/native/container-executor/impl/oci/oci_reap.c.o] 
Error 1
[WARNING] make[1]: *** [CMakeFiles/container.dir/all] Error 2
[WARNING] make: *** [all] Error 2
{code}

Could you check?  Thanks

> Add C changes for the new RuncContainerRuntime
> --
>
> Key: YARN-9561
> URL: https://issues.apache.org/jira/browse/YARN-9561
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Attachments: YARN-9561.001.patch, YARN-9561.002.patch
>
>
> This JIRA will be used to add the C changes to the container-executor native 
> binary that are necessary for the new RuncContainerRuntime. There should be 
> no changes to existing code paths. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9442) container working directory has group read permissions

2019-08-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906403#comment-16906403
 ] 

Hudson commented on YARN-9442:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #17103 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17103/])
YARN-9442. container working directory has group read permissions. (ebadger: 
rev 2ac029b949f041da2ee04da441c5f9f85e1f2c64)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/test/test-container-executor.c
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c


> container working directory has group read permissions
> --
>
> Key: YARN-9442
> URL: https://issues.apache.org/jira/browse/YARN-9442
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.2.2
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Attachments: YARN-9442.001.patch, YARN-9442.002.patch, 
> YARN-9442.003.patch
>
>
> Container working directories are currently created with permissions 0750, 
> owned by the user and with the group set to the node manager group.
> Is there any reason why these directories need group read permissions?
> I have been testing with group read permissions removed and so far I haven't 
> encountered any problems.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9217) Nodemanager will fail to start if GPU is misconfigured on the node or GPU drivers missing

2019-08-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906422#comment-16906422
 ] 

Hadoop QA commented on YARN-9217:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
24s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 33s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 0 new + 16 unchanged - 2 fixed = 16 total (was 18) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  5s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 20m 57s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 71m 29s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.amrmproxy.TestFederationInterceptor |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-9217 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12977502/YARN-9217.009.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux cea1da4f9d0f 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 274966e |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/24555/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24555/testReport/ |
| Max. process+thread count | 448 (vs. ulimit of 1) |
| modul

[jira] [Updated] (YARN-9561) Add C changes for the new RuncContainerRuntime

2019-08-13 Thread Eric Badger (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated YARN-9561:
--
Attachment: YARN-9561.003.patch

> Add C changes for the new RuncContainerRuntime
> --
>
> Key: YARN-9561
> URL: https://issues.apache.org/jira/browse/YARN-9561
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Attachments: YARN-9561.001.patch, YARN-9561.002.patch, 
> YARN-9561.003.patch
>
>
> This JIRA will be used to add the C changes to the container-executor native 
> binary that are necessary for the new RuncContainerRuntime. There should be 
> no changes to existing code paths. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9561) Add C changes for the new RuncContainerRuntime

2019-08-13 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906441#comment-16906441
 ] 

Eric Badger commented on YARN-9561:
---

Yea another error code was added, which messed up the ERROR_OCI_RUN_FAILED 
error code from the previous patch. Updated the patch to fix that as well as 
some other stuff.

> Add C changes for the new RuncContainerRuntime
> --
>
> Key: YARN-9561
> URL: https://issues.apache.org/jira/browse/YARN-9561
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Attachments: YARN-9561.001.patch, YARN-9561.002.patch, 
> YARN-9561.003.patch
>
>
> This JIRA will be used to add the C changes to the container-executor native 
> binary that are necessary for the new RuncContainerRuntime. There should be 
> no changes to existing code paths. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9562) Add Java changes for the new RuncContainerRuntime

2019-08-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906452#comment-16906452
 ] 

Hadoop QA commented on YARN-9562:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
36s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 10 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
16s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
 6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 34s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
19s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
37s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 25s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 283 new + 689 unchanged - 1 fixed = 972 total (was 690) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 17s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
16s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 generated 16 new + 0 unchanged - 0 fixed = 16 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 57s{color} 
| {color:red} hadoop-yarn-api in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 21m  5s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}100m 13s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | 
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
|  |  Nullcheck of NodeManager.context at line 535 of value previously 
dereferenced in 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStop()  At 
NodeManager.java:535 of value previously dereferenced in 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStop()  At 
NodeManager.java:[line 532] |
|  |  Unused field:NodeManager.java |
|  |  Dead store to refreshHdfsCacheThread in 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.ImageTagToManifestPlugin.serviceStart()
  At 
ImageTa

[jira] [Updated] (YARN-9442) container working directory has group read permissions

2019-08-13 Thread Eric Badger (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated YARN-9442:
--
Fix Version/s: 3.2.2
   3.1.3
   2.9.3
   3.3.0
   3.0.4
   2.10.0

+1 lgtm. I have committed this to trunk, branch-3.2, branch-3.1, branch-3.0, 
branch-2, and branch-2.9. 

[~Jim_Brennan], could you upload a branch-2.8 patch? There were some small 
naming conflicts that I cleaned up in branch-3.2 and branch-2, but this one is 
a little bit more. 

> container working directory has group read permissions
> --
>
> Key: YARN-9442
> URL: https://issues.apache.org/jira/browse/YARN-9442
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.2.2
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Fix For: 2.10.0, 3.0.4, 3.3.0, 2.9.3, 3.1.3, 3.2.2
>
> Attachments: YARN-9442.001.patch, YARN-9442.002.patch, 
> YARN-9442.003.patch
>
>
> Container working directories are currently created with permissions 0750, 
> owned by the user and with the group set to the node manager group.
> Is there any reason why these directories need group read permissions?
> I have been testing with group read permissions removed and so far I haven't 
> encountered any problems.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9442) container working directory has group read permissions

2019-08-13 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906463#comment-16906463
 ] 

Jim Brennan commented on YARN-9442:
---

Thanks [~ebadger]!  I will put up a patch for 2.8.

 

> container working directory has group read permissions
> --
>
> Key: YARN-9442
> URL: https://issues.apache.org/jira/browse/YARN-9442
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.2.2
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Fix For: 2.10.0, 3.0.4, 3.3.0, 2.9.3, 3.1.3, 3.2.2
>
> Attachments: YARN-9442.001.patch, YARN-9442.002.patch, 
> YARN-9442.003.patch
>
>
> Container working directories are currently created with permissions 0750, 
> owned by the user and with the group set to the node manager group.
> Is there any reason why these directories need group read permissions?
> I have been testing with group read permissions removed and so far I haven't 
> encountered any problems.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9442) container working directory has group read permissions

2019-08-13 Thread Jim Brennan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated YARN-9442:
--
Attachment: YARN-9442-branch-2.8.001.patch

> container working directory has group read permissions
> --
>
> Key: YARN-9442
> URL: https://issues.apache.org/jira/browse/YARN-9442
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.2.2
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Fix For: 2.10.0, 3.0.4, 3.3.0, 2.9.3, 3.1.3, 3.2.2
>
> Attachments: YARN-9442-branch-2.8.001.patch, YARN-9442.001.patch, 
> YARN-9442.002.patch, YARN-9442.003.patch
>
>
> Container working directories are currently created with permissions 0750, 
> owned by the user and with the group set to the node manager group.
> Is there any reason why these directories need group read permissions?
> I have been testing with group read permissions removed and so far I haven't 
> encountered any problems.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9442) container working directory has group read permissions

2019-08-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906589#comment-16906589
 ] 

Hadoop QA commented on YARN-9442:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  8m 
34s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.8 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
19s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} branch-2.8 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
33s{color} | {color:green} branch-2.8 passed with JDK v1.8.0_222 {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} branch-2.8 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed with JDK v1.8.0_222 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  9m 
41s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
20s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 31m 14s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:b93746a |
| JIRA Issue | YARN-9442 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12977536/YARN-9442-branch-2.8.001.patch
 |
| Optional Tests |  dupname  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux 2eec3dbb2182 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-2.8 / 829afac |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| Multi-JDK versions |  /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 
/usr/lib/jvm/java-8-openjdk-amd64:1.8.0_222 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24557/testReport/ |
| asflicense | 
https://builds.apache.org/job/PreCommit-YARN-Build/24557/artifact/out/patch-asflicense-problems.txt
 |
| Max. process+thread count | 174 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/24557/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> container working directory has group read permissions
> --
>
> Key: YARN-9442
> URL: https://issues.apache.org/jira/browse/YARN-9442
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
> 

[jira] [Updated] (YARN-9442) container working directory has group read permissions

2019-08-13 Thread Eric Badger (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated YARN-9442:
--
Fix Version/s: 2.8.6

Thanks, [~Jim_Brennan]! I committed the 2.8 patch to branch-2.8

> container working directory has group read permissions
> --
>
> Key: YARN-9442
> URL: https://issues.apache.org/jira/browse/YARN-9442
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.2.2
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Fix For: 2.10.0, 3.0.4, 3.3.0, 2.8.6, 2.9.3, 3.1.3, 3.2.2
>
> Attachments: YARN-9442-branch-2.8.001.patch, YARN-9442.001.patch, 
> YARN-9442.002.patch, YARN-9442.003.patch
>
>
> Container working directories are currently created with permissions 0750, 
> owned by the user and with the group set to the node manager group.
> Is there any reason why these directories need group read permissions?
> I have been testing with group read permissions removed and so far I haven't 
> encountered any problems.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9442) container working directory has group read permissions

2019-08-13 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906625#comment-16906625
 ] 

Jim Brennan commented on YARN-9442:
---

Thanks [~ebadger]!

> container working directory has group read permissions
> --
>
> Key: YARN-9442
> URL: https://issues.apache.org/jira/browse/YARN-9442
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.2.2
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Fix For: 2.10.0, 3.0.4, 3.3.0, 2.8.6, 2.9.3, 3.1.3, 3.2.2
>
> Attachments: YARN-9442-branch-2.8.001.patch, YARN-9442.001.patch, 
> YARN-9442.002.patch, YARN-9442.003.patch
>
>
> Container working directories are currently created with permissions 0750, 
> owned by the user and with the group set to the node manager group.
> Is there any reason why these directories need group read permissions?
> I have been testing with group read permissions removed and so far I haven't 
> encountered any problems.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9561) Add C changes for the new RuncContainerRuntime

2019-08-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906652#comment-16906652
 ] 

Hadoop QA commented on YARN-9561:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
15s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 13m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
60m 50s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
24s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 15m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 13m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m  3s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}135m 13s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
 2s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}255m 14s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestLeaseRecovery2 |
|   | hadoop.hdfs.web.TestWebHDFS |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-9561 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12977521/YARN-9561.003.patch |
| Optional Tests |  dupname  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux 55c31dcaf6ed 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 2ac029b |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/24556/artifact/out/patch-unit-root.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24556/testReport/ |
| Max. process+thread count | 5092 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 . U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/24556/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Add C changes for the new RuncContainerRuntime
> --
>
> Key: YARN-9561
> URL: https://issues.apache.org/jira/browse/YARN-9561
> Project: Hadoop YARN
>  Issue Type: Sub-ta

[jira] [Commented] (YARN-9562) Add Java changes for the new RuncContainerRuntime

2019-08-13 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906663#comment-16906663
 ] 

Eric Yang commented on YARN-9562:
-

[~ebadger] 
1. Node manager crashes if the defined images-tag-to-hash-files does not exist. 
 It would be nice, if this is a warning instead.
{code}
java.lang.RuntimeException: Couldn't load any image-tag-to-hash-files
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.ImageTagToManifestPlugin.serviceStart(ImageTagToManifestPlugin.java:315)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.RuncContainerRuntime.start(RuncContainerRuntime.java:277)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.start(DelegatingLinuxContainerRuntime.java:283)
at 
org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.start(LinuxContainerExecutor.java:351)
at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:519)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:989)
at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1069)
2019-08-13 21:07:03,002 INFO org.apache.hadoop.service.AbstractService: Service 
NodeManager failed in state INITED
java.lang.RuntimeException: Couldn't load any image-tag-to-hash-files
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.ImageTagToManifestPlugin.serviceStart(ImageTagToManifestPlugin.java:315)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.RuncContainerRuntime.start(RuncContainerRuntime.java:277)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.start(DelegatingLinuxContainerRuntime.java:283)
at 
org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.start(LinuxContainerExecutor.java:351)
at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:519)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:989)
at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1069)
2019-08-13 21:07:03,003 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: 
Stopping NodeManager metrics system...
2019-08-13 21:07:03,003 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: 
NodeManager metrics system stopped.
2019-08-13 21:07:03,004 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: 
NodeManager metrics system shutdown complete.
2019-08-13 21:07:03,005 ERROR 
org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting 
NodeManager
java.lang.RuntimeException: Couldn't load any image-tag-to-hash-files
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.ImageTagToManifestPlugin.serviceStart(ImageTagToManifestPlugin.java:315)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.RuncContainerRuntime.start(RuncContainerRuntime.java:277)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.start(DelegatingLinuxContainerRuntime.java:283)
at 
org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.start(LinuxContainerExecutor.java:351)
at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:519)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:989)
at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1069)
2019-08-13 21:07:03,016 INFO 
org.apache.hadoop.yarn.server.nodemanager.NodeManager: SHUTDOWN_MSG:
{code}

2.  Running mapreduce job using runc container, the patch still reference to 
incorrect path:

{code}
java.io.IOException: java.util.concurrent.ExecutionException: 
java.io.FileNotFoundException: File does not exist: 
hdfs://eyang-1.openstacklocal:9000/user/yarn/null/config/9f38484d220fa527b1fb19747638497179500a1bed8bf0498eb788229229e6e1
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.HdfsManifestToResourcesPlugin.getResource(HdfsManifestToResourcesPlugin.java:180)
at 

[jira] [Assigned] (YARN-9106) Add option to graceful decommission to not wait for applications

2019-08-13 Thread Wei-Chiu Chuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang reassigned YARN-9106:
-

Assignee: Mikayla Konst

> Add option to graceful decommission to not wait for applications
> 
>
> Key: YARN-9106
> URL: https://issues.apache.org/jira/browse/YARN-9106
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Reporter: Mikayla Konst
>Assignee: Mikayla Konst
>Priority: Major
> Attachments: YARN-9106.patch
>
>
> Add property 
> yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-applications.
> If true (the default), the resource manager waits for all containers, as well 
> as all applications associated with those containers, to finish before 
> gracefully decommissioning a node.
> If false, the resource manager only waits for containers, but not 
> applications, to finish. For map-only jobs or other jobs in which mappers do 
> not need to serve shuffle data, this allows nodes to be decommissioned as 
> soon as their containers are finished as opposed to when the job is done.
> Add property 
> yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-app-masters.
> If false, during graceful decommission, when the resource manager waits for 
> all containers on a node to finish, it will not wait for app master 
> containers to finish. Defaults to true. This property should only be set to 
> false if app master failure is recoverable.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-9101) Recovery Container exitCode Not Right

2019-08-13 Thread Wei-Chiu Chuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang reassigned YARN-9101:
-

Assignee: SuperbDong

> Recovery Container exitCode Not Right
> -
>
> Key: YARN-9101
> URL: https://issues.apache.org/jira/browse/YARN-9101
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0
>Reporter: SuperbDong
>Assignee: SuperbDong
>Priority: Major
>  Labels: pull-request-available
> Attachments: YARN-9101.1.patch, YARN-9101.patch
>
>
> It's correct exitCode when container launch nomally,but it is not correct if 
> the container by recovery. 
> Out of memory exitCode is -104, the exitCode had to be lost when the 
> container was recovered by restart NodeManager.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-9610) HeartbeatCallBack int FederationInterceptor clear AMRMToken in response from UAM should before add to aysncResponseSink

2019-08-13 Thread Wei-Chiu Chuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang reassigned YARN-9610:
-

Assignee: Morty Zhong

> HeartbeatCallBack int FederationInterceptor clear AMRMToken in response from 
> UAM should before add to aysncResponseSink 
> 
>
> Key: YARN-9610
> URL: https://issues.apache.org/jira/browse/YARN-9610
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: amrmproxy, federation
>Affects Versions: 3.2.0
>Reporter: Morty Zhong
>Assignee: Morty Zhong
>Priority: Major
> Attachments: YARN-9610.patch.1, YARN-9610.patch.2
>
>
> in federation, `allocate` is async. the response from RM is cached in 
> `asyncResponseSink`.
> the final allocate response is merged from all RMs allocate response. merge 
> will throw exception when AMRMToken from UAM response is not null.
> But set AMRMToken from UAM response to null is not in the scope of lock. so 
> there will be a change merge see that  AMRMToken from UAM response is not 
> null.
> so we should clear the token before add response to asyncResponseSink
>  
>  
> {code:java}
> synchronized (asyncResponseSink) {
>   List responses = null;
>   if (asyncResponseSink.containsKey(subClusterId)) {
> responses = asyncResponseSink.get(subClusterId);
>   } else {
> responses = new ArrayList<>();
> asyncResponseSink.put(subClusterId, responses);
>   }
>   responses.add(response);
>   // Notify main thread about the response arrival
>   asyncResponseSink.notifyAll();
> }
> ...
> if (this.isUAM && response.getAMRMToken() != null) {
>   Token newToken = ConverterUtils
>   .convertFromYarn(response.getAMRMToken(), (Text) null);
>   // Do not further propagate the new amrmToken for UAM
>   response.setAMRMToken(null);
> ...{code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9106) Add option to graceful decommission to not wait for applications

2019-08-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906755#comment-16906755
 ] 

Hadoop QA commented on YARN-9106:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  7s{color} 
| {color:red} YARN-9106 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-9106 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12951275/YARN-9106.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/24558/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Add option to graceful decommission to not wait for applications
> 
>
> Key: YARN-9106
> URL: https://issues.apache.org/jira/browse/YARN-9106
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Reporter: Mikayla Konst
>Assignee: Mikayla Konst
>Priority: Major
> Attachments: YARN-9106.patch
>
>
> Add property 
> yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-applications.
> If true (the default), the resource manager waits for all containers, as well 
> as all applications associated with those containers, to finish before 
> gracefully decommissioning a node.
> If false, the resource manager only waits for containers, but not 
> applications, to finish. For map-only jobs or other jobs in which mappers do 
> not need to serve shuffle data, this allows nodes to be decommissioned as 
> soon as their containers are finished as opposed to when the job is done.
> Add property 
> yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-app-masters.
> If false, during graceful decommission, when the resource manager waits for 
> all containers on a node to finish, it will not wait for app master 
> containers to finish. Defaults to true. This property should only be set to 
> false if app master failure is recoverable.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9101) Recovery Container exitCode Not Right

2019-08-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906804#comment-16906804
 ] 

Hadoop QA commented on YARN-9101:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 22m 
59s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  6s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 21s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 10 new + 6 unchanged - 0 fixed = 16 total (was 6) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 27s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 21m 
55s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
49s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}100m 58s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-9101 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12951928/YARN-9101.1.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux d2d2fcf8aa97 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / e6d240d |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/24559/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24559/testReport/ |
| asflicense | 
https://builds.apache.org/job/PreCommit-YARN-Build/24559/artifact/out/patch-asflicense-problems.txt
 |
| Max. process+thread count | 401 (vs. ulimit of 1) |
| mod

[jira] [Updated] (YARN-9106) Add option to graceful decommission to not wait for applications

2019-08-13 Thread Zhankun Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhankun Tang updated YARN-9106:
---
Issue Type: Sub-task  (was: Improvement)
Parent: YARN-914

> Add option to graceful decommission to not wait for applications
> 
>
> Key: YARN-9106
> URL: https://issues.apache.org/jira/browse/YARN-9106
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Mikayla Konst
>Assignee: Mikayla Konst
>Priority: Major
> Attachments: YARN-9106.patch
>
>
> Add property 
> yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-applications.
> If true (the default), the resource manager waits for all containers, as well 
> as all applications associated with those containers, to finish before 
> gracefully decommissioning a node.
> If false, the resource manager only waits for containers, but not 
> applications, to finish. For map-only jobs or other jobs in which mappers do 
> not need to serve shuffle data, this allows nodes to be decommissioned as 
> soon as their containers are finished as opposed to when the job is done.
> Add property 
> yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-app-masters.
> If false, during graceful decommission, when the resource manager waits for 
> all containers on a node to finish, it will not wait for app master 
> containers to finish. Defaults to true. This property should only be set to 
> false if app master failure is recoverable.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9746) Rm should only rewrite the jobConf passed by app when supporting multi-cluster token renew

2019-08-13 Thread Junfan Zhang (JIRA)
Junfan Zhang created YARN-9746:
--

 Summary: Rm should only rewrite the jobConf passed by app when 
supporting multi-cluster token renew
 Key: YARN-9746
 URL: https://issues.apache.org/jira/browse/YARN-9746
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Junfan Zhang


This issue links to YARN-5910.

When to support multi-cluster delegation token renew, the path of YARN-5910 
works in most scenarios.

But when intergrating with Oozie, we encounter some problems. In Oozie having 
multi delegation tokens including HDFS_DELEGATION_TOKEN(another cluster HA 
token) and MR_DELEGATION_TOKEN(Oozie mr launcher token), to support renew 
another cluster's token, YARN-5910 was patched and related config was set. The 
config is as follows
{code:xml}

mapreduce.job.send-token-conf

dfs.namenode.kerberos.principal|dfs.nameservices|^dfs.namenode.rpc-address.*$|^dfs.ha.namenodes.*$|^dfs.client.failover.proxy.provider.*$


dfs.nameservices

hadoop-clusterA-ns01,hadoop-clusterA-ns02,hadoop-clusterA-ns03,hadoop-clusterA-ns04,hadoop-clusterB-ns01,hadoop-clusterB-ns02,hadoop-clusterB-ns03,hadoop-clusterB-ns04


dfs.ha.namenodes.hadoop-clusterB-ns01
nn1,nn2



dfs.namenode.rpc-address.hadoop-clusterB-ns01.nn1
namenode01-clusterB.qiyi.hadoop:8020



dfs.namenode.rpc-address.hadoop-clusterB-ns01.nn2
namenode02-clusterB.qiyi.hadoop:8020



dfs.client.failover.proxy.provider.hadoop-clusterB-ns01

org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider

{code}
However, the MR_DELEGATION_TOKEN could‘t be renewed, because of lacking some 
config. Although we can set the required configurations through the app, this 
is not a good idea. So i think rm should only rewrite the jobConf passed by app 
to solve the above situation.  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9106) Add option to graceful decommission to not wait for applications

2019-08-13 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906856#comment-16906856
 ] 

Sunil Govindan commented on YARN-9106:
--

wait-for-applications and wait-for-app-masters

Expecting below behaviour:

1. wait-for-applications: by default,  suggestion is to set TRUE. This means no 
matter the containers are done, still node cannot be decommissioned, as some 
apps may be still running. This is true in case of MR, How about other apps?. 
Such as services, or tez or spark? I think we need to consider the reason why 
we need to hold node for longer time based on type containers/apps each node 
has ran. 

2. wait-for-app-masters: This config will be helpful inorder to force kill AM 
containers to decommission a node faster. Thinking out loud, this is an 
aggressive config, howver default is turned off. Hence i think its fine to have 
this. 

> Add option to graceful decommission to not wait for applications
> 
>
> Key: YARN-9106
> URL: https://issues.apache.org/jira/browse/YARN-9106
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Mikayla Konst
>Assignee: Mikayla Konst
>Priority: Major
> Attachments: YARN-9106.patch
>
>
> Add property 
> yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-applications.
> If true (the default), the resource manager waits for all containers, as well 
> as all applications associated with those containers, to finish before 
> gracefully decommissioning a node.
> If false, the resource manager only waits for containers, but not 
> applications, to finish. For map-only jobs or other jobs in which mappers do 
> not need to serve shuffle data, this allows nodes to be decommissioned as 
> soon as their containers are finished as opposed to when the job is done.
> Add property 
> yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-app-masters.
> If false, during graceful decommission, when the resource manager waits for 
> all containers on a node to finish, it will not wait for app master 
> containers to finish. Defaults to true. This property should only be set to 
> false if app master failure is recoverable.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9106) Add option to graceful decommission to not wait for applications

2019-08-13 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906857#comment-16906857
 ] 

Sunil Govindan commented on YARN-9106:
--

cc [~leftnoteasy] [~tangzhankun]

> Add option to graceful decommission to not wait for applications
> 
>
> Key: YARN-9106
> URL: https://issues.apache.org/jira/browse/YARN-9106
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Mikayla Konst
>Assignee: Mikayla Konst
>Priority: Major
> Attachments: YARN-9106.patch
>
>
> Add property 
> yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-applications.
> If true (the default), the resource manager waits for all containers, as well 
> as all applications associated with those containers, to finish before 
> gracefully decommissioning a node.
> If false, the resource manager only waits for containers, but not 
> applications, to finish. For map-only jobs or other jobs in which mappers do 
> not need to serve shuffle data, this allows nodes to be decommissioned as 
> soon as their containers are finished as opposed to when the job is done.
> Add property 
> yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-app-masters.
> If false, during graceful decommission, when the resource manager waits for 
> all containers on a node to finish, it will not wait for app master 
> containers to finish. Defaults to true. This property should only be set to 
> false if app master failure is recoverable.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-6003) yarn-ui build failure caused by debug 2.4.0

2019-08-13 Thread Akira Ajisaka (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved YARN-6003.
-
Resolution: Not A Problem

Now this issue is not a problem. Closing.

> yarn-ui build failure caused by debug 2.4.0
> ---
>
> Key: YARN-6003
> URL: https://issues.apache.org/jira/browse/YARN-6003
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: build, yarn-ui-v2
>Reporter: Akira Ajisaka
>Priority: Minor
>
> The recent build failure: 
> https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/255/artifact/out/patch-compile-root.txt
> {noformat}
> /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/target/src/main/webapp/node_modules/debug/debug.js:126
>   debug.color = selectColor(namespae);
> ^
> ReferenceError: namespae is not defined
> at createDebug 
> (/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/target/src/main/webapp/node_modules/debug/debug.js:126:29)
> at Object. 
> (/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/target/src/main/webapp/node_modules/ember-cli/lib/models/project.js:16:43)
> at Module._compile (module.js:456:26)
> at Object.Module._extensions..js (module.js:474:10)
> at Module.load (module.js:356:32)
> at Function.Module._load (module.js:312:12)
> at Module.require (module.js:364:17)
> at require (module.js:380:17)
> at Object. 
> (/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/target/src/main/webapp/node_modules/ember-cli/lib/cli/index.js:4:21)
> at Module._compile (module.js:456:26)
> {noformat}
> build@2.4.0 is broken. https://github.com/visionmedia/debug/issues/347
> Maybe we need to set the version to 2.4.1 explicitly.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9746) Rm should only rewrite the jobConf passed by app when supporting multi-cluster token renew

2019-08-13 Thread Junfan Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junfan Zhang updated YARN-9746:
---
Attachment: YARN-9746-01.path

> Rm should only rewrite the jobConf passed by app when supporting 
> multi-cluster token renew
> --
>
> Key: YARN-9746
> URL: https://issues.apache.org/jira/browse/YARN-9746
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Junfan Zhang
>Priority: Major
> Attachments: YARN-9746-01.path
>
>
> This issue links to YARN-5910.
> When to support multi-cluster delegation token renew, the path of YARN-5910 
> works in most scenarios.
> But when intergrating with Oozie, we encounter some problems. In Oozie having 
> multi delegation tokens including HDFS_DELEGATION_TOKEN(another cluster HA 
> token) and MR_DELEGATION_TOKEN(Oozie mr launcher token), to support renew 
> another cluster's token, YARN-5910 was patched and related config was set. 
> The config is as follows
> {code:xml}
> 
> mapreduce.job.send-token-conf
> 
> dfs.namenode.kerberos.principal|dfs.nameservices|^dfs.namenode.rpc-address.*$|^dfs.ha.namenodes.*$|^dfs.client.failover.proxy.provider.*$
> 
> 
> dfs.nameservices
> 
> hadoop-clusterA-ns01,hadoop-clusterA-ns02,hadoop-clusterA-ns03,hadoop-clusterA-ns04,hadoop-clusterB-ns01,hadoop-clusterB-ns02,hadoop-clusterB-ns03,hadoop-clusterB-ns04
> 
> 
> dfs.ha.namenodes.hadoop-clusterB-ns01
> nn1,nn2
> 
> 
> 
> dfs.namenode.rpc-address.hadoop-clusterB-ns01.nn1
> namenode01-clusterB.qiyi.hadoop:8020
> 
> 
> 
> dfs.namenode.rpc-address.hadoop-clusterB-ns01.nn2
> namenode02-clusterB.qiyi.hadoop:8020
> 
> 
> 
> dfs.client.failover.proxy.provider.hadoop-clusterB-ns01
> 
> org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
> 
> {code}
> However, the MR_DELEGATION_TOKEN could‘t be renewed, because of lacking some 
> config. Although we can set the required configurations through the app, this 
> is not a good idea. So i think rm should only rewrite the jobConf passed by 
> app to solve the above situation.  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9746) Rm should only rewrite partial jobConf passed by app when supporting multi-cluster token renew

2019-08-13 Thread Junfan Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junfan Zhang updated YARN-9746:
---
Summary: Rm should only rewrite partial jobConf passed by app when 
supporting multi-cluster token renew  (was: Rm should only rewrite the jobConf 
passed by app when supporting multi-cluster token renew)

> Rm should only rewrite partial jobConf passed by app when supporting 
> multi-cluster token renew
> --
>
> Key: YARN-9746
> URL: https://issues.apache.org/jira/browse/YARN-9746
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Junfan Zhang
>Priority: Major
> Attachments: YARN-9746-01.path
>
>
> This issue links to YARN-5910.
> When to support multi-cluster delegation token renew, the path of YARN-5910 
> works in most scenarios.
> But when intergrating with Oozie, we encounter some problems. In Oozie having 
> multi delegation tokens including HDFS_DELEGATION_TOKEN(another cluster HA 
> token) and MR_DELEGATION_TOKEN(Oozie mr launcher token), to support renew 
> another cluster's token, YARN-5910 was patched and related config was set. 
> The config is as follows
> {code:xml}
> 
> mapreduce.job.send-token-conf
> 
> dfs.namenode.kerberos.principal|dfs.nameservices|^dfs.namenode.rpc-address.*$|^dfs.ha.namenodes.*$|^dfs.client.failover.proxy.provider.*$
> 
> 
> dfs.nameservices
> 
> hadoop-clusterA-ns01,hadoop-clusterA-ns02,hadoop-clusterA-ns03,hadoop-clusterA-ns04,hadoop-clusterB-ns01,hadoop-clusterB-ns02,hadoop-clusterB-ns03,hadoop-clusterB-ns04
> 
> 
> dfs.ha.namenodes.hadoop-clusterB-ns01
> nn1,nn2
> 
> 
> 
> dfs.namenode.rpc-address.hadoop-clusterB-ns01.nn1
> namenode01-clusterB.qiyi.hadoop:8020
> 
> 
> 
> dfs.namenode.rpc-address.hadoop-clusterB-ns01.nn2
> namenode02-clusterB.qiyi.hadoop:8020
> 
> 
> 
> dfs.client.failover.proxy.provider.hadoop-clusterB-ns01
> 
> org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
> 
> {code}
> However, the MR_DELEGATION_TOKEN could‘t be renewed, because of lacking some 
> config. Although we can set the required configurations through the app, this 
> is not a good idea. So i think rm should only rewrite the jobConf passed by 
> app to solve the above situation.  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9080) Bucket Directories as part of ATS done accumulates

2019-08-13 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906884#comment-16906884
 ] 

Bibin A Chundatt commented on YARN-9080:


Thank you [~Prabhu Joseph] for working on this 

Have a query regarding this . Sorry to come in really late 

{code}
 while (iter.hasNext()) {
  FileStatus stat = iter.next();
  Path clusterTimeStampPath = stat.getPath();
  if (isValidClusterTimeStampDir(clusterTimeStampPath)) {
MutableBoolean appLogDirPresent = new MutableBoolean(false);
{code}
{ fs.getFileStatus(clusterTimeStampPath);}} in  *isValidClusterTimeStampDir** 
creates additional Namenode RPC call.

Can we pass the FileStatus instead of path .. {{if 
(isValidClusterTimeStampDir(clusterTimeStampPath))}} to reduce Namenode RPC 
call.. 

Thoughts??









> Bucket Directories as part of ATS done accumulates
> --
>
> Key: YARN-9080
> URL: https://issues.apache.org/jira/browse/YARN-9080
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: 0001-YARN-9080.patch, 0002-YARN-9080.patch, 
> 0003-YARN-9080.patch, YARN-9080-004.patch, YARN-9080-005.patch, 
> YARN-9080-006.patch, YARN-9080-007.patch, YARN-9080-008.patch
>
>
> Have observed older bucket directories cluster_timestamp, bucket1 and bucket2 
> as part of ATS done accumulates. The cleanLogs part of EntityLogCleaner 
> removes only the app directories and not the bucket directories.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9080) Bucket Directories as part of ATS done accumulates

2019-08-13 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9080:

Attachment: YARN-9080.addendum-001.patch

> Bucket Directories as part of ATS done accumulates
> --
>
> Key: YARN-9080
> URL: https://issues.apache.org/jira/browse/YARN-9080
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: 0001-YARN-9080.patch, 0002-YARN-9080.patch, 
> 0003-YARN-9080.patch, YARN-9080-004.patch, YARN-9080-005.patch, 
> YARN-9080-006.patch, YARN-9080-007.patch, YARN-9080-008.patch, 
> YARN-9080.addendum-001.patch
>
>
> Have observed older bucket directories cluster_timestamp, bucket1 and bucket2 
> as part of ATS done accumulates. The cleanLogs part of EntityLogCleaner 
> removes only the app directories and not the bucket directories.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9080) Bucket Directories as part of ATS done accumulates

2019-08-13 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906907#comment-16906907
 ] 

Prabhu Joseph commented on YARN-9080:
-

Thanks [~bibinchundatt], have adapted the changes in addendum patch 
[^YARN-9080.addendum-001.patch] . Can you review the changes when you get time. 
Thanks.

> Bucket Directories as part of ATS done accumulates
> --
>
> Key: YARN-9080
> URL: https://issues.apache.org/jira/browse/YARN-9080
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: 0001-YARN-9080.patch, 0002-YARN-9080.patch, 
> 0003-YARN-9080.patch, YARN-9080-004.patch, YARN-9080-005.patch, 
> YARN-9080-006.patch, YARN-9080-007.patch, YARN-9080-008.patch, 
> YARN-9080.addendum-001.patch
>
>
> Have observed older bucket directories cluster_timestamp, bucket1 and bucket2 
> as part of ATS done accumulates. The cleanLogs part of EntityLogCleaner 
> removes only the app directories and not the bucket directories.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9080) Bucket Directories as part of ATS done accumulates

2019-08-13 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906913#comment-16906913
 ] 

Bibin A Chundatt commented on YARN-9080:


[~Prabhu Joseph] Thank you for updating the patch.. Could you handle in new 
JIRA..

> Bucket Directories as part of ATS done accumulates
> --
>
> Key: YARN-9080
> URL: https://issues.apache.org/jira/browse/YARN-9080
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: 0001-YARN-9080.patch, 0002-YARN-9080.patch, 
> 0003-YARN-9080.patch, YARN-9080-004.patch, YARN-9080-005.patch, 
> YARN-9080-006.patch, YARN-9080-007.patch, YARN-9080-008.patch, 
> YARN-9080.addendum-001.patch
>
>
> Have observed older bucket directories cluster_timestamp, bucket1 and bucket2 
> as part of ATS done accumulates. The cleanLogs part of EntityLogCleaner 
> removes only the app directories and not the bucket directories.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9747) Reduce additional namenode call by EntityGroupFSTimelineStore#cleanLogs

2019-08-13 Thread Prabhu Joseph (JIRA)
Prabhu Joseph created YARN-9747:
---

 Summary: Reduce additional namenode call by 
EntityGroupFSTimelineStore#cleanLogs
 Key: YARN-9747
 URL: https://issues.apache.org/jira/browse/YARN-9747
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver
Affects Versions: 3.3.0
Reporter: Prabhu Joseph
Assignee: Prabhu Joseph


EntityGroupFSTimelineStore#cleanLogs creates additional Namenode RPC call.

{code}
cleanLogs:
 while (iter.hasNext()) {
  FileStatus stat = iter.next();
  Path clusterTimeStampPath = stat.getPath();
  if (isValidClusterTimeStampDir(clusterTimeStampPath)) {
MutableBoolean appLogDirPresent = new MutableBoolean(false);

{ fs.getFileStatus(clusterTimeStampPath);}} in isValidClusterTimeStampDir* 
creates additional Namenode RPC call.
{code}

cc [~bibinchundatt]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9747) Reduce additional namenode call by EntityGroupFSTimelineStore#cleanLogs

2019-08-13 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9747:

Attachment: YARN-9747-001.patch

> Reduce additional namenode call by EntityGroupFSTimelineStore#cleanLogs
> ---
>
> Key: YARN-9747
> URL: https://issues.apache.org/jira/browse/YARN-9747
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9747-001.patch
>
>
> EntityGroupFSTimelineStore#cleanLogs creates additional Namenode RPC call.
> {code}
> cleanLogs:
>  while (iter.hasNext()) {
>   FileStatus stat = iter.next();
>   Path clusterTimeStampPath = stat.getPath();
>   if (isValidClusterTimeStampDir(clusterTimeStampPath)) {
> MutableBoolean appLogDirPresent = new MutableBoolean(false);
> { fs.getFileStatus(clusterTimeStampPath);}} in isValidClusterTimeStampDir* 
> creates additional Namenode RPC call.
> {code}
> cc [~bibinchundatt]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9080) Bucket Directories as part of ATS done accumulates

2019-08-13 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906921#comment-16906921
 ] 

Prabhu Joseph commented on YARN-9080:
-

[~bibinchundatt] Yes, have reported YARN-9747 and submitted patch. Thanks.

> Bucket Directories as part of ATS done accumulates
> --
>
> Key: YARN-9080
> URL: https://issues.apache.org/jira/browse/YARN-9080
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: 0001-YARN-9080.patch, 0002-YARN-9080.patch, 
> 0003-YARN-9080.patch, YARN-9080-004.patch, YARN-9080-005.patch, 
> YARN-9080-006.patch, YARN-9080-007.patch, YARN-9080-008.patch, 
> YARN-9080.addendum-001.patch
>
>
> Have observed older bucket directories cluster_timestamp, bucket1 and bucket2 
> as part of ATS done accumulates. The cleanLogs part of EntityLogCleaner 
> removes only the app directories and not the bucket directories.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9735) Allow User Keytab to submit YARN Native Service

2019-08-13 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906926#comment-16906926
 ] 

Prabhu Joseph commented on YARN-9735:
-

Hi [~eyang], Have missed to check the reason for Yarn Native Service supporting 
only service principal with hostname (YARN-8571). Have seen application users 
starts to test the YARN Native Service with their user keytab and face the 
above issue. Then they debug and find it requires a service keytab and which 
has to be created for every host on Dev, Test, Prod clusters and populated. I 
think this affects the usability for new users.

Have tested with user keytab and it is also working fine. Do you think if it is 
fine to allow user keytab. If not, will close this jira as invalid.

> Allow User Keytab to submit YARN Native Service 
> 
>
> Key: YARN-9735
> URL: https://issues.apache.org/jira/browse/YARN-9735
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn-native-services
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
>
> Yarn Native Service launch fails on a secure cluster with user keytab. It 
> allows only service keytab. Have seen most of the users test their jobs with 
> user keytab.  
> {code}
> [ambari-qa@pjosephdocker-3 ~]$ yarn app -launch sleeper-service 
> /usr/hdp/3.0.1.0-187/hadoop-yarn/yarn-service-examples/sleeper/sleeper.json
> 19/08/03 17:17:04 ERROR client.ApiServiceClient: Kerberos principal 
> (ambari-qa-pjosephdoc...@docker.com) does  not contain a hostname.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9747) Reduce additional namenode call by EntityGroupFSTimelineStore#cleanLogs

2019-08-13 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906931#comment-16906931
 ] 

Bibin A Chundatt commented on YARN-9747:


Thank you [~Prabhu Joseph] for patch

+1 LGTM will wait for jenkins results



> Reduce additional namenode call by EntityGroupFSTimelineStore#cleanLogs
> ---
>
> Key: YARN-9747
> URL: https://issues.apache.org/jira/browse/YARN-9747
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9747-001.patch
>
>
> EntityGroupFSTimelineStore#cleanLogs creates additional Namenode RPC call.
> {code}
> cleanLogs:
>  while (iter.hasNext()) {
>   FileStatus stat = iter.next();
>   Path clusterTimeStampPath = stat.getPath();
>   if (isValidClusterTimeStampDir(clusterTimeStampPath)) {
> MutableBoolean appLogDirPresent = new MutableBoolean(false);
> { fs.getFileStatus(clusterTimeStampPath);}} in isValidClusterTimeStampDir* 
> creates additional Namenode RPC call.
> {code}
> cc [~bibinchundatt]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org