[jira] [Commented] (YARN-3660) [GPG] Federation Global Policy Generator (service hook only)

2023-05-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-3660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17720075#comment-17720075
 ] 

ASF GitHub Bot commented on YARN-3660:
--

hadoop-yetus commented on PR #5625:
URL: https://github.com/apache/hadoop/pull/5625#issuecomment-1537036536

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 58s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +0 :ok: |  shelldocs  |   0m  0s |  |  Shelldocs was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  31m 52s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  20m 41s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  16m 35s |  |  trunk passed with JDK 
Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  compile  |  15m  7s |  |  trunk passed with JDK 
Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  checkstyle  |   3m 52s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   8m 25s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   9m 15s |  |  trunk passed with JDK 
Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   7m 47s |  |  trunk passed with JDK 
Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +0 :ok: |  spotbugs  |   0m 19s |  |  branch/hadoop-project no spotbugs 
output file (spotbugsXml.xml)  |
   | +1 :green_heart: |  shadedclient  |  30m 40s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 32s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   6m 55s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  15m  2s |  |  the patch passed with JDK 
Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javac  |  15m  2s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  14m 24s |  |  the patch passed with JDK 
Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  javac  |  14m 24s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   3m 33s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   8m 29s |  |  the patch passed  |
   | +1 :green_heart: |  shellcheck  |   0m 24s |  |  No new issues.  |
   | +1 :green_heart: |  javadoc  |   9m 21s |  |  the patch passed with JDK 
Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   8m  6s |  |  the patch passed with JDK 
Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +0 :ok: |  spotbugs  |   0m 24s |  |  hadoop-project has no data from 
spotbugs  |
   | -1 :x: |  spotbugs  |  10m 15s | 
[/new-spotbugs-hadoop-yarn-project_hadoop-yarn.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5625/2/artifact/out/new-spotbugs-hadoop-yarn-project_hadoop-yarn.html)
 |  hadoop-yarn-project/hadoop-yarn generated 1 new + 0 unchanged - 0 fixed = 1 
total (was 0)  |
   | -1 :x: |  spotbugs  |   5m 45s | 
[/new-spotbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5625/2/artifact/out/new-spotbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server.html)
 |  hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0)  |
   | -1 :x: |  spotbugs  |   9m 55s | 
[/new-spotbugs-hadoop-yarn-project.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5625/2/artifact/out/new-spotbugs-hadoop-yarn-project.html)
 |  hadoop-yarn-project generated 1 new + 0 unchanged - 0 fixed = 1 total (was 
0)  |
   | +1 :green_heart: |  shadedclient  |  29m 11s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   0m 25s |  |  hadoop-project in the patch 
passed.  |
   | -1 :x: |  unit  | 239m 12s | 
[/patch-unit-hadoop-yarn-project_hadoop-yarn.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5625/2/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn.txt)
 |  hadoop-yarn in the patch passed. 

[jira] [Commented] (YARN-11470) FederationStateStoreFacade Cache Support Guava Cache

2023-05-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17720052#comment-17720052
 ] 

ASF GitHub Bot commented on YARN-11470:
---

slfan1989 commented on PR #5609:
URL: https://github.com/apache/hadoop/pull/5609#issuecomment-1536970845

   @goiri Thank you very much for helping to review the code!




> FederationStateStoreFacade Cache Support Guava Cache
> 
>
> Key: YARN-11470
> URL: https://issues.apache.org/jira/browse/YARN-11470
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: federation
>Affects Versions: 3.4.0
>Reporter: Shilun Fan
>Assignee: Shilun Fan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> FederationStateStoreFacade Cache Support Guava Cache.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11477) [Federation] MemoryFederationStateStore Support Store ApplicationSubmitData

2023-05-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17720051#comment-17720051
 ] 

ASF GitHub Bot commented on YARN-11477:
---

slfan1989 commented on PR #5616:
URL: https://github.com/apache/hadoop/pull/5616#issuecomment-1536970701

   @goiri Thank you very much for helping to review the code!




> [Federation] MemoryFederationStateStore Support Store ApplicationSubmitData
> ---
>
> Key: YARN-11477
> URL: https://issues.apache.org/jira/browse/YARN-11477
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: federation
>Affects Versions: 3.4.0
>Reporter: Shilun Fan
>Assignee: Shilun Fan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> Before completing YARN-8898, We need to store ApplicationSubmitData in 
> FederationStateStore first, this jira will store ApplicationSubmitData in 
> MemoryFederationStateStore.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11477) [Federation] MemoryFederationStateStore Support Store ApplicationSubmitData

2023-05-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17720001#comment-17720001
 ] 

ASF GitHub Bot commented on YARN-11477:
---

goiri merged PR #5616:
URL: https://github.com/apache/hadoop/pull/5616




> [Federation] MemoryFederationStateStore Support Store ApplicationSubmitData
> ---
>
> Key: YARN-11477
> URL: https://issues.apache.org/jira/browse/YARN-11477
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: federation
>Affects Versions: 3.4.0
>Reporter: Shilun Fan
>Assignee: Shilun Fan
>Priority: Major
>  Labels: pull-request-available
>
> Before completing YARN-8898, We need to store ApplicationSubmitData in 
> FederationStateStore first, this jira will store ApplicationSubmitData in 
> MemoryFederationStateStore.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-11477) [Federation] MemoryFederationStateStore Support Store ApplicationSubmitData

2023-05-05 Thread Jira


 [ 
https://issues.apache.org/jira/browse/YARN-11477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri resolved YARN-11477.

Fix Version/s: 3.4.0
   Resolution: Fixed

> [Federation] MemoryFederationStateStore Support Store ApplicationSubmitData
> ---
>
> Key: YARN-11477
> URL: https://issues.apache.org/jira/browse/YARN-11477
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: federation
>Affects Versions: 3.4.0
>Reporter: Shilun Fan
>Assignee: Shilun Fan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> Before completing YARN-8898, We need to store ApplicationSubmitData in 
> FederationStateStore first, this jira will store ApplicationSubmitData in 
> MemoryFederationStateStore.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-11470) FederationStateStoreFacade Cache Support Guava Cache

2023-05-05 Thread Jira


 [ 
https://issues.apache.org/jira/browse/YARN-11470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri resolved YARN-11470.

Fix Version/s: 3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

> FederationStateStoreFacade Cache Support Guava Cache
> 
>
> Key: YARN-11470
> URL: https://issues.apache.org/jira/browse/YARN-11470
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: federation
>Affects Versions: 3.4.0
>Reporter: Shilun Fan
>Assignee: Shilun Fan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> FederationStateStoreFacade Cache Support Guava Cache.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11470) FederationStateStoreFacade Cache Support Guava Cache

2023-05-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719998#comment-17719998
 ] 

ASF GitHub Bot commented on YARN-11470:
---

goiri merged PR #5609:
URL: https://github.com/apache/hadoop/pull/5609




> FederationStateStoreFacade Cache Support Guava Cache
> 
>
> Key: YARN-11470
> URL: https://issues.apache.org/jira/browse/YARN-11470
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: federation
>Affects Versions: 3.4.0
>Reporter: Shilun Fan
>Assignee: Shilun Fan
>Priority: Major
>  Labels: pull-request-available
>
> FederationStateStoreFacade Cache Support Guava Cache.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3660) [GPG] Federation Global Policy Generator (service hook only)

2023-05-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-3660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719903#comment-17719903
 ] 

ASF GitHub Bot commented on YARN-3660:
--

hadoop-yetus commented on PR #5625:
URL: https://github.com/apache/hadoop/pull/5625#issuecomment-1536394965

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 54s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +0 :ok: |  shelldocs  |   0m  0s |  |  Shelldocs was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  21m 54s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  21m 48s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  15m 45s |  |  trunk passed with JDK 
Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  compile  |  15m  8s |  |  trunk passed with JDK 
Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +1 :green_heart: |  checkstyle  |   3m 44s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   8m 57s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   9m  3s |  |  trunk passed with JDK 
Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   7m 45s |  |  trunk passed with JDK 
Private Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09  |
   | +0 :ok: |  spotbugs  |   0m 20s |  |  branch/hadoop-project no spotbugs 
output file (spotbugsXml.xml)  |
   | +1 :green_heart: |  shadedclient  |  30m 21s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 10s |  |  Maven dependency ordering for patch  |
   | -1 :x: |  mvninstall  |   0m 11s | 
[/patch-mvninstall-hadoop-yarn-project.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5625/1/artifact/out/patch-mvninstall-hadoop-yarn-project.txt)
 |  hadoop-yarn-project in the patch failed.  |
   | -1 :x: |  mvninstall  |   0m 11s | 
[/patch-mvninstall-hadoop-yarn-project_hadoop-yarn.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5625/1/artifact/out/patch-mvninstall-hadoop-yarn-project_hadoop-yarn.txt)
 |  hadoop-yarn in the patch failed.  |
   | -1 :x: |  mvninstall  |   0m 13s | 
[/patch-mvninstall-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5625/1/artifact/out/patch-mvninstall-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server.txt)
 |  hadoop-yarn-server in the patch failed.  |
   | -1 :x: |  mvninstall  |   0m  9s | 
[/patch-mvninstall-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-globalpolicygenerator.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5625/1/artifact/out/patch-mvninstall-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-globalpolicygenerator.txt)
 |  hadoop-yarn-server-globalpolicygenerator in the patch failed.  |
   | -1 :x: |  compile  |   0m 13s | 
[/patch-compile-root-jdkUbuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5625/1/artifact/out/patch-compile-root-jdkUbuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1.txt)
 |  root in the patch failed with JDK 
Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1.  |
   | -1 :x: |  javac  |   0m 13s | 
[/patch-compile-root-jdkUbuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5625/1/artifact/out/patch-compile-root-jdkUbuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1.txt)
 |  root in the patch failed with JDK 
Ubuntu-11.0.18+10-post-Ubuntu-0ubuntu120.04.1.  |
   | -1 :x: |  compile  |   0m 13s | 
[/patch-compile-root-jdkPrivateBuild-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5625/1/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09.txt)
 |  root in the patch failed with JDK Private 
Build-1.8.0_362-8u362-ga-0ubuntu1~20.04.1-b09.  |
   | -1 :x: |  javac  |   0m 13s | 

[jira] [Commented] (YARN-11477) [Federation] MemoryFederationStateStore Support Store ApplicationSubmitData

2023-05-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719898#comment-17719898
 ] 

ASF GitHub Bot commented on YARN-11477:
---

slfan1989 commented on PR #5616:
URL: https://github.com/apache/hadoop/pull/5616#issuecomment-1536382648

   @goiri Can you help to merge this pr into the trunk branch? Thank you very 
much! I will continue to follow up YARN-11479.




> [Federation] MemoryFederationStateStore Support Store ApplicationSubmitData
> ---
>
> Key: YARN-11477
> URL: https://issues.apache.org/jira/browse/YARN-11477
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: federation
>Affects Versions: 3.4.0
>Reporter: Shilun Fan
>Assignee: Shilun Fan
>Priority: Major
>  Labels: pull-request-available
>
> Before completing YARN-8898, We need to store ApplicationSubmitData in 
> FederationStateStore first, this jira will store ApplicationSubmitData in 
> MemoryFederationStateStore.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3660) [GPG] Federation Global Policy Generator (service hook only)

2023-05-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-3660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated YARN-3660:
-
Labels: federation gpg pull-request-available  (was: federation gpg)

> [GPG] Federation Global Policy Generator (service hook only)
> 
>
> Key: YARN-3660
> URL: https://issues.apache.org/jira/browse/YARN-3660
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Carlo Curino
>Assignee: Botong Huang
>Priority: Major
>  Labels: federation, gpg, pull-request-available
> Attachments: YARN-3660-YARN-7402.v1.patch, 
> YARN-3660-YARN-7402.v2.patch, YARN-3660-YARN-7402.v3.patch, 
> YARN-3660-YARN-7402.v3.patch, YARN-3660-YARN-7402.v3.patch, 
> YARN-3660-YARN-7402.v4.patch
>
>
> In a federated environment, local impairments of one sub-cluster might 
> unfairly affect users/queues that are mapped to that sub-cluster. A 
> centralized component (GPG) runs out-of-band and edits the policies governing 
> how users/queues are allocated to sub-clusters. This allows us to enforce 
> global invariants (by dynamically updating locally-enforced invariants).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3660) [GPG] Federation Global Policy Generator (service hook only)

2023-05-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-3660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719829#comment-17719829
 ] 

ASF GitHub Bot commented on YARN-3660:
--

slfan1989 opened a new pull request, #5625:
URL: https://github.com/apache/hadoop/pull/5625

   
   
   ### Description of PR
   
   YARN-3660. [GPG] Federation Global Policy Generator.
   
   ### How was this patch tested?
   
   
   ### For code changes:
   
   - [ ] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [ ] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   




> [GPG] Federation Global Policy Generator (service hook only)
> 
>
> Key: YARN-3660
> URL: https://issues.apache.org/jira/browse/YARN-3660
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Carlo Curino
>Assignee: Botong Huang
>Priority: Major
>  Labels: federation, gpg
> Attachments: YARN-3660-YARN-7402.v1.patch, 
> YARN-3660-YARN-7402.v2.patch, YARN-3660-YARN-7402.v3.patch, 
> YARN-3660-YARN-7402.v3.patch, YARN-3660-YARN-7402.v3.patch, 
> YARN-3660-YARN-7402.v4.patch
>
>
> In a federated environment, local impairments of one sub-cluster might 
> unfairly affect users/queues that are mapped to that sub-cluster. A 
> centralized component (GPG) runs out-of-band and edits the policies governing 
> how users/queues are allocated to sub-clusters. This allows us to enforce 
> global invariants (by dynamically updating locally-enforced invariants).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-11463) Node Labels root directory creation doesn't have a retry logic

2023-05-05 Thread Benjamin Teke (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Teke updated YARN-11463:
-
Fix Version/s: 3.4.0

> Node Labels root directory creation doesn't have a retry logic
> --
>
> Key: YARN-11463
> URL: https://issues.apache.org/jira/browse/YARN-11463
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Reporter: Benjamin Teke
>Assignee: Ashutosh Gupta
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> When CS is initialized, it'll [try to create the configured node labels root 
> dir|https://github.com/apache/hadoop/blob/7169ec450957e5602775c3cd6fe1bf0b95773dfb/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/store/AbstractFSNodeStore.java#L69].
>  This however doesn't implement any kind of retry logic (in contrast to the 
> RM FS state store or ZK state store), hence if the distributed file system is 
> unavailable at the exact moment CS tries to start it'll fail. A retry logic 
> could be implemented to improve the robustness of the startup process.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4754) Too many connection opened to TimelineServer while publishing entities

2023-05-05 Thread Kevin wang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719763#comment-17719763
 ] 

Kevin wang commented on YARN-4754:
--

[~hansonhe]Whether you have enabled keberos authentication

> Too many connection opened to TimelineServer while publishing entities
> --
>
> Key: YARN-4754
> URL: https://issues.apache.org/jira/browse/YARN-4754
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Rohith Sharma K S
>Priority: Critical
> Attachments: ConnectionLeak.rar
>
>
> It is observed that there are too many connections are kept opened to 
> TimelineServer while publishing entities via SystemMetricsPublisher. This 
> cause sometimes resource shortage for other process or RM itself
> {noformat}
> tcp0  0 10.18.99.110:3999   10.18.214.60:59265  
> ESTABLISHED 115302/java 
> tcp0  0 10.18.99.110:25001  :::*LISTEN
>   115302/java 
> tcp0  0 10.18.99.110:25002  :::*LISTEN
>   115302/java 
> tcp0  0 10.18.99.110:25003  :::*LISTEN
>   115302/java 
> tcp0  0 10.18.99.110:25004  :::*LISTEN
>   115302/java 
> tcp0  0 10.18.99.110:25005  :::*LISTEN
>   115302/java 
> tcp1  0 10.18.99.110:48866  10.18.99.110:8188   
> CLOSE_WAIT  115302/java 
> tcp1  0 10.18.99.110:48137  10.18.99.110:8188   
> CLOSE_WAIT  115302/java 
> tcp1  0 10.18.99.110:47553  10.18.99.110:8188   
> CLOSE_WAIT  115302/java 
> tcp1  0 10.18.99.110:48424  10.18.99.110:8188   
> CLOSE_WAIT  115302/java 
> tcp1  0 10.18.99.110:48139  10.18.99.110:8188   
> CLOSE_WAIT  115302/java 
> tcp1  0 10.18.99.110:48096  10.18.99.110:8188   
> CLOSE_WAIT  115302/java 
> tcp1  0 10.18.99.110:47558  10.18.99.110:8188   
> CLOSE_WAIT  115302/java 
> tcp1  0 10.18.99.110:49270  10.18.99.110:8188   
> CLOSE_WAIT  115302/java 
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-11464) queue element is added to any other  leaf queue, it's queueType becomes QueueType.PARENT_QUEUE

2023-05-05 Thread Susheel Gupta (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719699#comment-17719699
 ] 

Susheel Gupta edited comment on YARN-11464 at 5/5/23 7:14 AM:
--

I think this is not just a test issue. There was a missing dot before 
"{color:#57d9a3}auto-queue-creation-v2.enabled{color}", which is one issue. 
However, when I added a dot ({color:#57d9a3}.{color}) before 
"{color:#57d9a3}auto-queue-creation-v2.enabled{color}" for the method call 
assertNoValueForQueues in 
{*}TestFSQueueConverter#testAutoCreateV2FlagsInWeightMode{*}, the test failed 
again. This was because the queue root.admin.alice has 
"auto-queue-creation-v2.enabled" set to true ({*}even though we don't have the 
AQCv2 feature enabled to true for leaf queues{*}).

I tried to debug this and found that the XML file used for 
TestFSQueueConverter#testAutoCreateV2FlagsInWeightMode is 
"{{{}apache/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/fair-scheduler-conversion.xml{}}}".

In this file, there are two queues, {{root.admin.alice}} and 
{{{}root.admin.bob{}}}. When they defined the properties for root.admin.alice, 
they added the  queue element, which is correct since it is a leaf 
queue. However, due to the reservation queue element in root.admin.alice, its 
queueType becomes QueueType.PARENT_QUEUE (I guess), and 
"{color:#57d9a3}auto-queue-creation-v2.enabled{color}" is set to true for 
alice, which is wrong. I checked by adding the reservation queue element for 
other queues as well, namely {{{}root.admin.bob, root.users.john, and 
root.users.joe{}}}. All of these queues have AQCv2 enabled to true when the 
 queue element is added.


was (Author: JIRAUSER295692):
I think this is not just a test issue. There was a missing dot before 
"{color:#57d9a3}auto-queue-creation-v2.enabled{color}", which is one issue. 
However, when I added a dot ({color:#57d9a3}.{color}) before 
"{color:#57d9a3}auto-queue-creation-v2.enabled{color}" for the method call 
assertNoValueForQueues in 
{*}TestFSQueueConverter#testAutoCreateV2FlagsInWeightMode{*}, the test failed 
again. This was because the queue root.admin.alice has 
"auto-queue-creation-v2.enabled" set to true ({*}even though we don't have the 
AQCv2 feature enabled to true for leaf queues{*}).

I tried to debug this and found that the XML file used for 
TestFSQueueConverter#testAutoCreateV2FlagsInWeightMode is 
"{{{}apache/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/fair-scheduler-conversion.xml{}}}".

In this file, there are two queues, {{root.admin.alice}} and 
{{{}root.admin.bob{}}}. When they defined the properties for root.admin.alice, 
they added the  queue element, which is correct since it is a leaf 
queue. However, due to the reservation queue element in root.admin.alice, its 
queueType becomes QueueType.PARENT_QUEUE, and 
"{color:#57d9a3}auto-queue-creation-v2.enabled{color}" is set to true for 
alice, which is wrong. I checked by adding the reservation queue element for 
other queues as well, namely {{{}root.admin.bob, root.users.john, and 
root.users.joe{}}}. All of these queues have AQCv2 enabled to true when the 
 queue element is added.

>  queue element is added to any other  leaf queue, it's queueType 
> becomes QueueType.PARENT_QUEUE
> 
>
> Key: YARN-11464
> URL: https://issues.apache.org/jira/browse/YARN-11464
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.3.4
>Reporter: Susheel Gupta
>Priority: Major
>
> This testcase clearly reproduces the issue. There is a missing dot before 
> "auto-queue-creation-v2.enabled" for method call assertNoValueForQueues.
> {code:java}
> @Test
> public void testAutoCreateV2FlagsInWeightMode() {
>   converter = builder.withPercentages(false).build();
>   converter.convertQueueHierarchy(rootQueue);
>   assertTrue("root autocreate v2 flag",
>   csConfig.getBoolean(
>   PREFIX + "root.auto-queue-creation-v2.enabled", false));
>   assertTrue("root.admins autocreate v2 flag",
>   csConfig.getBoolean(
>   PREFIX + "root.admins.auto-queue-creation-v2.enabled", false));
>   assertTrue("root.users autocreate v2 flag",
>   csConfig.getBoolean(
>   PREFIX + "root.users.auto-queue-creation-v2.enabled", false));
>   assertTrue("root.misc autocreate v2 flag",
>   csConfig.getBoolean(
>   PREFIX + "root.misc.auto-queue-creation-v2.enabled", false));
>   Set leafs = Sets.difference(ALL_QUEUES,
>   Sets.newHashSet("root",
>   "root.default",
>   "root.admins",
>   "root.users",
>   "root.misc"));
>   

[jira] [Commented] (YARN-11464) queue element is added to any other  leaf queue, it's queueType becomes QueueType.PARENT_QUEUE

2023-05-05 Thread Susheel Gupta (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719699#comment-17719699
 ] 

Susheel Gupta commented on YARN-11464:
--

I think this is not just a test issue. There was a missing dot before 
"{color:#57d9a3}auto-queue-creation-v2.enabled{color}", which is one issue. 
However, when I added a dot ({color:#57d9a3}.{color}) before 
"{color:#57d9a3}auto-queue-creation-v2.enabled{color}" for the method call 
assertNoValueForQueues in 
{*}TestFSQueueConverter#testAutoCreateV2FlagsInWeightMode{*}, the test failed 
again. This was because the queue root.admin.alice has 
"auto-queue-creation-v2.enabled" set to true ({*}even though we don't have the 
AQCv2 feature enabled to true for leaf queues{*}).

I tried to debug this and found that the XML file used for 
TestFSQueueConverter#testAutoCreateV2FlagsInWeightMode is 
"{{{}apache/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/fair-scheduler-conversion.xml{}}}".

In this file, there are two queues, {{root.admin.alice}} and 
{{{}root.admin.bob{}}}. When they defined the properties for root.admin.alice, 
they added the  queue element, which is correct since it is a leaf 
queue. However, due to the reservation queue element in root.admin.alice, its 
queueType becomes QueueType.PARENT_QUEUE, and 
"{color:#57d9a3}auto-queue-creation-v2.enabled{color}" is set to true for 
alice, which is wrong. I checked by adding the reservation queue element for 
other queues as well, namely {{{}root.admin.bob, root.users.john, and 
root.users.joe{}}}. All of these queues have AQCv2 enabled to true when the 
 queue element is added.

>  queue element is added to any other  leaf queue, it's queueType 
> becomes QueueType.PARENT_QUEUE
> 
>
> Key: YARN-11464
> URL: https://issues.apache.org/jira/browse/YARN-11464
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.3.4
>Reporter: Susheel Gupta
>Priority: Major
>
> This testcase clearly reproduces the issue. There is a missing dot before 
> "auto-queue-creation-v2.enabled" for method call assertNoValueForQueues.
> {code:java}
> @Test
> public void testAutoCreateV2FlagsInWeightMode() {
>   converter = builder.withPercentages(false).build();
>   converter.convertQueueHierarchy(rootQueue);
>   assertTrue("root autocreate v2 flag",
>   csConfig.getBoolean(
>   PREFIX + "root.auto-queue-creation-v2.enabled", false));
>   assertTrue("root.admins autocreate v2 flag",
>   csConfig.getBoolean(
>   PREFIX + "root.admins.auto-queue-creation-v2.enabled", false));
>   assertTrue("root.users autocreate v2 flag",
>   csConfig.getBoolean(
>   PREFIX + "root.users.auto-queue-creation-v2.enabled", false));
>   assertTrue("root.misc autocreate v2 flag",
>   csConfig.getBoolean(
>   PREFIX + "root.misc.auto-queue-creation-v2.enabled", false));
>   Set leafs = Sets.difference(ALL_QUEUES,
>   Sets.newHashSet("root",
>   "root.default",
>   "root.admins",
>   "root.users",
>   "root.misc"));
>   assertNoValueForQueues(leafs, "auto-queue-creation-v2.enabled",
>   csConfig);
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-4754) Too many connection opened to TimelineServer while publishing entities

2023-05-05 Thread hansonhe (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719655#comment-17719655
 ] 

hansonhe edited comment on YARN-4754 at 5/5/23 7:04 AM:


My Product Environment: hadoop-3.1.4 have same problems when use TimelineV1.
(1)sh1-int-data-bigdata-dw-inv-prod-1:run timline server,there are so many 
FIN_WAIT2 |TIME_WAIT  
root@sh1-int-data-bigdata-dw-inv-prod-1 ~ $ netstat -anp|grep 8188
tcp        0      0 10.2.51.214:8188        0.0.0.0:*               LISTEN      
8949/java           
tcp        0      0 10.2.51.214:8188        10.2.51.215:52490       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52498       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52538       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52552       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52556       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34080        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52540       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34098        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52562       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34074        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34076        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34092        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52496       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34070        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34068        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34096        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52508       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52494       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52510       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52520       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.214:58984       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52542       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52536       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34078        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:58986       10.2.51.214:8188        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34072        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52512       FIN_WAIT2   
-                   
tcp        1      0 10.2.51.214:58984       10.2.51.214:8188        CLOSE_WAIT  
27743/java          
tcp        0      0 10.2.51.214:8188        10.2.51.22:34082        TIME_WAIT   
-             

(2)sh1-int-data-bigdata-dw-inv-prod-2:run ResourceManager Server,there are so 
many CLOSE_WAIT, even the number increase to more than 10 thousands.
root@sh1-int-data-bigdata-dw-inv-prod-2 ~ $ netstat -anp|grep 8188
tcp        1      0 10.2.51.215:52496       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52520       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52540       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52494       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52542       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        0      0 10.2.51.215:52522       10.2.51.214:8188        TIME_WAIT   
-                   
tcp        1      0 10.2.51.215:52510       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52536       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52498       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52556       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52538       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52562       10.2.51.214:8188        CLOSE_WAIT  
20846/java 

[jira] [Comment Edited] (YARN-4754) Too many connection opened to TimelineServer while publishing entities

2023-05-05 Thread hansonhe (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719655#comment-17719655
 ] 

hansonhe edited comment on YARN-4754 at 5/5/23 7:02 AM:


My Product Environment: hadoop-3.1.4 have same problems when use TimelineV1.
(1)sh1-int-data-bigdata-dw-inv-prod-1:run timline server,there are so many 
FIN_WAIT2 |TIME_WAIT  
root@sh1-int-data-bigdata-dw-inv-prod-1 ~ $ netstat -anp|grep 8188
tcp        0      0 10.2.51.214:8188        0.0.0.0:*               LISTEN      
8949/java           
tcp        0      0 10.2.51.214:8188        10.2.51.215:52490       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52498       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52538       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52552       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52556       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34080        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52540       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34098        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52562       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34074        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34076        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34092        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52496       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34070        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34068        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34096        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52508       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52494       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52510       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52520       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.214:58984       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52542       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52536       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34078        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:58986       10.2.51.214:8188        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34072        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52512       FIN_WAIT2   
-                   
tcp        1      0 10.2.51.214:58984       10.2.51.214:8188        CLOSE_WAIT  
27743/java          
tcp        0      0 10.2.51.214:8188        10.2.51.22:34082        TIME_WAIT   
-             

(2)sh1-int-data-bigdata-dw-inv-prod-2:run ResourceManager Server,there are so 
many CLOSE_WAIT, even the number increase to more than 10 thousands.
root@sh1-int-data-bigdata-dw-inv-prod-2 ~ $ netstat -anp|grep 8188
tcp        1      0 10.2.51.215:52496       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52520       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52540       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52494       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52542       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        0      0 10.2.51.215:52522       10.2.51.214:8188        TIME_WAIT   
-                   
tcp        1      0 10.2.51.215:52510       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52536       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52498       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52556       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52538       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52562       10.2.51.214:8188        CLOSE_WAIT  
20846/java 

[jira] [Comment Edited] (YARN-4754) Too many connection opened to TimelineServer while publishing entities

2023-05-05 Thread hansonhe (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719655#comment-17719655
 ] 

hansonhe edited comment on YARN-4754 at 5/5/23 6:59 AM:


My Product Environment: hadoop-3.1.4 have same problems when use TimelineV1.
(1)sh1-int-data-bigdata-dw-inv-prod-1:run timline server,there are so many 
FIN_WAIT2 |TIME_WAIT  
root@sh1-int-data-bigdata-dw-inv-prod-1 ~ $ netstat -anp|grep 8188
tcp        0      0 10.2.51.214:8188        0.0.0.0:*               LISTEN      
8949/java           
tcp        0      0 10.2.51.214:8188        10.2.51.215:52490       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52498       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52538       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52552       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52556       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34080        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52540       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34098        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52562       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34074        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34076        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34092        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52496       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34070        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34068        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34096        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52508       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52494       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52510       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52520       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.214:58984       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52542       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52536       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34078        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:58986       10.2.51.214:8188        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34072        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52512       FIN_WAIT2   
-                   
tcp        1      0 10.2.51.214:58984       10.2.51.214:8188        CLOSE_WAIT  
27743/java          
tcp        0      0 10.2.51.214:8188        10.2.51.22:34082        TIME_WAIT   
-             

(2)sh1-int-data-bigdata-dw-inv-prod-2:run ResourceManager Server,there are so 
many CLOSE_WAIT, even the number increase to more than 10 thousands.
root@sh1-int-data-bigdata-dw-inv-prod-2 ~ $ netstat -anp|grep 8188
tcp        1      0 10.2.51.215:52496       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52520       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52540       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52494       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52542       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        0      0 10.2.51.215:52522       10.2.51.214:8188        TIME_WAIT   
-                   
tcp        1      0 10.2.51.215:52510       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52536       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52498       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52556       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52538       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52562       10.2.51.214:8188        CLOSE_WAIT  
20846/java 

[jira] [Comment Edited] (YARN-4754) Too many connection opened to TimelineServer while publishing entities

2023-05-05 Thread hansonhe (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719655#comment-17719655
 ] 

hansonhe edited comment on YARN-4754 at 5/5/23 6:56 AM:


My Product Environment: hadoop-3.1.4 have same problems when use TimelineV1.
(1)sh1-int-data-bigdata-dw-inv-prod-1:run timline server,there are so many 
FIN_WAIT2 |TIME_WAIT  
root@sh1-int-data-bigdata-dw-inv-prod-1 ~ $ netstat -anp|grep 8188
tcp        0      0 10.2.51.214:8188        0.0.0.0:*               LISTEN      
8949/java           
tcp        0      0 10.2.51.214:8188        10.2.51.215:52490       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52498       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52538       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52552       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52556       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34080        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52540       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34098        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52562       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34074        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34076        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34092        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52496       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34070        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34068        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34096        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52508       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52494       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52510       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52520       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.214:58984       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52542       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52536       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34078        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:58986       10.2.51.214:8188        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34072        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52512       FIN_WAIT2   
-                   
tcp        1      0 10.2.51.214:58984       10.2.51.214:8188        CLOSE_WAIT  
27743/java          
tcp        0      0 10.2.51.214:8188        10.2.51.22:34082        TIME_WAIT   
-             

(2)sh1-int-data-bigdata-dw-inv-prod-2:run ResourceManager Server,there are so 
many CLOSE_WAIT, even the number increase to more than 10 thousands.
root@sh1-int-data-bigdata-dw-inv-prod-2 ~ $ netstat -anp|grep 8188
tcp        1      0 10.2.51.215:52496       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52520       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52540       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52494       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52542       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        0      0 10.2.51.215:52522       10.2.51.214:8188        TIME_WAIT   
-                   
tcp        1      0 10.2.51.215:52510       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52536       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52498       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52556       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52538       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52562       10.2.51.214:8188        CLOSE_WAIT  
20846/java 

[jira] [Comment Edited] (YARN-4754) Too many connection opened to TimelineServer while publishing entities

2023-05-05 Thread hansonhe (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719655#comment-17719655
 ] 

hansonhe edited comment on YARN-4754 at 5/5/23 6:55 AM:


My Product Environment: hadoop-3.1.4 have same problems when use TimelineV1.
(1)sh1-int-data-bigdata-dw-inv-prod-1:run timline server,there are so many 
FIN_WAIT2 |TIME_WAIT  
root@sh1-int-data-bigdata-dw-inv-prod-1 ~ $ netstat -anp|grep 8188
tcp        0      0 10.2.51.214:8188        0.0.0.0:*               LISTEN      
8949/java           
tcp        0      0 10.2.51.214:8188        10.2.51.215:52490       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52498       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52538       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52552       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52556       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34080        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52540       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34098        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52562       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34074        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34076        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34092        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52496       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34070        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34068        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34096        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52508       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52494       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52510       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52520       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.214:58984       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52542       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52536       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34078        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:58986       10.2.51.214:8188        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34072        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52512       FIN_WAIT2   
-                   
tcp        1      0 10.2.51.214:58984       10.2.51.214:8188        CLOSE_WAIT  
27743/java          
tcp        0      0 10.2.51.214:8188        10.2.51.22:34082        TIME_WAIT   
-             

(2)sh1-int-data-bigdata-dw-inv-prod-2:run ResourceManager Server,there are so 
many CLOSE_WAIT, even the number increase to more than 10 thousands.
root@sh1-int-data-bigdata-dw-inv-prod-2 ~ $ netstat -anp|grep 8188
tcp        1      0 10.2.51.215:52496       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52520       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52540       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52494       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52542       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        0      0 10.2.51.215:52522       10.2.51.214:8188        TIME_WAIT   
-                   
tcp        1      0 10.2.51.215:52510       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52536       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52498       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52556       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52538       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52562       10.2.51.214:8188        CLOSE_WAIT  
20846/java 

[jira] [Comment Edited] (YARN-4754) Too many connection opened to TimelineServer while publishing entities

2023-05-05 Thread hansonhe (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719655#comment-17719655
 ] 

hansonhe edited comment on YARN-4754 at 5/5/23 6:53 AM:


My Product Environment: hadoop-3.1.4 have same problems when use TimelineV1.
(1)sh1-int-data-bigdata-dw-inv-prod-1:run timline server,there are so many 
FIN_WAIT2 |TIME_WAIT  
root@sh1-int-data-bigdata-dw-inv-prod-1 ~ $ netstat -anp|grep 8188
tcp        0      0 10.2.51.214:8188        0.0.0.0:*               LISTEN      
8949/java           
tcp        0      0 10.2.51.214:8188        10.2.51.215:52490       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52498       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52538       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52552       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52556       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34080        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52540       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34098        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52562       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34074        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34076        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34092        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52496       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34070        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34068        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34096        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52508       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52494       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52510       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52520       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.214:58984       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52542       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52536       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34078        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:58986       10.2.51.214:8188        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34072        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52512       FIN_WAIT2   
-                   
tcp        1      0 10.2.51.214:58984       10.2.51.214:8188        CLOSE_WAIT  
27743/java          
tcp        0      0 10.2.51.214:8188        10.2.51.22:34082        TIME_WAIT   
-             

(2)sh1-int-data-bigdata-dw-inv-prod-2:run ResourceManager Server,there are so 
many CLOSE_WAIT, even the number increase to more than 10 thousands.
root@sh1-int-data-bigdata-dw-inv-prod-2 ~ $ netstat -anp|grep 8188
tcp        1      0 10.2.51.215:52496       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52520       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52540       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52494       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52542       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        0      0 10.2.51.215:52522       10.2.51.214:8188        TIME_WAIT   
-                   
tcp        1      0 10.2.51.215:52510       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52536       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52498       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52556       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52538       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52562       10.2.51.214:8188        CLOSE_WAIT  
20846/java 

[jira] [Comment Edited] (YARN-4754) Too many connection opened to TimelineServer while publishing entities

2023-05-05 Thread hansonhe (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719655#comment-17719655
 ] 

hansonhe edited comment on YARN-4754 at 5/5/23 6:51 AM:


My Product Environment: hadoop-3.1.4 have same problems when use TimelineV1.
(1)sh1-int-data-bigdata-dw-inv-prod-1:run timline server,there are so many 
FIN_WAIT2 |TIME_WAIT  
root@sh1-int-data-bigdata-dw-inv-prod-1 ~ $ netstat -anp|grep 8188
tcp        0      0 10.2.51.214:8188        0.0.0.0:*               LISTEN      
8949/java           
tcp        0      0 10.2.51.214:8188        10.2.51.215:52490       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52498       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52538       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52552       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52556       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34080        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52540       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34098        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52562       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34074        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34076        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34092        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52496       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34070        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34068        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34096        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52508       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52494       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52510       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52520       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.214:58984       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52542       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52536       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34078        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:58986       10.2.51.214:8188        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34072        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52512       FIN_WAIT2   
-                   
tcp        1      0 10.2.51.214:58984       10.2.51.214:8188        CLOSE_WAIT  
27743/java          
tcp        0      0 10.2.51.214:8188        10.2.51.22:34082        TIME_WAIT   
-             

(2)sh1-int-data-bigdata-dw-inv-prod-2:run ResourceManager Server,there are so 
many CLOSE_WAIT, even the number increase to more than 10 thousands.
root@sh1-int-data-bigdata-dw-inv-prod-2 ~ $ netstat -anp|grep 8188
tcp        1      0 10.2.51.215:52496       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52520       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52540       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52494       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52542       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        0      0 10.2.51.215:52522       10.2.51.214:8188        TIME_WAIT   
-                   
tcp        1      0 10.2.51.215:52510       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52536       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52498       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52556       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52538       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52562       10.2.51.214:8188        CLOSE_WAIT  
20846/java 

[jira] [Comment Edited] (YARN-4754) Too many connection opened to TimelineServer while publishing entities

2023-05-05 Thread hansonhe (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719655#comment-17719655
 ] 

hansonhe edited comment on YARN-4754 at 5/5/23 6:50 AM:


My Product Environment: hadoop-3.1.4 have same problems when use TimelineV1.
(1)sh1-int-data-bigdata-dw-inv-prod-1:run timline server,there are so many 
FIN_WAIT2 |TIME_WAIT  
root@sh1-int-data-bigdata-dw-inv-prod-1 ~ $ netstat -anp|grep 8188
tcp        0      0 10.2.51.214:8188        0.0.0.0:*               LISTEN      
8949/java           
tcp        0      0 10.2.51.214:8188        10.2.51.215:52490       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52498       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52538       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52552       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52556       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34080        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52540       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34098        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52562       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34074        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34076        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34092        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52496       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34070        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34068        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34096        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52508       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52494       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52510       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52520       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.214:58984       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52542       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52536       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34078        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:58986       10.2.51.214:8188        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34072        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52512       FIN_WAIT2   
-                   
tcp        1      0 10.2.51.214:58984       10.2.51.214:8188        CLOSE_WAIT  
27743/java          
tcp        0      0 10.2.51.214:8188        10.2.51.22:34082        TIME_WAIT   
-             

(2)sh1-int-data-bigdata-dw-inv-prod-2:run ResourceManager Server,there are so 
many CLOSE_WAIT, even the number increase to more than 10 thousands.
root@sh1-int-data-bigdata-dw-inv-prod-2 ~ $ netstat -anp|grep 8188
tcp        1      0 10.2.51.215:52496       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52520       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52540       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52494       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52542       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        0      0 10.2.51.215:52522       10.2.51.214:8188        TIME_WAIT   
-                   
tcp        1      0 10.2.51.215:52510       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52536       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52498       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52556       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52538       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52562       10.2.51.214:8188        CLOSE_WAIT  
20846/java 

[jira] [Commented] (YARN-4754) Too many connection opened to TimelineServer while publishing entities

2023-05-05 Thread hansonhe (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719655#comment-17719655
 ] 

hansonhe commented on YARN-4754:


My Product Environment: hadoop-3.1.4 have same problems when use TimelineV1.
(1)sh1-int-data-bigdata-dw-inv-prod-1:run timline server,there are so many 
FIN_WAIT2 |TIME_WAIT  
root@sh1-int-data-bigdata-dw-inv-prod-1 ~ $ netstat -anp|grep 8188
tcp        0      0 10.2.51.214:8188        0.0.0.0:*               LISTEN      
8949/java           
tcp        0      0 10.2.51.214:8188        10.2.51.215:52490       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52498       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52538       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52552       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52556       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34080        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52540       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34098        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52562       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34074        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34076        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34092        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52496       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34070        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34068        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34096        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52508       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52494       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52510       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52520       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.214:58984       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52542       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52536       FIN_WAIT2   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34078        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:58986       10.2.51.214:8188        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.22:34072        TIME_WAIT   
-                   
tcp        0      0 10.2.51.214:8188        10.2.51.215:52512       FIN_WAIT2   
-                   
tcp        1      0 10.2.51.214:58984       10.2.51.214:8188        CLOSE_WAIT  
27743/java          
tcp        0      0 10.2.51.214:8188        10.2.51.22:34082        TIME_WAIT   
-             

(2)sh1-int-data-bigdata-dw-inv-prod-2:run ResourceManager Server,there are so 
many CLOSE_WAIT, even the number increase to more than 10 thousands.
root@sh1-int-data-bigdata-dw-inv-prod-2 ~ $ netstat -anp|grep 8188
tcp        1      0 10.2.51.215:52496       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52520       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52540       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52494       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52542       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        0      0 10.2.51.215:52522       10.2.51.214:8188        TIME_WAIT   
-                   
tcp        1      0 10.2.51.215:52510       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52536       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52498       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52556       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52538       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        1      0 10.2.51.215:52562       10.2.51.214:8188        CLOSE_WAIT  
20846/java          
tcp        0      0