[jira] [Commented] (YARN-8757) [Submarine] Add Tensorboard component when --tensorboard is specified

2018-09-10 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16609943#comment-16609943 ] Wangda Tan commented on YARN-8757: -- Added ver.1 patch which spin up a Tensorboard contain

[jira] [Updated] (YARN-8757) [Submarine] Add Tensorboard component when --tensorboard is specified

2018-09-10 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8757: - Attachment: YARN-8757.001.patch > [Submarine] Add Tensorboard component when --tensorboard is specified >

[jira] [Commented] (YARN-8757) [Submarine] Add Tensorboard component when --tensorboard is specified

2018-09-09 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16608611#comment-16608611 ] Wangda Tan commented on YARN-8757: -- Working on the patch now, will update patch shortly.

[jira] [Updated] (YARN-8757) [Submarine] Add Tensorboard component when --tensorboard is specified

2018-09-09 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8757: - Description: We need to have a Tensorboard component when --tensorboard is specified. And we need to set q

[jira] [Created] (YARN-8757) [Submarine] Add Tensorboard component when --tensorboard is specified

2018-09-09 Thread Wangda Tan (JIRA)
Wangda Tan created YARN-8757: Summary: [Submarine] Add Tensorboard component when --tensorboard is specified Key: YARN-8757 URL: https://issues.apache.org/jira/browse/YARN-8757 Project: Hadoop YARN

[jira] [Updated] (YARN-8698) [Submarine] Failed to reset Hadoop home environment when submitting a submarine job

2018-09-09 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8698: - Summary: [Submarine] Failed to reset Hadoop home environment when submitting a submarine job (was: [Subma

[jira] [Updated] (YARN-8756) [Submarine] Properly handle relative path for staging area

2018-09-09 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8756: - Attachment: YARN-8756.001.patch > [Submarine] Properly handle relative path for staging area > ---

[jira] [Created] (YARN-8756) [Submarine] Properly handle relative path for staging area

2018-09-09 Thread Wangda Tan (JIRA)
Wangda Tan created YARN-8756: Summary: [Submarine] Properly handle relative path for staging area Key: YARN-8756 URL: https://issues.apache.org/jira/browse/YARN-8756 Project: Hadoop YARN Issue Ty

[jira] [Commented] (YARN-8698) [Submarine] Failed to add hadoop dependencies in docker container when submitting a submarine job

2018-09-09 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16608545#comment-16608545 ] Wangda Tan commented on YARN-8698: --  +1, will commit the patch shortly. Thanks, > [Subm

[jira] [Commented] (YARN-8513) CapacityScheduler infinite loop when queue is near fully utilized

2018-09-08 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16608213#comment-16608213 ] Wangda Tan commented on YARN-8513: -- [~hustnn], I agree that it is still a problem, but r

[jira] [Commented] (YARN-8513) CapacityScheduler infinite loop when queue is near fully utilized

2018-09-07 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16607495#comment-16607495 ] Wangda Tan commented on YARN-8513: -- Spent good amount of time to check the issue. I foun

[jira] [Commented] (YARN-5592) Add support for dynamic resource updates with multiple resource types

2018-09-07 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16607393#comment-16607393 ] Wangda Tan commented on YARN-5592: -- [~sunilg],  I think remove resource types gonna be h

[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-09-04 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16603707#comment-16603707 ] Wangda Tan commented on YARN-8569: -- Thanks [~eyang],   And forgot to mention: if we're g

[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-09-04 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16603680#comment-16603680 ] Wangda Tan commented on YARN-8569: -- [~eyang],  How we can make it available prior to con

[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-09-03 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16602350#comment-16602350 ] Wangda Tan commented on YARN-8569: -- And in implementation, AM should have ability to writ

[jira] [Comment Edited] (YARN-8569) Create an interface to provide cluster information to application

2018-09-03 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16602345#comment-16602345 ] Wangda Tan edited comment on YARN-8569 at 9/3/18 4:57 PM: -- [~eyan

[jira] [Comment Edited] (YARN-8569) Create an interface to provide cluster information to application

2018-09-03 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16602345#comment-16602345 ] Wangda Tan edited comment on YARN-8569 at 9/3/18 4:57 PM: -- [~eyan

[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-09-03 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16602345#comment-16602345 ] Wangda Tan commented on YARN-8569: -- [~eyang], I still think it is a bad idea to support

[jira] [Commented] (YARN-8513) CapacityScheduler infinite loop when queue is near fully utilized

2018-09-03 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16602326#comment-16602326 ] Wangda Tan commented on YARN-8513: -- And btw, I found a comment in LeafQueue: {code:java}

[jira] [Commented] (YARN-8513) CapacityScheduler infinite loop when queue is near fully utilized

2018-09-03 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16602323#comment-16602323 ] Wangda Tan commented on YARN-8513: -- Interesting, it must be caused by CS allocation doesn

[jira] [Commented] (YARN-8468) Limit container sizes per queue in FairScheduler

2018-08-29 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16596926#comment-16596926 ] Wangda Tan commented on YARN-8468: -- [~bsteinbach], Thanks, I think it makes sense to nor

[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-08-29 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16596014#comment-16596014 ] Wangda Tan commented on YARN-8569: -- [~eyang],  {quote}Unless malicious user already hacke

[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-08-28 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16595949#comment-16595949 ] Wangda Tan commented on YARN-8569: -- [~eyang], As we discussed offline, the use case is n

[jira] [Commented] (YARN-8718) Merge related work for YARN-3409

2018-08-28 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16595945#comment-16595945 ] Wangda Tan commented on YARN-8718: -- [~sunilg], the attached patch doesn't look correct.

[jira] [Commented] (YARN-8220) Running Tensorflow on YARN with GPU and Docker - Examples

2018-08-28 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16595944#comment-16595944 ] Wangda Tan commented on YARN-8220: -- Thanks [~sunilg], I think we should close this JIRA.

[jira] [Commented] (YARN-8468) Limit container sizes per queue in FairScheduler

2018-08-28 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16595712#comment-16595712 ] Wangda Tan commented on YARN-8468: -- 1) Is it sufficient to make changes like YARN-1582, I

[jira] [Commented] (YARN-7018) Interface for adding extra behavior to node heartbeats

2018-08-28 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16595575#comment-16595575 ] Wangda Tan commented on YARN-7018: -- [~jlowe], given the fields need to be updated should

[jira] [Commented] (YARN-8722) Failed to get native service application status when security is enabled

2018-08-28 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16595340#comment-16595340 ] Wangda Tan commented on YARN-8722: -- Thanks [~eyang], [~yuan_zac], are you able to *subm

[jira] [Commented] (YARN-8722) Failed to get native service application status when security is enabled

2018-08-28 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16595185#comment-16595185 ] Wangda Tan commented on YARN-8722: -- [~eyang], [~billie.rinaldi], have we seen this issue

[jira] [Updated] (YARN-8722) Failed to get native service application status when security is enabled

2018-08-28 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8722: - Environment: (was: The environment context is as follows: 1) Security enabled. kerberos 2) Klist out

[jira] [Updated] (YARN-8722) Failed to get native service application status when security is enabled

2018-08-28 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8722: - Description: Can't get job status with the following command, after a submarine job is submitted. bin/yar

[jira] [Created] (YARN-8716) [Submarine] Support passing Kerberos principle tokens when launch training jobs.

2018-08-26 Thread Wangda Tan (JIRA)
Wangda Tan created YARN-8716: Summary: [Submarine] Support passing Kerberos principle tokens when launch training jobs. Key: YARN-8716 URL: https://issues.apache.org/jira/browse/YARN-8716 Project: Hadoop

[jira] [Created] (YARN-8713) [Submarine] Support deploy model serving for existing models

2018-08-24 Thread Wangda Tan (JIRA)
Wangda Tan created YARN-8713: Summary: [Submarine] Support deploy model serving for existing models Key: YARN-8713 URL: https://issues.apache.org/jira/browse/YARN-8713 Project: Hadoop YARN Issue

[jira] [Created] (YARN-8714) [Submarine] Support files/tarballs to be localized for a training job.

2018-08-24 Thread Wangda Tan (JIRA)
Wangda Tan created YARN-8714: Summary: [Submarine] Support files/tarballs to be localized for a training job. Key: YARN-8714 URL: https://issues.apache.org/jira/browse/YARN-8714 Project: Hadoop YARN

[jira] [Created] (YARN-8712) [Submarine] Support create models / versions for training result.

2018-08-24 Thread Wangda Tan (JIRA)
Wangda Tan created YARN-8712: Summary: [Submarine] Support create models / versions for training result. Key: YARN-8712 URL: https://issues.apache.org/jira/browse/YARN-8712 Project: Hadoop YARN

[jira] [Commented] (YARN-8513) CapacityScheduler infinite loop when queue is near fully utilized

2018-08-24 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16591878#comment-16591878 ] Wangda Tan commented on YARN-8513: -- [~hustnn], what is the cause of "Failed to accept all

[jira] [Commented] (YARN-8638) Allow linux container runtimes to be pluggable

2018-08-22 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589478#comment-16589478 ] Wangda Tan commented on YARN-8638: -- [~ccondit-target], Thanks for working on this ticke

[jira] [Commented] (YARN-8698) [Submarine] Failed to add hadoop dependencies in docker container when submitting a submarine job

2018-08-22 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589242#comment-16589242 ] Wangda Tan commented on YARN-8698: -- Thanks [~yuan_zac], added u to the contributor list,

[jira] [Assigned] (YARN-8698) [Submarine] Failed to add hadoop dependencies in docker container when submitting a submarine job

2018-08-22 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan reassigned YARN-8698: Assignee: Zac Zhou > [Submarine] Failed to add hadoop dependencies in docker container when > subm

[jira] [Updated] (YARN-8698) [Submarine] Failed to add hadoop dependencies in docker container when submitting a submarine job

2018-08-22 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8698: - Issue Type: Sub-task (was: Bug) Parent: YARN-8135 > [Submarine] Failed to add hadoop dependencies

[jira] [Updated] (YARN-8698) [Submarine] Failed to add hadoop dependencies in docker container when submitting a submarine job

2018-08-22 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8698: - Summary: [Submarine] Failed to add hadoop dependencies in docker container when submitting a submarine job

[jira] [Updated] (YARN-8675) Setting hostname of docker container breaks with "host" networking mode for Apps which do not run as a YARN service

2018-08-21 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8675: - Reporter: Yesha Vora (was: Suma Shivaprasad) > Setting hostname of docker container breaks with "host" ne

[jira] [Comment Edited] (YARN-8513) CapacityScheduler infinite loop when queue is near fully utilized

2018-08-20 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16586854#comment-16586854 ] Wangda Tan edited comment on YARN-8513 at 8/21/18 3:37 AM: --- Inte

[jira] [Commented] (YARN-8513) CapacityScheduler infinite loop when queue is near fully utilized

2018-08-20 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16586854#comment-16586854 ] Wangda Tan commented on YARN-8513: -- Interesting, [~cheersyang], I can only think about

[jira] [Assigned] (YARN-8679) [ATSv2] If HBase cluster is down for long time, high chances that NM ContainerManager dispatcher get blocked

2018-08-17 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan reassigned YARN-8679: Assignee: Wangda Tan (was: Rohith Sharma K S) > [ATSv2] If HBase cluster is down for long time, hi

[jira] [Commented] (YARN-8679) [ATSv2] If HBase cluster is down for long time, high chances that NM ContainerManager dispatcher get blocked

2018-08-17 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16584322#comment-16584322 ] Wangda Tan commented on YARN-8679: -- [~rohithsharma], thanks for the patch. I'm a bit wo

[jira] [Updated] (YARN-8679) [ATSv2] If HBase cluster is down for long time, high chances that NM ContainerManager dispatcher get blocked

2018-08-17 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8679: - Attachment: YARN-8679.02.patch > [ATSv2] If HBase cluster is down for long time, high chances that NM > C

[jira] [Commented] (YARN-8677) Queue Management API - no errors thrown for wrong properties

2018-08-17 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16584242#comment-16584242 ] Wangda Tan commented on YARN-8677: -- [~akhilpb], could u move these issues to sub jira of

[jira] [Commented] (YARN-8657) User limit calculation should be read-lock-protected within LeafQueue

2018-08-17 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16584238#comment-16584238 ] Wangda Tan commented on YARN-8657: -- [~sunilg], I'm not quite sure if the patch changed

[jira] [Commented] (YARN-8513) CapacityScheduler infinite loop when queue is near fully utilized

2018-08-16 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16583026#comment-16583026 ] Wangda Tan commented on YARN-8513: -- [~cyfdecyf], Could u upload logs/jstacks for 3.1.0

[jira] [Updated] (YARN-8667) Cleanup symlinks when container restarted by NM to solve issue "find: File system loop detected;" for tar ball artifacts.

2018-08-16 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8667: - Summary: Cleanup symlinks when container restarted by NM to solve issue "find: File system loop detected;"

[jira] [Updated] (YARN-8667) Cleanup symlinks when container restarted by NM to solve issue "find: File system loop detected;" for tar ball artifacts.

2018-08-16 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8667: - Target Version/s: 3.1.1, 3.2.0 Priority: Critical (was: Major) > Cleanup symlinks when contai

[jira] [Comment Edited] (YARN-8668) Inconsistency between capacity and fair scheduler in the aspect of computing node available resource

2018-08-15 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16581574#comment-16581574 ] Wangda Tan edited comment on YARN-8668 at 8/15/18 8:34 PM: --- Than

[jira] [Commented] (YARN-8668) Inconsistency between capacity and fair scheduler in the aspect of computing node available resource

2018-08-15 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16581574#comment-16581574 ] Wangda Tan commented on YARN-8668: -- Thanks [~Cyl] for reporting the issue, this is by des

[jira] [Updated] (YARN-8657) User limit calculation should be read-lock-protected within LeafQueue

2018-08-13 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8657: - Attachment: YARN-8657.001.patch > User limit calculation should be read-lock-protected within LeafQueue >

[jira] [Commented] (YARN-8657) User limit calculation should be read-lock-protected within LeafQueue

2018-08-13 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16579053#comment-16579053 ] Wangda Tan commented on YARN-8657: -- [~sunil.gov...@gmail.com], [~cheersyang], could u hel

[jira] [Created] (YARN-8657) User limit calculation should be read-lock-protected within LeafQueue

2018-08-13 Thread Wangda Tan (JIRA)
Wangda Tan created YARN-8657: Summary: User limit calculation should be read-lock-protected within LeafQueue Key: YARN-8657 URL: https://issues.apache.org/jira/browse/YARN-8657 Project: Hadoop YARN

[jira] [Updated] (YARN-8647) Add a flag to disable move app between queues

2018-08-10 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8647: - Summary: Add a flag to disable move app between queues (was: Add a flag to disable move queue) > Add a f

[jira] [Updated] (YARN-8561) [Submarine] Add initial implementation: training job submission and job history retrieve.

2018-08-09 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8561: - Attachment: YARN-8561.005.patch > [Submarine] Add initial implementation: training job submission and job

[jira] [Commented] (YARN-8561) [Submarine] Add initial implementation: training job submission and job history retrieve.

2018-08-09 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16575464#comment-16575464 ] Wangda Tan commented on YARN-8561: -- Thanks [~sunilg] For your addition comments: 1. I t

[jira] [Commented] (YARN-8588) Logging improvements for better debuggability

2018-08-08 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16574042#comment-16574042 ] Wangda Tan commented on YARN-8588: -- +1, LGTM. thanks [~suma.shivaprasad] > Logging impro

[jira] [Updated] (YARN-8588) Logging improvements for better debuggability

2018-08-08 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8588: - Target Version/s: 3.2.0, 3.1.2 > Logging improvements for better debuggability > -

[jira] [Commented] (YARN-8407) Container launch exception in AM log should be printed in ERROR level

2018-08-07 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16572265#comment-16572265 ] Wangda Tan commented on YARN-8407: -- +1 to the patch. Thanks [~yeshavora] > Container lau

[jira] [Commented] (YARN-8561) [Submarine] Add initial implementation: training job submission and job history retrieve.

2018-08-07 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16572253#comment-16572253 ] Wangda Tan commented on YARN-8561: -- Attached ver.4 patch, fixed jenkins warnings. > [Sub

[jira] [Updated] (YARN-8561) [Submarine] Add initial implementation: training job submission and job history retrieve.

2018-08-07 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8561: - Attachment: YARN-8561.004.patch > [Submarine] Add initial implementation: training job submission and job

[jira] [Commented] (YARN-8629) Container cleanup fails while trying to delete Cgroups

2018-08-07 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16572242#comment-16572242 ] Wangda Tan commented on YARN-8629: -- Ah forgot to mention, patch got committed to trunk/br

[jira] [Updated] (YARN-8407) Container launch exception in AM log should be printed in ERROR level

2018-08-07 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8407: - Target Version/s: 3.2.0, 3.1.2 > Container launch exception in AM log should be printed in ERROR level > -

[jira] [Commented] (YARN-8629) Container cleanup fails while trying to delete Cgroups

2018-08-07 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16572206#comment-16572206 ] Wangda Tan commented on YARN-8629: -- +1, patch LGTM, thanks [~suma.shivaprasad]. > Contai

[jira] [Commented] (YARN-8561) [Submarine] Add initial implementation: training job submission and job history retrieve.

2018-08-07 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16572152#comment-16572152 ] Wangda Tan commented on YARN-8561: -- Attached ver.3 patch which included help messages and

[jira] [Updated] (YARN-8561) [Submarine] Add initial implementation: training job submission and job history retrieve.

2018-08-07 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8561: - Attachment: YARN-8561.003.patch > [Submarine] Add initial implementation: training job submission and job

[jira] [Commented] (YARN-8561) [Submarine] Add initial implementation: training job submission and job history retrieve.

2018-08-07 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16572087#comment-16572087 ] Wangda Tan commented on YARN-8561: -- Thanks [~sunilg], 1. Addressed. 2. I think we can re

[jira] [Updated] (YARN-8561) [Submarine] Add initial implementation: training job submission and job history retrieve.

2018-08-07 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8561: - Attachment: YARN-8561.002.patch > [Submarine] Add initial implementation: training job submission and job

[jira] [Updated] (YARN-8629) Container cleanup fails while trying to delete Cgroups

2018-08-07 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8629: - Target Version/s: 3.2.0, 3.1.2 Priority: Critical (was: Major) > Container cleanup fails whil

[jira] [Commented] (YARN-7089) Mark the log-aggregation-controller APIs as public

2018-08-06 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16570961#comment-16570961 ] Wangda Tan commented on YARN-7089: -- +1, LGTM. > Mark the log-aggregation-controller APIs

[jira] [Commented] (YARN-8475) Should check the resource of assignment is greater than Resources.none() before submitResourceCommitRequest

2018-08-03 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568712#comment-16568712 ] Wangda Tan commented on YARN-8475: -- [~zhouyunfan], could u add more details to the bug? I

[jira] [Commented] (YARN-8136) Add version attribute to site doc examples and quickstart

2018-08-03 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568596#comment-16568596 ] Wangda Tan commented on YARN-8136: -- +1, LGTM, thanks [~eyang].  > Add version attribute

[jira] [Assigned] (YARN-8136) Add version attribute to site doc examples and quickstart

2018-08-03 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan reassigned YARN-8136: Assignee: Eric Yang > Add version attribute to site doc examples and quickstart > -

[jira] [Updated] (YARN-8608) [UI2] No information available per application appAttempt about 'Total Outstanding Resource Requests'

2018-08-02 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8608: - Reporter: Sumana Sathish (was: Akhil PB) > [UI2] No information available per application appAttempt abou

[jira] [Updated] (YARN-8615) [UI2] Resource Usage tab shows only memory related info. No info available for vcores/gpu.

2018-08-02 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8615: - Reporter: Sumana Sathish (was: Akhil PB) > [UI2] Resource Usage tab shows only memory related info. No in

[jira] [Commented] (YARN-8200) Backport resource types/GPU features to branch-2

2018-08-01 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16566119#comment-16566119 ] Wangda Tan commented on YARN-8200: -- [~jhung], thanks for sharing the result. Overall the

[jira] [Comment Edited] (YARN-8559) Expose mutable-conf scheduler's configuration in RM /scheduler-conf endpoint

2018-08-01 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16566045#comment-16566045 ] Wangda Tan edited comment on YARN-8559 at 8/1/18 9:57 PM: -- Thanks

[jira] [Commented] (YARN-8559) Expose mutable-conf scheduler's configuration in RM /scheduler-conf endpoint

2018-08-01 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16566045#comment-16566045 ] Wangda Tan commented on YARN-8559: -- Thanks [~cheersyang], latest patch LGTM.  [~jhung],

[jira] [Commented] (YARN-8588) Logging improvements for better debuggability

2018-08-01 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16565926#comment-16565926 ] Wangda Tan commented on YARN-8588: -- [~suma.shivaprasad], could you help to take care of t

[jira] [Commented] (YARN-8606) Opportunistic scheduling doesnt work after failover

2018-07-31 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564775#comment-16564775 ] Wangda Tan commented on YARN-8606: -- [~bibinchundatt],  Gotcha, fix make sense to me. +1

[jira] [Commented] (YARN-7494) Add muti node lookup support for better placement

2018-07-31 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564589#comment-16564589 ] Wangda Tan commented on YARN-7494: -- [~sunilg], Thanks for updating the patch, some comm

[jira] [Commented] (YARN-8522) Application fails with InvalidResourceRequestException

2018-07-31 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564437#comment-16564437 ] Wangda Tan commented on YARN-8522: -- LGTM +1, thanks [~Zian Chen], Will commit shortly, we

[jira] [Updated] (YARN-8522) Application fails with InvalidResourceRequestException

2018-07-31 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8522: - Target Version/s: 3.2.0, 3.1.1 Priority: Critical (was: Major) > Application fails with Inval

[jira] [Updated] (YARN-8600) RegistryDNS hang when remote lookup does not reply

2018-07-31 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8600: - Priority: Critical (was: Major) > RegistryDNS hang when remote lookup does not reply > --

[jira] [Updated] (YARN-8579) New AM attempt could not retrieve previous attempt component data

2018-07-31 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8579: - Target Version/s: 3.2.0, 3.1.2 Fix Version/s: (was: 3.1.2) (was: 3.2.0

[jira] [Updated] (YARN-7512) Support service upgrade via YARN Service API and CLI

2018-07-31 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-7512: - Target Version/s: 3.1.2 (was: 3.1.1) > Support service upgrade via YARN Service API and CLI > ---

[jira] [Updated] (YARN-8399) NodeManager is giving 403 GSS exception post upgrade to 3.1 in secure mode

2018-07-31 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8399: - Target Version/s: 2.10.0, 3.2.0, 3.0.3, 3.1.2 (was: 2.10.0, 3.2.0, 3.1.1, 3.0.3) > NodeManager is giving

[jira] [Updated] (YARN-8520) Document best practice for user management

2018-07-31 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8520: - Target Version/s: 3.2.0, 3.1.2 (was: 3.2.0, 3.1.1) > Document best practice for user management > ---

[jira] [Updated] (YARN-8052) Move overwriting of service definition during flex to service master

2018-07-31 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8052: - Target Version/s: 3.1.2 (was: 3.1.1) > Move overwriting of service definition during flex to service mast

[jira] [Updated] (YARN-8136) Add version attribute to site doc examples and quickstart

2018-07-31 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8136: - Target Version/s: 3.1.2 (was: 3.1.1) > Add version attribute to site doc examples and quickstart > --

[jira] [Updated] (YARN-8399) NodeManager is giving 403 GSS exception post upgrade to 3.1 in secure mode

2018-07-31 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8399: - Target Version/s: 2.10.0, 3.2.0, 3.0.3 (was: 2.10.0, 3.2.0, 3.0.3, 3.1.2) > NodeManager is giving 403 GSS

[jira] [Updated] (YARN-8161) ServiceState FLEX should be removed

2018-07-31 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8161: - Target Version/s: 3.2.0, 3.1.2 (was: 3.2.0, 3.1.1) > ServiceState FLEX should be removed > --

[jira] [Updated] (YARN-8366) Expose debug log information when user intend to enable GPU without setting nvidia-smi path

2018-07-31 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8366: - Target Version/s: 3.2.0, 3.1.2 (was: 3.2.0, 3.1.1) > Expose debug log information when user intend to ena

[jira] [Updated] (YARN-8453) Additional Unit tests to verify queue limit and max-limit with multiple resource types

2018-07-31 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8453: - Target Version/s: 3.0.4, 3.1.2 (was: 3.1.1, 3.0.4) > Additional Unit tests to verify queue limit and max

[jira] [Updated] (YARN-8552) [DS] Container report fails for distributed containers

2018-07-31 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8552: - Target Version/s: 3.1.2 (was: 3.1.1) > [DS] Container report fails for distributed containers >

[jira] [Commented] (YARN-8301) Yarn Service Upgrade: Add documentation

2018-07-31 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-8301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564255#comment-16564255 ] Wangda Tan commented on YARN-8301: -- Committed to branch-3.1.1, thanks [~csingh]/[~eyang].

<    1   2   3   4   5   6   7   8   9   10   >