[jira] [Commented] (YARN-10601) The Yarn client should use the UGI who created the Yarn client for obtaining a delegation token for the remote log dir
[ https://issues.apache.org/jira/browse/YARN-10601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276106#comment-17276106 ] Prabhu Joseph commented on YARN-10601: -- [~fritsi] Thanks for the details. >> As you can see submitApplication is not invoked inside an ugi.doAs block Why submitApplication is not invoked inside ugi.doAs block. If we need the log aggregation to happen as per submitterUser, the job also has to be submitted by submitterUser right? > The Yarn client should use the UGI who created the Yarn client for obtaining > a delegation token for the remote log dir > -- > > Key: YARN-10601 > URL: https://issues.apache.org/jira/browse/YARN-10601 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation >Affects Versions: 3.3.0, 3.4.0 >Reporter: Daniel Fritsi >Priority: Critical > > It seems there was a bug introduced in YARN-10333 in this section of > *{color:#0747A6}{{addLogAggregationDelegationToken}}{color}*: > {code:java} > Path remoteRootLogDir = fileController.getRemoteRootLogDir(); > FileSystem fs = remoteRootLogDir.getFileSystem(conf); > final org.apache.hadoop.security.token.Token[] finalTokens = > fs.addDelegationTokens(masterPrincipal, credentials); > {code} > *{color:#0747A6}{{remoteRootLogDir.getFileSystem}}{color}* simply does this: > {code:java} > public FileSystem getFileSystem(Configuration conf) throws IOException { > return FileSystem.get(this.toUri(), conf); > } > {code} > As far as I know it's customary to create a YarnClient instance via > *{color:#0747A6}{{YarnClient.createYarnClient()}}{color}* in a > UserGroupInformation.doAs block if you would like to use it with a different > user then the current one. E.g.: > {code:java} > YarnClient yarnClient = ugi.doAs(new PrivilegedExceptionAction() { > @Override > public YarnClient run() throws Exception { > YarnClient yarnClient = YarnClient.createYarnClient(); > yarnClient.init(conf); > yarnClient.start(); > return yarnClient; > } > }); > {code} > If this statement is correct then I think YarnClient should save the > *{color:#0747A6}{{UserGroupInformation.getCurrentUser()}}{color}* when the > YarnClient is being created and the > *{color:#0747A6}{{remoteRootLogDir.getFileSystem(conf)}}{color}* call should > be made inside an ugi.doAs block with that saved user. > A more concrete example: > {code:java} > public YarnClient createYarnClient(UserGroupInformation ugi, Configuration > conf) throws Exception { > return ugi.doAs((PrivilegedExceptionAction) () -> { > // Her I am the submitterUser (see below) > YarnClient yarnClient = YarnClient.createYarnClient(); > yarnClient.init(conf); > yarnClient.start(); > return yarnClient; > }); > } > public void run() { > // Here I am the serviceUser > // ... > Configuration conf = ... > // ... > UserGroupInformation ugi = getSubmitterUser(); > // ... > YarnClient yarnClient = createYarnClient(ugi); > // ... > ApplicationSubmissionContext context = ... > // ... > yarnClient.submitApplication(context); > } > {code} > As you can see *{color:#0747A6}{{submitApplication}}{color}* is not invoked > inside an ugi.doAs block and submitApplication is the one who will eventually > invoke *{color:#0747A6}{{addLogAggregationDelegationToken}}{color}*. That's > why we need to save the UGI during the YarnClient creation and create the > FileSystem instance inside an ugi.doAs with that saved user. Otherwise Yarn > will try to get a delegation token with an incorrect user (serviceUser) > instead of the submitterUser. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10603) Failed to reinitialize for recovered container
[ https://issues.apache.org/jira/browse/YARN-10603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276087#comment-17276087 ] kyungwan nam edited comment on YARN-10603 at 2/1/21, 6:33 AM: -- I've attached a patch. this patch works well in our cluster. Please review and comment. Thanks. was (Author: kyungwan nam): I've attached a patch. Please review and comment. Thanks > Failed to reinitialize for recovered container > -- > > Key: YARN-10603 > URL: https://issues.apache.org/jira/browse/YARN-10603 > Project: Hadoop YARN > Issue Type: Bug >Reporter: kyungwan nam >Assignee: kyungwan nam >Priority: Major > Attachments: YARN-10603.001.patch > > > Container reinitializing request does not work after restarting NM. > I found some problem as below. > - when a recovered container is terminated, exiting occurs because it makes > always either CONTAINER_EXITED_WITH_FAILURE or CONTAINER_EXITED_WITH_SUCCESS > - container’s *recoveredStatus* is set at the time of NM recovery. and it is > never changed even though the container is terminated. > as a result, newly reinitializing container will be launched as a recovered > container, but it doesn't work -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10603) Failed to reinitialize for recovered container
[ https://issues.apache.org/jira/browse/YARN-10603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kyungwan nam updated YARN-10603: Attachment: YARN-10603.001.patch > Failed to reinitialize for recovered container > -- > > Key: YARN-10603 > URL: https://issues.apache.org/jira/browse/YARN-10603 > Project: Hadoop YARN > Issue Type: Bug >Reporter: kyungwan nam >Assignee: kyungwan nam >Priority: Major > Attachments: YARN-10603.001.patch > > > Container reinitializing request does not work after restarting NM. > I found some problem as below. > - when a recovered container is terminated, exiting occurs because it makes > always either CONTAINER_EXITED_WITH_FAILURE or CONTAINER_EXITED_WITH_SUCCESS > - container’s *recoveredStatus* is set at the time of NM recovery. and it is > never changed even though the container is terminated. > as a result, newly reinitializing container will be launched as a recovered > container, but it doesn't work -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10603) Failed to reinitialize for recovered container
kyungwan nam created YARN-10603: --- Summary: Failed to reinitialize for recovered container Key: YARN-10603 URL: https://issues.apache.org/jira/browse/YARN-10603 Project: Hadoop YARN Issue Type: Bug Reporter: kyungwan nam Assignee: kyungwan nam Container reinitializing request does not work after restarting NM. I found some problem as below. - when a recovered container is terminated, exiting occurs because it makes always either CONTAINER_EXITED_WITH_FAILURE or CONTAINER_EXITED_WITH_SUCCESS - container’s *recoveredStatus* is set at the time of NM recovery. and it is never changed even though the container is terminated. as a result, newly reinitializing container will be launched as a recovered container, but it doesn't work -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10532) Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue is not being used
[ https://issues.apache.org/jira/browse/YARN-10532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17275871#comment-17275871 ] Hadoop QA commented on YARN-10532: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 18s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 1s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 39s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 50s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 54s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 12s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 1m 47s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 45s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/564/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-warnings.html{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager in trunk has 1 extant findbugs warnings. {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 49s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 46s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 45s{color} | {color:orange}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/564/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 19 new + 305 unchanged - 0 fixed = 324 total (was 305) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} |
[jira] [Commented] (YARN-10532) Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue is not being used
[ https://issues.apache.org/jira/browse/YARN-10532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17275832#comment-17275832 ] zhuqi commented on YARN-10532: -- [~wangda] [~gandras] I have updated a new patch to make the code more clear. I have created a new Policy "(maybe we can make it runnable by default so we don't have to create another config) as [~wangda] suggested". The policy just simply monitor queue last used time and delete queues when needed. We can enable this, by adding AutoDeletionForExpiredQueuePolicy to the conf : "scheduler.monitor.policies". I also handled deletion of ParentQueues which without child queues. And i removed the reinitialize related logic, i think we don't need it when default enabled auto deletion. If you any other thoughts. Thanks. > Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue is > not being used > > > Key: YARN-10532 > URL: https://issues.apache.org/jira/browse/YARN-10532 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: zhuqi >Priority: Major > Attachments: YARN-10532.001.patch, YARN-10532.002.patch, > YARN-10532.003.patch, YARN-10532.004.patch, YARN-10532.005.patch, > YARN-10532.006.patch, YARN-10532.007.patch, YARN-10532.008.patch > > > It's better if we can delete auto-created queues when they are not in use for > a period of time (like 5 mins). It will be helpful when we have a large > number of auto-created queues (e.g. from 500 users), but only a small subset > of queues are actively used. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10532) Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue is not being used
[ https://issues.apache.org/jira/browse/YARN-10532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhuqi updated YARN-10532: - Attachment: YARN-10532.008.patch > Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue is > not being used > > > Key: YARN-10532 > URL: https://issues.apache.org/jira/browse/YARN-10532 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: zhuqi >Priority: Major > Attachments: YARN-10532.001.patch, YARN-10532.002.patch, > YARN-10532.003.patch, YARN-10532.004.patch, YARN-10532.005.patch, > YARN-10532.006.patch, YARN-10532.007.patch, YARN-10532.008.patch > > > It's better if we can delete auto-created queues when they are not in use for > a period of time (like 5 mins). It will be helpful when we have a large > number of auto-created queues (e.g. from 500 users), but only a small subset > of queues are actively used. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10602) YRAN job's state is FINISHED,the FinalStatus is UNDEFINED
[ https://issues.apache.org/jira/browse/YARN-10602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] 欧自力 updated YARN-10602: --- Issue Type: Improvement (was: Bug) > YRAN job's state is FINISHED,the FinalStatus is UNDEFINED > - > > Key: YARN-10602 > URL: https://issues.apache.org/jira/browse/YARN-10602 > Project: Hadoop YARN > Issue Type: Improvement > Components: api, resourcemanager, restapi >Affects Versions: 3.1.1 > Environment: ||ResourceManager version:|3.1.1.3.1.5.0-152 | > ||Hadoop version:|3.1.1.3.1.5.0-152| >Reporter: 欧自力 >Priority: Major > Labels: patch > Attachments: UNDEFINED.png, UNDEFINED.txt > > > when a tez task finished,But yarn api state is FINISHED,the FinalStatus is > UNDEFINED,The rest of you have had this problem > please look like this, > when i get status throuth > http://rm:8088/ws/v1/cluster/apps/application_1612017156073_24137 > {color:#4c9aff}{color} > {color:#4c9aff}application_1612017156073_24137{color} > {color:#4c9aff}datadev{color} > {color:#4c9aff}HIVE-08babb5c-0a46-45db-892f-67aae26c4b57{color} > {color:#4c9aff}common{color} > {color:#4c9aff}{color:#de350b}FINISHED{color}{color} > {color:#de350b}UNDEFINED{color} > 100.0 > {color:#4c9aff}History{color} > > {color:#4c9aff}[http://wx12-dsj-master002:8088/proxy/application_1612017156073_24137/]{color} > {color:#4c9aff}Session stats:submittedDAGs=1, successfulDAGs=1, > failedDAGs=0, killedDAGs=0 {color} > {color:#4c9aff}1612017156073{color} > {color:#4c9aff}TEZ{color} > > {color:#4c9aff}hive_20210131142041_6adab368-2ffe-4469-ad96-58918b8f80a0,userid=datadev{color} > {color:#4c9aff}0{color} > {color:#4c9aff}1612074042309{color} > {color:#4c9aff}1612074064373{color} > {color:#4c9aff}22064{color} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10602) YRAN job's state is FINISHED,the FinalStatus is UNDEFINED
[ https://issues.apache.org/jira/browse/YARN-10602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] 欧自力 updated YARN-10602: --- Remaining Estimate: (was: 1h) Original Estimate: (was: 1h) > YRAN job's state is FINISHED,the FinalStatus is UNDEFINED > - > > Key: YARN-10602 > URL: https://issues.apache.org/jira/browse/YARN-10602 > Project: Hadoop YARN > Issue Type: Bug > Components: api, resourcemanager, restapi >Affects Versions: 3.1.1 > Environment: ||ResourceManager version:|3.1.1.3.1.5.0-152 | > ||Hadoop version:|3.1.1.3.1.5.0-152| >Reporter: 欧自力 >Priority: Major > Labels: patch > Attachments: UNDEFINED.png, UNDEFINED.txt > > > when a tez task finished,But yarn api state is FINISHED,the FinalStatus is > UNDEFINED,The rest of you have had this problem > please look like this, > when i get status throuth > http://rm:8088/ws/v1/cluster/apps/application_1612017156073_24137 > {color:#4c9aff}{color} > {color:#4c9aff}application_1612017156073_24137{color} > {color:#4c9aff}datadev{color} > {color:#4c9aff}HIVE-08babb5c-0a46-45db-892f-67aae26c4b57{color} > {color:#4c9aff}common{color} > {color:#4c9aff}{color:#de350b}FINISHED{color}{color} > {color:#de350b}UNDEFINED{color} > 100.0 > {color:#4c9aff}History{color} > > {color:#4c9aff}[http://wx12-dsj-master002:8088/proxy/application_1612017156073_24137/]{color} > {color:#4c9aff}Session stats:submittedDAGs=1, successfulDAGs=1, > failedDAGs=0, killedDAGs=0 {color} > {color:#4c9aff}1612017156073{color} > {color:#4c9aff}TEZ{color} > > {color:#4c9aff}hive_20210131142041_6adab368-2ffe-4469-ad96-58918b8f80a0,userid=datadev{color} > {color:#4c9aff}0{color} > {color:#4c9aff}1612074042309{color} > {color:#4c9aff}1612074064373{color} > {color:#4c9aff}22064{color} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org