[jira] [Commented] (YARN-10057) Upgrade the dependencies managed by yarnpkg
[ https://issues.apache.org/jira/browse/YARN-10057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17003079#comment-17003079 ] Hudson commented on YARN-10057: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17791 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17791/]) YARN-10057. Upgrade the dependencies managed by yarnpkg. (#1780) (GitHub: rev 40887c9b12a41e846cbe5cfe7ee050461442ebe1) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/yarn.lock > Upgrade the dependencies managed by yarnpkg > --- > > Key: YARN-10057 > URL: https://issues.apache.org/jira/browse/YARN-10057 > Project: Hadoop YARN > Issue Type: Bug > Components: build, yarn-ui-v2 >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Major > Fix For: 3.3.0 > > > Run "yarn upgrade" to update the dependencies managed by yarnpkg. > Dependabot automatically created the following pull requests and this issue > is to close them. > * https://github.com/apache/hadoop/pull/1741 > * https://github.com/apache/hadoop/pull/1742 > * https://github.com/apache/hadoop/pull/1743 > * https://github.com/apache/hadoop/pull/1744 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-10057) Upgrade the dependencies managed by yarnpkg
[ https://issues.apache.org/jira/browse/YARN-10057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka resolved YARN-10057. -- Fix Version/s: 3.3.0 Resolution: Fixed Merged the PR into trunk. > Upgrade the dependencies managed by yarnpkg > --- > > Key: YARN-10057 > URL: https://issues.apache.org/jira/browse/YARN-10057 > Project: Hadoop YARN > Issue Type: Bug > Components: build, yarn-ui-v2 >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Major > Fix For: 3.3.0 > > > Run "yarn upgrade" to update the dependencies managed by yarnpkg. > Dependabot automatically created the following pull requests and this issue > is to close them. > * https://github.com/apache/hadoop/pull/1741 > * https://github.com/apache/hadoop/pull/1742 > * https://github.com/apache/hadoop/pull/1743 > * https://github.com/apache/hadoop/pull/1744 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10059) Final states of failed-to-localize containers are not recorded in NM state store
[ https://issues.apache.org/jira/browse/YARN-10059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-10059: Attachment: YARN-10059.001.patch > Final states of failed-to-localize containers are not recorded in NM state > store > > > Key: YARN-10059 > URL: https://issues.apache.org/jira/browse/YARN-10059 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-10059.001.patch > > > Currently we found an issue that many localizers of completed containers were > launched and exhausted memory/cpu of that machine after NM restarted, these > containers were all failed and completed when localizing on a non-existed > local directory which is caused by another problem, but their final states > weren't recorded in NM state store. > The process flow of a fail-to-localize container is as follow: > {noformat} > ResourceLocalizationService$LocalizerRunner#run > -> ContainerImpl$ResourceFailedTransition#transition handle LOCALIZING -> > LOCALIZATION_FAILED upon RESOURCE_FAILED > dispatch LocalizationEventType.CLEANUP_CONTAINER_RESOURCES > -> ResourceLocalizationService#handleCleanupContainerResources handle > CLEANUP_CONTAINER_RESOURCES > dispatch ContainerEventType.CONTAINER_RESOURCES_CLEANEDUP > -> ContainerImpl$LocalizationFailedToDoneTransition#transition > handle LOCALIZATION_FAILED -> DONE upon CONTAINER_RESOURCES_CLEANEDUP > {noformat} > There's no update for state store in this flow now, which is required to > avoid unnecessary localizations after NM restarts. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10059) Final states of failed-to-localize containers are not recorded in NM state store
[ https://issues.apache.org/jira/browse/YARN-10059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-10059: Attachment: (was: YARN-10059.001.patch) > Final states of failed-to-localize containers are not recorded in NM state > store > > > Key: YARN-10059 > URL: https://issues.apache.org/jira/browse/YARN-10059 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > > Currently we found an issue that many localizers of completed containers were > launched and exhausted memory/cpu of that machine after NM restarted, these > containers were all failed and completed when localizing on a non-existed > local directory which is caused by another problem, but their final states > weren't recorded in NM state store. > The process flow of a fail-to-localize container is as follow: > {noformat} > ResourceLocalizationService$LocalizerRunner#run > -> ContainerImpl$ResourceFailedTransition#transition handle LOCALIZING -> > LOCALIZATION_FAILED upon RESOURCE_FAILED > dispatch LocalizationEventType.CLEANUP_CONTAINER_RESOURCES > -> ResourceLocalizationService#handleCleanupContainerResources handle > CLEANUP_CONTAINER_RESOURCES > dispatch ContainerEventType.CONTAINER_RESOURCES_CLEANEDUP > -> ContainerImpl$LocalizationFailedToDoneTransition#transition > handle LOCALIZATION_FAILED -> DONE upon CONTAINER_RESOURCES_CLEANEDUP > {noformat} > There's no update for state store in this flow now, which is required to > avoid unnecessary localizations after NM restarts. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10060) Historyserver may recover too slow since JobHistory init too slow when there exist too many job
[ https://issues.apache.org/jira/browse/YARN-10060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhoukang updated YARN-10060: Description: Like below it cost >7min to listen to the service port {code:java} 2019-12-24,20:01:37,272 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down 2019-12-24,20:01:47,354 INFO org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager: Initializing Existing Jobs... 2019-12-24,20:08:29,589 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server xxx. Will not attempt to authenticate using SASL (unknown error) 2019-12-24,20:08:29,589 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to xxx, initiating session 2019-12-24,20:08:29,590 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server xxx, sessionid = 0x66d1a13e596ddc9, negotiated timeout = 5000 2019-12-24,20:08:29,593 INFO org.apache.zookeeper.ZooKeeper: Session: 0x66d1a13e596ddc9 closed 2019-12-24,20:08:29,593 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down 2019-12-24,20:08:29,655 INFO org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage: CachedHistoryStorage Init 2019-12-24,20:08:29,681 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue 2019-12-24,20:08:29,715 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue 2019-12-24,20:08:29,800 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2019-12-24,20:08:29,943 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2019-12-24,20:08:29,943 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: JobHistoryServer metrics system started 2019-12-24,20:08:29,950 INFO org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: Updating the current master key for generating delegation tokens 2019-12-24,20:08:29,951 INFO org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: Starting expired delegation token remover thread, tokenRemoverScanInterval=60 min(s) 2019-12-24,20:08:29,952 INFO org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: Updating the current master key for generating delegation tokens 2019-12-24,20:08:30,015 INFO org.apache.hadoop.http.HttpRequestLog: Http request log for http.requests.jobhistory is not defined 2019-12-24,20:08:30,025 INFO org.apache.hadoop.http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter) 2019-12-24,20:08:30,027 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context jobhistory 2019-12-24,20:08:30,027 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static 2019-12-24,20:08:30,030 INFO org.apache.hadoop.http.HttpServer2: adding path spec: /jobhistory/* 2019-12-24,20:08:30,030 INFO org.apache.hadoop.http.HttpServer2: adding path spec: /ws/* 2019-12-24,20:08:30,057 INFO org.apache.hadoop.http.HttpServer2: Jetty bound to port 20901 2019-12-24,20:08:30,939 INFO org.apache.hadoop.yarn.webapp.WebApps: Web app /jobhistory started at 20901 2019-12-24,20:08:31,177 INFO org.apache.hadoop.yarn.webapp.WebApps: Registered webapp guice modules 2019-12-24,20:08:31,187 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue 2019-12-24,20:08:31,187 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue 2019-12-24,20:08:31,189 INFO org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.mapreduce.v2.api.HSClientProtocolPB to the server 2019-12-24,20:08:31,216 INFO org.apache.hadoop.mapreduce.v2.hs.HistoryClientService: Instantiated HistoryClientService at xxx 2019-12-24,20:08:31,344 INFO org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService: aggregated log deletion started. 2019-12-24,20:08:31,690 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=xxx sessionTimeout=5000 watcher=org {code} {code:java} protected void serviceInit(Configuration conf) throws Exception { LOG.info("JobHistory Init"); this.conf = conf; this.appID = ApplicationId.newInstance(0, 0); this.appAttemptID = RecordFactoryProvider.getRecordFactory(conf) .newRecordInstance(ApplicationAttemptId.class); moveThreadInterval = conf.getLong( JHAdminConfig.MR_HISTORY_MOVE_INTERVAL_MS, JHAdminConfig.DEFAULT_MR_HISTORY_MOVE_INTERVAL_MS); hsManager = createHistoryFileManager(); hsManager.init(conf); try { hsManager.initExisting();
[jira] [Created] (YARN-10060) Historyserver may recover too slow since JobHistory init too slow when there exist too many job
zhoukang created YARN-10060: --- Summary: Historyserver may recover too slow since JobHistory init too slow when there exist too many job Key: YARN-10060 URL: https://issues.apache.org/jira/browse/YARN-10060 Project: Hadoop YARN Issue Type: Improvement Components: yarn Reporter: zhoukang Assignee: zhoukang Like below it cost >7min to listen to the service port {code:java} 2019-12-24,20:01:37,272 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down 2019-12-24,20:01:47,354 INFO org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager: Initializing Existing Jobs... 2019-12-24,20:08:29,589 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server zjy-hadoop-prc-ct07.bj/10.152.50.2:11000. Will not attempt to authenticate using SASL (unknown error) 2019-12-24,20:08:29,589 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to zjy-hadoop-prc-ct07.bj/10.152.50.2:11000, initiating session 2019-12-24,20:08:29,590 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server zjy-hadoop-prc-ct07.bj/10.152.50.2:11000, sessionid = 0x66d1a13e596ddc9, negotiated timeout = 5000 2019-12-24,20:08:29,593 INFO org.apache.zookeeper.ZooKeeper: Session: 0x66d1a13e596ddc9 closed 2019-12-24,20:08:29,593 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down 2019-12-24,20:08:29,655 INFO org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage: CachedHistoryStorage Init 2019-12-24,20:08:29,681 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue 2019-12-24,20:08:29,715 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue 2019-12-24,20:08:29,800 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2019-12-24,20:08:29,943 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2019-12-24,20:08:29,943 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: JobHistoryServer metrics system started 2019-12-24,20:08:29,950 INFO org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: Updating the current master key for generating delegation tokens 2019-12-24,20:08:29,951 INFO org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: Starting expired delegation token remover thread, tokenRemoverScanInterval=60 min(s) 2019-12-24,20:08:29,952 INFO org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: Updating the current master key for generating delegation tokens 2019-12-24,20:08:30,015 INFO org.apache.hadoop.http.HttpRequestLog: Http request log for http.requests.jobhistory is not defined 2019-12-24,20:08:30,025 INFO org.apache.hadoop.http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter) 2019-12-24,20:08:30,027 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context jobhistory 2019-12-24,20:08:30,027 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static 2019-12-24,20:08:30,030 INFO org.apache.hadoop.http.HttpServer2: adding path spec: /jobhistory/* 2019-12-24,20:08:30,030 INFO org.apache.hadoop.http.HttpServer2: adding path spec: /ws/* 2019-12-24,20:08:30,057 INFO org.apache.hadoop.http.HttpServer2: Jetty bound to port 20901 2019-12-24,20:08:30,939 INFO org.apache.hadoop.yarn.webapp.WebApps: Web app /jobhistory started at 20901 2019-12-24,20:08:31,177 INFO org.apache.hadoop.yarn.webapp.WebApps: Registered webapp guice modules 2019-12-24,20:08:31,187 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue 2019-12-24,20:08:31,187 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue 2019-12-24,20:08:31,189 INFO org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.mapreduce.v2.api.HSClientProtocolPB to the server 2019-12-24,20:08:31,216 INFO org.apache.hadoop.mapreduce.v2.hs.HistoryClientService: Instantiated HistoryClientService at zjy-hadoop-prc-ct11.bj/10.152.50.42:20900 2019-12-24,20:08:31,344 INFO org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService: aggregated log deletion started. 2019-12-24,20:08:31,690 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=zjyprc.observer.zk.hadoop.srv:11000 sessionTimeout=5000 watcher=org {code} {code:java} protected void serviceInit(Configuration conf) throws Exception { LOG.info("JobHistory Init"); this.conf = conf; this.appID =
[jira] [Commented] (YARN-10058) Capacity Scheduler dispatcher hang when async thread crash
[ https://issues.apache.org/jira/browse/YARN-10058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17002743#comment-17002743 ] Hadoop QA commented on YARN-10058: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 20m 11s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 11s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 43s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 81m 47s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 30s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}153m 49s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:c44943d1fc3 | | JIRA Issue | YARN-10058 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12989410/0001-global-scheduling-standby-hang.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 2a032651e6a5 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 34ff7db | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_232 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/25316/testReport/ | | Max. process+thread count | 874 (vs. ulimit of 5500) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/25316/console | | Powered by