[jira] [Commented] (YARN-10178) Global Scheduler async thread crash caused by 'Comparison method violates its general contract
[ https://issues.apache.org/jira/browse/YARN-10178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17463003#comment-17463003 ] Andras Gyori commented on YARN-10178: - Thank you [~epayne] for the thorough review, this is indeed convincing. Shall I ask someone to check it out and commit this? > Global Scheduler async thread crash caused by 'Comparison method violates its > general contract > -- > > Key: YARN-10178 > URL: https://issues.apache.org/jira/browse/YARN-10178 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.2.1 >Reporter: tuyu >Assignee: Qi Zhu >Priority: Major > Attachments: YARN-10178.001.patch, YARN-10178.002.patch, > YARN-10178.003.patch, YARN-10178.004.patch, YARN-10178.005.patch, > YARN-10178.006.patch, YARN-10178.branch-2.10.001.patch > > > Global Scheduler Async Thread crash stack > {code:java} > ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received > RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, > Thread-6066574, that exited unexpectedly: java.lang.IllegalArgumentException: > Comparison method violates its general contract! >at > java.util.TimSort.mergeHi(TimSort.java:899) > at java.util.TimSort.mergeAt(TimSort.java:516) > at java.util.TimSort.mergeForceCollapse(TimSort.java:457) > at java.util.TimSort.sort(TimSort.java:254) > at java.util.Arrays.sort(Arrays.java:1512) > at java.util.ArrayList.sort(ArrayList.java:1462) > at java.util.Collections.sort(Collections.java:177) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.policy.PriorityUtilizationQueueOrderingPolicy.getAssignmentIterator(PriorityUtilizationQueueOrderingPolicy.java:221) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.sortAndGetChildrenAllocationIterator(ParentQueue.java:777) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:791) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:623) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:1635) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1629) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1732) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1481) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.schedule(CapacityScheduler.java:569) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$AsyncScheduleThread.run(CapacityScheduler.java:616) > {code} > JAVA 8 Arrays.sort default use timsort algo, and timsort has few require > {code:java} > 1.x.compareTo(y) != y.compareTo(x) > 2.x>y,y>z --> x > z > 3.x=y, x.compareTo(z) == y.compareTo(z) > {code} > if not Arrays paramters not satify this require,TimSort will throw > 'java.lang.IllegalArgumentException' > look at PriorityUtilizationQueueOrderingPolicy.compare function,we will know > Capacity Scheduler use this these queue resource usage to compare > {code:java} > AbsoluteUsedCapacity > UsedCapacity > ConfiguredMinResource > AbsoluteCapacity > {code} > In Capacity Scheduler Global Scheduler AsyncThread use > PriorityUtilizationQueueOrderingPolicy function to choose queue to assign > container,and construct a CSAssignment struct, and use > submitResourceCommitRequest function add CSAssignment to backlogs > ResourceCommitterService will tryCommit this CSAssignment,look tryCommit > function,there will update queue resource usage > {code:java} > public boolean tryCommit(Resource cluster, ResourceCommitRequest r, > boolean updatePending) { > long commitStart = System.nanoTime(); > ResourceCommitRequest request = > (ResourceCommitRequest) r; > > ... > boolean isSuccess = false; > if (attemptId != null) { > FiCaSchedulerApp app = getApplicationAttempt(attemptId); > // Required sanity check for attemptId - when async-scheduling enabled, > // proposal might be outdated if AM failover just finished > // and proposal queue was not be consumed in time > if (app != null &&
[jira] [Commented] (YARN-10178) Global Scheduler async thread crash caused by 'Comparison method violates its general contract
[ https://issues.apache.org/jira/browse/YARN-10178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17462883#comment-17462883 ] Eric Payne commented on YARN-10178: --- No significant difference between most of the performance test suites in TestCapacitySchedulerPerf when compared between with and without this patch. One suite, however, showed _improvement_ with the patch: With 10 threads enabled, 200 apps on 100% of 50 queues, this test showed an average 2x improvement after applying this patch. Since there doesn't seem to be any other noticeable differences among the other test suites, I give my +1. Thank you [~tuyu], [~zhuqi], [~gandras], and others for your good work on this issue. This is a vital fix for a serious flaw. > Global Scheduler async thread crash caused by 'Comparison method violates its > general contract > -- > > Key: YARN-10178 > URL: https://issues.apache.org/jira/browse/YARN-10178 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.2.1 >Reporter: tuyu >Assignee: Qi Zhu >Priority: Major > Attachments: YARN-10178.001.patch, YARN-10178.002.patch, > YARN-10178.003.patch, YARN-10178.004.patch, YARN-10178.005.patch, > YARN-10178.006.patch, YARN-10178.branch-2.10.001.patch > > > Global Scheduler Async Thread crash stack > {code:java} > ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received > RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, > Thread-6066574, that exited unexpectedly: java.lang.IllegalArgumentException: > Comparison method violates its general contract! >at > java.util.TimSort.mergeHi(TimSort.java:899) > at java.util.TimSort.mergeAt(TimSort.java:516) > at java.util.TimSort.mergeForceCollapse(TimSort.java:457) > at java.util.TimSort.sort(TimSort.java:254) > at java.util.Arrays.sort(Arrays.java:1512) > at java.util.ArrayList.sort(ArrayList.java:1462) > at java.util.Collections.sort(Collections.java:177) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.policy.PriorityUtilizationQueueOrderingPolicy.getAssignmentIterator(PriorityUtilizationQueueOrderingPolicy.java:221) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.sortAndGetChildrenAllocationIterator(ParentQueue.java:777) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:791) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:623) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:1635) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1629) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1732) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1481) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.schedule(CapacityScheduler.java:569) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$AsyncScheduleThread.run(CapacityScheduler.java:616) > {code} > JAVA 8 Arrays.sort default use timsort algo, and timsort has few require > {code:java} > 1.x.compareTo(y) != y.compareTo(x) > 2.x>y,y>z --> x > z > 3.x=y, x.compareTo(z) == y.compareTo(z) > {code} > if not Arrays paramters not satify this require,TimSort will throw > 'java.lang.IllegalArgumentException' > look at PriorityUtilizationQueueOrderingPolicy.compare function,we will know > Capacity Scheduler use this these queue resource usage to compare > {code:java} > AbsoluteUsedCapacity > UsedCapacity > ConfiguredMinResource > AbsoluteCapacity > {code} > In Capacity Scheduler Global Scheduler AsyncThread use > PriorityUtilizationQueueOrderingPolicy function to choose queue to assign > container,and construct a CSAssignment struct, and use > submitResourceCommitRequest function add CSAssignment to backlogs > ResourceCommitterService will tryCommit this CSAssignment,look tryCommit > function,there will update queue resource usage > {code:java} > public boolean tryCommit(Resource cluster, ResourceCommitRequest r, > boolean updatePending) { > long commitStart =
[jira] [Commented] (YARN-10178) Global Scheduler async thread crash caused by 'Comparison method violates its general contract
[ https://issues.apache.org/jira/browse/YARN-10178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17462835#comment-17462835 ] Eric Payne commented on YARN-10178: --- [~gandras], I have reviewed the trunk and branch-2.10 patches. The code changes look good to me. I have manually tested the branch-2.10 patch multiple times in a 100-node cluster and found it to be stable with no thread crashes. I am currently running the TestCapacitySchedulerPerf performance tests with and without the patch, with and without threading enabled. I will post the results later today. > Global Scheduler async thread crash caused by 'Comparison method violates its > general contract > -- > > Key: YARN-10178 > URL: https://issues.apache.org/jira/browse/YARN-10178 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.2.1 >Reporter: tuyu >Assignee: Qi Zhu >Priority: Major > Attachments: YARN-10178.001.patch, YARN-10178.002.patch, > YARN-10178.003.patch, YARN-10178.004.patch, YARN-10178.005.patch, > YARN-10178.006.patch, YARN-10178.branch-2.10.001.patch > > > Global Scheduler Async Thread crash stack > {code:java} > ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received > RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, > Thread-6066574, that exited unexpectedly: java.lang.IllegalArgumentException: > Comparison method violates its general contract! >at > java.util.TimSort.mergeHi(TimSort.java:899) > at java.util.TimSort.mergeAt(TimSort.java:516) > at java.util.TimSort.mergeForceCollapse(TimSort.java:457) > at java.util.TimSort.sort(TimSort.java:254) > at java.util.Arrays.sort(Arrays.java:1512) > at java.util.ArrayList.sort(ArrayList.java:1462) > at java.util.Collections.sort(Collections.java:177) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.policy.PriorityUtilizationQueueOrderingPolicy.getAssignmentIterator(PriorityUtilizationQueueOrderingPolicy.java:221) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.sortAndGetChildrenAllocationIterator(ParentQueue.java:777) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:791) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:623) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:1635) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1629) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1732) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1481) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.schedule(CapacityScheduler.java:569) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$AsyncScheduleThread.run(CapacityScheduler.java:616) > {code} > JAVA 8 Arrays.sort default use timsort algo, and timsort has few require > {code:java} > 1.x.compareTo(y) != y.compareTo(x) > 2.x>y,y>z --> x > z > 3.x=y, x.compareTo(z) == y.compareTo(z) > {code} > if not Arrays paramters not satify this require,TimSort will throw > 'java.lang.IllegalArgumentException' > look at PriorityUtilizationQueueOrderingPolicy.compare function,we will know > Capacity Scheduler use this these queue resource usage to compare > {code:java} > AbsoluteUsedCapacity > UsedCapacity > ConfiguredMinResource > AbsoluteCapacity > {code} > In Capacity Scheduler Global Scheduler AsyncThread use > PriorityUtilizationQueueOrderingPolicy function to choose queue to assign > container,and construct a CSAssignment struct, and use > submitResourceCommitRequest function add CSAssignment to backlogs > ResourceCommitterService will tryCommit this CSAssignment,look tryCommit > function,there will update queue resource usage > {code:java} > public boolean tryCommit(Resource cluster, ResourceCommitRequest r, > boolean updatePending) { > long commitStart = System.nanoTime(); > ResourceCommitRequest request = > (ResourceCommitRequest) r; > > ... > boolean isSuccess = false; > if (attemptId != null) { >
[jira] [Updated] (YARN-10503) Support queue capacity in terms of absolute resources with custom resourceType.
[ https://issues.apache.org/jira/browse/YARN-10503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-10503: -- Fix Version/s: 3.2.4 > Support queue capacity in terms of absolute resources with custom > resourceType. > --- > > Key: YARN-10503 > URL: https://issues.apache.org/jira/browse/YARN-10503 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Critical > Labels: pull-request-available > Fix For: 3.4.0, 3.3.1, 3.2.4 > > Attachments: YARN-10503-branch-3.3.010.patch, YARN-10503.001.patch, > YARN-10503.002.patch, YARN-10503.003.patch, YARN-10503.004.patch, > YARN-10503.005.patch, YARN-10503.006.patch, YARN-10503.007.patch, > YARN-10503.008.patch, YARN-10503.009.patch, YARN-10503.010.patch > > Time Spent: 1h > Remaining Estimate: 0h > > Now the absolute resources are memory and cores. > {code:java} > /** > * Different resource types supported. > */ > public enum AbsoluteResourceType { > MEMORY, VCORES; > }{code} > But in our GPU production clusters, we need to support more resourceTypes. > It's very import for cluster scaling when with different resourceType > absolute demands. > > This Jira will handle GPU first. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11025) Implement distributed decommissioning
[ https://issues.apache.org/jira/browse/YARN-11025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Minni Mittal updated YARN-11025: Description: This Jira proposes to add support for accepting request from disributed sources for putting nodes in decomissioning state. It proposes to add configurable provider and consumer class interface in nodemanager. NM can receive request to put node in decomissioning from any distributed sources using the provider class implementation and consumer classes can utilize it to set node status. Corresponding changes will be done at RM side to update node staate while update event is called at DecomissioningNodesWatcher. was: This Jira proposes to add support for acceoting request from disributed sources for putting nodes in decomissioning state. It proposes to add configurable provider and consumer class interface in nodemanager. NM can receive request to put node in decomissioning from any distributed sources using the provider class implementation and consumer classes can utilize it to set node status. Corresponding changes will be done at RM side to update node staate while update event is called at DecomissioningNodesWatcher. > Implement distributed decommissioning > - > > Key: YARN-11025 > URL: https://issues.apache.org/jira/browse/YARN-11025 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Minni Mittal >Assignee: Minni Mittal >Priority: Major > > This Jira proposes to add support for accepting request from disributed > sources for putting nodes in decomissioning state. > It proposes to add configurable provider and consumer class interface in > nodemanager. NM can receive request to put node in decomissioning from any > distributed sources using the provider class implementation and consumer > classes can utilize it to set node status. > Corresponding changes will be done at RM side to update node staate while > update event is called at DecomissioningNodesWatcher. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11025) Implement distributed decommissioning
[ https://issues.apache.org/jira/browse/YARN-11025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Minni Mittal updated YARN-11025: Description: This Jira proposes to add support for acceoting request from disributed sources for putting nodes in decomissioning state. It proposes to add configurable provider and consumer class interface in nodemanager. NM can receive request to put node in decomissioning from any distributed sources using the provider class implementation and consumer classes can utilize it to set node status. Corresponding changes will be done at RM side to update node staate while update event is called at DecomissioningNodesWatcher. > Implement distributed decommissioning > - > > Key: YARN-11025 > URL: https://issues.apache.org/jira/browse/YARN-11025 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Minni Mittal >Assignee: Minni Mittal >Priority: Major > > This Jira proposes to add support for acceoting request from disributed > sources for putting nodes in decomissioning state. > It proposes to add configurable provider and consumer class interface in > nodemanager. NM can receive request to put node in decomissioning from any > distributed sources using the provider class implementation and consumer > classes can utilize it to set node status. > Corresponding changes will be done at RM side to update node staate while > update event is called at DecomissioningNodesWatcher. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11050) Typo in method name: RMWebServiceProtocol#removeFromCluserNodeLabels
[ https://issues.apache.org/jira/browse/YARN-11050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-11050: -- Fix Version/s: 3.4.0 > Typo in method name: RMWebServiceProtocol#removeFromCluserNodeLabels > > > Key: YARN-11050 > URL: https://issues.apache.org/jira/browse/YARN-11050 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Trivial > Labels: newbie, pull-request-available > Fix For: 3.4.0 > > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-11047) ResourceManager and NodeManager unable to connect to Hbase when ATSv2 is enabled
[ https://issues.apache.org/jira/browse/YARN-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang resolved YARN-11047. Resolution: Fixed > ResourceManager and NodeManager unable to connect to Hbase when ATSv2 is > enabled > > > Key: YARN-11047 > URL: https://issues.apache.org/jira/browse/YARN-11047 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Minni Mittal >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > yarn resourcemanager command has following issue: > 2021-12-14 19:26:33,345 WARN [pool-28-thread-1] > storage.TimelineStorageMonitor (TimelineStorageMonitor.java:run(95)) - Got > failure attempting to read from HBase, assuming Storage is down > java.lang.RuntimeException: org.apache.hadoop.hbase.DoNotRetryIOException: > java.lang.NoSuchMethodError: > [com.google.common.net|http://com.google.common.net/].HostAndPort.getHostText()Ljava/lang/String; > at > org.apache.hadoop.hbase.client.AbstractClientScanner$1.hasNext(AbstractClientScanner.java:95) > at > org.apache.hadoop.yarn.server.timelineservice.storage.reader.TimelineEntityReader.readEntities(TimelineEntityReader.java:283) > at > org.apache.hadoop.yarn.server.timelineservice.storage.HBaseStorageMonitor.healthCheck(HBaseStorageMonitor.java:77) > at > org.apache.hadoop.yarn.server.timelineservice.storage.TimelineStorageMonitor$MonitorThread.run(TimelineStorageMonitor.java:89) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11047) ResourceManager and NodeManager unable to connect to Hbase when ATSv2 is enabled
[ https://issues.apache.org/jira/browse/YARN-11047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated YARN-11047: --- Fix Version/s: 3.4.0 > ResourceManager and NodeManager unable to connect to Hbase when ATSv2 is > enabled > > > Key: YARN-11047 > URL: https://issues.apache.org/jira/browse/YARN-11047 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Minni Mittal >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > yarn resourcemanager command has following issue: > 2021-12-14 19:26:33,345 WARN [pool-28-thread-1] > storage.TimelineStorageMonitor (TimelineStorageMonitor.java:run(95)) - Got > failure attempting to read from HBase, assuming Storage is down > java.lang.RuntimeException: org.apache.hadoop.hbase.DoNotRetryIOException: > java.lang.NoSuchMethodError: > [com.google.common.net|http://com.google.common.net/].HostAndPort.getHostText()Ljava/lang/String; > at > org.apache.hadoop.hbase.client.AbstractClientScanner$1.hasNext(AbstractClientScanner.java:95) > at > org.apache.hadoop.yarn.server.timelineservice.storage.reader.TimelineEntityReader.readEntities(TimelineEntityReader.java:283) > at > org.apache.hadoop.yarn.server.timelineservice.storage.HBaseStorageMonitor.healthCheck(HBaseStorageMonitor.java:77) > at > org.apache.hadoop.yarn.server.timelineservice.storage.TimelineStorageMonitor$MonitorThread.run(TimelineStorageMonitor.java:89) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org