[
https://issues.apache.org/jira/browse/ACCUMULO-4615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15945855#comment-15945855
]
Josh Elser commented on ACCUMULO-4615:
--------------------------------------
bq. the monitoring was freaking out, showing different values for # tservers, #
tablets, # offline tables.
Yuck.
bq. If it takes longer than the configured time to gather information from all
the tablet servers, the thread pool stops and processing continues with
whatever has been collected
Maybe stats should be collected in the background on some interval instead of
on-demand? In the case where we don't get a response in some threshold, we
could fall back to the previous value?
FYI [~lstav] as this might be of interest to you in the monitor-reworking on
master.
> ThreadPool timeout when checking tserver stats is confusing
> -----------------------------------------------------------
>
> Key: ACCUMULO-4615
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4615
> Project: Accumulo
> Issue Type: Bug
> Components: master
> Affects Versions: 1.8.1
> Reporter: Michael Wall
> Priority: Minor
>
> If it takes longer than the configured time to gather information from all
> the tablet servers, the thread pool stops and processing continues with
> whatever has been collected. Code is
> https://github.com/apache/accumulo/blob/1.8/server/master/src/main/java/org/apache/accumulo/master/Master.java#L1120,
> default timeout is 6s. Does not appear to be an issue prior to 1.8.
> Best case, this was really confusing. The monitor page would have 30
> tservers, then 5 tservers. Didn't really see any other negative effects, no
> migrations and no balancing appeared to be affected. Worse case though, I
> missed something and the master is making decisions based on incomplete
> information.
> [[email protected]] please add more info if needed.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)