> On Nov. 6, 2014, 5:47 p.m., kturner wrote: > > server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServerResourceManager.java, > > line 250 > > <https://reviews.apache.org/r/27654/diff/3/?file=751140#file751140line250> > > > > The compaction code remembers when it logged an exception and does not > > do it again. It also logs a message if the compaction becomes unstuck. > > An advantage I thought of w/ repeatedly logging, is that you could see the > > stack trace changing (or not). > > > > > > The stack trace is a possible trace. By the time logging happens, the > > assignment could have completed and the thread could have moved on to other > > things. > > Josh Elser wrote: > Yeah, since these are running fairly regularly (order of seconds) a stuck > assignment could get really spammy. Like you point out, there could be value > gained from printing out the stack more than once. Maybe I could add some > backoff which only warns so often? > > bq. By the time logging happens, the assignment could have completed and > the thread could have moved on to other things. > > Do you think the message should be updated to be more clear about this? A > "Maybe you should look into this" type message? > > kturner wrote: > > a stuck assignment could get really spammy > > I think that spam is probably ok as long as the default is high enough > such that when it does happen, its something to be concerned about. Could > make the timer check a little less frequently. > > > Do you think the message should be updated to be more clear about this? > > I think compaction code just says its a possible stack trace. I suppose > a good solution would be to have error codes, then user can look up error > code and get nitty gritty details. Can't really put too much info in log > message. > > Josh Elser wrote: > bq. Could make the timer check a little less frequently. > > As long as we have a long threshold for warning about a stuck assignment, > we can easily make a longer period on the timer. The timer period dictates > the minimum stuck assignment time -- I can update the description with a > clarification.
I was thinking that once an assignment is considered stuck, that each time the timer kicks a check (I think its either 5 secs or 1 sec, not sure) that it will cause a spam. Was thinking this could be increased to produce less spam. The period of the timer could be a function of tserver.assignment.duration.warning, like 1/4 or 1/2. - kturner ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27654/#review60185 ----------------------------------------------------------- On Nov. 6, 2014, 12:58 a.m., Josh Elser wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/27654/ > ----------------------------------------------------------- > > (Updated Nov. 6, 2014, 12:58 a.m.) > > > Review request for accumulo. > > > Bugs: ACCUMULO-3304 > https://issues.apache.org/jira/browse/ACCUMULO-3304 > > > Repository: accumulo > > > Description > ------- > > Watches assignments and reports when an assignment is running for longer than > a configured time. > > > Diffs > ----- > > core/src/main/java/org/apache/accumulo/core/conf/Property.java 56f3d9c > > server/tserver/src/main/java/org/apache/accumulo/tserver/ActiveAssignmentRunnable.java > PRE-CREATION > > server/tserver/src/main/java/org/apache/accumulo/tserver/RunnableStartedAt.java > PRE-CREATION > server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServer.java > 94be0bb > > server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServerResourceManager.java > 935ffeb > > Diff: https://reviews.apache.org/r/27654/diff/ > > > Testing > ------- > > Very minimal. > > > Thanks, > > Josh Elser > >
