keith-turner commented on code in PR #4558:
URL: https://github.com/apache/accumulo/pull/4558#discussion_r1608960778
##########
server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/Tablet.java:
##########
@@ -905,6 +909,19 @@ public void close(boolean saveState) throws IOException {
void initiateClose(boolean saveState) {
log.trace("initiateClose(saveState={}) {}", saveState, getExtent());
+ synchronized (this) {
+ if (closeState == CloseState.OPEN) {
+ closeRequestTime = System.nanoTime();
+ } else if (closeRequestTime != 0) {
Review Comment:
Is it expected that closeRequestTime would be zero when closeState is not
OPEN? If not expected then could do something like the following to catch bugs
```
}else{
Preconditions.checkState(closeRequestTime != 0);
```
##########
core/src/main/java/org/apache/accumulo/core/logging/ConditionalLogger.java:
##########
@@ -0,0 +1,193 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * https://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.accumulo.core.logging;
+
+import java.time.Duration;
+import java.util.Arrays;
+import java.util.List;
+import java.util.concurrent.ConcurrentMap;
+import java.util.function.BiFunction;
+
+import org.apache.accumulo.core.util.Pair;
+import org.slf4j.Logger;
+import org.slf4j.Marker;
+import org.slf4j.event.Level;
+import org.slf4j.helpers.AbstractLogger;
+
+import com.github.benmanes.caffeine.cache.Cache;
+import com.github.benmanes.caffeine.cache.Caffeine;
+
+/**
+ * Logger that wraps another Logger and only emits a log message once per the
supplied duration.
+ *
+ */
+public abstract class ConditionalLogger extends AbstractLogger {
+
+ private static final long serialVersionUID = 1L;
+
+ /**
+ * A Logger implementation that will log a message at the supplied elevated
level if it has not
+ * been seen in the supplied duration. For repeat occurrences the message
will be logged at the
+ * level used in code (which is likely a lower level). Note that the first
log message will be
+ * logged at the elevated level because it has not been seen before.
+ */
+ public static class EscalatingLogger extends DeduplicatingLogger {
+
+ private static final long serialVersionUID = 1L;
+ private final Level elevatedLevel;
+
+ public EscalatingLogger(Logger log, Duration threshold, Level
elevatedLevel) {
+ super(log, threshold);
+ this.elevatedLevel = elevatedLevel;
+ }
+
+ @Override
+ protected void handleNormalizedLoggingCall(Level level, Marker marker,
String messagePattern,
+ Object[] arguments, Throwable throwable) {
+
+ if (arguments == null) {
+ arguments = new Object[0];
+ }
+ if (!condition.apply(messagePattern, Arrays.asList(arguments))) {
+
delegate.atLevel(level).addMarker(marker).setCause(throwable).log(messagePattern,
+ arguments);
+ } else {
+
delegate.atLevel(elevatedLevel).addMarker(marker).setCause(throwable).log(messagePattern,
+ arguments);
+ }
+
+ }
+
+ }
+
+ /**
+ * A Logger implementation that will suppress duplicate messages within the
supplied duration.
+ */
+ public static class DeduplicatingLogger extends ConditionalLogger {
+
+ private static final long serialVersionUID = 1L;
+
+ public DeduplicatingLogger(Logger log, Duration threshold) {
+ super(log, new BiFunction<>() {
+
+ private final Cache<Pair<String,List<Object>>,Boolean> cache =
+
Caffeine.newBuilder().expireAfterWrite(threshold).maximumSize(250).build();
Review Comment:
Once there are more than 250 tablets stuck in the close state on a tablet
server, this will basically stop deduping. Not sure what to do about this,
could make it higher but eventually once the tablets per tsever that are stuck
exceeds this number it will start logging everything. 250 seems a bit low.
##########
server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/Tablet.java:
##########
@@ -141,6 +143,7 @@
*/
public class Tablet extends TabletBase {
private static final Logger log = LoggerFactory.getLogger(Tablet.class);
+ private static final Logger DEDUPE_LOGGER = new DeduplicatingLogger(log,
Duration.ofMinutes(5));
Review Comment:
Could name this according to its purpose rather than its impl. Maybe
something like CLOSING_STUCK_LOGGER.
```suggestion
private static final Logger CLOSING_STUCK_LOGGER = new
DeduplicatingLogger(log, Duration.ofMinutes(5));
```
##########
server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/Tablet.java:
##########
@@ -905,6 +909,19 @@ public void close(boolean saveState) throws IOException {
void initiateClose(boolean saveState) {
log.trace("initiateClose(saveState={}) {}", saveState, getExtent());
+ synchronized (this) {
+ if (closeState == CloseState.OPEN) {
+ closeRequestTime = System.nanoTime();
+ } else if (closeRequestTime != 0) {
+ long runningTime = Duration.ofNanos(System.nanoTime() -
closeRequestTime).toMinutes();
+ if (runningTime >= 15) {
+ DEDUPE_LOGGER.info("Tablet {} close requested again, but has been
closing for {} minutes",
+ this.extent, runningTime);
+ }
+ }
+ closeState = CloseState.REQUESTED;
Review Comment:
Only want to transition to REQUESTED when the current state is OPEN. A
thread could call this method when the close state has advanced past REQUESTED.
```suggestion
if (closeState == CloseState.OPEN) {
closeRequestTime = System.nanoTime();
closeState = CloseState.REQUESTED;
} else if (closeRequestTime != 0) {
long runningTime = Duration.ofNanos(System.nanoTime() -
closeRequestTime).toMinutes();
if (runningTime >= 15) {
DEDUPE_LOGGER.info("Tablet {} close requested again, but has been
closing for {} minutes",
this.extent, runningTime);
}
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]