[GitHub] ivakegg commented on a change in pull request #356: ACCUMULO-4777 Removed the unused sequence generator.
ivakegg commented on a change in pull request #356: ACCUMULO-4777 Removed the unused sequence generator. URL: https://github.com/apache/accumulo/pull/356#discussion_r163321826 ## File path: server/tserver/src/main/java/org/apache/accumulo/tserver/log/TabletServerLogger.java ## @@ -400,24 +406,26 @@ private int write(final Collection sessions, boolean mincFinish, if (currentLogId == logId.get()) { // write the mutation to the logs - seq = seqGen.incrementAndGet(); - if (seq < 0) -throw new RuntimeException("Logger sequence generator wrapped! Onos!!!11!eleven"); - LoggerOperation lop = writer.write(copy, seq); + LoggerOperation lop = writer.write(copy); lop.await(); // double-check: did the log set change? success = (currentLogId == logId.get()); } } catch (DfsLogger.LogClosedException ex) { -log.debug("Logs closed while writing, retrying " + attempt); +log.debug("Logs closed while writing, retrying attempt " + writeRetry.retriesCompleted()); Review comment: That was the original coding. I believe the reason is that if another thread closed the current WAL log then it should immediately loop to write to the new WAL file instead. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ivakegg commented on a change in pull request #356: ACCUMULO-4777 Removed the unused sequence generator.
ivakegg commented on a change in pull request #356: ACCUMULO-4777 Removed the unused sequence generator. URL: https://github.com/apache/accumulo/pull/356#discussion_r163321467 ## File path: core/src/main/java/org/apache/accumulo/core/conf/Property.java ## @@ -263,13 +263,13 @@ "The maximum size for each write-ahead log. See comment for property tserver.memory.maps.max"), TSERV_WALOG_MAX_AGE("tserver.walog.max.age", "24h", PropertyType.TIMEDURATION, "The maximum age for each write-ahead log."), TSERV_WALOG_TOLERATED_CREATION_FAILURES("tserver.walog.tolerated.creation.failures", "50", PropertyType.COUNT, - "The maximum number of failures tolerated when creating a new WAL file within the period specified by tserver.walog.failures.period." Review comment: Precisely, that property does not exist. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ivakegg commented on a change in pull request #356: ACCUMULO-4777 Removed the unused sequence generator.
ivakegg commented on a change in pull request #356: ACCUMULO-4777 Removed the unused sequence generator. URL: https://github.com/apache/accumulo/pull/356#discussion_r161399250 ## File path: core/src/main/java/org/apache/accumulo/core/conf/Property.java ## @@ -263,8 +263,11 @@ "The maximum size for each write-ahead log. See comment for property tserver.memory.maps.max"), TSERV_WALOG_MAX_AGE("tserver.walog.max.age", "24h", PropertyType.TIMEDURATION, "The maximum age for each write-ahead log."), TSERV_WALOG_TOLERATED_CREATION_FAILURES("tserver.walog.tolerated.creation.failures", "50", PropertyType.COUNT, - "The maximum number of failures tolerated when creating a new WAL file within the period specified by tserver.walog.failures.period." - + " Exceeding this number of failures in the period causes the TabletServer to exit."), + "The maximum number of failures tolerated when creating a new WAL file." + + " Exceeding this number of failures consecutively trying to create a new WAL causes the TabletServer to exit."), + TSERV_WALOG_TOLERATED_WRITING_FAILURES("tserver.walog.tolerated.writing.failures", "1000", PropertyType.COUNT, Review comment: That does sound reasonable. The existing monitor might already be able to serve this function but a separate process would work as well. I am thinking for a bug release that I will remove the halting mechanism as that is really changing how things work. In 2.0.0 however we might want that and indeed may want a better way to configure this such as using time like you suggested. I will create a separate ticket for that. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] ivakegg commented on a change in pull request #356: ACCUMULO-4777 Removed the unused sequence generator.
ivakegg commented on a change in pull request #356: ACCUMULO-4777 Removed the unused sequence generator. URL: https://github.com/apache/accumulo/pull/356#discussion_r161241237 ## File path: server/tserver/src/main/java/org/apache/accumulo/tserver/log/TabletServerLogger.java ## @@ -400,24 +405,34 @@ private int write(final Collection sessions, boolean mincFinish, if (currentLogId == logId.get()) { // write the mutation to the logs - seq = seqGen.incrementAndGet(); - if (seq < 0) -throw new RuntimeException("Logger sequence generator wrapped! Onos!!!11!eleven"); - LoggerOperation lop = writer.write(copy, seq); + LoggerOperation lop = writer.write(copy); lop.await(); // double-check: did the log set change? success = (currentLogId == logId.get()); } } catch (DfsLogger.LogClosedException ex) { -log.debug("Logs closed while writing, retrying " + attempt); +log.debug("Logs closed while writing, retrying " + writeRetry.retriesCompleted()); } catch (Exception t) { -if (attempt != 1) { - log.error("Unexpected error writing to log, retrying attempt " + attempt, t); +// We have more retries or we exceeded the maximum number of accepted failures +if (writeRetry.canRetry()) { + // Use the createRetry and record the time in which we did so + writeRetry.useRetry(); + + try { +// Backoff +writeRetry.waitForNextAttempt(); + } catch (InterruptedException e) { +Thread.currentThread().interrupt(); +throw new RuntimeException(e); + } +} else { + log.error("Repeatedly failed to write WAL. Going to exit tabletserver.", t); Review comment: Is a termination reasonable in this circumstance or should we retry forever as we were doing before? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services