anoopsjohn commented on a change in pull request #2168:
URL: https://github.com/apache/hbase/pull/2168#discussion_r462215252
##########
File path:
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java
##########
@@ -412,6 +422,24 @@ protected void doReplaceWriter(Path oldPath, Path newPath,
Writer nextWriter) th
}
}
+ private void closeWriter(Path path, boolean syncCloseCall) throws
IOException {
+ try {
+ TraceUtil.addTimelineAnnotation("closing writer");
+ writer.close();
+ TraceUtil.addTimelineAnnotation("writer closed");
+ } catch (IOException ioe) {
+ int errors = closeErrorCount.incrementAndGet();
+ boolean hasUnflushedEntries = isUnflushedEntries();
+ if (syncCloseCall && (hasUnflushedEntries || (errors >
this.closeErrorsTolerated))) {
Review comment:
Ideally when we writer close, there wont be any unflushed entries. We
have synced all writes happend to this wal writer. But as the code was having
it and to be on safe side, continue to use that. Ya in that case will go with
sync way of call so that we can act against and IOE in close call. Same with
that errors count. If consecutively those many failures, we will have to throw
IOE and RS will abort. But here one catch is there. This #errors based abort
wont be that strict from now on. Say if the errors tolerated is 1, we need to
throw back IOE if errors count is 2. But it is possible that one close req
came and given to a thread and before that is done another close req and given
to another thread and then a 3rd close req came by then 1st thread is done and
it was failed. So the 3rd req will be tried in sync way of close. By the time
it finishes, the 2nd close thread also failed to close. So the errors already
became 2 and 3d (sync way call) also failed which makes t
he count to 3. So instead of throw back IOE at errors=2, it will do now on
errors=3. But we are sure all edits are synced and no data loss. So I believe
this is ok.. Asycn wal is not having any such logic at all.
##########
File path:
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java
##########
@@ -436,6 +464,19 @@ protected void doShutdown() throws IOException {
this.writer.close();
this.writer = null;
}
+ closeExecutor.shutdown();
+ try {
+ if (!closeExecutor.awaitTermination(waitOnShutdownInSeconds,
TimeUnit.SECONDS)) {
+ LOG.error("We have waited " + waitOnShutdownInSeconds + " seconds but"
Review comment:
done
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]