ddupg opened a new pull request, #4756: URL: https://github.com/apache/hbase/pull/4756
In [WALEntryStream#readNextEntryAndRecordReaderPosition](https://github.com/apache/hbase/blob/308cd729d23329e6d8d4b9c17a645180374b5962/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/WALEntryStream.java#L257), it is possible that we read uncommitted data. If we read beyond the committed file length, then reopen inputStream and seek back. In our use, we found that the position where seek back may be exactly the length of the file being written, which may cause EOF. The thrown EOF is finally caught by [ReplicationSourceWALReader.run](https://github.com/apache/hbase/blob/308cd729d23329e6d8d4b9c17a645180374b5962/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceWALReader.java#L158), but [totalBufferUsed](https://github.com/apache/hbase/blob/308cd729d23329e6d8d4b9c17a645180374b5962/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceWALReader.java#L78) is not cleanup up. After a long run, all peers will go slow and eventually block completely. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
