[GitHub] [hbase] ddupg opened a new pull request, #4756: HBASE-27354 EOF thrown by WALEntryStream causes replication blocking


ddupg opened a new pull request, #4756:
URL: https://github.com/apache/hbase/pull/4756


   In 
[WALEntryStream#readNextEntryAndRecordReaderPosition](https://github.com/apache/hbase/blob/308cd729d23329e6d8d4b9c17a645180374b5962/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/WALEntryStream.java#L257),
 it is possible that we read uncommitted data.  If we read beyond the committed 
file length, then reopen inputStream and seek back.
   
   In our use, we found that the position where seek back may be exactly the 
length of the file being written, which may cause EOF. The thrown EOF is 
finally caught by 
[ReplicationSourceWALReader.run](https://github.com/apache/hbase/blob/308cd729d23329e6d8d4b9c17a645180374b5962/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceWALReader.java#L158),
 but 
[totalBufferUsed](https://github.com/apache/hbase/blob/308cd729d23329e6d8d4b9c17a645180374b5962/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceWALReader.java#L78)
 is not cleanup up.
   
   After a long run, all peers will go slow and eventually block completely.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hbase] ddupg opened a new pull request, #4756: HBASE-27354 EOF thrown by WALEntryStream causes replication blocking

Reply via email to