Checking the WALs on HDFS, there are very old WALs, from a year ago... Does 
anyone have any idea how to handle this issue in production?
 
-rw-r--r--   2 hbase hadoop   20684288 2023-10-09 08:26 
/hbase/oldWALs/rzv-db14-hd.xxxx%2C16020%2C1674973593505.1696810047993
-rw-r--r--   2 hbase hadoop   15007744 2023-10-09 08:26 
/hbase/oldWALs/rzv-db13-hd.xxxx%2C16020%2C1684871532555.1696811057371
-rw-r--r--   2 hbase hadoop      15872 2023-10-09 08:26 
/hbase/oldWALs/rzv-db12-hd.xxxx%2C16020%2C1674973371058.1696813278286
-rw-r--r--   2 hbase hadoop   42594304 2023-10-09 08:27 
/hbase/oldWALs/rzv-db09-hd.xxxx%2C16020%2C1674973354605.1696810476448-rw-r--r-- 
  2 hbase hadoop   13622784 2023-10-09 08:26 
/hbase/oldWALs/rzv-db10-hd.xxxx%2C16020%2C1674973984596.1696810895708
    Il giovedì 12 settembre 2024 alle ore 09:30:46 CEST, Hamado Dene 
<hamadod...@yahoo.com> ha scritto:  
 
 Hi community,Could anyone kindly assist me in resolving this issue I'm facing? 
Thank you in advance!
Hamado Dene
    Il mercoledì 11 settembre 2024 alle ore 16:26:55 CEST, Hamado Dene 
<hamadod...@yahoo.com> ha scritto:  
 
 Hi HBase Community,
We are currently facing an issue in our production environment with HBase 
replication, and I would greatly appreciate any guidance or suggestions the 
community may have

We are running HBase version 2.5.8, and in the logs, we consistently encounter 
the following warning:



024-09-11T15:51:11,468 WARN  
[RS_CLAIM_REPLICATION_QUEUE-regionserver/rzv-db09-hd:16020-0.replicationSource,replicav3-rzv-db13-hd.xxxx,16020,1684871532555-rzv-db09-hd.xxxx,16020,1696832789107-rzv-db09-hd.xxxx,16020,1696833033289-rzv-db13-hd.xxxx,16020,1722636062425-rzv-db13-hd.xxxx,16020,1722636803794-rzv-db12-hd.xxxx,16020,1722636800268.replicationSource.wal-reader.rzv-db13-hd.xxxx%2C16020%2C1684871532555,replicav3-rzv-db13-hd.xxxx,16020,1684871532555-rzv-db09-hd.xxxx,16020,1696832789107-rzv-db09-hd.xxxx,16020,1696833033289-rzv-db13-hd.xxxx,16020,1722636062425-rzv-db13-hd.xxxx,16020,1722636803794-rzv-db12-hd.xxxx,16020,1722636800268]
 regionserver.ReplicationSourceWALReader: Failed to read stream of replication 
entriesjava.io.EOFException: Cannot seek after EOF        at 
org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:1682) 
~[hadoop-hdfs-client-2.10.2.jar:?]        at 
org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:66) 
~[hadoop-common-2.10.2.jar:?]        at 
org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.seekOnFs(ProtobufLogReader.java:527)
 ~[hbase-server-2.5.8.jar:2.5.8]        at 
org.apache.hadoop.hbase.regionserver.wal.ReaderBase.seek(ReaderBase.java:130) 
~[hbase-server-2.5.8.jar:2.5.8]        at 
org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.seek(WALEntryStream.java:408)
 ~[hbase-server-2.5.8.jar:2.5.8]        at 
org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openReader(WALEntryStream.java:339)
 ~[hbase-server-2.5.8.jar:2.5.8]        at 
org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openNextLog(WALEntryStream.java:308)
 ~[hbase-server-2.5.8.jar:2.5.8]        at 
org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.checkReader(WALEntryStream.java:298)
 ~[hbase-server-2.5.8.jar:2.5.8]        at 
org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.tryAdvanceEntry(WALEntryStream.java:172)
 ~[hbase-server-2.5.8.jar:2.5.8]        at 
org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.hasNext(WALEntryStream.java:102)
 ~[hbase-server-2.5.8.jar:2.5.8]        at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.tryAdvanceStreamAndCreateWALBatch(ReplicationSourceWALReader.java:258)
 ~[hbase-server-2.5.8.jar:2.5.8]        at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.run(ReplicationSourceWALReader.java:145)
 ~[hbase-server-2.5.8.jar:2.5.8]


This error appears to stem from the replication WAL reader, and the "Cannot 
seek after EOF" message suggests a failure to read the replication entries. We 
suspect this may be affecting the replication flow between our region servers.

Has anyone encountered this problem before, or does anyone have insights into 
potential causes and solutions?


Thank you in advance for your assistance!

Hamado Dene    

Reply via email to