截图似乎挂了,看不到。。。 如果还在用 tailing reader 读,说明这是最后一个文件,他是不会跳过空文件的
如果已经有新的 WAL 文件了,应该不会继续用 tailing reader 读了,这个时候如果遇到 EOF 了,是有逻辑直接跳过的 现在 tailing reader 一直在读的是最后一个文件吗?还是其实已经不是最后一个文件了,但还是一直在用 tailing reader 读? sudo rm -rf /* <2326130...@qq.com.invalid> 于2024年7月22日周一 21:41写道: > 张老师 > 您好,感谢您的回复,replication卡住了,我挑选了一个RS节点,replication status如下截图: > > > 截图中第一个文件格式是:hdfs://coreHBaseProdHa/hbase/WALs/sh2-int-hbase-main-ha-2,16020,1720603345541/sh2-int-hbase-main-ha-2%2C16020%2C1720603345541.1720606991648 > 第一个文件已经不存在了 > 第二个 三个文件指向oldWals目录中,文件存在,用hbase wal -p 文件读,报错如下: > Writer Classes: ProtobufLogWriter AsyncProtobufLogWriter > SecureProtobufLogWriter SecureAsyncProtobufLogWriter > Cell Codec Class: org.apache.hadoop.hbase.regionserver.wal.WALCellCodec > Exception in thread "main" java.io.EOFException: EOF while reading message > size > at > org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.parseDelimitedFrom(ProtobufUtil.java:3727) > at > org.apache.hadoop.hbase.regionserver.wal.ProtobufWALStreamReader.next(ProtobufWALStreamReader.java:56) > at > org.apache.hadoop.hbase.wal.WALStreamReader.next(WALStreamReader.java:42) > at > org.apache.hadoop.hbase.wal.WALPrettyPrinter.processFile(WALPrettyPrinter.java:297) > at > org.apache.hadoop.hbase.wal.WALPrettyPrinter.run(WALPrettyPrinter.java:516) > at > org.apache.hadoop.hbase.wal.WALPrettyPrinter.main(WALPrettyPrinter.java:429) > 像是一个读到空文件的报错, > 其他正常WAL文件,hbase wal -p 命令运行正常,能解析wal文件的内容。您有空帮忙再看看,非常感谢 > > > ------------------ 原始邮件 ------------------ > *发件人:* "user-zh" <palomino...@gmail.com>; > *发送时间:* 2024年7月22日(星期一) 晚上8:56 > *收件人:* "user-zh"<user-zh@hbase.apache.org>; > *主题:* Re: hbase2.6.0 replicationSource WALReader读取WAL异常 > > Replication 卡了吗?Stream reader 是在不停的 tail > 文件的,如果遇到写了一半的就是有可能出异常,他会重试。如果没卡,后面还能继续读说明就没问题 > > 你也可以尝试用 WALPrettyPrinter 去读一下那个文件看看能不能读? > > leojie <leo...@apache.org> 于2024年7月22日周一 18:03写道: > > > > 张老师 > > 您好,请教一个问题,最近在测试hbase2.6.0,在开启replication时,replication > > > Source线程中,目前使用ProtobufWALStreamReader类(2.6.0新类)读取和解析WAL文件,遇到异常如下:InvalidProtocolBufferException$InvalidWireTypeException: > > Protocol message tag had invalid wire type. > > 看了源码,没看太懂,涉及底层Protocol序列化的问题,会是因为使用低版本hbase-client(比如:hbase2.2.7) api > > 写入数据导致的么 > > 我的环境是:hadoop3.3.6 hbase2.6.0 > > 详细的异常堆栈如下: > > 2024-07-22T17:47:49,130 WARN > > > [RS_CLAIM_REPLICATION_QUEUE-regionserver/sh2-int-hbase-main-ha-9:16020-0.replicationSource,test_hbase_258-tx1-int-hbase-main-prod-3,16020,1720602522464.replicationSource.wal-reader.tx1-int-hbase-main-prod-3%2C16020%2C1720602522464,test_hbase_258-tx1-int-hbase-main-prod-3,16020,1720602522464] > > wal.ProtobufWALStreamReader: Error while reading WALKey, > > originalPosition=0, currentPosition=81 > > > org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException$InvalidWireTypeException: > > Protocol message tag had invalid wire type. > > at > > > org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException.invalidWireType(InvalidProtocolBufferException.java:119) > > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > > at > > > org.apache.hbase.thirdparty.com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom(UnknownFieldSet.java:503) > > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > > at > > > org.apache.hbase.thirdparty.com.google.protobuf.GeneratedMessage$Builder.parseUnknownField(GeneratedMessage.java:770) > > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > > at > > > org.apache.hadoop.hbase.shaded.protobuf.generated.WALProtos$WALKey$Builder.mergeFrom(WALProtos.java:2829) > > ~[hbase-protocol-shaded-2.6.0.jar:2.6.0] > > at > > > org.apache.hadoop.hbase.shaded.protobuf.generated.WALProtos$WALKey$1.parsePartialFrom(WALProtos.java:4212) > > ~[hbase-protocol-shaded-2.6.0.jar:2.6.0] > > at > > > org.apache.hadoop.hbase.shaded.protobuf.generated.WALProtos$WALKey$1.parsePartialFrom(WALProtos.java:4204) > > ~[hbase-protocol-shaded-2.6.0.jar:2.6.0] > > at > > > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:192) > > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > > at > > > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:209) > > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > > at > > > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:214) > > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > > at > > > org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25) > > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > > at > > > org.apache.hbase.thirdparty.com.google.protobuf.GeneratedMessage.parseWithIOException(GeneratedMessage.java:321) > > ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7] > > at > > > org.apache.hadoop.hbase.shaded.protobuf.generated.WALProtos$WALKey.parseFrom(WALProtos.java:2321) > > ~[hbase-protocol-shaded-2.6.0.jar:2.6.0] > > at > > > org.apache.hadoop.hbase.regionserver.wal.ProtobufWALTailingReader.readWALKey(ProtobufWALTailingReader.java:128) > > ~[hbase-server-2.6.0.jar:2.6.0] > > at > > > org.apache.hadoop.hbase.regionserver.wal.ProtobufWALTailingReader.next(ProtobufWALTailingReader.java:257) > > ~[hbase-server-2.6.0.jar:2.6.0] > > at > > > org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.readNextEntryAndRecordReaderPosition(WALEntryStream.java:490) > > ~[hbase-server-2.6.0.jar:2.6.0] > > at > > > org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.lastAttempt(WALEntryStream.java:306) > > ~[hbase-server-2.6.0.jar:2.6.0] > > at > > > org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.tryAdvanceEntry(WALEntryStream.java:388) > > ~[hbase-server-2.6.0.jar:2.6.0] > > at > > > org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.hasNext(WALEntryStream.java:130) > > ~[hbase-server-2.6.0.jar:2.6.0] > > at > > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.run(ReplicationSourceWALReader.java:153) > > ~[hbase-server-2.6.0.jar:2.6.0] > > 2024-07-22T17:48:13,315 WARN [RS-EventLoopGroup-1-65] > > ipc.NettyRpcConnection: Exception encountered while connecting to the > > server tx1-int-hbase-main-prod-3:16020 > > org.apache.hbase.thirdparty.io.netty.channel.ConnectTimeoutException: > > connection timed out after 10000 ms: tx1-int-hbase-main-prod-3/ > > 127.0.0.1:16020 > > at > > > org.apache.hbase.thirdparty.io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe$2.run(AbstractEpollChannel.java:615) > > ~[hbase-shaded-netty-4.1.7.jar:?] > > at > > > org.apache.hbase.thirdparty.io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) > > ~[hbase-shaded-netty-4.1.7.jar:?] > > at > > > org.apache.hbase.thirdparty.io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153) > > ~[hbase-shaded-netty-4.1.7.jar:?] > > at > > > org.apache.hbase.thirdparty.io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173) > > ~[hbase-shaded-netty-4.1.7.jar:?] > > at > > > org.apache.hbase.thirdparty.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166) > > ~[hbase-shaded-netty-4.1.7.jar:?] > > at > > > org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) > > ~[hbase-shaded-netty-4.1.7.jar:?] > > at > > > org.apache.hbase.thirdparty.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:416) > > ~[hbase-shaded-netty-4.1.7.jar:?] > > at > > > org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) > > ~[hbase-shaded-netty-4.1.7.jar:?] > > at > > > org.apache.hbase.thirdparty.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) > > ~[hbase-shaded-netty-4.1.7.jar:?] > > at > > > org.apache.hbase.thirdparty.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > > ~[hbase-shaded-netty-4.1.7.jar:?] > > at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_202] >