huaxiangsun commented on a change in pull request #1441: HBASE-24120 Flakey
Test: TestReplicationAdminWithClusters timeout
URL: https://github.com/apache/hbase/pull/1441#discussion_r405202118
##########
File path:
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogReader.java
##########
@@ -445,7 +445,7 @@ private IOException extractHiddenEof(Exception ex) {
&& ex.getCause() != null && ex.getCause() instanceof IOException) {
ioEx = (IOException)ex.getCause();
}
- if (ioEx != null) {
+ if ((ioEx != null) && (ioEx.getMessage() != null)) {
Review comment:
Added this check because the flakey test run into the following Nullpointer
exception.
`2020-04-07 03:30:03,677 WARN
[RS_REFRESH_PEER-regionserver/asf905:0-0.replicationSource,2.replicationSource.wal-reader.asf905.gq1.ygridcore.net%2C41391%2C1586230117579,2]
impl.BlockReaderFactory(768): I/O error constructing remote block reader.
java.nio.channels.ClosedByInterruptException
at
java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:659)
at
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at
org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:2881)
at
org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:825)
at
org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:750)
at
org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.build(BlockReaderFactory.java:387)
at
org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:717)
at
org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:665)
at
org.apache.hadoop.hdfs.DFSInputStream.seekToBlockSource(DFSInputStream.java:1697)
at
org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:915)
at
org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:950)
at
org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:996)
at java.io.DataInputStream.read(DataInputStream.java:149)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at java.io.PushbackInputStream.read(PushbackInputStream.java:186)
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:209)
at
org.apache.hadoop.hbase.KeyValueUtil.createKeyValueFromInputStream(KeyValueUtil.java:716)
at
org.apache.hadoop.hbase.codec.KeyValueCodecWithTags$KeyValueDecoder.parseCell(KeyValueCodecWithTags.java:81)
at
org.apache.hadoop.hbase.codec.BaseDecoder.advance(BaseDecoder.java:68)
at
org.apache.hadoop.hbase.wal.WALEdit.readFromCells(WALEdit.java:276)
at
org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.readNext(ProtobufLogReader.java:382)
at
org.apache.hadoop.hbase.regionserver.wal.ReaderBase.next(ReaderBase.java:98)
at
org.apache.hadoop.hbase.regionserver.wal.ReaderBase.next(ReaderBase.java:86)
at
org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.readNextEntryAndRecordReaderPosition(WALEntryStream.java:263)
at
org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.tryAdvanceEntry(WALEntryStream.java:176)
at
org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.hasNext(WALEntryStream.java:101)
at
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.readWALEntries(ReplicationSourceWALReader.java:221)
at
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.run(ReplicationSourceWALReader.java:138)
2020-04-07 03:30:03,678 ERROR
[RS_REFRESH_PEER-regionserver/asf905:0-0.replicationSource,2.replicationSource.wal-reader.asf905.gq1.ygridcore.net%2C41391%2C1586230117579,2]
regionserver.ReplicationSource(397): Unexpected exception in
RS_REFRESH_PEER-regionserver/asf905:0-0.replicationSource,2.replicationSource.wal-reader.asf905.gq1.ygridcore.net%2C41391%2C1586230117579,2
currentPath=hdfs://localhost:37359/user/jenkins/test-data/260e1f0f-a3fd-6192-b1d7-6568614aef58/WALs/asf905.gq1.ygridcore.net,41391,1586230117579/asf905.gq1.ygridcore.net%2C41391%2C1586230117579.1586230122806
java.lang.NullPointerException
at
org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.extractHiddenEof(ProtobufLogReader.java:449)
at
org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.readNext(ProtobufLogReader.java:396)
at
org.apache.hadoop.hbase.regionserver.wal.ReaderBase.next(ReaderBase.java:98)
at
org.apache.hadoop.hbase.regionserver.wal.ReaderBase.next(ReaderBase.java:86)
at
org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.readNextEntryAndRecordReaderPosition(WALEntryStream.java:263)
at
org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.tryAdvanceEntry(WALEntryStream.java:176)
at
org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.hasNext(WALEntryStream.java:101)
at
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.readWALEntries(ReplicationSourceWALReader.java:221)
at
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.run(ReplicationSourceWALReader.java:138)
`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services