[
https://issues.apache.org/jira/browse/HBASE-12074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14169745#comment-14169745
]
Dima Spivak commented on HBASE-12074:
-------------------------------------
I've seen this a few times on branch-1, as well. Some stdout for anyone who may
want more stdout:
{code}
2014-10-13 01:50:05,121 INFO [27] wal.FSHLog(895): Rolled WAL
/root/hbase/hbase-server/target/test-data/a50fe909-a402-40d1-85ac-4b1cd45ce1a3/logs/hlog.1413165005081
with entries=270, filesize=289.07 KB; new WAL
/root/hbase/hbase-server/target/test-data/a50fe909-a402-40d1-85ac-4b1cd45ce1a3/logs/hlog.1413165005100
2014-10-13 01:50:05,130 INFO [31] wal.FSHLog(895): Rolled WAL
/root/hbase/hbase-server/target/test-data/a50fe909-a402-40d1-85ac-4b1cd45ce1a3/logs/hlog.1413165005100
with entries=20, filesize=21.50 KB; new WAL
/root/hbase/hbase-server/target/test-data/a50fe909-a402-40d1-85ac-4b1cd45ce1a3/logs/hlog.1413165005123
2014-10-13 01:50:05,141 INFO [35] wal.FSHLog(895): Rolled WAL
/root/hbase/hbase-server/target/test-data/a50fe909-a402-40d1-85ac-4b1cd45ce1a3/logs/hlog.1413165005123
with entries=70, filesize=75.01 KB; new WAL
/root/hbase/hbase-server/target/test-data/a50fe909-a402-40d1-85ac-4b1cd45ce1a3/logs/hlog.1413165005131
2014-10-13 01:50:05,151 INFO [49] wal.FSHLog(895): Rolled WAL
/root/hbase/hbase-server/target/test-data/a50fe909-a402-40d1-85ac-4b1cd45ce1a3/logs/hlog.1413165005131
with entries=140, filesize=149.93 KB; new WAL
/root/hbase/hbase-server/target/test-data/a50fe909-a402-40d1-85ac-4b1cd45ce1a3/logs/hlog.1413165005143
2014-10-13 01:50:05,161 ERROR [sync.3] wal.FSHLog$SyncRunner(1306): Error
syncing, request close of hlog
java.io.IOException: java.lang.NullPointerException
at
org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:173)
at
org.apache.hadoop.hbase.regionserver.wal.FSHLog$SyncRunner.run(FSHLog.java:1302)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at
org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:170)
... 2 more
2014-10-13 01:50:05,162 INFO [2] wal.FSHLog(895): Rolled WAL
/root/hbase/hbase-server/target/test-data/a50fe909-a402-40d1-85ac-4b1cd45ce1a3/logs/hlog.1413165005143
with entries=185, filesize=198.10 KB; new WAL
/root/hbase/hbase-server/target/test-data/a50fe909-a402-40d1-85ac-4b1cd45ce1a3/logs/hlog.1413165005152
2014-10-13 01:50:05,162 INFO [23] wal.TestLogRollingNoCluster$Appender(139):
Caught exception from Appender:23
java.io.IOException: java.lang.NullPointerException
at
org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:173)
at
org.apache.hadoop.hbase.regionserver.wal.FSHLog$SyncRunner.run(FSHLog.java:1302)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at
org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:170)
... 2 more
2014-10-13 01:50:05,162 INFO [5] wal.TestLogRollingNoCluster$Appender(139):
Caught exception from Appender:5
java.io.IOException: java.lang.NullPointerException
at
org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:173)
at
org.apache.hadoop.hbase.regionserver.wal.FSHLog$SyncRunner.run(FSHLog.java:1302)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at
org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:170)
... 2 more
2014-10-13 01:50:05,162 INFO [53] wal.TestLogRollingNoCluster$Appender(139):
Caught exception from Appender:53
java.io.IOException: java.lang.NullPointerException
at
org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:173)
at
org.apache.hadoop.hbase.regionserver.wal.FSHLog$SyncRunner.run(FSHLog.java:1302)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at
org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:170)
... 2 more
2014-10-13 01:50:05,162 INFO [15] wal.TestLogRollingNoCluster$Appender(139):
Caught exception from Appender:15
java.io.IOException: java.lang.NullPointerException
at
org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:173)
at
org.apache.hadoop.hbase.regionserver.wal.FSHLog$SyncRunner.run(FSHLog.java:1302)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at
org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:170)
... 2 more
2014-10-13 01:50:05,162 INFO [3] wal.TestLogRollingNoCluster$Appender(139):
Caught exception from Appender:3
java.io.IOException: java.lang.NullPointerException
at
org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:173)
at
org.apache.hadoop.hbase.regionserver.wal.FSHLog$SyncRunner.run(FSHLog.java:1302)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at
org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:170)
... 2 more
{code}
> TestLogRollingNoCluster#testContendedLogRolling() failed
> --------------------------------------------------------
>
> Key: HBASE-12074
> URL: https://issues.apache.org/jira/browse/HBASE-12074
> Project: HBase
> Issue Type: Bug
> Reporter: Enis Soztutar
>
> TestLogRollingNoCluster#testContendedLogRolling() failed on a 0.98 run. I am
> trying to understand the context.
> The failure is this:
> {code}
> java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertFalse(Assert.java:64)
> at org.junit.Assert.assertFalse(Assert.java:74)
> at
> org.apache.hadoop.hbase.regionserver.wal.TestLogRollingNoCluster.testContendedLogRolling(TestLogRollingNoCluster.java:80)
> {code}
> Caused because one of the Appenders calling FSHLog.sync() threw IOE because
> of concurrent close:
> {code}
> 4-09-23 16:36:39,530 FATAL [pool-1-thread-1-WAL.AsyncSyncer0]
> wal.FSHLog$AsyncSyncer(1246): Error while AsyncSyncer sync, request close of
> hlog
> java.io.IOException: java.lang.NullPointerException
> at
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:168)
> at
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncSyncer.run(FSHLog.java:1241)
> at java.lang.Thread.run(Thread.java:722)
> Caused by: java.lang.NullPointerException
> at
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:165)
> ... 2 more
> 2014-09-23 16:36:39,531 INFO [32] wal.TestLogRollingNoCluster$Appender(137):
> Caught exception from Appender:32
> java.io.IOException: java.lang.NullPointerException
> at
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:168)
> at
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncSyncer.run(FSHLog.java:1241)
> at java.lang.Thread.run(Thread.java:722)
> Caused by: java.lang.NullPointerException
> at
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:165)
> ... 2 more
> 2014-09-23 16:36:39,532 INFO [19] wal.TestLogRollingNoCluster$Appender(137):
> Caught exception from Appender:19
> java.io.IOException: java.lang.NullPointerException
> at
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:168)
> at
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncSyncer.run(FSHLog.java:1241)
> at java.lang.Thread.run(Thread.java:722)
> Caused by: java.lang.NullPointerException
> at
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:165)
> ... 2 more
> {code}
> The code is:
> {code}
> public void sync() throws IOException {
> try {
> this.output.flush();
> this.output.sync();
> } catch (NullPointerException npe) {
> // Concurrent close...
> throw new IOException(npe);
> }
> }
> {code}
> I think the test case written exactly to catch this case:
> {code}
> * Spin up a bunch of threads and have them all append to a WAL. Roll the
> * WAL frequently to try and trigger NPE.
> {code}
> This is why I am reporting since I don't have much context. It may not be a
> test issue, but an actual bug.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)