[
https://issues.apache.org/jira/browse/HBASE-7982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13595538#comment-13595538
]
Hadoop QA commented on HBASE-7982:
----------------------------------
{color:green}+1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12572481/7982v4.txt
against trunk revision .
{color:green}+1 @author{color}. The patch does not contain any @author
tags.
{color:green}+1 tests included{color}. The patch appears to include 3 new
or modified tests.
{color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop
2.0 profile.
{color:green}+1 javadoc{color}. The javadoc tool did not generate any
warning messages.
{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.
{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 1.3.9) warnings.
{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.
{color:green}+1 lineLengths{color}. The patch does not introduce lines
longer than 100
{color:green}+1 core tests{color}. The patch passed unit tests in .
Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/4708//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4708//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4708//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4708//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4708//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4708//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4708//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4708//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4708//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/4708//console
This message is automatically generated.
> TestReplicationQueueFailover* runs for a minute, spews 3/4million lines
> complaining 'Filesystem closed', has an NPE, and still passes?
> --------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-7982
> URL: https://issues.apache.org/jira/browse/HBASE-7982
> Project: HBase
> Issue Type: Bug
> Components: build
> Reporter: stack
> Assignee: Jeffrey Zhong
> Priority: Blocker
> Fix For: 0.95.0
>
> Attachments: 7982v3.txt, 7982v4.txt, hbase-7982-combined_1.patch,
> hbase-7982-combined.patch, hbase-7982-huge-logging.patch,
> hbase-7982-NPE_2.patch, hbase-7982-NPE.patch,
> hbase-7982-with-Stack-changes-02.patch
>
>
> I was trying to look at why the odd time Hudson OOMEs trying to make a report
> on 0.95 build #4 https://builds.apache.org/job/hbase-0.95/4/console:
> {code}
> ERROR: Failed to archive test reports
> hudson.util.IOException2: remote file operation failed:
> /home/jenkins/jenkins-slave/workspace/hbase-0.95 at
> hudson.remoting.Channel@151a4e3e:ubuntu3
> at hudson.FilePath.act(FilePath.java:861)
> at hudson.FilePath.act(FilePath.java:838)
> at hudson.tasks.junit.JUnitParser.parse(JUnitParser.java:87)
> at
> ...
> Caused by: java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapCharBuffer.<init>(HeapCharBuffer.java:57)
> at java.nio.CharBuffer.allocate(CharBuffer.java:329)
> at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:792)
> at java.nio.charset.Charset.decode(Charset.java:791)
> at hudson.tasks.junit.SuiteResult.<init>(SuiteResult.java:215)
> ...
> {code}
> We are trying to allocate a big buffer and failing.
> Looking at reports being generated, we have quite a few that are > 10MB in
> size:
> {code}
> durruti:0.95 stack$ find hbase-* -type f -size +10000k -exec ls -la {} \;
> -rw-r--r--@ 1 stack staff 11126492 Feb 27 06:14
> hbase-server/target/surefire-reports/org.apache.hadoop.hbase.backup.TestHFileArchiving-output.txt
> -rw-r--r--@ 1 stack staff 13296009 Feb 27 05:47
> hbase-server/target/surefire-reports/org.apache.hadoop.hbase.client.TestFromClientSide3-output.txt
> -rw-r--r--@ 1 stack staff 10541898 Feb 27 05:47
> hbase-server/target/surefire-reports/org.apache.hadoop.hbase.client.TestMultiParallel-output.txt
> -rw-r--r--@ 1 stack staff 25344601 Feb 27 05:51
> hbase-server/target/surefire-reports/org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClient-output.txt
> -rw-r--r--@ 1 stack staff 17966969 Feb 27 06:12
> hbase-server/target/surefire-reports/org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction-output.txt
> -rw-r--r--@ 1 stack staff 17699068 Feb 27 06:09
> hbase-server/target/surefire-reports/org.apache.hadoop.hbase.regionserver.wal.TestHLogSplit-output.txt
> -rw-r--r--@ 1 stack staff 17701832 Feb 27 06:07
> hbase-server/target/surefire-reports/org.apache.hadoop.hbase.regionserver.wal.TestHLogSplitCompressed-output.txt
> -rw-r--r--@ 1 stack staff 717853709 Feb 27 06:17
> hbase-server/target/surefire-reports/org.apache.hadoop.hbase.replication.TestReplicationQueueFailover-output.txt
> -rw-r--r--@ 1 stack staff 563616793 Feb 27 06:17
> hbase-server/target/surefire-reports/org.apache.hadoop.hbase.replication.TestReplicationQueueFailoverCompressed-output.txt
> {code}
> ... with TestReplicationQueueFailover* being order of magnitude bigger than
> the others.
> Looking in the test I see both spewing between 800 and 900 thousand lines in
> about a minute. Here is their fixation:
> {code}
> 8908998 2013-02-27 06:17:48,176 ERROR
> [RegionServer:1;hemera.apache.org,35712,1361945801803.logSyncer]
> wal.FSHLog$LogSyncer(1012): Error while syncing, requesting close of hlog.
> 8908999 java.io.IOException: Filesystem closed
> 8909000 ,...at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:319)
> 8909001 ,...at org.apache.hadoop.hdfs.DFSClient.access$1200(DFSClient.java:78)
> 8909002 ,...at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:3843)
> 8909003 ,...at
> org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:97)
> 8909004 ,...at
> org.apache.hadoop.io.SequenceFile$Writer.syncFs(SequenceFile.java:999)
> 8909005 ,...at
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.sync(SequenceFileLogWriter.java:248)
> 8909006 ,...at
> org.apache.hadoop.hbase.regionserver.wal.FSHLog.syncer(FSHLog.java:1120)
> 8909007 ,...at
> org.apache.hadoop.hbase.regionserver.wal.FSHLog.syncer(FSHLog.java:1058)
> 8909008 ,...at
> org.apache.hadoop.hbase.regionserver.wal.FSHLog.sync(FSHLog.java:1228)
> 8909009 ,...at
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$LogSyncer.run(FSHLog.java:1010)
> 8909010 ,...at java.lang.Thread.run(Thread.java:722)
> 8909011 2013-02-27 06:17:48,176 FATAL
> [RegionServer:1;hemera.apache.org,35712,1361945801803.logSyncer]
> wal.FSHLog(1140): Could not sync. Requesting close of hlog
> 8909012 java.io.IOException: Filesystem closed
> 8909013 ,...at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:319)
> 8909014 ,...at org.apache.hadoop.hdfs.DFSClient.access$1200(DFSClient.java:78)
> 8909015 ,...at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:3843)
> 8909016 ,...at
> org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:97)
> 8909017 ,...at
> org.apache.hadoop.io.SequenceFile$Writer.syncFs(SequenceFile.java:999)
> 8909018 ,...at
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.sync(SequenceFileLogWriter.java:248)
> 8909019 ,...at
> org.apache.hadoop.hbase.regionserver.wal.FSHLog.syncer(FSHLog.java:1120)
> 8909020 ,...at
> org.apache.hadoop.hbase.regionserver.wal.FSHLog.syncer(FSHLog.java:1058)
> 8909021 ,...at
> org.apache.hadoop.hbase.regionserver.wal.FSHLog.sync(FSHLog.java:1228)
> 8909022 ,...at
> org.apache.hadoop.hbase.regionserver.wal.FSHLog$LogSyncer.run(FSHLog.java:1010)
> 8909023 ,...at java.lang.Thread.run(Thread.java:722)
> ...
> {code}
> These tests are 'succeeding'?
> I also see in both:
> {code}
> 3891 java.lang.NullPointerException
> 3892 ,...at
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.seek(SequenceFileLogReader.java:261)
> 3893 ,...at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationHLogReaderManager.seek(ReplicationHLogReaderManager.java:103)
> 3894 ,...at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:415)
> 3895 ,...at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:333)
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira