[
https://issues.apache.org/jira/browse/HBASE-14262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704233#comment-14704233
]
Duo Zhang commented on HBASE-14262:
-----------------------------------
{quote}
If I revert HBASE-13065, TestAcidGuarantees passes on my local mac...
{quote}
This could happen since reduce heap size also means increase native memory size
we could use so we can create more threads...
And I used to deal with TestAcidGuarantees when the new AsyncRpcClient is
introduced. I remember that finally I create a static EventLoopGroup and make
all AsyncRpcClient share it to reduce thread number...
So I suggest we verify the jstack result first to see if we really create too
many threads? Maybe we are reaching the precipice, so sometimes it fails and
sometimes not...
Thanks.
> Big Trunk unit tests failing with "OutOfMemoryError: unable to create new
> native thread"
> ----------------------------------------------------------------------------------------
>
> Key: HBASE-14262
> URL: https://issues.apache.org/jira/browse/HBASE-14262
> Project: HBase
> Issue Type: Bug
> Components: test
> Reporter: stack
> Assignee: stack
>
> The bit unit tests are coming in with OOME, can't create native threads.
> I was also getting the OOME locally running on MBP. git bisect got me to
> HBASE-13065, where we upped the test heap for TestDistributedLogSplitting
> back in feb. Around the time that this went in, we had similar OOME issues
> but then it was because we were doing 32bit JVMs. It does not seem to be the
> case here.
> A recent run failed all the below and most are OOME:
> {code}
> {color:red}-1 core tests{color}. The patch failed these unit tests:
>
> org.apache.hadoop.hbase.replication.TestReplicationEndpoint
>
> org.apache.hadoop.hbase.replication.TestPerTableCFReplication
>
> org.apache.hadoop.hbase.wal.TestBoundedRegionGroupingProvider
>
> org.apache.hadoop.hbase.replication.TestReplicationKillMasterRSCompressed
>
> org.apache.hadoop.hbase.replication.regionserver.TestRegionReplicaReplicationEndpointNoMaster
>
> org.apache.hadoop.hbase.replication.TestReplicationKillSlaveRS
>
> org.apache.hadoop.hbase.replication.regionserver.TestRegionReplicaReplicationEndpoint
> org.apache.hadoop.hbase.replication.TestMasterReplication
> org.apache.hadoop.hbase.mapred.TestTableMapReduce
>
> org.apache.hadoop.hbase.regionserver.TestRegionMergeTransactionOnCluster
> org.apache.hadoop.hbase.regionserver.TestRegionFavoredNodes
>
> org.apache.hadoop.hbase.replication.TestMultiSlaveReplication
> org.apache.hadoop.hbase.zookeeper.TestZKLeaderManager
>
> org.apache.hadoop.hbase.master.TestMasterRestartAfterDisablingTable
> org.apache.hadoop.hbase.TestGlobalMemStoreSize
> org.apache.hadoop.hbase.wal.TestWALFiltering
>
> org.apache.hadoop.hbase.replication.TestReplicationSmallTests
>
> org.apache.hadoop.hbase.replication.TestReplicationSyncUpTool
> org.apache.hadoop.hbase.replication.TestReplicationWithTags
>
> org.apache.hadoop.hbase.master.procedure.TestTruncateTableProcedure
>
> org.apache.hadoop.hbase.replication.TestReplicationChangingPeerRegionservers
>
> org.apache.hadoop.hbase.wal.TestDefaultWALProviderWithHLogKey
>
> org.apache.hadoop.hbase.regionserver.TestPerColumnFamilyFlush
>
> org.apache.hadoop.hbase.snapshot.TestMobRestoreFlushSnapshotFromClient
>
> org.apache.hadoop.hbase.master.procedure.TestCreateTableProcedure
> org.apache.hadoop.hbase.wal.TestWALFactory
>
> org.apache.hadoop.hbase.master.procedure.TestServerCrashProcedure
>
> org.apache.hadoop.hbase.replication.TestReplicationDisableInactivePeer
>
> org.apache.hadoop.hbase.master.procedure.TestAddColumnFamilyProcedure
>
> org.apache.hadoop.hbase.mapred.TestMultiTableSnapshotInputFormat
>
> org.apache.hadoop.hbase.master.procedure.TestEnableTableProcedure
>
> org.apache.hadoop.hbase.master.TestMasterFailoverBalancerPersistence
> org.apache.hadoop.hbase.TestStochasticBalancerJmxMetrics
> {color:red}-1 core zombie tests{color}. There are 16 zombie test(s):
> at
> org.apache.hadoop.hbase.security.visibility.TestVisibilityLabels.testVisibilityLabelsWithComplexLabels(TestVisibilityLabels.java:216)
> at
> org.apache.hadoop.hbase.mapred.TestTableInputFormat.testTableRecordReaderScannerFail(TestTableInputFormat.java:281)
> at
> org.apache.hadoop.hbase.replication.TestMultiSlaveReplication.testMultiSlaveReplication(TestMultiSlaveReplication.java:129)
> at
> org.apache.hadoop.hbase.regionserver.TestHRegion.testWritesWhileScanning(TestHRegion.java:3799)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)