[ 
https://issues.apache.org/jira/browse/HBASE-13065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14330016#comment-14330016
 ] 

zhangduo commented on HBASE-13065:
----------------------------------

[~stack] Seems HBase-TRUNK jenkins is running on a 32bit machine? So there are 
only 4GB address space. We increase heap from 1900m to 2800m, then there is 
much less address space for native memory.
In TestAcidGuarantees, we create connection per read write thread but do not 
close it, so we create lots of AsyncRpcClient instance. Each instance has its 
own netty thread pool, so there are too many threads and will easily exceed the 
4GB address space(VIRT, not RES) limit and then make the test fail.

{noformat}
java.lang.OutOfMemoryError: unable to create new native thread
        at java.lang.Thread.start0(Native Method)
        at java.lang.Thread.start(Thread.java:693)
        at 
io.netty.util.concurrent.SingleThreadEventExecutor.shutdownGracefully(SingleThreadEventExecutor.java:557)
        at 
io.netty.util.concurrent.MultithreadEventExecutorGroup.shutdownGracefully(MultithreadEventExecutorGroup.java:146)
        at 
io.netty.util.concurrent.AbstractEventExecutorGroup.shutdownGracefully(AbstractEventExecutorGroup.java:69)
        at 
org.apache.hadoop.hbase.ipc.AsyncRpcClient.close(AsyncRpcClient.java:253)
        at 
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.internalClose(ConnectionManager.java:2373)
        at 
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.close(ConnectionManager.java:2386)
        at 
org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniCluster(HBaseTestingUtility.java:1055)
        at 
org.apache.hadoop.hbase.TestAcidGuarantees.testGetAtomicity(TestAcidGuarantees.java:346)
{noformat}

I wonder why we run jenkins job on 32bit machines? Seems PreCommit-Build is run 
on 64bit machines.

> Increasing -Xmx when running TestDistributedLogSplitting
> --------------------------------------------------------
>
>                 Key: HBASE-13065
>                 URL: https://issues.apache.org/jira/browse/HBASE-13065
>             Project: HBase
>          Issue Type: Bug
>          Components: test
>            Reporter: zhangduo
>            Assignee: zhangduo
>             Fix For: 2.0.0, 1.0.1, 1.1.0, 0.98.11
>
>         Attachments: 13065-fix.txt
>
>
> Found this in PreCommit Build reports
> https://builds.apache.org/job/PreCommit-HBASE-Build/12885/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.master.TestDistributedLogSplitting-output.txt
> {noformat}
> 2015-02-18 03:45:42,141 WARN  [RS:4;asf901:41265] util.Sleeper(97): We slept 
> 59018ms instead of 1000ms, this is likely due to a long garbage collecting 
> pause and it's usually bad, see 
> http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
> 2015-02-18 03:45:26,750 WARN  [JvmPauseMonitor] 
> util.JvmPauseMonitor$Monitor(167): Detected pause in JVM or host machine (eg 
> GC): pause of approximately 39767ms
> GC pool 'PS MarkSweep' had collection(s): count=65 time=47720ms
> {noformat}
> Maybe we should increase the max heap size since this test starts 6 
> regionservers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to