[
https://issues.apache.org/jira/browse/HBASE-14819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15012594#comment-15012594
]
stack commented on HBASE-14819:
-------------------------------
bq. This is a failure to allocate PermGen, up it with -XX:MaxPermSize=XXXm,
e.g. 512. Only works with Java <= 7. Java 8 will accept the parameter but throw
a warning at JVM startup (PermGen went away in 8)
Yeah. I was thinking I could up the heap and then perm gen would up accordingly
but not sure what heap size we are running with... Some percentage of total
heap but maybe varies on build machines... need to figure whats going on here.
Was going to set a heap size when we fork but that might not be right thing to
do (4G seems good).
Then tried running these locally so could study. First one fails with the above
2k + threads (which is crazy). Let me add a 'fix' that makes it so we ONLY use
1500 threads in the test....
The other ITs just fail for all kinds of reasons. TODO.
> hbase-it tests failing with OOME
> --------------------------------
>
> Key: HBASE-14819
> URL: https://issues.apache.org/jira/browse/HBASE-14819
> Project: HBase
> Issue Type: Sub-task
> Components: test
> Reporter: stack
> Attachments: Screen Shot 2015-11-16 at 11.37.41 PM.png
>
>
> Let me up the heap used when failsafe forks.
> Here is example OOME doing ITBLL:
> {code}
> 2015-11-16 03:09:15,073 INFO [Thread-694] actions.BatchRestartRsAction(69):
> Starting region server:asf905.gq1.ygridcore.net
> 2015-11-16 03:09:15,099 INFO [Thread-694] client.ConnectionUtils(104):
> regionserver/asf905.gq1.ygridcore.net/67.195.81.149:0 server-side HConnection
> retries=350
> 2015-11-16 03:09:15,099 INFO [Thread-694] ipc.SimpleRpcScheduler(128): Using
> deadline as user call queue, count=1
> 2015-11-16 03:09:15,101 INFO [Thread-694] ipc.RpcServer$Listener(607):
> regionserver/asf905.gq1.ygridcore.net/67.195.81.149:0: started 3 reader(s)
> listening on port=36114
> 2015-11-16 03:09:15,103 INFO [Thread-694] fs.HFileSystem(252): Added
> intercepting call to namenode#getBlockLocations so can do block reordering
> using class class org.apache.hadoop.hbase.fs.HFileSystem$ReorderWALBlocks
> 2015-11-16 03:09:15,104 INFO [Thread-694]
> zookeeper.RecoverableZooKeeper(120): Process identifier=regionserver:36114
> connecting to ZooKeeper ensemble=localhost:50139
> 2015-11-16 03:09:15,117 DEBUG [Thread-694-EventThread]
> zookeeper.ZooKeeperWatcher(554): regionserver:361140x0,
> quorum=localhost:50139, baseZNode=/hbase Received ZooKeeper Event, type=None,
> state=SyncConnected, path=null
> 2015-11-16 03:09:15,118 DEBUG [Thread-694] zookeeper.ZKUtil(492):
> regionserver:361140x0, quorum=localhost:50139, baseZNode=/hbase Set watcher
> on existing znode=/hbase/master
> 2015-11-16 03:09:15,119 DEBUG [Thread-694] zookeeper.ZKUtil(492):
> regionserver:361140x0, quorum=localhost:50139, baseZNode=/hbase Set watcher
> on existing znode=/hbase/running
> 2015-11-16 03:09:15,119 DEBUG [Thread-694-EventThread]
> zookeeper.ZooKeeperWatcher(638): regionserver:36114-0x1510e2c6f1d0029
> connected
> 2015-11-16 03:09:15,120 INFO [RpcServer.responder]
> ipc.RpcServer$Responder(926): RpcServer.responder: starting
> 2015-11-16 03:09:15,121 INFO [RpcServer.listener,port=36114]
> ipc.RpcServer$Listener(738): RpcServer.listener,port=36114: starting
> 2015-11-16 03:09:15,121 DEBUG [Thread-694] ipc.RpcExecutor(115): B.default
> Start Handler index=0 queue=0
> 2015-11-16 03:09:15,121 DEBUG [Thread-694] ipc.RpcExecutor(115): B.default
> Start Handler index=1 queue=0
> 2015-11-16 03:09:15,121 DEBUG [Thread-694] ipc.RpcExecutor(115): B.default
> Start Handler index=2 queue=0
> 2015-11-16 03:09:15,122 DEBUG [Thread-694] ipc.RpcExecutor(115): B.default
> Start Handler index=3 queue=0
> 2015-11-16 03:09:15,122 DEBUG [Thread-694] ipc.RpcExecutor(115): B.default
> Start Handler index=4 queue=0
> 2015-11-16 03:09:15,122 DEBUG [Thread-694] ipc.RpcExecutor(115): Priority
> Start Handler index=0 queue=0
> 2015-11-16 03:09:15,123 DEBUG [Thread-694] ipc.RpcExecutor(115): Priority
> Start Handler index=1 queue=1
> 2015-11-16 03:09:15,123 DEBUG [Thread-694] ipc.RpcExecutor(115): Priority
> Start Handler index=2 queue=0
> 2015-11-16 03:09:15,123 DEBUG [Thread-694] ipc.RpcExecutor(115): Priority
> Start Handler index=3 queue=1
> 2015-11-16 03:09:15,124 DEBUG [Thread-694] ipc.RpcExecutor(115): Priority
> Start Handler index=4 queue=0
> 2015-11-16 03:09:15,124 DEBUG [Thread-694] ipc.RpcExecutor(115): Replication
> Start Handler index=0 queue=0
> 2015-11-16 03:09:15,124 DEBUG [Thread-694] ipc.RpcExecutor(115): Replication
> Start Handler index=1 queue=0
> 2015-11-16 03:09:15,124 DEBUG [Thread-694] ipc.RpcExecutor(115): Replication
> Start Handler index=2 queue=0
> 2015-11-16 03:09:15,761 DEBUG [RS:0;asf905:36114]
> client.ConnectionManager$HConnectionImplementation(715): connection
> construction failed
> java.io.IOException: java.lang.OutOfMemoryError: PermGen space
> at
> org.apache.hadoop.hbase.client.RegistryFactory.getRegistry(RegistryFactory.java:43)
> at
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.setupRegistry(ConnectionManager.java:886)
> at
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.<init>(ConnectionManager.java:692)
> at
> org.apache.hadoop.hbase.client.ConnectionUtils$2.<init>(ConnectionUtils.java:154)
> at
> org.apache.hadoop.hbase.client.ConnectionUtils.createShortCircuitConnection(ConnectionUtils.java:154)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.createClusterConnection(HRegionServer.java:689)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.setupClusterConnection(HRegionServer.java:720)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:733)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:889)
> at
> org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.runRegionServer(MiniHBaseCluster.java:156)
> at
> org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.access$000(MiniHBaseCluster.java:108)
> at
> org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer$1.run(MiniHBaseCluster.java:140)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
> at
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:334)
> at
> org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.run(MiniHBaseCluster.java:138)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.OutOfMemoryError: PermGen space
> at sun.misc.Unsafe.defineClass(Native Method)
> at sun.reflect.ClassDefiner.defineClass(ClassDefiner.java:63)
> at
> sun.reflect.MethodAccessorGenerator$1.run(MethodAccessorGenerator.java:399)
> at
> sun.reflect.MethodAccessorGenerator$1.run(MethodAccessorGenerator.java:396)
> at java.security.AccessController.doPrivileged(Native Method)
> at
> sun.reflect.MethodAccessorGenerator.generate(MethodAccessorGenerator.java:395)
> at
> sun.reflect.MethodAccessorGenerator.generateConstructor(MethodAccessorGenerator.java:94)
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:48)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at java.lang.Class.newInstance(Class.java:383)
> at
> org.apache.hadoop.hbase.client.RegistryFactory.getRegistry(RegistryFactory.java:41)
> ... 17 more
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)