I see that your region server had 5188 store files in 121 store, I'm 99% sure that it's the cause of your OOME. Luckily for you, we've been working on this issue since last week. What you should do :
- Upgrade to HBase 0.19.1 - Apply the latest patch in https://issues.apache.org/jira/browse/HBASE-1058 (the v3) Then you should be good. As to what caused this huge number of store files, I wouldn't be surprised if your data was uploaded sequentially so that would mean that whatever the number of regions (hence the level of distribution) in your table, only 1 region gets the load. This implies that another work around to your problem would be to insert with a more randomized pattern. Thx for trying either solution, J-D On Mon, Apr 13, 2009 at 8:28 AM, 11 Nov. <[email protected]> wrote: > hi coleagues, > We are doing data inserting on 32 nodes hbase cluster using mapreduce > framework recently, but the operation always gets failed because of > regionserver exceptions. We issued 4 map task on the same node > simultaneously, and exploit the BatchUpdate() function to handle work of > inserting data. > We had been suffered from such problem since last month, which only took > place on relatively large clusters at high concurrent inserting rate. We are > using hadoop-0.19.2 on current svn, and it's the head revision on svn last > week. We are using hbase 0.19.0. > > Here is the configure file of hadoop-site.xml: > > <configuration> > <property> > <name>fs.default.name</name> > <value>hdfs://192.168.33.204:11004/</value> > </property> > > <property> > <name>mapred.job.tracker</name> > <value>192.168.33.204:11005</value> > </property> > > <property> > <name>dfs.secondary.http.address</name> > <value>0.0.0.0:51100</value> > <description> > The secondary namenode http server address and port. > If the port is 0 then the server will start on a free port. > </description> > </property> > > <property> > <name>dfs.datanode.address</name> > <value>0.0.0.0:51110</value> > <description> > The address where the datanode server will listen to. > If the port is 0 then the server will start on a free port. > </description> > </property> > > <property> > <name>dfs.datanode.http.address</name> > <value>0.0.0.0:51175</value> > <description> > The datanode http server address and port. > If the port is 0 then the server will start on a free port. > </description> > </property> > > <property> > <name>dfs.datanode.ipc.address</name> > <value>0.0.0.0:11010</value> > <description> > The datanode ipc server address and port. > If the port is 0 then the server will start on a free port. > </description> > </property> > > <property> > <name>dfs.datanode.handler.count</name> > <value>30</value> > <description>The number of server threads for the datanode.</description> > </property> > > <property> > <name>dfs.namenode.handler.count</name> > <value>30</value> > <description>The number of server threads for the namenode.</description> > </property> > > <property> > <name>mapred.job.tracker.handler.count</name> > <value>30</value> > </property> > > <property> > <name>mapred.reduce.parallel.copies</name> > <value>30</value> > </property> > > <property> > <name>dfs.http.address</name> > <value>0.0.0.0:51170</value> > <description> > The address and the base port where the dfs namenode web ui will listen > on. > If the port is 0 then the server will start on a free port. > </description> > </property> > > <property> > <name>dfs.datanode.max.xcievers</name> > <value>8192</value> > <description> > </description> > </property> > > <property> > <name>dfs.datanode.socket.write.timeout</name> > <value>0</value> > <description> > </description> > </property> > > > <property> > <name>dfs.datanode.https.address</name> > <value>0.0.0.0:50477</value> > </property> > > <property> > <name>dfs.https.address</name> > <value>0.0.0.0:50472</value> > </property> > > <property> > <name>mapred.job.tracker.http.address</name> > <value>0.0.0.0:51130</value> > <description> > The job tracker http server address and port the server will listen on. > If the port is 0 then the server will start on a free port. > </description> > </property> > > <property> > <name>mapred.task.tracker.http.address</name> > <value>0.0.0.0:51160</value> > <description> > The task tracker http server address and port. > If the port is 0 then the server will start on a free port. > </description> > </property> > > > <property> > <name>mapred.map.tasks</name> > <value>3</value> > </property> > > <property> > <name>mapred.reduce.tasks</name> > <value>2</value> > </property> > > <property> > <name>mapred.tasktracker.map.tasks.maximum</name> > <value>4</value> > <description> > The maximum number of map tasks that will be run simultaneously by a > task tracker. > </description> > </property> > > <property> > <name>dfs.name.dir</name> > > <value>/data0/hbase/filesystem/dfs/name,/data1/hbase/filesystem/dfs/name,/data2/hbase/filesystem/dfs/name,/data3/hbase/filesystem/dfs/name</value> > </property> > > <property> > <name>dfs.data.dir</name> > > <value>/data0/hbase/filesystem/dfs/data,/data1/hbase/filesystem/dfs/data,/data2/hbase/filesystem/dfs/data,/data3/hbase/filesystem/dfs/data</value> > </property> > > <property> > <name>fs.checkpoint.dir</name> > > <value>/data0/hbase/filesystem/dfs/namesecondary,/data1/hbase/filesystem/dfs/namesecondary,/data2/hbase/filesystem/dfs/namesecondary,/data3/hbase/filesystem/dfs/namesecondary</value> > </property> > > <property> > <name>mapred.system.dir</name> > <value>/data1/hbase/filesystem/mapred/system</value> > </property> > > <property> > <name>mapred.local.dir</name> > > <value>/data0/hbase/filesystem/mapred/local,/data1/hbase/filesystem/mapred/local,/data2/hbase/filesystem/mapred/local,/data3/hbase/filesystem/mapred/local</value> > </property> > > <property> > <name>dfs.replication</name> > <value>3</value> > </property> > > <property> > <name>hadoop.tmp.dir</name> > <value>/data1/hbase/filesystem/tmp</value> > </property> > > <property> > <name>mapred.task.timeout</name> > <value>3600000</value> > <description>The number of milliseconds before a task will be > terminated if it neither reads an input, writes an output, nor > updates its status string. > </description> > </property> > > <property> > <name>ipc.client.idlethreshold</name> > <value>4000</value> > <description>Defines the threshold number of connections after which > connections will be inspected for idleness. > </description> > </property> > > > <property> > <name>ipc.client.connection.maxidletime</name> > <value>120000</value> > <description>The maximum time in msec after which a client will bring down > the > connection to the server. > </description> > </property> > > <property> > <value>-Xmx256m -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode</value> > </property> > > </configuration> > > > > > > > And here is the hbase-site.xml config file: > > <?xml version="1.0"?> > <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> > > <configuration> > <property> > <name>hbase.master</name> > <value>192.168.33.204:62000</value> > <description>The host and port that the HBase master runs at. > A value of 'local' runs the master and a regionserver in > a single process. > </description> > </property> > <property> > <name>hbase.rootdir</name> > <value>hdfs://192.168.33.204:11004/hbase</value> > <description>The directory shared by region servers. > Should be fully-qualified to include the filesystem to use. > E.g: hdfs://NAMENODE_SERVER:PORT/HBASE_ROOTDIR > </description> > </property> > > <property> > <name>hbase.master.info.port</name> > <value>62010</value> > <description>The port for the hbase master web UI > Set to -1 if you do not want the info server to run. > </description> > </property> > <property> > <name>hbase.master.info.bindAddress</name> > <value>0.0.0.0</value> > <description>The address for the hbase master web UI > </description> > </property> > <property> > <name>hbase.regionserver</name> > <value>0.0.0.0:62020</value> > <description>The host and port a HBase region server runs at. > </description> > </property> > > <property> > <name>hbase.regionserver.info.port</name> > <value>62030</value> > <description>The port for the hbase regionserver web UI > Set to -1 if you do not want the info server to run. > </description> > </property> > <property> > <name>hbase.regionserver.info.bindAddress</name> > <value>0.0.0.0</value> > <description>The address for the hbase regionserver web UI > </description> > </property> > > <property> > <name>hbase.regionserver.handler.count</name> > <value>20</value> > </property> > > <property> > <name>hbase.master.lease.period</name> > <value>180000</value> > </property> > > </configuration> > > > Here is a slice of the error log file on one of the failed > regionservers, which lose response after the OOM Exception: > > 2009-04-13 15:20:26,077 FATAL > org.apache.hadoop.hbase.regionserver.HRegionServer: OutOfMemoryError, > aborting. > java.lang.OutOfMemoryError: Java heap space > 2009-04-13 15:20:48,062 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics: > request=0, regions=121, stores=121, storefiles=5188, storefileIndexSize=195, > memcacheSize=214, usedHeap=4991, maxHeap=4991 > 2009-04-13 15:20:48,062 INFO org.apache.hadoop.ipc.HBaseServer: Stopping > server on 62020 > 2009-04-13 15:20:48,063 INFO > org.apache.hadoop.hbase.regionserver.LogFlusher: > regionserver/0:0:0:0:0:0:0:0:62020.logFlusher exiting > 2009-04-13 15:20:48,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: Stopping infoServer > 2009-04-13 15:20:48,228 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server > Responder, call batchUpdates([...@74f0bb4e, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@689939dc) from > 192.168.33.206:47754: output error > 2009-04-13 15:20:48,229 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 5 on 62020 caught: java.nio.channels.ClosedByInterruptException > at > java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184) > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341) > at > org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085) > at > org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923) > > 2009-04-13 15:20:48,229 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 5 on 62020: exiting > 2009-04-13 15:20:48,297 INFO org.apache.hadoop.ipc.HBaseServer: Stopping IPC > Server Responder > 2009-04-13 15:20:48,552 INFO org.apache.zookeeper.ClientCnxn: Attempting > connection to server /192.168.33.204:2181 > 2009-04-13 15:20:48,552 WARN org.apache.zookeeper.ClientCnxn: Exception > closing session 0x0 to sun.nio.ch.selectionkeyi...@480edf31 > java.io.IOException: TIMED OUT > at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:837) > 2009-04-13 15:20:48,555 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 9 on 62020, call batchUpdates([...@3509aa7f, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@d98930d) from 192.168.33.234:44367: > error: java.io.IOException: Server not running, aborting > java.io.IOException: Server not running, aborting > at > org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2809) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionServer.java:2304) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895) > 2009-04-13 15:20:48,561 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server > Responder, call batchUpdates([...@525a19ce, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@19544d9f) from > 192.168.33.208:47852: output error > 2009-04-13 15:20:48,561 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server > Responder, call batchUpdates([...@483206fe, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@4c6932b9) from > 192.168.33.221:37020: output error > 2009-04-13 15:20:48,561 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 0 on 62020 caught: java.nio.channels.ClosedByInterruptException > at > java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184) > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341) > at > org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085) > at > org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923) > > 2009-04-13 15:20:48,561 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 0 on 62020: exiting > 2009-04-13 15:20:48,561 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 7 on 62020 caught: java.nio.channels.ClosedByInterruptException > at > java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184) > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341) > at > org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085) > at > org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923) > > 2009-04-13 15:20:48,655 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 7 on 62020: exiting > 2009-04-13 15:20:48,692 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server > Responder, call batchUpdates([...@61af3c0e, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@378fed3c) from 192.168.34.1:35923: > output error > 2009-04-13 15:20:48,877 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server > Responder, call batchUpdates([...@2c4ff8dd, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@365b8be5) from 192.168.34.3:39443: > output error > 2009-04-13 15:20:48,877 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 16 on 62020 caught: java.nio.channels.ClosedByInterruptException > at > java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184) > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341) > at > org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085) > at > org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923) > > 2009-04-13 15:20:48,877 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 16 on 62020: exiting > 2009-04-13 15:20:48,877 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server > Responder, call batchUpdates([...@343d8344, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@32750027) from > 192.168.33.236:45479: output error > 2009-04-13 15:20:49,008 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 17 on 62020 caught: java.nio.channels.ClosedByInterruptException > at > java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184) > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341) > at > org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085) > at > org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923) > > 2009-04-13 15:20:49,008 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 17 on 62020: exiting > 2009-04-13 15:20:48,654 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server > Responder, call batchUpdates([...@3ff34fed, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@7f047167) from > 192.168.33.219:40059: output error > 2009-04-13 15:20:48,654 ERROR com.cmri.hugetable.zookeeper.ZNodeWatcher: > processNode /hugetable09/hugetable/acl.lock error!KeeperErrorCode = > ConnectionLoss > 2009-04-13 15:20:48,649 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server > Responder, call batchUpdates([...@721d9b81, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@75cc6cae) from > 192.168.33.254:51617: output error > 2009-04-13 15:20:48,649 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 12 on 62020, call batchUpdates([...@655edc27, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@36c7b86f) from > 192.168.33.238:51231: error: java.io.IOException: Server not running, > aborting > java.io.IOException: Server not running, aborting > at > org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2809) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionServer.java:2304) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895) > 2009-04-13 15:20:48,648 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server > Responder, call batchUpdates([...@3c853cce, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@4f5b176c) from > 192.168.33.209:43520: output error > 2009-04-13 15:20:49,225 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 4 on 62020 caught: java.nio.channels.ClosedByInterruptException > at > java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184) > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341) > at > org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085) > at > org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923) > > 2009-04-13 15:20:49,226 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 4 on 62020: exiting > 2009-04-13 15:20:48,648 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server > Responder, call batchUpdates([...@3509aa7f, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@d98930d) from 192.168.33.234:44367: > output error > 2009-04-13 15:20:48,647 INFO org.mortbay.util.ThreadedServer: Stopping > Acceptor ServerSocket[addr=0.0.0.0/0.0.0.0,port=0,localport=62030] > 2009-04-13 15:20:49,266 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 9 on 62020 caught: java.nio.channels.ClosedByInterruptException > at > java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184) > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341) > at > org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085) > at > org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923) > > 2009-04-13 15:20:49,266 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 9 on 62020: exiting > 2009-04-13 15:20:48,646 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 2 on 62020, call batchUpdates([...@2cc91b6, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@44724529) from > 192.168.33.210:44154: error: java.io.IOException: Server not running, > aborting > java.io.IOException: Server not running, aborting > at > org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2809) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionServer.java:2304) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895) > 2009-04-13 15:20:48,572 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server > Responder, call batchUpdates([...@e8136e0, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@4539b390) from > 192.168.33.217:60476: output error > 2009-04-13 15:20:49,272 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server > Responder, call batchUpdates([...@2cc91b6, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@44724529) from > 192.168.33.210:44154: output error > 2009-04-13 15:20:49,272 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 8 on 62020 caught: java.nio.channels.ClosedByInterruptException > at > java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184) > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341) > at > org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085) > at > org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923) > > 2009-04-13 15:20:49,272 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 8 on 62020: exiting > 2009-04-13 15:20:49,263 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server > Responder, call batchUpdates([...@655edc27, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@36c7b86f) from > 192.168.33.238:51231: output error > 2009-04-13 15:20:49,225 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 1 on 62020 caught: java.nio.channels.ClosedByInterruptException > at > java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184) > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341) > at > org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085) > at > org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923) > > 2009-04-13 15:20:49,068 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 14 on 62020 caught: java.nio.channels.ClosedByInterruptException > at > java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184) > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341) > at > org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085) > at > org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923) > > 2009-04-13 15:20:49,345 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 14 on 62020: exiting > 2009-04-13 15:20:49,048 ERROR > org.apache.hadoop.hbase.regionserver.HRegionServer: > java.lang.OutOfMemoryError: Java heap space > 2009-04-13 15:20:49,484 FATAL > org.apache.hadoop.hbase.regionserver.HRegionServer: OutOfMemoryError, > aborting. > java.lang.OutOfMemoryError: Java heap space > at > java.util.concurrent.ConcurrentHashMap$Values.iterator(ConcurrentHashMap.java:1187) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getGlobalMemcacheSize(HRegionServer.java:2863) > at > org.apache.hadoop.hbase.regionserver.MemcacheFlusher.reclaimMemcacheMemory(MemcacheFlusher.java:260) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionServer.java:2307) > at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895) > 2009-04-13 15:20:49,488 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics: > request=0, regions=121, stores=121, storefiles=5188, storefileIndexSize=195, > memcacheSize=214, usedHeap=4985, maxHeap=4991 > 2009-04-13 15:20:49,489 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 15 on 62020, call batchUpdates([...@302bb17f, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@492218e) from 192.168.33.235:35276: > error: java.io.IOException: java.lang.OutOfMemoryError: Java heap space > java.io.IOException: java.lang.OutOfMemoryError: Java heap space > at > org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:1334) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:1324) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionServer.java:2320) > at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895) > Caused by: java.lang.OutOfMemoryError: Java heap space > at > java.util.concurrent.ConcurrentHashMap$Values.iterator(ConcurrentHashMap.java:1187) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getGlobalMemcacheSize(HRegionServer.java:2863) > at > org.apache.hadoop.hbase.regionserver.MemcacheFlusher.reclaimMemcacheMemory(MemcacheFlusher.java:260) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionServer.java:2307) > ... 5 more > 2009-04-13 15:20:49,490 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server > Responder, call batchUpdates([...@302bb17f, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@492218e) from 192.168.33.235:35276: > output error > 2009-04-13 15:20:49,047 INFO org.apache.hadoop.ipc.HBaseServer: Stopping IPC > Server listener on 62020 > 2009-04-13 15:20:49,493 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 15 on 62020 caught: java.nio.channels.ClosedChannelException > at > sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:126) > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324) > at > org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085) > at > org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923) > > Any suggenstion is welcomed! Thanks a lot! >
