The client was doing this: Flushing 52428400 into <table>
On Wed, Aug 11, 2010 at 3:09 AM, Ted Yu <yuzhih...@gmail.com> wrote: > Here is client side stack trace: > > java.io.IOException: Call to > us01-ciqps1-grid01.carrieriq.com/10.32.42.233:60020 failed on local > exception: java.io.EOFException > java.net.ConnectException: Connection refused > java.net.ConnectException: Connection refused > java.net.ConnectException: Connection refused > > at > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:1037) > at > org.apache.hadoop.hbase.client.HConnectionManager$TableServers$3.doCall(HConnectionManager.java:1222) > at > org.apache.hadoop.hbase.client.HConnectionManager$TableServers$Batch.process(HConnectionManager.java:1144) > at > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1230) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:666) > at > com.carrieriq.m2m.platform.mmp2.input.StripedHBaseTable.flushAllStripesNew(StripedHBaseTable.java:300) > > > On Tue, Aug 10, 2010 at 11:01 PM, Ryan Rawson <ryano...@gmail.com> wrote: > >> Use a tool like Yourkit to grovel that heap, the open source tools are >> not really there yet. >> >> But your stack trace tells a lot.... the fatal allocation is in the >> RPC layer. Either a client is sending a massive value, or you have a >> semi-hostile network client sending bytes to your open socket which >> are being interpreted as the buffer size to allocate. If you look at >> the actual RPC code (any RPC code really) there is often a 'length' >> field which is then used to allocate a dynamic buffer. >> >> -ryan >> >> On Tue, Aug 10, 2010 at 10:55 PM, Ted Yu <yuzhih...@gmail.com> wrote: >> > The compressed file is still big: >> > -rw-r--r-- 1 hadoop users 809768340 Aug 11 05:49 java_pid26972.hprof.gz >> > >> > If you can tell me specific things to look for in the dump, I would >> collect >> > it (through jhat) and publish. >> > >> > Thanks >> > >> > On Tue, Aug 10, 2010 at 10:29 PM, Stack <st...@duboce.net> wrote: >> > >> >> On Tue, Aug 10, 2010 at 9:52 PM, Ted Yu <yuzhih...@gmail.com> wrote: >> >> > Here are GC-related parameters: >> >> > /usr/java/jdk1.6/bin/java -Xmx4000m -XX:+HeapDumpOnOutOfMemoryError >> >> > -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode >> >> > >> >> >> >> You have > 2 CPUs per machine I take it? You could probably drop the >> >> conservative XX:+CMSIncrementalMode. >> >> >> >> > The heap dump is big: >> >> > -rw------- 1 hadoop users 4146551927 Aug 11 03:59 java_pid26972.hprof >> >> > >> >> > Do you have ftp server where I can upload it ? >> >> > >> >> >> >> Not really. I was hoping you could put a compressed version under an >> >> http server somewhere that I could pull from. You might as well >> >> include the GC log while you are at it. >> >> >> >> Thanks Ted, >> >> >> >> St.Ack >> >> >> > >> > >