Is it doing a long GC? Is the aggregation taking longer than the client 60 second timeout (You are setting lease below but I don't think that will have an effect since it server-side)?
St.Ack On Fri, Sep 23, 2011 at 2:52 AM, Mayuresh <[email protected]> wrote: > Hi, > > I am using the AggregationClient to perform aggregate calculation on a > 10 M row hbase table which is spread across around 27 regions. > Currently all the regions are on a single region server. > > Here is briefly what the code is doing: > Leases leases = new Leases(40 * 60 * 1000, 5 * 1000); > leases.start(); > long startTime = System.currentTimeMillis(); > System.out.println("Start time: " + startTime); > Configuration conf = new Configuration(true); > AggregationClient aClient = new AggregationClient(conf); > > Scan scan = new Scan(); > scan.addColumn(TEST_FAMILY, TEST_QUALIFIER); > final ColumnInterpreter<Long, Long> ci = new LongColumnInterpreter(); > > double avg = aClient.avg(TEST_TABLE, ci, > scan); > leases.close(); > long endTime = System.currentTimeMillis(); > System.out.println("End time: " + endTime); > System.out.println("Avg is: " + avg); > System.out.println("Time taken is: " + (endTime - startTime)); > > However it is failing after some time with the following log messages > in the region server log: > 2011-09-23 14:21:25,649 DEBUG > org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for > region > Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1. > 2011-09-23 14:21:25,650 DEBUG > org.apache.hadoop.hbase.io.hfile.HFileReaderV2: On close of file > 15761056997240226 evicted 344 block(s) > 2011-09-23 14:21:25,650 DEBUG > org.apache.hadoop.hbase.regionserver.Store: closed data > 2011-09-23 14:21:25,650 INFO > org.apache.hadoop.hbase.regionserver.HRegion: Closed > Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1. > 2011-09-23 14:21:25,650 DEBUG > org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: > Closed region > Test,serv3-1315575650141,1316682342728.b3fe66994b0f2fb0643c88f6897474a1. > 2011-09-23 14:21:25,651 WARN org.apache.hadoop.ipc.HBaseServer: IPC > Server Responder, call execCoprocessor([B@6b6ab732, > getAvg(org.apache.hadoop.hbase.client.coprocessor.LongColumnInterpreter@2b216ab6, > {"timeRange":[0,9223372036854775807],"batch":-1,"startRow":"serv1-1315571539741","stopRow":"serv5-1315575834916","totalColumns":0,"cacheBlocks":true,"families":{"data":[]},"maxVersions":1,"caching":-1})) > from 137.72.240.177:47563: output error > 2011-09-23 14:21:25,651 WARN org.apache.hadoop.ipc.HBaseServer: IPC > Server handler 9 on 60020 caught: > java.nio.channels.ClosedChannelException > at > sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133) > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324) > at > org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1501) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:876) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:955) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Call.sendResponseIfReady(HBaseServer.java:390) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1240) > > And the following in the zookeeper log: > 2011-09-23 14:19:50,978 INFO > org.apache.zookeeper.server.NIOServerCnxn: Established session > 0x1329577cad70004 with negotiated timeout 180000 for client > /137.72.240.177:45431 > 2011-09-23 14:21:11,028 WARN > org.apache.zookeeper.server.NIOServerCnxn: EndOfStreamException: > Unable to read additional data from client sessionid > 0x1329577cad70004, likely client has closed socket > 2011-09-23 14:21:11,029 INFO > org.apache.zookeeper.server.NIOServerCnxn: Closed socket connection > for client /137.72.240.177:45431 which had sessionid 0x1329577cad70004 > 2011-09-23 14:24:42,000 INFO > org.apache.zookeeper.server.ZooKeeperServer: Expiring session > 0x1329577cad70004, timeout of 180000ms exceeded > 2011-09-23 14:24:42,001 INFO > org.apache.zookeeper.server.PrepRequestProcessor: Processed session > termination for sessionid: 0x1329577cad70004 > > Any ideas on how to resolve this? > > Thanks in advance. > > Regards, > Mayuresh >
