We us all plain gets and puts (sometimes batched). We have hbase.client.keyvalue.maxsize increased to 536870912 bytes on the client. That is the only thing I can see.
I am about to send you a zip file with the respective classes to your email address directly. I probably better dont post the code publicly. We will also attempt to set hbase.ipc.server.callqueue.handler.factor to 0 now. I keep you posted. Johannes On Sun, Aug 24, 2014 at 1:06 AM, Stack <[email protected]> wrote: > I am having trouble reproducing the stack overflow. Some particular > response is triggering it (the code here has been around a while). Any > particulars on how your client is accessing hbase? Anything unusual? > > If you were looking for something to try, set > hbase.ipc.server.callqueue.handler.factor > to 0. Multiple queues is what is new here. It should not make a difference > but... > > St.Ack > > > > > > On Sat, Aug 23, 2014 at 1:23 PM, Johannes Schaback < > [email protected]> wrote: > > > Thank you. > > > > From the proposed resolution I imagine that the RS would then die in case > > of a handler error. So the question remains what error originally occured > > in the handler in the first place. The log of the entire lifecycle of the > > RS (http://schabby.de/wp-content/uploads/2014/08/filtered.txt) does not > > reveal much to me unfortunately. Do you find anything in there that hints > > to something that may cause the handler to end up in the soon-to-be-fixed > > recursion? > > > > @Ted, the line "at > > org.apache.hadoop.hbase.CellUtil$1.advance(CellUtil.java:210)" is all I > can > > see unfortunately :( > > > > > > > > On Sat, Aug 23, 2014 at 9:43 PM, Andrew Purtell <[email protected]> > > wrote: > > > > > On Sat, Aug 23, 2014 at 12:11 PM, Johannes Schaback < > > > [email protected]> wrote: > > > > > > > Exception in thread "defaultRpcServer.handler=5,queue=2,port=60020" > > > > java.lang.StackOverflowError > > > > at > > org.apache.hadoop.hbase.CellUtil$1.advance(CellUtil.java:210) > > > > at > > org.apache.hadoop.hbase.CellUtil$1.advance(CellUtil.java:210) > > > > at > > org.apache.hadoop.hbase.CellUtil$1.advance(CellUtil.java:210) > > > > at > > org.apache.hadoop.hbase.CellUtil$1.advance(CellUtil.java:210) > > > > (and so on...) > > > > > > > > > > > > > > > > > That is the anonymous CellScanner instance we create from > > > CellUtil#createCellScanner. See > > > https://issues.apache.org/jira/browse/HBASE-11813 > > > > > > > > > > Filtering the .out file for "Exception" shows that several handlers > > > > crashed > > > > > > > > like that: > > > > > > > > Exception in thread "defaultRpcServer.handler=5,queue=2,port=60020" > > > > java.lang.StackOverflowError > > > > Exception in thread "defaultRpcServer.handler=18,queue=0,port=60020" > > > > java.lang.StackOverflowError > > > > Exception in thread "defaultRpcServer.handler=23,queue=2,port=60020" > > > > java.lang.StackOverflowError > > > > Exception in thread "defaultRpcServer.handler=24,queue=0,port=60020" > > > > java.lang.StackOverflowError > > > > Exception in thread "defaultRpcServer.handler=2,queue=2,port=60020" > > > > java.lang.StackOverflowError > > > > Exception in thread "defaultRpcServer.handler=11,queue=2,port=60020" > > > > java.lang.StackOverflowError > > > > Exception in thread "defaultRpcServer.handler=25,queue=1,port=60020" > > > > java.lang.StackOverflowError > > > > Exception in thread "defaultRpcServer.handler=20,queue=2,port=60020" > > > > java.lang.StackOverflowError > > > > Exception in thread "defaultRpcServer.handler=19,queue=1,port=60020" > > > > java.lang.StackOverflowError > > > > Exception in thread "defaultRpcServer.handler=15,queue=0,port=60020" > > > > java.lang.StackOverflowError > > > > Exception in thread "defaultRpcServer.handler=1,queue=1,port=60020" > > > > java.lang.StackOverflowError > > > > Exception in thread "defaultRpcServer.handler=7,queue=1,port=60020" > > > > java.lang.StackOverflowError > > > > Exception in thread "defaultRpcServer.handler=4,queue=1,port=60020" > > > > java.lang.StackOverflowError > > > > > > > > > > > > > We should fix this so the RegionServer aborts if it loses a handler to > an > > > Error. > > > > > > > > > -- > > > Best regards, > > > > > > - Andy > > > > > > Problems worthy of attack prove their worth by hitting back. - Piet > Hein > > > (via Tom White) > > > > > > > > > > > -- > > LadenZeile.de <http://www.ladenzeile.de/> > > powered by Visual Meta GmbH - www.visual-meta.com > > > > Tel.: +49 30 / 609 84 88 20 > > Fax: +49 30 / 609 84 88 21 > > E-Mail: [email protected] > > > > Visual Meta GmbH, Schützenstraße 25, 10117 Berlin > > Geschäftsführer: Robert M. Maier, Johannes Schaback > > Handelsregister HRB 115795 B, Amtsgericht Charlottenburg > > USt-IdNr.: DE263760203 > > > -- LadenZeile.de <http://www.ladenzeile.de/> powered by Visual Meta GmbH - www.visual-meta.com Tel.: +49 30 / 609 84 88 20 Fax: +49 30 / 609 84 88 21 E-Mail: [email protected] Visual Meta GmbH, Schützenstraße 25, 10117 Berlin Geschäftsführer: Robert M. Maier, Johannes Schaback Handelsregister HRB 115795 B, Amtsgericht Charlottenburg USt-IdNr.: DE263760203
