Created https://issues.apache.org/jira/browse/HBASE-5035
On Wed, Dec 14, 2011 at 1:17 PM, Ted Yu <[email protected]> wrote: > I am not sure. > If you patch your build with the upcoming patch, we should be able to get > more information. > > Thanks Shrijeet. > > On Wed, Dec 14, 2011 at 1:15 PM, Shrijeet Paliwal > <[email protected]>wrote: > >> I will open the jira. >> >> > Was there region splitting / transition at the time of this problem ? I >> > would assume the NPE is related to region transitions. >> >> I am not sure if that was happening. If it happens again, I will >> check. But there was one more exception >> ArrayIndexOutOfBoundsException, which I mentioned >> http://pastie.org/2987927 . Wonder if region transition theory can >> explain that as well. >> >> On Wed, Dec 14, 2011 at 12:45 PM, Ted Yu <[email protected]> wrote: >> > Shrijeet: >> > When I remove the try/catch block, HCM compiles. >> > Do you mind filing a JIRA for the issue so that other developers can >> > comment ? >> > >> > Null check for regionInfo should be added. >> > >> > Was there region splitting / transition at the time of this problem ? I >> > would assume the NPE is related to region transitions. >> > >> > Cheers >> > >> > On Wed, Dec 14, 2011 at 12:33 PM, Shrijeet Paliwal >> > <[email protected]>wrote: >> > >> >> > The following is preventing us from knowing where the NPE came from:> >> >> } catch (RuntimeException e) {> throw new >> >> IOException(e);> } >> >> Seems to me there is a scope of improving this block. I am trying to >> >> understanding the reasoning behind catching the run time exception. If >> >> we know that regioninfo can be null, may be a we can put a check and >> >> throw a more meaningful error. What do you think? >> >> >> >> > I think you may even be able to reproduce the error by scanning .META. >> >> > manually. >> >> Hmm. You mean to say it was not a client problem, instead it was a >> >> server problem? I must add other clients talking to server (ones whom >> >> did not have JVM tunings I mentioned) did fine even during shitty >> >> period seen by affected clients. >> >> On Wed, Dec 14, 2011 at 12:10 PM, Ted Yu <[email protected]> wrote: >> >> > The following is preventing us from knowing where the NPE came from: >> >> > } catch (RuntimeException e) { >> >> > throw new IOException(e); >> >> > } >> >> > Most likely regionInfo was null. >> >> > >> >> > I think you may even be able to reproduce the error by scanning .META. >> >> > manually. >> >> > >> >> > Cheers >> >> > >> >> > On Wed, Dec 14, 2011 at 11:28 AM, Shrijeet Paliwal >> >> > <[email protected]>wrote: >> >> > >> >> >> Here https://gist.github.com/1478070 >> >> >> >> >> >> On Wed, Dec 14, 2011 at 11:03 AM, Ted Yu <[email protected]> >> wrote: >> >> >> > I was just saying that upgrading wouldn't incur any regression in >> your >> >> >> > codebase. >> >> >> > The major motiv is to make code matching easier. >> >> >> > >> >> >> > Or maybe you can publish the patched HCM. >> >> >> > >> >> >> > On Wed, Dec 14, 2011 at 10:59 AM, Shrijeet Paliwal >> >> >> > <[email protected]>wrote: >> >> >> > >> >> >> >> Hi Ted, >> >> >> >> Thanks for replying. >> >> >> >> Like I mentioned in the mail " Line numbers in stack trace may not >> >> >> >> match with 0.90.3 branch because of extra patches we have. " >> >> >> >> We already have 4508 backported. Curious why you thought of that >> >> issue? >> >> >> >> >> >> >> >> On Wed, Dec 14, 2011 at 10:56 AM, Ted Yu <[email protected]> >> >> wrote: >> >> >> >> > Looking at the tip of 0.90, I didn't find the exact line of code >> >> where >> >> >> >> NPE >> >> >> >> > was thrown. >> >> >> >> > 0.90.5RC0 is available and it contains HBASE-4508. Is it >> possible >> >> to >> >> >> >> > upgrade ? >> >> >> >> > Cheers >> >> >> >> > >> >> >> >> > On Wed, Dec 14, 2011 at 10:07 AM, Shrijeet Paliwal >> >> >> >> > <[email protected]>wrote: >> >> >> >> > >> >> >> >> >> For what it is worth, the client was doing Full GC every 10th >> >> second >> >> >> >> >> while this was happening. >> >> >> >> >> We recently increased new gen size on few of the clients as a >> >> part of >> >> >> >> >> an experiment and all those clients suffer this situation I >> >> describe >> >> >> >> >> in the mail earlier. >> >> >> >> >> >> >> >> >> >> On Thu, Dec 8, 2011 at 1:13 PM, Shrijeet Paliwal >> >> >> >> >> <[email protected]> wrote: >> >> >> >> >> > Hi, >> >> >> >> >> > Version: 0.90.3 + patches back ported >> >> >> >> >> > >> >> >> >> >> > The other day our client started spitting these two runtime >> >> >> >> exceptions. >> >> >> >> >> Not >> >> >> >> >> > all clients connected to the cluster were under impact. Only >> 4 >> >> of >> >> >> >> them. >> >> >> >> >> > While 3 of them were throwing NPE, one of them was >> >> >> >> >> > throwing ArrayIndexOutOfBoundsException. The errors are : >> >> >> >> >> > >> >> >> >> >> > 1. http://pastie.org/2987926 >> >> >> >> >> > 2. http://pastie.org/2987927 >> >> >> >> >> > >> >> >> >> >> > Clients did not recover from this and I had to bump them. >> >> >> >> >> > >> >> >> >> >> > I wish to understand, since we are catching runtime >> exception in >> >> >> this >> >> >> >> >> block >> >> >> >> >> > of code - do we expect this kind of behavior. Also with the >> >> given >> >> >> >> stack >> >> >> >> >> > trace I can not tell which line caused NPE of AIOBE. >> >> >> >> >> > >> >> >> >> >> > Thanks. >> >> >> >> >> > >> >> >> >> >> > -Shrijeet >> >> >> >> >> > PS: Line numbers in stack trace may not match with 0.90.3 >> branch >> >> >> >> because >> >> >> >> >> of >> >> >> >> >> > extra patches we have. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>
