Yes Ted I will upload a patch soon.
On Wed, Dec 14, 2011 at 8:47 PM, <[email protected]> wrote: > Are you going to upload a patch, Shrijeet ? > > Thanks > > > > On Dec 14, 2011, at 7:25 PM, Shrijeet Paliwal <[email protected]> wrote: > >> Created https://issues.apache.org/jira/browse/HBASE-5035 >> >> On Wed, Dec 14, 2011 at 1:17 PM, Ted Yu <[email protected]> wrote: >>> I am not sure. >>> If you patch your build with the upcoming patch, we should be able to get >>> more information. >>> >>> Thanks Shrijeet. >>> >>> On Wed, Dec 14, 2011 at 1:15 PM, Shrijeet Paliwal >>> <[email protected]>wrote: >>> >>>> I will open the jira. >>>> >>>>> Was there region splitting / transition at the time of this problem ? I >>>>> would assume the NPE is related to region transitions. >>>> >>>> I am not sure if that was happening. If it happens again, I will >>>> check. But there was one more exception >>>> ArrayIndexOutOfBoundsException, which I mentioned >>>> http://pastie.org/2987927 . Wonder if region transition theory can >>>> explain that as well. >>>> >>>> On Wed, Dec 14, 2011 at 12:45 PM, Ted Yu <[email protected]> wrote: >>>>> Shrijeet: >>>>> When I remove the try/catch block, HCM compiles. >>>>> Do you mind filing a JIRA for the issue so that other developers can >>>>> comment ? >>>>> >>>>> Null check for regionInfo should be added. >>>>> >>>>> Was there region splitting / transition at the time of this problem ? I >>>>> would assume the NPE is related to region transitions. >>>>> >>>>> Cheers >>>>> >>>>> On Wed, Dec 14, 2011 at 12:33 PM, Shrijeet Paliwal >>>>> <[email protected]>wrote: >>>>> >>>>>>> The following is preventing us from knowing where the NPE came from:> >>>>>> } catch (RuntimeException e) {> throw new >>>>>> IOException(e);> } >>>>>> Seems to me there is a scope of improving this block. I am trying to >>>>>> understanding the reasoning behind catching the run time exception. If >>>>>> we know that regioninfo can be null, may be a we can put a check and >>>>>> throw a more meaningful error. What do you think? >>>>>> >>>>>>> I think you may even be able to reproduce the error by scanning .META. >>>>>>> manually. >>>>>> Hmm. You mean to say it was not a client problem, instead it was a >>>>>> server problem? I must add other clients talking to server (ones whom >>>>>> did not have JVM tunings I mentioned) did fine even during shitty >>>>>> period seen by affected clients. >>>>>> On Wed, Dec 14, 2011 at 12:10 PM, Ted Yu <[email protected]> wrote: >>>>>>> The following is preventing us from knowing where the NPE came from: >>>>>>> } catch (RuntimeException e) { >>>>>>> throw new IOException(e); >>>>>>> } >>>>>>> Most likely regionInfo was null. >>>>>>> >>>>>>> I think you may even be able to reproduce the error by scanning .META. >>>>>>> manually. >>>>>>> >>>>>>> Cheers >>>>>>> >>>>>>> On Wed, Dec 14, 2011 at 11:28 AM, Shrijeet Paliwal >>>>>>> <[email protected]>wrote: >>>>>>> >>>>>>>> Here https://gist.github.com/1478070 >>>>>>>> >>>>>>>> On Wed, Dec 14, 2011 at 11:03 AM, Ted Yu <[email protected]> >>>> wrote: >>>>>>>>> I was just saying that upgrading wouldn't incur any regression in >>>> your >>>>>>>>> codebase. >>>>>>>>> The major motiv is to make code matching easier. >>>>>>>>> >>>>>>>>> Or maybe you can publish the patched HCM. >>>>>>>>> >>>>>>>>> On Wed, Dec 14, 2011 at 10:59 AM, Shrijeet Paliwal >>>>>>>>> <[email protected]>wrote: >>>>>>>>> >>>>>>>>>> Hi Ted, >>>>>>>>>> Thanks for replying. >>>>>>>>>> Like I mentioned in the mail " Line numbers in stack trace may not >>>>>>>>>> match with 0.90.3 branch because of extra patches we have. " >>>>>>>>>> We already have 4508 backported. Curious why you thought of that >>>>>> issue? >>>>>>>>>> >>>>>>>>>> On Wed, Dec 14, 2011 at 10:56 AM, Ted Yu <[email protected]> >>>>>> wrote: >>>>>>>>>>> Looking at the tip of 0.90, I didn't find the exact line of code >>>>>> where >>>>>>>>>> NPE >>>>>>>>>>> was thrown. >>>>>>>>>>> 0.90.5RC0 is available and it contains HBASE-4508. Is it >>>> possible >>>>>> to >>>>>>>>>>> upgrade ? >>>>>>>>>>> Cheers >>>>>>>>>>> >>>>>>>>>>> On Wed, Dec 14, 2011 at 10:07 AM, Shrijeet Paliwal >>>>>>>>>>> <[email protected]>wrote: >>>>>>>>>>> >>>>>>>>>>>> For what it is worth, the client was doing Full GC every 10th >>>>>> second >>>>>>>>>>>> while this was happening. >>>>>>>>>>>> We recently increased new gen size on few of the clients as a >>>>>> part of >>>>>>>>>>>> an experiment and all those clients suffer this situation I >>>>>> describe >>>>>>>>>>>> in the mail earlier. >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Dec 8, 2011 at 1:13 PM, Shrijeet Paliwal >>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>> Hi, >>>>>>>>>>>>> Version: 0.90.3 + patches back ported >>>>>>>>>>>>> >>>>>>>>>>>>> The other day our client started spitting these two runtime >>>>>>>>>> exceptions. >>>>>>>>>>>> Not >>>>>>>>>>>>> all clients connected to the cluster were under impact. Only >>>> 4 >>>>>> of >>>>>>>>>> them. >>>>>>>>>>>>> While 3 of them were throwing NPE, one of them was >>>>>>>>>>>>> throwing ArrayIndexOutOfBoundsException. The errors are : >>>>>>>>>>>>> >>>>>>>>>>>>> 1. http://pastie.org/2987926 >>>>>>>>>>>>> 2. http://pastie.org/2987927 >>>>>>>>>>>>> >>>>>>>>>>>>> Clients did not recover from this and I had to bump them. >>>>>>>>>>>>> >>>>>>>>>>>>> I wish to understand, since we are catching runtime >>>> exception in >>>>>>>> this >>>>>>>>>>>> block >>>>>>>>>>>>> of code - do we expect this kind of behavior. Also with the >>>>>> given >>>>>>>>>> stack >>>>>>>>>>>>> trace I can not tell which line caused NPE of AIOBE. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks. >>>>>>>>>>>>> >>>>>>>>>>>>> -Shrijeet >>>>>>>>>>>>> PS: Line numbers in stack trace may not match with 0.90.3 >>>> branch >>>>>>>>>> because >>>>>>>>>>>> of >>>>>>>>>>>>> extra patches we have. >>>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>> >>>>
