Are you going to upload a patch, Shrijeet ? Thanks
On Dec 14, 2011, at 7:25 PM, Shrijeet Paliwal <[email protected]> wrote: > Created https://issues.apache.org/jira/browse/HBASE-5035 > > On Wed, Dec 14, 2011 at 1:17 PM, Ted Yu <[email protected]> wrote: >> I am not sure. >> If you patch your build with the upcoming patch, we should be able to get >> more information. >> >> Thanks Shrijeet. >> >> On Wed, Dec 14, 2011 at 1:15 PM, Shrijeet Paliwal >> <[email protected]>wrote: >> >>> I will open the jira. >>> >>>> Was there region splitting / transition at the time of this problem ? I >>>> would assume the NPE is related to region transitions. >>> >>> I am not sure if that was happening. If it happens again, I will >>> check. But there was one more exception >>> ArrayIndexOutOfBoundsException, which I mentioned >>> http://pastie.org/2987927 . Wonder if region transition theory can >>> explain that as well. >>> >>> On Wed, Dec 14, 2011 at 12:45 PM, Ted Yu <[email protected]> wrote: >>>> Shrijeet: >>>> When I remove the try/catch block, HCM compiles. >>>> Do you mind filing a JIRA for the issue so that other developers can >>>> comment ? >>>> >>>> Null check for regionInfo should be added. >>>> >>>> Was there region splitting / transition at the time of this problem ? I >>>> would assume the NPE is related to region transitions. >>>> >>>> Cheers >>>> >>>> On Wed, Dec 14, 2011 at 12:33 PM, Shrijeet Paliwal >>>> <[email protected]>wrote: >>>> >>>>>> The following is preventing us from knowing where the NPE came from:> >>>>> } catch (RuntimeException e) {> throw new >>>>> IOException(e);> } >>>>> Seems to me there is a scope of improving this block. I am trying to >>>>> understanding the reasoning behind catching the run time exception. If >>>>> we know that regioninfo can be null, may be a we can put a check and >>>>> throw a more meaningful error. What do you think? >>>>> >>>>>> I think you may even be able to reproduce the error by scanning .META. >>>>>> manually. >>>>> Hmm. You mean to say it was not a client problem, instead it was a >>>>> server problem? I must add other clients talking to server (ones whom >>>>> did not have JVM tunings I mentioned) did fine even during shitty >>>>> period seen by affected clients. >>>>> On Wed, Dec 14, 2011 at 12:10 PM, Ted Yu <[email protected]> wrote: >>>>>> The following is preventing us from knowing where the NPE came from: >>>>>> } catch (RuntimeException e) { >>>>>> throw new IOException(e); >>>>>> } >>>>>> Most likely regionInfo was null. >>>>>> >>>>>> I think you may even be able to reproduce the error by scanning .META. >>>>>> manually. >>>>>> >>>>>> Cheers >>>>>> >>>>>> On Wed, Dec 14, 2011 at 11:28 AM, Shrijeet Paliwal >>>>>> <[email protected]>wrote: >>>>>> >>>>>>> Here https://gist.github.com/1478070 >>>>>>> >>>>>>> On Wed, Dec 14, 2011 at 11:03 AM, Ted Yu <[email protected]> >>> wrote: >>>>>>>> I was just saying that upgrading wouldn't incur any regression in >>> your >>>>>>>> codebase. >>>>>>>> The major motiv is to make code matching easier. >>>>>>>> >>>>>>>> Or maybe you can publish the patched HCM. >>>>>>>> >>>>>>>> On Wed, Dec 14, 2011 at 10:59 AM, Shrijeet Paliwal >>>>>>>> <[email protected]>wrote: >>>>>>>> >>>>>>>>> Hi Ted, >>>>>>>>> Thanks for replying. >>>>>>>>> Like I mentioned in the mail " Line numbers in stack trace may not >>>>>>>>> match with 0.90.3 branch because of extra patches we have. " >>>>>>>>> We already have 4508 backported. Curious why you thought of that >>>>> issue? >>>>>>>>> >>>>>>>>> On Wed, Dec 14, 2011 at 10:56 AM, Ted Yu <[email protected]> >>>>> wrote: >>>>>>>>>> Looking at the tip of 0.90, I didn't find the exact line of code >>>>> where >>>>>>>>> NPE >>>>>>>>>> was thrown. >>>>>>>>>> 0.90.5RC0 is available and it contains HBASE-4508. Is it >>> possible >>>>> to >>>>>>>>>> upgrade ? >>>>>>>>>> Cheers >>>>>>>>>> >>>>>>>>>> On Wed, Dec 14, 2011 at 10:07 AM, Shrijeet Paliwal >>>>>>>>>> <[email protected]>wrote: >>>>>>>>>> >>>>>>>>>>> For what it is worth, the client was doing Full GC every 10th >>>>> second >>>>>>>>>>> while this was happening. >>>>>>>>>>> We recently increased new gen size on few of the clients as a >>>>> part of >>>>>>>>>>> an experiment and all those clients suffer this situation I >>>>> describe >>>>>>>>>>> in the mail earlier. >>>>>>>>>>> >>>>>>>>>>> On Thu, Dec 8, 2011 at 1:13 PM, Shrijeet Paliwal >>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>> Hi, >>>>>>>>>>>> Version: 0.90.3 + patches back ported >>>>>>>>>>>> >>>>>>>>>>>> The other day our client started spitting these two runtime >>>>>>>>> exceptions. >>>>>>>>>>> Not >>>>>>>>>>>> all clients connected to the cluster were under impact. Only >>> 4 >>>>> of >>>>>>>>> them. >>>>>>>>>>>> While 3 of them were throwing NPE, one of them was >>>>>>>>>>>> throwing ArrayIndexOutOfBoundsException. The errors are : >>>>>>>>>>>> >>>>>>>>>>>> 1. http://pastie.org/2987926 >>>>>>>>>>>> 2. http://pastie.org/2987927 >>>>>>>>>>>> >>>>>>>>>>>> Clients did not recover from this and I had to bump them. >>>>>>>>>>>> >>>>>>>>>>>> I wish to understand, since we are catching runtime >>> exception in >>>>>>> this >>>>>>>>>>> block >>>>>>>>>>>> of code - do we expect this kind of behavior. Also with the >>>>> given >>>>>>>>> stack >>>>>>>>>>>> trace I can not tell which line caused NPE of AIOBE. >>>>>>>>>>>> >>>>>>>>>>>> Thanks. >>>>>>>>>>>> >>>>>>>>>>>> -Shrijeet >>>>>>>>>>>> PS: Line numbers in stack trace may not match with 0.90.3 >>> branch >>>>>>>>> because >>>>>>>>>>> of >>>>>>>>>>>> extra patches we have. >>>>>>>>>>> >>>>>>>>> >>>>>>> >>>>> >>>
