Are you going to upload a patch, Shrijeet ?

Thanks



On Dec 14, 2011, at 7:25 PM, Shrijeet Paliwal <[email protected]> wrote:

> Created https://issues.apache.org/jira/browse/HBASE-5035
> 
> On Wed, Dec 14, 2011 at 1:17 PM, Ted Yu <[email protected]> wrote:
>> I am not sure.
>> If you patch your build with the upcoming patch, we should be able to get
>> more information.
>> 
>> Thanks Shrijeet.
>> 
>> On Wed, Dec 14, 2011 at 1:15 PM, Shrijeet Paliwal
>> <[email protected]>wrote:
>> 
>>> I will open the jira.
>>> 
>>>> Was there region splitting / transition at the time of this problem ? I
>>>> would assume the NPE is related to region transitions.
>>> 
>>> I am not sure if that was happening. If it happens again, I will
>>> check. But there was one more exception
>>> ArrayIndexOutOfBoundsException, which I mentioned
>>> http://pastie.org/2987927 . Wonder if region transition theory can
>>> explain that as well.
>>> 
>>> On Wed, Dec 14, 2011 at 12:45 PM, Ted Yu <[email protected]> wrote:
>>>> Shrijeet:
>>>> When I remove the try/catch block, HCM compiles.
>>>> Do you mind filing a JIRA for the issue so that other developers can
>>>> comment ?
>>>> 
>>>> Null check for regionInfo should be added.
>>>> 
>>>> Was there region splitting / transition at the time of this problem ? I
>>>> would assume the NPE is related to region transitions.
>>>> 
>>>> Cheers
>>>> 
>>>> On Wed, Dec 14, 2011 at 12:33 PM, Shrijeet Paliwal
>>>> <[email protected]>wrote:
>>>> 
>>>>>> The following is preventing us from knowing where the NPE came from:>
>>>>>        } catch (RuntimeException e) {>            throw new
>>>>> IOException(e);>          }
>>>>> Seems to me there is a scope of improving this block. I am trying to
>>>>> understanding the reasoning behind catching the run time exception. If
>>>>> we know that regioninfo can be null, may be a we can put a check and
>>>>> throw a more meaningful error. What do you think?
>>>>> 
>>>>>> I think you may even be able to reproduce the error by scanning .META.
>>>>>> manually.
>>>>> Hmm. You mean to say it was not a client problem, instead it was a
>>>>> server problem? I must add other clients talking to server (ones whom
>>>>> did not have JVM tunings I mentioned) did fine even during shitty
>>>>> period seen by affected clients.
>>>>> On Wed, Dec 14, 2011 at 12:10 PM, Ted Yu <[email protected]> wrote:
>>>>>> The following is preventing us from knowing where the NPE came from:
>>>>>>          } catch (RuntimeException e) {
>>>>>>            throw new IOException(e);
>>>>>>          }
>>>>>> Most likely regionInfo was null.
>>>>>> 
>>>>>> I think you may even be able to reproduce the error by scanning .META.
>>>>>> manually.
>>>>>> 
>>>>>> Cheers
>>>>>> 
>>>>>> On Wed, Dec 14, 2011 at 11:28 AM, Shrijeet Paliwal
>>>>>> <[email protected]>wrote:
>>>>>> 
>>>>>>> Here https://gist.github.com/1478070
>>>>>>> 
>>>>>>> On Wed, Dec 14, 2011 at 11:03 AM, Ted Yu <[email protected]>
>>> wrote:
>>>>>>>> I was just saying that upgrading wouldn't incur any regression in
>>> your
>>>>>>>> codebase.
>>>>>>>> The major motiv is to make code matching easier.
>>>>>>>> 
>>>>>>>> Or maybe you can publish the patched HCM.
>>>>>>>> 
>>>>>>>> On Wed, Dec 14, 2011 at 10:59 AM, Shrijeet Paliwal
>>>>>>>> <[email protected]>wrote:
>>>>>>>> 
>>>>>>>>> Hi Ted,
>>>>>>>>> Thanks for replying.
>>>>>>>>> Like I mentioned in the mail " Line numbers in stack trace may not
>>>>>>>>> match with 0.90.3 branch because of extra patches we have. "
>>>>>>>>> We already have 4508 backported. Curious why you thought of that
>>>>> issue?
>>>>>>>>> 
>>>>>>>>> On Wed, Dec 14, 2011 at 10:56 AM, Ted Yu <[email protected]>
>>>>> wrote:
>>>>>>>>>> Looking at the tip of 0.90, I didn't find the exact line of code
>>>>> where
>>>>>>>>> NPE
>>>>>>>>>> was thrown.
>>>>>>>>>> 0.90.5RC0 is available and it contains HBASE-4508. Is it
>>> possible
>>>>> to
>>>>>>>>>> upgrade ?
>>>>>>>>>> Cheers
>>>>>>>>>> 
>>>>>>>>>> On Wed, Dec 14, 2011 at 10:07 AM, Shrijeet Paliwal
>>>>>>>>>> <[email protected]>wrote:
>>>>>>>>>> 
>>>>>>>>>>> For what it is worth, the client was doing Full GC every 10th
>>>>> second
>>>>>>>>>>> while this was happening.
>>>>>>>>>>> We recently increased new gen size on few of the clients as a
>>>>> part of
>>>>>>>>>>> an experiment and all those clients suffer this situation I
>>>>> describe
>>>>>>>>>>> in the mail earlier.
>>>>>>>>>>> 
>>>>>>>>>>> On Thu, Dec 8, 2011 at 1:13 PM, Shrijeet Paliwal
>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>> Version: 0.90.3 + patches back ported
>>>>>>>>>>>> 
>>>>>>>>>>>> The other day our client started spitting these two runtime
>>>>>>>>> exceptions.
>>>>>>>>>>> Not
>>>>>>>>>>>> all clients connected to the cluster were under impact. Only
>>> 4
>>>>> of
>>>>>>>>> them.
>>>>>>>>>>>> While 3 of them were throwing NPE, one of them was
>>>>>>>>>>>> throwing ArrayIndexOutOfBoundsException. The errors are :
>>>>>>>>>>>> 
>>>>>>>>>>>> 1. http://pastie.org/2987926
>>>>>>>>>>>> 2. http://pastie.org/2987927
>>>>>>>>>>>> 
>>>>>>>>>>>> Clients did not recover from this and I had to bump them.
>>>>>>>>>>>> 
>>>>>>>>>>>> I wish to understand, since we are catching runtime
>>> exception in
>>>>>>> this
>>>>>>>>>>> block
>>>>>>>>>>>> of code - do we expect this kind of behavior. Also with the
>>>>> given
>>>>>>>>> stack
>>>>>>>>>>>> trace I can not tell which line caused NPE of AIOBE.
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks.
>>>>>>>>>>>> 
>>>>>>>>>>>> -Shrijeet
>>>>>>>>>>>> PS: Line numbers in stack trace may not match with 0.90.3
>>> branch
>>>>>>>>> because
>>>>>>>>>>> of
>>>>>>>>>>>> extra patches we have.
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>> 

Reply via email to