[jira] [Comment Edited] (HADOOP-15593) UserGroupInformation TGT renewer throws NPE

2018-07-25 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1651#comment-1651
 ] 

Gabor Bota edited comment on HADOOP-15593 at 7/25/18 11:52 AM:
---

* I've added runRenewalLoop because of the unit testing. I just pass false if I 
don't want the loop to run. I don't use this flag for any other purpose.
* I'll add a new patch with [~xiaochen]'s solution:

{code:java}
  if (tgt.isDestroyed()) {
 //log and return;
  }
  try{
tgtEndTime = tgt.getEndTime().getTime();
  } catch (NullPointerException npe) {
 // log and return;
  }
{code}
* And correct {{metrics.renewalFailures.value()}} and 
{{metrics.renewalFailuresTotal.value()}} in the log message.



was (Author: gabor.bota):
* I've added runRenewalLoop because of the unit testing. I just pass false if I 
don't want the loop to run. I don't use this flag for any other purpose.
* I'll add a new patch with [~xiaochen]'s solution:

{code:java}
  if (tgt.isDestroyed()) {
 //log and return;
  }
  try{
tgtEndTime = tgt.getEndTime().getTime();
  } catch (NullPointerException npe) {
 // log and return;
  }
{code}


> UserGroupInformation TGT renewer throws NPE
> ---
>
> Key: HADOOP-15593
> URL: https://issues.apache.org/jira/browse/HADOOP-15593
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Blocker
> Attachments: HADOOP-15593.001.patch, HADOOP-15593.002.patch, 
> HADOOP-15593.003.patch, HADOOP-15593.004.patch
>
>
> Found the following NPE thrown in UGI tgt renewer. The NPE was thrown within 
> an exception handler so the original exception was hidden, though it's likely 
> caused by expired tgt.
> {noformat}
> 18/07/02 10:30:57 ERROR util.SparkUncaughtExceptionHandler: Uncaught 
> exception in thread Thread[TGT Renewer for f...@example.com,5,main]
> java.lang.NullPointerException
> at 
> javax.security.auth.kerberos.KerberosTicket.getEndTime(KerberosTicket.java:482)
> at 
> org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:894)
> at java.lang.Thread.run(Thread.java:748){noformat}
> Suspect it's related to [https://bugs.openjdk.java.net/browse/JDK-8154889].
> The relevant code was added in HADOOP-13590. File this jira to handle the 
> exception better.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-15593) UserGroupInformation TGT renewer throws NPE

2018-07-23 Thread Xiao Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553838#comment-16553838
 ] 

Xiao Chen edited comment on HADOOP-15593 at 7/24/18 5:57 AM:
-

Thanks [~gabor.bota] and [~eyang]. Workaround the NPE sounds good to me (but 
sad). :)

I'm also looking at this particular code block:
{code}

  try {
Date endTime = tgt.getEndTime();
if (tgt != null && endTime != null && !tgt.isDestroyed()) {
  tgtEndTime = endTime.getTime();
}
  } catch (NullPointerException npe) {
{code}
- Do we really need the tgt==null check at all? What's the scenario that tgt 
can be null here? (If it's needed, the check should happen before 
{{getEndTime}} call, but it doesn't look possible to me that tgt can be null.
- Suggest to make the NPE try-catch strictly around the line we're trying to 
workaround: tgt.getEndTime(); Then also add a pointer to the JDK issue 
JDK-8147772 in the comment, to save future people the time to search on this 
jira. Should also explain the fact that the NPE is only possible prior to the 
JDK fix.
- We also need a unit test for this. This can be done by using a mocked tgt


was (Author: xiaochen):
I'm also looking at this particular code block:
{code}

  try {
Date endTime = tgt.getEndTime();
if (tgt != null && endTime != null && !tgt.isDestroyed()) {
  tgtEndTime = endTime.getTime();
}
  } catch (NullPointerException npe) {
{code}
- Do we really need the tgt==null check at all? What's the scenario that tgt 
can be null here? (If it's needed, the check should happen before 
{{getEndTime}} call, but it doesn't look possible to me that tgt can be null.
- Suggest to make the NPE try-catch strictly around the line we're trying to 
workaround: tgt.getEndTime(); Then also add a pointer to the JDK issue 
JDK-8147772 in the comment, to save future people the time to search on this 
jira. Should also explain the fact that the NPE is only possible prior to the 
JDK fix.
- We also need a unit test for this. This can be done by using a mocked tgt

> UserGroupInformation TGT renewer throws NPE
> ---
>
> Key: HADOOP-15593
> URL: https://issues.apache.org/jira/browse/HADOOP-15593
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Blocker
> Attachments: HADOOP-15593.001.patch, HADOOP-15593.002.patch, 
> HADOOP-15593.003.patch
>
>
> Found the following NPE thrown in UGI tgt renewer. The NPE was thrown within 
> an exception handler so the original exception was hidden, though it's likely 
> caused by expired tgt.
> {noformat}
> 18/07/02 10:30:57 ERROR util.SparkUncaughtExceptionHandler: Uncaught 
> exception in thread Thread[TGT Renewer for f...@example.com,5,main]
> java.lang.NullPointerException
> at 
> javax.security.auth.kerberos.KerberosTicket.getEndTime(KerberosTicket.java:482)
> at 
> org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:894)
> at java.lang.Thread.run(Thread.java:748){noformat}
> Suspect it's related to [https://bugs.openjdk.java.net/browse/JDK-8154889].
> The relevant code was added in HADOOP-13590. File this jira to handle the 
> exception better.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-15593) UserGroupInformation TGT renewer throws NPE

2018-07-20 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16551442#comment-16551442
 ] 

Eric Yang edited comment on HADOOP-15593 at 7/21/18 12:04 AM:
--

What if we catch the null pointer exception and reset tgtEndTime to now?  When 
tgtEndTime is undefined for any reasons, with tgtEndTime reset to now, should 
have no ill effect within the scope of getNextTgtRenewalTime.


was (Author: eyang):
What if we catch the null pointer and reset tgtEndTime to now?  When tgtEndTime 
is undefined for any reasons, with tgtEndTime reset to now, should have no ill 
effect within the scope of getNextTgtRenewalTime.

> UserGroupInformation TGT renewer throws NPE
> ---
>
> Key: HADOOP-15593
> URL: https://issues.apache.org/jira/browse/HADOOP-15593
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Blocker
> Attachments: HADOOP-15593.001.patch, HADOOP-15593.002.patch
>
>
> Found the following NPE thrown in UGI tgt renewer. The NPE was thrown within 
> an exception handler so the original exception was hidden, though it's likely 
> caused by expired tgt.
> {noformat}
> 18/07/02 10:30:57 ERROR util.SparkUncaughtExceptionHandler: Uncaught 
> exception in thread Thread[TGT Renewer for f...@example.com,5,main]
> java.lang.NullPointerException
> at 
> javax.security.auth.kerberos.KerberosTicket.getEndTime(KerberosTicket.java:482)
> at 
> org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:894)
> at java.lang.Thread.run(Thread.java:748){noformat}
> Suspect it's related to [https://bugs.openjdk.java.net/browse/JDK-8154889].
> The relevant code was added in HADOOP-13590. File this jira to handle the 
> exception better.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org