[jira] [Comment Edited] (HADOOP-15593) UserGroupInformation TGT renewer throws NPE
[ https://issues.apache.org/jira/browse/HADOOP-15593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1651#comment-1651 ] Gabor Bota edited comment on HADOOP-15593 at 7/25/18 11:52 AM: --- * I've added runRenewalLoop because of the unit testing. I just pass false if I don't want the loop to run. I don't use this flag for any other purpose. * I'll add a new patch with [~xiaochen]'s solution: {code:java} if (tgt.isDestroyed()) { //log and return; } try{ tgtEndTime = tgt.getEndTime().getTime(); } catch (NullPointerException npe) { // log and return; } {code} * And correct {{metrics.renewalFailures.value()}} and {{metrics.renewalFailuresTotal.value()}} in the log message. was (Author: gabor.bota): * I've added runRenewalLoop because of the unit testing. I just pass false if I don't want the loop to run. I don't use this flag for any other purpose. * I'll add a new patch with [~xiaochen]'s solution: {code:java} if (tgt.isDestroyed()) { //log and return; } try{ tgtEndTime = tgt.getEndTime().getTime(); } catch (NullPointerException npe) { // log and return; } {code} > UserGroupInformation TGT renewer throws NPE > --- > > Key: HADOOP-15593 > URL: https://issues.apache.org/jira/browse/HADOOP-15593 > Project: Hadoop Common > Issue Type: Bug > Components: security >Affects Versions: 3.0.0 >Reporter: Wei-Chiu Chuang >Assignee: Gabor Bota >Priority: Blocker > Attachments: HADOOP-15593.001.patch, HADOOP-15593.002.patch, > HADOOP-15593.003.patch, HADOOP-15593.004.patch > > > Found the following NPE thrown in UGI tgt renewer. The NPE was thrown within > an exception handler so the original exception was hidden, though it's likely > caused by expired tgt. > {noformat} > 18/07/02 10:30:57 ERROR util.SparkUncaughtExceptionHandler: Uncaught > exception in thread Thread[TGT Renewer for f...@example.com,5,main] > java.lang.NullPointerException > at > javax.security.auth.kerberos.KerberosTicket.getEndTime(KerberosTicket.java:482) > at > org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:894) > at java.lang.Thread.run(Thread.java:748){noformat} > Suspect it's related to [https://bugs.openjdk.java.net/browse/JDK-8154889]. > The relevant code was added in HADOOP-13590. File this jira to handle the > exception better. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-15593) UserGroupInformation TGT renewer throws NPE
[ https://issues.apache.org/jira/browse/HADOOP-15593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553838#comment-16553838 ] Xiao Chen edited comment on HADOOP-15593 at 7/24/18 5:57 AM: - Thanks [~gabor.bota] and [~eyang]. Workaround the NPE sounds good to me (but sad). :) I'm also looking at this particular code block: {code} try { Date endTime = tgt.getEndTime(); if (tgt != null && endTime != null && !tgt.isDestroyed()) { tgtEndTime = endTime.getTime(); } } catch (NullPointerException npe) { {code} - Do we really need the tgt==null check at all? What's the scenario that tgt can be null here? (If it's needed, the check should happen before {{getEndTime}} call, but it doesn't look possible to me that tgt can be null. - Suggest to make the NPE try-catch strictly around the line we're trying to workaround: tgt.getEndTime(); Then also add a pointer to the JDK issue JDK-8147772 in the comment, to save future people the time to search on this jira. Should also explain the fact that the NPE is only possible prior to the JDK fix. - We also need a unit test for this. This can be done by using a mocked tgt was (Author: xiaochen): I'm also looking at this particular code block: {code} try { Date endTime = tgt.getEndTime(); if (tgt != null && endTime != null && !tgt.isDestroyed()) { tgtEndTime = endTime.getTime(); } } catch (NullPointerException npe) { {code} - Do we really need the tgt==null check at all? What's the scenario that tgt can be null here? (If it's needed, the check should happen before {{getEndTime}} call, but it doesn't look possible to me that tgt can be null. - Suggest to make the NPE try-catch strictly around the line we're trying to workaround: tgt.getEndTime(); Then also add a pointer to the JDK issue JDK-8147772 in the comment, to save future people the time to search on this jira. Should also explain the fact that the NPE is only possible prior to the JDK fix. - We also need a unit test for this. This can be done by using a mocked tgt > UserGroupInformation TGT renewer throws NPE > --- > > Key: HADOOP-15593 > URL: https://issues.apache.org/jira/browse/HADOOP-15593 > Project: Hadoop Common > Issue Type: Bug > Components: security >Affects Versions: 3.0.0 >Reporter: Wei-Chiu Chuang >Assignee: Gabor Bota >Priority: Blocker > Attachments: HADOOP-15593.001.patch, HADOOP-15593.002.patch, > HADOOP-15593.003.patch > > > Found the following NPE thrown in UGI tgt renewer. The NPE was thrown within > an exception handler so the original exception was hidden, though it's likely > caused by expired tgt. > {noformat} > 18/07/02 10:30:57 ERROR util.SparkUncaughtExceptionHandler: Uncaught > exception in thread Thread[TGT Renewer for f...@example.com,5,main] > java.lang.NullPointerException > at > javax.security.auth.kerberos.KerberosTicket.getEndTime(KerberosTicket.java:482) > at > org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:894) > at java.lang.Thread.run(Thread.java:748){noformat} > Suspect it's related to [https://bugs.openjdk.java.net/browse/JDK-8154889]. > The relevant code was added in HADOOP-13590. File this jira to handle the > exception better. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-15593) UserGroupInformation TGT renewer throws NPE
[ https://issues.apache.org/jira/browse/HADOOP-15593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16551442#comment-16551442 ] Eric Yang edited comment on HADOOP-15593 at 7/21/18 12:04 AM: -- What if we catch the null pointer exception and reset tgtEndTime to now? When tgtEndTime is undefined for any reasons, with tgtEndTime reset to now, should have no ill effect within the scope of getNextTgtRenewalTime. was (Author: eyang): What if we catch the null pointer and reset tgtEndTime to now? When tgtEndTime is undefined for any reasons, with tgtEndTime reset to now, should have no ill effect within the scope of getNextTgtRenewalTime. > UserGroupInformation TGT renewer throws NPE > --- > > Key: HADOOP-15593 > URL: https://issues.apache.org/jira/browse/HADOOP-15593 > Project: Hadoop Common > Issue Type: Bug > Components: security >Affects Versions: 3.0.0 >Reporter: Wei-Chiu Chuang >Assignee: Gabor Bota >Priority: Blocker > Attachments: HADOOP-15593.001.patch, HADOOP-15593.002.patch > > > Found the following NPE thrown in UGI tgt renewer. The NPE was thrown within > an exception handler so the original exception was hidden, though it's likely > caused by expired tgt. > {noformat} > 18/07/02 10:30:57 ERROR util.SparkUncaughtExceptionHandler: Uncaught > exception in thread Thread[TGT Renewer for f...@example.com,5,main] > java.lang.NullPointerException > at > javax.security.auth.kerberos.KerberosTicket.getEndTime(KerberosTicket.java:482) > at > org.apache.hadoop.security.UserGroupInformation$1.run(UserGroupInformation.java:894) > at java.lang.Thread.run(Thread.java:748){noformat} > Suspect it's related to [https://bugs.openjdk.java.net/browse/JDK-8154889]. > The relevant code was added in HADOOP-13590. File this jira to handle the > exception better. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org