[ 
https://issues.apache.org/jira/browse/HDFS-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14321801#comment-14321801
 ] 

Hadoop QA commented on HDFS-7798:
---------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12698944/HDFS-7798.01.patch
  against trunk revision 3338f6d.

    {color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

    {color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
                        Please justify why no new tests are needed for this 
patch.
                        Also please list what manual steps were performed to 
verify this patch.

    {color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

    {color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

    {color:red}-1 eclipse:eclipse{color}.  The patch failed to build with 
eclipse:eclipse.

    {color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

    {color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9587//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9587//console

This message is automatically generated.

> Checkpointing failure caused by shared KerberosAuthenticator
> ------------------------------------------------------------
>
>                 Key: HDFS-7798
>                 URL: https://issues.apache.org/jira/browse/HDFS-7798
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: security
>            Reporter: Chengbing Liu
>            Assignee: Chengbing Liu
>            Priority: Critical
>         Attachments: HDFS-7798.01.patch
>
>
> We have observed in our real cluster occasional checkpointing failure. The 
> standby NameNode was not able to upload image to the active NameNode.
> After some digging, the root cause appears to be a shared 
> {{KerberosAuthenticator}} in {{URLConnectionFactory}}. The authenticator is 
> designed as a use-once instance, and is not stateless. It has attributes such 
> as {{HttpURLConnection}} and {{URL}}. When multiple threads are calling 
> {{URLConnectionFactory#openConnection(...)}}, the shared authenticator is 
> going to have race condition, resulting in a failed image uploading.
> Therefore for the first step, without breaking the current API, I propose we 
> create a new {{KerberosAuthenticator}} instance for each connection, to make 
> checkpointing work. We may consider making {{Authenticator}} design and 
> implementation stateless afterwards, as {{ConnectionConfigurator}} does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to