[ 
https://issues.apache.org/jira/browse/HDFS-11908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-11908:
-----------------------------------
    Attachment: HDFS-11908.HDFS-8707.000.patch

The simple fix to set auth info for new connections.  Once I have some time in 
the next month or two I'd like to do some general improvements to the RPC code 
including making sure this stuff is set during initialization.  Doesn't make 
sense to call a bunch of setters right after creating a new object.  I'd like 
to start with this patch because it's been well tested (externally) and solves 
a major problem with minimal code change and add a test along with the 
improvements for HDFS-11807.  Right now the minidfscluster CI tests don't do HA 
or kerberos auth.

> libhdfs++: Authentication failure when first NN of kerberized HA cluster is 
> standby
> -----------------------------------------------------------------------------------
>
>                 Key: HDFS-11908
>                 URL: https://issues.apache.org/jira/browse/HDFS-11908
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>            Reporter: James Clampffer
>            Assignee: James Clampffer
>         Attachments: HDFS-11908.HDFS-8707.000.patch
>
>
> Library won't properly authenticate to kerberized HA cluster if the first 
> namenode it tries to connect to is the standby.  RpcConnection ends up 
> attempting to use simple auth.
> Control flow to connect to NN for the first time:
> # RpcConnection constructed with a pointer to the RpcEngine as the only 
> argument
> # RpcConnection::Connect(server endpoints, auth_info, callback called)
> ** auth_info contains the SASL mechanism to use + the delegation token if we 
> already have one
> Control flow to connect to NN after failover:
> # RpcEngine::NewConnection called, allocates an RpcConnection exactly how 
> step 1 above would
> # RpcEngine::InitializeConnection called, sets event hooks and a string for 
> cluster name
> # Rpc calls sent using RpcConnection::PreEnqueueRequests called to add RPC 
> message that didn't make it on last call due to standby exception
> # RpcConnection::ConnectAndFlush called to send RPC packets. This only takes 
> server endpoints, no auth info
> To fix:
> RpcEngine::InitializeConnection just needs to set RpcConnection::auth_info_ 
> from the existing RpcEngine::auth_info_, even better would be setting this in 
> the constructor so if an RpcConnection exists it can be expected to be in a 
> usable state.  I'll get a diff up once I sort out CI build failures.
> Also really need to get CI test coverage for HA and kerberos because this 
> issue should not have been around for so long.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to