[ 
https://issues.apache.org/jira/browse/HADOOP-4707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12696847#action_12696847
 ] 

Todd Lipcon commented on HADOOP-4707:
-------------------------------------

{quote}
Yep. A simpler option might be to spawn a thread on the datanode which calls 
Namenode.datanodeUp() every so often. Any preference?
{quote}

That is definitely simpler but seems ugly to me. I know heartbeats are the norm 
elsewhere in Hadoop, but I'd prefer not to introduce more.

{quote}
On my real 6-node test cluster, that value is the same as the value of 
org.apache.hadoop.hdfs.protocol.DatanodeInfo.name for every DatanodeInfo 
instance, so everything works. On a MiniDFSCluster cluster, however, it is not 
— just as you found out in your case, Todd (classloader issues, perharps?).
{quote}

I think I've figured out the issue here:

In ThriftUtils.createNamenodeClient, it uses dfs.thrift.address to connect to 
the namenode. This same configuration variable is used in 
NamenodePlugin.getAddress(). So, with the default configuration of 
0.0.0.0:9090, the NN plugin binds to all local interfaces, and the datanodes 
attempt to connect to whatever IP is first in the local wildcard.

It seems to me that the correct behaviour would be:

 - The NamenodePlugin continues to listen on dfs.thrift.address
   - Having a :0 port on dfs.thrift.address seems inadvisable since there's 
currently no way for the DatanodePlugin to locate the thrift server in that 
case.
 - The DatanodePlugin looks at dfs.thrift.address. If it is a "wildcard" 
address (0.0.0.0) it uses only the port portion, and locates the NN host using 
datanode.getNameNodeAddr()
 - Additionally, it we should inspect the TTransport from the client for the 
hostname in datanodeUp/datanodeDown rather than taking those as parameters. The 
hostname registered in the DatanodeRegistration comes from the remote side of 
the Hadoop RPC socket, so it makes sense to use the same method for getting the 
hostname on the Thrift side.

I'll work on hacking these up and see where I get.

> Improvements to Hadoop Thrift bindings
> --------------------------------------
>
>                 Key: HADOOP-4707
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4707
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/thiftfs
>    Affects Versions: 0.20.0
>         Environment: Tested under Linux x86-64
>            Reporter: Carlos Valiente
>            Priority: Minor
>         Attachments: all.diff, BlockManager.java, build_xml.diff, 
> DefaultBlockManager.java, DFSBlockManager.java, gen.diff, HADOOP-4707.diff, 
> HADOOP-4707.patch, hadoopfs_thrift.diff, hadoopthriftapi.jar, 
> HadoopThriftServer.java, HadoopThriftServer_java.diff, hdfs.py, 
> hdfs_py_venky.diff, libthrift.jar, libthrift.jar, libthrift.jar
>
>
> I have made the following changes to hadoopfs.thrift:
> #  Added namespaces for Python, Perl and C++.
> # Renamed parameters and struct members to camelCase versions to keep them 
> consistent (in particular FileStatus{blockReplication,blockSize} vs 
> FileStatus.{block_replication,blocksize}).
> # Renamed ThriftHadoopFileSystem to FileSystem. From the perspective of a 
> Perl/Python/C++ user, 1) it is already clear that we're using Thrift, and 2) 
> the fact that we're dealing with Hadoop is already explicit in the namespace. 
>  The usage of generated code is more compact and (in my opinion) clearer:
> {quote}
>         *Perl*:
>         use HadoopFS;
>         my $client = HadoopFS::FileSystemClient->new(..);
>          _instead of:_
>         my $client = HadoopFS::ThriftHadoopFileSystemClient->new(..);
>         *Python*:
>         from hadoopfs import FileSystem
>         client = FileSystem.Client(..)
>         _instead of_
>         from hadoopfs import ThriftHadoopFileSystem
>         client = ThriftHadoopFileSystem.Client(..)
>         (See also the attached diff [^scripts_hdfs_py.diff] for the
>          new version of 'scripts/hdfs.py').
>         *C++*:
>         hadoopfs::FileSystemClient client(..);
>          _instead of_:
>         hadoopfs::ThriftHadoopFileSystemClient client(..);
> {quote}
> # Renamed ThriftHandle to FileHandle: As in 3, it is clear that we're dealing 
> with a Thrift object, and its purpose (to act as a handle for file 
> operations) is clearer.
> # Renamed ThriftIOException to IOException, to keep it simpler, and 
> consistent with MalformedInputException.
> # Added explicit version tags to fields of ThriftHandle/FileHandle, Pathname, 
> MalformedInputException and ThriftIOException/IOException, to improve 
> compatibility of existing clients with future versions of the interface which 
> might add new fields to those objects (like stack traces for the exception 
> types, for instance).
> Those changes are reflected in the attachment [^hadoopfs_thrift.diff].
> Changes in generated Java, Python, Perl and C++ code are also attached in 
> [^gen.diff]. They were generated by a Thrift checkout from trunk
> ([http://svn.apache.org/repos/asf/incubator/thrift/trunk/]) as of revision
> 719697, plus the following Perl-related patches:
> * [https://issues.apache.org/jira/browse/THRIFT-190]
> * [https://issues.apache.org/jira/browse/THRIFT-193]
> * [https://issues.apache.org/jira/browse/THRIFT-199]
> The Thrift jar file [^libthrift.jar] built from that Thrift checkout is also 
> attached, since it's needed to run the Java Thrift server.
> I have also added a new target to src/contrib/thriftfs/build.xml to build the 
> Java bindings needed for org.apache.hadoop.thriftfs.HadoopThriftServer.java 
> (see attachment [^build_xml.diff] and modified HadoopThriftServer.java to 
> make use of the new bindings (see attachment [^HadoopThriftServer_java.diff]).
> The jar file [^lib/hadoopthriftapi.jar] is also included, although it can be 
> regenerated from the stuff under 'gen-java' and the new 'compile-gen' Ant 
> target.
> The whole changeset is also included as [^all.diff].

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to