[ 
https://issues.apache.org/jira/browse/HADOOP-4707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12696715#action_12696715
 ] 

Carlos Valiente commented on HADOOP-4707:
-----------------------------------------

{quote}
1) If the NameNode restarts but the DataNodes stay up, the DataNodes don't 
re-register their Thrift ports with the NameNode. Calling getDatanodeReport 
then triggers an NPE in ThriftUtils.toThrift
[..]
Specifically, I think the DataNode needs a hook in DataNode.register() that 
calls through to plugins. Doing this in a generalized way on the HADOOP-5257 
plugin interface might be nice - some kind of "hook point" pubsub kind of 
interface. 
{quote}

Yep. A simpler option might be to spawn a thread on the datanode which calls 
{{Namenode.datanodeUp()}} every so often. Any preference?

bq.2) I think the datanode hostnames need to be canonicalized somehow when 
inserting into the thriftPorts map. On a pseudodistributed cluster, I'm seeing 
getDatanodeReport fail to find the thriftPort since the DN is registering under 
the name 127.0.1.1, but then being looked up as 127.0.0.1 for whatever reason. 
I'll look into a solution for this.

I came across that same issue when writing the test suite. The "canonical" 
name, as far as the {{thriftPorts}} map is concerned, is 
{{org.apache.hadoop.thriftfs.DatanodePlugin.datanode.DataNode.dnRegistration.getName()}}.
 On my real 6-node test cluster, that value is the same as the value of  
{{org.apache.hadoop.hdfs.protocol.DatanodeInfo.name}} for every 
{{DatanodeInfo}} instance, so everything works. On a {{MiniDFSCluster}} 
cluster, however, it is not --- just as you found out in your case, Todd 
(classloader issues, perharps?).

My workaround for the test suite was to set the property {{slave.host.name}} to 
the expected value.

bq.3) Lastly, I think the "Unknown Thrift port for Datanode" NPE is 
unnecessarily strict. I'd prefer for it to return a -1 or a 0 to indicate that 
the DN thrift server isn't running. This would require some extra checks 
elsewhere in the code before trying to contact a non-existent thrift server, 
but it enables getDatanodeReport to work even without the thrift plugin on the 
DNs.

Yep, makes sense. I propose setting the port to -1, and also doing something 
like this:

{code}
  public static Block toThrift(LocatedBlock block, String path,
      Map<String, Integer> thriftPorts) {
    if (block == null) {
      return new Block();
    }

    List<DatanodeInfo> nodes = new ArrayList<DatanodeInfo>();
    for (org.apache.hadoop.hdfs.protocol.DatanodeInfo n:
        block.getLocations()) {

        DatanodeInfo node = toThrift(n, thriftPorts);
        if (node.thriftPort != -1) {
          nodes.add(toThrift(n, thriftPorts));
        }
    }  
    // [...]
  }
{code}

That way we return to the client the (possibly empty) list of all block 
locations accessible by the Thrift interface.

> Improvements to Hadoop Thrift bindings
> --------------------------------------
>
>                 Key: HADOOP-4707
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4707
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/thiftfs
>    Affects Versions: 0.20.0
>         Environment: Tested under Linux x86-64
>            Reporter: Carlos Valiente
>            Priority: Minor
>         Attachments: all.diff, BlockManager.java, build_xml.diff, 
> DefaultBlockManager.java, DFSBlockManager.java, gen.diff, HADOOP-4707.diff, 
> HADOOP-4707.patch, hadoopfs_thrift.diff, hadoopthriftapi.jar, 
> HadoopThriftServer.java, HadoopThriftServer_java.diff, hdfs.py, 
> hdfs_py_venky.diff, libthrift.jar, libthrift.jar, libthrift.jar
>
>
> I have made the following changes to hadoopfs.thrift:
> #  Added namespaces for Python, Perl and C++.
> # Renamed parameters and struct members to camelCase versions to keep them 
> consistent (in particular FileStatus{blockReplication,blockSize} vs 
> FileStatus.{block_replication,blocksize}).
> # Renamed ThriftHadoopFileSystem to FileSystem. From the perspective of a 
> Perl/Python/C++ user, 1) it is already clear that we're using Thrift, and 2) 
> the fact that we're dealing with Hadoop is already explicit in the namespace. 
>  The usage of generated code is more compact and (in my opinion) clearer:
> {quote}
>         *Perl*:
>         use HadoopFS;
>         my $client = HadoopFS::FileSystemClient->new(..);
>          _instead of:_
>         my $client = HadoopFS::ThriftHadoopFileSystemClient->new(..);
>         *Python*:
>         from hadoopfs import FileSystem
>         client = FileSystem.Client(..)
>         _instead of_
>         from hadoopfs import ThriftHadoopFileSystem
>         client = ThriftHadoopFileSystem.Client(..)
>         (See also the attached diff [^scripts_hdfs_py.diff] for the
>          new version of 'scripts/hdfs.py').
>         *C++*:
>         hadoopfs::FileSystemClient client(..);
>          _instead of_:
>         hadoopfs::ThriftHadoopFileSystemClient client(..);
> {quote}
> # Renamed ThriftHandle to FileHandle: As in 3, it is clear that we're dealing 
> with a Thrift object, and its purpose (to act as a handle for file 
> operations) is clearer.
> # Renamed ThriftIOException to IOException, to keep it simpler, and 
> consistent with MalformedInputException.
> # Added explicit version tags to fields of ThriftHandle/FileHandle, Pathname, 
> MalformedInputException and ThriftIOException/IOException, to improve 
> compatibility of existing clients with future versions of the interface which 
> might add new fields to those objects (like stack traces for the exception 
> types, for instance).
> Those changes are reflected in the attachment [^hadoopfs_thrift.diff].
> Changes in generated Java, Python, Perl and C++ code are also attached in 
> [^gen.diff]. They were generated by a Thrift checkout from trunk
> ([http://svn.apache.org/repos/asf/incubator/thrift/trunk/]) as of revision
> 719697, plus the following Perl-related patches:
> * [https://issues.apache.org/jira/browse/THRIFT-190]
> * [https://issues.apache.org/jira/browse/THRIFT-193]
> * [https://issues.apache.org/jira/browse/THRIFT-199]
> The Thrift jar file [^libthrift.jar] built from that Thrift checkout is also 
> attached, since it's needed to run the Java Thrift server.
> I have also added a new target to src/contrib/thriftfs/build.xml to build the 
> Java bindings needed for org.apache.hadoop.thriftfs.HadoopThriftServer.java 
> (see attachment [^build_xml.diff] and modified HadoopThriftServer.java to 
> make use of the new bindings (see attachment [^HadoopThriftServer_java.diff]).
> The jar file [^lib/hadoopthriftapi.jar] is also included, although it can be 
> regenerated from the stuff under 'gen-java' and the new 'compile-gen' Ant 
> target.
> The whole changeset is also included as [^all.diff].

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to