[ 
https://issues.apache.org/jira/browse/HADOOP-4707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12704663#action_12704663
 ] 

Todd Lipcon commented on HADOOP-4707:
-------------------------------------

bq. The identd approach is far from perfect, definitely, but it is more than 
enough for my particular requirements.

Not to be a pain, but unfortunately it's completely inadequate for my 
particular requirements ;-) I need to be able to assume the identity of other 
users in the system.

bq. The Thrift IDL file, which I see as the main documentation source for the 
exposed service, becomes cluttered.

This is somewhat true. We can use the new-ish "thrift -gen html" to generate 
html documentation, though, which at least is slightly nicer to look at. I 
don't think adding a single common parameter to every call really clutters 
things that much, though, since the additional mental burden is "O(1)". Once 
you understand what the parameter means on one call, you understand it 
everywhere :)

bq.  Although old clients would still be able to call the Thrift services, C++ 
clients generated from the new IDL source would get the extra argument in the 
signature, and the user code would be less readable.

Yes, it's slightly less readable, but on the other hand I assume that anyone 
using this service will be wrapping the thrift client in some kind of 
fliesystem facade anyway. The change then is confined to a single module. The 
fact that existing C++ code will need changes doesn't concern me too much since 
(a) it's a very clear compilation-time error, not something subtle and hard to 
find, and (b) this hasn't been committed yet, so now is the one time we should 
feel especially free to break backwards compat!

bq. Then on the server side we could either start a 
org.apache.thrift.transport.TServerSocket or a 
org.apache.thrift.transport.TSecureServerSocket, depending on some 
configuration value

This doesn't really solve my issue. I don't care about authentication from an 
access control perspective. I care more about the ability to assume different 
user roles. One primary use case for this Thrift interface, as I see it, is so 
that web applications written in non-Java languages can access HDFS more 
easily. Web applications typically run as some user like www-data or nobody, 
which is problematic if they really need to be assuming the identity of the end 
user (tlipcon) who has provided some authentication via a login form, etc.

The other option I considered was to add a su(...) RPC and use a specialized 
TTransport on the server side to hold the current UGI, but this breaks a lot of 
layer encapsulation properties. Namely, the TTransport cannot automatically 
reconnect to the server, since the UGI will be lost after the reconnect.


> Improvements to Hadoop Thrift bindings
> --------------------------------------
>
>                 Key: HADOOP-4707
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4707
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/thiftfs
>    Affects Versions: 0.20.0
>         Environment: Tested under Linux x86-64
>            Reporter: Carlos Valiente
>            Assignee: Carlos Valiente
>            Priority: Minor
>         Attachments: all.diff, BlockManager.java, build_xml.diff, 
> DefaultBlockManager.java, DFSBlockManager.java, gen.diff, 
> HADOOP-4707-55c046a.txt, hadoop-4707-6bc958.txt, HADOOP-4707.diff, 
> HADOOP-4707.patch, HADOOP-4707.patch, hadoopfs_thrift.diff, 
> hadoopthriftapi.jar, HadoopThriftServer.java, HadoopThriftServer_java.diff, 
> hdfs.py, hdfs_py_venky.diff, libthrift.jar, libthrift.jar, libthrift.jar
>
>
> I have made the following changes to hadoopfs.thrift:
> #  Added namespaces for Python, Perl and C++.
> # Renamed parameters and struct members to camelCase versions to keep them 
> consistent (in particular FileStatus{blockReplication,blockSize} vs 
> FileStatus.{block_replication,blocksize}).
> # Renamed ThriftHadoopFileSystem to FileSystem. From the perspective of a 
> Perl/Python/C++ user, 1) it is already clear that we're using Thrift, and 2) 
> the fact that we're dealing with Hadoop is already explicit in the namespace. 
>  The usage of generated code is more compact and (in my opinion) clearer:
> {quote}
>         *Perl*:
>         use HadoopFS;
>         my $client = HadoopFS::FileSystemClient->new(..);
>          _instead of:_
>         my $client = HadoopFS::ThriftHadoopFileSystemClient->new(..);
>         *Python*:
>         from hadoopfs import FileSystem
>         client = FileSystem.Client(..)
>         _instead of_
>         from hadoopfs import ThriftHadoopFileSystem
>         client = ThriftHadoopFileSystem.Client(..)
>         (See also the attached diff [^scripts_hdfs_py.diff] for the
>          new version of 'scripts/hdfs.py').
>         *C++*:
>         hadoopfs::FileSystemClient client(..);
>          _instead of_:
>         hadoopfs::ThriftHadoopFileSystemClient client(..);
> {quote}
> # Renamed ThriftHandle to FileHandle: As in 3, it is clear that we're dealing 
> with a Thrift object, and its purpose (to act as a handle for file 
> operations) is clearer.
> # Renamed ThriftIOException to IOException, to keep it simpler, and 
> consistent with MalformedInputException.
> # Added explicit version tags to fields of ThriftHandle/FileHandle, Pathname, 
> MalformedInputException and ThriftIOException/IOException, to improve 
> compatibility of existing clients with future versions of the interface which 
> might add new fields to those objects (like stack traces for the exception 
> types, for instance).
> Those changes are reflected in the attachment [^hadoopfs_thrift.diff].
> Changes in generated Java, Python, Perl and C++ code are also attached in 
> [^gen.diff]. They were generated by a Thrift checkout from trunk
> ([http://svn.apache.org/repos/asf/incubator/thrift/trunk/]) as of revision
> 719697, plus the following Perl-related patches:
> * [https://issues.apache.org/jira/browse/THRIFT-190]
> * [https://issues.apache.org/jira/browse/THRIFT-193]
> * [https://issues.apache.org/jira/browse/THRIFT-199]
> The Thrift jar file [^libthrift.jar] built from that Thrift checkout is also 
> attached, since it's needed to run the Java Thrift server.
> I have also added a new target to src/contrib/thriftfs/build.xml to build the 
> Java bindings needed for org.apache.hadoop.thriftfs.HadoopThriftServer.java 
> (see attachment [^build_xml.diff] and modified HadoopThriftServer.java to 
> make use of the new bindings (see attachment [^HadoopThriftServer_java.diff]).
> The jar file [^lib/hadoopthriftapi.jar] is also included, although it can be 
> regenerated from the stuff under 'gen-java' and the new 'compile-gen' Ant 
> target.
> The whole changeset is also included as [^all.diff].

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to