[ 
https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12620033#action_12620033
 ] 

Nitay Joffe commented on HADOOP-3754:
-------------------------------------

This is great work Dhruba!

I worked on a very similar system at Powerset and had a few questions/comments:

1) I wrote a C++ stream using boost::iostreams which used the thrift API 
underneath so that you can work with a standard stream yet read/write to 
hadoop. In order to make this work, I had to add a seek() method to the IDL. On 
input streams, I would allow arbitrary seeking. On output streams I would only 
allow it to get called with an offset of 0 (no actual seeking) which boost 
streams uses to find the current location.

2) I had situations where the client would not close the files appropriately. 
This meant that other users would not see the file even though the writing was 
done because the data did not appear until the file is closed. To fix this 
situation, I put a TimerTask on each fd which would timeout (and close the 
file) after some period. Whenever an operation was called on a Handle its 
TimerTask would reset.

I would attach code but we are currently being merged into Microsoft and in the 
process of figuring out how open source contributions will work.

Cheers,
-n

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code 
> generation for a variety of languages (Java, C++, python, PHP, etc) It would 
> be nice if HDFS APIs are exposed though Thirft. It will allow applications 
> written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to