[ 
https://issues.apache.org/jira/browse/HDFS-9924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15321910#comment-15321910
 ] 

Ming Ma commented on HDFS-9924:
-------------------------------

Based on the discussion so far, it seems like the existing async API needs to 
be changed. But not sure if the new async API has to be done in a feature 
branch. How about the option of reverting the existing async API and after that 
develop the new async API directly in trunk, and optionally branch-2 (not 
branch-2.8) assuming we will go through more thorough discussion this time with 
broader consensus?

* Java8 future API is more flexible and supports different kinds of future 
compositions and make it easy to support dependent operations. In fact, the 
discussion in HADOOP-12910 proposes this for trunk.

* HADOOP-12910 also discussed the API design for branch-2, but not for 
branch-2.8. It appears the async requirement comes from Hive. If we leave the 
API as it is in 2.8, does it mean Hive needs to hard code to use 
AsyncDistributedFileSystem? Or we can have Hive use multiple threads to 
interact with FileSystem just like how MAPREDUCE does split calculation; this 
approach doesn't require any async support in 2.8. Either way, Hive needs to 
change how it uses HDFS when upgraded from 2.8 to 2.9 if we plan to put the new 
async API to 2.9.

> [umbrella] Asynchronous HDFS Access
> -----------------------------------
>
>                 Key: HDFS-9924
>                 URL: https://issues.apache.org/jira/browse/HDFS-9924
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: fs
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: Xiaobing Zhou
>         Attachments: AsyncHdfs20160510.pdf
>
>
> This is an umbrella JIRA for supporting Asynchronous HDFS Access.
> Currently, all the API methods are blocking calls -- the caller is blocked 
> until the method returns.  It is very slow if a client makes a large number 
> of independent calls in a single thread since each call has to wait until the 
> previous call is finished.  It is inefficient if a client needs to create a 
> large number of threads to invoke the calls.
> We propose adding a new API to support asynchronous calls, i.e. the caller is 
> not blocked.  The methods in the new API immediately return a Java Future 
> object.  The return value can be obtained by the usual Future.get() method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to