[
https://issues.apache.org/jira/browse/HDFS-9924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15321910#comment-15321910
]
Ming Ma commented on HDFS-9924:
-------------------------------
Based on the discussion so far, it seems like the existing async API needs to
be changed. But not sure if the new async API has to be done in a feature
branch. How about the option of reverting the existing async API and after that
develop the new async API directly in trunk, and optionally branch-2 (not
branch-2.8) assuming we will go through more thorough discussion this time with
broader consensus?
* Java8 future API is more flexible and supports different kinds of future
compositions and make it easy to support dependent operations. In fact, the
discussion in HADOOP-12910 proposes this for trunk.
* HADOOP-12910 also discussed the API design for branch-2, but not for
branch-2.8. It appears the async requirement comes from Hive. If we leave the
API as it is in 2.8, does it mean Hive needs to hard code to use
AsyncDistributedFileSystem? Or we can have Hive use multiple threads to
interact with FileSystem just like how MAPREDUCE does split calculation; this
approach doesn't require any async support in 2.8. Either way, Hive needs to
change how it uses HDFS when upgraded from 2.8 to 2.9 if we plan to put the new
async API to 2.9.
> [umbrella] Asynchronous HDFS Access
> -----------------------------------
>
> Key: HDFS-9924
> URL: https://issues.apache.org/jira/browse/HDFS-9924
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: fs
> Reporter: Tsz Wo Nicholas Sze
> Assignee: Xiaobing Zhou
> Attachments: AsyncHdfs20160510.pdf
>
>
> This is an umbrella JIRA for supporting Asynchronous HDFS Access.
> Currently, all the API methods are blocking calls -- the caller is blocked
> until the method returns. It is very slow if a client makes a large number
> of independent calls in a single thread since each call has to wait until the
> previous call is finished. It is inefficient if a client needs to create a
> large number of threads to invoke the calls.
> We propose adding a new API to support asynchronous calls, i.e. the caller is
> not blocked. The methods in the new API immediately return a Java Future
> object. The return value can be obtained by the usual Future.get() method.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]