[
https://issues.apache.org/jira/browse/HDFS-9144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14948661#comment-14948661
]
Bob Hansen commented on HDFS-9144:
----------------------------------
A proposed structure to start:
__C-API__
hdfslib-compatible layer that is a thin wrapper around C++-posixish-API
__C++-posixish-API__
Stateful quasi-posix API that will be familiar and easy to consume
Embodies sane default policies and strategies for common operations
implements all asynchronous operations
has synchronous helpers for all asynchronous operations
Wrapper around functional-API, below
FileSystem:
constructs with config object
open() returns a FileHandle
common NN operations
holds state for dead DNs
shared state and thread-safe (implement single lock for FS?)
owns and is wrapper around NameNodeConnection
FileHandle:
supports implicit position and streaming reads (posixy)
stateful and single-threaded with the exception of cancellation method
thread-safe cancel() method will cancel any outstanding I/Os and
deliver a cancellation error to its continuation
implements reliable reads and error recovery
Maintains a pointer to the posixy-FileSystem for operations on the dead
DN
Owns block map
Read operation: will pick appropriate DNConnections and
Will eventually cache DNConnections
__functional-API__
Low-level implementation of composible asynchronous blocks
NameNodeConnection:
Has all configuration params explicitly passed in/set
Owns TCP connection to NN
Encapsulates method call to Message construction
Refactoring of the current FileSystemImpl object
Thread-safe methods
May be connected or not
DataNodeConnection:
Owns TCP connection to the DN
Owns RpcEngine
Encapsulates method call to Message construction
Encapsulates connecting and handshaking to the DN
Thread-safe methods
May be connected or not
AsyncReadBlockOperation:
Ephemeral object; performs operation once and is done
Takes a DataNodeConnection, block extents as input
Connects DataNodeConnection if necessary and makes RPC calls to read
data
Single-threaded (although wil. have callbacks from asio and will call
into consuler handler from asio thread) outside of cancel() method
Encapsulation of current InputStreamImpl::AsyncReadBlock method and its
associated state
PositionalReadOperation:
Ephemeral object; performs operation once and is done
Owns BlockReadOperation
Owns DNConnection
Given block map and snapshot of dead DN list, creates a new DN
connection and kicks off BlockReadOperation
Refactoring of current InputStreamImpl::PositionRead
Single-threaded outside of cancel() method
Cannot do DNConnection caching
Functional convenince object for those not using FileHandle
Some retry logic here?
> Refactor libhdfs into stateful/ephemeral objects
> ------------------------------------------------
>
> Key: HDFS-9144
> URL: https://issues.apache.org/jira/browse/HDFS-9144
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: hdfs-client
> Affects Versions: HDFS-8707
> Reporter: Bob Hansen
> Assignee: Bob Hansen
>
> In discussion for other efforts, we decided that we should separate several
> concerns:
> * A posix-like FileSystem/FileHandle object (stream-based, positional reads)
> * An ephemeral ReadOperation object that holds the state for
> reads-in-progress, which consumes
> * An immutable FileInfo object which holds the block map and file size (and
> other metadata about the file that we assume will not change over the life of
> the file)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)