[ 
https://issues.apache.org/jira/browse/HDFS-9144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14948661#comment-14948661
 ] 

Bob Hansen commented on HDFS-9144:
----------------------------------

A proposed structure to start:

__C-API__
hdfslib-compatible layer that is a thin wrapper around C++-posixish-API

__C++-posixish-API__
Stateful quasi-posix API that will be familiar and easy to consume
Embodies sane default policies and strategies for common operations
implements all asynchronous operations
has synchronous helpers for all asynchronous operations
Wrapper around functional-API, below

FileSystem:
        constructs with config object
        open() returns a FileHandle
        common NN operations
        holds state for dead DNs
        shared state and thread-safe (implement single lock for FS?)
        owns and is wrapper around NameNodeConnection

FileHandle:
        supports implicit position and streaming reads (posixy)
        stateful and single-threaded with the exception of cancellation method
        thread-safe cancel() method will cancel any outstanding I/Os and 
deliver a cancellation error to its continuation
        implements reliable reads and error recovery
        Maintains a pointer to the posixy-FileSystem for operations on the dead 
DN 
        Owns block map
        Read operation: will pick appropriate DNConnections and 
        Will eventually cache DNConnections
        
__functional-API__
Low-level implementation of composible asynchronous blocks

NameNodeConnection: 
        Has all configuration params explicitly passed in/set
        Owns TCP connection to NN
        Encapsulates method call to Message construction
        Refactoring of the current FileSystemImpl object
        Thread-safe methods
        May be connected or not
        
DataNodeConnection:
        Owns TCP connection to the DN
        Owns RpcEngine
        Encapsulates method call to Message construction
        Encapsulates connecting and handshaking to the DN
        Thread-safe methods
        May be connected or not
        
AsyncReadBlockOperation: 
        Ephemeral object; performs operation once and is done
        Takes a DataNodeConnection, block extents as input
        Connects DataNodeConnection if necessary and makes RPC calls to read 
data
        Single-threaded (although wil. have callbacks from asio and will call 
into consuler handler from asio thread) outside of cancel() method
        Encapsulation of current InputStreamImpl::AsyncReadBlock method and its 
associated state

PositionalReadOperation:
        Ephemeral object; performs operation once and is done
        Owns BlockReadOperation
        Owns DNConnection
        Given block map and snapshot of dead DN list, creates a new DN 
connection and kicks off BlockReadOperation
        Refactoring of current InputStreamImpl::PositionRead
        Single-threaded outside of cancel() method
        Cannot do DNConnection caching
        Functional convenince object for those not using FileHandle
        Some retry logic here?


> Refactor libhdfs into stateful/ephemeral objects
> ------------------------------------------------
>
>                 Key: HDFS-9144
>                 URL: https://issues.apache.org/jira/browse/HDFS-9144
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>    Affects Versions: HDFS-8707
>            Reporter: Bob Hansen
>            Assignee: Bob Hansen
>
> In discussion for other efforts, we decided that we should separate several 
> concerns:
> * A posix-like FileSystem/FileHandle object (stream-based, positional reads)
> * An ephemeral ReadOperation object that holds the state for 
> reads-in-progress, which consumes
> * An immutable FileInfo object which holds the block map and file size (and 
> other metadata about the file that we assume will not change over the life of 
> the file)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to