[
https://issues.apache.org/jira/browse/HDFS-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14902640#comment-14902640
]
Bob Hansen commented on HDFS-9103:
----------------------------------
I agree that having a modular, thread-safe AsyncPreadSome is a Good Thing, and
the current state of the stateful and stateless is rife for error. I had a
discussion with one of our doc compatriots about naming
previously_excluded_datanodes (stateful and temporal) vs. excluded_datanodes
(stateless, ephemeral), we came to the conclusion that PositionRead and
SyncReadSome should be in different classes.
AsyncReadBlock is not stateless as written - it pulls the ref to the
FileSystemImpl and file metadata from the state of the InputStreamImpl, as well
as storing its working data in the form of constructed continuations. It's a
good design for storing the state for a read effort - it would be cumbersome to
pass _all_ the state on the stack as discrete elements. This is a common
pattern in functional programming - the required state to get a job done
continue to grow and is eventually packed into state objects that get passed
around. It frequently turns into something akin to an inverted state machine,
which is why OO asynchronous systems normally end up with objects and events on
those objects.
The AsyncPreadSome was restored to mostly-stateless in patch 2.
I propose (perhaps in another jira), that we separate FileHandle (stream state
such as position, previously failed DataNodes, etc.), FileInfo (file length,
LocatedBlocks, etc.), and ReadOperation (ephemeral state for an async read such
as Continuations and refs to FileInfo) as a good model.
> Retry reads on DN failure
> -------------------------
>
> Key: HDFS-9103
> URL: https://issues.apache.org/jira/browse/HDFS-9103
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: hdfs-client
> Reporter: Bob Hansen
> Assignee: Bob Hansen
> Fix For: HDFS-8707
>
> Attachments: HDFS-9103.1.patch, HDFS-9103.2.patch,
> HDFS-9103.HDFS-8707.3.patch
>
>
> When AsyncPreadSome fails, add the failed DataNode to the excluded list and
> try again.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)