[
https://issues.apache.org/jira/browse/HDFS-8272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14518134#comment-14518134
]
Zhe Zhang commented on HDFS-8272:
---------------------------------
Continuing the my review:
The {{DFSInputStream}} changes look good to me. Only nit is maybe we should
pass {{DNAddrPair}} to {{getBlockReader}}?
In {{DFSStripedInputStream}}, {{getBlockReaderWithRetry}} is a great way to
improve readability. The main structural change is that we used to retry
outside the for loop; with the patch we'll retry for each iteration in the for
loop. In other words, we used to retry at most once for encryption key and at
most once for token for the entire block group. Now we retry at most once for
each internal block. Considering all internal blocks share the same encryption
key and token, maybe we should just try refetching the key/token once for the
group?
One thing I haven't quite figured out is how the encryption key retry logic
works in the original {{DFSInputStream}}. After
{{dfsClient.clearDataEncryptionKey()}}, which code tries to refetch the
encryption key? If that logic is not in {{getBlockReader}}, I guess
{{getBlockReaderWithRetry}} won't be able to refetch the key?
> Erasure Coding: simplify the retry logic in DFSStripedInputStream
> -----------------------------------------------------------------
>
> Key: HDFS-8272
> URL: https://issues.apache.org/jira/browse/HDFS-8272
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Jing Zhao
> Assignee: Jing Zhao
> Attachments: h8272-HDFS-7285.000.patch
>
>
> Currently in DFSStripedInputStream the retry logic is still the same with
> DFSInputStream. More specifically, every failed read will try to search for
> another source node. And an exception is thrown when no new source node can
> be identified. This logic is not appropriate for EC inputstream and can be
> simplified.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)