[
https://issues.apache.org/jira/browse/HDFS-7782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14378484#comment-14378484
]
Zhe Zhang commented on HDFS-7782:
---------------------------------
Thanks for the insights Jing; they are really helpful.
# Hedged vs. parallel reading: yes the current implementation is parallel
instead of hedged reading. Parallel reading is an important mode for the
striping layout. But sometimes the user might want to turn it off b/c of
increased usage of sockets and bandwidth (same concern as hedged reading). My
initial thought was to leverage the hedged read config path. But you made a
great point that we should support real hedged read for striping layout by
reading from parity blocks. I will move the current
{{hedgedFetchBlockByteRange}} code to a new method, and add new config options
to turn on/off parallel reading.
# bq. We should try to avoid unnecessary data copy in the implementation.
That part of the code was developed for parallel reading. The motivation was to
give each reading thread a contiguous buffer. Then if you read more than 1
cells from a DN, you only need 1 I/O operation to that DN. I will do the
following the address the buffer copy overhead:
#* Update positional non-hedged read to directly use the given byte array
#* In the new parallel read mode, check if the read is for more than 1 full
stripe of cells; if not, directly use the given byte array
# Good ideas about unit tests. I will add those.
# bq. I guess the read failure/timeout will be handled in a separate jira?
Yes that's the plan. The updated hedged read will also be added in that JIRA
since it involves parity data.
> Read a striping layout file from client side
> --------------------------------------------
>
> Key: HDFS-7782
> URL: https://issues.apache.org/jira/browse/HDFS-7782
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Li Bo
> Assignee: Zhe Zhang
> Attachments: HDFS-7782-000.patch, HDFS-7782-001.patch,
> HDFS-7782-002.patch, HDFS-7782-003.patch
>
>
> If client wants to read a file, he is not necessary to know and handle what
> layout the file is. This sub task adds logic to DFSInputStream to support
> reading striping layout files.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)