[
https://issues.apache.org/jira/browse/HDFS-5563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13832905#comment-13832905
]
Brandon Li commented on HDFS-5563:
----------------------------------
Thanks for the review.
{quote} Instead of using fromRead as parameter, how about using a parameter
in the opposite way like "toCache"? Also please add javadoc for this new
parameter. {quote}
Javadoc added. "fromRead" make the meaning more obvious. For example, if the
commit is triggered, we should not update the stream access time.
{quote}It's better to use assertEquals(expected, actual) instead of
assertTrue(expected == actual value) in the unit test.{quote}
Done.
{quote} A possible optimization here may be to directly return the local
buffered data for the read request without calling hsync. This may be addressed
in future jiras.
{quote}
Current cached data in pending write hashmap is not searchable. In the future,
we can consider use fix-sized cache blocks so that it's easy to find the cached
data for read.
> NFS gateway should commit the buffered data when read request comes after
> write to the same file
> ------------------------------------------------------------------------------------------------
>
> Key: HDFS-5563
> URL: https://issues.apache.org/jira/browse/HDFS-5563
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: nfs
> Reporter: Brandon Li
> Assignee: Brandon Li
> Attachments: HDFS-5563.001.patch
>
>
> HDFS write is asynchronous and data may not be available to read immediately
> after write.
> One of the main reason is that DFSClient doesn't flush data to DN until its
> local buffer is full.
> To workaround this problem, when a read comes after write to the same file,
> NFS gateway should sync the data so the read request can get the latest
> content. The drawback is that, the frequent hsync() call can slow down data
> write.
--
This message was sent by Atlassian JIRA
(v6.1#6144)