[ 
https://issues.apache.org/jira/browse/HDFS-11435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15876934#comment-15876934
 ] 

Arpit Agarwal edited comment on HDFS-11435 at 2/21/17 11:03 PM:
----------------------------------------------------------------

bq. Yiqun Lin, Yes, so far the thinking is on the lines of Jing Zhao proposal 
of enhancing the heartbeat protocol to let NameNode know about openforwrite 
file lengths. 
-Hi [~manojg], do you have a pointer to this proposal/discussion?-
Never mind, you were probably referring to this comment.
https://issues.apache.org/jira/browse/HDFS-11402?focusedCommentId=15872739&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15872739


was (Author: arpitagarwal):
bq. Yiqun Lin, Yes, so far the thinking is on the lines of Jing Zhao proposal 
of enhancing the heartbeat protocol to let NameNode know about openforwrite 
file lengths. 
-Hi [~manojg], do you have a pointer to this proposal/discussion?-
Never mind, you were probably referring to this comment.
https://issues.apache.org/jira/secure/EditComment!default.jspa?id=13041860&commentId=15872739

> NameNode should track open for write files lengths more frequent than on 
> newer block allocations
> ------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-11435
>                 URL: https://issues.apache.org/jira/browse/HDFS-11435
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Manoj Govindassamy
>            Assignee: Manoj Govindassamy
>
> *Problem:*
> Currently the length of an open for write / Under construction file is 
> updated on the NameNode only when 
> # Block boundary: On block boundaries and upon allocation of new Block, 
> NameNode gets to know the file growth and the file length catches up
> # hsync(SyncFlag.UPDATE_LENGTH): Upon Client apps invoking a hsync on the 
> write stream with a special flag, DataNodes send an incremental block report 
> with the latest file length which NameNode uses it to update its meta data.
> # First hflush() on the new Block: Upon Client apps doing first time hflush() 
> on an every new Block, DataNodes notifies NameNode about the latest file 
> length.
> # Output stream close: Forces DataNodes update NameNode about the file length 
> after data persistence and proper acknowledgements in the pipeline.
> So, lengths for open for write files are usually a lot less than the length 
> seen by the DN/client. Highly preferred to have NameNode not lagging in file 
> lengths by order of Block size for under construction files and to have more 
> frequent, scalable update mechanism for these open file lengths. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to