[ 
https://issues.apache.org/jira/browse/HDFS-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171883#comment-16171883
 ] 

Sean Mackrory commented on HDFS-10702:
--------------------------------------

The assumption of this feature is that an application is responsible for 
knowing when a dataset is stable enough to work on, and that any failures or 
inaccuracies resulting in stuff that happens after the minimum transaction ID 
is assumed by the application. That said, I'd be all for testing the scenario 
above to verify exactly how it fails and that it doesn't bring all of HDFS down 
with it - just the client. But if file is deleted after the specified 
transaction and the application tries to access it, returning an exception 
would be the correct behavior.

I was actually wondering if what you meant was the block locations were out of 
date because the file had been re-replicated in a different configuration due 
to cluster health issues, or decommissioning. Cluster state is distinct from an 
application knowing when it's safe to assume that a dataset is finalized, so 
that complicates the assumption somewhat.

But if it's just a clearly stated assumption that this feature transfers 
reponsibility for knowing that a dataset is complete to the client application 
and we test the accessing a deleted file fails in a correct manner, would that 
address your concerns, [~mingma]?

> Add a Client API and Proxy Provider to enable stale read from Standby
> ---------------------------------------------------------------------
>
>                 Key: HDFS-10702
>                 URL: https://issues.apache.org/jira/browse/HDFS-10702
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Jiayi Zhou
>            Assignee: Sean Mackrory
>            Priority: Minor
>         Attachments: HDFS-10702.001.patch, HDFS-10702.002.patch, 
> HDFS-10702.003.patch, HDFS-10702.004.patch, HDFS-10702.005.patch, 
> HDFS-10702.006.patch, HDFS-10702.007.patch, HDFS-10702.008.patch, 
> StaleReadfromStandbyNN.pdf
>
>
> Currently, clients must always talk to the active NameNode when performing 
> any metadata operation, which means active NameNode could be a bottleneck for 
> scalability. One way to solve this problem is to send read-only operations to 
> Standby NameNode. The disadvantage is that it might be a stale read. 
> Here, I'm thinking of adding a Client API to enable/disable stale read from 
> Standby which gives Client the power to set the staleness restriction.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to