[ 
https://issues.apache.org/jira/browse/HDFS-17768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-17768.
------------------------------------
    Fix Version/s: 3.5.0
       Resolution: Fixed

> Observer namenode network delay causing empty block location for 
> getBatchedListing
> ----------------------------------------------------------------------------------
>
>                 Key: HDFS-17768
>                 URL: https://issues.apache.org/jira/browse/HDFS-17768
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 3.4.1
>            Reporter: Dimas Shidqi Parikesit
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 3.5.0
>
>
> In our testing with the latest hdfs version (e8a64d0), we found a similar 
> case to HDFS-16732 happening in getBatchedListing. During a 
> getBatchedListing, if the block report of the observer nn is delayed, one or 
> more of the listing results will return blocks without location.
> Steps to reproduce this bug:
>  # Start a cluster with 1 observer namenode
>  # Create an empty file
>  # Inject network delay between observer nn and active nn to delay block 
> report (or add sleep to the BlockReportProcessingThread of the observer).
>  # Append file to add block
>  # Send a batchedListPaths request using client API
>  # Check that the result has block without location
> In HDFS-16732 and HDFS-13924,  a check was added in getBlockLocations, 
> getFileInfo, and getListing that checks whether the found blocks have valid 
> locations. Missing locations indicate that the observer namenode is not 
> up-to-date compared to the active namenode.
> We propose to add the same check to getBatchedListing. If any of the 
> sub-listing return blocks without location then it will throw 
> ObserverRetryOnActiveException and exit the function early. The entire 
> batchedListing request will be then retried on active namenode.
> Your insights are very much appreciated. We will continue following up this 
> issue until it is resolved.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to