[ https://issues.apache.org/jira/browse/HADOOP-19543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Anuj Modi resolved HADOOP-19543. -------------------------------- Fix Version/s: 3.5.0 3.4.2 Resolution: Fixed > ABFS: [FnsOverBlob] Remove Duplicates from Blob Endpoint Listing Across > Iterations > ---------------------------------------------------------------------------------- > > Key: HADOOP-19543 > URL: https://issues.apache.org/jira/browse/HADOOP-19543 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure > Affects Versions: 3.5.0, 3.4.1 > Reporter: Anuj Modi > Assignee: Anuj Modi > Priority: Blocker > Labels: pull-request-available > Fix For: 3.5.0, 3.4.2 > > > On FNS-Blob, List Blobs API is known to return duplicate entries for the > non-empty explicit directories. One entry corresponds to the directory itself > and another entry corresponding to the marker blob that driver internally > creates and maintains to mark that path as a directory. We already know about > this behaviour and it was handled to remove such duplicate entries from the > set of entries that were returned as part current list iterations. > Due to possible partition split if such duplicate entries happen to be > returned in separate iteration, there is no handling on this and caller might > get back the result with duplicate entries as happening in this case. The > logic to remove duplicate was designed before the realization of partition > split came. > This PR fixes this bug -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org