[ https://issues.apache.org/jira/browse/HDFS-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14034879#comment-14034879 ]
Lei (Eddy) Xu commented on HDFS-5546: ------------------------------------- [~daryn] Thank you very much for your detailed comments and thank [~cmccabe] for explanation! As this patch deals with a _very rare_ case and is just for the CLI command {{ls}}, I would prefer avoiding crashing user oriented CLI program to a little bit heavier load for NN in such a rare case. Moreover, the overhead of looking up non-existed files/dirs on NN in this rare case is not more than a normal {{ls}} on this namespace, without deleting sub-directories. If there is a data point to back up how rare such an race condition will occur, it would be great for us to justify this design though. Maybe we should print more verbose messages and tell the users re-run `ls -R`? > race condition crashes "hadoop ls -R" when directories are moved/removed > ------------------------------------------------------------------------ > > Key: HDFS-5546 > URL: https://issues.apache.org/jira/browse/HDFS-5546 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 2.2.0 > Reporter: Colin Patrick McCabe > Assignee: Lei (Eddy) Xu > Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-5546.1.patch, HDFS-5546.2.000.patch, > HDFS-5546.2.001.patch, HDFS-5546.2.002.patch > > > This seems to be a rare race condition where we have a sequence of events > like this: > 1. org.apache.hadoop.shell.Ls calls DFS#getFileStatus on directory D. > 2. someone deletes or moves directory D > 3. org.apache.hadoop.shell.Ls calls PathData#getDirectoryContents(D), which > calls DFS#listStatus(D). This throws FileNotFoundException. > 4. ls command terminates with FNF -- This message was sent by Atlassian JIRA (v6.2#6252)