[ 
https://issues.apache.org/jira/browse/HDFS-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14043956#comment-14043956
 ] 

Colin Patrick McCabe commented on HDFS-5546:
--------------------------------------------

I agree with a lot of the stuff that's been presented, but I also think our 
behavior should be consistent beween "{{ls /a1/b /a2/b}}" and "{{ls 
/a\{1,2\}/b}}", and right now I can't see a good way to achieve that if we 
catch IOE (since the globber does not catch IOE)  On the other hand, if we 
catch FNF and continue if a top-level directory disappears on us, then we are 
making things more consistent, since the globber catches and ignores IOEs (when 
dealing with globs).

bq. Colin Patrick McCabe shouldn't the globStatus() be out of scope for this 
JIRA? Maybe we should open another related JIRA?

I'm not sure how the globber would report IOE other than throwing it.  We'd 
have to return a list of {{Option<FileStatus, IOException>}} or something?  It 
doesn't seem like the kind of change that could be made compatibly, since we'd 
need a new interface.

So overall I would lean towards just catching FNF at the top-level, like the 
earlier patch did.  And maybe revisiting this later if we have better ideas 
about how to handle this in the globber as well.  [~daryn], [~eddyxu], does 
that make sense?  Or am I trying too hard to be consistent? :)

> race condition crashes "hadoop ls -R" when directories are moved/removed
> ------------------------------------------------------------------------
>
>                 Key: HDFS-5546
>                 URL: https://issues.apache.org/jira/browse/HDFS-5546
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.2.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Lei (Eddy) Xu
>            Priority: Minor
>             Fix For: 3.0.0
>
>         Attachments: HDFS-5546.1.patch, HDFS-5546.2.000.patch, 
> HDFS-5546.2.001.patch, HDFS-5546.2.002.patch, HDFS-5546.2.003.patch, 
> HDFS-5546.2.004.patch
>
>
> This seems to be a rare race condition where we have a sequence of events 
> like this:
> 1. org.apache.hadoop.shell.Ls calls DFS#getFileStatus on directory D.
> 2. someone deletes or moves directory D
> 3. org.apache.hadoop.shell.Ls calls PathData#getDirectoryContents(D), which 
> calls DFS#listStatus(D). This throws FileNotFoundException.
> 4. ls command terminates with FNF



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to