[ 
https://issues.apache.org/jira/browse/BEAM-1190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15765711#comment-15765711
 ] 

Paul Findlay commented on BEAM-1190:
------------------------------------

[[email protected]] Correct me if I'm wrong.. but isn't 
FileBasedSource.createReader basically already doing a stat for each file in 
the expanded list but swallowing the error if there is one, and leaving it for 
startImpl to blow up? We are just asking for the method to not be final so we 
can treat the different sub-classes of IOException appropriately (for our 
pipeline).

But would love to know if there is scary behaviour we haven't considered.

> FileBasedSource should ignore files that matched the glob but don't exist
> -------------------------------------------------------------------------
>
>                 Key: BEAM-1190
>                 URL: https://issues.apache.org/jira/browse/BEAM-1190
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-core
>            Reporter: Eugene Kirpichov
>            Assignee: Eugene Kirpichov
>
> See user issue:
> http://stackoverflow.com/questions/41251741/coping-with-eventual-consistency-of-gcs-bucket-listing
> We should, after globbing the files in FileBasedSource, individually stat 
> every file and remove those that don't exist, to account for the possibility 
> that glob yielded non-existing files due to eventual consistency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to