[
https://issues.apache.org/jira/browse/BEAM-1190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15765711#comment-15765711
]
Paul Findlay commented on BEAM-1190:
------------------------------------
[[email protected]] Correct me if I'm wrong.. but isn't
FileBasedSource.createReader basically already doing a stat for each file in
the expanded list but swallowing the error if there is one, and leaving it for
startImpl to blow up? We are just asking for the method to not be final so we
can treat the different sub-classes of IOException appropriately (for our
pipeline).
But would love to know if there is scary behaviour we haven't considered.
> FileBasedSource should ignore files that matched the glob but don't exist
> -------------------------------------------------------------------------
>
> Key: BEAM-1190
> URL: https://issues.apache.org/jira/browse/BEAM-1190
> Project: Beam
> Issue Type: Bug
> Components: sdk-java-core
> Reporter: Eugene Kirpichov
> Assignee: Eugene Kirpichov
>
> See user issue:
> http://stackoverflow.com/questions/41251741/coping-with-eventual-consistency-of-gcs-bucket-listing
> We should, after globbing the files in FileBasedSource, individually stat
> every file and remove those that don't exist, to account for the possibility
> that glob yielded non-existing files due to eventual consistency.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)