[ 
https://issues.apache.org/jira/browse/BEAM-1190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15765711#comment-15765711
 ] 

Paul Findlay edited comment on BEAM-1190 at 12/21/16 12:52 AM:
---------------------------------------------------------------

[~dhalp...@google.com] Correct me if I'm wrong.. but isn't 
FileBasedSource.createReader basically already doing a stat for each file in 
the expanded list but swallowing the error if there is one, and leaving it for 
FileBasedReader.startImpl to blow up? We are just asking for the method to not 
be final so we can treat the different sub-classes of IOException appropriately 
(for our pipeline).

But would love to know if there is scary behaviour we haven't considered.


was (Author: p...@findlay.net.nz):
[~dhalp...@google.com] Correct me if I'm wrong.. but isn't 
FileBasedSource.createReader basically already doing a stat for each file in 
the expanded list but swallowing the error if there is one, and leaving it for 
startImpl to blow up? We are just asking for the method to not be final so we 
can treat the different sub-classes of IOException appropriately (for our 
pipeline).

But would love to know if there is scary behaviour we haven't considered.

> FileBasedSource should ignore files that matched the glob but don't exist
> -------------------------------------------------------------------------
>
>                 Key: BEAM-1190
>                 URL: https://issues.apache.org/jira/browse/BEAM-1190
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-core
>            Reporter: Eugene Kirpichov
>            Assignee: Eugene Kirpichov
>
> See user issue:
> http://stackoverflow.com/questions/41251741/coping-with-eventual-consistency-of-gcs-bucket-listing
> We should, after globbing the files in FileBasedSource, individually stat 
> every file and remove those that don't exist, to account for the possibility 
> that glob yielded non-existing files due to eventual consistency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to