[ https://issues.apache.org/jira/browse/BEAM-4772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrew Pilloud resolved BEAM-4772. ---------------------------------- Resolution: Fixed Fix Version/s: 2.7.0 > TextIO.read transform does not respect .withEmptyMatchTreatment > --------------------------------------------------------------- > > Key: BEAM-4772 > URL: https://issues.apache.org/jira/browse/BEAM-4772 > Project: Beam > Issue Type: Bug > Components: sdk-java-core > Affects Versions: 2.5.0 > Reporter: Samuel Waggoner > Assignee: Kyle Winkelman > Priority: Major > Fix For: 2.7.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > I modified the MinimalWordCount example to reproduce. I expect the read > transform to read 0 lines rather than give an exception, since I used > EmptyMatchTreatment.ALLOW. I see the same behavior with ALLOW_IF_WILDCARD. > The EmptyMatchTreatment value seems to be ignored. > {code:java} > public class MinimalWordCount { > public static void main(String[] args) { > PipelineOptions options = PipelineOptionsFactory.create(); > Pipeline p = Pipeline.create(options); > p.apply(TextIO.read() > .from("gs://apache-beam-samples/doesnotexist/*") > .withEmptyMatchTreatment(EmptyMatchTreatment.ALLOW)) > .apply(TextIO.write().to("wordcounts")); > p.run().waitUntilFinish(); > } > } > {code} > {code:java} > Exception in thread "main" > org.apache.beam.sdk.Pipeline$PipelineExecutionException: > java.io.FileNotFoundException: No files matched spec: > gs://apache-beam-samples/doesnotexist/* > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317) > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:297) > at org.apache.beam.examples.MinimalWordCount.main(MinimalWordCount.java:124) > Caused by: java.io.FileNotFoundException: No files matched spec: > gs://apache-beam-samples/doesnotexist/* > at > org.apache.beam.sdk.io.FileSystems.maybeAdjustEmptyMatchResult(FileSystems.java:172) > at org.apache.beam.sdk.io.FileSystems.match(FileSystems.java:158) > at > org.apache.beam.sdk.io.FileBasedSource.getEstimatedSizeBytes(FileBasedSource.java:222) > at > org.apache.beam.runners.direct.BoundedReadEvaluatorFactory$InputProvider.getInitialInputs(BoundedReadEvaluatorFactory.java:212) > at > org.apache.beam.runners.direct.ReadEvaluatorFactory$InputProvider.getInitialInputs(ReadEvaluatorFactory.java:91) > at > org.apache.beam.runners.direct.RootProviderRegistry.getInitialInputs(RootProviderRegistry.java:81){code} > We see this behavior both when using DirectRunner and DataflowRunner -- This message was sent by Atlassian JIRA (v7.6.3#76005)