Amit Sela created BEAM-434:
------------------------------

             Summary: When examples write output to file it creates many output 
files instead of one
                 Key: BEAM-434
                 URL: https://issues.apache.org/jira/browse/BEAM-434
             Project: Beam
          Issue Type: Bug
          Components: examples-java
            Reporter: Amit Sela
            Assignee: Amit Sela
            Priority: Minor
             Fix For: 0.2.0-incubating


When using `TextIO.Write.to("/path/to/output")` without any restrictions on the 
number of shards, it might generate many output files (depending on your 
input), for WordCount for example, you'll get as many output files as unique 
words in your input.

Since I think examples are expected to execute in a friendly manner to "see" 
what it does and not optimize for performance in some way, I suggest to use 
`withoutSharding()` when writing the example output to an output file.

Examples I could find that behave this way:
org.apache.beam.examples.WordCount
org.apache.beam.examples.complete.TfIdf
org.apache.beam.examples.cookbook.DeDupExample



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to