Indeed -- this is a clear area for improvement. Sources are usually not as big of an issue -- these resources are publicly accessible regardless where/how you run the pipeline (locally, or with any runner). On the other hand, Sinks require write access, which is often more problematic.
One correction, however: WordCount supports both GCS and local paths, with some exceptions depending on a runner. There are several efforts to improve this, most notably BEAM-59, which is assigned to Pei. On Thu, Oct 27, 2016 at 8:17 AM, Jesse Anderson <[email protected]> wrote: > Those tutorials help. I was going through the example code and had the same > thought. We need to take a pass through the examples and remove some of the > Google Cloud dependencies. > > On Thu, Oct 27, 2016, 5:13 PM Thomas Weise <[email protected]> wrote: > > > The Beam tutorials seem to address this: > > > > https://github.com/eljefe6a/beamexample/blob/master/README.md > > > > > > On Thu, Oct 27, 2016 at 8:04 AM, Manu Zhang <[email protected]> > > wrote: > > > > > Hey guys, > > > > > > I find Beam examples under the examples folder are not easy to run due > to > > > dependency on Google specific services. Even the MinimalWordCount > > > <https://github.com/apache/incubator-beam/blob/master/ > > > > > examples/java/src/main/java/org/apache/beam/examples/ > MinimalWordCount.java > > > > > > > requires > > > input and output to be on Google Cloud Storage. Others like > > > WindowedWordCount > > > <https://github.com/apache/incubator-beam/blob/master/ > > > examples/java/src/main/java/org/apache/beam/examples/ > > > WindowedWordCount.java> > > > require > > > BigQuery. I wouldn't expect newcomers to tweak IO themselves. > > > > > > Can we have more quick start examples that can be run anywhere ? > > > > > > Thanks, > > > Manu Zhang > > > > > >
