Indeed -- this is a clear area for improvement. Sources are usually not as
big of an issue -- these resources are publicly accessible regardless
where/how you run the pipeline (locally, or with any runner). On the other
hand, Sinks require write access, which is often more problematic.

One correction, however: WordCount supports both GCS and local paths, with
some exceptions depending on a runner.

There are several efforts to improve this, most notably BEAM-59, which is
assigned to Pei.

On Thu, Oct 27, 2016 at 8:17 AM, Jesse Anderson <[email protected]>
wrote:

> Those tutorials help. I was going through the example code and had the same
> thought. We need to take a pass through the examples and remove some of the
> Google Cloud dependencies.
>
> On Thu, Oct 27, 2016, 5:13 PM Thomas Weise <[email protected]> wrote:
>
> > The Beam tutorials seem to address this:
> >
> > https://github.com/eljefe6a/beamexample/blob/master/README.md
> >
> >
> > On Thu, Oct 27, 2016 at 8:04 AM, Manu Zhang <[email protected]>
> > wrote:
> >
> > > Hey guys,
> > >
> > > I find Beam examples under the examples folder are not easy to run due
> to
> > > dependency on Google specific services. Even the MinimalWordCount
> > > <https://github.com/apache/incubator-beam/blob/master/
> > >
> > examples/java/src/main/java/org/apache/beam/examples/
> MinimalWordCount.java
> > > >
> > > requires
> > > input and output to be on Google Cloud Storage. Others like
> > > WindowedWordCount
> > > <https://github.com/apache/incubator-beam/blob/master/
> > > examples/java/src/main/java/org/apache/beam/examples/
> > > WindowedWordCount.java>
> > > require
> > > BigQuery.  I wouldn't expect newcomers to tweak IO themselves.
> > >
> > > Can we have more quick start examples that can be run anywhere ?
> > >
> > > Thanks,
> > > Manu Zhang
> > >
> >
>

Reply via email to