Presumably we'll eventually also run additional services alongside (like
Kafka) to have true integration tests for I/O connectors. What is the
common deployment in this case?

On Jul 28, 2016 06:35, "Amit Sela" <[email protected]> wrote:

> So what would be the preferred resource manager to test Flink on ?
>
> On Thu, Jul 28, 2016, 16:34 Aljoscha Krettek <[email protected]> wrote:
>
> > Flink also has a standalone mode.
> >
> > On Thu, 28 Jul 2016 at 13:42 Ismaël Mejía <[email protected]> wrote:
> >
> > > Good subject,  YARN is the de-facto standard at least from the point of
> > > view of the Big Data Distributions (Cloudera, Hortonworks, etc) and
> Cloud
> > > offers, e.g. AWS EMR, Azure HDInsight and Google Dataproc), and given
> > that
> > > it is supported by both Spark and Flink I think it is valuable to test
> > the
> > > support for YARN. The question is, should the tests be run on
> > 'Standalone'
> > > OR YARN' or maybe we can have  tests for 'Standalone AND YARN' ?
> > >
> > > Ismael.
> > >
> > >
> > >
> > >
> > > On Thu, Jul 28, 2016 at 12:24 PM, Amit Sela <[email protected]>
> > wrote:
> > >
> > > > Following a discussion I had with Kenneth and Dan here
> > > > <https://github.com/apache/incubator-beam/pull/711>. I want to raise
> > the
> > > > issue of which resource manager we should use for on going tests that
> > > will
> > > > run on actual clusters (on top of local/in-mem tests).
> > > > If we plan to test all runners on all their supported resource
> > managers,
> > > > great! But I guess this won't be the case, at least not at the
> > beginning.
> > > >
> > > > Spark can run it's own (Standalone Mode) resource manager, use YARN
> or
> > > use
> > > > Mesos. According to the latest survey
> > > > <
> > > >
> > >
> >
> http://go.databricks.com/hubfs/DataBricks_Surveys_-_Content/Spark-Survey-2015-Infographic.pdf
> > > > >
> > > > by
> > > > Databricks Standalone is in the lead (48%), with YARN tailing it
> > > > (40%) while Mesos looks like the least favourite.
> > > > For Spark, I'd vote for Standalone as it is the most popular use
> case +
> > > it
> > > > avoids the additional complexity of maintaining YARN on this cluster.
> > > > Having said that, AFAIK Flink is a "first-class" YARN citizen (right
> ?)
> > > and
> > > > I don't know what available resource managers can be used by other
> > > runners,
> > > > so I think runner authors should give their input here.
> > > >
> > > > *Summary:*
> > > > *Spark* - StandaloneMode or YARN (in that order).
> > > > *Flink * - ?
> > > > *Others* - ?
> > > >
> > > > Thanks,
> > > > Amit
> > > >
> > >
> >
>

Reply via email to