"Replacement for production-ish" is beyond a stretch phrasing, UX just isn’t 
there yet for average end user wanting push-button.

Up until a bit ago focus was heavily focused on infrastructure folks and people 
building their own distros.  Project is turning towards "end users" so anyone 
from ops to dev/data-hacker will be able to extract value and get moving easily.

If you are brave enough to give it a go and start playing around with it in its 
current state you can start here looking at puppet modules readme:

https://github.com/apache/bigtop/tree/master/bigtop-deploy/puppet

Currently limited (ie: no yarn, mesos variants, orchestration not added yet), 
things will be stepping up a great detail heading out of 1.0 release.  If you 
do and run into stuff hop on mailing list, docs are another area updating is 
needed.

Thanks for pointers on the json feed link, definitely handy for some smoke tests


-----Original Message-----
From: Nicholas Chammas [mailto:nicholas.cham...@gmail.com] 
Sent: Tuesday, April 21, 2015 2:33 PM
To: n...@reactor8.com; Spark dev list
Subject: Re: Is spark-ec2 for production use?

Nate, could you point us to an example of how one would use Big Top as a "more 
production-ish" replacement for spark-ec2? I look a look at the project page 
<http://bigtop.apache.org/index.html>, but couldn't find any usage examples. 
Perhaps we can link to them from the spark-ec2 docs.

Regarding tests to validate that Spark was set up correctly, I am using the 
JSON feed from the Spark master web UI 
<http://stackoverflow.com/a/29659630/877069> for starters. Y'all might find it 
useful for the same purpose.

Nick

On Tue, Apr 21, 2015 at 5:21 PM <n...@reactor8.com> wrote:

> Several of the Bigtop folks got together last week at ApacheCon, this 
> was popular topic for next enhancements with spark related components 
> after getting 1.0 out the door.  Some leading topics were:
>
> -deployment of spark specific clusters
>      -spark standalone, hdfs
>      -spark over yarn, hdfs
>      -spark on mesos (talked to mesos folk about working to include in 
> bigtop post 1.0)
>      -the above plus variants of other bigtop components (ie: kafka, 
> zeppelin, demo data generators)
>
> One thing group would like some help on is tests for spark 
> environments so things can be validated post build/deploy and enhance 
> CI process so if you choose to deploy via bigtop in test/prod/etc you 
> know things have gone through a certain amount of rigor beforehand
>
> Nate
>
> -----Original Message-----
> From: Patrick Wendell [mailto:pwend...@gmail.com]
> Sent: Tuesday, April 21, 2015 12:46 PM
> To: Nicholas Chammas
> Cc: Spark dev list
> Subject: Re: Is spark-ec2 for production use?
>
> It could be a good idea to document this a bit. The original goals 
> were to give people an easy way to get started with Spark and also to 
> provide a consistent environment for our own experiments and 
> benchmarking of Spark at the AMPLab. Over time I've noticed a huge 
> amount of scope increase in terms of what people want to do and I do 
> know that many companies run production infrastructure based on launching the 
> EC2 scripts.
>
> My feeling is that the general problem of deploying Spark with other 
> applications and frameworks is fairly well covered by projects which 
> specifically focus on packaging and automation (e.g. Whirr, BigTop, etc).
> So
> I'd like to see a narrower focus on just getting a vanilla Spark 
> cluster up and running and make it clear that customization and 
> extension of that functionality is really not in scope.
>
> This doesn't mean discouraging people from using it for production use 
> cases, but more that they shouldn't expect us to merge and maintain 
> things that seek to do broader integration with other technologies, 
> automation, etc.
>
> - Patrick
>
> On Tue, Apr 21, 2015 at 12:05 PM, Nicholas Chammas 
> <nicholas.cham...@gmail.com> wrote:
> > Is spark-ec2 intended for spinning up production Spark clusters?
> >
> > I think the answer is no.
> >
> > However, the docs for spark-ec2
> > <https://spark.apache.org/docs/latest/ec2-scripts.html> very much 
> > leave that possibility open, and indeed I see many people asking 
> > questions or opening issues that stem from some production use case 
> > they are trying to fit spark-ec2 to.
> >
> > Here's the latest example
> > <https://issues.apache.org/jira/browse/SPARK-6900?focusedCommentId=1
> > 45 
> > 04236&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-t
> > ab
> > panel#comment-14504236>
> > of
> > someone using spark-ec2 to power their (presumably) production service.
> >
> > Shouldn't we actively discourage people from using spark-ec2 in this way?
> >
> > I understand there's no stopping people from doing what they want 
> > with it, and certainly the questions and issues we receive about 
> > spark-ec2 are still valid, even if they stem from discouraged use cases.
> >
> > From what I understand, spark-ec2 is intended for quick 
> > experimentation, one-off jobs, prototypes, and so forth.
> >
> > If that's the case, it's best to stress this in the docs.
> >
> > Nick
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For 
> additional commands, e-mail: dev-h...@spark.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For 
> additional commands, e-mail: dev-h...@spark.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Reply via email to