Re: Removing the Mesos fine-grained mode

2016-01-20 Thread Iulian Dragoș
That'd be great, thanks Adam!

On Tue, Jan 19, 2016 at 5:41 PM, Adam McElwee  wrote:

> Sorry, I never got a chance to circle back with the master logs for this.
> I definitely can't share the job code, since it's used to build a pretty
> core dataset for my company, but let me see if I can pull some logs
> together in the next couple days.
>
> On Tue, Jan 19, 2016 at 10:08 AM, Iulian Dragoș <
> iulian.dra...@typesafe.com> wrote:
>
>> It would be good to get to the bottom of this.
>>
>> Adam, could you share the Spark app that you're using to test this?
>>
>> iulian
>>
>> On Mon, Nov 30, 2015 at 10:10 PM, Timothy Chen  wrote:
>>
>>> Hi Adam,
>>>
>>> Thanks for the graphs and the tests, definitely interested to dig a
>>> bit deeper to find out what's could be the cause of this.
>>>
>>> Do you have the spark driver logs for both runs?
>>>
>>> Tim
>>>
>>> On Mon, Nov 30, 2015 at 9:06 AM, Adam McElwee  wrote:
>>> > To eliminate any skepticism around whether cpu is a good performance
>>> metric
>>> > for this workload, I did a couple comparison runs of an example job to
>>> > demonstrate a more universal change in performance metrics (stage/job
>>> time)
>>> > between coarse and fine-grained mode on mesos.
>>> >
>>> > The workload is identical here - pulling tgz archives from s3, parsing
>>> json
>>> > lines from the files and ultimately creating documents to index into
>>> solr.
>>> > The tasks are not inserting into solr (just to let you know that
>>> there's no
>>> > network side-effect of the map task). The runs are on the same exact
>>> > hardware in ec2 (m2.4xlarge, with 68GB of ram and 45G executor memory),
>>> > exact same jvm and it's not dependent on order of running the jobs,
>>> meaning
>>> > I get the same results whether I run the coarse first or whether I run
>>> the
>>> > fine-grained first. No other frameworks/tasks are running on the mesos
>>> > cluster during the test. I see the same results whether it's a 3-node
>>> > cluster, or whether it's a 200-node cluster.
>>> >
>>> > With the CMS collector in fine-grained mode, the map stage takes
>>> roughly
>>> > 2.9h, and coarse-grained mode takes 3.4h. Because both modes initially
>>> start
>>> > out performing similarly, the total execution time gap widens as the
>>> job
>>> > size grows. To put that another way, the difference is much smaller for
>>> > jobs/stages < 1 hour. When I submit this job for a much larger dataset
>>> that
>>> > takes 5+ hours, the difference in total stage time moves closer and
>>> closer
>>> > to roughly 20-30% longer execution time.
>>> >
>>> > With the G1 collector in fine-grained mode, the map stage takes roughly
>>> > 2.2h, and coarse-grained mode takes 2.7h. Again, the fine and
>>> coarse-grained
>>> > execution tests are on the exact same machines, exact same dataset,
>>> and only
>>> > changing spark.mesos.coarse to true/false.
>>> >
>>> > Let me know if there's anything else I can provide here.
>>> >
>>> > Thanks,
>>> > -Adam
>>> >
>>> >
>>> > On Mon, Nov 23, 2015 at 11:27 AM, Adam McElwee 
>>> wrote:
>>> >>
>>> >>
>>> >>
>>> >> On Mon, Nov 23, 2015 at 7:36 AM, Iulian Dragoș
>>> >>  wrote:
>>> >>>
>>> >>>
>>> >>>
>>> >>> On Sat, Nov 21, 2015 at 3:37 AM, Adam McElwee 
>>> wrote:
>>> 
>>>  I've used fine-grained mode on our mesos spark clusters until this
>>> week,
>>>  mostly because it was the default. I started trying coarse-grained
>>> because
>>>  of the recent chatter on the mailing list about wanting to move the
>>> mesos
>>>  execution path to coarse-grained only. The odd things is,
>>> coarse-grained vs
>>>  fine-grained seems to yield drastic cluster utilization metrics for
>>> any of
>>>  our jobs that I've tried out this week.
>>> 
>>>  If this is best as a new thread, please let me know, and I'll try
>>> not to
>>>  derail this conversation. Otherwise, details below:
>>> >>>
>>> >>>
>>> >>> I think it's ok to discuss it here.
>>> >>>
>>> 
>>>  We monitor our spark clusters with ganglia, and historically, we
>>>  maintain at least 90% cpu utilization across the cluster. Making a
>>> single
>>>  configuration change to use coarse-grained execution instead of
>>> fine-grained
>>>  consistently yields a cpu utilization pattern that starts around
>>> 90% at the
>>>  beginning of the job, and then it slowly decreases over the next
>>> 1-1.5 hours
>>>  to level out around 65% cpu utilization on the cluster. Does anyone
>>> have a
>>>  clue why I'd be seeing such a negative effect of switching to
>>> coarse-grained
>>>  mode? GC activity is comparable in both cases. I've tried 1.5.2, as
>>> well as
>>>  the 1.6.0 preview tag that's on github.
>>> >>>
>>> >>>
>>> >>> I'm not very familiar with Ganglia, and how it computes utilization.
>>> But
>>> >>> one thing comes to mind: did you enable dynamic allocation on
>>> coarse-grained
>>> >>> mode?
>>> >>
>>> >>
>>> >> Dynamic allocation is definitely not enabled. The only delta betwee

Re: Removing the Mesos fine-grained mode

2016-01-19 Thread Adam McElwee
Sorry, I never got a chance to circle back with the master logs for this. I
definitely can't share the job code, since it's used to build a pretty core
dataset for my company, but let me see if I can pull some logs together in
the next couple days.

On Tue, Jan 19, 2016 at 10:08 AM, Iulian Dragoș 
wrote:

> It would be good to get to the bottom of this.
>
> Adam, could you share the Spark app that you're using to test this?
>
> iulian
>
> On Mon, Nov 30, 2015 at 10:10 PM, Timothy Chen  wrote:
>
>> Hi Adam,
>>
>> Thanks for the graphs and the tests, definitely interested to dig a
>> bit deeper to find out what's could be the cause of this.
>>
>> Do you have the spark driver logs for both runs?
>>
>> Tim
>>
>> On Mon, Nov 30, 2015 at 9:06 AM, Adam McElwee  wrote:
>> > To eliminate any skepticism around whether cpu is a good performance
>> metric
>> > for this workload, I did a couple comparison runs of an example job to
>> > demonstrate a more universal change in performance metrics (stage/job
>> time)
>> > between coarse and fine-grained mode on mesos.
>> >
>> > The workload is identical here - pulling tgz archives from s3, parsing
>> json
>> > lines from the files and ultimately creating documents to index into
>> solr.
>> > The tasks are not inserting into solr (just to let you know that
>> there's no
>> > network side-effect of the map task). The runs are on the same exact
>> > hardware in ec2 (m2.4xlarge, with 68GB of ram and 45G executor memory),
>> > exact same jvm and it's not dependent on order of running the jobs,
>> meaning
>> > I get the same results whether I run the coarse first or whether I run
>> the
>> > fine-grained first. No other frameworks/tasks are running on the mesos
>> > cluster during the test. I see the same results whether it's a 3-node
>> > cluster, or whether it's a 200-node cluster.
>> >
>> > With the CMS collector in fine-grained mode, the map stage takes roughly
>> > 2.9h, and coarse-grained mode takes 3.4h. Because both modes initially
>> start
>> > out performing similarly, the total execution time gap widens as the job
>> > size grows. To put that another way, the difference is much smaller for
>> > jobs/stages < 1 hour. When I submit this job for a much larger dataset
>> that
>> > takes 5+ hours, the difference in total stage time moves closer and
>> closer
>> > to roughly 20-30% longer execution time.
>> >
>> > With the G1 collector in fine-grained mode, the map stage takes roughly
>> > 2.2h, and coarse-grained mode takes 2.7h. Again, the fine and
>> coarse-grained
>> > execution tests are on the exact same machines, exact same dataset, and
>> only
>> > changing spark.mesos.coarse to true/false.
>> >
>> > Let me know if there's anything else I can provide here.
>> >
>> > Thanks,
>> > -Adam
>> >
>> >
>> > On Mon, Nov 23, 2015 at 11:27 AM, Adam McElwee  wrote:
>> >>
>> >>
>> >>
>> >> On Mon, Nov 23, 2015 at 7:36 AM, Iulian Dragoș
>> >>  wrote:
>> >>>
>> >>>
>> >>>
>> >>> On Sat, Nov 21, 2015 at 3:37 AM, Adam McElwee 
>> wrote:
>> 
>>  I've used fine-grained mode on our mesos spark clusters until this
>> week,
>>  mostly because it was the default. I started trying coarse-grained
>> because
>>  of the recent chatter on the mailing list about wanting to move the
>> mesos
>>  execution path to coarse-grained only. The odd things is,
>> coarse-grained vs
>>  fine-grained seems to yield drastic cluster utilization metrics for
>> any of
>>  our jobs that I've tried out this week.
>> 
>>  If this is best as a new thread, please let me know, and I'll try
>> not to
>>  derail this conversation. Otherwise, details below:
>> >>>
>> >>>
>> >>> I think it's ok to discuss it here.
>> >>>
>> 
>>  We monitor our spark clusters with ganglia, and historically, we
>>  maintain at least 90% cpu utilization across the cluster. Making a
>> single
>>  configuration change to use coarse-grained execution instead of
>> fine-grained
>>  consistently yields a cpu utilization pattern that starts around 90%
>> at the
>>  beginning of the job, and then it slowly decreases over the next
>> 1-1.5 hours
>>  to level out around 65% cpu utilization on the cluster. Does anyone
>> have a
>>  clue why I'd be seeing such a negative effect of switching to
>> coarse-grained
>>  mode? GC activity is comparable in both cases. I've tried 1.5.2, as
>> well as
>>  the 1.6.0 preview tag that's on github.
>> >>>
>> >>>
>> >>> I'm not very familiar with Ganglia, and how it computes utilization.
>> But
>> >>> one thing comes to mind: did you enable dynamic allocation on
>> coarse-grained
>> >>> mode?
>> >>
>> >>
>> >> Dynamic allocation is definitely not enabled. The only delta between
>> runs
>> >> is adding --conf "spark.mesos.coarse=true" the job submission. Ganglia
>> is
>> >> just pulling stats from the procfs, and I've never seen it report bad
>> >> results. If I sample any of the 100-200 nodes in the cluster, dstat
>> reflects
>> >> the

Re: Removing the Mesos fine-grained mode

2016-01-19 Thread Iulian Dragoș
It would be good to get to the bottom of this.

Adam, could you share the Spark app that you're using to test this?

iulian

On Mon, Nov 30, 2015 at 10:10 PM, Timothy Chen  wrote:

> Hi Adam,
>
> Thanks for the graphs and the tests, definitely interested to dig a
> bit deeper to find out what's could be the cause of this.
>
> Do you have the spark driver logs for both runs?
>
> Tim
>
> On Mon, Nov 30, 2015 at 9:06 AM, Adam McElwee  wrote:
> > To eliminate any skepticism around whether cpu is a good performance
> metric
> > for this workload, I did a couple comparison runs of an example job to
> > demonstrate a more universal change in performance metrics (stage/job
> time)
> > between coarse and fine-grained mode on mesos.
> >
> > The workload is identical here - pulling tgz archives from s3, parsing
> json
> > lines from the files and ultimately creating documents to index into
> solr.
> > The tasks are not inserting into solr (just to let you know that there's
> no
> > network side-effect of the map task). The runs are on the same exact
> > hardware in ec2 (m2.4xlarge, with 68GB of ram and 45G executor memory),
> > exact same jvm and it's not dependent on order of running the jobs,
> meaning
> > I get the same results whether I run the coarse first or whether I run
> the
> > fine-grained first. No other frameworks/tasks are running on the mesos
> > cluster during the test. I see the same results whether it's a 3-node
> > cluster, or whether it's a 200-node cluster.
> >
> > With the CMS collector in fine-grained mode, the map stage takes roughly
> > 2.9h, and coarse-grained mode takes 3.4h. Because both modes initially
> start
> > out performing similarly, the total execution time gap widens as the job
> > size grows. To put that another way, the difference is much smaller for
> > jobs/stages < 1 hour. When I submit this job for a much larger dataset
> that
> > takes 5+ hours, the difference in total stage time moves closer and
> closer
> > to roughly 20-30% longer execution time.
> >
> > With the G1 collector in fine-grained mode, the map stage takes roughly
> > 2.2h, and coarse-grained mode takes 2.7h. Again, the fine and
> coarse-grained
> > execution tests are on the exact same machines, exact same dataset, and
> only
> > changing spark.mesos.coarse to true/false.
> >
> > Let me know if there's anything else I can provide here.
> >
> > Thanks,
> > -Adam
> >
> >
> > On Mon, Nov 23, 2015 at 11:27 AM, Adam McElwee  wrote:
> >>
> >>
> >>
> >> On Mon, Nov 23, 2015 at 7:36 AM, Iulian Dragoș
> >>  wrote:
> >>>
> >>>
> >>>
> >>> On Sat, Nov 21, 2015 at 3:37 AM, Adam McElwee  wrote:
> 
>  I've used fine-grained mode on our mesos spark clusters until this
> week,
>  mostly because it was the default. I started trying coarse-grained
> because
>  of the recent chatter on the mailing list about wanting to move the
> mesos
>  execution path to coarse-grained only. The odd things is,
> coarse-grained vs
>  fine-grained seems to yield drastic cluster utilization metrics for
> any of
>  our jobs that I've tried out this week.
> 
>  If this is best as a new thread, please let me know, and I'll try not
> to
>  derail this conversation. Otherwise, details below:
> >>>
> >>>
> >>> I think it's ok to discuss it here.
> >>>
> 
>  We monitor our spark clusters with ganglia, and historically, we
>  maintain at least 90% cpu utilization across the cluster. Making a
> single
>  configuration change to use coarse-grained execution instead of
> fine-grained
>  consistently yields a cpu utilization pattern that starts around 90%
> at the
>  beginning of the job, and then it slowly decreases over the next
> 1-1.5 hours
>  to level out around 65% cpu utilization on the cluster. Does anyone
> have a
>  clue why I'd be seeing such a negative effect of switching to
> coarse-grained
>  mode? GC activity is comparable in both cases. I've tried 1.5.2, as
> well as
>  the 1.6.0 preview tag that's on github.
> >>>
> >>>
> >>> I'm not very familiar with Ganglia, and how it computes utilization.
> But
> >>> one thing comes to mind: did you enable dynamic allocation on
> coarse-grained
> >>> mode?
> >>
> >>
> >> Dynamic allocation is definitely not enabled. The only delta between
> runs
> >> is adding --conf "spark.mesos.coarse=true" the job submission. Ganglia
> is
> >> just pulling stats from the procfs, and I've never seen it report bad
> >> results. If I sample any of the 100-200 nodes in the cluster, dstat
> reflects
> >> the same average cpu that I'm seeing reflected in ganglia.
> >>>
> >>>
> >>> iulian
> >>
> >>
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>


-- 

--
Iulian Dragos

--
Reactive Apps on the JVM
www.typesafe.com


Re: Removing the Mesos fine-grained mode

2015-11-30 Thread Timothy Chen
Hi Adam,

Thanks for the graphs and the tests, definitely interested to dig a
bit deeper to find out what's could be the cause of this.

Do you have the spark driver logs for both runs?

Tim

On Mon, Nov 30, 2015 at 9:06 AM, Adam McElwee  wrote:
> To eliminate any skepticism around whether cpu is a good performance metric
> for this workload, I did a couple comparison runs of an example job to
> demonstrate a more universal change in performance metrics (stage/job time)
> between coarse and fine-grained mode on mesos.
>
> The workload is identical here - pulling tgz archives from s3, parsing json
> lines from the files and ultimately creating documents to index into solr.
> The tasks are not inserting into solr (just to let you know that there's no
> network side-effect of the map task). The runs are on the same exact
> hardware in ec2 (m2.4xlarge, with 68GB of ram and 45G executor memory),
> exact same jvm and it's not dependent on order of running the jobs, meaning
> I get the same results whether I run the coarse first or whether I run the
> fine-grained first. No other frameworks/tasks are running on the mesos
> cluster during the test. I see the same results whether it's a 3-node
> cluster, or whether it's a 200-node cluster.
>
> With the CMS collector in fine-grained mode, the map stage takes roughly
> 2.9h, and coarse-grained mode takes 3.4h. Because both modes initially start
> out performing similarly, the total execution time gap widens as the job
> size grows. To put that another way, the difference is much smaller for
> jobs/stages < 1 hour. When I submit this job for a much larger dataset that
> takes 5+ hours, the difference in total stage time moves closer and closer
> to roughly 20-30% longer execution time.
>
> With the G1 collector in fine-grained mode, the map stage takes roughly
> 2.2h, and coarse-grained mode takes 2.7h. Again, the fine and coarse-grained
> execution tests are on the exact same machines, exact same dataset, and only
> changing spark.mesos.coarse to true/false.
>
> Let me know if there's anything else I can provide here.
>
> Thanks,
> -Adam
>
>
> On Mon, Nov 23, 2015 at 11:27 AM, Adam McElwee  wrote:
>>
>>
>>
>> On Mon, Nov 23, 2015 at 7:36 AM, Iulian Dragoș
>>  wrote:
>>>
>>>
>>>
>>> On Sat, Nov 21, 2015 at 3:37 AM, Adam McElwee  wrote:

 I've used fine-grained mode on our mesos spark clusters until this week,
 mostly because it was the default. I started trying coarse-grained because
 of the recent chatter on the mailing list about wanting to move the mesos
 execution path to coarse-grained only. The odd things is, coarse-grained vs
 fine-grained seems to yield drastic cluster utilization metrics for any of
 our jobs that I've tried out this week.

 If this is best as a new thread, please let me know, and I'll try not to
 derail this conversation. Otherwise, details below:
>>>
>>>
>>> I think it's ok to discuss it here.
>>>

 We monitor our spark clusters with ganglia, and historically, we
 maintain at least 90% cpu utilization across the cluster. Making a single
 configuration change to use coarse-grained execution instead of 
 fine-grained
 consistently yields a cpu utilization pattern that starts around 90% at the
 beginning of the job, and then it slowly decreases over the next 1-1.5 
 hours
 to level out around 65% cpu utilization on the cluster. Does anyone have a
 clue why I'd be seeing such a negative effect of switching to 
 coarse-grained
 mode? GC activity is comparable in both cases. I've tried 1.5.2, as well as
 the 1.6.0 preview tag that's on github.
>>>
>>>
>>> I'm not very familiar with Ganglia, and how it computes utilization. But
>>> one thing comes to mind: did you enable dynamic allocation on coarse-grained
>>> mode?
>>
>>
>> Dynamic allocation is definitely not enabled. The only delta between runs
>> is adding --conf "spark.mesos.coarse=true" the job submission. Ganglia is
>> just pulling stats from the procfs, and I've never seen it report bad
>> results. If I sample any of the 100-200 nodes in the cluster, dstat reflects
>> the same average cpu that I'm seeing reflected in ganglia.
>>>
>>>
>>> iulian
>>
>>
>

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Removing the Mesos fine-grained mode

2015-11-30 Thread Adam McElwee
To eliminate any skepticism around whether cpu is a good performance metric
for this workload, I did a couple comparison runs of an example job to
demonstrate a more universal change in performance metrics (stage/job time)
between coarse and fine-grained mode on mesos.

The workload is identical here - pulling tgz archives from s3, parsing json
lines from the files and ultimately creating documents to index into solr.
The tasks are not inserting into solr (just to let you know that there's no
network side-effect of the map task). The runs are on the same exact
hardware in ec2 (m2.4xlarge, with 68GB of ram and 45G executor memory),
exact same jvm and it's not dependent on order of running the jobs, meaning
I get the same results whether I run the coarse first or whether I run the
fine-grained first. No other frameworks/tasks are running on the mesos
cluster during the test. I see the same results whether it's a 3-node
cluster, or whether it's a 200-node cluster.

With the CMS collector in fine-grained mode, the map stage takes roughly
2.9h, and coarse-grained mode takes 3.4h. Because both modes initially
start out performing similarly, the total execution time gap widens as the
job size grows. To put that another way, the difference is much smaller for
jobs/stages < 1 hour. When I submit this job for a much larger dataset that
takes 5+ hours, the difference in total stage time moves closer and closer
to roughly 20-30% longer execution time.

With the G1 collector in fine-grained mode, the map stage takes roughly
2.2h, and coarse-grained mode takes 2.7h. Again, the fine and coarse-grained
execution tests are on the exact same machines, exact same dataset, and
only changing spark.mesos.coarse to true/false.

Let me know if there's anything else I can provide here.

Thanks,
-Adam


On Mon, Nov 23, 2015 at 11:27 AM, Adam McElwee  wrote:

>
>
> On Mon, Nov 23, 2015 at 7:36 AM, Iulian Dragoș  > wrote:
>
>>
>>
>> On Sat, Nov 21, 2015 at 3:37 AM, Adam McElwee  wrote:
>>
>>> I've used fine-grained mode on our mesos spark clusters until this week,
>>> mostly because it was the default. I started trying coarse-grained because
>>> of the recent chatter on the mailing list about wanting to move the mesos
>>> execution path to coarse-grained only. The odd things is, coarse-grained vs
>>> fine-grained seems to yield drastic cluster utilization metrics for any of
>>> our jobs that I've tried out this week.
>>>
>>> If this is best as a new thread, please let me know, and I'll try not to
>>> derail this conversation. Otherwise, details below:
>>>
>>
>> I think it's ok to discuss it here.
>>
>>
>>> We monitor our spark clusters with ganglia, and historically, we
>>> maintain at least 90% cpu utilization across the cluster. Making a single
>>> configuration change to use coarse-grained execution instead of
>>> fine-grained consistently yields a cpu utilization pattern that starts
>>> around 90% at the beginning of the job, and then it slowly decreases over
>>> the next 1-1.5 hours to level out around 65% cpu utilization on the
>>> cluster. Does anyone have a clue why I'd be seeing such a negative effect
>>> of switching to coarse-grained mode? GC activity is comparable in both
>>> cases. I've tried 1.5.2, as well as the 1.6.0 preview tag that's on github.
>>>
>>
>> I'm not very familiar with Ganglia, and how it computes utilization. But
>> one thing comes to mind: did you enable dynamic allocation
>> 
>> on coarse-grained mode?
>>
>
> Dynamic allocation is definitely not enabled. The only delta between runs
> is adding --conf "spark.mesos.coarse=true" the job submission. Ganglia is
> just pulling stats from the procfs, and I've never seen it report bad
> results. If I sample any of the 100-200 nodes in the cluster, dstat
> reflects the same average cpu that I'm seeing reflected in ganglia.
>
>>
>> iulian
>>
>
>


Re: Removing the Mesos fine-grained mode

2015-11-23 Thread Jerry Lam
@Andrew Or

I assume you are referring to this ticket [SPARK-5095]: 
https://issues.apache.org/jira/browse/SPARK-5095 
 
Thank you!

Best Regards,

Jerry

> On Nov 23, 2015, at 2:41 PM, Andrew Or  wrote:
> 
> @Jerry Lam
> 
> Can someone confirm if it is true that dynamic allocation on mesos "is 
> designed to run one executor per slave with the configured amount of 
> resources." I copied this sentence from the documentation. Does this mean 
> there is at most 1 executor per node? Therefore,  if you have a big machine, 
> you need to allocate a fat executor on this machine in order to fully utilize 
> it?
> 
> Mesos inherently does not support multiple executors per slave currently. 
> This is actually not related to dynamic allocation. There is, however, an 
> outstanding patch to add support for multiple executors per slave. When that 
> feature is merged, it will work well with dynamic allocation.
>  
> 
> 2015-11-23 9:27 GMT-08:00 Adam McElwee  >:
> 
> 
> On Mon, Nov 23, 2015 at 7:36 AM, Iulian Dragoș  > wrote:
> 
> 
> On Sat, Nov 21, 2015 at 3:37 AM, Adam McElwee  > wrote:
> I've used fine-grained mode on our mesos spark clusters until this week, 
> mostly because it was the default. I started trying coarse-grained because of 
> the recent chatter on the mailing list about wanting to move the mesos 
> execution path to coarse-grained only. The odd things is, coarse-grained vs 
> fine-grained seems to yield drastic cluster utilization metrics for any of 
> our jobs that I've tried out this week.
> 
> If this is best as a new thread, please let me know, and I'll try not to 
> derail this conversation. Otherwise, details below:
> 
> I think it's ok to discuss it here.
>  
> We monitor our spark clusters with ganglia, and historically, we maintain at 
> least 90% cpu utilization across the cluster. Making a single configuration 
> change to use coarse-grained execution instead of fine-grained consistently 
> yields a cpu utilization pattern that starts around 90% at the beginning of 
> the job, and then it slowly decreases over the next 1-1.5 hours to level out 
> around 65% cpu utilization on the cluster. Does anyone have a clue why I'd be 
> seeing such a negative effect of switching to coarse-grained mode? GC 
> activity is comparable in both cases. I've tried 1.5.2, as well as the 1.6.0 
> preview tag that's on github.
> 
> I'm not very familiar with Ganglia, and how it computes utilization. But one 
> thing comes to mind: did you enable dynamic allocation 
> 
>  on coarse-grained mode?
> 
> Dynamic allocation is definitely not enabled. The only delta between runs is 
> adding --conf "spark.mesos.coarse=true" the job submission. Ganglia is just 
> pulling stats from the procfs, and I've never seen it report bad results. If 
> I sample any of the 100-200 nodes in the cluster, dstat reflects the same 
> average cpu that I'm seeing reflected in ganglia.
> 
> iulian
> 
> 



Re: Removing the Mesos fine-grained mode

2015-11-23 Thread Jerry Lam
Hi Andrew,

Thank you for confirming this. I’m referring to this because I used 
fine-grained mode before and it was a headache because of the memory issue. 
Therefore, I switched to Yarn with dynamic allocation. I was thinking if I can 
switch back to Mesos with coarse-grained mode + dynamic allocation but from 
what you explain to me, I still cannot have more than 1 executor per slave. 
This sounds like a deal breaker for me because if I have a slave of 100GB of 
RAM and a slave of 30GB, I cannot utilize the instance of 100GB of RAM fully if 
I specify spark.executor.memory = 20GB. The two slaves will each consume 20GB 
in this case even though there is 80GB left for the bigger machine. If I 
specify 90GB for spark.executor.memory, the only active slave is the one with 
100GB. Therefore the slave with 30GB will be idled. 

Do you know the link to the JIRA that I can receive update for the feature you 
mention? We have intentions to use Mesos but it is proven difficult with our 
tight budget constraint. 

Best Regards,

Jerry


> On Nov 23, 2015, at 2:41 PM, Andrew Or  wrote:
> 
> @Jerry Lam
> 
> Can someone confirm if it is true that dynamic allocation on mesos "is 
> designed to run one executor per slave with the configured amount of 
> resources." I copied this sentence from the documentation. Does this mean 
> there is at most 1 executor per node? Therefore,  if you have a big machine, 
> you need to allocate a fat executor on this machine in order to fully utilize 
> it?
> 
> Mesos inherently does not support multiple executors per slave currently. 
> This is actually not related to dynamic allocation. There is, however, an 
> outstanding patch to add support for multiple executors per slave. When that 
> feature is merged, it will work well with dynamic allocation.
>  
> 
> 2015-11-23 9:27 GMT-08:00 Adam McElwee  >:
> 
> 
> On Mon, Nov 23, 2015 at 7:36 AM, Iulian Dragoș  > wrote:
> 
> 
> On Sat, Nov 21, 2015 at 3:37 AM, Adam McElwee  > wrote:
> I've used fine-grained mode on our mesos spark clusters until this week, 
> mostly because it was the default. I started trying coarse-grained because of 
> the recent chatter on the mailing list about wanting to move the mesos 
> execution path to coarse-grained only. The odd things is, coarse-grained vs 
> fine-grained seems to yield drastic cluster utilization metrics for any of 
> our jobs that I've tried out this week.
> 
> If this is best as a new thread, please let me know, and I'll try not to 
> derail this conversation. Otherwise, details below:
> 
> I think it's ok to discuss it here.
>  
> We monitor our spark clusters with ganglia, and historically, we maintain at 
> least 90% cpu utilization across the cluster. Making a single configuration 
> change to use coarse-grained execution instead of fine-grained consistently 
> yields a cpu utilization pattern that starts around 90% at the beginning of 
> the job, and then it slowly decreases over the next 1-1.5 hours to level out 
> around 65% cpu utilization on the cluster. Does anyone have a clue why I'd be 
> seeing such a negative effect of switching to coarse-grained mode? GC 
> activity is comparable in both cases. I've tried 1.5.2, as well as the 1.6.0 
> preview tag that's on github.
> 
> I'm not very familiar with Ganglia, and how it computes utilization. But one 
> thing comes to mind: did you enable dynamic allocation 
> 
>  on coarse-grained mode?
> 
> Dynamic allocation is definitely not enabled. The only delta between runs is 
> adding --conf "spark.mesos.coarse=true" the job submission. Ganglia is just 
> pulling stats from the procfs, and I've never seen it report bad results. If 
> I sample any of the 100-200 nodes in the cluster, dstat reflects the same 
> average cpu that I'm seeing reflected in ganglia.
> 
> iulian
> 
> 



Re: Removing the Mesos fine-grained mode

2015-11-23 Thread Andrew Or
@Jerry Lam

Can someone confirm if it is true that dynamic allocation on mesos "is
> designed to run one executor per slave with the configured amount of
> resources." I copied this sentence from the documentation. Does this mean
> there is at most 1 executor per node? Therefore,  if you have a big
> machine, you need to allocate a fat executor on this machine in order to
> fully utilize it?


Mesos inherently does not support multiple executors per slave currently.
This is actually not related to dynamic allocation. There is, however, an
outstanding patch to add support for multiple executors per slave. When
that feature is merged, it will work well with dynamic allocation.


2015-11-23 9:27 GMT-08:00 Adam McElwee :

>
>
> On Mon, Nov 23, 2015 at 7:36 AM, Iulian Dragoș  > wrote:
>
>>
>>
>> On Sat, Nov 21, 2015 at 3:37 AM, Adam McElwee  wrote:
>>
>>> I've used fine-grained mode on our mesos spark clusters until this week,
>>> mostly because it was the default. I started trying coarse-grained because
>>> of the recent chatter on the mailing list about wanting to move the mesos
>>> execution path to coarse-grained only. The odd things is, coarse-grained vs
>>> fine-grained seems to yield drastic cluster utilization metrics for any of
>>> our jobs that I've tried out this week.
>>>
>>> If this is best as a new thread, please let me know, and I'll try not to
>>> derail this conversation. Otherwise, details below:
>>>
>>
>> I think it's ok to discuss it here.
>>
>>
>>> We monitor our spark clusters with ganglia, and historically, we
>>> maintain at least 90% cpu utilization across the cluster. Making a single
>>> configuration change to use coarse-grained execution instead of
>>> fine-grained consistently yields a cpu utilization pattern that starts
>>> around 90% at the beginning of the job, and then it slowly decreases over
>>> the next 1-1.5 hours to level out around 65% cpu utilization on the
>>> cluster. Does anyone have a clue why I'd be seeing such a negative effect
>>> of switching to coarse-grained mode? GC activity is comparable in both
>>> cases. I've tried 1.5.2, as well as the 1.6.0 preview tag that's on github.
>>>
>>
>> I'm not very familiar with Ganglia, and how it computes utilization. But
>> one thing comes to mind: did you enable dynamic allocation
>> 
>> on coarse-grained mode?
>>
>
> Dynamic allocation is definitely not enabled. The only delta between runs
> is adding --conf "spark.mesos.coarse=true" the job submission. Ganglia is
> just pulling stats from the procfs, and I've never seen it report bad
> results. If I sample any of the 100-200 nodes in the cluster, dstat
> reflects the same average cpu that I'm seeing reflected in ganglia.
>
>>
>> iulian
>>
>
>


Re: Removing the Mesos fine-grained mode

2015-11-23 Thread Adam McElwee
On Mon, Nov 23, 2015 at 7:36 AM, Iulian Dragoș 
wrote:

>
>
> On Sat, Nov 21, 2015 at 3:37 AM, Adam McElwee  wrote:
>
>> I've used fine-grained mode on our mesos spark clusters until this week,
>> mostly because it was the default. I started trying coarse-grained because
>> of the recent chatter on the mailing list about wanting to move the mesos
>> execution path to coarse-grained only. The odd things is, coarse-grained vs
>> fine-grained seems to yield drastic cluster utilization metrics for any of
>> our jobs that I've tried out this week.
>>
>> If this is best as a new thread, please let me know, and I'll try not to
>> derail this conversation. Otherwise, details below:
>>
>
> I think it's ok to discuss it here.
>
>
>> We monitor our spark clusters with ganglia, and historically, we maintain
>> at least 90% cpu utilization across the cluster. Making a single
>> configuration change to use coarse-grained execution instead of
>> fine-grained consistently yields a cpu utilization pattern that starts
>> around 90% at the beginning of the job, and then it slowly decreases over
>> the next 1-1.5 hours to level out around 65% cpu utilization on the
>> cluster. Does anyone have a clue why I'd be seeing such a negative effect
>> of switching to coarse-grained mode? GC activity is comparable in both
>> cases. I've tried 1.5.2, as well as the 1.6.0 preview tag that's on github.
>>
>
> I'm not very familiar with Ganglia, and how it computes utilization. But
> one thing comes to mind: did you enable dynamic allocation
> 
> on coarse-grained mode?
>

Dynamic allocation is definitely not enabled. The only delta between runs
is adding --conf "spark.mesos.coarse=true" the job submission. Ganglia is
just pulling stats from the procfs, and I've never seen it report bad
results. If I sample any of the 100-200 nodes in the cluster, dstat
reflects the same average cpu that I'm seeing reflected in ganglia.

>
> iulian
>


Re: Removing the Mesos fine-grained mode

2015-11-23 Thread Jerry Lam
Hi guys,

Can someone confirm if it is true that dynamic allocation on mesos "is designed 
to run one executor per slave with the configured amount of resources." I 
copied this sentence from the documentation. Does this mean there is at most 1 
executor per node? Therefore,  if you have a big machine, you need to allocate 
a fat executor on this machine in order to fully utilize it?

Best Regards,

Sent from my iPhone

> On 23 Nov, 2015, at 8:36 am, Iulian Dragoș  wrote:
> 
> 
> 
>> On Sat, Nov 21, 2015 at 3:37 AM, Adam McElwee  wrote:
>> I've used fine-grained mode on our mesos spark clusters until this week, 
>> mostly because it was the default. I started trying coarse-grained because 
>> of the recent chatter on the mailing list about wanting to move the mesos 
>> execution path to coarse-grained only. The odd things is, coarse-grained vs 
>> fine-grained seems to yield drastic cluster utilization metrics for any of 
>> our jobs that I've tried out this week.
>> 
>> If this is best as a new thread, please let me know, and I'll try not to 
>> derail this conversation. Otherwise, details below:
> 
> I think it's ok to discuss it here.
>  
>> We monitor our spark clusters with ganglia, and historically, we maintain at 
>> least 90% cpu utilization across the cluster. Making a single configuration 
>> change to use coarse-grained execution instead of fine-grained consistently 
>> yields a cpu utilization pattern that starts around 90% at the beginning of 
>> the job, and then it slowly decreases over the next 1-1.5 hours to level out 
>> around 65% cpu utilization on the cluster. Does anyone have a clue why I'd 
>> be seeing such a negative effect of switching to coarse-grained mode? GC 
>> activity is comparable in both cases. I've tried 1.5.2, as well as the 1.6.0 
>> preview tag that's on github.
> 
> I'm not very familiar with Ganglia, and how it computes utilization. But one 
> thing comes to mind: did you enable dynamic allocation on coarse-grained mode?
> 
> iulian


Re: Removing the Mesos fine-grained mode

2015-11-23 Thread Iulian Dragoș
On Sat, Nov 21, 2015 at 3:37 AM, Adam McElwee  wrote:

> I've used fine-grained mode on our mesos spark clusters until this week,
> mostly because it was the default. I started trying coarse-grained because
> of the recent chatter on the mailing list about wanting to move the mesos
> execution path to coarse-grained only. The odd things is, coarse-grained vs
> fine-grained seems to yield drastic cluster utilization metrics for any of
> our jobs that I've tried out this week.
>
> If this is best as a new thread, please let me know, and I'll try not to
> derail this conversation. Otherwise, details below:
>

I think it's ok to discuss it here.


> We monitor our spark clusters with ganglia, and historically, we maintain
> at least 90% cpu utilization across the cluster. Making a single
> configuration change to use coarse-grained execution instead of
> fine-grained consistently yields a cpu utilization pattern that starts
> around 90% at the beginning of the job, and then it slowly decreases over
> the next 1-1.5 hours to level out around 65% cpu utilization on the
> cluster. Does anyone have a clue why I'd be seeing such a negative effect
> of switching to coarse-grained mode? GC activity is comparable in both
> cases. I've tried 1.5.2, as well as the 1.6.0 preview tag that's on github.
>

I'm not very familiar with Ganglia, and how it computes utilization. But
one thing comes to mind: did you enable dynamic allocation

on coarse-grained mode?

iulian


Re: Removing the Mesos fine-grained mode

2015-11-20 Thread Adam McElwee
I've used fine-grained mode on our mesos spark clusters until this week,
mostly because it was the default. I started trying coarse-grained because
of the recent chatter on the mailing list about wanting to move the mesos
execution path to coarse-grained only. The odd things is, coarse-grained vs
fine-grained seems to yield drastic cluster utilization metrics for any of
our jobs that I've tried out this week.

If this is best as a new thread, please let me know, and I'll try not to
derail this conversation. Otherwise, details below:

We monitor our spark clusters with ganglia, and historically, we maintain
at least 90% cpu utilization across the cluster. Making a single
configuration change to use coarse-grained execution instead of
fine-grained consistently yields a cpu utilization pattern that starts
around 90% at the beginning of the job, and then it slowly decreases over
the next 1-1.5 hours to level out around 65% cpu utilization on the
cluster. Does anyone have a clue why I'd be seeing such a negative effect
of switching to coarse-grained mode? GC activity is comparable in both
cases. I've tried 1.5.2, as well as the 1.6.0 preview tag that's on github.

Thanks,
-Adam

On Fri, Nov 20, 2015 at 9:53 AM, Iulian Dragoș 
wrote:

> This is a good point. We should probably document this better in the
> migration notes. In the mean time:
>
>
> http://spark.apache.org/docs/latest/running-on-mesos.html#dynamic-resource-allocation-with-mesos
>
> Roughly, dynamic allocation lets Spark add and kill executors based on the
> scheduling delay. The min and max number of executors can be configured.
> Would this fit your use-case?
>
> iulian
>
>
> On Fri, Nov 20, 2015 at 1:55 AM, Jo Voordeckers 
> wrote:
>
>> As a recent fine-grained mode adopter I'm now confused after reading this
>> and other resources from spark-summit, the docs, ...  so can someone please
>> advise me for our use-case?
>>
>> We'll have 1 or 2 streaming jobs and an will run scheduled batch jobs
>> which should take resources away from the streaming jobs and give 'em back
>> upon completion.
>>
>> Can someone point me at the docs or a guide to set this up?
>>
>> Thanks!
>>
>> - Jo Voordeckers
>>
>>
>> On Thu, Nov 19, 2015 at 5:52 AM, Heller, Chris 
>> wrote:
>>
>>> I was one that argued for fine-grain mode, and there is something I
>>> still appreciate about how fine-grain mode operates in terms of the way one
>>> would define a Mesos framework. That said, with dyn-allocation and Mesos
>>> support for both resource reservation, oversubscription and revocation, I
>>> think the direction is clear that the coarse mode is the proper way
>>> forward, and having the two code paths is just noise.
>>>
>>> -Chris
>>>
>>> From: Iulian Dragoș 
>>> Date: Thursday, November 19, 2015 at 6:42 AM
>>> To: "dev@spark.apache.org" 
>>> Subject: Removing the Mesos fine-grained mode
>>>
>>> Hi all,
>>>
>>> Mesos is the only cluster manager that has a fine-grained mode, but it's
>>> more often than not problematic, and it's a maintenance burden. I'd like to
>>> suggest removing it in the 2.0 release.
>>>
>>> A few reasons:
>>>
>>> - code/maintenance complexity. The two modes duplicate a lot of
>>> functionality (and sometimes code) that leads to subtle differences or
>>> bugs. See SPARK-10444
>>> 
>>>  and
>>> also this thread
>>> 
>>>  and MESOS-3202
>>> 
>>> - it's not widely used (Reynold's previous thread
>>> 
>>> got very few responses from people relying on it)
>>> - similar functionality can be achieved with dynamic allocation +
>>> coarse-grained mode
>>>
>>> I suggest that Spark 1.6 already issues a warning if it detects
>>> fine-grained use, with removal in the 2.0 release.
>>>
>>> Thoughts?
>>>
>>> 

Re: Removing the Mesos fine-grained mode

2015-11-20 Thread Iulian Dragoș
This is a good point. We should probably document this better in the
migration notes. In the mean time:

http://spark.apache.org/docs/latest/running-on-mesos.html#dynamic-resource-allocation-with-mesos

Roughly, dynamic allocation lets Spark add and kill executors based on the
scheduling delay. The min and max number of executors can be configured.
Would this fit your use-case?

iulian


On Fri, Nov 20, 2015 at 1:55 AM, Jo Voordeckers 
wrote:

> As a recent fine-grained mode adopter I'm now confused after reading this
> and other resources from spark-summit, the docs, ...  so can someone please
> advise me for our use-case?
>
> We'll have 1 or 2 streaming jobs and an will run scheduled batch jobs
> which should take resources away from the streaming jobs and give 'em back
> upon completion.
>
> Can someone point me at the docs or a guide to set this up?
>
> Thanks!
>
> - Jo Voordeckers
>
>
> On Thu, Nov 19, 2015 at 5:52 AM, Heller, Chris  wrote:
>
>> I was one that argued for fine-grain mode, and there is something I still
>> appreciate about how fine-grain mode operates in terms of the way one would
>> define a Mesos framework. That said, with dyn-allocation and Mesos support
>> for both resource reservation, oversubscription and revocation, I think the
>> direction is clear that the coarse mode is the proper way forward, and
>> having the two code paths is just noise.
>>
>> -Chris
>>
>> From: Iulian Dragoș 
>> Date: Thursday, November 19, 2015 at 6:42 AM
>> To: "dev@spark.apache.org" 
>> Subject: Removing the Mesos fine-grained mode
>>
>> Hi all,
>>
>> Mesos is the only cluster manager that has a fine-grained mode, but it's
>> more often than not problematic, and it's a maintenance burden. I'd like to
>> suggest removing it in the 2.0 release.
>>
>> A few reasons:
>>
>> - code/maintenance complexity. The two modes duplicate a lot of
>> functionality (and sometimes code) that leads to subtle differences or
>> bugs. See SPARK-10444
>> 
>>  and
>> also this thread
>> 
>>  and MESOS-3202
>> 
>> - it's not widely used (Reynold's previous thread
>> 
>> got very few responses from people relying on it)
>> - similar functionality can be achieved with dynamic allocation +
>> coarse-grained mode
>>
>> I suggest that Spark 1.6 already issues a warning if it detects
>> fine-grained use, with removal in the 2.0 release.
>>
>> Thoughts?
>>
>> iulian
>>
>>
>


-- 

--
Iulian Dragos

--
Reactive Apps on the JVM
www.typesafe.com


Re: Removing the Mesos fine-grained mode

2015-11-19 Thread Jo Voordeckers
As a recent fine-grained mode adopter I'm now confused after reading this
and other resources from spark-summit, the docs, ...  so can someone please
advise me for our use-case?

We'll have 1 or 2 streaming jobs and an will run scheduled batch jobs which
should take resources away from the streaming jobs and give 'em back upon
completion.

Can someone point me at the docs or a guide to set this up?

Thanks!

- Jo Voordeckers


On Thu, Nov 19, 2015 at 5:52 AM, Heller, Chris  wrote:

> I was one that argued for fine-grain mode, and there is something I still
> appreciate about how fine-grain mode operates in terms of the way one would
> define a Mesos framework. That said, with dyn-allocation and Mesos support
> for both resource reservation, oversubscription and revocation, I think the
> direction is clear that the coarse mode is the proper way forward, and
> having the two code paths is just noise.
>
> -Chris
>
> From: Iulian Dragoș 
> Date: Thursday, November 19, 2015 at 6:42 AM
> To: "dev@spark.apache.org" 
> Subject: Removing the Mesos fine-grained mode
>
> Hi all,
>
> Mesos is the only cluster manager that has a fine-grained mode, but it's
> more often than not problematic, and it's a maintenance burden. I'd like to
> suggest removing it in the 2.0 release.
>
> A few reasons:
>
> - code/maintenance complexity. The two modes duplicate a lot of
> functionality (and sometimes code) that leads to subtle differences or
> bugs. See SPARK-10444
> 
>  and
> also this thread
> 
>  and MESOS-3202
> 
> - it's not widely used (Reynold's previous thread
> 
> got very few responses from people relying on it)
> - similar functionality can be achieved with dynamic allocation +
> coarse-grained mode
>
> I suggest that Spark 1.6 already issues a warning if it detects
> fine-grained use, with removal in the 2.0 release.
>
> Thoughts?
>
> iulian
>
>


Re: Removing the Mesos fine-grained mode

2015-11-19 Thread Heller, Chris
I was one that argued for fine-grain mode, and there is something I still 
appreciate about how fine-grain mode operates in terms of the way one would 
define a Mesos framework. That said, with dyn-allocation and Mesos support for 
both resource reservation, oversubscription and revocation, I think the 
direction is clear that the coarse mode is the proper way forward, and having 
the two code paths is just noise.

-Chris

From: Iulian Dragoș 
mailto:iulian.dra...@typesafe.com>>
Date: Thursday, November 19, 2015 at 6:42 AM
To: "dev@spark.apache.org" 
mailto:dev@spark.apache.org>>
Subject: Removing the Mesos fine-grained mode

Hi all,

Mesos is the only cluster manager that has a fine-grained mode, but it's more 
often than not problematic, and it's a maintenance burden. I'd like to suggest 
removing it in the 2.0 release.

A few reasons:

- code/maintenance complexity. The two modes duplicate a lot of functionality 
(and sometimes code) that leads to subtle differences or bugs. See 
SPARK-10444
 and also this 
thread
 and 
MESOS-3202
- it's not widely used (Reynold's previous 
thread
 got very few responses from people relying on it)
- similar functionality can be achieved with dynamic allocation + 
coarse-grained mode

I suggest that Spark 1.6 already issues a warning if it detects fine-grained 
use, with removal in the 2.0 release.

Thoughts?

iulian



Re: Removing the Mesos fine-grained mode

2015-11-19 Thread Dean Wampler
Sounds like the right move. Simplifies things in important ways.

Dean Wampler, Ph.D.
Author: Programming Scala, 2nd Edition
 (O'Reilly)
Typesafe 
@deanwampler 
http://polyglotprogramming.com

On Thu, Nov 19, 2015 at 5:42 AM, Iulian Dragoș 
wrote:

> Hi all,
>
> Mesos is the only cluster manager that has a fine-grained mode, but it's
> more often than not problematic, and it's a maintenance burden. I'd like to
> suggest removing it in the 2.0 release.
>
> A few reasons:
>
> - code/maintenance complexity. The two modes duplicate a lot of
> functionality (and sometimes code) that leads to subtle differences or
> bugs. See SPARK-10444  and
> also this thread
> 
>  and MESOS-3202 
> - it's not widely used (Reynold's previous thread
> 
> got very few responses from people relying on it)
> - similar functionality can be achieved with dynamic allocation +
> coarse-grained mode
>
> I suggest that Spark 1.6 already issues a warning if it detects
> fine-grained use, with removal in the 2.0 release.
>
> Thoughts?
>
> iulian
>
>