Hi Adam,

Thanks for the graphs and the tests, definitely interested to dig a
bit deeper to find out what's could be the cause of this.

Do you have the spark driver logs for both runs?

Tim

On Mon, Nov 30, 2015 at 9:06 AM, Adam McElwee <a...@mcelwee.me> wrote:
> To eliminate any skepticism around whether cpu is a good performance metric
> for this workload, I did a couple comparison runs of an example job to
> demonstrate a more universal change in performance metrics (stage/job time)
> between coarse and fine-grained mode on mesos.
>
> The workload is identical here - pulling tgz archives from s3, parsing json
> lines from the files and ultimately creating documents to index into solr.
> The tasks are not inserting into solr (just to let you know that there's no
> network side-effect of the map task). The runs are on the same exact
> hardware in ec2 (m2.4xlarge, with 68GB of ram and 45G executor memory),
> exact same jvm and it's not dependent on order of running the jobs, meaning
> I get the same results whether I run the coarse first or whether I run the
> fine-grained first. No other frameworks/tasks are running on the mesos
> cluster during the test. I see the same results whether it's a 3-node
> cluster, or whether it's a 200-node cluster.
>
> With the CMS collector in fine-grained mode, the map stage takes roughly
> 2.9h, and coarse-grained mode takes 3.4h. Because both modes initially start
> out performing similarly, the total execution time gap widens as the job
> size grows. To put that another way, the difference is much smaller for
> jobs/stages < 1 hour. When I submit this job for a much larger dataset that
> takes 5+ hours, the difference in total stage time moves closer and closer
> to roughly 20-30% longer execution time.
>
> With the G1 collector in fine-grained mode, the map stage takes roughly
> 2.2h, and coarse-grained mode takes 2.7h. Again, the fine and coarse-grained
> execution tests are on the exact same machines, exact same dataset, and only
> changing spark.mesos.coarse to true/false.
>
> Let me know if there's anything else I can provide here.
>
> Thanks,
> -Adam
>
>
> On Mon, Nov 23, 2015 at 11:27 AM, Adam McElwee <a...@mcelwee.me> wrote:
>>
>>
>>
>> On Mon, Nov 23, 2015 at 7:36 AM, Iulian Dragoș
>> <iulian.dra...@typesafe.com> wrote:
>>>
>>>
>>>
>>> On Sat, Nov 21, 2015 at 3:37 AM, Adam McElwee <a...@mcelwee.me> wrote:
>>>>
>>>> I've used fine-grained mode on our mesos spark clusters until this week,
>>>> mostly because it was the default. I started trying coarse-grained because
>>>> of the recent chatter on the mailing list about wanting to move the mesos
>>>> execution path to coarse-grained only. The odd things is, coarse-grained vs
>>>> fine-grained seems to yield drastic cluster utilization metrics for any of
>>>> our jobs that I've tried out this week.
>>>>
>>>> If this is best as a new thread, please let me know, and I'll try not to
>>>> derail this conversation. Otherwise, details below:
>>>
>>>
>>> I think it's ok to discuss it here.
>>>
>>>>
>>>> We monitor our spark clusters with ganglia, and historically, we
>>>> maintain at least 90% cpu utilization across the cluster. Making a single
>>>> configuration change to use coarse-grained execution instead of 
>>>> fine-grained
>>>> consistently yields a cpu utilization pattern that starts around 90% at the
>>>> beginning of the job, and then it slowly decreases over the next 1-1.5 
>>>> hours
>>>> to level out around 65% cpu utilization on the cluster. Does anyone have a
>>>> clue why I'd be seeing such a negative effect of switching to 
>>>> coarse-grained
>>>> mode? GC activity is comparable in both cases. I've tried 1.5.2, as well as
>>>> the 1.6.0 preview tag that's on github.
>>>
>>>
>>> I'm not very familiar with Ganglia, and how it computes utilization. But
>>> one thing comes to mind: did you enable dynamic allocation on coarse-grained
>>> mode?
>>
>>
>> Dynamic allocation is definitely not enabled. The only delta between runs
>> is adding --conf "spark.mesos.coarse=true" the job submission. Ganglia is
>> just pulling stats from the procfs, and I've never seen it report bad
>> results. If I sample any of the 100-200 nodes in the cluster, dstat reflects
>> the same average cpu that I'm seeing reflected in ganglia.
>>>
>>>
>>> iulian
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Reply via email to