-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25184/
-----------------------------------------------------------
(Updated Oct. 15, 2014, 10:23 a.m.)
Review request for mesos, Adam B and Timothy St. Clair.
Bugs: MESOS-1746
https://issues.apache.org/jira/browse/MESOS-1746
Repository: mesos-git
Description
-------
There was a bug found that Spark use TaskStatus.data to transfer computed
result and mesos-master RES memory keeps increasing fast and finally will be
killed by OOM killer.
Diffs
-----
src/master/master.cpp cb46cec0674b3aa031450c5b4f48f4f8bb92767d
Diff: https://reviews.apache.org/r/25184/diff/
Testing (updated)
-------
tested with spark. It's very easy to reproduce this issue (100%) with spark,
when spark use mesos as resource manager, its executor driver will put result
into TaskStatus. For example, a result of a single task like below.
14/08/22 13:29:18 INFO Executor: Serialized size of result for 248 is 17573033
It's about 16MB large, and a stage of spark generally consist of maybe hundreds
of task and finished in tens of seconds, this will put mesos get killed by OOM
killer soon.
Thanks,
Chengwei Yang