Hi,

I am reading the code about resource allocator in Mesos and trying to
understand it.

Now, we have HierarchicalAllocatorProcess. From the code, I think each time
master will first offer resources to the framework whose "resource share"
is the smallest. But I have some question in my use case.

We want to deploy hadoop-cdh3u5 and spark on the same cluster of nodes with
Mesos. Hadoop jobtracker will launch Mesos executors to replace
tasktrackers. But each executor may have several mapper slot and reduce
slot. So each executor may allocate the resources for the mapper/reducer
slots, even if there are not so many running tasks. And executor will not
exit unless all the tasks are finished.

So, here comes the problem. If hadoop always have some submitted running
jobs. These jobs may not need so many resources which is allocated
by jobtracker. eg. each executor only runs one mapper. In this situation,
the spark jobs will not get enough resources.

I think I can set mapred.tasktracker.map.tasks.maximum and
mapred.tasktracker.reduce.tasks.maximum to a small number(eg: 1 or 2), and
each node may have multiple MesosExecutors for hadoop. But at this time,
the jobtracker will have to communicate with a lot of tasktrackers (this is
different from traditional hadoop cluster), I am afraid the jobtracker will
become overwhelmed.

Any idea about this ? I am glad to know any idea or advice about this.

Best

Guodong

Reply via email to