Hi, I am reading the code about resource allocator in Mesos and trying to understand it.
Now, we have HierarchicalAllocatorProcess. From the code, I think each time master will first offer resources to the framework whose "resource share" is the smallest. But I have some question in my use case. We want to deploy hadoop-cdh3u5 and spark on the same cluster of nodes with Mesos. Hadoop jobtracker will launch Mesos executors to replace tasktrackers. But each executor may have several mapper slot and reduce slot. So each executor may allocate the resources for the mapper/reducer slots, even if there are not so many running tasks. And executor will not exit unless all the tasks are finished. So, here comes the problem. If hadoop always have some submitted running jobs. These jobs may not need so many resources which is allocated by jobtracker. eg. each executor only runs one mapper. In this situation, the spark jobs will not get enough resources. I think I can set mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.reduce.tasks.maximum to a small number(eg: 1 or 2), and each node may have multiple MesosExecutors for hadoop. But at this time, the jobtracker will have to communicate with a lot of tasktrackers (this is different from traditional hadoop cluster), I am afraid the jobtracker will become overwhelmed. Any idea about this ? I am glad to know any idea or advice about this. Best Guodong
