On the resource management front, Meng Zhu, Andrei Sekretenko, and myself have been working on quota limits and enhancing multi-role framework support:
- A memory leak in the allocator was fixed: MESOS-9852 - Support for quota limits work is well underway, and at this point the major pieces are there and within the next few days it can be tried out. The plan here is to try to push users towards using only quota limits, as future support for optimistic offers will likely not support quota guarantees. - The /roles endpoint was fixed to expose all cases of "known" roles: MESOS-9888 <https://issues.apache.org/jira/browse/MESOS-9888>, MESOS-9890 <https://issues.apache.org/jira/browse/MESOS-9890>. - The /roles endpoint and roles table in the webui has been updated to display quota consumption, as well as breakdowns of allocated, offered, and reserved resources, see gif here: https://reviews.apache.org/r/71059/file/1894/ - Several bugs were fixed for MULTI_ROLE framework support: MESOS-9856 <https://issues.apache.org/jira/browse/MESOS-9856>, MESOS-9870 <https://issues.apache.org/jira/browse/MESOS-9870> - The v0 scheduler driver and java/python bindings are being updated to support multiple roles. On the performance front: - MESOS-9755 <https://issues.apache.org/jira/browse/MESOS-9755>: William Mahler and I looked into updating protobuf to 3.7.x from our existing 3.5.x in order to attempt to use protobuf arenas in the master API and we noticed a performance regression in the v0 /state endpoint. After looking into it, it appears to be a performance regression in the protobuf reflection code that we use to convert from our in-memory protobuf to json. No issue is filed yet with the protobuf community, but it's worth also trying out protobuf's built in json conversion code to see how that compares (see MESOS-9896 <https://issues.apache.org/jira/browse/MESOS-9896>). Feel free to reply with any questions or additional comments! Ben