After multiple disk support is introduced, I tried to perform some initial
testing of the feature. In the process, I noticed that some existing
frameworks (e.g. Apache Aurora) is not safe to be used.

What happens is because the framework has not been upgraded to the proper
version binding, the scheduler cannot see the type/mount part yet, and
would incorrectly thought this just another slice of scalar disk resource.
What's worse, in the process of handling static reserved resources, the
scheduler usually merges multiple scalar resources.

The net result could be that the scheduler try to tap onto these MOUNT
disks and launch task with such resource, but Mesos master would consider
the resource as invalid (because the machine doesn't really have disk
resources usable for sandbox), the launch task action triggers an error.

I'm not exactly sure whether this is a problem on Mesos side or framework
side, but until the framework is upgraded and correctly identifies new
protobuf fields, are there better ways to handle this?

Thanks.

-- 
Cheers,

Zhitao Li

Reply via email to