Hi,

> Chatted with BenM offline on this. There's another option what both of us
> agreed that it's probably better than any of the ones mentioned above.
> 
> The idea is to make `allocable` return the portion of the input resources
> that are allocatable, and strip the unelectable portion.
> 
> For example:
> 1) If the input resources are "cpus:0.001,gpus:1", the `allocatable` method
> will return "gpus:1".
> 2) If the input resources are "cpus:1,mem:1", the `allocatable` method will
> return "cpus:1".
> 3) If the input resources are "cpus:0.001,mem:1", the `allocatable` method
> will return an empty Resources object.
> 
> Basically, the algorithm is like the following:
> 
> allocatable = input
> foreach known resource type t: do
>  r = resources of type t from the input
>  if r is less than the min resource of type t; then
>    allocatable -= r
>  fi
> done
> return allocatable

I think that sounds like a faithful extension the current behavior to me 
(removing too small resources from the offerable pool), but I feel we should 
not just filter out any resource _kind_  below the minimum, but inside a kind 
all _addable_ subresources,

    allocatable : Resources = input
      for (resource: Resource) in input:
        if resource < min(resource.kind):
          allocatable -= resource

    return allocatable

This would have the effect of clumping together each distinguishable resource 
we care about instead of of accumulating say different disks which in sum are 
potentially not that more interesting to frameworks (they would prefer more of 
a particular disk than smaller pieces scattered across multiple disks).

@alexr
> If we are about to offer some of the resources from a particular agent, why
> would we filter anything at all? I doubt we should be concerned about the
> size of the offer representation travelling through the network. If
> available resources are "cpus:0.001,gpus:1" and we want to allocate GPU,
> what is the benefit of filtering CPU?
> 
> What about the following:
> allocatable(R)
> {
>  return true
>    iff (there exists r in R for which size(r) > MIN(type(r)))
> }

I think this is less about communication overhead, but more a tool to help to 
make sure that offered resources are actually useful to frameworks. If we would 
completely remove the current behavior of clumping resources it might take a 
long time for frameworks to actually receive sufficient interesting resources. 
While frameworks can use filters to prevent some offers, to filter out an offer 
we currently always require that the filtered resources are a superset of the 
resources we are about to offer. As the number of possible dimensions (e.g., 
resource kinds, labels, other fields) increases it becomes harder and harder 
for filters to be effective in this regard and the allocator needs to step in.

https://en.wikipedia.org/wiki/Curse_of_dimensionality


Cheers,

Benjamin

Reply via email to