James Xu created STORM-55:
-----------------------------

             Summary: Replace Storm's scheduler with a constraint logic 
programming engine
                 Key: STORM-55
                 URL: https://issues.apache.org/jira/browse/STORM-55
             Project: Apache Storm (Incubating)
          Issue Type: New Feature
            Reporter: James Xu


https://github.com/nathanmarz/storm/issues/383

CLP seems to be a great fit for Storm's resource scheduling. We want to be able 
to declaratively specify constraints, such as:

1. Topology A's slots should be <= 10 and as close to 10 as possible (minimize 
the delta between assigned slots and 10)
2. All topologies should use less than 200 CPU's and less than 600 GB of memory
3. Topology B should run at most 2 workers on each host
4. Each worker for topology C should run at most one task for component X and 
one task for component Y
5. Should minimize the amount of reassignment to running topologies in order to 
satisfy constraints
6. Should only be allowed to reassign workers for an individual topology whose 
individual constraints are satisfied once every 10 minutes

And then the logic engine should do an efficient search and optimize the 
constraints.

What's going to make this especially powerful is exposing the logic engine to 
users (so people can plug in functions that return new constraints), making it 
really easy for people to add sophisticated scheduling logic with minimal 
effort.

Clojure's core.logic may be a great fit for this. According to this thread it 
doesn't do minimization goals ( 
https://groups.google.com/group/clojure/browse_thread/thread/a050a4e6b390514a 
), but it may in the future.

-------------------------------------------
@nathanmarz: There also appears to be a fair amount of research in this area 
that we should look at: 
http://scholar.google.com/scholar?q=resource+scheduling+constraint+logic+programming&hl=en&as_sdt=0&as_vis=1&oi=scholart&sa=X&ei=X1uIUK6gA4rBigL5yoCgBg&ved=0CD0QgQMwAA


-------------------------------------------
bgedik: It would be good if the default scheduler understands placement 
constraints. Would be very helpful if the placement constraints can be attached 
to the topology, relieving the developer from having to write a scheduler to 
get the placement she wants.

The placement constraints can come in various forms. Here are some examples:

Isolation constraint: The host/spout task cannot share its host with any other 
task
Co-location constraint: A group of host/spout tasks has to be placed on the 
same host
Ex-location constraint: A group of host/spout tasks cannot be placed on the 
same host
Pool placement: A group of host/spout tasks are sprayed over a pool of hosts
Explicit placement: A host/spout task is placed on a user-specified host
Pools can be defined using host properties. For instance, configuration files 
can be used to assign properties to hosts. As a specific example, some hosts 
can have GPUs installed. As an application developer, I can request 4 such 
hosts (with gpu=true property) to create a pool in my topology construction 
code (without hardcoding the identities of the hosts). Then I can spray some of 
my bolt instances across the hosts in my GPU pool.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to