GitHub user tillrohrmann opened a pull request:

    https://github.com/apache/flink/pull/5091

    [FLINK-7956] [flip6] Add support for queued scheduling with slot sharing to 
SlotPool

    ## What is the purpose of the change
    
    This commit adds support for queued scheduling with slot sharing to the
    SlotPool. The idea of slot sharing is that multiple tasks can run in the
    same slot. Moreover, queued scheduling means that a slot request must not
    be completed right away but at a later point in time. This allows to
    start new TaskExecutors in case that there are no more slots left.
    
    The main component responsible for the management of shared slots is the
    SlotSharingManager. The SlotSharingManager maintains internally a tree-like
    structure which stores the SlotContext future of the underlying
    AllocatedSlot. Whenever this future is completed potentially pending
    LogicalSlot instantiations are executed and sent to the slot requester.
    
    A shared slot is represented by a MultiTaskSlot which can harbour multiple
    TaskSlots. A TaskSlot can either be a MultiTaskSlot or a SingleTaskSlot.
    
    In order to represent co-location constraints, we first obtain a root
    MultiTaskSlot and then allocate a nested MultiTaskSlot in which the
    co-located tasks are allocated. The corresponding SlotRequestID is assigned
    to the CoLocationConstraint in order to make the TaskSlot retrievable for
    other tasks assigned to the same CoLocationConstraint.
    
    This PR also moves the `SlotPool` components to 
`o.a.f.runtime.jobmaster.slotpool`.
    
    This PR is based on #5090 
    
    ## Brief change log
    
    - Add `SlotSharingManager` to manage shared slots
    - Rework `SlotPool` to use `SlotSharingManager`
    - Add `SlotPool#allocateMultiTaskSlot` to allocate a shared slot
    - Add `SlotPool#allocateCoLocatedMultiTaskSlot` to allocate a co-located 
slot
    - Move `SlotPool` components to `o.a.f.runtime.jobmaster.slotpool`
    
    ## Verifying this change
    
    - Port `SchedulerSlotSharingTest`, `SchedulerIsolatedTasksTest` and
    `ScheduleWithCoLocationHintTest` to run with `SlotPool`
    - Add `SlotSharingManagerTest`, `SlotPoolSlotSharingTest` and
    `SlotPoolCoLocationTest` 
    
    ## Does this pull request potentially affect one of the following parts:
    
      - Dependencies (does it add or upgrade a dependency): (no)
      - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
      - The serializers: (no)
      - The runtime per-record code paths (performance sensitive): (no)
      - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes)
      - The S3 file system connector: (no)
    
    ## Documentation
    
      - Does this pull request introduce a new feature? (no)
      - If yes, how is the feature documented? (not applicable)
    
    CC: @GJL 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tillrohrmann/flink slotPoolSlots

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/5091.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5091
    
----
commit d30dde83548dbeff4249f3b57b67cdb6247af510
Author: Till Rohrmann <[email protected]>
Date:   2017-11-14T22:50:52Z

    [FLINK-8078] Introduce LogicalSlot interface
    
    The LogicalSlot interface decouples the task deployment from the actual
    slot implementation which at the moment is Slot, SimpleSlot and SharedSlot.
    This is a helpful step to introduce a different slot implementation for
    Flip-6.

commit e5da9566a6fc8a36ac8b06bae911c0dff5554e5d
Author: Till Rohrmann <[email protected]>
Date:   2017-11-15T13:20:27Z

    [FLINK-8085] Thin out LogicalSlot interface
    
    Remove isCanceled, isReleased method and decouple logical slot from 
Execution by
    introducing a Payload interface which is set for a LogicalSlot. The Payload 
interface
    is implemented by the Execution and allows to fail an implementation and 
obtaining
    a termination future.
    
    Introduce proper Execution#releaseFuture which is completed once the 
Execution's
    assigned resource has been released.

commit 84d86bebe2f9f8395430e7c71dd2393ba117b44f
Author: Till Rohrmann <[email protected]>
Date:   2017-11-24T17:03:49Z

    [FLINK-8087] Decouple Slot from AllocatedSlot
    
    This commit introduces the SlotContext which is an abstraction for the 
SimpleSlot
    to obtain the relevant slot information to do the communication with the
    TaskManager without relying on the AllocatedSlot which is now only used by 
the
    SlotPool.

commit 80a3cc848a0c724a2bc09b1b967cc9e6ccec5942
Author: Till Rohrmann <[email protected]>
Date:   2017-11-24T17:06:10Z

    [FLINK-8088] Associate logical slots with the slot request id
    
    Before logical slots like the SimpleSlot and SharedSlot where associated to 
the
    actually allocated slot via the AllocationID. This, however, was 
sub-optimal because
    allocated slots can be re-used to fulfill also other slot requests (logical 
slots).
    Therefore, we should bind the logical slots to the right id with the right 
lifecycle
    which is the slot request id.

commit 3e4550c0607744b20893dc90c587b63e68e4de1e
Author: Till Rohrmann <[email protected]>
Date:   2017-11-13T14:42:07Z

    [FLINK-8089] Also check for other pending slot requests in offerSlot
    
    Not only check for a slot request with the right allocation id but also 
check
    whether we can fulfill other pending slot requests with an unclaimed offered
    slot before adding it to the list of available slots.

commit b04dda46aaf298d921929910574662970d9c5093
Author: Till Rohrmann <[email protected]>
Date:   2017-11-24T22:29:53Z

    [hotfix] Speed up RecoveryITCase

commit e512558917f9bb5005024630b8a015cd624164b4
Author: Till Rohrmann <[email protected]>
Date:   2017-11-24T17:08:38Z

    [FLINK-7956] [flip6] Add support for queued scheduling with slot sharing to 
SlotPool
    
    This commit adds support for queued scheduling with slot sharing to the
    SlotPool. The idea of slot sharing is that multiple tasks can run in the
    same slot. Moreover, queued scheduling means that a slot request must not
    be completed right away but at a later point in time. This allows to
    start new TaskExecutors in case that there are no more slots left.
    
    The main component responsible for the management of shared slots is the
    SlotSharingManager. The SlotSharingManager maintains internally a tree-like
    structure which stores the SlotContext future of the underlying
    AllocatedSlot. Whenever this future is completed potentially pending
    LogicalSlot instantiations are executed and sent to the slot requester.
    
    A shared slot is represented by a MultiTaskSlot which can harbour multiple
    TaskSlots. A TaskSlot can either be a MultiTaskSlot or a SingleTaskSlot.
    
    In order to represent co-location constraints, we first obtain a root
    MultiTaskSlot and then allocate a nested MultiTaskSlot in which the
    co-located tasks are allocated. The corresponding SlotRequestID is assigned
    to the CoLocationConstraint in order to make the TaskSlot retrievable for
    other tasks assigned to the same CoLocationConstraint.
    
    Port SchedulerSlotSharingTest, SchedulerIsolatedTasksTest and
    ScheduleWithCoLocationHintTest to run with SlotPool.
    
    Restructure SlotPool components.
    
    Add SlotSharingManagerTest, SlotPoolSlotSharingTest and
    SlotPoolCoLocationTest.

commit 6489c6769a40b70f49b827784c810f954c413361
Author: Till Rohrmann <[email protected]>
Date:   2017-11-27T08:29:54Z

    [hotfix] [tests] Speed up queryable state IT tests by removing sleep

----


---

Reply via email to