GitHub user tillrohrmann opened a pull request:

    https://github.com/apache/flink/pull/5310

    [FLINK-8453] [flip6] Add SerializableExecutionGraphStore to Dispatcher

    ## What is the purpose of the change
    
    The SerializableExecutionGraphStore is responsible for storing completed 
jobs
    for historic job requests (e.g. from the web ui or from the client). The 
store
    is populated by the Dispatcher once a job has terminated.
    
    The FileSerializableExecutionGraphStore implementation persists all
    SerializableExecutionGraphs on disk in order to avoid OOM problems. It only 
keeps
    some of the stored graphs in memory until it reaches a configurable size. 
Once
    coming close to this size, it will evict the elements and only reload them 
if
    requested again. Additionally, the FileSerializableExecutionGraphStore 
defines
    an expiration time after which the execution graphs will be removed from 
disk.
    This prevents excessive use of disk resources.
    
    This PR is based on #5309.
    
    ## Brief change log
    
    - Introduce `SerializableExecutionGraphStore` and 
`FileSerializableExecutionGraphStore`
    - Add `FileSerializableExecutionGraphStore` to `Dispatcher`
    - Store `SerializableExecutionGraphs` in corresponding 
`FileSerializableExecutionGraphStore`
    - Adapt `Dispatcher` to serve requests for historic jobs
    
    ## Verifying this change
    
    - Added `FileSerializableExecutionGraphStoreTest`
    
    ## Does this pull request potentially affect one of the following parts:
    
      - Dependencies (does it add or upgrade a dependency): (no)
      - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
      - The serializers: (no)
      - The runtime per-record code paths (performance sensitive): (no)
      - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
      - The S3 file system connector: (no)
    
    ## Documentation
    
      - Does this pull request introduce a new feature? (no)
      - If yes, how is the feature documented? (not applicable)
    
    cc @GJL 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tillrohrmann/flink addHistoricJobView

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/5310.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5310
    
----
commit a959b9411833e320065b328ed2fc936b58f911f4
Author: Till Rohrmann <trohrmann@...>
Date:   2018-01-16T17:45:53Z

    [FLINK-8449] [flip6] Extend OnCompletionActions to accept an 
SerializableExecutionGraph
    
    This commit introduces the SerializableExecutionGraph which extends the
    AccessExecutionGraph and adds serializability to it. Moreover, this commit
    changes the OnCompletionActions interface such that it accepts a
    SerializableExecutionGraph instead of a plain JobResult. This allows to
    archive the completed ExecutionGraph for further usage in the container
    component of the JobMasterRunner.

commit ca15b076c05ff940a12a240ba385e2434f93790b
Author: Till Rohrmann <trohrmann@...>
Date:   2018-01-18T14:02:36Z

    [hotfix] [tests] Let BucketingSink extend TestLogger

commit 21c25502fb6d07c6fb65f18100dc6d4ec23e9d93
Author: Till Rohrmann <trohrmann@...>
Date:   2018-01-17T14:01:57Z

    [FLINK-8450] [flip6] Make JobMaster/DispatcherGateway#requestJob type safe
    
    Let JobMasterGateway#requestJob and DispatcherGateway#requestJob return a
    CompletableFuture<SerializableExecutionGraph> instead of a
    CompletableFuture<AccessExecutionGraph>. In order to support the old code
    and the JobManagerGateway implementation we have to keep the return type
    in RestfulGateway. Once the old code has been removed, we should change
    this as well.

commit 7b7b0692582189b8e540e5ae022d351c45991e43
Author: Till Rohrmann <trohrmann@...>
Date:   2018-01-17T11:22:43Z

    [FLINK-8453] [flip6] Add SerializableExecutionGraphStore to Dispatcher
    
    The SerializableExecutionGraphStore is responsible for storing completed 
jobs
    for historic job requests (e.g. from the web ui or from the client). The 
store
    is populated by the Dispatcher once a job has terminated.
    
    The FileSerializableExecutionGraphStore implementation persists all
    SerializableExecutionGraphs on disk in order to avoid OOM problems. It only 
keeps
    some of the stored graphs in memory until it reaches a configurable size. 
Once
    coming close to this size, it will evict the elements and only reload them 
if
    requested again. Additionally, the FileSerializableExecutionGraphStore 
defines
    an expiration time after which the execution graphs will be removed from 
disk.
    This prevents excessive use of disk resources.

----


---

Reply via email to