[jira] [Commented] (FLINK-1579) Create a Flink History Server

ASF GitHub Bot (JIRA) Wed, 08 Feb 2017 02:47:48 -0800

    [ 
https://issues.apache.org/jira/browse/FLINK-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15857822#comment-15857822
 ]


ASF GitHub Bot commented on FLINK-1579:
---------------------------------------

GitHub user zentol opened a pull request:

    https://github.com/apache/flink/pull/3286

    [FLINK-1579] [WIP] Implement Standalone HistoryServer

    This PR is a work-in-progress view over a standalone History Server (HS).
    
    JobManagers may send completed jobs to the HistoryServer for them to be 
archived. Upon receiving an ArchivedExecutionGraph the HS pre-computes all 
possible REST requests and writes them into files. The files are arranged in a 
directory structure corresponding to the REST API.
    
    The HS can be started by calling `./bin/historyserver.sh start`, similar to 
the JM/TM. Various config options exist for the HS that mostly mirror the 
web-ui/RPC options of the JM.
    
    The HS uses a slightly modified web-ui; basically it only shows the 
"Completed Jobs" page. To not duplicate the everything I've added 2 files, 
`index2.jade` and `index2.coffee`, to the build script. The resulting 
`index2.html` file will be loaded when the browser requests the `index.html`.
    
    In order to re-use the JSON generation code that previously was contained 
in various handlers a giant utility `JsonUtils` class was created. This class 
now contains a variety of static methods that generate the JSON responses. As a 
result most handlers were reduced to one-liners, bar some sanity-checks.
    
    In regard to tests we verify that the HS creates all expected files upon 
receiving an ExecutionGraph.
    Furthermore, the newly created JsonUtils are mostly tested (the new 
checkpoint stats aren't tested); so we have tests for the REST responses now, 
which is neat.
    
    I'm not opening a proper PR yet as i have to go through all changes once 
again in detail, but it works (locally and on a cluster) so i wanted people to 
try it out and get some feedback.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zentol/flink 1579_history_server_b

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/3286.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3286
    
----
commit 0fdfeec0c86cba60d271d38cfbce7e4ae759b700
Author: zentol <[email protected]>
Date:   2016-10-17T10:55:19Z

    Add AccessExecutionVertex#getPriorExecutions()

commit 18c4cc6a9e8f3c9b772bcfe8f866e07d2f7304ce
Author: zentol <[email protected]>
Date:   2017-01-30T15:06:13Z

    [FLINK-5645] EG stuff

commit fcc4def5251086d4e37901c58bc47785e1d90788
Author: zentol <[email protected]>
Date:   2017-01-24T09:13:24Z

    [FLINK-1579] Implement History Server - Frontend

commit 2cc6b736c0c5c78903b85f9c1a9ccde8c3ee70b8
Author: zentol <[email protected]>
Date:   2016-10-21T12:29:30Z

    [FLINK-1579] Implement History Server - Backend

commit 0047ae53b9f2f79eee9ec7e76195559b32dbeb20
Author: zentol <[email protected]>
Date:   2017-02-08T08:58:01Z

    [FLINK-1579] Implement History Server - Backend - Tests

commit 730548a7d88c56a2cde235e3d7d92dbf676611b7
Author: zentol <[email protected]>
Date:   2017-02-08T08:58:22Z

    Use JsonUtils in handlers

commit adcc161e46f817e80301d1fb885cdef4a8679a71
Author: zentol <[email protected]>
Date:   2017-02-08T10:23:56Z

    Rebuild web-frontend

commit 3227fc2a12e8aeaaf111339833123da708ccea70
Author: zentol <[email protected]>
Date:   2017-02-08T10:24:14Z

    tmp streaming example with checkpointing

----


> Create a Flink History Server
> -----------------------------
>
>                 Key: FLINK-1579
>                 URL: https://issues.apache.org/jira/browse/FLINK-1579
>             Project: Flink
>          Issue Type: New Feature
>          Components: Distributed Coordination
>    Affects Versions: 0.9
>            Reporter: Robert Metzger
>            Assignee: Chesnay Schepler
>
> Right now its not possible to analyze the job results for jobs that ran on 
> YARN, because we'll loose the information once the JobManager has stopped.
> Therefore, I propose to implement a "Flink History Server" which serves  the 
> results from these jobs.
> I haven't started thinking about the implementation, but I suspect it 
> involves some JSON files stored in HDFS :)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-1579) Create a Flink History Server

Reply via email to