[
https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14190849#comment-14190849
]
Zhijie Shen commented on SPARK-1537:
------------------------------------
[~vanzin], thanks for introducing YARN timeline server to Spark. Let me briefly
summarize the current status of the timeline server and answer some concerns
here. Spark folks who are interested in this monitoring service offered by YARN
can go ahead to YARN-1530 to read the design doc and watch the latest progress.
1. The essential functions or the timeline service have been available since
Hadoop 2.4. Basically, the user can organize the app's history or metrics
according to timeline data model and post it the the timeline server. Later on,
user or admin can come back to query this information to analyze how the app
was going. The essential APIs keep unchanged from 2.4 to the coming 2.6. There
should *NOT* be any incompatible API changes that will block this work.
Moreover, Keeping compatible is always in our consideration when coming up with
new features in the following Hadoop releases.
2. It's *NOT* exactly that the timeline server is not production-ready. In
fact, Apache Tez has already integrated the timeline server for logging the
history information. In the coming Hadoop 2.6, MapReduce is also enabled to
publish the history information to the timeline server, too. Moreover, within
the scope of YARN, a built-in generic history service on top of the timeline
service is available to YARN users to watch all kinds of apps. Hence, with
several successful pioneer, Spark should be confident enough to take the new
merit of YARN.
3. While YARN community is progressing quickly to improve the timeline server
in terms of security (coming 2.6), high availability, scalability, better
client libs and so on, it should not disturb the initial attempt for Spark to
embrace the timeline server, but will offer better experience if Spark is
riding on it.
If you have other issue of high priority to work on, I think [~zhazhan] will be
able to help this integration. Thanks!
> Add integration with Yarn's Application Timeline Server
> -------------------------------------------------------
>
> Key: SPARK-1537
> URL: https://issues.apache.org/jira/browse/SPARK-1537
> Project: Spark
> Issue Type: New Feature
> Components: YARN
> Reporter: Marcelo Vanzin
> Assignee: Marcelo Vanzin
>
> It would be nice to have Spark integrate with Yarn's Application Timeline
> Server (see YARN-321, YARN-1530). This would allow users running Spark on
> Yarn to have a single place to go for all their history needs, and avoid
> having to manage a separate service (Spark's built-in server).
> At the moment, there's a working version of the ATS in the Hadoop 2.4 branch,
> although there is still some ongoing work. But the basics are there, and I
> wouldn't expect them to change (much) at this point.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]