Re: About Spark job web ui persist(JIRA-969)

Pillis W Fri, 31 Jan 2014 07:42:34 -0800

I have attached a proposal design document to the SPARK-969 JIRA for
discussion.
If not assigned to anyone, I would be interested in implementing it.
Best regards,
pillis





On Sun, Jan 19, 2014 at 10:06 PM, Pillis W <pillis.w...@gmail.com> wrote:

> Hi Junluan,
> I was thinking that the SparkContextData structure will mirror what the UI
> needs, as it is a good use case.
> It could look like this approximately
> - SparkContextData
>   - StagesData
>     - SchedulingMode
>     - Stages[]
>   - StorageData
>   - EnvironmentData
>     - Map <category:String, Map<key:String, value:String>>
>   - ExecutorsData
>     - Executors[]
>
> Metrics can show the data structure in JMX as is.
>
> I was thinking the UI will look exactly the same as it does right now,
> except that there will be a Combo-box on the title bar to pick the
> SparkContext. Once picked, all the tabs and pages should look exactly the
> same.
> One additional information that will need to be shown is the SparkContext
> state - running, stopped, etc.
>
> Hope that helps.
> Regards,
> Pillis
>
>
>
>
> On Thu, Jan 16, 2014 at 9:51 PM, Xia, Junluan <junluan....@intel.com>wrote:
>
>> Hi pillis
>>
>> Do you mind to submit more detail design document? May contain
>> 1. what data structures will be exposed to external UI/metrics/third party
>> 2. what new UI looks like if you want to persist SparkContextData
>> periodically
>> 3. .........
>>
>> Any other suggestions?
>>
>> -----Original Message-----
>> From: Pillis Work [mailto:pillis.w...@gmail.com]
>> Sent: Friday, January 17, 2014 9:07 AM
>> To: dev@spark.incubator.apache.org
>> Subject: Re: About Spark job web ui persist(JIRA-969)
>>
>> Hello,
>> If changes are acceptable, I would like to request assignment of JIRA to
>> me for implementation.
>> Regards
>> pillis
>>
>>
>> On Thu, Jan 16, 2014 at 9:28 AM, Pillis Work <pillis.w...@gmail.com>
>> wrote:
>>
>> > Hi Junluan,
>> > 1. Yes, we could persist to HDFS or any FS. I think at a minimum we
>> > should persist it to local disk - keeps the core simple.
>> > We can think of HDFS interactions as level-2 functionality that can be
>> > implemented once we have a good local implementation. The
>> > persistence/hydration layer of a SparkContextData can be made
>> > pluggable as a next step.
>> > Also, as mentioned in previous mail, SparkUI will now show multiple
>> > SparkContexts using data from SparkContextDatas.
>> >
>> > 2. Yes, we could
>> >
>> > 3. Yes, SparkUI will need a rewrite to deal with SparkContextDatas
>> > (either live, or hydrated from historical JSONs).
>> > Regards
>> >
>> >
>> >
>> >
>> > On Thu, Jan 16, 2014 at 8:15 AM, Xia, Junluan <junluan....@intel.com
>> >wrote:
>> >
>> >> Hi Pillis
>> >>
>> >> It sound goods
>> >> 1. For SparkContextData, I think we could persist in HDFS not in
>> >> local disk(one SparkUI service may show more than one sparkcontext)
>> >> 2. we also could consider SparkContextData as one metrics
>> >> input(MetricsSource), for long running spark job, SparkContextData
>> >> will shown in ganglia/jmx .....
>> >> 3. if we persist SparkContextData periodically, we need to rewrite
>> >> the UI logic as spark ui now just show one timestamp information.
>> >>
>> >> -----Original Message-----
>> >> From: Pillis Work [mailto:pillis.w...@gmail.com]
>> >> Sent: Thursday, January 16, 2014 5:37 PM
>> >> To: dev@spark.incubator.apache.org
>> >> Subject: Re: About Spark job web ui persist(JIRA-969)
>> >>
>> >> Hello,
>> >> I wanted to write down at a high level the changes I was thinking of.
>> >> Please feel free to critique and suggest changes.
>> >>
>> >> SparkContext:
>> >> SparkContext start will not be starting UI anymore. Rather it will
>> >> launch a SparkContextObserver (has SparkListener trait) which will
>> >> generate a SparkContextData instance. SparkContextObserver keeps
>> >> SparkContextData uptodate. SparkContextData will have all the
>> >> historical information anyone needs. Stopping a SparkContext stops the
>> SparkContextObserver.
>> >>
>> >> SparkContextData:
>> >> Has all historical information of a SparkContext run. Periodically
>> >> persists itself to disk as JSON. Can hydrate itself from the same JSON.
>> >> SparkContextDatas are created without any UI usage. SparkContextData
>> >> can evolve independently of what UI needs - like having non-UI data
>> >> needed for third party integration.
>> >>
>> >> SparkUI:
>> >> No longer needs SparkContext. Will need an array of SparkContextDatas
>> >> (either by polling folder or other means). UI pages at render time
>> >> will access appropriate SparkContextData and produce HTML. SparkUI
>> >> can be started and stopped independently of SparkContexts. Multiple
>> >> SparkContexts can be shown in UI.
>> >>
>> >> I have purposefully not gone into much detail. Please let me know if
>> >> any piece needs to be elaborated.
>> >> Regards,
>> >> Pillis
>> >>
>> >>
>> >>
>> >>
>> >> On Mon, Jan 13, 2014 at 1:32 PM, Patrick Wendell <pwend...@gmail.com>
>> >> wrote:
>> >>
>> >> > Pillis - I agree we need to decouple the representation from a
>> >> > particular history server. But why not provide as simple history
>> >> > server people can (optionally) run if they aren't using Yarn or
>> Mesos?
>> >> > For people running the standalone cluster scheduler this seems
>> >> > important. Giving them only a JSON dump isn't super consumable for
>> >> > most users.
>> >> >
>> >> > - Patrick
>> >> >
>> >> > On Mon, Jan 13, 2014 at 10:43 AM, Pillis Work
>> >> > <pillis.w...@gmail.com>
>> >> > wrote:
>> >> > > The listeners in SparkUI which update the counters can trigger
>> >> > > saves
>> >> > along
>> >> > > the way.
>> >> > > The save can be on a 500ms delay after the last update, to batch
>> >> changes.
>> >> > > This solution would not require save on stop().
>> >> > >
>> >> > >
>> >> > >
>> >> > > On Mon, Jan 13, 2014 at 6:15 AM, Tom Graves
>> >> > > <tgraves...@yahoo.com>
>> >> > wrote:
>> >> > >
>> >> > >> So the downside to just saving stuff at the end is that if the
>> >> > >> app
>> >> > crashes
>> >> > >> or exits badly you don't have anything.   Hadoop has taken the
>> >> approach
>> >> > of
>> >> > >> saving events along the way.  But Hadoop also uses that history
>> >> > >> file to start where it left off at if something bad happens and
>> >> > >> it gets
>> >> > restarted.
>> >> > >>  I don't think the latter really applies to spark though.
>> >> > >>
>> >> > >> Does mesos have a history server?
>> >> > >>
>> >> > >> Tom
>> >> > >>
>> >> > >>
>> >> > >>
>> >> > >> On Sunday, January 12, 2014 9:22 PM, Pillis Work
>> >> > >> <pillis.w...@gmail.com
>> >> > >
>> >> > >> wrote:
>> >> > >>
>> >> > >> IMHO from a pure Spark standpoint, I don't know if having a
>> >> > >> dedicated history service makes sense as of now - considering
>> >> > >> that cluster
>> >> > managers
>> >> > >> have their own history servers. Just showing UI of history runs
>> >> > >> might be too thin a requirement for a full service. Spark should
>> >> > >> store history information that can later be exposed in required
>> ways.
>> >> > >>
>> >> > >> Since each SparkContext is the logical entry and exit point for
>> >> > >> doing something useful in Spark, during its stop(), it should
>> >> > >> serialize that run's statistics into a JSON file - like
>> >> > "sc_run_[name]_[start-time].json".
>> >> > >> When SparkUI.stop() is called, it in turn asks its UI objects
>> >> > >> (which
>> >> > should
>> >> > >> implement a trait) to provide either a flat or hierarchical Map
>> >> > >> of
>> >> > String
>> >> > >> key/value pairs. This map (flat, hierarchical) is then
>> >> > >> serialized to a configured path (default being "var/history").
>> >> > >>
>> >> > >> With regards to Mesos or YARN, their applications during
>> >> > >> shutdown can
>> >> > have
>> >> > >> API to import this Spark history into their history servers - by
>> >> > >> making
>> >> > API
>> >> > >> calls etc.
>> >> > >>
>> >> > >> This way Spark's history information is persisted independent of
>> >> > >> cluster framework, and cluster frameworks can import the history
>> >> when/as needed.
>> >> > >> Hope this helps.
>> >> > >> Regards,
>> >> > >> pillis
>> >> > >>
>> >> > >>
>> >> > >>
>> >> > >> On Thu, Jan 9, 2014 at 6:13 AM, Tom Graves
>> >> > >> <tgraves...@yahoo.com>
>> >> > wrote:
>> >> > >>
>> >> > >> > Note that it looks like we are planning on adding support for
>> >> > application
>> >> > >> > specific frameworks to YARN sooner rather then later. There is
>> >> > >> > an
>> >> > initial
>> >> > >> > design up here: https://issues.apache.org/jira/browse/YARN-1530
>> .
>> >> > >> > Note this has not been reviewed yet so changes are likely but
>> >> > >> > gives an
>> >> > idea of
>> >> > >> > the general direction.  If anyone has comments on how that
>> >> > >> > might work
>> >> > >> with
>> >> > >> > SPARK I encourage you to post to the jira.
>> >> > >> >
>> >> > >> > As Sandy mentioned it would be very nice if the solution could
>> >> > >> > be compatible with that.
>> >> > >> >
>> >> > >> > Tom
>> >> > >> >
>> >> > >> >
>> >> > >> >
>> >> > >> > On Wednesday, January 8, 2014 12:44 AM, Sandy Ryza <
>> >> > >> > sandy.r...@cloudera.com> wrote:
>> >> > >> >
>> >> > >> > Hey,
>> >> > >> >
>> >> > >> > YARN-321 is targeted for the Hadoop 2.4.  The minimum feature
>> >> > >> > set
>> >> > doesn't
>> >> > >> > include application-specific data, so that probably won't be
>> >> > >> > part of
>> >> > 2.4
>> >> > >> > unless other things delay the release for a while.  There are
>> >> > >> > no APIs
>> >> > for
>> >> > >> > it yet and pluggable UIs have been discussed but not agreed
>> upon.
>> >> > >> > I
>> >> > >> think
>> >> > >> > requirements from Spark could be useful in helping shape what
>> >> > >> > gets
>> >> > done
>> >> > >> > there.
>> >> > >> >
>> >> > >> > -Sandy
>> >> > >> >
>> >> > >> >
>> >> > >> >
>> >> > >> > On Tue, Jan 7, 2014 at 4:13 PM, Patrick Wendell
>> >> > >> > <pwend...@gmail.com>
>> >> > >> > wrote:
>> >> > >> >
>> >> > >> > > Hey Sandy,
>> >> > >> > >
>> >> > >> > > Do you know what the status is for YARN-321 and what version
>> >> > >> > > of YARN it's targeted for? Also, is there any kind of
>> >> > >> > > documentation or API
>> >> > for
>> >> > >> > > this? Does it control the presentation of the data itself
>> (e.g.
>> >> > >> > > it actually has its own UI)?
>> >> > >> > >
>> >> > >> > > @Tom - having an optional history server sounds like a good
>> idea.
>> >> > >> > >
>> >> > >> > > One question is what format to use for storing the data and
>> >> > >> > > how the persisted format relates to XML/HTML generation in
>> >> > >> > > the live UI. One idea would be to add JSON as an
>> >> > >> > > intermediate format inside of the current WebUI, and then
>> >> > >> > > any JSON page could be persisted and
>> >> > rendered
>> >> > >> > > by the history server using the same code. Once a
>> >> > >> > > SparkContext exits it could dump a series of named paths
>> >> > >> > > each with a JSON file. Then
>> >> > the
>> >> > >> > > history server could load those paths and pass them through
>> >> > >> > > the
>> >> > second
>> >> > >> > > rendering stage (JSON => XML) to create each page.
>> >> > >> > >
>> >> > >> > > It would be good if SPARK-969 had a good design doc before
>> >> > >> > > anyone starts working on it.
>> >> > >> > >
>> >> > >> > > - Patrick
>> >> > >> > >
>> >> > >> > > On Tue, Jan 7, 2014 at 3:18 PM, Sandy Ryza
>> >> > >> > > <sandy.r...@cloudera.com
>> >> > >
>> >> > >> > > wrote:
>> >> > >> > > > As a sidenote, it would be nice to make sure that whatever
>> >> > >> > > > done
>> >> > here
>> >> > >> > will
>> >> > >> > > > work with the YARN Application History Server (YARN-321),
>> >> > >> > > > a
>> >> > generic
>> >> > >> > > history
>> >> > >> > > > server that functions similarly to MapReduce's
>> >> JobHistoryServer.
>> >> >  It
>> >> > >> > will
>> >> > >> > > > eventually have the ability to store application-specific
>> data.
>> >> > >> > > >
>> >> > >> > > > -Sandy
>> >> > >> > > >
>> >> > >> > > >
>> >> > >> > > > On Tue, Jan 7, 2014 at 2:51 PM, Tom Graves
>> >> > >> > > > <tgraves...@yahoo.com>
>> >> > >> > wrote:
>> >> > >> > > >
>> >> > >> > > >> I don't think you want to save the html/xml files. I
>> >> > >> > > >> would rather
>> >> > >> see
>> >> > >> > > the
>> >> > >> > > >> info saved into a history file in like a json format that
>> >> > >> > > >> could
>> >> > then
>> >> > >> > be
>> >> > >> > > >> re-read and the web ui display the info, hopefully
>> >> > >> > > >> without much
>> >> > >> change
>> >> > >> > > to
>> >> > >> > > >> the UI parts.  For instance perhaps the history server
>> >> > >> > > >> could read
>> >> > >> the
>> >> > >> > > file
>> >> > >> > > >> and populate the appropriate Spark data structures that
>> >> > >> > > >> the web
>> >> > ui
>> >> > >> > > already
>> >> > >> > > >> uses.
>> >> > >> > > >>
>> >> > >> > > >> I would suggest making it so the history server is an
>> >> > >> > > >> optional
>> >> > >> server
>> >> > >> > > and
>> >> > >> > > >> could be run on any node. That way if the load on a
>> >> > >> > > >> particular
>> >> > node
>> >> > >> > > becomes
>> >> > >> > > >> to much it could be moved, but you also could run it on
>> >> > >> > > >> the same
>> >> > >> node
>> >> > >> > as
>> >> > >> > > >> the Master.  All it really needs to know is where to get
>> >> > >> > > >> the
>> >> > history
>> >> > >> > > files
>> >> > >> > > >> from and have access to that location.
>> >> > >> > > >>
>> >> > >> > > >> Hadoop actually has a history server for MapReduce which
>> >> > >> > > >> works
>> >> > very
>> >> > >> > > >> similar to what I mention above.   One thing to keep in
>> minds
>> >> > here
>> >> > >> is
>> >> > >> > > >> security.  You want to make sure that the history files
>> >> > >> > > >> can only
>> >> > be
>> >> > >> > > read by
>> >> > >> > > >> users who have the appropriate permissions.  The history
>> >> > >> > > >> server
>> >> > >> itself
>> >> > >> > > >> could run as  a superuser who has permission to server up
>> >> > >> > > >> the
>> >> > files
>> >> > >> > > based
>> >> > >> > > >> on the acls.
>> >> > >> > > >>
>> >> > >> > > >>
>> >> > >> > > >>
>> >> > >> > > >> On Tuesday, January 7, 2014 8:06 AM, "Xia, Junluan" <
>> >> > >> > > junluan....@intel.com>
>> >> > >> > > >> wrote:
>> >> > >> > > >>
>> >> > >> > > >> Hi all
>> >> > >> > > >>          Spark job web ui will not be available when job
>> >> > >> > > >> is over,
>> >> > >> but
>> >> > >> > it
>> >> > >> > > >> is convenient for developer to debug with persisting job
>> >> > >> > > >> web
>> >> ui.
>> >> > I
>> >> > >> > just
>> >> > >> > > >> come up with draft for this issue.
>> >> > >> > > >>
>> >> > >> > > >> 1.       We could simply save the web page with html/xml
>> >> > >> > > >> format(stages/executors/storages/environment) to certain
>> >> > >> > > >> location
>> >> > >> when
>> >> > >> > > job
>> >> > >> > > >> finished
>> >> > >> > > >>
>> >> > >> > > >> 2.       But it is not easy for user to review the job info
>> >> with
>> >> > #1,
>> >> > >> > we
>> >> > >> > > >> could build extra job history service for developers
>> >> > >> > > >>
>> >> > >> > > >> 3.       But where will we build this history service? In
>> >> Driver
>> >> > >> node
>> >> > >> > or
>> >> > >> > > >> Master node?
>> >> > >> > > >>
>> >> > >> > > >> Any suggestions about this improvement?
>> >> > >> > > >>
>> >> > >> > > >> regards,
>> >> > >> > > >> Andrew
>> >> > >> > > >>
>> >> > >> > >
>> >> > >> >
>> >> > >>
>> >> >
>> >>
>> >
>> >
>>
>
>

Re: About Spark job web ui persist(JIRA-969)

Reply via email to