RE: About Spark job web ui persist(JIRA-969)

Xia, Junluan Thu, 16 Jan 2014 21:52:50 -0800

Hi pillis

Do you mind to submit more detail design document? May contain
1. what data structures will be exposed to external UI/metrics/third party
2. what new UI looks like if you want to persist SparkContextData periodically
3. .........


Any other suggestions?

-----Original Message-----
From: Pillis Work [mailto:pillis.w...@gmail.com] 
Sent: Friday, January 17, 2014 9:07 AM
To: dev@spark.incubator.apache.org
Subject: Re: About Spark job web ui persist(JIRA-969)

Hello,
If changes are acceptable, I would like to request assignment of JIRA to me for 
implementation.
Regards
pillis


On Thu, Jan 16, 2014 at 9:28 AM, Pillis Work <pillis.w...@gmail.com> wrote:

> Hi Junluan,
> 1. Yes, we could persist to HDFS or any FS. I think at a minimum we 
> should persist it to local disk - keeps the core simple.
> We can think of HDFS interactions as level-2 functionality that can be 
> implemented once we have a good local implementation. The 
> persistence/hydration layer of a SparkContextData can be made 
> pluggable as a next step.
> Also, as mentioned in previous mail, SparkUI will now show multiple 
> SparkContexts using data from SparkContextDatas.
>
> 2. Yes, we could
>
> 3. Yes, SparkUI will need a rewrite to deal with SparkContextDatas 
> (either live, or hydrated from historical JSONs).
> Regards
>
>
>
>
> On Thu, Jan 16, 2014 at 8:15 AM, Xia, Junluan <junluan....@intel.com>wrote:
>
>> Hi Pillis
>>
>> It sound goods
>> 1. For SparkContextData, I think we could persist in HDFS not in 
>> local disk(one SparkUI service may show more than one sparkcontext) 
>> 2. we also could consider SparkContextData as one metrics 
>> input(MetricsSource), for long running spark job, SparkContextData 
>> will shown in ganglia/jmx .....
>> 3. if we persist SparkContextData periodically, we need to rewrite 
>> the UI logic as spark ui now just show one timestamp information.
>>
>> -----Original Message-----
>> From: Pillis Work [mailto:pillis.w...@gmail.com]
>> Sent: Thursday, January 16, 2014 5:37 PM
>> To: dev@spark.incubator.apache.org
>> Subject: Re: About Spark job web ui persist(JIRA-969)
>>
>> Hello,
>> I wanted to write down at a high level the changes I was thinking of.
>> Please feel free to critique and suggest changes.
>>
>> SparkContext:
>> SparkContext start will not be starting UI anymore. Rather it will 
>> launch a SparkContextObserver (has SparkListener trait) which will 
>> generate a SparkContextData instance. SparkContextObserver keeps 
>> SparkContextData uptodate. SparkContextData will have all the 
>> historical information anyone needs. Stopping a SparkContext stops the 
>> SparkContextObserver.
>>
>> SparkContextData:
>> Has all historical information of a SparkContext run. Periodically 
>> persists itself to disk as JSON. Can hydrate itself from the same JSON.
>> SparkContextDatas are created without any UI usage. SparkContextData 
>> can evolve independently of what UI needs - like having non-UI data 
>> needed for third party integration.
>>
>> SparkUI:
>> No longer needs SparkContext. Will need an array of SparkContextDatas 
>> (either by polling folder or other means). UI pages at render time 
>> will access appropriate SparkContextData and produce HTML. SparkUI 
>> can be started and stopped independently of SparkContexts. Multiple 
>> SparkContexts can be shown in UI.
>>
>> I have purposefully not gone into much detail. Please let me know if 
>> any piece needs to be elaborated.
>> Regards,
>> Pillis
>>
>>
>>
>>
>> On Mon, Jan 13, 2014 at 1:32 PM, Patrick Wendell <pwend...@gmail.com>
>> wrote:
>>
>> > Pillis - I agree we need to decouple the representation from a 
>> > particular history server. But why not provide as simple history 
>> > server people can (optionally) run if they aren't using Yarn or Mesos?
>> > For people running the standalone cluster scheduler this seems 
>> > important. Giving them only a JSON dump isn't super consumable for 
>> > most users.
>> >
>> > - Patrick
>> >
>> > On Mon, Jan 13, 2014 at 10:43 AM, Pillis Work 
>> > <pillis.w...@gmail.com>
>> > wrote:
>> > > The listeners in SparkUI which update the counters can trigger 
>> > > saves
>> > along
>> > > the way.
>> > > The save can be on a 500ms delay after the last update, to batch
>> changes.
>> > > This solution would not require save on stop().
>> > >
>> > >
>> > >
>> > > On Mon, Jan 13, 2014 at 6:15 AM, Tom Graves 
>> > > <tgraves...@yahoo.com>
>> > wrote:
>> > >
>> > >> So the downside to just saving stuff at the end is that if the 
>> > >> app
>> > crashes
>> > >> or exits badly you don't have anything.   Hadoop has taken the
>> approach
>> > of
>> > >> saving events along the way.  But Hadoop also uses that history 
>> > >> file to start where it left off at if something bad happens and 
>> > >> it gets
>> > restarted.
>> > >>  I don't think the latter really applies to spark though.
>> > >>
>> > >> Does mesos have a history server?
>> > >>
>> > >> Tom
>> > >>
>> > >>
>> > >>
>> > >> On Sunday, January 12, 2014 9:22 PM, Pillis Work 
>> > >> <pillis.w...@gmail.com
>> > >
>> > >> wrote:
>> > >>
>> > >> IMHO from a pure Spark standpoint, I don't know if having a 
>> > >> dedicated history service makes sense as of now - considering 
>> > >> that cluster
>> > managers
>> > >> have their own history servers. Just showing UI of history runs 
>> > >> might be too thin a requirement for a full service. Spark should 
>> > >> store history information that can later be exposed in required ways.
>> > >>
>> > >> Since each SparkContext is the logical entry and exit point for 
>> > >> doing something useful in Spark, during its stop(), it should 
>> > >> serialize that run's statistics into a JSON file - like
>> > "sc_run_[name]_[start-time].json".
>> > >> When SparkUI.stop() is called, it in turn asks its UI objects 
>> > >> (which
>> > should
>> > >> implement a trait) to provide either a flat or hierarchical Map 
>> > >> of
>> > String
>> > >> key/value pairs. This map (flat, hierarchical) is then 
>> > >> serialized to a configured path (default being "var/history").
>> > >>
>> > >> With regards to Mesos or YARN, their applications during 
>> > >> shutdown can
>> > have
>> > >> API to import this Spark history into their history servers - by 
>> > >> making
>> > API
>> > >> calls etc.
>> > >>
>> > >> This way Spark's history information is persisted independent of 
>> > >> cluster framework, and cluster frameworks can import the history
>> when/as needed.
>> > >> Hope this helps.
>> > >> Regards,
>> > >> pillis
>> > >>
>> > >>
>> > >>
>> > >> On Thu, Jan 9, 2014 at 6:13 AM, Tom Graves 
>> > >> <tgraves...@yahoo.com>
>> > wrote:
>> > >>
>> > >> > Note that it looks like we are planning on adding support for
>> > application
>> > >> > specific frameworks to YARN sooner rather then later. There is 
>> > >> > an
>> > initial
>> > >> > design up here: https://issues.apache.org/jira/browse/YARN-1530.
>> > >> > Note this has not been reviewed yet so changes are likely but 
>> > >> > gives an
>> > idea of
>> > >> > the general direction.  If anyone has comments on how that 
>> > >> > might work
>> > >> with
>> > >> > SPARK I encourage you to post to the jira.
>> > >> >
>> > >> > As Sandy mentioned it would be very nice if the solution could 
>> > >> > be compatible with that.
>> > >> >
>> > >> > Tom
>> > >> >
>> > >> >
>> > >> >
>> > >> > On Wednesday, January 8, 2014 12:44 AM, Sandy Ryza < 
>> > >> > sandy.r...@cloudera.com> wrote:
>> > >> >
>> > >> > Hey,
>> > >> >
>> > >> > YARN-321 is targeted for the Hadoop 2.4.  The minimum feature 
>> > >> > set
>> > doesn't
>> > >> > include application-specific data, so that probably won't be 
>> > >> > part of
>> > 2.4
>> > >> > unless other things delay the release for a while.  There are 
>> > >> > no APIs
>> > for
>> > >> > it yet and pluggable UIs have been discussed but not agreed upon.
>> > >> > I
>> > >> think
>> > >> > requirements from Spark could be useful in helping shape what 
>> > >> > gets
>> > done
>> > >> > there.
>> > >> >
>> > >> > -Sandy
>> > >> >
>> > >> >
>> > >> >
>> > >> > On Tue, Jan 7, 2014 at 4:13 PM, Patrick Wendell 
>> > >> > <pwend...@gmail.com>
>> > >> > wrote:
>> > >> >
>> > >> > > Hey Sandy,
>> > >> > >
>> > >> > > Do you know what the status is for YARN-321 and what version 
>> > >> > > of YARN it's targeted for? Also, is there any kind of 
>> > >> > > documentation or API
>> > for
>> > >> > > this? Does it control the presentation of the data itself (e.g.
>> > >> > > it actually has its own UI)?
>> > >> > >
>> > >> > > @Tom - having an optional history server sounds like a good idea.
>> > >> > >
>> > >> > > One question is what format to use for storing the data and 
>> > >> > > how the persisted format relates to XML/HTML generation in 
>> > >> > > the live UI. One idea would be to add JSON as an 
>> > >> > > intermediate format inside of the current WebUI, and then 
>> > >> > > any JSON page could be persisted and
>> > rendered
>> > >> > > by the history server using the same code. Once a 
>> > >> > > SparkContext exits it could dump a series of named paths 
>> > >> > > each with a JSON file. Then
>> > the
>> > >> > > history server could load those paths and pass them through 
>> > >> > > the
>> > second
>> > >> > > rendering stage (JSON => XML) to create each page.
>> > >> > >
>> > >> > > It would be good if SPARK-969 had a good design doc before 
>> > >> > > anyone starts working on it.
>> > >> > >
>> > >> > > - Patrick
>> > >> > >
>> > >> > > On Tue, Jan 7, 2014 at 3:18 PM, Sandy Ryza 
>> > >> > > <sandy.r...@cloudera.com
>> > >
>> > >> > > wrote:
>> > >> > > > As a sidenote, it would be nice to make sure that whatever 
>> > >> > > > done
>> > here
>> > >> > will
>> > >> > > > work with the YARN Application History Server (YARN-321), 
>> > >> > > > a
>> > generic
>> > >> > > history
>> > >> > > > server that functions similarly to MapReduce's
>> JobHistoryServer.
>> >  It
>> > >> > will
>> > >> > > > eventually have the ability to store application-specific data.
>> > >> > > >
>> > >> > > > -Sandy
>> > >> > > >
>> > >> > > >
>> > >> > > > On Tue, Jan 7, 2014 at 2:51 PM, Tom Graves 
>> > >> > > > <tgraves...@yahoo.com>
>> > >> > wrote:
>> > >> > > >
>> > >> > > >> I don't think you want to save the html/xml files. I 
>> > >> > > >> would rather
>> > >> see
>> > >> > > the
>> > >> > > >> info saved into a history file in like a json format that 
>> > >> > > >> could
>> > then
>> > >> > be
>> > >> > > >> re-read and the web ui display the info, hopefully 
>> > >> > > >> without much
>> > >> change
>> > >> > > to
>> > >> > > >> the UI parts.  For instance perhaps the history server 
>> > >> > > >> could read
>> > >> the
>> > >> > > file
>> > >> > > >> and populate the appropriate Spark data structures that 
>> > >> > > >> the web
>> > ui
>> > >> > > already
>> > >> > > >> uses.
>> > >> > > >>
>> > >> > > >> I would suggest making it so the history server is an 
>> > >> > > >> optional
>> > >> server
>> > >> > > and
>> > >> > > >> could be run on any node. That way if the load on a 
>> > >> > > >> particular
>> > node
>> > >> > > becomes
>> > >> > > >> to much it could be moved, but you also could run it on 
>> > >> > > >> the same
>> > >> node
>> > >> > as
>> > >> > > >> the Master.  All it really needs to know is where to get 
>> > >> > > >> the
>> > history
>> > >> > > files
>> > >> > > >> from and have access to that location.
>> > >> > > >>
>> > >> > > >> Hadoop actually has a history server for MapReduce which 
>> > >> > > >> works
>> > very
>> > >> > > >> similar to what I mention above.   One thing to keep in minds
>> > here
>> > >> is
>> > >> > > >> security.  You want to make sure that the history files 
>> > >> > > >> can only
>> > be
>> > >> > > read by
>> > >> > > >> users who have the appropriate permissions.  The history 
>> > >> > > >> server
>> > >> itself
>> > >> > > >> could run as  a superuser who has permission to server up 
>> > >> > > >> the
>> > files
>> > >> > > based
>> > >> > > >> on the acls.
>> > >> > > >>
>> > >> > > >>
>> > >> > > >>
>> > >> > > >> On Tuesday, January 7, 2014 8:06 AM, "Xia, Junluan" <
>> > >> > > junluan....@intel.com>
>> > >> > > >> wrote:
>> > >> > > >>
>> > >> > > >> Hi all
>> > >> > > >>          Spark job web ui will not be available when job 
>> > >> > > >> is over,
>> > >> but
>> > >> > it
>> > >> > > >> is convenient for developer to debug with persisting job 
>> > >> > > >> web
>> ui.
>> > I
>> > >> > just
>> > >> > > >> come up with draft for this issue.
>> > >> > > >>
>> > >> > > >> 1.       We could simply save the web page with html/xml
>> > >> > > >> format(stages/executors/storages/environment) to certain 
>> > >> > > >> location
>> > >> when
>> > >> > > job
>> > >> > > >> finished
>> > >> > > >>
>> > >> > > >> 2.       But it is not easy for user to review the job info
>> with
>> > #1,
>> > >> > we
>> > >> > > >> could build extra job history service for developers
>> > >> > > >>
>> > >> > > >> 3.       But where will we build this history service? In
>> Driver
>> > >> node
>> > >> > or
>> > >> > > >> Master node?
>> > >> > > >>
>> > >> > > >> Any suggestions about this improvement?
>> > >> > > >>
>> > >> > > >> regards,
>> > >> > > >> Andrew
>> > >> > > >>
>> > >> > >
>> > >> >
>> > >>
>> >
>>
>
>

RE: About Spark job web ui persist(JIRA-969)

Reply via email to