Hi pillis Do you mind to submit more detail design document? May contain 1. what data structures will be exposed to external UI/metrics/third party 2. what new UI looks like if you want to persist SparkContextData periodically 3. .........
Any other suggestions? -----Original Message----- From: Pillis Work [mailto:pillis.w...@gmail.com] Sent: Friday, January 17, 2014 9:07 AM To: dev@spark.incubator.apache.org Subject: Re: About Spark job web ui persist(JIRA-969) Hello, If changes are acceptable, I would like to request assignment of JIRA to me for implementation. Regards pillis On Thu, Jan 16, 2014 at 9:28 AM, Pillis Work <pillis.w...@gmail.com> wrote: > Hi Junluan, > 1. Yes, we could persist to HDFS or any FS. I think at a minimum we > should persist it to local disk - keeps the core simple. > We can think of HDFS interactions as level-2 functionality that can be > implemented once we have a good local implementation. The > persistence/hydration layer of a SparkContextData can be made > pluggable as a next step. > Also, as mentioned in previous mail, SparkUI will now show multiple > SparkContexts using data from SparkContextDatas. > > 2. Yes, we could > > 3. Yes, SparkUI will need a rewrite to deal with SparkContextDatas > (either live, or hydrated from historical JSONs). > Regards > > > > > On Thu, Jan 16, 2014 at 8:15 AM, Xia, Junluan <junluan....@intel.com>wrote: > >> Hi Pillis >> >> It sound goods >> 1. For SparkContextData, I think we could persist in HDFS not in >> local disk(one SparkUI service may show more than one sparkcontext) >> 2. we also could consider SparkContextData as one metrics >> input(MetricsSource), for long running spark job, SparkContextData >> will shown in ganglia/jmx ..... >> 3. if we persist SparkContextData periodically, we need to rewrite >> the UI logic as spark ui now just show one timestamp information. >> >> -----Original Message----- >> From: Pillis Work [mailto:pillis.w...@gmail.com] >> Sent: Thursday, January 16, 2014 5:37 PM >> To: dev@spark.incubator.apache.org >> Subject: Re: About Spark job web ui persist(JIRA-969) >> >> Hello, >> I wanted to write down at a high level the changes I was thinking of. >> Please feel free to critique and suggest changes. >> >> SparkContext: >> SparkContext start will not be starting UI anymore. Rather it will >> launch a SparkContextObserver (has SparkListener trait) which will >> generate a SparkContextData instance. SparkContextObserver keeps >> SparkContextData uptodate. SparkContextData will have all the >> historical information anyone needs. Stopping a SparkContext stops the >> SparkContextObserver. >> >> SparkContextData: >> Has all historical information of a SparkContext run. Periodically >> persists itself to disk as JSON. Can hydrate itself from the same JSON. >> SparkContextDatas are created without any UI usage. SparkContextData >> can evolve independently of what UI needs - like having non-UI data >> needed for third party integration. >> >> SparkUI: >> No longer needs SparkContext. Will need an array of SparkContextDatas >> (either by polling folder or other means). UI pages at render time >> will access appropriate SparkContextData and produce HTML. SparkUI >> can be started and stopped independently of SparkContexts. Multiple >> SparkContexts can be shown in UI. >> >> I have purposefully not gone into much detail. Please let me know if >> any piece needs to be elaborated. >> Regards, >> Pillis >> >> >> >> >> On Mon, Jan 13, 2014 at 1:32 PM, Patrick Wendell <pwend...@gmail.com> >> wrote: >> >> > Pillis - I agree we need to decouple the representation from a >> > particular history server. But why not provide as simple history >> > server people can (optionally) run if they aren't using Yarn or Mesos? >> > For people running the standalone cluster scheduler this seems >> > important. Giving them only a JSON dump isn't super consumable for >> > most users. >> > >> > - Patrick >> > >> > On Mon, Jan 13, 2014 at 10:43 AM, Pillis Work >> > <pillis.w...@gmail.com> >> > wrote: >> > > The listeners in SparkUI which update the counters can trigger >> > > saves >> > along >> > > the way. >> > > The save can be on a 500ms delay after the last update, to batch >> changes. >> > > This solution would not require save on stop(). >> > > >> > > >> > > >> > > On Mon, Jan 13, 2014 at 6:15 AM, Tom Graves >> > > <tgraves...@yahoo.com> >> > wrote: >> > > >> > >> So the downside to just saving stuff at the end is that if the >> > >> app >> > crashes >> > >> or exits badly you don't have anything. Hadoop has taken the >> approach >> > of >> > >> saving events along the way. But Hadoop also uses that history >> > >> file to start where it left off at if something bad happens and >> > >> it gets >> > restarted. >> > >> I don't think the latter really applies to spark though. >> > >> >> > >> Does mesos have a history server? >> > >> >> > >> Tom >> > >> >> > >> >> > >> >> > >> On Sunday, January 12, 2014 9:22 PM, Pillis Work >> > >> <pillis.w...@gmail.com >> > > >> > >> wrote: >> > >> >> > >> IMHO from a pure Spark standpoint, I don't know if having a >> > >> dedicated history service makes sense as of now - considering >> > >> that cluster >> > managers >> > >> have their own history servers. Just showing UI of history runs >> > >> might be too thin a requirement for a full service. Spark should >> > >> store history information that can later be exposed in required ways. >> > >> >> > >> Since each SparkContext is the logical entry and exit point for >> > >> doing something useful in Spark, during its stop(), it should >> > >> serialize that run's statistics into a JSON file - like >> > "sc_run_[name]_[start-time].json". >> > >> When SparkUI.stop() is called, it in turn asks its UI objects >> > >> (which >> > should >> > >> implement a trait) to provide either a flat or hierarchical Map >> > >> of >> > String >> > >> key/value pairs. This map (flat, hierarchical) is then >> > >> serialized to a configured path (default being "var/history"). >> > >> >> > >> With regards to Mesos or YARN, their applications during >> > >> shutdown can >> > have >> > >> API to import this Spark history into their history servers - by >> > >> making >> > API >> > >> calls etc. >> > >> >> > >> This way Spark's history information is persisted independent of >> > >> cluster framework, and cluster frameworks can import the history >> when/as needed. >> > >> Hope this helps. >> > >> Regards, >> > >> pillis >> > >> >> > >> >> > >> >> > >> On Thu, Jan 9, 2014 at 6:13 AM, Tom Graves >> > >> <tgraves...@yahoo.com> >> > wrote: >> > >> >> > >> > Note that it looks like we are planning on adding support for >> > application >> > >> > specific frameworks to YARN sooner rather then later. There is >> > >> > an >> > initial >> > >> > design up here: https://issues.apache.org/jira/browse/YARN-1530. >> > >> > Note this has not been reviewed yet so changes are likely but >> > >> > gives an >> > idea of >> > >> > the general direction. If anyone has comments on how that >> > >> > might work >> > >> with >> > >> > SPARK I encourage you to post to the jira. >> > >> > >> > >> > As Sandy mentioned it would be very nice if the solution could >> > >> > be compatible with that. >> > >> > >> > >> > Tom >> > >> > >> > >> > >> > >> > >> > >> > On Wednesday, January 8, 2014 12:44 AM, Sandy Ryza < >> > >> > sandy.r...@cloudera.com> wrote: >> > >> > >> > >> > Hey, >> > >> > >> > >> > YARN-321 is targeted for the Hadoop 2.4. The minimum feature >> > >> > set >> > doesn't >> > >> > include application-specific data, so that probably won't be >> > >> > part of >> > 2.4 >> > >> > unless other things delay the release for a while. There are >> > >> > no APIs >> > for >> > >> > it yet and pluggable UIs have been discussed but not agreed upon. >> > >> > I >> > >> think >> > >> > requirements from Spark could be useful in helping shape what >> > >> > gets >> > done >> > >> > there. >> > >> > >> > >> > -Sandy >> > >> > >> > >> > >> > >> > >> > >> > On Tue, Jan 7, 2014 at 4:13 PM, Patrick Wendell >> > >> > <pwend...@gmail.com> >> > >> > wrote: >> > >> > >> > >> > > Hey Sandy, >> > >> > > >> > >> > > Do you know what the status is for YARN-321 and what version >> > >> > > of YARN it's targeted for? Also, is there any kind of >> > >> > > documentation or API >> > for >> > >> > > this? Does it control the presentation of the data itself (e.g. >> > >> > > it actually has its own UI)? >> > >> > > >> > >> > > @Tom - having an optional history server sounds like a good idea. >> > >> > > >> > >> > > One question is what format to use for storing the data and >> > >> > > how the persisted format relates to XML/HTML generation in >> > >> > > the live UI. One idea would be to add JSON as an >> > >> > > intermediate format inside of the current WebUI, and then >> > >> > > any JSON page could be persisted and >> > rendered >> > >> > > by the history server using the same code. Once a >> > >> > > SparkContext exits it could dump a series of named paths >> > >> > > each with a JSON file. Then >> > the >> > >> > > history server could load those paths and pass them through >> > >> > > the >> > second >> > >> > > rendering stage (JSON => XML) to create each page. >> > >> > > >> > >> > > It would be good if SPARK-969 had a good design doc before >> > >> > > anyone starts working on it. >> > >> > > >> > >> > > - Patrick >> > >> > > >> > >> > > On Tue, Jan 7, 2014 at 3:18 PM, Sandy Ryza >> > >> > > <sandy.r...@cloudera.com >> > > >> > >> > > wrote: >> > >> > > > As a sidenote, it would be nice to make sure that whatever >> > >> > > > done >> > here >> > >> > will >> > >> > > > work with the YARN Application History Server (YARN-321), >> > >> > > > a >> > generic >> > >> > > history >> > >> > > > server that functions similarly to MapReduce's >> JobHistoryServer. >> > It >> > >> > will >> > >> > > > eventually have the ability to store application-specific data. >> > >> > > > >> > >> > > > -Sandy >> > >> > > > >> > >> > > > >> > >> > > > On Tue, Jan 7, 2014 at 2:51 PM, Tom Graves >> > >> > > > <tgraves...@yahoo.com> >> > >> > wrote: >> > >> > > > >> > >> > > >> I don't think you want to save the html/xml files. I >> > >> > > >> would rather >> > >> see >> > >> > > the >> > >> > > >> info saved into a history file in like a json format that >> > >> > > >> could >> > then >> > >> > be >> > >> > > >> re-read and the web ui display the info, hopefully >> > >> > > >> without much >> > >> change >> > >> > > to >> > >> > > >> the UI parts. For instance perhaps the history server >> > >> > > >> could read >> > >> the >> > >> > > file >> > >> > > >> and populate the appropriate Spark data structures that >> > >> > > >> the web >> > ui >> > >> > > already >> > >> > > >> uses. >> > >> > > >> >> > >> > > >> I would suggest making it so the history server is an >> > >> > > >> optional >> > >> server >> > >> > > and >> > >> > > >> could be run on any node. That way if the load on a >> > >> > > >> particular >> > node >> > >> > > becomes >> > >> > > >> to much it could be moved, but you also could run it on >> > >> > > >> the same >> > >> node >> > >> > as >> > >> > > >> the Master. All it really needs to know is where to get >> > >> > > >> the >> > history >> > >> > > files >> > >> > > >> from and have access to that location. >> > >> > > >> >> > >> > > >> Hadoop actually has a history server for MapReduce which >> > >> > > >> works >> > very >> > >> > > >> similar to what I mention above. One thing to keep in minds >> > here >> > >> is >> > >> > > >> security. You want to make sure that the history files >> > >> > > >> can only >> > be >> > >> > > read by >> > >> > > >> users who have the appropriate permissions. The history >> > >> > > >> server >> > >> itself >> > >> > > >> could run as a superuser who has permission to server up >> > >> > > >> the >> > files >> > >> > > based >> > >> > > >> on the acls. >> > >> > > >> >> > >> > > >> >> > >> > > >> >> > >> > > >> On Tuesday, January 7, 2014 8:06 AM, "Xia, Junluan" < >> > >> > > junluan....@intel.com> >> > >> > > >> wrote: >> > >> > > >> >> > >> > > >> Hi all >> > >> > > >> Spark job web ui will not be available when job >> > >> > > >> is over, >> > >> but >> > >> > it >> > >> > > >> is convenient for developer to debug with persisting job >> > >> > > >> web >> ui. >> > I >> > >> > just >> > >> > > >> come up with draft for this issue. >> > >> > > >> >> > >> > > >> 1. We could simply save the web page with html/xml >> > >> > > >> format(stages/executors/storages/environment) to certain >> > >> > > >> location >> > >> when >> > >> > > job >> > >> > > >> finished >> > >> > > >> >> > >> > > >> 2. But it is not easy for user to review the job info >> with >> > #1, >> > >> > we >> > >> > > >> could build extra job history service for developers >> > >> > > >> >> > >> > > >> 3. But where will we build this history service? In >> Driver >> > >> node >> > >> > or >> > >> > > >> Master node? >> > >> > > >> >> > >> > > >> Any suggestions about this improvement? >> > >> > > >> >> > >> > > >> regards, >> > >> > > >> Andrew >> > >> > > >> >> > >> > > >> > >> > >> > >> >> > >> > >