[jira] [Commented] (SPARK-18085) SPIP: Better History Server scalability for many / large applications
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16345615#comment-16345615 ] Marcelo Vanzin commented on SPARK-18085: If you want the short and dirty description: these changes decouple the application status data storage from the UI code, and allow different ways for storing the status data. Shipped with 2.3 are in-memory and disk storage options. This allows the SHS to use disk-based storage to use less memory and serve data more quickly when restarted. > SPIP: Better History Server scalability for many / large applications > - > > Key: SPARK-18085 > URL: https://issues.apache.org/jira/browse/SPARK-18085 > Project: Spark > Issue Type: Umbrella > Components: Spark Core, Web UI >Affects Versions: 2.0.0 >Reporter: Marcelo Vanzin >Priority: Major > Labels: SPIP > Fix For: 2.3.0 > > Attachments: screenshot-1.png, screenshot-2.png, spark_hs_next_gen.pdf > > > It's a known fact that the History Server currently has some annoying issues > when serving lots of applications, and when serving large applications. > I'm filing this umbrella to track work related to addressing those issues. > I'll be attaching a document shortly describing the issues and suggesting a > path to how to solve them. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18085) SPIP: Better History Server scalability for many / large applications
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344339#comment-16344339 ] Alex Bozarth commented on SPARK-18085: -- [~vanzin] since this is complete and going into 2.3 I was hoping you could help write a short overview of the SPIP, like what's changed and why. Given how much this evolved during the project I'm not sure if the original pitch is the best description anymore and I would like to have a nice summary to describe the project's impact. Thanks > SPIP: Better History Server scalability for many / large applications > - > > Key: SPARK-18085 > URL: https://issues.apache.org/jira/browse/SPARK-18085 > Project: Spark > Issue Type: Umbrella > Components: Spark Core, Web UI >Affects Versions: 2.0.0 >Reporter: Marcelo Vanzin >Priority: Major > Labels: SPIP > Fix For: 2.3.0 > > Attachments: screenshot-1.png, screenshot-2.png, spark_hs_next_gen.pdf > > > It's a known fact that the History Server currently has some annoying issues > when serving lots of applications, and when serving large applications. > I'm filing this umbrella to track work related to addressing those issues. > I'll be attaching a document shortly describing the issues and suggesting a > path to how to solve them. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18085) SPIP: Better History Server scalability for many / large applications
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16164187#comment-16164187 ] jincheng commented on SPARK-18085: -- Well done! In fact , shs-ng/HEAD performs better than shs-ng/m9, especially in terms of page rending > SPIP: Better History Server scalability for many / large applications > - > > Key: SPARK-18085 > URL: https://issues.apache.org/jira/browse/SPARK-18085 > Project: Spark > Issue Type: Umbrella > Components: Spark Core, Web UI >Affects Versions: 2.0.0 >Reporter: Marcelo Vanzin > Labels: SPIP > Attachments: screenshot-1.png, screenshot-2.png, spark_hs_next_gen.pdf > > > It's a known fact that the History Server currently has some annoying issues > when serving lots of applications, and when serving large applications. > I'm filing this umbrella to track work related to addressing those issues. > I'll be attaching a document shortly describing the issues and suggesting a > path to how to solve them. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18085) SPIP: Better History Server scalability for many / large applications
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16163585#comment-16163585 ] Marcelo Vanzin commented on SPARK-18085: I filed SPARK-21987 for the event log issue (lest we forget to do it). > SPIP: Better History Server scalability for many / large applications > - > > Key: SPARK-18085 > URL: https://issues.apache.org/jira/browse/SPARK-18085 > Project: Spark > Issue Type: Umbrella > Components: Spark Core, Web UI >Affects Versions: 2.0.0 >Reporter: Marcelo Vanzin > Labels: SPIP > Attachments: screenshot-1.png, screenshot-2.png, spark_hs_next_gen.pdf > > > It's a known fact that the History Server currently has some annoying issues > when serving lots of applications, and when serving large applications. > I'm filing this umbrella to track work related to addressing those issues. > I'll be attaching a document shortly describing the issues and suggesting a > path to how to solve them. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18085) SPIP: Better History Server scalability for many / large applications
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16158997#comment-16158997 ] Marcelo Vanzin commented on SPARK-18085: [~jincheng] that is caused by SPARK-17701. The bug is still open but the patch has actually been committed, and it removes a property of {{SparkPlanInfo}} that makes Spark 2.3 unable to read event logs from earlier versions. Can you file a new bug with that information? Thanks. > SPIP: Better History Server scalability for many / large applications > - > > Key: SPARK-18085 > URL: https://issues.apache.org/jira/browse/SPARK-18085 > Project: Spark > Issue Type: Umbrella > Components: Spark Core, Web UI >Affects Versions: 2.0.0 >Reporter: Marcelo Vanzin > Labels: SPIP > Attachments: screenshot-1.png, screenshot-2.png, spark_hs_next_gen.pdf > > > It's a known fact that the History Server currently has some annoying issues > when serving lots of applications, and when serving large applications. > I'm filing this umbrella to track work related to addressing those issues. > I'll be attaching a document shortly describing the issues and suggesting a > path to how to solve them. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18085) SPIP: Better History Server scalability for many / large applications
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16158865#comment-16158865 ] Marcelo Vanzin commented on SPARK-18085: Do you really mean fixed, as in you're not seeing it anymore, or introduced? Anyway, I'll take a look; might be something that changed since I last rebased my branch. > SPIP: Better History Server scalability for many / large applications > - > > Key: SPARK-18085 > URL: https://issues.apache.org/jira/browse/SPARK-18085 > Project: Spark > Issue Type: Umbrella > Components: Spark Core, Web UI >Affects Versions: 2.0.0 >Reporter: Marcelo Vanzin > Labels: SPIP > Attachments: screenshot-1.png, screenshot-2.png, spark_hs_next_gen.pdf > > > It's a known fact that the History Server currently has some annoying issues > when serving lots of applications, and when serving large applications. > I'm filing this umbrella to track work related to addressing those issues. > I'll be attaching a document shortly describing the issues and suggesting a > path to how to solve them. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18085) SPIP: Better History Server scalability for many / large applications
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16158247#comment-16158247 ] jincheng commented on SPARK-18085: -- {code:java} com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException: Unrecognized field "metadata" (class org.apache.spark.sql.execution.SparkPlanInfo), not marked as ignorable (4 known properties: "simpleString", "nodeName", "children", "metrics"]) at [Source: {"Event":"org.apache.spark.sql.execution.ui.SparkListenerSQLExecutionStart","executionId":0,"description":"json at NativeMethodAccessorImpl.java:0","details":"org.apache.spark.sql.DataFrameWriter.json(DataFrameWriter.scala:487)\nsun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\nsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\nsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\njava.lang.reflect.Method.invoke(Method.java:498)\npy4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)\npy4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)\npy4j.Gateway.invoke(Gateway.java:280)\npy4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)\npy4j.commands.CallCommand.execute(CallCommand.java:79)\npy4j.GatewayConnection.run(GatewayConnection.java:214)\njava.lang.Thread.run(Thread.java:748)","physicalPlanDescription":"== Parsed Logical Plan ==\nRepartition 200, true\n+- LogicalRDD [uid#327L, gids#328]\n\n== Analyzed Logical Plan ==\nuid: bigint, gids: array\nRepartition 200, true\n+- LogicalRDD [uid#327L, gids#328]\n\n== Optimized Logical Plan ==\nRepartition 200, true\n+- LogicalRDD [uid#327L, gids#328]\n\n== Physical Plan ==\nExchange RoundRobinPartitioning(200)\n+- Scan ExistingRDD[uid#327L,gids#328]","sparkPlanInfo":{"nodeName":"Exchange","simpleString":"Exchange RoundRobinPartitioning(200)","children":[{"nodeName":"ExistingRDD","simpleString":"Scan ExistingRDD[uid#327L,gids#328]","children":[],"metadata":{},"metrics":[{"name":"number of output rows","accumulatorId":140,"metricType":"sum"}]}],"metadata":{},"metrics":[{"name":"data size total (min, med, max)","accumulatorId":139,"metricType":"size"}]},"time":1504837052948}; line: 1, column: 1622] (through reference chain: org.apache.spark.sql.execution.ui.SparkListenerSQLExecutionStart["sparkPlanInfo"]->org.apache.spark.sql.execution.SparkPlanInfo["children"]->com.fasterxml.jackson.module.scala.deser.BuilderWrapper[0]->org.apache.spark.sql.execution.SparkPlanInfo["metadata"]) at com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException.from(UnrecognizedPropertyException.java:51) at com.fasterxml.jackson.databind.DeserializationContext.reportUnknownProperty(DeserializationContext.java:839) at com.fasterxml.jackson.databind.deser.std.StdDeserializer.handleUnknownProperty(StdDeserializer.java:1045) at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.handleUnknownProperty(BeanDeserializerBase.java:1352) at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.handleUnknownProperties(BeanDeserializerBase.java:1306) at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:399) at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1099) at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:296) at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:133) at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:245) at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:217) at com.fasterxml.jackson.module.scala.deser.SeqDeserializer.deserialize(SeqDeserializerModule.scala:76) at com.fasterxml.jackson.module.scala.deser.SeqDeserializer.deserialize(SeqDeserializerModule.scala:59) at com.fasterxml.jackson.databind.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:520) at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeWithErrorWrapping(BeanDeserializer.java:463) at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:378) at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1099) at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:296) at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:133) at com.fasterxml.jackson.databind.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:520)
[jira] [Commented] (SPARK-18085) SPIP: Better History Server scalability for many / large applications
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156784#comment-16156784 ] jincheng commented on SPARK-18085: -- thanks. it was ok! > SPIP: Better History Server scalability for many / large applications > - > > Key: SPARK-18085 > URL: https://issues.apache.org/jira/browse/SPARK-18085 > Project: Spark > Issue Type: Umbrella > Components: Spark Core, Web UI >Affects Versions: 2.0.0 >Reporter: Marcelo Vanzin > Labels: SPIP > Attachments: screenshot-1.png, screenshot-2.png, spark_hs_next_gen.pdf > > > It's a known fact that the History Server currently has some annoying issues > when serving lots of applications, and when serving large applications. > I'm filing this umbrella to track work related to addressing those issues. > I'll be attaching a document shortly describing the issues and suggesting a > path to how to solve them. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18085) SPIP: Better History Server scalability for many / large applications
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16154363#comment-16154363 ] Marcelo Vanzin commented on SPARK-18085: That should be fixed if you sync to my repo's 'shs-ng/HEAD'. > SPIP: Better History Server scalability for many / large applications > - > > Key: SPARK-18085 > URL: https://issues.apache.org/jira/browse/SPARK-18085 > Project: Spark > Issue Type: Umbrella > Components: Spark Core, Web UI >Affects Versions: 2.0.0 >Reporter: Marcelo Vanzin > Labels: SPIP > Attachments: screenshot-1.png, screenshot-2.png, spark_hs_next_gen.pdf > > > It's a known fact that the History Server currently has some annoying issues > when serving lots of applications, and when serving large applications. > I'm filing this umbrella to track work related to addressing those issues. > I'll be attaching a document shortly describing the issues and suggesting a > path to how to solve them. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18085) SPIP: Better History Server scalability for many / large applications
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16154336#comment-16154336 ] Marcelo Vanzin commented on SPARK-18085: Ok I think I was able to reproduce that locally. > SPIP: Better History Server scalability for many / large applications > - > > Key: SPARK-18085 > URL: https://issues.apache.org/jira/browse/SPARK-18085 > Project: Spark > Issue Type: Umbrella > Components: Spark Core, Web UI >Affects Versions: 2.0.0 >Reporter: Marcelo Vanzin > Labels: SPIP > Attachments: screenshot-1.png, screenshot-2.png, spark_hs_next_gen.pdf > > > It's a known fact that the History Server currently has some annoying issues > when serving lots of applications, and when serving large applications. > I'm filing this umbrella to track work related to addressing those issues. > I'll be attaching a document shortly describing the issues and suggesting a > path to how to solve them. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18085) SPIP: Better History Server scalability for many / large applications
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16152409#comment-16152409 ] jincheng commented on SPARK-18085: -- *{color:red}Here is a picture of how it looks{color}* !screenshot-1.png! {color:red}*and I also tried in spark 2.0.it looks like this *{color} !screenshot-2.png! the code is located at : org.apache.spark.ui.jobs.TaskPagedTable.table(StagePage.scala:108) and it calls pagetable.pageData(PagedTable.scala:56) and throws an exception {code:java} def pageData(page: Int): PageData[T] = { val totalPages = (dataSize + pageSize - 1) / pageSize if (page <= 0 || page > totalPages) { throw new IndexOutOfBoundsException( s"Page $page is out of range. Please select a page number between 1 and $totalPages.") } val from = (page - 1) * pageSize val to = dataSize.min(page * pageSize) PageData(totalPages, sliceData(from, to)) } {code} it looks page=1 but totalPages = 0. so datasize + pagesize = 1. as {code:java} private[ui] abstract class PagedDataSource[T](val pageSize: Int) { if (pageSize <= 0) { throw new IllegalArgumentException("Page size must be positive") } {code} we did not meet this exception. so datasize = 0. this matches the case that no completed tasks, but instead all failed tasks should displayed just like spark 2.0. > SPIP: Better History Server scalability for many / large applications > - > > Key: SPARK-18085 > URL: https://issues.apache.org/jira/browse/SPARK-18085 > Project: Spark > Issue Type: Umbrella > Components: Spark Core, Web UI >Affects Versions: 2.0.0 >Reporter: Marcelo Vanzin > Labels: SPIP > Attachments: screenshot-1.png, screenshot-2.png, spark_hs_next_gen.pdf > > > It's a known fact that the History Server currently has some annoying issues > when serving lots of applications, and when serving large applications. > I'm filing this umbrella to track work related to addressing those issues. > I'll be attaching a document shortly describing the issues and suggesting a > path to how to solve them. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18085) SPIP: Better History Server scalability for many / large applications
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16148093#comment-16148093 ] Marcelo Vanzin commented on SPARK-18085: [~jincheng] do you have code that can reproduce that? The code in the exception hasn't really changed, so this is probably an artifact of how the new code is recording data from the application. For your description (when we clicking stages with no tasks successful) I haven't been able to reproduce this; a stage that has only failed tasks still renders fine with the code in my branch. > SPIP: Better History Server scalability for many / large applications > - > > Key: SPARK-18085 > URL: https://issues.apache.org/jira/browse/SPARK-18085 > Project: Spark > Issue Type: Umbrella > Components: Spark Core, Web UI >Affects Versions: 2.0.0 >Reporter: Marcelo Vanzin > Labels: SPIP > Attachments: spark_hs_next_gen.pdf > > > It's a known fact that the History Server currently has some annoying issues > when serving lots of applications, and when serving large applications. > I'm filing this umbrella to track work related to addressing those issues. > I'll be attaching a document shortly describing the issues and suggesting a > path to how to solve them. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18085) SPIP: Better History Server scalability for many / large applications
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16141091#comment-16141091 ] jincheng commented on SPARK-18085: -- it is really a good idea to speed up history server with levelDB. I met a problem with following exception. {code:java} java.lang.IndexOutOfBoundsException: Page 1 is out of range. Please select a page number between 1 and 0. at org.apache.spark.ui.PagedDataSource.pageData(PagedTable.scala:56) at org.apache.spark.ui.PagedTable$class.table(PagedTable.scala:108) at org.apache.spark.ui.jobs.TaskPagedTable.table(StagePage.scala:703) at org.apache.spark.ui.jobs.StagePage.liftedTree1$1(StagePage.scala:296) at org.apache.spark.ui.jobs.StagePage.render(StagePage.scala:285) at org.apache.spark.ui.WebUI$$anonfun$2.apply(WebUI.scala:82) at org.apache.spark.ui.WebUI$$anonfun$2.apply(WebUI.scala:82) at org.apache.spark.ui.JettyUtils$$anon$3.doGet(JettyUtils.scala:88) at javax.servlet.http.HttpServlet.service(HttpServlet.java:687) at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:845) at org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1689) at org.apache.spark.deploy.history.ApplicationCacheCheckFilter.doFilter(ApplicationCache.scala:438) at org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1676) at org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581) at org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) at org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511) at org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) at org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213) at org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) at org.spark_project.jetty.server.Server.handle(Server.java:524) at org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:319) at org.spark_project.jetty.server.HttpConnection.onFillable(HttpConnection.java:253) at org.spark_project.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273) at org.spark_project.jetty.io.FillInterest.fillable(FillInterest.java:95) at org.spark_project.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) at org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303) at org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148) at org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136) at org.spark_project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671) at org.spark_project.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589) at java.lang.Thread.run(Thread.java:748) {code} it will occur when we clicking stages with no tasks successful. > SPIP: Better History Server scalability for many / large applications > - > > Key: SPARK-18085 > URL: https://issues.apache.org/jira/browse/SPARK-18085 > Project: Spark > Issue Type: Umbrella > Components: Spark Core, Web UI >Affects Versions: 2.0.0 >Reporter: Marcelo Vanzin > Labels: SPIP > Attachments: spark_hs_next_gen.pdf > > > It's a known fact that the History Server currently has some annoying issues > when serving lots of applications, and when serving large applications. > I'm filing this umbrella to track work related to addressing those issues. > I'll be attaching a document shortly describing the issues and suggesting a > path to how to solve them. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18085) SPIP: Better History Server scalability for many / large applications
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125922#comment-16125922 ] Marcelo Vanzin commented on SPARK-18085: If your only problem is how fast to load a 100g event log, then correct, this change alone won't help you. I recommend you check the linked issues in this bug, some of which track things closer to what you want. > SPIP: Better History Server scalability for many / large applications > - > > Key: SPARK-18085 > URL: https://issues.apache.org/jira/browse/SPARK-18085 > Project: Spark > Issue Type: Umbrella > Components: Spark Core, Web UI >Affects Versions: 2.0.0 >Reporter: Marcelo Vanzin > Labels: SPIP > Attachments: spark_hs_next_gen.pdf > > > It's a known fact that the History Server currently has some annoying issues > when serving lots of applications, and when serving large applications. > I'm filing this umbrella to track work related to addressing those issues. > I'll be attaching a document shortly describing the issues and suggesting a > path to how to solve them. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18085) SPIP: Better History Server scalability for many / large applications
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125150#comment-16125150 ] duyanghao commented on SPARK-18085: --- [~vanzin] so what you mean is that your project has done nothing about loading large event logs(100G+) and loading history summary page when the event log directory is large at the first time. > SPIP: Better History Server scalability for many / large applications > - > > Key: SPARK-18085 > URL: https://issues.apache.org/jira/browse/SPARK-18085 > Project: Spark > Issue Type: Umbrella > Components: Spark Core, Web UI >Affects Versions: 2.0.0 >Reporter: Marcelo Vanzin > Labels: SPIP > Attachments: spark_hs_next_gen.pdf > > > It's a known fact that the History Server currently has some annoying issues > when serving lots of applications, and when serving large applications. > I'm filing this umbrella to track work related to addressing those issues. > I'll be attaching a document shortly describing the issues and suggesting a > path to how to solve them. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18085) SPIP: Better History Server scalability for many / large applications
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16124731#comment-16124731 ] Marcelo Vanzin commented on SPARK-18085: [~duyanghao] that should all be explained in the document attached to this bug. I encourage you to read it if you're looking for details, or take a look at the work-in-progress code linked in many comments above. You're also welcome to run the code against your event logs and report any problems. Note that no part of this work is about speeding up the loading of logs; loading an event log from scratch will most probably become slower now, when writing data to disk. The goal here is to control memory usage of the SHS, and to only have to process event logs once. > SPIP: Better History Server scalability for many / large applications > - > > Key: SPARK-18085 > URL: https://issues.apache.org/jira/browse/SPARK-18085 > Project: Spark > Issue Type: Umbrella > Components: Spark Core, Web UI >Affects Versions: 2.0.0 >Reporter: Marcelo Vanzin > Labels: SPIP > Attachments: spark_hs_next_gen.pdf > > > It's a known fact that the History Server currently has some annoying issues > when serving lots of applications, and when serving large applications. > I'm filing this umbrella to track work related to addressing those issues. > I'll be attaching a document shortly describing the issues and suggesting a > path to how to solve them. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18085) SPIP: Better History Server scalability for many / large applications
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16124421#comment-16124421 ] duyanghao commented on SPARK-18085: --- [~vanzin] Could you have a summary of your progress(solved problems and unsolved problems) since we all have the same problems for loading large event logs(100G+) and loading history summary page when the event log directory is large. We all want to know how good your solution is(of course from your test results)? > SPIP: Better History Server scalability for many / large applications > - > > Key: SPARK-18085 > URL: https://issues.apache.org/jira/browse/SPARK-18085 > Project: Spark > Issue Type: Umbrella > Components: Spark Core, Web UI >Affects Versions: 2.0.0 >Reporter: Marcelo Vanzin > Labels: SPIP > Attachments: spark_hs_next_gen.pdf > > > It's a known fact that the History Server currently has some annoying issues > when serving lots of applications, and when serving large applications. > I'm filing this umbrella to track work related to addressing those issues. > I'll be attaching a document shortly describing the issues and suggesting a > path to how to solve them. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18085) SPIP: Better History Server scalability for many / large applications
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16090042#comment-16090042 ] Marcelo Vanzin commented on SPARK-18085: 👍 > SPIP: Better History Server scalability for many / large applications > - > > Key: SPARK-18085 > URL: https://issues.apache.org/jira/browse/SPARK-18085 > Project: Spark > Issue Type: Umbrella > Components: Spark Core, Web UI >Affects Versions: 2.0.0 >Reporter: Marcelo Vanzin > Labels: SPIP > Attachments: spark_hs_next_gen.pdf > > > It's a known fact that the History Server currently has some annoying issues > when serving lots of applications, and when serving large applications. > I'm filing this umbrella to track work related to addressing those issues. > I'll be attaching a document shortly describing the issues and suggesting a > path to how to solve them. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18085) SPIP: Better History Server scalability for many / large applications
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16089332#comment-16089332 ] Reynold Xin commented on SPARK-18085: - [~vanzin] That's actually not true anymore. It was re-licensed last night. > SPIP: Better History Server scalability for many / large applications > - > > Key: SPARK-18085 > URL: https://issues.apache.org/jira/browse/SPARK-18085 > Project: Spark > Issue Type: Umbrella > Components: Spark Core, Web UI >Affects Versions: 2.0.0 >Reporter: Marcelo Vanzin > Labels: SPIP > Attachments: spark_hs_next_gen.pdf > > > It's a known fact that the History Server currently has some annoying issues > when serving lots of applications, and when serving large applications. > I'm filing this umbrella to track work related to addressing those issues. > I'll be attaching a document shortly describing the issues and suggesting a > path to how to solve them. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18085) SPIP: Better History Server scalability for many / large applications
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088758#comment-16088758 ] Marcelo Vanzin commented on SPARK-18085: Just for historical purposes, since it's mentioned in the spec: RocksDB is off-limits. See LEGAL-303. > SPIP: Better History Server scalability for many / large applications > - > > Key: SPARK-18085 > URL: https://issues.apache.org/jira/browse/SPARK-18085 > Project: Spark > Issue Type: Umbrella > Components: Spark Core, Web UI >Affects Versions: 2.0.0 >Reporter: Marcelo Vanzin > Labels: SPIP > Attachments: spark_hs_next_gen.pdf > > > It's a known fact that the History Server currently has some annoying issues > when serving lots of applications, and when serving large applications. > I'm filing this umbrella to track work related to addressing those issues. > I'll be attaching a document shortly describing the issues and suggesting a > path to how to solve them. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18085) SPIP: Better History Server scalability for many / large applications
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16084451#comment-16084451 ] Ye Zhou commented on SPARK-18085: - I want to add my own testing experience with the codes from the HEAD of your repo. I have built and deployed this Spark History Server 2 weeks ago and set the log dir to our dev cluster where around 1.5k spark applications daily application logs will be produced. And we continuously generate the loads to it by rest call. It works well. Thanks for the work. I have gone through the codes partially and have some minor improvement ideas to fit our internal need especially if there are more than 50K log files in the log dir. Some of them might be also useful to upstream. > SPIP: Better History Server scalability for many / large applications > - > > Key: SPARK-18085 > URL: https://issues.apache.org/jira/browse/SPARK-18085 > Project: Spark > Issue Type: Umbrella > Components: Spark Core, Web UI >Affects Versions: 2.0.0 >Reporter: Marcelo Vanzin > Labels: SPIP > Attachments: spark_hs_next_gen.pdf > > > It's a known fact that the History Server currently has some annoying issues > when serving lots of applications, and when serving large applications. > I'm filing this umbrella to track work related to addressing those issues. > I'll be attaching a document shortly describing the issues and suggesting a > path to how to solve them. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18085) SPIP: Better History Server scalability for many / large applications
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16084369#comment-16084369 ] Marcelo Vanzin commented on SPARK-18085: [~kanjilal] the code is pretty much all written at this point, most of the help I'd need right now is with testing and code reviews. For those arriving now, all the code is at https://github.com/vanzin/spark/commits/shs-ng/HEAD, and you can check the sub-tasks for which patches are currently under review (since sending all at the same time would create a huge diff). > SPIP: Better History Server scalability for many / large applications > - > > Key: SPARK-18085 > URL: https://issues.apache.org/jira/browse/SPARK-18085 > Project: Spark > Issue Type: Umbrella > Components: Spark Core, Web UI >Affects Versions: 2.0.0 >Reporter: Marcelo Vanzin > Labels: SPIP > Attachments: spark_hs_next_gen.pdf > > > It's a known fact that the History Server currently has some annoying issues > when serving lots of applications, and when serving large applications. > I'm filing this umbrella to track work related to addressing those issues. > I'll be attaching a document shortly describing the issues and suggesting a > path to how to solve them. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18085) SPIP: Better History Server scalability for many / large applications
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16084361#comment-16084361 ] Saikat Kanjilal commented on SPARK-18085: - [~vanzin] I would be interested in helping with this, I will first read the proposal and go through this thread and add my feedback, barring that are there areas where you need immediate help? > SPIP: Better History Server scalability for many / large applications > - > > Key: SPARK-18085 > URL: https://issues.apache.org/jira/browse/SPARK-18085 > Project: Spark > Issue Type: Umbrella > Components: Spark Core, Web UI >Affects Versions: 2.0.0 >Reporter: Marcelo Vanzin > Labels: SPIP > Attachments: spark_hs_next_gen.pdf > > > It's a known fact that the History Server currently has some annoying issues > when serving lots of applications, and when serving large applications. > I'm filing this umbrella to track work related to addressing those issues. > I'll be attaching a document shortly describing the issues and suggesting a > path to how to solve them. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18085) SPIP: Better History Server scalability for many / large applications
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16084277#comment-16084277 ] Reynold Xin commented on SPARK-18085: - You should email dev@ to notify the list about a new SPIP. Thanks. > SPIP: Better History Server scalability for many / large applications > - > > Key: SPARK-18085 > URL: https://issues.apache.org/jira/browse/SPARK-18085 > Project: Spark > Issue Type: Umbrella > Components: Spark Core, Web UI >Affects Versions: 2.0.0 >Reporter: Marcelo Vanzin > Labels: SPIP > Attachments: spark_hs_next_gen.pdf > > > It's a known fact that the History Server currently has some annoying issues > when serving lots of applications, and when serving large applications. > I'm filing this umbrella to track work related to addressing those issues. > I'll be attaching a document shortly describing the issues and suggesting a > path to how to solve them. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org