[jira] [Commented] (SPARK-15142) Spark Mesos dispatcher becomes unusable when the Mesos master restarts
[ https://issues.apache.org/jira/browse/SPARK-15142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16103035#comment-16103035 ] Stavros Kontopoulos commented on SPARK-15142: - [~devaraj.k] Thnx. Btw I was not able to reproduce the problem with drivers being queued. After master was up then things went fine. I was able to launch more drivers. https://gist.github.com/skonto/0e23af643e7271e0125e321b04e630a2 > Spark Mesos dispatcher becomes unusable when the Mesos master restarts > -- > > Key: SPARK-15142 > URL: https://issues.apache.org/jira/browse/SPARK-15142 > Project: Spark > Issue Type: Bug > Components: Deploy, Mesos >Reporter: Devaraj K >Priority: Minor > Attachments: > spark-devaraj-org.apache.spark.deploy.mesos.MesosClusterDispatcher-1-stobdtserver5.out > > > While Spark Mesos dispatcher running if the Mesos master gets restarted then > Spark Mesos dispatcher will keep running and queues up all the submitted > applications and will not launch them. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15142) Spark Mesos dispatcher becomes unusable when the Mesos master restarts
[ https://issues.apache.org/jira/browse/SPARK-15142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100079#comment-16100079 ] Devaraj K commented on SPARK-15142: --- bq. That means there is no way to detect the new master while the dispatcher is still alive, it must be restarted when the new master is up, correct? Yes, the https://github.com/apache/spark/pull/13143 doesn't handle discovery of new master when the dispatcher is still alive. You can reopen this jira and create a PR if you want to work on. > Spark Mesos dispatcher becomes unusable when the Mesos master restarts > -- > > Key: SPARK-15142 > URL: https://issues.apache.org/jira/browse/SPARK-15142 > Project: Spark > Issue Type: Bug > Components: Deploy, Mesos >Reporter: Devaraj K >Priority: Minor > Attachments: > spark-devaraj-org.apache.spark.deploy.mesos.MesosClusterDispatcher-1-stobdtserver5.out > > > While Spark Mesos dispatcher running if the Mesos master gets restarted then > Spark Mesos dispatcher will keep running and queues up all the submitted > applications and will not launch them. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15142) Spark Mesos dispatcher becomes unusable when the Mesos master restarts
[ https://issues.apache.org/jira/browse/SPARK-15142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16099923#comment-16099923 ] Stavros Kontopoulos commented on SPARK-15142: - [~devaraj.k] Great, so one question to clarify things, if the dispatcher is running and then the master is restarted, a DRIVER_ABORTED is fired and its handled properly? In that PR you mention it says: "I think we need to throw the exception and shutdown the dispatcher." and so later some other PRs like (https://github.com/apache/spark/pull/13072) handle the shutdown process is that right? > Spark Mesos dispatcher becomes unusable when the Mesos master restarts > -- > > Key: SPARK-15142 > URL: https://issues.apache.org/jira/browse/SPARK-15142 > Project: Spark > Issue Type: Bug > Components: Deploy, Mesos >Reporter: Devaraj K >Priority: Minor > Attachments: > spark-devaraj-org.apache.spark.deploy.mesos.MesosClusterDispatcher-1-stobdtserver5.out > > > While Spark Mesos dispatcher running if the Mesos master gets restarted then > Spark Mesos dispatcher will keep running and queues up all the submitted > applications and will not launch them. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15142) Spark Mesos dispatcher becomes unusable when the Mesos master restarts
[ https://issues.apache.org/jira/browse/SPARK-15142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16099535#comment-16099535 ] Devaraj K commented on SPARK-15142: --- [~skonto] Thanks for showing interest on this. I have already created a PR for SPARK-15359 which fixes this issue. > Spark Mesos dispatcher becomes unusable when the Mesos master restarts > -- > > Key: SPARK-15142 > URL: https://issues.apache.org/jira/browse/SPARK-15142 > Project: Spark > Issue Type: Bug > Components: Deploy, Mesos >Reporter: Devaraj K >Priority: Minor > Attachments: > spark-devaraj-org.apache.spark.deploy.mesos.MesosClusterDispatcher-1-stobdtserver5.out > > > While Spark Mesos dispatcher running if the Mesos master gets restarted then > Spark Mesos dispatcher will keep running and queues up all the submitted > applications and will not launch them. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15142) Spark Mesos dispatcher becomes unusable when the Mesos master restarts
[ https://issues.apache.org/jira/browse/SPARK-15142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16098239#comment-16098239 ] Stavros Kontopoulos commented on SPARK-15142: - I will give it a shot. > Spark Mesos dispatcher becomes unusable when the Mesos master restarts > -- > > Key: SPARK-15142 > URL: https://issues.apache.org/jira/browse/SPARK-15142 > Project: Spark > Issue Type: Bug > Components: Deploy, Mesos >Reporter: Devaraj K >Priority: Minor > Attachments: > spark-devaraj-org.apache.spark.deploy.mesos.MesosClusterDispatcher-1-stobdtserver5.out > > > While Spark Mesos dispatcher running if the Mesos master gets restarted then > Spark Mesos dispatcher will keep running and queues up all the submitted > applications and will not launch them. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15142) Spark Mesos dispatcher becomes unusable when the Mesos master restarts
[ https://issues.apache.org/jira/browse/SPARK-15142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16031143#comment-16031143 ] Laurent Hoss commented on SPARK-15142: -- +1 to get this solved ! (ps: We are calling the dispatcher with requests generated from a *livy* container..) > Spark Mesos dispatcher becomes unusable when the Mesos master restarts > -- > > Key: SPARK-15142 > URL: https://issues.apache.org/jira/browse/SPARK-15142 > Project: Spark > Issue Type: Bug > Components: Deploy, Mesos >Reporter: Devaraj K >Priority: Minor > Attachments: > spark-devaraj-org.apache.spark.deploy.mesos.MesosClusterDispatcher-1-stobdtserver5.out > > > While Spark Mesos dispatcher running if the Mesos master gets restarted then > Spark Mesos dispatcher will keep running and queues up all the submitted > applications and will not launch them. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15142) Spark Mesos dispatcher becomes unusable when the Mesos master restarts
[ https://issues.apache.org/jira/browse/SPARK-15142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949450#comment-15949450 ] Jared commented on SPARK-15142: --- [~devaraj.k] are you still working on this problem? It seemed that MesosClusterDispatcher was unaware of master lost. > Spark Mesos dispatcher becomes unusable when the Mesos master restarts > -- > > Key: SPARK-15142 > URL: https://issues.apache.org/jira/browse/SPARK-15142 > Project: Spark > Issue Type: Bug > Components: Deploy, Mesos >Reporter: Devaraj K >Priority: Minor > Attachments: > spark-devaraj-org.apache.spark.deploy.mesos.MesosClusterDispatcher-1-stobdtserver5.out > > > While Spark Mesos dispatcher running if the Mesos master gets restarted then > Spark Mesos dispatcher will keep running and queues up all the submitted > applications and will not launch them. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15142) Spark Mesos dispatcher becomes unusable when the Mesos master restarts
[ https://issues.apache.org/jira/browse/SPARK-15142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15275978#comment-15275978 ] Devaraj K commented on SPARK-15142: --- bq. Can you include the dispatcher logs? I have attached the dispatcher logs, but I don't see anything useful in those. bq. Does restarting the dispatcher fix the problem? Yes, It works fine after restarting the dispatcher. I suspect the dispatcher is loosing the connection with the mesos master after the mesos master restart and stops receiving the resource offerings. I think the dispatcher may need to re-register with the mesos master for connection loss. I will try creating a PR to fix this issue. > Spark Mesos dispatcher becomes unusable when the Mesos master restarts > -- > > Key: SPARK-15142 > URL: https://issues.apache.org/jira/browse/SPARK-15142 > Project: Spark > Issue Type: Bug > Components: Deploy, Mesos >Reporter: Devaraj K >Priority: Minor > Attachments: > spark-devaraj-org.apache.spark.deploy.mesos.MesosClusterDispatcher-1-stobdtserver5.out > > > While Spark Mesos dispatcher running if the Mesos master gets restarted then > Spark Mesos dispatcher will keep running and queues up all the submitted > applications and will not launch them. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15142) Spark Mesos dispatcher becomes unusable when the Mesos master restarts
[ https://issues.apache.org/jira/browse/SPARK-15142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15274447#comment-15274447 ] Michael Gummelt commented on SPARK-15142: - Can you include the dispatcher logs? Does restarting the dispatcher fix the problem? > Spark Mesos dispatcher becomes unusable when the Mesos master restarts > -- > > Key: SPARK-15142 > URL: https://issues.apache.org/jira/browse/SPARK-15142 > Project: Spark > Issue Type: Bug > Components: Deploy, Mesos >Reporter: Devaraj K >Priority: Minor > > While Spark Mesos dispatcher running if the Mesos master gets restarted then > Spark Mesos dispatcher will keep running and queues up all the submitted > applications and will not launch them. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15142) Spark Mesos dispatcher becomes unusable when the Mesos master restarts
[ https://issues.apache.org/jira/browse/SPARK-15142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15274445#comment-15274445 ] Michael Gummelt commented on SPARK-15142: - I can't understand this sentence. Can you reword this? > Spark Mesos dispatcher becomes unusable when the Mesos master restarts > -- > > Key: SPARK-15142 > URL: https://issues.apache.org/jira/browse/SPARK-15142 > Project: Spark > Issue Type: Bug > Components: Deploy, Mesos >Reporter: Devaraj K >Priority: Minor > > While Spark Mesos dispatcher running if the Mesos master gets restarted then > Spark Mesos dispatcher will keep running and queues up all the submitted > applications and will not launch them. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15142) Spark Mesos dispatcher becomes unusable when the Mesos master restarts
[ https://issues.apache.org/jira/browse/SPARK-15142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272056#comment-15272056 ] Devaraj K commented on SPARK-15142: --- It is not running the queued applications after Mesos master comes up and these are staying in 'Queued Drivers:' forever, the newly submitted applications also will go and add up in queue. Only way I see here is restarting the Spark Mesos dispatcher for launching the newly submitted applications. > Spark Mesos dispatcher becomes unusable when the Mesos master restarts > -- > > Key: SPARK-15142 > URL: https://issues.apache.org/jira/browse/SPARK-15142 > Project: Spark > Issue Type: Bug > Components: Deploy, Mesos >Reporter: Devaraj K >Priority: Minor > > While Spark Mesos dispatcher running if the Mesos master gets restarted then > Spark Mesos dispatcher will keep running and queues up all the submitted > applications and will not launch them. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15142) Spark Mesos dispatcher becomes unusable when the Mesos master restarts
[ https://issues.apache.org/jira/browse/SPARK-15142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272013#comment-15272013 ] Sagar commented on SPARK-15142: --- Spark Mesos dispatcher queues all the applications when Mesos master restarted running application lost their reference or I think after Mesos master restarts it includes previously queued up applications and start running them. > Spark Mesos dispatcher becomes unusable when the Mesos master restarts > -- > > Key: SPARK-15142 > URL: https://issues.apache.org/jira/browse/SPARK-15142 > Project: Spark > Issue Type: Bug > Components: Deploy, Mesos >Reporter: Devaraj K >Priority: Minor > > While Spark Mesos dispatcher running if the Mesos master gets restarted then > Spark Mesos dispatcher will keep running and queues up all the submitted > applications and will not launch them. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org