Re: Running Spark master/slave instances in non Daemon mode
Hi Mike, I can imagine the trouble that daemonization is causing and I think that having non-forking start script is a good idea. A simple, non-intrusive, fix could be to change the "spark-daemon.sh" script to conditionally omit the "nohup &". Personally, I think the semantically correct approach would be to also rename "spark-daemon" to something else (since it won't necessarily start a background process anymore), however that may have the potential to break things, in which case it is probably not worth cosmetic rename. best, --Jakob On Thu, Sep 29, 2016 at 6:47 PM, Mike Ihbe <m...@mustwin.com> wrote: > Our particular use case is for Nomad, using the "exec" configuration > described here: https://www.nomadproject.io/docs/drivers/exec.html. It's not > exactly a container, just a cgroup. It performs a simple fork/exec of a > command and binds to the output fds from that process, so daemonizing is > causing us minor hardship and seems like an easy thing to make optional. > We'd be happy to make the PR as well. > > --Mike > > On Thu, Sep 29, 2016 at 5:25 PM, Jakob Odersky <ja...@odersky.com> wrote: >> >> I'm curious, what kind of container solutions require foreground >> processes? Most init systems work fine with "starter" processes that >> run other processes. IIRC systemd and start-stop-daemon have an option >> called "fork", that will expect the main process to run another one in >> the background and only consider the former complete when the latter >> exits. I'm not against having a non-forking start script, I'm just >> wondering where you'd run into issues. >> >> Regarding the logging, would it be an option to create a custom slf4j >> logger that uses the standard mechanisms exposed by the system? >> >> best, >> --Jakob >> >> On Thu, Sep 29, 2016 at 3:46 PM, jpuro <jp...@mustwin.com> wrote: >> > Hi, >> > >> > I recently tried deploying Spark master and slave instances to container >> > based environments such as Docker, Nomad etc. There are two issues that >> > I've >> > found with how the startup scripts work. The sbin/start-master.sh and >> > sbin/start-slave.sh start a daemon by default, but this isn't as >> > compatible >> > with container deployments as one would think. The first issue is that >> > the >> > daemon runs in the background and some container solutions require the >> > apps >> > to run in the foreground or they consider the application to not be >> > running >> > and they may close down the task. The second issue is that logs don't >> > seem >> > to get integrated with the logging mechanism in the container solution. >> > What >> > is the possibility of adding additional flags or startup scripts for >> > supporting Spark to run in the foreground? It would be great if a flag >> > like >> > SPARK_NO_DAEMONIZE could be added or another script for foreground >> > execution. >> > >> > Regards, >> > >> > Jeff >> > >> > >> > >> > -- >> > View this message in context: >> > http://apache-spark-developers-list.1001551.n3.nabble.com/Running-Spark-master-slave-instances-in-non-Daemon-mode-tp19172.html >> > Sent from the Apache Spark Developers List mailing list archive at >> > Nabble.com. >> > >> > - >> > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> > >> >> - >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> > > > > -- > Mike Ihbe > MustWin - Principal > > m...@mustwin.com > mikeji...@gmail.com > skype: mikeihbe > Cell: 651.283.0815 - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
Re: Running Spark master/slave instances in non Daemon mode
Our particular use case is for Nomad, using the "exec" configuration described here: https://www.nomadproject.io/docs/drivers/exec.html. It's not exactly a container, just a cgroup. It performs a simple fork/exec of a command and binds to the output fds from that process, so daemonizing is causing us minor hardship and seems like an easy thing to make optional. We'd be happy to make the PR as well. --Mike On Thu, Sep 29, 2016 at 5:25 PM, Jakob Odersky <ja...@odersky.com> wrote: > I'm curious, what kind of container solutions require foreground > processes? Most init systems work fine with "starter" processes that > run other processes. IIRC systemd and start-stop-daemon have an option > called "fork", that will expect the main process to run another one in > the background and only consider the former complete when the latter > exits. I'm not against having a non-forking start script, I'm just > wondering where you'd run into issues. > > Regarding the logging, would it be an option to create a custom slf4j > logger that uses the standard mechanisms exposed by the system? > > best, > --Jakob > > On Thu, Sep 29, 2016 at 3:46 PM, jpuro <jp...@mustwin.com> wrote: > > Hi, > > > > I recently tried deploying Spark master and slave instances to container > > based environments such as Docker, Nomad etc. There are two issues that > I've > > found with how the startup scripts work. The sbin/start-master.sh and > > sbin/start-slave.sh start a daemon by default, but this isn't as > compatible > > with container deployments as one would think. The first issue is that > the > > daemon runs in the background and some container solutions require the > apps > > to run in the foreground or they consider the application to not be > running > > and they may close down the task. The second issue is that logs don't > seem > > to get integrated with the logging mechanism in the container solution. > What > > is the possibility of adding additional flags or startup scripts for > > supporting Spark to run in the foreground? It would be great if a flag > like > > SPARK_NO_DAEMONIZE could be added or another script for foreground > > execution. > > > > Regards, > > > > Jeff > > > > > > > > -- > > View this message in context: http://apache-spark-developers > -list.1001551.n3.nabble.com/Running-Spark-master-slave- > instances-in-non-Daemon-mode-tp19172.html > > Sent from the Apache Spark Developers List mailing list archive at > Nabble.com. > > > > - > > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > > > > - > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > > -- Mike Ihbe MustWin - Principal m...@mustwin.com mikeji...@gmail.com skype: mikeihbe Cell: 651.283.0815
Re: Running Spark master/slave instances in non Daemon mode
I'm curious, what kind of container solutions require foreground processes? Most init systems work fine with "starter" processes that run other processes. IIRC systemd and start-stop-daemon have an option called "fork", that will expect the main process to run another one in the background and only consider the former complete when the latter exits. I'm not against having a non-forking start script, I'm just wondering where you'd run into issues. Regarding the logging, would it be an option to create a custom slf4j logger that uses the standard mechanisms exposed by the system? best, --Jakob On Thu, Sep 29, 2016 at 3:46 PM, jpuro <jp...@mustwin.com> wrote: > Hi, > > I recently tried deploying Spark master and slave instances to container > based environments such as Docker, Nomad etc. There are two issues that I've > found with how the startup scripts work. The sbin/start-master.sh and > sbin/start-slave.sh start a daemon by default, but this isn't as compatible > with container deployments as one would think. The first issue is that the > daemon runs in the background and some container solutions require the apps > to run in the foreground or they consider the application to not be running > and they may close down the task. The second issue is that logs don't seem > to get integrated with the logging mechanism in the container solution. What > is the possibility of adding additional flags or startup scripts for > supporting Spark to run in the foreground? It would be great if a flag like > SPARK_NO_DAEMONIZE could be added or another script for foreground > execution. > > Regards, > > Jeff > > > > -- > View this message in context: > http://apache-spark-developers-list.1001551.n3.nabble.com/Running-Spark-master-slave-instances-in-non-Daemon-mode-tp19172.html > Sent from the Apache Spark Developers List mailing list archive at Nabble.com. > > - > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
Running Spark master/slave instances in non Daemon mode
Hi, I recently tried deploying Spark master and slave instances to container based environments such as Docker, Nomad etc. There are two issues that I've found with how the startup scripts work. The sbin/start-master.sh and sbin/start-slave.sh start a daemon by default, but this isn't as compatible with container deployments as one would think. The first issue is that the daemon runs in the background and some container solutions require the apps to run in the foreground or they consider the application to not be running and they may close down the task. The second issue is that logs don't seem to get integrated with the logging mechanism in the container solution. What is the possibility of adding additional flags or startup scripts for supporting Spark to run in the foreground? It would be great if a flag like SPARK_NO_DAEMONIZE could be added or another script for foreground execution. Regards, Jeff -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Running-Spark-master-slave-instances-in-non-Daemon-mode-tp19172.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org