I do send terminal updates for the task: https://github.com/dgrnbrg/easypaas/blob/master/src/easypaas/core.clj#L126
The linked-to line spawns a new thread that waits for the underlying process to finish, then submits the final task update and exits the executor. On Thursday, June 13, 2013, Benjamin Mahler wrote: > Ok I'll try to do one thing at a time here, the first thing I'm seeing is > that you have an executor terminating. > > I0611 20:19:58.519618 48373 process_based_isolation_module.cpp:344] Telling > slave of lost executor cc54e5a4-ca40-444b-9286-72212bf012b5 of framework > 201305261216-3261142444-5050-56457-0006 > > This is fine. We've actually changed this message since 0.12.0 to say > "terminated" as opposed to "lost". > > However, this executor was running tasks! As a result, the slave considers > these tasks as lost, and sends the appropriate status updates for them: > > I0611 20:19:58.519785 48401 slave.cpp:1065] Executor > 'cc54e5a4-ca40-444b-9286-72212bf012b5' of framework > 201305261216-3261142444-5050-56457-0006 has exited with status 0 > I0611 20:19:58.525691 48401 slave.cpp:842] Status update: task > cc54e5a4-ca40-444b-9286-72212bf012b5 of framework > 201305261216-3261142444-5050-56457-0006 is now in state TASK_LOST > > Since I see an exit status of 0, I'm assuming this is a clean shutdown of a > custom executor that you've written? If so, you'll need to send terminal > updates for the tasks you're running prior to shutting down the executor. > E.g. TASK_FINISHED. Otherwise, the slave will consider all tasks running on > the executor as LOST. Does that clear anything up? > > > On Wed, Jun 12, 2013 at 4:39 PM, David Greenberg > <[email protected]<javascript:;> > >wrote: > > > Sure, sorry I didn't post the link--I'm on a restricted network at work > > that blocks uploading sites. Here it is: > > https://www.dropbox.com/s/bhapvvq6kznlgyz/master_and_slave_logs.tar.bz2 > > > > Currently, I'm trying to set up Hadoop and Spark on Mesos for ad-hoc data > > analysis tasks. I also wrote a Clojure fluent library for working with > > Mesos, which I intend to use to build a new scheduler for a specific > > problem at work on our 700 machine cluster. Some of the Clojure work will > > be open source (EPL) once I've written better documentation and actually > > had an opportunity to test it. > > > > Thanks! > > > > > > On Wed, Jun 12, 2013 at 6:28 PM, Benjamin Mahler > > <[email protected]>wrote: > > > > > Can you link to the logs? > > > > > > Can you give us a little background about how you're using mesos? If > > you're > > > using it for production jobs, I would recommend 0.12.0 once released as > > it > > > has been vetted in production (at Twitter at least). We've also > included > > > instructions on how to upgrade from 0.11.0 to 0.12.0 on a running > > cluster. > > > > > > > > > On Wed, Jun 12, 2013 at 7:07 AM, David Greenberg < > [email protected] > > > >wrote: > > > > > > > I am on 0.12 right now, git revision > > > > 3758114ee4492dcbb784d5aac65d43ac54ddb439 (same as airbnb/chronos > > > > recomends). > > > > > > > > I've the master and slave logs are 1.7MB bz2'ed, but apache.org's > > mailer > > > > doesn't accept such large messages. I've sent them directly to VInod, > > > and I > > > > can send them to anyone else who asks. > > > > > > > > I'm just running mesos w/ --conf, and the config is > > > > > > > > master = zk://iadv1.pit.mycompany.com:2181, > > iadv2.pit.mycompany.com:2181, > > > > iadv3.pit.mycompany.com:2181,iadv4.pit.mycompany.com:2181, > > > > iadv5.pit.mycompany.com:2181/mesos > > > > zk = zk://iadv1.pit.mycompany.com:2181,iadv2.pit.mycompany.com:2181, > > > > iadv3.pit.mycompany.com:2181,iadv4.pit.mycompany.com:2181, > > > > iadv5.pit.mycompany.com:2181/mesos > > > > log_dir = /data/scratch/local/mesos/logs > > > > work_dir = /data/scratch/local/mesos/work > > > > > > > > > > > > I would be happy to move to the latest version that's likely stable, > > but > > > > even after reading all of the discussion over the past couple weeks > on > > > > 0.11, 0.12, and 0.13, I have no idea whether I should pick one of > > those, > > > > HEAD, or some other commit. > > > > > > > > Thank you! > > > > > > > > > > > > On Wed, Jun 12, 2013 at 10:01 AM, David Greenberg < > > > [email protected] > > > > >wrote: > > > > > > > > > I am on 0.12 right now, git revision > > > > > 3758114ee4492dcbb784d5aac65d43ac54ddb439 (same as airbnb/chronos > > > > recomends). > > > > > > > > > > I've attached the master and slave logs. I'm just running mesos w/ > > > > --conf, > > > > > and the config is > > > > > > > > > > master = zk://iadv1.pit.mycompany.com:2181, > > > iadv2.pit.mycompany.com:21 <http://iadv2.pit.mycompany.com:2181>
