tldr; *Reusing TaskIDs clashes with the mesos-agent recovery feature.* Adam Bordelon wrote: > Reusing taskIds may work if you're guaranteed to never be running two instances of the same taskId simultaneously
I've encountered another scenario where reusing TaskIDs is dangerous, even if you meet the guarantee of never running 2 task instances with the same TaskID simultaneously. *Scenario leading to a problem:* Say you have a task with ID "T1", which terminates for some reason, so its terminal status update gets recorded into the agent's current "run" in the task's updates file: MESOS_WORK_DIR/meta/slaves/latest/frameworks/FRAMEWORK_ID/executors/EXECUTOR_ID/runs/latest/tasks/T1/task.updates Then say a new task is launched with the same ID of T1, and it gets scheduled under the same Executor on the same agent host. In that case, the task will be reusing the same work_dir path, and thus have the already recorded "terminal status update" in its task.updates file. So the updates file has a stream of updates that might look like this: - TASK_RUNNING - TASK_FINISHED - TASK_RUNNING Say you subsequently restart the mesos-slave/agent, expecting all tasks to survive the restart via the recovery process. Unfortunately, T1 is terminated because the task recovery logic <https://github.com/apache/mesos/blob/0.27.0/src/slave/slave.cpp#L5701-L5708> [1] looks at the current run's tasks' task.updates files, searching for tasks with "terminal status updates", and then terminating any such tasks. So, even though T1 was actually running just fine, it gets terminated because at some point in its previous incarnation it had a "terminal status update" recorded. *Leads to inconsistent state* Compounding the problem, this termination is done without informing the Executor, and thus the process underlying the task continues to run, even though mesos thinks it's gone. Which is really bad since it leaves the host with a different state than mesos thinks exists. e.g., if the task had a port resource, then mesos incorrectly thinks the port is now free, so a framework might try to launch a task/executor that uses the port, but it will fail because the process cannot bind to the port. *Change recovery code or just update comments in mesos.proto?* Perhaps this behavior could be considered a "bug" and the recovery logic that processes tasks status updates could be modified to ignore "terminal status updates" if there is a subsequent TASK_RUNNING update in the task.updates file. If that sounds like a desirable change, I'm happy to file a JIRA issue for that and work on the fix myself. If we think the recovery logic is fine as it is, then we should update these comments <https://github.com/apache/mesos/blob/0.27.0/include/mesos/mesos.proto#L63-L66> [2] in mesos.proto since they are incorrect given the behavior I just encountered: A framework generated ID to distinguish a task. The ID must remain > unique while the task is active. However, a framework can reuse an > ID _only_ if a previous task with the same ID has reached a > terminal state (e.g., TASK_FINISHED, TASK_LOST, TASK_KILLED, etc.). *Conclusion* It is dangerous indeed to reuse a TaskID for separate task runs, even if they are guaranteed to not be running concurrently. - Erik P.S., I encountered this problem while trying to use mesos-agent recovery with the storm-mesos framework <https://github.com/mesos/storm> [3]. Notably, this framework sets the TaskID to "<agenthostname>-<stormworkerport>" for the storm worker tasks, so when a storm worker dies and is reborn on that host, the TaskID gets reused. But then the task doesn't survive an agent restart (even though the worker *process* does survive, putting us in an inconsistent state!). P.P.S., being able to enable verbose logging in mesos-slave/agent with the GLOG_v=3 environment variable is *super* convenient! Would have taken me *way* longer to figure this out if the verbose logging didn't exist. P.P.P.S, To debug this, I wrote a tool <https://github.com/erikdw/protoc-decode-lenprefix> [4] to decode length-prefixed protobuf <http://eli.thegreenplace.net/2011/08/02/length-prefix-framing-for-protocol-buffers> [5] files, such as task.updates. Here's an example of invoking the tool (notably, it has the same syntax as "protoc --decode", but handles the length-prefix headers): cat task.updates | \ protoc-decode-lenprefix \ --decode mesos.internal.StatusUpdateRecord \ -I MESOS_CODE/src -I MESOS_CODE/include \ MESOS_CODE/src/messages/messages.proto [1] https://github.com/apache/mesos/blob/0.27.0/src/slave/slave.cpp#L5701-L5708 [2] https://github.com/apache/mesos/blob/0.27.0/include/mesos/mesos.proto#L63-L66 [3] https://github.com/mesos/storm [4] https://github.com/erikdw/protoc-decode-lenprefix [5] http://eli.thegreenplace.net/2011/08/02/length-prefix-framing-for-protocol-buffers On Sat, Jul 11, 2015 at 11:45 AM, CCAAT <[email protected]> wrote: > I'd be most curious to see a working example of this idea, prefixes > and all for sleeping (long term sleeping) nodes (slave and masters). > > Anybody, do post what you have/are doing on this taskid resuse and > reservations experimentations. Probably many are interested for a variety > of reasons including but not limited to security, auditing and node > diversification interests.... My interests are in self-modifying > codes, which can be achieved whilst the nodes sleep for some very > interesting applications. > > > James > > > > On 07/11/2015 06:01 AM, Adam Bordelon wrote: > >> Reusing taskIds may work if you're guaranteed to never be running two >> instances of the same taskId simultaneously, but I could imagine a >> particularly dangerous scenario where a master and slave experience a >> network partition, so the master declares the slave lost and therefore >> its tasks lost, and then the framework scheduler launches a new task >> with the same taskId. However, the task is still running on the original >> slave. When the slave reregisters and claims it is running that taskId, >> or that that taskId has completed, the Mesos master may have a difficult >> time reconciling which instance of the task is on which node and in >> which status, since it expects only one instance to exist at a time. >> You may be better off using a fixed taskId prefix and appending an >> incrementing instance/trial number so that each run gets a uniqueId. >> Also note that taskIds only need to be unique within a single >> frameworkId, so don't worry about conflicting with other frameworks. >> TL;DR: I wouldn't recommend it. >> >> On Fri, Jul 10, 2015 at 10:20 AM, Antonio Fernández >> <[email protected] <mailto:[email protected]>> wrote: >> >> Sounds risky. Every task should have its own unique id, collisions >> could happen and unexpected issues. >> >> I think it will be as hard to monitor that you can start again a >> task than get a mechanism to know it’s ID. >> >> >> >> On 10 Jul 2015, at 19:14, Jie Yu <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> Re-using Task IDs is definitely not encouraged. As far as I know, >>> many of the Mesos code assume Task ID is unique. So I probably >>> won't risk that. >>> >>> >>> On Fri, Jul 10, 2015 at 10:06 AM, Sargun Dhillon <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> Is reusing Task IDs good behaviour? Let's say that I have some >>> singleton task - I'll call it a monitoring service. It's >>> always going >>> to be the same process, doing the same thing, and there will >>> only ever >>> be one around (per instance of a framework). Reading the >>> protobuf doc, >>> I learned this: >>> >>> >>> /** >>> * A framework generated ID to distinguish a task. The ID must >>> remain >>> * unique while the task is active. However, a framework can >>> reuse an >>> * ID _only_ if a previous task with the same ID has reached a >>> * terminal state (e.g., TASK_FINISHED, TASK_LOST, >>> TASK_KILLED, etc.). >>> */ >>> message TaskID { >>> required string value = 1; >>> } >>> --- >>> Which makes me think that it's reasonable to just give this >>> task the >>> same taskID, and that every time I bring it from a terminal >>> status to >>> running once more, I can reuse the same ID. This also gives me >>> the >>> benefit of being able to more easily locate the task for a given >>> framework, and I'm able to exploit Mesos for some weak guarantees >>> saying there wont be multiple of these running (don't worry, >>> they lock >>> in Zookeeper, and concurrent runs don't do anything, they just >>> fail). >>> >>> Opinions? >>> >>> >>> >> >> ^^Nos encantan los árboles. No me imprimas si no es necesario. >> >> Protección de Datos:Mundo Reader S.L. le informa de que los datos >> personales facilitados por Ud. y utilizados para el envío de esta >> comunicación serán objeto de tratamiento automatizado o no en >> nuestros ficheros, con la finalidad de gestionar la agenda de >> contactos de nuestra empresa y para el envío de comunicaciones >> profesionales por cualquier medio electrónico o no. Puede consultar >> en www.bq.com <http://www.bq.com/>los detalles de nuestra Política >> de Privacidad y dónde ejercer el derecho de acceso, rectificación, >> cancelación y oposición. >> >> Confidencialidad:Este mensaje contiene material confidencial y está >> dirigido exclusivamente a su destinatario. Cualquier revisión, >> modificación o distribución por otras personas, así como su reenvío >> sin el consentimiento expreso está estrictamente prohibido. Si usted >> no es el destinatario del mensaje, por favor, comuníqueselo al >> emisor y borre todas las copias de forma inmediata. >> Confidentiality:This e-mail contains material that is confidential >> for de sole use of de intended recipient. Any review, reliance or >> distribution by others or forwarding without express permission is >> strictly prohibited. If you are not the intended recipient, please >> contact the sender and delete all copies. >> >> >> >

