[ https://issues.apache.org/jira/browse/MESOS-4299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15085704#comment-15085704 ]
Lei Xu commented on MESOS-4299: ------------------------------- update: I stop the slave and remove all files in data_dir path, and restart the slave, it still shows the same logs above. How to clear up a slave node and join the cluster as a new one ? > Slave lives in two different cluster at the same time with different slave id > ----------------------------------------------------------------------------- > > Key: MESOS-4299 > URL: https://issues.apache.org/jira/browse/MESOS-4299 > Project: Mesos > Issue Type: Bug > Components: master, webui > Affects Versions: 0.25.0 > Environment: Mesos 0.25.0 > Reporter: Lei Xu > > I've migrated some nodes from Cluster A to B, and today I found these nodes > lives both in Cluster A and B, and the here is the {{/master/slaves}} > response: > {code} > { > "slaves": [ > { > "active": false, > "attributes": { > "apps": "logstash", > "colo": "cn5", > "type": "prod" > }, > "hostname": "l-bu128g5-10k10.ops.cn2.qunar.com", > "id": "3e7ba6b1-29fd-44e8-9be2-f72896054ac6-S2", > "offered_resources": { > "cpus": 0, > "disk": 0, > "mem": 0 > }, > "pid": "slave(1)@10.90.5.19:5051", > "registered_time": 1451988622.66323, > "reserved_resources": {}, > "resources": { > "cpus": 32.0, > "disk": 2728919.0, > "mem": 128126.0, > "ports": "[8100-10000, 31000-32000]" > }, > "unreserved_resources": { > "cpus": 32.0, > "disk": 2728919.0, > "mem": 128126.0, > "ports": "[8100-10000, 31000-32000]" > }, > "used_resources": { > "cpus": 0, > "disk": 0, > "mem": 0 > } > }, > ..... > {code} > And the following is mesos slave logs: > {quote} > I0105 18:36:22.683724 6452 slave.cpp:2248] Updated checkpointed resources > from to > I0105 18:37:09.900497 6459 slave.cpp:3926] Current disk usage 0.06%. Max > allowed age: 1.798706758587755days > I0105 18:37:22.678374 6453 slave.cpp:3146] Master marked the slave as > disconnected but the slave considers itself registered! Forcing > re-registration. > I0105 18:37:22.678699 6453 slave.cpp:694] Re-detecting master > I0105 18:37:22.678715 6471 status_update_manager.cpp:176] Pausing sending > status updates > I0105 18:37:22.678753 6453 slave.cpp:741] Detecting new master > I0105 18:37:22.678977 6456 status_update_manager.cpp:176] Pausing sending > status updates > I0105 18:37:22.679047 6455 slave.cpp:705] New master detected at > master@10.88.169.195:5050 > I0105 18:37:22.679108 6455 slave.cpp:768] Authenticating with master > master@10.88.169.195:5050 > I0105 18:37:22.679136 6455 slave.cpp:773] Using default CRAM-MD5 > authenticatee > I0105 18:37:22.679239 6455 slave.cpp:741] Detecting new master > I0105 18:37:22.679354 6464 authenticatee.cpp:115] Creating new client SASL > connection > I0105 18:37:22.680883 6461 authenticatee.cpp:206] Received SASL > authentication mechanisms: CRAM-MD5 > I0105 18:37:22.680946 6461 authenticatee.cpp:232] Attempting to authenticate > with mechanism 'CRAM-MD5' > I0105 18:37:22.681759 6455 authenticatee.cpp:252] Received SASL > authentication step > I0105 18:37:22.682874 6454 authenticatee.cpp:292] Authentication success > I0105 18:37:22.682986 6441 slave.cpp:836] Successfully authenticated with > master master@10.88.169.195:5050 > I0105 18:37:22.684303 6454 slave.cpp:980] Re-registered with master > master@10.88.169.195:5050 > I0105 18:37:22.684455 6454 slave.cpp:1016] Forwarding total oversubscribed > resources > I0105 18:37:22.684471 6468 status_update_manager.cpp:183] Resuming sending > status updates > I0105 18:37:22.684649 6454 slave.cpp:2152] Updating framework > 20150610-204949-3299432458-5050-25057-0000 pid to > scheduler-1bef8172-5068-44c6-93f5-e97a3910ed79@10.88.169.195:35708 > I0105 18:37:22.685025 6452 status_update_manager.cpp:183] Resuming sending > status updates > I0105 18:37:22.685117 6454 slave.cpp:2248] Updated checkpointed resources > from to > I0105 18:38:09.901587 6464 slave.cpp:3926] Current disk usage 0.06%. Max > allowed age: 1.798706755730266days > I0105 18:38:22.679468 6451 slave.cpp:3146] Master marked the slave as > disconnected but the slave considers itself registered! Forcing > re-registration. > I0105 18:38:22.679739 6451 slave.cpp:694] Re-detecting master > I0105 18:38:22.679754 6453 status_update_manager.cpp:176] Pausing sending > status updates > I0105 18:38:22.679785 6451 slave.cpp:741] Detecting new master > I0105 18:38:22.680054 6461 slave.cpp:705] New master detected at > master@10.88.169.195:5050 > I0105 18:38:22.680106 6470 status_update_manager.cpp:176] Pausing sending > status updates > I0105 18:38:22.680107 6461 slave.cpp:768] Authenticating with master > master@10.88.169.195:5050 > I0105 18:38:22.680197 6461 slave.cpp:773] Using default CRAM-MD5 > authenticatee > I0105 18:38:22.680271 6461 slave.cpp:741] Detecting new master > ................. > W0105 19:05:38.207882 6450 slave.cpp:1973] Ignoring shutdown framework > message for 3e7ba6b1-29fd-44e8-9be2-f72896054ac6-0116 from > master@10.90.12.29:5050 because it is not from the registered master > (master@10.88.169.195:5050) > W0106 09:12:38.666767 6468 slave.cpp:1973] Ignoring shutdown framework > message for 3e7ba6b1-29fd-44e8-9be2-f72896054ac6-0002 from > master@10.90.12.29:5050 because it is not from the registered master > (master@10.88.169.195:5050) > W0106 12:13:35.782218 6441 slave.cpp:1973] Ignoring shutdown framework > message for 3e7ba6b1-29fd-44e8-9be2-f72896054ac6-0117 from > master@10.90.12.29:5050 because it is not from the registered master > (master@10.88.169.195:5050) > W0106 12:23:22.348956 6444 slave.cpp:1973] Ignoring shutdown framework > message for 3e7ba6b1-29fd-44e8-9be2-f72896054ac6-0118 from > master@10.90.12.29:5050 because it is not from the registered master > (master@10.88.169.195:5050) > W0106 12:35:36.660111 6443 slave.cpp:1973] Ignoring shutdown framework > message for 3e7ba6b1-29fd-44e8-9be2-f72896054ac6-0119 from > master@10.90.12.29:5050 because it is not from the registered master > (master@10.88.169.195:5050) > W0106 12:40:43.735994 6461 slave.cpp:1973] Ignoring shutdown framework > message for 3e7ba6b1-29fd-44e8-9be2-f72896054ac6-0121 from > master@10.90.12.29:5050 because it is not from the registered master > (master@10.88.169.195:5050) > W0106 12:42:09.539126 6456 slave.cpp:1973] Ignoring shutdown framework > message for 3e7ba6b1-29fd-44e8-9be2-f72896054ac6-0120 from > master@10.90.12.29:5050 because it is not from the registered master > (master@10.88.169.195:5050) > W0106 12:52:40.787961 6465 slave.cpp:1973] Ignoring shutdown framework > message for 3e7ba6b1-29fd-44e8-9be2-f72896054ac6-0122 from > master@10.90.12.29:5050 because it is not from the registered master > (master@10.88.169.195:5050) > W0106 12:58:10.425287 6461 slave.cpp:1973] Ignoring shutdown framework > message for 3e7ba6b1-29fd-44e8-9be2-f72896054ac6-0123 from > master@10.90.12.29:5050 because it is not from the registered master > (master@10.88.169.195:5050) > W0106 13:03:32.236495 6456 slave.cpp:1973] Ignoring shutdown framework > message for 3e7ba6b1-29fd-44e8-9be2-f72896054ac6-0125 from > master@10.90.12.29:5050 because it is not from the registered master > (master@10.88.169.195:5050) > W0106 13:10:58.501510 6472 slave.cpp:1973] Ignoring shutdown framework > message for 3e7ba6b1-29fd-44e8-9be2-f72896054ac6-0126 from > master@10.90.12.29:5050 because it is not from the registered master > (master@10.88.169.195:5050) > W0106 13:16:04.233232 6460 slave.cpp:1973] Ignoring shutdown framework > message for 3e7ba6b1-29fd-44e8-9be2-f72896054ac6-0127 from > master@10.90.12.29:5050 because it is not from the registered master > (master@10.88.169.195:5050) > W0106 14:17:24.198786 6472 slave.cpp:1973] Ignoring shutdown framework > message for 3e7ba6b1-29fd-44e8-9be2-f72896054ac6-0115 from > master@10.90.12.29:5050 because it is not from the registered master > (master@10.88.169.195:5050) > W0106 14:18:57.036814 6464 slave.cpp:1973] Ignoring shutdown framework > message for 3e7ba6b1-29fd-44e8-9be2-f72896054ac6-0005 from > master@10.90.12.29:5050 because it is not from the registered master > (master@10.88.169.195:5050) > W0106 14:36:19.755764 6460 slave.cpp:1973] Ignoring shutdown framework > message for 3e7ba6b1-29fd-44e8-9be2-f72896054ac6-0112 from > master@10.90.12.29:5050 because it is not from the registered master > (master@10.88.169.195:5050) > W0106 14:46:54.420217 6462 slave.cpp:1973] Ignoring shutdown framework > message for 3e7ba6b1-29fd-44e8-9be2-f72896054ac6-0129 from > master@10.90.12.29:5050 because it is not from the registered master > (master@10.88.169.195:5050) > {quote} > It looks like that slave nodes has some metadata from Cluster A, but still > accept to registery with Cluster B. > Should we do some validation before join the new cluster if we do not clear > up the node ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)