[ 
https://issues.apache.org/jira/browse/MESOS-4299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15085704#comment-15085704
 ] 

Lei Xu commented on MESOS-4299:
-------------------------------

update:

I stop the slave and remove all files in data_dir path, and restart the slave, 
it still shows the same logs above. How to clear up a slave node and join the 
cluster as a new one ?

> Slave lives in two different cluster at the same time with different slave id
> -----------------------------------------------------------------------------
>
>                 Key: MESOS-4299
>                 URL: https://issues.apache.org/jira/browse/MESOS-4299
>             Project: Mesos
>          Issue Type: Bug
>          Components: master, webui
>    Affects Versions: 0.25.0
>         Environment: Mesos 0.25.0
>            Reporter: Lei Xu
>
> I've migrated some nodes from Cluster A to B, and today I found these nodes 
> lives both in Cluster A and B, and the here is the {{/master/slaves}} 
> response:
> {code}
> {
>   "slaves": [
>     {
>       "active": false,
>       "attributes": {
>         "apps": "logstash",
>         "colo": "cn5",
>         "type": "prod"
>       },
>       "hostname": "l-bu128g5-10k10.ops.cn2.qunar.com",
>       "id": "3e7ba6b1-29fd-44e8-9be2-f72896054ac6-S2",
>       "offered_resources": {
>         "cpus": 0,
>         "disk": 0,
>         "mem": 0
>       },
>       "pid": "slave(1)@10.90.5.19:5051",
>       "registered_time": 1451988622.66323,
>       "reserved_resources": {},
>       "resources": {
>         "cpus": 32.0,
>         "disk": 2728919.0,
>         "mem": 128126.0,
>         "ports": "[8100-10000, 31000-32000]"
>       },
>       "unreserved_resources": {
>         "cpus": 32.0,
>         "disk": 2728919.0,
>         "mem": 128126.0,
>         "ports": "[8100-10000, 31000-32000]"
>       },
>       "used_resources": {
>         "cpus": 0,
>         "disk": 0,
>         "mem": 0
>       }
>     },
>     .....
> {code}
> And the following is mesos slave logs:
> {quote}
> I0105 18:36:22.683724  6452 slave.cpp:2248] Updated checkpointed resources 
> from  to
> I0105 18:37:09.900497  6459 slave.cpp:3926] Current disk usage 0.06%. Max 
> allowed age: 1.798706758587755days
> I0105 18:37:22.678374  6453 slave.cpp:3146] Master marked the slave as 
> disconnected but the slave considers itself registered! Forcing 
> re-registration.
> I0105 18:37:22.678699  6453 slave.cpp:694] Re-detecting master
> I0105 18:37:22.678715  6471 status_update_manager.cpp:176] Pausing sending 
> status updates
> I0105 18:37:22.678753  6453 slave.cpp:741] Detecting new master
> I0105 18:37:22.678977  6456 status_update_manager.cpp:176] Pausing sending 
> status updates
> I0105 18:37:22.679047  6455 slave.cpp:705] New master detected at 
> master@10.88.169.195:5050
> I0105 18:37:22.679108  6455 slave.cpp:768] Authenticating with master 
> master@10.88.169.195:5050
> I0105 18:37:22.679136  6455 slave.cpp:773] Using default CRAM-MD5 
> authenticatee
> I0105 18:37:22.679239  6455 slave.cpp:741] Detecting new master
> I0105 18:37:22.679354  6464 authenticatee.cpp:115] Creating new client SASL 
> connection
> I0105 18:37:22.680883  6461 authenticatee.cpp:206] Received SASL 
> authentication mechanisms: CRAM-MD5
> I0105 18:37:22.680946  6461 authenticatee.cpp:232] Attempting to authenticate 
> with mechanism 'CRAM-MD5'
> I0105 18:37:22.681759  6455 authenticatee.cpp:252] Received SASL 
> authentication step
> I0105 18:37:22.682874  6454 authenticatee.cpp:292] Authentication success
> I0105 18:37:22.682986  6441 slave.cpp:836] Successfully authenticated with 
> master master@10.88.169.195:5050
> I0105 18:37:22.684303  6454 slave.cpp:980] Re-registered with master 
> master@10.88.169.195:5050
> I0105 18:37:22.684455  6454 slave.cpp:1016] Forwarding total oversubscribed 
> resources
> I0105 18:37:22.684471  6468 status_update_manager.cpp:183] Resuming sending 
> status updates
> I0105 18:37:22.684649  6454 slave.cpp:2152] Updating framework 
> 20150610-204949-3299432458-5050-25057-0000 pid to 
> scheduler-1bef8172-5068-44c6-93f5-e97a3910ed79@10.88.169.195:35708
> I0105 18:37:22.685025  6452 status_update_manager.cpp:183] Resuming sending 
> status updates
> I0105 18:37:22.685117  6454 slave.cpp:2248] Updated checkpointed resources 
> from  to
> I0105 18:38:09.901587  6464 slave.cpp:3926] Current disk usage 0.06%. Max 
> allowed age: 1.798706755730266days
> I0105 18:38:22.679468  6451 slave.cpp:3146] Master marked the slave as 
> disconnected but the slave considers itself registered! Forcing 
> re-registration.
> I0105 18:38:22.679739  6451 slave.cpp:694] Re-detecting master
> I0105 18:38:22.679754  6453 status_update_manager.cpp:176] Pausing sending 
> status updates
> I0105 18:38:22.679785  6451 slave.cpp:741] Detecting new master
> I0105 18:38:22.680054  6461 slave.cpp:705] New master detected at 
> master@10.88.169.195:5050
> I0105 18:38:22.680106  6470 status_update_manager.cpp:176] Pausing sending 
> status updates
> I0105 18:38:22.680107  6461 slave.cpp:768] Authenticating with master 
> master@10.88.169.195:5050
> I0105 18:38:22.680197  6461 slave.cpp:773] Using default CRAM-MD5 
> authenticatee
> I0105 18:38:22.680271  6461 slave.cpp:741] Detecting new master
> .................
> W0105 19:05:38.207882  6450 slave.cpp:1973] Ignoring shutdown framework 
> message for 3e7ba6b1-29fd-44e8-9be2-f72896054ac6-0116 from 
> master@10.90.12.29:5050 because it is not from the registered master 
> (master@10.88.169.195:5050)
> W0106 09:12:38.666767  6468 slave.cpp:1973] Ignoring shutdown framework 
> message for 3e7ba6b1-29fd-44e8-9be2-f72896054ac6-0002 from 
> master@10.90.12.29:5050 because it is not from the registered master 
> (master@10.88.169.195:5050)
> W0106 12:13:35.782218  6441 slave.cpp:1973] Ignoring shutdown framework 
> message for 3e7ba6b1-29fd-44e8-9be2-f72896054ac6-0117 from 
> master@10.90.12.29:5050 because it is not from the registered master 
> (master@10.88.169.195:5050)
> W0106 12:23:22.348956  6444 slave.cpp:1973] Ignoring shutdown framework 
> message for 3e7ba6b1-29fd-44e8-9be2-f72896054ac6-0118 from 
> master@10.90.12.29:5050 because it is not from the registered master 
> (master@10.88.169.195:5050)
> W0106 12:35:36.660111  6443 slave.cpp:1973] Ignoring shutdown framework 
> message for 3e7ba6b1-29fd-44e8-9be2-f72896054ac6-0119 from 
> master@10.90.12.29:5050 because it is not from the registered master 
> (master@10.88.169.195:5050)
> W0106 12:40:43.735994  6461 slave.cpp:1973] Ignoring shutdown framework 
> message for 3e7ba6b1-29fd-44e8-9be2-f72896054ac6-0121 from 
> master@10.90.12.29:5050 because it is not from the registered master 
> (master@10.88.169.195:5050)
> W0106 12:42:09.539126  6456 slave.cpp:1973] Ignoring shutdown framework 
> message for 3e7ba6b1-29fd-44e8-9be2-f72896054ac6-0120 from 
> master@10.90.12.29:5050 because it is not from the registered master 
> (master@10.88.169.195:5050)
> W0106 12:52:40.787961  6465 slave.cpp:1973] Ignoring shutdown framework 
> message for 3e7ba6b1-29fd-44e8-9be2-f72896054ac6-0122 from 
> master@10.90.12.29:5050 because it is not from the registered master 
> (master@10.88.169.195:5050)
> W0106 12:58:10.425287  6461 slave.cpp:1973] Ignoring shutdown framework 
> message for 3e7ba6b1-29fd-44e8-9be2-f72896054ac6-0123 from 
> master@10.90.12.29:5050 because it is not from the registered master 
> (master@10.88.169.195:5050)
> W0106 13:03:32.236495  6456 slave.cpp:1973] Ignoring shutdown framework 
> message for 3e7ba6b1-29fd-44e8-9be2-f72896054ac6-0125 from 
> master@10.90.12.29:5050 because it is not from the registered master 
> (master@10.88.169.195:5050)
> W0106 13:10:58.501510  6472 slave.cpp:1973] Ignoring shutdown framework 
> message for 3e7ba6b1-29fd-44e8-9be2-f72896054ac6-0126 from 
> master@10.90.12.29:5050 because it is not from the registered master 
> (master@10.88.169.195:5050)
> W0106 13:16:04.233232  6460 slave.cpp:1973] Ignoring shutdown framework 
> message for 3e7ba6b1-29fd-44e8-9be2-f72896054ac6-0127 from 
> master@10.90.12.29:5050 because it is not from the registered master 
> (master@10.88.169.195:5050)
> W0106 14:17:24.198786  6472 slave.cpp:1973] Ignoring shutdown framework 
> message for 3e7ba6b1-29fd-44e8-9be2-f72896054ac6-0115 from 
> master@10.90.12.29:5050 because it is not from the registered master 
> (master@10.88.169.195:5050)
> W0106 14:18:57.036814  6464 slave.cpp:1973] Ignoring shutdown framework 
> message for 3e7ba6b1-29fd-44e8-9be2-f72896054ac6-0005 from 
> master@10.90.12.29:5050 because it is not from the registered master 
> (master@10.88.169.195:5050)
> W0106 14:36:19.755764  6460 slave.cpp:1973] Ignoring shutdown framework 
> message for 3e7ba6b1-29fd-44e8-9be2-f72896054ac6-0112 from 
> master@10.90.12.29:5050 because it is not from the registered master 
> (master@10.88.169.195:5050)
> W0106 14:46:54.420217  6462 slave.cpp:1973] Ignoring shutdown framework 
> message for 3e7ba6b1-29fd-44e8-9be2-f72896054ac6-0129 from 
> master@10.90.12.29:5050 because it is not from the registered master 
> (master@10.88.169.195:5050)
> {quote}
> It looks like that slave nodes has some metadata from Cluster A, but still 
> accept to registery with Cluster B.
> Should we do some validation before join the new cluster if we do not clear 
> up the node ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to