It's my personal "production" cluster , by the way.

Hello Gr. Stefan,
1.

osd  marked noout,nobackfill,norecover before shutting down .

$ ceph osd set noout

$ ceph osd set nobackfill

$ ceph osd set norecover



2.
[root@ceph-node1 ~]# systemctl  status firewalld
?? firewalld.service - firewalld - dynamic firewall daemon
   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; 
disabled; vendor preset: enabled)
   Active: inactive (dead)
     Docs: man:firewalld(1)
[root@ceph-node1 ~]# netstat  -antp|grep 6789
tcp        0      0 192.168.1.6:6789  
      0.0.0.0:*              
 LISTEN      474841/ceph-mon     
[root@ceph-node1 ~]# netstat  -antp|grep 3300
[root@ceph-node1 ~]# 



3. osd mds mgr log is empty!
[root@ceph-node1 ceph]# ls -lh *.log
-rw-------  1 ceph ceph    0 Dec 11 03:09 ceph.audit.log
-rw-------  1 ceph ceph 3.7K Dec 11 08:36 ceph.log
-rw-r--r--. 1 ceph ceph    0 Dec  9 03:19 ceph-mds.ceph-node1.log
-rw-r--r--. 1 ceph ceph    0 Dec  9 03:19 ceph-mgr.ceph-node1.log
-rw-r--r--  1 ceph ceph 2.2M Dec 11 14:42 ceph-mon.ceph-node1.log
-rw-r--r--. 1 ceph ceph    0 Dec  9 03:19 ceph-osd.0.log
-rw-r--r--. 1 ceph ceph    0 Dec  9 03:19 ceph-osd.10.log
-rw-r--r--. 1 ceph ceph    0 Dec  9 03:19 ceph-osd.11.log
-rw-r--r--. 1 ceph ceph    0 Dec  9 03:19 ceph-osd.1.log
-rw-r--r--. 1 ceph ceph    0 Dec  9 03:19 ceph-osd.2.log
-rw-r--r--. 1 ceph ceph    0 Dec  9 03:19 ceph-osd.3.log
-rw-r--r--. 1 ceph ceph    0 Dec  9 03:19 ceph-osd.4.log
-rw-r--r--. 1 ceph ceph    0 Dec  9 03:19 ceph-osd.5.log
-rw-r--r--. 1 ceph ceph    0 Dec  9 03:19 ceph-osd.6.log
-rw-r--r--. 1 ceph ceph    0 Dec  9 03:19 ceph-osd.7.log
-rw-r--r--. 1 ceph ceph    0 Dec  9 03:19 ceph-osd.8.log
-rw-r--r--. 1 ceph ceph    0 Dec  9 03:19 ceph-osd.9.log
-rw-r--r--. 1 ceph ceph    0 Dec  9 03:19 
ceph-rgw-ceph-node1.rgw0.log
-rw-r--r--  1 root root    0 Dec 11 03:09 ceph-volume.log
-rw-r--r--  1 root root    0 Dec 11 03:09 ceph-volume-systemd.log



4.[root@ceph-node1 ceph]# ceph -s
just blocked ...
error 111 after a few hours


------------------ ???????? ------------------
??????:&nbsp;"Stefan Kooman"<ste...@bit.nl&gt;;
????????:&nbsp;2019??12??11??(??????) ????2:37
??????:&nbsp;"Cc??"<o...@qq.com&gt;;
????:&nbsp;"ceph-users"<ceph-users@lists.ceph.com&gt;;
????:&nbsp;Re: [ceph-users] ceph-mon is blocked after shutting down and ip 
address changed



Quoting Cc?? (o...@qq.com):
&gt; ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba) nautilus 
(stable)
&gt; 
&gt; os :CentOS Linux release 7.7.1908 (Core)
&gt; single node ceph cluster with 1 mon,1mgr,1 mds,1rgw and 12osds , but 
only&amp;nbsp; cephfs is used.
&gt; &amp;nbsp;ceph -s&amp;nbsp; &amp;nbsp;is blocked after&amp;nbsp; shutting 
down the machine (192.168.0.104), then ip address changed to&amp;nbsp; 
192.168.1.6
&gt; 
&gt; &amp;nbsp;I created the monmap with monmap tool and&amp;nbsp; update the 
ceph.conf , hosts file and then start ceph-mon.
&gt; and the ceph-mon&amp;nbsp; log:
&gt; ...
&gt; 2019-12-11 08:57:45.170 7f952cdac700&amp;nbsp; 1 
mon.ceph-node1@0(leader).mds e34 no beacon from mds.0.10 (gid: 4384 addr: 
[v2:192.168.0.104:6898/4084823750,v1:192.168.0.104:6899/4084823750] state: 
up:active) since 1285.14s
&gt; 2019-12-11 08:57:50.170 7f952cdac700&amp;nbsp; 1 
mon.ceph-node1@0(leader).mds e34 no beacon from mds.0.10 (gid: 4384 addr: 
[v2:192.168.0.104:6898/4084823750,v1:192.168.0.104:6899/4084823750] state: 
up:active) since 1290.14s
&gt; 2019-12-11 08:57:55.171 7f952cdac700&amp;nbsp; 1 
mon.ceph-node1@0(leader).mds e34 no beacon from mds.0.10 (gid: 4384 addr: 
[v2:192.168.0.104:6898/4084823750,v1:192.168.0.104:6899/4084823750] state: 
up:active) since 1295.14s
&gt; 2019-12-11 08:58:00.171 7f952cdac700&amp;nbsp; 1 
mon.ceph-node1@0(leader).mds e34 no beacon from mds.0.10 (gid: 4384 addr: 
[v2:192.168.0.104:6898/4084823750,v1:192.168.0.104:6899/4084823750] state: 
up:active) since 1300.14s
&gt; 2019-12-11 08:58:05.172 7f952cdac700&amp;nbsp; 1 
mon.ceph-node1@0(leader).mds e34 no beacon from mds.0.10 (gid: 4384 addr: 
[v2:192.168.0.104:6898/4084823750,v1:192.168.0.104:6899/4084823750] state: 
up:active) since 1305.14s
&gt; 2019-12-11 08:58:10.171 7f952cdac700&amp;nbsp; 1 
mon.ceph-node1@0(leader).mds e34 no beacon from mds.0.10 (gid: 4384 addr: 
[v2:192.168.0.104:6898/4084823750,v1:192.168.0.104:6899/4084823750] state: 
up:active) since 1310.14s
&gt; 2019-12-11 08:58:15.173 7f952cdac700&amp;nbsp; 1 
mon.ceph-node1@0(leader).mds e34 no beacon from mds.0.10 (gid: 4384 addr: 
[v2:192.168.0.104:6898/4084823750,v1:192.168.0.104:6899/4084823750] state: 
up:active) since 1315.14s
&gt; 2019-12-11 08:58:20.173 7f952cdac700&amp;nbsp; 1 
mon.ceph-node1@0(leader).mds e34 no beacon from mds.0.10 (gid: 4384 addr: 
[v2:192.168.0.104:6898/4084823750,v1:192.168.0.104:6899/4084823750] state: 
up:active) since 1320.14s
&gt; 2019-12-11 08:58:25.174 7f952cdac700&amp;nbsp; 1 
mon.ceph-node1@0(leader).mds e34 no beacon from mds.0.10 (gid: 4384 addr: 
[v2:192.168.0.104:6898/4084823750,v1:192.168.0.104:6899/4084823750] state: 
up:active) since 1325.14s
&gt; 
&gt; ...
&gt; 
&gt; 
&gt; I changed IP back to 192.168.0.104 yeasterday, but all the same.

Just checking here: do you run a firewall? Is port 3300 open (besides
6789)?

What do you see in the logs on the MDS and the ODSs? There are timers
configured in the MON / OSD in case they cannot reach (in time) each
other. OSDs might get marked out. But I'm unsure what is the status of
your cluster. Could you paste a "ceph -s"?

Gr. Stefan

P.s. BTW: is this running production?

-- 
| BIT BV&nbsp; https://www.bit.nl/&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp; Kamer 
van Koophandel 09090351
| GPG: 
0xD14839C6&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 &nbsp; +31 318 648 688 / i...@bit.nl
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to