On Tue, Apr 8, 2014 at 3:33 AM, Diedrich Ehlerding <[email protected]> wrote: > Hi, > >> >> Have you increased the verbosity for the monitors, restarted them, and >> looked at the log output? > > First of all: The bug is still there, and the logs do not help. But I > seem to have found a workaround (just for myself, not generally) > > As for the bug: > > I appended "debug log=20" to ceph-deploy's generated ceph.conf but I > dont see much in the logs (and they do not get larger bym this > option). Here is one of the monitor logs form /var/lib/ceph; the > other ones look identically.
You would probably need to up the verbosity for the monitors, so it would look like this on the global section debug mon = 20 debug ms = 10 Then restart the mons and check the output > > 2014-04-08 08:26:09.405714 7fd1a0a94780 0 ceph version 0.72.2 > (a913ded2ff138aefb8cb84d347d72164099cfd60), process ceph-mon, pid > 28842 > 2014-04-08 08:26:09.851227 7f66ecd06780 0 ceph version 0.72.2 > (a913ded2ff138aefb8cb84d347d72164099cfd60), process ceph-mon, pid > 28943 > 2014-04-08 08:26:09.933034 7f66ecd06780 0 mon.hvrrzceph2 does not > exist in monmap, will attempt to join an existing cluster > 2014-04-08 08:26:09.933417 7f66ecd06780 0 using public_addr > 10.111.3.2:0/0 -> 10.111.3.2:6789/0 > 2014-04-08 08:26:09.934003 7f66ecd06780 1 mon.hvrrzceph2@-1(probing) > e0 preinit fsid c847e327-1bc5-445f-9c7e-de0551bfde06 > 2014-04-08 08:26:09.934149 7f66ecd06780 1 mon.hvrrzceph2@-1(probing) > e0 initial_members hvrrzceph1,hvrrzceph2,hvrrzceph3, filtering seed > monmap > 2014-04-08 08:26:09.937302 7f66ecd06780 0 mon.hvrrzceph2@-1(probing) > e0 my rank is now 0 (was -1) > 2014-04-08 08:26:09.938254 7f66e63c9700 0 -- 10.111.3.2:6789/0 >> > 0.0.0.0:0/2 pipe(0x15fba00 sd=21 :0 s=1 pgs=0 cs=0 l=0 > c=0x15c9c60).fault > 2014-04-08 08:26:09.938442 7f66e61c7700 0 -- 10.111.3.2:6789/0 >> > 10.112.3.2:6789/0 pipe(0x1605280 sd=22 :0 s=1 pgs=0 cs=0 l=0 > c=0x15c99a0).fault > 2014-04-08 08:26:09.939001 7f66ecd04700 0 -- 10.111.3.2:6789/0 >> > 0.0.0.0:0/1 pipe(0x15fb280 sd=25 :0 s=1 pgs=0 cs=0 l=0 > c=0x15c9420).fault > 2014-04-08 08:26:09.939120 7f66e62c8700 0 -- 10.111.3.2:6789/0 >> > 10.112.3.1:6789/0 pipe(0x1605780 sd=24 :0 s=1 pgs=0 cs=0 l=0 > c=0x15c9b00).fault > 2014-04-08 08:26:09.941140 7f66e60c6700 0 -- 10.111.3.2:6789/0 >> > 10.112.3.3:6789/0 pipe(0x1605c80 sd=23 :0 s=1 pgs=0 cs=0 l=0 > c=0x15c9840).fault > 2014-04-08 08:27:09.934720 7f66e7bcc700 0 > mon.hvrrzceph2@0(probing).data_health(0) update_stats avail 70% total > 15365520 used 3822172 avail 10762804 > 2014-04-08 08:28:09.935036 7f66e7bcc700 0 > mon.hvrrzceph2@0(probing).data_health(0) update_stats avail 70% total > 15365520 used 3822172 avail 10762804 > > Since ceph-deploy complained about not getting an answer from > ceph-generate-keys: > > [hvrrzceph3][DEBUG ] Starting ceph-create-keys on hvrrzceph3... > [hvrrzceph3][WARNIN] No data was received after 7 seconds, > disconnecting... > > I therefore tried to create a keys manually: > > hvrrzceph2:~ # ceph-create-keys --id client.admin > admin_socket: exception getting command descriptions: [Errno 2] No > such file or directory > INFO:ceph-create-keys:ceph-mon admin socket not ready yet. > admin_socket: exception getting command descriptions: [Errno 2] No > such file or directory > INFO:ceph-create-keys:ceph-mon admin socket not ready yet. > admin_socket: exception getting command descriptions: [Errno 2] No > such file or directory > INFO:ceph-create-keys:ceph-mon admin socket not ready yet. > admin_socket: exception getting command descriptions: [Errno 2] No > such file or directory > INFO:ceph-create-keys:ceph-mon admin socket not ready yet. > > [etc.] > > As for the workaround: What I wanted to do is: I have three servers, > two NICs, and thre IP addresses per server. The NICs are bonded, the > bond has an IP address in one network (untagged), and additionally, > two tagged VLANs are also on the bond. The bug occured when I tried > to use a dedicated cluster network (i.e. one of the tagged vlans) and > another dedicated public network (the other tagged vlan). At that > time, I had > > I now tried to leave "cluster network" and "public network" away from > ceph.conf ... and now I could create the cluster. > > So it seems to be a network problem, as you (and Brian) supposed. > However, ssh etc. are properly working on all three networks. I > don't really understand what's going on there, but at least, I can > continue to learn. > > Thank you. > > best regards > Diedrich > > > > > -- > Diedrich Ehlerding, Fujitsu Technology Solutions GmbH, > FTS CE SC PS&IS W, Hildesheimer Str 25, D-30880 Laatzen > Fon +49 511 8489-1806, Fax -251806, Mobil +49 173 2464758 > Firmenangaben: http://de.ts.fujitsu.com/imprint.html > _______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
