Hi,

> 
> Have you increased the verbosity for the monitors, restarted them, and
> looked at the log output? 

First of all: The bug is still there, and the logs do not help. But I 
seem to have found a workaround (just for myself, not generally)

As for the bug:

I appended "debug log=20" to ceph-deploy's generated ceph.conf but I 
dont see much in the logs (and they do not get larger bym this 
option). Here is one of the monitor logs form /var/lib/ceph; the 
other ones look identically.

2014-04-08 08:26:09.405714 7fd1a0a94780  0 ceph version 0.72.2 
(a913ded2ff138aefb8cb84d347d72164099cfd60), process ceph-mon, pid 
28842
2014-04-08 08:26:09.851227 7f66ecd06780  0 ceph version 0.72.2 
(a913ded2ff138aefb8cb84d347d72164099cfd60), process ceph-mon, pid 
28943
2014-04-08 08:26:09.933034 7f66ecd06780  0 mon.hvrrzceph2 does not 
exist in monmap, will attempt to join an existing cluster
2014-04-08 08:26:09.933417 7f66ecd06780  0 using public_addr 
10.111.3.2:0/0 -> 10.111.3.2:6789/0
2014-04-08 08:26:09.934003 7f66ecd06780  1 mon.hvrrzceph2@-1(probing) 
e0 preinit fsid c847e327-1bc5-445f-9c7e-de0551bfde06
2014-04-08 08:26:09.934149 7f66ecd06780  1 mon.hvrrzceph2@-1(probing) 
e0  initial_members hvrrzceph1,hvrrzceph2,hvrrzceph3, filtering seed 
monmap
2014-04-08 08:26:09.937302 7f66ecd06780  0 mon.hvrrzceph2@-1(probing) 
e0  my rank is now 0 (was -1)
2014-04-08 08:26:09.938254 7f66e63c9700  0 -- 10.111.3.2:6789/0 >> 
0.0.0.0:0/2 pipe(0x15fba00 sd=21 :0 s=1 pgs=0 cs=0 l=0 
c=0x15c9c60).fault
2014-04-08 08:26:09.938442 7f66e61c7700  0 -- 10.111.3.2:6789/0 >> 
10.112.3.2:6789/0 pipe(0x1605280 sd=22 :0 s=1 pgs=0 cs=0 l=0 
c=0x15c99a0).fault
2014-04-08 08:26:09.939001 7f66ecd04700  0 -- 10.111.3.2:6789/0 >> 
0.0.0.0:0/1 pipe(0x15fb280 sd=25 :0 s=1 pgs=0 cs=0 l=0 
c=0x15c9420).fault
2014-04-08 08:26:09.939120 7f66e62c8700  0 -- 10.111.3.2:6789/0 >> 
10.112.3.1:6789/0 pipe(0x1605780 sd=24 :0 s=1 pgs=0 cs=0 l=0 
c=0x15c9b00).fault
2014-04-08 08:26:09.941140 7f66e60c6700  0 -- 10.111.3.2:6789/0 >> 
10.112.3.3:6789/0 pipe(0x1605c80 sd=23 :0 s=1 pgs=0 cs=0 l=0 
c=0x15c9840).fault
2014-04-08 08:27:09.934720 7f66e7bcc700  0 
mon.hvrrzceph2@0(probing).data_health(0) update_stats avail 70% total 
15365520 used 3822172 avail 10762804
2014-04-08 08:28:09.935036 7f66e7bcc700  0 
mon.hvrrzceph2@0(probing).data_health(0) update_stats avail 70% total 
15365520 used 3822172 avail 10762804

Since ceph-deploy complained about not getting an answer from 
ceph-generate-keys:

[hvrrzceph3][DEBUG ] Starting ceph-create-keys on hvrrzceph3...
[hvrrzceph3][WARNIN] No data was received after 7 seconds, 
disconnecting...

I therefore tried to create a keys manually:

hvrrzceph2:~ # ceph-create-keys --id client.admin
admin_socket: exception getting command descriptions: [Errno 2] No 
such file or directory
INFO:ceph-create-keys:ceph-mon admin socket not ready yet.
admin_socket: exception getting command descriptions: [Errno 2] No 
such file or directory
INFO:ceph-create-keys:ceph-mon admin socket not ready yet.
admin_socket: exception getting command descriptions: [Errno 2] No 
such file or directory
INFO:ceph-create-keys:ceph-mon admin socket not ready yet.
admin_socket: exception getting command descriptions: [Errno 2] No 
such file or directory
INFO:ceph-create-keys:ceph-mon admin socket not ready yet.

[etc.]

As for the workaround: What I wanted to do is: I have three servers, 
two NICs, and thre IP addresses per server. The NICs are bonded, the 
bond has an IP address in one network (untagged), and additionally, 
two tagged VLANs are also on the bond. The bug occured when I tried 
to use a dedicated cluster network (i.e. one of the tagged vlans) and 
another dedicated public network (the other tagged vlan). At that 
time, I had 

I now tried to leave "cluster network" and "public network" away from 
ceph.conf ... and now I could create the cluster.

So it seems to be a network problem, as you (and Brian) supposed. 
However, ssh etc. are properly working on all three networks.  I 
don't really understand what's going on there, but at least, I can 
continue to learn.

Thank you.

best regards
Diedrich 




-- 
Diedrich Ehlerding, Fujitsu Technology Solutions GmbH,
FTS CE SC PS&IS W, Hildesheimer Str 25, D-30880 Laatzen
Fon +49 511 8489-1806, Fax -251806, Mobil +49 173 2464758
Firmenangaben: http://de.ts.fujitsu.com/imprint.html

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to