Re: [ceph-users] Issues with fresh 0.93 OSD adding to existing cluster
I think I might have found the issue Something is wrong with my crush map. I was just attempting to modify it microserver-1:~ # ceph osd getcrushmap -o /tmp/cm got crush map from osdmap epoch 3937 microserver-1:~ # crushtool -d /tmp/cm -o /tmp/cm.txt microserver-1:~ # vim /tmp/cm.txt microserver-1:~ # crushtool -c /tmp/cm.txt -o /tmp/cm.new microserver-1:~ # ceph osd setcrushmap -i /tmp/cm.new Error EINVAL: Failed to parse crushmap: buffer::end_of_buffer microserver-1:~ # crushtool -c /tmp/cm.txt -o /tmp/cm.new microserver-1:~ # ceph osd setcrushmap -i /tmp/cm.new Error EPERM: Failed to parse crushmap: error running crushmap through crushtool: (1) Operation not permitted It's like something is missing or broken from my crush map. This cluster has been around for at least two years and has been upgraded to each new version of ceph. -Original Message- From: Malcolm Haak Sent: Wednesday, 18 March 2015 12:53 PM To: Malcolm Haak; Joao Eduardo Luis; ceph-users@lists.ceph.com Subject: RE: [ceph-users] Issues with fresh 0.93 OSD adding to existing cluster Sorry to bump this one, but I have more hardware coming and I still cannot add another OSD to my cluster.. Does anybody have any clues? -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Malcolm Haak Sent: Friday, 13 March 2015 10:05 AM To: Joao Eduardo Luis; ceph-users@lists.ceph.com Subject: Re: [ceph-users] Issues with fresh 0.93 OSD adding to existing cluster Sorry about this, I sent this at 1AM last night and went to bed, I didn't realise the log was far too long and the email had been blocked... I've reattached all the requested files and trimmed the body of the email. Thank you again for looking at this. -Original Message- From: Malcolm Haak Sent: Friday, 13 March 2015 1:38 AM To: 'Joao Eduardo Luis'; ceph-users@lists.ceph.com Subject: RE: [ceph-users] Issues with fresh 0.93 OSD adding to existing cluster Ok, So, I've been doing things in the meantime and as such the osd is now requesting 3008 and 3009 instead of 2758/9 I've included the problem OSD's log file. And attached all the osdmap's as requested. Regards Malcolm Haak -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Joao Eduardo Luis Sent: Friday, 13 March 2015 1:02 AM To: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Issues with fresh 0.93 OSD adding to existing cluster On 03/12/2015 05:16 AM, Malcolm Haak wrote: Sorry about all the unrelated grep issues.. So I've rebuilt and reinstalled and it's still broken. On the working node, even with the new packages, everything works. On the new broken node, I've added a mon and it works. But I still cannot start an OSD on the new node. What else do you need from me? I'll get logs run any number of tests. I've got data in this cluster already, and it's full so I need to expand it, I've already got hardware. Thanks in advance for even having a look Sam mentioned to me on IRC that the next step would be to grab the offending osdmaps. Easiest way for that will be to stop a monitor and run 'ceph-monstore-tool' in order to obtain the full maps, and then use 'ceph-kvstore-tool' to obtain incrementals. Given the osd is crashing on version 2759, the following would be best: (Assuming you have stopped a given monitor with id FOO, whose store is sitting at default path /var/lib/ceph/mon/ceph-FOO) ceph-monstore-tool /var/lib/ceph/mon/ceph-FOO get osdmap -- --version 2758 --out /tmp/osdmap.full.2758 ceph-monstore-tool /var/lib/ceph/mon/ceph-FOO get osdmap -- --version 2759 --out /tmp/osdmap.full.2759 (please note the '--' between 'osdmap' and '--version', as that is required for the tool to do its thing) and then ceph-kvstore-tool /var/lib/ceph/mon/ceph-FOO/store.db get osdmap 2758 out /tmp/osdmap.inc.2758 ceph-kvstore-tool /var/lib/ceph/mon/ceph-FOO/store.db get osdmap 2759 out /tmp/osdmap.inc.2759 Cheers! -Joao -Original Message- From: Samuel Just [mailto:sj...@redhat.com] Sent: Wednesday, 11 March 2015 1:41 AM To: Malcolm Haak; jl...@redhat.com Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Issues with fresh 0.93 OSD adding to existing cluster Joao, it looks like map 2759 is causing trouble, how would he get the full and incremental maps for that out of the mons? -Sam On Tue, 2015-03-10 at 14:12 +, Malcolm Haak wrote: Hi Samuel, The sha1? I'm going to admit ignorance as to what you are looking for. They are all running the same release if that is what you are asking. Same tarball built into rpms using rpmbuild on both nodes... Only difference being that the other node has been upgraded and the problem node is fresh. added the requested config here is the command line output microserver-1:/etc # /etc/init.d/ceph start osd.3 ___ ceph-users mailing list ceph-users
Re: [ceph-users] Issues with fresh 0.93 OSD adding to existing cluster
Hi all, So the init script issue is sorted.. my grep binary is not working correctly. I've replaced it and everything seems to be fine. Which now has me wondering if the binaries I generated are any good... the bad grep might have caused issues with the build... I'm going to recompile after some more sanity testing.. -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Malcolm Haak Sent: Wednesday, 11 March 2015 8:56 PM To: Samuel Just; jl...@redhat.com Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Issues with fresh 0.93 OSD adding to existing cluster I ran ceph-osd via the command line... It's not really given me much more to go off... Well except that it's hitting an early end of buffer for some reason. Also I've hit another issue... The /etc/init.d/ceph script is not seeing my new mon (I decided to add more mon's to see if it would help since the mon map looks like it is the issue) The script starts the mon fine. And the new mon (on the same host as this problem osd) appears to be good. The issue is when you do /etc/init.d/ceph status It tells you the mon.b is dead.. It seems to be one of the greps that is failing Specifically grep -qwe -i.$daemon_id /proc/\$pid/cmdline returns 1 What's odd is the same grep works on the other node for mon.a it just doesn't work on this node for mon.b I'm wondering if there is something odd happening. Anyway here is the output of the manual start of ceph-osd # /usr/bin/ceph-osd -i 3 --pid-file /var/run/ceph/osd.3.pid -c /etc/ceph/ceph.conf --cluster ceph -f starting osd.3 at :/0 osd_data /var/lib/ceph/osd/ceph-3 /var/lib/ceph/osd/ceph-3/journal 2015-03-11 20:38:56.401205 7f04221e6880 -1 journal FileJournal::_open: disabling aio for non-block journal. Use journal_force_aio t o force use of aio anyway 2015-03-11 20:38:56.418747 7f04221e6880 -1 osd.3 2757 log_to_monitors {default=true} terminate called after throwing an instance of 'ceph::buffer::end_of_buffer' what(): buffer::end_of_buffer *** Caught signal (Aborted) ** in thread 7f041192a700 ceph version 0.93 (bebf8e9a830d998eeaab55f86bb256d4360dd3c4) 1: /usr/bin/ceph-osd() [0xac7cea] 2: (()+0x10050) [0x7f04210f1050] 3: (gsignal()+0x37) [0x7f041f5c40f7] 4: (abort()+0x13a) [0x7f041f5c54ca] 5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7f041fea9fe5] 6: (()+0x63186) [0x7f041fea8186] 7: (()+0x631b3) [0x7f041fea81b3] 8: (()+0x633d2) [0x7f041fea83d2] 9: (ceph::buffer::list::iterator::copy(unsigned int, char*)+0x137) [0xc2cea7] 10: (OSDMap::decode_classic(ceph::buffer::list::iterator)+0x605) [0xb7b7b5] 11: (OSDMap::decode(ceph::buffer::list::iterator)+0x8c) [0xb7bebc] 12: (OSDMap::decode(ceph::buffer::list)+0x3f) [0xb7dfbf] 13: (OSD::handle_osd_map(MOSDMap*)+0xd37) [0x6cd9a7] 14: (OSD::_dispatch(Message*)+0x3eb) [0x6d0afb] 15: (OSD::ms_dispatch(Message*)+0x257) [0x6d1007] 16: (DispatchQueue::entry()+0x649) [0xc6fe09] 17: (DispatchQueue::DispatchThread::entry()+0xd) [0xb9dd7d] 18: (()+0x83a4) [0x7f04210e93a4] 19: (clone()+0x6d) [0x7f041f673a4d] 2015-03-11 20:38:56.471624 7f041192a700 -1 *** Caught signal (Aborted) ** in thread 7f041192a700 ceph version 0.93 (bebf8e9a830d998eeaab55f86bb256d4360dd3c4) 1: /usr/bin/ceph-osd() [0xac7cea] 2: (()+0x10050) [0x7f04210f1050] 3: (gsignal()+0x37) [0x7f041f5c40f7] 4: (abort()+0x13a) [0x7f041f5c54ca] 5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7f041fea9fe5] 6: (()+0x63186) [0x7f041fea8186] 7: (()+0x631b3) [0x7f041fea81b3] 8: (()+0x633d2) [0x7f041fea83d2] 9: (ceph::buffer::list::iterator::copy(unsigned int, char*)+0x137) [0xc2cea7] 10: (OSDMap::decode_classic(ceph::buffer::list::iterator)+0x605) [0xb7b7b5] 11: (OSDMap::decode(ceph::buffer::list::iterator)+0x8c) [0xb7bebc] 12: (OSDMap::decode(ceph::buffer::list)+0x3f) [0xb7dfbf] 13: (OSD::handle_osd_map(MOSDMap*)+0xd37) [0x6cd9a7] 14: (OSD::_dispatch(Message*)+0x3eb) [0x6d0afb] 15: (OSD::ms_dispatch(Message*)+0x257) [0x6d1007] 16: (DispatchQueue::entry()+0x649) [0xc6fe09] 17: (DispatchQueue::DispatchThread::entry()+0xd) [0xb9dd7d] 18: (()+0x83a4) [0x7f04210e93a4] 19: (clone()+0x6d) [0x7f041f673a4d] NOTE: a copy of the executable, or `objdump -rdS executable` is needed to interpret this. -308 2015-03-11 20:38:56.401205 7f04221e6880 -1 journal FileJournal::_open: disabling aio for non-block journal. Use journal_for ce_aio to force use of aio anyway -77 2015-03-11 20:38:56.418747 7f04221e6880 -1 osd.3 2757 log_to_monitors {default=true} 0 2015-03-11 20:38:56.471624 7f041192a700 -1 *** Caught signal (Aborted) ** in thread 7f041192a700 ceph version 0.93 (bebf8e9a830d998eeaab55f86bb256d4360dd3c4) 1
Re: [ceph-users] Issues with fresh 0.93 OSD adding to existing cluster
I've no idea if this helps. But I was looking in the meta file of osd.3 to see if things there made any sense. I'm very much out of my depth. To me this looks like a bug. Quite possibly a corner case, but bug none the less. Anyway I've included my crush map and what look like the osdmap files out of the osd that wont start. Cracking them open it appears that the new osd.3 is not in the map at all.. which might be correct, but I would have expected to see it in the layout. I've also added the current osdmap dump as well... If I'm asking in the wrong place, please let me know. I don't want to be wasting peoples time. -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Malcolm Haak Sent: Thursday, 12 March 2015 4:16 PM To: Samuel Just; jl...@redhat.com Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Issues with fresh 0.93 OSD adding to existing cluster Sorry about all the unrelated grep issues.. So I've rebuilt and reinstalled and it's still broken. On the working node, even with the new packages, everything works. On the new broken node, I've added a mon and it works. But I still cannot start an OSD on the new node. What else do you need from me? I'll get logs run any number of tests. I've got data in this cluster already, and it's full so I need to expand it, I've already got hardware. Thanks in advance for even having a look -Original Message- From: Samuel Just [mailto:sj...@redhat.com] Sent: Wednesday, 11 March 2015 1:41 AM To: Malcolm Haak; jl...@redhat.com Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Issues with fresh 0.93 OSD adding to existing cluster Joao, it looks like map 2759 is causing trouble, how would he get the full and incremental maps for that out of the mons? -Sam On Tue, 2015-03-10 at 14:12 +, Malcolm Haak wrote: Hi Samuel, The sha1? I'm going to admit ignorance as to what you are looking for. They are all running the same release if that is what you are asking. Same tarball built into rpms using rpmbuild on both nodes... Only difference being that the other node has been upgraded and the problem node is fresh. added the requested config here is the command line output microserver-1:/etc # /etc/init.d/ceph start osd.3 === osd.3 === Mounting xfs on microserver-1:/var/lib/ceph/osd/ceph-3 2015-03-11 01:00:13.492279 7f05b2f72700 1 -- :/0 messenger.start 2015-03-11 01:00:13.492823 7f05b2f72700 1 -- :/1002795 -- 192.168.0.10:6789/0 -- auth(proto 0 26 bytes epoch 0) v1 -- ?+0 0x7f05ac0290b0 con 0x7f05ac027c40 2015-03-11 01:00:13.510814 7f05b07ef700 1 -- 192.168.0.250:0/1002795 learned my addr 192.168.0.250:0/1002795 2015-03-11 01:00:13.527653 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 1 mon_map magic: 0 v1 191+0+0 (1112175541 0 0) 0x7f05aab0 con 0x7f05ac027c40 2015-03-11 01:00:13.527899 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 2 auth_reply(proto 1 0 (0) Success) v1 24+0+0 (3859410672 0 0) 0x7f05ae70 con 0x7f05ac027c40 2015-03-11 01:00:13.527973 7f05abfff700 1 -- 192.168.0.250:0/1002795 -- 192.168.0.10:6789/0 -- mon_subscribe({monmap=0+}) v2 -- ?+0 0x7f05ac029730 con 0x7f05ac027c40 2015-03-11 01:00:13.528124 7f05b2f72700 1 -- 192.168.0.250:0/1002795 -- 192.168.0.10:6789/0 -- mon_subscribe({monmap=2+,osdmap=0}) v2 -- ?+0 0x7f05ac029a50 con 0x7f05ac027c40 2015-03-11 01:00:13.528265 7f05b2f72700 1 -- 192.168.0.250:0/1002795 -- 192.168.0.10:6789/0 -- mon_subscribe({monmap=2+,osdmap=0}) v2 -- ?+0 0x7f05ac029f20 con 0x7f05ac027c40 2015-03-11 01:00:13.530359 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 3 mon_map magic: 0 v1 191+0+0 (1112175541 0 0) 0x7f05aab0 con 0x7f05ac027c40 2015-03-11 01:00:13.530548 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 4 mon_subscribe_ack(300s) v1 20+0+0 (3648139960 0 0) 0x7f05afb0 con 0x7f05ac027c40 2015-03-11 01:00:13.531114 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 5 osd_map(3277..3277 src has 2757..3277) v3 5366+0+0 (3110999244 0 0) 0x7f05a0002800 con 0x7f05ac027c40 2015-03-11 01:00:13.531772 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 6 mon_subscribe_ack(300s) v1 20+0+0 (3648139960 0 0) 0x7f05afb0 con 0x7f05ac027c40 2015-03-11 01:00:13.532186 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 7 osd_map(3277..3277 src has 2757..3277) v3 5366+0+0 (3110999244 0 0) 0x7f05a0001250 con 0x7f05ac027c40 2015-03-11 01:00:13.532260 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 8 mon_subscribe_ack(300s) v1 20+0+0 (3648139960 0 0) 0x7f05afb0 con 0x7f05ac027c40 2015-03-11 01:00:13.556748 7f05b2f72700 1 -- 192.168.0.250:0/1002795 -- 192.168.0.10:6789/0
Re: [ceph-users] Issues with fresh 0.93 OSD adding to existing cluster
On 03/12/2015 05:16 AM, Malcolm Haak wrote: Sorry about all the unrelated grep issues.. So I've rebuilt and reinstalled and it's still broken. On the working node, even with the new packages, everything works. On the new broken node, I've added a mon and it works. But I still cannot start an OSD on the new node. What else do you need from me? I'll get logs run any number of tests. I've got data in this cluster already, and it's full so I need to expand it, I've already got hardware. Thanks in advance for even having a look Sam mentioned to me on IRC that the next step would be to grab the offending osdmaps. Easiest way for that will be to stop a monitor and run 'ceph-monstore-tool' in order to obtain the full maps, and then use 'ceph-kvstore-tool' to obtain incrementals. Given the osd is crashing on version 2759, the following would be best: (Assuming you have stopped a given monitor with id FOO, whose store is sitting at default path /var/lib/ceph/mon/ceph-FOO) ceph-monstore-tool /var/lib/ceph/mon/ceph-FOO get osdmap -- --version 2758 --out /tmp/osdmap.full.2758 ceph-monstore-tool /var/lib/ceph/mon/ceph-FOO get osdmap -- --version 2759 --out /tmp/osdmap.full.2759 (please note the '--' between 'osdmap' and '--version', as that is required for the tool to do its thing) and then ceph-kvstore-tool /var/lib/ceph/mon/ceph-FOO/store.db get osdmap 2758 out /tmp/osdmap.inc.2758 ceph-kvstore-tool /var/lib/ceph/mon/ceph-FOO/store.db get osdmap 2759 out /tmp/osdmap.inc.2759 Cheers! -Joao -Original Message- From: Samuel Just [mailto:sj...@redhat.com] Sent: Wednesday, 11 March 2015 1:41 AM To: Malcolm Haak; jl...@redhat.com Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Issues with fresh 0.93 OSD adding to existing cluster Joao, it looks like map 2759 is causing trouble, how would he get the full and incremental maps for that out of the mons? -Sam On Tue, 2015-03-10 at 14:12 +, Malcolm Haak wrote: Hi Samuel, The sha1? I'm going to admit ignorance as to what you are looking for. They are all running the same release if that is what you are asking. Same tarball built into rpms using rpmbuild on both nodes... Only difference being that the other node has been upgraded and the problem node is fresh. added the requested config here is the command line output microserver-1:/etc # /etc/init.d/ceph start osd.3 === osd.3 === Mounting xfs on microserver-1:/var/lib/ceph/osd/ceph-3 2015-03-11 01:00:13.492279 7f05b2f72700 1 -- :/0 messenger.start 2015-03-11 01:00:13.492823 7f05b2f72700 1 -- :/1002795 -- 192.168.0.10:6789/0 -- auth(proto 0 26 bytes epoch 0) v1 -- ?+0 0x7f05ac0290b0 con 0x7f05ac027c40 2015-03-11 01:00:13.510814 7f05b07ef700 1 -- 192.168.0.250:0/1002795 learned my addr 192.168.0.250:0/1002795 2015-03-11 01:00:13.527653 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 1 mon_map magic: 0 v1 191+0+0 (1112175541 0 0) 0x7f05aab0 con 0x7f05ac027c40 2015-03-11 01:00:13.527899 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 2 auth_reply(proto 1 0 (0) Success) v1 24+0+0 (3859410672 0 0) 0x7f05ae70 con 0x7f05ac027c40 2015-03-11 01:00:13.527973 7f05abfff700 1 -- 192.168.0.250:0/1002795 -- 192.168.0.10:6789/0 -- mon_subscribe({monmap=0+}) v2 -- ?+0 0x7f05ac029730 con 0x7f05ac027c40 2015-03-11 01:00:13.528124 7f05b2f72700 1 -- 192.168.0.250:0/1002795 -- 192.168.0.10:6789/0 -- mon_subscribe({monmap=2+,osdmap=0}) v2 -- ?+0 0x7f05ac029a50 con 0x7f05ac027c40 2015-03-11 01:00:13.528265 7f05b2f72700 1 -- 192.168.0.250:0/1002795 -- 192.168.0.10:6789/0 -- mon_subscribe({monmap=2+,osdmap=0}) v2 -- ?+0 0x7f05ac029f20 con 0x7f05ac027c40 2015-03-11 01:00:13.530359 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 3 mon_map magic: 0 v1 191+0+0 (1112175541 0 0) 0x7f05aab0 con 0x7f05ac027c40 2015-03-11 01:00:13.530548 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 4 mon_subscribe_ack(300s) v1 20+0+0 (3648139960 0 0) 0x7f05afb0 con 0x7f05ac027c40 2015-03-11 01:00:13.531114 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 5 osd_map(3277..3277 src has 2757..3277) v3 5366+0+0 (3110999244 0 0) 0x7f05a0002800 con 0x7f05ac027c40 2015-03-11 01:00:13.531772 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 6 mon_subscribe_ack(300s) v1 20+0+0 (3648139960 0 0) 0x7f05afb0 con 0x7f05ac027c40 2015-03-11 01:00:13.532186 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 7 osd_map(3277..3277 src has 2757..3277) v3 5366+0+0 (3110999244 0 0) 0x7f05a0001250 con 0x7f05ac027c40 2015-03-11 01:00:13.532260 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 8 mon_subscribe_ack(300s) v1 20+0+0 (3648139960 0 0) 0x7f05afb0 con 0x7f05ac027c40 2015-03-11 01:00:13.556748 7f05b2f72700 1
Re: [ceph-users] Issues with fresh 0.93 OSD adding to existing cluster
Sorry about this, I sent this at 1AM last night and went to bed, I didn't realise the log was far too long and the email had been blocked... I've reattached all the requested files and trimmed the body of the email. Thank you again for looking at this. -Original Message- From: Malcolm Haak Sent: Friday, 13 March 2015 1:38 AM To: 'Joao Eduardo Luis'; ceph-users@lists.ceph.com Subject: RE: [ceph-users] Issues with fresh 0.93 OSD adding to existing cluster Ok, So, I've been doing things in the meantime and as such the osd is now requesting 3008 and 3009 instead of 2758/9 I've included the problem OSD's log file. And attached all the osdmap's as requested. Regards Malcolm Haak -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Joao Eduardo Luis Sent: Friday, 13 March 2015 1:02 AM To: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Issues with fresh 0.93 OSD adding to existing cluster On 03/12/2015 05:16 AM, Malcolm Haak wrote: Sorry about all the unrelated grep issues.. So I've rebuilt and reinstalled and it's still broken. On the working node, even with the new packages, everything works. On the new broken node, I've added a mon and it works. But I still cannot start an OSD on the new node. What else do you need from me? I'll get logs run any number of tests. I've got data in this cluster already, and it's full so I need to expand it, I've already got hardware. Thanks in advance for even having a look Sam mentioned to me on IRC that the next step would be to grab the offending osdmaps. Easiest way for that will be to stop a monitor and run 'ceph-monstore-tool' in order to obtain the full maps, and then use 'ceph-kvstore-tool' to obtain incrementals. Given the osd is crashing on version 2759, the following would be best: (Assuming you have stopped a given monitor with id FOO, whose store is sitting at default path /var/lib/ceph/mon/ceph-FOO) ceph-monstore-tool /var/lib/ceph/mon/ceph-FOO get osdmap -- --version 2758 --out /tmp/osdmap.full.2758 ceph-monstore-tool /var/lib/ceph/mon/ceph-FOO get osdmap -- --version 2759 --out /tmp/osdmap.full.2759 (please note the '--' between 'osdmap' and '--version', as that is required for the tool to do its thing) and then ceph-kvstore-tool /var/lib/ceph/mon/ceph-FOO/store.db get osdmap 2758 out /tmp/osdmap.inc.2758 ceph-kvstore-tool /var/lib/ceph/mon/ceph-FOO/store.db get osdmap 2759 out /tmp/osdmap.inc.2759 Cheers! -Joao -Original Message- From: Samuel Just [mailto:sj...@redhat.com] Sent: Wednesday, 11 March 2015 1:41 AM To: Malcolm Haak; jl...@redhat.com Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Issues with fresh 0.93 OSD adding to existing cluster Joao, it looks like map 2759 is causing trouble, how would he get the full and incremental maps for that out of the mons? -Sam On Tue, 2015-03-10 at 14:12 +, Malcolm Haak wrote: Hi Samuel, The sha1? I'm going to admit ignorance as to what you are looking for. They are all running the same release if that is what you are asking. Same tarball built into rpms using rpmbuild on both nodes... Only difference being that the other node has been upgraded and the problem node is fresh. added the requested config here is the command line output microserver-1:/etc # /etc/init.d/ceph start osd.3 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ceph-osd.3.log Description: ceph-osd.3.log osdmap.full.3008 Description: osdmap.full.3008 osdmap.full.3009 Description: osdmap.full.3009 osdmap.inc.3008 Description: osdmap.inc.3008 osdmap.inc.3009 Description: osdmap.inc.3009 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Issues with fresh 0.93 OSD adding to existing cluster
Sorry about all the unrelated grep issues.. So I've rebuilt and reinstalled and it's still broken. On the working node, even with the new packages, everything works. On the new broken node, I've added a mon and it works. But I still cannot start an OSD on the new node. What else do you need from me? I'll get logs run any number of tests. I've got data in this cluster already, and it's full so I need to expand it, I've already got hardware. Thanks in advance for even having a look -Original Message- From: Samuel Just [mailto:sj...@redhat.com] Sent: Wednesday, 11 March 2015 1:41 AM To: Malcolm Haak; jl...@redhat.com Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Issues with fresh 0.93 OSD adding to existing cluster Joao, it looks like map 2759 is causing trouble, how would he get the full and incremental maps for that out of the mons? -Sam On Tue, 2015-03-10 at 14:12 +, Malcolm Haak wrote: Hi Samuel, The sha1? I'm going to admit ignorance as to what you are looking for. They are all running the same release if that is what you are asking. Same tarball built into rpms using rpmbuild on both nodes... Only difference being that the other node has been upgraded and the problem node is fresh. added the requested config here is the command line output microserver-1:/etc # /etc/init.d/ceph start osd.3 === osd.3 === Mounting xfs on microserver-1:/var/lib/ceph/osd/ceph-3 2015-03-11 01:00:13.492279 7f05b2f72700 1 -- :/0 messenger.start 2015-03-11 01:00:13.492823 7f05b2f72700 1 -- :/1002795 -- 192.168.0.10:6789/0 -- auth(proto 0 26 bytes epoch 0) v1 -- ?+0 0x7f05ac0290b0 con 0x7f05ac027c40 2015-03-11 01:00:13.510814 7f05b07ef700 1 -- 192.168.0.250:0/1002795 learned my addr 192.168.0.250:0/1002795 2015-03-11 01:00:13.527653 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 1 mon_map magic: 0 v1 191+0+0 (1112175541 0 0) 0x7f05aab0 con 0x7f05ac027c40 2015-03-11 01:00:13.527899 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 2 auth_reply(proto 1 0 (0) Success) v1 24+0+0 (3859410672 0 0) 0x7f05ae70 con 0x7f05ac027c40 2015-03-11 01:00:13.527973 7f05abfff700 1 -- 192.168.0.250:0/1002795 -- 192.168.0.10:6789/0 -- mon_subscribe({monmap=0+}) v2 -- ?+0 0x7f05ac029730 con 0x7f05ac027c40 2015-03-11 01:00:13.528124 7f05b2f72700 1 -- 192.168.0.250:0/1002795 -- 192.168.0.10:6789/0 -- mon_subscribe({monmap=2+,osdmap=0}) v2 -- ?+0 0x7f05ac029a50 con 0x7f05ac027c40 2015-03-11 01:00:13.528265 7f05b2f72700 1 -- 192.168.0.250:0/1002795 -- 192.168.0.10:6789/0 -- mon_subscribe({monmap=2+,osdmap=0}) v2 -- ?+0 0x7f05ac029f20 con 0x7f05ac027c40 2015-03-11 01:00:13.530359 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 3 mon_map magic: 0 v1 191+0+0 (1112175541 0 0) 0x7f05aab0 con 0x7f05ac027c40 2015-03-11 01:00:13.530548 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 4 mon_subscribe_ack(300s) v1 20+0+0 (3648139960 0 0) 0x7f05afb0 con 0x7f05ac027c40 2015-03-11 01:00:13.531114 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 5 osd_map(3277..3277 src has 2757..3277) v3 5366+0+0 (3110999244 0 0) 0x7f05a0002800 con 0x7f05ac027c40 2015-03-11 01:00:13.531772 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 6 mon_subscribe_ack(300s) v1 20+0+0 (3648139960 0 0) 0x7f05afb0 con 0x7f05ac027c40 2015-03-11 01:00:13.532186 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 7 osd_map(3277..3277 src has 2757..3277) v3 5366+0+0 (3110999244 0 0) 0x7f05a0001250 con 0x7f05ac027c40 2015-03-11 01:00:13.532260 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 8 mon_subscribe_ack(300s) v1 20+0+0 (3648139960 0 0) 0x7f05afb0 con 0x7f05ac027c40 2015-03-11 01:00:13.556748 7f05b2f72700 1 -- 192.168.0.250:0/1002795 -- 192.168.0.10:6789/0 -- mon_command({prefix: get_command_descriptions} v 0) v1 -- ?+0 0x7f05ac016ac0 con 0x7f05ac027c40 2015-03-11 01:00:13.564968 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 9 mon_command_ack([{prefix: get_command_descriptions}]=0 v0) v1 72+0+34995 (1092875540 0 1727986498) 0x7f05aa70 con 0x7f05ac027c40 2015-03-11 01:00:13.770122 7f05b2f72700 1 -- 192.168.0.250:0/1002795 -- 192.168.0.10:6789/0 -- mon_command({prefix: osd crush create-or-move, args: [host=microserver-1, root=default], id: 3, weight: 1.81} v 0) v1 -- ?+0 0x7f05ac016ac0 con 0x7f05ac027c40 2015-03-11 01:00:13.772299 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 10 mon_command_ack([{prefix: osd crush create-or-move, args: [host=microserver-1, root=default], id: 3, weight: 1.81}]=0 create-or-move updated item name 'osd.3' weight 1.81 at location {host=microserver-1,root=default
Re: [ceph-users] Issues with fresh 0.93 OSD adding to existing cluster
Can you reproduce this with debug osd = 20 debug filestore = 20 debug ms = 1 on the crashing osd? Also, what sha1 are the other osds and mons running? -Sam - Original Message - From: Malcolm Haak malc...@sgi.com To: ceph-users@lists.ceph.com Sent: Tuesday, March 10, 2015 3:28:26 AM Subject: [ceph-users] Issues with fresh 0.93 OSD adding to existing cluster Hi all, I've just attempted to add a new node and OSD to an existing ceph cluster (it's a small one I use as a NAS at home, not like the big production ones I normally work on) and it seems to be throwing some odd errors... Just looking for where to poke it next... Log is below, It's a two node cluster with 3 osd's in node A and one osd in the new node (It's going to have more eventually and node one will be retired after node three gets added) And I've hit a weird snag. I was running 0.80 but I ran into the 'Invalid Command' bug on the new node so I opted to jump to the latest code with the required patches already. Please let me know what else you need.. This is the log content when attempting to start the new OSD: 2015-03-10 19:28:48.795318 7f0774108880 0 ceph version 0.93 (bebf8e9a830d998eeaab55f86bb256d4360dd3c4), process ceph-osd, pid 10810 2015-03-10 19:28:48.817803 7f0774108880 0 filestore(/var/lib/ceph/osd/ceph-3) backend xfs (magic 0x58465342) 2015-03-10 19:28:48.866862 7f0774108880 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-3) detect_features: FIEMAP ioctl is supported and appears to work 2015-03-10 19:28:48.866920 7f0774108880 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-3) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option 2015-03-10 19:28:48.905069 7f0774108880 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-3) detect_features: syncfs(2) syscall fully supported (by glibc and kernel) 2015-03-10 19:28:48.905467 7f0774108880 0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-3) detect_feature: extsize is supported and kernel 3.18.3-1-desktop = 3.5 2015-03-10 19:28:49.077872 7f0774108880 0 filestore(/var/lib/ceph/osd/ceph-3) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled 2015-03-10 19:28:49.078321 7f0774108880 -1 journal FileJournal::_open: disabling aio for non-block journal. Use journal_force_aio to force use of aio anyway 2015-03-10 19:28:49.078328 7f0774108880 1 journal _open /var/lib/ceph/osd/ceph-3/journal fd 19: 1073741824 bytes, block size 4096 bytes, directio = 1, aio = 0 2015-03-10 19:28:49.079721 7f0774108880 1 journal _open /var/lib/ceph/osd/ceph-3/journal fd 19: 1073741824 bytes, block size 4096 bytes, directio = 1, aio = 0 2015-03-10 19:28:49.080948 7f0774108880 0 cls cls/hello/cls_hello.cc:271: loading cls_hello 2015-03-10 19:28:49.094194 7f0774108880 0 osd.3 2757 crush map has features 33816576, adjusting msgr requires for clients 2015-03-10 19:28:49.094211 7f0774108880 0 osd.3 2757 crush map has features 33816576 was 8705, adjusting msgr requires for mons 2015-03-10 19:28:49.094217 7f0774108880 0 osd.3 2757 crush map has features 33816576, adjusting msgr requires for osds 2015-03-10 19:28:49.094235 7f0774108880 0 osd.3 2757 load_pgs 2015-03-10 19:28:49.094279 7f0774108880 0 osd.3 2757 load_pgs opened 0 pgs 2015-03-10 19:28:49.095121 7f0774108880 -1 osd.3 2757 log_to_monitors {default=true} 2015-03-10 19:28:49.134104 7f0774108880 0 osd.3 2757 done with init, starting boot process 2015-03-10 19:28:49.149994 7f076384c700 -1 *** Caught signal (Aborted) ** in thread 7f076384c700 ceph version 0.93 (bebf8e9a830d998eeaab55f86bb256d4360dd3c4) 1: /usr/bin/ceph-osd() [0xac7cea] 2: (()+0x10050) [0x7f0773013050] 3: (gsignal()+0x37) [0x7f07714e60f7] 4: (abort()+0x13a) [0x7f07714e74ca] 5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7f0771dcbfe5] 6: (()+0x63186) [0x7f0771dca186] 7: (()+0x631b3) [0x7f0771dca1b3] 8: (()+0x633d2) [0x7f0771dca3d2] 9: (ceph::buffer::list::iterator::copy(unsigned int, char*)+0x137) [0xc2cea7] 10: (OSDMap::decode_classic(ceph::buffer::list::iterator)+0x605) [0xb7b7b5] 11: (OSDMap::decode(ceph::buffer::list::iterator)+0x8c) [0xb7bebc] 12: (OSDMap::decode(ceph::buffer::list)+0x3f) [0xb7dfbf] 13: (OSD::handle_osd_map(MOSDMap*)+0xd37) [0x6cd9a7] 14: (OSD::_dispatch(Message*)+0x3eb) [0x6d0afb] 15: (OSD::ms_dispatch(Message*)+0x257) [0x6d1007] 16: (DispatchQueue::entry()+0x649) [0xc6fe09] 17: (DispatchQueue::DispatchThread::entry()+0xd) [0xb9dd7d] 18: (()+0x83a4) [0x7f077300b3a4] 19: (clone()+0x6d) [0x7f0771595a4d] NOTE: a copy of the executable, or `objdump -rdS executable` is needed to interpret this. --- begin dump of recent events --- -135 2015-03-10 19:28:48.790490 7f0774108880 5 asok(0x420) register_command perfcounters_dump hook 0x41b4030 -134 2015-03-10 19:28:48.790565 7f0774108880 5 asok(0x420) register_command 1 hook 0x41b4030 -133 2015-03-10 19:28:48.790571 7f0774108880 5 asok(0x420) register_command perf dump hook 0x41b4030 -132
Re: [ceph-users] Issues with fresh 0.93 OSD adding to existing cluster
Hi Samuel, The sha1? I'm going to admit ignorance as to what you are looking for. They are all running the same release if that is what you are asking. Same tarball built into rpms using rpmbuild on both nodes... Only difference being that the other node has been upgraded and the problem node is fresh. added the requested config here is the command line output microserver-1:/etc # /etc/init.d/ceph start osd.3 === osd.3 === Mounting xfs on microserver-1:/var/lib/ceph/osd/ceph-3 2015-03-11 01:00:13.492279 7f05b2f72700 1 -- :/0 messenger.start 2015-03-11 01:00:13.492823 7f05b2f72700 1 -- :/1002795 -- 192.168.0.10:6789/0 -- auth(proto 0 26 bytes epoch 0) v1 -- ?+0 0x7f05ac0290b0 con 0x7f05ac027c40 2015-03-11 01:00:13.510814 7f05b07ef700 1 -- 192.168.0.250:0/1002795 learned my addr 192.168.0.250:0/1002795 2015-03-11 01:00:13.527653 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 1 mon_map magic: 0 v1 191+0+0 (1112175541 0 0) 0x7f05aab0 con 0x7f05ac027c40 2015-03-11 01:00:13.527899 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 2 auth_reply(proto 1 0 (0) Success) v1 24+0+0 (3859410672 0 0) 0x7f05ae70 con 0x7f05ac027c40 2015-03-11 01:00:13.527973 7f05abfff700 1 -- 192.168.0.250:0/1002795 -- 192.168.0.10:6789/0 -- mon_subscribe({monmap=0+}) v2 -- ?+0 0x7f05ac029730 con 0x7f05ac027c40 2015-03-11 01:00:13.528124 7f05b2f72700 1 -- 192.168.0.250:0/1002795 -- 192.168.0.10:6789/0 -- mon_subscribe({monmap=2+,osdmap=0}) v2 -- ?+0 0x7f05ac029a50 con 0x7f05ac027c40 2015-03-11 01:00:13.528265 7f05b2f72700 1 -- 192.168.0.250:0/1002795 -- 192.168.0.10:6789/0 -- mon_subscribe({monmap=2+,osdmap=0}) v2 -- ?+0 0x7f05ac029f20 con 0x7f05ac027c40 2015-03-11 01:00:13.530359 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 3 mon_map magic: 0 v1 191+0+0 (1112175541 0 0) 0x7f05aab0 con 0x7f05ac027c40 2015-03-11 01:00:13.530548 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 4 mon_subscribe_ack(300s) v1 20+0+0 (3648139960 0 0) 0x7f05afb0 con 0x7f05ac027c40 2015-03-11 01:00:13.531114 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 5 osd_map(3277..3277 src has 2757..3277) v3 5366+0+0 (3110999244 0 0) 0x7f05a0002800 con 0x7f05ac027c40 2015-03-11 01:00:13.531772 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 6 mon_subscribe_ack(300s) v1 20+0+0 (3648139960 0 0) 0x7f05afb0 con 0x7f05ac027c40 2015-03-11 01:00:13.532186 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 7 osd_map(3277..3277 src has 2757..3277) v3 5366+0+0 (3110999244 0 0) 0x7f05a0001250 con 0x7f05ac027c40 2015-03-11 01:00:13.532260 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 8 mon_subscribe_ack(300s) v1 20+0+0 (3648139960 0 0) 0x7f05afb0 con 0x7f05ac027c40 2015-03-11 01:00:13.556748 7f05b2f72700 1 -- 192.168.0.250:0/1002795 -- 192.168.0.10:6789/0 -- mon_command({prefix: get_command_descriptions} v 0) v1 -- ?+0 0x7f05ac016ac0 con 0x7f05ac027c40 2015-03-11 01:00:13.564968 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 9 mon_command_ack([{prefix: get_command_descriptions}]=0 v0) v1 72+0+34995 (1092875540 0 1727986498) 0x7f05aa70 con 0x7f05ac027c40 2015-03-11 01:00:13.770122 7f05b2f72700 1 -- 192.168.0.250:0/1002795 -- 192.168.0.10:6789/0 -- mon_command({prefix: osd crush create-or-move, args: [host=microserver-1, root=default], id: 3, weight: 1.81} v 0) v1 -- ?+0 0x7f05ac016ac0 con 0x7f05ac027c40 2015-03-11 01:00:13.772299 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 10 mon_command_ack([{prefix: osd crush create-or-move, args: [host=microserver-1, root=default], id: 3, weight: 1.81}]=0 create-or-move updated item name 'osd.3' weight 1.81 at location {host=microserver-1,root=default} to crush map v3277) v1 256+0+0 (1191546821 0 0) 0x7f05a0001000 con 0x7f05ac027c40 create-or-move updated item name 'osd.3' weight 1.81 at location {host=microserver-1,root=default} to crush map 2015-03-11 01:00:13.776891 7f05b2f72700 1 -- 192.168.0.250:0/1002795 mark_down 0x7f05ac027c40 -- 0x7f05ac0239a0 2015-03-11 01:00:13.777212 7f05b2f72700 1 -- 192.168.0.250:0/1002795 mark_down_all 2015-03-11 01:00:13.778120 7f05b2f72700 1 -- 192.168.0.250:0/1002795 shutdown complete. Starting Ceph osd.3 on microserver-1... microserver-1:/etc # Log file 2015-03-11 01:00:13.876152 7f41a1ba4880 0 ceph version 0.93 (bebf8e9a830d998eeaab55f86bb256d4360dd3c4), process ceph-osd, pid 2840 2015-03-11 01:00:13.877059 7f41a1ba4880 1 accepter.accepter.bind my_inst.addr is 0.0.0.0:6800/2840 need_addr=1 2015-03-11 01:00:13.877111 7f41a1ba4880 1 accepter.accepter.bind my_inst.addr is 0.0.0.0:6801/2840 need_addr=1 2015-03-11 01:00:13.877140 7f41a1ba4880 1 accepter.accepter.bind my_inst.addr
Re: [ceph-users] Issues with fresh 0.93 OSD adding to existing cluster
Joao, it looks like map 2759 is causing trouble, how would he get the full and incremental maps for that out of the mons? -Sam On Tue, 2015-03-10 at 14:12 +, Malcolm Haak wrote: Hi Samuel, The sha1? I'm going to admit ignorance as to what you are looking for. They are all running the same release if that is what you are asking. Same tarball built into rpms using rpmbuild on both nodes... Only difference being that the other node has been upgraded and the problem node is fresh. added the requested config here is the command line output microserver-1:/etc # /etc/init.d/ceph start osd.3 === osd.3 === Mounting xfs on microserver-1:/var/lib/ceph/osd/ceph-3 2015-03-11 01:00:13.492279 7f05b2f72700 1 -- :/0 messenger.start 2015-03-11 01:00:13.492823 7f05b2f72700 1 -- :/1002795 -- 192.168.0.10:6789/0 -- auth(proto 0 26 bytes epoch 0) v1 -- ?+0 0x7f05ac0290b0 con 0x7f05ac027c40 2015-03-11 01:00:13.510814 7f05b07ef700 1 -- 192.168.0.250:0/1002795 learned my addr 192.168.0.250:0/1002795 2015-03-11 01:00:13.527653 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 1 mon_map magic: 0 v1 191+0+0 (1112175541 0 0) 0x7f05aab0 con 0x7f05ac027c40 2015-03-11 01:00:13.527899 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 2 auth_reply(proto 1 0 (0) Success) v1 24+0+0 (3859410672 0 0) 0x7f05ae70 con 0x7f05ac027c40 2015-03-11 01:00:13.527973 7f05abfff700 1 -- 192.168.0.250:0/1002795 -- 192.168.0.10:6789/0 -- mon_subscribe({monmap=0+}) v2 -- ?+0 0x7f05ac029730 con 0x7f05ac027c40 2015-03-11 01:00:13.528124 7f05b2f72700 1 -- 192.168.0.250:0/1002795 -- 192.168.0.10:6789/0 -- mon_subscribe({monmap=2+,osdmap=0}) v2 -- ?+0 0x7f05ac029a50 con 0x7f05ac027c40 2015-03-11 01:00:13.528265 7f05b2f72700 1 -- 192.168.0.250:0/1002795 -- 192.168.0.10:6789/0 -- mon_subscribe({monmap=2+,osdmap=0}) v2 -- ?+0 0x7f05ac029f20 con 0x7f05ac027c40 2015-03-11 01:00:13.530359 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 3 mon_map magic: 0 v1 191+0+0 (1112175541 0 0) 0x7f05aab0 con 0x7f05ac027c40 2015-03-11 01:00:13.530548 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 4 mon_subscribe_ack(300s) v1 20+0+0 (3648139960 0 0) 0x7f05afb0 con 0x7f05ac027c40 2015-03-11 01:00:13.531114 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 5 osd_map(3277..3277 src has 2757..3277) v3 5366+0+0 (3110999244 0 0) 0x7f05a0002800 con 0x7f05ac027c40 2015-03-11 01:00:13.531772 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 6 mon_subscribe_ack(300s) v1 20+0+0 (3648139960 0 0) 0x7f05afb0 con 0x7f05ac027c40 2015-03-11 01:00:13.532186 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 7 osd_map(3277..3277 src has 2757..3277) v3 5366+0+0 (3110999244 0 0) 0x7f05a0001250 con 0x7f05ac027c40 2015-03-11 01:00:13.532260 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 8 mon_subscribe_ack(300s) v1 20+0+0 (3648139960 0 0) 0x7f05afb0 con 0x7f05ac027c40 2015-03-11 01:00:13.556748 7f05b2f72700 1 -- 192.168.0.250:0/1002795 -- 192.168.0.10:6789/0 -- mon_command({prefix: get_command_descriptions} v 0) v1 -- ?+0 0x7f05ac016ac0 con 0x7f05ac027c40 2015-03-11 01:00:13.564968 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 9 mon_command_ack([{prefix: get_command_descriptions}]=0 v0) v1 72+0+34995 (1092875540 0 1727986498) 0x7f05aa70 con 0x7f05ac027c40 2015-03-11 01:00:13.770122 7f05b2f72700 1 -- 192.168.0.250:0/1002795 -- 192.168.0.10:6789/0 -- mon_command({prefix: osd crush create-or-move, args: [host=microserver-1, root=default], id: 3, weight: 1.81} v 0) v1 -- ?+0 0x7f05ac016ac0 con 0x7f05ac027c40 2015-03-11 01:00:13.772299 7f05abfff700 1 -- 192.168.0.250:0/1002795 == mon.0 192.168.0.10:6789/0 10 mon_command_ack([{prefix: osd crush create-or-move, args: [host=microserver-1, root=default], id: 3, weight: 1.81}]=0 create-or-move updated item name 'osd.3' weight 1.81 at location {host=microserver-1,root=default} to crush map v3277) v1 256+0+0 (1191546821 0 0) 0x7f05a0001000 con 0x7f05ac027c40 create-or-move updated item name 'osd.3' weight 1.81 at location {host=microserver-1,root=default} to crush map 2015-03-11 01:00:13.776891 7f05b2f72700 1 -- 192.168.0.250:0/1002795 mark_down 0x7f05ac027c40 -- 0x7f05ac0239a0 2015-03-11 01:00:13.777212 7f05b2f72700 1 -- 192.168.0.250:0/1002795 mark_down_all 2015-03-11 01:00:13.778120 7f05b2f72700 1 -- 192.168.0.250:0/1002795 shutdown complete. Starting Ceph osd.3 on microserver-1... microserver-1:/etc # Log file 2015-03-11 01:00:13.876152 7f41a1ba4880 0 ceph version 0.93 (bebf8e9a830d998eeaab55f86bb256d4360dd3c4), process ceph-osd, pid 2840 2015-03-11 01:00:13.877059