Hi Mark,

This should be fixed by d55399ffec224206ea324e83bb8ead1e9ca1eddc in the 
'next' branch of ceph.git.  Can you test it out and see if that allows 
journal replay to complete?

Thanks!
sage

http://tracker.newdream.net/issues/1019



On Tue, 19 Apr 2011, Mark Nigh wrote:

> I recently have been working with exporting ceph to NFS. I have had stability 
> problems with NFS (ceph is working but NFS crashes). But most recently, my 
> mds0 will not start after one of these instances with NFS.
> 
> My setup. 2 mds, 1 mon (located on mds0), 5 osds. All running Ubuntu v10.10.
> 
> Here is the output when I try to start the mds0. Is there other debugging I 
> can turn on?
> 
> /etc/init.d/ceph start mds0
> 
> 2011-04-19 10:06:58.602640 7fb202fe4700 mds0.11 ms_handle_connect on 
> 10.6.1.93:6800/945
> ./include/elist.h: In function 'elist<T>::item::~item() [with T = 
> MDSlaveUpdate*]', in thread '0x7fb2004d5700'
> ./include/elist.h: 39: FAILED assert(!is_on_list())
>  ceph version 0.26 (commit:9981ff90968398da43c63106694d661f5e3d07d5)
>  1: (MDSlaveUpdate::~MDSlaveUpdate()+0x59) [0x4d9fe9]
>  2: (ESlaveUpdate::replay(MDS*)+0x422) [0x4d2772]
>  3: (MDLog::_replay_thread()+0xb90) [0x67f850]
>  4: (MDLog::ReplayThread::entry()+0xd) [0x4b89ed]
>  5: (()+0x7971) [0x7fb20564a971]
>  6: (clone()+0x6d) [0x7fb2042e692d]
>  ceph version 0.26 (commit:9981ff90968398da43c63106694d661f5e3d07d5)
>  1: (MDSlaveUpdate::~MDSlaveUpdate()+0x59) [0x4d9fe9]
>  2: (ESlaveUpdate::replay(MDS*)+0x422) [0x4d2772]
>  3: (MDLog::_replay_thread()+0xb90) [0x67f850]
>  4: (MDLog::ReplayThread::entry()+0xd) [0x4b89ed]
>  5: (()+0x7971) [0x7fb20564a971]
>  6: (clone()+0x6d) [0x7fb2042e692d]
> *** Caught signal (Aborted) **
>  in thread 0x7fb2004d5700
>  ceph version 0.26 (commit:9981ff90968398da43c63106694d661f5e3d07d5)
>  1: /usr/bin/cmds() [0x70fc38]
>  2: (()+0xfb40) [0x7fb205652b40]
>  3: (gsignal()+0x35) [0x7fb204233ba5]
>  4: (abort()+0x180) [0x7fb2042376b0]
>  5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7fb204ad76bd]
>  6: (()+0xb9906) [0x7fb204ad5906]
>  7: (()+0xb9933) [0x7fb204ad5933]
>  8: (()+0xb9a3e) [0x7fb204ad5a3e]
>  9: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
> const*)+0x36a) [0x6f5eaa]
>  10: (MDSlaveUpdate::~MDSlaveUpdate()+0x59) [0x4d9fe9]
>  11: (ESlaveUpdate::replay(MDS*)+0x422) [0x4d2772]
>  12: (MDLog::_replay_thread()+0xb90) [0x67f850]
>  13: (MDLog::ReplayThread::entry()+0xd) [0x4b89ed]
>  14: (()+0x7971) [0x7fb20564a971]
>  15: (clone()+0x6d) [0x7fb2042e692d]
> 
> I am not sure why the IP address of 0.0.0.0 shows up with starting the mds0.
> 
> root@mds0:/var/log/ceph# /etc/init.d/ceph start mds0
> === mds.0 ===
> Starting Ceph mds.0 on mds0...
>  ** WARNING: Ceph is still under heavy development, and is only suitable for 
> **
>  **          testing and review.  Do not trust it with important data.       
> **
> starting mds.0 at 0.0.0.0:6800/2994
> 
> Thanks for your assistance.
> 
> Mark Nigh
> Systems Architect
> [email protected]
>  (p) 314.392.6926
> 
> 
> 
> 
> This transmission and any attached files are privileged, confidential or 
> otherwise the exclusive property of the intended recipient or Netelligent 
> Corporation. If you are not the intended recipient, any disclosure, copying, 
> distribution or use of any of the information contained in or attached to 
> this transmission is strictly prohibited. If you have received this 
> transmission in error, please contact us immediately by responding to this 
> message or by telephone (314-392-6900) and promptly destroy the original 
> transmission and its attachments.
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to [email protected]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to