New issue created - http://tracker.ceph.com/issues/11027  

Regards.

Italo Santos
http://italosantos.com.br/


On Tuesday, March 3, 2015 at 9:23 PM, Loic Dachary wrote:

> Hi Yann,
>  
> That seems related to http://tracker.ceph.com/issues/10536 which seems to be 
> resolved. Could you create a new issue with a link to 10536 ? More logs and 
> ceph report would also be useful to figure out why it resurfaced.
>  
> Thanks !
>  
>  
> On 04/03/2015 00:04, Yann Dupont wrote:
> >  
> > Le 03/03/2015 22:03, Italo Santos a écrit :
> > >  
> > > I realised that when the first OSD goes down, the cluster was performing 
> > > a deep-scrub and I found the bellow trace on the logs of osd.8, anyone 
> > > can help me understand why the osd.8, and other osds, unexpected goes 
> > > down?
> >  
> > I'm afraid I've seen this this afternoon too on my test cluster, just after 
> > upgrading from 0.87 to 0.93. After an initial migration success, some OSD 
> > started to go down : All presented similar stack traces , with magic word 
> > "scrub" in it :
> >  
> > ceph version 0.93 (bebf8e9a830d998eeaab55f86bb256d4360dd3c4)
> > 1: /usr/bin/ceph-osd() [0xbeb3dc]
> > 2: (()+0xf0a0) [0x7f8f3ca130a0]
> > 3: (gsignal()+0x35) [0x7f8f3b37d165]
> > 4: (abort()+0x180) [0x7f8f3b3803e0]
> > 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f8f3bbd389d]
> > 6: (()+0x63996) [0x7f8f3bbd1996]
> > 7: (()+0x639c3) [0x7f8f3bbd19c3]
> > 8: (()+0x63bee) [0x7f8f3bbd1bee]
> > 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
> > const*)+0x220) [0xcd74f0]
> > 10: (ReplicatedPG::issue_repop(ReplicatedPG::RepGather*, utime_t)+0x1fc) 
> > [0x97259c]
> > 11: (ReplicatedPG::simple_repop_submit(ReplicatedPG::RepGather*)+0x7a) 
> > [0x97344a]
> > 12: (ReplicatedPG::_scrub(ScrubMap&, std::map<hobject_t, std::pair<unsigned 
> > int, unsigned int>, std::less<hobject_t>, 
> > std::allocator<std::pair<hobject_t const, std::pa
> > ir<unsigned int, unsigned int> > > > const&)+0x2e4d) [0x9a5ded]
> > 13: (PG::scrub_compare_maps()+0x658) [0x916378]
> > 14: (PG::chunky_scrub(ThreadPool::TPHandle&)+0x202) [0x917ee2]
> > 15: (PG::scrub(ThreadPool::TPHandle&)+0x3a3) [0x919f83]
> > 16: (OSD::ScrubWQ::_process(PG*, ThreadPool::TPHandle&)+0x13) [0x7eff93]
> > 17: (ThreadPool::worker(ThreadPool::WorkThread*)+0x629) [0xcc8c49]
> > 18: (ThreadPool::WorkThread::entry()+0x10) [0xccac40]
> > 19: (()+0x6b50) [0x7f8f3ca0ab50]
> > 20: (clone()+0x6d) [0x7f8f3b42695d]
> >  
> > As a temporary measure, noscrub and nodeep-scrub are now set for this 
> > cluster, and all is working fine right now.
> >  
> > So there is probably something wrong here. Need to investigate further.
> >  
> > Cheers,
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> > _______________________________________________
> > ceph-users mailing list
> > [email protected] (mailto:[email protected])
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >  
>  
>  
> --  
> Loïc Dachary, Artisan Logiciel Libre
>  
> _______________________________________________
> ceph-users mailing list
> [email protected] (mailto:[email protected])
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>  
>  


_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to