Thanks Greg. I quit playing with it because every time I restarted the cluster (service ceph -a restart), I lost more OSDs.. First time it was 1, 2nd 10, 3rd time 13... All 13 down OSDs all show the same stacktrace.
- Travis On Mon, Apr 29, 2013 at 11:56 AM, Gregory Farnum <[email protected]> wrote: > This sounds vaguely familiar to me, and I see > http://tracker.ceph.com/issues/4052, which is marked as "Can't > reproduce" — I think maybe this is fixed in "next" and "master", but > I'm not sure. For more than that I'd have to defer to Sage or Sam. > -Greg > Software Engineer #42 @ http://inktank.com | http://ceph.com > > > On Sat, Apr 27, 2013 at 6:43 PM, Travis Rhoden <[email protected]> wrote: > > Hey folks, > > > > I'm helping put together a new test/experimental cluster, and hit this > today > > when bringing the cluster up for the first time (using mkcephfs). > > > > After doing the normal "service ceph -a start", I noticed one OSD was > down, > > and a lot of PGs were stuck creating. I tried restarting the down OSD, > but > > it would come up. It always had this error: > > > > -1> 2013-04-27 18:11:56.179804 b6fcd000 2 osd.1 0 boot > > 0> 2013-04-27 18:11:56.402161 b6fcd000 -1 osd/PG.cc: In function > > 'static epoch_t PG::peek_map_epoch(ObjectStore*, coll_t, hobject_t&, > > ceph::bufferlist*)' thread b6fcd000 time 2013-04-27 18:11:56.399089 > > osd/PG.cc: 2556: FAILED assert(values.size() == 1) > > > > ceph version 0.60-401-g17a3859 > (17a38593d60f5f29b9b66c13c0aaa759762c6d04) > > 1: (PG::peek_map_epoch(ObjectStore*, coll_t, hobject_t&, > > ceph::buffer::list*)+0x1ad) [0x2c3c0a] > > 2: (OSD::load_pgs()+0x357) [0x28cba0] > > 3: (OSD::init()+0x741) [0x290a16] > > 4: (main()+0x1427) [0x2155c0] > > 5: (__libc_start_main()+0x99) [0xb69bcf42] > > NOTE: a copy of the executable, or `objdump -rdS <executable>` is > needed to > > interpret this. > > > > > > I then did a full cluster restart, and now I have ten OSDs down -- each > > showing the same exception/failed assert. > > > > Anybody seen this? > > > > I know I'm running a weird version -- it's compiled from source, and was > > provided to me. The OSDs are all on ARM, and the mon is x86_64. Just > > looking to see if anyone has seen this particular stack trace of > > load_pgs()/peek_map_epoch() before.... > > > > - Travis > > > > _______________________________________________ > > ceph-users mailing list > > [email protected] > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > >
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
