Hi Greg,

Thanks for the clarification. I believe the OSD was in the middle of a deep scrub (sorry for not mentioning this straight away), so then it could've been a silent error that got wind during scrub?

What's best practice when the store is corrupted like this?

Cheers,
Josef

Gregory Farnum skrev 2014-06-14 02:21:
The OSD did a read off of the local filesystem and it got back the EIO
error code. That means the store got corrupted or something, so it
killed itself to avoid spreading bad data to the rest of the cluster.
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Fri, Jun 13, 2014 at 5:16 PM, Josef Johansson <jo...@oderland.se> wrote:
Hey,

Just examing what happened to an OSD, that was just turned off. Data has
been moved away from it, so hesitating to turned it back on.

Got the below in the logs, any clues to what the assert talks about?

Cheers,
Josef

-1 os/FileStore.cc: In function 'virtual int FileStore::read(coll_t, const
hobject_t&, uint64_t, size_t, ceph::bufferlist&, bool)' thread 7fdacb88
c700 time 2014-06-11 21:13:54.036982
os/FileStore.cc: 2992: FAILED assert(allow_eio || !m_filestore_fail_eio ||
got != -5)

  ceph version 0.67.7 (d7ab4244396b57aac8b7e80812115bbd079e6b73)
  1: (FileStore::read(coll_t, hobject_t const&, unsigned long, unsigned long,
ceph::buffer::list&, bool)+0x653) [0x8ab6c3]
  2: (ReplicatedPG::do_osd_ops(ReplicatedPG::OpContext*, std::vector<OSDOp,
std::allocator<OSDOp> >&)+0x350) [0x708230]
  3: (ReplicatedPG::prepare_transaction(ReplicatedPG::OpContext*)+0x86)
[0x713366]
  4: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>)+0x3095) [0x71acb5]
  5: (PG::do_request(std::tr1::shared_ptr<OpRequest>,
ThreadPool::TPHandle&)+0x3f0) [0x812340]
  6: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x2ea) [0x75c80a]
  7: (OSD::OpWQ::_process(boost::intrusive_ptr<PG>,
ThreadPool::TPHandle&)+0x198) [0x770da8]
  8: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>,
std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG>
::_void_process(void*, ThreadPool::TPHandle&)+0xae) [0x7a89
ce]
  9: (ThreadPool::worker(ThreadPool::WorkThread*)+0x68a) [0x9b5dea]
  10: (ThreadPool::WorkThread::entry()+0x10) [0x9b7040]
  11: (()+0x6b50) [0x7fdadffdfb50]
  12: (clone()+0x6d) [0x7fdade53b0ed]
  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
interpret this.


_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to