Hm. It seems that the cache pool qoutas have not been set. At least I'm
sure I didn't set them.

# ceph osd pool get-quota cache
quotas for pool 'cache':
  max objects: N/A
  max bytes  : N/A

Hmm. It seems that the cache pool quota have not been set. At least I'm
sure I didn't set it. Maybe it have default setting.

# ceph osd pool get-quota cache
quotas for pool 'cache':
  max objects: N/A
  max bytes  : N/A

But I set target_max_bytes:

# ceph osd pool set cache target_max_bytes 1000000000000

Can it serve as the reason?

On Wed, Feb 24, 2016 at 4:08 PM, Alexey Sheplyakov <asheplya...@mirantis.com
> wrote:

> Hi,
>
> > 0> 2016-02-24 04:51:45.884445 7fd994825700 -1 osd/ReplicatedPG.cc: In
> function 'int ReplicatedPG::fill_in_copy_get(ReplicatedPG::OpContext*,
> ceph::buffer::list::iterator&, OSDOp&, ObjectContextRef&, bool)' thread
> 7fd994825700 time 2016-02-24 04:51:45.870995
> osd/ReplicatedPG.cc: 5558: FAILED assert(cursor.data_complete)
> > ceph version 0.80.11-8-g95c4287
> (95c4287b5d24b762bc8538633c5bb2918ecfe4dd)
>
> This one looks familiar: http://tracker.ceph.com/issues/13098
>
> A quick work around is to unset the cache pool quota:
>
> ceph osd pool set-quota $cache_pool_name max_bytes 0
> ceph osd pool set-quota $cache_pool_name max_objects 0
>
> The problem has been properly fixed in infernalis v9.1.0, and
> (partially) in hammer (v0.94.6 which will be released soon).
>
>  Best regards,
>       Alexey
>
>
> On Wed, Feb 24, 2016 at 5:37 AM, Alexander Gubanov <sht...@gmail.com>
> wrote:
> > Hi,
> >
> > Every time 2 of 18 OSDs are crashing. I think it's happening when run PG
> > replication because crashing only 2 OSDs and every time they're are the
> > same.
> >
> > 0> 2016-02-24 04:51:45.884445 7fd994825700 -1 osd/ReplicatedPG.cc: In
> > function 'int ReplicatedPG::fill_in_copy_get(ReplicatedPG::OpContext*,
> > ceph::buffer::list::iterator&, OSDOp&, ObjectContextRef&, bool)' thread
> > 7fd994825700 time 2016-02-24 04:51:45.870995
> > osd/ReplicatedPG.cc: 5558: FAILED assert(cursor.data_complete)
> >
> >  ceph version 0.80.11-8-g95c4287
> (95c4287b5d24b762bc8538633c5bb2918ecfe4dd)
> >  1: (ReplicatedPG::fill_in_copy_get(ReplicatedPG::OpContext*,
> > ceph::buffer::list::iterator&, OSDOp&,
> std::tr1::shared_ptr<ObjectContext>&,
> > bool)+0xffc) [0x7c1f7c]
> >  2: (ReplicatedPG::do_osd_ops(ReplicatedPG::OpContext*,
> std::vector<OSDOp,
> > std::allocator<OSDOp> >&)+0x4171) [0x809f21]
> >  3: (ReplicatedPG::prepare_transaction(ReplicatedPG::OpContext*)+0x62)
> > [0x814622]
> >  4: (ReplicatedPG::execute_ctx(ReplicatedPG::OpContext*)+0x5f8)
> [0x815098]
> >  5: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>)+0x3dd4)
> [0x81a3f4]
> >  6: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>,
> > ThreadPool::TPHandle&)+0x66d) [0x7b4ecd]
> >  7: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
> > std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3a5) [0x600ee5]
> >  8: (OSD::OpWQ::_process(boost::intrusive_ptr<PG>,
> > ThreadPool::TPHandle&)+0x203) [0x61cba3]
> >  9: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>,
> > std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG>
> >>::_void_process(void*, ThreadPool::TPHandle&)+0xac) [0x660f2c]
> >  10: (ThreadPool::worker(ThreadPool::WorkThread*)+0xb20) [0xa7def0]
> >  11: (ThreadPool::WorkThread::entry()+0x10) [0xa7ede0]
> >  12: (()+0x7dc5) [0x7fd9ad03edc5]
> >  13: (clone()+0x6d) [0x7fd9abd2828d]
> >  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
> needed to
> > interpret this.
> >
> > --- logging levels ---
> >    0/ 5 none
> >    0/ 1 lockdep
> >    0/ 1 context
> >    1/ 1 crush
> >    1/ 5 mds
> >    1/ 5 mds_balancer
> >    1/ 5 mds_locker
> >    1/ 5 mds_log
> >    1/ 5 mds_log_expire
> >    1/ 5 mds_migrator
> >    0/ 1 buffer
> >    0/ 1 timer
> >    0/ 1 filer
> >    0/ 1 striper
> >    0/ 1 objecter
> >    0/ 5 rados
> >    0/ 5 rbd
> >    0/ 5 journaler
> >    0/ 5 objectcacher
> >    0/ 5 client
> >    0/ 5 osd
> >    0/ 5 optracker
> >    0/ 5 objclass
> >    1/ 3 filestore
> >    1/ 3 keyvaluestore
> >    1/ 3 journal
> >    0/ 5 ms
> >    1/ 5 mon
> >    0/10 monc
> >    1/ 5 paxos
> >    0/ 5 tp
> >    1/ 5 auth
> >    1/ 5 crypto
> >    1/ 1 finisher
> >    1/ 5 heartbeatmap
> >    1/ 5 perfcounter
> >    1/ 5 rgw
> >    1/10 civetweb
> >    1/ 5 javaclient
> >    1/ 5 asok
> >    1/ 1 throttle
> >   -2/-2 (syslog threshold)
> >   -1/-1 (stderr threshold)
> >   max_recent     10000
> >   max_new         1000
> >   log_file /var/log/ceph/ceph-osd.3.log
> > --- end dump of recent events ---
> > 2016-02-24 04:51:45.944447 7fd994825700 -1 *** Caught signal (Aborted) **
> >  in thread 7fd994825700
> >
> >  ceph version 0.80.11-8-g95c4287
> (95c4287b5d24b762bc8538633c5bb2918ecfe4dd)
> >  1: /usr/bin/ceph-osd() [0x9a24f6]
> >  2: (()+0xf100) [0x7fd9ad046100]
> >  3: (gsignal()+0x37) [0x7fd9abc675f7]
> >  4: (abort()+0x148) [0x7fd9abc68ce8]
> >  5: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7fd9ac56b9d5]
> >  6: (()+0x5e946) [0x7fd9ac569946]
> >  7: (()+0x5e973) [0x7fd9ac569973]
> >  8: (()+0x5eb93) [0x7fd9ac569b93]
> >  9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> > const*)+0x1ef) [0xa8d9df]
> >  10: (ReplicatedPG::fill_in_copy_get(ReplicatedPG::OpContext*,
> > ceph::buffer::list::iterator&, OSDOp&,
> std::tr1::shared_ptr<ObjectContext>&,
> > bool)+0xffc) [0x7c1f7c]
> >  11: (ReplicatedPG::do_osd_ops(ReplicatedPG::OpContext*,
> std::vector<OSDOp,
> > std::allocator<OSDOp> >&)+0x4171) [0x809f21]
> >  12: (ReplicatedPG::prepare_transaction(ReplicatedPG::OpContext*)+0x62)
> > [0x814622]
> >  13: (ReplicatedPG::execute_ctx(ReplicatedPG::OpContext*)+0x5f8)
> [0x815098]
> >  14: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>)+0x3dd4)
> > [0x81a3f4]
> >  15: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>,
> > ThreadPool::TPHandle&)+0x66d) [0x7b4ecd]
> >  16: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
> > std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3a5) [0x600ee5]
> >  17: (OSD::OpWQ::_process(boost::intrusive_ptr<PG>,
> > ThreadPool::TPHandle&)+0x203) [0x61cba3]
> >  18: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>,
> > std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG>
> >>::_void_process(void*, ThreadPool::TPHandle&)+0xac) [0x660f2c]
> >  19: (ThreadPool::worker(ThreadPool::WorkThread*)+0xb20) [0xa7def0]
> >  20: (ThreadPool::WorkThread::entry()+0x10) [0xa7ede0]
> >  21: (()+0x7dc5) [0x7fd9ad03edc5]
> >  22: (clone()+0x6d) [0x7fd9abd2828d]
> >  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
> needed to
> > interpret this.
> >
> > --- begin dump of recent events ---
> >     -5> 2016-02-24 04:51:45.904559 7fd995026700  5 -- op tracker -- ,
> seq:
> > 19230, time: 2016-02-24 04:51:45.904559, event: started, request:
> > osd_op(osd.13.12097:806246 rb.0.218d6.238e1f29.000000010db3@snapdir
> > [list-snaps] 3.94c2bed2
> ack+read+ignore_cache+ignore_overlay+map_snap_clone
> > e13252) v4
> >     -4> 2016-02-24 04:51:45.904598 7fd995026700  1 --
> 172.16.0.1:6801/419703
> > --> 172.16.0.3:6844/12260 -- osd_op_reply(806246
> > rb.0.218d6.238e1f29.000000010db3 [list-snaps] v0'0 uv27683057 ondisk =
> 0) v6
> > -- ?+0 0x9f90800 con 0x1b7838c0
> >     -3> 2016-02-24 04:51:45.904616 7fd995026700  5 -- op tracker -- ,
> seq:
> > 19230, time: 2016-02-24 04:51:45.904616, event: done, request:
> > osd_op(osd.13.12097:806246 rb.0.218d6.238e1f29.000000010db3@snapdir
> > [list-snaps] 3.94c2bed2
> ack+read+ignore_cache+ignore_overlay+map_snap_clone
> > e13252) v4
> >     -2> 2016-02-24 04:51:45.904637 7fd995026700  5 -- op tracker -- ,
> seq:
> > 19231, time: 2016-02-24 04:51:45.904637, event: reached_pg, request:
> > osd_op(osd.13.12097:806247 rb.0.218d6.238e1f29.000000010db3 [copy-get max
> > 8388608] 3.94c2bed2 ack+read+ignore_cache+ignore_overlay+map_snap_clone
> > e13252) v4
> >     -1> 2016-02-24 04:51:45.904673 7fd995026700  5 -- op tracker -- ,
> seq:
> > 19231, time: 2016-02-24 04:51:45.904673, event: started, request:
> > osd_op(osd.13.12097:806247 rb.0.218d6.238e1f29.000000010db3 [copy-get max
> > 8388608] 3.94c2bed2 ack+read+ignore_cache+ignore_overlay+map_snap_clone
> > e13252) v4
> >      0> 2016-02-24 04:51:45.944447 7fd994825700 -1 *** Caught signal
> > (Aborted) **
> >  in thread 7fd994825700
> >
> >  ceph version 0.80.11-8-g95c4287
> (95c4287b5d24b762bc8538633c5bb2918ecfe4dd)
> >  1: /usr/bin/ceph-osd() [0x9a24f6]
> >  2: (()+0xf100) [0x7fd9ad046100]
> >  3: (gsignal()+0x37) [0x7fd9abc675f7]
> >  4: (abort()+0x148) [0x7fd9abc68ce8]
> >  5: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7fd9ac56b9d5]
> >  6: (()+0x5e946) [0x7fd9ac569946]
> >  7: (()+0x5e973) [0x7fd9ac569973]
> >  8: (()+0x5eb93) [0x7fd9ac569b93]
> >  9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> > const*)+0x1ef) [0xa8d9df]
> >  10: (ReplicatedPG::fill_in_copy_get(ReplicatedPG::OpContext*,
> > ceph::buffer::list::iterator&, OSDOp&,
> std::tr1::shared_ptr<ObjectContext>&,
> > bool)+0xffc) [0x7c1f7c]
> >  11: (ReplicatedPG::do_osd_ops(ReplicatedPG::OpContext*,
> std::vector<OSDOp,
> > std::allocator<OSDOp> >&)+0x4171) [0x809f21]
> >  12: (ReplicatedPG::prepare_transaction(ReplicatedPG::OpContext*)+0x62)
> > [0x814622]
> >  13: (ReplicatedPG::execute_ctx(ReplicatedPG::OpContext*)+0x5f8)
> [0x815098]
> >  14: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>)+0x3dd4)
> > [0x81a3f4]
> >  15: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>,
> > ThreadPool::TPHandle&)+0x66d) [0x7b4ecd]
> >  16: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
> > std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3a5) [0x600ee5]
> >  17: (OSD::OpWQ::_process(boost::intrusive_ptr<PG>,
> > ThreadPool::TPHandle&)+0x203) [0x61cba3]
> >  18: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>,
> > std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG>
> >>::_void_process(void*, ThreadPool::TPHandle&)+0xac) [0x660f2c]
> >  19: (ThreadPool::worker(ThreadPool::WorkThread*)+0xb20) [0xa7def0]
> >  20: (ThreadPool::WorkThread::entry()+0x10) [0xa7ede0]
> >  21: (()+0x7dc5) [0x7fd9ad03edc5]
> >  22: (clone()+0x6d) [0x7fd9abd2828d]
> >  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
> needed to
> > interpret this.
> >
> > --- logging levels ---
> >    0/ 5 none
> >    0/ 1 lockdep
> >    0/ 1 context
> >    1/ 1 crush
> >    1/ 5 mds
> >    1/ 5 mds_balancer
> >    1/ 5 mds_locker
> >    1/ 5 mds_log
> >    1/ 5 mds_log_expire
> >    1/ 5 mds_migrator
> >    0/ 1 buffer
> >    0/ 1 timer
> >    0/ 1 filer
> >    0/ 1 striper
> >    0/ 1 objecter
> >    0/ 5 rados
> >    0/ 5 rbd
> >    0/ 5 journaler
> >    0/ 5 objectcacher
> >    0/ 5 client
> >    0/ 5 osd
> >    0/ 5 optracker
> >    0/ 5 objclass
> >    1/ 3 filestore
> >    1/ 3 keyvaluestore
> >    1/ 3 journal
> >    0/ 5 ms
> >    1/ 5 mon
> >    0/10 monc
> >    1/ 5 paxos
> >    0/ 5 tp
> >    1/ 5 auth
> >    1/ 5 crypto
> >    1/ 1 finisher
> >    1/ 5 heartbeatmap
> >    1/ 5 perfcounter
> >    1/ 5 rgw
> >    1/10 civetweb
> >    1/ 5 javaclient
> >    1/ 5 asok
> >    1/ 1 throttle
> >   -2/-2 (syslog threshold)
> >   -1/-1 (stderr threshold)
> >   max_recent     10000
> >   max_new         1000
> >   log_file /var/log/ceph/ceph-osd.3.log
> > --- end dump of recent events ---
> >
> > --
> > Alexander Gubanov
> >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>



-- 
Alexander Gubanov
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to