Re: [ceph-users] Fwd: Hammer OSD memory increase when add new machine
Thanks, though CERN 30PB cluster test, the osdmap caches causes memory increase, I'll test how these configs( osd_map_cache_size, osd_map_max_advance, etc.) influence the memory usage. 2016-11-08 22:48 GMT+08:00 zphj1987 <zphj1...@gmail.com>: > I remember CERN had a test ceph cluster 30PB and the osd use more memery > than usual ,and thay tune osdmap_epochs ,if it is the osdmap make it use > more memery,ithink you may have a test use less osdmap_epochs to see if > have some change > > default mon_min_osdmap_epochs is 500 > > > zphj1987 > > 2016-11-08 22:08 GMT+08:00 Sage Weil <s...@newdream.net>: >> >> > -- Forwarded message -- >> > From: Dong Wu <archer.wud...@gmail.com> >> > Date: 2016-10-27 18:50 GMT+08:00 >> > Subject: Re: [ceph-users] Hammer OSD memory increase when add new >> > machine >> > To: huang jun <hjwsm1...@gmail.com> >> > 抄送: ceph-users <ceph-users@lists.ceph.com> >> > >> > >> > 2016-10-27 17:50 GMT+08:00 huang jun <hjwsm1...@gmail.com>: >> > > how do you add the new machine ? >> > > does it first added to default ruleset and then you add the new rule >> > > for this group? >> > > do you have data pool use the default rule, does these pool contain >> > > data? >> > >> > we dont use default ruleset, when we add new group machine, >> > crush_location auto generate root and chassis, then we add a new rule >> > for this group. >> > >> > >> > > 2016-10-27 17:34 GMT+08:00 Dong Wu <archer.wud...@gmail.com>: >> > >> Hi all, >> > >> >> > >> We have a ceph cluster only use rbd. The cluster contains several >> > >> group machines, each group contains several machines, then each >> > >> machine has 12 SSDs, each ssd as an OSD (journal and data together). >> > >> eg: >> > >> group1: machine1~machine12 >> > >> group2: machine13~machine24 >> > >> .. >> > >> each group is separated with other group, which means each group has >> > >> separated pools. >> > >> >> > >> we use Hammer(0.94.6) compiled with jemalloc(4.2). >> > >> >> > >> We have found that when we add a new group machine, the other group >> > >> machine's memory increase 5% more or less (OSDs usage). >> > >> >> > >> each group's data is separated with others, so backfill only in >> > >> group, >> > >> not across. >> > >> Why add a group of machine cause others memory increase? Is this >> > >> reasonable? >> >> It could be cached OSDmaps (they get slightly larger when you add OSDs) >> but it's hard to say. It seems more likely that the pools and crush rules >> aren't configured right and you're adding OSDs to the wrong group. >> >> If you look at the 'ceph daemon osd.NNN perf dump' output you can see, >> among other things, how many PGs are on the OSD. Can you capture the >> output before and after the change (and 5% memory footprint increase)? >> >> sage > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Fwd: Hammer OSD memory increase when add new machine
any sugesstions? Thanks. -- Forwarded message -- From: Dong Wu <archer.wud...@gmail.com> Date: 2016-10-27 18:50 GMT+08:00 Subject: Re: [ceph-users] Hammer OSD memory increase when add new machine To: huang jun <hjwsm1...@gmail.com> 抄送: ceph-users <ceph-users@lists.ceph.com> 2016-10-27 17:50 GMT+08:00 huang jun <hjwsm1...@gmail.com>: > how do you add the new machine ? > does it first added to default ruleset and then you add the new rule > for this group? > do you have data pool use the default rule, does these pool contain data? we dont use default ruleset, when we add new group machine, crush_location auto generate root and chassis, then we add a new rule for this group. > 2016-10-27 17:34 GMT+08:00 Dong Wu <archer.wud...@gmail.com>: >> Hi all, >> >> We have a ceph cluster only use rbd. The cluster contains several >> group machines, each group contains several machines, then each >> machine has 12 SSDs, each ssd as an OSD (journal and data together). >> eg: >> group1: machine1~machine12 >> group2: machine13~machine24 >> .. >> each group is separated with other group, which means each group has >> separated pools. >> >> we use Hammer(0.94.6) compiled with jemalloc(4.2). >> >> We have found that when we add a new group machine, the other group >> machine's memory increase 5% more or less (OSDs usage). >> >> each group's data is separated with others, so backfill only in group, >> not across. >> Why add a group of machine cause others memory increase? Is this reasonable? >> >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > > > > -- > Thank you! > HuangJun ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Hammer OSD memory increase when add new machine
Hi all, We have a ceph cluster only use rbd. The cluster contains several group machines, each group contains several machines, then each machine has 12 SSDs, each ssd as an OSD (journal and data together). eg: group1: machine1~machine12 group2: machine13~machine24 .. each group is separated with other group, which means each group has separated pools. we use Hammer(0.94.6) compiled with jemalloc(4.2). We have found that when we add a new group machine, the other group machine's memory increase 5% more or less (OSDs usage). each group's data is separated with others, so backfill only in group, not across. Why add a group of machine cause others memory increase? Is this reasonable? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] lsof ceph-osd find many "can't identify protocol"
Hi, cephers. I use lsof in my system, find a lot of "can't identify protocol", dose it mean socket descriptor leaks? ceph-osd 5389root 112u sock0,7 0t0 295880018 can't identify protocol ceph-osd 5389root 136u sock0,7 0t0 295572256 can't identify protocol ceph-osd 5389root 176u sock0,7 0t0 292738022 can't identify protocol ceph-osd 5389root 240u sock0,7 0t0 297919149 can't identify protocol ceph-osd 5389root 301u sock0,7 0t0 313075907 can't identify protocol ceph-osd 5389root 351u sock0,7 0t0 295314260 can't identify protocol ceph-osd 5389root 617u sock0,7 0t0 296221898 can't identify protocol ceph-osd 5389root 657u sock0,7 0t0 313075919 can't identify protocol ceph-osd 5389root 714u sock0,7 0t0 295881042 can't identify protocol ceph-osd 5389root 743u sock0,7 0t0 295904170 can't identify protocol ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] after upgrade from 0.80.11 to 0.94.6, rbd cmd core dump
Hi all, I upgraded my cluster from 0.80.11 to 0.94.6, everything is ok except that rbd cmd cord dump on one host and success on others. I have disabled auth in ceph.conf: auth_cluster_required = none auth_service_required = none auth_client_required = none here is the core message. $ sudo rbd ls 2016-03-25 16:00:43.043000 7f3ae6c13780 1 -- :/0 messenger.start 2016-03-25 16:00:43.043329 7f3ae6c13780 1 -- :/1008171 --> 10.180.0.46:6789/0 -- auth(proto 0 30 bytes epoch 0) v1 -- ?+0 0x434a330 con 0x4349fc0 2016-03-25 16:00:43.043377 7f3ae6c13780 0 -- :/1008171 submit_message auth(proto 0 30 bytes epoch 0) v1 : 00 00 00 00 00 00 00 00 ff ff 00 00 00 00 00 00 : 0010 : 00 00 00 00 00 00 1e 00 00 00 01 01 00 00 00 01 : 0020 : 00 00 00 08 00 00 00 05 00 00 00 61 64 6d 69 6e : ...admin 0030 : 00 00 00 00 00 00 00 00 00 00 00 00 : 2016-03-25 16:00:43.043450 7f3adb7fe700 1 monclient(hunting): continuing hunt 2016-03-25 16:00:43.043489 7f3adb7fe700 1 -- :/1008171 mark_down 0x4349fc0 -- 0x4349d30 2016-03-25 16:00:43.043614 7f3adb7fe700 1 -- :/1008171 --> 10.180.0.31:6789/0 -- auth(proto 0 30 bytes epoch 0) v1 -- ?+0 0x7f39cc001060 con 0x7f39cc000cf0 2016-03-25 16:00:43.043648 7f3adb7fe700 0 -- :/1008171 submit_message auth(proto 0 30 bytes epoch 0) v1 : 00 00 00 00 00 00 00 00 ff ff 00 00 00 00 00 00 : 0010 : 00 00 00 00 00 00 1e 00 00 00 01 01 00 00 00 01 : 0020 : 00 00 00 08 00 00 00 05 00 00 00 61 64 6d 69 6e : ...admin 0030 : 00 00 00 00 00 00 00 00 00 00 00 00 : 2016-03-25 16:00:43.043694 7f3ae6c13780 0 monclient(hunting): authenticate timed out after 2.47033e-321 *** Caught signal (Segmentation fault) ** in thread 7f3adbfff700 2016-03-25 16:00:43.043756 7f3adb7fe700 1 monclient(hunting): continuing hunt 2016-03-25 16:00:43.043749 7f3ae6c13780 0 librados: client.admin authentication error (110) Connection timed out ceph version 0.94.6-2-gbb98b8f (bb98b8fcb0bb0bd3688310f6a1688736ef422b25) 1: rbd() [0x60408c] 2: (()+0xf8d0) [0x7f3ae4ea88d0] 3: rbd() [0x52b841] 4: (Mutex::~Mutex()+0x9b) [0x562a6b] 5: (Connection::~Connection()+0x6e) [0x7f3ae5550fce] 6: (Connection::~Connection()+0x9) [0x7f3ae5551049] 7: (Pipe::~Pipe()+0x90) [0x7f3ae553f330] 8: (Pipe::~Pipe()+0x9) [0x7f3ae553f4e9] 9: (SimpleMessenger::reaper()+0x8a9) [0x7f3aebf9] 10: (SimpleMessenger::reaper_entry()+0x88) [0x7f3ae5556b38] 11: (SimpleMessenger::ReaperThread::entry()+0xd) [0x7f3ae555ba8d] 12: (()+0x80a4) [0x7f3ae4ea10a4] 13: (clone()+0x6d) [0x7f3ae3a2d04d] 2016-03-25 16:00:43.045278 7f3adbfff700 -1 *** Caught signal (Segmentation fault) ** in thread 7f3adbfff700 ceph version 0.94.6-2-gbb98b8f (bb98b8fcb0bb0bd3688310f6a1688736ef422b25) 1: rbd() [0x60408c] 2: (()+0xf8d0) [0x7f3ae4ea88d0] 3: rbd() [0x52b841] 4: (Mutex::~Mutex()+0x9b) [0x562a6b] 5: (Connection::~Connection()+0x6e) [0x7f3ae5550fce] 6: (Connection::~Connection()+0x9) [0x7f3ae5551049] 7: (Pipe::~Pipe()+0x90) [0x7f3ae553f330] 8: (Pipe::~Pipe()+0x9) [0x7f3ae553f4e9] 9: (SimpleMessenger::reaper()+0x8a9) [0x7f3aebf9] 10: (SimpleMessenger::reaper_entry()+0x88) [0x7f3ae5556b38] 11: (SimpleMessenger::ReaperThread::entry()+0xd) [0x7f3ae555ba8d] 12: (()+0x80a4) [0x7f3ae4ea10a4] 13: (clone()+0x6d) [0x7f3ae3a2d04d] NOTE: a copy of the executable, or `objdump -rdS ` is needed to interpret this. --- begin dump of recent events --- -39> 2016-03-25 16:00:43.036565 7f3ae6c13780 5 asok(0x42f1830) register_command perfcounters_dump hook 0x42f5000 -38> 2016-03-25 16:00:43.036596 7f3ae6c13780 5 asok(0x42f1830) register_command 1 hook 0x42f5000 -37> 2016-03-25 16:00:43.036608 7f3ae6c13780 5 asok(0x42f1830) register_command perf dump hook 0x42f5000 -36> 2016-03-25 16:00:43.036621 7f3ae6c13780 5 asok(0x42f1830) register_command perfcounters_schema hook 0x42f5000 -35> 2016-03-25 16:00:43.036630 7f3ae6c13780 5 asok(0x42f1830) register_command 2 hook 0x42f5000 -34> 2016-03-25 16:00:43.036634 7f3ae6c13780 5 asok(0x42f1830) register_command perf schema hook 0x42f5000 -33> 2016-03-25 16:00:43.036639 7f3ae6c13780 5 asok(0x42f1830) register_command perf reset hook 0x42f5000 -32> 2016-03-25 16:00:43.036643 7f3ae6c13780 5 asok(0x42f1830) register_command config show hook 0x42f5000 -31> 2016-03-25 16:00:43.036651 7f3ae6c13780 5 asok(0x42f1830) register_command config set hook 0x42f5000 -30> 2016-03-25 16:00:43.036654 7f3ae6c13780 5 asok(0x42f1830) register_command config get hook 0x42f5000 -29> 2016-03-25 16:00:43.036659 7f3ae6c13780 5 asok(0x42f1830) register_command config diff hook 0x42f5000 -28> 2016-03-25 16:00:43.036662 7f3ae6c13780 5 asok(0x42f1830) register_command log flush hook 0x42f5000 -27> 2016-03-25 16:00:43.036667 7f3ae6c13780 5 asok(0x42f1830) register_command log dump hook 0x42f5000 -26> 2016-03-25 16:00:43.036670 7f3ae6c13780 5 asok(0x42f1830)
Re: [ceph-users] why not add (offset,len) to pglog
Based on Yao Ning's PR, I promote a new PR for this https://github.com/ceph/ceph/pull/8083 In this PR, i also solved such a upgrade situation problem: consider such a upgrade situation which we need to upgrade to this can_recover_partial version: eg. a pg 3.67 [0, 1, 2] 1)firstly, we update osd.0(service ceph restart osd.0), and recover normally, everything goes on; 2)a write req(eg. req1, will write to obj1) is sent to primary(osd.0), and pglog record such a req; 3)then we update osd.1, req1 send to osd.1 fail, but will send to osd.2, when osd.2 is dealing with the req(just in function do_request), pg3.67 starts peering, then on osd.7, it call can_discard_request to check that req1 should be dropped; 4)so the req1 only write successfuly on osd.0, because min_size=2, osd.0 re-enqueue the req1; 5)when peering, primary find that req1's object obj1 is missing on osd.1 and osd.2, so recover the object; 6)because osd.0 and osd.1 is already updated, osd.0 will calculate partial data in prep_push_to_replica, and osd.1 can deal with the partial data very well, 7)but osd.2 has not been updated, on osd.2's code logic(submit_push_data), it will remove origin object first, then write the partial data from osd.0, so the origin data of the object is lost; 2016-01-22 19:40 GMT+08:00 Ning Yao <zay11...@gmail.com>: > Great! Based on Sage's suggestion, we just add a flag > can_recover_partial to indicate whether. > And I promote a new PR for this https://github.com/ceph/ceph/pull/7325 > Please review and comment > Regards > Ning Yao > > > 2015-12-25 22:27 GMT+08:00 Sage Weil <s...@newdream.net>: >> On Fri, 25 Dec 2015, Ning Yao wrote: >>> Hi, Dong Wu, >>> >>> 1. As I currently work for other things, this proposal is abandon for >>> a long time >>> 2. This is a complicated task as we need to consider a lots such as >>> (not just for writeOp, as well as truncate, delete) and also need to >>> consider the different affects for different backends(Replicated, EC). >>> 3. I don't think it is good time to redo this patch now, since the >>> BlueStore and Kstore is inprogress, and I'm afraid to bring some >>> side-effect. We may prepare and propose the whole design in next CDS. >>> 4. Currently, we already have some tricks to deal with recovery (like >>> throttle the max recovery op, set the priority for recovery and so >>> on). So this kind of patch may not solve the critical problem but just >>> make things better, and I am not quite sure that this will really >>> bring a big improvement. Based on my previous test, it works >>> excellently on slow disk (say hdd), and also for a short-time >>> maintaining. Otherwise, it will trigger the backfill process. So wait >>> for Sage's opinion @sage >>> >>> If you are interest on this, we may cooperate to do this. >> >> I think it's a great idea. We didn't do it before only because it is >> complicated. The good news is that if we can't conclusively infer exactly >> which parts of hte object need to be recovered from the log entry we can >> always just fall back to recovering the whole thing. Also, the place >> where this is currently most visible is RBD small writes: >> >> - osd goes down >> - client sends a 4k overwrite and modifies an object >> - osd comes back up >> - client sends another 4k overwrite >> - client io blocks while osd recovers 4mb >> >> So even if we initially ignore truncate and omap and EC and clones and >> anything else complicated I suspect we'll get a nice benefit. >> >> I haven't thought about this too much, but my guess is that the hard part >> is making the primary's missing set representation include a partial delta >> (say, an interval_set<> indicating which ranges of the file have changed) >> in a way that gracefully degrades to recovering the whole object if we're >> not sure. >> >> In any case, we should definitely have the design conversation! >> >> sage >> >>> >>> Regards >>> Ning Yao >>> >>> >>> 2015-12-25 14:23 GMT+08:00 Dong Wu <archer.wud...@gmail.com>: >>> > Thanks, from this pull request I learned that this issue is not >>> > completed, is there any new progress of this issue? >>> > >>> > 2015-12-25 12:30 GMT+08:00 Xinze Chi (??) <xmdx...@gmail.com>: >>> >> Yeah, This is good idea for recovery, but not for backfill. >>> >> @YaoNing have pull a request about this >>> >> https://github.com/ceph/ceph/pull/3837 this year. >>> >> >>> >> 2015-12-25 11:16 GMT+08:00 Dong Wu <a
[ceph-users] how to downgrade when upgrade from firefly to hammer fail
hi, cephers I want to upgrade my ceph cluster from firefly(0.80.11) to hammer, when i successfully install hammer deb package on all my hosts, then i update monitor first, and it success. but when i restart osds on one host to upgrade, it failed, osds cannot startup, then i want to downgrade to firefly again to keep my cluster going on, after i reinstall firefly deb package, i failed to start osds on the host, here is the log: 2016-03-07 09:47:14.704242 7f2f11ba87c0 0 ceph version 0.80.11 (8424145d49264624a3b0a204aedb127835161070), process ceph-osd, pid 37459 2016-03-07 09:47:14.709159 7f2f11ba87c0 -1 filestore(/var/lib/ceph/osd/ceph-0) FileStore::mount : stale version stamp 4. Please run the FileStore update script before starting the OSD, or set filestore_update_to to 3 2016-03-07 09:47:14.709176 7f2f11ba87c0 -1 ** ERROR: error converting store /var/lib/ceph/osd/ceph-0: (22) Invalid argument 2016-03-07 09:47:18.385399 7f98478187c0 0 ceph version 0.80.11 (8424145d49264624a3b0a204aedb127835161070), process ceph-osd, pid 39041 2016-03-07 09:47:18.390320 7f98478187c0 -1 filestore(/var/lib/ceph/osd/ceph-0) FileStore::mount : stale version stamp 4. Please run the FileStore update script before starting the OSD, or set filestore_update_to to 3 2016-03-07 09:47:18.390337 7f98478187c0 -1 ** ERROR: error converting store /var/lib/ceph/osd/ceph-0: (22) Invalid argument how can i downgrade to firefly successfully? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Read IO to object while new data still in journal
what i know is that librbd use applied callback, here is the code: in send_write() calls librados::Rados::aio_create_completion, parameter rados_req_cb is cb_safe, and cb_complete is NULL, the cb_safe is just applied callback. void AbstractWrite::send_write() { ldout(m_ictx->cct, 20) << "send_write " << this << " " << m_oid << " " << m_object_off << "~" << m_object_len << dendl; m_state = LIBRBD_AIO_WRITE_FLAT; guard_write(); add_write_ops(_write); assert(m_write.size() != 0); librados::AioCompletion *rados_completion = librados::Rados::aio_create_completion(this, NULL, rados_req_cb); int r = m_ictx->data_ctx.aio_operate(m_oid, rados_completion, _write, m_snap_seq, m_snaps); assert(r == 0); rados_completion->release(); } librados::AioCompletion *librados::Rados::aio_create_completion(void *cb_arg, callback_t cb_complete, callback_t cb_safe) { AioCompletionImpl *c; int r = rados_aio_create_completion(cb_arg, cb_complete, cb_safe, (void**)); assert(r == 0); return new AioCompletion(c); } anything wrong? Regards, Dong Wu 2015-12-31 10:33 GMT+08:00 min fang <louisfang2...@gmail.com>: > yes, the question here is, librbd use the committed callback, as my > understanding, when this callback returned, librbd write will be looked as > completed. So I can issue a read IO even if the data is not readable. In > this case, i would like to know what data will be returned for the read IO? > > 2015-12-31 10:29 GMT+08:00 Dong Wu <archer.wud...@gmail.com>: >> >> there are two callbacks: committed and applied, committed means write >> to all replica's journal, applied means write to all replica's file >> system. so when applied callback return to client, it means data can >> be read. >> >> 2015-12-31 10:15 GMT+08:00 min fang <louisfang2...@gmail.com>: >> > Hi, as my understanding, write IO will committed data to journal >> > firstly, >> > then give a safe callback to ceph client. So it is possible that data >> > still >> > in journal when I send a read IO to the same area. So what data will be >> > returned if the new data still in journal? >> > >> > Thanks. >> > >> > ___ >> > ceph-users mailing list >> > ceph-users@lists.ceph.com >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Read IO to object while new data still in journal
there are two callbacks: committed and applied, committed means write to all replica's journal, applied means write to all replica's file system. so when applied callback return to client, it means data can be read. 2015-12-31 10:15 GMT+08:00 min fang: > Hi, as my understanding, write IO will committed data to journal firstly, > then give a safe callback to ceph client. So it is possible that data still > in journal when I send a read IO to the same area. So what data will be > returned if the new data still in journal? > > Thanks. > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] how io works when backfill
Hi, When add osd or remove osd, ceph will backfill to rebalance data. eg: - pg1.0[1, 2, 3] - add an osd(eg. osd.7) - ceph start backfill, then pg1.0 osd set changes to [1, 2, 7] - if [a, b, c, d, e] are objects needing to backfill to osd.7 and now object a is backfilling - when a write io hits object a, then the io needs to wait for its complete, then goes on. - but if io hits object b which has not been backfilled, io reaches osd.1, then osd.1 send the io to osd.2 and osd.7, but osd.7 does not have object b, so osd.7 needs to wait for object b to backfilled, then write. Is it right? Or osd.1 only send the io to osd.2, not both? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] why not add (offset,len) to pglog
Thank you for your reply. I am looking formard to Sage's opinion too @sage. Also I'll keep on with the BlueStore and Kstore's progress. Regards 2015-12-25 14:48 GMT+08:00 Ning Yao <zay11...@gmail.com>: > Hi, Dong Wu, > > 1. As I currently work for other things, this proposal is abandon for > a long time > 2. This is a complicated task as we need to consider a lots such as > (not just for writeOp, as well as truncate, delete) and also need to > consider the different affects for different backends(Replicated, EC). > 3. I don't think it is good time to redo this patch now, since the > BlueStore and Kstore is inprogress, and I'm afraid to bring some > side-effect. We may prepare and propose the whole design in next CDS. > 4. Currently, we already have some tricks to deal with recovery (like > throttle the max recovery op, set the priority for recovery and so > on). So this kind of patch may not solve the critical problem but just > make things better, and I am not quite sure that this will really > bring a big improvement. Based on my previous test, it works > excellently on slow disk (say hdd), and also for a short-time > maintaining. Otherwise, it will trigger the backfill process. So wait > for Sage's opinion @sage > > If you are interest on this, we may cooperate to do this. > > Regards > Ning Yao > > > 2015-12-25 14:23 GMT+08:00 Dong Wu <archer.wud...@gmail.com>: >> Thanks, from this pull request I learned that this issue is not >> completed, is there any new progress of this issue? >> >> 2015-12-25 12:30 GMT+08:00 Xinze Chi (信泽) <xmdx...@gmail.com>: >>> Yeah, This is good idea for recovery, but not for backfill. >>> @YaoNing have pull a request about this >>> https://github.com/ceph/ceph/pull/3837 this year. >>> >>> 2015-12-25 11:16 GMT+08:00 Dong Wu <archer.wud...@gmail.com>: >>>> Hi, >>>> I have doubt about pglog, the pglog contains (op,object,version) etc. >>>> when peering, use pglog to construct missing list,then recover the >>>> whole object in missing list even if different data among replicas is >>>> less then a whole object data(eg,4MB). >>>> why not add (offset,len) to pglog? If so, the missing list can contain >>>> (object, offset, len), then we can reduce recover data. >>>> ___ >>>> ceph-users mailing list >>>> ceph-users@lists.ceph.com >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>> >>> >>> -- >>> Regards, >>> Xinze Chi >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] why not add (offset,len) to pglog
Thanks, from this pull request I learned that this issue is not completed, is there any new progress of this issue? 2015-12-25 12:30 GMT+08:00 Xinze Chi (信泽) <xmdx...@gmail.com>: > Yeah, This is good idea for recovery, but not for backfill. > @YaoNing have pull a request about this > https://github.com/ceph/ceph/pull/3837 this year. > > 2015-12-25 11:16 GMT+08:00 Dong Wu <archer.wud...@gmail.com>: >> Hi, >> I have doubt about pglog, the pglog contains (op,object,version) etc. >> when peering, use pglog to construct missing list,then recover the >> whole object in missing list even if different data among replicas is >> less then a whole object data(eg,4MB). >> why not add (offset,len) to pglog? If so, the missing list can contain >> (object, offset, len), then we can reduce recover data. >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > -- > Regards, > Xinze Chi ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] why not add (offset,len) to pglog
Hi, I have doubt about pglog, the pglog contains (op,object,version) etc. when peering, use pglog to construct missing list,then recover the whole object in missing list even if different data among replicas is less then a whole object data(eg,4MB). why not add (offset,len) to pglog? If so, the missing list can contain (object, offset, len), then we can reduce recover data. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com