Re: [ceph-users] EU mirror now supports rsync
Hi Wido, I have a connection refused for a few days on your mirror using rsync (via IPv4). Is it a blacklist or an issue ? Thank you Florent On 04/09/2014 08:04 AM, Wido den Hollander wrote: Hi, I just enabled rsync on the eu.ceph.com mirror. eu.ceph.com mirrors from Ceph.com every 3 hours. Feel free to rsync all the contents to your local environment, might be useful for some large deployments where you want to save external bandwidth by not having each machine fetch the Deb/RPM packages from the internet. Rsync is available over IPv4 and IPv6, simply sync with this command: $ mkdir cephmirror $ rsync -avr --stats --progress eu.ceph.com::ceph cephmirror I ask you all to be gentle. It's a free service, so don't start hammering the server by setting your Cron to sync every 5 minutes. Once every couple of hours should be sufficient. Also, please don't all start syncing at the first minute of the hour. When setting up the Cron, select a random minute from the hour. This way the load on the system can be spread out. Should you have any questions or issues, let me know! -- Wido den Hollander 42on B.V. Ceph trainer and consultant ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cephfs survey results
On Tue, 4 Nov 2014, Blair Bethwaite wrote: On 4 November 2014 01:50, Sage Weil s...@newdream.net wrote: In the Ceph session at the OpenStack summit someone asked what the CephFS survey results looked like. Thanks Sage, that was me! Here's the link: https://www.surveymonkey.com/results/SM-L5JV7WXL/ In short, people want fsck multimds snapshots quotas TBH I'm a bit surprised by a couple of these and hope maybe you guys will apply a certain amount of filtering on this... fsck and quotas were there for me, but multimds and snapshots are what I'd consider icing features - they're nice to have but not on the critical path to using cephfs instead of e.g. nfs in a production setting. I'd have thought stuff like small file performance and gateway support was much more relevant to uptake and positive/pain-free UX. Interested to hear others rationale here. Yeah, I agree, and am taking the results with a grain of salt. I think the results are heavily influenced by the order they were originally listed (I whish surveymonkey would randomize is for each person or something). fsck is a clear #1. Everybody wants multimds, but I think very few actually need it at this point. We'll be merging a soft quota patch shortly, and things like performance (adding the inline data support to the kernel client, for instance) will probably compete with getting snapshots working (as part of a larger subvolume infrastructure). That's my guess at least; for now, we're really focused on fsck and hard usability edges and haven't set priorities beyond that. We're definitely interested in hearing feedback on this strategy, and on peoples' experiences with giant so far... sage ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph Giant not fixed RepllicatedPG:NotStrimming?
Hi Sam, I resend logs with debug options http://123.30.41.138/ceph-osd.21.log (Sorry about my spam :D) I saw many missing objects :| 2014-11-04 15:26:02.205607 7f3ab11a8700 10 osd.21 pg_epoch: 106407 pg[24.7d7( v 106407'491583 lc 106401'491579 (105805'487042,106407'491583] loca l-les=106403 n=179 ec=25000 les/c 106403/106390 106402/106402/106402) [21,28,4] r=0 lpr=106402 pi=106377-106401/4 rops=1 crt=106401'491581 mlcod 106393'491097 active+recovering+degraded m=2 snaptrimq=[306~1,312~1]] recover_primary 675ea7d7/*rbd_data.4930222ae8944a.0001/head//24 106401'491580 (missing) (missing head) (recovering) (recovering head)* 2014-11-04 15:26:02.205642 7f3ab11a8700 10 osd.21 pg_epoch: 106407 pg[24.7d7( v 106407'491583 lc 106401'491579 (105805'487042,106407'491583] local-les=106403 n=179 ec=25000 les/c 106403/106390 106402/106402/106402) [21,28,4] r=0 lpr=106402 pi=106377-106401/4 rops=1 crt=106401'491581 mlcod 106393'491097 active+recovering+degraded m=2 snaptrimq=[306~1,312~1]] recover_primary d4d4bfd7/rbd_data.c6964d30a28220.035f/head//24 106401'491581 (missing) (missing head) 2014-11-04 15:26:02.237994 7f3ab29ab700 10 osd.21 pg_epoch: 106407 pg[24.7d7( v 106407'491583 lc 106401'491579 (105805'487042,106407'491583] local-les=106403 n=179 ec=25000 les/c 106403/106390 106402/106402/106402) [21,28,4] r=0 lpr=106402 pi=106377-106401/4 rops=2 crt=106401'491581 mlcod 106393'491097 active+recovering+degraded m=2 snaptrimq=[306~1,312~1]] *got missing d4d4bfd7/rbd_data.c6964d30a28220.035f/head//24 v 106401'491581* Thanks Sam and All, -- Tuan HaNoi-Vietnam On 11/04/2014 04:54 AM, Samuel Just wrote: Can you reproduce with debug osd = 20 debug filestore = 20 debug ms = 1 In the [osd] section of that osd's ceph.conf? -Sam On Sun, Nov 2, 2014 at 9:10 PM, Ta Ba Tuan tua...@vccloud.vn wrote: Hi Sage, Samuel All, I upgraded to GAINT, but still appearing that errors |: I'm trying on deleting related objects/volumes, but very hard to verify missing objects :(. Guide me to resolve it, please! (I send attached detail log). 2014-11-03 11:37:57.730820 7f28fb812700 0 osd.21 105950 do_command r=0 2014-11-03 11:37:57.856578 7f28fc013700 -1 *** Caught signal (Segmentation fault) ** in thread 7f28fc013700 ceph version 0.87-6-gdba7def (dba7defc623474ad17263c9fccfec60fe7a439f0) 1: /usr/bin/ceph-osd() [0x9b6725] 2: (()+0xfcb0) [0x7f291fc2acb0] 3: (ReplicatedPG::trim_object(hobject_t const)+0x395) [0x811b55] 4: (ReplicatedPG::TrimmingObjects::react(ReplicatedPG::SnapTrim const)+0x43e) [0x82b9be] 5: (boost::statechart::simple_stateReplicatedPG::TrimmingObjects, ReplicatedPG::SnapTrimmer, boost::mpl::listmpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, (boost::statechart::history_mode)0::react_impl(boost::statechart::event_base const, void const*)+0xc0) [0x870ce0] 6: (boost::statechart::state_machineReplicatedPG::SnapTrimmer, ReplicatedPG::NotTrimming, std::allocatorvoid, boost::statechart::null_exception_translator::process_queued_events()+0xfb) [0x85618b] 7: (boost::statechart::state_machineReplicatedPG::SnapTrimmer, ReplicatedPG::NotTrimming, std::allocatorvoid, boost::statechart::null_exception_translator::process_event(boost::statechart::event_base const)+0x1e) [0x85633e] 8: (ReplicatedPG::snap_trimmer()+0x4f8) [0x7d5ef8] 9: (OSD::SnapTrimWQ::_process(PG*)+0x14) [0x673ab4] 10: (ThreadPool::worker(ThreadPool::WorkThread*)+0x48e) [0xa8fade] 11: (ThreadPool::WorkThread::entry()+0x10) [0xa92870] 12: (()+0x7e9a) [0x7f291fc22e9a] 13: (clone()+0x6d) [0x7f291e5ed31d] NOTE: a copy of the executable, or `objdump -rdS executable` is needed to interpret this. -9993 2014-11-03 11:37:47.689335 7f28fc814700 1 -- 172.30.5.2:6803/7606 -- 172.30.5.1:6886/3511 -- MOSDPGPull(6.58e 105950 [PullOp(87f82d8e/rbd_data.45e62779c99cf1.22b5/head//6, recovery_info: ObjectRecoveryInfo(87f82d8e/rbd_data.45e62779c99cf1.22b5/head//6@105938'11622009, copy_subset: [0~18446744073709551615], clone_subset: {}), recovery_progress: ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false, omap_recovered_to:, omap_complete:false))]) v2 -- ?+0 0x26c59000 con 0x22fbc420 -2 2014-11-03 11:37:57.853585 7f2902820700 5 osd.21 pg_epoch: 105950 pg[24.9e4( v 105946'113392 lc 105946'113391 (103622'109598,105946'113392] local-les=1 05948 n=88 ec=25000 les/c 105948/105943 105947/105947/105947) [21,112,33] r=0 lpr=105947 pi=105933-105946/4 crt=105946'113392 lcod 0'0 mlcod 0'0 active+recovery _wait+degraded m=1 snaptrimq=[303~3,307~1]] enter Started/Primary/Active/Recovering -1 2014-11-03 11:37:57.853735 7f28fc814700 1 -- 172.30.5.2:6803/7606 -- 172.30.5.9:6806/24552 -- MOSDPGPull(24.9e4 105950 [PullOp(5abb99e4/rbd_data.5dd32 f2ae8944a.0165/head//24, recovery_info:
Re: [ceph-users] 0.87 rados df fault
Thanks for your answer Greg. Unfortunately, the three monitor were working perfectly for at least 30 minutes after the upgrade. I don't know their memory usage at the time. What I did was : upgrade mons, upgrade osds, upgrade mds (single mds), upgrade fuse clients. I checked that everything was ok (health OK and data available). Then I started a rsync of around 7TB of data, mostly files between 100KB and 10MB, with 6TB of data already in CephFS. Currently the memory usage of my mons is around 110MB (on 1GB of memory and 1GB of swap). I'll keep an eye on this. On another matter (maybe I should start another thread), sometimes I have : health HEALTH_WARN mds0: Client wimi-recette-files-nginx:recette-files-rw failing to respond to cache pressure; mds0: Client wimi-prod-backupmanager:files-rw failing to respond to cache pressure And two minutes later : health HEALTH_OK Cephfs fuse clients only. But everything is working well, so I'm not so worried. Regards, -- Thomas Lemarchand Cloud Solutions SAS - Responsable des systèmes d'information On lun., 2014-11-03 at 09:57 -0800, Gregory Farnum wrote: On Mon, Nov 3, 2014 at 4:40 AM, Thomas Lemarchand thomas.lemarch...@cloud-solutions.fr wrote: Update : /var/log/kern.log.1:Oct 31 17:19:17 c-mon kernel: [17289149.746084] [21787] 0 21780 492110 185044 920 240143 0 ceph-mon /var/log/kern.log.1:Oct 31 17:19:17 c-mon kernel: [17289149.746115] [13136] 0 1313652172 1753 590 0 ceph /var/log/kern.log.1:Oct 31 17:19:17 c-mon kernel: [17289149.746126] Out of memory: Kill process 21787 (ceph-mon) score 827 or sacrifice child /var/log/kern.log.1:Oct 31 17:19:17 c-mon kernel: [17289149.746262] Killed process 21787 (ceph-mon) total-vm:1968440kB, anon-rss:740176kB, file-rss:0kB OOM kill. I have 1GB memory on my mons, and 1GB swap. It's the only mon that crashed. Is there a change in memory requirement from Firefly ? There generally shouldn't be, but I don't think it's something we monitored closely. More likely your monitor was running near its memory limit already and restarting all the OSDs (and servicing the resulting changes) pushed it over the edge. -Greg -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cephfs survey results
On 11/04/2014 10:02 AM, Sage Weil wrote: On Tue, 4 Nov 2014, Blair Bethwaite wrote: On 4 November 2014 01:50, Sage Weil s...@newdream.net wrote: In the Ceph session at the OpenStack summit someone asked what the CephFS survey results looked like. Thanks Sage, that was me! Here's the link: https://www.surveymonkey.com/results/SM-L5JV7WXL/ In short, people want fsck multimds snapshots quotas TBH I'm a bit surprised by a couple of these and hope maybe you guys will apply a certain amount of filtering on this... fsck and quotas were there for me, but multimds and snapshots are what I'd consider icing features - they're nice to have but not on the critical path to using cephfs instead of e.g. nfs in a production setting. I'd have thought stuff like small file performance and gateway support was much more relevant to uptake and positive/pain-free UX. Interested to hear others rationale here. Yeah, I agree, and am taking the results with a grain of salt. I think the results are heavily influenced by the order they were originally listed (I whish surveymonkey would randomize is for each person or something). fsck is a clear #1. Everybody wants multimds, but I think very few actually need it at this point. We'll be merging a soft quota patch shortly, and things like performance (adding the inline data support to the kernel client, for instance) will probably compete with getting snapshots working (as part of a larger subvolume infrastructure). That's my guess at least; for now, we're really focused on fsck and hard usability edges and haven't set priorities beyond that. We're definitely interested in hearing feedback on this strategy, and on peoples' experiences with giant so far... I think the approach is correct. Everybody I talk to wants to kick out their NFS server, but you don't need multi MDS for that. Active/Standby is just fine. Wido sage -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cephfs survey results
On Tue, 4 Nov 2014 10:36:07 +1100, Blair Bethwaite blair.bethwa...@gmail.com wrote: TBH I'm a bit surprised by a couple of these and hope maybe you guys will apply a certain amount of filtering on this... fsck and quotas were there for me, but multimds and snapshots are what I'd consider icing features - they're nice to have but not on the critical path to using cephfs instead of e.g. nfs in a production setting. I'd have thought stuff like small file performance and gateway support was much more relevant to uptake and positive/pain-free UX. Interested to hear others rationale here. Those are related; if small file performance will be enough for one MDS to handle high load with a lot of small files (typical case of webserver), having multiple acive MDS will be less of a priority; And if someone currently have OSD on bunch of relatively weak nodes, again, having active-active setup with MDS will be more interesting to him than someone that can just buy new fast machine for it. -- Mariusz Gronczewski, Administrator Efigence S. A. ul. Wołoska 9a, 02-583 Warszawa T: [+48] 22 380 13 13 F: [+48] 22 380 13 14 E: mariusz.gronczew...@efigence.com mailto:mariusz.gronczew...@efigence.com signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] emperor - firefly 0.80.7 upgrade problem
On Monday, November 03, 2014 17:34:06 you wrote: If you have osds that are close to full, you may be hitting 9626. I pushed a branch based on v0.80.7 with the fix, wip-v0.80.7-9626. -Sam Thanks Sam I may have been hitting that as well. I certainly hit too_full conditions often. I am able to squeeze PGs off of the too_full OSD by reweighting and then eventually all PGs get to where they want to be. Kind of silly that I have to do this manually though. Could Ceph order the PG movements better? (Is this what your bug fix does in effect?) So, at the moment there are no PG moving around the cluster, but all are not in active+clean. Also, there is one OSD which has blocked requests. The OSD seems idle and restarting the OSD just results in a younger blocked request. ~# ceph -s cluster 7797e50e-f4b3-42f6-8454-2e2b19fa41d6 health HEALTH_WARN 35 pgs down; 208 pgs incomplete; 210 pgs stuck inactive; 210 pgs stuck unclean; 1 requests are blocked 32 sec monmap e3: 3 mons at {mon01=128.104.164.197:6789/0,mon02=128.104.164.198:6789/0,mon03=144.92.180.139:67 89/0}, election epoch 2996, quorum 0,1,2 mon01,mon02,mon03 osdmap e115306: 24 osds: 24 up, 24 in pgmap v6630195: 8704 pgs, 7 pools, 6344 GB data, 1587 kobjects 12747 GB used, 7848 GB / 20596 GB avail 2 inactive 8494 active+clean 173 incomplete 35 down+incomplete # ceph health detail ... 1 ops are blocked 8388.61 sec 1 ops are blocked 8388.61 sec on osd.15 1 osds have slow requests from the log of the osd with the blocked request (osd.15): 2014-11-04 08:57:26.851583 7f7686331700 0 log [WRN] : 1 slow requests, 1 included below; oldest blocked for 3840.430247 secs 2014-11-04 08:57:26.851593 7f7686331700 0 log [WRN] : slow request 3840.430247 seconds old, received at 2014-11-04 07:53:26.421301: osd_op(client.11334078.1:592 rb.0.206609.238e1f29.000752e8 [read 512~512] 4.17df39a7 RETRY=1 retry+read e115304) v4 currently reached pg Other requests (like PG scrubs) are happening without taking a long time on this OSD. Also, this was one of the OSDs which I completely drained, removed from ceph, reformatted, and created again using ceph-deploy. So it is completely created by firefly 0.80.7 code. As Greg requested, output of ceph scrub: 2014-11-04 09:25:58.761602 7f6c0e20b700 0 mon.mon01@0(leader) e3 handle_command mon_command({prefix: scrub} v 0) v1 2014-11-04 09:26:21.320043 7f6c0ea0c700 1 mon.mon01@0(leader).paxos(paxos updating c 11563072..11563575) accept timeout, calling fresh elect ion 2014-11-04 09:26:31.264873 7f6c0ea0c700 0 mon.mon01@0(probing).data_health(2996) update_stats avail 38% total 6948572 used 3891232 avail 268 1328 2014-11-04 09:26:33.529403 7f6c0e20b700 0 log [INF] : mon.mon01 calling new monitor election 2014-11-04 09:26:33.538286 7f6c0e20b700 1 mon.mon01@0(electing).elector(2996) init, last seen epoch 2996 2014-11-04 09:26:38.809212 7f6c0ea0c700 0 log [INF] : mon.mon01@0 won leader election with quorum 0,2 2014-11-04 09:26:40.215095 7f6c0e20b700 0 log [INF] : monmap e3: 3 mons at {mon01=128.104.164.197:6789/0,mon02=128.104.164.198:6789/0,mon03= 144.92.180.139:6789/0} 2014-11-04 09:26:40.215754 7f6c0e20b700 0 log [INF] : pgmap v6630201: 8704 pgs: 2 inactive, 8494 active+clean, 173 incomplete, 35 down+incom plete; 6344 GB data, 12747 GB used, 7848 GB / 20596 GB avail 2014-11-04 09:26:40.215913 7f6c0e20b700 0 log [INF] : mdsmap e1: 0/0/1 up 2014-11-04 09:26:40.216621 7f6c0e20b700 0 log [INF] : osdmap e115306: 24 osds: 24 up, 24 in 2014-11-04 09:26:41.227010 7f6c0e20b700 0 log [INF] : pgmap v6630202: 8704 pgs: 2 inactive, 8494 active+clean, 173 incomplete, 35 down+incom plete; 6344 GB data, 12747 GB used, 7848 GB / 20596 GB avail 2014-11-04 09:26:41.367373 7f6c0e20b700 1 mon.mon01@0(leader).osd e115307 e115307: 24 osds: 24 up, 24 in 2014-11-04 09:26:41.437706 7f6c0e20b700 0 log [INF] : osdmap e115307: 24 osds: 24 up, 24 in 2014-11-04 09:26:41.471558 7f6c0e20b700 0 log [INF] : pgmap v6630203: 8704 pgs: 2 inactive, 8494 active+clean, 173 incomplete, 35 down+incom plete; 6344 GB data, 12747 GB used, 7848 GB / 20596 GB avail 2014-11-04 09:26:41.497318 7f6c0e20b700 1 mon.mon01@0(leader).osd e115308 e115308: 24 osds: 24 up, 24 in 2014-11-04 09:26:41.533965 7f6c0e20b700 0 log [INF] : osdmap e115308: 24 osds: 24 up, 24 in 2014-11-04 09:26:41.553161 7f6c0e20b700 0 log [INF] : pgmap v6630204: 8704 pgs: 2 inactive, 8494 active+clean, 173 incomplete, 35 down+incom plete; 6344 GB data, 12747 GB used, 7848 GB / 20596 GB avail 2014-11-04 09:26:42.701720 7f6c0e20b700 1 mon.mon01@0(leader).osd e115309 e115309: 24 osds: 24 up, 24 in 2014-11-04 09:26:42.953977 7f6c0e20b700 0 log [INF] : osdmap e115309: 24 osds: 24 up, 24 in 2014-11-04 09:26:45.776411 7f6c0e20b700 0 log [INF] : pgmap v6630205: 8704 pgs: 2 inactive, 8494 active+clean, 173 incomplete, 35 down+incom plete; 6344 GB data,
[ceph-users] RBD - possible to query used space of images/clones ?
Hi, is there a way to query the used space of a RBD image created with format 2 (used for kvm)? Also, if I create a linked clone base on this image, how do I get the additional, individual used space of this clone? In zfs, I can query these kind of information by calling zfs info .. (2). rbd info (1) shows not that much information about the image. best regards Danny --- (1) Output of zfs info from a Solaris system root@storage19:~# zfs get all pool5/w2k8.dsk NAMEPROPERTY VALUE SOURCE pool5/w2k8.dsk available 75,3G - pool5/w2k8.dsk checksum on default pool5/w2k8.dsk compression offdefault pool5/w2k8.dsk compressratio 1.00x - pool5/w2k8.dsk copies1 default pool5/w2k8.dsk creation Di. Mai 10 14:44 2011 - pool5/w2k8.dsk dedup offdefault pool5/w2k8.dsk encryptionoff- pool5/w2k8.dsk keychangedate - default pool5/w2k8.dsk keysource none default pool5/w2k8.dsk keystatus none - pool5/w2k8.dsk logbias latencydefault pool5/w2k8.dsk primarycache alldefault pool5/w2k8.dsk readonly offdefault pool5/w2k8.dsk referenced17,4G - pool5/w2k8.dsk refreservationnone default pool5/w2k8.dsk rekeydate - default pool5/w2k8.dsk reservation none default pool5/w2k8.dsk secondarycachealldefault pool5/w2k8.dsk sync standard default pool5/w2k8.dsk type volume - pool5/w2k8.dsk used 18,5G - pool5/w2k8.dsk usedbychildren0 - pool5/w2k8.dsk usedbydataset 17,4G - pool5/w2k8.dsk usedbyrefreservation 0 - pool5/w2k8.dsk usedbysnapshots 1,15G - pool5/w2k8.dsk volblocksize 8K - pool5/w2k8.dsk volsize 25Glocal pool5/w2k8.dsk zoned offdefault (2) Output of rbd info [root@ceph-admin2 ~]# rbd info rbd/myimage-1 rbd image 'myimage-1': size 5 MB in 12500 objects order 22 (4096 kB objects) block_name_prefix: rbd_data.11e82ae8944a format: 2 features: layering smime.p7s Description: S/MIME cryptographic signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cephfs survey results
Agreed Multi-MDS is a nice to have but not required for full production use. TBH stability and recovery will win any IT person dealing with filesystems. On Tue, Nov 4, 2014 at 7:33 AM, Mariusz Gronczewski mariusz.gronczew...@efigence.com wrote: On Tue, 4 Nov 2014 10:36:07 +1100, Blair Bethwaite blair.bethwa...@gmail.com wrote: TBH I'm a bit surprised by a couple of these and hope maybe you guys will apply a certain amount of filtering on this... fsck and quotas were there for me, but multimds and snapshots are what I'd consider icing features - they're nice to have but not on the critical path to using cephfs instead of e.g. nfs in a production setting. I'd have thought stuff like small file performance and gateway support was much more relevant to uptake and positive/pain-free UX. Interested to hear others rationale here. Those are related; if small file performance will be enough for one MDS to handle high load with a lot of small files (typical case of webserver), having multiple acive MDS will be less of a priority; And if someone currently have OSD on bunch of relatively weak nodes, again, having active-active setup with MDS will be more interesting to him than someone that can just buy new fast machine for it. -- Mariusz Gronczewski, Administrator Efigence S. A. ul. Wołoska 9a, 02-583 Warszawa T: [+48] 22 380 13 13 F: [+48] 22 380 13 14 E: mariusz.gronczew...@efigence.com mailto:mariusz.gronczew...@efigence.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Follow Me: @Scottix http://about.me/scottix scot...@gmail.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Is there an negative relationship between storage utilization and ceph performance?
I'd say it's storage in general, though Ceph can be especially harsh on file systems (RBD can invoke particularly bad fragmentation in btrfs for example due to how COW works). So generally there's a lot of things that can cause slow downs as your disks get full: 1) More objects spread across deeper PG directory trees 2) More disk fragmentation in general 3) fragmentation that generates even more fragmentation during writes once you no longer have contiguous space to store objects. 4) A higher data/pagecache ratio (and more dentries/inodes to cache) 5) disk heads move farther across the disk during random IO. 6) differences between outer and inner track performance on some disks. There's probably other things I'm missing. Mark On 11/04/2014 01:56 PM, Andrey Korolyov wrote: On Tue, Nov 4, 2014 at 10:49 PM, Udo Lembke ulem...@polarzone.de wrote: Hi, since a long time I'm looking for performance improvements for our ceph-cluster. The last expansion got better performance, because we add another node (with 12 OSDs). The storage utilization was after that 60%. Now we reach again 69% (the next nodes are waiting for installation) and the performance drop! OK, we also change the ceph-version from 0.72.x to firefly. But I'm wonder if there an relationship between utilization an performance?! The OSDs are xfs disks, but now i start to use ext4, because of the bad fragmentation on a xfs-filesystem (yes, I use the mountoption allocsize=4M allready). Has anybody the same effect? Udo AFAIR there is a specific point somewhere in Ceph user guide to not reach commit ratio higher than 70% due to heavy performance impact. In practice, hot storage feels even fifty-percent commit on a xfs with default mount parameters, so you may consider as a rule of thumb to reach commit not higher than 60 percents. For mixed or cold storage numbers will vary, as average clat and write throughput will matter less. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cephfs survey results
On 04/11/14 22:02, Sage Weil wrote: On Tue, 4 Nov 2014, Blair Bethwaite wrote: On 4 November 2014 01:50, Sage Weil s...@newdream.net wrote: In the Ceph session at the OpenStack summit someone asked what the CephFS survey results looked like. Thanks Sage, that was me! Here's the link: https://www.surveymonkey.com/results/SM-L5JV7WXL/ In short, people want fsck multimds snapshots quotas TBH I'm a bit surprised by a couple of these and hope maybe you guys will apply a certain amount of filtering on this... fsck and quotas were there for me, but multimds and snapshots are what I'd consider icing features - they're nice to have but not on the critical path to using cephfs instead of e.g. nfs in a production setting. I'd have thought stuff like small file performance and gateway support was much more relevant to uptake and positive/pain-free UX. Interested to hear others rationale here. Yeah, I agree, and am taking the results with a grain of salt. I think the results are heavily influenced by the order they were originally listed (I whish surveymonkey would randomize is for each person or something). fsck is a clear #1. Everybody wants multimds, but I think very few actually need it at this point. We'll be merging a soft quota patch shortly, and things like performance (adding the inline data support to the kernel client, for instance) will probably compete with getting snapshots working (as part of a larger subvolume infrastructure). That's my guess at least; for now, we're really focused on fsck and hard usability edges and haven't set priorities beyond that. We're definitely interested in hearing feedback on this strategy, and on peoples' experiences with giant so far... Heh, not necessarily - I put multi mds in there, as we want the cephfs part to be of similar to the rest of ceph in its availability. Maybe its because we are looking at plugging it in with an Openstack setup and for that you want everything to 'just look after itself'. If on the other hand we were wanting merely an nfs replacement, then sure multi mds not so important there. regards Mark ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RBD - possible to query used space of images/clones ?
$ rbd diff rbd/myimage-1 | awk '{ SUM += $2 } END { print SUM/1024/1024 MB }' -- Regards, Sébastien Han. On 04 Nov 2014, at 16:57, Daniel Schwager daniel.schwa...@dtnet.de wrote: Hi, is there a way to query the used space of a RBD image created with format 2 (used for kvm)? Also, if I create a linked clone base on this image, how do I get the additional, individual used space of this clone? In zfs, I can query these kind of information by calling zfs info .. (2). rbd info (1) shows not that much information about the image. best regards Danny --- (1) Output of zfs info from a Solaris system root@storage19:~# zfs get all pool5/w2k8.dsk NAMEPROPERTY VALUE SOURCE pool5/w2k8.dsk available 75,3G - pool5/w2k8.dsk checksum on default pool5/w2k8.dsk compression offdefault pool5/w2k8.dsk compressratio 1.00x - pool5/w2k8.dsk copies1 default pool5/w2k8.dsk creation Di. Mai 10 14:44 2011 - pool5/w2k8.dsk dedup offdefault pool5/w2k8.dsk encryptionoff- pool5/w2k8.dsk keychangedate - default pool5/w2k8.dsk keysource none default pool5/w2k8.dsk keystatus none - pool5/w2k8.dsk logbias latencydefault pool5/w2k8.dsk primarycache alldefault pool5/w2k8.dsk readonly offdefault pool5/w2k8.dsk referenced17,4G - pool5/w2k8.dsk refreservationnone default pool5/w2k8.dsk rekeydate - default pool5/w2k8.dsk reservation none default pool5/w2k8.dsk secondarycachealldefault pool5/w2k8.dsk sync standard default pool5/w2k8.dsk type volume - pool5/w2k8.dsk used 18,5G - pool5/w2k8.dsk usedbychildren0 - pool5/w2k8.dsk usedbydataset 17,4G - pool5/w2k8.dsk usedbyrefreservation 0 - pool5/w2k8.dsk usedbysnapshots 1,15G - pool5/w2k8.dsk volblocksize 8K - pool5/w2k8.dsk volsize 25Glocal pool5/w2k8.dsk zoned offdefault (2) Output of rbd info [root@ceph-admin2 ~]# rbd info rbd/myimage-1 rbd image 'myimage-1': size 5 MB in 12500 objects order 22 (4096 kB objects) block_name_prefix: rbd_data.11e82ae8944a format: 2 features: layering ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cephfs survey results
On 11/04/2014 03:11 PM, Mark Kirkwood wrote: On 04/11/14 22:02, Sage Weil wrote: On Tue, 4 Nov 2014, Blair Bethwaite wrote: On 4 November 2014 01:50, Sage Weil s...@newdream.net wrote: In the Ceph session at the OpenStack summit someone asked what the CephFS survey results looked like. Thanks Sage, that was me! Here's the link: https://www.surveymonkey.com/results/SM-L5JV7WXL/ In short, people want fsck multimds snapshots quotas TBH I'm a bit surprised by a couple of these and hope maybe you guys will apply a certain amount of filtering on this... fsck and quotas were there for me, but multimds and snapshots are what I'd consider icing features - they're nice to have but not on the critical path to using cephfs instead of e.g. nfs in a production setting. I'd have thought stuff like small file performance and gateway support was much more relevant to uptake and positive/pain-free UX. Interested to hear others rationale here. Yeah, I agree, and am taking the results with a grain of salt. I think the results are heavily influenced by the order they were originally listed (I whish surveymonkey would randomize is for each person or something). fsck is a clear #1. Everybody wants multimds, but I think very few actually need it at this point. We'll be merging a soft quota patch shortly, and things like performance (adding the inline data support to the kernel client, for instance) will probably compete with getting snapshots working (as part of a larger subvolume infrastructure). That's my guess at least; for now, we're really focused on fsck and hard usability edges and haven't set priorities beyond that. We're definitely interested in hearing feedback on this strategy, and on peoples' experiences with giant so far... Heh, not necessarily - I put multi mds in there, as we want the cephfs part to be of similar to the rest of ceph in its availability. Maybe its because we are looking at plugging it in with an Openstack setup and for that you want everything to 'just look after itself'. If on the other hand we were wanting merely an nfs replacement, then sure multi mds not so important there. Do you need active/active or is active/passive good enough? regards Mark ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cephfs survey results
On 05/11/14 10:58, Mark Nelson wrote: On 11/04/2014 03:11 PM, Mark Kirkwood wrote: Heh, not necessarily - I put multi mds in there, as we want the cephfs part to be of similar to the rest of ceph in its availability. Maybe its because we are looking at plugging it in with an Openstack setup and for that you want everything to 'just look after itself'. If on the other hand we were wanting merely an nfs replacement, then sure multi mds not so important there. Do you need active/active or is active/passive good enough? That is of course a good question. We are certainly seeing active/active as much better - essentially because all the other bits are, and it avoids the need to wake people up to change things. Does that make it essential? I'm not 100% sure, it might just be a nice to have that is so nice that we'll wait for it to be there! Cheers Mark ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cephfs survey results
On Wed, 5 Nov 2014, Mark Kirkwood wrote: On 04/11/14 22:02, Sage Weil wrote: On Tue, 4 Nov 2014, Blair Bethwaite wrote: On 4 November 2014 01:50, Sage Weil s...@newdream.net wrote: In the Ceph session at the OpenStack summit someone asked what the CephFS survey results looked like. Thanks Sage, that was me! Here's the link: https://www.surveymonkey.com/results/SM-L5JV7WXL/ In short, people want fsck multimds snapshots quotas TBH I'm a bit surprised by a couple of these and hope maybe you guys will apply a certain amount of filtering on this... fsck and quotas were there for me, but multimds and snapshots are what I'd consider icing features - they're nice to have but not on the critical path to using cephfs instead of e.g. nfs in a production setting. I'd have thought stuff like small file performance and gateway support was much more relevant to uptake and positive/pain-free UX. Interested to hear others rationale here. Yeah, I agree, and am taking the results with a grain of salt. I think the results are heavily influenced by the order they were originally listed (I whish surveymonkey would randomize is for each person or something). fsck is a clear #1. Everybody wants multimds, but I think very few actually need it at this point. We'll be merging a soft quota patch shortly, and things like performance (adding the inline data support to the kernel client, for instance) will probably compete with getting snapshots working (as part of a larger subvolume infrastructure). That's my guess at least; for now, we're really focused on fsck and hard usability edges and haven't set priorities beyond that. We're definitely interested in hearing feedback on this strategy, and on peoples' experiences with giant so far... Heh, not necessarily - I put multi mds in there, as we want the cephfs part to be of similar to the rest of ceph in its availability. Maybe its because we are looking at plugging it in with an Openstack setup and for that you want everything to 'just look after itself'. If on the other hand we were wanting merely an nfs replacement, then sure multi mds not so important there. Important clarification: multimds == multiple *active* MDS's. single mds means 1 active MDS and N standy's. One perfectly valid strategy, for example, is to run a ceph-mds on *every* node and let the mon pick whichever one is active. (That works as long as you have sufficient memory on all nodes.) sage ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cephfs survey results
On 05/11/14 11:47, Sage Weil wrote: On Wed, 5 Nov 2014, Mark Kirkwood wrote: On 04/11/14 22:02, Sage Weil wrote: On Tue, 4 Nov 2014, Blair Bethwaite wrote: On 4 November 2014 01:50, Sage Weil s...@newdream.net wrote: In the Ceph session at the OpenStack summit someone asked what the CephFS survey results looked like. Thanks Sage, that was me! Here's the link: https://www.surveymonkey.com/results/SM-L5JV7WXL/ In short, people want fsck multimds snapshots quotas TBH I'm a bit surprised by a couple of these and hope maybe you guys will apply a certain amount of filtering on this... fsck and quotas were there for me, but multimds and snapshots are what I'd consider icing features - they're nice to have but not on the critical path to using cephfs instead of e.g. nfs in a production setting. I'd have thought stuff like small file performance and gateway support was much more relevant to uptake and positive/pain-free UX. Interested to hear others rationale here. Yeah, I agree, and am taking the results with a grain of salt. I think the results are heavily influenced by the order they were originally listed (I whish surveymonkey would randomize is for each person or something). fsck is a clear #1. Everybody wants multimds, but I think very few actually need it at this point. We'll be merging a soft quota patch shortly, and things like performance (adding the inline data support to the kernel client, for instance) will probably compete with getting snapshots working (as part of a larger subvolume infrastructure). That's my guess at least; for now, we're really focused on fsck and hard usability edges and haven't set priorities beyond that. We're definitely interested in hearing feedback on this strategy, and on peoples' experiences with giant so far... Heh, not necessarily - I put multi mds in there, as we want the cephfs part to be of similar to the rest of ceph in its availability. Maybe its because we are looking at plugging it in with an Openstack setup and for that you want everything to 'just look after itself'. If on the other hand we were wanting merely an nfs replacement, then sure multi mds not so important there. Important clarification: multimds == multiple *active* MDS's. single mds means 1 active MDS and N standy's. One perfectly valid strategy, for example, is to run a ceph-mds on *every* node and let the mon pick whichever one is active. (That works as long as you have sufficient memory on all nodes.) Righty, so I think I've (plus a few others perhaps) misunderstood the nature of the 'promotion mechanism' for 1 active several standby design - I was under the (possibly wrong) impression that you needed to 'do something' to make a standby active? If not then yeah it would be fine, sorry! ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cephfs survey results
+1 for fsck and snapshots, being able to have snapshot backups and protect against accidental deletion, etc is something we are really looking forward to. Thanks, Shain On 11/04/2014 04:02 AM, Sage Weil wrote: On Tue, 4 Nov 2014, Blair Bethwaite wrote: On 4 November 2014 01:50, Sage Weil s...@newdream.net wrote: In the Ceph session at the OpenStack summit someone asked what the CephFS survey results looked like. Thanks Sage, that was me! Here's the link: https://www.surveymonkey.com/results/SM-L5JV7WXL/ In short, people want fsck multimds snapshots quotas TBH I'm a bit surprised by a couple of these and hope maybe you guys will apply a certain amount of filtering on this... fsck and quotas were there for me, but multimds and snapshots are what I'd consider icing features - they're nice to have but not on the critical path to using cephfs instead of e.g. nfs in a production setting. I'd have thought stuff like small file performance and gateway support was much more relevant to uptake and positive/pain-free UX. Interested to hear others rationale here. Yeah, I agree, and am taking the results with a grain of salt. I think the results are heavily influenced by the order they were originally listed (I whish surveymonkey would randomize is for each person or something). fsck is a clear #1. Everybody wants multimds, but I think very few actually need it at this point. We'll be merging a soft quota patch shortly, and things like performance (adding the inline data support to the kernel client, for instance) will probably compete with getting snapshots working (as part of a larger subvolume infrastructure). That's my guess at least; for now, we're really focused on fsck and hard usability edges and haven't set priorities beyond that. We're definitely interested in hearing feedback on this strategy, and on peoples' experiences with giant so far... sage ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Shain Miley | Manager of Systems and Infrastructure, Digital Media | smi...@npr.org | 202.513.3649 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] osd down question
The OSDs will heartbeat each other, and report back to the monitors if any other OSD fails to respond. An OSD that fails to respond is effectively down, since it's not doing the things that it's supposed to do. It is possible for this process to cause problems. For example, I've had some OSDs on an overloaded node mark all of the other OSDs in the cluster down, because the overloaded node wasn't processing the heartbeat responses quickly enough. The solution there was to adjust mon osd min down reporters and mon osd min down reports so that a single node can't do that. You might see something like epoch XXX wrongly marked me down, followed by the OSD rejoining the cluster. That's a sign that the OSD was overloaded, but not down. Once it was kicked out of the cluster, it caught up with the backlog and was able to rejoin. This shouldn't cause a chain reaction though. If you're not seeing that, then the OSD really is unresponsive, and needs to be restarted. The other OSDs will start replicating it's data automatically to make the cluster healthy again. This should not cause a chain reaction. If your cluster is overloaded (very close to running out of CPU, RAM, or Disk IO), then a failed OSD can cause a chain reaction as other OSDs pick up the failed OSD's workload. On Mon, Nov 3, 2014 at 11:12 PM, 飞 duron...@qq.com wrote: hello, I am running ceph v0.87 for one week, at this week, many osd have marking down, but I run ps -ef | grep osd, I can see the osd process, the osd not really down, then, I check osd log, I see many logs like osd.XX from dead osd.YY,marking down, if the 0.87 will check other osd process ? if some osd is down, then the mon will mark the current to down state ? This will cause a chain reaction, leading to failure of the entire cluster, it is a bug ? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] emperor - firefly 0.80.7 upgrade problem
Incomplete usually means the pgs do not have any complete copies. Did you previously have more osds? -Sam On Tue, Nov 4, 2014 at 7:37 AM, Chad Seys cws...@physics.wisc.edu wrote: On Monday, November 03, 2014 17:34:06 you wrote: If you have osds that are close to full, you may be hitting 9626. I pushed a branch based on v0.80.7 with the fix, wip-v0.80.7-9626. -Sam Thanks Sam I may have been hitting that as well. I certainly hit too_full conditions often. I am able to squeeze PGs off of the too_full OSD by reweighting and then eventually all PGs get to where they want to be. Kind of silly that I have to do this manually though. Could Ceph order the PG movements better? (Is this what your bug fix does in effect?) So, at the moment there are no PG moving around the cluster, but all are not in active+clean. Also, there is one OSD which has blocked requests. The OSD seems idle and restarting the OSD just results in a younger blocked request. ~# ceph -s cluster 7797e50e-f4b3-42f6-8454-2e2b19fa41d6 health HEALTH_WARN 35 pgs down; 208 pgs incomplete; 210 pgs stuck inactive; 210 pgs stuck unclean; 1 requests are blocked 32 sec monmap e3: 3 mons at {mon01=128.104.164.197:6789/0,mon02=128.104.164.198:6789/0,mon03=144.92.180.139:67 89/0}, election epoch 2996, quorum 0,1,2 mon01,mon02,mon03 osdmap e115306: 24 osds: 24 up, 24 in pgmap v6630195: 8704 pgs, 7 pools, 6344 GB data, 1587 kobjects 12747 GB used, 7848 GB / 20596 GB avail 2 inactive 8494 active+clean 173 incomplete 35 down+incomplete # ceph health detail ... 1 ops are blocked 8388.61 sec 1 ops are blocked 8388.61 sec on osd.15 1 osds have slow requests from the log of the osd with the blocked request (osd.15): 2014-11-04 08:57:26.851583 7f7686331700 0 log [WRN] : 1 slow requests, 1 included below; oldest blocked for 3840.430247 secs 2014-11-04 08:57:26.851593 7f7686331700 0 log [WRN] : slow request 3840.430247 seconds old, received at 2014-11-04 07:53:26.421301: osd_op(client.11334078.1:592 rb.0.206609.238e1f29.000752e8 [read 512~512] 4.17df39a7 RETRY=1 retry+read e115304) v4 currently reached pg Other requests (like PG scrubs) are happening without taking a long time on this OSD. Also, this was one of the OSDs which I completely drained, removed from ceph, reformatted, and created again using ceph-deploy. So it is completely created by firefly 0.80.7 code. As Greg requested, output of ceph scrub: 2014-11-04 09:25:58.761602 7f6c0e20b700 0 mon.mon01@0(leader) e3 handle_command mon_command({prefix: scrub} v 0) v1 2014-11-04 09:26:21.320043 7f6c0ea0c700 1 mon.mon01@0(leader).paxos(paxos updating c 11563072..11563575) accept timeout, calling fresh elect ion 2014-11-04 09:26:31.264873 7f6c0ea0c700 0 mon.mon01@0(probing).data_health(2996) update_stats avail 38% total 6948572 used 3891232 avail 268 1328 2014-11-04 09:26:33.529403 7f6c0e20b700 0 log [INF] : mon.mon01 calling new monitor election 2014-11-04 09:26:33.538286 7f6c0e20b700 1 mon.mon01@0(electing).elector(2996) init, last seen epoch 2996 2014-11-04 09:26:38.809212 7f6c0ea0c700 0 log [INF] : mon.mon01@0 won leader election with quorum 0,2 2014-11-04 09:26:40.215095 7f6c0e20b700 0 log [INF] : monmap e3: 3 mons at {mon01=128.104.164.197:6789/0,mon02=128.104.164.198:6789/0,mon03= 144.92.180.139:6789/0} 2014-11-04 09:26:40.215754 7f6c0e20b700 0 log [INF] : pgmap v6630201: 8704 pgs: 2 inactive, 8494 active+clean, 173 incomplete, 35 down+incom plete; 6344 GB data, 12747 GB used, 7848 GB / 20596 GB avail 2014-11-04 09:26:40.215913 7f6c0e20b700 0 log [INF] : mdsmap e1: 0/0/1 up 2014-11-04 09:26:40.216621 7f6c0e20b700 0 log [INF] : osdmap e115306: 24 osds: 24 up, 24 in 2014-11-04 09:26:41.227010 7f6c0e20b700 0 log [INF] : pgmap v6630202: 8704 pgs: 2 inactive, 8494 active+clean, 173 incomplete, 35 down+incom plete; 6344 GB data, 12747 GB used, 7848 GB / 20596 GB avail 2014-11-04 09:26:41.367373 7f6c0e20b700 1 mon.mon01@0(leader).osd e115307 e115307: 24 osds: 24 up, 24 in 2014-11-04 09:26:41.437706 7f6c0e20b700 0 log [INF] : osdmap e115307: 24 osds: 24 up, 24 in 2014-11-04 09:26:41.471558 7f6c0e20b700 0 log [INF] : pgmap v6630203: 8704 pgs: 2 inactive, 8494 active+clean, 173 incomplete, 35 down+incom plete; 6344 GB data, 12747 GB used, 7848 GB / 20596 GB avail 2014-11-04 09:26:41.497318 7f6c0e20b700 1 mon.mon01@0(leader).osd e115308 e115308: 24 osds: 24 up, 24 in 2014-11-04 09:26:41.533965 7f6c0e20b700 0 log [INF] : osdmap e115308: 24 osds: 24 up, 24 in 2014-11-04 09:26:41.553161 7f6c0e20b700 0 log [INF] : pgmap v6630204: 8704 pgs: 2 inactive, 8494 active+clean, 173 incomplete, 35 down+incom plete; 6344 GB data, 12747 GB used, 7848 GB / 20596 GB avail 2014-11-04 09:26:42.701720 7f6c0e20b700 1 mon.mon01@0(leader).osd e115309 e115309: 24 osds: 24 up, 24 in 2014-11-04 09:26:42.953977
[ceph-users] osd troubleshooting
Hi, I'm trying to run osd troubleshooting commands. *Use case: Stopping osd without re-balancing.* .#ceph osd noout // this command works. But, neither of the following work: #stop ceph-osd id=1 (Error message: *no valid command found; 10 closest matches:* ...) or # ceph osd stop osd.1 ( Error message: *stop: Unknown job: ceph-osd* ) Environment: ceph: 0.80.7 OS: RHEL6.5 upstart-0.6.5-13.el6_5.3.x86_64 ceph-0.80.7-0.el6.x86_64 ceph-common-0.80.7-0.el6.x86_64 Thanks, shiva ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph Giant not fixed RepllicatedPG:NotStrimming?
Can you upload the entire log file? David On Nov 4, 2014, at 1:03 AM, Ta Ba Tuan tua...@vccloud.vn wrote: Hi Sam, I resend logs with debug options http://123.30.41.138/ceph-osd.21.log http://123.30.41.138/ceph-osd.21.log (Sorry about my spam :D) I saw many missing objects :| 2014-11-04 15:26:02.205607 7f3ab11a8700 10 osd.21 pg_epoch: 106407 pg[24.7d7( v 106407'491583 lc 106401'491579 (105805'487042,106407'491583] loca l-les=106403 n=179 ec=25000 les/c 106403/106390 106402/106402/106402) [21,28,4] r=0 lpr=106402 pi=106377-106401/4 rops=1 crt=106401'491581 mlcod 106393'491097 active+recovering+degraded m=2 snaptrimq=[306~1,312~1]] recover_primary 675ea7d7/rbd_data.4930222ae8944a.0001/head//24 106401'491580 (missing) (missing head) (recovering) (recovering head) 2014-11-04 15:26:02.205642 7f3ab11a8700 10 osd.21 pg_epoch: 106407 pg[24.7d7( v 106407'491583 lc 106401'491579 (105805'487042,106407'491583] local-les=106403 n=179 ec=25000 les/c 106403/106390 106402/106402/106402) [21,28,4] r=0 lpr=106402 pi=106377-106401/4 rops=1 crt=106401'491581 mlcod 106393'491097 active+recovering+degraded m=2 snaptrimq=[306~1,312~1]] recover_primary d4d4bfd7/rbd_data.c6964d30a28220.035f/head//24 106401'491581 (missing) (missing head) 2014-11-04 15:26:02.237994 7f3ab29ab700 10 osd.21 pg_epoch: 106407 pg[24.7d7( v 106407'491583 lc 106401'491579 (105805'487042,106407'491583] local-les=106403 n=179 ec=25000 les/c 106403/106390 106402/106402/106402) [21,28,4] r=0 lpr=106402 pi=106377-106401/4 rops=2 crt=106401'491581 mlcod 106393'491097 active+recovering+degraded m=2 snaptrimq=[306~1,312~1]] got missing d4d4bfd7/rbd_data.c6964d30a28220.035f/head//24 v 106401'491581 Thanks Sam and All, -- Tuan HaNoi-Vietnam On 11/04/2014 04:54 AM, Samuel Just wrote: Can you reproduce with debug osd = 20 debug filestore = 20 debug ms = 1 In the [osd] section of that osd's ceph.conf? -Sam On Sun, Nov 2, 2014 at 9:10 PM, Ta Ba Tuan tua...@vccloud.vn mailto:tua...@vccloud.vn wrote: Hi Sage, Samuel All, I upgraded to GAINT, but still appearing that errors |: I'm trying on deleting related objects/volumes, but very hard to verify missing objects :(. Guide me to resolve it, please! (I send attached detail log). 2014-11-03 11:37:57.730820 7f28fb812700 0 osd.21 105950 do_command r=0 2014-11-03 11:37:57.856578 7f28fc013700 -1 *** Caught signal (Segmentation fault) ** in thread 7f28fc013700 ceph version 0.87-6-gdba7def (dba7defc623474ad17263c9fccfec60fe7a439f0) 1: /usr/bin/ceph-osd() [0x9b6725] 2: (()+0xfcb0) [0x7f291fc2acb0] 3: (ReplicatedPG::trim_object(hobject_t const)+0x395) [0x811b55] 4: (ReplicatedPG::TrimmingObjects::react(ReplicatedPG::SnapTrim const)+0x43e) [0x82b9be] 5: (boost::statechart::simple_stateReplicatedPG::TrimmingObjects, ReplicatedPG::SnapTrimmer, boost::mpl::listmpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, (boost::statechart::history_mode)0::react_impl(boost::statechart::event_base const, void const*)+0xc0) [0x870ce0] 6: (boost::statechart::state_machineReplicatedPG::SnapTrimmer, ReplicatedPG::NotTrimming, std::allocatorvoid, boost::statechart::null_exception_translator::process_queued_events()+0xfb) [0x85618b] 7: (boost::statechart::state_machineReplicatedPG::SnapTrimmer, ReplicatedPG::NotTrimming, std::allocatorvoid, boost::statechart::null_exception_translator::process_event(boost::statechart::event_base const)+0x1e) [0x85633e] 8: (ReplicatedPG::snap_trimmer()+0x4f8) [0x7d5ef8] 9: (OSD::SnapTrimWQ::_process(PG*)+0x14) [0x673ab4] 10: (ThreadPool::worker(ThreadPool::WorkThread*)+0x48e) [0xa8fade] 11: (ThreadPool::WorkThread::entry()+0x10) [0xa92870] 12: (()+0x7e9a) [0x7f291fc22e9a] 13: (clone()+0x6d) [0x7f291e5ed31d] NOTE: a copy of the executable, or `objdump -rdS executable` is needed to interpret this. -9993 2014-11-03 11:37:47.689335 7f28fc814700 1 -- 172.30.5.2:6803/7606 -- 172.30.5.1:6886/3511 -- MOSDPGPull(6.58e 105950 [PullOp(87f82d8e/rbd_data.45e62779c99cf1.22b5/head//6, recovery_info: ObjectRecoveryInfo(87f82d8e/rbd_data.45e62779c99cf1.22b5/head//6@105938'11622009, copy_subset: [0~18446744073709551615], clone_subset: {}), recovery_progress: ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false, omap_recovered_to:, omap_complete:false))]) v2 -- ?+0 0x26c59000 con 0x22fbc420 -2 2014-11-03 11:37:57.853585 7f2902820700 5 osd.21 pg_epoch: 105950 pg[24.9e4( v 105946'113392 lc 105946'113391 (103622'109598,105946'113392] local-les=1 05948 n=88 ec=25000 les/c 105948/105943 105947/105947/105947) [21,112,33] r=0 lpr=105947 pi=105933-105946/4 crt=105946'113392 lcod 0'0 mlcod 0'0 active+recovery _wait+degraded m=1
Re: [ceph-users] osd troubleshooting
Shiva, You need to connect to the host where the OSD is located and stop it by invoking: service stop ceph osd.1 I don't think there's a way to stop and start OSDs from an admin node, unless I missed a change that provides this functionality. -Steve On 11/04/2014 10:59 PM, shiva rkreddy wrote: Hi, I'm trying to run osd troubleshooting commands. *Use case: Stopping osd without re-balancing.* .#ceph osd noout // this command works. But, neither of the following work: #stop ceph-osd id=1 (Error message: /*no valid command found; 10 closest matches:*/ ...) or # ceph osd stop osd.1 ( Error message: /*stop: Unknown job: ceph-osd*/ ) Environment: ceph: 0.80.7 OS: RHEL6.5 upstart-0.6.5-13.el6_5.3.x86_64 ceph-0.80.7-0.el6.x86_64 ceph-common-0.80.7-0.el6.x86_64 Thanks, shiva ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Steve Anthony LTS HPC Support Specialist Lehigh University sma...@lehigh.edu ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Full backup/restore of Ceph cluster?
Hi folks, I was wondering if anyone has a solution for performing a complete backup and restore of a CEph cluster. A Google search came up with some articles/blog posts, some of which are old, and I don't really have a great idea of the feasibility of this. Here's what I've found: http://ceph.com/community/blog/tag/backup/ http://ceph.com/docs/giant/rbd/rbd-snapshot/ http://t3491.file-systems-ceph-user.file-systemstalk.us/backups-t3491.html Is RBD snapshotting what I'm looking for? Is this even possible? Any info is much appreciated! Thanks, Chris *Chris Armstrong*Head of Services OpDemand / Deis.io GitHub: https://github.com/deis/deis -- Docs: http://docs.deis.io/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com