Re: [ceph-users] Slow/Hung IOs

2015-01-06 Thread Christian Balzer
On Mon, 5 Jan 2015 22:36:29 + Sanders, Bill wrote: Hi Ceph Users, We've got a Ceph cluster we've built, and we're experiencing issues with slow or hung IO's, even running 'rados bench' on the OSD cluster. Things start out great, ~600 MB/s, then rapidly drops off as the test waits for

[ceph-users] got XmlParseFailure when libs3 client accessing radosgw object gateway

2015-01-06 Thread Liu, Xuezhao
Hello, I am new to ceph and have a problem about ceph object gateway usage, did not find enough hints by googling it, so send an email here, thanks. I have a ceph server with object gateway configured, and another client node to test the object accessing. On the client when I using s3cmd (a

Re: [ceph-users] full osdmaps in mon txns

2015-01-06 Thread Dan van der Ster
On Mon, Jan 5, 2015 at 10:12 AM, Dan van der Ster daniel.vanders...@cern.ch wrote: Hi Sage, On Tue, Dec 23, 2014 at 10:10 PM, Sage Weil sw...@redhat.com wrote: This fun issue came up again in the form of 10422: http://tracker.ceph.com/issues/10422 I think we have 3 main options:

Re: [ceph-users] What to do when a parent RBD clone becomes corrupted

2015-01-06 Thread Robert LeBlanc
Now that the holidays are over, I'm going to bump this message to see if there are any good ideas on this. Thanks, Robert LeBlanc On Thu, Dec 18, 2014 at 2:21 PM, Robert LeBlanc rob...@leblancnet.us wrote: Before we base thousands of VM image clones off of one or more snapshots, I want to

Re: [ceph-users] CRUSH question - failing to rebalance after failure test

2015-01-06 Thread Christopher Kunz
Am 05.01.15 um 16:37 schrieb Samuel Just: Can you post the output of 'ceph pg dump'? -Sam Hi, it's at https://www.christopher-kunz.de/tmp/pgdump.txt Regards, --ck ___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] What to do when a parent RBD clone becomes corrupted

2015-01-06 Thread Gregory Farnum
On Thu, Dec 18, 2014 at 1:21 PM, Robert LeBlanc rob...@leblancnet.us wrote: Before we base thousands of VM image clones off of one or more snapshots, I want to test what happens when the snapshot becomes corrupted. I don't believe the snapshot will become corrupted through client access to the

Re: [ceph-users] OSDs with btrfs are down

2015-01-06 Thread Gregory Farnum
On Sun, Jan 4, 2015 at 8:10 AM, Lionel Bouton lionel+c...@bouton.name wrote: On 01/04/15 16:25, Jiri Kanicky wrote: Hi. I have been experiencing same issues on both nodes over the past 2 days (never both nodes at the same time). It seems the issue occurs after some time when copying a

Re: [ceph-users] rbd resize (shrink) taking forever and a day

2015-01-06 Thread Chen, Xiaoxi
When you shrinking the RBD, most of the time was spent on librbd/internal.cc::trim_image(), in this function, client will iterator all unnecessary objects(no matter whether it exists) and delete them. So in this case, when Edwin shrinking his RBD from 650PB to 650GB, there are[ (650PB *

[ceph-users] Ceph status

2015-01-06 Thread Ajitha Robert
Hi all, I have installed ceph using ceph-deploy utility.. I have created three VM's, one for monitor+mds and other two VM's for OSD's. ceph admin is another seperate machine... .Status and health of ceph are shown below.. Can you please suggest What i can infer from the status.. I m a beginner

Re: [ceph-users] Erasure Encoding Chunks Number of Hosts

2015-01-06 Thread Nick Fisk
Hi Loic, That's an interesting idea, I suppose the same could probably be achieved by just creating more Crush Host Buckets for each actual host and then treat the actual physical host as a chassis (Chassis-1 contains Host-1-A, Host-1-B...etc) I was thinking about this some more and I don't

Re: [ceph-users] Slow/Hung IOs

2015-01-06 Thread Lindsay Mathieson
On Tue, 6 Jan 2015 12:07:26 AM Sanders, Bill wrote: 14 and 18 happened to show up during that run, but its certainly not only those OSD's. It seems to vary each run. Just from the runs I've done today I've seen the following pairs of OSD's: Could your osd nodes be paging? I know from

Re: [ceph-users] Cache tiers flushing logic

2015-01-06 Thread Gregory Farnum
On Tue, Dec 30, 2014 at 11:38 AM, Erik Logtenberg e...@logtenberg.eu wrote: Hi Erik, I have tiering working on a couple test clusters. It seems to be working with Ceph v0.90 when I set: ceph osd pool set POOL hit_set_type bloom ceph osd pool set POOL hit_set_count 1 ceph osd pool set

Re: [ceph-users] Slow/Hung IOs

2015-01-06 Thread Gonzalo Aguilar Delgado
Hi, I just ran this test and found my system is not better. But I use commodity hardware. The only difference is latency. You should look at it. Total time run: 62.412381 Total writes made: 919 Write size: 4194304 Bandwidth (MB/sec): 58.899 Stddev Bandwidth:

Re: [ceph-users] Building Ceph

2015-01-06 Thread Lincoln Bryant
Hi Pankaj, You can search for the lib using the 'yum provides' command, which accepts wildcards. [root@sl7 ~]# yum provides */lib64/libkeyutils* Loaded plugins: langpacks keyutils-libs-1.5.8-3.el7.x86_64 : Key utilities library Repo: sl Matched from: Filename:

Re: [ceph-users] Different disk usage on different OSDs

2015-01-06 Thread Christian Balzer
On Mon, 5 Jan 2015 23:41:17 +0400 ivan babrou wrote: Rebalancing is almost finished, but things got even worse: http://i.imgur.com/0HOPZil.png Looking at that graph only one OSD really kept growing and growing, everything else seems to be a lot denser, less varied than before, as one would

Re: [ceph-users] Different disk usage on different OSDs

2015-01-06 Thread Robert LeBlanc
Ceph currently isn't very smart on ordering the balancing operations. It can fill a disk before moving some things off of it. So if you are close to the toofull line, it can push that OSD over. I think there is a blueprint to help with this being worked on for Hammer. You have a couple of

Re: [ceph-users] Marking a OSD a new in the OSDMap

2015-01-06 Thread Robert LeBlanc
I think because ceph-disk or ceph-deploy doesn't support --osd-uuid. On Wed, Dec 31, 2014 at 10:30 AM, Andrey Korolyov and...@xdel.ru wrote: On Wed, Dec 31, 2014 at 8:20 PM, Wido den Hollander w...@42on.com wrote: On 12/31/2014 05:54 PM, Andrey Korolyov wrote: On Wed, Dec 31, 2014 at 7:34

Re: [ceph-users] Different disk usage on different OSDs

2015-01-06 Thread ivan babrou
I deleted some old backups and GC is returning some disk space back. But cluster state is still bad: 2015-01-06 13:35:54.102493 mon.0 [INF] pgmap v4017947: 5832 pgs: 23 active+remapped+wait_backfill, 1 active+remapped+wait_backfill+backfill_toofull, 2 active+remapped+backfilling, 5806

Re: [ceph-users] Ceph status

2015-01-06 Thread Lincoln Bryant
Hi Ajitha, For one, it looks like you don't have enough OSDs for the number of replicas you have specified in the config file. What is the value of your 'osd pool default size' in ceph.conf? If it's 3, for example, then you need to have at least 3 hosts with 1 OSD each (with the default

Re: [ceph-users] rbd resize (shrink) taking forever and a day

2015-01-06 Thread Jake Young
On Monday, January 5, 2015, Chen, Xiaoxi xiaoxi.c...@intel.com wrote: When you shrinking the RBD, most of the time was spent on librbd/internal.cc::trim_image(), in this function, client will iterator all unnecessary objects(no matter whether it exists) and delete them. So in this case,

Re: [ceph-users] rbd directory listing performance issues

2015-01-06 Thread Robert LeBlanc
I would think that the RBD mounter would cache the directory listing which should always make it fast, unless there is so much memory pressure that it is dropping it frequently. How many entries are in your directory and total on the RBD? ls | wc -l find . | wc -l What does your memory look

Re: [ceph-users] rbd directory listing performance issues

2015-01-06 Thread Shain Miley
It does seem like the entries get cached for a certain period of time. Here is the memory listing for the rbd client server: root@cephmount1:~# free -m total used free sharedbuffers cached Mem: 11965 11816149 3139

Re: [ceph-users] rbd directory listing performance issues

2015-01-06 Thread Robert LeBlanc
What fs are you running inside the RBD? On Tue, Jan 6, 2015 at 8:29 AM, Shain Miley smi...@npr.org wrote: Hello, We currently have a 12 node (3 monitor+9 OSD) ceph cluster, made up of 107 x 4TB drives formatted with xfs. The cluster is running ceph version 0.80.7: Cluster health: cluster

Re: [ceph-users] Different disk usage on different OSDs

2015-01-06 Thread ivan babrou
Restarting OSD fixed PGs that were stuck: http://i.imgur.com/qd5vuzV.png Still OSD dis usage is very different, 150..250gb. Shall I double PGs again? On 6 January 2015 at 17:12, ivan babrou ibob...@gmail.com wrote: I deleted some old backups and GC is returning some disk space back. But

[ceph-users] rbd directory listing performance issues

2015-01-06 Thread Shain Miley
Hello, We currently have a 12 node (3 monitor+9 OSD) ceph cluster, made up of 107 x 4TB drives formatted with xfs. The cluster is running ceph version 0.80.7: Cluster health: cluster 504b5794-34bd-44e7-a8c3-0494cf800c23 health HEALTH_WARN crush map has legacy tunables monmap e1: 3

Re: [ceph-users] rbd directory listing performance issues

2015-01-06 Thread Shain Miley
Robert, Thanks again for the help. I'll keep looking around. However as you stated it might be a matter of trying to reduce OSD latency, instead of trying to find tuning option on the client. I've already increased the readahead values, played with the scheduler, and mount options...so I'm

Re: [ceph-users] rbd directory listing performance issues

2015-01-06 Thread Shain Miley
Christian, Each of the OSD's server nodes are running on Dell R-720xd's with 64 GB or RAM. We have 107 OSD's so I have not checked all of them..however the ones I have checked with xfs_db, have shown anywhere from 1% to 4% fragmentation. I'll try to upgrade the client server to 32 or 64 GB of

Re: [ceph-users] OSDs with btrfs are down

2015-01-06 Thread Lionel Bouton
On 01/06/15 02:36, Gregory Farnum wrote: [...] filestore btrfs snap controls whether to use btrfs snapshots to keep the journal and backing store in check. WIth that option disabled it handles things in basically the same way we do with xfs. filestore btrfs clone range I believe controls how

Re: [ceph-users] rbd directory listing performance issues

2015-01-06 Thread Shain Miley
Robert, xfs on the rbd image as well: /dev/rbd0 on /mnt/ceph-block-device-archive type xfs (rw) However looking at the mount options...it does not look like I've enabled anything special in terms of mount options. Thanks, Shain Shain Miley | Manager of Systems and Infrastructure, Digital

Re: [ceph-users] OSDs with btrfs are down

2015-01-06 Thread Gregory Farnum
I'm afraid I don't know what would happen if you change those options. Hopefully we've set it up so things continue to work, but we definitely don't test it. -Greg On Tue, Jan 6, 2015 at 8:22 AM Lionel Bouton lionel+c...@bouton.name wrote: On 01/06/15 02:36, Gregory Farnum wrote: [...]

Re: [ceph-users] Different disk usage on different OSDs

2015-01-06 Thread Christian Balzer
On Tue, 6 Jan 2015 19:28:44 +0400 ivan babrou wrote: Restarting OSD fixed PGs that were stuck: http://i.imgur.com/qd5vuzV.png Good to hear that. Funny (not really) how often restarting OSDs fixes stuff like that. Still OSD dis usage is very different, 150..250gb. Shall I double PGs again?

Re: [ceph-users] Erasure Encoding Chunks Number of Hosts

2015-01-06 Thread Nick Fisk
I've also had some luck with the following crush ruleset, erasure profile failure domain is set to OSD rule ecpool_test2 { ruleset 3 type erasure min_size 3 max_size 20 step set_chooseleaf_tries 5 step set_choose_tries 100 step take ceph1

Re: [ceph-users] OSDs with btrfs are down

2015-01-06 Thread Lionel Bouton
On 01/06/15 18:26, Gregory Farnum wrote: I'm afraid I don't know what would happen if you change those options. Hopefully we've set it up so things continue to work, but we definitely don't test it. Thanks. That's not a problem: when the opportunity arise I'll just adapt my tests accordingly

Re: [ceph-users] rbd resize (shrink) taking forever and a day

2015-01-06 Thread Robert LeBlanc
Can't this be done in parallel? If the OSD doesn't have an object then it is a noop and should be pretty quick. The number of outstanding operations can be limited to 100 or a 1000 which would provide a balance between speed and performance impact if there is data to be trimmed. I'm not a big fan

Re: [ceph-users] What to do when a parent RBD clone becomes corrupted

2015-01-06 Thread Robert LeBlanc
On Mon, Jan 5, 2015 at 6:01 PM, Gregory Farnum g...@gregs42.com wrote: On Thu, Dec 18, 2014 at 1:21 PM, Robert LeBlanc rob...@leblancnet.us wrote: Before we base thousands of VM image clones off of one or more snapshots, I want to test what happens when the snapshot becomes corrupted. I don't