Re: [ceph-users] HEALTH_WARN 6 requests are blocked

2015-05-12 Thread Irek Fasikhov
Patrick, At the moment, you do not have any problems related to the slow query. 2015-05-12 8:56 GMT+03:00 Patrik Plank pat...@plank.me: So ok, understand. But what can I do if the scrubbing process hangs by one page since last night: root@ceph01:~# ceph health detail HEALTH_OK

Re: [ceph-users] Scrub Error / How does ceph pg repair work?

2015-05-12 Thread Christian Balzer
Hello, I can only nod emphatically to what Robert said, don't issue repairs unless you a) don't care about the data or b) have verified that your primary OSD is good. See this for some details on how establish which replica(s) are actually good or not:

Re: [ceph-users] EC backend benchmark

2015-05-12 Thread Christian Balzer
Hello, Could you have another EC run with differing block sizes like described here: http://lists.opennebula.org/pipermail/ceph-users-ceph.com/2014-October/043949.html and look for write amplification? I'd suspect that by the very nature of EC and the addition local checksums it (potentially)

Re: [ceph-users] Scrub Error / How does ceph pg repair work?

2015-05-12 Thread Christian Eichelmann
Hi Christian, Hi Robert, thank you for your replies! I was already expecting something like this. But I am seriously worried about that! Just assume that this is happening at night. Our shift has not necessarily enough knowledge to perform all the steps in Sebasien's article. And if we always

Re: [ceph-users] Scrub Error / How does ceph pg repair work?

2015-05-12 Thread Dan van der Ster
On Tue, May 12, 2015 at 1:07 AM, Anthony D'Atri a...@dreamsnake.net wrote: Agree that 99+% of the inconsistent PG's I see correlate directly to disk flern. Check /var/log/kern.log*, /var/log/messages*, etc. and I'll bet you find errors correlating. More to this... In the case that an

[ceph-users] Debian Jessie packages?

2015-05-12 Thread Henrik Korkuc
Hey, as Debian Jessie is already released for some time, I'd like to ask is there any plans to build newer Ceph packages for it? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Debian Jessie packages?

2015-05-12 Thread Alexandre DERUMIER
Hi, as Debian Jessie is already released for some time, I'd like to ask is there any plans to build newer Ceph packages for it? Yes it's planned, I'm currently helping on create images for build integration platform. If you want, I have build ceph packages for jessie:

Re: [ceph-users] New Calamari server

2015-05-12 Thread Alexandre DERUMIER
Hi, when you have remove salt from nodes, do you have remove the old master key /etc/salt/pki/minion/minion_master.pub ? I have add the same behavior than you when reinstalling calamari server, and previously installed salt on ceph nodes (with explicit error about the key in

Re: [ceph-users] questions about CephFS

2015-05-12 Thread Gregory Farnum
[ Adding ceph-users to the CC ] On Mon, May 11, 2015 at 8:22 PM, zhao.ming...@h3c.com zhao.ming...@h3c.com wrote: Hi: I'm learning CephFS recently, and now I have some question about it; 1. I've seen the typical configuration is 'single MDS',and found some resources from Internet

Re: [ceph-users] Error in sys.exitfunc

2015-05-12 Thread Patrick McGarry
Moving this to ceph-user where it belongs for eyeballs and responses. On Mon, May 11, 2015 at 10:39 PM, 张忠波 zhangzhongbo2...@163.com wrote: Hi When I run ceph-deploy , error will appear , Error in sys.exitfunc: . I find the same error message with me ,

Re: [ceph-users] Civet RadosGW S3 not storing complete obects; civetweb logs stop after rotation

2015-05-12 Thread Mark Murphy
Hey Yehuda, I work with Sean on the dev side. We thought we should put together a short report on what we’ve been seeing in the hopes that the behavior might make some sense to you. We had originally noticed these issues a while ago with our first iteration of this particular Ceph deployment.

Re: [ceph-users] kernel version for rbd client and hammer tunables

2015-05-12 Thread Ilya Dryomov
On Tue, May 12, 2015 at 8:37 PM, Chad William Seys cws...@physics.wisc.edu wrote: Hi Ilya and all, Is it safe to use kernel 3.16.7 rbd with Hammer tunables? I've tried this on a test Hammer cluster and the client seems to work fine. I've also mounted cephfs on a Hammer cluster

Re: [ceph-users] about rgw region sync

2015-05-12 Thread Craig Lewis
Are you trying to setup replication on one cluster right now? Generally replication is setup between two different clusters, each having one zone. Both clusters are in the same region. I can't think of a reason why two zones in one cluster wouldn't work. It's more complicated to setup though.

[ceph-users] Question regarding multipart object HEAD calls

2015-05-12 Thread Eric Beerman
Hello, When storing large, multipart objects in the Ceph Object Store (~100 GB and more), we have noticed that HEAD calls against the rados gateway for these objects are excessively slow - in fact, they are about the same as doing a GET on the object. Looking at the logs while this is

[ceph-users] RBD images -- parent snapshot missing (help!)

2015-05-12 Thread Pavel V. Kaygorodov
Hi! I have an RBD image (in pool volumes), made by openstack from parent image (in pool images). Recently, I have tried to decrease number of PG-s, to avoid new Hammer warning. I have copied pool images to another pool, deleted original pool and renamed new pool to images. Ceph allowed me to do

[ceph-users] 回复:Re: about rgw region sync

2015-05-12 Thread 刘俊
no,i set up replication between two clusters,each cluster has  one zone, both clusters are in the same region. but  i got some errors. 在 Craig Lewis cle...@centraldesktop.com,2015年5月13日 上午12:02写道:Are you trying to setup replication on one cluster right now?Generally replication is setup between

Re: [ceph-users] RBD images -- parent snapshot missing (help!)

2015-05-12 Thread Tuomas Juntunen
Hi I am having this exact same problem, for more than a week. I have not found a way to do this either. Any help would be appreciated. Basically all of our guests are now down, even though they are not in production, we would still need to get the data out of them. Br, Tuomas -Original

Re: [ceph-users] New Calamari server

2015-05-12 Thread Bruce McFarland
I am having a similar issue. The cluster is up and salt is running on and has accepted keys from all nodes, including the monitor. I can issue salt and salt/ceph.py commands from the Calamari including 'salt \* ceph.get_heartbeats' which returns from all nodes including the monitor with the

Re: [ceph-users] Civet RadosGW S3 not storing complete obects; civetweb logs stop after rotation

2015-05-12 Thread Yehuda Sadeh-Weinraub
Hi, Thank you for a very thorough investigation. See my comments below: - Original Message - From: Mark Murphy murphyma...@uchicago.edu To: Yehuda Sadeh-Weinraub yeh...@redhat.com Cc: Sean Sullivan seapasu...@uchicago.edu, ceph-users@lists.ceph.com Sent: Tuesday, May 12, 2015

[ceph-users] Cluster always in WARN state, failing to respond to cache pressure

2015-05-12 Thread Cullen King
I'm operating a fairly small ceph cluster, currently three nodes (with plans to expand to five in the next couple of months) with more than adequate hardware. Node specs: 2x Xeon E5-2630 64gb ram 2x RAID1 SSD for system 2x 256gb SSDs for journals 4x 4tb drives for OSDs 1GbE for frontend (shared

Re: [ceph-users] kernel version for rbd client and hammer tunables

2015-05-12 Thread Ilya Dryomov
On Tue, May 12, 2015 at 10:38 PM, Chad William Seys cws...@physics.wisc.edu wrote: Hi Ilya and all, Thanks for explaining. I'm confused about what building a crushmap means. After running #ceph osd crush tunables hammer data migrated around the cluster, so something

Re: [ceph-users] kernel version for rbd client and hammer tunables

2015-05-12 Thread Chad William Seys
Hi Ilya and all, Thanks for explaining. I'm confused about what building a crushmap means. After running #ceph osd crush tunables hammer data migrated around the cluster, so something changed. I was expecting that 'straw' would be replaced by 'straw2'.

Re: [ceph-users] EC backend benchmark

2015-05-12 Thread Nick Fisk
Hi Somanth, Firstly, thank you for sharing these results. I suspect you are struggling to saturate anything due to the effects of serial latency, have you tried scaling the clients above 8? I noticed a similar ceiling at a much albeit at a much lower performance threshold when using 1Gb

Re: [ceph-users] Cluster always in WARN state, failing to respond to cache pressure

2015-05-12 Thread Gregory Farnum
On Tue, May 12, 2015 at 12:03 PM, Cullen King cul...@ridewithgps.com wrote: I'm operating a fairly small ceph cluster, currently three nodes (with plans to expand to five in the next couple of months) with more than adequate hardware. Node specs: 2x Xeon E5-2630 64gb ram 2x RAID1 SSD for

Re: [ceph-users] RBD images -- parent snapshot missing (help!)

2015-05-12 Thread Pavel V. Kaygorodov
Hi! I have found a way to change a pool ID for image parent: list images: # rbd ls volumes 5a4b167d-2588-4c06-904c-347abf91d788_disk.swap volume-0ed965a0-53a5-4054-ad9c-3a432c8455d6 volume-1269b41a-4af0-499b-a16c-9bb6a5b98e70 volume-4094fbc1-9969-47aa-a0de-7026678b8e64

Re: [ceph-users] kernel version for rbd client and hammer tunables

2015-05-12 Thread Ilya Dryomov
On Tue, May 12, 2015 at 11:16 PM, Robert LeBlanc rob...@leblancnet.us wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 What is the difference between straw and straw2 buckets? Should we consider upgrading to straw2 buckets by dumping the CRUSH map and updating them? Well, straw bucket

Re: [ceph-users] EC backend benchmark

2015-05-12 Thread Somnath Roy
Nick thanks for your feedback. Please find my response inline. Regards Somnath -Original Message- From: Nick Fisk [mailto:n...@fisk.me.uk] Sent: Tuesday, May 12, 2015 1:02 PM To: Somnath Roy; 'Christian Balzer'; ceph-users@lists.ceph.com Subject: RE: [ceph-users] EC backend benchmark Hi

Re: [ceph-users] Ceph User Committee Vote

2015-05-12 Thread Patrick McGarry
I know Wido was looking to potentially get on the ballot here, so am including him. I'm also including ceph-user and ceph-devel in case anyone else had strong feelings about leading the Ceph User Committee (CUC). Thanks. As Eric mentioned. If you are interested in taking on the role and

Re: [ceph-users] kernel version for rbd client and hammer tunables

2015-05-12 Thread Chad William Seys
No, pools use crush rulesets. straw and straw2 are bucket types (or algorithms). As an example, if you do ceph osd crush add-bucket foo rack on a cluster with firefly tunables, you will get a new straw bucket. The same after doing ceph osd crush tunables hammer will get you a new straw2

Re: [ceph-users] Cluster always in WARN state, failing to respond to cache pressure

2015-05-12 Thread Cullen King
Thanks for the suggestions Greg. One thing I forgot to mention, restarting the main MDS service fixes the problem temporarily. Clearing inodes and dentries on the client with echo 2 | sudo tee /proc/sys/vm/drop_caches on the two cephfs clients that were failing to respond to cache pressure fixed

Re: [ceph-users] kernel version for rbd client and hammer tunables

2015-05-12 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 What is the difference between straw and straw2 buckets? Should we consider upgrading to straw2 buckets by dumping the CRUSH map and updating them? -BEGIN PGP SIGNATURE- Version: Mailvelope v0.13.1 Comment: https://www.mailvelope.com

Re: [ceph-users] export-diff exported only 4kb instead of 200-600gb

2015-05-12 Thread Ultral
If you run 'rbd info --pool RBD-01 CEPH_006__01__NA__0003__ESX__ALL_EXT', what is the output? size 2048 GB in 524288 objects order 22 (4096 kB objects) block_name_prefix: rb.0.19b1.238e1f29 format: 1 Does 'rbd diff' work against the image (i.e. more than a few

Re: [ceph-users] cache pool parameters and pressure

2015-05-12 Thread Gregory Farnum
On Tue, May 12, 2015 at 5:54 AM, Kenneth Waegeman kenneth.waege...@ugent.be wrote: On 04/30/2015 07:50 PM, Gregory Farnum wrote: On Thu, Apr 30, 2015 at 2:03 AM, Kenneth Waegeman kenneth.waege...@ugent.be wrote: So the cache is empty, but I get warning when I check the health:

[ceph-users] kernel version for rbd client and hammer tunables

2015-05-12 Thread Chad William Seys
Hi Ilya and all, Is it safe to use kernel 3.16.7 rbd with Hammer tunables? I've tried this on a test Hammer cluster and the client seems to work fine. I've also mounted cephfs on a Hammer cluster (and Hammer tunables) using kernel 3.16. It seems to work fine (but not much

Re: [ceph-users] [ceph-calamari] Does anyone understand Calamari??

2015-05-12 Thread Bruce McFarland
/var/log/salt/minion doesn't really look very interesting after that sequence. I issues salt oceton109 ceph.get_heartbeats from the master. The logs are much more interesting when clear calamari and stop salt-minion. Looking at the endpoints from http://essperf2/api/v2/cluster doesn't show

Re: [ceph-users] [ceph-calamari] Does anyone understand Calamari??

2015-05-12 Thread Gregory Meno
Ideally I would like everything in /var/log/calmari be sure to set calamari.conf like so: [shadow_man@vpm107 ~]$ grep DEBUG /etc/calamari/calamari.conf log_level = DEBUG db_log_level = DEBUG log_level = DEBUG then restart cthulhu and apache visit http://essperf3/api/v2/cluster and

Re: [ceph-users] [ceph-calamari] Does anyone understand Calamari??

2015-05-12 Thread Gregory Meno
All that looks fine. There must be some state where the cluster is known to calamari and it is failing to actually show it. If you have time to debug I would love to see the logs at debug level. If you don’t we could try cleaning out calamari’s state. sudo supervisorctl shutdown sudo service

Re: [ceph-users] [ceph-calamari] Does anyone understand Calamari??

2015-05-12 Thread Bruce McFarland
Which logs? I'm assuming /var/log/salt/minon since the rest on the minions are relatively empty. Possibly Cthulhu from the master? I'm running on Ubuntu 14.04 and don't have an httpd service. I had been start/stopping apache2. Likewise there is no supervisord service and I've been using

Re: [ceph-users] EC backend benchmark

2015-05-12 Thread Christian Balzer
Hello, On Tue, 12 May 2015 15:28:28 + Somnath Roy wrote: Hi Christian, Wonder why are you saying EC will write more data than replication ? There are 2 distinct things here to look at. 1. One is the overhead (increasing with smaller blocks) created by Ceph (and the filesystem) as per my

Re: [ceph-users] OSD in ceph.conf

2015-05-12 Thread Georgios Dimitrakakis
Robert, thanks a lot for the feedback! I was very worried about the same thing! Glad to know tha Ceph's automagic takes care of everything :-P Best regards, George If you use ceph-disk (and I believe ceph-depoly) to create your OSDs, or you go through the manual steps to set up the

Re: [ceph-users] rbd unmap command hangs when there is no network connection with mons and osds

2015-05-12 Thread Sebastien Han
Should we put a timeout to the unmap command on the RBD RA in the meantime? On 08 May 2015, at 15:13, Vandeir Eduardo vandeir.edua...@gmail.com wrote: Wouldn't be better a configuration named (map|unmap)_timeout? Cause we are talking about a map/unmap of a RBD device, not a mount/unmount of

Re: [ceph-users] Btrfs defragmentation

2015-05-12 Thread Lionel Bouton
On 05/06/15 20:28, Lionel Bouton wrote: Hi, On 05/06/15 20:07, Timofey Titovets wrote: 2015-05-06 20:51 GMT+03:00 Lionel Bouton lionel+c...@bouton.name: Is there something that would explain why initially Btrfs creates the 4MB files with 128k extents (32 extents / file) ? Is it a bad thing

[ceph-users] 回复:Re: about rgw region sync

2015-05-12 Thread 刘俊
please give me some advice, thanks 在 刘俊 316828...@qq.com,2015年5月13日 上午12:29写道:no,i set up replication between two clusters,each cluster has  one zone, both clusters are in the same region. but  i got some errors. 在 Craig Lewis cle...@centraldesktop.com,2015年5月13日 上午12:02写道:Are you trying to setup

Re: [ceph-users] Civet RadosGW S3 not storing complete obects; civetweb logs stop after rotation

2015-05-12 Thread Yehuda Sadeh-Weinraub
I opened issue #11604, and have a fix for the issue. I updated our test suite to cover the specific issue that you were hitting. We'll backport the fix to both hammer and firefly soon. Thanks! Yehuda - Original Message - From: Yehuda Sadeh-Weinraub yeh...@redhat.com To: Mark Murphy

Re: [ceph-users] [ceph-calamari] Does anyone understand Calamari??

2015-05-12 Thread Gregory Meno
Bruce, It is great to hear that salt is reporting status from all the nodes in the cluster. Let me see if I understand your question: You want to know what conditions cause us to recognize a working cluster? see

[ceph-users] about rgw region sync

2015-05-12 Thread TERRY
I build two ceph clusters. for the first cluster, I do the follow steps 1.create pools sudo ceph osd pool create .us-east.rgw.root 64 64 sudo ceph osd pool create .us-east.rgw.control 64 64 sudo ceph osd pool create .us-east.rgw.gc 64 64 sudo ceph osd pool create .us-east.rgw.buckets 64 64 sudo

Re: [ceph-users] cache pool parameters and pressure

2015-05-12 Thread Kenneth Waegeman
On 04/30/2015 07:50 PM, Gregory Farnum wrote: On Thu, Apr 30, 2015 at 2:03 AM, Kenneth Waegeman kenneth.waege...@ugent.be wrote: So the cache is empty, but I get warning when I check the health: health HEALTH_WARN mds0: Client cephtst.cubone.os failing to respond to cache

Re: [ceph-users] Cisco UCS Blades as MONs? Pros cons ...?

2015-05-12 Thread Christian Balzer
Hello, I'm not familiar with Cisco UCS gear (can you cite exact models?), but somehow the thought of buying compute gear from Cisco makes me think of having too much money or very steep discounts. ^o^ That said, I presume the chassis those blades are in have redundancy in terms of PSUs (we

[ceph-users] How to debug a ceph read performance problem?

2015-05-12 Thread changqian zuo
Hi, guys, We have been running an OpenStack Havana environment with Ceph 0.72.2 as block storage backend. Recently we were trying to upgrade OpenStack to Juno. For testing, we deployed a Juno all-in-one node, this node share the same Cinder volume rbd pool and Glance image rbd pool with the old

Re: [ceph-users] export-diff exported only 4kb instead of 200-600gb

2015-05-12 Thread Jason Dillaman
Very strange. I'll see if I can reproduce on a giant release. If you run 'rbd info --pool RBD-01 CEPH_006__01__NA__0003__ESX__ALL_EXT', what is the output? I want to use the same settings as your image. Does 'rbd diff' work against the image (i.e. more than a few kilobyes of deltas)?

[ceph-users] about rgw region sync

2015-05-12 Thread TERRY
could i build one region using two clusters?? each cluster has one zone?? so that I sync metadata and data from one cluster to another cluster?? I build two ceph clusters. for the first cluster, I do the follow steps 1.create pools sudo ceph osd pool create .us-east.rgw.root 64 64 sudo ceph

Re: [ceph-users] Rados Gateway and keystone

2015-05-12 Thread ghislain.chevalier
Thx Mark I understand the specific parameters are mandatory for the S3 implementation but as they are not for the swift implementation (I tested it...) it should have been better to distinguish which parameter is mandatory according to the implementation. For the S3 implementation, the creation

Re: [ceph-users] Scrub Error / How does ceph pg repair work?

2015-05-12 Thread Anthony D'Atri
For me that's true about 1/3 the time, but often I do still have to repair the PG after removing the affected OSD. YMMV. Agree that 99+% of the inconsistent PG's I see correlate directly to disk flern. Check /var/log/kern.log*, /var/log/messages*, etc. and I'll bet you find errors

Re: [ceph-users] EC backend benchmark

2015-05-12 Thread Somnath Roy
Hi Christian, Wonder why are you saying EC will write more data than replication ? Anyways, as you suggested, I will see how can I measure WA for EC vs replication. Thanks Regards Somnath -Original Message- From: Christian Balzer [mailto:ch...@gol.com] Sent: Monday, May 11, 2015 11:28

Re: [ceph-users] New Calamari server

2015-05-12 Thread Michael Kuriger
In my case, I did remove all salt keys. The salt portion of my install is working. It’s just that the calamari server is not seeing the ceph cluster. Michael Kuriger Sr. Unix Systems Engineer * mk7...@yp.com |( 818-649-7235 On 5/12/15, 1:35 AM, Alexandre DERUMIER aderum...@odiso.com