[ceph-users] ceph can't recognize ext4 extended attributes when --mkfs --mkkey

2015-03-03 Thread wsnote
ceph version 0.80.1 System: CentOS 6.5 [root@dn1 osd.6]# mount /dev/sde1 on /cache4 type ext4 (rw,noatime,user_xattr) —— osd.6 /dev/sdf1 on /cache5 type ext4 (rw,noatime,user_xattr) —— osd.7 /dev/sdg1 on /cache6 type ext4 (rw,noatime,user_xattr) —— osd.8 /dev/sdh1 on /cache7 type ext4

Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-03 Thread Irek Fasikhov
You have a number of replication? 2015-03-03 15:14 GMT+03:00 Andrija Panic andrija.pa...@gmail.com: Hi Irek, yes, stoping OSD (or seting it to OUT) resulted in only 3% of data degraded and moved/recovered. When I after that removed it from Crush map ceph osd crush rm id, that's when the

Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-03 Thread Irek Fasikhov
Once you have only three nodes in the cluster. I recommend you add new nodes to the cluster, and then delete the old. 2015-03-03 15:28 GMT+03:00 Irek Fasikhov malm...@gmail.com: You have a number of replication? 2015-03-03 15:14 GMT+03:00 Andrija Panic andrija.pa...@gmail.com: Hi Irek,

[ceph-users] Objects, created with Rados Gateway, have incorrect UTC timestamp

2015-03-03 Thread Sergey Arkhipov
Hi, I have a problem with timestamps of objects created in Rados Gateway. Timestamps are supposed to be in UTC timezone but instead I have strange offset shift. Server with Rados Gateway use MSK timezone (GMT +3). NTP is set, up and running correctly. Rados Gateway and Ceph has no objects (usage

Re: [ceph-users] Question regarding rbd cache

2015-03-03 Thread Jason Dillaman
librbd caches data at a buffer / block level. In a simplified example, if you are reading and writing random 4K blocks, the librbd cache would store only those individual 4K blocks. Behind the scenes, it is possible for adjacent block buffers to be merged together within the librbd cache.

[ceph-users] cephfs filesystem layouts : authentication gotchas ?

2015-03-03 Thread SCHAER Frederic
Hi, I am attempting to test the cephfs filesystem layouts. I created a user with rights to write only in one pool : client.puppet key:zzz caps: [mon] allow r caps: [osd] allow rwx pool=puppet I also created another pool in which I would assume this user is allowed to do

Re: [ceph-users] qemu-kvm and cloned rbd image

2015-03-03 Thread Jason Dillaman
Your procedure appears correct to me. Would you mind re-running your cloned image VM with the following ceph.conf properties: [client] rbd cache off debug rbd = 20 log file = /path/writeable/by/qemu.$pid.log If you recreate the issue, would you mind opening a ticket at

Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-03 Thread Andrija Panic
Thanks Irek. Does this mean, that after peering for each PG, there will be delay of 10sec, meaning that every once in a while, I will have 10sec od the cluster NOT being stressed/overloaded, and then the recovery takes place for that PG, and then another 10sec cluster is fine, and then stressed

Re: [ceph-users] More than 50% osds down, CPUs still busy; will the cluster recover without help?

2015-03-03 Thread Chris Murray
Ah yes, that's a good point :-) Thank you for your assistance Greg, I'm understanding a little more about how Ceph operates under the hood now. We're probably at a reasonable point for me to say I'll just switch the machines off and forget about them for a while. It's no great loss; I just

Re: [ceph-users] cephfs filesystem layouts : authentication gotchas ?

2015-03-03 Thread John Spray
On 03/03/2015 15:21, SCHAER Frederic wrote: By the way : looks like the “ceph fs ls” command is inconsistent when the cephfs is mounted (I used a locally compiled kmod-ceph rpm): [root@ceph0 ~]# ceph fs ls name: cephfs_puppet, metadata pool: puppet_metadata, data pools: [puppet ] (umount

[ceph-users] Rbd image's data deletion

2015-03-03 Thread Giuseppe Civitella
Hi all, what happens to data contained in an rbd image when the image itself gets deleted? Are the data just unlinked or are them destroyed in a way that make them unreadable? thanks Giuseppe ___ ceph-users mailing list ceph-users@lists.ceph.com

[ceph-users] Question about rados bench

2015-03-03 Thread Tony Harris
Hi all, In my reading on the net about various implementations of Ceph, I came across this website blog page (really doesn't give a lot of good information but caused me to wonder): http://avengermojo.blogspot.com/2014/12/cubieboard-cluster-ceph-test.html near the bottom, the person did a rados

Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-03 Thread Andrija Panic
Thx Irek. Number of replicas is 3. I have 3 servers with 2 OSDs on them on 1g switch (1 OSD already decommissioned), which is further connected to a new 10G switch/network with 3 servers on it with 12 OSDs each. I'm decommissioning old 3 nodes on 1G network... So you suggest removing whole node

Re: [ceph-users] problem in cephfs for remove empty directory

2015-03-03 Thread John Spray
On 03/03/2015 14:07, Daniel Takatori Ohara wrote: *$ls test-daniel-old/* total 0 drwx-- 1 rmagalhaes BioInfoHSL Users0 Mar 2 10:52 ./ drwx-- 1 rmagalhaes BioInfoHSL Users 773099838313 Mar 2 11:41 ../ *$rm -rf test-daniel-old/* rm: cannot remove ‘test-daniel-old/’:

Re: [ceph-users] problem in cephfs for remove empty directory

2015-03-03 Thread Gregory Farnum
On Tue, Mar 3, 2015 at 9:24 AM, John Spray john.sp...@redhat.com wrote: On 03/03/2015 14:07, Daniel Takatori Ohara wrote: $ls test-daniel-old/ total 0 drwx-- 1 rmagalhaes BioInfoHSL Users0 Mar 2 10:52 ./ drwx-- 1 rmagalhaes BioInfoHSL Users 773099838313 Mar 2 11:41 ../

[ceph-users] Ceph Cluster Address

2015-03-03 Thread Garg, Pankaj
Hi, I have ceph cluster that is contained within a rack (1 Monitor and 5 OSD nodes). I kept the same public and private address for configuration. I do have 2 NICS and 2 valid IP addresses (one internal only and one external) for each machine. Is it possible now, to change the Public Network

[ceph-users] Unbalanced cluster

2015-03-03 Thread Matt Conner
Hi All, I have a cluster that I've been pushing data into in order to get an idea of how full it can get prior ceph marking the cluster full. Unfortunately, each time I fill the cluster I end up with one disk that typically hits the full ratio (0.95) while all other disks still have anywhere from

Re: [ceph-users] problem in cephfs for remove empty directory

2015-03-03 Thread Daniel Takatori Ohara
Hi John and Gregory, The version of ceph client is 0.87 and the kernel is 3.13. The debug logs here in attach. I see this problem in a older kernel, but i didn't find the solution in the track. Thanks, Att. --- Daniel Takatori Ohara. System Administrator - Lab. of Bioinformatics Molecular

Re: [ceph-users] CephFS Attributes Question Marks

2015-03-03 Thread Scottix
I did a bit more testing. 1. I tried on a newer kernel and was not able to recreate the problem, maybe it is that kernel bug you mentioned. Although its not an exact replica of the load. 2. I haven't tried the debug yet since I have to wait for the right moment. One thing I realized and maybe it

[ceph-users] import-diff requires snapshot exists?

2015-03-03 Thread Steve Anthony
Hello, I've been playing with backing up images from my production site (running 0.87) to my backup site (running 0.87.1) using export/import and export-diff/import-diff. After initially exporting and importing the image (rbd/small to backup/small) I took a snapshot (called test1) on the

Re: [ceph-users] RadosGW do not populate log file

2015-03-03 Thread Italo Santos
After change the ownership of the log file directory everything became fine. Thanks for your help Regards. Italo Santos http://italosantos.com.br/ On Tuesday, March 3, 2015 at 00:35, zhangdongmao wrote: I have met this before. Because I use apache with rgw, radosrgw is executed by the

Re: [ceph-users] Ceph Cluster Address

2015-03-03 Thread J-P Methot
I had to go through the same experience of changing the public network address and it's not easy. Ceph seems to keep a record of what ip address is associated to what OSD and a port number for the process. I was never able to find out where this record is kept or how to change it manually.

Re: [ceph-users] import-diff requires snapshot exists?

2015-03-03 Thread Steve Anthony
Jason, Ah, ok that makes sense. I was forgetting snapshots are read-only. Thanks! My plan was to do something like this. First, create a sync snapshot and seed the backup: rbd snap create rbd/small@sync rbd export rbd/small@sync ./foo rbd import ./foo backup/small rbd snap create

[ceph-users] Unexpected OSD down during deep-scrub

2015-03-03 Thread Italo Santos
Hello everyone, I have a cluster with 5 hosts and 18 OSDs, today I faced with a unexpected issue when multiple OSD goes down. The first OSD go down, was osd.8, feel minutes after, another OSD goes down on the same host, the osd.1. So, I tried restart the OSDs (osd.8 and osd.1) but doesn’t

Re: [ceph-users] CephFS Attributes Question Marks

2015-03-03 Thread Scottix
Ya we are not at 0.87.1 yet, possibly tomorrow. I'll let you know if it still reports the same. Thanks John, --Scottie On Tue, Mar 3, 2015 at 2:57 PM John Spray john.sp...@redhat.com wrote: On 03/03/2015 22:35, Scottix wrote: I was testing a little bit more and decided to run the

Re: [ceph-users] v0.80.8 and librbd performance

2015-03-03 Thread Olivier Bonvalet
Does kernel client affected by the problem ? Le mardi 03 mars 2015 à 15:19 -0800, Sage Weil a écrit : Hi, This is just a heads up that we've identified a performance regression in v0.80.8 from previous firefly releases. A v0.80.9 is working it's way through QA and should be out in a few

Re: [ceph-users] v0.80.8 and librbd performance

2015-03-03 Thread Olivier Bonvalet
Le mardi 03 mars 2015 à 16:32 -0800, Sage Weil a écrit : On Wed, 4 Mar 2015, Olivier Bonvalet wrote: Does kernel client affected by the problem ? Nope. The kernel client is unaffected.. the issue is in librbd. sage Ok, thanks for the clarification. So I have to dig !

Re: [ceph-users] CephFS Attributes Question Marks

2015-03-03 Thread Scottix
I was testing a little bit more and decided to run the cephfs-journal-tool I ran across some errors $ cephfs-journal-tool journal inspect 2015-03-03 14:18:54.453981 7f8e29f86780 -1 Bad entry start ptr (0x2aebf6) at 0x2aeb32279b 2015-03-03 14:18:54.539060 7f8e29f86780 -1 Bad entry start ptr

Re: [ceph-users] CephFS Attributes Question Marks

2015-03-03 Thread John Spray
On 03/03/2015 22:35, Scottix wrote: I was testing a little bit more and decided to run the cephfs-journal-tool I ran across some errors $ cephfs-journal-tool journal inspect 2015-03-03 14:18:54.453981 7f8e29f86780 -1 Bad entry start ptr (0x2aebf6) at 0x2aeb32279b 2015-03-03

[ceph-users] v0.80.8 and librbd performance

2015-03-03 Thread Sage Weil
Hi, This is just a heads up that we've identified a performance regression in v0.80.8 from previous firefly releases. A v0.80.9 is working it's way through QA and should be out in a few days. If you haven't upgraded yet you may want to wait. Thanks! sage

[ceph-users] Fwd: RPM Build Errors

2015-03-03 Thread Jesus Chavez (jeschave)
someone in this DL had the thread error? Checking for unpackaged file(s): /usr/lib/rpm/check-files /home/vagrant/rpmbuild/BUILDROOT/calamari-server-1.3-rc_23_g4c41db3.el7.x86_64 Wrote:

Re: [ceph-users] CephFS Attributes Question Marks

2015-03-03 Thread John Spray
On 03/03/2015 22:57, John Spray wrote: On 03/03/2015 22:35, Scottix wrote: I was testing a little bit more and decided to run the cephfs-journal-tool I ran across some errors $ cephfs-journal-tool journal inspect 2015-03-03 14:18:54.453981 7f8e29f86780 -1 Bad entry start ptr (0x2aebf6)

Re: [ceph-users] Unexpected OSD down during deep-scrub

2015-03-03 Thread Yann Dupont
Le 03/03/2015 22:03, Italo Santos a écrit : I realised that when the first OSD goes down, the cluster was performing a deep-scrub and I found the bellow trace on the logs of osd.8, anyone can help me understand why the osd.8, and other osds, unexpected goes down? I'm afraid I've seen

Re: [ceph-users] v0.80.8 and librbd performance

2015-03-03 Thread Ken Dreyer
On 03/03/2015 04:19 PM, Sage Weil wrote: Hi, This is just a heads up that we've identified a performance regression in v0.80.8 from previous firefly releases. A v0.80.9 is working it's way through QA and should be out in a few days. If you haven't upgraded yet you may want to wait.

Re: [ceph-users] Unexpected OSD down during deep-scrub

2015-03-03 Thread Loic Dachary
Hi Yann, That seems related to http://tracker.ceph.com/issues/10536 which seems to be resolved. Could you create a new issue with a link to 10536 ? More logs and ceph report would also be useful to figure out why it resurfaced. Thanks ! On 04/03/2015 00:04, Yann Dupont wrote: Le

[ceph-users] Clustering a few NAS into a Ceph cluster

2015-03-03 Thread Loic Dachary
Hi Ceph, Last week-end I discussed with a friend about a use case many of us thought about already: it would be cool to have a simple way to assemble Ceph aware NAS fresh from the store. I summarized the use case and interface we discussed here :

Re: [ceph-users] v0.80.8 and librbd performance

2015-03-03 Thread Sage Weil
On Wed, 4 Mar 2015, Olivier Bonvalet wrote: Does kernel client affected by the problem ? Nope. The kernel client is unaffected.. the issue is in librbd. sage Le mardi 03 mars 2015 à 15:19 -0800, Sage Weil a écrit : Hi, This is just a heads up that we've identified a performance

Re: [ceph-users] EC configuration questions...

2015-03-03 Thread Don Doerner
Loic, Thank you, I got it created. One of these days, I am going to have to try to understand some of the crush map details... Anyway, on to the next step! -don- -- The information contained in this transmission may be

Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-03 Thread Robert LeBlanc
I would be inclined to shut down both OSDs in a node, let the cluster recover. Once it is recovered, shut down the next two, let it recover. Repeat until all the OSDs are taken out of the cluster. Then I would set nobackfill and norecover. Then remove the hosts/disks from the CRUSH then unset

[ceph-users] problem in cephfs for remove empty directory

2015-03-03 Thread Daniel Takatori Ohara
Hi, I have a problem when i will remove a empty directory in cephfs. The directory is empty, but it seems have files crashed in MDS. *$ls test-daniel-old/* total 0 drwx-- 1 rmagalhaes BioInfoHSL Users0 Mar 2 10:52 ./ drwx-- 1 rmagalhaes BioInfoHSL Users 773099838313 Mar 2

Re: [ceph-users] Some long running ops may lock osd

2015-03-03 Thread Erdem Agaoglu
Looking further, i guess what i tried to tell was a simplified version of sharded threadpools, released in giant. Is it possible for that to be backported to firefly? On Tue, Mar 3, 2015 at 9:33 AM, Erdem Agaoglu erdem.agao...@gmail.com wrote: Thank you folks for bringing that up. I had some

Re: [ceph-users] backfill_toofull, but OSDs not full

2015-03-03 Thread wsnote
ceph 0.80.1 The same quesiton. I have deleted 1/4 data, but the problem didn't disappear Does anyone have other way to solve it? At 2015-01-10 05:31:30,Udo Lembke ulem...@polarzone.de wrote: Hi, I had an similiar effect two weeks ago - 1PG backfill_toofull and due reweighting and delete there

Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-03 Thread Irek Fasikhov
Hi. Use value osd_recovery_delay_start example: [root@ceph08 ceph]# ceph --admin-daemon /var/run/ceph/ceph-osd.94.asok config show | grep osd_recovery_delay_start osd_recovery_delay_start: 10 2015-03-03 13:13 GMT+03:00 Andrija Panic andrija.pa...@gmail.com: HI Guys, I yesterday removed 1

[ceph-users] Problems with shadow objects

2015-03-03 Thread Butkeev Stas
Hello, all I have ceph+RGW installation. And have some problems with shadow objects. For example: #rados ls -p .rgw.buckets|grep default.4507.1 . default.4507.1__shadow_test_s3.2/2vO4WskQNBGMnC8MGaYPSLfGkhQY76U.1_5 default.4507.1__shadow_test_s3.2/2vO4WskQNBGMnC8MGaYPSLfGkhQY76U.2_2

[ceph-users] Understand RadosGW logs

2015-03-03 Thread Daniel Schneller
Hi! After realizing the problem with log rotation (see http://thread.gmane.org/gmane.comp.file-systems.ceph.user/17708) and fixing it, I now for the first time have some meaningful (and recent) logs to look at. While from an application perspective there seem to be no issues, I would like to

Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-03 Thread Irek Fasikhov
A large percentage of the rebuild of the cluster map (But low percentage degradation). If you had not made ceph osd crush rm id, the percentage would be low. In your case, the correct option is to remove the entire node, rather than each disk individually 2015-03-03 14:27 GMT+03:00 Andrija Panic

Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-03 Thread Andrija Panic
Another question - I mentioned here 37% of objects being moved arround - this is MISPLACED object (degraded objects were 0.001%, after I removed 1 OSD from cursh map (out of 44 OSD or so). Can anybody confirm this is normal behaviour - and are there any workarrounds ? I understand this is

Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-03 Thread Irek Fasikhov
osd_recovery_delay_start - is the delay in seconds between iterations recovery (osd_recovery_max_active) It is described here: https://github.com/ceph/ceph/search?utf8=%E2%9C%93q=osd_recovery_delay_start 2015-03-03 14:27 GMT+03:00 Andrija Panic andrija.pa...@gmail.com: Another question - I

Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-03 Thread Andrija Panic
Hi Irek, yes, stoping OSD (or seting it to OUT) resulted in only 3% of data degraded and moved/recovered. When I after that removed it from Crush map ceph osd crush rm id, that's when the stuff with 37% happened. And thanks Irek for help - could you kindly just let me know of the prefered steps

[ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-03 Thread Andrija Panic
HI Guys, I yesterday removed 1 OSD from cluster (out of 42 OSDs), and it caused over 37% od the data to rebalance - let's say this is fine (this is when I removed it frm Crush Map). I'm wondering - I have previously set some throtling mechanism, but during first 1h of rebalancing, my rate of