Re: [ceph-users] [Help: pool not responding] Now osd crash

2016-03-08 Thread Mario Giammarco
Hello, probably I have restarted osd too many times or I have put in/out osd too many times but now I get this: root@proxmox-zotac:~# /usr/bin/ceph-osd -i 1 --pid-file /var/run/ceph/osd.1.pid -c /etc/ceph/ceph.conf --cluster ceph -f starting osd.1 at :/0 osd_data /var/lib/ceph/osd/ceph-1

Re: [ceph-users] yum install ceph on RHEL 7.2

2016-03-08 Thread Shinobu Kinjo
If you register subscription properly, you should be able to install the Ceph without the EPEL. Cheers, S On Wed, Mar 9, 2016 at 8:08 AM, Deneau, Tom wrote: > Yes, that is what lsb_release is showing... > >> -Original Message- >> From: Shinobu Kinjo

Re: [ceph-users] yum install ceph on RHEL 7.2

2016-03-08 Thread Shinobu Kinjo
Either way is true. it's up to you, anyhow. Cheers, S On Wed, Mar 9, 2016 at 1:24 PM, Ken Dreyer wrote: > On Tue, Mar 8, 2016 at 4:11 PM, Shinobu Kinjo wrote: >> If you register subscription properly, you should be able to install >> the Ceph without

Re: [ceph-users] yum install ceph on RHEL 7.2

2016-03-08 Thread Ken Dreyer
On Tue, Mar 8, 2016 at 4:11 PM, Shinobu Kinjo wrote: > If you register subscription properly, you should be able to install > the Ceph without the EPEL. The opposite is true (when installing upstream / ceph.com). We rely on EPEL for several things, like leveldb and

Re: [ceph-users] 1 more way to kill OSD

2016-03-08 Thread Dzianis Kahanovich
Fix: affected not only Megaraid SAS. For tested time: Affected: MegaRAID SAS 2108, Intel 82801JI Unaffected: Intel C602 Both Intel's in AHCI mode. So, hardware possible not important. Dzianis Kahanovich пишет: > This issue was fixed by "xfs_repair -L". > > 1) Megaraid SAS (Intel's SATA still

Re: [ceph-users] Ceph Recovery Assistance, pgs stuck peering

2016-03-08 Thread David Zafman
I expected it to return to osd.36. Oh, if you set "noout" during this process then the pg won't move around when you down osd.36. I expected osd.36 to go down and back up quickly. Also, the pg 10.4f is the same situation, so try the same thing on osd.6. David On 3/8/16 1:05 PM, Ben Hines

Re: [ceph-users] yum install ceph on RHEL 7.2

2016-03-08 Thread Deneau, Tom
Yes, that is what lsb_release is showing... > -Original Message- > From: Shinobu Kinjo [mailto:shinobu...@gmail.com] > Sent: Tuesday, March 08, 2016 5:01 PM > To: Deneau, Tom > Cc: ceph-users > Subject: Re: [ceph-users] yum install ceph on RHEL 7.2 > > On Wed, Mar 9, 2016 at 7:52 AM,

Re: [ceph-users] yum install ceph on RHEL 7.2

2016-03-08 Thread Shinobu Kinjo
On Wed, Mar 9, 2016 at 7:52 AM, Deneau, Tom wrote: > Just checking... > > On vanilla RHEL 7.2 (x64), should I be able to yum install ceph without > adding the EPEL repository? Are you talking about? # lsb_release -a ... Description:Red Hat Enterprise Linux Server

Re: [ceph-users] Can I rebuild object maps while VMs are running ?

2016-03-08 Thread Loris Cuoghi
Le 07/03/2016 17:58, Jason Dillaman a écrit : > Documentation of these new RBD features is definitely lacking and I've opened a tracker ticket to improve it [1]. > > [1] http://tracker.ceph.com/issues/15000 > Hey, thank you Jason! :) > That's disheartening to hear that your RBD images were

[ceph-users] threading requirements for librbd

2016-03-08 Thread Blair Bethwaite
Hi all, Not getting very far with this query internally (RH), so hoping someone familiar with the code can spare me the C++ pain... We've hit soft thread count ulimits a couple of times with different Ceph clusters. The clients (Qemu/KVM guests on both Ubuntu and RHEL hosts) have hit the limit

Re: [ceph-users] OSDs go down with infernalis

2016-03-08 Thread Adrien Gillard
Hello Yoann, I think I faced the same issue setting up my own cluster. If it is the same, it's one of the many people encounter(ed) during disk initialization. Could you please give the output of : - ll /dev/disk/by-partuuid/ - ll /var/lib/ceph/osd/ceph-* On Thu, Mar 3, 2016 at 3:42 PM,

Re: [ceph-users] how ceph osd handle ios sent from crashed ceph client

2016-03-08 Thread Jason Dillaman
librbd provides crash-consistent IO. It is still up to your application to provide its own consistency by adding barriers (flushes) where necessary. If you flush your IO, once that flush completes you are guaranteed that your previous IO is safely committed to disk. -- Jason Dillaman

[ceph-users] how ceph osd handle ios sent from crashed ceph client

2016-03-08 Thread louis
Hi, my process will use librbd doing block io. I wonder to know if my process crashed,how ceph handle ios sent from my process before crashed?Thanks ___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] threading requirements for librbd

2016-03-08 Thread Jason Dillaman
Are you interesting in the max FD count or max thread count? You mention both in your email. Right now qemu doesn't pool OSD connections when you have multiple RBD images connected to the same VM -- each image uses its own librbd/librados instance. Since each image, at the worst case, might

Re: [ceph-users] threading requirements for librbd

2016-03-08 Thread Dan van der Ster
Hi Blair! Last I heard you should budget for 2-3 fds per OSD. This only affects Glance in our cloud -- the hypervisors run unlimited as root. Here's our config in /etc/security/limits.d/91-nproc.conf: glance softnofile 32768 glance hardnofile 32768 glance soft

Re: [ceph-users] inconsistent PG -> unfound objects on an erasure coded system

2016-03-08 Thread Jeffrey McDonald
Resent to ceph-users to be under the message size limit On Tue, Mar 8, 2016 at 6:16 AM, Jeffrey McDonald wrote: > OK, this is done and I've observed the state change of 70.459 from > active+clean to active+clean+inconsistent after the first scrub. > > Files attached:

Re: [ceph-users] Can I rebuild object maps while VMs are running ?

2016-03-08 Thread Jason Dillaman
Thanks for the detailed feedback! Comments inline below: - Original Message - > From: "Loris Cuoghi" > To: "Jason Dillaman" > Cc: ceph-users@lists.ceph.com > Sent: Tuesday, March 8, 2016 5:57:30 AM > Subject: Re: [ceph-users] Can I rebuild

Re: [ceph-users] inconsistent PG -> unfound objects on an erasure coded system

2016-03-08 Thread Samuel Just
Oh, for the pg with unfound objects, restart the primary, that should fix it. -Sam On Tue, Mar 8, 2016 at 6:44 AM, Jeffrey McDonald wrote: > Resent to ceph-users to be under the message size limit > > On Tue, Mar 8, 2016 at 6:16 AM, Jeffrey McDonald

Re: [ceph-users] OSDs go down with infernalis

2016-03-08 Thread Yoann Moulin
Hello Adrien, > I think I faced the same issue setting up my own cluster. If it is the same, > it's one of the many people encounter(ed) during disk initialization. > Could you please give the output of : > - ll /dev/disk/by-partuuid/ > - ll /var/lib/ceph/osd/ceph-* unfortunately, I already

[ceph-users] Does object map feature lock snapshots ?

2016-03-08 Thread Christoph Adomeit
Hi, i have installed ceph 9.21 on proxmox with kernel 4.2.8-1-pve. Afterwards I have enabled the features: rbd feature enable $IMG exclusive-lock rbd feature enable $IMG object-map rbd feature enable $IMG fast-diff During the night I have a cronjob which does a rbd snap create on each of my

Re: [ceph-users] Does object map feature lock snapshots ?

2016-03-08 Thread Jason Dillaman
Is there anyway for you to provide debug logs (i.e. debug rbd = 20) from your rbd CLI and qemu process when you attempt to create a snapshot? In v9.2.0, there was an issue [1] where the cache flush writeback from the snap create request was being blocked when the exclusive lock feature was

Re: [ceph-users] OSDs go down with infernalis

2016-03-08 Thread Adrien Gillard
If you manually create your journal partition, you need to specify the correct Ceph partition GUID in order for the system and Ceph to identify the partition as Ceph journal and affect correct ownership and permissions at boot via udev. I used something like this to create the partition : sudo

Re: [ceph-users] Ceph Recovery Assistance, pgs stuck peering

2016-03-08 Thread David Zafman
Ben, I haven't look at everything in your message, but pg 12.7a1 has lost data because of writes that went only to osd.73. The way to recover this is to force recovery to ignore this fact and go with whatever data you have on the remaining OSDs. I assume that having min_size 1, having

[ceph-users] pg to RadosGW object list

2016-03-08 Thread Wade Holler
Hi All, I searched google and what not but haven't found this yet. Does anyone know how to do PG -> applicable RadosGW Object mapping? Best Regards, Wade ___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] inconsistent PG -> unfound objects on an erasure coded system

2016-03-08 Thread Samuel Just
That doesn't sound related. What is it? -Sam On Tue, Mar 8, 2016 at 12:15 PM, Jeffrey McDonald wrote: > One other oddity I've found is that ceph left 51 GB of data on each of the > OSDs on the retired hardware.Is that by design or could it indicate some > other problems?

Re: [ceph-users] inconsistent PG -> unfound objects on an erasure coded system

2016-03-08 Thread Jeffrey McDonald
Will do, I'm in the midst of the xfs filesystem check so it will be a bit before I have the filesystem back. jeff On Tue, Mar 8, 2016 at 2:05 PM, Samuel Just wrote: > There are 3 other example pairs on that osd. Can you gather the same > information about these as well? >

Re: [ceph-users] inconsistent PG -> unfound objects on an erasure coded system

2016-03-08 Thread Jeffrey McDonald
The filesystem check came back clean: xfs_repair -o ag_stride=8 -t 5 -n /dev/disk/by-id/md-uuid-07f46e30:e2734cbd:ae2f95b3:71ec3067 The other tree sets:

Re: [ceph-users] Ceph Recovery Assistance, pgs stuck peering

2016-03-08 Thread Ben Hines
After making that setting, the pg appeared to start peering but then it actually changed the primary OSD to osd.100 - then went incomplete again. Perhaps it did that because another OSD had more data? I presume i need to set that value on each osd where the pg hops to. -Ben On Tue, Mar 8, 2016

Re: [ceph-users] inconsistent PG -> unfound objects on an erasure coded system

2016-03-08 Thread Samuel Just
Yep On Tue, Mar 8, 2016 at 12:08 PM, Jeffrey McDonald wrote: > Will do, I'm in the midst of the xfs filesystem check so it will be a bit > before I have the filesystem back. > jeff > > On Tue, Mar 8, 2016 at 2:05 PM, Samuel Just wrote: >> >> There are 3

Re: [ceph-users] inconsistent PG -> unfound objects on an erasure coded system

2016-03-08 Thread Samuel Just
The pgs are not actually inconsistent (that is, I think that all of the real objects are present and healthy). I think each of those pgs has one of these duplicate pairs confusing scrub (and also pg removal -- hence your ENOTEMPTY bug). Once we figure out what's going on, you'll have to clean

Re: [ceph-users] inconsistent PG -> unfound objects on an erasure coded system

2016-03-08 Thread Jeffrey McDonald
I restarted the OSDs with the 'unfound' objects and now I have none, but I have 43 inconsistent PGs that I need to repair.I only see unfound files once issue the 'pg repair'.How do I clear out the inconsistent states? ceph -s cluster 5221cc73-869e-4c20-950f-18824ddd6692 health

[ceph-users] v10.0.4 released

2016-03-08 Thread Sage Weil
This is the fourth and last development release before Jewel. The next release will be a release candidate with the final set of features. Big items include RGW static website support, librbd journal framework, fixed mon sync of config-key data, C++11 updates, and bluestore/kstore. Note that,

Re: [ceph-users] inconsistent PG -> unfound objects on an erasure coded system

2016-03-08 Thread Samuel Just
There are 3 other example pairs on that osd. Can you gather the same information about these as well?

Re: [ceph-users] inconsistent PG -> unfound objects on an erasure coded system

2016-03-08 Thread Samuel Just
By "migrated", you mean you marked them out one at a time and let ceph recover them over to new nodes? -Sam On Tue, Mar 8, 2016 at 11:55 AM, Jeffrey McDonald wrote: > No its not...historical reasons.ceph[1-3] were the nodes that were > retired. ceph0[1-9] are all new

Re: [ceph-users] inconsistent PG -> unfound objects on an erasure coded system

2016-03-08 Thread Jeffrey McDonald
One other oddity I've found is that ceph left 51 GB of data on each of the OSDs on the retired hardware.Is that by design or could it indicate some other problems?The PGs there seem to now be remapped elsewhere. Regards, Jeff On Tue, Mar 8, 2016 at 2:09 PM, Samuel Just

Re: [ceph-users] inconsistent PG -> unfound objects on an erasure coded system

2016-03-08 Thread Samuel Just
Yeah, that procedure should have isolated any filesystem issues. Are there still unfound objects? -sam On Tue, Mar 8, 2016 at 11:58 AM, Jeffrey McDonald wrote: > Yes, I used the crushmap, set them to 0, then let ceph migrate/remap them > over. I controlled the tempo of the

Re: [ceph-users] inconsistent PG -> unfound objects on an erasure coded system

2016-03-08 Thread Samuel Just
ceph3 is not the same host as ceph03? -Sam On Tue, Mar 8, 2016 at 11:48 AM, Jeffrey McDonald wrote: > Hi Sam, > > 1) Are those two hardlinks to the same file? No: > > # find . -name '*fa202ec9b4b3b217275a*' -exec ls -ltr {} + > -rw-r--r-- 1 root root 0 Jan 23 21:49 >

[ceph-users] yum install ceph on RHEL 7.2

2016-03-08 Thread Deneau, Tom
Just checking... On vanilla RHEL 7.2 (x64), should I be able to yum install ceph without adding the EPEL repository? (looks like the version being installed is 0.94.6) -- Tom Deneau, AMD ___ ceph-users mailing list ceph-users@lists.ceph.com