Re: [ceph-users] Hammer Cache Tiering

2016-11-01 Thread Ashley Merrick
Hello, Already have Journals on SSD’s, but a Journal is only designed for very short write bursts, and is not going to help when I have someone writing for example a 100GB Backup file, where in my eyes a SSD Tier set to cache write’s only will allow the 100GB Write to be completed much quicker

Re: [ceph-users] Hammer Cache Tiering

2016-11-01 Thread Christian Wuerdig
On Wed, Nov 2, 2016 at 5:19 PM, Ashley Merrick wrote: > Hello, > > Thanks for your reply, when you say latest's version do you .6 and not .5? > > The use case is large scale storage VM's, which may have a burst of high > write's during new storage being loaded onto the

Re: [ceph-users] Monitor troubles

2016-11-01 Thread Tracy Reed
On Tue, Nov 01, 2016 at 09:36:16PM PDT, Tracy Reed spake thusly: > I initially setup my ceph cluster on CentOS 7 with just one monitor. The > monitor runs on an osd server (not ideal, will change soon). I've Sorry, forgot to add that I'm running the following ceph version from the ceph repo: #

[ceph-users] Monitor troubles

2016-11-01 Thread Tracy Reed
I initially setup my ceph cluster on CentOS 7 with just one monitor. The monitor runs on an osd server (not ideal, will change soon). I've tested it quite a lot over the last couple of months and things have gone well. I knew I needed to add a couple more monitors so I did the following:

Re: [ceph-users] Hammer Cache Tiering

2016-11-01 Thread Ashley Merrick
Hello, Thanks for your reply, when you say latest's version do you .6 and not .5? The use case is large scale storage VM's, which may have a burst of high write's during new storage being loaded onto the environment, looking to place the SSD Cache in front currently with a replica of 3 and

Re: [ceph-users] Hammer Cache Tiering

2016-11-01 Thread Christian Balzer
Hello, On Tue, 1 Nov 2016 15:07:33 + Ashley Merrick wrote: > Hello, > > Currently using a Proxmox & CEPH cluster, currently they are running on > Hammer looking to update to Jewel shortly, I know I can do a manual upgrade > however would like to keep what is tested well with Proxmox. >

Re: [ceph-users] [EXTERNAL] Re: pg stuck with unfound objects on non exsisting osd's

2016-11-01 Thread Will . Boege
Start with a rolling restart of just the OSDs one system at a time, checking the status after each restart. On Nov 1, 2016, at 6:20 PM, Ronny Aasen > wrote: thanks for the suggestion. is a rolling reboot sufficient? or must all osd's

Re: [ceph-users] pg stuck with unfound objects on non exsisting osd's

2016-11-01 Thread Ronny Aasen
thanks for the suggestion. is a rolling reboot sufficient? or must all osd's be down at the same time ? one is no problem. the other takes some scheduling.. Ronny Aasen On 01.11.2016 21:52, c...@elchaka.de wrote: Hello Ronny, if it is possible for you, try to Reboot all OSD Nodes. I had

Re: [ceph-users] pg stuck with unfound objects on non exsisting osd's

2016-11-01 Thread ceph
Hello Ronny, if it is possible for you, try to Reboot all OSD Nodes. I had this issue on my test Cluster and it become healthy after rebooting. Hth - Mehmet Am 1. November 2016 19:55:07 MEZ, schrieb Ronny Aasen : >Hello. > >I have a cluster stuck with 2 pg's stuck

Re: [ceph-users] Need help! Ceph backfill_toofull and recovery_wait+degraded

2016-11-01 Thread Udo Lembke
Hi again, and change the value with something like this ceph tell osd.* injectargs '--mon_osd_full_ratio 0.96' Udo On 01.11.2016 21:16, Udo Lembke wrote: > Hi Marcus, > > for a fast help you can perhaps increase the mon_osd_full_ratio? > > What values do you have? > Please post the output of

Re: [ceph-users] Need help! Ceph backfill_toofull and recovery_wait+degraded

2016-11-01 Thread Udo Lembke
Hi Marcus, for a fast help you can perhaps increase the mon_osd_full_ratio? What values do you have? Please post the output of (on host ceph1, because osd.0.asok) ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok config show | grep full_ratio after that it would be helpfull to use on all hosts

Re: [ceph-users] Total free space in addition to MAX AVAIL

2016-11-01 Thread Stillwell, Bryan J
On 11/1/16, 1:45 PM, "Sage Weil" wrote: >On Tue, 1 Nov 2016, Stillwell, Bryan J wrote: >> I recently learned that 'MAX AVAIL' in the 'ceph df' output doesn't >> represent what I thought it did. It actually represents the amount of >> data that can be used before the first OSD

Re: [ceph-users] Need help! Ceph backfill_toofull and recovery_wait+degraded

2016-11-01 Thread David Turner
Your weights are very poorly managed. if you have a 1TB drive, it's weight should be about 1, if you have an 8TB drive, it's weight should be about 8. You have 4TB drives with a weight of 3.64 (which is good), but the new node you added with 4x 8TB drives have weights ranging from 3.3-5. The

Re: [ceph-users] Total free space in addition to MAX AVAIL

2016-11-01 Thread Sage Weil
On Tue, 1 Nov 2016, Stillwell, Bryan J wrote: > I recently learned that 'MAX AVAIL' in the 'ceph df' output doesn't > represent what I thought it did. It actually represents the amount of > data that can be used before the first OSD becomes full, and not the sum > of all free space across a set

[ceph-users] Total free space in addition to MAX AVAIL

2016-11-01 Thread Stillwell, Bryan J
I recently learned that 'MAX AVAIL' in the 'ceph df' output doesn't represent what I thought it did. It actually represents the amount of data that can be used before the first OSD becomes full, and not the sum of all free space across a set of OSDs. This means that balancing the data with 'ceph

Re: [ceph-users] Need help! Ceph backfill_toofull and recovery_wait+degraded

2016-11-01 Thread Ronny Aasen
if you have the default crushmap and osd pool default size = 3, then ceph creates 3 copies of each object. and store it on 3 separate nodes. so the best way to solve your space problems is to try to even out the space between your hosts. either by adding disks to ceph1 ceph2 ceph3, or by

Re: [ceph-users] Uniquely identifying a Ceph client

2016-11-01 Thread Jason Dillaman
> Not using qemu in this scenario. Just rbd map && rbd lock. It's more > that I can't match the output from "rbd lock" against the output from > "rbd status", because they are using different librados instances. > I'm just trying to capture who has an image mapped and locked, and to > those not

[ceph-users] Need help! Ceph backfill_toofull and recovery_wait+degraded

2016-11-01 Thread Marcus Müller
Hi all, i have a big problem and i really hope someone can help me! We are running a ceph cluster since a year now. Version is: 0.94.7 (Hammer) Here is some info: Our osd map is: ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 26.67998 root default

Re: [ceph-users] Uniquely identifying a Ceph client

2016-11-01 Thread Travis Rhoden
On Tue, Nov 1, 2016 at 11:45 AM, Sage Weil wrote: > On Tue, 1 Nov 2016, Travis Rhoden wrote: >> Hello, >> Is there a consistent, reliable way to identify a Ceph client? I'm looking >> for a string/ID (UUID, for example) that can be traced back to a client >> doing RBD maps. >>

[ceph-users] pg stuck with unfound objects on non exsisting osd's

2016-11-01 Thread Ronny Aasen
Hello. I have a cluster stuck with 2 pg's stuck undersized degraded, with 25 unfound objects. # ceph health detail HEALTH_WARN 2 pgs degraded; 2 pgs recovering; 2 pgs stuck degraded; 2 pgs stuck unclean; 2 pgs stuck undersized; 2 pgs undersized; recovery 294599/149522370 objects degraded

Re: [ceph-users] Uniquely identifying a Ceph client

2016-11-01 Thread Sage Weil
On Tue, 1 Nov 2016, Travis Rhoden wrote: > Hello, > Is there a consistent, reliable way to identify a Ceph client? I'm looking > for a string/ID (UUID, for example) that can be traced back to a client > doing RBD maps. > > There are a couple of possibilities out there, but they aren't quite what

[ceph-users] Uniquely identifying a Ceph client

2016-11-01 Thread Travis Rhoden
Hello, Is there a consistent, reliable way to identify a Ceph client? I'm looking for a string/ID (UUID, for example) that can be traced back to a client doing RBD maps. There are a couple of possibilities out there, but they aren't quite what I'm looking for. When checking "rbd status", for

Re: [ceph-users] I need help building the source code can anyone help?

2016-11-01 Thread Kamble, Nitin A
Building ceph is bit involved process. what version are you trying to build? For building are you following README in the code? - Nitin On Oct 28, 2016, at 12:16 AM, 刘 畅 > wrote: After I successfully run the install-deps.sh , I try

Re: [ceph-users] After kernel upgrade OSD's on different disk.

2016-11-01 Thread David Turner
Peter nailed this on the head. You shouldn't setup your journals using /dev/sdx naming. You should use /dev/disk/by-partuuid or something similar. This way it will not matter what letter your drives are assigned on reboot. Your /dev/sdx letter assignments can change on a reboot regardless

[ceph-users] Hammer Cache Tiering

2016-11-01 Thread Ashley Merrick
Hello, Currently using a Proxmox & CEPH cluster, currently they are running on Hammer looking to update to Jewel shortly, I know I can do a manual upgrade however would like to keep what is tested well with Proxmox. Looking to put a SSD Cache tier in front, however have seen and read there has

Re: [ceph-users] 10Gbit switch advice for small ceph cluster upgrade

2016-11-01 Thread Alexandre DERUMIER
>>We use Mellanox SX1036 and SX1012, which can function in 10 and 56GbE modes. >>It uses QSFP, Twinax or MPO, which terminates with LC fiber connections. >>While not dirt cheap, or entry >>level, we like these as being considerably >>cheaper than even a decent SDN solution. We have been able

Re: [ceph-users] Question about writing a program that transfer snapshot diffs between ceph clusters

2016-11-01 Thread Wes Dillingham
You might want to have a look at this: https://github.com/camptocamp/ceph-rbd-backup/blob/master/ceph-rbd-backup.py I have a bash implementation of this, but it basically boils down to wrapping what peter said: an export-diff to stdout piped to an import-diff on a different cluster. The

[ceph-users] Integrating Ceph Jewel and Mitaka

2016-11-01 Thread fridifree
Hi everybody, I am trying to integrate Ceph with Mitaka and I get an error about cinder-volume cannot connect to cluster 2016-11-01 11:40:51.110 13762 ERROR oslo_service.service VolumeBackendAPIException: Bad or unexpected response from the storage volume backend API: Error connecting to ceph

Re: [ceph-users] After kernel upgrade OSD's on different disk.

2016-11-01 Thread jan hugo prins
Below are the block ID's for the OSD drives. I have one Journal disk in this system and because I'm testing the setup at the moment, one disk has it's journal local and the other 2 OSD's have the journal on the journal disk (/dev/sdb). There is also one journal to many but this is because I took

Re: [ceph-users] Question about writing a program that transfer snapshot diffs between ceph clusters

2016-11-01 Thread Peter Maloney
On 11/01/16 10:22, Peter Maloney wrote: > On 11/01/16 06:57, xxhdx1985126 wrote: >> Hi, everyone. >> >> I'm trying to write a program based on the librbd API that transfers >> snapshot diffs between ceph clusters without the need for a temporary >> storage which is required if I use the "rbd

Re: [ceph-users] Question about writing a program that transfer snapshot diffs between ceph clusters

2016-11-01 Thread Peter Maloney
On 11/01/16 06:57, xxhdx1985126 wrote: > Hi, everyone. > > I'm trying to write a program based on the librbd API that transfers > snapshot diffs between ceph clusters without the need for a temporary > storage which is required if I use the "rbd export-diff" and "rbd > import-diff" pair. You

Re: [ceph-users] After kernel upgrade OSD's on different disk.

2016-11-01 Thread Peter Maloney
On 11/01/16 00:10, jan hugo prins wrote: > After the kernel upgrade, I also upgraded the cluster to 10.2.3 from > 10.2.2. > Let's hope I only hit a bug and that this bug is now fixed, on the other > hand, I think I also saw the issue with a 10.2.3 node, but I'm not sure. It's not a bug for disks