Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS

2014-09-02 Thread Dan Van Der Ster
Hi Sebastien, That sounds promising. Did you enable the sharded ops to get this result? Cheers, Dan On 02 Sep 2014, at 02:19, Sebastien Han sebastien@enovance.com wrote: Mark and all, Ceph IOPS performance has definitely improved with Giant. With this version: ceph version

[ceph-users] SSD journal deployment experiences

2014-09-04 Thread Dan Van Der Ster
perform adequately, that’d give us quite a few SSDs to build a dedicated high-IOPS pool. I’d also appreciate any other suggestions/experiences which might be relevant. Thanks! Dan -- Dan van der Ster || Data Storage Services || CERN IT Department

Re: [ceph-users] SSD journal deployment experiences

2014-09-04 Thread Dan Van Der Ster
(n-1 or n-2) will be a bit too old from where we want to be, which I'm sure will work wonderfully on Red Hat, but how will n.1, n.2 or n.3 run? Robert LeBlanc On Thu, Sep 4, 2014 at 11:22 AM, Dan Van Der Ster daniel.vanders...@cern.chmailto:daniel.vanders...@cern.ch wrote: Hi Robert, That's

Re: [ceph-users] SSD journal deployment experiences

2014-09-04 Thread Dan van der Ster
to see what you decide to do and what your results are.   On Thu, Sep 4, 2014 at 12:12 PM, Dan Van Der Ster wrote:   I've just been reading the bcache docs. It's a pity the mirrored writes aren't implemented yet. Do you know if you can use an md RAID1 as a cache dev

Re: [ceph-users] SSD journal deployment experiences

2014-09-04 Thread Dan van der Ster
Hi Stefan, September 4 2014 9:13 PM, Stefan Priebe s.pri...@profihost.ag wrote: Hi Dan, hi Robert, Am 04.09.2014 21:09, schrieb Dan van der Ster: Thanks again for all of your input. I agree with your assessment -- in our cluster we avg 3ms for a random (hot) 4k read already, but 40ms

Re: [ceph-users] SSD journal deployment experiences

2014-09-04 Thread Dan van der Ster
the constant load very well. Cheers, Martin On Thu, Sep 4, 2014 at 6:21 PM, Dan Van Der Ster daniel.vanders...@cern.ch wrote: Dear Cephalopods, In a few weeks we will receive a batch of 200GB Intel DC S3700’s to augment our cluster, and I’d like to hear your practical experience and discuss

Re: [ceph-users] SSD journal deployment experiences

2014-09-04 Thread Dan van der Ster
Hi Craig, September 4 2014 11:50 PM, Craig Lewis cle...@centraldesktop.com wrote: On Thu, Sep 4, 2014 at 9:21 AM, Dan Van Der Ster daniel.vanders...@cern.ch wrote: 1) How often are DC S3700's failing in your deployments? None of mine have failed yet. I am planning to monitor the wear

Re: [ceph-users] SSD journal deployment experiences

2014-09-05 Thread Dan Van Der Ster
Hi Christian, On 05 Sep 2014, at 03:09, Christian Balzer ch...@gol.com wrote: Hello, On Thu, 4 Sep 2014 14:49:39 -0700 Craig Lewis wrote: On Thu, Sep 4, 2014 at 9:21 AM, Dan Van Der Ster daniel.vanders...@cern.ch wrote: 1) How often are DC S3700's failing in your deployments

Re: [ceph-users] SSD journal deployment experiences

2014-09-05 Thread Dan Van Der Ster
On 05 Sep 2014, at 10:30, Nigel Williams nigel.d.willi...@gmail.com wrote: On Fri, Sep 5, 2014 at 5:46 PM, Dan Van Der Ster daniel.vanders...@cern.ch wrote: On 05 Sep 2014, at 03:09, Christian Balzer ch...@gol.com wrote: You might want to look into cache pools (and dedicated SSD servers

Re: [ceph-users] SSD journal deployment experiences

2014-09-05 Thread Dan Van Der Ster
On 05 Sep 2014, at 11:04, Christian Balzer ch...@gol.com wrote: Hello Dan, On Fri, 5 Sep 2014 07:46:12 + Dan Van Der Ster wrote: Hi Christian, On 05 Sep 2014, at 03:09, Christian Balzer ch...@gol.com wrote: Hello, On Thu, 4 Sep 2014 14:49:39 -0700 Craig Lewis wrote

Re: [ceph-users] SSD journal deployment experiences

2014-09-06 Thread Dan van der Ster
Hi Christian, Let's keep debating until a dev corrects us ;) September 6 2014 1:27 PM, Christian Balzer ch...@gol.com wrote: On Fri, 5 Sep 2014 09:42:02 + Dan Van Der Ster wrote: On 05 Sep 2014, at 11:04, Christian Balzer ch...@gol.com wrote: On Fri, 5 Sep 2014 07:46:12 + Dan Van

Re: [ceph-users] SSD journal deployment experiences

2014-09-06 Thread Dan van der Ster
September 6 2014 4:01 PM, Christian Balzer ch...@gol.com wrote: On Sat, 6 Sep 2014 13:07:27 + Dan van der Ster wrote: Hi Christian, Let's keep debating until a dev corrects us ;) For the time being, I give the recent: https://www.mail-archive.com/ceph-users@lists.ceph.com

Re: [ceph-users] SSD journal deployment experiences

2014-09-06 Thread Dan Van Der Ster
considered RAID 5 over your SSDs? Practically speaking, there's no performance downside to RAID 5 when your devices aren't IOPS-bound. On Sat Sep 06 2014 at 8:37:56 AM Christian Balzer ch...@gol.commailto:ch...@gol.com wrote: On Sat, 6 Sep 2014 14:50:20 + Dan van der Ster wrote: September 6

Re: [ceph-users] NAS on RBD

2014-09-09 Thread Dan Van Der Ster
Hi Blair, On 09 Sep 2014, at 09:05, Blair Bethwaite blair.bethwa...@gmail.com wrote: Hi folks, In lieu of a prod ready Cephfs I'm wondering what others in the user community are doing for file-serving out of Ceph clusters (if at all)? We're just about to build a pretty large cluster -

Re: [ceph-users] NAS on RBD

2014-09-09 Thread Dan Van Der Ster
On 09 Sep 2014, at 16:39, Michal Kozanecki mkozane...@evertz.com wrote: On 9 September 2014 08:47, Blair Bethwaite blair.bethwa...@gmail.com wrote: On 9 September 2014 20:12, Dan Van Der Ster daniel.vanders...@cern.ch wrote: One thing I’m not comfortable with is the idea of ZFS checking

Re: [ceph-users] Ceph general configuration questions

2014-09-16 Thread Dan Van Der Ster
Hi, On 16 Sep 2014, at 16:46, shiva rkreddy shiva.rkre...@gmail.commailto:shiva.rkre...@gmail.com wrote: 2. Has any one used SSD devices for Monitors. If so, can you please share the details ? Any specific changes to the configuration files? We use SSDs on our monitors — a spinning disk was

Re: [ceph-users] Still seing scrub errors in .80.5

2014-09-16 Thread Dan Van Der Ster
Hi Greg, I believe Marc is referring to the corruption triggered by set_extsize on xfs. That option was disabled by default in 0.80.4... See the thread firefly scrub error. Cheers, Dan From: Gregory Farnum g...@inktank.com Sent: Sep 16, 2014 8:15 PM To: Marc Cc: ceph-users@lists.ceph.com

Re: [ceph-users] Ceph general configuration questions

2014-09-17 Thread Dan Van Der Ster
, xfs, something else or doesn't matter? I think it doesn’t matter. We use xfs. Cheers, Dan On Tue, Sep 16, 2014 at 10:15 AM, Dan Van Der Ster daniel.vanders...@cern.chmailto:daniel.vanders...@cern.ch wrote: Hi, On 16 Sep 2014, at 16:46, shiva rkreddy shiva.rkre...@gmail.commailto:shiva.rkre

Re: [ceph-users] RGW hung, 2 OSDs using 100% CPU

2014-09-17 Thread Dan Van Der Ster
Hi Florian, On 17 Sep 2014, at 17:09, Florian Haas flor...@hastexo.com wrote: Hi Craig, just dug this up in the list archives. On Fri, Mar 28, 2014 at 2:04 AM, Craig Lewis cle...@centraldesktop.com wrote: In the interest of removing variables, I removed all snapshots on all pools,

Re: [ceph-users] RGW hung, 2 OSDs using 100% CPU

2014-09-17 Thread Dan Van Der Ster
to e.g 16, or to fix the loss of purged_snaps after backfilling. Actually, probably both of those are needed. But a real dev would know better. Cheers, Dan From: Florian Haas flor...@hastexo.com Sent: Sep 17, 2014 5:33 PM To: Dan Van Der Ster Cc: Craig Lewis cle...@centraldesktop.com;ceph-users

Re: [ceph-users] v0.67.11 dumpling released

2014-09-25 Thread Dan Van Der Ster
Hi Mike, On 25 Sep 2014, at 17:47, Mike Dawson mike.daw...@cloudapt.com wrote: On 9/25/2014 11:09 AM, Sage Weil wrote: v0.67.11 Dumpling === This stable update for Dumpling fixes several important bugs that affect a small set of users. We recommend that all Dumpling

[ceph-users] ceph osd replacement with shared journal device

2014-09-26 Thread Dan Van Der Ster
Hi, Apologies for this trivial question, but what is the correct procedure to replace a failed OSD that uses a shared journal device? Suppose you have 5 spinning disks (sde,sdf,sdg,sdh,sdi) and these each have a journal partition on sda (sda1-5). Now sde fails and is replaced with a new drive.

Re: [ceph-users] ceph osd replacement with shared journal device

2014-09-29 Thread Dan Van Der Ster
Hi Wido, On 26 Sep 2014, at 23:14, Wido den Hollander w...@42on.com wrote: On 26-09-14 17:16, Dan Van Der Ster wrote: Hi, Apologies for this trivial question, but what is the correct procedure to replace a failed OSD that uses a shared journal device? Suppose you have 5 spinning disks

Re: [ceph-users] ceph osd replacement with shared journal device

2014-09-29 Thread Dan Van Der Ster
Hi, On 29 Sep 2014, at 10:01, Daniel Swarbrick daniel.swarbr...@profitbricks.com wrote: On 26/09/14 17:16, Dan Van Der Ster wrote: Hi, Apologies for this trivial question, but what is the correct procedure to replace a failed OSD that uses a shared journal device? I’m just curious

Re: [ceph-users] ceph osd replacement with shared journal device

2014-09-29 Thread Dan Van Der Ster
-disk prepare /dev/sde /dev/sda1 and try to coerce that to use the persistent name. Cheers, Dan Best of luck. Owen On 09/29/2014 10:24 AM, Dan Van Der Ster wrote: Hi, On 29 Sep 2014, at 10:01, Daniel Swarbrick daniel.swarbr...@profitbricks.com wrote: On 26/09/14 17:16

Re: [ceph-users] SSD MTBF

2014-09-29 Thread Dan Van Der Ster
Hi Emmanuel, This is interesting, because we’ve had sales guys telling us that those Samsung drives are definitely the best for a Ceph journal O_o ! The conventional wisdom has been to use the Intel DC S3700 because of its massive durability. Anyway, I’m curious what do the SMART counters say

Re: [ceph-users] ceph osd replacement with shared journal device

2014-09-29 Thread Dan Van Der Ster
On 29 Sep 2014, at 10:47, Dan Van Der Ster daniel.vanders...@cern.ch wrote: Hi Owen, On 29 Sep 2014, at 10:33, Owen Synge osy...@suse.com wrote: Hi Dan, At least looking at upstream to get journals and partitions persistently working, this requires gpt partitions, and being able

Re: [ceph-users] SSD MTBF

2014-10-01 Thread Dan Van Der Ster
On 30 Sep 2014, at 16:38, Mark Nelson mark.nel...@inktank.com wrote: On 09/29/2014 03:58 AM, Dan Van Der Ster wrote: Hi Emmanuel, This is interesting, because we’ve had sales guys telling us that those Samsung drives are definitely the best for a Ceph journal O_o ! Our sales guys

Re: [ceph-users] CRUSH depends on host + OSD?

2014-10-15 Thread Dan van der Ster
Hi Chad, That sounds bizarre to me, and I can't reproduce it. I added an osd (which was previously not in the crush map) to a fake host=test: ceph osd crush create-or-move osd.52 1.0 rack=RJ45 host=test that resulted in some data movement of course. Then I removed that osd from the crush

Re: [ceph-users] CRUSH depends on host + OSD?

2014-10-15 Thread Dan van der Ster
Hi, October 15 2014 7:05 PM, Chad Seys cws...@physics.wisc.edu wrote: Hi Dan, I'm using Emperor (0.72). Though I would think CRUSH maps have not changed that much btw versions? I'm using dumpling, with the hashpspool flag enabled, which I believe could have been the only difference. That

[ceph-users] converting legacy puppet-ceph configured OSDs to look like ceph-deployed OSDs

2014-10-15 Thread Dan van der Ster
Hi Ceph users, (sorry for the novel, but perhaps this might be useful for someone) During our current project to upgrade our cluster from disks-only to SSD journals, we've found it useful to convert our legacy puppet-ceph deployed cluster (using something like the enovance module) to one that

Re: [ceph-users] Lost monitors in a multi mon cluster

2014-10-24 Thread Dan van der Ster
Hi, October 24 2014 5:28 PM, HURTEVENT VINCENT vincent.hurtev...@univ-lyon1.fr wrote: Hello, I was running a multi mon (3) Ceph cluster and in a migration move, I reinstall 2 of the 3 monitors nodes without deleting them properly into the cluster. So, there is only one monitor left

Re: [ceph-users] What a maximum theoretical and practical capacity in ceph cluster?

2014-10-27 Thread Dan van der Ster
Hi, October 27 2014 5:07 PM, Wido den Hollander w...@42on.com wrote: On 10/27/2014 04:30 PM, Mike wrote: Hello, My company is plaining to build a big Ceph cluster for achieving and storing data. By requirements from customer - 70% of capacity is SATA, 30% SSD. First day data is storing

Re: [ceph-users] What a maximum theoretical and practical capacity in ceph cluster?

2014-10-28 Thread Dan Van Der Ster
On 28 Oct 2014, at 08:25, Robert van Leeuwen robert.vanleeu...@spilgames.com wrote: By now we decide use a SuperMicro's SKU with 72 bays for HDD = 22 SSD + 50 SATA drives. Our racks can hold 10 this servers and 50 this racks in ceph cluster = 36000 OSD's, With 4tb SATA drives and

Re: [ceph-users] What a maximum theoretical and practical capacity in ceph cluster?

2014-10-28 Thread Dan Van Der Ster
On 28 Oct 2014, at 09:30, Christian Balzer ch...@gol.com wrote: On Tue, 28 Oct 2014 07:46:30 + Dan Van Der Ster wrote: On 28 Oct 2014, at 08:25, Robert van Leeuwen robert.vanleeu...@spilgames.com wrote: By now we decide use a SuperMicro's SKU with 72 bays for HDD = 22 SSD + 50

Re: [ceph-users] Scrub proces, IO performance

2014-10-28 Thread Dan Van Der Ster
Hi, You should try the new osd_disk_thread_ioprio_class / osd_disk_thread_ioprio_priority options. Cheers, dan On 28 Oct 2014, at 09:27, Mateusz Skała mateusz.sk...@budikom.netmailto:mateusz.sk...@budikom.net wrote: Hello, We are using Ceph as a storage backend for KVM, used for hosting MS

[ceph-users] RHEL6.6 upgrade (selinux-policy-targeted) triggers slow requests

2014-10-29 Thread Dan Van Der Ster
Hi RHEL/CentOS users, This is just a heads up that we observe slow requests during the RHEL6.6 upgrade. The upgrade includes selinux-policy-targeted, which runs this during the update: /sbin/restorecon -i -f - -R -p -e /sys -e /proc -e /dev -e /mnt -e /var/tmp -e /home -e /tmp -e /dev

Re: [ceph-users] Delete pools with low priority?

2014-10-30 Thread Dan van der Ster
Hi Daniel, I can't remember if deleting a pool invokes the snap trimmer to do the actual work deleting objects. But if it does, then it is most definitely broken in everything except latest releases (actual dumpling doesn't have the fix yet in a release). Given a release with those fixes (see

Re: [ceph-users] Delete pools with low priority?

2014-10-30 Thread Dan van der Ster
October 30 2014 11:32 AM, Daniel Schneller daniel.schnel...@centerdevice.com wrote: On 2014-10-30 10:14:44 +, Dan van der Ster said: Hi Daniel, I can't remember if deleting a pool invokes the snap trimmer to do the actual work deleting objects. But if it does, then it is most

Re: [ceph-users] rhel7 krbd backported module repo ?

2014-11-03 Thread Dan van der Ster
There's this one: http://gitbuilder.ceph.com/kmod-rpm-rhel7beta-x86_64-basic/ref/rhel7/x86_64/ But that hasn't been updated since July. Cheers, Dan On Mon Nov 03 2014 at 5:35:23 AM Alexandre DERUMIER aderum...@odiso.com wrote: Hi, I would like to known if a repository is available for

Re: [ceph-users] Typical 10GbE latency

2014-11-06 Thread Dan van der Ster
Between two hosts on an HP Procurve 6600, no jumbo frames: rtt min/avg/max/mdev = 0.096/0.128/0.151/0.019 ms Cheers, Dan On Thu Nov 06 2014 at 2:19:07 PM Wido den Hollander w...@42on.com wrote: Hello, While working at a customer I've ran into a 10GbE latency which seems high to me. I

Re: [ceph-users] PG inconsistency

2014-11-06 Thread Dan van der Ster
Hi, I've only ever seen (1), EIO to read a file. In this case I've always just killed / formatted / replaced that OSD completely -- that moves the PG to a new master and the new replication fixes the inconsistency. This way, I've never had to pg repair. I don't know if this is a best or even good

Re: [ceph-users] PG inconsistency

2014-11-06 Thread Dan van der Ster
IIRC, the EIO we had also correlated with a SMART status that showed the disk was bad enough for a warranty replacement -- so yes, I replaced the disk in these cases. Cheers, Dan On Thu Nov 06 2014 at 2:44:08 PM GuangYang yguan...@outlook.com wrote: Thanks Dan. By killed/formatted/replaced the

Re: [ceph-users] Reusing old journal block device w/ data causes FAILED assert(0)

2014-11-13 Thread Dan van der Ster
Hi, Did you mkjournal the reused journal? ceph-osd -i $ID --mkjournal Cheers, Dan On Thu Nov 13 2014 at 2:34:51 PM Anthony Alba ascanio.al...@gmail.com wrote: When I create a new OSD with a block device as journal that has existing data on it, ceph is causing FAILED assert. The block

Re: [ceph-users] Reusing old journal block device w/ data causes FAILED assert(0)

2014-11-13 Thread Dan van der Ster
Hi, On Thu Nov 13 2014 at 3:35:55 PM Anthony Alba ascanio.al...@gmail.com wrote: Ah no. On 13 Nov 2014 21:49, Dan van der Ster daniel.vanders...@cern.ch wrote: Hi, Did you mkjournal the reused journal? ceph-osd -i $ID --mkjournal Cheers, Dan No - however the man page states

[ceph-users] Client forward compatibility

2014-11-20 Thread Dan van der Ster
Hi all, What is compatibility/incompatibility of dumpling clients to talk to firefly and giant clusters? I know that tunables=firefly will prevent dumpling clients from talking to a firefly cluster, but how about the existence or not of erasure pools? Can a dumpling client talk to a Firefly/Giant

Re: [ceph-users] Client forward compatibility

2014-11-25 Thread Dan Van Der Ster
Hi Greg, On 24 Nov 2014, at 22:01, Gregory Farnum g...@gregs42.com wrote: On Thu, Nov 20, 2014 at 9:08 AM, Dan van der Ster daniel.vanders...@cern.ch wrote: Hi all, What is compatibility/incompatibility of dumpling clients to talk to firefly and giant clusters? We sadly don't have

Re: [ceph-users] Quetions abount osd journal configuration

2014-11-26 Thread Dan Van Der Ster
On 26 Nov 2014, at 13:47, Christian Balzer ch...@gol.com wrote: On Wed, 26 Nov 2014 05:37:43 -0600 Mark Nelson wrote: On 11/26/2014 04:05 AM, Yujian Peng wrote: [snip] Since the size of jornal partitions on SSDs is 10G, I want to set filestore max sync interval to 30 minutes. Is 30

Re: [ceph-users] Quetions abount osd journal configuration

2014-11-26 Thread Dan Van Der Ster
Hi, On 26 Nov 2014, at 17:07, Yujian Peng pengyujian5201...@126.com wrote: Thanks a lot! IOPS is a bottleneck in my cluster and the object disks are much slower than SSDs. I don't know whether SSDs will be used as caches if filestore_max_sync_interval is set to a big value. I will set

Re: [ceph-users] Quetions abount osd journal configuration

2014-11-26 Thread Dan Van Der Ster
On 26 Nov 2014, at 17:26, Dan Van Der Ster daniel.vanders...@cern.ch wrote: Hi, On 26 Nov 2014, at 17:07, Yujian Peng pengyujian5201...@126.com wrote: Thanks a lot! IOPS is a bottleneck in my cluster and the object disks are much slower than SSDs. I don't know whether SSDs

[ceph-users] large reads become 512 byte reads on qemu-kvm rbd

2014-11-27 Thread Dan Van Der Ster
Hi all, We throttle (with qemu-kvm) rbd devices to 100 w/s and 100 r/s (and 80MB/s write and read). With fio we cannot exceed 51.2MB/s sequential or random reads, no matter the reading block size. (But with large writes we can achieve 80MB/s). I just realised that the VM subsytem is probably

Re: [ceph-users] large reads become 512 kbyte reads on qemu-kvm rbd

2014-11-27 Thread Dan Van Der Ster
, Dan On 27 Nov 2014 18:26, Dan Van Der Ster daniel.vanders...@cern.ch wrote: Hi all, We throttle (with qemu-kvm) rbd devices to 100 w/s and 100 r/s (and 80MB/s write and read). With fio we cannot exceed 51.2MB/s sequential or random reads, no matter the reading block size. (But with large writes we

Re: [ceph-users] large reads become 512 kbyte reads on qemu-kvm rbd

2014-11-28 Thread Dan Van Der Ster
rule or similar that could set max_sectors_kb when a RBD device is attached? Cheers, Dan On 27 Nov 2014, at 20:29, Dan Van Der Ster daniel.vanders...@cern.chmailto:daniel.vanders...@cern.ch wrote: Oops, I was off by a factor of 1000 in my original subject. We actually have 4M and 8M reads

Re: [ceph-users] large reads become 512 kbyte reads on qemu-kvm rbd

2014-11-28 Thread Dan Van Der Ster
if this impacts performance? Like small block size performance, etc? Cheers From: Dan Van Der Ster daniel.vanders...@cern.chmailto:daniel.vanders...@cern.ch To: ceph-users ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com Sent: Friday, 28 November, 2014 1:33

Re: [ceph-users] Removing Snapshots Killing Cluster Performance

2014-12-01 Thread Dan Van Der Ster
Hi, Which version of Ceph are you using? This could be related: http://tracker.ceph.com/issues/9487 See ReplicatedPG: don't move on to the next snap immediately; basically, the OSD is getting into a tight loop trimming the snapshot objects. The fix above breaks out of that loop more frequently,

Re: [ceph-users] large reads become 512 kbyte reads on qemu-kvm rbd

2014-12-01 Thread Dan Van Der Ster
Hi Ilya, On 28 Nov 2014, at 17:56, Ilya Dryomov ilya.dryo...@inktank.com wrote: On Fri, Nov 28, 2014 at 5:46 PM, Dan Van Der Ster daniel.vanders...@cern.ch wrote: Hi Andrei, Yes, I’m testing from within the guest. Here is an example. First, I do 2MB reads when the max_sectors_kb=512

Re: [ceph-users] Largest Production Ceph Cluster

2014-04-03 Thread Dan Van Der Ster
Hi, On Apr 3, 2014 4:49 AM, Christian Balzer ch...@gol.com wrote: On Tue, 1 Apr 2014 14:18:51 + Dan Van Der Ster wrote: [snip] http://www.slideshare.net/Inktank_Ceph/scaling-ceph-at-cern [snap] In that slide it says that replacing failed OSDs is automated via puppet. I'm

Re: [ceph-users] Small Production system with Openstack

2014-04-05 Thread Dan Van Der Ster
Hi, I'm not looking at your hardware in detail (except to say that you absolutely must have 3 monitors and that I don't know what a load balancer would be useful for in this setup), but perhaps the two parameters below may help you evaluate your system. To estimate the IOPS capacity of your

Re: [ceph-users] ceph cluster health monitoring

2014-04-11 Thread Dan Van Der Ster
It’s pretty basic, but we run this hourly: https://github.com/cernceph/ceph-scripts/blob/master/ceph-health-cron/ceph-health-cron -- Dan van der Ster || Data Storage Services || CERN IT Department -- On 11 Apr 2014 at 09:12:13, Pavel V. Kaygorodov (pa...@inasan.rumailto:pa...@inasan.ru

[ceph-users] RBD write access patterns and atime

2014-04-16 Thread Dan van der Ster
65536: 87783 131072: 87279 12288: 66735 49152: 50170 24576: 47794 262144: 45199 466944: 23064 So reads are mostly 512kB, which is probably some default read-ahead size. -- Dan van der Ster || Data Storage Services || CERN IT Department -- ___ ceph-users

Re: [ceph-users] RBD write access patterns and atime

2014-04-17 Thread Dan van der Ster
Hi, Gregory Farnum wrote: I forget which clients you're using — is rbd caching enabled? Yes, the clients are qemu-kvm-rhev with latest librbd from dumpling and rbd cache = true. Cheers, Dan -- Dan van der Ster || Data Storage Services || CERN IT Department

Re: [ceph-users] RBD write access patterns and atime

2014-04-17 Thread Dan van der Ster
The last num is the size of the write/read. Then run this: https://github.com/cernceph/ceph-scripts/blob/master/tools/rbd-io-stats.pl Cheers, Dan -- -- Dan van der Ster || Data Storage Services || CERN IT Department -- ___ ceph-users mailing list ceph

Re: [ceph-users] RBD write access patterns and atime

2014-04-17 Thread Dan van der Ster
isn't the file accesses leading to many small writes. Any other theories? Cheers, Dan -- Dan van der Ster || Data Storage Services || CERN IT Department -- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph

Re: [ceph-users] OSD distribution unequally

2014-04-18 Thread Dan Van Der Ster
ceph osd reweight-by-utilization Is that still in 0.79? I'd start with reweight-by-utilization 200 and then adjust that number down until you get to 120 or so. Cheers, Dan On Apr 18, 2014 12:49 PM, Kenneth Waegeman kenneth.waege...@ugent.be wrote: Hi, Some osds of our cluster filled up:

Re: [ceph-users] Pool with empty name recreated

2014-04-24 Thread Dan van der Ster
(d7ab4244396b57aac8b7e80812115bbd079e6b73) How can i delete it forever? -- Dan van der Ster || Data Storage Services || CERN IT Department -- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Unable to add CEPH as Primary Storage - libvirt error undefined storage pool type

2014-04-28 Thread Dan van der Ster
On 28/04/14 14:54, Wido den Hollander wrote: On 04/28/2014 02:15 PM, Andrija Panic wrote: Thank you very much Wido, any suggestion on compiling libvirt with support (I already found a way) or perhaps use some prebuilt , that you would recommend ? No special suggestions, just make sure you

Re: [ceph-users] Unable to add CEPH as Primary Storage - libvirt error undefined storage pool type

2014-04-28 Thread Dan van der Ster
rbd enabled qemu, qemu-img etc from ceph.com http://ceph.com site) I need just libvirt with rbd support ? Thanks On 28 April 2014 15:05, Andrija Panic andrija.pa...@gmail.com mailto:andrija.pa...@gmail.com wrote: Thanks Dan :) On 28 April 2014 15:02, Dan van der Ster

Re: [ceph-users] Replace journals disk

2014-05-06 Thread Dan Van Der Ster
I've followed this recipe successfully in the past: http://wiki.skytech.dk/index.php/Ceph_-_howto,_rbd,_lvm,_cluster#Add.2Fmove_journal_in_running_cluster On May 6, 2014 12:34 PM, Gandalf Corvotempesta gandalf.corvotempe...@gmail.com wrote: Hi to all, I would like to replace a disk used as

Re: [ceph-users] v0.80 Firefly released

2014-05-07 Thread Dan van der Ster
Hi, Sage Weil wrote: **Primary affinity*: Ceph now has the ability to skew selection of OSDs as the primary copy, which allows the read workload to be cheaply skewed away from parts of the cluster without migrating any data. Can you please elaborate a bit on this one? I found the

[ceph-users] 0.67.7 rpms changed today??

2014-05-08 Thread Dan van der Ster
-0.67.7-0.el6.x86_64.rpm (568 kiB) Yet the dates haven't changed. Is that understood? It's not a malicious incident, is it? Cheers, Dan -- Dan van der Ster || Data Storage Services || CERN IT Department -- ___ ceph-users mailing list ceph-users

Re: [ceph-users] Bulk storage use case

2014-05-13 Thread Dan van der Ster
Hi, I think you're not getting many replies simply because those are rather large servers and not many have such hardware in prod. We run with 24x3TB drives, 64GB ram, one 10Gbit NIC. Memory-wise there are no problems. Throughput-wise, the bottleneck is somewhere between the NIC (~1GB/s) and

Re: [ceph-users] How to backup mon-data?

2014-05-23 Thread Dan Van Der Ster
to a safe place. Cheers, Dan -- Dan van der Ster || Data Storage Services || CERN IT Department -- On 23 May 2014, at 15:45, Fabian Zimmermann f.zimmerm...@xplosion.de wrote: Hello, I’m running a 3 node cluster with 2 hdd/osd and one mon on each node. Sadly the fsyncs done by mon-processes

Re: [ceph-users] Ceph and low latency kernel

2014-05-25 Thread Dan Van Der Ster
I very briefly tried kernel-rt from RH MRG, and it didn't make any noticeable difference. Though I didn't spend any time tuning things. Cheers, Dan On May 25, 2014 11:04 AM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Hi, has anybody ever tried to use a low latency kernel for

Re: [ceph-users] v0.67.9 Dumpling released

2014-06-04 Thread Dan Van Der Ster
for osd_snap_trim_sleep = … ? Cheers, Dan -- Dan van der Ster || Data Storage Services || CERN IT Department -- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] v0.67.9 Dumpling released

2014-06-04 Thread Dan Van Der Ster
On 04 Jun 2014, at 16:06, Sage Weil s...@inktank.com wrote: On Wed, 4 Jun 2014, Dan Van Der Ster wrote: Hi Sage, all, On 21 May 2014, at 22:02, Sage Weil s...@inktank.com wrote: * osd: allow snap trim throttling with simple delay (#6278, Sage Weil) Do you have some advice about how

Re: [ceph-users] v0.67.9 Dumpling released

2014-06-04 Thread Dan Van Der Ster
On 04 Jun 2014, at 16:06, Sage Weil s...@inktank.com wrote: You can adjust this on running OSDs with something like 'ceph daemon osd.NN config set osd_snap_trim_sleep .01' or with 'ceph tell osd.* injectargs -- --osd-snap-trim-sleep .01'. Thanks, trying that now. I noticed that using =

Re: [ceph-users] Problem installing ceph from package manager / ceph repositories

2014-06-10 Thread Dan Van Der Ster
i have not faced this , though i have done several ceph cluster installation with package manager. I don’t want EPEL version of Ceph. You probably need to tweak the repo priorities. We use priority=30 for epel.repo, priority=5 for ceph.repo. Cheers, Dan -- Dan van der Ster || Data Storage

Re: [ceph-users] How to avoid deep-scrubbing performance hit?

2014-06-10 Thread Dan Van Der Ster
? - If OTOH a disk/op thread is switching between scrubbing and client IO responsibilities, could Ceph use ioprio_set to change the io priorities on the fly?? Cheers, Dan -- Dan van der Ster || Data Storage Services || CERN IT Department -- On 10 Jun 2014, at 00:22, Craig Lewis cle

Re: [ceph-users] PG Selection Criteria for Deep-Scrub

2014-06-11 Thread Dan Van Der Ster
Hi Greg, This tracker issue is relevant: http://tracker.ceph.com/issues/7288 Cheers, Dan On 11 Jun 2014, at 00:30, Gregory Farnum g...@inktank.com wrote: Hey Mike, has your manual scheduling resolved this? I think I saw another similar-sounding report, so a feature request to improve scrub

Re: [ceph-users] How to avoid deep-scrubbing performance hit?

2014-06-11 Thread Dan Van Der Ster
On 10 Jun 2014, at 11:59, Dan Van Der Ster daniel.vanders...@cern.ch wrote: One idea I had was to check the behaviour under different disk io schedulers, trying exploit thread io priorities with cfq. So I have a question for the developers about using ionice or ioprio_set to lower the IO

Re: [ceph-users] Throttle pool pg_num/pgp_num increase impact

2014-07-08 Thread Dan Van Der Ster
Hi Greg, We're also due for a similar splitting exercise in the not too distant future, and will also need to minimize the impact on latency. In addition to increasing pg_num in small steps and using a minimal max_backfills/recoveries configuration, I was planning to increase pgp_num very

Re: [ceph-users] Hang of ceph-osd -i (adding an OSD)

2014-07-09 Thread Dan Van Der Ster
Hi, On 09 Jul 2014, at 14:44, Robert van Leeuwen robert.vanleeu...@spilgames.com wrote: I cannot add a new OSD to a current Ceph cluster. It just hangs, here is the debug log: This is ceph 0.72.1 on CentOS. Found the issue: Although I installed the specific ceph (0.72.1) version the

Re: [ceph-users] Hang of ceph-osd -i (adding an OSD)

2014-07-09 Thread Dan Van Der Ster
On 09 Jul 2014, at 15:30, Robert van Leeuwen robert.vanleeu...@spilgames.com wrote: Which leveldb from where? 1.12.0-5 that tends to be in el6/7 repos is broken for Ceph. You need to remove the “basho fix” patch. 1.7.0 is the only readily available version that works, though it is so old

Re: [ceph-users] nf_conntrack overflow crashes OSDs

2014-08-08 Thread Dan Van Der Ster
this, and google isn’t helping. I suggest that this should be added to ceph.com docs. Cheers, Dan -- Dan van der Ster || Data Storage Services || CERN IT Department -- On 08 Aug 2014, at 10:46, Christian Kauhaus k...@gocept.com wrote: Hi, today I'd like to share a severe problem we've found

Re: [ceph-users] Show IOps per VM/client to find heavy users...

2014-08-08 Thread Dan Van Der Ster
. BTW, do you throttle your clients? We found that its absolutely necessary, since without a throttle just a few active VMs can eat up the entire iops capacity of the cluster. Cheers, Dan -- Dan van der Ster || Data Storage Services || CERN IT Department -- On 08 Aug 2014, at 13:51, Andrija

Re: [ceph-users] Show IOps per VM/client to find heavy users...

2014-08-08 Thread Dan Van Der Ster
On 8 August 2014 15:44, Dan Van Der Ster daniel.vanders...@cern.chmailto:daniel.vanders...@cern.ch wrote: Hi, Here’s what we do to identify our top RBD users. First, enable log level 10 for the filestore so you can see all the IOs coming from the VMs. Then use a script like this (used

Re: [ceph-users] Show IOps per VM/client to find heavy users...

2014-08-11 Thread Dan Van Der Ster
Hi, I changed the script to be a bit more flexible with the osd path. Give this a try again: https://github.com/cernceph/ceph-scripts/blob/master/tools/rbd-io-stats.pl Cheers, Dan -- Dan van der Ster || Data Storage Services || CERN IT Department -- On 11 Aug 2014, at 12:48, Andrija Panic

Re: [ceph-users] Serious performance problems with small file writes

2014-08-20 Thread Dan Van Der Ster
, then it might be that something else is reading the disks heavily. One thing to check is updatedb — we had to disable it from indexing /var/lib/ceph on our OSDs. Best Regards, Dan -- Dan van der Ster || Data Storage Services || CERN IT Department -- On 20 Aug 2014, at 16:39, Hugo Mills h.r.mi

Re: [ceph-users] Serious performance problems with small file writes

2014-08-20 Thread Dan Van Der Ster
max ops/bytes, and the filestore wbthrottle xfs * options. (I’m not going to publish exact configs here because I haven’t finished tuning yet). Cheers, Dan Thanks a lot!! Best regards, German Anders On Wednesday 20/08/2014 at 11:51, Dan Van Der Ster wrote: Hi, Do you get slow

Re: [ceph-users] Serious performance problems with small file writes

2014-08-21 Thread Dan Van Der Ster
Hi Hugo, On 20 Aug 2014, at 17:54, Hugo Mills h.r.mi...@reading.ac.uk wrote: What are you using for OSD journals? On each machine, the three OSD journals live on the same ext4 filesystem on an SSD, which is also the root filesystem of the machine. Also check the CPU usage for the mons

Re: [ceph-users] MON running 'ceph -w' doesn't see OSD's booting

2014-08-21 Thread Dan Van Der Ster
Hi, You only have one OSD? I’ve seen similar strange things in test pools having only one OSD — and I kinda explained it by assuming that OSDs need peers (other OSDs sharing the same PG) to behave correctly. Install a second OSD and see how it goes... Cheers, Dan On 21 Aug 2014, at 02:59,

Re: [ceph-users] Serious performance problems with small file writes

2014-08-21 Thread Dan Van Der Ster
Hi Hugo, On 21 Aug 2014, at 14:17, Hugo Mills h.r.mi...@reading.ac.uk wrote: Not sure what you mean about colocated journal/OSD. The journals aren't on the same device as the OSDs. However, all three journals on each machine are on the same SSD. embarrassed I obviously didn’t drink

Re: [ceph-users] ulimit max user processes (-u) and non-root ceph clients

2013-09-20 Thread Dan Van Der Ster
On Sep 19, 2013, at 6:10 PM, Gregory Farnum g...@inktank.com wrote: On Wed, Sep 18, 2013 at 11:43 PM, Dan Van Der Ster daniel.vanders...@cern.ch wrote: On Sep 18, 2013, at 11:50 PM, Gregory Farnum g...@inktank.com wrote: On Wed, Sep 18, 2013 at 6:33 AM, Dan Van Der Ster daniel.vanders

Re: [ceph-users] PG distribution scattered

2013-09-20 Thread Dan Van Der Ster
On Sep 19, 2013, at 3:43 PM, Mark Nelson mark.nel...@inktank.com wrote: If you set: osd pool default flag hashpspool = true Theoretically that will cause different pools to be distributed more randomly. The name seems to imply that it should be settable per pool. Is that possible now?

[ceph-users] bit correctness and checksumming

2013-10-16 Thread Dan Van Der Ster
. If the checksum is already stored per object in the OSD, is this retrievable by librados? We have some applications which also need to know the checksum of the data and this would be handy if it was already calculated by Ceph. Thanks in advance! Dan van der Ster CERN

Re: [ceph-users] bit correctness and checksumming

2013-10-16 Thread Dan van der Ster
at the moment to check for myself, and the answer is relevant to this discussion anyway). Cheers, Dan Sage Weil s...@inktank.com wrote: On Wed, 16 Oct 2013, Dan Van Der Ster wrote: Hi all, There has been some confusion the past couple days at the CHEP conference during conversations about Ceph

Re: [ceph-users] bit correctness and checksumming

2013-10-16 Thread Dan van der Ster
On Wed, Oct 16, 2013 at 6:12 PM, Sage Weil s...@inktank.com wrote: 3. During deep scrub of an object with 2 replicas, suppose the checksum is different for the two objects -- which object wins? (I.e. if you store the checksum locally, this is trivial since the consistency of objects can be

Re: [ceph-users] Help with CRUSH maps

2013-10-31 Thread Dan van der Ster
On Thu, Oct 31, 2013 at 2:29 PM, Alexis GÜNST HORN alexis.gunsth...@outscale.com wrote: step take example step emit This is the problem, AFAICT. Just omit those two lines in both rules and it should work. Cheers, dan ___ ceph-users

Re: [ceph-users] Help with CRUSH maps

2013-10-31 Thread Dan van der Ster
On Thu, Oct 31, 2013 at 2:29 PM, Alexis GÜNST HORN alexis.gunsth...@outscale.com wrote: -11 0 drive hdd -21 0 datacenter hdd-dc1 -1020 room hdd-dc1-A -5030 host A-ceph-osd-2

Re: [ceph-users] Mapping rbd's on boot

2013-11-14 Thread Dan Van Der Ster
Hi, We’re trying the same, on SLC. We tried rbdmap but it seems to have some ubuntu-isms which cause errors. We also tried with rc.local, and you can map and mount easily, but at shutdown we’re seeing the still-mapped images blocking a machine from shutting down (libceph connection refused

  1   2   3   4   5   6   >