Re: [ceph-users] Blog post: storage server power consumption

2017-11-08 Thread Nick Fisk
Also look at the new WD 10TB Red's if you want very low use archive storage. Because they spin at 5400, they only use 2.8W at idle. > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Jack > Sent: 06 November 2017 22:31 > To: ceph-users@lists.c

Re: [ceph-users] bluestore - wal,db on faster devices?

2017-11-08 Thread Nick Fisk
> -Original Message- > From: Mark Nelson [mailto:mnel...@redhat.com] > Sent: 08 November 2017 21:42 > To: n...@fisk.me.uk; 'Wolfgang Lendl' > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] bluestore - wal,db on faster devices? > > > >

Re: [ceph-users] Bluestore performance 50% of filestore

2017-11-18 Thread Nick Fisk
le, say 3-4x over the total amount of RAM > in all of the nodes, helps you get a better idea of what the behavior is like > when those tricks are less effective. I think that's probably a more likely > scenario in most production environments, but it's up to you which worklo

Re: [ceph-users] ceph all-nvme mysql performance tuning

2017-11-27 Thread Nick Fisk
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of German Anders Sent: 27 November 2017 14:44 To: Maged Mokhtar Cc: ceph-users Subject: Re: [ceph-users] ceph all-nvme mysql performance tuning Hi Maged, Thanks a lot for the response. We try with different number of t

Re: [ceph-users] what's the maximum number of OSDs per OSD server?

2017-12-10 Thread Nick Fisk
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Igor Mendelev Sent: 10 December 2017 15:39 To: ceph-users@lists.ceph.com Subject: [ceph-users] what's the maximum number of OSDs per OSD server? Given that servers with 64 CPU cores (128 threads @ 2.7GHz) and up to 2TB RA

Re: [ceph-users] what's the maximum number of OSDs per OSD server?

2017-12-10 Thread Nick Fisk
software? Just make sure you size the nodes to a point that if one has to be taken offline for any reason, that you are happy with the resulting state of the cluster, including the peering when suddenly taking ~200 OSD’s offline/online. Nick On Sun, Dec 10, 2017 at 11:17 AM, Nic

[ceph-users] Odd object blocking IO on PG

2017-12-12 Thread Nick Fisk
Does anyone know what this object (0.ae78c1cf) might be, it's not your normal run of the mill RBD object and I can't seem to find it in the pool using rados --all ls . It seems to be leaving the 0.1cf PG stuck in an activating+remapped state and blocking IO. Pool 0 is just a pure RBD pool with a ca

[ceph-users] Bluestore Compression not inheriting pool option

2017-12-12 Thread Nick Fisk
Hi All, Has anyone been testing the bluestore pool compression option? I have set compression=snappy on a RBD pool. When I add a new bluestore OSD, data is not being compressed when backfilling, confirmed by looking at the perf dump results. If I then set again the compression type on the pool to

Re: [ceph-users] Odd object blocking IO on PG

2017-12-12 Thread Nick Fisk
ached) is not showing in the main status that it has been blocked from peering or that there are any missing objects. I've tried restarting all OSD's I can see relating to the PG in case they needed a bit of a nudge. > > On Tue, Dec 12, 2017 at 12:36 PM, Nick Fisk wrote: > >

Re: [ceph-users] Health Error : Request Stuck

2017-12-13 Thread Nick Fisk
Hi Karun, I too am experiencing something very similar with a PG stuck in activating+remapped state after re-introducing a OSD back into the cluster as Bluestore. Although this new OSD is not the one listed against the PG’s stuck activating. I also see the same thing as you where the up set

Re: [ceph-users] Odd object blocking IO on PG

2017-12-13 Thread Nick Fisk
On Tue, Dec 12, 2017 at 12:33 PM Nick Fisk mailto:n...@fisk.me.uk> > wrote: > That doesn't look like an RBD object -- any idea who is > "client.34720596.1:212637720"? So I think these might be proxy ops from the cache tier, as there are also block ops on one of the

Re: [ceph-users] Health Error : Request Stuck

2017-12-13 Thread Nick Fisk
onest, not exactly sure its the correct way. P.S : I had upgraded to Luminous 12.2.2 yesterday. Karun Josy On Wed, Dec 13, 2017 at 4:31 PM, Nick Fisk mailto:n...@fisk.me.uk> > wrote: Hi Karun, I too am experiencing something very similar with a PG stuck in activatin

Re: [ceph-users] Odd object blocking IO on PG

2017-12-13 Thread Nick Fisk
alf Of Nick Fisk Sent: 13 December 2017 11:14 To: 'Gregory Farnum' Cc: 'ceph-users' Subject: Re: [ceph-users] Odd object blocking IO on PG On Tue, Dec 12, 2017 at 12:33 PM Nick Fisk mailto:n...@fisk.me.uk> > wrote: > That doesn't look like an RB

Re: [ceph-users] Bluestore Compression not inheriting pool option

2017-12-13 Thread Nick Fisk
Thanks for confirming, logged http://tracker.ceph.com/issues/22419 > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Stefan Kooman > Sent: 12 December 2017 20:35 > To: Nick Fisk > Cc: ceph-users@lists.ceph.com > Sub

Re: [ceph-users] Cache tiering on Erasure coded pools

2017-12-27 Thread Nick Fisk
Also carefully read the word of caution section on David's link (which is absent in the jewel version of the docs), a cache tier in front of an ersure coded data pool for RBD is almost always a bad idea. I would say that statement is incorrect if using Bluestore. If using Bluestore, small

[ceph-users] Linux Meltdown (KPTI) fix and how it affects performance?

2018-01-04 Thread Nick Fisk
Hi All, As the KPTI fix largely only affects the performance where there are a large number of syscalls made, which Ceph does a lot of, I was wondering if anybody has had a chance to perform any initial tests. I suspect small write latencies will the worse affected? Although I'm thinking the back

Re: [ceph-users] Cluster crash - FAILED assert(interval.last > last)

2018-01-11 Thread Nick Fisk
I take my hat off to you, well done for solving that!!! > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Zdenek Janda > Sent: 11 January 2018 13:01 > To: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] Cluster crash - FAILED assert(int

Re: [ceph-users] Ubuntu 17.10 or Debian 9.3 + Luminous = random OS hang ?

2018-01-21 Thread Nick Fisk
How up to date is your VM environment? We saw something very similar last year with Linux VM’s running newish kernels. It turns out newer kernels supported a new feature of the vmxnet3 adapters which had a bug in ESXi. The fix was release last year some time in ESXi6.5 U1, or a workaround was to

Re: [ceph-users] What is the should be the expected latency of 10Gbit network connections

2018-01-22 Thread Nick Fisk
Anyone with 25G ethernet willing to do the test? Would love to see what the latency figures are for that. From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Maged Mokhtar Sent: 22 January 2018 11:28 To: ceph-users@lists.ceph.com Subject: Re: [ceph-users] What is the shou

Re: [ceph-users] OSD servers swapping despite having free memory capacity

2018-01-24 Thread Nick Fisk
I know this may be a bit vague, but also suggests the "try a newer kernel" approach. We had constant problems with hosts mounting a number of RBD volumes formatted with XFS. The servers would start aggressively swapping even though the actual memory in use was nowhere near even 50% and eventuall

Re: [ceph-users] BlueStore.cc: 9363: FAILED assert(0 == "unexpected error")

2018-01-26 Thread Nick Fisk
I can see this in the logs: 2018-01-25 06:05:56.292124 7f37fa6ea700 -1 log_channel(cluster) log [ERR] : full status failsafe engaged, dropping updates, now 101% full 2018-01-25 06:05:56.325404 7f3803f9c700 -1 bluestore(/var/lib/ceph/osd/ceph-9) _do_alloc_write failed to reserve 0x4000 2018-

Re: [ceph-users] pgs down after adding 260 OSDs & increasing PGs

2018-01-29 Thread Nick Fisk
Hi Jake, I suspect you have hit an issue that me and a few others have hit in Luminous. By increasing the number of PG's before all the data has re-balanced, you have probably exceeded hard PG per OSD limit. See this thread https://www.spinics.net/lists/ceph-users/msg41231.html Nick > -Orig

Re: [ceph-users] RBD Watch Notify for snapshots

2016-08-22 Thread Nick Fisk
ink. Nick > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Nick > Fisk > Sent: 08 July 2016 09:58 > To: dilla...@redhat.com > Cc: 'ceph-users' > Subject: Re: [ceph-users] RBD Watch Notify for snapshots > > Thank

[ceph-users] udev rule to set readahead on Ceph RBD's

2016-08-22 Thread Nick Fisk
Hope it's useful to someone https://gist.github.com/fiskn/6c135ab218d35e8b53ec0148fca47bf6 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] RBD Watch Notify for snapshots

2016-08-22 Thread Nick Fisk
> -Original Message- > From: Ilya Dryomov [mailto:idryo...@gmail.com] > Sent: 22 August 2016 14:53 > To: Nick Fisk > Cc: Jason Dillaman ; ceph-users > > Subject: Re: [ceph-users] RBD Watch Notify for snapshots > > On Mon, Aug 22, 2016 at 3:13 PM, Nic

Re: [ceph-users] udev rule to set readahead on Ceph RBD's

2016-08-22 Thread Nick Fisk
> -Original Message- > From: Ilya Dryomov [mailto:idryo...@gmail.com] > Sent: 22 August 2016 15:16 > To: Nick Fisk > Cc: ceph-users > Subject: Re: [ceph-users] udev rule to set readahead on Ceph RBD's > > On Mon, Aug 22, 2016 at 3:17 PM, Nick Fisk wrote: &g

Re: [ceph-users] RBD Watch Notify for snapshots

2016-08-22 Thread Nick Fisk
> -Original Message- > From: Ilya Dryomov [mailto:idryo...@gmail.com] > Sent: 22 August 2016 15:00 > To: Jason Dillaman > Cc: Nick Fisk ; ceph-users > Subject: Re: [ceph-users] RBD Watch Notify for snapshots > > On Fri, Jul 8, 2016 at 5:02 AM, Jason Dillaman

Re: [ceph-users] udev rule to set readahead on Ceph RBD's

2016-08-22 Thread Nick Fisk
> -Original Message- > From: Wido den Hollander [mailto:w...@42on.com] > Sent: 22 August 2016 18:22 > To: ceph-users ; n...@fisk.me.uk > Subject: Re: [ceph-users] udev rule to set readahead on Ceph RBD's > > > > Op 22 augustus 2016 om 15:17 schreef N

Re: [ceph-users] Ceph + VMware + Single Thread Performance

2016-08-22 Thread Nick Fisk
> -Original Message- > From: Christian Balzer [mailto:ch...@gol.com] > Sent: 22 August 2016 03:00 > To: 'ceph-users' > Cc: Nick Fisk > Subject: Re: [ceph-users] Ceph + VMware + Single Thread Performance > > > Hello, > > On Sun,

Re: [ceph-users] Ceph + VMware + Single Thread Performance

2016-08-22 Thread Nick Fisk
From: Alex Gorbachev [mailto:a...@iss-integration.com] Sent: 22 August 2016 20:30 To: Nick Fisk Cc: Wilhelm Redbrake ; Horace Ng ; ceph-users Subject: Re: [ceph-users] Ceph + VMware + Single Thread Performance On Sunday, August 21, 2016, Wilhelm Redbrake mailto:w...@globe.de

Re: [ceph-users] udev rule to set readahead on Ceph RBD's

2016-08-23 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Alex > Gorbachev > Sent: 23 August 2016 16:43 > To: Wido den Hollander > Cc: ceph-users ; Nick Fisk > Subject: Re: [ceph-users] udev rule to set readahead on Ceph RBD&

Re: [ceph-users] udev rule to set readahead on Ceph RBD's

2016-08-23 Thread Nick Fisk
> -Original Message- > From: Wido den Hollander [mailto:w...@42on.com] > Sent: 23 August 2016 19:45 > To: Ilya Dryomov ; Nick Fisk > Cc: ceph-users > Subject: Re: [ceph-users] udev rule to set readahead on Ceph RBD's > > > > Op 23 augustus

Re: [ceph-users] udev rule to set readahead on Ceph RBD's

2016-08-24 Thread Nick Fisk
> -Original Message- > From: Wido den Hollander [mailto:w...@42on.com] > Sent: 24 August 2016 07:08 > To: Ilya Dryomov ; n...@fisk.me.uk > Cc: ceph-users > Subject: RE: [ceph-users] udev rule to set readahead on Ceph RBD's > > > > Op 23 augus

Re: [ceph-users] RBD Watch Notify for snapshots

2016-08-24 Thread Nick Fisk
> -Original Message- > From: Jason Dillaman [mailto:jdill...@redhat.com] > Sent: 23 August 2016 13:23 > To: Nick Fisk > Cc: ceph-users > Subject: Re: [ceph-users] RBD Watch Notify for snapshots > > Looks good. Since you are re-using the RBD header

Re: [ceph-users] Storcium has been certified by VMWare

2016-08-26 Thread Nick Fisk
Well done Alex, I know the challenges you have worked through to attain this. > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Alex > Gorbachev > Sent: 26 August 2016 15:53 > To: scst-de...@lists.sourceforge.net; ceph-users > Subject: [ceph-

Re: [ceph-users] Ceph + VMware + Single Thread Performance

2016-08-31 Thread Nick Fisk
From: w...@globe.de [mailto:w...@globe.de] Sent: 30 August 2016 18:40 To: n...@fisk.me.uk; 'Alex Gorbachev' Cc: 'Horace Ng' Subject: Re: [ceph-users] Ceph + VMware + Single Thread Performance Hi Nick, here are my answers and questions... Am 30.08.16 um 19:

Re: [ceph-users] Ceph + VMware + Single Thread Performance

2016-08-31 Thread Nick Fisk
omes along, you will be in a better place to take advantage of it. Kind Regards! Am 31.08.16 um 09:51 schrieb Nick Fisk: From: w...@globe.de <mailto:w...@globe.de> [mailto:w...@globe.de] Sent: 30 August 2016 18:40 To: n...@fisk.me.uk <mailto:n...@fisk.me.uk> ; &#x

Re: [ceph-users] Slow Request on OSD

2016-09-01 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Wido > den Hollander > Sent: 01 September 2016 08:19 > To: Reed Dier > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] Slow Request on OSD > > > > Op 31 augustus 2016 om 23:21 schre

Re: [ceph-users] vmware + iscsi + tgt + reservations

2016-09-02 Thread Nick Fisk
Have you disabled the vaai functions in ESXi? I can't remember off the top of my head, but one of them makes everything slow to a crawl. > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Oliver Dzombic > Sent: 02 September 2016 09:50 > To:

Re: [ceph-users] Ceph + VMware + Single Thread Performance

2016-09-04 Thread Nick Fisk
From: Alex Gorbachev [mailto:a...@iss-integration.com] Sent: 04 September 2016 04:45 To: Nick Fisk Cc: Wilhelm Redbrake ; Horace Ng ; ceph-users Subject: Re: [ceph-users] Ceph + VMware + Single Thread Performance On Saturday, September 3, 2016, Alex Gorbachev mailto:a...@iss

Re: [ceph-users] RBD Watch Notify for snapshots

2016-09-06 Thread Nick Fisk
atch_check function, do I need to call this periodically to check that the watch is still active? Thanks, Nick > -Original Message- > From: Jason Dillaman [mailto:jdill...@redhat.com] > Sent: 24 August 2016 15:54 > To: Nick Fisk > Cc: ceph-users > Subject: Re: [ceph-users] RB

Re: [ceph-users] Single Threaded performance for Ceph MDS

2016-09-06 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of John > Spray > Sent: 06 September 2016 13:44 > To: Wido den Hollander > Cc: ceph-users > Subject: Re: [ceph-users] Single Threaded performance for Ceph MDS > > On Tue, Sep 6, 2016 at 1:12 PM,

Re: [ceph-users] RBD Watch Notify for snapshots

2016-09-06 Thread Nick Fisk
Thanks for the hint, I will update my code. > -Original Message- > From: Jason Dillaman [mailto:jdill...@redhat.com] > Sent: 06 September 2016 14:44 > To: Nick Fisk > Cc: ceph-users > Subject: Re: [ceph-users] RBD Watch Notify for snapshots > > If you re

Re: [ceph-users] Ceph + VMware + Single Thread Performance

2016-09-11 Thread Nick Fisk
From: Alex Gorbachev [mailto:a...@iss-integration.com] Sent: 11 September 2016 16:14 To: Nick Fisk Cc: Wilhelm Redbrake ; Horace Ng ; ceph-users Subject: Re: [ceph-users] Ceph + VMware + Single Thread Performance On Sun, Sep 4, 2016 at 4:48 PM, Nick Fisk mailto:n...@fisk.me.uk

Re: [ceph-users] Ceph + VMware + Single Thread Performance

2016-09-11 Thread Nick Fisk
> -Original Message- > From: Alex Gorbachev [mailto:a...@iss-integration.com] > Sent: 11 September 2016 03:17 > To: Nick Fisk > Cc: Wilhelm Redbrake ; Horace Ng ; > ceph-users > Subject: Re: [ceph-users] Ceph + VMware + Single Thread Performance > > C

Re: [ceph-users] Replacing a failed OSD

2016-09-15 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Jim > Kilborn > Sent: 14 September 2016 20:30 > To: Reed Dier > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] Replacing a failed OSD > > Reed, > > > > Thanks for the response.

[ceph-users] RBD Snapshots and osd_snap_trim_sleep

2016-09-19 Thread Nick Fisk
Hi, Does the osd_snap_trim_sleep throttle effect the deletion of RBD snapshots? I've done some searching but am seeing conflicting results on whether this only effects RADOS pool snapshots. I've just deleted a snapshot which comprised of somewhere around 150k objects and it brought the cluste

Re: [ceph-users] RBD Snapshots and osd_snap_trim_sleep

2016-09-19 Thread Nick Fisk
> -Original Message- > From: Dan van der Ster [mailto:d...@vanderster.com] > Sent: 19 September 2016 12:11 > To: Nick Fisk > Cc: ceph-users > Subject: Re: [ceph-users] RBD Snapshots and osd_snap_trim_sleep > > Hi Nick, > > I assume you had osd_snap_tri

Re: [ceph-users] capacity planning - iops

2016-09-19 Thread Nick Fisk
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Matteo Dacrema Sent: 19 September 2016 15:24 To: ceph-users@lists.ceph.com Subject: [ceph-users] capacity planning - iops Hi All, I’m trying to estimate how many iops ( 4k direct random write ) my ceph cluster

Re: [ceph-users] Snap delete performance impact

2016-09-22 Thread Nick Fisk
Hi Adrian, I have also hit this recently and have since increased the osd_snap_trim_sleep to try and stop this from happening again. However, I haven't had an opportunity to actually try and break it again yet, but your mail seems to suggest it might not be the silver bullet I was looking for.

Re: [ceph-users] rbd pool:replica size choose: 2 vs 3

2016-09-23 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Ja. > C.A. > Sent: 23 September 2016 09:50 > To: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] rbd pool:replica size choose: 2 vs 3 > > Hi > > with rep_size=2 and min_size=2, what dr

Re: [ceph-users] Snap delete performance impact

2016-09-23 Thread Nick Fisk
> much concurrent work. As they have inherited a setting targeted for > > SSDs, so I have wound that back to defaults on those machines see if it > > makes a difference. > > > > But I suspect going by the disk activity there is a lot of very small > > FS metadata u

Re: [ceph-users] rbd pool:replica size choose: 2 vs 3

2016-09-23 Thread Nick Fisk
mmmok. > > and, how would the affected PG recover, just replacing the affected OSD/DISK? > or would the affected PG migrate to othe OSD/DISK? Yes, Ceph would start recovering the PG's to other OSD's. But until your PG size=min_size then IO will be blocked. >

Re: [ceph-users] Snap delete performance impact

2016-09-23 Thread Nick Fisk
is not enough to worry the current limit. Sent from my SAMSUNG Galaxy S7 on the Telstra Mobile Network Original message From: Nick Fisk mailto:n...@fisk.me.uk> > Date: 23/09/2016 7:26 PM (GMT+10:00) To: Adrian Saul mailto:adrian.s...@tpgtelecom.

Re: [ceph-users] ceph write performance issue

2016-09-29 Thread Nick Fisk
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of min fang Sent: 29 September 2016 10:34 To: ceph-users Subject: [ceph-users] ceph write performance issue Hi, I created 40 osds ceph cluster with 8 PM863 960G SSD as journal. One ssd is used by 5 osd drives as journal.

Re: [ceph-users] Interested in Ceph, but have performance questions

2016-09-29 Thread Nick Fisk
Hi Gerald, I would say it’s definitely possible. I would make sure you invest in the networking to make sure you have enough bandwidth and choose disks based on performance rather than capacity. Either lots of lower capacity disks or SSD’s would be best. The biggest challenge may be around t

Re: [ceph-users] production cluster down :(

2016-09-30 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Oliver Dzombic > Sent: 30 September 2016 14:16 > To: ceph-users@lists.ceph.com > Subject: [ceph-users] production cluster down :( > > Hi, > > we have: > > ceph version 10.2.2 > > heal

Re: [ceph-users] Blog post about Ceph cache tiers - feedback welcome

2016-10-02 Thread Nick Fisk
Hi Sascha, Good article, you might want to add a small section about these two variables osd_agent_max_high_ops osd_agent_max_ops They control how many concurrent flushes happen at the high/low thresholds. Ie you can set the low one to 1 to minimise the impact on client IO. Also the target_max

Re: [ceph-users] 6 Node cluster with 24 SSD per node: Hardware planning / agreement

2016-10-04 Thread Nick Fisk
Hi, Comments inline > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Denny Fuchs > Sent: 04 October 2016 14:43 > To: ceph-users@lists.ceph.com > Subject: [ceph-users] 6 Node cluster with 24 SSD per node: Hardware planning > / agreement >

Re: [ceph-users] 6 Node cluster with 24 SSD per node: Hardware planning / agreement

2016-10-04 Thread Nick Fisk
g / agreement > > Hi, > > thanks for take a look :-) > > Am 04.10.2016 16:11, schrieb Nick Fisk: > > >> We have two goals: > >> > >> * High availability > >> * Short latency for our transaction services > > > > How Low? See belo

Re: [ceph-users] 6 Node cluster with 24 SSD per node: Hardwareplanning/ agreement

2016-10-05 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Denny Fuchs > Sent: 05 October 2016 12:43 > To: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] 6 Node cluster with 24 SSD per node: > Hardwareplanning/ agreement > > hi, > > I get a

Re: [ceph-users] RBD with SSD journals and SAS OSDs

2016-10-17 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > William Josefsson > Sent: 17 October 2016 09:31 > To: Christian Balzer > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] RBD with SSD journals and SAS OSDs > > Thx Christian for he

Re: [ceph-users] RBD with SSD journals and SAS OSDs

2016-10-17 Thread Nick Fisk
1476.187 > cpu MHz : 2545.125 > cpu MHz : 2792.718 > cpu MHz : 2630.156 > cpu MHz : 3090.750 > cpu MHz : 2951.906 > cpu MHz : 2845.875 > cpu MHz : 2553.281 > cpu MHz : 2602.125 > cpu MHz : 2600.906 > cp

Re: [ceph-users] RBD with SSD journals and SAS OSDs

2016-10-20 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > William Josefsson > Sent: 20 October 2016 10:25 > To: Nick Fisk > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] RBD with SSD journals and SAS OSDs > &

[ceph-users] Ceph and TCP States

2016-10-21 Thread Nick Fisk
Hi, I'm just testing out using a Ceph client in a DMZ behind a FW from the main Ceph cluster. One thing I have noticed is that if the state table on the FW is emptied maybe by restarting it or just clearing the state table...etc. Then the Ceph client will hang for a long time as the TCP session

Re: [ceph-users] Ceph and TCP States

2016-10-21 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Haomai Wang > Sent: 21 October 2016 15:28 > To: Nick Fisk > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] Ceph and TCP States > > > > On Fr

Re: [ceph-users] Ceph and TCP States

2016-10-21 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Haomai Wang > Sent: 21 October 2016 15:40 > To: Nick Fisk > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] Ceph and TCP States > > > > On Fr

Re: [ceph-users] cache tiering deprecated in RHCS 2.0

2016-10-23 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Zoltan Arnold Nagy > Sent: 22 October 2016 15:13 > To: ceph-users > Subject: [ceph-users] cache tiering deprecated in RHCS 2.0 > > Hi, > > The 2.0 release notes for Red Hat Ceph Storage dep

Re: [ceph-users] Three tier cache

2016-10-23 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Robert Sanders > Sent: 22 October 2016 03:44 > To: ceph-us...@ceph.com > Subject: [ceph-users] Three tier cache > > Hello, > > Is it possible to create a three level cache tier? Searching d

Re: [ceph-users] cache tiering deprecated in RHCS 2.0

2016-10-23 Thread Nick Fisk
From: Robert Sanders [mailto:rlsand...@gmail.com] Sent: 23 October 2016 16:32 To: n...@fisk.me.uk Cc: ceph-users Subject: Re: [ceph-users] cache tiering deprecated in RHCS 2.0 On Oct 23, 2016, at 4:32 AM, Nick Fisk mailto:n...@fisk.me.uk> > wrote: Unofficial answer but I susp

Re: [ceph-users] New cephfs cluster performance issues- Jewel - cache pressure, capability release, poor iostat await avg queue size

2016-10-24 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Christian Balzer > Sent: 24 October 2016 02:30 > To: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] New cephfs cluster performance issues- Jewel - > cache pressure, capability release,

Re: [ceph-users] Ceph and TCP States

2016-10-24 Thread Nick Fisk
> -Original Message- > From: Yan, Zheng [mailto:uker...@gmail.com] > Sent: 24 October 2016 10:19 > To: Gregory Farnum > Cc: Nick Fisk ; Zheng Yan ; Ceph Users > > Subject: Re: [ceph-users] Ceph and TCP States > > X-Assp-URIBL failed: 'ceph-users-ceph

Re: [ceph-users] Ceph and TCP States

2016-10-24 Thread Nick Fisk
> -Original Message- > From: Ilya Dryomov [mailto:idryo...@gmail.com] > Sent: 24 October 2016 10:33 > To: Nick Fisk > Cc: Yan, Zheng ; Gregory Farnum ; > Zheng Yan ; Ceph Users us...@lists.ceph.com> > Subject: Re: [ceph-users] Ceph and TCP States > > On

Re: [ceph-users] Ceph and TCP States

2016-10-24 Thread Nick Fisk
> -Original Message- > From: Ilya Dryomov [mailto:idryo...@gmail.com] > Sent: 24 October 2016 14:45 > To: Nick Fisk > Cc: Yan, Zheng ; Gregory Farnum ; > Zheng Yan ; Ceph Users us...@lists.ceph.com> > Subject: Re: [ceph-users] Ceph and TCP States > > On

Re: [ceph-users] 10Gbit switch advice for small ceph cluster upgrade

2016-10-27 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Oliver Dzombic > Sent: 27 October 2016 14:16 > To: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] 10Gbit switch advice for small ceph cluster upgrade > > Hi, > > i can recommand > >

[ceph-users] MDS Problems - Solved but reporting for benefit of others

2016-11-02 Thread Nick Fisk
Hi all, Just a bit of an outage with CephFS around the MDS's, I managed to get everything up and running again after a bit of head scratching and thought I would share here what happened. Cause I believe the MDS's which were running as VM's suffered when the hypervisor ran out of ram and starte

Re: [ceph-users] MDS Problems - Solved but reporting for benefit of others

2016-11-02 Thread Nick Fisk
be interested if that bug is related or if I have stumbled on something new. Nick > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Nick > Fisk > Sent: 02 November 2016 17:58 > To: 'Ceph Users' > Subject: [ceph-

[ceph-users] MDS Problems

2016-11-04 Thread Nick Fisk
gt; > > > >&, std::__cxx11::basic_string, std::allocator >, ceph::buffer::list&)+0xa8) [0x55c691a10db8] 8: (AdminSocket::do_accept()+0x1267) [0x55c691e1c987] 9: (AdminSocket::entry()+0x298) [0x55c691e1df48] 10: (()+0x770a) [0x7f925215070a] 11: (clone()+0x6d) [0x7f92506

Re: [ceph-users] MDS Problems

2016-11-04 Thread Nick Fisk
Hi John, thanks for your response > -Original Message- > From: John Spray [mailto:jsp...@redhat.com] > Sent: 04 November 2016 14:26 > To: n...@fisk.me.uk > Cc: Ceph Users > Subject: Re: [ceph-users] MDS Problems > > On Fri, Nov 4, 2016 at 2:54 PM, Nick Fisk

[ceph-users] Scrubbing not using Idle thread?

2016-11-08 Thread Nick Fisk
Hi, I have all the normal options set in ceph.conf (priority and class for disk threads) however scrubs look like they are running as the standard BE/4 class in iotop. Running 10.2.3. Eg PG Dump (Shows that OSD 1 will be scrubbing) pg_stat objects mip degrmispunf bytes log

Re: [ceph-users] Scrubbing not using Idle thread?

2016-11-08 Thread Nick Fisk
> -Original Message- > From: Dan van der Ster [mailto:d...@vanderster.com] > Sent: 08 November 2016 08:38 > To: Nick Fisk > Cc: Ceph Users > Subject: Re: [ceph-users] Scrubbing not using Idle thread? > > Hi Nick, > > That's expected since jewel, w

Re: [ceph-users] MDS Problems - Solved but reporting for benefit of others

2016-11-09 Thread Nick Fisk
> -Original Message- > From: Gregory Farnum [mailto:gfar...@redhat.com] > Sent: 08 November 2016 22:55 > To: Nick Fisk > Cc: Ceph Users > Subject: Re: [ceph-users] MDS Problems - Solved but reporting for benefit of > others > > On Wed, Nov 2, 2016 at 2:49

Re: [ceph-users] 6 Node cluster with 24 SSD per node: Hardwareplanning/ agreement

2016-11-11 Thread Nick Fisk
Hi Matteo, > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Matteo Dacrema > Sent: 11 November 2016 10:57 > To: Christian Balzer > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] 6 Node cluster with 24 SSD per node: > Hardwarepl

[ceph-users] Ceph Blog Articles

2016-11-11 Thread Nick Fisk
Hi All, I've recently put together some articles around some of the performance testing I have been doing. The first explores the high level theory behind latency in a Ceph infrastructure and what we have managed to achieve. http://www.sys-pro.co.uk/ceph-write-latency/ The second explores som

Re: [ceph-users] Ceph Blog Articles

2016-11-11 Thread Nick Fisk
Hi, Yes, I specifically wanted to make sure the disk part of the infrastructure didn't affect the results, the main aims were to reduce the end to end latency in the journals and Ceph code by utilising fast CPU's and NVME journals. SQL transaction logs are a good example where this low latency,

Re: [ceph-users] Ceph Blog Articles

2016-11-14 Thread Nick Fisk
uot;fio" or something > else? Would you be willing to attach to the article the relevant part of > the benchmark tool configuration? >Thanks! > > Fulvio > > Original Message > Subject: [ceph-users] Ceph Blog Articles > From: N

Re: [ceph-users] Ceph Blog Articles

2016-11-14 Thread Nick Fisk
Hi Maged, I would imagine as soon as you start saturating the disks, the latency impact would make the savings from the fast CPU's pointless. Really you would only try and optimise the latency if you are using SSD based cluster. This was only done with spinning disks in our case with a low Que

Re: [ceph-users] Ceph Blog Articles

2016-11-14 Thread Nick Fisk
res are near peak utilization. > > Cheers /Maged > > ------ > From: "Nick Fisk" > Sent: Monday, November 14, 2016 11:41 AM > To: "'Maged Mokhtar'" ; > Subject: RE: [ceph-users] Ceph Blog Articles > > > Hi Maged, > >

Re: [ceph-users] Ceph Blog Articles

2016-11-14 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > William Josefsson > Sent: 14 November 2016 14:46 > To: Nick Fisk > Cc: Ceph Users > Subject: Re: [ceph-users] Ceph Blog Articles > > Hi Nick, I found the graph

[ceph-users] After OSD Flap - FAILED assert(oi.version == i->first)

2016-11-15 Thread Nick Fisk
Hi, I have two OSD's which are failing with an assert which looks related to missing objects. This happened after a large RBD snapshot was deleted causing several OSD's to start flapping as they experienced high load. Cluster is fully recovered and I don't need any help from a recovery perspecti

[ceph-users] Kernel 4.7 on OSD nodes

2016-11-15 Thread Nick Fisk
Hi All, Just a slight note of caution. I had been running the 4.7 kernel (With Ubuntu 16.04) on the majority of my OSD Nodes, as when I installed the cluster there was that outstanding panic bug with the 4.4 kernel. I have been experiencing a lot of flapping OSD's every time the cluster was p

Re: [ceph-users] Fwd: iSCSI Lun issue after MON Out Of Memory

2016-11-16 Thread Nick Fisk
I assume you mean you only had 1 mon and it crashed, so effectively the iSCSI suddenly went offline? I suspect somehow that you have corrupted the NTFS volume, are there any errors in the event log? You may be able to use some disk recovery tools to try and fix the FS. Maybe also try mou

Re: [ceph-users] ceph cluster having blocke requests very frequently

2016-11-16 Thread Nick Fisk
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Thomas Danan Sent: 15 November 2016 21:14 To: Peter Maloney Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] ceph cluster having blocke requests very frequently Very interesting ... Any idea why optimal

Re: [ceph-users] ceph cluster having blocke requests very frequently

2016-11-16 Thread Nick Fisk
rmances in that case. We have just deactivated it and deleted all snapshots. Will notify you if it drastically reduce the blocked ops and consequently the IO freeze on client side. Thanks Thomas From: Nick Fisk [mailto:n...@fisk.me.uk] Sent: mercredi 16 novembre 2016 13:25 To: Thomas D

Re: [ceph-users] Help needed ! cluster unstable after upgrade from Hammer to Jewel

2016-11-16 Thread Nick Fisk
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Vincent Godin Sent: 16 November 2016 18:02 To: ceph-users Subject: [ceph-users] Help needed ! cluster unstable after upgrade from Hammer to Jewel Hello, We now have a full cluster (Mon, OSD & Clients) in jewel 10.2.

Re: [ceph-users] how possible is that ceph cluster crash

2016-11-16 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Pedro Benites > Sent: 16 November 2016 17:51 > To: ceph-users@lists.ceph.com > Subject: [ceph-users] how possible is that ceph cluster crash > > Hi, > > I have a ceph cluster with 50 TB, w

Re: [ceph-users] ceph cluster having blocke requests very frequently

2016-11-16 Thread Nick Fisk
ration increased significantly. Also the number of impacted OSDs was much more important. Don’t really know what to conclude from all of this … Again we have checked Disk / network / and everything seems fine … Thomas From: Nick Fisk [mailto:n...@fisk.me.uk] Sent: mercredi 16 novembre 2016

Re: [ceph-users] ceph cluster having blocke requests very frequently

2016-11-17 Thread Nick Fisk
wing messages on secondary OSDs. 2016-11-15 03:53:04.298502 7ff9c434f700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7ff9bdb42700' had timed out after 15 Thomas From: Nick Fisk [mailto:n...@fisk.me.uk] Sent: mercredi 16 novembre 2016 22:13 To: Thomas Danan; n...@f

Re: [ceph-users] After OSD Flap - FAILED assert(oi.version == i->first)

2016-11-17 Thread Nick Fisk
Hi Sam, I've updated the ticket with logs from the wip run. Nick > -Original Message- > From: Samuel Just [mailto:sj...@redhat.com] > Sent: 15 November 2016 18:30 > To: Nick Fisk > Cc: Ceph Users > Subject: Re: [ceph-users] After OSD Flap - FAILED assert(oi

[ceph-users] I want to submit a PR - Can someone guide me

2016-11-18 Thread Nick Fisk
Hi All, I want to submit a PR to include fix in this tracker bug, as I have just realised I've been experiencing it. http://tracker.ceph.com/issues/9860 I understand that I would also need to update the debian/ceph-osd.* to get the file copied, however I'm not quite sure where this new file (/

<    1   2   3   4   5   6   7   8   >