Re: [ceph-users] v12.2.8 Luminous released

2018-09-05 Thread Adrian Saul
Can I confirm if this bluestore compression assert issue is resolved in 12.2.8? https://tracker.ceph.com/issues/23540 I notice that it has a backport that is listed against 12.2.8 but there is no mention of that issue or backport listed in the release notes. > -Original Message- >

Re: [ceph-users] Ceph talks from Mounpoint.io

2018-09-05 Thread linghucongsong
Thank you for Gregory to provide it. By the way where to get the pdf for these talks? Thanks again! At 2018-09-06 07:03:32, "Gregory Farnum" wrote: >Hey all, >Just wanted to let you know that all the talks from Mountpoint.io are >now available on YouTube. These are reasonably

[ceph-users] Ceph talks from Mounpoint.io

2018-09-05 Thread Gregory Farnum
Hey all, Just wanted to let you know that all the talks from Mountpoint.io are now available on YouTube. These are reasonably high-quality videos and include Ceph talks such as: "Bringing smart device failure prediction to Ceph" "Pains & Pleasures Testing the Ceph Distributed Storage Stack" "Ceph

Re: [ceph-users] No announce for 12.2.8 / available in repositories

2018-09-05 Thread Linh Vu
With more testing and checking, we realised that this had nothing to do with Ceph. One part of the upgrade accidentally changed the MTU of our VMs tap interface from 9000 to 1500... Sorry for the false warning everyone! From: ceph-users on behalf of Linh Vu

Re: [ceph-users] Slow requests from bluestore osds

2018-09-05 Thread Tim Bishop
On Sat, Sep 01, 2018 at 12:45:06PM -0400, Brett Chancellor wrote: > Hi Cephers, > I am in the process of upgrading a cluster from Filestore to bluestore, > but I'm concerned about frequent warnings popping up against the new > bluestore devices. I'm frequently seeing messages like this, although

Re: [ceph-users] Slow requests from bluestore osds

2018-09-05 Thread Brett Chancellor
Mine is currently at 1000 due to the high number of pgs we had coming from Jewel. I do find it odd that only the bluestore OSDs have this issue. Filestore OSDs seem to be unaffected. On Wed, Sep 5, 2018, 3:43 PM Samuel Taylor Liston wrote: > Just a thought - have you looked at increasing your

Re: [ceph-users] Slow requests from bluestore osds

2018-09-05 Thread Samuel Taylor Liston
Just a thought - have you looked at increasing your "—mon_max_pg_per_osd” both on the mons and osds? I was having a similar issue while trying to add more OSDs to my cluster (12.2.27, CentOS7.5, 3.10.0-862.9.1.el7.x86_64). I increased mine to 300 temporarily while adding OSDs and stopped

Re: [ceph-users] Slow requests from bluestore osds

2018-09-05 Thread Daniel Pryor
I've experienced the same thing during scrubbing and/or any kind of expansion activity. *Daniel Pryor* On Mon, Sep 3, 2018 at 2:13 AM Marc Schöchlin wrote: > Hi, > > we are also experiencing this type of behavior for some weeks on our not > so performance critical hdd pools. > We haven't spent

Re: [ceph-users] Slow requests from bluestore osds

2018-09-05 Thread Brett Chancellor
I'm running Centos 7.5. If I turn off spectre/meltdown protection then a security sweep will disconnect it from the network. -Brett On Wed, Sep 5, 2018 at 2:24 PM, Uwe Sauter wrote: > I'm also experiencing slow requests though I cannot point it to scrubbing. > > Which kernel do you run? Would

Re: [ceph-users] Slow requests from bluestore osds

2018-09-05 Thread Uwe Sauter
I'm also experiencing slow requests though I cannot point it to scrubbing. Which kernel do you run? Would you be able to test against the same kernel with Spectre/Meltdown mitigations disabled ("noibrs noibpb nopti nospectre_v2" as boot option)? Uwe Am 05.09.18 um 19:30 schrieb Brett

Re: [ceph-users] [Ceph-community] How to setup Ceph OSD auto boot up on node reboot

2018-09-05 Thread David Turner
The magic sauce to get Filestore OSDs to start on a node reboot is to make sure that all of your udev magic is correct. In particular you need to have the correct UUID set for all partitions. I haven't dealt with it in a long time, but I've written up a few good ML responses about it. On Tue,

Re: [ceph-users] Slow requests from bluestore osds

2018-09-05 Thread Brett Chancellor
Marc, As with you, this problem manifests itself only when the bluestore OSD is involved in some form of deep scrub. Anybody have any insight on what might be causing this? -Brett On Mon, Sep 3, 2018 at 4:13 AM, Marc Schöchlin wrote: > Hi, > > we are also experiencing this type of behavior

Re: [ceph-users] v12.2.8 Luminous released

2018-09-05 Thread David Turner
I upgraded my home cephfs/rbd cluster to 12.2.8 during an OS upgrade to Ubuntu 18.04 and ProxMox 5.1 (Stretch). Everything is running well so far. On Wed, Sep 5, 2018 at 10:21 AM Dan van der Ster wrote: > Thanks for the release! > > We've updated some test clusters (rbd, cephfs) and it looks

Re: [ceph-users] RBD image "lightweight snapshots"

2018-09-05 Thread Alex Elder
On 08/09/2018 08:15 AM, Sage Weil wrote: > On Thu, 9 Aug 2018, Piotr Dałek wrote: >> Hello, >> >> At OVH we're heavily utilizing snapshots for our backup system. We think >> there's an interesting optimization opportunity regarding snapshots I'd like >> to discuss here. >> >> The idea is to

[ceph-users] Save the date: Ceph Day Berlin - November 12th

2018-09-05 Thread Danielle Womboldt
*We’re bringing Ceph to you, conveniently co-located with OpenStack Summit Berlin! Join Ceph experts, our customers and partners, and active members of the Ceph community in a full-day event all about Ceph. You’ll hear from key members of the development community, storage industry experts, and

Re: [ceph-users] v12.2.8 Luminous released

2018-09-05 Thread Dan van der Ster
Thanks for the release! We've updated some test clusters (rbd, cephfs) and it looks good so far. -- dan On Tue, Sep 4, 2018 at 6:30 PM Abhishek Lekshmanan wrote: > > > We're glad to announce the next point release in the Luminous v12.2.X > stable release series. This release contains a range

Re: [ceph-users] ceph-fuse using excessive memory

2018-09-05 Thread Andras Pataki
Below are the performance counters.  Some scientific workflows trigger this - some parts of them are quite data intensive - they process thousands of files over many hours to days.  The 200GB ceph-fuse got there in about 3 days.  I'm keeping the node alive for now in case we can extract some

Re: [ceph-users] ceph-fuse using excessive memory

2018-09-05 Thread Sage Weil
On Wed, 5 Sep 2018, Andras Pataki wrote: > Hi cephers, > > Every so often we have a ceph-fuse process that grows to rather large size (up > to eating up the whole memory of the machine).  Here is an example of a 200GB > RSS size ceph-fuse instance: > > # ceph daemon

[ceph-users] ceph-fuse using excessive memory

2018-09-05 Thread Andras Pataki
Hi cephers, Every so often we have a ceph-fuse process that grows to rather large size (up to eating up the whole memory of the machine).  Here is an example of a 200GB RSS size ceph-fuse instance: # ceph daemon /var/run/ceph/ceph-client.admin.asok dump_mempools {     "bloom_filter": {   

[ceph-users] Slow Ceph: Any plans on torrent-like transfers from OSDs ?

2018-09-05 Thread Alex Lupsa
Alex Lupsa tis 4 sep. 18:07 (21 timmar sedan) till ceph-users Hi, I have a really small homelab 3-node ceph cluster on consumer hw - thanks to Proxmox for making it easy to deploy it. The problem I am having is very very bad transfer rates, ie 20mb/sec for both read and write on 17 OSDs with

Re: [ceph-users] Upgrading ceph with HEALTH_ERR 1 scrub errors; Possible data damage: 1 pg inconsistent

2018-09-05 Thread Sean Purdy
On Wed, 5 Sep 2018, John Spray said: > On Wed, Sep 5, 2018 at 8:38 AM Marc Roos wrote: > > > > > > The adviced solution is to upgrade ceph only in HEALTH_OK state. And I > > also read somewhere that is bad to have your cluster for a long time in > > an HEALTH_ERR state. > > > > But why is this

[ceph-users] mgr/dashboard: Community branding & styling

2018-09-05 Thread Ernesto Puerta
Hi dashboard devels & users, You may find below a link to a PDF with the recommendations from Michael Celedonia & Ju Lim (in CC) on top of the current community branding, but just to summarize changes (http://tracker.ceph.com/issues/35688): - Login screen (http://tracker.ceph.com/issues/35689).

Re: [ceph-users] Upgrading ceph with HEALTH_ERR 1 scrub errors; Possible data damage: 1 pg inconsistent

2018-09-05 Thread John Spray
On Wed, Sep 5, 2018 at 8:38 AM Marc Roos wrote: > > > The adviced solution is to upgrade ceph only in HEALTH_OK state. And I > also read somewhere that is bad to have your cluster for a long time in > an HEALTH_ERR state. > > But why is this bad? Aside from the obvious (errors are bad things!),

Re: [ceph-users] Ceph-Deploy error on 15/71 stage

2018-09-05 Thread Eugen Block
Hi Jones, Just to make things clear: are you so telling me that it is completely impossible to have a ceph "volume" in non-dedicated devices, sharing space with, for instance, the nodes swap, boot or main partition? And so the only possible way to have a functioning ceph distributed filesystem

[ceph-users] Upgrading ceph with HEALTH_ERR 1 scrub errors; Possible data damage: 1 pg inconsistent

2018-09-05 Thread Marc Roos
The adviced solution is to upgrade ceph only in HEALTH_OK state. And I also read somewhere that is bad to have your cluster for a long time in an HEALTH_ERR state. But why is this bad? Why is this bad during upgrading? Can I quantify how bad it is? (like with large log/journal file?)

Re: [ceph-users] mimic + cephmetrics + prometheus - working ?

2018-09-05 Thread Jan Fajerski
I'm not the expert when it comes to cephmetrics but I think (at least until very recently) cephmetrics relies on other exporters besides the mgr module and the node_exporter. On Mon, Aug 27, 2018 at 01:11:29PM -0400, Steven Vacaroaia wrote: Hi has anyone been able to use Mimic +

Re: [ceph-users] mimic - troubleshooting prometheus

2018-09-05 Thread Jan Fajerski
The prometheus plugin currently skips histogram perf counters. The representation in ceph is not compatible with prometheus' approach (iirc). However I believe most, if not all of the perf counters should be exported as long running averages. Look for metric pair that are named

Re: [ceph-users] How to secure Prometheus endpoints (mgr plugin and node_exporter)

2018-09-05 Thread Jan Fajerski
Hi Martin, hope this is still useful, despite the lag. On Fri, Jun 29, 2018 at 01:04:09PM +0200, Martin Palma wrote: Since Prometheus uses a pull model over HTTP for collecting metrics. What are the best practices to secure these HTTP endpoints? - With a reverse proxy with authentication? This