Re: [ceph-users] ceph-fuse using excessive memory

2018-09-06 Thread Yan, Zheng
Could you please try make ceph-fuse use simple messenger (add "ms type = simple" to client section of ceph.conf). Regards Yan, Zheng On Wed, Sep 5, 2018 at 10:09 PM Sage Weil wrote: > > On Wed, 5 Sep 2018, Andras Pataki wrote: > > Hi cephers, > > > > Every so often we have a ceph-fuse process

Re: [ceph-users] Ceph and NVMe

2018-09-06 Thread Linh Vu
We have P3700s and Optane 900P (similar to P4800 but the workstation version and a lot cheaper) on R730xds, for WAL, DB and metadata pools for cephfs and radosgw. They perform great! From: ceph-users on behalf of Jeff Bailey Sent: Friday, 7 September 2018

[ceph-users] Safe to use RBD mounts for Docker volumes on containerized Ceph nodes

2018-09-06 Thread Jacob DeGlopper
I've seen the requirement not to mount RBD devices or CephFS filesystems on OSD nodes.  Does this still apply when the OSDs and clients using the RBD volumes are all in Docker containers? That is, is it possible to run a 3-server setup in production with both Ceph daemons (mon, mgr, and OSD)

Re: [ceph-users] Ceph and NVMe

2018-09-06 Thread Jeff Bailey
I haven't had any problems using 375GB P4800X's in R730 and R740xd machines for DB+WAL. The iDRAC whines a bit on the R740 but everything works fine. On 9/6/2018 3:09 PM, Steven Vacaroaia wrote: Hi , Just to add to this question, is anyone using Intel Optane DC P4800X on DELL R630 ...or any

Re: [ceph-users] Ceph talks from Mounpoint.io

2018-09-06 Thread Amye Scavarda
Still working on all of that! Never fear! -- amye On Thu, Sep 6, 2018 at 1:16 PM David Turner wrote: > They mentioned that they were going to send the slides to everyone that > had their badges scanned at the conference. I haven't seen that email come > out yet, though. > > On Thu, Sep 6, 2018

Re: [ceph-users] Ceph talks from Mounpoint.io

2018-09-06 Thread David Turner
They mentioned that they were going to send the slides to everyone that had their badges scanned at the conference. I haven't seen that email come out yet, though. On Thu, Sep 6, 2018 at 4:14 PM Gregory Farnum wrote: > Unfortunately I don't believe anybody collected the slide files, so they >

Re: [ceph-users] Ceph and NVMe

2018-09-06 Thread Steven Vacaroaia
Hi , Just to add to this question, is anyone using Intel Optane DC P4800X on DELL R630 ...or any other server ? Any gotchas / feedback/ knowledge sharing will be greatly appreciated Steven On Thu, 6 Sep 2018 at 14:59, Stefan Priebe - Profihost AG < s.pri...@profihost.ag> wrote: > Hello list, >

[ceph-users] Ceph and NVMe

2018-09-06 Thread Stefan Priebe - Profihost AG
Hello list, has anybody tested current NVMe performance with luminous and bluestore? Is this something which makes sense or just a waste of money? Greets, Stefan ___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] CephFS on a mixture of SSDs and HDDs

2018-09-06 Thread Marc Roos
To add a data pool to an existing cephfs ceph osd pool set fs_data.ec21 allow_ec_overwrites true ceph osd pool application enable fs_data.ec21 cephfs ceph fs add_data_pool cephfs fs_data.ec21 Then link the pool to the directory (ec21) setfattr -n ceph.dir.layout.pool -v fs_data.ec21 ec21

Re: [ceph-users] CephFS on a mixture of SSDs and HDDs

2018-09-06 Thread Serkan Çoban
>Is there a way of doing this without running multiple filesystems within the >same cluster? yes, have a look at following link: https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html-single/ceph_file_system_guide/index#working-with-file-and-directory-layouts On Thu, Sep 6,

Re: [ceph-users] Ceph talks from Mounpoint.io

2018-09-06 Thread Gregory Farnum
Unfortunately I don't believe anybody collected the slide files, so they aren't available for public access. :( On Wed, Sep 5, 2018 at 8:16 PM xiangyang yu wrote: > Hi Greg, > Where can we download the talk ppt at mountpoint.io? > > Best wishes, > brandy > > Gregory Farnum 于2018年9月6日周四

[ceph-users] CephFS on a mixture of SSDs and HDDs

2018-09-06 Thread Vladimir Brik
Hello I am setting up a new ceph cluster (probably Mimic) made up of servers that have a mixture of solid state and spinning disks. I'd like CephFS to store data of some of our applications only on SSDs, and store data of other applications only on HDDs. Is there a way of doing this without

Re: [ceph-users] v12.2.8 Luminous released

2018-09-06 Thread Igor Fedotov
Hi Adrian, yes, this issue has been fixed by https://github.com/ceph/ceph/pull/22909 Thanks, Igor On 9/6/2018 8:10 AM, Adrian Saul wrote: Can I confirm if this bluestore compression assert issue is resolved in 12.2.8? https://tracker.ceph.com/issues/23540 I notice that it has a backport

Re: [ceph-users] Rados performance inconsistencies, lower than expected performance

2018-09-06 Thread Alwin Antreich
On Thu, Sep 06, 2018 at 05:15:26PM +0200, Marc Roos wrote: > > It is idle, testing still, running a backup's at night on it. > How do you fill up the cluster so you can test between empty and full? > Do you have a "ceph df" from empty and full? > > I have done another test disabling new scrubs

Re: [ceph-users] mgr/dashboard: Community branding & styling

2018-09-06 Thread Ernesto Puerta
Thanks for the feedback, Erwan, John! We may follow up on the tracker issues. Kind Regards, Ernesto On Thu, Sep 6, 2018 at 3:54 PM Erwan Velu wrote: > > Cool stuff. > Feed the tickets to report my comments. > > Cheers, > > - Mail original - > De: "Ernesto Puerta" > À:

Re: [ceph-users] Rados performance inconsistencies, lower than expected performance

2018-09-06 Thread Marc Roos
It is idle, testing still, running a backup's at night on it. How do you fill up the cluster so you can test between empty and full? Do you have a "ceph df" from empty and full? I have done another test disabling new scrubs on the rbd.ssd pool (but still 3 on hdd) with: ceph tell osd.*

Re: [ceph-users] failing to respond to cache pressure

2018-09-06 Thread Eugen Block
Hi, I would like to update this thread for others struggling with cache pressure. The last time we hit that message was more than three weeks ago (workload has not changed), so it seems as our current configuration is fitting our workload. Reducing client_oc_size to 100 MB (from default 200

Re: [ceph-users] Rados performance inconsistencies, lower than expected performance

2018-09-06 Thread Menno Zonneveld
The benchmark does fluctuate quite a bit that's why I run it for 180 seconds now as then I do get consistent results. Your performance seems on par with what I'm getting with 3 nodes and 9 OSD's, not sure what to make of that. Are your machines actively used perhaps? Mine are mostly idle as

Re: [ceph-users] Rados performance inconsistencies, lower than expected performance

2018-09-06 Thread Menno Zonneveld
-Original message- > From:Alwin Antreich > Sent: Thursday 6th September 2018 16:27 > To: ceph-users > Cc: Menno Zonneveld > Subject: Re: [ceph-users] Rados performance inconsistencies, lower than > expected performance > > Hi, Hi! > On Thu, Sep 06, 2018 at 03:52:21PM +0200, Menno

Re: [ceph-users] Rados performance inconsistencies, lower than expected performance

2018-09-06 Thread Alwin Antreich
Hi, On Thu, Sep 06, 2018 at 03:52:21PM +0200, Menno Zonneveld wrote: > ah yes, 3x replicated with minimal 2. > > > my ceph.conf is pretty bare, just in case it might be relevant > > [global] >auth client required = cephx >auth cluster required = cephx >auth service

Re: [ceph-users] Rados performance inconsistencies, lower than expected performance

2018-09-06 Thread Marc Roos
I am on 4 nodes, mostly hdds, and 4x samsung sm863 480GB 2x E5-2660 2x LSI SAS2308 1x dual port 10Gbit (one used, and shared between cluster/client vlans) I have 5 pg's scrubbing, but I am not sure if there is any on the ssd pool. I am noticing a drop in the performance at the end of the

Re: [ceph-users] help needed

2018-09-06 Thread Darius Kasparavičius
Hello, I'm currently running a similar setup. It's running a blustore OSD with 1 NVME device for db/wal devices. That NVME device is not large enough to support 160GB db partition per osd, so I'm stuck with 50GB each. Currently haven't had any issues with slowdowns or crashes. The cluster is

Re: [ceph-users] Rados performance inconsistencies, lower than expected performance

2018-09-06 Thread Menno Zonneveld
ah yes, 3x replicated with minimal 2. my ceph.conf is pretty bare, just in case it might be relevant [global] auth client required = cephx auth cluster required = cephx auth service required = cephx cluster network = 172.25.42.0/24 fsid =

Re: [ceph-users] Rados performance inconsistencies, lower than expected performance

2018-09-06 Thread Marc Roos
Test pool is 3x replicated? -Original Message- From: Menno Zonneveld [mailto:me...@1afa.com] Sent: donderdag 6 september 2018 15:29 To: ceph-users@lists.ceph.com Subject: [ceph-users] Rados performance inconsistencies, lower than expected performance I've setup a CEPH cluster to

[ceph-users] Rados performance inconsistencies, lower than expected performance

2018-09-06 Thread Menno Zonneveld
I've setup a CEPH cluster to test things before going into production but I've run into some performance issues that I cannot resolve or explain. Hardware in use in each storage machine (x3) - dual 10Gbit Solarflare Communications SFC9020 (Linux bond, mtu 9000) - dual 10Gbit EdgeSwitch 16-Port

Re: [ceph-users] help needed

2018-09-06 Thread Nick Fisk
If it helps, I’m seeing about a 3GB DB usage for a 3TB OSD about 60% full. This is with a pure RBD workload, I believe this can vary depending on what your Ceph use case is. From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of David Turner Sent: 06 September 2018 14:09

Re: [ceph-users] help needed

2018-09-06 Thread David Turner
The official ceph documentation recommendations for a db partition for a 4TB bluestore osd would be 160GB each. Samsung Evo Pro is not an Enterprise class SSD. A quick search of the ML will allow which SSDs people are using. As was already suggested, the better option is an HBA as opposed to a

Re: [ceph-users] help needed

2018-09-06 Thread Muhammad Junaid
Thanks. Can you please clarify, if we use any other enterprise class SSD for journal, should we enable write-back caching available on raid controller for journal device or connect it as write through. Regards. On Thu, Sep 6, 2018 at 4:50 PM Marc Roos wrote: > > > > Do not use Samsung 850 PRO

Re: [ceph-users] Upgrading ceph with HEALTH_ERR 1 scrub errors; Possible data damage: 1 pg inconsistent

2018-09-06 Thread Marc Roos
> > > > > > > The adviced solution is to upgrade ceph only in HEALTH_OK state. And I > > also read somewhere that is bad to have your cluster for a long time in > > an HEALTH_ERR state. > > > > But why is this bad? > > Aside from the obvious (errors are bad things!), many people have >

Re: [ceph-users] Upgrading ceph with HEALTH_ERR 1 scrub errors; Possible data damage: 1 pg inconsistent

2018-09-06 Thread Marc Roos
Thanks interesting to read. So in luminous it is not really a problem. I was expecting to get into trouble with the monitors/mds. Because my failover takes quite long, and thought it was related to the damaged pg Luminous: "When the past intervals tracking structure was rebuilt around exactly

Re: [ceph-users] help needed

2018-09-06 Thread Marc Roos
Do not use Samsung 850 PRO for journal Just use LSI logic HBA (eg. SAS2308) -Original Message- From: Muhammad Junaid [mailto:junaid.fsd...@gmail.com] Sent: donderdag 6 september 2018 13:18 To: ceph-users@lists.ceph.com Subject: [ceph-users] help needed Hi there Hope, every one

[ceph-users] Fixing a 12.2.5 reshard

2018-09-06 Thread Sean Purdy
Hi, We were on 12.2.5 when a bucket with versioning and 100k objects got stuck when autoreshard kicked in. We could download but not upload files. But upgrading to 12.2.7 then running bucket check now shows twice as many objects, according to bucket limit check. How do I fix this?

[ceph-users] help needed

2018-09-06 Thread Muhammad Junaid
Hi there Hope, every one will be fine. I need an urgent help in ceph cluster design. We are planning 3 OSD node cluster in the beginning. Details are as under: Servers: 3 * DELL R720xd OS Drives: 2 2.5" SSD OSD Drives: 10 3.5" SAS 7200rpm 3/4 TB Journal Drives: 2 SSD's Samsung 850 PRO 256GB

Re: [ceph-users] ceph-fuse using excessive memory

2018-09-06 Thread Andras Pataki
It looks like I have a process that can reproduce the problem at will.  Attached is a quick plot of the RSS memory usage of ceph-fuse over a period of 13-14 hours or so (the x axis is minutes, the y axis is bytes).  It looks like the process steadily grows up to about 200GB and then its memory

Re: [ceph-users] v12.2.8 Luminous released

2018-09-06 Thread Abhishek Lekshmanan
Adrian Saul writes: > Can I confirm if this bluestore compression assert issue is resolved in > 12.2.8? > > https://tracker.ceph.com/issues/23540 The PR itself in the backport issue is in the release notes, ie. pr#22909, which references two tracker issues. Unfortunately,the script that

Re: [ceph-users] Ceph Luminous - journal setting

2018-09-06 Thread M Ranga Swami Reddy
Thank you. Yep, Iam using the bluestore backend with Luminous version. Thanks Swami On Tue, Sep 4, 2018 at 9:04 PM David Turner wrote: > > Are you planning on using bluestore or filestore? The settings for filestore > haven't changed. If you're planning to use bluestore there is a lot of >

Re: [ceph-users] Slow requests from bluestore osds

2018-09-06 Thread Marc Schöchlin
Hello Uwe, as described in my mail we are running 4.13.0-39. In conjunction with some later mails of this thread it seems that this problem might related to os/microcode (spectre) updates. I am planning a ceph/ubuntu upgrade in the next week because of various reasons, let's see what

Re: [ceph-users] [Ceph-community] How to setup Ceph OSD auto boot up on node reboot

2018-09-06 Thread Mateusz Skala (UST, POL)
Hi, If it’s problem with UUID of partition You can use this commands: sgdisk --change-name={journal_partition_number}:'ceph journal' --typecode={journal_partition_number}:45b0969e-9b03-4f30-b4c6-b4b80ceff106 --mbrtogpt -- /dev/{journal_device} sgdisk

Re: [ceph-users] SSD OSDs crashing after upgrade to 12.2.7

2018-09-06 Thread Caspar Smit
Hi, These reports are kind of worrying since we have a 12.2.5 cluster too waiting to upgrade. Did you have a luck with upgrading to 12.2.8 or still the same behavior? Is there a bugtracker for this issue? Kind regards, Caspar Op di 4 sep. 2018 om 09:59 schreef Wolfgang Lendl <