[ceph-users] assertion error trying to start mds server

2017-10-10 Thread Bill Sharer
I've been in the process of updating my gentoo based cluster both with new hardware and a somewhat postponed update.  This includes some major stuff including the switch from gcc 4.x to 5.4.0 on existing hardware and using gcc 6.4.0 to make better use of AMD Ryzen on the new hardware.  The

Re: [ceph-users] min_size & hybrid OSD latency

2017-10-10 Thread Christian Balzer
Hello, On Wed, 11 Oct 2017 00:05:26 +0200 Jack wrote: > Hi, > > I would like some information about the following > > Let say I have a running cluster, with 4 OSDs: 2 SSDs, and 2 HDDs > My single pool has size=3, min_size=2 > > For a write-only pattern, I thought I would get SSDs performance

[ceph-users] RGW flush_read_list error

2017-10-10 Thread Travis Nielsen
In Luminous 12.2.1, when running a GET on a large (1GB file) repeatedly for an hour from RGW, the following error was hit intermittently a number of times. The first error was hit after 45 minutes and then the error happened frequently for the remainder of the test. ERROR: flush_read_list():

[ceph-users] min_size & hybrid OSD latency

2017-10-10 Thread Jack
Hi, I would like some information about the following Let say I have a running cluster, with 4 OSDs: 2 SSDs, and 2 HDDs My single pool has size=3, min_size=2 For a write-only pattern, I thought I would get SSDs performance level, because the write would be acked as soon as min_size OSDs acked

Re: [ceph-users] right way to recover a failed OSD (disk) when using BlueStore ?

2017-10-10 Thread Alejandro Comisario
Hi, i see some notes there that did'nt existed on jewel : http://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/#replacing-an-osd In my case what im using right now on that OSD is this : root@ndc-cl-osd4:~# ls -lsah /var/lib/ceph/osd/ceph-104 total 64K 0 drwxr-xr-x 2 ceph ceph

Re: [ceph-users] Unable to restrict a CephFS client to a subdirectory

2017-10-10 Thread John Spray
On Tue, Oct 10, 2017 at 5:40 PM, Shawfeng Dong wrote: > Hi Yoann, > > I confirm too that your recipe works! > > We run CentOS 7: > [root@pulpo-admin ~]# uname -r > 3.10.0-693.2.2.el7.x86_64 > > Here were the old caps for user 'hydra': > # ceph auth get client.hydra > exported

Re: [ceph-users] Unable to restrict a CephFS client to a subdirectory

2017-10-10 Thread Shawfeng Dong
Hi Yoann, I confirm too that your recipe works! We run CentOS 7: [root@pulpo-admin ~]# uname -r 3.10.0-693.2.2.el7.x86_64 Here were the old caps for user 'hydra': # ceph auth get client.hydra exported keyring for client.hydra [client.hydra] key = AQ==

Re: [ceph-users] All replicas of pg 5.b got placed on the same host - how to correct?

2017-10-10 Thread Peter Linder
Probably chooseleaf also instead of choose. Konrad Riedel skrev: (10 oktober 2017 17:05:52 CEST) >Hello Ceph-users, > >after switching to luminous I was excited about the great >crush-device-class feature - now we have 5 servers with 1x2TB NVMe >based OSDs, 3 of them

Re: [ceph-users] All replicas of pg 5.b got placed on the same host - how to correct?

2017-10-10 Thread Peter Linder
I think your failure domain within your rules is wrong. step choose firstn 0 type osd Should be: step choose firstn 0 type host On 10/10/2017 5:05 PM, Konrad Riedel wrote: > Hello Ceph-users, > > after switching to luminous I was excited about the great > crush-device-class feature - now we

[ceph-users] All replicas of pg 5.b got placed on the same host - how to correct?

2017-10-10 Thread Konrad Riedel
Hello Ceph-users, after switching to luminous I was excited about the great crush-device-class feature - now we have 5 servers with 1x2TB NVMe based OSDs, 3 of them additionally with 4 HDDS per server. (we have only three 400G NVMe disks for block.wal and block.db and therefore can't

Re: [ceph-users] Ceph-mgr summarize recovery counters

2017-10-10 Thread John Spray
On Tue, Oct 10, 2017 at 3:50 PM, Benjeman Meekhof wrote: > Hi John, > > Thanks for the guidance! Is pg_status something we should expect to > find in Luminous (12.2.1)? It doesn't seem to exist. We do have a > 'pg_summary' object which contains a list of every PG and

Re: [ceph-users] Ceph-mgr summarize recovery counters

2017-10-10 Thread Benjeman Meekhof
Hi John, Thanks for the guidance! Is pg_status something we should expect to find in Luminous (12.2.1)? It doesn't seem to exist. We do have a 'pg_summary' object which contains a list of every PG and current state (active, etc) but nothing about I/O. Calls to self.get('pg_status') in our

Re: [ceph-users] rgw resharding operation seemingly won't end

2017-10-10 Thread Ryan Leimenstoll
Thanks for the response Yehuda. Staus: [root@objproxy02 UMobjstore]# radosgw-admin reshard status —bucket=$bucket_name [ { "reshard_status": 1, "new_bucket_instance_id": "8b980d5b-23de-41f9-8b14-84a5bbc3f1c9.47370206.1", "num_shards": 4 } ] I cleared the flag

Re: [ceph-users] killing ceph-disk [was Re: ceph-volume: migration and disk partition support]

2017-10-10 Thread Willem Jan Withagen
On 10-10-2017 14:21, Alfredo Deza wrote: > On Tue, Oct 10, 2017 at 8:14 AM, Willem Jan Withagen wrote: >> On 10-10-2017 13:51, Alfredo Deza wrote: >>> On Mon, Oct 9, 2017 at 8:50 PM, Christian Balzer wrote: Hello, (pet peeve alert) On

Re: [ceph-users] ceph-volume: migration and disk partition support

2017-10-10 Thread Alfredo Deza
On Tue, Oct 10, 2017 at 3:28 AM, Stefan Kooman wrote: > Hi, > > Quoting Alfredo Deza (ad...@redhat.com): >> Hi, >> >> Now that ceph-volume is part of the Luminous release, we've been able >> to provide filestore support for LVM-based OSDs. We are making use of >> LVM's powerful

Re: [ceph-users] killing ceph-disk [was Re: ceph-volume: migration and disk partition support]

2017-10-10 Thread Alfredo Deza
On Tue, Oct 10, 2017 at 8:14 AM, Willem Jan Withagen wrote: > On 10-10-2017 13:51, Alfredo Deza wrote: >> On Mon, Oct 9, 2017 at 8:50 PM, Christian Balzer wrote: >>> >>> Hello, >>> >>> (pet peeve alert) >>> On Mon, 9 Oct 2017 15:09:29 + (UTC) Sage Weil wrote:

Re: [ceph-users] how to debug (in order to repair) damaged MDS (rank)?

2017-10-10 Thread Daniel Baumann
On 10/10/2017 02:10 PM, John Spray wrote: > Yes. worked, rank 6 is back and cephfs up again. thank you very much. > Do a final ls to make sure you got all of them -- it is > dangerous to leave any fragments behind. will do. > BTW opened http://tracker.ceph.com/issues/21749 for the underlying

Re: [ceph-users] killing ceph-disk [was Re: ceph-volume: migration and disk partition support]

2017-10-10 Thread Willem Jan Withagen
On 10-10-2017 13:51, Alfredo Deza wrote: > On Mon, Oct 9, 2017 at 8:50 PM, Christian Balzer wrote: >> >> Hello, >> >> (pet peeve alert) >> On Mon, 9 Oct 2017 15:09:29 + (UTC) Sage Weil wrote: >> >>> To put this in context, the goal here is to kill ceph-disk in mimic. Right,

Re: [ceph-users] how to debug (in order to repair) damaged MDS (rank)?

2017-10-10 Thread John Spray
On Tue, Oct 10, 2017 at 12:30 PM, Daniel Baumann wrote: > Hi John, > > thank you very much for your help. > > On 10/10/2017 12:57 PM, John Spray wrote: >> A) Do a "rados -p ls | grep "^506\." or similar, to >> get a list of the objects > > done, gives me these: > >

Re: [ceph-users] killing ceph-disk [was Re: ceph-volume: migration and disk partition support]

2017-10-10 Thread Alfredo Deza
On Mon, Oct 9, 2017 at 8:50 PM, Christian Balzer wrote: > > Hello, > > (pet peeve alert) > On Mon, 9 Oct 2017 15:09:29 + (UTC) Sage Weil wrote: > >> To put this in context, the goal here is to kill ceph-disk in mimic. >> >> One proposal is to make it so new OSDs can *only* be

Re: [ceph-users] how to debug (in order to repair) damaged MDS (rank)?

2017-10-10 Thread Daniel Baumann
Hi John, thank you very much for your help. On 10/10/2017 12:57 PM, John Spray wrote: > A) Do a "rados -p ls | grep "^506\." or similar, to > get a list of the objects done, gives me these: 506. 506.0017 506.001b 506.0019 506.001a 506.001c

[ceph-users] BlueStore Cache Ratios

2017-10-10 Thread Jorge Pinilla López
I've been reading about BlueStore and I came across the BlueStore Cache and its ratios. I couldn't fully understand it. Are .99 KV, .01 MetaData and .0 Data ratios right? they seem a little too disproporcionate. Also .99 KV and Cache of 3GB for SSD means that almost the 3GB would be used for KV

Re: [ceph-users] Unable to restrict a CephFS client to a subdirectory

2017-10-10 Thread Yoann Moulin
>> I am trying to follow the instructions at: >> http://docs.ceph.com/docs/master/cephfs/client-auth/ >> to restrict a client to a subdirectory of Ceph filesystem, but always get >> an error. >> >> We are running the latest stable release of Ceph (v12.2.1) on CentOS 7 >> servers. The user

Re: [ceph-users] Unable to restrict a CephFS client to a subdirectory

2017-10-10 Thread John Spray
On Tue, Oct 10, 2017 at 2:22 AM, Shawfeng Dong wrote: > Dear all, > > I am trying to follow the instructions at: > http://docs.ceph.com/docs/master/cephfs/client-auth/ > to restrict a client to a subdirectory of Ceph filesystem, but always get > an error. > > We are running the

Re: [ceph-users] how to debug (in order to repair) damaged MDS (rank)?

2017-10-10 Thread John Spray
On Tue, Oct 10, 2017 at 10:28 AM, Daniel Baumann wrote: > Hi all, > > unfortunatly I'm still struggling bringing cephfs back up after one of > the MDS has been marked "damaged" (see messages from monday). > > 1. When I mark the rank as "repaired", this is what I get in the

[ceph-users] advice on number of objects per OSD

2017-10-10 Thread Alexander Kushnirenko
Hi, Are there any recommendations on what is the limit when osd performance start to decline because of large number of objects? Or perhaps a procedure on how to find this number (luminous)? My understanding is that the recommended object size is 10-100 MB, but is there any performance hit due

[ceph-users] 1 MDSs behind on trimming (was Re: clients failing to advance oldest client/flush tid)

2017-10-10 Thread John Spray
On Tue, Oct 10, 2017 at 3:48 AM, Nigel Williams wrote: > On 9 October 2017 at 19:21, Jake Grimmett wrote: >> HEALTH_WARN 9 clients failing to advance oldest client/flush tid; >> 1 MDSs report slow requests; 1 MDSs behind on trimming (This is

[ceph-users] how to debug (in order to repair) damaged MDS (rank)?

2017-10-10 Thread Daniel Baumann
Hi all, unfortunatly I'm still struggling bringing cephfs back up after one of the MDS has been marked "damaged" (see messages from monday). 1. When I mark the rank as "repaired", this is what I get in the monitor log (leaving unrelated leveldb compacting chatter aside): 2017-10-10

Re: [ceph-users] ceph-volume: migration and disk partition support

2017-10-10 Thread Dan van der Ster
On Fri, Oct 6, 2017 at 6:56 PM, Alfredo Deza wrote: > Hi, > > Now that ceph-volume is part of the Luminous release, we've been able > to provide filestore support for LVM-based OSDs. We are making use of > LVM's powerful mechanisms to store metadata which allows the process > to

Re: [ceph-users] ceph-volume: migration and disk partition support

2017-10-10 Thread Stefan Kooman
Hi, Quoting Alfredo Deza (ad...@redhat.com): > Hi, > > Now that ceph-volume is part of the Luminous release, we've been able > to provide filestore support for LVM-based OSDs. We are making use of > LVM's powerful mechanisms to store metadata which allows the process > to no longer rely on UDEV

Re: [ceph-users] Unable to restrict a CephFS client to a subdirectory

2017-10-10 Thread Yoann Moulin
Hello, > I am trying to follow the instructions at: > http://docs.ceph.com/docs/master/cephfs/client-auth/ > to restrict a client to a subdirectory of  Ceph filesystem, but always get an > error. > > We are running the latest stable release of Ceph (v12.2.1) on CentOS 7 > servers. The user