Re: [ceph-users] init mon fail since use service rather than systemctl

2018-06-21 Thread xiang....@sky-data.cn
Thanks very much - Original Message - From: "Alfredo Deza" To: "xiang dai" Cc: "ceph-users" Sent: Thursday, June 21, 2018 8:42:34 PM Subject: Re: [ceph-users] init mon fail since use service rather than systemctl On Thu, Jun 21, 2018 at 8:41 AM, wrote: > I met below issue: > > INFO:

Re: [ceph-users] fixing unrepairable inconsistent PG

2018-06-21 Thread Brad Hubbard
That seems like an authentication issue? Try running it like so... $ ceph --debug_monc 20 --debug_auth 20 pg 18.2 query On Thu, Jun 21, 2018 at 12:18 AM, Andrei Mikhailovsky wrote: > Hi Brad, > > Yes, but it doesn't show much: > > ceph pg 18.2 query > Error EPERM: problem getting command

Re: [ceph-users] lacp bonding | working as expected..?

2018-06-21 Thread mj
Hi Jacob, Thanks for your reply. But I'm not sure I completely understand it. :-) On 06/21/2018 09:09 PM, Jacob DeGlopper wrote: In your example, where you see one link being used, I see an even source IP paired with an odd destination port number for both transfers, or is that a search and

[ceph-users] Centos kernel

2018-06-21 Thread Steven Vacaroaia
Hi, Just wondering if you would recommend using newest kernel on Centos ( i.e. after installing regular Centos ( 3.10.0-862), enable elrepo.kernel and install 4.17 ) or simply stay with the regular one Many thanks Steven ___ ceph-users mailing list

Re: [ceph-users] lacp bonding | working as expected..?

2018-06-21 Thread Jacob DeGlopper
Consider trying some variation in source and destination IP addresses and port numbers - unless you force it, iperf3 at least tends to pick only even port numbers for the ephemeral source port, which leads to all traffic being balanced to one link. In your example, where you see one link

Re: [ceph-users] MDS: journaler.pq decode error

2018-06-21 Thread John Spray
On Thu, Jun 21, 2018 at 4:39 PM Benjeman Meekhof wrote: > > I do have one follow-up related question: While doing this I took > offline all the standby MDS, and max_mds on our cluster is at 1. Were > I to enable multiple MDS would they all actively split up processing > the purge queue? When

Re: [ceph-users] MDS: journaler.pq decode error

2018-06-21 Thread Benjeman Meekhof
I do have one follow-up related question: While doing this I took offline all the standby MDS, and max_mds on our cluster is at 1. Were I to enable multiple MDS would they all actively split up processing the purge queue? We have not yet at this point ever allowed multi active MDS but plan to

Re: [ceph-users] MDS: journaler.pq decode error

2018-06-21 Thread Benjeman Meekhof
Thanks very much John! Skipping over the corrupt entry by setting a new expire_pos seems to have worked. The journal expire_pos is now advancing and pools are being purged. It has a little while to go to catch up to current write_pos but the journal inspect command gives an 'OK' for overall

Re: [ceph-users] Designating an OSD as a spare

2018-06-21 Thread Wido den Hollander
On 06/21/2018 03:35 PM, Drew Weaver wrote: > Yes, > >   > > Eventually however you would probably want to replace that physical disk > that has died and sometimes with remote deployments it is nice to not > have to do that instantly which is how enterprise arrays and support > contracts have

Re: [ceph-users] Designating an OSD as a spare

2018-06-21 Thread Drew Weaver
Yes, Eventually however you would probably want to replace that physical disk that has died and sometimes with remote deployments it is nice to not have to do that instantly which is how enterprise arrays and support contracts have worked for decades. I understand your point from a purely

Re: [ceph-users] Designating an OSD as a spare

2018-06-21 Thread Paul Emmerich
Spare disks are bad design. There is no point in having a disk that is not being used. Ceph will automatically remove a dead disk after 15 minutes from the cluster, backfilling the data onto other disks. Paul 2018-06-21 14:54 GMT+02:00 Drew Weaver : > Does anyone know if it is possible to

Re: [ceph-users] "ceph pg scrub" does not start

2018-06-21 Thread Jake Grimmett
On 21/06/18 10:14, Wido den Hollander wrote: Hi Wido, >> Note the date stamps, the scrub command appears to be ignored >> >> Any ideas on why this is happening, and what we can do to fix the error? > > Are any of the OSDs involved with that PG currently doing recovery? If > so, they will ignore

Re: [ceph-users] MDS: journaler.pq decode error

2018-06-21 Thread John Spray
On Wed, Jun 20, 2018 at 2:17 PM Benjeman Meekhof wrote: > > Thanks for the response. I was also hoping to be able to debug better > once we got onto Mimic. We just finished that upgrade yesterday and > cephfs-journal-tool does find a corruption in the purge queue though > our MDS continues to

[ceph-users] Designating an OSD as a spare

2018-06-21 Thread Drew Weaver
Does anyone know if it is possible to designate an OSD as a spare so that if a disk dies in a host no administrative action needs to be immediately taken to remedy the situation? Thanks, -Drew ___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] CentOS Dojo at CERN

2018-06-21 Thread Dan van der Ster
On Thu, Jun 21, 2018 at 2:41 PM Kai Wagner wrote: > > On 20.06.2018 17:39, Dan van der Ster wrote: > > And BTW, if you can't make it to this event we're in the early days of > > planning a dedicated Ceph + OpenStack Days at CERN around May/June > > 2019. > > More news on that later... > Will that

Re: [ceph-users] init mon fail since use service rather than systemctl

2018-06-21 Thread Alfredo Deza
On Thu, Jun 21, 2018 at 8:41 AM, wrote: > I met below issue: > > INFO: initialize ceph mon ... > [ceph_deploy.conf][DEBUG ] found configuration file at: > /root/.cephdeploy.conf > [ceph_deploy.cli][INFO ] Invoked (1.5.25): /usr/bin/ceph-deploy > --overwrite-conf mon create-initial >

Re: [ceph-users] CentOS Dojo at CERN

2018-06-21 Thread Kai Wagner
On 20.06.2018 17:39, Dan van der Ster wrote: > And BTW, if you can't make it to this event we're in the early days of > planning a dedicated Ceph + OpenStack Days at CERN around May/June > 2019. > More news on that later... Will that be during a CERN maintenance window? *that would raise my

[ceph-users] init mon fail since use service rather than systemctl

2018-06-21 Thread xiang . dai
I met below issue: INFO: initialize ceph mon ... [ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (1.5.25): /usr/bin/ceph-deploy --overwrite-conf mon create-initial [ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts

[ceph-users] MDS reports metadata damage

2018-06-21 Thread Hennen, Christian
Dear Community, here at ZIMK at the University of Trier we operate a Ceph Luminous Cluster as filer for a HPC environment via CephFS (Bluestore backend). During setup last year we made the mistake of not configuring the RAID as JBOD, so initially the 3 nodes only housed 1 OSD each. Currently,

[ceph-users] "ceph pg scrub" does not start

2018-06-21 Thread Jake Grimmett
Dear All, A bad disk controller appears to have damaged our cluster... # ceph health HEALTH_ERR 10 scrub errors; Possible data damage: 10 pgs inconsistent probing to find bad pg... # ceph health detail HEALTH_ERR 10 scrub errors; Possible data damage: 10 pgs inconsistent OSD_SCRUB_ERRORS 10

Re: [ceph-users] PG status is "active+undersized+degraded"

2018-06-21 Thread Burkhard Linke
Hi, On 06/21/2018 05:14 AM, dave.c...@dell.com wrote: Hi all, I have setup a ceph cluster in my lab recently, the configuration per my understanding should be okay, 4 OSD across 3 nodes, 3 replicas, but couple of PG stuck with state "active+undersized+degraded", I think this should be very