Re: [ceph-users] Command ceph osd df hangs

2019-11-21 Thread Thomas Schneider
Hi, issue solved! I have stopped active MGR service and waited until standby MGR became active. Then I started the (previously stopped) MGR service in order to have 2 standby. Thanks Eugen Am 21.11.2019 um 15:23 schrieb Eugen Block: > Hi, > > check if the active MGR is hanging. > I had this

Re: [ceph-users] RBD Mirror DR Testing

2019-11-21 Thread Jason Dillaman
On Thu, Nov 21, 2019 at 10:16 AM Vikas Rana wrote: > > Thanks Jason. > We are just mounting and verifying the directory structure and make sure it > looks good. > > My understanding was, in 12.2.10, we can't mount the DR snapshot as the RBD > image is non-primary. Is this wrong? You have

Re: [ceph-users] RBD Mirror DR Testing

2019-11-21 Thread Vikas Rana
Thanks Jason. We are just mounting and verifying the directory structure and make sure it looks good. My understanding was, in 12.2.10, we can't mount the DR snapshot as the RBD image is non-primary. Is this wrong? Thanks, -Vikas -Original Message- From: Jason Dillaman Sent:

Re: [ceph-users] Scaling out

2019-11-21 Thread Alfredo De Luca
Thanks heaps Nathan. That's what we thoughts and we wanted implement but I wanted to double check with the community. Cheers On Thu, Nov 21, 2019 at 2:42 PM Nathan Fish wrote: > The default crush rule uses "host" as the failure domain, so in order > to deploy on one host you will need to

Re: [ceph-users] Replace bad db for bluestore

2019-11-21 Thread Nathan Fish
A power outage shouldn't corrupt your db unless you are doing dangerous async writes. And sharing an SSD for several OSDs on the same host is normal, but not an issue given that you have planned for the failure of hosts. On Thu, Nov 21, 2019 at 9:57 AM 展荣臻(信泰) wrote: > > > > In general db is

Re: [ceph-users] RBD Mirror DR Testing

2019-11-21 Thread Jason Dillaman
On Thu, Nov 21, 2019 at 9:56 AM Jason Dillaman wrote: > > On Thu, Nov 21, 2019 at 8:49 AM Vikas Rana wrote: > > > > Thanks Jason for such a quick response. We are on 12.2.10. > > > > Checksuming a 200TB image will take a long time. > > How would mounting an RBD image and scanning the image be

Re: [ceph-users] Replace bad db for bluestore

2019-11-21 Thread 展荣臻(信泰)
In general db is located ssd and 4~5 or more osd share the same ssd. Considering such a situation that the db is broken due to the power outage of data center, it will be a disaster. > You should design your cluster and crush rules such that a failure of > a single OSD is not a problem.

Re: [ceph-users] RBD Mirror DR Testing

2019-11-21 Thread Jason Dillaman
On Thu, Nov 21, 2019 at 8:49 AM Vikas Rana wrote: > > Thanks Jason for such a quick response. We are on 12.2.10. > > Checksuming a 200TB image will take a long time. How would mounting an RBD image and scanning the image be faster? Are you only using a small percentage of the image? > To test

Re: [ceph-users] Replace bad db for bluestore

2019-11-21 Thread 展荣臻(信泰)
Thanks for your reply. If the db is broken but the block is good, can I think the data is still ok? So re-create osd cause a lot of data migration ,than only replace db. > > if it's no longer readable: no, the OSD is lost, you'll have to re-create it > > Paul > > -- > Paul Emmerich > >

Re: [ceph-users] Command ceph osd df hangs

2019-11-21 Thread Eugen Block
Hi, check if the active MGR is hanging. I had this when testing pg_autoscaler, after some time every command would hang. Restarting the MGR helped for a short period of time, then I disabled pg_autoscaler. This is an upgraded cluster, currently on Nautilus. Regards, Eugen Zitat von

[ceph-users] Command ceph osd df hangs

2019-11-21 Thread Thomas Schneider
Hi, command ceph osd df does not return any output. Based on the strace output there's a timeout. [...] mmap(NULL, 262144, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f53006b9000 brk(0x55c2579b6000) = 0x55c2579b6000 brk(0x55c2579d7000) =

Re: [ceph-users] Cannot enable pg_autoscale_mode

2019-11-21 Thread Thomas Schneider
Update: Issue is solved. The output of "ceph osd dump" showed that the required setting was incorrect, means require_osd_release luminous After executing ceph osd require-osd-release nautilus I can enable pg_autoscale_mode on any pool. THX Am 21.11.2019 um 13:51 schrieb Paul Emmerich: > "ceph

Re: [ceph-users] RBD Mirror DR Testing

2019-11-21 Thread Vikas Rana
Thanks Jason for such a quick response. We are on 12.2.10. Checksuming a 200TB image will take a long time. To test the DR copy by mounting it, these are the steps I'm planning to follow 1. Demote the Prod copy and promote the DR copy 2. Do we have to recreate the rbd mirror relationship going

Re: [ceph-users] Replace bad db for bluestore

2019-11-21 Thread Nathan Fish
You should design your cluster and crush rules such that a failure of a single OSD is not a problem. Preferably such that losing any 1 host isn't a problem either. On Thu, Nov 21, 2019 at 6:32 AM zhanrzh...@teamsun.com.cn wrote: > > Hi,all > Suppose the db of bluestore can't read/write,are

Re: [ceph-users] Scaling out

2019-11-21 Thread Nathan Fish
The default crush rule uses "host" as the failure domain, so in order to deploy on one host you will need to make a crush rule that specifies "osd". Then simply adding more hosts with osds will result in automatic rebalancing. Once you have enough hosts to satisfy the crush rule ( 3 for replicated

Re: [ceph-users] RBD Mirror DR Testing

2019-11-21 Thread Jason Dillaman
On Thu, Nov 21, 2019 at 8:29 AM Vikas Rana wrote: > > Hi all, > > > > We have a 200TB RBD image which we are replicating using RBD mirroring. > > We want to test the DR copy and make sure that we have a consistent copy in > case primary site is lost. > > > > We did it previously and promoted the

[ceph-users] RBD Mirror DR Testing

2019-11-21 Thread Vikas Rana
Hi all, We have a 200TB RBD image which we are replicating using RBD mirroring. We want to test the DR copy and make sure that we have a consistent copy in case primary site is lost. We did it previously and promoted the DR copy which broken the DR copy from primary and we have to resync

Re: [ceph-users] Cannot enable pg_autoscale_mode

2019-11-21 Thread Thomas Schneider
Looks like the flag is not correct. root@ld3955:~# ceph osd dump | grep nautilus root@ld3955:~# ceph osd dump | grep require require_min_compat_client luminous require_osd_release luminous Am 21.11.2019 um 13:51 schrieb Paul Emmerich: > "ceph osd dump" shows you if the flag is set > > > Paul >

Re: [ceph-users] Cannot enable pg_autoscale_mode

2019-11-21 Thread Paul Emmerich
"ceph osd dump" shows you if the flag is set Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Thu, Nov 21, 2019 at 1:18 PM Thomas Schneider <74cmo...@gmail.com>

Re: [ceph-users] Replace bad db for bluestore

2019-11-21 Thread Paul Emmerich
if it's no longer readable: no, the OSD is lost, you'll have to re-create it Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Thu, Nov 21, 2019 at 12:18 PM

[ceph-users] Scaling out

2019-11-21 Thread Alfredo De Luca
Hi all. We are doing some tests on how to scale out nodes on Ceph Nautilus. Basically we want to try to install Ceph on one node and scale up to 2+ nodes. How to do so? Every nodes has 6 disks and maybe we can use the crushmap to achieve this? Any thoughts/ideas/recommendations? Cheers --

Re: [ceph-users] Cannot enable pg_autoscale_mode

2019-11-21 Thread Thomas Schneider
Hello Paul, I didn't skip this step. Actually I'm sure that everything on Cluster is on Nautilus because I had issues with SLES 12SP2 Clients that failed to connect due to outdated client tools that could not connect to Nautilus. Would it make sense to execute ceph osd require-osd-release

[ceph-users] Replace bad db for bluestore

2019-11-21 Thread zhanrzh...@teamsun.com.cn
Hi,all  Suppose the db of bluestore can't read/write,are there  some methods to replace bad db with a new one,in luminous. If not, i dare not  deploy ceph with bluestore in my production. zhanrzh...@teamsun.com.cn ___ ceph-users mailing list

Re: [ceph-users] Cannot enable pg_autoscale_mode

2019-11-21 Thread Paul Emmerich
did you skip the last step of the OSD upgrade during the Nautilus upgrade? ceph osd require-osd-release nautilus Paul Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90

[ceph-users] Cannot enable pg_autoscale_mode

2019-11-21 Thread Thomas Schneider
Hi, I try to enable pg_autoscale_mode on a specific pool of my cluster, however this returns an error. root@ld3955:~# ceph osd pool set ssd pg_autoscale_mode on Error EINVAL: must set require_osd_release to nautilus or later before setting pg_autoscale_mode The error message is clear, but my

[ceph-users] bucket policies with Principal (arn) on a subuser-level

2019-11-21 Thread Francois Scheurer
Dear All Is it possible to define s3 bucket policies with the Principal ("arn:aws:iam:::user/parentusera") on a subuser - level instead of user - level? I did a test with Nautilus (14.2.4-373) with a user 'parentusera' and a subuser 'subusera'. radosgw-admin user info --uid=parentusera {