Re: [ceph-users] 3 OSDs stopped and unable to restart

2019-07-11 Thread Brett Chancellor
I did try and run sudo ceph-bluestore-tool --out-dir /mnt/ceph bluefs-export . but it died after writing out 93GB and filling up my root partition. On Thu, Jul 11, 2019 at 3:32 PM Brett Chancellor wrote: > We moved the .rgw.meta data pool over to SSD to try and improve > performance, during the

Re: [ceph-users] "session established", "io error", "session lost, hunting for new mon" solution/fix

2019-07-11 Thread Marc Roos
Anyone know why I would get these? Is it not strange to get them in a 'standard' setup? -Original Message- Subject: [ceph-users] "session established", "io error", "session lost, hunting for new mon" solution/fix I have on a cephfs client again (luminous cluster, centos7, only

Re: [ceph-users] 3 OSDs stopped and unable to restart

2019-07-11 Thread Brett Chancellor
We moved the .rgw.meta data pool over to SSD to try and improve performance, during the backfill SSDs bgan dying in mass. Log attached to this case https://tracker.ceph.com/issues/40741 Right now the SSD's wont come up with either allocator and the cluster is pretty much dead. What are the

Re: [ceph-users] P1 production down - 4 OSDs down will not start 14.2.1 nautilus

2019-07-11 Thread Edward Kalk
Production has been restored, it just took about 26 minuets for linux to let me execute the OSD start command this time. The longest yet. sudo systemctl start ceph-osd@X (Yes, this has happened to us about 4 times now.) -Ed > On Jul 11, 2019, at 11:38 AM, Edward Kalk wrote: > > Rebooted node

Re: [ceph-users] memory usage of: radosgw-admin bucket rm

2019-07-11 Thread Harald Staub
Created https://tracker.ceph.com/issues/40700 (sry forgot to mention). On 11.07.19 16:41, Matt Benjamin wrote: I don't think one has been created yet. Eric Ivancich and Mark Kogan of my team are investigating this behavior. Matt On Thu, Jul 11, 2019 at 10:40 AM Paul Emmerich wrote: Is

Re: [ceph-users] memory usage of: radosgw-admin bucket rm [EXT]

2019-07-11 Thread Matthew Vernon
On 11/07/2019 15:40, Paul Emmerich wrote: Is there already a tracker issue? I'm seeing the same problem here. Started deletion of a bucket with a few hundred million objects a week ago or so and I've now noticed that it's also leaking memory and probably going to crash. Going to investigate

Re: [ceph-users] memory usage of: radosgw-admin bucket rm

2019-07-11 Thread Matt Benjamin
I don't think one has been created yet. Eric Ivancich and Mark Kogan of my team are investigating this behavior. Matt On Thu, Jul 11, 2019 at 10:40 AM Paul Emmerich wrote: > > Is there already a tracker issue? > > I'm seeing the same problem here. Started deletion of a bucket with a few >

[ceph-users] 14.2.1 Nautilus OSDs crash

2019-07-11 Thread Edward Kalk
http://tracker.ceph.com/issues/38724 ^seems this bug is related. I’ve added notes to it. Triggers seem to be a node reboot or remove or add a new OSD. There seem to be pack port duplicates for Mimic and Luminous Copied to RADOS - Backport #39692

Re: [ceph-users] memory usage of: radosgw-admin bucket rm

2019-07-11 Thread Paul Emmerich
Is there already a tracker issue? I'm seeing the same problem here. Started deletion of a bucket with a few hundred million objects a week ago or so and I've now noticed that it's also leaking memory and probably going to crash. Going to investigate this further... Paul -- Paul Emmerich

Re: [ceph-users] RGW Beast crash 14.2.1

2019-07-11 Thread Casey Bodley
On 7/11/19 3:28 AM, EDH - Manuel Rios Fernandez wrote: Hi Folks, This night RGW crashed without sense using beast as fronted. We solved turning on civetweb again. Should be report to tracker? Please do. It looks like this crashed during startup. Can you please include the rgw_frontends

Re: [ceph-users] shutdown down all monitors

2019-07-11 Thread Nathan Fish
The monitors determine quorum, so stopping all monitors will immediately stop IO to prevent split-brain. I would not recommend shutting down all mons at once in production, though it *should* come back up fine. If you really need to, shut them down in a certain order, and bring them back up in the

[ceph-users] Libceph clock drift or I guess kernel clock drift issue

2019-07-11 Thread Marc Roos
I noticed that the dmesg -T gives incorrect time, the messages have a time in the future compared to the system time. Not sure if this is libceph issue or a kernel issue. [Thu Jul 11 10:41:22 2019] libceph: mon2 192.168.10.113:6789 session lost, hunting for new mon [Thu Jul 11 10:41:22 2019]

[ceph-users] "session established", "io error", "session lost, hunting for new mon" solution/fix

2019-07-11 Thread Marc Roos
I have on a cephfs client again (luminous cluster, centos7, only 32 osds!). Wanted to share the 'fix' [Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 session established [Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 io error [Thu Jul 11 12:16:09 2019] libceph: mon0

Re: [ceph-users] What's the best practice for Erasure Coding

2019-07-11 Thread Frank Schilder
Oh dear. Every occurrence of stripe_* is wrong :) It should be stripe_count (option --stripe-count in rbd create) everywhere in my text. What choices are legal depends on the restrictions on stripe_count*stripe_unit (=stripe_size=stripe_width?) imposed by ceph. I believe all of this ends up

Re: [ceph-users] writable snapshots in cephfs? GDPR/DSGVO

2019-07-11 Thread Lars Täuber
Thu, 11 Jul 2019 10:24:16 +0200 "Marc Roos" ==> ceph-users , lmb : > What about creating snaps on a 'lower level' in the directory structure > so you do not need to remove files from a snapshot as a work around? Thanks for the idea. This might be a solution for our use case. Regards, Lars

Re: [ceph-users] writable snapshots in cephfs? GDPR/DSGVO

2019-07-11 Thread Lars Täuber
Thu, 11 Jul 2019 10:21:16 +0200 Lars Marowsky-Bree ==> ceph-users@lists.ceph.com : > On 2019-07-10T09:59:08, Lars Täuber wrote: > > > Hi everbody! > > > > Is it possible to make snapshots in cephfs writable? > > We need to remove files because of this General Data Protection Regulation > >

Re: [ceph-users] What's the best practice for Erasure Coding

2019-07-11 Thread Lars Marowsky-Bree
On 2019-07-11T09:46:47, Frank Schilder wrote: > Striping with stripe units other than 1 is something I also tested. I found > that with EC pools non-trivial striping should be avoided. Firstly, EC is > already a striped format and, secondly, striping on top of that with > stripe_unit>1 will

Re: [ceph-users] shutdown down all monitors

2019-07-11 Thread Wido den Hollander
On 7/11/19 11:42 AM, Marc Roos wrote: > > > Can I temporary shutdown all my monitors? This only affects new > connections not? Existing will still keep running? > You can, but it will completely shut down your whole Ceph cluster. All I/O will pause until the MONs are back and have reached

Re: [ceph-users] What's the best practice for Erasure Coding

2019-07-11 Thread Frank Schilder
Striping with stripe units other than 1 is something I also tested. I found that with EC pools non-trivial striping should be avoided. Firstly, EC is already a striped format and, secondly, striping on top of that with stripe_unit>1 will make every write an ec_overwrite, because now shards are

[ceph-users] shutdown down all monitors

2019-07-11 Thread Marc Roos
Can I temporary shutdown all my monitors? This only affects new connections not? Existing will still keep running? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Luminous cephfs maybe not to stable as expected?

2019-07-11 Thread Marc Roos
I decided to restart osd.0, then the load of the cephfs and on all osd nodes dropped. After this I still have on the first server [@~]# cat /sys/kernel/debug/ceph/0f1701f5-453a-4a3b-928d-f652a2bbbcb0.client357431 0/osdc REQUESTS 0 homeless 0 LINGER REQUESTS BACKOFFS [@~]# cat

Re: [ceph-users] Luminous cephfs maybe not to stable as expected?

2019-07-11 Thread Marc Roos
Forgot to add these [@ ~]# cat /sys/kernel/debug/ceph/0f1701f5-453a-4a3b-928d-f652a2bbbcb0.client357431 0/osdc REQUESTS 0 homeless 0 LINGER REQUESTS BACKOFFS [@~]# cat /sys/kernel/debug/ceph/0f1701f5-453a-4a3b-928d-f652a2bbbcb0.client358422 4/osdc REQUESTS 38 homeless 0 317841 osd0

[ceph-users] Luminous cephfs maybe not to stable as expected?

2019-07-11 Thread Marc Roos
Maybe this requires some attention. I have a default centos7 (maybe not the most recent kernel though), ceph luminous setup eg. no different kernels. This is 2nd or 3rd time that a vm is going into a high load (151) and stopping its services. I have two vm's both mounting the same 2 cephfs

Re: [ceph-users] writable snapshots in cephfs? GDPR/DSGVO

2019-07-11 Thread Marc Roos
What about creating snaps on a 'lower level' in the directory structure so you do not need to remove files from a snapshot as a work around? -Original Message- From: Lars Marowsky-Bree [mailto:l...@suse.com] Sent: donderdag 11 juli 2019 10:21 To: ceph-users@lists.ceph.com Subject:

Re: [ceph-users] writable snapshots in cephfs? GDPR/DSGVO

2019-07-11 Thread Lars Marowsky-Bree
On 2019-07-10T09:59:08, Lars Täuber wrote: > Hi everbody! > > Is it possible to make snapshots in cephfs writable? > We need to remove files because of this General Data Protection Regulation > also from snapshots. Removing data from existing WORM storage is tricky, snapshots being a

Re: [ceph-users] What's the best practice for Erasure Coding

2019-07-11 Thread Lars Marowsky-Bree
On 2019-07-09T07:27:28, Frank Schilder wrote: > Small addition: > > This result holds for rbd bench. It seems to imply good performance for > large-file IO on cephfs, since cephfs will split large files into many > objects of size object_size. Small-file IO is a different story. > > The

[ceph-users] RGW Beast crash 14.2.1

2019-07-11 Thread EDH - Manuel Rios Fernandez
Hi Folks, This night RGW crashed without sense using beast as fronted. We solved turning on civetweb again. Should be report to tracker? Regards Manuel Centos 7.6 Linux ceph-rgw03 3.10.0-957.21.3.el7.x86_64 #1 SMP Tue Jun 18 16:35:19 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux