[ceph-users] Re: Balancing PGs across OSDs

2019-12-02 Thread Harald Staub
Hi all Something to try: ceph config set mgr mgr/balancer/upmap_max_iterations 20 (Default is 100.) Cheers Harry On 03.12.19 08:02, Lars Täuber wrote: BTW: The osdmaptool doesn't see anything to do either: $ ceph osd getmap -o om $ osdmaptool om --upmap /tmp/upmap.sh --upmap-pool

[ceph-users] Re: Balancing PGs across OSDs

2019-12-02 Thread Lars Täuber
BTW: The osdmaptool doesn't see anything to do either: $ ceph osd getmap -o om $ osdmaptool om --upmap /tmp/upmap.sh --upmap-pool cephfs_data osdmaptool: osdmap file 'om' writing upmap command output to: /tmp/upmap.sh checking for upmap cleanups upmap, max-count 100, max deviation 0.01 limiting

[ceph-users] Re: Balancing PGs across OSDs

2019-12-02 Thread Lars Täuber
Hi Konstantin, Tue, 3 Dec 2019 10:01:34 +0700 Konstantin Shalygin ==> Lars Täuber , ceph-users@ceph.io : > Please paste your `ceph osd df tree`, `ceph osd pool ls detail`, `ceph > osd crush rule dump`. here it comes: $ ceph osd df tree ID CLASS WEIGHTREWEIGHT SIZERAW USE DATA

[ceph-users] Re: Changing failure domain

2019-12-02 Thread Konstantin Shalygin
On 12/2/19 5:56 PM, Francois Legrand wrote: For replica, what is the best way to change crush profile ? Is it to create a new replica profile, and set this profile as crush rulest for the pool (something like ceph osd pool set {pool-name} crush_ruleset my_new_rule) ? Indeed. Then you can

[ceph-users] Re: Balancing PGs across OSDs

2019-12-02 Thread Konstantin Shalygin
On 12/2/19 5:55 PM, Lars Täuber wrote: Here we have a similar situation. After adding some OSDs to the cluster the PGs are not equally distributed over the OSDs. The balancing mode is set to upmap. The docshttps://docs.ceph.com/docs/master/rados/operations/balancer/#modes say: "This CRUSH

[ceph-users] Re: Possible data corruption with 14.2.3 and 14.2.4

2019-12-02 Thread Paul Emmerich
On Mon, Dec 2, 2019 at 4:55 PM Simon Ironside wrote: > > Any word on 14.2.5? Nervously waiting here . . . real soon, the release is 99% done (check the corresponding thread on the devel mailing list) Paul > > Thanks, > Simon. > > On 18/11/2019 11:29, Simon Ironside wrote: > > > I will sit

[ceph-users] Re: Can min_read_recency_for_promote be -1

2019-12-02 Thread Paul Emmerich
I've recently configured something like this for a backup cluster with these settings: ceph osd pool set cache_test hit_set_type bloom ceph osd pool set cache_test hit_set_count 1 ceph osd pool set cache_test hit_set_period 7200 ceph osd pool set cache_test target_max_bytes 1 ceph osd

[ceph-users] Re: Can min_read_recency_for_promote be -1

2019-12-02 Thread Romit Misra
Hi Robert, I am not quite sure if I get your question correct, but what I understand is that you want the inbound writes to land on the cache tier, which presumably would be on a faster media, possibily a ssd. >From there you would want it to trickle down to the base tier, which is a EC pool

[ceph-users] Can min_read_recency_for_promote be -1

2019-12-02 Thread Robert LeBlanc
I'd like to configure a cache tier to act as a write buffer, so that if writes come in, it promotes objects, but reads never promote an object. We have a lot of cold data so we would like to tier down to an EC pool (CephFS) after a period of about 30 days to save space. The storage tier and the

[ceph-users] Re: ceph node crashed with these errors "kernel: ceph: build_snap_context" (maybe now it is urgent?)

2019-12-02 Thread Marc Roos
Yes Luis, good guess!! ;) -Original Message- Cc: ceph-users Subject: Re: [ceph-users] ceph node crashed with these errors "kernel: ceph: build_snap_context" (maybe now it is urgent?) On Mon, Dec 02, 2019 at 10:27:21AM +0100, Marc Roos wrote: > > I have been asking before[1]. Since

[ceph-users] Re: Possible data corruption with 14.2.3 and 14.2.4

2019-12-02 Thread Simon Ironside
Any word on 14.2.5? Nervously waiting here . . . Thanks, Simon. On 18/11/2019 11:29, Simon Ironside wrote: I will sit tight and wait for 14.2.5. Thanks again, Simon. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to

[ceph-users] Re: ceph node crashed with these errors "kernel: ceph: build_snap_context" (maybe now it is urgent?)

2019-12-02 Thread Marc Roos
I can confirm that removing all the snapshots seems to resolve the problem. A - I would propose a redesign of something like that snapshots from below the mountpoint are only taken into account and not snapshots in the entire filesystem. That should fix a lot of issues B - That reminds me

[ceph-users] Re: ceph node crashed with these errors "kernel: ceph: build_snap_context" (maybe now it is urgent?)

2019-12-02 Thread Marc Roos
> >> > >> >ISTR there were some anti-spam measures put in place. Is your account >> >waiting for manual approval? If so, David should be able to help. >> >> Yes if I remember correctly I get waiting approval when I try to log in. >> >> >> >> >> >> >> >> >> Dec 1 03:14:36 c04

[ceph-users] Re: ceph node crashed with these errors "kernel: ceph: build_snap_context" (maybe now it is urgent?)

2019-12-02 Thread Ilya Dryomov
On Mon, Dec 2, 2019 at 1:23 PM Marc Roos wrote: > > > > I guess this is related? kworker 100% > > > [Mon Dec 2 13:05:27 2019] SysRq : Show backtrace of all active CPUs > [Mon Dec 2 13:05:27 2019] sending NMI to all CPUs: > [Mon Dec 2 13:05:27 2019] NMI backtrace for cpu 0 skipped: idling at pc

[ceph-users] Re: ceph node crashed with these errors "kernel: ceph: build_snap_context" (maybe now it is urgent?)

2019-12-02 Thread Ilya Dryomov
On Mon, Dec 2, 2019 at 12:48 PM Marc Roos wrote: > > > Hi Ilya, > > > > > > >ISTR there were some anti-spam measures put in place. Is your account > >waiting for manual approval? If so, David should be able to help. > > Yes if I remember correctly I get waiting approval when I try to log

[ceph-users] Re: atime with cephfs

2019-12-02 Thread Oliver Freyermuth
On 2019-12-02 14:22, Nathan Fish wrote: You may be thinking of "lazytime". "relatime" only updates atime when updating mtime, to prevent being inconsistent. I was thinking about the behaviour of relatime on kernels since 2.6.30 (quoting mount(8)): "Update inode access

[ceph-users] Re: atime with cephfs

2019-12-02 Thread Nathan Fish
You may be thinking of "lazytime". "relatime" only updates atime when updating mtime, to prevent being inconsistent. On Mon, Dec 2, 2019 at 4:46 AM Oliver Freyermuth wrote: > > Dear Cephers, > > we are currently mounting CephFS with relatime, using the FUSE client > (version 13.2.6): >

[ceph-users] Multi-site RadosGW with multiple placement targets

2019-12-02 Thread Tobias Urdin
Hello, I'm trying to wrap my head around how having a multi-site (two zones in one zonegroup) with multiple placement targets but only wanting to replicate some placement targets would work. Can you setup a zonegroup with two zones and it would only replicate the placement targets that the

[ceph-users] Re: ceph node crashed with these errors "kernel: ceph: build_snap_context" (maybe now it is urgent?)

2019-12-02 Thread Marc Roos
I guess this is related? kworker 100% [Mon Dec 2 13:05:27 2019] SysRq : Show backtrace of all active CPUs [Mon Dec 2 13:05:27 2019] sending NMI to all CPUs: [Mon Dec 2 13:05:27 2019] NMI backtrace for cpu 0 skipped: idling at pc 0xb0581e94 [Mon Dec 2 13:05:27 2019] NMI backtrace

[ceph-users] Re: ceph node crashed with these errors "kernel: ceph: build_snap_context" (maybe now it is urgent?)

2019-12-02 Thread Marc Roos
Hi Ilya, > > >ISTR there were some anti-spam measures put in place. Is your account >waiting for manual approval? If so, David should be able to help. Yes if I remember correctly I get waiting approval when I try to log in. >> >> >> >> Dec 1 03:14:36 c04 kernel: ceph:

[ceph-users] Re: Dual network board setup info

2019-12-02 Thread Rodrigo Severo - Fábrica
Em dom., 1 de dez. de 2019 às 09:05, Erdem Agaoglu escreveu: > > Hi Rodrigo, > > The fact that you're getting logs from mon and the function name set_mon_vals > suggests that you made use of mon provided centralized config, like `ceph > config set osd cluster_network x.x.x.x` but it seems

[ceph-users] nautilus radosgw fails with pre jewel buckets - index objects not at right place

2019-12-02 Thread Ingo Reimann
Hi, 2 years after my issue [ https://tracker.ceph.com/issues/22928 | https://tracker.ceph.com/issues/22928 ] the next one fires back. The Problem: Old Buckets have their index and data in rgw.buckets: root@cephrgw01:~# radosgw-admin metadata get bucket:testtesttesty { "key":

[ceph-users] Re: ceph node crashed with these errors "kernel: ceph: build_snap_context" (maybe now it is urgent?)

2019-12-02 Thread Ilya Dryomov
On Mon, Dec 2, 2019 at 10:27 AM Marc Roos wrote: > > > I have been asking before[1]. Since Nautilus upgrade I am having these, > with a total node failure as a result(?). Was not expecting this in my > 'low load' setup. Maybe now someone can help resolving this? I am also > waiting quite some

[ceph-users] Re: Changing failure domain

2019-12-02 Thread Francois Legrand
Thanks. For replica, what is the best way to change crush profile ? Is it to create a new replica profile, and set this profile as crush rulest for the pool (something like ceph osd pool set {pool-name} crush_ruleset my_new_rule) ? For erasure coding, I would thus have to change the profile

[ceph-users] Re: Balancing PGs across OSDs

2019-12-02 Thread Lars Täuber
Hi there! Here we have a similar situation. After adding some OSDs to the cluster the PGs are not equally distributed over the OSDs. The balancing mode is set to upmap. The docs https://docs.ceph.com/docs/master/rados/operations/balancer/#modes say: "This CRUSH mode will optimize the placement

[ceph-users] Disable pgmap messages? Still having this Bug #39646

2019-12-02 Thread Marc Roos
I am getting these every 2 seconds, does it make sense to log this? log_channel(cluster) log [DBG] : pgmap v32653: 384 pgs: 2 active+clean+scrubbing+deep, 382 active+clean; log_channel(cluster) log [DBG] : pgmap v32653: 384 pgs: 2 active+clean+scrubbing+deep, 382 active+clean;

[ceph-users] Ceph on CentOS 8?

2019-12-02 Thread Jan Kasprzak
Hello, Ceph users, does anybody use Ceph on recently released CentOS 8? Apparently there are no el8 packages neither at download.ceph.com, nor in the native CentOS package tree. I am thinking about upgrading my cluster to C8 (because of other software running on it apart from Ceph). Do

[ceph-users] atime with cephfs

2019-12-02 Thread Oliver Freyermuth
Dear Cephers, we are currently mounting CephFS with relatime, using the FUSE client (version 13.2.6): ceph-fuse on /cephfs type fuse.ceph-fuse (rw,relatime,user_id=0,group_id=0,allow_other) For the first time, I wanted to use atime to identify old unused data. My expectation with

[ceph-users] ceph node crashed with these errors "kernel: ceph: build_snap_context" (maybe now it is urgent?)

2019-12-02 Thread Marc Roos
I have been asking before[1]. Since Nautilus upgrade I am having these, with a total node failure as a result(?). Was not expecting this in my 'low load' setup. Maybe now someone can help resolving this? I am also waiting quite some time to get access at https://tracker.ceph.com/issues.

[ceph-users] Re: ERROR: osd init failed: (13) Permission denied

2019-12-02 Thread Marc Roos
I have gpt. Disks are created with the old ceph-disk tool. Why is such a major thing not being handled for 3 months? I was scared shitless when this node went down and did not come up. [@c04 tmp]# fdisk -l /dev/sdb WARNING: fdisk GPT support is currently new, and therefore in an

[ceph-users] Re: ERROR: osd init failed: (13) Permission denied

2019-12-02 Thread Marco Gaiarin
Mandi! Marc Roos In chel di` si favelave... > Node does not start osds! Why do I have this error? Previous boot was > just fine (upgraded recently to nautilus) See if this is your case: https://tracker.ceph.com/issues/41777 -- dott. Marco Gaiarin