[ceph-users] Re: Disable signature url in ceph rgw
On Fri, Dec 08, 2023 at 10:41:59AM +0100, marc@singer.services wrote: > Hi Ceph users > > We are using Ceph Pacific (16) in this specific deployment. > > In our use case we do not want our users to be able to generate signature v4 > URLs because they bypass the policies that we set on buckets (e.g IP > restrictions). > Currently we have a sidecar reverse proxy running that filters requests with > signature URL specific request parameters. > This is obviously not very efficient and we are looking to replace this > somehow in the future. > > 1. Is there an option in RGW to disable this signed URLs (e.g returning > status 403)? > 2. If not is this planned or would it make sense to add it as a configuration > option? > 3. Or is the behaviour of not respecting bucket policies in RGW with > signature v4 URLs a bug and they should be actually applied? Trying to clarify your ask: - you want ALL requests, including presigned URLs, to be subject to the IP restrictions encoded in your bucket policy? e.g. auth (signature AND IP-list) That should be possible with bucket policy. Can you post the current bucket policy that you have? (redact with distinct values the IPs, userids, bucket name, any paths, but otherwise keep it complete). You cannot fundamentally stop anybody from generating presigned URLs, because that's purely a client-side operation. Generating presigned URLs requires an access key and secret key, at which point they can do presigned or regular authenticated requests. P.S. What stops your users from changing the bucket policy? -- Robin Hugh Johnson Gentoo Linux: Dev, Infra Lead, Foundation President & Treasurer E-Mail : robb...@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: MDS recovery with existing pools
Some more information on the damaged CephFS, apparently the journal is damaged: ---snip--- # cephfs-journal-tool --rank=storage:0 --journal=mdlog journal inspect 2023-12-08T15:35:22.922+0200 7f834d0320c0 -1 Missing object 200.000527c4 2023-12-08T15:35:22.938+0200 7f834d0320c0 -1 Bad entry start ptr (0x149f140067f) at 0x149f1174595 2023-12-08T15:35:22.942+0200 7f834d0320c0 -1 Bad entry start ptr (0x149f1400e66) at 0x149f1174d7c 2023-12-08T15:35:22.954+0200 7f834d0320c0 -1 Bad entry start ptr (0x149f1401642) at 0x149f1175558 2023-12-08T15:35:22.970+0200 7f834d0320c0 -1 Bad entry start ptr (0x149f1401e29) at 0x149f1175d3f 2023-12-08T15:35:22.974+0200 7f834d0320c0 -1 Bad entry start ptr (0x149f1402610) at 0x149f1176526 2023-12-08T15:35:22.978+0200 7f834d0320c0 -1 Missing object 200.000527ca 2023-12-08T15:35:22.978+0200 7f834d0320c0 -1 Missing object 200.000527cb 2023-12-08T15:35:22.994+0200 7f834d0320c0 -1 Bad entry start ptr (0x149f30008f4) at 0x149f2d7480a 2023-12-08T15:35:22.998+0200 7f834d0320c0 -1 Bad entry start ptr (0x149f3000ced) at 0x149f2d74c03 Overall journal integrity: DAMAGED Objects missing: 0x527c4 0x527ca 0x527cb Corrupt regions: 0x149f0d73f16-149f1174595 0x149f1174595-149f1174d7c 0x149f1174d7c-149f1175558 0x149f1175558-149f1175d3f 0x149f1175d3f-149f1176526 0x149f1176526-149f2d7480a 0x149f2d7480a-149f2d74c03 0x149f2d74c03- # cephfs-journal-tool --rank=storage:0 --journal=purge_queue journal inspect 2023-12-08T15:35:57.691+0200 7f331621e0c0 -1 Missing object 500.0dc6 Overall journal integrity: DAMAGED Objects missing: 0xdc6 Corrupt regions: 0x3718522e9- ---snip--- A backup isn't possible: ---snip--- # cephfs-journal-tool --rank=storage:0 journal export backup.bin 2023-12-08T15:42:07.643+0200 7fde6a24f0c0 -1 Missing object 200.000527c4 2023-12-08T15:42:07.659+0200 7fde6a24f0c0 -1 Bad entry start ptr (0x149f140067f) at 0x149f1174595 2023-12-08T15:42:07.667+0200 7fde6a24f0c0 -1 Bad entry start ptr (0x149f1400e66) at 0x149f1174d7c 2023-12-08T15:42:07.675+0200 7fde6a24f0c0 -1 Bad entry start ptr (0x149f1401642) at 0x149f1175558 2023-12-08T15:42:07.687+0200 7fde6a24f0c0 -1 Bad entry start ptr (0x149f1401e29) at 0x149f1175d3f 2023-12-08T15:42:07.699+0200 7fde6a24f0c0 -1 Bad entry start ptr (0x149f1402610) at 0x149f1176526 2023-12-08T15:42:07.699+0200 7fde6a24f0c0 -1 Missing object 200.000527ca 2023-12-08T15:42:07.699+0200 7fde6a24f0c0 -1 Missing object 200.000527cb 2023-12-08T15:42:07.707+0200 7fde6a24f0c0 -1 Bad entry start ptr (0x149f30008f4) at 0x149f2d7480a 2023-12-08T15:42:07.707+0200 7fde6a24f0c0 -1 Bad entry start ptr (0x149f3000ced) at 0x149f2d74c03 2023-12-08T15:42:07.707+0200 7fde6a24f0c0 -1 journal_export: Journal not readable, attempt object-by-object dump with `rados` Error ((5) Input/output error) ---snip--- Does it make sense to continue with the advanced disaster recovery [3] bei running (all of) these steps: cephfs-journal-tool event recover_dentries summary cephfs-journal-tool [--rank=N] journal reset cephfs-table-tool all reset session ceph fs reset --yes-i-really-mean-it cephfs-table-tool 0 reset session cephfs-table-tool 0 reset snap cephfs-table-tool 0 reset inode cephfs-journal-tool --rank=0 journal reset cephfs-data-scan init Fortunately, I didn't have to run through this procedure too often, so I'd appreciate any comments what the best approach would be here. Thanks! Eugen [3] https://docs.ceph.com/en/latest/cephfs/disaster-recovery-experts/#disaster-recovery-experts Zitat von Eugen Block : I was able to (almost) reproduce the issue in a (Pacific) test cluster. I rebuilt the monmap from the OSDs, brought everything back up, started the mds recovery like described in [1]: ceph fs new--force --recover Then I added two mds daemons which went into standby: ---snip--- Started Ceph mds.cephfs.pacific.uexvvq for 1b0afda4-2221-11ee-87be-fa163eed040c. Dez 08 12:51:53 pacific conmon[100493]: debug 2023-12-08T11:51:53.086+ 7ff5f589b900 0 set uid:gid to 167:167 (ceph:ceph) Dez 08 12:51:53 pacific conmon[100493]: debug 2023-12-08T11:51:53.086+ 7ff5f589b900 0 ceph version 16.2.14 (238ba602515df21ea7ffc75c88db29f9e5ef12c9) pacific (stable), process ceph-md> Dez 08 12:51:53 pacific conmon[100493]: debug 2023-12-08T11:51:53.086+ 7ff5f589b900 1 main not setting numa affinity Dez 08 12:51:53 pacific conmon[100493]: debug 2023-12-08T11:51:53.086+ 7ff5f589b900 0 pidfile_write: ignore empty --pid-file Dez 08 12:51:53 pacific conmon[100493]: starting mds.cephfs.pacific.uexvvq at Dez 08 12:51:53 pacific conmon[100493]: debug 2023-12-08T11:51:53.102+ 7ff5e37be700 1 mds.cephfs.pacific.uexvvq Updating MDS map to version 2 from mon.0 Dez 08 12:51:53 pacific conmon[100493]: debug 2023-12-08T11:51:53.802+ 7ff5e37be700 1 mds.cephfs.pacific.uexvvq Updating MDS map to version 3 from mon.
[ceph-users] Re: MDS recovery with existing pools
I was able to (almost) reproduce the issue in a (Pacific) test cluster. I rebuilt the monmap from the OSDs, brought everything back up, started the mds recovery like described in [1]: ceph fs new--force --recover Then I added two mds daemons which went into standby: ---snip--- Started Ceph mds.cephfs.pacific.uexvvq for 1b0afda4-2221-11ee-87be-fa163eed040c. Dez 08 12:51:53 pacific conmon[100493]: debug 2023-12-08T11:51:53.086+ 7ff5f589b900 0 set uid:gid to 167:167 (ceph:ceph) Dez 08 12:51:53 pacific conmon[100493]: debug 2023-12-08T11:51:53.086+ 7ff5f589b900 0 ceph version 16.2.14 (238ba602515df21ea7ffc75c88db29f9e5ef12c9) pacific (stable), process ceph-md> Dez 08 12:51:53 pacific conmon[100493]: debug 2023-12-08T11:51:53.086+ 7ff5f589b900 1 main not setting numa affinity Dez 08 12:51:53 pacific conmon[100493]: debug 2023-12-08T11:51:53.086+ 7ff5f589b900 0 pidfile_write: ignore empty --pid-file Dez 08 12:51:53 pacific conmon[100493]: starting mds.cephfs.pacific.uexvvq at Dez 08 12:51:53 pacific conmon[100493]: debug 2023-12-08T11:51:53.102+ 7ff5e37be700 1 mds.cephfs.pacific.uexvvq Updating MDS map to version 2 from mon.0 Dez 08 12:51:53 pacific conmon[100493]: debug 2023-12-08T11:51:53.802+ 7ff5e37be700 1 mds.cephfs.pacific.uexvvq Updating MDS map to version 3 from mon.0 Dez 08 12:51:53 pacific conmon[100493]: debug 2023-12-08T11:51:53.802+ 7ff5e37be700 1 mds.cephfs.pacific.uexvvq Monitors have assigned me to become a standby. ---snip--- But as soon as I ran pacific:~ # ceph fs set cephfs joinable true cephfs marked joinable; MDS may join as newly active. one MDS daemon became active and the FS is available now. So apparently the "Advanced" steps from [2] usually weren't necessary, but are they in this case? I'm still trying to find an explanation for the purge_queue errors. Zitat von Eugen Block : Hi, following up on the previous thread (After hardware failure tried to recover ceph and followed instructions for recovery using OSDS), we were able to get ceph back into a healthy state (including the unfound object). Now the CephFS needs to be recovered and I'm having trouble to fully understand the docs [1] which the next steps would be. We ran the following which according to [1] sets the state to existing but failed: ceph fs new--force --recover But how to continue from here? Should we expect an active MDS at this point or not? Because the "ceph fs status" output still shows rank 0 as failed. We then tried: ceph fs set joinable true But apparently it was already joinable, nothing changed. Before doing anything (destructive) from the advanced options [2] I wanted to ask the community, how to get on from here. I pasted the mds logs at the bottom, I'm not really sure if the current state is expected or not. Apparently, the journal recovers but the purge_queue does not: mds.0.41 Booting: 2: waiting for purge queue recovered mds.0.journaler.pq(ro) _finish_probe_end write_pos = 14797504512 (header had 14789452521). recovered. mds.0.purge_queue operator(): open complete mds.0.purge_queue operator(): recovering write_pos monclient: get_auth_request con 0x55c280bc5c00 auth_method 0 monclient: get_auth_request con 0x55c280ee0c00 auth_method 0 mds.0.journaler.pq(ro) _finish_read got error -2 mds.0.purge_queue _recover: Error -2 recovering write_pos mds.0.purge_queue _go_readonly: going readonly because internal IO failed: No such file or directory mds.0.journaler.pq(ro) set_readonly mds.0.41 unhandled write error (2) No such file or directory, force readonly... mds.0.cache force file system read-only force file system read-only Is this expected because the "--recover" flag prevents an active MDS or not? Before running "ceph mds rmfailed ..." and/or "ceph fs reset " with the --yes-i-really-mean-it flag I'd like to ask for your input. In which case should we run those commands? The docs are not really clear to me. Any input is highly appreciated! Thanks! Eugen [1] https://docs.ceph.com/en/latest/cephfs/recover-fs-after-mon-store-loss/ [2] https://docs.ceph.com/en/latest/cephfs/administration/#advanced-cephfs-admin-settings ---snip--- Dec 07 15:35:48 node02 bash[692598]: debug-90> 2023-12-07T13:35:47.730+ 7f4cd855f700 1 mds.storage.node02.hemalk Updating MDS map to version 41 from mon.0 Dec 07 15:35:48 node02 bash[692598]: debug-89> 2023-12-07T13:35:47.730+ 7f4cd855f700 4 mds.0.purge_queue operator(): data pool 3 not found in OSDMap Dec 07 15:35:48 node02 bash[692598]: debug-88> 2023-12-07T13:35:47.730+ 7f4cd855f700 5 asok(0x55c27fe86000) register_command objecter_requests hook 0x55c27fe16310 Dec 07 15:35:48 node02 bash[692598]: debug-87> 2023-12-07T13:35:47.730+ 7f4cd855f700 10 monclient: _renew_subs Dec 07 15:35:48 node02 bash[692598]: debug-86> 2023-12-07T13:35:47.730+ 7f4cd855f700 10 monclient: _send
[ceph-users] Re: ceph fs (meta) data inconsistent
Hi Xiubo, I will update the case. I'm afraid this will have to wait a little bit though. I'm too occupied for a while and also don't have a test cluster that would help speed things up. I will update you, please keep the tracker open. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Xiubo Li Sent: Tuesday, December 5, 2023 1:58 AM To: Frank Schilder; Gregory Farnum Cc: ceph-users@ceph.io Subject: Re: [ceph-users] Re: ceph fs (meta) data inconsistent Frank, By using your script I still couldn't reproduce it. Locally my python version is 3.9.16, and I didn't have other VMs to test python other versions. Could you check the tracker to provide the debug logs ? Thanks - Xiubo On 12/1/23 21:08, Frank Schilder wrote: > Hi Xiubo, > > I uploaded a test script with session output showing the issue. When I look > at your scripts, I can't see the stat-check on the second host anywhere. > Hence, I don't really know what you are trying to compare. > > If you want me to run your test scripts on our system for comparison, please > include the part executed on the second host explicitly in an ssh-command. > Running your scripts alone in their current form will not reproduce the issue. > > Best regards, > = > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > > From: Xiubo Li > Sent: Monday, November 27, 2023 3:59 AM > To: Frank Schilder; Gregory Farnum > Cc: ceph-users@ceph.io > Subject: Re: [ceph-users] Re: ceph fs (meta) data inconsistent > > > On 11/24/23 21:37, Frank Schilder wrote: >> Hi Xiubo, >> >> thanks for the update. I will test your scripts in our system next week. >> Something important: running both scripts on a single client will not >> produce a difference. You need 2 clients. The inconsistency is between >> clients, not on the same client. For example: > Frank, > > Yeah, I did this with 2 different kclients. > > Thanks > >> Setup: host1 and host2 with a kclient mount to a cephfs under /mnt/kcephfs >> >> Test 1 >> - on host1: execute shutil.copy2 >> - execute ls -l /mnt/kcephfs/ on host1 and host2: same result >> >> Test 2 >> - on host1: shutil.copy >> - execute ls -l /mnt/kcephfs/ on host1 and host2: file size=0 on host 2 >> while correct on host 1 >> >> Your scripts only show output of one host, but the inconsistency requires >> two hosts for observation. The stat information is updated on host1, but not >> synchronized to host2 in the second test. In case you can't reproduce that, >> I will append results from our system to the case. >> >> Also it would be important to know the python and libc versions. We observe >> this only for newer versions of both. >> >> Best regards, >> = >> Frank Schilder >> AIT Risø Campus >> Bygning 109, rum S14 >> >> >> From: Xiubo Li >> Sent: Thursday, November 23, 2023 3:47 AM >> To: Frank Schilder; Gregory Farnum >> Cc: ceph-users@ceph.io >> Subject: Re: [ceph-users] Re: ceph fs (meta) data inconsistent >> >> I just raised one tracker to follow this: >> https://tracker.ceph.com/issues/63510 >> >> Thanks >> >> - Xiubo >> >> > > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Disable signature url in ceph rgw
Hi Ceph users We are using Ceph Pacific (16) in this specific deployment. In our use case we do not want our users to be able to generate signature v4 URLs because they bypass the policies that we set on buckets (e.g IP restrictions). Currently we have a sidecar reverse proxy running that filters requests with signature URL specific request parameters. This is obviously not very efficient and we are looking to replace this somehow in the future. 1. Is there an option in RGW to disable this signed URLs (e.g returning status 403)? 2. If not is this planned or would it make sense to add it as a configuration option? 3. Or is the behaviour of not respecting bucket policies in RGW with signature v4 URLs a bug and they should be actually applied? Thanks you for your help and let me know if you have any questions Marc Singer ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: How to replace a disk with minimal impact on performance
> > Based on our observation of the impact of the balancer on the > performance of the entire cluster, we have drawn conclusions that we > would like to discuss with you. > > - A newly created pool should be balanced before being handed over > to the user. This, I believe, is quite evident. > I think this question might contain a lot of hidden assumptions, so it's hard to respond to in a correct manner. Using rgw means you get some 7-10-13 different pools depending on if you use either swift/s3 or all at the same time. In this case, only one or a few of those pools need care before doing bulk work, the rest are quite fine being very small and .. "unbalanced". > - When replacing a disk, it is advisable to exchange it directly > for a new one. As soon as the OSD replacement occurs, the balancer > should be invoked to realign any improperly placed PGs during the disk > outage and disk recovery. > Not that I think the default behaviours are optimal in any way, but the above text seems to describe what actually does happen, even if the balancer may not be involved, the normal crush "repairs" of an imbalanced cluster will even the data out when the new OSD is in place. Perhaps an even better method is to pause recovery and backfilling > before removing the disk, remove the disk itself, promptly add a new > one, and then resume recovery and backfilling. It's essential to per > form all of this as quickly as possible (using a script). > Here I would just state "set norebalance (and noout if you must stop the whole OSD host) before removing the old and adding the new OSD", then when the new OSD is created and started, you unset the options and let it repair back to the newly added OSD. > Ad. We are using a community balancer developed by Jonas Jelton because > the built-in one does not meet our requirements. > We sometimes use the python or go upmap remapper scripts/programs to have the cluster be less sad while moving a small number of PGs at a time, but that is more or less just for convenience and to let scrubs run on the non-moving PGs if the data movements are expected to take long calendar time. -- May the most significant bit of your life be positive. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io