[ceph-users] Re: quincy v17.2.1 QE Validation status

2022-06-15 Thread Neha Ojha
On Wed, Jun 15, 2022 at 7:23 AM Venky Shankar  wrote:
>
> On Tue, Jun 14, 2022 at 10:51 PM Yuri Weinstein  wrote:
> >
> > Details of this release are summarized here:
> >
> > https://tracker.ceph.com/issues/55974
> > Release Notes - https://github.com/ceph/ceph/pull/46576
> >
> > Seeking approvals for:
> >
> > rados - Neha, Travis, Ernesto, Adam

All the rados tests have passed, so this looks good! However, some
users are waiting for a remediation for
https://tracker.ceph.com/issues/53729, and we want to make the offline
tool available in 17.2.1 by means of
https://github.com/ceph/ceph/pull/46706. Given that this PR is
restricted to a new option being added to the ceph-objectstore-tool,
the testing requirement is minimal - just running the rados suite on
the PR should be enough and we can merge it by the end of this week.

If anybody has any objections, please let me know.

Thanks,
Neha

>
> > rgw - Casey
> > fs - Venky, Gerg
>
> fs approved.
>
> > orch - Adam
> > rbd - Ilya, Deepika
> > krbd  Ilya, Deepika
> > upgrade/octopus-x - Casey
> >
> > Please reply to this email with approval and/or trackers of known 
> > issues/PRs to address them.
> >
> > Josh, David - it's ready for LRC upgrade if you'd like.
> >
> > Thx
> > YuriW
> > ___
> > Dev mailing list -- d...@ceph.io
> > To unsubscribe send an email to dev-le...@ceph.io
>
>
>
> --
> Cheers,
> Venky
>
> ___
> Dev mailing list -- d...@ceph.io
> To unsubscribe send an email to dev-le...@ceph.io
>

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rfc: Accounts in RGW

2022-06-15 Thread Casey Bodley
(oops, i had cc'ed this to the old ceph-users list)

On Wed, Jun 15, 2022 at 1:56 PM Casey Bodley  wrote:
>
> On Mon, May 11, 2020 at 10:20 AM Abhishek Lekshmanan  
> wrote:
> >
> >
> > The basic premise is for an account to be a container for users, and
> > also related functionality like roles & groups. This would converge
> > similar to the AWS concept of an account, where the AWS account can
> > further create iam users/roles or groups. Every account can have a root
> > user or user(s) with permissions to administer creation of users and
> > allot quotas within an account. These can be implemented with a new
> > account cap. IAM set of apis already have a huge subset of functionality
> > to summarize accounts and inspect/create users/roles or groups. Every
> > account would also store the membership of its users/groups and roles,
> > (similar to user's buckets) though we'd ideally limit to < 10k
> > users/roles or groups per account.
> >
> > In order to deal with the currently used tenants which namespace
> > buckets, but also currently stand in for the account id in the policy
> > language & ARNs, we'd have a tenant_id attribute in the account, which
> > if set will prevent cross tenant users being added. Though this is not
> > enforced when the tenant id isn't set, accounts without this field set
> > can potentially add users across tenants, so this is one of the cases
> > where we expect the account owner to know what they are doing.
> > We'd transition away from : in the Policy principal to
> > :, so if users with different tenants are in the same 
> > account
> > we'd expect the user to change their policies to reuse the account ids.
> >
> > In terms of regular operations IO costs, the user info would have an 
> > account id
> > attribute, and if non empty we'd have to read the Account root user 
> > policies and
> > /or public access configuration, though other attributes like list of 
> > users/roles
> > and groups would only be read for necessary IAM/admin apis.
> >
> > Quotas
> > ~~
> > For quotas we can implement one of the following ways
> > - a user_bytes/buckets quota, which would be alloted to every user
> > - a total account quota, in which case it is the responsibility of the 
> > account
> >   user to allot a quota upon user creation
> >
> > Though for operations themselves it is th user quota that comes into play.
> >
> > APIs
> > 
> > - creating an account itself should be available via the admin tooling/apis
> > - Ideally creation of a root user under an account would still have to be
> >   explicitly, though we could consider adding this to the account creation
> >   process itself to simplify things.
> > - For further user creation and management, we could start implementing to 
> > the
> >   iam set of apis in the future, though currently we already have admin 
> > apis for
> >   user creation and the like, and we could allow the user with account caps 
> > to
> >   do these operations
> >
> > Deviations
> > ~~
> > Some apis like list buckets in AWS list all the buckets in the user account 
> > and
> > not the specific iam user, we'd probably still list only the user buckets,
> > though we could consider this for the account root user.
> >
> > Wrt to the openstack swift apis, we'd still keep the current user_id -> 
> > swift
> > account id mapping, so no breakage is expected wrt end user apis, so the
> > account stats and related apis would be similar to the older version where
> > it is still user's summary that is displayed
> >
> > Comments on if this is the right direction?
> >
> > --
> > Abhishek
> > ___
> > Dev mailing list -- d...@ceph.io
> > To unsubscribe send an email to dev-le...@ceph.io
> >
>
> this project has been revived in
> https://github.com/ceph/ceph/pull/46373 and we've been talking through
> the design in our weekly refactoring meeting
>
> Abhishek shared a good summary of the design above. the only major
> changes we've made are in its interaction with swift tenants:
> - accounts will be strictly namespaced by tenant, so an account can't
> mix users from different tenants
> - add a unique account ID, separate from the account name, for use in
> IAM policy. use a specific, documented format to disambiguate account
> IDs from tenant names
>
> the account features we're planning to start with are:
> - radosgw-admin commands and /admin/ APIs to add/remove/list the users
> and roles under an account
> - support for IAM principals like ACCOUNTID/username,
> ACCOUNTID/rolename and ACCOUNTID/* in addition to tenant/...
> - ListAllMyBuckets lists all buckets under the user's account, not
> only those owned by the user
> - account quotas that limit objects/bytes, on top of existing user/bucket 
> quotas
>
> eventually we'd like to add:
> - IAM APIs for account and user management by 'account root users'
> without global admin caps
> - support for groups under account
>
> i'd love to hear feedback from the 

[ceph-users] Re: Changes to Crush Weight Causing Degraded PGs instead of Remapped

2022-06-15 Thread Wesley Dillingham
I have found that I can only reproduce it on clusters built initially on
pacific. My cluster which went nautilus to pacific does not reproduce the
issue. My working theory is it is related to rocksdb sharding:

https://docs.ceph.com/en/quincy/rados/configuration/bluestore-config-ref/#rocksdb-shardingOSDs
deployed in Pacific or later use RocksDB sharding by default. If Ceph is
upgraded to Pacific from a previous version, sharding is off.
To enable sharding and apply the Pacific defaults, stop an OSD and run

ceph-bluestore-tool \
  --path  \
  --sharding="m(3) p(3,0-12) O(3,0-13)=block_cache={type=binned_lru} L P" \
  reshard


Respectfully,

*Wes Dillingham*
w...@wesdillingham.com
LinkedIn 


On Tue, Jun 14, 2022 at 11:31 AM Wesley Dillingham 
wrote:

> I have made https://tracker.ceph.com/issues/56046 regarding the issue I
> am observing.
>
> Respectfully,
>
> *Wes Dillingham*
> w...@wesdillingham.com
> LinkedIn 
>
>
> On Tue, Jun 14, 2022 at 5:32 AM Eugen Block  wrote:
>
>> I found the thread I was referring to [1]. The report was very similar
>> to yours, apparently the balancer seems to cause the "degraded"
>> messages, but the thread was not concluded. Maybe a tracker ticket
>> should be created if it doesn't already exist, I didn't find a ticket
>> related to that in a quick search.
>>
>> [1]
>>
>> https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/H4L5VNQJKIDXXNY2TINEGUGOYLUTT5UL/
>>
>> Zitat von Wesley Dillingham :
>>
>> > Thanks for the reply. I believe regarding "0" vs "0.0" its the same
>> > difference. I will note its not just changing crush weights which
>> induces
>> > this situation. Introducing upmaps manually or via the balancer also
>> causes
>> > the PGs to be degraded instead of the expected remapped PG state.
>> >
>> > Respectfully,
>> >
>> > *Wes Dillingham*
>> > w...@wesdillingham.com
>> > LinkedIn 
>> >
>> >
>> > On Mon, Jun 13, 2022 at 9:27 PM Szabo, Istvan (Agoda) <
>> > istvan.sz...@agoda.com> wrote:
>> >
>> >> Isn’t it the correct syntax like this?
>> >>
>> >> ceph osd crush reweight osd.1 0.0 ?
>> >>
>> >> Istvan Szabo
>> >> Senior Infrastructure Engineer
>> >> ---
>> >> Agoda Services Co., Ltd.
>> >> e: istvan.sz...@agoda.com
>> >> ---
>> >>
>> >> On 2022. Jun 14., at 0:38, Wesley Dillingham 
>> >> wrote:
>> >>
>> >> ceph osd crush reweight osd.1 0
>> >>
>> >>
>> >> --
>> >> This message is confidential and is for the sole use of the intended
>> >> recipient(s). It may also be privileged or otherwise protected by
>> copyright
>> >> or other legal rules. If you have received it by mistake please let us
>> know
>> >> by reply email and delete it from your system. It is prohibited to copy
>> >> this message or disclose its content to anyone. Any confidentiality or
>> >> privilege is not waived or lost by any mistaken delivery or
>> unauthorized
>> >> disclosure of the message. All messages sent to and from Agoda may be
>> >> monitored to ensure compliance with company policies, to protect the
>> >> company's interests and to remove potential malware. Electronic
>> messages
>> >> may be intercepted, amended, lost or deleted, or contain viruses.
>> >>
>> > ___
>> > ceph-users mailing list -- ceph-users@ceph.io
>> > To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>>
>>
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] host disk used by osd container

2022-06-15 Thread Tony Liu
Hi,

"df -h" on the OSD host shows 187G is being used.
"du -sh /" shows 36G. bluefs_buffered_io is enabled here.
What's  taking that 150G disk space, cache?
Then where is that cache file? Any way to configure it smaller?

# free -h
  totalusedfree  shared  buff/cache   available
Mem:  187Gi28Gi   4.4Gi   4.1Gi   154Gi   152Gi
Swap: 8.0Gi82Mi   7.9Gi

# df -h
FilesystemSize  Used Avail Use% Mounted on
devtmpfs   94G 0   94G   0% /dev
tmpfs  94G 0   94G   0% /dev/shm
tmpfs  94G  4.2G   90G   5% /run
tmpfs  94G 0   94G   0% /sys/fs/cgroup
/dev/mapper/vg0-root  215G  187G   29G  87% /
/dev/sdk2 239M  150M   72M  68% /boot
/dev/sdk1 250M  6.9M  243M   3% /boot/efi
overlay   215G  187G   29G  87% 
/var/lib/docker/overlay2/bc4904d8da14dd9ab0fbc49ae60f20ba4a3cbf8f361c0ed13e818e0d65e22531/merged
overlay   215G  187G   29G  87% 
/var/lib/docker/overlay2/617494be5e05d5f91d1d08aad6b6ace8f335a346ca9ea868dc2bc7fd07906901/merged
overlay   215G  187G   29G  87% 
/var/lib/docker/overlay2/3b039d5daffaf212d3384afc30b5bf75353fd215b238101b9bfba4050638eab5/merged
overlay   215G  187G   29G  87% 
/var/lib/docker/overlay2/507c24e65c7cd075b5e1ab4901f8e198263c85265b3e4610606dc3dfd4dad0b5/merged
overlay   215G  187G   29G  87% 
/var/lib/docker/overlay2/6856e867322a73cb1d0e203a2c12f8516bd76fa3866a945b199e477396704f76/merged
overlay   215G  187G   29G  87% 
/var/lib/docker/overlay2/5bb197c41ba584981d1767e377bff84cd13750476a63f26206f58b274d854739/merged
overlay   215G  187G   29G  87% 
/var/lib/docker/overlay2/1c2b16f94ffda06fc277c6906e6df8bd150de16c80a1ba7f113f0774ad8a5de1/merged
overlay   215G  187G   29G  87% 
/var/lib/docker/overlay2/58de1f5b8e3638e94cbc55f02f690937295e8714dfea44f155271df70093a69f/merged
overlay   215G  187G   29G  87% 
/var/lib/docker/overlay2/be5d6ac02ab83436b18c475f43df48732c0b2b5c73732237064631deb2d5243f/merged
overlay   215G  187G   29G  87% 
/var/lib/docker/overlay2/e59810bd48f0667bd3f91dcc65ec1b51227314754dfbcc7ba8dee376bdcd4c0a/merged
overlay   215G  187G   29G  87% 
/var/lib/docker/overlay2/78326ce8e1cf36680eaa56e744b4ea97f1b358adac17eacaf67b88937dd5e876/merged
overlay   215G  187G   29G  87% 
/var/lib/docker/overlay2/6a53cf14e33b69c418794514fbd35f5257c553f5a9b0ead62e03b76163112de4/merged
overlay   215G  187G   29G  87% 
/var/lib/docker/overlay2/efec8e5382be117acdbfc81e9d9a9fbc62e289c2a9fcdfa4c53868de50faf420/merged
overlay   215G  187G   29G  87% 
/var/lib/docker/overlay2/eca247de3d54f43372961146b84b485d7c5715d1784afae83e44763717ecf552/merged
tmpfs  19G 0   19G   0% /run/user/0
overlay   215G  187G   29G  87% 
/var/lib/docker/overlay2/bfdf90bdc15c9059d9436caddb1d927788ae9eeff15df631ed150cae966528eb/merged
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: quincy v17.2.1 QE Validation status

2022-06-15 Thread Venky Shankar
On Tue, Jun 14, 2022 at 10:51 PM Yuri Weinstein  wrote:
>
> Details of this release are summarized here:
>
> https://tracker.ceph.com/issues/55974
> Release Notes - https://github.com/ceph/ceph/pull/46576
>
> Seeking approvals for:
>
> rados - Neha, Travis, Ernesto, Adam
> rgw - Casey
> fs - Venky, Gerg

fs approved.

> orch - Adam
> rbd - Ilya, Deepika
> krbd  Ilya, Deepika
> upgrade/octopus-x - Casey
>
> Please reply to this email with approval and/or trackers of known issues/PRs 
> to address them.
>
> Josh, David - it's ready for LRC upgrade if you'd like.
>
> Thx
> YuriW
> ___
> Dev mailing list -- d...@ceph.io
> To unsubscribe send an email to dev-le...@ceph.io



-- 
Cheers,
Venky

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rbd resize thick provisioned image

2022-06-15 Thread Ilya Dryomov
On Wed, Jun 15, 2022 at 3:21 PM Frank Schilder  wrote:
>
> Hi Eugen,
>
> in essence I would like the property "thick provisioned" to be sticky after 
> creation and apply to any other operation that would be affected.
>
> To answer the use-case question: this is a disk image on a pool designed for 
> predictable high-performance. On images on this pool we need to avoid any 
> latency spikes due to on-demand allocation of non-provisioned objects. It is 
> kind of strange that the rbd cli API is incomplete with regard to the thick 
> provision property.

Hi Frank,

Yeah, it appears to be an omission.  I filed [1] to get that addressed
one day.

RBD "thick provisioning" is a bit of odd ball feature.  There are other
issues with it: bluestore compression, if enabled, would interfere with
it, for example.  In general, the "thickness" (really just a bunch of
zeroes written to the image on your behalf) isn't safe-guarded against
any kind of compression-like optimization on the backend.

>
> I'm not sure if a flatten will have the desired effect. It just merges all 
> snapshots, which does not require to allocate unallocated objects if they are 
> not present in any snapshot. An un-sparsify image would do that. Did anyone 
> find a reasonable work-around except maybe a dd after the end of the existing 
> objects? Or a dd of the disk image onto itself?

"dd if=/dev/zero bs= ..." for the grown area is
a perfectly fine workaround.

[1] https://tracker.ceph.com/issues/56064

Thanks,

Ilya

>
> Best regards,
> =
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> 
> From: Eugen Block 
> Sent: 15 June 2022 14:54:54
> To: ceph-users@ceph.io
> Subject: [ceph-users] Re: rbd resize thick provisioned image
>
> So basically, you need the reverse sparsify command, right? ;-)
> I only find several mailing list thready asking why someone would want
> thick-provisioning but it happened eventually. I suppose cloning and
> flattening the resulting image is not a desirable workaround.
>
>
> Zitat von Frank Schilder :
>
> > Hi all,
> >
> > I need to increase the size of images created with
> > --thick-provision. Using resize will just change the provisioned
> > size, but not allocate/initialize the additional space. I seem to be
> > unable to find an option that will maintain thick provisioning of an
> > image when resizing.
> >
> > Is there a way to resize thick provisioned images properly, that is,
> > maintaining thick provisioning?
> >
> > Thanks and best regards,
> > =
> > Frank Schilder
> > AIT Risø Campus
> > Bygning 109, rum S14
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rbd resize thick provisioned image

2022-06-15 Thread Eugen Block

So basically, you need the reverse sparsify command, right? ;-)
I only find several mailing list thready asking why someone would want  
thick-provisioning but it happened eventually. I suppose cloning and  
flattening the resulting image is not a desirable workaround.



Zitat von Frank Schilder :


Hi all,

I need to increase the size of images created with  
--thick-provision. Using resize will just change the provisioned  
size, but not allocate/initialize the additional space. I seem to be  
unable to find an option that will maintain thick provisioning of an  
image when resizing.


Is there a way to resize thick provisioned images properly, that is,  
maintaining thick provisioning?


Thanks and best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: quincy v17.2.1 QE Validation status

2022-06-15 Thread Casey Bodley
On Tue, Jun 14, 2022 at 1:21 PM Yuri Weinstein  wrote:
>
> Details of this release are summarized here:
>
> https://tracker.ceph.com/issues/55974
> Release Notes - https://github.com/ceph/ceph/pull/46576
>
> Seeking approvals for:
>
> rados - Neha, Travis, Ernesto, Adam
> rgw - Casey
> fs - Venky, Gerg
> orch - Adam
> rbd - Ilya, Deepika
> krbd  Ilya, Deepika
> upgrade/octopus-x - Casey

rgw and upgrade approved

>
> Please reply to this email with approval and/or trackers of known issues/PRs 
> to address them.
>
> Josh, David - it's ready for LRC upgrade if you'd like.
>
> Thx
> YuriW
> ___
> Dev mailing list -- d...@ceph.io
> To unsubscribe send an email to dev-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: quincy v17.2.1 QE Validation status

2022-06-15 Thread Ilya Dryomov
On Tue, Jun 14, 2022 at 7:21 PM Yuri Weinstein  wrote:
>
> Details of this release are summarized here:
>
> https://tracker.ceph.com/issues/55974
> Release Notes - https://github.com/ceph/ceph/pull/46576
>
> Seeking approvals for:
>
> rados - Neha, Travis, Ernesto, Adam
> rgw - Casey
> fs - Venky, Gerg
> orch - Adam
> rbd - Ilya, Deepika
> krbd  Ilya, Deepika

rbd and krbd approved.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Multi-active MDS cache pressure

2022-06-15 Thread Eugen Block

Hi *,

I finally caught some debug logs during the cache pressure warnings.  
In the meantime I had doubled the mds_cache_memory_limit to 128 GB  
which decreased the number cache pressure messages significantly, but  
they still appear a few times per day.


Turning on debug logs for a few seconds results in a 1 GB file, but I  
found this message:


2022-06-15 10:07:34.254 7fdbbd44a700  2 mds.beacon.stmailmds01b-8  
Session chead015:cephfs_client (2757628057) is not releasing caps fast  
enough. Recalled caps at 390118 > 262144 (mds_recall_warning_threshold).


So now I know which limit is reached here, the question is what to do  
about it? Should I increase the mds_recall_warning_threshold (default  
256k) or should I maybe increase mds_recall_max_caps (currently at  
60k, default is 50k)? Any other suggestions? I'd appreciate any  
comments.


Thanks,
Eugen


Zitat von Eugen Block :


Hi,

I'm currently debugging a reoccuring issue with multi-active MDS.  
The cluster is still on Nautilus and can't be upgraded at this time.  
There have been many discussions about "cache pressure" and I was  
able to find the right settings a couple of times, but before I  
change too much in this setup I'd like to ask for your opinion. I'll  
add some information at the end.
So we have 16 active MDS daemons spread over 2 servers for one  
cephfs (8 daemons per server) with mds_cache_memory_limit = 64GB,  
the MDS servers are mostly idle except for some short peaks. Each of  
the MDS daemons uses around 2 GB according to 'ceph daemon mds.  
cache status', so we're nowhere near the 64GB limit. There are  
currently 25 servers that mount the cephs as clients.
Watching the ceph health I can see that the reported clients with  
cache pressure change, so they are not actually stuck but just don't  
respond as quickly as the MDS would like them to (I assume). For  
some of the mentioned clients I see high values for  
.recall_caps.value in the 'daemon session ls' output (at the bottom).


The docs basically state this:
When the MDS needs to shrink its cache (to stay within  
mds_cache_size), it sends messages to clients to shrink their  
caches too. The client is unresponsive to MDS requests to release  
cached inodes. Either the client is unresponsive or has a bug


To me it doesn't seem like the MDS servers are near the cache size  
limit, so it has to be the clients, right? In a different setup it  
helped to decrease the client_oc_size from 200MB to 100MB, but then  
there's also client_cache_size with 16K default. I'm not sure what  
the best approach would be here. I'd appreciate any comments on how  
to size the various cache/caps/threshold configurations.


Thanks!
Eugen


---snip---
# ceph daemon mds. session ls

"id": 2728101146,
"entity": {
  "name": {
"type": "client",
"num": 2728101146
  },
[...]
"nonce": 1105499797
  }
},
"state": "open",
"num_leases": 0,
"num_caps": 16158,
"request_load_avg": 0,
"uptime": 1118066.210318422,
"requests_in_flight": 0,
"completed_requests": [],
"reconnecting": false,
"recall_caps": {
  "value": 788916.8276369586,
  "halflife": 60
},
"release_caps": {
  "value": 8.814981576458962,
  "halflife": 60
},
"recall_caps_throttle": {
  "value": 27379.27162576508,
  "halflife": 1.5
},
"recall_caps_throttle2o": {
  "value": 5382.261925615086,
  "halflife": 0.5
},
"session_cache_liveness": {
  "value": 12.91841737465921,
  "halflife": 300
},
"cap_acquisition": {
  "value": 0,
  "halflife": 10
},
[...]
"used_inos": [],
"client_metadata": {
  "features": "0x3bff",
  "entity_id": "cephfs_client",


# ceph fs status

cephfs - 25 clients
==
+--+++---+---+---+
| Rank | State  |  MDS   |Activity   |  dns  |  inos |
+--+++---+---+---+
|  0   | active | stmailmds01d-3 | Reqs:   89 /s |  375k |  371k |
|  1   | active | stmailmds01d-4 | Reqs:   64 /s |  386k |  383k |
|  2   | active | stmailmds01a-3 | Reqs:9 /s |  403k |  399k |
|  3   | active | stmailmds01a-8 | Reqs:   23 /s |  393k |  390k |
|  4   | active | stmailmds01a-2 | Reqs:   36 /s |  391k |  387k |
|  5   | active | stmailmds01a-4 | Reqs:   57 /s |  394k |  390k |
|  6   | active | stmailmds01a-6 | Reqs:   50 /s |  395k |  391k |
|  7   | active | stmailmds01d-5 | Reqs:   37 /s |  384k |  380k |
|  8   | active | stmailmds01a-5 | Reqs:   39 /s |  397k |  394k |
|  9   | active |  stmailmds01a  | Reqs:   23 /s |  400k |  396k |
|  10  | active | stmailmds01d-8 | Reqs:   74 /s |  402k |  399k |
|  11  | active | stmailmds01d-6 | Reqs:   37 /s |  399k |  395k |
|  12  | active |  stmailmds01d  | Reqs:   36 /s |  394k |  390k |
|  13  | active | stmailmds01d-7 | Reqs:   80 /s |  397k |  393k |
|  14  | active | stmailmds01d-2 

[ceph-users] MDS error handle_find_ino_reply failed with -116

2022-06-15 Thread Denis Polom

Hi,

I have Ceph Pacific 16.2.9 with CephFS and 4 MDS (2 active, 2 standby-reply)


==
RANK  STATE   MDS  ACTIVITY DNS    INOS   DIRS CAPS
 0    active  mds3  Reqs:   31 /s   162k   159k  69.5k 177k
 1    active  mds1  Reqs:    4 /s  31.0k  28.7k  10.6k 20.7k
1-s   standby-replay  mds2  Evts:   35 /s  23.3k  17.7k   264 0
0-s   standby-replay  mds4  Evts:   49 /s   173k   159k  69.5k 0
    POOL   TYPE USED  AVAIL
upload_hdd_metadata  metadata  1421M  4101G
  upload_hdd_data  data 915G  6151G
MDS version: ceph version 16.2.9 
(4c3647a322c0ff5a1dd2344e039859dcbd28c830) pacific (stable)



In MDS logs which handles rank 0 are a lot of error messages:

022-06-15T12:19:53.600+0200 7f3bb3130700  0 mds.0.cache 
handle_find_ino_reply failed with -116 on #0x20008559a07, retrying
2022-06-15T12:19:53.600+0200 7f3bb3130700  0 mds.0.cache 
handle_find_ino_reply failed with -116 on #0x20008559a07, retrying
2022-06-15T12:19:53.600+0200 7f3bb3130700  0 mds.0.cache 
handle_find_ino_reply failed with -116 on #0x20008559a07, retrying
2022-06-15T12:19:53.600+0200 7f3bb3130700  0 mds.0.cache 
handle_find_ino_reply failed with -116 on #0x20008559a07, retrying
2022-06-15T12:19:53.600+0200 7f3bb3130700  0 mds.0.cache 
handle_find_ino_reply failed with -116 on #0x20008559a07, retrying
2022-06-15T12:19:53.600+0200 7f3bb3130700  0 mds.0.cache 
handle_find_ino_reply failed with -116 on #0x20008559a07, retrying
2022-06-15T12:19:53.600+0200 7f3bb3130700  0 mds.0.cache 
handle_find_ino_reply failed with -116 on #0x20008559a07, retrying
2022-06-15T12:19:53.600+0200 7f3bb3130700  0 mds.0.cache 
handle_find_ino_reply failed with -116 on #0x20008559a07, retrying
2022-06-15T12:19:53.600+0200 7f3bb3130700  0 mds.0.cache 
handle_find_ino_reply failed with -116 on #0x20008559a07, retrying
2022-06-15T12:19:53.604+0200 7f3bb3130700  0 mds.0.cache 
handle_find_ino_reply failed with -116 on #0x20008559a07, retrying



It appears on any MDS that keeps rank 0.

I tried to find what this error means and how to handle it, but no luck. 
To me it looks like MDS cannot find path to inode, is ti correct? How 
can I fix ?


Thanks


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] ceph.pub not presistent over reboots?

2022-06-15 Thread Thomas Roth

Hi all,


while setting up a system with cephadm under Quincy, I bootstrapped from host A, added mons on hosts B 
and C, and rebooted host A.

Afterwards, ceph seemed to be in a healthy state (no OSDs yet, of course), but my host A 
was "offline".

I was afraid I had run into https://tracker.ceph.com/issues/51027, but no, my host A simply lacked the 
ceph.pub key.


Since this is Quincy (no support for non-root users), I had provided the other mons by 'ssh-copy-id -f 
-i /etc/ceph/ceph.pub root@hostB' etc. but not my host A.

After making up for that, my host A found itself to be online again ;-)


Then I prepared three machines to host OSDs. For some reasons, one of them showed only the locked 
'/dev/sdX' devices and not the LVs that I intended to use as OSDs.
I rebooted, which did not change anything, then copied the key ('ssh-copy-id -f -i /etc/ceph/ceph.pub 
root@fileserverA'),  which did everything, and now I am wondering if I should write a cron job that 
periodically copies the key to all involved machines...



Where could the keys get lost? Is this a container-feature?


Is it really true that sites using cephadm never reboot their nodes? Can't 
really believe that.


Regards
Thomas


--

Thomas Roth
Department: Informationstechnologie
Location: SB3 2.291


GSI Helmholtzzentrum für Schwerionenforschung GmbH
Planckstraße 1, 64291 Darmstadt, Germany, www.gsi.de

Commercial Register / Handelsregister: Amtsgericht Darmstadt, HRB 1528
Managing Directors / Geschäftsführung:
Professor Dr. Paolo Giubellino, Dr. Ulrich Breuer, Jörg Blaurock
Chairman of the Supervisory Board / Vorsitzender des GSI-Aufsichtsrats:
State Secretary / Staatssekretär Dr. Volkmar Dietz

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Announcing go-ceph v0.16.0

2022-06-15 Thread Konstantin Shalygin
/Whoops, already corrected...

Sent from my iPhone

> On 15 Jun 2022, at 09:51, Konstantin Shalygin  wrote:
> 
> The link with mistype, I think
> 
> https://github.com/ceph/go-ceph/releases/tag/v0.16.0
> 
> 
> k
> Sent from my iPhone
> 
>>> On 14 Jun 2022, at 23:37, John Mulligan  
>>> wrote:
>>> 
>> On Tuesday, June 14, 2022 4:29:59 PM EDT John Mulligan wrote:
>>> I'm happy to announce another release of the go-ceph API library. This is a
>>> regular release following our every-two-months release cadence.
>>> 
>>> https://github.com/ceph/go-ceph/releases/tag/v0.64.0
>>> 
>> 
>> Eventually I was bound to typo that link. The correct link is 
>> https://github.com/ceph/go-ceph/releases/tag/v0.16.0
>> 
>> Apologies for any confusion.
>> 
>> 
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Announcing go-ceph v0.16.0

2022-06-15 Thread Konstantin Shalygin
The link with mistype, I think

https://github.com/ceph/go-ceph/releases/tag/v0.16.0


k
Sent from my iPhone

> On 14 Jun 2022, at 23:37, John Mulligan  wrote:
> 
> On Tuesday, June 14, 2022 4:29:59 PM EDT John Mulligan wrote:
>> I'm happy to announce another release of the go-ceph API library. This is a
>> regular release following our every-two-months release cadence.
>> 
>> https://github.com/ceph/go-ceph/releases/tag/v0.64.0
>> 
> 
> Eventually I was bound to typo that link. The correct link is 
> https://github.com/ceph/go-ceph/releases/tag/v0.16.0
> 
> Apologies for any confusion.
> 
> 
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: OSD crash with "no available blob id" and check for Zombie blobs

2022-06-15 Thread Konstantin Shalygin
The other fixes landed to nautilus and later releases
I suggest to you upgrade to nautilus as soon as possible, this is very stable 
release (14.2.22)


k
Sent from my iPhone

> On 14 Jun 2022, at 12:13, tao song  wrote:
> 
> 
>   Thanks , we have backport some PR to 12.2.12,but the problem remain.Are 
> there any other fixes?
> eg:
>> os/bluestore: apply garbage collection against excessive blob count growth
>>  https://github.com/ceph/ceph/pull/28229 
>> AND the ceph-bluestore-tool fsck/repair
>> https://github.com/ceph/ceph/pull/38050 
> 
> 
> Konstantin Shalygin  于2022年6月14日周二 15:37写道:
>> Hi,
>> 
>> Many of fixes for "zombie blobs" landed in last Nautilus release
>> I suggest to upgrade to last Nautilus version
>> 
>> 
>> k
>> 
>> > On 14 Jun 2022, at 10:23, tao song  wrote:
>> > 
>> > I have a old Cluster 12.2.12 running bluestore ,use iscsi + RBD in EC
>> > pools(k:m=2:1) with ec_overwrites flags. Multiple OSD crashes occurred due
>> > to assert (0 == "no available blob id").
>> > The problems occur periodically when the RBD volume is cyclically
>> > overwritten.
>> > 
>> > 2022-05-24 22:08:19.950550 7fcb41894700  1 osd.171 pg_epoch: 47676
>> >> pg[4.1467s2( v 44365'8455207 (44045'8453660,44365'8455207]
>> >> local-lis/les=44415/44416 n=21866 ec=16123/349 lis/c 47665/44415 les/c/f
>> >> 47666/44416/4714 47676/47676/39511)
>> >> [115,16,171]/[115,2147483647,2147483647]p115(0) r=-1 lpr=47676
>> >> pi=[44415,47676)/2 crt=44260'8455206 lcod 0'0 remapped NOTIFY mbc={}]
>> >> state: transitioning to Stray
>> >> 2022-05-24 22:08:20.834007 7fcb41894700  1 osd.171 pg_epoch: 47677
>> >> pg[4.1467s2( v 44365'8455207 (44045'8453660,44365'8455207]
>> >> local-lis/les=44415/44416 n=21866 ec=16123/349 lis/c 47665/44415 les/c/f
>> >> 47666/44416/4714 47676/47677/39511) [115,16,171]p115(0) r=2 lpr=47677
>> >> pi=[44415,47677)/2 crt=44260'8455206 lcod 0'0 unknown NOTIFY mbc={}]
>> >> start_peering_interval up [115,16,171] -> [115,16,171], acting
>> >> [115,2147483647,2147483647] -> [115,16,171], acting_primary 115(0) -> 115,
>> >> up_primary 115(0) -> 115, role -1 -> 2, features acting 
>> >> 4611087853746454523
>> >> upacting 4611087853746454523
>> >> 2022-05-24 22:08:20.834073 7fcb41894700  1 osd.171 pg_epoch: 47677
>> >> pg[4.1467s2( v 44365'8455207 (44045'8453660,44365'8455207]
>> >> local-lis/les=44415/44416 n=21866 ec=16123/349 lis/c 47665/44415 les/c/f
>> >> 47666/44416/4714 47676/47677/39511) [115,16,171]p115(0) r=2 lpr=47677
>> >> pi=[44415,47677)/2 crt=44260'8455206 lcod 0'0 unknown NOTIFY mbc={}]
>> >> state: transitioning to Stray
>> >> 2022-05-24 22:08:22.097055 7fcb3a085700 -1
>> >> /ceph-12.2.12/src/os/bluestore/BlueStore.cc: In function 'bid_t
>> >> BlueStore::ExtentMap::allocate_spanning_blob_id()' thread 7fcb3a085700 
>> >> time
>> >> 2022-05-24 22:08:22.091806
>> >> /ceph-12.2.12/src/os/bluestore/BlueStore.cc: 2083: FAILED assert(0 == "no
>> >> available blob id")
>> >> 
>> >> ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) luminous
>> >> (stable)
>> >> 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>> >> const*)+0x110) [0x560e41dbd520]
>> >> 2: (()+0x8fce4e) [0x560e41c15e4e]
>> >> 3: (BlueStore::ExtentMap::reshard(KeyValueDB*,
>> >> std::shared_ptr)+0x13da) [0x560e41c6fc6a]
>> >> 4: (BlueStore::_txc_write_nodes(BlueStore::TransContext*,
>> >> std::shared_ptr)+0x1ab) [0x560e41c7131b]
>> >> 5: (BlueStore::queue_transactions(ObjectStore::Sequencer*,
>> >> std::vector> >> std::allocator >&,
>> >> boost::intrusive_ptr, ThreadPool::TPHandle*)+0x3fd)
>> >> [0x560e41c8cc4d]
>> >> 6:
>> >> (PrimaryLogPG::queue_transactions(std::vector> >> std::allocator >&,
>> >> boost::intrusive_ptr)+0x65) [0x560e419efac5]
>> >> 7: (ECBackend::handle_sub_write(pg_shard_t,
>> >> boost::intrusive_ptr, ECSubWrite&, ZTracer::Trace const&,
>> >> Context*)+0x631) [0x560e41b18331]
>> >> 8: (ECBackend::_handle_message(boost::intrusive_ptr)+0x349)
>> >> [0x560e41b29ba9]
>> >> 9: (PGBackend::handle_message(boost::intrusive_ptr)+0x50)
>> >> [0x560e41a255f0]
>> >> 10: (PrimaryLogPG::do_request(boost::intrusive_ptr&,
>> >> ThreadPool::TPHandle&)+0x59c) [0x560e4198f97c]
>> >> 11: (OSD::dequeue_op(boost::intrusive_ptr,
>> >> boost::intrusive_ptr, ThreadPool::TPHandle&)+0x3f9)
>> >> [0x560e4180af59]
>> >> 12: (PGQueueable::RunVis::operator()(boost::intrusive_ptr
>> >> const&)+0x57) [0x560e41a9ac27]
>> >> 13: (OSD::ShardedOpWQ::_process(unsigned int,
>> >> ceph::heartbeat_handle_d*)+0xfce) [0x560e4183a20e]
>> >> 14: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x83f)
>> >> [0x560e41dc304f]
>> >> 15: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x560e41dc4fe0]
>> >> 16: (()+0x7dd5) [0x7fcb5913fdd5]
>> >> 17: (clone()+0x6d) [0x7fcb5822fead]
>> >> NOTE: a copy of the executable, or `objdump -rdS ` is needed
>> >> to interpret this.
>> >> 
>> > 
>> > Some would not restart with "no available blob id" assertion.We adjust the
>> > following parameters to to ensure that the OSD can be