Wei-Chiu Chuang created HDDS-11655:
--------------------------------------
Summary: Prune redundant KeyInfo fields from PurgeDirectories
request
Key: HDDS-11655
URL: https://issues.apache.org/jira/browse/HDDS-11655
Project: Apache Ozone
Issue Type: Improvement
Reporter: Wei-Chiu Chuang
We recently saw a case where OM ratis log exceeded gRPC buffer size limit.
During investigation, we realized a number of requests have very bloated data
structures. They were so bloated, Ratis log rolled every 20 seconds. Among
which, PurgeDirectories is the biggest offender. Using the command
{code:java}
ozone debug ratislogparser om --segment-path=<ratislog>
{code}
I found the keys to be deleted had a lot of redundant info. Ideally it just
needed volumeName, bucketName and keyName, but instead, it carried the whole
KeyInfo object. Adding to this problem is the file ACLs. There appears to be no
limit to how many ACLs a file can have, and we did see files with hundreds of
ACLs (that is a separate problem which I will open another jira to track)
This is one example:
{noformat}
(t:1, i:14744), STATEMACHINELOGENTRY, cmdType: PurgeDirectories clientId:
"client-698E60F7C348" purgeDirectoriesRequest {
deletedPath { volumeId: 13835058055282163456 bucketId: 9223372036854776064
deletedDir:
"/-4611686018427388160/-9223372036854775552/-9223372036854774014/241030230000/-9223372036854653695"
deletedSubFiles { volumeName: "s3v" bucketName:
"cloudera-health-monitoring-ozone-basic-canary-bucket" keyName:
".Trash/cloudera-scm/241030230000/cloudera-health-monitoring-ozone-basic-canary-key"
dataSize: 63 type: RATIS factor: THREE keyLocationList { version: 0
keyLocations { blockID { containerBlockID { containerID: 3 localID:
113750153625600078 } blockCommitSequenceId: 50 } offset: 0 length: 63
createVersion: 0 partNumber: 0 } isMultipartKey: false } creationTime:
1730325604743 modificationTime: 1730325606257 latestVersion: 0 acls { type:
USER name: "cloudera-scm" rights: "\200" aclScope: ACCESS } acls { type: GROUP
name: "cloudera-scm" rights: "\200" aclScope: ACCESS } acls { type: GROUP name:
"wheel" rights: "\200" aclScope: ACCESS } objectID: 9223372036854896897
updateID: 479 parentID: 9223372036854897921 isFile: true } ...
{noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]