[
https://issues.apache.org/jira/browse/HDDS-12882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17946893#comment-17946893
]
Ivan Andika edited comment on HDDS-12882 at 5/13/25 1:26 AM:
-------------------------------------------------------------
fter researching at the S3 listObjectsV2 behavior, limiting the number of keys
in the {{OzoneKeyIterator}} will cause correctness issue on the S3 when
"delimiter" parameter is used.
See TestBucketList#listWithContinuationTokenDirBreak for the counter-example:
We have these keys in the bucket
"test/dir1/file1",
"test/dir1/file2",
"test/dir1/file3",
"test/dir2/file4",
"test/dir2/file5",
"test/dir2/file6",
"test/dir3/file7",
"test/file8"
User sends a ListObjectsV2 request with these parameters
* delimiter: “/”
* maxKeys: 2
* prefix: “test/”
Expected results (Note: the total number of entries in CommonPrefixes and
Contents should be maxKeys):
* CommonPrefixes: ["test/dir1", "test/dir2"]
** This requires S3G to skip keys ["test/dir1/file2" and "test/dir1/file3"]
since the CommonPrefix "test/dir1" was already seen when S3G see "test/dir1/"
* Contents: [] From this example, we see that we cannot simply use "maxKeys"
as the maximum number of keys that OzoneKeyIterator will ask OM to return. This
means that the current S3G implementation is already correct.
We can configure the "ozone.client.list.cache" to value lower than 1000 (maybe
500). This way every listObjects with maxKeys=1 sent by S3A will only load 500
entries instead of 1000. However, This is not recommended since users that want
to actually list directories will send more RPC requests to OM which might
cause RPC slowness
was (Author: JIRAUSER298977):
fter researching at the S3 listObjectsV2 behavior, limiting the number of keys
in the {{OzoneKeyIterator}} will cause correctness issue on the S3 when
"delimiter" parameter is used.
See TestBucketList#listWithContinuationTokenDirBreak for the counter-example:
We have these keys in the bucket
"test/dir1/file1",
"test/dir1/file2",
"test/dir1/file3",
"test/dir2/file4",
"test/dir2/file5",
"test/dir2/file6",
"test/dir3/file7",
"test/file8"
User sends a ListObjectsV2 request with these parameters
* delimiter: “/”
* maxKeys: 2
* prefix: “test/” Expected results (Note: the total number of entries in
CommonPrefixes and Contents should be maxKeys):
* CommonPrefixes: ["test/dir1", "test/dir2"]
** This requires S3G to skip keys ["test/dir1/file2" and "test/dir1/file3"]
since the CommonPrefix "test/dir1" was already seen when S3G see "test/dir1/"
* Contents: [] From this example, we see that we cannot simply use "maxKeys"
as the maximum number of keys that OzoneKeyIterator will ask OM to return. This
means that the current S3G implementation is already correct.
We can configure the "ozone.client.list.cache" to value lower than 1000 (maybe
500). This way every listObjects with maxKeys=1 sent by S3A will only load 500
entries instead of 1000. However, This is not recommended since users that want
to actually list directories will send more RPC requests to OM which might
cause RPC slowness
> S3G should consider S3 listObjects max-keys parameter
> -----------------------------------------------------
>
> Key: HDDS-12882
> URL: https://issues.apache.org/jira/browse/HDDS-12882
> Project: Apache Ozone
> Issue Type: Sub-task
> Reporter: Ivan Andika
> Assignee: Ivan Andika
> Priority: Major
>
> We have a situation where OM crash with heap space OOM because S3 user keeps
> sending a lot listObjectsV2 with max-keys=1 parameter as a way to check the
> object's owner (although GetObjectAcl should be a better alternative).
> We see that the KeyIterator implementation always send listStatus with
> maxKeys specified by ozone.client.list.cache (default is 1000). The default
> ozone.client.list.cache of 1000 is also the same as the "max-keys" specified
> by S3G, so normally it will be ok. If there are 1000 files inside this
> directory, the S3 client will send 1000 S3 listObjectsV2 (with max-keys=1),
> and OM has to load 1000 * 1000 = 1,000,000 OmKeyInfo entries to its memory.
> This causes OM heap space to be OOM.
> We need to take into account this max-keys sent by S3. One possible idea is
> to use the S3 max-keys if it's smaller than the configured
> ozone.client.list.cache. This way, OM will only load the necessary OmKeyInfo
> to memory.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]