[
https://issues.apache.org/jira/browse/HDDS-8920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Siyao Meng updated HDDS-8920:
-----------------------------
Description:
Gabor found that because `HddsClientUtils#isSupportedCharacter` calls
`Character.isLowerCase` and `Character.isDigit` which are Unicode-aware, Ozone
client or Ozone Manager is not really filtering out those Unicode (non-letter)
characters and can successfully pass the filter. e.g. with three
[U+FF5A|https://www.compart.com/en/unicode/U+FF5A]:
{code}
[root@gimre-sp4-1 ~]# ozone sh volume create zzz
23/06/23 16:16:44 INFO rpc.RpcClient: Creating Volume: zzz, with root as owner
and space quota set to -1 bytes, counts quota set to -1
{code}
while according to S3 [bucket naming
rules|https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucketnamingrules.html]
this wouldn't be allowed:
{code}
Bucket names can consist only of lowercase letters, numbers, dots (.), and
hyphens (-).
{code}
And is indeed blocked by awscli:
{code}
$ aws s3api --endpoint-url https://s3g:9879 --ca-bundle cacerts.pem
create-bucket --bucket zzz
Parameter validation failed:
Invalid bucket name "zzz": Bucket name must match the regex
"^[a-zA-Z0-9.\-_]{1,255}$"
$ aws s3api --endpoint-url https://s3g:9879 --ca-bundle cacerts.pem
delete-bucket --bucket zzz
Parameter validation failed:
Invalid bucket name "zzz": Bucket name must match the regex
"^[a-zA-Z0-9.\-_]{1,255}$"
$ aws --version
aws-cli/1.15.57 Python/2.7.18 Darwin/22.5.0 botocore/1.10.56
{code}
TODO:
1. Confirm if indeed such unicode chars shall be blocked
2. Enhance resource name checking (for volume and bucket) on both client and
server side. e.g. use regex, or use some form of normalization like
[Punycode|https://www.punycoder.com/]
3. Mitigate impact on existing users when they already have such volumes or
buckets in their systems, e.g. by making the new check optional and not
enforced on older clusters when upgraded, or only disallow such Unicode chars
during new volume and bucket creation (but not operations on existing volume
and bucket names that has such characters)
cc [~swamirishi] [~hemantk] [~ppogde]
was:
Gabor found that because `HddsClientUtils#isSupportedCharacter` calls
`Character.isLowerCase` and `Character.isDigit` which are Unicode-aware, Ozone
client or Ozone Manager is not really filtering out those Unicode (non-letter)
characters and can successfully pass the filter. e.g. with three
[U+FF5A|https://www.compart.com/en/unicode/U+FF5A]:
{code}
[root@gimre-sp4-1 ~]# ozone sh volume create zzz
23/06/23 16:16:44 INFO rpc.RpcClient: Creating Volume: zzz, with root as owner
and space quota set to -1 bytes, counts quota set to -1
{code}
while according to S3 [bucket naming
rules|https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucketnamingrules.html]
this wouldn't be allowed:
{code}
Bucket names can consist only of lowercase letters, numbers, dots (.), and
hyphens (-).
{code}
And is indeed blocked by awscli:
{code}
$ aws s3api --endpoint-url https://s3g:9879 --ca-bundle cacerts.pem
create-bucket --bucket zzz
Parameter validation failed:
Invalid bucket name "zzz": Bucket name must match the regex
"^[a-zA-Z0-9.\-_]{1,255}$"
$ aws s3api --endpoint-url https://s3g:9879 --ca-bundle cacerts.pem
delete-bucket --bucket zzz
Parameter validation failed:
Invalid bucket name "zzz": Bucket name must match the regex
"^[a-zA-Z0-9.\-_]{1,255}$"
$ aws --version
aws-cli/1.15.57 Python/2.7.18 Darwin/22.5.0 botocore/1.10.56
{code}
TODO:
1. Confirm if indeed such unicode chars shall be blocked
2. Enhance resource name checking (for volume and bucket) on both client and
server side. e.g. use regex, or use some form of normalization like
[Punycode|https://www.punycoder.com/]
3. Mitigate impact on existing users when they already have such volumes or
buckets in their systems, e.g. by making the new check optional and not
enforced on older clusters when upgraded
cc [~swamirishi] [~hemantk] [~ppogde]
> Ozone is supporting unicode volume and bucket names, potentially
> unintentionally
> --------------------------------------------------------------------------------
>
> Key: HDDS-8920
> URL: https://issues.apache.org/jira/browse/HDDS-8920
> Project: Apache Ozone
> Issue Type: Bug
> Reporter: Siyao Meng
> Assignee: Duong
> Priority: Major
>
> Gabor found that because `HddsClientUtils#isSupportedCharacter` calls
> `Character.isLowerCase` and `Character.isDigit` which are Unicode-aware,
> Ozone client or Ozone Manager is not really filtering out those Unicode
> (non-letter) characters and can successfully pass the filter. e.g. with three
> [U+FF5A|https://www.compart.com/en/unicode/U+FF5A]:
> {code}
> [root@gimre-sp4-1 ~]# ozone sh volume create zzz
> 23/06/23 16:16:44 INFO rpc.RpcClient: Creating Volume: zzz, with root as
> owner and space quota set to -1 bytes, counts quota set to -1
> {code}
> while according to S3 [bucket naming
> rules|https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucketnamingrules.html]
> this wouldn't be allowed:
> {code}
> Bucket names can consist only of lowercase letters, numbers, dots (.), and
> hyphens (-).
> {code}
> And is indeed blocked by awscli:
> {code}
> $ aws s3api --endpoint-url https://s3g:9879 --ca-bundle cacerts.pem
> create-bucket --bucket zzz
> Parameter validation failed:
> Invalid bucket name "zzz": Bucket name must match the regex
> "^[a-zA-Z0-9.\-_]{1,255}$"
> $ aws s3api --endpoint-url https://s3g:9879 --ca-bundle cacerts.pem
> delete-bucket --bucket zzz
> Parameter validation failed:
> Invalid bucket name "zzz": Bucket name must match the regex
> "^[a-zA-Z0-9.\-_]{1,255}$"
> $ aws --version
> aws-cli/1.15.57 Python/2.7.18 Darwin/22.5.0 botocore/1.10.56
> {code}
> TODO:
> 1. Confirm if indeed such unicode chars shall be blocked
> 2. Enhance resource name checking (for volume and bucket) on both client and
> server side. e.g. use regex, or use some form of normalization like
> [Punycode|https://www.punycoder.com/]
> 3. Mitigate impact on existing users when they already have such volumes or
> buckets in their systems, e.g. by making the new check optional and not
> enforced on older clusters when upgraded, or only disallow such Unicode chars
> during new volume and bucket creation (but not operations on existing volume
> and bucket names that has such characters)
> cc [~swamirishi] [~hemantk] [~ppogde]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]