[ 
https://issues.apache.org/jira/browse/HDDS-8920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siyao Meng updated HDDS-8920:
-----------------------------
    Description: 
Gabor found that because `HddsClientUtils#isSupportedCharacter` calls 
`Character.isLowerCase` and `Character.isDigit` which are Unicode-aware, Ozone 
client or Ozone Manager is not really filtering out those Unicode (non-letter) 
characters and can successfully pass the filter. e.g. with three 
[U+FF5A|https://www.compart.com/en/unicode/U+FF5A]:

{code}
[root@gimre-sp4-1 ~]# ozone sh volume create zzz
23/06/23 16:16:44 INFO rpc.RpcClient: Creating Volume: zzz, with root as owner 
and space quota set to -1 bytes, counts quota set to -1
{code}

while according to S3 [bucket naming 
rules|https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucketnamingrules.html]
 this wouldn't be allowed:

{code}
Bucket names can consist only of lowercase letters, numbers, dots (.), and 
hyphens (-).
{code}

And is indeed blocked by awscli:

{code}
$ aws s3api --endpoint-url https://s3g:9879 --ca-bundle cacerts.pem 
create-bucket --bucket zzz

Parameter validation failed:
Invalid bucket name "zzz": Bucket name must match the regex 
"^[a-zA-Z0-9.\-_]{1,255}$"

$ aws --version
aws-cli/1.15.57 Python/2.7.18 Darwin/22.5.0 botocore/1.10.56
{code}

TODO:

1. Confirm if indeed such unicode chars shall be blocked
2. Enhance resource name checking (for volume and bucket) on both client and 
server side. e.g. use regex, or use some form of normalization like 
[Punycode|https://www.punycoder.com/]
3. Mitigate impact on existing users when they already have such volumes or 
buckets in their systems, e.g. by making the new check optional and not 
enforced on older clusters when upgraded

cc [~swamirishi] [~hemantk] [~ppogde]

  was:
Gabor found that because `HddsClientUtils#isSupportedCharacter` calls 
`Character.isLowerCase` and `Character.isDigit` which are Unicode-aware, Ozone 
client or Ozone Manager is not really filtering out those Unicode (non-letter) 
characters and can successfully pass the filter. e.g. with three 
[U+FF5A|https://www.compart.com/en/unicode/U+FF5A]:

{code}
[root@gimre-sp4-1 ~]# ozone sh volume create zzz
23/06/23 16:16:44 INFO rpc.RpcClient: Creating Volume: zzz, with root as owner 
and space quota set to -1 bytes, counts quota set to -1
{code}

while according to S3 [bucket naming 
rules|https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucketnamingrules.html]
 this wouldn't be allowed:

{code}
Bucket names can consist only of lowercase letters, numbers, dots (.), and 
hyphens (-).
{code}

And is indeed blocked by awscli:

{code}
$ aws s3api --endpoint-url https://s3g:9879 --ca-bundle cacerts.pem 
create-bucket --bucket zzz

Parameter validation failed:
Invalid bucket name "zzz": Bucket name must match the regex 
"^[a-zA-Z0-9.\-_]{1,255}$"

$ aws --version
aws-cli/1.15.57 Python/2.7.18 Darwin/22.5.0 botocore/1.10.56
{code}

TODO:

1. Confirm if indeed such unicode chars shall be blocked
2. Enhance resource name checking (for volume and bucket) on both client and 
server side (use regex, or use some form of normalization like 
[Punycode|https://www.punycoder.com/])
3. Mitigate impact on existing users when they already have such volumes or 
buckets in their system. (e.g. by making the new check optional and not 
enforced on older clusters when upgraded)

cc [~swamirishi] [~hemantk] [~ppogde]


> Ozone is supporting unicode volume and bucket names, potentially 
> unintentionally
> --------------------------------------------------------------------------------
>
>                 Key: HDDS-8920
>                 URL: https://issues.apache.org/jira/browse/HDDS-8920
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: Siyao Meng
>            Priority: Major
>
> Gabor found that because `HddsClientUtils#isSupportedCharacter` calls 
> `Character.isLowerCase` and `Character.isDigit` which are Unicode-aware, 
> Ozone client or Ozone Manager is not really filtering out those Unicode 
> (non-letter) characters and can successfully pass the filter. e.g. with three 
> [U+FF5A|https://www.compart.com/en/unicode/U+FF5A]:
> {code}
> [root@gimre-sp4-1 ~]# ozone sh volume create zzz
> 23/06/23 16:16:44 INFO rpc.RpcClient: Creating Volume: zzz, with root as 
> owner and space quota set to -1 bytes, counts quota set to -1
> {code}
> while according to S3 [bucket naming 
> rules|https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucketnamingrules.html]
>  this wouldn't be allowed:
> {code}
> Bucket names can consist only of lowercase letters, numbers, dots (.), and 
> hyphens (-).
> {code}
> And is indeed blocked by awscli:
> {code}
> $ aws s3api --endpoint-url https://s3g:9879 --ca-bundle cacerts.pem 
> create-bucket --bucket zzz
> Parameter validation failed:
> Invalid bucket name "zzz": Bucket name must match the regex 
> "^[a-zA-Z0-9.\-_]{1,255}$"
> $ aws --version
> aws-cli/1.15.57 Python/2.7.18 Darwin/22.5.0 botocore/1.10.56
> {code}
> TODO:
> 1. Confirm if indeed such unicode chars shall be blocked
> 2. Enhance resource name checking (for volume and bucket) on both client and 
> server side. e.g. use regex, or use some form of normalization like 
> [Punycode|https://www.punycoder.com/]
> 3. Mitigate impact on existing users when they already have such volumes or 
> buckets in their systems, e.g. by making the new check optional and not 
> enforced on older clusters when upgraded
> cc [~swamirishi] [~hemantk] [~ppogde]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to