[jira] [Updated] (HDDS-6321) Avoid refresh pipeline for key lookup in checkAcls

UENISHI Kota (Jira) Tue, 15 Feb 2022 00:44:04 -0800


     [ 
https://issues.apache.org/jira/browse/HDDS-6321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


UENISHI Kota updated HDDS-6321:
-------------------------------
    Description: 
In every ACL check under native Ozone authorizer, it calls 
[keyManager.checkAccess|#L162]. KeyManagerImpl#checkAccess [calls 
getFileStatus() as 
well|https://github.com/apache/ozone/blob/76aa27e7c05196ae00cba540efce4bb7529e5d15/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java#L1804],
 which finally [calls pipeline refresh()|#L2011]. Pipeline refresh is not 
needed here because it just obtains key ACL and no need for blocks. This causes 
additional external RPC call to SCM, which is unnecessary overhead on each 
object-get.

We observed this issue in our production cluster, as 50% increase of latency 
estimated from wall clock profile:

!Screenshot_2022-02-15_17-35-18.png|width=739,height=452!

Also, our monitoring shows 2x lookup key to OM, which increases SCM call count 
of GetContainerWithPipeline.

!29843180-8924-11ec-8ad5-5b5a8342f2d3.png|width=797,height=245!
!2b4df500-8924-11ec-927a-de3d8adc6fe0.png|width=798,height=239!

 

I'm not sure how to fix this issue regarding {color:#6e7781}HDDS-3658{color} . 
Cleanest way would be re-utilizing again refreshPipeline flag, but it'd be a 
hustle to consider all cases using getFileStatus(). HDDS-5450 may be give us 
some hints.

  was:
In every ACL check under native Ozone authorizer, it calls 
[keyManager.checkAccess|#L162].] KeyManagerImpl#checkAccess [calls 
getFileStatus() as 
well|https://github.com/apache/ozone/blob/76aa27e7c05196ae00cba540efce4bb7529e5d15/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java#L1804],
 which finally [calls pipeline refresh()|#L2011].] Pipeline refresh is not 
needed here because it just obtains key ACL and no need for blocks. This causes 
additional external RPC call to SCM, which is unnecessary overhead on each 
object-get.

We observed this issue in our production cluster, as 50% increase of latency 
estimated from wall clock profile:

!Screenshot_2022-02-15_17-35-18.png!

Also, our monitoring shows 2x lookup key to OM, which increases SCM call count 
of GetContainerWithPipeline.

!29843180-8924-11ec-8ad5-5b5a8342f2d3.png!
!2b4df500-8924-11ec-927a-de3d8adc6fe0.png!

 

I'm not sure how to fix this issue regarding {color:#6e7781}HDDS-3658{color} . 
Cleanest way would be re-utilizing again refreshPipeline flag, but it'd be a 
hustle to consider all cases using getFileStatus(). HDDS-5450 may be give us 
some hints.


> Avoid refresh pipeline for key lookup in checkAcls
> --------------------------------------------------
>
>                 Key: HDDS-6321
>                 URL: https://issues.apache.org/jira/browse/HDDS-6321
>             Project: Apache Ozone
>          Issue Type: Bug
>          Components: Ozone Manager
>    Affects Versions: 1.2.0
>         Environment: OM setup with Native Ozone Authorizer
>            Reporter: UENISHI Kota
>            Priority: Major
>         Attachments: 29843180-8924-11ec-8ad5-5b5a8342f2d3.png, 
> 2b4df500-8924-11ec-927a-de3d8adc6fe0.png, Screenshot_2022-02-15_17-35-18.png
>
>
> In every ACL check under native Ozone authorizer, it calls 
> [keyManager.checkAccess|#L162]. KeyManagerImpl#checkAccess [calls 
> getFileStatus() as 
> well|https://github.com/apache/ozone/blob/76aa27e7c05196ae00cba540efce4bb7529e5d15/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java#L1804],
>  which finally [calls pipeline refresh()|#L2011]. Pipeline refresh is not 
> needed here because it just obtains key ACL and no need for blocks. This 
> causes additional external RPC call to SCM, which is unnecessary overhead on 
> each object-get.
> We observed this issue in our production cluster, as 50% increase of latency 
> estimated from wall clock profile:
> !Screenshot_2022-02-15_17-35-18.png|width=739,height=452!
> Also, our monitoring shows 2x lookup key to OM, which increases SCM call 
> count of GetContainerWithPipeline.
> !29843180-8924-11ec-8ad5-5b5a8342f2d3.png|width=797,height=245!
> !2b4df500-8924-11ec-927a-de3d8adc6fe0.png|width=798,height=239!
>  
> I'm not sure how to fix this issue regarding {color:#6e7781}HDDS-3658{color} 
> . Cleanest way would be re-utilizing again refreshPipeline flag, but it'd be 
> a hustle to consider all cases using getFileStatus(). HDDS-5450 may be give 
> us some hints.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (HDDS-6321) Avoid refresh pipeline for key lookup in checkAcls

Reply via email to