István Fajth created HDDS-11472:
-----------------------------------

             Summary: Multiple IOzoneAuthorizer instances may be created at 
install snapshot failure 
                 Key: HDDS-11472
                 URL: https://issues.apache.org/jira/browse/HDDS-11472
             Project: Apache Ozone
          Issue Type: Bug
          Components: Ozone Manager
            Reporter: István Fajth


If during installing a Ratis snapshot there is a failure after the metadata 
manager has been stopped, then as part of the flow, we are calling 
OzoneManager#reloadOMState() method from OzoneManager#installCheckpoint.

As part of the reloadOmState call, we re-create the IAccessAuthorizer instance 
with the help of OzoneAuthorizerFactory, but the old instance may remain 
running as even if we replace the reference in OzoneManager, there might be 
other places from where the old object is still referenced.
In the case when the authorizer object refers to a significant amount of data, 
then a repating failure like the one described in HDDS-10300 can fill up the 
heap of Ozone Manager. Internally we ran into this with and Atlas+Ranger+Ozone 
setup, where the plugin refers to ~2GB worth of objects in the heap, and was 
present multiple times in a heap dump. This condition lead to a crash due to 
long GC pauses.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to