[ 
https://issues.apache.org/jira/browse/HDDS-14841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Andika updated HDDS-14841:
-------------------------------
    Description: 
We have encountered incidents caused by createFakeDirIfShould in getFileStatus 
since createFakeDirIfShould creates a RocksDB iterator and it might take a long 
time when the keyTable has a lot of tombstones.  This causes OM to be stuck 
since writes on the same bucket will be held, which in turns held all the 
pending write transactions in OM Ratis applier.

Let's move createFakeDirIfShould outside of the lock to prevent this. There is 
some tradeoff in terms of consistency, but since createFakeDirIfShould should 
not be the normal case, we can contend with this.

  was:
We have encountered large incidents caused by createFakeDirIfShould in 
getFileStatus since createFakeDirIfShould creates a RocksDB iterator and it 
might take a long time when the keyTable has a lot of tombstones.  This causes 
OM to be stuck since writes on the same bucket will be held, which in turns 
held all the pending write transactions in OM Ratis applier.

Let's move createFakeDirIfShould outside of the lock to prevent this. There is 
some tradeoff in terms of consistency, but since createFakeDirIfShould should 
not be the normal case, we can contend with this.


> Run createFakeDirIfShould outside the lock to prevent OM stuck
> --------------------------------------------------------------
>
>                 Key: HDDS-14841
>                 URL: https://issues.apache.org/jira/browse/HDDS-14841
>             Project: Apache Ozone
>          Issue Type: Improvement
>            Reporter: Ivan Andika
>            Assignee: Ivan Andika
>            Priority: Major
>
> We have encountered incidents caused by createFakeDirIfShould in 
> getFileStatus since createFakeDirIfShould creates a RocksDB iterator and it 
> might take a long time when the keyTable has a lot of tombstones.  This 
> causes OM to be stuck since writes on the same bucket will be held, which in 
> turns held all the pending write transactions in OM Ratis applier.
> Let's move createFakeDirIfShould outside of the lock to prevent this. There 
> is some tradeoff in terms of consistency, but since createFakeDirIfShould 
> should not be the normal case, we can contend with this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to