[ 
https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17215148#comment-17215148
 ] 

Yiqun Lin commented on HDDS-4308:
---------------------------------

{quote}This might not be complete i believe, If 2 threads acquire copy object 
and if they update outside lock we have issue again. I think the whole 
operation should be performed under volume lock. (As we update in-memory it 
should be quick) But i agree that it might have performance impact across 
buckets when key writes happen.
{quote}
Use volume lock during bucket operation makes logic a little complex. 
 As current PR change does:
 1)acquire bucket lock
 2)release bucket lock
 3)acquire volume lock
    update volume usedBytes usage
 4)release volume lock
 5)acquire bucket lock again (to finish remaining operation)
 6)release bucket lock

Can we just make the method OMKeyRequest#getVolumeInfo be thread safe to return 
a copied object, this should be okay for current issue? This will make the 
logic more simplified.
 Like:
{code:java}
  public static synchronized OmVolumeArgs getVolumeInfo(OMMetadataManager 
omMetadataManager,
      String volume) {
    return omMetadataManager.getVolumeTable().getCacheValue(
        new CacheKey<>(omMetadataManager.getVolumeKey(volume)))
        .getCacheValue().copyObject();
  }
{code}

> Fix issue with quota update
> ---------------------------
>
>                 Key: HDDS-4308
>                 URL: https://issues.apache.org/jira/browse/HDDS-4308
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>            Reporter: Bharat Viswanadham
>            Assignee: mingchao zhao
>            Priority: Blocker
>              Labels: pull-request-available
>
> Currently volumeArgs using getCacheValue and put the same object in 
> doubleBuffer, this might cause issue.
> Let's take the below scenario:
> InitialVolumeArgs quotaBytes -> 10000
> 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated 
> volumeArgs to DoubleBuffer.
> 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to 
> double buffer.
> *Now at the end of flushing these transactions, our DB should have 7000 as 
> bytes used.*
> Now T1 is picked by double Buffer and when it commits, and as it uses cached 
> Object put into doubleBuffer, it flushes to DB with the updated value from 
> T2(As it is a cache object) and update DB with bytesUsed as 7000.
> And now OM has restarted, and only DB has transactions till T1. (We get this 
> info from TransactionInfo 
> Table(https://issues.apache.org/jira/browse/HDDS-3685)
> Now T2 is again replayed, as it is not committed to DB, now DB will be again 
> subtracted with 2000, and now DB will have 5000.
> But after T2, the value should be 7000, so we have DB in an incorrect state.
> Issue here:
> 1. As we use a cached object and put the same cached object into double 
> buffer this can cause this kind of issue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

Reply via email to