[
https://issues.apache.org/jira/browse/HDDS-2477?focusedWorklogId=343028&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-343028
]
ASF GitHub Bot logged work on HDDS-2477:
----------------------------------------
Author: ASF GitHub Bot
Created on: 14/Nov/19 00:24
Start Date: 14/Nov/19 00:24
Worklog Time Spent: 10m
Work Description: bharatviswa504 commented on pull request #159:
HDDS-2477. TableCache cleanup issue for OM non-HA.:
URL: https://github.com/apache/hadoop-ozone/pull/159
## What changes were proposed in this pull request?
In OM in non-HA case, the ratisTransactionLogIndex is generated by
OmProtocolServersideTranslatorPB.java. And in OM non-HA validateAndUpdateCache
is called from multipleHandler threads. So think of a case where one thread
which has an index - 10 has added to doubleBuffer. (0-9 still have not added).
DoubleBuffer flush thread flushes and call cleanup. (So, now cleanup will go
and cleanup all cache entries with less than 10 epoch) This should not have
cleanup those which might have put in to cache later and which are in process
of flush to DB. This will cause inconsitency for few OM requests.
Example:
4 threads Committing 4 parts.
1st thread - part 1 - ratis Index - 3
2nd thread - part 2 - ratis index - 2
3rd thread - part3 - ratis index - 1
First thread got lock, and put in to doubleBuffer and cache with
OmMultipartInfo (with part1). And cleanup is called to cleanup all entries in
cache with less than 3. In the mean time 2nd thread and 1st thread put 2,3
parts in to OmMultipartInfo in to Cache and doubleBuffer. But first thread
might cleanup those entries, as it is called with index 3 for cleanup.
Now when the 4th part upload came -> when it is commit Multipart upload when
it gets multipartinfo it get Only part1 in OmMultipartInfo, as the
OmMultipartInfo (with 1,2,3 is still in process of committing to DB). So now
after 4th part upload is complete in DB and Cache we will have 1,4 parts only.
We will miss part2,3 information.
So for non-HA case cleanup will be called with list of epochs that need to
be cleanedup.
## What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-2477
## How was this patch tested?
Added UT.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 343028)
Remaining Estimate: 0h
Time Spent: 10m
> TableCache cleanup issue for OM non-HA
> --------------------------------------
>
> Key: HDDS-2477
> URL: https://issues.apache.org/jira/browse/HDDS-2477
> Project: Hadoop Distributed Data Store
> Issue Type: Bug
> Reporter: Bharat Viswanadham
> Assignee: Bharat Viswanadham
> Priority: Major
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> In OM in non-HA case, the ratisTransactionLogIndex is generated by
> OmProtocolServersideTranslatorPB.java. And in OM non-HA
> validateAndUpdateCache is called from multipleHandler threads. So think of a
> case where one thread which has an index - 10 has added to doubleBuffer. (0-9
> still have not added). DoubleBuffer flush thread flushes and call cleanup.
> (So, now cleanup will go and cleanup all cache entries with less than 10
> epoch) This should not have cleanup those which might have put in to cache
> later and which are in process of flush to DB. This will cause inconsitency
> for few OM requests.
>
>
> Example:
> 4 threads Committing 4 parts.
> 1st thread - part 1 - ratis Index - 3
> 2nd thread - part 2 - ratis index - 2
> 3rd thread - part3 - ratis index - 1
>
> First thread got lock, and put in to doubleBuffer and cache with
> OmMultipartInfo (with part1). And cleanup is called to cleanup all entries in
> cache with less than 3. In the mean time 2nd thread and 1st thread put 2,3
> parts in to OmMultipartInfo in to Cache and doubleBuffer. But first thread
> might cleanup those entries, as it is called with index 3 for cleanup.
>
> Now when the 4th part upload came -> when it is commit Multipart upload when
> it gets multipartinfo it get Only part1 in OmMultipartInfo, as the
> OmMultipartInfo (with 1,2,3 is still in process of committing to DB). So now
> after 4th part upload is complete in DB and Cache we will have 1,4 parts
> only. We will miss part2,3 information.
>
> So for non-HA case cleanup will be called with list of epochs that need to be
> cleanedup.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]