jvarenina commented on a change in pull request #7145:
URL: https://github.com/apache/geode/pull/7145#discussion_r763045860
##########
File path: geode-core/src/main/java/org/apache/geode/internal/cache/Oplog.java
##########
@@ -1303,16 +1305,18 @@ void close(DiskRegion dr) {
// while a krf is being created can not close a region
lockCompactor();
try {
- addUnrecoveredRegion(dr.getId());
- DiskRegionInfo dri = getDRI(dr);
- if (dri != null) {
- long clearCount = dri.clear(null);
- if (clearCount != 0) {
- totalLiveCount.addAndGet(-clearCount);
- // no need to call handleNoLiveValues because we now have an
- // unrecovered region.
+ if (!isDrfOnly()) {
Review comment:
We have put check `ifDrfOnly()` after the `lockCompactor()` on purpose
because of the following case:
Oplog files (.crf, .drf and .krf) are in a process of compaction and
`lockCompactor()` is acquired. If we check `isDrfOnly()` before trying to
acquire the lock, then it would most probably return false, because all Oplog
files (.crf, .drf and .krf) exists during compaction (files are deleted as a
last step of compaction, so it possible for check to return true only during
that period). After check is performed the thread will continue and block (on
`lockCompactor()`) until compaction is done. Compaction could then produce the
.drf only file and release the lock, and in that case thread would resume with
executing unnecessary code.
Because we put the the `ifDrfOnly() `after `lockCompactor()`, the thread
will first block until compaction is done. If compaction produces only .drf
file, then `ifDrfOnly()` will be checked after acquiring the lock and
unnecessary code will be skipped.
We think that it is safe to skip `addUnrecoveredRegion()` when Oplog is only
.drf file. This is because only .drf Oplog cannot contain live entries and
therefore cannot contain unrecoverable live region entries. We would expect
that having unrecoverable region live entry's is only possible when there is
.crf file available (please have in mind that .crf file is deleted due to
compaction when there is only .drf file). And this would only happen when one
of the multiple regions that has been using this disk-store Oplog is not yet
recovered. In that case it is not possible to perform compaction as geode did
not load the the unrecoverable region data from .crf file in memory and it
doesn't have live entries references to do the compaction (please check linked
list `DiskRegionInfoWithList.liveEntries`). Not sure if we got this case right
and if there is something that we misunderstood?
We also think that it is not necessary to place additional check for
`ifDrfOnly() ` on places where unrecoveredRegionCount > 0 is checked (e.g.
`hasNoLiveValues() and needsCompaction()`). If compaction produced only .drf
file then both `totalLiveCount` and `unrecoveredRegionCount` will be 0 for that
Oplog object. If `unrecoveredRegionCount` is greater then 0 than compaction
will not be done in the first place, and .crf would not be deleted.
1. Function `hasNoLiveValues(`) -> In this function there is already check
whether totalLiveCount.get() <= 0. This condition will be always **true** for
the only .drf files.
2. Function `needsCompaction()` -> I'm not sure in which case geode will try
to mark only .drf file as ready for compaction. Even if this happens then
compactor thread will check if `hasNoLiveValues()` is true and then it will
call `handleNoLiveValues()`. This function will then (if this is oldest only
.drf Oplog) delete all only .drf files that don't have .crf file with older Id.
Otherwise if would just call `handleEmpty()` which would then cancel creation
of .krf and will just return as only .drf Oplog is already closed. I'm not sure
if we will break something if we add this additional check in
`needsCompaction()`. Please check code below:
```
void handleNoLiveValues() {
if (!doneAppending) {
return;
}
if (hasNoLiveValues()) {
if (LocalRegion.ISSUE_CALLBACKS_TO_CACHE_OBSERVER) {
if (calledByCompactorThread()) {
// after compaction, remove the oplog from the list & destroy it
CacheObserverHolder.getInstance().beforeDeletingCompactedOplog(this);
} else {
CacheObserverHolder.getInstance().beforeDeletingEmptyOplog(this);
}
}
if (isOldest()) {
if (calledByCompactorThread()) {
// do it in this compactor thread
handleEmptyAndOldest(true);
} else {
// schedule another thread to do it
getParent().executeDiskStoreTask(() -> handleEmptyAndOldest(false));
}
} else {
if (calledByCompactorThread()) {
// do it in this compactor thread
handleEmpty(true);
} else {
// schedule another thread to do it
getParent().executeDiskStoreTask(() -> handleEmpty(false));
}
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]