keith-turner commented on code in PR #6060:
URL: https://github.com/apache/accumulo/pull/6060#discussion_r2728961270
##########
server/manager/src/main/java/org/apache/accumulo/manager/compaction/coordinator/commit/CommitCompaction.java:
##########
@@ -211,13 +223,51 @@ private void updateTabletForCompaction(TCompactionStats
stats, ExternalCompactio
// scan
ecm.getJobFiles().forEach(tabletMutator::putScan);
}
- ecm.getJobFiles().forEach(tabletMutator::deleteFile);
+ for (StoredTabletFile file : ecm.getJobFiles()) {
+ // Check if file is shared
+ DataFileValue fileMetadata = tablet.getFilesMap().get(file);
+ boolean isShared = fileMetadata != null && fileMetadata.isShared();
+
+ if (isShared) {
+ // File is shared - must use GC delete markers
+ LOG.debug("File {} is shared, will create GC delete marker",
file.getFileName());
+ filesToDeleteViaGc.add(file);
+ } else {
+ // File is not shared - try to delete directly
+ try {
+ Path filePath = new Path(file.getMetadataPath());
+ boolean deleted = ctx.getVolumeManager().deleteRecursively(filePath);
Review Comment:
Deleting files here is not correct because the conditional mutation to the
metadata table has not yet been made. Only after successfully making the
conditional mutation will we know what was shared or not. Need to do something
like the following.
1. Modify conditional mutation to check that the file values (this checks
the new shared field) are the same for the tablet. This atomically ensures
that what is thought to be shared is actually shared when the metadata table
update happens. Currently the conditional mutation only checks the set of
files, not their values.
2. After successfully submitting the condtional mutation look at the table
metadata and the list of files compacting (from commitData) . Compute a set of
files that is in commitData.inputPaths and is marked as not shared in the
tablet metadata. If a file is not in tablet metadata, add it to the set because
its shared status is unknown (this can happen because of failure and retry).
This is the set of compacting files that are known for sure to not be shared,
in the case of failure and retry this set should be empty.
3. If the conditional mutation was not successful, this set above should be
empty.
4. Pass the set computed above to PutGcCandidates
5. In PutGcCandidates delete non shared files and add gc delete markers
for shared files. Doing the actual file deletes in PutGcCandidates makes
reasoning about retries (caused by process dying in middle of fate step) much
easier.
There is existing code for comparing the files values, but its not exactly
what we need. Would probably need to create a new
`requireFiles(Map<StoredTabletFile, DataFileValue> files)` method that uses
code similar to
[this](https://github.com/apache/accumulo/blob/3dfe28e8e1774a143c86edbeb645ea3ac301cdd5/server/base/src/main/java/org/apache/accumulo/server/metadata/ConditionalTabletMutatorImpl.java#L242-L247)
in its impl.
##########
server/base/src/main/java/org/apache/accumulo/server/util/MetadataTableUtil.java:
##########
@@ -380,6 +428,8 @@ public static void cloneTable(ServerContext context,
TableId srcTableId, TableId
}
}
+ markSourceFilesAsShared(srcTableId, tableId, context, bw);
Review Comment:
Marking the source tablets as shared here has race conditions, these may not
be the files in the clone. Also marking them as shared must be done using a
conditional writer that only changes the marking if the file exists in the
tablet w/ the same value (this check must be done by the conditional mutation).
The following conditions must be true for clone to avoid race conditions.
1. Files are marked as shared before cloning.
2. After copying the tablets file they must still exist in the source
tablet.
Need to rework the code to do the following algorithm for each tablet.
1. Before cloning mark all files in the source tablet as shared using a
conditional mutation.
2. Copy the source tablets files to the destination tablet.
3. Read the source and desitation tablets and ensure the destitnations
tablet has a subset of the sources files. If so this tablet was successfully
cloned and we know the GC did not delete any files because the source still has
them. Done with tablet and do not need to next step.
4. There was a concurrent change w/ the source tablets files, possible that
GC even deleted some of them. Delete the destination tablet and go back to
step 1.
The current code does this w/o step 1. Also the current code batches tablet
read/writes for efficiency. The code would probably be much easier to
understand w/o the batching, but doing one tablet at a time w/o batching would
be really slow. So need to modify the code work in step 1 and still do the
batching.
##########
server/manager/src/main/java/org/apache/accumulo/manager/tableOps/bulkVer2/LoadFiles.java:
##########
@@ -402,6 +404,32 @@ static long loadFiles(Loader loader, BulkInfo bulkInfo,
Path bulkDir,
String fmtTid = fateId.getTxUUIDStr();
log.trace("{}: Started loading files at row: {}", fmtTid, startRow);
+ List<Map.Entry<KeyExtent,Bulk.Files>> allEntries = new ArrayList<>();
+ while (lmi.hasNext()) {
Review Comment:
Instead of pulling all of this into memory, it may be better to make two
passes over the file. The first pass computes the shared files and the 2nd
pass calls this method with the shared files.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]