Re: [I] Investigate removing table locks for FATE ops [accumulo]

via GitHub Thu, 26 Oct 2023 09:29:41 -0700


keith-turner commented on issue #3823:
URL: https://github.com/apache/accumulo/issues/3823#issuecomment-1781457175

The interaction of Merge with other operations would need the most attention
if removing table locks. Currently the table locks prevents other operations
from running concurrently with merge.

## User compactions

User compactions create two tablet metadata columns, one for running
compactions and one for completed compactions. Without table locks, merge
would need to handle these in some way. These columns can not be copied during
a merge operation. Would conceptually want merge to wait on compaction or
cancel them. A nice behavior would be if merge cancels running compactions and
lets existing completed compaction markers be processed before starting to
merge.

* Merge places operation id on tablet and in the same mutation deletes any
entries for running compactions.
* After placing the operation ids, merge waits for all tablets to have no
location and no compaction completed markers.

The above would make merge wait on completed compactions to be acknowledged
without waiting on running compactions or allowing new compactions to start
until after the merge is complete.

## Bulk import
Bulk import adds loaded markers to tablets. During merge these loaded
markers would need to be handled. Currently bulk import does the following.

1. Checks that its load mapping aligns with actual split points in the
table and if not throws a concurrent merge exception. The checks for merge that
happened between file inspection to create the load mapping and acquiring the
table lock.
2. Starts loading files into tablets with the assumption that a concurrent
merge will not happen because of the table lock. Write load columns to
individual tablets.
3. After all tablets are loaded, deletes the loaded mappings from each
tablet.

The reason step 1 is done that files need to be imported into the exact
tablets specified. However with the ability of files to now have ranges, it
may be that if the load mapping does not align with the tables splits that we
could add ranges to the bulk loaded files to handle this situation. The merge
operation could also adjust these ranges on loaded mappings like it does for
file mappings, this would allow bulk import to detect ranges that are not
completely covered.

Adding ranges to the load mapping may also remove the need to ever throw
concurrent merge exception in the bulk API, as concurrent merges could now be
handled. So maybe [this
check](https://github.com/apache/accumulo/blob/c44216d7643c50830e0619a2037d4a8d20c15a01/server/manager/src/main/java/org/apache/accumulo/manager/tableOps/bulkVer2/PrepBulkImport.java#L181)
could be modified or removed.

## Split tablets

Split and merge both set operation ids on tablets, so this case is easy.
They can wait on each via the operation id.

## Merge

Concurrent merge operations that run on the same table with overlapping
ranges could deadlock if acquiring operation ids is not done with care. If
merge operations observer other merge operations in the metadata table, one of
them could remove its operation ids to let the other proceed. Could be the
merge operation with the highest fate tx id. Relinquishing opids would only
need to be done when a merge operation is in the phase of acquiring operation
ids on all of its tablets. Once a merge operation is past the acquisition
phase, it can just complete and no longer needs to worry about relinquishing.

## Clone

The clone code makes repeated passes over the source and destination
metadata until convergence is confirmed. This clone code is currently split
tolerant, would need to make this code also merge tolerant.

## Delete table

If delete table acquires operation ids on a tablet, then it will mostly be
fine with concurrent merges. However when a merge operation is in its
acquisition phase, it should relinquish any operation ids it has obtained if it
sees delete table operation ids.

## Offline table

Currently taking a table offline would wait for any running merge to
complete and after taking the table offline would prevent any merge from
starting. Without table locks this would be hard to achieve without placing
information in each tablet for an offline table. Maybe tablets could have a
state of enabled or disabled (using terminology from #3860). This per tablet
state could have the following properties.

* Can only transition to disabled when there are no operation ids and no
location.
* A merge operation can not set an operation id on a tablet if its state is
disabled

Maybe the APIs for online/offline of a table goes away and turns
conceptually into a concept of enabling and disabling ranges of a table.

## Export table

Exporting a t table currently depends on the table being offline, so the
conflict of offline with merge should consider export.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] Investigate removing table locks for FATE ops [accumulo]

Reply via email to