Sourabh Badhya created HIVE-27018:
-------------------------------------
Summary: Aborted transaction cleanup outside compaction process
Key: HIVE-27018
URL: https://issues.apache.org/jira/browse/HIVE-27018
Project: Hive
Issue Type: Improvement
Reporter: Sourabh Badhya
Assignee: Sourabh Badhya
Aborted transactions processing is tightly integrated into the compaction
pipeline and consists of 3 main stages: Initiator, Compactor (Worker), Cleaner.
That could be simplified by doing all work on the Cleaner side.
*Potential Benefits -*
There are major advantages of implementing this on the cleaner side -
1) Currently an aborted txn in the TXNS table blocks the cleaning of
TXN_TO_WRITE_ID table since nothing gets cleaned above MIN(aborted txnid) in
the current implementation. After implementing this on the cleaner side, the
cleaner regularly checks and cleans the aborted records in the TXN_COMPONENTS
table, which in turn makes the AcidTxnCleanerService clean the aborted txns in
TXNS table.
2) Initiator and worker do not do anything on tables which contain only
aborted directories. It's the cleaner which removes the aborted directories of
the table. Hence all operations associated with the initiator and worker for
these tables are wasteful. These wasteful operations are avoided.
3) DP writes which are aborted are skipped by the worker currently. Hence once
again the cleaner is the one deleting the aborted directories. All operations
associated with the initiator and worker for this entry are wasteful. These
wasteful operations are avoided.
*Proposed solution -*
*Implement logic to handle aborted transactions exclusively in Cleaner.*
Implement logic to fetch the TXN_COMPONENTS which are associated with
transactions in aborted state and send the required information to the cleaner.
Cleaner must clean up the aborted deltas/delete deltas by using the aborted
directories in the AcidState of the table/partition.
It is also better to separate entities which provide information of compaction
and abort cleanup to enhance code modularity. This can be done in this way -
Cleaner can be divided into separate entities like -
*1) Handler* - This entity fetches the data from the metastore DB from relevant
tables and converts it into a request entity called CleaningRequest. It would
also do SQL operations post cleanup (postprocess). Every type of cleaning
request is provided by a separate handler.
*2) Filesystem remover* - This entity fetches the cleaning requests from
various handlers and deletes them according to the cleaning request.
*This division allows for dynamic extensibility of cleanup from multiple
handlers. Every handler is responsible for providing cleaning requests from a
specific source.*
The following solution is resilient i.e. in the event of abrupt metastore
shutdown, the cleaner can still see the relevant entries in the metastore DB
and retry the cleaning task for that entry.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)