github-actions[bot] commented on code in PR #63310:
URL: https://github.com/apache/doris/pull/63310#discussion_r3427248747


##########
fe/fe-core/src/main/java/org/apache/doris/catalog/CatalogRecycleBin.java:
##########
@@ -574,7 +587,128 @@ public void replayEraseTable(long tableId) {
         }
     }
 
-    private void erasePartition(long currentTimeMs, int keepNum) {
+    /**
+     * Erase all partitions belonging to the given table.
+     * This handles partitions that were not expired when erasePartition() 
collected
+     * its expired IDs, but became expired later due to time advancement 
during processing.
+     *
+     * <p>This method acquires its own locks with fine granularity:
+     * <ul>
+     *   <li>Read lock to collect partition IDs</li>
+     *   <li>Write lock for each individual partition cleanup</li>
+     * </ul>
+     *
+     * @param tableId the table ID whose partitions need to be erased
+     */
+    private void erasePartitionsForTable(long tableId) {
+        // 1. Collect orphan partition IDs under read lock (fast, non-blocking)
+        List<Long> partitionIds = new ArrayList<>();
+        readLock();
+        try {
+            for (Map.Entry<Long, RecyclePartitionInfo> entry : 
idToPartition.entrySet()) {
+                if (entry.getValue().getTableId() == tableId) {
+                    partitionIds.add(entry.getKey());
+                }
+            }
+        } finally {
+            readUnlock();
+        }
+
+        // 2. Clean each orphan partition with individual write lock (fine 
granularity)
+        for (Long partitionId : partitionIds) {
+            writeLock();
+            try {
+                RecyclePartitionInfo pInfo = idToPartition.remove(partitionId);
+                if (pInfo == null) {
+                    continue;
+                }
+                Partition partition = pInfo.getPartition();
+                Env.getCurrentEnv().onErasePartition(partition);
+                idToRecycleTime.remove(partitionId);
+
+                dbTblIdPartitionNameToIds.computeIfPresent(
+                        Pair.of(pInfo.getDbId(), pInfo.getTableId()), (pair, 
partitionMap) -> {
+                            partitionMap.computeIfPresent(partition.getName(), 
(name, idSet) -> {
+                                idSet.remove(partitionId);
+                                return idSet.isEmpty() ? null : idSet;
+                            });
+                            return partitionMap.isEmpty() ? null : 
partitionMap;
+                        });
+
+                
Env.getCurrentEnv().getEditLog().logErasePartition(partitionId);
+                LOG.info("erase orphan partition[{}] when erasing table[{}]", 
partitionId, tableId);
+            } finally {
+                writeUnlock();
+            }
+        }
+    }
+
+    /**
+     * Erase all tables belonging to the given database.
+     * This handles tables that were not expired when eraseTable() collected
+     * its expired IDs, but became expired later due to time advancement 
during processing.
+     *
+     * <p>This method acquires its own locks with fine granularity:
+     * <ul>
+     *   <li>Read lock to collect table IDs</li>
+     *   <li>For each table, erase its partitions then erase the table 
itself</li>
+     * </ul>
+     *
+     * @param dbInfo the RecycleDatabaseInfo containing the database and its 
table metadata
+     */
+    private void eraseTablesForDatabase(RecycleDatabaseInfo dbInfo) {
+        long dbId = dbInfo.getDb().getId();
+        Set<String> tableNames = Sets.newHashSet(dbInfo.getTableNames());
+        Set<Long> tableIdSet = Sets.newHashSet(dbInfo.getTableIds());
+
+        // 1. Collect orphan table IDs under read lock
+        List<Long> tableIds = new ArrayList<>();
+        readLock();
+        try {
+            for (Map.Entry<Long, RecycleTableInfo> entry : 
idToTable.entrySet()) {
+                RecycleTableInfo tableInfo = entry.getValue();
+                if (tableInfo.getDbId() == dbId
+                        && tableNames.contains(tableInfo.getTable().getName())

Review Comment:
   This still skips recycled tables from the same database if they were dropped 
before the database drop. Concrete sequence: drop table `t_old` in db `d`, then 
later drop database `d`. `RecycleDatabaseInfo` is built from `db.getTables()` 
at the database drop, so it does not contain `t_old`. When the db expires, this 
helper skips `t_old`, then `eraseDatabase()` removes/logs the db, leaving 
`idToTable` with a dbId that is no longer in the catalog or `idToDatabase`. 
That table is no longer recoverable and will be rebuilt from image as a 
recycle-bin table whose database is missing. For database parent erasure, 
please erase all recycled tables/partitions with this `dbId` before logging 
`OP_ERASE_DB`, or otherwise explicitly clean the remaining children by dbId. 
This is distinct from the existing thread's current-table drop-db case because 
it covers tables that were already recycled before the database drop and 
therefore are not in `dbInfo.getTableIds()`.



##########
fe/fe-core/src/main/java/org/apache/doris/catalog/CatalogRecycleBin.java:
##########
@@ -467,6 +479,7 @@ private void eraseTable(long currentTimeMs, int keepNum) {
                     if (tableInfo == null) {
                         continue;
                     }
+                    erasePartitionsForTable(tableId);

Review Comment:
   This fixes the normal expired-table path, but the parallel 
`eraseTableWithSameName()` path still removes a recycled table without calling 
`erasePartitionsForTable`. A concrete case is: drop partition `p` from table 
`t`, then drop multiple tables named `t` in the same db so 
`max_same_name_catalog_trash_num` evicts the older table after min latency. 
`erasePartition()` can leave `p` because it is not expired and not over the 
partition keep count, then `eraseTableWithSameName()` removes/logs the table 
and leaves `p` with a missing parent table. Please apply the same partition 
cascade before removing/logging the table in the same-name path, and add a test 
for that path.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to