This is an automated email from the ASF dual-hosted git repository.

yiguolei pushed a commit to branch branch-2.1
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/branch-2.1 by this push:
     new 6eba030897d [fix](chore) path gc should consider tablet migration 
(#30095) (#30548)
6eba030897d is described below

commit 6eba030897dc3a7e7ca5f4f0b24c24e88c6fe503
Author: zhannngchen <[email protected]>
AuthorDate: Tue Jan 30 12:03:21 2024 +0800

    [fix](chore) path gc should consider tablet migration (#30095) (#30548)
    
    Background:
    
    Migration will create new tablet in different DataDir, the old tablet will 
be moved to TabletManager::_shutdown_tablets.
    The migration task won't copy data in stale rowsets to new tablet, so after 
migration, the new tablet don't contains stale rowsets of old tablet
    The path GC process will check every path, to make sure if it's an useless 
tablet, or an useless rowset. If it is, will remove data of these 
tablets/rowsets
    The issue:
    
    When path GC got a stale rowset path from the data dir of old tablet, it 
extract the tablet id and rowset id
    Then it check if the tablet id exists in TabletManager, and the answer is 
YES!
    It got the tablet instance, which is the new tablet, then it check if the 
stale rowset id from the old tablet path exists in the new tablet instance, and 
got the answer NO.
    The path GC process treat the rowset as an useless rowset, since it can't 
find anyone holds reference to it, then delete the data of this stale rowset.
    But some query may still holds reference to this stale rowset, the deletion 
will cause query failure.
    Solution:
    
    The lifecycle of all rowsets in a shutdown tablet, should be related with 
the lifecycle of this tablet
    We need to differentiate the old tablet and the new one created by 
migration task, while performing path GC.
---
 be/src/olap/data_dir.cpp | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/be/src/olap/data_dir.cpp b/be/src/olap/data_dir.cpp
index 24dc88169b8..351e7fd992e 100644
--- a/be/src/olap/data_dir.cpp
+++ b/be/src/olap/data_dir.cpp
@@ -694,7 +694,14 @@ void 
DataDir::_perform_path_gc_by_tablet(std::vector<std::string>& tablet_paths)
             std::swap(*forward, *backward);
             continue;
         }
-        if (auto tablet = _tablet_manager->get_tablet(tablet_id); !tablet) {
+        auto tablet = _tablet_manager->get_tablet(tablet_id);
+        if (!tablet || tablet->data_dir() != this) {
+            if (tablet) {
+                LOG(INFO) << "The tablet in path " << path
+                          << " is not same with the running one: " << 
tablet->data_dir()->_path
+                          << "/" << tablet->tablet_path()
+                          << ", might be the old tablet after migration, try 
to move it to trash";
+            }
             _tablet_manager->try_delete_unused_tablet_path(this, tablet_id, 
schema_hash, path);
             --backward;
             std::swap(*forward, *backward);
@@ -740,6 +747,12 @@ void DataDir::_perform_path_gc_by_rowset(const 
std::vector<std::string>& tablet_
             continue;
         }
 
+        if (tablet->data_dir() != this) {
+            // Current running tablet is not in same data_dir, maybe it's a 
tablet after migration,
+            // will be reclaimed in the next time `_perform_path_gc_by_tablet`
+            continue;
+        }
+
         bool exists;
         std::vector<io::FileInfo> files;
         auto st = io::global_local_filesystem()->list(path, true, &files, 
&exists);


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to