amorynan commented on code in PR #53083:
URL: https://github.com/apache/doris/pull/53083#discussion_r2209600872


##########
be/src/vec/common/schema_util.cpp:
##########
@@ -292,23 +328,33 @@ void update_least_schema_internal(const 
std::map<PathInData, DataTypes>& subcolu
             path_set->insert(tuple_paths[i]);
         }
     }
+    return Status::OK();
 }
 
-void update_least_common_schema(const std::vector<TabletSchemaSPtr>& schemas,
-                                TabletSchemaSPtr& common_schema, int32_t 
variant_col_unique_id,
-                                std::set<PathInData>* path_set) {
+Status update_least_common_schema(const std::vector<TabletSchemaSPtr>& schemas,
+                                  TabletSchemaSPtr& common_schema, int32_t 
variant_col_unique_id,
+                                  std::set<PathInData>* path_set) {
     // Types of subcolumns by path from all tuples.
     std::map<PathInData, DataTypes> subcolumns_types;
+
+    // Collect all paths first to enable batch checking
+    std::vector<PathInData> all_paths;
+
     for (const TabletSchemaSPtr& schema : schemas) {
         for (const TabletColumnPtr& col : schema->columns()) {
             // Get subcolumns of this variant
             if (col->has_path_info() && col->parent_unique_id() > 0 &&
                 col->parent_unique_id() == variant_col_unique_id) {
                 subcolumns_types[*col->path_info_ptr()].push_back(
                         DataTypeFactory::instance().create_data_type(*col, 
col->is_nullable()));
+                all_paths.push_back(*col->path_info_ptr());

Review Comment:
   I use same path group to check which can reduce the time complexity to o(n + 
m^2) and which is more reliable than hash. By the ways SipHash is so slow 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to