csun5285 commented on code in PR #53083:
URL: https://github.com/apache/doris/pull/53083#discussion_r2209173764
##########
be/src/vec/common/schema_util.cpp:
##########
@@ -292,23 +328,33 @@ void update_least_schema_internal(const
std::map<PathInData, DataTypes>& subcolu
path_set->insert(tuple_paths[i]);
}
}
+ return Status::OK();
}
-void update_least_common_schema(const std::vector<TabletSchemaSPtr>& schemas,
- TabletSchemaSPtr& common_schema, int32_t
variant_col_unique_id,
- std::set<PathInData>* path_set) {
+Status update_least_common_schema(const std::vector<TabletSchemaSPtr>& schemas,
+ TabletSchemaSPtr& common_schema, int32_t
variant_col_unique_id,
+ std::set<PathInData>* path_set) {
// Types of subcolumns by path from all tuples.
std::map<PathInData, DataTypes> subcolumns_types;
+
+ // Collect all paths first to enable batch checking
+ std::vector<PathInData> all_paths;
+
for (const TabletSchemaSPtr& schema : schemas) {
for (const TabletColumnPtr& col : schema->columns()) {
// Get subcolumns of this variant
if (col->has_path_info() && col->parent_unique_id() > 0 &&
col->parent_unique_id() == variant_col_unique_id) {
subcolumns_types[*col->path_info_ptr()].push_back(
DataTypeFactory::instance().create_data_type(*col,
col->is_nullable()));
+ all_paths.push_back(*col->path_info_ptr());
Review Comment:
这样是不是有点慢,用一个 std::unordered_map<string, UInt128> // <path.get_path(),
path.get_parts_hash()>。相同的 path的hash 应该一样,不一样的说明 is_nested 或
anonymous_array_level 不一样。
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]