Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/19838 )
Change subject: IMPALA-11534: Skip reloading file metadata for some ALTER_TABLE events ...................................................................... Patch Set 3: (6 comments) Thanks for the changes, looks good! Few more comments. http://gerrit.cloudera.org:8080/#/c/19838/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19838/3//COMMIT_MSG@17 PS3, Line 17: If any of these are changed, the file metadata reloading : can be skipped. What happens with the table property that was changed? While the table metadata is not reloaded, we still update the given properties, right? http://gerrit.cloudera.org:8080/#/c/19838/2/be/src/catalog/catalog-server.cc File be/src/catalog/catalog-server.cc: http://gerrit.cloudera.org:8080/#/c/19838/2/be/src/catalog/catalog-server.cc@136 PS2, Line 136: "to whitelist the table properties that are supposed t > Actually both the compaction-related properties are not required, since hiv Shouldn't we keep transactional and add transactional_properties? I think that adding properties that rarely change is ok, even if it is not needed actually. The critical part is to avoid reload on properties that can change often, e.g. on every insert. http://gerrit.cloudera.org:8080/#/c/19838/2/be/src/catalog/catalog-server.cc@136 PS2, Line 136: "to whitelist the table properties that are supposed to refresh file metadata" : "when these properties are changed."); > I see that there are some query options that can be modified at run time. B I didn't mean to use a query option here. The idea is to have a special table property name that is not used for anything else than force reloading a table. This could act as a "safety valve" for users to cause reloading a table (without altering its semantics) even with default flags. http://gerrit.cloudera.org:8080/#/c/19838/3/common/thrift/BackendGflags.thrift File common/thrift/BackendGflags.thrift: http://gerrit.cloudera.org:8080/#/c/19838/3/common/thrift/BackendGflags.thrift@257 PS3, Line 257: nit: extra line http://gerrit.cloudera.org:8080/#/c/19838/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/19838/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3900 PS2, Line 3900: return false; > If the whitelisted config is empty, we'll never skip file reload. The reaso I see, agree with handling it like this. Can you add a comment about it here and also mention this in the flag's description in catalog-server.cc ? http://gerrit.cloudera.org:8080/#/c/19838/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3904 PS2, Line 3904: if (isFieldSchemaChanged(beforeTable, afterTable) || : !beforeTable.getOwner().equals(afterTable.getOwner()) || : !isCustomTblPropsChanged(beforeTable, afterTable)) { : return true; : } > if any of the conditions are true then we can skip reloading file metadata. So if one alter table event changes the field schema or the owner, then it is guaranteed that the table properties can not change in this event? If this is the case, then the first two checks seem redundant to me, as isCustomTblPropsChanged would also return false in this case because no table properties were changed at all. -- To view, visit http://gerrit.cloudera.org:8080/19838 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia66b96a7c4b7f50fbf46b2e02296cd29a47347b6 Gerrit-Change-Number: 19838 Gerrit-PatchSet: 3 Gerrit-Owner: Sai Hemanth Gantasala <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Sai Hemanth Gantasala <[email protected]> Gerrit-Comment-Date: Fri, 19 May 2023 08:05:38 +0000 Gerrit-HasComments: Yes
