Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14307 )
Change subject: IMPALA-7506: support global INVALIDATE METADATA in local catalog mode ...................................................................... Patch Set 8: (8 comments) Thanks for your comments! Sorry for uploading the buggy patch set 8. I finally revert the changes for IMPALA-9062. > Is it possible to use this solution for catalog v1 as well, and remove the > CatalogObjectVersionSet thing? I think that thing is buggy/complex, maybe > would be nice to just have one implementation to worry about. (ok to do that > in a follow-up if it's not trivial, or if you think it's a risky change, > maybe we shouldn't do it at all) I think it needs a careful design to remove CatalogObjectVersionSet in catalog-v1. The reason we introduced it is step-by-step: 1) We don't want topic update blocks concurrent DDL/DMLs (IMPALA-5058). So when gathering topic update, we read current catalog version (as toVersion) and free the version lock to let concurrent DDL/DMLs acquire it. 2) Some rapidly changed tables are skipped in topic update since their catalog version exceeds toVersion. Note that if a table is skipped due to version too large in 2 times (controlled by MAX_NUM_SKIPPED_TOPIC_UPDATES), we'll force it to be collected once. 3) Coordinators still have stale cache for these tables until receiving them in topic update. CatalogObjectVersionSet is used to know whether there are stale cached tables. If so, "invalidate metadata" still needs to wait. It might be feasible that we simply reset the cache in v1 when receiving lastResetStartVersion changed, i.e. all tables come back to IncompleteTable and associate them with the version as lastResetStartVersion. But we are lack of test coverage on concurrent DDL/DMLs to support such a big change. I plan to add enough test coverage before uploading a new patch. Thanks again for your comments! http://gerrit.cloudera.org:8080/#/c/14307/8/be/src/util/impalad-metrics.h File be/src/util/impalad-metrics.h: http://gerrit.cloudera.org:8080/#/c/14307/8/be/src/util/impalad-metrics.h@119 PS8, Line 119: /// Minimal catalog object version in local catalog cache of Coordinator. > this isn't actually true -- it's no longer the minimum, but rather a lower Yeah, done. http://gerrit.cloudera.org:8080/#/c/14307/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/14307/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@604 PS8, Line 604: lastResetStartVersion > nit, I think its better to make this variable as final (same for fromVersio Good point! http://gerrit.cloudera.org:8080/#/c/14307/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@763 PS8, Line 763: catalog.setCatalog(new TCatalog(catalogServiceId_, ctx.lastResetStartVersion)); > Just a thought to further optimize this. If we know that the lastResetStart Yeah, good point! I'd like to have a try. http://gerrit.cloudera.org:8080/#/c/14307/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1135 PS8, Line 1135: catalogTbl.setTable(new TTable(tbl.db_.getName(), tbl.name_)); : catalogTbl.setCatalog_version(tbl.getCatalogVersion()); > I think its really good that we are not serializing the whole table object Yes, these changes do introduce bugs. The GVO failure is due to I moved the checking of "tbl.getCatalogVersion() <= ctx.fromVersion" before acquiring the table lock... I finally closed IMPALA-9062 since it's wrong. Though we just need the catalog version in catalog-v2, we still need to acquire the table lock. It's possible that a concurrent thread is holding the table lock and bumping its version into the (fromVersion, toVersion] range. E.g. in AlterTable, we increase and get a new catalog version before modifying the table. Table version is only bumped when operation succeeds. The table lock is held during the operation. So without waiting on the table lock, the table version we see may lower than fromVersion but is going to bump into the range. We will end up missing this table forever. http://gerrit.cloudera.org:8080/#/c/14307/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1138 PS8, Line 1138: return > Do we need to add this to topicUpdateEntry before returning? Otherwise, any Updating topicUpdateLog_ is done inside ctx.addCatalogObject(). However, I'm going to remove this if-clause since we do need the table lock... http://gerrit.cloudera.org:8080/#/c/14307/8/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1587 PS8, Line 1587: lastResetStartVersion_ = startVersion; > What if there are two concurrent calls to INVALIDATE METADATA? is it possib Nice find! We should avoid lastResetStartVersion_ going downward. I'm thinking adding more test coverage on concurrent DDL/DMLs to avoid bugs like this. http://gerrit.cloudera.org:8080/#/c/14307/8/tests/authorization/test_grant_revoke.py File tests/authorization/test_grant_revoke.py: http://gerrit.cloudera.org:8080/#/c/14307/8/tests/authorization/test_grant_revoke.py@364 PS8, Line 364: self.client.execute("grant role {0} to group foobar".format(role_name)) > why were these changes necessary? I found the bug in line 372 and 375 so want to fix them by the way since this test also uses "invalidate metadata"... For these two lines, they are needed otherwise user "foobar" and "FOOBAR" don't have privilege to create database below. http://gerrit.cloudera.org:8080/#/c/14307/8/tests/authorization/test_ranger.py File tests/authorization/test_ranger.py: http://gerrit.cloudera.org:8080/#/c/14307/8/tests/authorization/test_ranger.py@a60 PS8, Line 60: > so we don't need --use_local_catalog passed to catalogd anymore? Yes, catalogd only cares the catalog_topic_mode. -- To view, visit http://gerrit.cloudera.org:8080/14307 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib61a7ab1ffa062620ffbc2dadc34bd7a8ca9e549 Gerrit-Change-Number: 14307 Gerrit-PatchSet: 8 Gerrit-Owner: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Bharath Vissapragada <bhara...@apache.org> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-Reviewer: Vihang Karajgaonkar <vih...@cloudera.com> Gerrit-Comment-Date: Tue, 29 Oct 2019 13:36:32 +0000 Gerrit-HasComments: Yes