Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/16549 )
Change subject: IMPALA-6671: Skip locked tables from topic updates ...................................................................... Patch Set 7: (9 comments) http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1304 PS6, Line 1304: ce we don't have the table lock yet and the table could be changed during the : // execution below. > Maybe use a local variable - canSkipUpdate? The relationship between this c Done. Added changed the wording of the comment which hopefully makes it easier to understand. http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1339 PS6, Line 1339: the tabl > nit: hdfsTable Done http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1349 PS6, Line 1349: * @param ctx GetCatalogDeltaContext where the table is add if the lock is acquired. > Just to make sure I understand correctly, in the case when Thanks for the question, while explaining it you I found a bug with logic and updated the the patch. The main idea is here as follows: the hdfstable.getLastVersionSeenByTopicUpdate() returns the version of the table when topic update thread last inspected the table to evaluate if it needs to go in a topic update. If this is not the same as the table version, it means we have skipped table due to lock contention before and the table was updated before we came here again to add to the topic update. The reason we need to do this is to distinguish the case when a table is locked for long time and topic update thread visited it more than once during this time v/s topic thread thread visited the table and the table happened to be locked again by a different CatalogOperation. Without doing this we will hit the limit maxSkippedUpdatesLockContention too soon and the patch will not really be of much help. For example: Case 1: Time t1: table is locked for a refresh Time t2: topic update tried to add the table but skipped it due to lock contention. It increments the counter by 1 and sets this tblVersion in setLastVersionSeenByTopicUpdate() Time t3: topic update tried again. This time though, the table version has not changed yet since refresh is not complete. It skips the table without incrementing the counter. Time t4: topic update tries again. Same as above. Time t5: topic update tries again. Same as above. Time t6: refresh completes. Time t7: topic update tries now. lock is acquired and it adds the table to the update. Case 2: Time t1: table is locked for a refresh Time t2: topic update tried to add the table but skipped it due to lock contention. It increments the counter by 1 and sets this tblVersion in setLastVersionSeenByTopicUpdate() Time t3: topic update tried again. This time though, the table version has not changed yet since refresh is not complete. It skips the table without incrementing the counter. Time t4: topic update tries again. Same as above. Time t5: topic update tries again. Same as above. Time t6: refresh completes. Time t7: a second refresh is issued which gets the table lock. Time t8: topic update tries now. the table is locked and getLastVersionSeenByTopicUpdate() will not be same as tblVersion now and hence we increment the counter by 1 again. Time t9: refresh completes. Time t10: third refresh is issued which gets the table lock. time t11: topic update tries now. Now we detect the table has been skipped twice before and we block until we acquire the lock. Hope this helps. http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1394 PS6, Line 1394: } else { > I'm not currently quite grokking the need to have all the bits of this sync The main objective here is that if the topic update thread decides to skip the table, it must also make sure that a concurrent setCatalogVersion() on this table should set the version outside the current topic update window. Otherwise, if there is a race between the two, it is possible that a table modification gets leaked from the topic updates. I will try to clean this up more. If you have any suggestions let me know as well. I don't think we really need to update the pending version in CatalogOpExecutor but let me think some more about it. http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1397 PS6, Line 1397: > nit: at most Done http://gerrit.cloudera.org:8080/#/c/16549/6/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1398 PS6, Line 1398: > nit: extra e Done http://gerrit.cloudera.org:8080/#/c/16549/4/tests/metadata/test_topic_update_frequency.py File tests/metadata/test_topic_update_frequency.py: http://gerrit.cloudera.org:8080/#/c/16549/4/tests/metadata/test_topic_update_frequency.py@15 PS4, Line 15: import time > flake8: F401 'threading' imported but unused Done http://gerrit.cloudera.org:8080/#/c/16549/4/tests/metadata/test_topic_update_frequency.py@18 PS4, Line 18: > flake8: F401 'time.sleep' imported but unused Done http://gerrit.cloudera.org:8080/#/c/16549/4/tests/metadata/test_topic_update_frequency.py@20 PS4, Line 20: class TestTopicUpdateFrequency(ImpalaTestSuite): > flake8: F401 'tests.beeswax.impala_beeswax.ImpalaBeeswaxException' imported Done -- To view, visit http://gerrit.cloudera.org:8080/16549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic657b96edbcdc94c6b906e7ca59291f4e4715655 Gerrit-Change-Number: 16549 Gerrit-PatchSet: 7 Gerrit-Owner: Vihang Karajgaonkar <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Shant Hovsepian <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Vihang Karajgaonkar <[email protected]> Gerrit-Comment-Date: Fri, 23 Oct 2020 21:58:31 +0000 Gerrit-HasComments: Yes
