Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14799 )
Change subject: IMPALA-9101: Add support for detecting self-events on partition events ...................................................................... Patch Set 2: (33 comments) Did a first round review. Most of the comments are nits :) Still need some time to digest this patch deeper. Thanks! http://gerrit.cloudera.org:8080/#/c/14799/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14799/2//COMMIT_MSG@9 PS2, Line 9: This commit redoes some of the self-event detection logic, specifically for the partition Please wrap the lines to fix the 72 character limit (to be consistent with other commits). There's an interesting blog about explaining the 72 character limit: https://preslav.me/2015/02/21/what-s-with-the-50-72-rule/ http://gerrit.cloudera.org:8080/#/c/14799/2//COMMIT_MSG@42 PS2, Line 42: [TODO] Is it done? http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@52 PS2, Line 52: import org.apache.impala.catalog.events.SelfEventContext; nit: keep the import list sorted http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@397 PS2, Line 397: .equals(((MetastoreEventsProcessor) metastoreEventProcessor_).getStatus()); Looks like we check ACTIVE in many places. It may worth to add a function isActive() in MetastoreEventsProcessor. http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@817 PS2, Line 817: a nit: an http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@831 PS2, Line 831: versionNumber == -1 Is this safe enough or should be something like "versionNumber < 0" ? http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@868 PS2, Line 868: a nit: an http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@872 PS2, Line 872: LOG.warn(String.format("Partition %s not found during self-event " I think we need to log the table full name too since partName is just something like "year=2019/month=1" http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2617 PS2, Line 2617: } else { : LOG.info(String.format("Partition metadata for %s was not refreshed since " : + "it does not exist in metastore anymore", : hdfsTable.getFullName() + " " + partitionName)); In this case, I think we can return hdfsTable.toMinimalTCatalogObject() since the whole Thrift object won't be used. I left my reason here: https://gerrit.cloudera.org/c/14799/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#3983 http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@623 PS2, Line 623: 20 Just be interested, is there a reason we choose 20? http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java File fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java: http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java@44 PS2, Line 44: private static final Logger LOG = LoggerFactory.getLogger(InFlightEvents.class); nit: move it above to be together with "private static final int DEFAULT_MAX_NUMBER_OF_INFLIGHT_EVENTS" ? http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@567 PS2, Line 567: incomplete change to unloaded? http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@577 PS2, Line 577: T nit: T => t http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1379 PS2, Line 1379: //TODO refresh all the partition together instead of looping one by one Could you create a JIRA for this follow-up? http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1532 PS2, Line 1532: , nit: need space http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java File fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java: http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/catalog/events/SelfEventContext.java@48 PS2, Line 48: parameters Could you add a comment about this variable? Looks like it could be tblproperties of a table or something similar for a db. http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@37 PS2, Line 37: import javax.annotation.Nullable; nit: move it one line above to be together with the "java" packages? http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3525 PS2, Line 3525: a nit: an http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3982 PS2, Line 3982: // TODO is the partition was not really refresed because the partSpec typos: is => if, refresed => refreshed http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3983 PS2, Line 3983: // was wrong, do we still need to send back the table? I think we won't send back the table to coordinators in this case since the catalog version is unchanged (the table is not modified). Then it won't be collected in topic update since its version is not in range. However, we can do a minor optimization in reloadPartition() that if the partSpec references a partition that is non-exists in both HMS and catalog, we can return hdfsTable.toMinimalTCatalogObject() instead of the whole hdfsTable.toTCatalogObject(). http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4182 PS2, Line 4182: nit: empty line http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java File fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java: http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@858 PS2, Line 858: numberOfInvalidatesBefore need rename to numberOfRefreshesBefore http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@927 PS2, Line 927: an nit: a http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2204 PS2, Line 2204: Assert. nit: don't need the class name since this static function is imported. http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2226 PS2, Line 2226: Assert. nit: don't need class name http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2230 PS2, Line 2230: Assert. nit: don't need class name http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2233 PS2, Line 2233: Assert. nit: don't need class name http://gerrit.cloudera.org:8080/#/c/14799/2/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2243 PS2, Line 2243: unfinished? http://gerrit.cloudera.org:8080/#/c/14799/2/tests/custom_cluster/test_event_processing.py File tests/custom_cluster/test_event_processing.py: http://gerrit.cloudera.org:8080/#/c/14799/2/tests/custom_cluster/test_event_processing.py@17 PS2, Line 17: import random : import string nit: be together with the other imports and leave one empty line after the file comments? http://gerrit.cloudera.org:8080/#/c/14799/2/tests/custom_cluster/test_event_processing.py@94 PS2, Line 94: nit: should be two spaces indent inside this block. http://gerrit.cloudera.org:8080/#/c/14799/2/tests/custom_cluster/test_event_processing.py@215 PS2, Line 215: time.sleep(build_flavor_timeout(2, slow_build_timeout=4)) Can do the same change as EventProcessorUtils.wait_for_event_processing() that waits for 'catalog.curr-version' instead. http://gerrit.cloudera.org:8080/#/c/14799/2/tests/custom_cluster/test_event_processing.py@302 PS2, Line 302: db_name, tbl_name) Also add coverage for "alter table xxx recover partitions" statement here? http://gerrit.cloudera.org:8080/#/c/14799/2/tests/util/event_processor_utils.py File tests/util/event_processor_utils.py: http://gerrit.cloudera.org:8080/#/c/14799/2/tests/util/event_processor_utils.py@21 PS2, Line 21: import logging nit: be together with the other imports and leave one empty line after the file comments? -- To view, visit http://gerrit.cloudera.org:8080/14799 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c Gerrit-Change-Number: 14799 Gerrit-PatchSet: 2 Gerrit-Owner: Vihang Karajgaonkar <[email protected]> Gerrit-Reviewer: Anurag Mantripragada <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Comment-Date: Mon, 16 Dec 2019 07:01:01 +0000 Gerrit-HasComments: Yes
