Hello Quanlong Huang, Riza Suminto, Sai Hemanth Gantasala, Csaba Ringhofer, Impala Public Jenkins,
I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/22997 to look at the new patch set (#22). Change subject: IMPALA-13801: Support greatest synced event with hierarchical metastore event processing ...................................................................... IMPALA-13801: Support greatest synced event with hierarchical metastore event processing It is a follow-up jira/commit to IMPALA-12709. IMPALA-12152 and IMPALA-12785 are affected when hierarchical metastore event processing feature is enabled. Following changes are incorporated with this patch: 1. Added creationTime_ and dispatchTime_ fields in MetastoreEvent class to store the current time in millisec. They are used to calculate: a) Event dispatch time(time between a MetastoreEvent object creation and when event is moved to inProgressLog_ of EventExecutorService after dispatching it to a DbEventExecutor). b) Event schedule delays incurred at DbEventExecutors and TableEventExecutors(time between an event moved to EventExecutorService's inProgressLog_ and before start of processing event at appropriate DbEventExecutor and TableEventExecutor). c) Event process time from EventExecutorService point of view(time spent in inProgressLog_ before it is moved to processedLog_). Logs are added to show the event dispatch time, schedule delays, process time from EventExecutorService point of view for each event. Also a log is added to show the time taken for event's processIfEnabled(). 2. Added isDelimiter_ field in MetastoreEvent class to indicate whether it is a delimiter event. It is set only when hierarchical event processing is enabled. Delimiter is a kind of metastore event that do not require event processing. Delimeter event can be: a) A CommitTxnEvent that do not have any write event info for a given transaction. b) An AbortTxnEvent that do not have write ids for a given transaction. c) An IgnoredEvent. An event is determined and marked as delimiter in EventExecutorService#dispatch(). They are not queued to a DbEventExecutor for processing. They are just maintained in the inProgressLog_ to preserve continuity and correctness in synchronization tracking. The delimiter events are removed from inProgressLog_ when their preceding non-delimiter metastore event is removed from inProgressLog_. 3. Greatest synced event id is computed based on the dispatched events(inProgressLog_) and processed events(processedLog_) tree maps. Greatest synced event is the latest event such that all events with id less than or equal to the latest event are definitely synced. 4. Lag is calculated as difference between latest event time on HMS and the greatest synced event time. It is shown in the log. 5. Greatest synced event id is used in IMPALA-12152 changes. When greatest synced event id becomes greater than or equal to waitForEventId, all the required events are definitely synced. 6. Event processor is paused gracefully when paused with command in IMPALA-12785. This ensures that all the fetched events from HMS in current batch are processed before the event processor is fully paused. It is necessary to process the current batch of events because, certain events like AllocWriteIdEvent, AbortTxnEvent and CommitTxnEvent update table write ids in catalog upon metastore event object creation. And the table write ids are later updated to appropriate table object during their event process. Can lead to inconsistent state of write ids on table objects when paused abruptly in the middle of current batch of event processing. 7. Added greatest synced event id and event time in events processor metrics. And updated description of lag, pending events, last synced event id and event time metrics. 8. Atomically update the event queue and increment outstanding event count in enqueue methods of both DbProcessor and TableProcessor so that respective process methods do not process the event until event is added to queue and outstanding event count is incremented. Otherwise, event can get processed, outstanding event count gets decremented before it is incremented in enqueue method. 9. Refactored DbEventExecutor, DbProcessor, TableEventExecutor and TableProcessor classes to propapage the exception occurred along with event during event processing. EventProcessException is a wrapper added to hold reference to event being processed and exception occurred. 10.Added AcidTableWriteInfo helper class to store table, writeids and partitions for the transaction id received in CommitTxnEvent. Testing: - Added new tests and executed existing end to end tests. - Have executed the existing tests with hierarchical event processing enabled. Change-Id: I26240f36aaf85125428dc39a66a2a1e4d3197e85 --- M be/src/util/event-metrics.cc M be/src/util/event-metrics.h M common/thrift/JniCatalog.thrift M common/thrift/metrics.json M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/Catalog.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/events/DbBarrierEvent.java M fe/src/main/java/org/apache/impala/catalog/events/DbEventExecutor.java M fe/src/main/java/org/apache/impala/catalog/events/EventExecutorService.java A fe/src/main/java/org/apache/impala/catalog/events/EventProcessException.java M fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java M fe/src/main/java/org/apache/impala/catalog/events/RenameTableBarrierEvent.java M fe/src/main/java/org/apache/impala/catalog/events/TableEventExecutor.java M fe/src/test/java/org/apache/impala/catalog/events/EventExecutorServiceTest.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java M tests/util/event_processor_utils.py 22 files changed, 1,404 insertions(+), 452 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/97/22997/22 -- To view, visit http://gerrit.cloudera.org:8080/22997 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I26240f36aaf85125428dc39a66a2a1e4d3197e85 Gerrit-Change-Number: 22997 Gerrit-PatchSet: 22 Gerrit-Owner: Anonymous Coward <k.venureddy2...@gmail.com> Gerrit-Reviewer: Anonymous Coward <k.venureddy2...@gmail.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Reviewer: Sai Hemanth Gantasala <saihema...@cloudera.com>