Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/23194 to look at the new patch set (#2). Change subject: IMPALA-14220 (part 2): Delay AcceptRequest until catalog is stable ...................................................................... IMPALA-14220 (part 2): Delay AcceptRequest until catalog is stable CatalogD availability is improving since reading is_active_ no longer requires holding catalog_lock_. However, during a failover scenario, requests may slip into the passive-turn-to-active CatalogD and obtain stale metadata. This patch improves the situation in two steps. First, it adds a new mutex ha_transition_lock_ that must be obtained by AcceptRequest() in HA mode. This mutex protects both CatalogServer::WaitHATransition() and CatalogServer::UpdateActiveCatalogd(). WaitHATransition() will only exist and return to AcceptRequest() after the initial metadata reset is complete or min_catalog_version_to_serve_ is met. Second, it increments the catalog version by CATALOG_VERSION_INCREMENT_ON_RESET (100) on every global reset (Invalidate Metadata). CatalogServer::MarkPendingMetadataReset() matches this logic to increment min_catalog_version_to_serve_ before setting triggered_first_reset_ flag to False (consequently waking up TriggerResetMetadata thread). AcceptRequest() will delay incoming requests until the catalog version is larger than or equal to min_catalog_version_to_serve_. Rename WaitForCatalogReady() to WaitCatalogReadinessForWorkloadManagement() since this wait mechanism is specific to Workload Management initialization and has stricter requirements. Change is_active_ and triggered_first_reset_ to volatile boolean. Also inline CatalogServer::IsActive(). Testing: Added test_metadata_after_failover_with_delayed_reset and test_metadata_after_failover_with_hms_sync. Change-Id: I370d21319335318e441ec3c3455bac4227803900 --- M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-server.h M be/src/catalog/catalogd-main.cc M be/src/catalog/workload-management-init.cc M fe/src/main/java/org/apache/impala/catalog/Catalog.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M tests/custom_cluster/test_catalogd_ha.py 7 files changed, 150 insertions(+), 62 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/23194/2 -- To view, visit http://gerrit.cloudera.org:8080/23194 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I370d21319335318e441ec3c3455bac4227803900 Gerrit-Change-Number: 23194 Gerrit-PatchSet: 2 Gerrit-Owner: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>