Hello Quanlong Huang, k.venureddy2...@gmail.com, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/23382

to look at the new patch set (#3).

Change subject: IMPALA-14400: Fix deadlock in 
CatalogServiceCatalog.getDbProperty()
......................................................................

IMPALA-14400: Fix deadlock in CatalogServiceCatalog.getDbProperty()

IMPALA-13850 (part 4) modify CatalogServiceCatalog.getDb() to delay
looking up catalog cache until initial reset() is complete.
EventProcessor can start processing event before reset() happen and
obtain versionLock_.readLock() when calling
CatalogServiceCatalog.getDbProperty(). Later on, it will hit deadlock
when attempting to obtain versionLock_.writeLock() through getDb() /
waitInitialResetCompletion(). This lock upgrade from read to write is
unsafe.

This patch mitigate the issue by adding
CatalogServiceCatalog.getDbNoWait() that does not call
waitInitialResetCompletion(). All calls to getDb() from inside
CatalogServiceCatalog that begins with acquiring versionLock_.readLock()
is replaced with getDbNoWait(). This is OK because nothing except
EventProcessor interact with catalog cache concurrently. For
EventProcessor itself, having getDbNoWait() returns null is also OK
because reset() will pause EventProcessor and populate the cache soon
after.

Skip calling catalog_.startEventsProcessor() in JniCatalog constructor.
Instead, let CatalogServiceCatalog.reset() start it at the end of cache
population.

Remove acquireVersionReadLock() since there is only 1 callsite to it.
Thus, lock acquisition and release and clearly shows within the method.

Testing:
Increase TRIGGER_RESET_METADATA_DELAY from 1s to 3s in
test_metadata_after_failover_with_delayed_reset. It was easy to hit the
deadlock with 3s delay before the patch. No more deadlock happen after
the patch.
Run and pass test_catalogd_ha.py and test_restart_services.py
exhaustively.

Change-Id: I3162472ea9531add77886bf1d0d73460ff34d07a
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M tests/custom_cluster/test_catalogd_ha.py
3 files changed, 36 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/23382/3
--
To view, visit http://gerrit.cloudera.org:8080/23382
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3162472ea9531add77886bf1d0d73460ff34d07a
Gerrit-Change-Number: 23382
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <k.venureddy2...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com>

Reply via email to