Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14307 )

Change subject: IMPALA-7506: support global INVALIDATE METADATA in local 
catalog mode
......................................................................


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14307/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/14307/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1104
PS5, Line 1104:     if (ctx.collectFullUpdates || tbl.getCatalogVersion() <= 
ctx.toVersion) {
> I'm a little curious why this approach instead of tracking the min versions
I've tried tracking the min catalog versions in local-catalog mode coordinator 
(Solution 2 in the doc: 
https://docs.google.com/document/d/1-AoigzsQPgSGosW4vtVP8E7TiwoJJfLkAq3Rd6KvKBg).
 But finally I found we need to replace Guava cache with something else that 
supports holding an external lock while evicting items and calling the 
RemovalListener.onRemoval(). The reason is:

- To delete catalog versions due to cache eviction resulting from timed 
expiration or exceeding a maximum size, we have to use RemovalListener. Then we 
only call CatalogObjectVersionSet.removeVersion() in our customized 
RemovalListener.onRemoval().
- We have no way to introduce a lock protecting the deletion of the cache item 
and the deletion of its catalog version.
- Without the lock, there's a chance for a version not being removed forever:
- - Thread 1 in CatalogdMetaProvider.loadWithCaching() put a new catalog object 
(with version v1) into the Guava cache
- - Thread 2 in CatalogdMetaProvider.invalidateCacheForObject() remove this 
catalog object
- - Thread 2 in RemovalListener.onRemoval() calls 
CatalogObjectVersionSet.removeVersion(v1). It's ignored since v1 hasn't been 
added yet.
- - Thread 1 in CatalogdMetaProvider.loadWithCaching() calls 
CatalogObjectVersionSet.addVersion(v1). v1 will permanantly exist in the 
version set, causing INVALIDATE METADATA to hang.

One approach to fix this case is adding a wait in 
CatalogObjectVersionSet.removeVersion() until the version is added. But the 
code paths of legacy catalog mode assumes that a version can be removed several 
times, and CatalogObjectVersionSet.removeVersion can be called on an unexisting 
version. Fixsing the assumption is also complex. Guava doc also mentions that 
we should avoid performing blocking calls or synchronizing on shared resources 
in the RemovalListener.

I've also written down other problems in the doc. The main obstacle is the lack 
of protection on modifying Guava cache and the version set in the same time. 
Any thoughts are welcome.



--
To view, visit http://gerrit.cloudera.org:8080/14307
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib61a7ab1ffa062620ffbc2dadc34bd7a8ca9e549
Gerrit-Change-Number: 14307
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Vihang Karajgaonkar <[email protected]>
Gerrit-Comment-Date: Mon, 14 Oct 2019 10:20:54 +0000
Gerrit-HasComments: Yes

Reply via email to