[ 
https://issues.apache.org/jira/browse/IMPALA-7168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16572224#comment-16572224
 ] 

Balazs Jeszenszky commented on IMPALA-7168:
-------------------------------------------

To rephrase, the issue here is that the new subscriber upon joining will have a 
catalog version of 0, which gets propagated as the minimum topic version for 
the catalog topic. If the new subscriber fails to process the initial update 
and keeps re-requesting it (locking its catalog version at 0), SYNC_DDL queries 
will hang.
Without having an initial catalog update processed, the coordinator will not 
serve any queries, and so its metadata staleness isn't relevant for the 
purposes of SYNC_DDL. Maybe it's enough to just ignore 0 values for minimum 
subscriber topic version?

> DML query may hang if CatalogUpdateCallback() encounters repeated error
> -----------------------------------------------------------------------
>
>                 Key: IMPALA-7168
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7168
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog
>    Affects Versions: Impala 2.9.0, Impala 2.10.0, Impala 2.11.0, Impala 3.0, 
> Impala 2.12.0
>            Reporter: Pranay Singh
>            Priority: Major
>
> DML queries or INSERT  will encounter a hang, if 
> exec_env_->frontend()->UpdateCatalogCache() in 
> ImpalaServer::CatalogUpdateCallback encounters repeated error like ENOMEM. 
> This happens with SYNC_DDL set to 1 when the coordinator node is waiting for 
> it's catalog version to become current.
> The scenario shows up like this, lets say there are two coordinator nodes , 
> Node A, Node B
> and catalogd and statestored are running on Node C.
> a) CREATE TABLE is executed on Node A, with SYNC_DDL set to 1, the thread 
> running the query is going to block in 
> impala::ImpalaServer::ProcessCatalogUpdateResult(), waiting for it's catalog 
> version to become current.
> b) Meanwhile statestored running on Node C would call 
> ImpalaServer::CatalogUpdateCallback on Node B via thrift RPC to do a delta 
> topic update, which would not happen if we encounter repeated errors, say 
> front end is low on memory (low JVM heap situation).
> c) In such case Node A will wait indefinitely waiting for it's catalog 
> version to become current, till Node B is shutdown voluntarily.
> Note: This is a case where Node B is reachable (hearbeat is fine, but node is 
> in a bad state, non working).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to