[jira] [Commented] (IMPALA-7597) "show partitions" does not retry on InconsistentMetadataFetchException
[ https://issues.apache.org/jira/browse/IMPALA-7597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656111#comment-16656111 ] ASF subversion and git services commented on IMPALA-7597: - Commit 5cc49c343f8558602af2663e3bb519da6d9852cc in impala's branch refs/heads/master from [~vercego] [ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=5cc49c3 ] IMPALA-7597: wraps retries around Frontend metadata operations. When configured to use the local catalog, concurrent metadata reads and writes can cause the CatalogMetaProvider to throw an InconsistentMetadataFetchException. Queries have been wrapped with a retry loop, but the other frontend methods, such listing table or partition information, can also fail from the same error. These errors were seen under a workload consisting of concurrent adding and showing partitions. This change wraps call-sites (primarily in Frontend.java) that acquire a Catalog, so have a chance of throwing an InconsistentMetadataFetchExecption. Testing: - added a test that checks whether inconsistent metadata exceptions are seen in a concurrent workload. Change-Id: I43a21571d54a7966c5c68bea1fe69dbc62be2a0b Reviewed-on: http://gerrit.cloudera.org:8080/11608 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > "show partitions" does not retry on InconsistentMetadataFetchException > -- > > Key: IMPALA-7597 > URL: https://issues.apache.org/jira/browse/IMPALA-7597 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.1.0 >Reporter: bharath v >Assignee: Vuk Ercegovac >Priority: Critical > > IMPALA-7530 added retries in case LocalCatalog throws > InconsistentMetadataFetchException. These retries apply to all code paths > taking {{Frontend#createExecRequest()}}. > "show partitions" additionally takes {{Frontend#getTableStats()} and aborts > the first time it sees InconsistentMetadataFetchException. > We need to make sure all the queries (especially DDLs) retry if they hit this > exception. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7597) "show partitions" does not retry on InconsistentMetadataFetchException
[ https://issues.apache.org/jira/browse/IMPALA-7597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16633295#comment-16633295 ] ASF subversion and git services commented on IMPALA-7597: - Commit 6f7b162154daae4614a6f1da0be920394478b123 in impala's branch refs/heads/master from [~vercego] [ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=6f7b162 ] IMPALA-7599: make the number of local cache retries configurable Under heavy read/write load, the number of retries needed for queries in order to skip over inconsistent metadata exceptions needs to be set higher. This change makes the number of retries configurable. It can be set with the newly added flag --local_catalog_max_fetch_retries. In addition, this change increases the default from 10 to 40, which was sufficient when handling several workloads with high read/write load. Follow-up change for IMPALA-7597 will make use of this configuration when retrying for cases other than analyzing queries. Made several fixes to exception messages. Testing: - manual tests - added an e2e test that sets the flag and checks for inconsistent metadata Change-Id: I4f14d5a8728f3cb07c7710589c44c2cd52478ba8 Reviewed-on: http://gerrit.cloudera.org:8080/11539 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > "show partitions" does not retry on InconsistentMetadataFetchException > -- > > Key: IMPALA-7597 > URL: https://issues.apache.org/jira/browse/IMPALA-7597 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.1.0 >Reporter: bharath v >Assignee: Vuk Ercegovac >Priority: Critical > > IMPALA-7530 added retries in case LocalCatalog throws > InconsistentMetadataFetchException. These retries apply to all code paths > taking {{Frontend#createExecRequest()}}. > "show partitions" additionally takes {{Frontend#getTableStats()} and aborts > the first time it sees InconsistentMetadataFetchException. > We need to make sure all the queries (especially DDLs) retry if they hit this > exception. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7597) "show partitions" does not retry on InconsistentMetadataFetchException
[ https://issues.apache.org/jira/browse/IMPALA-7597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629491#comment-16629491 ] Vuk Ercegovac commented on IMPALA-7597: --- The issue reported here is one example of InconsistentMetadataFetchException that can be thrown by code that is not under the retry loop of createExecRequest. Working backwards, all of these a thrown from sendRequest in CatalogMetaProvider when fetching from catalogd and at catalogd, 1) not finding an expected object (e.g., database might have been deleted and now we're fetching its list of table names, which is no longer valid) or 2) finding that versions mismatch due to an interleaved write. Such inconsistencies are possible at every step of the schema hierarchy, e.g., list dbs, get db info, list table names, load table, load table col stats, list partitions, load partition(s), list functions, load function. With the push architecture ("v1"), many of these operations would succeed but with potentially stale data. For example, if the table is present locally, its partitions are also present, so "show partitions" would complete. With the pull architecture ("v2"), if a new partition is added or the table is dropped for example, after the table is cached but before the partitions are fetched, the change will be reported as an exception. While the exception reflects a more current state, such exceptions offer a different behavior than with "v1". With "v1", a stale result can be returned. A follow-up operation, for example listing the tables in a database for a database that was listed (via show databases) but since dropped would just result in an error stating that the database does not exist. For queries, we chose to explicitly retry. An option here is to retry for all such operations. We can do so with a retrying wrapper with the same interface (similar to the hms retrying client). However, that may be too heavyweight an approach. For example, getCatalogMetrics (and its callers) should be able to proceed when such an exception arises-- its for internal book-keeping and can be skipped. An alternative is to provide a wrapper that retries and can easily be obtained-- first thought is to add something along side getCatalog in Frontend, e.g., getRetryableCatalog-- and to use it where needed. Further alternatives include making the exception checked, which was pointed out in a todo (along with it being viral). Another approach is to make v2's cache more coarse grained. For example, a database can include all its table names and functions (avoids the double check). In addition, a way to test this is needed. Initial thought is to inject time delays and check that at least one such inconsistency is encountered and retried per operation. > "show partitions" does not retry on InconsistentMetadataFetchException > -- > > Key: IMPALA-7597 > URL: https://issues.apache.org/jira/browse/IMPALA-7597 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.1.0 >Reporter: bharath v >Assignee: Vuk Ercegovac >Priority: Critical > > IMPALA-7530 added retries in case LocalCatalog throws > InconsistentMetadataFetchException. These retries apply to all code paths > taking {{Frontend#createExecRequest()}}. > "show partitions" additionally takes {{Frontend#getTableStats()} and aborts > the first time it sees InconsistentMetadataFetchException. > We need to make sure all the queries (especially DDLs) retry if they hit this > exception. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7597) "show partitions" does not retry on InconsistentMetadataFetchException
[ https://issues.apache.org/jira/browse/IMPALA-7597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621334#comment-16621334 ] bharath v commented on IMPALA-7597: --- Adrian, I raised a separate jira for it, IMPALA-7599 > "show partitions" does not retry on InconsistentMetadataFetchException > -- > > Key: IMPALA-7597 > URL: https://issues.apache.org/jira/browse/IMPALA-7597 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.1.0 >Reporter: bharath v >Assignee: Vuk Ercegovac >Priority: Critical > > IMPALA-7530 added retries in case LocalCatalog throws > InconsistentMetadataFetchException. These retries apply to all code paths > taking {{Frontend#createExecRequest()}}. > "show partitions" does not take this path and aborts the first time it sees > InconsistentMetadataFetchException. (It takes {{Frontend#getTableStats()}}) > We need to make sure all the queries retry if they hit this exception. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7597) "show partitions" does not retry on InconsistentMetadataFetchException
[ https://issues.apache.org/jira/browse/IMPALA-7597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621330#comment-16621330 ] Adrian Ng commented on IMPALA-7597: --- As part of this fix, we should make #retries configurable before throwing out InconsistentMetadataFetchException. Even with 10, we are hitting this exceptions with our workload. > "show partitions" does not retry on InconsistentMetadataFetchException > -- > > Key: IMPALA-7597 > URL: https://issues.apache.org/jira/browse/IMPALA-7597 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.1.0 >Reporter: bharath v >Assignee: Vuk Ercegovac >Priority: Critical > > IMPALA-7530 added retries in case LocalCatalog throws > InconsistentMetadataFetchException. These retries apply to all code paths > taking {{Frontend#createExecRequest()}}. > "show partitions" does not take this path and aborts the first time it sees > InconsistentMetadataFetchException. (It takes {{Frontend#getTableStats()}}) > We need to make sure all the queries retry if they hit this exception. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org