[
https://issues.apache.org/jira/browse/IMPALA-7093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16862718#comment-16862718
]
Robbie Zhang edited comment on IMPALA-7093 at 6/19/19 1:48 PM:
---------------------------------------------------------------
Thank you, [~tarmstrong]!
I find the problem is in
[ImpaladCatalog.updateCatalog()|https://github.com/apache/impala/blob/ab908d54c22861967f693428ec7d9f6d7008607f/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java#L191].
This function always adds top-level catalog objects first. When we run
'invalidate metadata', it adds the database objects then table/view/function
objects. But at that time the new database objects are empty, no
table/view/function object in them. After function
[ImpaladCatalog.addDB()|https://github.com/apache/impala/blob/ab908d54c22861967f693428ec7d9f6d7008607f/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java#L391]
replaces the existing database objects with the new database objects, the
existing table/view/function objects are lost until Catalogd.updateCatalog()
adds these objects back. If the impalad compiles a query when the
table/view/function objects disappear, the query will fail with
AnalysisException. The error message various in the different type of queries.
For example, for 'desc table', we can see 'Could not resolve path', for 'select
* from table', we can see 'Could not resolve table reference', for 'insert into
table', we can see 'Table does not exist', etc.
I can reproduce this issue by running two scripts. The first script keeps
running 'invalidate metadata':
{code:java}
#!/bin/bash
while [ 1 ]
do
shell/impala-shell -q "invalidate metadata"
done
{code}
After I start the first script, I run the second script which keeps running a
query:
{code:java}
#!/bin/bash
while [ 1 ]
do
#shell/impala-shell -q "desc test" 2>&1| tee test.output
#shell/impala-shell -q "select * from test" 2>&1| tee test.output
shell/impala-shell -q "insert overwrite test(i) values(1)" 2>&1| tee
test.output
n=`egrep "Fetched |Modified " test.output | wc -l`
if [ $n -lt 1 ]; then
exit
fi
done{code}
The more table/view/function objects there are, the longer the objects
disappear, and the easier the second script hit AnalysisException. I created
thousands tables on my cluster. Sometimes the second script hit
AnalysisException in a couple of minutes while sometimes it takes nearly half
an hour. Anyway, it's repeatable.
I changed ImpaladCatalog.java as the following. So far, I haven't see the
AnalysisException again. Seems the issue has gone.
{code:java}
diff --git a/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
b/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
index 13cb620..23a7d68 100644
--- a/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
+++ b/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
@@ -20,6 +20,8 @@ package org.apache.impala.catalog;
import java.nio.ByteBuffer;
import java.util.ArrayDeque;
import java.util.Set;
+import java.util.Map;
+import java.util.List;
import java.util.concurrent.atomic.AtomicLong;
import java.util.concurrent.atomic.AtomicReference;
@@ -388,6 +390,19 @@ public class ImpaladCatalog extends Catalog implements
FeCatalog {
existingDb.getCatalogVersion() < catalogVersion) {
Db newDb = Db.fromTDatabase(thriftDb);
newDb.setCatalogVersion(catalogVersion);
+ if (existingDb != null) {
+ // Migrant all existing table/view/function to newDb. Otherwise they
+ // will disappear temporarily.
+ for (Table tbl: existingDb.getTables()) {
+ newDb.addTable(tbl);
+ }
+ Map<String, List<Function>> functions = existingDb.getAllFunctions();
+ for (List<Function> fns: existingDb.getAllFunctions().values()) {
+ for (Function f: fns) {
+ newDb.addFunction(f);
+ }
+ }
+ }
addDb(newDb);
if (existingDb != null) {
CatalogObjectVersionSet.INSTANCE.updateVersions(
{code}
Adding a lock into Catalog is another solution. But the change will be more
complex. In my change, one possible problem is that if the new database object
has less table/view/function objects than the existing database object, the
deleted object might be left in Catalog forever. According to my test, the
deleted objects should be in sequencer.getDeletedObjects() and will be removed
by
[ImpaladCatalog.removeCatalogObject()|https://github.com/apache/impala/blob/ab908d54c22861967f693428ec7d9f6d7008607f/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java#L229].
So I think my change is fine. Please correct me if I'm wrong.
was (Author: robbie):
Thank you, [~tarmstrong]!
I find the problem is in
[Catalogd.updateCatalog()|https://github.com/apache/impala/blob/ab908d54c22861967f693428ec7d9f6d7008607f/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java#L191].
This function always adds top-level catalog objects first. When we run
'invalidate metadata', it adds the database objects then table/view/function
objects. But at that time the new database objects are empty, no
table/view/function object in them. After function
[Catalogd.addDB()|https://github.com/apache/impala/blob/ab908d54c22861967f693428ec7d9f6d7008607f/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java#L391]
replaces the existing database objects with the new database objects, the
existing table/view/function objects are lost until Catalogd.updateCatalog()
adds these objects back. If the impalad compiles a query when the
table/view/function objects disappear, the query will fail with
AnalysisException. The error message various in the different type of queries.
For example, for 'desc table', we can see 'Could not resolve path', for 'select
* from table', we can see 'Could not resolve table reference', for 'insert into
table', we can see 'Table does not exist', etc.
I can reproduce this issue by running two scripts. The first script keeps
running 'invalidate metadata':
{code:java}
#!/bin/bash
while [ 1 ]
do
shell/impala-shell -q "invalidate metadata"
done
{code}
After I start the first script, I run the second script which keeps running a
query:
{code:java}
#!/bin/bash
while [ 1 ]
do
#shell/impala-shell -q "desc test" 2>&1| tee test.output
#shell/impala-shell -q "select * from test" 2>&1| tee test.output
shell/impala-shell -q "insert overwrite test(i) values(1)" 2>&1| tee
test.output
n=`egrep "Fetched |Modified " test.output | wc -l`
if [ $n -lt 1 ]; then
exit
fi
done{code}
The more table/view/function objects there are, the longer the objects
disappear, and the easier the second script hit AnalysisException. I created
thousands tables on my cluster. Sometimes the second script hit
AnalysisException in a couple of minutes while sometimes it takes nearly half
an hour. Anyway, it's repeatable.
I changed ImpaladCatalog.java as the following. So far, I haven't see the
AnalysisException again. Seems the issue has gone.
{code:java}
diff --git a/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
b/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
index 13cb620..23a7d68 100644
--- a/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
+++ b/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
@@ -20,6 +20,8 @@ package org.apache.impala.catalog;
import java.nio.ByteBuffer;
import java.util.ArrayDeque;
import java.util.Set;
+import java.util.Map;
+import java.util.List;
import java.util.concurrent.atomic.AtomicLong;
import java.util.concurrent.atomic.AtomicReference;
@@ -388,6 +390,19 @@ public class ImpaladCatalog extends Catalog implements
FeCatalog {
existingDb.getCatalogVersion() < catalogVersion) {
Db newDb = Db.fromTDatabase(thriftDb);
newDb.setCatalogVersion(catalogVersion);
+ if (existingDb != null) {
+ // Migrant all existing table/view/function to newDb. Otherwise they
+ // will disappear temporarily.
+ for (Table tbl: existingDb.getTables()) {
+ newDb.addTable(tbl);
+ }
+ Map<String, List<Function>> functions = existingDb.getAllFunctions();
+ for (List<Function> fns: existingDb.getAllFunctions().values()) {
+ for (Function f: fns) {
+ newDb.addFunction(f);
+ }
+ }
+ }
addDb(newDb);
if (existingDb != null) {
CatalogObjectVersionSet.INSTANCE.updateVersions(
{code}
Adding a lock into Catalog is another solution. But the change will be more
complex. In my change, one possible problem is that if the new database object
has less table/view/function objects than the existing database object, the
deleted object might be left in Catalog forever. According to my test, the
deleted objects should be in sequencer.getDeletedObjects() and will be removed
by
[ImpaladCatalog.removeCatalogObject()|https://github.com/apache/impala/blob/ab908d54c22861967f693428ec7d9f6d7008607f/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java#L229].
So I think my change is fine. Please correct me if I'm wrong.
> Tables briefly appear to not exist after INVALIDATE METADATA or catalog
> restart
> -------------------------------------------------------------------------------
>
> Key: IMPALA-7093
> URL: https://issues.apache.org/jira/browse/IMPALA-7093
> Project: IMPALA
> Issue Type: Bug
> Components: Catalog
> Affects Versions: Impala 2.12.0, Impala 2.13.0
> Reporter: Todd Lipcon
> Priority: Major
>
> I'm doing some stress testing of Impala 2.13 (recent snapshot build) and hit
> the following sequence:
> {code}
> {"query": "SHOW TABLES in consistency_test", "type": "call", "id": 3}
> {"type": "response", "id": 3, "results": [["t1"]]}
> {"query": "INVALIDATE METADATA", "type": "call", "id": 7}
> {"type": "response", "id": 7}
> {"query": "DESCRIBE consistency_test.t1", "type": "call", "id": 9}
> {"type": "response", "id": 9, "error": "AnalysisException: Could not resolve
> path: 'consistency_test.t1'\n"}
> {code}
> i.e. 'SHOW TABLES' shows that a table exists, but then shortly after an
> INVALIDATE METADATA, an attempt to describe a table indicates that the table
> does not exist. This is a single-threaded test case against a single impalad.
> I also saw a similar behavior that issuing queries to an impalad shortly
> after a catalogd restart could transiently show tables not existing that in
> fact exist.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]