[
https://issues.apache.org/jira/browse/IMPALA-12788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Quanlong Huang updated IMPALA-12788:
------------------------------------
Description:
This is identified by an internal S3 build that doesn't launch HBase. There are
some tests that still run queries on HBase tables, e.g.
TestDdlStatements::test_alter_set_column_stats. But they don't fail on even if
the table can't be correctly loaded. Catalogd logs show that the connection
failure to HBase is ignored:
{noformat}
I0203 14:12:33.687620 20673 TableLoadingMgr.java:71] Loading metadata for
table: functional_hbase.alltypes
I0203 14:12:33.687674 24282 TableLoader.java:76] Loading metadata for:
functional_hbase.alltypes (background load)
I0203 14:12:33.687706 20673 TableLoadingMgr.java:73] Remaining items in queue:
0. Loads in progress: 1
I0203 14:12:33.690941 26564 JniCatalog.java:257] execDdl request: DROP_DATABASE
test_compute_stats_9c95c5d8 issued by jenkins
I0203 14:12:33.691668 24282 Table.java:218] createEventId_ for table:
functional_hbase.alltypes set to: -1
......
W0203 14:13:06.941573 1978 ReadOnlyZKClient.java:193] 0x65bc7c50 to
localhost:2181 failed for get of /hbase/hbaseid, code = CONNECTIONLOSS, retries
= 30, give up
W0203 14:13:06.947460 24282 ConnectionImplementation.java:641] Retrieve cluster
id failed
Java exception follows:
java.util.concurrent.ExecutionException:
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode =
ConnectionLoss for /hbase/hbaseid
at
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
at
org.apache.hadoop.hbase.client.ConnectionImplementation.retrieveClusterId(ConnectionImplementation.java:639)
at
org.apache.hadoop.hbase.client.ConnectionImplementation.<init>(ConnectionImplementation.java:325)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at
org.apache.hadoop.hbase.client.ConnectionFactory.lambda$createConnection$0(ConnectionFactory.java:231)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
at
org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:325)
at
org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:230)
at
org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:130)
at
org.apache.impala.catalog.FeHBaseTable$Util$ConnectionHolder.getConnection(FeHBaseTable.java:722)
at
org.apache.impala.catalog.FeHBaseTable$Util.getHBaseTable(FeHBaseTable.java:126)
at org.apache.impala.catalog.HBaseTable.load(HBaseTable.java:112)
at org.apache.impala.catalog.TableLoader.load(TableLoader.java:144)
at
org.apache.impala.catalog.TableLoadingMgr$2.call(TableLoadingMgr.java:245)
at
org.apache.impala.catalog.TableLoadingMgr$2.call(TableLoadingMgr.java:242)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
at
org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$ZKTask$1.exec(ReadOnlyZKClient.java:195)
at
org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient.run(ReadOnlyZKClient.java:340)
... 1 more
I0203 14:13:07.058998 24282 TableLoader.java:175] Loaded metadata for:
functional_hbase.alltypes (33371ms)
I0203 14:13:07.866829 21368 catalog-server.cc:403] A catalog update with 9
entries is assembled. Catalog version: 6192 Last sent catalog version: 6181
I0203 14:13:07.870369 21344 catalog-server.cc:816] Collected update:
1:TABLE:functional_hbase.alltypes, version=6193, original size=3855, compressed
size=1471
I0203 14:13:07.872047 21344 catalog-server.cc:816] Collected update:
1:CATALOG_SERVICE_ID, version=6193, original size=60, compressed
size=58{noformat}
This is problematic since impalad thought the table is correctly loaded and
will try to load it again when applying the catalog update, which could block
the statestore subscriber thread for a long time, causing other DDL queries to
be blocked as well since they can't acquire the catalog update lock.
We've seen TestAsyncLoadData.test_async_load timeout on S3 (IMPALA-11285) and
this is the cause.
Here are logs showing impalad is blocked in applying catalog update of the
HBase table:
{noformat}
I0203 14:13:09.359010 3636 Frontend.java:1917]
db4f57572baab787:ebdb853600000000] Analyzing query: load data inpath
'/test-warehouse/test_load_staging_beeswax_True' into table
test_async_load_898a2f19.test_load_nopart_beeswax_True db: functional
...
I0203 14:13:42.188225 4881 ClientCnxn.java:1246] Socket error occurred:
localhost/0:0:0:0:0:0:0:1:2181: Connection refused
W0203 14:13:42.288529 4880 ReadOnlyZKClient.java:189] 0x43325be0 to
localhost:2181 failed for get of /hbase/hbaseid, code = CONNECTIONLOSS, retries
= 29
I0203 14:13:43.288617 4881 ClientCnxn.java:1111] Opening socket connection to
server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL
(unknown error)
I0203 14:13:43.288892 4881 ClientCnxn.java:1246] Socket error occurred:
localhost/127.0.0.1:2181: Connection refused
W0203 14:13:43.389173 4880 ReadOnlyZKClient.java:189] 0x43325be0 to
localhost:2181 failed for get of /hbase/hbaseid, code = CONNECTIONLOSS, retries
= 30
I0203 14:13:44.389231 4881 ClientCnxn.java:1111] Opening socket connection to
server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL
(unknown error)
I0203 14:13:44.389554 4881 ClientCnxn.java:1246] Socket error occurred:
localhost/127.0.0.1:2181: Connection refused
W0203 14:13:44.489856 4880 ReadOnlyZKClient.java:193] 0x43325be0 to
localhost:2181 failed for get of /hbase/hbaseid, code = CONNECTIONLOSS, retries
= 30, give up
W0203 14:13:44.500921 22023 ConnectionImplementation.java:641] Retrieve cluster
id failed
Java exception follows:
java.util.concurrent.ExecutionException:
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode =
ConnectionLoss for /hbase/hbaseid
at
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
at
org.apache.hadoop.hbase.client.ConnectionImplementation.retrieveClusterId(ConnectionImplementation.java:639)
at
org.apache.hadoop.hbase.client.ConnectionImplementation.<init>(ConnectionImplementation.java:325)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at
org.apache.hadoop.hbase.client.ConnectionFactory.lambda$createConnection$0(ConnectionFactory.java:231)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
at
org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:325)
at
org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:230)
at
org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:130)
at
org.apache.impala.catalog.FeHBaseTable$Util$ConnectionHolder.getConnection(FeHBaseTable.java:722)
at
org.apache.impala.catalog.FeHBaseTable$Util.getHBaseTable(FeHBaseTable.java:126)
at
org.apache.impala.catalog.HBaseTable.loadFromThrift(HBaseTable.java:139)
at org.apache.impala.catalog.Table.fromThrift(Table.java:538)
at
org.apache.impala.catalog.ImpaladCatalog.addTable(ImpaladCatalog.java:474)
at
org.apache.impala.catalog.ImpaladCatalog.addCatalogObject(ImpaladCatalog.java:329)
at
org.apache.impala.catalog.ImpaladCatalog.updateCatalog(ImpaladCatalog.java:258)
at
org.apache.impala.service.FeCatalogManager$CatalogdImpl.updateCatalogCache(FeCatalogManager.java:114)
at
org.apache.impala.service.Frontend.updateCatalogCache(Frontend.java:513)
at
org.apache.impala.service.JniFrontend.updateCatalogCache(JniFrontend.java:185)
Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
at
org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$ZKTask$1.exec(ReadOnlyZKClient.java:195)
at
org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient.run(ReadOnlyZKClient.java:340)
at java.lang.Thread.run(Thread.java:748)
I0203 14:13:44.585079 22023 impala-server.cc:2060] Catalog topic update applied
with version: 6193 new min catalog object version: 2
... // After this the table test_load_nopart_beeswax_true from LoadData
statement can be added
I0203 14:13:44.586282 4723 ImpaladCatalog.java:228]
db4f57572baab787:ebdb853600000000] Adding:
TABLE:test_async_load_898a2f19.test_load_nopart_beeswax_true version: 6207
size: 3866 {noformat}
The bug is that table loading on HBase table should fail if catalogd fails to
connect to HBase.
was:
This is identified by an internal S3 build that doesn't launch HBase. There are
some tests that still run queries on HBase tables, e.g.
TestDdlStatements::test_alter_set_column_stats. But they don't fail on even if
the table can't be correctly loaded. Catalogd logs show that the connection
failure to HBase is ignored:
{noformat}
I0203 14:12:33.687620 20673 TableLoadingMgr.java:71] Loading metadata for
table: functional_hbase.alltypes
I0203 14:12:33.687674 24282 TableLoader.java:76] Loading metadata for:
functional_hbase.alltypes (background load)
I0203 14:12:33.687706 20673 TableLoadingMgr.java:73] Remaining items in queue:
0. Loads in progress: 1
I0203 14:12:33.690941 26564 JniCatalog.java:257] execDdl request: DROP_DATABASE
test_compute_stats_9c95c5d8 issued by jenkins
I0203 14:12:33.691668 24282 Table.java:218] createEventId_ for table:
functional_hbase.alltypes set to: -1
......
W0203 14:13:06.941573 1978 ReadOnlyZKClient.java:193] 0x65bc7c50 to
localhost:2181 failed for get of /hbase/hbaseid, code = CONNECTIONLOSS, retries
= 30, give up
W0203 14:13:06.947460 24282 ConnectionImplementation.java:641] Retrieve cluster
id failed
Java exception follows:
java.util.concurrent.ExecutionException:
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode =
ConnectionLoss for /hbase/hbaseid
at
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
at
org.apache.hadoop.hbase.client.ConnectionImplementation.retrieveClusterId(ConnectionImplementation.java:639)
at
org.apache.hadoop.hbase.client.ConnectionImplementation.<init>(ConnectionImplementation.java:325)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at
org.apache.hadoop.hbase.client.ConnectionFactory.lambda$createConnection$0(ConnectionFactory.java:231)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
at
org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:325)
at
org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:230)
at
org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:130)
at
org.apache.impala.catalog.FeHBaseTable$Util$ConnectionHolder.getConnection(FeHBaseTable.java:722)
at
org.apache.impala.catalog.FeHBaseTable$Util.getHBaseTable(FeHBaseTable.java:126)
at org.apache.impala.catalog.HBaseTable.load(HBaseTable.java:112)
at org.apache.impala.catalog.TableLoader.load(TableLoader.java:144)
at
org.apache.impala.catalog.TableLoadingMgr$2.call(TableLoadingMgr.java:245)
at
org.apache.impala.catalog.TableLoadingMgr$2.call(TableLoadingMgr.java:242)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
at
org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$ZKTask$1.exec(ReadOnlyZKClient.java:195)
at
org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient.run(ReadOnlyZKClient.java:340)
... 1 more
I0203 14:13:07.058998 24282 TableLoader.java:175] Loaded metadata for:
functional_hbase.alltypes (33371ms)
I0203 14:13:07.866829 21368 catalog-server.cc:403] A catalog update with 9
entries is assembled. Catalog version: 6192 Last sent catalog version: 6181
I0203 14:13:07.870369 21344 catalog-server.cc:816] Collected update:
1:TABLE:functional_hbase.alltypes, version=6193, original size=3855, compressed
size=1471
I0203 14:13:07.872047 21344 catalog-server.cc:816] Collected update:
1:CATALOG_SERVICE_ID, version=6193, original size=60, compressed
size=58{noformat}
This is problematic since impalad thought the table is correctly loaded and
will try to load it again when applying the catalog update, which could block
the statestore subscriber thread for a long time, causing other DDL queries to
be blocked as well since they can't acquire the catalog update lock.
We've seen TestAsyncLoadData.test_async_load timeout on S3 (IMPALA-11285) and
this is the cause.
Here are logs showing impalad is blocked in applying catalog update of the
HBase table:
{noformat}
I0203 14:13:09.359010 3636 Frontend.java:1917]
db4f57572baab787:ebdb853600000000] Analyzing query: load data inpath
'/test-warehouse/test_load_staging_beeswax_True' into table
test_async_load_898a2f19.test_load_nopart_beeswax_True db: functional
...
I0203 14:13:42.188225 4881 ClientCnxn.java:1246] Socket error occurred:
localhost/0:0:0:0:0:0:0:1:2181: Connection refused
W0203 14:13:42.288529 4880 ReadOnlyZKClient.java:189] 0x43325be0 to
localhost:2181 failed for get of /hbase/hbaseid, code = CONNECTIONLOSS, retries
= 29
I0203 14:13:43.288617 4881 ClientCnxn.java:1111] Opening socket connection to
server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL
(unknown error)
I0203 14:13:43.288892 4881 ClientCnxn.java:1246] Socket error occurred:
localhost/127.0.0.1:2181: Connection refused
W0203 14:13:43.389173 4880 ReadOnlyZKClient.java:189] 0x43325be0 to
localhost:2181 failed for get of /hbase/hbaseid, code = CONNECTIONLOSS, retries
= 30
I0203 14:13:44.389231 4881 ClientCnxn.java:1111] Opening socket connection to
server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL
(unknown error)
I0203 14:13:44.389554 4881 ClientCnxn.java:1246] Socket error occurred:
localhost/127.0.0.1:2181: Connection refused
W0203 14:13:44.489856 4880 ReadOnlyZKClient.java:193] 0x43325be0 to
localhost:2181 failed for get of /hbase/hbaseid, code = CONNECTIONLOSS, retries
= 30, give up
W0203 14:13:44.500921 22023 ConnectionImplementation.java:641] Retrieve cluster
id failed
Java exception follows:
java.util.concurrent.ExecutionException:
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode =
ConnectionLoss for /hbase/hbaseid
at
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
at
org.apache.hadoop.hbase.client.ConnectionImplementation.retrieveClusterId(ConnectionImplementation.java:639)
at
org.apache.hadoop.hbase.client.ConnectionImplementation.<init>(ConnectionImplementation.java:325)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at
org.apache.hadoop.hbase.client.ConnectionFactory.lambda$createConnection$0(ConnectionFactory.java:231)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
at
org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:325)
at
org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:230)
at
org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:130)
at
org.apache.impala.catalog.FeHBaseTable$Util$ConnectionHolder.getConnection(FeHBaseTable.java:722)
at
org.apache.impala.catalog.FeHBaseTable$Util.getHBaseTable(FeHBaseTable.java:126)
at
org.apache.impala.catalog.HBaseTable.loadFromThrift(HBaseTable.java:139)
at org.apache.impala.catalog.Table.fromThrift(Table.java:538)
at
org.apache.impala.catalog.ImpaladCatalog.addTable(ImpaladCatalog.java:474)
at
org.apache.impala.catalog.ImpaladCatalog.addCatalogObject(ImpaladCatalog.java:329)
at
org.apache.impala.catalog.ImpaladCatalog.updateCatalog(ImpaladCatalog.java:258)
at
org.apache.impala.service.FeCatalogManager$CatalogdImpl.updateCatalogCache(FeCatalogManager.java:114)
at
org.apache.impala.service.Frontend.updateCatalogCache(Frontend.java:513)
at
org.apache.impala.service.JniFrontend.updateCatalogCache(JniFrontend.java:185)
Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
at
org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$ZKTask$1.exec(ReadOnlyZKClient.java:195)
at
org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient.run(ReadOnlyZKClient.java:340)
at java.lang.Thread.run(Thread.java:748)
I0203 14:13:44.585079 22023 impala-server.cc:2060] Catalog topic update applied
with version: 6193 new min catalog object version: 2
... // After this the table test_load_nopart_beeswax_true from LoadData
statement can be added
I0203 14:13:44.586282 4723 ImpaladCatalog.java:228]
db4f57572baab787:ebdb853600000000] Adding:
TABLE:test_async_load_898a2f19.test_load_nopart_beeswax_true version: 6207
size: 3866 {noformat}
The bug is that table loading on HBase table should fail if catalogd fails to
connect to HBase. It's true before this change (IMPALA-7322):
[https://gerrit.cloudera.org/c/13786/8/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java]
After IMPALA-7322, the exception thrown by Util.getHBaseTable(hbaseTableName_)
is ignored:
{code:java}
try {
hbaseTableName_ = Util.getHBaseTableName(getMetaStoreTable());
// Warm up the connection and verify the table exists.
Util.getHBaseTable(hbaseTableName_).close();
columnFamilies_ = null;
cols = Util.loadColumns(msTable_);
} finally {
storageMetadataLoadTime_ = storageLoadTimer.stop();
} {code}
> HBaseTable still get loaded even if HBase is down
> -------------------------------------------------
>
> Key: IMPALA-12788
> URL: https://issues.apache.org/jira/browse/IMPALA-12788
> Project: IMPALA
> Issue Type: Bug
> Components: Catalog
> Affects Versions: Impala 4.0.0, Impala 3.4.0, Impala 3.4.1, Impala 4.1.0,
> Impala 4.2.0, Impala 4.1.1, Impala 4.1.2, Impala 4.3.0
> Reporter: Quanlong Huang
> Assignee: Quanlong Huang
> Priority: Critical
>
> This is identified by an internal S3 build that doesn't launch HBase. There
> are some tests that still run queries on HBase tables, e.g.
> TestDdlStatements::test_alter_set_column_stats. But they don't fail on even
> if the table can't be correctly loaded. Catalogd logs show that the
> connection failure to HBase is ignored:
> {noformat}
> I0203 14:12:33.687620 20673 TableLoadingMgr.java:71] Loading metadata for
> table: functional_hbase.alltypes
> I0203 14:12:33.687674 24282 TableLoader.java:76] Loading metadata for:
> functional_hbase.alltypes (background load)
> I0203 14:12:33.687706 20673 TableLoadingMgr.java:73] Remaining items in
> queue: 0. Loads in progress: 1
> I0203 14:12:33.690941 26564 JniCatalog.java:257] execDdl request:
> DROP_DATABASE test_compute_stats_9c95c5d8 issued by jenkins
> I0203 14:12:33.691668 24282 Table.java:218] createEventId_ for table:
> functional_hbase.alltypes set to: -1
> ......
> W0203 14:13:06.941573 1978 ReadOnlyZKClient.java:193] 0x65bc7c50 to
> localhost:2181 failed for get of /hbase/hbaseid, code = CONNECTIONLOSS,
> retries = 30, give up
> W0203 14:13:06.947460 24282 ConnectionImplementation.java:641] Retrieve
> cluster id failed
> Java exception follows:
> java.util.concurrent.ExecutionException:
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode
> = ConnectionLoss for /hbase/hbaseid
> at
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
> at
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
> at
> org.apache.hadoop.hbase.client.ConnectionImplementation.retrieveClusterId(ConnectionImplementation.java:639)
> at
> org.apache.hadoop.hbase.client.ConnectionImplementation.<init>(ConnectionImplementation.java:325)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at
> org.apache.hadoop.hbase.client.ConnectionFactory.lambda$createConnection$0(ConnectionFactory.java:231)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
> at
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:325)
> at
> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:230)
> at
> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:130)
> at
> org.apache.impala.catalog.FeHBaseTable$Util$ConnectionHolder.getConnection(FeHBaseTable.java:722)
> at
> org.apache.impala.catalog.FeHBaseTable$Util.getHBaseTable(FeHBaseTable.java:126)
> at org.apache.impala.catalog.HBaseTable.load(HBaseTable.java:112)
> at org.apache.impala.catalog.TableLoader.load(TableLoader.java:144)
> at
> org.apache.impala.catalog.TableLoadingMgr$2.call(TableLoadingMgr.java:245)
> at
> org.apache.impala.catalog.TableLoadingMgr$2.call(TableLoadingMgr.java:242)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
> at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
> at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
> at
> org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$ZKTask$1.exec(ReadOnlyZKClient.java:195)
> at
> org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient.run(ReadOnlyZKClient.java:340)
> ... 1 more
> I0203 14:13:07.058998 24282 TableLoader.java:175] Loaded metadata for:
> functional_hbase.alltypes (33371ms)
> I0203 14:13:07.866829 21368 catalog-server.cc:403] A catalog update with 9
> entries is assembled. Catalog version: 6192 Last sent catalog version: 6181
> I0203 14:13:07.870369 21344 catalog-server.cc:816] Collected update:
> 1:TABLE:functional_hbase.alltypes, version=6193, original size=3855,
> compressed size=1471
> I0203 14:13:07.872047 21344 catalog-server.cc:816] Collected update:
> 1:CATALOG_SERVICE_ID, version=6193, original size=60, compressed
> size=58{noformat}
> This is problematic since impalad thought the table is correctly loaded and
> will try to load it again when applying the catalog update, which could block
> the statestore subscriber thread for a long time, causing other DDL queries
> to be blocked as well since they can't acquire the catalog update lock.
> We've seen TestAsyncLoadData.test_async_load timeout on S3 (IMPALA-11285) and
> this is the cause.
> Here are logs showing impalad is blocked in applying catalog update of the
> HBase table:
> {noformat}
> I0203 14:13:09.359010 3636 Frontend.java:1917]
> db4f57572baab787:ebdb853600000000] Analyzing query: load data inpath
> '/test-warehouse/test_load_staging_beeswax_True' into table
> test_async_load_898a2f19.test_load_nopart_beeswax_True db: functional
> ...
> I0203 14:13:42.188225 4881 ClientCnxn.java:1246] Socket error occurred:
> localhost/0:0:0:0:0:0:0:1:2181: Connection refused
> W0203 14:13:42.288529 4880 ReadOnlyZKClient.java:189] 0x43325be0 to
> localhost:2181 failed for get of /hbase/hbaseid, code = CONNECTIONLOSS,
> retries = 29
> I0203 14:13:43.288617 4881 ClientCnxn.java:1111] Opening socket connection
> to server localhost/127.0.0.1:2181. Will not attempt to authenticate using
> SASL (unknown error)
> I0203 14:13:43.288892 4881 ClientCnxn.java:1246] Socket error occurred:
> localhost/127.0.0.1:2181: Connection refused
> W0203 14:13:43.389173 4880 ReadOnlyZKClient.java:189] 0x43325be0 to
> localhost:2181 failed for get of /hbase/hbaseid, code = CONNECTIONLOSS,
> retries = 30
> I0203 14:13:44.389231 4881 ClientCnxn.java:1111] Opening socket connection
> to server localhost/127.0.0.1:2181. Will not attempt to authenticate using
> SASL (unknown error)
> I0203 14:13:44.389554 4881 ClientCnxn.java:1246] Socket error occurred:
> localhost/127.0.0.1:2181: Connection refused
> W0203 14:13:44.489856 4880 ReadOnlyZKClient.java:193] 0x43325be0 to
> localhost:2181 failed for get of /hbase/hbaseid, code = CONNECTIONLOSS,
> retries = 30, give up
> W0203 14:13:44.500921 22023 ConnectionImplementation.java:641] Retrieve
> cluster id failed
> Java exception follows:
> java.util.concurrent.ExecutionException:
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode
> = ConnectionLoss for /hbase/hbaseid
> at
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
> at
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
> at
> org.apache.hadoop.hbase.client.ConnectionImplementation.retrieveClusterId(ConnectionImplementation.java:639)
> at
> org.apache.hadoop.hbase.client.ConnectionImplementation.<init>(ConnectionImplementation.java:325)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at
> org.apache.hadoop.hbase.client.ConnectionFactory.lambda$createConnection$0(ConnectionFactory.java:231)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
> at
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:325)
> at
> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:230)
> at
> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:130)
> at
> org.apache.impala.catalog.FeHBaseTable$Util$ConnectionHolder.getConnection(FeHBaseTable.java:722)
> at
> org.apache.impala.catalog.FeHBaseTable$Util.getHBaseTable(FeHBaseTable.java:126)
> at
> org.apache.impala.catalog.HBaseTable.loadFromThrift(HBaseTable.java:139)
> at org.apache.impala.catalog.Table.fromThrift(Table.java:538)
> at
> org.apache.impala.catalog.ImpaladCatalog.addTable(ImpaladCatalog.java:474)
> at
> org.apache.impala.catalog.ImpaladCatalog.addCatalogObject(ImpaladCatalog.java:329)
> at
> org.apache.impala.catalog.ImpaladCatalog.updateCatalog(ImpaladCatalog.java:258)
> at
> org.apache.impala.service.FeCatalogManager$CatalogdImpl.updateCatalogCache(FeCatalogManager.java:114)
> at
> org.apache.impala.service.Frontend.updateCatalogCache(Frontend.java:513)
> at
> org.apache.impala.service.JniFrontend.updateCatalogCache(JniFrontend.java:185)
> Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
> at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
> at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
> at
> org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$ZKTask$1.exec(ReadOnlyZKClient.java:195)
> at
> org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient.run(ReadOnlyZKClient.java:340)
> at java.lang.Thread.run(Thread.java:748)
> I0203 14:13:44.585079 22023 impala-server.cc:2060] Catalog topic update
> applied with version: 6193 new min catalog object version: 2
> ... // After this the table test_load_nopart_beeswax_true from LoadData
> statement can be added
> I0203 14:13:44.586282 4723 ImpaladCatalog.java:228]
> db4f57572baab787:ebdb853600000000] Adding:
> TABLE:test_async_load_898a2f19.test_load_nopart_beeswax_true version: 6207
> size: 3866 {noformat}
> The bug is that table loading on HBase table should fail if catalogd fails to
> connect to HBase.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]