[
https://issues.apache.org/jira/browse/IMPALA-14279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Surya Hebbar closed IMPALA-14279.
---------------------------------
> TestCatalogdHA.test_metadata_after_failover_with_hms_sync failing
> -----------------------------------------------------------------
>
> Key: IMPALA-14279
> URL: https://issues.apache.org/jira/browse/IMPALA-14279
> Project: IMPALA
> Issue Type: Bug
> Reporter: Surya Hebbar
> Assignee: Riza Suminto
> Priority: Major
>
> [https://master-03.jenkins.cloudera.com/job/impala-asf-master-core-s3-data-cache/532]
>
> Error Message
> {code}
> assert 'Error making an RPC call to Catalog server' in "Query
> 954dacb0ba3c1f89:f9b9ec4800000000 failed:\nLocalCatalogException: Could not
> load table names for database 'test... for
> impala-ec2-redhat86-m6i-4xlarge-ondemand-1669.vpc.cloudera.com:26000
> (connect() failed: Connection refused)\n\n\n" + where "Query
> 954dacb0ba3c1f89:f9b9ec4800000000 failed:\nLocalCatalogException: Could not
> load table names for database 'test... for
> impala-ec2-redhat86-m6i-4xlarge-ondemand-1669.vpc.cloudera.com:26000
> (connect() failed: Connection refused)\n\n\n" = str(HiveServer2Error("Query
> 954dacb0ba3c1f89:f9b9ec4800000000
> failed:\nLocalCatalo...and-1669.vpc.cloudera.com:26000 (connect() failed:
> Connection refused)\n\n\n",)){code}
> Stacktrace
> {code}
> custom_cluster/test_catalogd_ha.py:548: in
> test_metadata_after_failover_with_hms_sync
> self._test_metadata_after_failover(unique_database, skip_func_test=True)
> custom_cluster/test_catalogd_ha.py:644: in _test_metadata_after_failover
> assert "Error making an RPC call to Catalog server" in str(e)
> E assert 'Error making an RPC call to Catalog server' in "Query
> 954dacb0ba3c1f89:f9b9ec4800000000 failed:\nLocalCatalogException: Could not
> load table names for database 'test... for
> impala-ec2-redhat86-m6i-4xlarge-ondemand-1669.vpc.cloudera.com:26000
> (connect() failed: Connection refused)\n\n\n"
> E + where "Query 954dacb0ba3c1f89:f9b9ec4800000000
> failed:\nLocalCatalogException: Could not load table names for database
> 'test... for
> impala-ec2-redhat86-m6i-4xlarge-ondemand-1669.vpc.cloudera.com:26000
> (connect() failed: Connection refused)\n\n\n" = str(HiveServer2Error("Query
> 954dacb0ba3c1f89:f9b9ec4800000000
> failed:\nLocalCatalo...and-1669.vpc.cloudera.com:26000 (connect() failed:
> Connection refused)\n\n\n",))
> {code}
> Standard Error
> {code}
> – 2025-07-29 17:31:15,423 INFO MainThread: Starting cluster with command:
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/bin/start-impala-cluster.py
> '-{-}state_store_args={-}{-}statestore_update_frequency_ms=50
> --statestore_priority_update_frequency_ms=50
> --statestore_heartbeat_frequency_ms=50' --cluster_size=3 --num_coordinators=3
> --log_dir=/data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/logs/custom_cluster_tests
> --log_level=1 '{-}{-}impalad_args={-}{-}use_local_catalog=true '
> '{-}{-}state_store_args={-}{-}use_subscriber_id_as_catalogd_priority=true '
> '{-}{-}catalogd_args={-}{-}catalogd_ha_reset_metadata_on_failover=false
> --catalog_topic_mode=minimal
> --debug_actions=catalogd_event_processing_delay:SLEEP@1000 '
> --enable_catalogd_ha --impalad_args={-}-default_query_options=
> 17:31:15 MainThread: Found 0 impalad/0 statestored/0 catalogd process(es)
> 17:31:15 MainThread: Starting State Store logging to
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/logs/custom_cluster_tests/statestored.INFO
> 17:31:15 MainThread: Starting Catalog Service logging to
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/logs/custom_cluster_tests/catalogd.INFO
> 17:31:15 MainThread: Starting Catalog Service logging to
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/logs/custom_cluster_tests/catalogd_node1.INFO
> 17:31:15 MainThread: Starting Impala Daemon logging to
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/logs/custom_cluster_tests/impalad.INFO
> 17:31:15 MainThread: Starting Impala Daemon logging to
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO
> 17:31:15 MainThread: Starting Impala Daemon logging to
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO
> 17:31:18 MainThread: Found 3 impalad/1 statestored/2 catalogd process(es)
> 17:31:18 MainThread: Waiting for Impalad webserver port 25000
> 17:31:18 MainThread: Waiting for Impalad webserver port 25000
> 17:31:19 MainThread: Waiting for Impalad webserver port 25000
> 17:31:19 MainThread: Waiting for Impalad webserver port 25000
> 17:31:20 MainThread: Waiting for Impalad webserver port 25000
> 17:31:20 MainThread: Waiting for Impalad webserver port 25000
> 17:31:20 MainThread: Waiting for Impalad webserver port 25001
> 17:31:20 MainThread: Waiting for Impalad webserver port 25002
> 17:31:22 MainThread: Waiting for coordinator client services - hs2 port:
> 21050 hs2-http port: 28000 beeswax port: 21000
> 17:31:24 MainThread: Waiting for coordinator client services - hs2 port:
> 21051 hs2-http port: 28001 beeswax port: 21001
> 17:31:26 MainThread: Waiting for coordinator client services - hs2 port:
> 21052 hs2-http port: 28002 beeswax port: 21002
> 17:31:26 MainThread: Getting num_known_live_backends from
> impala-ec2-redhat86-m6i-4xlarge-ondemand-1669.vpc.cloudera.com:25000
> 17:31:26 MainThread: num_known_live_backends has reached value: 3
> 17:31:26 MainThread: Getting num_known_live_backends from
> impala-ec2-redhat86-m6i-4xlarge-ondemand-1669.vpc.cloudera.com:25001
> 17:31:26 MainThread: num_known_live_backends has reached value: 3
> 17:31:26 MainThread: Getting num_known_live_backends from
> impala-ec2-redhat86-m6i-4xlarge-ondemand-1669.vpc.cloudera.com:25002
> 17:31:26 MainThread: num_known_live_backends has reached value: 3
> 17:31:26 MainThread: Total wait: 8.62s
> 17:31:26 MainThread: Impala Cluster Running with 3 nodes (3 coordinators, 3
> executors).
> – 2025-07-29 17:31:26,682 DEBUG MainThread: Found 3 impalad/1 statestored/2
> catalogd process(es)
> – 2025-07-29 17:31:26,682 INFO MainThread: Getting metric:
> statestore.live-backends from
> impala-ec2-redhat86-m6i-4xlarge-ondemand-1669.vpc.cloudera.com:25010
> – 2025-07-29 17:31:26,685 INFO MainThread: Metric 'statestore.live-backends'
> has reached desired value: 5. total_wait: 0s
> – 2025-07-29 17:31:26,685 DEBUG MainThread: Getting num_known_live_backends
> from impala-ec2-redhat86-m6i-4xlarge-ondemand-1669.vpc.cloudera.com:25000
> – 2025-07-29 17:31:26,687 INFO MainThread: num_known_live_backends has
> reached value: 3
> – 2025-07-29 17:31:26,687 DEBUG MainThread: Getting num_known_live_backends
> from impala-ec2-redhat86-m6i-4xlarge-ondemand-1669.vpc.cloudera.com:25001
> – 2025-07-29 17:31:26,688 INFO MainThread: num_known_live_backends has
> reached value: 3
> – 2025-07-29 17:31:26,689 DEBUG MainThread: Getting num_known_live_backends
> from impala-ec2-redhat86-m6i-4xlarge-ondemand-1669.vpc.cloudera.com:25002
> – 2025-07-29 17:31:26,690 INFO MainThread: num_known_live_backends has
> reached value: 3
> – 2025-07-29 17:31:26,690 INFO MainThread: beeswax:
> set
> client_identifier=custom_cluster/test_catalogd_ha.py::TestCatalogdHA::()::test_metadata_after_failover_with_hms_sync;
> – 2025-07-29 17:31:26,691 INFO MainThread: beeswax: connected to
> impala-ec2-redhat86-m6i-4xlarge-ondemand-1669.vpc.cloudera.com:21000 with
> beeswax
> – 2025-07-29 17:31:26,691 INFO MainThread: hs2:
> set
> client_identifier=custom_cluster/test_catalogd_ha.py::TestCatalogdHA::()::test_metadata_after_failover_with_hms_sync;
> – 2025-07-29 17:31:26,691 INFO MainThread: hs2: connected to
> impala-ec2-redhat86-m6i-4xlarge-ondemand-1669.vpc.cloudera.com:21050 with
> impyla hs2
> – 2025-07-29 17:31:26,691 INFO MainThread: hs2-http:
> set
> client_identifier=custom_cluster/test_catalogd_ha.py::TestCatalogdHA::()::test_metadata_after_failover_with_hms_sync;
> – 2025-07-29 17:31:26,691 INFO MainThread: hs2-http: connected to
> impala-ec2-redhat86-m6i-4xlarge-ondemand-1669.vpc.cloudera.com:28000 with
> impyla hs2-http
> – 2025-07-29 17:31:26,692 INFO MainThread: hs2:
> set
> client_identifier=custom_cluster/test_catalogd_ha.py::TestCatalogdHA::()::test_metadata_after_failover_with_hms_sync;
> – 2025-07-29 17:31:26,692 INFO MainThread: hs2: connected to
> impala-ec2-redhat86-m6i-4xlarge-ondemand-1669.vpc.cloudera.com:21050 with
> impyla hs2
> – 2025-07-29 17:31:26,692 INFO MainThread: hs2:
> set
> client_identifier=custom_cluster/test_catalogd_ha.py::TestCatalogdHA::()::test_metadata_after_failover_with_hms_sync;
> – 2025-07-29 17:31:26,692 INFO MainThread: hs2: set_configuration:
> set sync_ddl=False;
> – 2025-07-29 17:31:26,693 INFO MainThread: hs2: executing against Impala at
> impala-ec2-redhat86-m6i-4xlarge-ondemand-1669.vpc.cloudera.com:21050.
> session: e047f08b2d0426cb:ae5b15c12058cda8 main_cursor: True user: None
> DROP DATABASE IF EXISTS `test_metadata_after_failover_with_hms_sync_86ff433c`
> CASCADE;
> – 2025-07-29 17:31:27,048 INFO MainThread: 2c45ae9c61b72080:9cb4d20a00000000:
> query started
> – 2025-07-29 17:31:27,050 INFO MainThread: 2c45ae9c61b72080:9cb4d20a00000000:
> getting log for operation
> – 2025-07-29 17:31:27,050 INFO MainThread: 2c45ae9c61b72080:9cb4d20a00000000:
> getting runtime profile operation
> – 2025-07-29 17:31:27,051 INFO MainThread: 2c45ae9c61b72080:9cb4d20a00000000:
> closing query for operation
> – 2025-07-29 17:31:29,032 INFO MainThread: hs2: executing against Impala at
> impala-ec2-redhat86-m6i-4xlarge-ondemand-1669.vpc.cloudera.com:21050.
> session: e047f08b2d0426cb:ae5b15c12058cda8 main_cursor: True user: None
> CREATE DATABASE `test_metadata_after_failover_with_hms_sync_86ff433c`;
> – 2025-07-29 17:31:29,594 INFO MainThread: ff4ce57017836e4a:c7876aad00000000:
> query started
> – 2025-07-29 17:31:29,595 INFO MainThread: ff4ce57017836e4a:c7876aad00000000:
> getting log for operation
> – 2025-07-29 17:31:29,595 INFO MainThread: ff4ce57017836e4a:c7876aad00000000:
> getting runtime profile operation
> – 2025-07-29 17:31:29,596 INFO MainThread: ff4ce57017836e4a:c7876aad00000000:
> closing query for operation
> – 2025-07-29 17:31:29,596 INFO MainThread: Created database
> "test_metadata_after_failover_with_hms_sync_86ff433c" for test ID
> "custom_cluster/test_catalogd_ha.py::TestCatalogdHA::()::test_metadata_after_failover_with_hms_sync"
> – 2025-07-29 17:31:29,596 INFO MainThread: hs2: closing 1 sync and 0 async
> hs2 connections to:
> impala-ec2-redhat86-m6i-4xlarge-ondemand-1669.vpc.cloudera.com:21050
> – 2025-07-29 17:31:29,630 INFO MainThread: hs2: executing against Impala at
> impala-ec2-redhat86-m6i-4xlarge-ondemand-1669.vpc.cloudera.com:21050.
> session: 9e4b294339981440:95bec133cb1d0b97 main_cursor: True user: None
> create function
> test_metadata_after_failover_with_hms_sync_86ff433c.identity_tmp(bigint)
> returns bigint location '/test-warehouse/libTestUdfs.so' symbol='Identity';
> – 2025-07-29 17:31:30,019 INFO MainThread: 0342ef29b927720a:60dee5b000000000:
> query started
> – 2025-07-29 17:31:30,019 INFO MainThread: 0342ef29b927720a:60dee5b000000000:
> getting log for operation
> – 2025-07-29 17:31:30,019 INFO MainThread: 0342ef29b927720a:60dee5b000000000:
> getting runtime profile operation
> – 2025-07-29 17:31:30,020 INFO MainThread: 0342ef29b927720a:60dee5b000000000:
> closing query for operation
> – 2025-07-29 17:31:30,020 INFO MainThread: hs2: executing against Impala at
> impala-ec2-redhat86-m6i-4xlarge-ondemand-1669.vpc.cloudera.com:21050.
> session: 9e4b294339981440:95bec133cb1d0b97 main_cursor: True user: None
> select test_metadata_after_failover_with_hms_sync_86ff433c.identity_tmp(10);
> – 2025-07-29 17:31:30,253 INFO MainThread: f14f5625b7e34d1a:43188cd200000000:
> query started
> – 2025-07-29 17:31:30,254 INFO MainThread: f14f5625b7e34d1a:43188cd200000000:
> getting log for operation
> – 2025-07-29 17:31:30,254 INFO MainThread: f14f5625b7e34d1a:43188cd200000000:
> getting runtime profile operation
> – 2025-07-29 17:31:30,255 INFO MainThread: f14f5625b7e34d1a:43188cd200000000:
> closing query for operation
> – 2025-07-29 17:31:30,255 INFO MainThread: hs2: executing against Impala at
> impala-ec2-redhat86-m6i-4xlarge-ondemand-1669.vpc.cloudera.com:21050.
> session: 9e4b294339981440:95bec133cb1d0b97 main_cursor: True user: None
> create table test_metadata_after_failover_with_hms_sync_86ff433c.tbl(i int);
> – 2025-07-29 17:31:30,723 INFO MainThread: d6485426e08364ae:a1cd3c2100000000:
> query started
> – 2025-07-29 17:31:30,724 INFO MainThread: d6485426e08364ae:a1cd3c2100000000:
> getting log for operation
> – 2025-07-29 17:31:30,724 INFO MainThread: d6485426e08364ae:a1cd3c2100000000:
> getting runtime profile operation
> – 2025-07-29 17:31:30,724 INFO MainThread: d6485426e08364ae:a1cd3c2100000000:
> closing query for operation
> – 2025-07-29 17:31:30,747 INFO MainThread: Found PID 1614928 for
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/be/build/latest/service/catalogd
> -logbufsecs=5 -v=1 -max_log_files=0 -log_rotation_match_pid=true
> -log_filename=catalogd
> -log_dir=/data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/logs/custom_cluster_tests
> -kudu_master_hosts localhost --catalogd_ha_reset_metadata_on_failover=false
> --catalog_topic_mode=minimal
> --debug_actions=catalogd_event_processing_delay:SLEEP@1000
> -catalog_service_port=26000 -state_store_subscriber_port=23020
> -webserver_port=25020 -enable_catalogd_ha=true
> – 2025-07-29 17:31:30,768 INFO MainThread: Killing <CatalogdProcess PID:
> 1614928
> (/data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/be/build/latest/service/catalogd
> -logbufsecs=5 -v=1 -max_log_files=0 -log_rotation_match_pid=true
> -log_filename=catalogd
> -log_dir=/data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/logs/custom_cluster_tests
> -kudu_master_hosts localhost --catalogd_ha_reset_metadata_on_failover=false
> --catalog_topic_mode=minimal
> --debug_actions=catalogd_event_processing_delay:SLEEP@1000
> -catalog_service_port=26000 -state_store_subscriber_port=23020
> -webserver_port=25020 -enable_catalogd_ha=true)> with signal 9
> – 2025-07-29 17:31:30,801 INFO MainThread: Getting metric:
> catalog-server.active-status from
> impala-ec2-redhat86-m6i-4xlarge-ondemand-1669.vpc.cloudera.com:25021
> – 2025-07-29 17:31:30,804 INFO MainThread: Waiting for metric value
> 'catalog-server.active-status'=True. Current value: False. total_wait: 0s
> – 2025-07-29 17:31:30,804 INFO MainThread: Sleeping 1s before next retry.
> – 2025-07-29 17:31:31,805 INFO MainThread: Getting metric:
> catalog-server.active-status from
> impala-ec2-redhat86-m6i-4xlarge-ondemand-1669.vpc.cloudera.com:25021
> – 2025-07-29 17:31:31,808 INFO MainThread: Metric
> 'catalog-server.active-status' has reached desired value: True. total_wait:
> 1.00432705879s
> – 2025-07-29 17:31:31,811 INFO MainThread: hs2: executing against Impala at
> impala-ec2-redhat86-m6i-4xlarge-ondemand-1669.vpc.cloudera.com:21050.
> session: 9e4b294339981440:95bec133cb1d0b97 main_cursor: True user: None
> describe test_metadata_after_failover_with_hms_sync_86ff433c.tbl;
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)