[ 
https://issues.apache.org/jira/browse/IMPALA-13850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17950616#comment-17950616
 ] 

ASF subversion and git services commented on IMPALA-13850:
----------------------------------------------------------

Commit be16a02fa8f98da09d572d8250363896ae10b9e7 in impala's branch 
refs/heads/master from Riza Suminto
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=be16a02fa ]

IMPALA-13850 (part 2): Fix bug found by test_restart_services.py

This patch stabilize test_restart_catalogd_with_local_catalog in
test_restart_services.py after the first part of IMPALA-13850 merged.

IMPALA-13850 (part 1) make local catalog mode send statestore update
twice: the first is to announce its availability and service id, while
the second is the full topic update. There is a slight duration where
CatalogD accept getCatalogObject() request before the very first
CatalogServiceCatalog.reset() initiated and obtain write lock. When such
request went through, the request might see an empty catalog which
results in query failures of db/table not exists.

This patch block CatalogServiceThriftIf.AcceptRequest() until
CatalogServiceCatalog.reset() initiated. Catalog version 100 is used to
signal that initial reset has begun. Later in part 3, when we implement
in-place metadata cache reset, AcceptRequest() can unblock faster when
reset() release the write lock in-between catalog cache initialization.

Testing:
- Loop and pass test_restart_catalogd_with_local_catalog 100 times.

Change-Id: I97f6f692506de0bbf2e1445f83bed824dc8298fd
Reviewed-on: http://gerrit.cloudera.org:8080/22844
Reviewed-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>


> Catalogd should not start metadata operation until initialization is done if 
> HA is enabled
> ------------------------------------------------------------------------------------------
>
>                 Key: IMPALA-13850
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13850
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog
>            Reporter: Wenzhe Zhou
>            Assignee: Riza Suminto
>            Priority: Critical
>
> In a case reported by user, the catalogd initialization failed to complete. 
> Log messages showed that catalog HA was enabled. catalogd was blocked when 
> trying to acquire "CatalogServer.catalog_lock_" when calling 
> CatalogServer::UpdateActiveCatalogd() during statestore subscriber 
> registration.
> Log message showed that there was IM command issued before catalogd tried to 
> register to statestore.
> {code:java}
> I0310 12:21:34.093617     1 CatalogServiceCatalog.java:2188] Invalidated all 
> metadata.
> I0310 12:21:34.094341     1 thrift-server.cc:419] ThriftServer 
> 'StatestoreSubscriber' started on port: 23020
> I0310 12:21:34.094341  1816 TAcceptQueueServer.cpp:329] 
> connection_setup_thread_pool_size is set to 2
> I0310 12:21:34.094586     1 thrift-util.cc:198] TSocket::open() error on 
> socket (after THRIFT_POLL) <Host: localhost Port: 23020>: Connection refused
> I0310 12:21:34.094790     1 statestore-subscriber.cc:745] Starting statestore 
> subscriber
> {code}
> We should not allow any metadata operation until initialization is done. When 
> HA is enabled, catalog-server should not hold "CatalogServer.catalog_lock_" 
> for long time before active catalogd is assigned.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to