[ 
https://issues.apache.org/jira/browse/IMPALA-11729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith resolved IMPALA-11729.
------------------------------------
    Fix Version/s: Impala 4.5.0
       Resolution: Fixed

> Investigate and improve impalad startup time
> --------------------------------------------
>
>                 Key: IMPALA-11729
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11729
>             Project: IMPALA
>          Issue Type: Improvement
>            Reporter: Csaba Ringhofer
>            Priority: Minor
>              Labels: ramp-up
>             Fix For: Impala 4.5.0
>
>
> impalad startup takes several seconds, even few seconds before trying 
> connecting to statestored. From a  test run (release mode) with a parallel 
> catalogd startup:
> {code}
> I1113 21:02:17.334743  4363 logging.cc:247] stdout will be logged to this 
> file.
> I1113 21:02:18.968991  4363 JniFrontend.java:141] Java Input arguments:
> I1113 21:02:19.887519  4363 exec-env.cc:467] Starting statestore subscriber 
> service
> {code}
> After connecting to statestore coordinators need to wait for the initial 
> catalog update and processing it will take time depending on the number of 
> catalog objects:
> {code}
> I1113 21:02:19.888423  4363 Frontend.java:1618] Waiting for local catalog to 
> be initialized, attempt: 0
> I1113 21:02:21.888621  4363 Frontend.java:1618] Waiting for local catalog to 
> be initialized, attempt: 1
> I1113 21:02:23.888849  4363 Frontend.java:1614] Local catalog initialized 
> after: 4000 ms.
> I1113 21:02:23.890105  4363 impala-server.cc:3103] Impala has started.
> {code}
> Meanwhile on catalogd it takes 2 seconds before even trying to connect to HMS:
> {code}
> I1113 21:02:17.289606  4281 logging.cc:247] stdout will be logged to this 
> file.
> I1113 21:02:19.023339  4281 HiveMetaStoreClient.java:720] Trying to connect 
> to metastore with URI (thrift://localhost:9083) in binary transport mode
> I1113 21:02:21.671665  5028 catalog-server.cc:400] A catalog update with 1647 
> entries is assembled. Catalog version: 1649 Last sent catalog version: 0
> {code}
> Statestore starts up quickly, much before other components try to connect to 
> it:
> {code}
> I1113 21:02:17.263167  4262 logging.cc:247] stdout will be logged to this 
> file.
> I1113 21:02:17.268682  4262 thrift-server.cc:419] ThriftServer 
> 'StatestoreService' started on port: 24000
> I1113 21:02:19.670817  4285 TAcceptQueueServer.cpp:355] New connection to 
> server StatestoreService from client <Host: 127.0.0.1 Port: 44156>
> {code}
> While this 6 secs at impalad with ~2 secs waiting for initial catalog update 
> is not very bad, making it quicker would be visible in test run times (custom 
> cluster tests restart the cluster a lot) and in autoscaling scenarios. 
> Finding out what takes the time during startup would be also nice ramp up 
> task.
> The startup logic is single threaded - I see the most potential in moving 
> some independent tasks to separate threads. It is also possible that we are 
> doing some completely unnecessary tasks in some scenarios (e..g executor only 
> impalad) or that some tasks could be safely moved to a later point when they 
> are actually needed.
> Initialization is driven mainly from here:
> https://github.com/apache/impala/blob/master/be/src/service/impalad-main.cc
> https://github.com/apache/impala/blob/master/be/src/catalog/catalogd-main.cc
> but probably most of time is spend in Java code



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to