[
https://issues.apache.org/jira/browse/IMPALA-12024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17760243#comment-17760243
]
ASF subversion and git services commented on IMPALA-12024:
----------------------------------------------------------
Commit 27d37a60c954d7734cfaea35e73939d5c7681d55 in impala's branch
refs/heads/master from stiga-huang
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=27d37a60c ]
IMPALA-12024: Add catalog profile for createTable
DDL/DMLs might spend a long time in the execution in catalogd.
Currently, the profiles just have a counter of CatalogOpExecTimer. We
need more items to show the details of how they are executed in
catalogd.
As the first step, this patch adds the profile timeline for how the
createTable DDL is executed in Catalog Server. Also added more events in
the existing "Query Timeline" for when the CatalogDdlRequest finished
and when the catalog updates are applied.
To implement this, a 'profile' field is added in TDdlExecResponse to
carry the execution counters and timeline in catalogd. Currently, we
just uses the timeline of it. We can add more counters in the future.
Several methods add a timeline parameter to mark the progress in them.
Timeline events are added after each RPC finished.
Here is an example when HMS is hanging for 26s in a CTAS. I used gdb to
attach to HMS as the JIRA description mentioned. In the timeline, we can
see the time is spent in the first HMS RPC that fetching the current HMS
event id:
Catalog Server Operation: 26s560ms
- Got metastoreDdlLock: 163.977us (163.977us)
- Got Metastore client: 166.339us (2.362us)
- Got current Metastore event id 8355270: 26s494ms (26s494ms)
- Created table in Metastore: 26s558ms (63.507ms)
- Fetched event batch from Metastore: 26s559ms (1.155ms)
- Created table in catalog cache: 26s560ms (1.164ms)
- DDL finished: 26s560ms (65.538us)
Query Compilation: 164.257ms
- Metadata of all 1 tables cached: 10.535ms (10.535ms)
- Analysis finished: 118.324ms (107.788ms)
- Authorization finished (noop): 118.489ms (164.626us)
- Value transfer graph computed: 118.830ms (341.792us)
- Single node plan created: 150.150ms (31.319ms)
- Runtime filters computed: 150.254ms (103.529us)
- Distributed plan created: 151.832ms (1.578ms)
- Planning finished: 164.257ms (12.425ms)
Query Timeline: 27s304ms
- Query submitted: 129.658us (129.658us)
- Planning finished: 170.224ms (170.095ms)
- CatalogDdlRequest finished: 26s731ms (26s561ms)
- Applied catalog updates from DDL: 26s740ms (8.752ms)
- Submit for admission: 26s740ms (22.233us)
- Completed admission: 26s740ms (286.295us)
- Ready to start on 3 backends: 26s740ms (155.916us)
- All 3 execution backends (3 fragment instances) started: 26s751ms
(10.864ms)
- Last row fetched: 26s920ms (168.226ms)
- Released admission control resources: 26s920ms (27.635us)
- DML data written: 26s920ms (126.369us)
- Applied catalog updates from DDL: 26s985ms (65.354ms)
- DML Metastore update finished: 26s985ms (30.343us)
- Rows available: 26s985ms (27.050us)
- Unregister query: 27s304ms (318.661ms)
An example of creating a Kudu table:
Catalog Server Operation: 1s730ms
- Got Metastore client: 113.403us (113.403us)
- Got current Metastore event id 8355276: 974.500us (861.097us)
- Got Kudu client: 212.123ms (211.148ms)
- Got kuduDdlLock: 212.128ms (4.680us)
- Checked table existence in Kudu: 850.786ms (638.658ms)
- Created table in Kudu: 1s623ms (772.379ms)
- Got metastoreDdlLock: 1s623ms (397.305us)
- Got Metastore client: 1s623ms (7.813us)
- Checked table existence in Metastore: 1s648ms (25.154ms)
- Created table in Metastore: 1s725ms (76.348ms)
- Fetched event batch from Metastore: 1s728ms (3.004ms)
- Created table in catalog cache: 1s730ms (2.024ms)
- DDL finished: 1s730ms (84.448us)
An example of creating an Iceberg table:
Catalog Server Operation: 1s573ms
- Got Metastore client: 141.799us (141.799us)
- Checked table existence in Metastore: 2.957ms (2.815ms)
- Got current Metastore event id 8355277: 3.669ms (712.475us)
- Created table using Iceberg Catalog HIVE_CATALOG: 1s379ms (1s375ms)
- Fetched event batch from Metastore: 1s381ms (2.188ms)
- Created table in catalog cache: 1s382ms (1.556ms)
- Set Iceberg table owner in Metastore: 1s573ms (190.262ms)
- DDL finished: 1s573ms (59.176us)
Tests:
- Add e2e tests to verify the DDL timeline events exist in profile
Change-Id: I3ebf591625e71391a5b23f56ddca8f0ae97b1efa
Reviewed-on: http://gerrit.cloudera.org:8080/20368
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> For CTAS, include time to create the table in the query profile timeline
> ------------------------------------------------------------------------
>
> Key: IMPALA-12024
> URL: https://issues.apache.org/jira/browse/IMPALA-12024
> Project: IMPALA
> Issue Type: Improvement
> Components: Backend
> Affects Versions: Impala 4.3.0
> Reporter: Joe McDonnell
> Assignee: Quanlong Huang
> Priority: Major
> Attachments: profile_7347b5216e8840b9_a19c3b0800000000.txt
>
>
> If the metastore is slow, catalog operations to create / drop tables can take
> a long time. For a CTAS statement, we don't have a clear indication in the
> profile that this is happening. Instead, it shows up as a gap between
> planning finished and submitting for admission:
> {noformat}
> Query Timeline: 12s696ms
> - Query submitted: 64.303us (64.303us)
> - Planning finished: 117.816ms (117.752ms)
> - Submit for admission: 12s360ms (12s243ms)
> ...{noformat}
> This may apply for other statement types.
> To reproduce with CTAS on an Impala devenv:
> {noformat}
> 1. Start Impala
> 2. In impala-shell, run "select * from functional.alltypes"
> 3. Find the metastore pid (jps -v | grep metastore)
> 4. sudo gdb attach ${metastore_pid}
> 5. In impala-shell, run "create table foo as select * from
> functional.alltypes"
> .. This will hang because it can't create the table, wait a bit
> 6. Back in the gdb session, detach from metastore and let the CTAS
> finish{noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]