[
https://issues.apache.org/jira/browse/IMPALA-9101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Quanlong Huang updated IMPALA-9101:
-----------------------------------
Description:
In {{CatalogOpExecutor.alterTable()}}, we call
{{addVersionsForInflightEvents()}} whenever the AlterTable operation changes
anything or not. If nothing changes, no HMS RPCs are sent. The event processor
ends up waiting on a non-existed self-event. Then all self-events are treated
as outside events and unneccessary REFRESH/INVALIDATE on this table will be
performed.
Codes:
{code:java}
private void alterTable(TAlterTableParams params, TDdlExecResponse response)
throws ImpalaException {
....
tryLock(tbl);
// Get a new catalog version to assign to the table being altered.
long newCatalogVersion = catalog_.incrementAndGetCatalogVersion();
addCatalogServiceIdentifiers(tbl, catalog_.getCatalogServiceId(),
newCatalogVersion);
....
// now that HMS alter operation has succeeded, add this version to list
of inflight
// events in catalog table if event processing is enabled
catalog_.addVersionsForInflightEvents(tbl, newCatalogVersion); <----
We should check before calling this.
}
{code}
Reproduce:
{code:sql}
create table testtbl (col int) partitioned by (p1 int, p2 int);
alter table testtbl add partition (p1=2,p2=6);
alter table testtbl add if not exists partition (p1=2,p2=6);
-- After this point, can't detect self-events on this table
alter table testtbl add partition (p1=2,p2=7);
{code}
Catalogd logs:
{code:bash}
I1029 07:41:15.310956 8546 HdfsTable.java:630] Loaded file and block metadata
for default.testtbl partitions: p1=2/p2=6
I1029 07:41:15.892410 8321 MetastoreEventsProcessor.java:480] Received 1
events. Start event id : 11463
I1029 07:41:15.895717 8321 MetastoreEvents.java:396] EventId: 11464 EventType:
ADD_PARTITION Creating event 11464 of type ADD_PARTITION on table
default.testtbl
I1029 07:41:15.940225 8321 MetastoreEvents.java:241] Total number of events
received: 1 Total number of events filtered out: 0
I1029 07:41:15.940414 8321 MetastoreEvents.java:385] EventId: 11464 EventType:
ADD_PARTITION Not processing the event as it is a self-event
#### Correctly recognize self-event ^^^^
I1029 07:41:16.829824 8329 catalog-server.cc:641] Collected update:
1:TABLE:default.testtbl, version=1385, original size=4438, compressed size=1216
I1029 07:41:16.831853 8329 catalog-server.cc:641] Collected update:
1:CATALOG_SERVICE_ID, version=1385, original size=60, compressed size=58
I1029 07:41:18.827137 8339 catalog-server.cc:337] A catalog update with 2
entries is assembled. Catalog version: 1385 Last sent catalog version: 1384
#### No events for adding partition p1=2,p2=6 again. But we still bump the
catalog version.
I1029 07:45:38.900974 8329 catalog-server.cc:641] Collected update:
1:CATALOG_SERVICE_ID, version=1386, original size=60, compressed size=58
I1029 07:45:40.899353 8339 catalog-server.cc:337] A catalog update with 1
entries is assembled. Catalog version: 1386 Last sent catalog version: 1385
#### Creating partition p1=2,p2=7
I1029 07:45:48.827221 8546 HdfsTable.java:630] Loaded file and block metadata
for default.testtbl partitions: p1=2/p2=7
I1029 07:45:48.904234 8329 catalog-server.cc:641] Collected update:
1:TABLE:default.testtbl, version=1387, original size=4886, compressed size=1251
I1029 07:45:48.905262 8329 catalog-server.cc:641] Collected update:
1:CATALOG_SERVICE_ID, version=1387, original size=60, compressed size=58
I1029 07:45:49.523567 8321 MetastoreEventsProcessor.java:480] Received 1
events. Start event id : 11464
I1029 07:45:49.524150 8321 MetastoreEvents.java:396] EventId: 11465 EventType:
ADD_PARTITION Creating event 11465 of type ADD_PARTITION on table
default.testtbl
I1029 07:45:49.527262 8321 MetastoreEvents.java:241] Total number of events
received: 1 Total number of events filtered out: 0
I1029 07:45:49.530278 8321 MetastoreEvents.java:385] EventId: 11465 EventType:
ADD_PARTITION Trying to refresh 1 partitions added to table default.testtbl in
the event
I1029 07:45:49.531026 8321 CatalogServiceCatalog.java:2572] Refreshing
partition metadata: default.testtbl p1=2/p2=7 (processing ADD_PARTITION event
from HMS)
#### Unneccessary REFRESH ^^^^
I1029 07:45:49.604936 8321 HdfsTable.java:630] Loaded file and block metadata
for default.testtbl partitions: p1=2/p2=7
I1029 07:45:49.605069 8321 CatalogServiceCatalog.java:2594] Refreshed
partition metadata: default.testtbl p1=2/p2=7
I1029 07:45:49.605273 8321 MetastoreEvents.java:385] EventId: 11465 EventType:
ADD_PARTITION Refreshed 1 partitions of table default.testtbl
I1029 07:45:50.901763 8339 catalog-server.cc:337] A catalog update with 2
entries is assembled. Catalog version: 1387 Last sent catalog version: 1386
I1029 07:45:50.904940 8329 catalog-server.cc:641] Collected update:
1:TABLE:default.testtbl, version=1388, original size=4886, compressed size=1251
I1029 07:45:50.905792 8329 catalog-server.cc:641] Collected update:
1:CATALOG_SERVICE_ID, version=1388, original size=60, compressed size=58
I1029 07:45:52.902602 8339 catalog-server.cc:337] A catalog update with 2
entries is assembled. Catalog version: 1388 Last sent catalog version: 1387
{code}
was:
In {{CatalogOpExecutor.alterTable()}}, we call
{{addVersionsForInflightEvents()}} whenever the AlterTable operation changes
anything or not. If nothing changes, no HMS RPCs are sent. The event processor
ends up waiting on a non-existed self-event. Then all self-events are treated
as outside events and unneccessary REFRESH/INVALIDATE on this table will be
performed.
Codes:
{code:java}
private void alterTable(TAlterTableParams params, TDdlExecResponse response)
throws ImpalaException {
....
tryLock(tbl);
// Get a new catalog version to assign to the table being altered.
long newCatalogVersion = catalog_.incrementAndGetCatalogVersion();
addCatalogServiceIdentifiers(tbl, catalog_.getCatalogServiceId(),
newCatalogVersion);
....
// now that HMS alter operation has succeeded, add this version to list
of inflight
// events in catalog table if event processing is enabled
catalog_.addVersionsForInflightEvents(tbl, newCatalogVersion); <----
We should check before calling this.
}
{code}
Reproduce:
{code:sql}
create table testtbl (col int) partitioned by (p1 int, p2 int);
alter table testtbl add partition (p1=2,p2=6);
alter table testtbl add if not exists partition (p1=2,p2=6);
-- After this point, can't detect self-events on this table
alter table testtbl add partition (p1=2,p2=7);
{code}
> Unneccessary REFRESH due to wrong self-event detection
> ------------------------------------------------------
>
> Key: IMPALA-9101
> URL: https://issues.apache.org/jira/browse/IMPALA-9101
> Project: IMPALA
> Issue Type: Bug
> Reporter: Quanlong Huang
> Priority: Minor
>
> In {{CatalogOpExecutor.alterTable()}}, we call
> {{addVersionsForInflightEvents()}} whenever the AlterTable operation changes
> anything or not. If nothing changes, no HMS RPCs are sent. The event
> processor ends up waiting on a non-existed self-event. Then all self-events
> are treated as outside events and unneccessary REFRESH/INVALIDATE on this
> table will be performed.
> Codes:
> {code:java}
> private void alterTable(TAlterTableParams params, TDdlExecResponse response)
> throws ImpalaException {
> ....
> tryLock(tbl);
> // Get a new catalog version to assign to the table being altered.
> long newCatalogVersion = catalog_.incrementAndGetCatalogVersion();
> addCatalogServiceIdentifiers(tbl, catalog_.getCatalogServiceId(),
> newCatalogVersion);
> ....
> // now that HMS alter operation has succeeded, add this version to list
> of inflight
> // events in catalog table if event processing is enabled
> catalog_.addVersionsForInflightEvents(tbl, newCatalogVersion); <----
> We should check before calling this.
> }
> {code}
> Reproduce:
> {code:sql}
> create table testtbl (col int) partitioned by (p1 int, p2 int);
> alter table testtbl add partition (p1=2,p2=6);
> alter table testtbl add if not exists partition (p1=2,p2=6);
> -- After this point, can't detect self-events on this table
> alter table testtbl add partition (p1=2,p2=7);
> {code}
> Catalogd logs:
> {code:bash}
> I1029 07:41:15.310956 8546 HdfsTable.java:630] Loaded file and block
> metadata for default.testtbl partitions: p1=2/p2=6
> I1029 07:41:15.892410 8321 MetastoreEventsProcessor.java:480] Received 1
> events. Start event id : 11463
> I1029 07:41:15.895717 8321 MetastoreEvents.java:396] EventId: 11464
> EventType: ADD_PARTITION Creating event 11464 of type ADD_PARTITION on table
> default.testtbl
> I1029 07:41:15.940225 8321 MetastoreEvents.java:241] Total number of events
> received: 1 Total number of events filtered out: 0
> I1029 07:41:15.940414 8321 MetastoreEvents.java:385] EventId: 11464
> EventType: ADD_PARTITION Not processing the event as it is a self-event
> #### Correctly recognize self-event ^^^^
> I1029 07:41:16.829824 8329 catalog-server.cc:641] Collected update:
> 1:TABLE:default.testtbl, version=1385, original size=4438, compressed
> size=1216
> I1029 07:41:16.831853 8329 catalog-server.cc:641] Collected update:
> 1:CATALOG_SERVICE_ID, version=1385, original size=60, compressed size=58
> I1029 07:41:18.827137 8339 catalog-server.cc:337] A catalog update with 2
> entries is assembled. Catalog version: 1385 Last sent catalog version: 1384
> #### No events for adding partition p1=2,p2=6 again. But we still bump the
> catalog version.
> I1029 07:45:38.900974 8329 catalog-server.cc:641] Collected update:
> 1:CATALOG_SERVICE_ID, version=1386, original size=60, compressed size=58
> I1029 07:45:40.899353 8339 catalog-server.cc:337] A catalog update with 1
> entries is assembled. Catalog version: 1386 Last sent catalog version: 1385
> #### Creating partition p1=2,p2=7
> I1029 07:45:48.827221 8546 HdfsTable.java:630] Loaded file and block
> metadata for default.testtbl partitions: p1=2/p2=7
> I1029 07:45:48.904234 8329 catalog-server.cc:641] Collected update:
> 1:TABLE:default.testtbl, version=1387, original size=4886, compressed
> size=1251
> I1029 07:45:48.905262 8329 catalog-server.cc:641] Collected update:
> 1:CATALOG_SERVICE_ID, version=1387, original size=60, compressed size=58
> I1029 07:45:49.523567 8321 MetastoreEventsProcessor.java:480] Received 1
> events. Start event id : 11464
> I1029 07:45:49.524150 8321 MetastoreEvents.java:396] EventId: 11465
> EventType: ADD_PARTITION Creating event 11465 of type ADD_PARTITION on table
> default.testtbl
> I1029 07:45:49.527262 8321 MetastoreEvents.java:241] Total number of events
> received: 1 Total number of events filtered out: 0
> I1029 07:45:49.530278 8321 MetastoreEvents.java:385] EventId: 11465
> EventType: ADD_PARTITION Trying to refresh 1 partitions added to table
> default.testtbl in the event
> I1029 07:45:49.531026 8321 CatalogServiceCatalog.java:2572] Refreshing
> partition metadata: default.testtbl p1=2/p2=7 (processing ADD_PARTITION event
> from HMS)
> #### Unneccessary REFRESH ^^^^
> I1029 07:45:49.604936 8321 HdfsTable.java:630] Loaded file and block
> metadata for default.testtbl partitions: p1=2/p2=7
> I1029 07:45:49.605069 8321 CatalogServiceCatalog.java:2594] Refreshed
> partition metadata: default.testtbl p1=2/p2=7
> I1029 07:45:49.605273 8321 MetastoreEvents.java:385] EventId: 11465
> EventType: ADD_PARTITION Refreshed 1 partitions of table default.testtbl
> I1029 07:45:50.901763 8339 catalog-server.cc:337] A catalog update with 2
> entries is assembled. Catalog version: 1387 Last sent catalog version: 1386
> I1029 07:45:50.904940 8329 catalog-server.cc:641] Collected update:
> 1:TABLE:default.testtbl, version=1388, original size=4886, compressed
> size=1251
> I1029 07:45:50.905792 8329 catalog-server.cc:641] Collected update:
> 1:CATALOG_SERVICE_ID, version=1388, original size=60, compressed size=58
> I1029 07:45:52.902602 8339 catalog-server.cc:337] A catalog update with 2
> entries is assembled. Catalog version: 1388 Last sent catalog version: 1387
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]