Quanlong Huang created IMPALA-14677:
---------------------------------------
Summary: Self-event evaluation of batched events just checks
parameter of the first event
Key: IMPALA-14677
URL: https://issues.apache.org/jira/browse/IMPALA-14677
Project: IMPALA
Issue Type: Bug
Components: Catalog
Reporter: Quanlong Huang
In BatchPartitionEvent, when the batched event type is not InsertEvent, we
check self-event just using the parameter of the first event (baseEvent_ or
msTbl_ from the first event). If only the first several events are self-events,
the following non-self-events will be incorrectly skipped.
{code:java}
protected SelfEventContext getSelfEventContext() {
List<List<TPartitionKeyValue>> partitionKeyValues = new ArrayList<>();
List<Long> eventIds = new ArrayList<>();
// We treat insert event as a special case since the self-event context
for an
// insert event is generated differently using the eventIds.
boolean isInsertEvent = baseEvent_ instanceof InsertEvent;
for (T event : batchedEvents_) {
partitionKeyValues.add(
getTPartitionSpecFromHmsPartition(event.msTbl_,
event.getPartitionForBatching()));
eventIds.add(event.getEventId());
}
return new SelfEventContext(dbName_, tblName_, partitionKeyValues,
// This might be wrong if not all the events have the same
catalog-service-id and catalog-version in the parameters.
baseEvent_.getPartitionForBatching().getParameters(),
isInsertEvent ? eventIds : null);
}{code}
[https://github.com/apache/impala/blob/83036b13e5ff5531444af91fd9e46a05d5547455/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java#L3291]
CC [~VenuReddy] , [~hemanth619]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)