Fang-Yu Rao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20145 )

Change subject: IMPALA-12256: Fix DDLs losing partition-level create event ids 
in reloads
......................................................................


Patch Set 4: Code-Review+1

(3 comments)

Thanks for the analysis and the prompt fix Quanlong!

I only have some minor questions mostly because I know very little around this 
area.

http://gerrit.cloudera.org:8080/#/c/20145/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/20145/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@1326
PS4, Line 1326: loss
nit: Should we use "lose" instead?


http://gerrit.cloudera.org:8080/#/c/20145/4/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/20145/4/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2904
PS4, Line 2904:         
partBuilder.setCreateEventId(oldPartition.getCreateEventId());
              :         
partBuilder.setLastCompactionId(oldPartition.getLastCompactionId());
I have 2 questions.

1. At 
https://gerrit.cloudera.org/c/20145/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java#1329
 we also populate the field 'lastRefreshEventId_'. I wonder if we also need to 
pick up the value of 'lastRefreshEventId_' like the following. Or we do not 
really need to do this because we already populate the field 
'lastRefreshEventId_' somewhere else and this field will be available when the 
event processor is processing a reload event (e.g., at 
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L3891)?

        partBuilder.setLastRefreshEventId(oldPartition.getLastRefreshEventId());

2. Here we also pick up the value of 'lastCompactionId_'. Will this piece of 
information be required to fix the issue reported in this  JIRA? Or we just 
want to keep this field in sync with the same field in the partition that was 
just updated by the DDL command?


http://gerrit.cloudera.org:8080/#/c/20145/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/20145/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4544
PS4, Line 4544:     // if the partition has been created since the event was 
generated, skip
              :     // the stale event.
              :     boolean isStale = hdfsPartition.getCreateEventId() > 
eventId;
              :     LOG.info("{} partition {} of table {} since it's create 
event id {} is {} than " +
              :             "eventid {}",
              :         isStale ? "Not dropping" : "Dropping",
              :         hdfsPartition.getPartitionName(), 
hdfsTable.getFullName(),
              :         hdfsPartition.getCreateEventId(), isStale ? "higher" : 
"not higher",
              :         eventId);
              :     return !isStale;
              :
At 
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L3891
 we have the following check within 
CatalogServiceCatalog.java#isPartitionLoadedAfterEvent(). I wonder if it would 
also be good for us to log the reload event?

      HdfsPartition hdfsPartition = getHdfsPartition(dbName, tableName, 
msPartition);
      if (hdfsPartition.getLastRefreshEventId() > eventId) {
        return true;
      }



--
To view, visit http://gerrit.cloudera.org:8080/20145
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I052faa093bda69fb16db0d424c1478bba103dad9
Gerrit-Change-Number: 20145
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Fang-Yu Rao <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Sai Hemanth Gantasala <[email protected]>
Gerrit-Comment-Date: Sun, 02 Jul 2023 00:06:33 +0000
Gerrit-HasComments: Yes

Reply via email to