Sourabh Goyal has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17858 )

Change subject: IMPALA-10923: Fine grained table refreshing at partition level 
events for transactional tables
......................................................................


Patch Set 12:

(22 comments)

http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/main/java/org/apache/impala/catalog/Catalog.java
File fe/src/main/java/org/apache/impala/catalog/Catalog.java:

http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/main/java/org/apache/impala/catalog/Catalog.java@99
PS12, Line 99:   protected final ConcurrentHashMap<Long, Set<TableWriteId>> 
txnToWriteIds_ =
nit: It would be good to document why we need to maintain this list and why 
getting writeId list from commit/Abort txn is not enough. Please ignore if it 
is already documented somewhere else.


http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/main/java/org/apache/impala/catalog/Catalog.java@776
PS12, Line 776:     return txnToWriteIds_.getOrDefault(txnId, 
Collections.emptySet());
return an immutable set so that caller can't modify it?


http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2585
PS12, Line 2585:         LOG.debug("Not reloading the table {}.{} since it is 
recreated.",
nit: Can add more details to the debug message like eventId and table's 
createEventId.


http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3558
PS12, Line 3558:       tbl = getTable(dbName, tblName);
Looks like we skip adding writeIds for an incomplete table. If yes, consider 
the following scenario:
1. AllocateWriteId event was processed on a table which was incomplete at that 
time.
2. Abort/Commit Txn event was processed on a fully loaded table. Since open 
write ids were not added in #1, we would not mark those as committed in table.

For allocate write Id event, should we reload a table if it is incomplete 
before processing the event?


http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3564
PS12, Line 3564:         || (createEventId > 0 && tbl.getCreateEventId() != 
createEventId)) {
If I understand correctly the createEventId check is to make sure we process 
event on a table with createEventId less than this event. If so, we can pass 
the current event id in addWriteIdsToTable and check

tbl.createEventId() > 0 && currentEventId > tbl.getCreateEventId()

Also if would be helpful to log the reasons of not adding writeIds separately.


http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3580
PS12, Line 3580:         LOG.debug("Not adding write ids to table {}.{} since 
its valid write id list is "
nit: can add non-acid to acid table upgrade details in the log message.


http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2737
PS12, Line 2737:     Preconditions.checkArgument(partsFromEvent != null
nit: since this api is modifying the table. It would be good to add a 
preconditions check for table's writeLock be held by current thread. Same for 
other variations of reloadPartitions.


http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2771
PS12, Line 2771:       Map<Partition, HdfsPartition> hmsPartsToHdfsParts, 
boolean loadFileMetadata)
nit: add loadFileMetadata in the method description.


http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2901
PS12, Line 2901:     return validWriteIds_;
Exposing a reference to a mutable write Id list can introduce subtle race 
condition issues specially when MutableWriteIdList is not thread safe. Should 
we return an immutable copy of validWriteIds_ here?


http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2913
PS12, Line 2913:     Preconditions.checkState(validWriteIds_ != null);
Before modifying mutableValidWriteIdList, we should assert a check that 
writeLock is held by current thread. That way we can ensure that validWriteIds_ 
is modified only under table lock.


http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2922
PS12, Line 2922:         throw new CatalogException("Unknown write id status " 
+ status);
nit: Add full table name in the exception?


http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/main/java/org/apache/impala/catalog/TableWriteId.java
File fe/src/main/java/org/apache/impala/catalog/TableWriteId.java:

http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/main/java/org/apache/impala/catalog/TableWriteId.java@32
PS12, Line 32:   private Long createEventId_;
nit: data type to  primitive type i.e long ? Should consume less memory.


http://gerrit.cloudera.org:8080/#/c/17858/11/fe/src/main/java/org/apache/impala/hive/common/MutableValidReaderWriteIdList.java
File 
fe/src/main/java/org/apache/impala/hive/common/MutableValidReaderWriteIdList.java:

http://gerrit.cloudera.org:8080/#/c/17858/11/fe/src/main/java/org/apache/impala/hive/common/MutableValidReaderWriteIdList.java@248
PS11, Line 248:           highWatermark);
We should assert a check here that give witeId should be in exceptions list and 
be open. Otherwise user should get an error that it is trying to add a open 
write id which is already aborted or committed.


http://gerrit.cloudera.org:8080/#/c/17858/11/fe/src/main/java/org/apache/impala/hive/common/MutableValidReaderWriteIdList.java@252
PS11, Line 252:       exceptions.add(currentId);
IMO, we should update the abortedBits list as well (to keep exceptions and 
abortedBits in sync) and not rely on caller of this method to set the bits 
appropriately


http://gerrit.cloudera.org:8080/#/c/17858/11/fe/src/main/java/org/apache/impala/hive/common/MutableValidReaderWriteIdList.java@289
PS11, Line 289:           + "them as open first", highWatermark, maxWriteId);
looks like we missed updating open write ids list? Also it would be good to 
update MutableValidReaderWriteIdListTest to include the scenarios introduced in 
this patch.


http://gerrit.cloudera.org:8080/#/c/17858/11/fe/src/main/java/org/apache/impala/hive/common/MutableValidReaderWriteIdList.java@299
PS11, Line 299:         Preconditions.checkState(!abortedBits.get(idx));
nit: it would be good to add an error message describing that writeId was 
expected to be open but is aborted. Same for other preconditions checks.


http://gerrit.cloudera.org:8080/#/c/17858/11/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/17858/11/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4269
PS11, Line 4269:       LOG.info("Could not reload {} partitions of table {}", 
partsFromEvent.size(),
shouldn't we rethrow this exception?


http://gerrit.cloudera.org:8080/#/c/17858/11/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4353
PS11, Line 4353:       // set write id as committed before reload the 
partitions so that we can get
> The writeIdList is used to refresh file metadata. If a write id is not comm
Errors in event processor don't render catalogD unusable. I was referring to 
the following scenario for a transactional table:
1. Commit Txn event updated write id list
2. reloadPartitionsFromEvent threw an error

As a result of #2, the table write id list is updated but file metadata is 
still pointing to the one before this method call. As a result of it, any 
subsequent getTable() requests might skip table reload if the cached 
writeIdList  > writeIdList passed in getTableRequest and silently return a 
stale filemetadata.

If table goes into an inconsistent state from other places in the code, that 
should not mean we introduce one more place of inconsistency specially if that 
is avoidable.

IMHO we should catch exception from reloadPartitionsFromEvent and revert adding 
of openWriteIds to table's writeIdList.

Thoughts?


http://gerrit.cloudera.org:8080/#/c/17858/11/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4365
PS11, Line 4365:       LOG.info("Could not reload {} partitions of table {}", 
partsFromEvent.size(),
should we not rethrow exception? Also change info to error.


http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4342
PS12, Line 4342:         if (!writeIdList.isWriteIdValid(writeId)) {
if the writeId is not committed, it means it could either be open or aborted. 
Should we also check if it is a open txn? Because only then we can commit it.


http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2499
PS12, Line 2499:     testInsertIntoTransactionalTable("test_transactional", 
true, false);
nit: can include test name in the table name for example: 
test_abort_transactional, test_abort_transactional_part. Same for other tests.


http://gerrit.cloudera.org:8080/#/c/17858/12/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@2818
PS12, Line 2818:     params.put("transactional", "true");
instead of setting table params, you can enhance getTestTable i.e pass table 
type (external/managed) and set that in tbl object.



--
To view, visit http://gerrit.cloudera.org:8080/17858
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9
Gerrit-Change-Number: 17858
Gerrit-PatchSet: 12
Gerrit-Owner: Yu-Wen Lai <[email protected]>
Gerrit-Reviewer: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Fucun Chu <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Sourabh Goyal <[email protected]>
Gerrit-Reviewer: Vihang Karajgaonkar <[email protected]>
Gerrit-Reviewer: Yu-Wen Lai <[email protected]>
Gerrit-Comment-Date: Tue, 02 Nov 2021 17:22:18 +0000
Gerrit-HasComments: Yes

Reply via email to