Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16602 )

Change subject: IMPALA-10243: ConcurrentModificationException during parallel 
INSERTs
......................................................................


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16602/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16602/1//COMMIT_MSG@9
PS1, Line 9: ConcurrentModificationException
Can you add the full callstack somewhere,e.g. in the Jira? It would be good to 
see where it was thrown exactly.


http://gerrit.cloudera.org:8080/#/c/16602/1//COMMIT_MSG@15
PS1, Line 15: This can potentially happen in CatalogOpExecutor.updateCatalog() 
which
            : updates the catalog version and unsets table column statistics.
tldr: I would prefer to threat msTable as immutable and deepcopy it where we 
modify it, not where we just read it

This happens in unsetTableColStats(), right?

IMO the real bug is not in toThrift(), but in updateCatalog(), as it calls 
unsetTableColStats with the "original" msTable instead of a copy. The general 
pattern in CatalogOpExecutor is that if the msTable is modified, then it is 
deep copied first, and the change will be "applied" later when 
loadTableMetadata() reloads the schema from HMS. This is also problematic 
because the table in the catalog will change even if updateCatalog() will fail 
somewhere later and will deviate from the HMS version.

My idea for the fix is to do a deepCopy of msTable early in updateCatalog() and 
apply all changes to that copy. loadTableMetadata() will be called later and 
will reload the table at 
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L4619

Note that it is possible that the whole unsetTableColStats stuff is no longer 
necessary, as Hive know uses the "engine" field in stats and lives in a 
separate world than Impala, so we shouldn't be able to cause correctness issues 
by inserting rows without invalidating stats.



--
To view, visit http://gerrit.cloudera.org:8080/16602
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie656925d764d5eb26c318703ca425529ecf7a3a3
Gerrit-Change-Number: 16602
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Comment-Date: Thu, 15 Oct 2020 10:23:48 +0000
Gerrit-HasComments: Yes

Reply via email to