Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/22036 )
Change subject: IMPALA-13518: Show target name of COMMIT_TXN events in logs ...................................................................... Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/22036/1/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java File fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java: http://gerrit.cloudera.org:8080/#/c/22036/1/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@918 PS1, Line 918: tableNames_.add(tableWriteId.getDbName() + "." + tableWriteId.getTblName()); > What kind of memory overhead might this add? We will have at most 1000 COMMIT_TXN events in memory (EVENTS_BATCH_SIZE_PER_RPC). So this might add 1000 table names. I think it's trivial. Note that usually a COMMIT_TXN event just have 0 to 1 table. I can't find a Hive statement that modify two tables at once. The only way I can find to generate a COMMIT_TXN event with multiple tables is by using the HMS APIs to open a transaction and modify multiple tables in the transaction, which is what I added in the new test. I don't think it's actually used in the reality, e.g. in Spark. -- To view, visit http://gerrit.cloudera.org:8080/22036 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4a7cb5e716453290866a4c3e74c0d269f621144f Gerrit-Change-Number: 22036 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang <[email protected]> Gerrit-Reviewer: Anonymous Coward <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Sai Hemanth Gantasala <[email protected]> Gerrit-Comment-Date: Fri, 03 Jan 2025 02:11:38 +0000 Gerrit-HasComments: Yes
