Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21922 )

Change subject: IMPALA-13438  Batch the `addHmsPartitions` operations in 
`alterTableRecoverPartitions`.
......................................................................


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21922/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21922/2//COMMIT_MSG@20
PS2, Line 20: An analysis of the memory dump using the MemoryAnalyzer revealed 
that the temporary object
            : contained a massive number of FieldSchema objects (2000 columns * 
50,000 partitions),
            : which overwhelmed memory resources.
> I have reproduced the issue on the latest master branch. Although IMPALA-11
I see. Checked the codes, when event-processor is disabled, the Partition 
objects come from MetaStoreUtil.addPartitions():
https://github.com/apache/impala/blob/2535e79491078a0353dbeed1a094e91366906149/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L5319-L5321

In that method, the FieldSchema list of partitions will be replaced with the 
reference of the table's. So we shouldn't see lots of FieldSchema objects.
https://github.com/apache/impala/blob/2535e79491078a0353dbeed1a094e91366906149/fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java#L248

However, when event-processor is enabled, the Partition objects are extracted 
from HMS events:
https://github.com/apache/impala/blob/2535e79491078a0353dbeed1a094e91366906149/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L5343
The above partition list is not used as the result. I think we should modify 
getPartitionsFromEvent() to use the table level list of FieldSchema:
https://github.com/apache/impala/blob/2535e79491078a0353dbeed1a094e91366906149/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L2479


http://gerrit.cloudera.org:8080/#/c/21922/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/21922/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2483
PS2, Line 2483:           partitionToEventId.put(part, eventId);
As commented above, I think we should trim the Partitions here using 
MetaStoreUtil.replaceSchemaFromTable() to get rid of lots of FieldSchema 
objects (they will be GCed).



--
To view, visit http://gerrit.cloudera.org:8080/21922
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I13aaad8a915f75fbe808bf96b1cf891312b1a592
Gerrit-Change-Number: 21922
Gerrit-PatchSet: 2
Gerrit-Owner: zhangqianqiong <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: zhangqianqiong <[email protected]>
Gerrit-Comment-Date: Mon, 14 Oct 2024 06:57:57 +0000
Gerrit-HasComments: Yes

Reply via email to