zhangqianqiong created IMPALA-13438:
---------------------------------------

             Summary: In alterTableRecoverPartitions, we should batch the 
addHmsPartitions operations.
                 Key: IMPALA-13438
                 URL: https://issues.apache.org/jira/browse/IMPALA-13438
             Project: IMPALA
          Issue Type: Improvement
          Components: Catalog
    Affects Versions: Impala 4.4.1, Impala 4.4.0, Impala 4.3.0, Impala 4.1.2, 
Impala 4.1.1, Impala 4.2.0, Impala 4.1.0
            Reporter: zhangqianqiong
            Assignee: zhangqianqiong


After applying the merge request 'IMPALA-10502: Handle CREATE/DROP events 
correctly', the {{alterTableRecoverPartitions}} method changed from batching 
the {{add_partitions}} calls to invoking {{addHmsPartitions}} all at once. 
However, for tables with a huge number of partitions, this can result in the 
creation of a huge temporary object, {{{}List<Partitions>{}}}, leading to 
OutOfMemory.

In my test environment, where the catalogd JVM {{Xmx}} was set to 2GB, running 
the end-to-end test {{custom_cluster/test_wide_table_operations.py}} on a table 
with 2000 columns and 50,000 partitions during the {{recover partitions}} 
operation caused catalogd to run into a Java heap space 
{{{}OutOfMemoryError{}}}.

An analysis of the memory dump using the MemoryAnalyzer revealed that the 
temporary object contained a massive number of {{FieldSchema}} objects (2000 
columns * 50,000 partitions), which overwhelmed memory resources.

To resolve this issue, we propose batching the {{addHmsPartitions}} calls, 
ensuring that temporary objects are released after each batch operation. This 
solution was tested and verified to resolve the {{{}OutOfMemoryError{}}}, 
ensuring system stability when handling a large number of partitions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to