[ 
https://issues.apache.org/jira/browse/IMPALA-11580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang resolved IMPALA-11580.
-------------------------------------
    Fix Version/s: Impala 4.2.0
       Resolution: Fixed

> Memory leak in legacy catalog mode when applying incremental partition updates
> ------------------------------------------------------------------------------
>
>                 Key: IMPALA-11580
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11580
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog
>    Affects Versions: Impala 4.0.0, Impala 4.1.0
>            Reporter: Quanlong Huang
>            Assignee: Quanlong Huang
>            Priority: Critical
>             Fix For: Impala 4.2.0
>
>
> Since IMPALA-3127, catalogd propagates incremental metadata updates in 
> partition level. In the legacy catalog mode, while applying the updates, 
> impalad reuses the existing partition objects and move them to a new 
> HdfsTable object. However, the partition objects are immutable, which means 
> their reference to the old table object remain unchanged. JVM cannot collect 
> the stale table objects since they still have active reference from the 
> partitions.
> To reproduce the issue, create a partitioned table and add new partitions to 
> it in a rate closer to the catalog update frequency (2s by default):
> {code:sql}
> impala-shell> drop table if exists my_part_tbl;
> impala-shell> create external table my_part_tbl (id int) partitioned by (p 
> int) stored as textfile;
> {code}
> Add a partition every 2s:
> {code:bash}
> for i in `seq 1000`; do impala-shell.sh -q "alter table my_part_tbl add 
> partition (p=$i)"; sleep 2; done
> {code}
> Then monitor the live table objects in impalad JVM:
> {code:bash}
> for p in `pidof impalad`; do echo PID=$p; jmap -histo:live $p | grep 
> 'org.apache.impala.catalog.HdfsTable$'; done
> {code}
> You can see that only one impalad has the value unchanged. The number in the 
> other 2 impalads keep bumping.
> {noformat}
> $ for p in `pidof impalad`; do echo PID=$p; jmap -histo:live $p | grep 
> 'org.apache.impala.catalog.HdfsTable$'; done
> PID=27677
>  136:            14           3360  org.apache.impala.catalog.HdfsTable
> PID=27671
>  136:            14           3360  org.apache.impala.catalog.HdfsTable
> PID=27668
>  474:             1            240  org.apache.impala.catalog.HdfsTable
> $ for p in `pidof impalad`; do echo PID=$p; jmap -histo:live $p | grep 
> 'org.apache.impala.catalog.HdfsTable$'; done
> PID=27677
>  113:            21           5040  org.apache.impala.catalog.HdfsTable
> PID=27671
>  113:            21           5040  org.apache.impala.catalog.HdfsTable
> PID=27668
>  474:             1            240  org.apache.impala.catalog.HdfsTable
> {noformat}
> This only happens in the legacy catalog mode and doesn't occur in the 
> local-catalog mode. To workaround this, use the startup flag 
> {{--enable_incremental_metadata_updates=false}} in catalogd to disable 
> incremental catalog updates.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to