[ 
https://issues.apache.org/jira/browse/IMPALA-13491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18050093#comment-18050093
 ] 

Quanlong Huang commented on IMPALA-13491:
-----------------------------------------

The default should be 0 to match the existing unlimited behavior. Admins can 
tune it when catalogd hits OOM due to too many concurrent reloads. The OOM heap 
dump will show how many concurrent loads were running. The flag should be set 
smaller than that.

A better way is to provide a formula which calculates how many concurrent 
load/reloads can be support in a given JVM heap size. It should consider the 
metadata scale of the large tables, e.g. number of partitions, files, etc. 
Admins set the flag base on the formula. This needs more discussion and we can 
track it in another JIRA.

> Add config on catalogd for controlling the number of concurrent 
> loading/refresh commands
> ----------------------------------------------------------------------------------------
>
>                 Key: IMPALA-13491
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13491
>             Project: IMPALA
>          Issue Type: Improvement
>            Reporter: Manish Maheshwari
>            Assignee: Arnab Karmakar
>            Priority: Critical
>
> When running Table Loading or Refresh commands, catalogd requires working 
> memory in proportion to the number of tables been refreshed. While we have a 
> table level lock, we dont have a config to control concurrent load/refresh 
> operations.
> In case of customers that run refresh in parallel in multiple threads, the 
> number of load/refresh command can cause OOM on the catalog due to running 
> out of working memory.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to