[
https://issues.apache.org/jira/browse/TEZ-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106408#comment-14106408
]
Siddharth Seth commented on TEZ-1471:
-------------------------------------
{code}
+ * In current Local Mode, large amount of input data may lead to JVM out of
memory since all TEZ components are running in single JVM. Make sure the JVM
heap size is at least as TWICE as your input data size, at the same time,
larger than the required memory size that all your concurrent tasks need if
they are running in parallel. Keep
TezConfiguration.TEZ_AM_INLINE_TASK_EXECUTION_MAX_TASKS to be 1 (in default) if
you have limited memory since this pararmeter is to control the number of
concurrent task running in Local Mode.
{code}
[~airbots] - instead of asking users to setup their Xmx value, can we just
mention that the data size should be kept small, and
TezConfiguration.TEZ_AM_INLINE_TASK_EXECUTION_MAX_TASKS(tez.am.inline.task.execution.max-tasks)
should not be changed (defaults to 1).
The bit about running 2 threads is a good point, since many of the runtime
parameters base their memory usage on available heap size. Sounds like another
jira under TEZ-684 to somehow fix this (either modifying the memory
distributor, or the available memory returned by Tez*Context)
> Additional supplement for TEZ local mode document
> -------------------------------------------------
>
> Key: TEZ-1471
> URL: https://issues.apache.org/jira/browse/TEZ-1471
> Project: Apache Tez
> Issue Type: Sub-task
> Affects Versions: 0.4.0
> Reporter: Chen He
> Assignee: Chen He
> Attachments: TEZ-1471.patch
>
>
> some supplements for Local mode document
--
This message was sent by Atlassian JIRA
(v6.2#6252)