[
https://issues.apache.org/jira/browse/TEZ-776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14359459#comment-14359459
]
Bikas Saha commented on TEZ-776:
--------------------------------
This jira has presented a solution for reducing the memory usage in the AM
because of storing events. The solution has proven to solve the problem. There
were some concerns raised around CPU usage and extensive CPU profiling in
simulation and real clusters has provided sufficient evidence of CPU usage to
neither be a practical problem nor significantly higher compared to the current
state of affairs. The API changes are incompatible but essentially follow the
existing pattern. They only narrow the return value of the plugin from the
entire routing range to that of a specific task. Thus the change in the plugin
code should be minimal and perhaps even simplify things. The code allows older
plugins to be used by setting a configuration and so this can be
interchangeably used without code change. Hence, the proposed changes should be
reviewed on their own merit and proof points.
IMO, it is not reasonable nor justifiable to demand acceptance of this patch to
be tied to an acceptance of some yet to be built concept. Other approaches
could be implemented in full, tested, profiled and verified and presented
before the community and discussed and evaluated on their own merits. But until
that process is followed, it would be premature to block the work done in this
jira.
At this point, I believe that the on demand approach proposed has been
sufficiently validated and tested by multiple people independently. Hence, I
will request committers and watchers to review this jira so that we can take it
to resolution.
> Reduce AM mem usage caused by storing TezEvents
> -----------------------------------------------
>
> Key: TEZ-776
> URL: https://issues.apache.org/jira/browse/TEZ-776
> Project: Apache Tez
> Issue Type: Sub-task
> Reporter: Siddharth Seth
> Assignee: Bikas Saha
> Attachments: TEZ-776.ondemand.1.patch, TEZ-776.ondemand.2.patch,
> TEZ-776.ondemand.3.patch, TEZ-776.ondemand.4.patch, TEZ-776.ondemand.5.patch,
> TEZ-776.ondemand.patch, With_Patch_AM_hotspots.png,
> With_Patch_AM_profile.png, Without_patch_AM_CPU_Usage.png,
> events-problem-solutions.txt, with_patch_jmc_output_of_AM.png,
> without_patch_jmc_output_of_AM.png
>
>
> This is open ended at the moment.
> A fair chunk of the AM heap is taken up by TezEvents (specifically
> DataMovementEvents - 64 bytes per event).
> Depending on the connection pattern - this puts limits on the number of tasks
> that can be processed.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)