[
https://issues.apache.org/jira/browse/AMBARI-24638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16617559#comment-16617559
]
Hudson commented on AMBARI-24638:
---------------------------------
FAILURE: Integrated in Jenkins build Ambari-trunk-Commit #9962 (See
[https://builds.apache.org/job/Ambari-trunk-Commit/9962/])
AMBARI-24638. Ambari-agent process consuming more memory. (aonishuk) (aonishuk:
[https://gitbox.apache.org/repos/asf?p=ambari.git&a=commit&h=4db8904e67cd814bd775243717e049d95e2f92e1])
* (edit) ambari-common/src/main/python/ambari_ws4py/websocket.py
> Ambari-agent process memory leak
> --------------------------------
>
> Key: AMBARI-24638
> URL: https://issues.apache.org/jira/browse/AMBARI-24638
> Project: Ambari
> Issue Type: Bug
> Affects Versions: 2.7.0
> Reporter: Andrew Onischuk
> Assignee: Andrew Onischuk
> Priority: Major
> Labels: pull-request-available
> Fix For: 2.7.2
>
> Attachments: AMBARI-24638.patch, AMBARI-24638.patch
>
> Time Spent: 1h 20m
> Remaining Estimate: 0h
>
> There was one process which started using memory rapidly at certain point and
> grew up to ~27GB of RSS used until eventually we restarted it. Which happened
> after a month of running of 10 ambari-agent nodes.
> [root@andrew2-1n01 ~]# ps aux | grep ambari_agent
> root 39955 0.0 0.0 47580 6024 ? S Aug17 0:00
> /usr/bin/python /usr/lib/ambari-agent/lib/ambari_agent/AmbariAgent.py start
> root 39959 20.4 10.2 31623096 27154348 ? Sl Aug17 7645:55
> /usr/bin/python /usr/lib/ambari-agent/lib/ambari_agent/main.py start
> Just before the growth in memory usage is seen. This exception pops out:
> ERROR 2018-09-11 10:56:59,716 websocket.py:552 - Websocket connection was
> closed with an exception
> Traceback (most recent call last):
> File "/usr/lib/ambari-agent/lib/ambari_ws4py/websocket.py", line 549, in run
> if not self.once():
> File "/usr/lib/ambari-agent/lib/ambari_ws4py/websocket.py", line 428, in
> once
> if not self.process(self.buf[:requested]):
> File "/usr/lib/ambari-agent/lib/ambari_ws4py/websocket.py", line 483, in
> process
> self.reading_buffer_size = s.parser.send(bytes) or DEFAULT_READING_SIZE
> ValueError: generator already executing
> This exception is not seen on all other nodes or on this one at any other
> period (during 1 month). So I suggest it can be the root cause.
> Basically this error means that generator is being used by multiple threads.
> So I will upload the fix to thread-lock this place.
> This is just a guess solution which might work and might not. No way to test
> really. But definitely we should try this.
>
> This is noticed in ambari-2.7.1.0-73 version as well.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)