Anyway, meanwhile I will try to work on a patch to see if can fix the issue 
with my problem. Hopefully I can send you a pull request on Github by the end 
of the week.

Kaiwen

> On May 17, 2015, at 6:08 PM, Kaiwen Xu <[email protected]> wrote:
> 
> Hi Roberto,
> 
> Thank you very much for the response and the patch! I tried it out without 
> single-threaded mode, it seems work well. However, I still noticing some 
> issues in multi-threaded mode that I mentioned earlier.
> 
> After some digging, it turns out that since we are using reload-on-rss (which 
> is lower than evil-reload-on-rss), the worker process hits the reload-on-rss 
> limit first, so one of threads of the worker will start to call 
> wait_for_threads(). However, if the thread that calls wait_for_threads() 
> happens to be a thread spawned by main thread (actually it’s very much 
> likely) and it will call pthread_cancel() and pthread_join() on the main 
> thread as well. And this seems to cause the worker process appears to be a 
> zombie (at least for Linux). So if the process ever goes to zombie, the 
> /proc/self/stat will show it is using “0” memory, which prevents it ever gets 
> killed by evil-reload-on-rss. And the remaining thread of that process that 
> get stuck in a loop, can still continue consuming more memory.
> 
> I am wondering if zombification of the process is on purpose or not? Since it 
> appears to be causing issues.
> 
> Thanks,
> Kaiwen
> 
>> On May 16, 2015, at 8:52 PM, Roberto De Ioris <[email protected]> wrote:
>> 
>> 
>>> Hi all,
>>> 
>>> We have a setup of uwsgi running in emperor mode, so that we can have
>>> multiple applications (mostly Python, few Ruby) running on a single
>>> machine. For most of the application setup, we use reload-on-rss = 256 and
>>> evil-reload-on-rss = 512 in case the application misbehaves.
>>> 
>>> Recently we found out the machine is having extremely high ram usage, and
>>> the high ram consuming process ended up being one of the uwsgi worker
>>> process which belongs to a Python program and it’s consuming about 16G of
>>> memory. Not sure why uwsgi master process didn’t kill the worker based on
>>> the evil-reload-on-rss setting, we ran gdb on the worker process. After
>>> some quick digging, we realized that worker process is actually stuck in
>>> Python interrupter, which mostly likely means it’s because of some of
>>> Python code goes bad, and it’s running into a infinite loop and keeps
>>> allocating memory.
>>> 
>>> In order to test it, we wrote a simple Python wsgi program which does
>>> exactly that. And we get the same behavior, the application keeps
>>> allocating memory until it’s killed by Linux's OOM killer.
>>> 
>>> Also we found that even if we enable threads (i.e. set “threads = 8”), we
>>> still get that behavior most of the time, only few other times, it’s
>>> killed by uwsgi’s evil-reload-on-rss.
>>> 
>>> After some digging into the uwsgi, we found out that the master process
>>> kills the worker process based on the ram usage reported by the worker
>>> process (in uwsgi_master_check_workers_deadline() of
>>> core/master_checks.c). And the worker process only updates its ram usage
>>> after a request ends (in uwsgi_close_request() of core/utils.c), but if
>>> the request never ends, it means the ram usage is never updated, which
>>> means it will never be killed by the master process. So after enabling
>>> threads, in theory, other threads of the worker process should run
>>> uwsgi_close_request() and update the ram, but it’s not the case, I am not
>>> sure why it’s happening, and it still needs more digging.
>>> 
>>> I am wondering shouldn’t it be the case that the master process checks
>>> worker’s ram usage when doing evil reload? Is there any reason why it’s
>>> not doing that?
>>> 
>>> Thanks,
>>> Kaiwen Xu
>>> 
>> 
>> 
>> Hi Kaiwen, yes your analysis is right, unfortunately getting memory usage
>> of external processes is not very portable.
>> 
>> Btw this patch should help in your situation:
>> 
>> https://github.com/unbit/uwsgi/commit/27ea1203251a843355c9d6db39e0c2c3b480697a
>> 
>> basically a thread is started for every worker that periodically scans for
>> memory usage
>> 
>> Let me know how it works for you
>> 
>> 
>> 
>> -- 
>> Roberto De Ioris
>> http://unbit.com
>> _______________________________________________
>> uWSGI mailing list
>> [email protected]
>> http://lists.unbit.it/cgi-bin/mailman/listinfo/uwsgi
> 
> _______________________________________________
> uWSGI mailing list
> [email protected]
> http://lists.unbit.it/cgi-bin/mailman/listinfo/uwsgi

_______________________________________________
uWSGI mailing list
[email protected]
http://lists.unbit.it/cgi-bin/mailman/listinfo/uwsgi

Reply via email to