Anyway, meanwhile I will try to work on a patch to see if can fix the issue with my problem. Hopefully I can send you a pull request on Github by the end of the week.
Kaiwen > On May 17, 2015, at 6:08 PM, Kaiwen Xu <[email protected]> wrote: > > Hi Roberto, > > Thank you very much for the response and the patch! I tried it out without > single-threaded mode, it seems work well. However, I still noticing some > issues in multi-threaded mode that I mentioned earlier. > > After some digging, it turns out that since we are using reload-on-rss (which > is lower than evil-reload-on-rss), the worker process hits the reload-on-rss > limit first, so one of threads of the worker will start to call > wait_for_threads(). However, if the thread that calls wait_for_threads() > happens to be a thread spawned by main thread (actually it’s very much > likely) and it will call pthread_cancel() and pthread_join() on the main > thread as well. And this seems to cause the worker process appears to be a > zombie (at least for Linux). So if the process ever goes to zombie, the > /proc/self/stat will show it is using “0” memory, which prevents it ever gets > killed by evil-reload-on-rss. And the remaining thread of that process that > get stuck in a loop, can still continue consuming more memory. > > I am wondering if zombification of the process is on purpose or not? Since it > appears to be causing issues. > > Thanks, > Kaiwen > >> On May 16, 2015, at 8:52 PM, Roberto De Ioris <[email protected]> wrote: >> >> >>> Hi all, >>> >>> We have a setup of uwsgi running in emperor mode, so that we can have >>> multiple applications (mostly Python, few Ruby) running on a single >>> machine. For most of the application setup, we use reload-on-rss = 256 and >>> evil-reload-on-rss = 512 in case the application misbehaves. >>> >>> Recently we found out the machine is having extremely high ram usage, and >>> the high ram consuming process ended up being one of the uwsgi worker >>> process which belongs to a Python program and it’s consuming about 16G of >>> memory. Not sure why uwsgi master process didn’t kill the worker based on >>> the evil-reload-on-rss setting, we ran gdb on the worker process. After >>> some quick digging, we realized that worker process is actually stuck in >>> Python interrupter, which mostly likely means it’s because of some of >>> Python code goes bad, and it’s running into a infinite loop and keeps >>> allocating memory. >>> >>> In order to test it, we wrote a simple Python wsgi program which does >>> exactly that. And we get the same behavior, the application keeps >>> allocating memory until it’s killed by Linux's OOM killer. >>> >>> Also we found that even if we enable threads (i.e. set “threads = 8”), we >>> still get that behavior most of the time, only few other times, it’s >>> killed by uwsgi’s evil-reload-on-rss. >>> >>> After some digging into the uwsgi, we found out that the master process >>> kills the worker process based on the ram usage reported by the worker >>> process (in uwsgi_master_check_workers_deadline() of >>> core/master_checks.c). And the worker process only updates its ram usage >>> after a request ends (in uwsgi_close_request() of core/utils.c), but if >>> the request never ends, it means the ram usage is never updated, which >>> means it will never be killed by the master process. So after enabling >>> threads, in theory, other threads of the worker process should run >>> uwsgi_close_request() and update the ram, but it’s not the case, I am not >>> sure why it’s happening, and it still needs more digging. >>> >>> I am wondering shouldn’t it be the case that the master process checks >>> worker’s ram usage when doing evil reload? Is there any reason why it’s >>> not doing that? >>> >>> Thanks, >>> Kaiwen Xu >>> >> >> >> Hi Kaiwen, yes your analysis is right, unfortunately getting memory usage >> of external processes is not very portable. >> >> Btw this patch should help in your situation: >> >> https://github.com/unbit/uwsgi/commit/27ea1203251a843355c9d6db39e0c2c3b480697a >> >> basically a thread is started for every worker that periodically scans for >> memory usage >> >> Let me know how it works for you >> >> >> >> -- >> Roberto De Ioris >> http://unbit.com >> _______________________________________________ >> uWSGI mailing list >> [email protected] >> http://lists.unbit.it/cgi-bin/mailman/listinfo/uwsgi > > _______________________________________________ > uWSGI mailing list > [email protected] > http://lists.unbit.it/cgi-bin/mailman/listinfo/uwsgi _______________________________________________ uWSGI mailing list [email protected] http://lists.unbit.it/cgi-bin/mailman/listinfo/uwsgi
