Re: [modwsgi] Process killed kills my cache

Julien Delafontaine Wed, 20 Apr 2016 02:25:21 -0700

At least I tried again to generate the matrix after the app has started (in 
the ready() hook) - before any request -, and store the result in a Django 
cache. But when I access the cache in a request, the key is not found and 
recreated... So the simple solution does not work.


Le mercredi 20 avril 2016 05:20:41 UTC+2, Graham Dumpleton a écrit :
>
> That is what the —service-script in mod_wsgi-express is exactly for.
>
> You give it a script and it runs in in managed daemon process.
>
> I haven’t tested this to see if it still works and is more elaborate 
> example that you need but see tasks.py at:
>
>     
> https://gist.github.com/GrahamDumpleton/f5c6e4da0c18a860d5ee379c2f4b6fd5
>
> Run mod_wsgi-express, supplying the argument:
>
>     —service-script tasks /some/path/tasks.py
>
> Then use the tasks-queue-client.py at:
>
>     
> https://gist.github.com/GrahamDumpleton/f94c94ab7e4ad37a53438bf079a26be0
>
> as the WSGI script file. Thus:
>
>     mod_wsgi-express —service-script tasks /some/path/tasks.py 
> /some/path/tasks-queue-client.py
>
> Fire requests at it and watch what happens.
>
> Graham
>
> On 19 Apr 2016, at 11:38 PM, Jason Garber <[email protected] <javascript:>> 
> wrote:
>
> Just a thought...  run a separate mod-wsgi-express instance with 1 process 
> in daemon mode.  Set it to port 15888 or something.
>
> Then.... communicate with it via http from your other Web processes.
>
> If you cannot get mod-wsgi-express to behave this way just write your own 
> python script that always runs and listens for either http or even simpler 
> connections.  It will handle cache and can use multithreading to handle 
> concurrent requests as needed.
> On Apr 19, 2016 9:01 AM, "Julien Delafontaine" <[email protected] 
> <javascript:>> wrote:
>
>> Sure. I have a million-by-thousand boolean matrix (numpy) that takes a 
>> few MB of my RAM, which is fine to me. Now when I try to fit it into, or 
>> retreive from a Redis cache, it requires to transform it into a string, 
>> which takes more than 10 seconds either way (ndarray.tobytes).
>>
>> With the in-memory version, the user has the answer in less than a 
>> second, as expected in a reactive web service. The problem with that (and 
>> with all available Django caches...) is that the cached object belongs to 
>> its process, so if I have N Apache processes, I need to generate N copies 
>> of that cache, and it can time out, and it gets killed together with the 
>> process.
>>
>> All I want is an in-memory cache that is independent from Apache 
>> processes, and I can't believe I have to build it myself. This is no more a 
>> mod_wsgi problem, though.
>>
>>
>> Le mardi 19 avril 2016 14:43:14 UTC+2, Jason Garber a écrit :
>>>
>>> Hi Julian, 
>>>
>>> This conversation points to some improvements that could be made in the 
>>> data structures. It is hard to picture what you are doing that your efforts 
>>> would not be better rewarded by fitting your problem cleanly into Redis.  
>>> Can you shed any light on specifics of your data structures?
>>>
>>> Thanks!
>>> Jason
>>> On Apr 19, 2016 8:37 AM, "Julien Delafontaine" <[email protected]> 
>>> wrote:
>>>
>>>> One problem is that the app needs to be fully loaded, i.e. models etc. 
>>>> I know that in the latest versions of Django there is a hook 
>>>> <https://docs.djangoproject.com/en/dev/ref/applications/#django.apps.AppConfig.ready>
>>>>  
>>>> that allows to run things after everything is loaded, but it did not work 
>>>> as well as expected in practice. I'll try with a few seconds delay.
>>>>
>>>> I see what you mean with the mini cache server. If I don't find a 
>>>> simpler way, I'll try something like this because it does exactly what I 
>>>> need: a kind of Memcached without serialization.
>>>>
>>>>
>>>> Le mardi 19 avril 2016 12:48:18 UTC+2, Graham Dumpleton a écrit :
>>>>>
>>>>> One can always fire off the creation of the cache as a side affect of 
>>>>> the WSGI script file being loaded. You can even do it in a background 
>>>>> thread while still handling requests. So initial requests may be slow as 
>>>>> cache populates, but once loaded should be good.
>>>>>
>>>>> See a problem with doing it that way?
>>>>>
>>>>> On 19 Apr 2016, at 8:45 PM, Julien Delafontaine <[email protected]> 
>>>>> wrote:
>>>>>
>>>>> When a process is started, I pull blobs out of a database, put their 
>>>>> data in a matrix, and keep the matrix in memory (because 
>>>>> [de-]serialization 
>>>>> for usual caches is slow) so that computations using that matrix are very 
>>>>> quick. The construction of the matrix takes time, though, and can time 
>>>>> out 
>>>>> if the database is big. 
>>>>>
>>>>> What I do is I trigger the first call to that controller myself (with 
>>>>> a curl) so that users don't see it later and only have the quick 
>>>>> responses. 
>>>>> But if the cache gets erased, it becomes slow for them as well.
>>>>> I have a --request-timeout set to 90s, but apparently it is still not 
>>>>> enough.
>>>>>
>>>>> An improvement would be to store the computed matrix in a persistent 
>>>>> cache, and load that only when the app starts (takes double the amount of 
>>>>> memory and still a dozen seconds to deserialize from my experience).
>>>>>
>>>>>
>>>>> Le mardi 19 avril 2016 11:33:36 UTC+2, Graham Dumpleton a écrit :
>>>>>>
>>>>>> Can you explain more about what the long running requests are doing?
>>>>>>
>>>>>> The timeout can be extended by using option like:
>>>>>>
>>>>>>     —request-timeout=300
>>>>>>
>>>>>> Would help to understand the need for long running requests and can 
>>>>>> perhaps suggest a better way.
>>>>>>
>>>>>> Graham
>>>>>>
>>>>>> On 19 Apr 2016, at 7:30 PM, Julien Delafontaine <[email protected]> 
>>>>>> wrote:
>>>>>>
>>>>>> I have long requests on purpose. In the same process I am building 
>>>>>> cache for several elements, it can take time (at startup only), and for 
>>>>>> one 
>>>>>> item it occasionally times out. So it would fit the scenario where when 
>>>>>> this one times out, the process is restarted and all the previously 
>>>>>> computed data is lost... This is extremely annoying :( Time to set up 
>>>>>> persistent cache, maybe.
>>>>>>
>>>>>> Thanks a lot !
>>>>>>
>>>>>> Le mardi 19 avril 2016 11:15:42 UTC+2, Graham Dumpleton a écrit :
>>>>>>>
>>>>>>> When using mod_wsgi-express it does run daemon mode. So with that 
>>>>>>> configuration you should have two persistent processes. The processes 
>>>>>>> should not be recycled under normal circumstances.
>>>>>>>
>>>>>>> The only way with the default configuration that processes could be 
>>>>>>> recycled is if you have stuck requests and eventually trip the request 
>>>>>>> timeout. For a multi thread process the process restart would kick in 
>>>>>>> only 
>>>>>>> when the length of all active requests (across total number of request 
>>>>>>> slots) averaged 60 seconds.
>>>>>>>
>>>>>>> So if you had one stuck request only, if it was stuck for 5 minutes, 
>>>>>>> then finally process would be forcibly restarted. If has two stuck 
>>>>>>> requests 
>>>>>>> that started at same time, would restart after 2.5 minutes. If five 
>>>>>>> stuck 
>>>>>>> requests in same process, then after 60 seconds. It is a weird 
>>>>>>> calculation 
>>>>>>> but only thing that makes half sense in multi threaded application.
>>>>>>>
>>>>>>> To work out whether forced process restarts are occurring because of 
>>>>>>> the timeout, add the:
>>>>>>>
>>>>>>>     —log-level info
>>>>>>>
>>>>>>> option. With this mod_wsgi will log more details about process 
>>>>>>> restarts and why they were triggered. You can then look in the logs to 
>>>>>>> confirm if this is what is happening.
>>>>>>>
>>>>>>> Do you know if you are seeing requests that never seem to finish? Or 
>>>>>>> does your application run with very long requests on purpose?
>>>>>>>
>>>>>>> Graham
>>>>>>>
>>>>>>> On 19 Apr 2016, at 6:34 PM, Julien Delafontaine <[email protected]> 
>>>>>>> wrote:
>>>>>>>
>>>>>>> I am using mod_wsgi-express:
>>>>>>>
>>>>>>>     mod_wsgi-express setup-server ${baseDir}/project/wsgi.py 
>>>>>>> --port=8887 --user myapp --server-root=${remoteDir}/mod_wsgi-server 
>>>>>>> --processes 2 --threads 5;
>>>>>>>
>>>>>>> Then
>>>>>>>
>>>>>>>     ${remoteDir}/mod_wsgi-server/apachectl restart
>>>>>>>
>>>>>>> This sets up the configuration itself, it seems. I thought 
>>>>>>> mod_wsgi-express would run daemon mode by default?
>>>>>>>
>>>>>>>
>>>>>>> Le mardi 19 avril 2016 10:19:01 UTC+2, Graham Dumpleton a écrit :
>>>>>>>>
>>>>>>>> Sounds like you are using embedded mode rather than daemon mode. In 
>>>>>>>> embedded mode Apache will recycle processes.
>>>>>>>>
>>>>>>>> How do you have it configured? Are you using 
>>>>>>>> WSGIDaemonProcess/WSGIProcessGroup directives at all?
>>>>>>>>
>>>>>>>> Graham
>>>>>>>>
>>>>>>>> On 19 Apr 2016, at 6:12 PM, Julien Delafontaine <[email protected]> 
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> I have a multi-processes mod_wsgi application that stores some 
>>>>>>>> cache data in memory. Each process naturally gets its own instance of 
>>>>>>>> that 
>>>>>>>> cache. Now it seems that processes after some time get 
>>>>>>>> killed/restarted/whatever, so that the cache has to be reinitialized 
>>>>>>>> everytime this happens. How can I control it ?
>>>>>>>>
>>>>>>>> Ideally I'd like to start 2 Apache/mod_wsgi processes, initialize 
>>>>>>>> the cache on each, and let the app run forever without needing to 
>>>>>>>> recompute 
>>>>>>>> the cache. Is that possible?
>>>>>>>>
>>>>>>>> -- 
>>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>>> Groups "modwsgi" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>> send an email to [email protected].
>>>>>>>> To post to this group, send email to [email protected].
>>>>>>>> Visit this group at https://groups.google.com/group/modwsgi.
>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> -- 
>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>> Groups "modwsgi" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>> send an email to [email protected].
>>>>>>> To post to this group, send email to [email protected].
>>>>>>> Visit this group at https://groups.google.com/group/modwsgi.
>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> -- 
>>>>>> You received this message because you are subscribed to the Google 
>>>>>> Groups "modwsgi" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>> send an email to [email protected].
>>>>>> To post to this group, send email to [email protected].
>>>>>> Visit this group at https://groups.google.com/group/modwsgi.
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>>
>>>>>>
>>>>> -- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "modwsgi" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to [email protected].
>>>>> To post to this group, send email to [email protected].
>>>>> Visit this group at https://groups.google.com/group/modwsgi.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>>
>>>>>
>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "modwsgi" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to [email protected].
>>>> To post to this group, send email to [email protected].
>>>> Visit this group at https://groups.google.com/group/modwsgi.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "modwsgi" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To post to this group, send email to [email protected] 
>> <javascript:>.
>> Visit this group at https://groups.google.com/group/modwsgi.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "modwsgi" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected] <javascript:>.
> To post to this group, send email to [email protected] <javascript:>
> .
> Visit this group at https://groups.google.com/group/modwsgi.
> For more options, visit https://groups.google.com/d/optout.
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/modwsgi.
For more options, visit https://groups.google.com/d/optout.

Re: [modwsgi] Process killed kills my cache

Reply via email to