Hi Graham, Thank you for the detailed response. I understand and fully support your decision to focus on core, not UI or DB stuff. I think the priority here is to ensure that the data points are exposed in a layer or way that applications can do what they want with them, including (and maybe especially!) a New Relic plugin.
Decoupling would provide a lot of tangible benefits. I think we should make the configuration straightforward. I am thinking out loud here: WSGIDaemonProcess app1 threads=10 processes=2 WSGIMonitorDaemonProcess app1 handler=/path/to/mystats.py stat_a=10 stat_b=20 stat_c=1 *# The above instructs mod_wsgi to create a process and thread, load /path/to/mystats.py, call the setup() function, and report stat_a every 10 seconds, stat_b every 20 seconds and stat_c every 1 second. I would think the stats should be grouped into useful groupings so you don't have to specify each one, but could if you wanted to.* */path/to/mystats.py:* def setup(config): # setup environment, spawn threads, whatever you want to be. # return the stats collector callback. def onstat(name, value, interval): # This function will be called every time there is a statistic to report # It would likely queue them up in memory for a number of seconds and then make a webservice call Please take the above just as a concept, not as a proposal. What are your thoughts on creating a pythonic interface to the stats scoreboard? Thanks! Jason On Mon, Jun 9, 2014 at 7:58 AM, Graham Dumpleton <[email protected] > wrote: > On 09/06/2014, at 1:50 PM, Jason Garber <[email protected]> wrote: > > Can you comment on how this mechanism could provide data to a custom > monitoring application and the plans for extending it to cover daemon mode > process information? > > What this feature relies on is an ability within mod_wsgi itself to > provide a snapshot of what is called the Apache scoreboard. > > This Apache scoreboard is a shared memory segment used by Apache to keep > track of the status of each worker thread across all Apache processes. > Apache itself uses this information to determine how busy the Apache child > processes are that the worker threads run in and depending on the MPM > settings will use the information to increase or decrease the number of > child process running. > > If mod_status is loaded, additional information about the number of > requests handled by Apache is also kept within the scoreboard. Enable > ExtendedStatus and even more information is tracked. > > As mod_status itself provides by way of being able to expose an external > URL such as /server-status, the plugin uses the data to create a picture of > what Apache is doing. > > The difference between what the plugin is doing and what can be done from > an external monitoring system using the exposed URL, is that the plugin can > poll on a regular 1 second interval without needing to do a web request > which would itself be reflected in server traffic. By being able to poll > more frequently it can build up a better picture, including using extra > detail available directly from the scoreboard to grab sample data on actual > requests and so generate response time averages and percentiles. > > The short polling also allows the number of active child processes to be > monitored closely and so be able to derive metrics for things such as > process churn for the child processes, as well as more accurate metrics > about server restarts and capacity utilisation. > > The intention in monitoring mod_wsgi is for mod_wsgi to have its own form > of scoreboard using shared memory which tracks a snapshot of what is going > on across all processes, whether the WSGI application is running in > embedded mode or daemon mode. > > This would allow tracking of details such as response time measured within > the WSGI application independent of front end time, the queueing time which > is how long between when Apache accepted the request and the WSGI > application got to handle it, plus separate measures of capacity > utilisation for each mod_wsgi daemon process group. Other metrics which > could be thrown into this might include queue depth for daemon processes, > queue timeout rate, daemon connection failure rates etc. > > If a module such as psutil were available, then possibly the plugin could > also track and report for processes memory usage, CPU usage and the number > of process context switches. In essence, anything I can think of that would > help to supplement data on throughput and response times to work out > whether changing processes/threads mix is actually having some form of > positive effect. > > So my plans for future work in trying to achieve all that are as follows. > > 1. Refactor the current plugin code, which doesn't have a clean separation > between deriving the metrics and reporting them up to New Relic, such that > there is a distinct layer between the two. > > 2. Implement the equivalent of a scoreboard for mod_wsgi itself in order > to be able to accumulate the additional information required and enhance > the metrics generation code and the current New Relic plugin to match. > > 3. Create an optionally enabled internal consumer of the metrics which > would retain a working history of metrics for a period of 30 minutes, but > only within the collecting process itself, this process being a dedicated > daemon process group set up to collect the data. Part of this would involve > a minimal REST API to retrieve raw metric data from the process in some way. > > 4. Create as a proof of concept an extension for Django Debug Toolbar > which can query the historical data from the in memory cache using the REST > API. I intend doing this purely though as an example to support a talk I > will be giving at PyCon AU in August on how Python web application toolbars > work. Part of the talk will be about the usefulness or applicability of > debug toolbars to a production environment, and I can see this proof of > concept helping me to illustrate some points about the problem of a debug > toolbar being of use in a multi host deployment. > > That is as much as I have planned at this point. > > Things I have no intention of doing are the following. > > 1. Creating any plugin to report data to any other charting system such as > Graphite. > > 2. Creating a database for long term persistence of data. > > 3. Creating any chart visualisation system of my own to view the metric > data, beyond any minimal experiments I may do to support the Django Debug > Toolbar experiment. > > The reason I am not doing any of these is that they are outside of my area > of expertise. I have never used tools such as Graphite. I am not a database > person, nor am I a front end web developer or Javascript developer. > > I well know from my work at New Relic how much time and effort needs to go > into creating a professional production quality backed system for retaining > and visualising metric data and even if I had the skills in those areas it > would be an amazingly huge time suck which would totally dwarf any time I > am even able to spend on progressing mod_wsgi itself. > > Since my experience lies in the area of Apache, WSGI servers and > instrumenting for and collecting metric data, I will keep to that area. > Doing so is just the most practical thing I can do as that is where I will > be most productive and can do the most. > > Those areas are also the ones I am interested in and enjoy working in. I > don't find database and front end web design to be that interesting and > given that my impetus for doing any work these days is a personal > requirement or because I enjoy the technical challenge in a specific > problem, then I will as a result be staying well clear of those areas. > > Personally I got no issue if others want to pursue those things I have no > interest in and certainly the way I intend refactoring the code would allow > anyone to develop their own plugins to get the metrics out and into some > other system. > > In saying that, please don't take this as me saying 'patches welcome' and > otherwise buzz off. I hate the way that some Open Source projects will say > that when they don't have time to do something themselves. Reality is that > I am time poor and I simply need to focus my time in the best way I can. > > If you are genuinely interested in trying to fill in those areas where I > feel I can't do a good job or don't have the time, I will not stop you nor > make it difficult and will actually be accommodating as I can and make it > easier for you to get the data out and also advise on what would be the > best way to do something. > > What I simply am not in a position to do is lead such an initiative. My > own priorities and interest will always take precedence and I have come to > learn that I must do that if I am to avoid become burnt out again in > respect of the work I do on mod_wsgi. So is due to a measure of self > preservation that I take this stance. > > Hope that all makes sense and gives you a better idea of where I am > heading and why I am restricting myself to that. > > Graham > > -- > You received this message because you are subscribed to the Google Groups > "modwsgi" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/modwsgi. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "modwsgi" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/modwsgi. For more options, visit https://groups.google.com/d/optout.
