On 2017-07-30 13:30, Peter Booth wrote:

It appears that you have  a lot of data that could help in this
analysis.
How frequently is the status page being queried? Does every status
datapoint get recorded
or is munin showing some rolled up rrd  data?

The nginx status page is queried every 5 minutes (default Munin polling time), and it stores raw metrics into rrd database. But munin is not imporant in this issue. I get the same values if I query the status page directly.


If you open the status page in a browser do the numbers report match
what you see with netstat?

Waiting does:

# netstat -n | grep -E "tcp4|tcp6" | grep ESTABLISHED | wc -l \
  && echo "----------------------------" \
  && fetch -qo - http://10.0.0.4/nginx_status

      82
----------------------------
Active connections: 89
server accepts handled requests
 669843 669843 3158515
Reading: 0 Writing: 22 Waiting: 82

And I ran it a few times with several minutes in between, the above is just an example from the last run. This is inside the nginx jail, so grepping tcp4|tcp6 shows only connections to the nginx server.

Now, the part I don't quite understand is whether Active = Reading + Writing + Waiting. The above certainly doesn't seem to suggest so.



Do you have a hypothesis that explains
why the graph could jump back to 12/13, rather than spend a few days
increasing linearly in the way it did from
the 18th to the 23rd?

Bots crawling the sites, pacing themselves over a longer time frame so there's no correlation to daily sinusoid caused by live visitors. We do have a lot of resources on all those sites to crawl through. They're all real estate agency sites, and there are tens of thousands of pages with hundreds of thousands of images. And looking at the logs, quite a number of requests from bots (that are decent enough to say they're bots).

We've deviated a bit into assuming this is a bug or some unexpected behavior (my fault for suggesting it in the beginning). That's why all I wanted to do was to check which IPs are those that nginx considers "Writing" to. The only reason this caught my attention was apparently "flat" appearance of Writing, but now thinking about bots, this could be quite normal.


How long was nginx down for? If you graph only the “writing”
variable for just 23rd July does the length of
time that the # of writing connections is thoughtto be 0 make sense?

It was only restarted. It appears the "offending" connections started showing up less than an hour later.



I wonder whether what you are seeing could be a side-effect of the
server being in a FreeBSD jail?

I doubt it. I used to see this when the server was on Debian Jessie, but it was much less noticeable. Then again, back then we had much less traffic and much less content.



Do any of the other nginx sites in other jails exhibit the same
behavior?

There is only one instance of nginx running on the server. Individual sites are only runing php-fpm or uwsgi in their jails.


In FreeBSD jails is there an equivalent of Dom) in a XEN hypervisor? A
parent or root OS?

FreeBSD jails are OS-level virtualization. It's basically similar to containers on Linux but with more isolation (it's not just namespacing).


If so, do you see all connections on al jails the you log into it? If
wondering if you are hitting some ulimit or
resource shortage on the host as a whole?

I don't think it's that, as limits are far above the current demands for traffic, and there's nothing logged about potential resource exhaustion.



Thanks for helping me figure this out.


--
Vlad K.
_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx

Reply via email to