Today we have got a little crisis
But I don't know why it started ... and why it ended!

my uwsgi.log was full of:

"SIGPIPE: writing to a closed pipe/socket/fd (probably the client
disconnected) on request"

"IOError: write error"

"uwsgi_response_writev_headers_and_body_do(): Broken pipe [core/writer.c
line 296]"

"DAMN ! worker 3 (pid: 21418) died, killed by signal 9 :( trying respawn
..."


I restarted uwsgi service with no result: again, uwsgi.log full of the same
messages!

On the browser I got the nginx message "Bad Gateway"

The uwsgi host hadn't any CPU or memory crisis

So I crossed my fingers and I tried a full restart
And... all fine!

So far so good, ... but why?!?

Here's another important notice:

At about the same time the crisis started, I got a NetEye warning on
postgres

Notification Type: PROBLEM
Service: POSTGRES_LOCKS
Additional Info: POSTGRES_LOCKS WARNING: DB postgres (host:192.168.100.164)
total locks: 1397

And about when the crisis ended (after I restarted the uwsgi host), I got a
NetEye notice

Notification Type: RECOVERY
Service: POSTGRES_LOCKS
Additional Info: POSTGRES_LOCKS OK: DB postgres (host:192.168.100.164)
total=571

But which is the cause and which the effect?

Last information:
Just on the same period of time I noticed an incredible list of page
accesses from the same IP
I guess it is a script to automate some user actions on the site

Ok, then we have 3 ingredients:
1) uwsgi: Broken pipe, worker restart
2) postgres lock warnings
3) some script to access my django pages

My plan is as follows:
a) ask on uwsgi list for any idea ... and here I am :-)
b) schedule a detailed query on postgres to determine the exact objects
locked (the actual NetEye monitor gives me just a "count")
c) activate some sort of DOS filter on nginx to stop undesired HTTP
"scripting" (http://nginx.org/en/docs/http/ngx_http_limit_req_module.html)

But the step (a) is the step, well, (A)!
I admit I dream of something as:
"Hi, you should just change the parameter
'cabina-telefonica-per-super-uwsgi' to 1 and all will be ok"

So here I am: does exist any "super-uwsgi" parameter which I am not aware
of? :-)

Otherwise what you think?
Have you any hint / link /advice or other?

Thank you so much for your attention!
Marco

P.S. I had another similar crisis some months ago
Same uwsgi messages and same postgres NetEye notice
I tried a uwsgi restart but with no effect
Everything fine after about 15 minutes without the need of a full host
restart
_______________________________________________
uWSGI mailing list
[email protected]
http://lists.unbit.it/cgi-bin/mailman/listinfo/uwsgi

Reply via email to