*Background:*
When JMX exporter is overloaded (say 20-30 qps), we have observed that some 
of the requests take more than 20 sec to serve  which was higher than the 
client-side request timeout. As a result, the agent tried to send response 
on connections already closed, which has two consequences: (a) it resulted 
errors and (b) it sometimes led to the socket channel being blocked 
indefinitely on the write syscall. Eventually, all threads of the HTTP 
server in the Prometheus agent get stuck and no more requests can be 
accepted. However, the thread accepting connections is still active and new 
connections are created but never actually used, and since all request 
threads of the HTTP server are stuck, the connections are never closed by 
the server, resulting in a long backlog of CLOSE_WAIT sockets waiting to be 
closed.

*Proposed Solution*

*1. Limit connections*
We want to limit the number of connection to the exporter agents. There is 
no native way for jmx exporter to put such restrictions(To over come this 
we will be adding ip table rules for jmx port). 

*2. Adding timeouts to requestes*
This could be easily achieved by JVM settings. But it would be nice to add 
these in jmx-exporter's documentation. 

*-Dsun.net.httpserver.maxReqTime=20 -Dsun.net.httpserver.maxRspTime=20 *

*Please let me know if this idea makes sense to community. I can work on 
design.*


Thanks
Brajesh Kumar

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/d3d0c669-0794-47e9-b0db-96cf2cb2d8efn%40googlegroups.com.

Reply via email to