Benoit Tellier created JAMES-3107:
-------------------------------------

             Summary: Log request when P99 is exceeded
                 Key: JAMES-3107
                 URL: https://issues.apache.org/jira/browse/JAMES-3107
             Project: James Server
          Issue Type: New Feature
          Components: Metrics
            Reporter: Benoit Tellier


Given our current tooling I struggle to correctly review slow requests from 
James.

My current procedure is:
  - In grafana identify timestamp of a spike
  - Groke logs in kibana until I find something that could correspond
  - Pray and hope my analisys stands.

This is both time consumming, hard to do and unreliable.

Identifying slow queries is important as it can point us to critical path to 
optimize.

Hence I propose to log an info message when p99 is exceeded for high level 
function (JMAP methods, IMAP processors, matcher mailet and overall processing, 
mailbox listeners, and remote delivery).

In order to avoid log spamming I propose to only log when a function-specified 
threshold is exceeded (defaulting to 100ms)

I belive it will help us coming up with more meaningful performance analysis 
and better fixes for the greater goods of our prduction platforms.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to