[ 
https://issues.apache.org/jira/browse/OOZIE-3566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxim Kolesnikov updated OOZIE-3566:
------------------------------------
    Description: 
webservices.requests sampler based on 
{code:java}
org.apache.oozie.servlet.JsonRestServlet#TOTAL_REQUESTS_SAMPLER_COUNTER{code}
may produce wrong results if
{code:java}
org.apache.oozie.servlet.JsonRestServlet#service{code}
rises an Exception during request URL validation.

 

TOTAL_REQUESTS_SAMPLER_COUNTER is incremented in try block and decremented in 
finally block right after that. But as validateRestUrl comes before the counter 
is incremented it may lead to situation when counter is decremented without 
being incremented before (in case of validation error).

 

Attached image is an example graph of webservices.requests being drawn in 
Grafana via Graphite. Pay attention to Y-axis. While most nodes publish 
positive numbers close to 0, some others seems to be shifted off by 1, with 
results close to -1, util two of these nodes jump to numbers close to -2 around 
14:57. At this exact time both servers have URL validation errors in their 
logs, like this:
{code:java}
2019-11-29 14:57:29,394 WARN V2JobServlet:523 - USER[-] GROUP[-] TOKEN[-] 
APP[-] JOB[-] ACTION[-] URL[GET <SANITIZED>] error[E0301], E0301: Invalid 
resource [<SANITIZED>]{code}

  was:
webservices.requests sampler based on 
{code:java}
org.apache.oozie.servlet.JsonRestServlet#TOTAL_REQUESTS_SAMPLER_COUNTER{code}
may produce wrong results if
{code:java}
org.apache.oozie.servlet.JsonRestServlet#service{code}
rises an Exception during request URL validation.

 

TOTAL_REQUESTS_SAMPLER_COUNTER is incremented in try block and decremented in 
finally block right after that. But as validateRestUrl comes before the counter 
is incremented it may lead to situation when counter is decremented without 
being incremented before.

 

Attached image is an example graph of webservices.requests being drawn in 
Grafana via Graphite. Pay attention to Y-axis. While most nodes publish 
positive numbers close to 0, some others seems to be shifted off by 1, with 
results close to -1, util two of these nodes jump to numbers close to -2 around 
14:57. At this exact time both servers have URL validation errors in their 
logs, like this:
{code:java}
2019-11-29 14:57:29,394 WARN V2JobServlet:523 - USER[-] GROUP[-] TOKEN[-] 
APP[-] JOB[-] ACTION[-] URL[GET <SANITIZED>] error[E0301], E0301: Invalid 
resource [<SANITIZED>]{code}


> Wrong numbers provided by Oozie monitoring webservices.requests sampler
> -----------------------------------------------------------------------
>
>                 Key: OOZIE-3566
>                 URL: https://issues.apache.org/jira/browse/OOZIE-3566
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Maxim Kolesnikov
>            Priority: Major
>         Attachments: unnamed.png
>
>
> webservices.requests sampler based on 
> {code:java}
> org.apache.oozie.servlet.JsonRestServlet#TOTAL_REQUESTS_SAMPLER_COUNTER{code}
> may produce wrong results if
> {code:java}
> org.apache.oozie.servlet.JsonRestServlet#service{code}
> rises an Exception during request URL validation.
>  
> TOTAL_REQUESTS_SAMPLER_COUNTER is incremented in try block and decremented in 
> finally block right after that. But as validateRestUrl comes before the 
> counter is incremented it may lead to situation when counter is decremented 
> without being incremented before (in case of validation error).
>  
> Attached image is an example graph of webservices.requests being drawn in 
> Grafana via Graphite. Pay attention to Y-axis. While most nodes publish 
> positive numbers close to 0, some others seems to be shifted off by 1, with 
> results close to -1, util two of these nodes jump to numbers close to -2 
> around 14:57. At this exact time both servers have URL validation errors in 
> their logs, like this:
> {code:java}
> 2019-11-29 14:57:29,394 WARN V2JobServlet:523 - USER[-] GROUP[-] TOKEN[-] 
> APP[-] JOB[-] ACTION[-] URL[GET <SANITIZED>] error[E0301], E0301: Invalid 
> resource [<SANITIZED>]{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to