[ 
https://issues.apache.org/jira/browse/SOLR-14683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17226729#comment-17226729
 ] 

Andrzej Bialecki commented on SOLR-14683:
-----------------------------------------

Prometheus best practices recommend "avoiding missing metrics" (as if that were 
always possible... what about eg. missing them due to network connectivity?), 
and recommend reporting 0 or NaN for the missing numeric metrics:

{quote}
Avoid missing metrics
Time series that are not present until something happens are difficult to deal 
with, as the usual simple operations are no longer sufficient to correctly 
handle them. To avoid this, export 0 (or NaN, if 0 would be misleading) for any 
time series you know may exist in advance.

Most Prometheus client libraries (including Go, Java, and Python) will 
automatically export a 0 for you for metrics with no labels.
{quote}

For frequently occurring events, where the average value of the metric may be 
high, reporting 0 WILL skew the stats more than reporting NaN. Reporting NaN 
also clearly indicates that the data is not available, as opposed to 0 which 
may be a legitimate value of the metric.

The problem is that serialization of NaN in JSON is not present in the JSON 
standard, only in extensions such as JSON 5 (http://json5.org). The current 
JSON standard ECMA-404 says "Numeric values that cannot be represented as 
sequences of digits (such as Infinity and NaN) are not permitted."

So the only standard option left in JSON to indicate that the data is missing 
is to return {{null}}.

> Review the metrics API to ensure consistent placeholders for missing values
> ---------------------------------------------------------------------------
>
>                 Key: SOLR-14683
>                 URL: https://issues.apache.org/jira/browse/SOLR-14683
>             Project: Solr
>          Issue Type: Improvement
>          Components: metrics
>            Reporter: Andrzej Bialecki
>            Assignee: Andrzej Bialecki
>            Priority: Major
>
> Spin-off from SOLR-14657. Some gauges can legitimately be missing or in an 
> unknown state at some points in time, eg. during SolrCore startup or shutdown.
> Currently the API returns placeholders with either impossible values for 
> numeric gauges (such as index size -1) or empty maps / strings for other 
> non-numeric gauges.
> [~hossman] noticed that the values for these placeholders may be misleading, 
> depending on how the user treats them - if the client has no special logic to 
> treat them as "missing values" it may erroneously treat them as valid data. 
> E.g. numeric values of -1 or 0 may severely skew averages and produce 
> misleading peaks / valleys in metrics histories.
> On the other hand returning a literal {{null}} value instead of the expected 
> number may also cause unexpected client issues - although in this case it's 
> clearer that there's actually no data available, so long-term this may be a 
> better strategy than returning impossible values, even if it means that the 
> client should learn to handle {{null}} values appropriately.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to