On Mon, Feb 22, 2021 at 5:40 AM Ben Kochie <[email protected]> wrote:

> The problem I have with the histogram approach (and this is partly due to
>> the current way histograms work in Prometheus) is that I don't know the
>> distribution a priori.
>>
>> I let smokeping_prober run for a few days against several IP addresses.
>> For a particular one, after 250+ thousand observations, it's telling me
>> that the round trip time is somewhere between 51.2 ms and 102.4 ms. Using
>> the sum and the count from histogram data I can derive an average (not
>> mean) over a short window and it's giving me ~ 60 ms. I happen to know
>> (from the individual observations) that the 95th percentile is also ~ 60
>> ms, and that's pretty much the 50th percentile (the spread of the
>> observations is very small). The actual min/max/avg from observations is
>> something like 59.1 / 59.7 / 59.4 ms. If I use the data from the histogram
>> the 50th percentile comes out as ~ 77 ms and the 95th percentile as ~ 100
>> ms. I must be missing something, because I don't see how I would extract
>> the min / max / dev from the available data. I do understand that the
>> standard deviation for this data is unusually small (compared to what you'd
>> expect to see in the wild), but still...
>>
>
> The default histogram buckets in the smokeping_prober cover latency
> durations from localhost to the moon and back. It's relatively easy to
> adjust the buckets, and easy enough to get within a reasonable range for
> your network expectations.
>
> Without knowing exactly what queries you're running, it's hard to say what
> you're doing. If you're using the histogram count/sum, this will give you
> the mean value.
>

histogram_quantile(0.95,
rate(smokeping_response_duration_seconds_bucket[1m]))
histogram_quantile(0.50,
rate(smokeping_response_duration_seconds_bucket[1m]))
histogram_quantile(0.05,
rate(smokeping_response_duration_seconds_bucket[1m]))
increase(smokeping_response_duration_seconds_sum[1m])/increase(smokeping_response_duration_seconds_count[1m])

and yes, I'm using the default buckets: but that's what I said before, I
don't know the distribution a priori. Ideally I would generate buckets
centered around the expected mean, but that mean is wildly different
depending on the target IP address, so I'm left with the problem of having
to define too many buckets or buckets that are too wide to provide good
estimates for the above quantities, when my original problem was in
principle trying to provide a reasonable guestimate for packet loss and
variance...


> There is one known issue with the smokeping_prober right now that I need
> to fix, the ping library handling of sequence numbers is broken and doesn't
> wrap correctly.
>
>
>>
>> I also have to think of the data size. For 1 ICMP packet every 1 second,
>> I'm at (order of magnitude) 100 MB of data per target per month. Reducing
>> this to 5 packets every 60 seconds I'm down to 10 MB (order of magnitude).
>> This doesn't sound like much for a single target but it does add up.
>>
>
> Yes, this is going to be an issue no matter what you do. I don't see how
> this relates to any mode of operation.
>

I'm sorry I wasn't clear enough...

With the way smokeping_prober works, I can send one packet per second and
that produces ~ 100 MB / target / month in traffic.

With what I wrote initially, one burst of 5 packets every 60 seconds, I'm
down to 10 MB / target / month.

I could run smokeping_prober with a ping interval of 12 seconds, and I
would get the same 10 MB / target / month, but then I go back to my
original question: what do I gain by doing this vs adding functionality to
blackbox_exporter to send multiple packets per probe?

Thanks again,

Marcelo

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/CABiJYgasSNUCox48tLEfwE%3DmjEN6Z2BYk%2B441GBwXt%3DijJ10Vg%40mail.gmail.com.

Reply via email to