On Mon, Oct 26, 2020 at 8:21 PM JuanPablo AJ <jpabl...@gmail.com> wrote:

> Jesper,
> thanks a lot for your email, your answer was a hand in the dark forest of
> doubts.
>
> I will start trying the load generator wrk2.
>
> About "instrument, profile, observe", yes, I added the gops agent but
> until now I don't have any conclusion related to that information.
>
>
I'm a proponent of adding metrics into your production systems running
code. If the system has low load, you can certainly pay the overhead of
such metrics. If the system has high load, you can always sample and only
pick a fraction of every request (1% say). I'm happy to pay the cost of
5-10% on my production systems if the sacrifice means I know what is going
on. Observability is formally defined as a way to determine the state of a
system based on its outputs[0]. If you start having metrics along your
genuine program output, you stand a far better chance at figuring out what
is going on inside the system. Also, metrics tend to be proactive: problems
can show themselves in metrics long before the critical threshold of system
failure is hit.

Good algorithms and data structures which your metrics package could
endorse. Either directly, or as a variant thereof:

* Vitter's algorithm R. It is related to a Fisher-Yates shuffle in a
peculiar and interesting way. Though you may have to drop or decay the
reservoir unless you are measuring the whole window.
* Gil Tene's HdrHistogram. This essentially tracks a histogram based on the
observation of floating point numbers: If we regard the exponent as
buckets, each containing a set of mantissa buckets, we can quickly
increment a bucket (a few nanoseconds). And the exponent-nature means we
have high resolution close to 0 and less resolution away from 0. But this
is often what one wants: if something takes 5 minutes, you often don't care
if it was 5 minutes and 34 microseconds, so the approximation is sound.
HdrHistogram also supports some nice algebraic properties such as merging
(It forms a commutative monoid with the empty histogram as neutral element,
and merging as the composition operation).
* HyperLogLog-based data structure ideas: accept approximate values in
exchange for much smaller data storage needs.
* Decay ideas: If you keep a pair of (value, timestamp), you can decay the
value over time according to some curve you decide. Keep an array of these
and you can track top popular items efficiently. Go through the array and
weed out any value which decays under a noise floor periodically to keep it
down.

I'm not saying you should implement these things yourself. I'm saying that
a good metrics package will do that for you, and you should endorse it. The
key is to figure out which metrics your application needs and then you need
to add those. The SRE handbooks I linked earlier have some good starting
points on what to measure. But nothing beats having knowledge of the
internals of a system so you can add the better metrics yourself.
At-a-glance blackbox metrics are nice. However, they often simply tells you
something is wrong, but not what.

In general, descriptive statistics is the tool you need to understand
system behavior in the modern world. Infrastructures are simply too complex
nowadays. For more pin-point understanding, a profiler might work really
well, but the more concurrency a system has, the harder it is to gleam
anything meaningful from a profile[1].



[0] Hat tip to Charity Majors for recognizing this from control theory.
[1] This is the same reason debuggers can have a hard time in a distributed
setting. Your program is halted, but half of the program lives behind an
API not under your control. And the timeout is lurking.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CAGrdgiW3YS8VTstp7gx8WN4hEqN%3DioUbEBahb7g7jC%3DnXEGGKw%40mail.gmail.com.

Reply via email to