On Sun, Jul 31, 2016 at 5:31 AM, <almeida.pedro...@gmail.com> wrote:

> It's a new area to me, store tons of GB in a GC language.


Set up an SLA for the service:

* 99th percentile: 99 of 100 requests are under 5ms processing time.
* 99.99th percentile:  9999 of 10.000 requests are under 25ms processing
time.
* 99.9999th percentile: 999.999 of 1.000.000 requests are under 40ms
processing time.

By coming up with a modal latency rate like the above, you avoid several
problems:

* If you say "NO requests must be slower than 10ms" you are making a claim
you cannot guarantee. There is always a larger doomsday scenario which you
didn't account for. And engineering a system with enough leverage to never
hit a doomsday scenario is almost always a waste of programming resources.

* Even in the no-doomsday game, you will have many requests arriving in
spikes. You generally don't have enough cores to process all of these, so
you will have to queue them. Which means you need more latency in your SLA.
Most people just cram this under the idea of "if my service is blazing
fast, problems doesn't happen". This strategy is only viable in the most
naive systems. Stability comes from proactive queue management and clever
spike handling. This is one of the places where the preemption capabilities
of Go tend to help.

* You establish a base, GC'ed language or not. When I sustain-loaded
Varnish with 30k req/s for 5 minutes, the 99th percentile for it was well
in the "several second" ballpark. The reason is that Varnish 500 default
threads can't keep up and latency queue buildup occurs. I note Varnish has
no GC and uses mmap()'ed files. In other words, ripping out the GC is not a
sufficient condition for solving the latency problems. And I don't think it
is a necessary condition either.

* You get a good acceptance criteria. You should also make ballpark napkin
math on the desired levels of latency. How many microseconds do you have
per request to burn on the machine? If this looks completely impossible
from the get go, you need to adjust your SLA latencies.

Also, the size of the heap is not everything. Large blocks of memory with
no pointers tend to be fast to scan, so the "pointer density" will tell you
a lot about the latencies. But to make this work, you need to conduct
experiments.

If you have lots of additional machinery to burn, you can also use an old
trick: Send the request to N servers, with a 2ms delay between each send.
Pick the first response arriving and have the first processing server
cancel the request at the other servers. This can often hide a latency
spike by bounding it from above by 2ms.



-- 
J.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to