On Sun, Jul 31, 2016 at 5:31 AM, <almeida.pedro...@gmail.com> wrote: > It's a new area to me, store tons of GB in a GC language.
Set up an SLA for the service: * 99th percentile: 99 of 100 requests are under 5ms processing time. * 99.99th percentile: 9999 of 10.000 requests are under 25ms processing time. * 99.9999th percentile: 999.999 of 1.000.000 requests are under 40ms processing time. By coming up with a modal latency rate like the above, you avoid several problems: * If you say "NO requests must be slower than 10ms" you are making a claim you cannot guarantee. There is always a larger doomsday scenario which you didn't account for. And engineering a system with enough leverage to never hit a doomsday scenario is almost always a waste of programming resources. * Even in the no-doomsday game, you will have many requests arriving in spikes. You generally don't have enough cores to process all of these, so you will have to queue them. Which means you need more latency in your SLA. Most people just cram this under the idea of "if my service is blazing fast, problems doesn't happen". This strategy is only viable in the most naive systems. Stability comes from proactive queue management and clever spike handling. This is one of the places where the preemption capabilities of Go tend to help. * You establish a base, GC'ed language or not. When I sustain-loaded Varnish with 30k req/s for 5 minutes, the 99th percentile for it was well in the "several second" ballpark. The reason is that Varnish 500 default threads can't keep up and latency queue buildup occurs. I note Varnish has no GC and uses mmap()'ed files. In other words, ripping out the GC is not a sufficient condition for solving the latency problems. And I don't think it is a necessary condition either. * You get a good acceptance criteria. You should also make ballpark napkin math on the desired levels of latency. How many microseconds do you have per request to burn on the machine? If this looks completely impossible from the get go, you need to adjust your SLA latencies. Also, the size of the heap is not everything. Large blocks of memory with no pointers tend to be fast to scan, so the "pointer density" will tell you a lot about the latencies. But to make this work, you need to conduct experiments. If you have lots of additional machinery to burn, you can also use an old trick: Send the request to N servers, with a 2ms delay between each send. Pick the first response arriving and have the first processing server cancel the request at the other servers. This can often hide a latency spike by bounding it from above by 2ms. -- J. -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.