Hi all, as discussed a few times in the past, we have the possibility to enable the Wiki on the github repository. In the past a few of us thought it would be a nice alternative to the obsolete architecture manual because it would allow a number of people to contribute to various areas with a relative ease.
Thus today I've given it a deeper thought and figured that not only the architecture manual should go there but also some recommendations or indications about how to test certain things. So in order to help start with something, I'm proposing that we create sort of a plan vaguely following the one below (but I'm totally open to criticism, feel free to voice in). I'd rather avoid to have placeholders that will never be filled ("this site is under construction" for the old ones like me), and I'm not sure we can save drafts, so maybe we can simply place this into a separate document that serves as the initial plan for reference. If we manage to go far enough with this, we'll finally be able to kill the architecture manual 12 years after its last update. Please share ideas / dos / donts. However please keep in mind that wish lists are best served when the requester proposes to handle them by him/herself ;-) Thanks, Willy --------------------------------------------------------------- Project organization -------------------- Team Places - haproxy.org - mailing list - discourse - github Release cycle - development - stable - LTS Contributing code - read CONTRIBUTING - read coding-style - read git log Participating with no code - read problem reports - review / adjust patches - help others - contribute to the wiki - test the code - suggest use cases - report issues, gdb traces - bisect issues Architecture manual ------------------- Presentation How a proxy works Terminology - client - server - frontend / service - backend / farm - active / backup - connection, session, transaction, request, response Topologies - edge + short silos - central LB + a bunch of servers, multiple layers - service clusters (stacks of [haproxy + servers]) - sidecar Setting up HA for haproxy - keepalived / ucarp / pacemaker ? - LVS - ECMP - ELB Common use cases 1) as a basic proxy - IPv6 to IPv4 gatewaying - port filtering - TLS enforcement / cert validation - protocol inspection. E.g. HTTP+SSH, SMTP banner delay - authentication - transparent proxying - logging / anomaly detection / time measurement - DoS protection (stick tables, tarpit) - traffic aggregation (multiple interfaces attachment) - traffic limitation (maxconn) 2) as an accelerating proxy - TLS offloading - traffic compression - response caching 3) as a load balancer - classical stateless L7 LB - classical stateful L7 LB - when to use round robin -> short requests / web applications - when to use least conn -> long sessions - when to use first -> ephemeral VMs, fast scale-in/scale-out - when to use hashing -> affinity (e.g. caches) - consistent vs map-based hashing - persistence vs hashing - inbound vs outbound load balancing - backup server(s) - grouping traffic to a single server (active/backup for data bases) Advanced use cases - providing TLS to Varnish (in + out) - caching clusters with consistent hashing and small object caching - H2 in front of Nginx (max-reuse) - using priorities to speed up critical parts of a site - service discovery via DNS, CLI, Lua - managing certificates at scale / let's encrypt - tuning for extreme loads. pitfalls. - accessing services inside Linux containers using namespaces - multi-site abuser eviction (stick-tables + peers) Scripting in Lua On the fly management - stats page - CLI - signals - master-worker - agent-check - add-acl/del-acl - DNS Operating system specificities - Linux >= 3.9 : SO_REUSEPORT - Linux >= 4.2 : IP_BIND_ADDRESS_NO_PORT Performance considerations - orders of magnitude for a few typical metrics - cost of processing for various operations - cost of traversal for various topologies - optimizing for lowest latency - optimizing for highest throughput - optimizing for TCO Benchmarks ---------- Principles - what - why - when - beware of audience Conducting a benchmark - define purpose - define expected metrics - define ideal conditions - take note of real conditions - ensure reproducibility / minimise noise - problems are part of the process - report Archived results - one per page : date, title, report Testing new features -------------------- ---------------------------------------------------------------