https://bz.apache.org/bugzilla/show_bug.cgi?id=69986

            Bug ID: 69986
           Summary: O(1) hash-based vhost name lookup in
                    update_server_from_aliases()
           Product: Apache httpd-2
           Version: 2.5-HEAD
          Hardware: All
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Core
          Assignee: [email protected]
          Reporter: [email protected]
  Target Milestone: ---

Created attachment 40160
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=40160&action=edit
vhost-hash-lookup.patch

On servers with many name-based virtual hosts (shared hosting, CDN
configurations), update_server_from_aliases() performs a linear scan over all
ServerName/ServerAlias entries on every request. For N total vhosts with
aliases, this is O(N) per request.

The attached patch adds an apr_hash_t built at config time in
ap_fini_vhost_config() that maps lowercased hostnames to server_rec pointers.
Exact ServerName and non-wildcard ServerAlias entries go into the hash. On each
request, we try apr_hash_get() first - if it hits and the port matches, we
return in O(1). On mis, the existing linear scan runs unchanged, so wildcard
aliases (*.example.com) and any other edge cases are handled exactly as before.

Since multiple vhosts can share the same hostname on different ports, the hash
stores a linked list of vhost_hash_entry structs per key. The fast-path
iterates candidates from the hash and validates each against the connection's
vhost_lookup_data chain, checking the port binding before accepting a match.

The hash is allocated from the config pool (p), so it is automatically freed
and rebuilt on graceful restart. It is populated once during configuration and
is strictly read-only during request processing - no locking is needed under
any MPM.

Host header comparison is case-insensitive: keys are lowercased at insert time
via ap_str_tolower() (per RFC 7230 section 2.7.1), and Apache normalizes the
incoming Host header to lowercase before update_server_from_aliases() is
called.

What changed

One file: server/vhost.c, ~75 lines added.

- vhost_hash_entry struct + vhost_hash_insert() helper: chained hash entries
supporting multiple servers per hostname (different port bindings).
- ap_init_vhost_config(): initializes vhost_name_hash to NULL.
- ap_fini_vhost_config(): builds the hash after config parsing. Iterates
main_s->next through all virtual hosts, inserting each ServerName and
non-wildcard ServerAlias. Logs entry count at APLOG_DEBUG level.
- update_server_from_aliases(): hash lookup before the existing linear scan.
Iterates hash candidates (outer loop) and validates against vhost_lookup_data
with port check (inner loop). Falls through to linear scan on miss.

Testing

Tested with all three MPMs (prefork, event, worker):

- Exact ServerName match
- ServerAlias match
- Case-insensitive matching (e.g., Host: WWW.EXAMPLE.COM)
- Unknown host fallback to default vhost
- Wildcard ServerAlias fallback (*.example.com)
- Multiple vhosts sharing hostname on different ports
- Stress test under concurrency (ab -c 100)
- Graceful restart: hash rebuilt correctly, no stale pointers

No regressions, no leaks (verified with Valgrind), no segfaults.

Benchmarks with ab on a config with 500 name-based vhosts show ~15-20%
throughput improvement on the hash-hit path vs. the linear scan. The gain grows
with vhost count. On smaller configs the difference is negligible.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to