agronaught opened a new pull request, #13173:
URL: https://github.com/apache/cloudstack/pull/13173
This PR adds the IPv6 equivalent of `fw_router_routing()` to the systemvm
Virtual Router's network configuration, so that return traffic for VR-initiated
IPv6 connections (BGP to upstream PE peers, NTP, DNS lookups, etc.) is allowed
back through the `ip6_firewall fw_input` chain.
### Problem
The systemvm VR's nftables `ip6 ip6_firewall fw_input` chain is created with
`policy=drop` and only ICMPv6 accept rules. The IPv4 INPUT chain has the
equivalent `iifname "eth2" ct state established,related accept` rule (added by
`fw_router_routing()` in `CsAddress.py`); the IPv6 path has no such rule.
Effect: any v6 connection the VR itself initiates outbound has its return
traffic silently dropped at the v6 INPUT hook before TCP processes it. For
Isolated IPv6 ROUTED networks this is fatal — BGP IPv6 sessions cannot reach
`Established`, tenant `/64` prefixes are never advertised upstream, and VMs in
the network are unreachable from the IPv6 internet.
#10970 added the equivalent rule to the FORWARD chain (covering tenant VM
return traffic) but explicitly removed it from the INPUT chain in its second
commit. This PR completes that fix for VR-originated traffic.
### Behavioural change
Before this PR, IPv6 BGP sessions from VRs in `IsolatedV6RoutedFiltered`
(and similar Routed v6) network offerings stay in `Connect` state indefinitely.
After this PR, sessions reach `Established` within seconds of VR start and
prefix advertisements work normally.
The change is additive and behind the existing `is_routed()` / `is_vpc()`
gating — only routed, non-VPC networks see new INPUT rules. No change for
existing v4 paths, v4 NATted networks, or VPC networks.
Fixes: #13171
### Types of changes
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] New feature (non-breaking change which adds functionality)
- [x] Bug fix (non-breaking change which fixes an issue)
- [ ] Enhancement (improves an existing feature and functionality)
- [ ] Cleanup (Code refactoring and cleanup, that may add test cases)
- [ ] Build/CI
- [ ] Test (unit or integration test code)
### Feature/Enhancement Scale or Bug Severity
#### Feature/Enhancement Scale
- [ ] Major
- [ ] Minor
#### Bug Severity
- [ ] BLOCKER
- [ ] Critical
- [x] Major
- [ ] Minor
- [ ] Trivial
Justifying Major: any operator wanting to ship the
`IsolatedV6RoutedFiltered` offering (or any v6 Routed isolated network with
`Firewall` service) for production tenant workloads is blocked. Workaround
requires per-VR `nft` injection that wipes on every tenant FW rule change,
making the offering unusable as a customer product without a downstream patch
like this one.
### Screenshots (if appropriate)
N/A — kernel-level firewall change.
### How Has This Been Tested?
Verified end-to-end on Apache CloudStack 4.22.0.0, KVM hypervisor (Ubuntu
24.04 hosts), with:
- Zone configured for BGP Routed networks (ASN range, BGP peers, IPv6 guest
prefix `/48`)
- Tenant network using `IsolatedV6RoutedFiltered` offering
- Two independent fresh VRs in two different tenant networks
**Before the patch:**
vtysh -c "show bgp ipv6 unicast summary"
Neighbor State/PfxRcd
2400:88e0:ffff:258::2 Connect 0
2400:88e0:ffff:258::3 Connect 0
Hypervisor-side packet capture on the underlay confirms PE responds with
SYN-ACK, but the VR's TCP stack never delivers it to FRR. Kernel `TCPMD5*`
counters stay at zero — drop happens at netfilter before TCP processes the
segment. Inside the VR:
$ nft list table ip6 ip6_firewall
table ip6 ip6_firewall {
chain fw_input {
type filter hook input priority filter; policy drop;
icmpv6 type { ... } accept
}
...
}
No `ct state established,related accept` rule.
**After the patch:**
vtysh -c "show bgp ipv6 unicast summary"
Neighbor State/PfxRcd
2400:88e0:ffff:258::2 Established 1
2400:88e0:ffff:258::3 Established 1
`fw_input` now includes the new rule with active counters:
iifname "eth2" ct state established,related counter packets ... bytes ...
accept
Verified end-to-end: SSH from public IPv6 internet to a VM inside the
v6-routed network succeeds. Reachability survives subsequent tenant firewall
rule updates (the rule is rebuilt from `nft_ipv6_fw` on every
`IpTablesExecutor.process()` cycle).
#### How did you try to break this feature and the system with this change?
- **Tenant firewall rule churn**: added/removed tenant ingress rules via
`cmk createIpv6FirewallRule` / `deleteIpv6FirewallRule` repeatedly after the
patch. `IpTablesExecutor.process()` flushes and rebuilds the v6 table each
time; the new INPUT rule is re-emitted on every cycle because it's now in
`nft_ipv6_fw`. Counters resume; BGP stays Established.
- **VR reboot**: rebooted the VR (`cmk rebootRouter`). After the reboot
pulls fresh `cloud-scripts.tgz`, the patched `CsAddress.py` runs in the rebuilt
VR and the rule is in place from boot. BGP establishes within ~30s of VR ready.
- **Non-routed networks**: confirmed `is_routed()` gating means standard
Isolated v4 networks and VPC networks see no new rules in either chain — no
behaviour change for them.
- **Cross-account / cross-domain**: verified the rule fires per-VR (each
tenant network's VR gets its own rule with its own `eth2` reference and per-VR
counter), with no cross-tenant traffic leakage.
Tested with both single-tenant and multi-tenant network deployments.
Validated the substrate change on ACS 4.22.0.0; same code path exists in `4.20`
branch HEAD per inspection.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]