Hi everyone, CC-ing ovn-kubernetes mailing list as I know there's interest about this there too.
OVN currently has a couple of tools that help tracing/tracking/simulating what would happen to packets within OVN, some examples: 1. ovn-trace 2. ovs-appctl ofproto/trace ... | ovn-detrace They're both really useful and provide lots of information but with both of them quite it's hard to get an overview of the end-to-end packet processing in OVN for a given packet. Therefore both solutions have disadvantages when trying to troubleshoot production deployments. Some examples: a. ovn-trace will not take into account any potential issues with translating logical flows to openflow so if there's a bug in the translation we'll not be able to detect it by looking at ovn-trace output. There is the --ovs switch but the user would have to somehow determine on which hypervisor to query for the openflows corresponding to logical flows/SB entities. b. "ovs-appctl ofproto/trace ... | ovn-detrace" works quite well when used on a single node but as soon as traffic gets tunneled to a different hypervisor the user has to figure out the changes that were performed on the packet on the source hypervisor and adapt the packet/flow to include the tunnel information to be used when running ofproto/trace on the destination hypervisor. c. both ovn-trace and ofproto/trace support minimal hints to specify the new conntrack state after conntrack recirculation but that turns out to be not enough even in simple scenarios when NAT is involved [0]. In a production deployment one of the scenarios one would have to troubleshoot is: "Given this OVN deployment on X nodes why isn't this specific packet/traffic that is received on logical port P1 doesn't reach/reach port P2." Assuming that point "c" above is addressed somehow (there are a few suggestions on how to do that [1]) it's still quite a lot of work for the engineer doing the troubleshooting to gather all the interesting information. One would probably do something like: 1. connect to the node running the southbound database and get the chassis where the logical port is bound: chassis=$(ovn-sbctl --bare --columns chassis list port_binding P1) hostname=$(ovn-sbctl --bare --columns hostname list chassis $chassis) 2. connect to $hostname and determine the OVS ofport id of the interface corresponding to P1: in_port=$(ovs-vsctl --bare --columns ofport find interface external_ids:iface-id=P1) iface=$(ovs-vsctl --bare --columns name find interface external_ids:iface-id=P1) 3. get a hexdump of the packet to be traced (or the flow), for example, on $hostname: flow=$(tcpdump -xx -c 1 -i $iface $pkt_filter | ovs-tcpundump) 3. run ofproto/trace on $hostname (potentially piping output to ovn-detrace): ovs-appctl ofproto/trace br-int in_port=$in_port $flow | ovn-detrace --ovnnb=$NB_CONN --ovnsb=$SB_CONN 4. In the best case the packet is fully processed on the current node (e.g., is dropped or forwarded out a local VIF). 5. In the worst case the packet needs to be tunneled to a remote hypervisor for egress on a remote VIF. The engineer needs to identify in the ofproto/trace output the metadata that would be passed through the tunnel along with the packet and also the changes that would happen to the packet payload (e.g. NAT) on the local hypervisor. 6. Determine the hostname of the chassis hosting the remote tunnel destination based on "tun_dst" from the ofproto/trace output at point 3 above: chassis_name=$(ovn-sbctl --bare --columns chassis_name find encap ip=$tun_dst) hostname=$(ovn-sbctl --bare --columns hostname find chassis name=$chassis_name) 7. Rerun the ofproto/trace on the remote chassis (basically go back to step #3 above). My initial thought was that all the work above can be automated as all the information we need is either in the Southbound DB or in OVS DB on the hypervisors and the output of ofproto/trace contains all the packet modifications and tunnel information we need. I had started working on a tool, "ovn-global-trace", that would do all the work above but I hit a few blocking issues: - point "c" above, i.e., conntrack related packet modifications: this will require some work in OVS ofproto/trace to either support additional conntrack hints or to actually run the trace against conntrack on the node. - if we choose to query conntrack during ofproto/trace we'd probably need a way to also update the conntrack records the trace is run against. This would turn out useful for cases when we troubleshoot session establishment, e.g., with TCP: first run a trace for the SYN packet, then run a a trace for the SYN-ACK packet in the other direction but for this second trace we'd need the conntrack entry to have been created by the initial trace. - ofproto/trace output is plain text: while a tool could parse the information from the text output it would probably be easier if ofproto/trace would dump the trace information in a structured way (e.g., json). It would be great to get some feedback from the community about other aspects that I might have missed regarding end-to-end packet tracing and how we could aggregate current utilities into a single easier to use tool like I was hoping "ovn-global-trace" would end up. Thanks, Dumitru [0] https://patchwork.ozlabs.org/project/openvswitch/patch/1578648883-1145-1-git-send-email-dce...@redhat.com/ [1] https://mail.openvswitch.org/pipermail/ovs-dev/2020-January/366571.html _______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss