[...]
Some good alternatives to using the lflow UUID as a datapath identifier would be
* The value of the OXM_METADATA register.
* The southbound datapath_binding UUID.
If we use any of these identifiers, we would know what datapath dropped the
packet but not what lflow in it's pipeline did it, Right?
OK, I see your point here now. For some reason I was thinking that the sample()
would capture the current OF flow where the sample was sent, but that's not how
it works.
As was discussed before, logical flows can be coalesced if datapath groups are
enabled. So a logical flow UUID is ambiguous when trying to determine which
logical datapath dropped the packet.
One idea would be to implicitly disable logical datapath groups when drop
debugging is enabled. This way, logical flow UUIDs are not ambiguous. One
downside is that the southbound database will become larger, likely resulting in
OVSDB and ovn-controller incurring more CPU and taking longer to process the
data. Since this is an active debugging scenario, that may not be a problem. The
other potential downside would be if removing logical datapath groups causes the
problem to present itself differently. That could turn a simple issue into a
heisenbug.
Another approach would be to encode data directly into the ObservationPointID.
From what I can tell, the key things you need to know where a packet was
dropped are
* Logical datapath
* OF table
* OF priority
Logical datapaths are 24 bits. I'm not sure what the maximum size for OF tables
are, but with OVN, 8 bits is plenty. Priorities are 16 bits. Therefore, 48 bits
would be necessary to encode all of that. Since the ObservationPointID is only
32 bits, that makes this idea not work as-is. You'd have to include another
field, potentially.
If the logical datapaths are only 24 bits I think we can squeeze the information
in. What we can specify in the sample action is:
ObservationPointID: 32bit
ObservationDomainID: 32bit
Let's say for each IPFIX "application" that OVN might want to implement in the
future (drop-debugging is just one of them) we define a "base_obs_domain_id" where:
0 < base_obs_domain_id < 256
and we make the controller insert:
obs_domain_id = (base_obs_domain_id << 24 | logical_datapath)
In addition, for ovn-debug we use:
obs_point_id = lflow_uuid
This would allow for 254 different non-overlaping uses of IPFIX by OVN plus 16Mi
observation_domain_ids for external users to use (base_domain_id = 0).
For drop debugging we'll always know the logical flow that caused the drop and
the logical datapath it belongs to.
Would this work?
Where can I find those unique 24bits that identify the datapath? Are they
available when on action encoding (i.e: through struct ovnact_encode_params)?
Thanks for the patience and guidance.
--
Adrián Moreno
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev