JannePeltonen replied on github web page:
include/odp/api/spec/packet_io.h
line 21
@@ -256,11 +256,27 @@ typedef struct odp_pktio_param_t {
* belong to time synchronization protocol (PTP).
*
* Packet input checksum checking may be enabled or disabled. When it is
- * enabled, implementation will verify checksum correctness on incoming packets
- * and depending on drop configuration either deliver erroneous packets with
- * appropriate flags set (e.g. odp_packet_has_l3_error()) or drop those.
- * When packet dropping is enabled, application will never receive a packet
- * with the specified error and may avoid to check the error flag.
+ * enabled, implementation will attempt to verify checksum correctness on
+ * incoming packets and depending on drop configuration either deliver
erroneous
+ * packets with appropriate flags set (e.g. odp_packet_has_l3_error(),
+ * odp_packet_l3_chksum_status()) or drop those. When packet dropping is
+ * enabled, application will never receive a packet with the specified error
+ * and may avoid to check the error flag.
+ *
+ * If checksum checking is enabled, IPv4 header checksum checking is always
+ * done for packets that do not have IP options and L4 checksum checking
+ * is done for unfragmented packets that do not have IPv4 options or IPv6
+ * extension headers. In other cases checksum checking may or may not
+ * be done. For example, L4 checksum of fragmented packets is typically
+ * not checked.
Comment:
In the RX side there are valid use cases where we both care and do not care
about the L4 checksum. If a system is both forwarding and terminating IP
traffic and neither clearly dominates the other, then we want to be able to
take advantage of HW checksum checking without unnecessarily doing SW checksum
checking when it is not needed.
ODP implementation cannot determine which packets are going to be forwarded
(and for which checking the L4 checksum would be a waste of resources) and
which are locally received (and L4 checksum must be checked). The application
knows it.
It would be good if an application would be able to request "cheap" (i.e. done
in HW, not in SW) L4 checksum checking for all incoming packets and later
decide whether the result of the checksum check is actually used and whether to
fall back to the much more costly SW calculation (once it is known for which
packets the check is needed.
Maybe the API proposal is not clear enough of the intention of enabling this,
but this is what we have needed the past in actual products and this is what we
have implemented (in totally different context than ODP).
Maybe one possibility would be to have ODP always check the L4 checksum but do
the SW fallback only when the checksum result is being requested. If the
application checks the L4 checksum result only when it knows that it needs to
check it, the SW calculation would not be done in vain for other packets.
Even for applications that only terminate all traffic and thus need to check
the L4 checksum, it would make sense to postpone the SW fallback until it is
known that the checksum result is really needed (and the packet did not fail
any other checks, i.e. had proper L3 checksum, the right IP addresses and ports
etc).
> None(JannePeltonen) wrote:
> OK. It may be worthwhile to include that clarification here so that
> applications have assurance that this is not applicable to IPsec inline
> output.
>> None(JannePeltonen) wrote:
>> ODP APIs say nothing about what HW can or cannot do since the APIs don't
>> presume any implementation model. They describe the services that ODP
>> implementations provide in a platform-optimal manner. Ideally that means in
>> HW, but if a given platform requires SW to provide that service in general,
>> or in a specific case, then the implementation is in a better position to do
>> that optimally than the application.
>>
>> If the intent here is to request only services that are HW accelerated, then
>> they should be recast in that light. But in the case of checksumming, it's
>> not as if this can be skipped if it's "too hard" to do in HW. So I'd rather
>> see ODP do it in SW once than requiring every application to duplicate this
>> effort.
>>> JannePeltonen wrote
>>> There are 3 cases:
>>>
>>> 1. Csum was done + success
>>> 2. Csum was done + error
>>> 3. Csum was not done
>>>
>>>
>>> First thing you will usually do with a packet (for N number of reasons):
>>>
>>> If (odp_packet_has_error())
>>> odp_packet_free(pkt);
>>>
>>> This will remove packets with âCsum was done + errorâ.
>>> The only question remaining: âCsum was done + successâ or âCsum was
>>> not doneâ
>>>
>>> More, the implementation will likely put this info in
>>> odp_packet_hdr(pkt)->p.input_flags, so it makes sense to look like the
>>> other APIs from there.
>>>> None(JannePeltonen) wrote:
>>>> The application determines whether L3 checksum processing is needed or
>>>> not. If it says it is needed, then I don't see why it's unreasonable to
>>>> expect the ODP implementation to do it. Is requiring every application to
>>>> do this in SW somehow better than having ODP do it commonly in SW?
>>>>> None(JannePeltonen) wrote:
>>>>> @JannePeltonen I thought that's why we were providing the per-packet
>>>>> override capability here--so that applications can say "While I normally
>>>>> want checksums added, for this packet that's not needed". Or "Please
>>>>> don't add checksums by default, however for this packet please do so".
>>>>> The point is, if the application determines that checksum processing is
>>>>> needed for a particular packet (or interface) then that work needs to be
>>>>> done. The ODP implementation is in a better position than the application
>>>>> to determine how best to do that work on this platform since it has
>>>>> direct access to the platform-specific HW, specialized instructions,
>>>>> etc., and if it needs to fall back to SW can do so using instruction
>>>>> sequences optimized to its microarchitecture.
>>>>>
>>>>> On the RX side, again we either don't care about checksum validation or
>>>>> else we do. If we do then that work has to be done. So I don't see what
>>>>> advantage is gained by requiring every application to have a "backup
>>>>> plan" in this area.
>>>>>
>>>>> Also, since ODP requires the application to have correctly set the L3 and
>>>>> L4 offset values, that's all the information required to perform these
>>>>> calculations as specified by the relevant RFCs independent of the rest of
>>>>> the packet structure. So again, it's cleaner to have ODP do that once
>>>>> rather than requiring every application to duplicate that effort.
>>>>>> None(JannePeltonen) wrote:
>>>>>> Why you'd prefer two calls instead of one? With ABI compat, the number
>>>>>> of function calls per packet matters. Also, "has checksum been checked"
>>>>>> and "checksum check result" are likely reading the same bits in packet
>>>>>> header anyway. So, it makes sense to do both checks in one go.
>>>>>>> JannePeltonen wrote
>>>>>>> Instead of this I would prefer to have in
>>>>>>> ./include/odp/api/spec/packet_flags.h /
>>>>>>> ./platform/linux-generic/odp_packet_flags.c something like:
>>>>>>>
>>>>>>> int odp_packet_has_l3_csum(odp_packet_t pkt);
>>>>>>> int odp_packet_has_l4_csum(odp_packet_t pkt);
>>>>>>>
>>>>>>> * @retval non-zero operation was performed
>>>>>>> * @retval 0 operation was not performed
>>>>>>>
>>>>>>> For completeness, we may add explicit csum operation result call but
>>>>>>> likely odp_packet_has_error() or odp_packet_has_l3_error() or
>>>>>>> odp_packet_has_l4_error() will be preferred.
>>>>>>>
>>>>>>> int odp_packet_has_l3_csum_error(odp_packet_t pkt);
>>>>>>> int odp_packet_has_l4_csum_error(odp_packet_t pkt);
>>>>>>>> JannePeltonen wrote
>>>>>>>> It would not be good to require ODP to always check the L4 checksum if
>>>>>>>> checksum checking is enabled in the config, because then some
>>>>>>>> implementations might have to fall back to SW checking for some
>>>>>>>> packets and doing the check in SW would be a waste of resources in
>>>>>>>> case the application is not interested in the L4 checksum of all
>>>>>>>> packets (e.g. an application that both terminates and forwards IP
>>>>>>>> packets needs to check the L4 checksum of the locally received packets
>>>>>>>> but should ignore the L4 checksum of forwarded packets.
>>>>>>>>> None(JannePeltonen) wrote:
>>>>>>>>> This talks **application** passing fragments to ODP for L4 checksum
>>>>>>>>> insertion. When ODP IPsec inline creates fragments, application does
>>>>>>>>> not have control on those (no need for API spec) and obviously
>>>>>>>>> implementation must not trick itself (it's an implementation internal
>>>>>>>>> bug to create a fragment and ask output HW to insert L4 checksum for
>>>>>>>>> those).
>>>>>>>>>
>>>>>>>>> Application passes full packet to IPSEC with fragmentation and L4
>>>>>>>>> checksum offload requests - implementation inserts L4 checksum, does
>>>>>>>>> fragmentation if needed, passes new packets back to user or directly
>>>>>>>>> to output (in case of inline).
>>>>>>>>>> None(JannePeltonen) wrote:
>>>>>>>>>> The intention is best effort on top of the minimum requirement that
>>>>>>>>>> all basic IP packets must be checked. The wording says that quite
>>>>>>>>>> clearly already (what needs to change ?). Again it's a fact that not
>>>>>>>>>> all HW will check non-basic IP packets, or if they do (e.g. with
>>>>>>>>>> firmware / SW) packet rate may be much lower for _all_ packets.
>>>>>>>>>>
>>>>>>>>>> Typically, these non-basic packets are a small fraction of all
>>>>>>>>>> packets and belong to the slow path anyway.
>>>>>>>>>>> None(JannePeltonen) wrote:
>>>>>>>>>>> That's RFC for IP stack not for NIC HW. ODP just pass information
>>>>>>>>>>> between app and HW. IP stack is an app. So, the app (IP stack) must
>>>>>>>>>>> do checksum check if HW didn't do it. We leave option for the HW to
>>>>>>>>>>> not do it, since not all HW do it for packets with
>>>>>>>>>>> options/extensions.
>>>>>>>>>>>> None(JannePeltonen) wrote:
>>>>>>>>>>>> About the values, we can change those later if there's evidence
>>>>>>>>>>>> from various implementations that OK == 0 would be better than
>>>>>>>>>>>> UNKNOWN == 0. If there's no measurable performance difference,
>>>>>>>>>>>> UNKNOWN == 0 is more robust as init value.
>>>>>>>>>>>>> None(JannePeltonen) wrote:
>>>>>>>>>>>>> These three values are needed. For example, we cannot dictate
>>>>>>>>>>>>> that all HW must be able to calculate checksums when there are
>>>>>>>>>>>>> extension headers. Still some HW can do that. So, even if
>>>>>>>>>>>>> application enabled checksumming, it depends on the packet and on
>>>>>>>>>>>>> the implementation if check was done or not. We just dictate that
>>>>>>>>>>>>> basic packets (no options / extensions) must always be checked
>>>>>>>>>>>>> when checksumming is enabled.
>>>>>>>>>>>>>> None(JannePeltonen) wrote:
>>>>>>>>>>>>>> OK
>>>>>>>>>>>>>>> None(JannePeltonen) wrote:
>>>>>>>>>>>>>>> Then perhaps the 0 enum should be `ODP_PACKET_CHECKSUM_NORMAL`
>>>>>>>>>>>>>>> where "normal" is interpreted as OK if checksum offload was
>>>>>>>>>>>>>>> requested and "unknown" if it was not requested. The enum is
>>>>>>>>>>>>>>> then simply either NORMAL or BAD. The point is the only thing
>>>>>>>>>>>>>>> the application really wants to know is whether the checksum is
>>>>>>>>>>>>>>> bad or not. If it's disabled checksumming for whatever reason
>>>>>>>>>>>>>>> then presumably it doesn't care.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If we reduce the enum to only two values, then it can simply
>>>>>>>>>>>>>>> collapse into a "bad" bit that's set if we know the checksum is
>>>>>>>>>>>>>>> incorrect and left as 0 otherwise. The 0 means it's OK if we
>>>>>>>>>>>>>>> are checking checksums and unknown if we don't care about them.
>>>>>>>>>>>>>>>> None(JannePeltonen) wrote:
>>>>>>>>>>>>>>>> Agreed, however whether IPsec fragments the output for a given
>>>>>>>>>>>>>>>> packet may not be known in advance by the application. As
>>>>>>>>>>>>>>>> worded, this would seem to imply that I cannot request L4
>>>>>>>>>>>>>>>> checksum processing for inline IPsec output on the off chance
>>>>>>>>>>>>>>>> it may need to be fragmented. I doubt if that is your intent
>>>>>>>>>>>>>>>> here.
>>>>>>>>>>>>>>>>> None(JannePeltonen) wrote:
>>>>>>>>>>>>>>>>> Again, TCP checksums are not optional and their presence is
>>>>>>>>>>>>>>>>> not conditional upon such things. If the intent here is to
>>>>>>>>>>>>>>>>> request best-effort HW offload (as opposed to requiring the
>>>>>>>>>>>>>>>>> ODP implementation provide this service however it chooses)
>>>>>>>>>>>>>>>>> then this wording needs to change to reflect that
>>>>>>>>>>>>>>>>> distinction. I'd take the position that applications should
>>>>>>>>>>>>>>>>> never have to worry about checksums unless they are being
>>>>>>>>>>>>>>>>> overridden for specific diagnostic or other purposes. In
>>>>>>>>>>>>>>>>> normal operation applications should be able to rely on the
>>>>>>>>>>>>>>>>> ODP implementation taking care of them (preferably in HW, but
>>>>>>>>>>>>>>>>> otherwise in SW if needed).
>>>>>>>>>>>>>>>>>> None(JannePeltonen) wrote:
>>>>>>>>>>>>>>>>>> Not according to [RFC
>>>>>>>>>>>>>>>>>> 791](https://tools.ietf.org/html/rfc791):
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>>>>> The Header Checksum provides a verification that the
>>>>>>>>>>>>>>>>>> information used
>>>>>>>>>>>>>>>>>> in processing internet datagram has been transmitted
>>>>>>>>>>>>>>>>>> correctly. The
>>>>>>>>>>>>>>>>>> data may contain errors. If the header checksum fails,
>>>>>>>>>>>>>>>>>> the internet
>>>>>>>>>>>>>>>>>> datagram is discarded at once by the entity which detects
>>>>>>>>>>>>>>>>>> the error.
>>>>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> This requirement is independent of whether or not IP options
>>>>>>>>>>>>>>>>>> exist.
>>>>>>>>>>>>>>>>>>> None(JannePeltonen) wrote:
>>>>>>>>>>>>>>>>>>> Not easily for IPv6, the fragmentation extension header may
>>>>>>>>>>>>>>>>>>> be behind a list of other extension headers. Also it does
>>>>>>>>>>>>>>>>>>> not make sense to ask L4 checksum for a packet that it
>>>>>>>>>>>>>>>>>>> cannot be done. Usually even HW assisted checksum insertion
>>>>>>>>>>>>>>>>>>> is ordered by SW on per packet basis. So, if application
>>>>>>>>>>>>>>>>>>> does not order correctly, ODP would need to parse packet
>>>>>>>>>>>>>>>>>>> and find out if it can be ordered or not.
>>>>>>>>>>>>>>>>>>>> None(JannePeltonen) wrote:
>>>>>>>>>>>>>>>>>>>> L4 check includes fields from IP header. If the IP header
>>>>>>>>>>>>>>>>>>>> is not in basic form (has options/extensions) HW may not
>>>>>>>>>>>>>>>>>>>> be able to do the L4 checksum check.
>>>>>>>>>>>>>>>>>>>>> None(JannePeltonen) wrote:
>>>>>>>>>>>>>>>>>>>>> Basic IP packets checksum checking is a MUST, support for
>>>>>>>>>>>>>>>>>>>>> options is a MAY.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> We don't want to force every implementation to check also
>>>>>>>>>>>>>>>>>>>>> packets with IP options, since not all HW can do it. User
>>>>>>>>>>>>>>>>>>>>> finds out if the check was done from the packet.
>>>>>>>>>>>>>>>>>>>>>> None(JannePeltonen) wrote:
>>>>>>>>>>>>>>>>>>>>>> Not in this patch, since statistics are defined to work
>>>>>>>>>>>>>>>>>>>>>> as RFCs specify. We don't want to redefine how
>>>>>>>>>>>>>>>>>>>>>> packet_io_stats_t works in here.
>>>>>>>>>>>>>>>>>>>>>>> None(JannePeltonen) wrote:
>>>>>>>>>>>>>>>>>>>>>>> OK and UNKNOWN are the most common cases: OK when
>>>>>>>>>>>>>>>>>>>>>>> checking is enabled, UNKNOWN when checking disabled.
>>>>>>>>>>>>>>>>>>>>>>> E.g. a router would not need check L4 checksum, so IPv4
>>>>>>>>>>>>>>>>>>>>>>> check would be on, but all L4 checksum checks off.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Initialization is more robust with UNKNOWN == 0.
>>>>>>>>>>>>>>>>>>>>>>> Odp-linux would save these along the other packet flags
>>>>>>>>>>>>>>>>>>>>>>> (init would be flags.all_bits = 0). On the other hand,
>>>>>>>>>>>>>>>>>>>>>>> HW specific implementation may not store this enum
>>>>>>>>>>>>>>>>>>>>>>> anywhere, but directly mask HW specific packet
>>>>>>>>>>>>>>>>>>>>>>> descriptor flags to produce the status. In that case,
>>>>>>>>>>>>>>>>>>>>>>> the values do not matter too much.
>>>>>>>>>>>>>>>>>>>>>>>> None(JannePeltonen) wrote:
>>>>>>>>>>>>>>>>>>>>>>>> Can't the interface detect the IPfrag itself? How does
>>>>>>>>>>>>>>>>>>>>>>>> this effect IPsec inline output fragmentation support?
>>>>>>>>>>>>>>>>>>>>>>>>> None(JannePeltonen) wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> L4 checksums being skipped for IP fragments is
>>>>>>>>>>>>>>>>>>>>>>>>> normal, but why the caveat about IP options or IPv6
>>>>>>>>>>>>>>>>>>>>>>>>> extension headers? A TCP or UCP checksum is well
>>>>>>>>>>>>>>>>>>>>>>>>> defined independent of these L3 considerations.
>>>>>>>>>>>>>>>>>>>>>>>>>> None(JannePeltonen) wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> What is the relevance of IP options to checksum
>>>>>>>>>>>>>>>>>>>>>>>>>> processing? These two are orthogonal.
>>>>>>>>>>>>>>>>>>>>>>>>>>> None(JannePeltonen) wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>> Worth mentioning anything about statistics here?
>>>>>>>>>>>>>>>>>>>>>>>>>>> Presumably such drops are accumulated there.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> None(JannePeltonen) wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Since ODP_PACKET_CHKSUM_OK is the most likely
>>>>>>>>>>>>>>>>>>>>>>>>>>>> case, wouldn't it be better to use 0 for that
>>>>>>>>>>>>>>>>>>>>>>>>>>>> enum? In almost all cases HW is going to "do the
>>>>>>>>>>>>>>>>>>>>>>>>>>>> right thing" so SW should only need to initialize
>>>>>>>>>>>>>>>>>>>>>>>>>>>> this to something else if that's not the case.
https://github.com/Linaro/odp/pull/167#discussion_r137996055
updated_at 2017-09-11 07:23:40