Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page:
include/odp/api/spec/packet_io.h
line 58
@@ -314,13 +330,27 @@ typedef union odp_pktin_config_opt_t {
* Packet output configuration options listed in a bit field structure. Packet
* output checksum insertion may be enabled or disabled. When it is enabled,
* implementation will calculate and insert checksum into every outgoing packet
- * by default. Application may use a packet metadata flag to disable checksum
- * insertion per packet bases. For correct operation, packet metadata must
- * provide valid offsets for the appropriate protocols. For example, UDP
- * checksum calculation needs both L3 and L4 offsets (to access IP and UDP
- * headers). When application (e.g. a switch) does not modify L3/L4 data and
- * thus checksum does not need to be updated, output checksum insertion should
- * be disabled for optimal performance.
+ * by default. Application may disable checksum insertion (e.g.
+ * odp_packet_l4_chksum_insert()) on per packet basis. For correct operation,
+ * packet metadata must provide valid offsets for the appropriate protocols.
+ * For example, UDP checksum calculation needs both L3 and L4 offsets (to
access
+ * IP and UDP headers). When application (e.g. a switch) does not modify L3/L4
+ * data and thus checksum does not need to be updated, output checksum
insertion
+ * should be disabled for optimal performance.
+ *
+ * Packet flags (odp_packet_has_*()) are ignored for the purpose of checksum
+ * insertion in packet output.
+ *
+ * UDP, TCP and SCTP checksum insertion must not be requested for IP fragments.
+ * Use checksum override function (odp_packet_l4_chksum_insert()) to disable
+ * checksumming when sending a fragment through a packet IO interface that has
+ * the relevant L4 checksum insertion enabled.
Comment:
OK. It may be worthwhile to include that clarification here so that
applications have assurance that this is not applicable to IPsec inline output.
> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
> ODP APIs say nothing about what HW can or cannot do since the APIs don't
> presume any implementation model. They describe the services that ODP
> implementations provide in a platform-optimal manner. Ideally that means in
> HW, but if a given platform requires SW to provide that service in general,
> or in a specific case, then the implementation is in a better position to do
> that optimally than the application.
>
> If the intent here is to request only services that are HW accelerated, then
> they should be recast in that light. But in the case of checksumming, it's
> not as if this can be skipped if it's "too hard" to do in HW. So I'd rather
> see ODP do it in SW once than requiring every application to duplicate this
> effort.
>> Bill-Fischofer-Linaro wrote
>> There are 3 cases:
>>
>> 1. Csum was done + success
>> 2. Csum was done + error
>> 3. Csum was not done
>>
>>
>> First thing you will usually do with a packet (for N number of reasons):
>>
>> If (odp_packet_has_error())
>> odp_packet_free(pkt);
>>
>> This will remove packets with âCsum was done + errorâ.
>> The only question remaining: âCsum was done + successâ or âCsum was
>> not doneâ
>>
>> More, the implementation will likely put this info in
>> odp_packet_hdr(pkt)->p.input_flags, so it makes sense to look like the other
>> APIs from there.
>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>> The application determines whether L3 checksum processing is needed or not.
>>> If it says it is needed, then I don't see why it's unreasonable to expect
>>> the ODP implementation to do it. Is requiring every application to do this
>>> in SW somehow better than having ODP do it commonly in SW?
>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>>> @JannePeltonen I thought that's why we were providing the per-packet
>>>> override capability here--so that applications can say "While I normally
>>>> want checksums added, for this packet that's not needed". Or "Please don't
>>>> add checksums by default, however for this packet please do so". The point
>>>> is, if the application determines that checksum processing is needed for a
>>>> particular packet (or interface) then that work needs to be done. The ODP
>>>> implementation is in a better position than the application to determine
>>>> how best to do that work on this platform since it has direct access to
>>>> the platform-specific HW, specialized instructions, etc., and if it needs
>>>> to fall back to SW can do so using instruction sequences optimized to its
>>>> microarchitecture.
>>>>
>>>> On the RX side, again we either don't care about checksum validation or
>>>> else we do. If we do then that work has to be done. So I don't see what
>>>> advantage is gained by requiring every application to have a "backup plan"
>>>> in this area.
>>>>
>>>> Also, since ODP requires the application to have correctly set the L3 and
>>>> L4 offset values, that's all the information required to perform these
>>>> calculations as specified by the relevant RFCs independent of the rest of
>>>> the packet structure. So again, it's cleaner to have ODP do that once
>>>> rather than requiring every application to duplicate that effort.
>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>>>> Why you'd prefer two calls instead of one? With ABI compat, the number of
>>>>> function calls per packet matters. Also, "has checksum been checked" and
>>>>> "checksum check result" are likely reading the same bits in packet header
>>>>> anyway. So, it makes sense to do both checks in one go.
>>>>>> Bill-Fischofer-Linaro wrote
>>>>>> Instead of this I would prefer to have in
>>>>>> ./include/odp/api/spec/packet_flags.h /
>>>>>> ./platform/linux-generic/odp_packet_flags.c something like:
>>>>>>
>>>>>> int odp_packet_has_l3_csum(odp_packet_t pkt);
>>>>>> int odp_packet_has_l4_csum(odp_packet_t pkt);
>>>>>>
>>>>>> * @retval non-zero operation was performed
>>>>>> * @retval 0 operation was not performed
>>>>>>
>>>>>> For completeness, we may add explicit csum operation result call but
>>>>>> likely odp_packet_has_error() or odp_packet_has_l3_error() or
>>>>>> odp_packet_has_l4_error() will be preferred.
>>>>>>
>>>>>> int odp_packet_has_l3_csum_error(odp_packet_t pkt);
>>>>>> int odp_packet_has_l4_csum_error(odp_packet_t pkt);
>>>>>>> Bill-Fischofer-Linaro wrote
>>>>>>> It would not be good to require ODP to always check the L4 checksum if
>>>>>>> checksum checking is enabled in the config, because then some
>>>>>>> implementations might have to fall back to SW checking for some packets
>>>>>>> and doing the check in SW would be a waste of resources in case the
>>>>>>> application is not interested in the L4 checksum of all packets (e.g.
>>>>>>> an application that both terminates and forwards IP packets needs to
>>>>>>> check the L4 checksum of the locally received packets but should ignore
>>>>>>> the L4 checksum of forwarded packets.
>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>>>>>>> This talks **application** passing fragments to ODP for L4 checksum
>>>>>>>> insertion. When ODP IPsec inline creates fragments, application does
>>>>>>>> not have control on those (no need for API spec) and obviously
>>>>>>>> implementation must not trick itself (it's an implementation internal
>>>>>>>> bug to create a fragment and ask output HW to insert L4 checksum for
>>>>>>>> those).
>>>>>>>>
>>>>>>>> Application passes full packet to IPSEC with fragmentation and L4
>>>>>>>> checksum offload requests - implementation inserts L4 checksum, does
>>>>>>>> fragmentation if needed, passes new packets back to user or directly
>>>>>>>> to output (in case of inline).
>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>>>>>>>> The intention is best effort on top of the minimum requirement that
>>>>>>>>> all basic IP packets must be checked. The wording says that quite
>>>>>>>>> clearly already (what needs to change ?). Again it's a fact that not
>>>>>>>>> all HW will check non-basic IP packets, or if they do (e.g. with
>>>>>>>>> firmware / SW) packet rate may be much lower for _all_ packets.
>>>>>>>>>
>>>>>>>>> Typically, these non-basic packets are a small fraction of all
>>>>>>>>> packets and belong to the slow path anyway.
>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>>>>>>>>> That's RFC for IP stack not for NIC HW. ODP just pass information
>>>>>>>>>> between app and HW. IP stack is an app. So, the app (IP stack) must
>>>>>>>>>> do checksum check if HW didn't do it. We leave option for the HW to
>>>>>>>>>> not do it, since not all HW do it for packets with
>>>>>>>>>> options/extensions.
>>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>>>>>>>>>> About the values, we can change those later if there's evidence
>>>>>>>>>>> from various implementations that OK == 0 would be better than
>>>>>>>>>>> UNKNOWN == 0. If there's no measurable performance difference,
>>>>>>>>>>> UNKNOWN == 0 is more robust as init value.
>>>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>>>>>>>>>>> These three values are needed. For example, we cannot dictate that
>>>>>>>>>>>> all HW must be able to calculate checksums when there are
>>>>>>>>>>>> extension headers. Still some HW can do that. So, even if
>>>>>>>>>>>> application enabled checksumming, it depends on the packet and on
>>>>>>>>>>>> the implementation if check was done or not. We just dictate that
>>>>>>>>>>>> basic packets (no options / extensions) must always be checked
>>>>>>>>>>>> when checksumming is enabled.
>>>>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>>>>>>>>>>>> OK
>>>>>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>>>>>>>>>>>>> Then perhaps the 0 enum should be `ODP_PACKET_CHECKSUM_NORMAL`
>>>>>>>>>>>>>> where "normal" is interpreted as OK if checksum offload was
>>>>>>>>>>>>>> requested and "unknown" if it was not requested. The enum is
>>>>>>>>>>>>>> then simply either NORMAL or BAD. The point is the only thing
>>>>>>>>>>>>>> the application really wants to know is whether the checksum is
>>>>>>>>>>>>>> bad or not. If it's disabled checksumming for whatever reason
>>>>>>>>>>>>>> then presumably it doesn't care.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If we reduce the enum to only two values, then it can simply
>>>>>>>>>>>>>> collapse into a "bad" bit that's set if we know the checksum is
>>>>>>>>>>>>>> incorrect and left as 0 otherwise. The 0 means it's OK if we are
>>>>>>>>>>>>>> checking checksums and unknown if we don't care about them.
>>>>>>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>>>>>>>>>>>>>> Agreed, however whether IPsec fragments the output for a given
>>>>>>>>>>>>>>> packet may not be known in advance by the application. As
>>>>>>>>>>>>>>> worded, this would seem to imply that I cannot request L4
>>>>>>>>>>>>>>> checksum processing for inline IPsec output on the off chance
>>>>>>>>>>>>>>> it may need to be fragmented. I doubt if that is your intent
>>>>>>>>>>>>>>> here.
>>>>>>>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>>>>>>>>>>>>>>> Again, TCP checksums are not optional and their presence is
>>>>>>>>>>>>>>>> not conditional upon such things. If the intent here is to
>>>>>>>>>>>>>>>> request best-effort HW offload (as opposed to requiring the
>>>>>>>>>>>>>>>> ODP implementation provide this service however it chooses)
>>>>>>>>>>>>>>>> then this wording needs to change to reflect that distinction.
>>>>>>>>>>>>>>>> I'd take the position that applications should never have to
>>>>>>>>>>>>>>>> worry about checksums unless they are being overridden for
>>>>>>>>>>>>>>>> specific diagnostic or other purposes. In normal operation
>>>>>>>>>>>>>>>> applications should be able to rely on the ODP implementation
>>>>>>>>>>>>>>>> taking care of them (preferably in HW, but otherwise in SW if
>>>>>>>>>>>>>>>> needed).
>>>>>>>>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>>>>>>>>>>>>>>>> Not according to [RFC
>>>>>>>>>>>>>>>>> 791](https://tools.ietf.org/html/rfc791):
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>>>> The Header Checksum provides a verification that the
>>>>>>>>>>>>>>>>> information used
>>>>>>>>>>>>>>>>> in processing internet datagram has been transmitted
>>>>>>>>>>>>>>>>> correctly. The
>>>>>>>>>>>>>>>>> data may contain errors. If the header checksum fails, the
>>>>>>>>>>>>>>>>> internet
>>>>>>>>>>>>>>>>> datagram is discarded at once by the entity which detects
>>>>>>>>>>>>>>>>> the error.
>>>>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This requirement is independent of whether or not IP options
>>>>>>>>>>>>>>>>> exist.
>>>>>>>>>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>>>>>>>>>>>>>>>>> Not easily for IPv6, the fragmentation extension header may
>>>>>>>>>>>>>>>>>> be behind a list of other extension headers. Also it does
>>>>>>>>>>>>>>>>>> not make sense to ask L4 checksum for a packet that it
>>>>>>>>>>>>>>>>>> cannot be done. Usually even HW assisted checksum insertion
>>>>>>>>>>>>>>>>>> is ordered by SW on per packet basis. So, if application
>>>>>>>>>>>>>>>>>> does not order correctly, ODP would need to parse packet and
>>>>>>>>>>>>>>>>>> find out if it can be ordered or not.
>>>>>>>>>>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>>>>>>>>>>>>>>>>>> L4 check includes fields from IP header. If the IP header
>>>>>>>>>>>>>>>>>>> is not in basic form (has options/extensions) HW may not be
>>>>>>>>>>>>>>>>>>> able to do the L4 checksum check.
>>>>>>>>>>>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>>>>>>>>>>>>>>>>>>> Basic IP packets checksum checking is a MUST, support for
>>>>>>>>>>>>>>>>>>>> options is a MAY.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> We don't want to force every implementation to check also
>>>>>>>>>>>>>>>>>>>> packets with IP options, since not all HW can do it. User
>>>>>>>>>>>>>>>>>>>> finds out if the check was done from the packet.
>>>>>>>>>>>>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>>>>>>>>>>>>>>>>>>>> Not in this patch, since statistics are defined to work
>>>>>>>>>>>>>>>>>>>>> as RFCs specify. We don't want to redefine how
>>>>>>>>>>>>>>>>>>>>> packet_io_stats_t works in here.
>>>>>>>>>>>>>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>>>>>>>>>>>>>>>>>>>>> OK and UNKNOWN are the most common cases: OK when
>>>>>>>>>>>>>>>>>>>>>> checking is enabled, UNKNOWN when checking disabled.
>>>>>>>>>>>>>>>>>>>>>> E.g. a router would not need check L4 checksum, so IPv4
>>>>>>>>>>>>>>>>>>>>>> check would be on, but all L4 checksum checks off.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Initialization is more robust with UNKNOWN == 0.
>>>>>>>>>>>>>>>>>>>>>> Odp-linux would save these along the other packet flags
>>>>>>>>>>>>>>>>>>>>>> (init would be flags.all_bits = 0). On the other hand,
>>>>>>>>>>>>>>>>>>>>>> HW specific implementation may not store this enum
>>>>>>>>>>>>>>>>>>>>>> anywhere, but directly mask HW specific packet
>>>>>>>>>>>>>>>>>>>>>> descriptor flags to produce the status. In that case,
>>>>>>>>>>>>>>>>>>>>>> the values do not matter too much.
>>>>>>>>>>>>>>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>>>>>>>>>>>>>>>>>>>>>> Can't the interface detect the IPfrag itself? How does
>>>>>>>>>>>>>>>>>>>>>>> this effect IPsec inline output fragmentation support?
>>>>>>>>>>>>>>>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>>>>>>>>>>>>>>>>>>>>>>> L4 checksums being skipped for IP fragments is normal,
>>>>>>>>>>>>>>>>>>>>>>>> but why the caveat about IP options or IPv6 extension
>>>>>>>>>>>>>>>>>>>>>>>> headers? A TCP or UCP checksum is well defined
>>>>>>>>>>>>>>>>>>>>>>>> independent of these L3 considerations.
>>>>>>>>>>>>>>>>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> What is the relevance of IP options to checksum
>>>>>>>>>>>>>>>>>>>>>>>>> processing? These two are orthogonal.
>>>>>>>>>>>>>>>>>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> Worth mentioning anything about statistics here?
>>>>>>>>>>>>>>>>>>>>>>>>>> Presumably such drops are accumulated there.
>>>>>>>>>>>>>>>>>>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>> Since ODP_PACKET_CHKSUM_OK is the most likely case,
>>>>>>>>>>>>>>>>>>>>>>>>>>> wouldn't it be better to use 0 for that enum? In
>>>>>>>>>>>>>>>>>>>>>>>>>>> almost all cases HW is going to "do the right
>>>>>>>>>>>>>>>>>>>>>>>>>>> thing" so SW should only need to initialize this to
>>>>>>>>>>>>>>>>>>>>>>>>>>> something else if that's not the case.
https://github.com/Linaro/odp/pull/167#discussion_r137809086
updated_at 2017-09-08 14:46:47