> -----Original Message----- > From: lng-odp [mailto:[email protected]] On Behalf Of Ola > Liljedahl > Sent: Friday, June 02, 2017 1:41 PM > To: Savolainen, Petri (Nokia - FI/Espoo) <[email protected]>; > Honnappa > Nagarahalli <[email protected]> > Cc: Elo, Matias (Nokia - FI/Espoo) <[email protected]>; nd <[email protected]>; > Kevin Wang > <[email protected]>; Honnappa Nagarahalli <[email protected]>; > lng- > [email protected] > Subject: Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler > > > > On 02/06/2017, 12:17, "Savolainen, Petri (Nokia - FI/Espoo)" > <[email protected]> wrote: > > > > > > >> -----Original Message----- > >> From: lng-odp [mailto:[email protected]] On Behalf Of > >> Honnappa Nagarahalli > >> Sent: Thursday, June 01, 2017 11:30 PM > >> To: Ola Liljedahl <[email protected]> > >> Cc: [email protected]; Honnappa Nagarahalli > >> <[email protected]>; Elo, Matias (Nokia - FI/Espoo) > >> <[email protected]>; Kevin Wang <[email protected]>; nd <[email protected]> > >> Subject: Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler > >> > >> On 1 June 2017 at 15:20, Ola Liljedahl <[email protected]> wrote: > >> > > >> > > >> > > >> > > >> > On 01/06/2017, 22:15, "Honnappa Nagarahalli" > >> > <[email protected]> wrote: > >> > > >> >>On 1 June 2017 at 15:09, Ola Liljedahl <[email protected]> wrote: > >> >>> > >> >>> > >> >>> On 01/06/2017, 21:03, "Bill Fischofer" <[email protected]> > >> >>>wrote: > >> >>> > >> >>>>On Thu, Jun 1, 2017 at 10:59 AM, Honnappa Nagarahalli > >> >>>><[email protected]> wrote: > >> >>>>> On 1 June 2017 at 01:26, Elo, Matias (Nokia - FI/Espoo) > >> >>>>> <[email protected]> wrote: > >> >>>>>> > >> >>>>>>> On 31 May 2017, at 23:53, Bill Fischofer > >> <[email protected]> > >> >>>>>>>wrote: > >> >>>>>>> > >> >>>>>>> On Wed, May 31, 2017 at 8:12 AM, Elo, Matias (Nokia - FI/Espoo) > >> >>>>>>> <[email protected]> wrote: > >> >>>>>>>> > >> >>>>>>>>>>> What¹s the purpose of calling ord_enq_multi() here? To save > >> >>>>>>>>>>>(stash) > >> >>>>>>>>>>> packets if the thread is out-of-order? > >> >>>>>>>>>>> And when the thread is in-order, it is re-enqueueing the > >> packets > >> >>>>>>>>>>>which > >> >>>>>>>>>>> again will invoke pktout_enqueue/pktout_enq_multi but this > >> time > >> >>>>>>>>>>> ord_enq_multi() will not save the packets, instead they will > >> >>>>>>>>>>>actually be > >> >>>>>>>>>>> transmitted by odp_pktout_send()? > >> >>>>>>>>>>> > >> >>>>>>>>>> > >> >>>>>>>>>> Since transmitting packets may fail, out-of-order packets > >> cannot > >> >>>>>>>>>>be > >> >>>>>>>>>> stashed here. > >> >>>>>>>>> You mean that the TX queue of the pktio might be full so not > >>all > >> >>>>>>>>>packets > >> >>>>>>>>> will actually be enqueued for transmission. > >> >>>>>>>> > >> >>>>>>>> Yep. > >> >>>>>>>> > >> >>>>>>>>> This is an interesting case but is it a must to know how many > >> >>>>>>>>>packets are > >> >>>>>>>>> actually accepted? Packets can always be dropped without > >>notice, > >> >>>>>>>>>the > >> >>>>>>>>> question is from which point this is acceptable. If packets > >> >>>>>>>>>enqueued onto > >> >>>>>>>>> a pktout (egress) queue are accepted, this means that they > >>must > >> >>>>>>>>>also be > >> >>>>>>>>> put onto the driver TX queue (as done by odp_pktout_send)? > >> >>>>>>>>> > >> >>>>>>>> > >> >>>>>>>> Currently, the packet_io/queue APIs don't say anything about > >> >>>>>>>>packets > >> >>>>>>>>being > >> >>>>>>>> possibly dropped after successfully calling odp_queue_enq() to > >>a > >> >>>>>>>>pktout > >> >>>>>>>> event queue. So to be consistent with standard odp_queue_enq() > >> >>>>>>>>operations I > >> >>>>>>>> think it is better to return the number of events actually > >> accepted > >> >>>>>>>>to the TX queue. > >> >>>>>>>> > >> >>>>>>>> To have more leeway one option would be to modify the API > >> >>>>>>>>documentation to > >> >>>>>>>> state that packets may still be dropped after a successful > >> >>>>>>>>odp_queue_enq() call > >> >>>>>>>> before reaching the NIC. If the application would like to be > >>sure > >> >>>>>>>>that the > >> >>>>>>>> packets are actually sent, it should use odp_pktout_send() > >> instead. > >> >>>>>>> > >> >>>>>>> Ordered queues simply say that packets will be delivered to the > >> next > >> >>>>>>> queue in the pipeline in the order they originated from their > >> source > >> >>>>>>> queue. What happens after that depends on the attributes of the > >> >>>>>>>target > >> >>>>>>> queue. If the target queue is an exit point from the > >>application, > >> >>>>>>>then > >> >>>>>>> this is outside of ODP's scope. > >> >>>>>> > >> >>>>>> My point was that with stashing the application has no way of > >> knowing > >> >>>>>>if an > >> >>>>>> ordered pktout enqueue call actually succeed. In case of parallel > >> and > >> >>>>>>atomic > >> >>>>>> queues it does. So my question is, is this acceptable? > >> >>>>>> > >> >>>>> Also, currently, it is not possible for the application to have a > >> >>>>> consistent 'wait/drop on destination queue full' policy for all > >>the > >> >>>>> queue types. > >> >>>> > >> >>>>Today applications have no way of knowing whether packets sent to a > >> >>>>pktout_queue or tm_queue actually make it to the wire or whether > >>they > >> >>>>are vaporized as soon as they hit the wire, so there's no change > >>here. > >> >>>>An RC of 0 simply says that the packet was "accepted" for > >>transmission > >> >>>>and hence the caller no longer owns that packet handle. You need > >> >>>>higher-level protocols to track end-to-end transmission and receipt. > >> >>>>All that ordered queues say is that packets being sent to TX queues > >> >>>>will have those TX calls made in the same order as the source queue > >> >>>>they originated from. > >> >>>> > >> >>>>The only way to track packet disposition today is to (a) create a > >> >>>>reference to the packet you want to transmit, (b) verify that > >> >>>>odp_packet_has_ref(original_pkt) > 0, indicating that an actual > >> >>>>reference was created, (c) transmit that reference, and (d) note > >>when > >> >>>>odp_packet_has_ref(original_pkt) returns to 0. That confirms that > >>the > >> >>>>reference has exited the scope of this ODP instance since a > >> >>>>"successful" transmission will free that reference. > >> >>> Doesn¹t this just confirm that the reference has been freed? But you > >> >>>don¹t > >> >>> know if this was due to the packet actually being transmitted on the > >> >>>wire > >> >>> or if it was dropped before that (which would also free the > >> reference). > >> >>> > >> >>> Back to my original question, how far into the ³machine² can we > >>return > >> >>> (to SW) absolute knowledge of the states of packets? > >> >>> > >> >>> With normal queues (including scheduled queues), a successful > >>enqueue > >> >>> guarantees that the packet (event) was actually enqueued. But pktio > >> >>>egress > >> >>> queues are not normal queues, they are essentially representations > >>of > >> a > >> >>> network interface¹s TX queues but also maintain the order > >>restoration > >> >>> function of events enqueued to a queue when processing an ordered > >> queue. > >> >>> > >> >>Why is this specific to pktio egress queues. This problem exists even > >> >>with normal/scheduled queues. Issue is due to source queue being an > >> >>ordered queue. > >> > I don't think it is acceptable to silently drop events enqueued to > >> normal > >> > queues. > >> > That could lead to bad behaviour in applications (e.g. resource > >>leaks). > >> > So out-of-order enqueue of events when processing an ordered queue > >>must > >> be > >> > reliable and I think we have some extra work to do here. > >> > > >> > Enqueue of packets to an egress queue is different since the egress > >> queue > >> > represents > >> > a network interface. Transmission over networks is never guaranteed to > >> > succeed. > >> > > >> I think we should treat the network interface as any other stage in > >> the packet processing pipeline. The enqueue API is informing the > >> previous stage of the application that it has successfully enqueued > >> the packets to the next stage (in this case the network interface). > >> Network interface has the freedom to drop the packets if required (for > >> ex: if there is traffic management at the interface level). > >> > > > >We have TM API for using traffic management features of the HW. This > >discussion is about normal packet out. My opinion is that it's not > >acceptable > Not acceptable for the ODP specification or for some/all ODP applications? > As I wrote earlier, what might be formally acceptable (in the > specification) might not be very useful to some/all applications but this > shouldn't necessarily limit the specification and specific > implementations, there could be other trade-offs that make implementations > which behave in a less-than-ideal way (in this regard) to be useful (to > some applications). An implementation that is 100% functionally correct > but dog slow is not useful. I think we need to separate between > specification and implementations. > > > for packet output to first tell application that a packet was "accepted > >for transmission" and then drop it silently. Packet out (it's a simple > >function) should be able to determine if packet can be accepted for > >transmission and if it's accepted the packet will eventually go out. > Obviously, packet out is not so simple to implement when considering order > restoration etc. The original linux-generic implementation was wrong.
Ordering in the Linux-generic implementation went accidentally broken in January when the new ordered queue implementation was added. I suppose it worked before that. > > > Otherwise (e.g. NIC tx queue is currently full), > This separation between ODP pktout (egress) queue and NIC TX queue is part > of the linux-generic implementation so I am not sure what conclusions we > can draw for the architecture. Conceptually, the ODP egress queue is the > TX queue for the interface, everything beyond that is part of the > implementation. From the ODP application perspective, the link starts at > the ODP egress queue and ends at the ODP ingress queue. In between, > packets can be dropped. > > > it must tell application that packet was "not accepted for delivery". > >Application then decides if it wants to retry later, or drop, or whatever. > > > >Obviously pktout call cannot guarantee that a packet will reach its > >destination other side of the network, but it should be able to guarantee > >that a packet will be sent (NIC queue has free space) or cannot be sent > >(== NIC queue does not have free space). > > > >An event queue towards pktout, is just an extra abstraction over direct > >pktout queue (queue_enq == pktout_send). > Not true in the case when processing an ordered queue and must perform > order restoration of enqueued events while packets passed directly to > odp_pktout_send() are not subject to order restoration. > > > > >-Petri > > > >
