> -----Original Message----- > From: Ola Liljedahl [mailto:[email protected]] > Sent: Friday, June 02, 2017 1:54 PM > To: Peltonen, Janne (Nokia - FI/Espoo) <[email protected]>; > Savolainen, Petri (Nokia - FI/Espoo) <[email protected]>; > Honnappa Nagarahalli <[email protected]> > Cc: Elo, Matias (Nokia - FI/Espoo) <[email protected]>; nd > <[email protected]>; Kevin Wang <[email protected]>; Honnappa Nagarahalli > <[email protected]>; [email protected] > Subject: Re: Suspected SPAM - Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add > scalable scheduler > > > > > > > >> -----Original Message----- > >> From: lng-odp [mailto:[email protected]] On Behalf Of > >>Savolainen, Petri > >> (Nokia - FI/Espoo) > >> Sent: Friday, June 02, 2017 1:18 PM > >> To: Honnappa Nagarahalli <[email protected]>; Ola > >>Liljedahl > >> <[email protected]> > >> Cc: Elo, Matias (Nokia - FI/Espoo) <[email protected]>; nd > >><[email protected]>; Kevin Wang > >> <[email protected]>; Honnappa Nagarahalli > >><[email protected]>; lng- > >> [email protected] > >> Subject: Suspected SPAM - Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add > >>scalable scheduler > >> > >> > >> > >> > -----Original Message----- > >> > From: lng-odp [mailto:[email protected]] On Behalf Of > >> > Honnappa Nagarahalli > >> > Sent: Thursday, June 01, 2017 11:30 PM > >> > To: Ola Liljedahl <[email protected]> > >> > Cc: [email protected]; Honnappa Nagarahalli > >> > <[email protected]>; Elo, Matias (Nokia - FI/Espoo) > >> > <[email protected]>; Kevin Wang <[email protected]>; nd > >><[email protected]> > >> > Subject: Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler > >> > > >> > On 1 June 2017 at 15:20, Ola Liljedahl <[email protected]> wrote: > >> > > > >> > > > >> > > > >> > > > >> > > On 01/06/2017, 22:15, "Honnappa Nagarahalli" > >> > > <[email protected]> wrote: > >> > > > >> > >>On 1 June 2017 at 15:09, Ola Liljedahl <[email protected]> > >>wrote: > >> > >>> > >> > >>> > >> > >>> On 01/06/2017, 21:03, "Bill Fischofer" > <[email protected]> > >> > >>>wrote: > >> > >>> > >> > >>>>On Thu, Jun 1, 2017 at 10:59 AM, Honnappa Nagarahalli > >> > >>>><[email protected]> wrote: > >> > >>>>> On 1 June 2017 at 01:26, Elo, Matias (Nokia - FI/Espoo) > >> > >>>>> <[email protected]> wrote: > >> > >>>>>> > >> > >>>>>>> On 31 May 2017, at 23:53, Bill Fischofer > >> > <[email protected]> > >> > >>>>>>>wrote: > >> > >>>>>>> > >> > >>>>>>> On Wed, May 31, 2017 at 8:12 AM, Elo, Matias (Nokia - > >>FI/Espoo) > >> > >>>>>>> <[email protected]> wrote: > >> > >>>>>>>> > >> > >>>>>>>>>>> What¹s the purpose of calling ord_enq_multi() here? To > >>save > >> > >>>>>>>>>>>(stash) > >> > >>>>>>>>>>> packets if the thread is out-of-order? > >> > >>>>>>>>>>> And when the thread is in-order, it is re-enqueueing the > >> > packets > >> > >>>>>>>>>>>which > >> > >>>>>>>>>>> again will invoke pktout_enqueue/pktout_enq_multi but > this > >> > time > >> > >>>>>>>>>>> ord_enq_multi() will not save the packets, instead they > >>will > >> > >>>>>>>>>>>actually be > >> > >>>>>>>>>>> transmitted by odp_pktout_send()? > >> > >>>>>>>>>>> > >> > >>>>>>>>>> > >> > >>>>>>>>>> Since transmitting packets may fail, out-of-order packets > >> > cannot > >> > >>>>>>>>>>be > >> > >>>>>>>>>> stashed here. > >> > >>>>>>>>> You mean that the TX queue of the pktio might be full so > >>not all > >> > >>>>>>>>>packets > >> > >>>>>>>>> will actually be enqueued for transmission. > >> > >>>>>>>> > >> > >>>>>>>> Yep. > >> > >>>>>>>> > >> > >>>>>>>>> This is an interesting case but is it a must to know how > >>many > >> > >>>>>>>>>packets are > >> > >>>>>>>>> actually accepted? Packets can always be dropped without > >>notice, > >> > >>>>>>>>>the > >> > >>>>>>>>> question is from which point this is acceptable. If packets > >> > >>>>>>>>>enqueued onto > >> > >>>>>>>>> a pktout (egress) queue are accepted, this means that they > >>must > >> > >>>>>>>>>also be > >> > >>>>>>>>> put onto the driver TX queue (as done by odp_pktout_send)? > >> > >>>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> Currently, the packet_io/queue APIs don't say anything about > >> > >>>>>>>>packets > >> > >>>>>>>>being > >> > >>>>>>>> possibly dropped after successfully calling odp_queue_enq() > >>to a > >> > >>>>>>>>pktout > >> > >>>>>>>> event queue. So to be consistent with standard > >>odp_queue_enq() > >> > >>>>>>>>operations I > >> > >>>>>>>> think it is better to return the number of events actually > >> > accepted > >> > >>>>>>>>to the TX queue. > >> > >>>>>>>> > >> > >>>>>>>> To have more leeway one option would be to modify the API > >> > >>>>>>>>documentation to > >> > >>>>>>>> state that packets may still be dropped after a successful > >> > >>>>>>>>odp_queue_enq() call > >> > >>>>>>>> before reaching the NIC. If the application would like to be > >>sure > >> > >>>>>>>>that the > >> > >>>>>>>> packets are actually sent, it should use odp_pktout_send() > >> > instead. > >> > >>>>>>> > >> > >>>>>>> Ordered queues simply say that packets will be delivered to > >>the > >> > next > >> > >>>>>>> queue in the pipeline in the order they originated from their > >> > source > >> > >>>>>>> queue. What happens after that depends on the attributes of > >>the > >> > >>>>>>>target > >> > >>>>>>> queue. If the target queue is an exit point from the > >>application, > >> > >>>>>>>then > >> > >>>>>>> this is outside of ODP's scope. > >> > >>>>>> > >> > >>>>>> My point was that with stashing the application has no way of > >> > knowing > >> > >>>>>>if an > >> > >>>>>> ordered pktout enqueue call actually succeed. In case of > >>parallel > >> > and > >> > >>>>>>atomic > >> > >>>>>> queues it does. So my question is, is this acceptable? > >> > >>>>>> > >> > >>>>> Also, currently, it is not possible for the application to have > >>a > >> > >>>>> consistent 'wait/drop on destination queue full' policy for all > >>the > >> > >>>>> queue types. > >> > >>>> > >> > >>>>Today applications have no way of knowing whether packets sent to > >>a > >> > >>>>pktout_queue or tm_queue actually make it to the wire or whether > >>they > >> > >>>>are vaporized as soon as they hit the wire, so there's no change > >>here. > >> > >>>>An RC of 0 simply says that the packet was "accepted" for > >>transmission > >> > >>>>and hence the caller no longer owns that packet handle. You need > >> > >>>>higher-level protocols to track end-to-end transmission and > >>receipt. > >> > >>>>All that ordered queues say is that packets being sent to TX > >>queues > >> > >>>>will have those TX calls made in the same order as the source > >>queue > >> > >>>>they originated from. > >> > >>>> > >> > >>>>The only way to track packet disposition today is to (a) create a > >> > >>>>reference to the packet you want to transmit, (b) verify that > >> > >>>>odp_packet_has_ref(original_pkt) > 0, indicating that an actual > >> > >>>>reference was created, (c) transmit that reference, and (d) note > >>when > >> > >>>>odp_packet_has_ref(original_pkt) returns to 0. That confirms that > >>the > >> > >>>>reference has exited the scope of this ODP instance since a > >> > >>>>"successful" transmission will free that reference. > >> > >>> Doesn¹t this just confirm that the reference has been freed? But > >>you > >> > >>>don¹t > >> > >>> know if this was due to the packet actually being transmitted on > >>the > >> > >>>wire > >> > >>> or if it was dropped before that (which would also free the > >> > reference). > >> > >>> > >> > >>> Back to my original question, how far into the ³machine² can we > >>return > >> > >>> (to SW) absolute knowledge of the states of packets? > >> > >>> > >> > >>> With normal queues (including scheduled queues), a successful > >>enqueue > >> > >>> guarantees that the packet (event) was actually enqueued. But > >>pktio > >> > >>>egress > >> > >>> queues are not normal queues, they are essentially > >>representations of > >> > a > >> > >>> network interface¹s TX queues but also maintain the order > >>restoration > >> > >>> function of events enqueued to a queue when processing an ordered > >> > queue. > >> > >>> > >> > >>Why is this specific to pktio egress queues. This problem exists > >>even > >> > >>with normal/scheduled queues. Issue is due to source queue being an > >> > >>ordered queue. > >> > > I don't think it is acceptable to silently drop events enqueued to > >> > normal > >> > > queues. > >> > > That could lead to bad behaviour in applications (e.g. resource > >>leaks). > >> > > So out-of-order enqueue of events when processing an ordered queue > >>must > >> > be > >> > > reliable and I think we have some extra work to do here. > >> > > > >> > > Enqueue of packets to an egress queue is different since the egress > >> > queue > >> > > represents > >> > > a network interface. Transmission over networks is never guaranteed > >>to > >> > > succeed. > >> > > > >> > I think we should treat the network interface as any other stage in > >> > the packet processing pipeline. The enqueue API is informing the > >> > previous stage of the application that it has successfully enqueued > >> > the packets to the next stage (in this case the network interface). > >> > Network interface has the freedom to drop the packets if required > (for > >> > ex: if there is traffic management at the interface level). > >> > > >> > >> We have TM API for using traffic management features of the HW. This > >>discussion is about > >> normal packet out. My opinion is that it's not acceptable for packet > >>output to first tell > >> application that a packet was "accepted for transmission" and then drop > >>it silently. > >> Packet out (it's a simple function) should be able to determine if > >>packet can be accepted > >> for transmission and if it's accepted the packet will eventually go > >>out. Otherwise (e.g. > >> NIC tx queue is currently full), it must tell application that packet > >>was "not accepted > >> for delivery". Application then decides if it wants to retry later, or > >>drop, or whatever. > >> > >> Obviously pktout call cannot guarantee that a packet will reach its > >>destination other side > >> of the network, but it should be able to guarantee that a packet will > >>be sent (NIC queue > >> has free space) or cannot be sent (== NIC queue does not have free > >>space). > >> > >> An event queue towards pktout, is just an extra abstraction over direct > >>pktout queue > >> (queue_enq == pktout_send). > >> > >> -Petri > >> > > > >FWIW, I agree with Petri. If an application wants to do traffic > >management itself as > >opposed to using TM (e.g. there are special requirements that the TM > >cannot meet), > >it may need reliable information on whether an output packet really goes > >into the NIC > >queue or not. With DPDK we get the info and make use of it. It would be > >nice if that > >would work with ODP too. > Yes but you can do that by using odp_pktout_send() directly. > > If you are using the scheduler and ordered queues, there is not > necessarily any correlation between the call to odp_queue_enq() and when > packets are actually enqueued to the egress queue and thus put onto the > driver TX queue.
As I stated before, the important thing here is whether a packet was accepted for delivery, or not. Accepted means that it will be sent out eventually. Reordering is a feature of the event type pktout queue. Implementation needs to just make sure that it does not overbook any potential "mid" queues between event queue enqueue and the output interface. Once a packet is accepted for delivery, is just matter of *when* to send it out. I think that even with "scalable scheduler" the synchronization overhead is not as dramatic as speculated on this thread. -Petri
