> > >> -----Original Message----- >> From: lng-odp [mailto:[email protected]] On Behalf Of >>Savolainen, Petri >> (Nokia - FI/Espoo) >> Sent: Friday, June 02, 2017 1:18 PM >> To: Honnappa Nagarahalli <[email protected]>; Ola >>Liljedahl >> <[email protected]> >> Cc: Elo, Matias (Nokia - FI/Espoo) <[email protected]>; nd >><[email protected]>; Kevin Wang >> <[email protected]>; Honnappa Nagarahalli >><[email protected]>; lng- >> [email protected] >> Subject: Suspected SPAM - Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add >>scalable scheduler >> >> >> >> > -----Original Message----- >> > From: lng-odp [mailto:[email protected]] On Behalf Of >> > Honnappa Nagarahalli >> > Sent: Thursday, June 01, 2017 11:30 PM >> > To: Ola Liljedahl <[email protected]> >> > Cc: [email protected]; Honnappa Nagarahalli >> > <[email protected]>; Elo, Matias (Nokia - FI/Espoo) >> > <[email protected]>; Kevin Wang <[email protected]>; nd >><[email protected]> >> > Subject: Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler >> > >> > On 1 June 2017 at 15:20, Ola Liljedahl <[email protected]> wrote: >> > > >> > > >> > > >> > > >> > > On 01/06/2017, 22:15, "Honnappa Nagarahalli" >> > > <[email protected]> wrote: >> > > >> > >>On 1 June 2017 at 15:09, Ola Liljedahl <[email protected]> >>wrote: >> > >>> >> > >>> >> > >>> On 01/06/2017, 21:03, "Bill Fischofer" <[email protected]> >> > >>>wrote: >> > >>> >> > >>>>On Thu, Jun 1, 2017 at 10:59 AM, Honnappa Nagarahalli >> > >>>><[email protected]> wrote: >> > >>>>> On 1 June 2017 at 01:26, Elo, Matias (Nokia - FI/Espoo) >> > >>>>> <[email protected]> wrote: >> > >>>>>> >> > >>>>>>> On 31 May 2017, at 23:53, Bill Fischofer >> > <[email protected]> >> > >>>>>>>wrote: >> > >>>>>>> >> > >>>>>>> On Wed, May 31, 2017 at 8:12 AM, Elo, Matias (Nokia - >>FI/Espoo) >> > >>>>>>> <[email protected]> wrote: >> > >>>>>>>> >> > >>>>>>>>>>> What¹s the purpose of calling ord_enq_multi() here? To >>save >> > >>>>>>>>>>>(stash) >> > >>>>>>>>>>> packets if the thread is out-of-order? >> > >>>>>>>>>>> And when the thread is in-order, it is re-enqueueing the >> > packets >> > >>>>>>>>>>>which >> > >>>>>>>>>>> again will invoke pktout_enqueue/pktout_enq_multi but this >> > time >> > >>>>>>>>>>> ord_enq_multi() will not save the packets, instead they >>will >> > >>>>>>>>>>>actually be >> > >>>>>>>>>>> transmitted by odp_pktout_send()? >> > >>>>>>>>>>> >> > >>>>>>>>>> >> > >>>>>>>>>> Since transmitting packets may fail, out-of-order packets >> > cannot >> > >>>>>>>>>>be >> > >>>>>>>>>> stashed here. >> > >>>>>>>>> You mean that the TX queue of the pktio might be full so >>not all >> > >>>>>>>>>packets >> > >>>>>>>>> will actually be enqueued for transmission. >> > >>>>>>>> >> > >>>>>>>> Yep. >> > >>>>>>>> >> > >>>>>>>>> This is an interesting case but is it a must to know how >>many >> > >>>>>>>>>packets are >> > >>>>>>>>> actually accepted? Packets can always be dropped without >>notice, >> > >>>>>>>>>the >> > >>>>>>>>> question is from which point this is acceptable. If packets >> > >>>>>>>>>enqueued onto >> > >>>>>>>>> a pktout (egress) queue are accepted, this means that they >>must >> > >>>>>>>>>also be >> > >>>>>>>>> put onto the driver TX queue (as done by odp_pktout_send)? >> > >>>>>>>>> >> > >>>>>>>> >> > >>>>>>>> Currently, the packet_io/queue APIs don't say anything about >> > >>>>>>>>packets >> > >>>>>>>>being >> > >>>>>>>> possibly dropped after successfully calling odp_queue_enq() >>to a >> > >>>>>>>>pktout >> > >>>>>>>> event queue. So to be consistent with standard >>odp_queue_enq() >> > >>>>>>>>operations I >> > >>>>>>>> think it is better to return the number of events actually >> > accepted >> > >>>>>>>>to the TX queue. >> > >>>>>>>> >> > >>>>>>>> To have more leeway one option would be to modify the API >> > >>>>>>>>documentation to >> > >>>>>>>> state that packets may still be dropped after a successful >> > >>>>>>>>odp_queue_enq() call >> > >>>>>>>> before reaching the NIC. If the application would like to be >>sure >> > >>>>>>>>that the >> > >>>>>>>> packets are actually sent, it should use odp_pktout_send() >> > instead. >> > >>>>>>> >> > >>>>>>> Ordered queues simply say that packets will be delivered to >>the >> > next >> > >>>>>>> queue in the pipeline in the order they originated from their >> > source >> > >>>>>>> queue. What happens after that depends on the attributes of >>the >> > >>>>>>>target >> > >>>>>>> queue. If the target queue is an exit point from the >>application, >> > >>>>>>>then >> > >>>>>>> this is outside of ODP's scope. >> > >>>>>> >> > >>>>>> My point was that with stashing the application has no way of >> > knowing >> > >>>>>>if an >> > >>>>>> ordered pktout enqueue call actually succeed. In case of >>parallel >> > and >> > >>>>>>atomic >> > >>>>>> queues it does. So my question is, is this acceptable? >> > >>>>>> >> > >>>>> Also, currently, it is not possible for the application to have >>a >> > >>>>> consistent 'wait/drop on destination queue full' policy for all >>the >> > >>>>> queue types. >> > >>>> >> > >>>>Today applications have no way of knowing whether packets sent to >>a >> > >>>>pktout_queue or tm_queue actually make it to the wire or whether >>they >> > >>>>are vaporized as soon as they hit the wire, so there's no change >>here. >> > >>>>An RC of 0 simply says that the packet was "accepted" for >>transmission >> > >>>>and hence the caller no longer owns that packet handle. You need >> > >>>>higher-level protocols to track end-to-end transmission and >>receipt. >> > >>>>All that ordered queues say is that packets being sent to TX >>queues >> > >>>>will have those TX calls made in the same order as the source >>queue >> > >>>>they originated from. >> > >>>> >> > >>>>The only way to track packet disposition today is to (a) create a >> > >>>>reference to the packet you want to transmit, (b) verify that >> > >>>>odp_packet_has_ref(original_pkt) > 0, indicating that an actual >> > >>>>reference was created, (c) transmit that reference, and (d) note >>when >> > >>>>odp_packet_has_ref(original_pkt) returns to 0. That confirms that >>the >> > >>>>reference has exited the scope of this ODP instance since a >> > >>>>"successful" transmission will free that reference. >> > >>> Doesn¹t this just confirm that the reference has been freed? But >>you >> > >>>don¹t >> > >>> know if this was due to the packet actually being transmitted on >>the >> > >>>wire >> > >>> or if it was dropped before that (which would also free the >> > reference). >> > >>> >> > >>> Back to my original question, how far into the ³machine² can we >>return >> > >>> (to SW) absolute knowledge of the states of packets? >> > >>> >> > >>> With normal queues (including scheduled queues), a successful >>enqueue >> > >>> guarantees that the packet (event) was actually enqueued. But >>pktio >> > >>>egress >> > >>> queues are not normal queues, they are essentially >>representations of >> > a >> > >>> network interface¹s TX queues but also maintain the order >>restoration >> > >>> function of events enqueued to a queue when processing an ordered >> > queue. >> > >>> >> > >>Why is this specific to pktio egress queues. This problem exists >>even >> > >>with normal/scheduled queues. Issue is due to source queue being an >> > >>ordered queue. >> > > I don’t think it is acceptable to silently drop events enqueued to >> > normal >> > > queues. >> > > That could lead to bad behaviour in applications (e.g. resource >>leaks). >> > > So out-of-order enqueue of events when processing an ordered queue >>must >> > be >> > > reliable and I think we have some extra work to do here. >> > > >> > > Enqueue of packets to an egress queue is different since the egress >> > queue >> > > represents >> > > a network interface. Transmission over networks is never guaranteed >>to >> > > succeed. >> > > >> > I think we should treat the network interface as any other stage in >> > the packet processing pipeline. The enqueue API is informing the >> > previous stage of the application that it has successfully enqueued >> > the packets to the next stage (in this case the network interface). >> > Network interface has the freedom to drop the packets if required (for >> > ex: if there is traffic management at the interface level). >> > >> >> We have TM API for using traffic management features of the HW. This >>discussion is about >> normal packet out. My opinion is that it's not acceptable for packet >>output to first tell >> application that a packet was "accepted for transmission" and then drop >>it silently. >> Packet out (it's a simple function) should be able to determine if >>packet can be accepted >> for transmission and if it's accepted the packet will eventually go >>out. Otherwise (e.g. >> NIC tx queue is currently full), it must tell application that packet >>was "not accepted >> for delivery". Application then decides if it wants to retry later, or >>drop, or whatever. >> >> Obviously pktout call cannot guarantee that a packet will reach its >>destination other side >> of the network, but it should be able to guarantee that a packet will >>be sent (NIC queue >> has free space) or cannot be sent (== NIC queue does not have free >>space). >> >> An event queue towards pktout, is just an extra abstraction over direct >>pktout queue >> (queue_enq == pktout_send). >> >> -Petri >> > >FWIW, I agree with Petri. If an application wants to do traffic >management itself as >opposed to using TM (e.g. there are special requirements that the TM >cannot meet), >it may need reliable information on whether an output packet really goes >into the NIC >queue or not. With DPDK we get the info and make use of it. It would be >nice if that >would work with ODP too. Yes but you can do that by using odp_pktout_send() directly.
If you are using the scheduler and ordered queues, there is not necessarily any correlation between the call to odp_queue_enq() and when packets are actually enqueued to the egress queue and thus put onto the driver TX queue. > >I am not sure that allowing the enqueue to pktout to succeed but then >silently drop >the packet would make the implementation significantly easier, at least >in the >linux generic ODP. Implementations come and go. The question is what the ODP architecture/specification should say about this. > > Janne > >
