Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler
On 2 June 2017 at 05:17, Savolainen, Petri (Nokia - FI/Espoo) <petri.savolai...@nokia.com> wrote: > > >> -Original Message- >> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of >> Honnappa Nagarahalli >> Sent: Thursday, June 01, 2017 11:30 PM >> To: Ola Liljedahl <ola.liljed...@arm.com> >> Cc: lng-odp@lists.linaro.org; Honnappa Nagarahalli >> <honnappa.nagaraha...@arm.com>; Elo, Matias (Nokia - FI/Espoo) >> <matias@nokia.com>; Kevin Wang <kevin.w...@arm.com>; nd <n...@arm.com> >> Subject: Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler >> >> On 1 June 2017 at 15:20, Ola Liljedahl <ola.liljed...@arm.com> wrote: >> > >> > >> > >> > >> > On 01/06/2017, 22:15, "Honnappa Nagarahalli" >> > <honnappa.nagaraha...@linaro.org> wrote: >> > >> >>On 1 June 2017 at 15:09, Ola Liljedahl <ola.liljed...@arm.com> wrote: >> >>> >> >>> >> >>> On 01/06/2017, 21:03, "Bill Fischofer" <bill.fischo...@linaro.org> >> >>>wrote: >> >>> >> >>>>On Thu, Jun 1, 2017 at 10:59 AM, Honnappa Nagarahalli >> >>>><honnappa.nagaraha...@linaro.org> wrote: >> >>>>> On 1 June 2017 at 01:26, Elo, Matias (Nokia - FI/Espoo) >> >>>>> <matias@nokia.com> wrote: >> >>>>>> >> >>>>>>> On 31 May 2017, at 23:53, Bill Fischofer >> <bill.fischo...@linaro.org> >> >>>>>>>wrote: >> >>>>>>> >> >>>>>>> On Wed, May 31, 2017 at 8:12 AM, Elo, Matias (Nokia - FI/Espoo) >> >>>>>>> <matias@nokia.com> wrote: >> >>>>>>>> >> >>>>>>>>>>> What¹s the purpose of calling ord_enq_multi() here? To save >> >>>>>>>>>>>(stash) >> >>>>>>>>>>> packets if the thread is out-of-order? >> >>>>>>>>>>> And when the thread is in-order, it is re-enqueueing the >> packets >> >>>>>>>>>>>which >> >>>>>>>>>>> again will invoke pktout_enqueue/pktout_enq_multi but this >> time >> >>>>>>>>>>> ord_enq_multi() will not save the packets, instead they will >> >>>>>>>>>>>actually be >> >>>>>>>>>>> transmitted by odp_pktout_send()? >> >>>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>> Since transmitting packets may fail, out-of-order packets >> cannot >> >>>>>>>>>>be >> >>>>>>>>>> stashed here. >> >>>>>>>>> You mean that the TX queue of the pktio might be full so not all >> >>>>>>>>>packets >> >>>>>>>>> will actually be enqueued for transmission. >> >>>>>>>> >> >>>>>>>> Yep. >> >>>>>>>> >> >>>>>>>>> This is an interesting case but is it a must to know how many >> >>>>>>>>>packets are >> >>>>>>>>> actually accepted? Packets can always be dropped without notice, >> >>>>>>>>>the >> >>>>>>>>> question is from which point this is acceptable. If packets >> >>>>>>>>>enqueued onto >> >>>>>>>>> a pktout (egress) queue are accepted, this means that they must >> >>>>>>>>>also be >> >>>>>>>>> put onto the driver TX queue (as done by odp_pktout_send)? >> >>>>>>>>> >> >>>>>>>> >> >>>>>>>> Currently, the packet_io/queue APIs don't say anything about >> >>>>>>>>packets >> >>>>>>>>being >> >>>>>>>> possibly dropped after successfully calling odp_queue_enq() to a >> >>>>>>>>pktout >> >>>>>>>> event queue. So to be consistent with standard odp_queue_enq() >> >>>>>>>>operations I >> >>>>>&
Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler
On 02/06/2017, 12:53, "Peltonen, Janne (Nokia - FI/Espoo)"wrote: >>> for packet output to first tell application that a packet was "accepted >>>for transmission" and then drop it silently. Packet out (it's a simple >>>function) should be able to determine if packet can be accepted for >>>transmission and if it's accepted the packet will eventually go out. >>Obviously, packet out is not so simple to implement when considering >>order >>restoration etc. The original linux-generic implementation was wrong. > >Ordering in the Linux-generic implementation went accidentally broken >in January when the new ordered queue implementation was added. I suppose >it worked before that. OK so remove the word ³original² from my statement above.
Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler
On 02/06/2017, 12:17, "Savolainen, Petri (Nokia - FI/Espoo)" <petri.savolai...@nokia.com> wrote: > > >> -Original Message- >> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of >> Honnappa Nagarahalli >> Sent: Thursday, June 01, 2017 11:30 PM >> To: Ola Liljedahl <ola.liljed...@arm.com> >> Cc: lng-odp@lists.linaro.org; Honnappa Nagarahalli >> <honnappa.nagaraha...@arm.com>; Elo, Matias (Nokia - FI/Espoo) >> <matias....@nokia.com>; Kevin Wang <kevin.w...@arm.com>; nd <n...@arm.com> >> Subject: Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler >> >> On 1 June 2017 at 15:20, Ola Liljedahl <ola.liljed...@arm.com> wrote: >> > >> > >> > >> > >> > On 01/06/2017, 22:15, "Honnappa Nagarahalli" >> > <honnappa.nagaraha...@linaro.org> wrote: >> > >> >>On 1 June 2017 at 15:09, Ola Liljedahl <ola.liljed...@arm.com> wrote: >> >>> >> >>> >> >>> On 01/06/2017, 21:03, "Bill Fischofer" <bill.fischo...@linaro.org> >> >>>wrote: >> >>> >> >>>>On Thu, Jun 1, 2017 at 10:59 AM, Honnappa Nagarahalli >> >>>><honnappa.nagaraha...@linaro.org> wrote: >> >>>>> On 1 June 2017 at 01:26, Elo, Matias (Nokia - FI/Espoo) >> >>>>> <matias@nokia.com> wrote: >> >>>>>> >> >>>>>>> On 31 May 2017, at 23:53, Bill Fischofer >> <bill.fischo...@linaro.org> >> >>>>>>>wrote: >> >>>>>>> >> >>>>>>> On Wed, May 31, 2017 at 8:12 AM, Elo, Matias (Nokia - FI/Espoo) >> >>>>>>> <matias@nokia.com> wrote: >> >>>>>>>> >> >>>>>>>>>>> What¹s the purpose of calling ord_enq_multi() here? To save >> >>>>>>>>>>>(stash) >> >>>>>>>>>>> packets if the thread is out-of-order? >> >>>>>>>>>>> And when the thread is in-order, it is re-enqueueing the >> packets >> >>>>>>>>>>>which >> >>>>>>>>>>> again will invoke pktout_enqueue/pktout_enq_multi but this >> time >> >>>>>>>>>>> ord_enq_multi() will not save the packets, instead they will >> >>>>>>>>>>>actually be >> >>>>>>>>>>> transmitted by odp_pktout_send()? >> >>>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>> Since transmitting packets may fail, out-of-order packets >> cannot >> >>>>>>>>>>be >> >>>>>>>>>> stashed here. >> >>>>>>>>> You mean that the TX queue of the pktio might be full so not >>all >> >>>>>>>>>packets >> >>>>>>>>> will actually be enqueued for transmission. >> >>>>>>>> >> >>>>>>>> Yep. >> >>>>>>>> >> >>>>>>>>> This is an interesting case but is it a must to know how many >> >>>>>>>>>packets are >> >>>>>>>>> actually accepted? Packets can always be dropped without >>notice, >> >>>>>>>>>the >> >>>>>>>>> question is from which point this is acceptable. If packets >> >>>>>>>>>enqueued onto >> >>>>>>>>> a pktout (egress) queue are accepted, this means that they >>must >> >>>>>>>>>also be >> >>>>>>>>> put onto the driver TX queue (as done by odp_pktout_send)? >> >>>>>>>>> >> >>>>>>>> >> >>>>>>>> Currently, the packet_io/queue APIs don't say anything about >> >>>>>>>>packets >> >>>>>>>>being >> >>>>>>>> possibly dropped after successfully calling odp_queue_enq() to >>a >> >>>>>>>>pktout >> >>>>>>>> event queue. So to be consistent with standard odp_queue_enq() >> >>>>>>>>o
Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler
> -Original Message- > From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of > Honnappa Nagarahalli > Sent: Thursday, June 01, 2017 11:30 PM > To: Ola Liljedahl <ola.liljed...@arm.com> > Cc: lng-odp@lists.linaro.org; Honnappa Nagarahalli > <honnappa.nagaraha...@arm.com>; Elo, Matias (Nokia - FI/Espoo) > <matias@nokia.com>; Kevin Wang <kevin.w...@arm.com>; nd <n...@arm.com> > Subject: Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler > > On 1 June 2017 at 15:20, Ola Liljedahl <ola.liljed...@arm.com> wrote: > > > > > > > > > > On 01/06/2017, 22:15, "Honnappa Nagarahalli" > > <honnappa.nagaraha...@linaro.org> wrote: > > > >>On 1 June 2017 at 15:09, Ola Liljedahl <ola.liljed...@arm.com> wrote: > >>> > >>> > >>> On 01/06/2017, 21:03, "Bill Fischofer" <bill.fischo...@linaro.org> > >>>wrote: > >>> > >>>>On Thu, Jun 1, 2017 at 10:59 AM, Honnappa Nagarahalli > >>>><honnappa.nagaraha...@linaro.org> wrote: > >>>>> On 1 June 2017 at 01:26, Elo, Matias (Nokia - FI/Espoo) > >>>>> <matias@nokia.com> wrote: > >>>>>> > >>>>>>> On 31 May 2017, at 23:53, Bill Fischofer > <bill.fischo...@linaro.org> > >>>>>>>wrote: > >>>>>>> > >>>>>>> On Wed, May 31, 2017 at 8:12 AM, Elo, Matias (Nokia - FI/Espoo) > >>>>>>> <matias@nokia.com> wrote: > >>>>>>>> > >>>>>>>>>>> What¹s the purpose of calling ord_enq_multi() here? To save > >>>>>>>>>>>(stash) > >>>>>>>>>>> packets if the thread is out-of-order? > >>>>>>>>>>> And when the thread is in-order, it is re-enqueueing the > packets > >>>>>>>>>>>which > >>>>>>>>>>> again will invoke pktout_enqueue/pktout_enq_multi but this > time > >>>>>>>>>>> ord_enq_multi() will not save the packets, instead they will > >>>>>>>>>>>actually be > >>>>>>>>>>> transmitted by odp_pktout_send()? > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Since transmitting packets may fail, out-of-order packets > cannot > >>>>>>>>>>be > >>>>>>>>>> stashed here. > >>>>>>>>> You mean that the TX queue of the pktio might be full so not all > >>>>>>>>>packets > >>>>>>>>> will actually be enqueued for transmission. > >>>>>>>> > >>>>>>>> Yep. > >>>>>>>> > >>>>>>>>> This is an interesting case but is it a must to know how many > >>>>>>>>>packets are > >>>>>>>>> actually accepted? Packets can always be dropped without notice, > >>>>>>>>>the > >>>>>>>>> question is from which point this is acceptable. If packets > >>>>>>>>>enqueued onto > >>>>>>>>> a pktout (egress) queue are accepted, this means that they must > >>>>>>>>>also be > >>>>>>>>> put onto the driver TX queue (as done by odp_pktout_send)? > >>>>>>>>> > >>>>>>>> > >>>>>>>> Currently, the packet_io/queue APIs don't say anything about > >>>>>>>>packets > >>>>>>>>being > >>>>>>>> possibly dropped after successfully calling odp_queue_enq() to a > >>>>>>>>pktout > >>>>>>>> event queue. So to be consistent with standard odp_queue_enq() > >>>>>>>>operations I > >>>>>>>> think it is better to return the number of events actually > accepted > >>>>>>>>to the TX queue. > >>>>>>>> > >>>>>>>> To have more leeway one option would be to modify the API > >>>>>>>>documentation to > >>>>>>>> state that packets may still be dropped after a
Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler
On 01/06/2017, 22:15, "Honnappa Nagarahalli"wrote: >On 1 June 2017 at 15:09, Ola Liljedahl wrote: >> >> >> On 01/06/2017, 21:03, "Bill Fischofer" >>wrote: >> >>>On Thu, Jun 1, 2017 at 10:59 AM, Honnappa Nagarahalli >>> wrote: On 1 June 2017 at 01:26, Elo, Matias (Nokia - FI/Espoo) wrote: > >> On 31 May 2017, at 23:53, Bill Fischofer >>wrote: >> >> On Wed, May 31, 2017 at 8:12 AM, Elo, Matias (Nokia - FI/Espoo) >> wrote: >>> >> What¹s the purpose of calling ord_enq_multi() here? To save >>(stash) >> packets if the thread is out-of-order? >> And when the thread is in-order, it is re-enqueueing the packets >>which >> again will invoke pktout_enqueue/pktout_enq_multi but this time >> ord_enq_multi() will not save the packets, instead they will >>actually be >> transmitted by odp_pktout_send()? >> > > Since transmitting packets may fail, out-of-order packets cannot >be > stashed here. You mean that the TX queue of the pktio might be full so not all packets will actually be enqueued for transmission. >>> >>> Yep. >>> This is an interesting case but is it a must to know how many packets are actually accepted? Packets can always be dropped without notice, the question is from which point this is acceptable. If packets enqueued onto a pktout (egress) queue are accepted, this means that they must also be put onto the driver TX queue (as done by odp_pktout_send)? >>> >>> Currently, the packet_io/queue APIs don't say anything about >>>packets >>>being >>> possibly dropped after successfully calling odp_queue_enq() to a >>>pktout >>> event queue. So to be consistent with standard odp_queue_enq() >>>operations I >>> think it is better to return the number of events actually accepted >>>to the TX queue. >>> >>> To have more leeway one option would be to modify the API >>>documentation to >>> state that packets may still be dropped after a successful >>>odp_queue_enq() call >>> before reaching the NIC. If the application would like to be sure >>>that the >>> packets are actually sent, it should use odp_pktout_send() instead. >> >> Ordered queues simply say that packets will be delivered to the next >> queue in the pipeline in the order they originated from their source >> queue. What happens after that depends on the attributes of the >>target >> queue. If the target queue is an exit point from the application, >>then >> this is outside of ODP's scope. > > My point was that with stashing the application has no way of knowing >if an > ordered pktout enqueue call actually succeed. In case of parallel and >atomic > queues it does. So my question is, is this acceptable? > Also, currently, it is not possible for the application to have a consistent 'wait/drop on destination queue full' policy for all the queue types. >>> >>>Today applications have no way of knowing whether packets sent to a >>>pktout_queue or tm_queue actually make it to the wire or whether they >>>are vaporized as soon as they hit the wire, so there's no change here. >>>An RC of 0 simply says that the packet was "accepted" for transmission >>>and hence the caller no longer owns that packet handle. You need >>>higher-level protocols to track end-to-end transmission and receipt. >>>All that ordered queues say is that packets being sent to TX queues >>>will have those TX calls made in the same order as the source queue >>>they originated from. >>> >>>The only way to track packet disposition today is to (a) create a >>>reference to the packet you want to transmit, (b) verify that >>>odp_packet_has_ref(original_pkt) > 0, indicating that an actual >>>reference was created, (c) transmit that reference, and (d) note when >>>odp_packet_has_ref(original_pkt) returns to 0. That confirms that the >>>reference has exited the scope of this ODP instance since a >>>"successful" transmission will free that reference. >> Doesn¹t this just confirm that the reference has been freed? But you >>don¹t >> know if this was due to the packet actually being transmitted on the >>wire >> or if it was dropped before that (which would also free the reference). >> >> Back to my original question, how far into the ³machine² can we return >> (to SW) absolute knowledge of the states of packets? >> >> With normal queues (including scheduled queues), a successful enqueue >> guarantees that the packet (event) was actually enqueued. But pktio >>egress >> queues are not normal queues, they
Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler
On 1 June 2017 at 15:09, Ola Liljedahlwrote: > > > On 01/06/2017, 21:03, "Bill Fischofer" wrote: > >>On Thu, Jun 1, 2017 at 10:59 AM, Honnappa Nagarahalli >> wrote: >>> On 1 June 2017 at 01:26, Elo, Matias (Nokia - FI/Espoo) >>> wrote: > On 31 May 2017, at 23:53, Bill Fischofer >wrote: > > On Wed, May 31, 2017 at 8:12 AM, Elo, Matias (Nokia - FI/Espoo) > wrote: >> > What¹s the purpose of calling ord_enq_multi() here? To save >(stash) > packets if the thread is out-of-order? > And when the thread is in-order, it is re-enqueueing the packets >which > again will invoke pktout_enqueue/pktout_enq_multi but this time > ord_enq_multi() will not save the packets, instead they will >actually be > transmitted by odp_pktout_send()? > Since transmitting packets may fail, out-of-order packets cannot be stashed here. >>> You mean that the TX queue of the pktio might be full so not all >>>packets >>> will actually be enqueued for transmission. >> >> Yep. >> >>> This is an interesting case but is it a must to know how many >>>packets are >>> actually accepted? Packets can always be dropped without notice, the >>> question is from which point this is acceptable. If packets >>>enqueued onto >>> a pktout (egress) queue are accepted, this means that they must >>>also be >>> put onto the driver TX queue (as done by odp_pktout_send)? >>> >> >> Currently, the packet_io/queue APIs don't say anything about packets >>being >> possibly dropped after successfully calling odp_queue_enq() to a >>pktout >> event queue. So to be consistent with standard odp_queue_enq() >>operations I >> think it is better to return the number of events actually accepted >>to the TX queue. >> >> To have more leeway one option would be to modify the API >>documentation to >> state that packets may still be dropped after a successful >>odp_queue_enq() call >> before reaching the NIC. If the application would like to be sure >>that the >> packets are actually sent, it should use odp_pktout_send() instead. > > Ordered queues simply say that packets will be delivered to the next > queue in the pipeline in the order they originated from their source > queue. What happens after that depends on the attributes of the target > queue. If the target queue is an exit point from the application, then > this is outside of ODP's scope. My point was that with stashing the application has no way of knowing if an ordered pktout enqueue call actually succeed. In case of parallel and atomic queues it does. So my question is, is this acceptable? >>> Also, currently, it is not possible for the application to have a >>> consistent 'wait/drop on destination queue full' policy for all the >>> queue types. >> >>Today applications have no way of knowing whether packets sent to a >>pktout_queue or tm_queue actually make it to the wire or whether they >>are vaporized as soon as they hit the wire, so there's no change here. >>An RC of 0 simply says that the packet was "accepted" for transmission >>and hence the caller no longer owns that packet handle. You need >>higher-level protocols to track end-to-end transmission and receipt. >>All that ordered queues say is that packets being sent to TX queues >>will have those TX calls made in the same order as the source queue >>they originated from. >> >>The only way to track packet disposition today is to (a) create a >>reference to the packet you want to transmit, (b) verify that >>odp_packet_has_ref(original_pkt) > 0, indicating that an actual >>reference was created, (c) transmit that reference, and (d) note when >>odp_packet_has_ref(original_pkt) returns to 0. That confirms that the >>reference has exited the scope of this ODP instance since a >>"successful" transmission will free that reference. > Doesn¹t this just confirm that the reference has been freed? But you don¹t > know if this was due to the packet actually being transmitted on the wire > or if it was dropped before that (which would also free the reference). > > Back to my original question, how far into the ³machine² can we return > (to SW) absolute knowledge of the states of packets? > > With normal queues (including scheduled queues), a successful enqueue > guarantees that the packet (event) was actually enqueued. But pktio egress > queues are not normal queues, they are essentially representations of a > network interface¹s TX queues but also maintain the order restoration > function of events enqueued to a queue when processing an ordered queue. > Why is this specific to pktio egress queues. This
Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler
On 01/06/2017, 21:03, "Bill Fischofer"wrote: >On Thu, Jun 1, 2017 at 10:59 AM, Honnappa Nagarahalli > wrote: >> On 1 June 2017 at 01:26, Elo, Matias (Nokia - FI/Espoo) >> wrote: >>> On 31 May 2017, at 23:53, Bill Fischofer wrote: On Wed, May 31, 2017 at 8:12 AM, Elo, Matias (Nokia - FI/Espoo) wrote: > What¹s the purpose of calling ord_enq_multi() here? To save (stash) packets if the thread is out-of-order? And when the thread is in-order, it is re-enqueueing the packets which again will invoke pktout_enqueue/pktout_enq_multi but this time ord_enq_multi() will not save the packets, instead they will actually be transmitted by odp_pktout_send()? >>> >>> Since transmitting packets may fail, out-of-order packets cannot be >>> stashed here. >> You mean that the TX queue of the pktio might be full so not all >>packets >> will actually be enqueued for transmission. > > Yep. > >> This is an interesting case but is it a must to know how many >>packets are >> actually accepted? Packets can always be dropped without notice, the >> question is from which point this is acceptable. If packets >>enqueued onto >> a pktout (egress) queue are accepted, this means that they must >>also be >> put onto the driver TX queue (as done by odp_pktout_send)? >> > > Currently, the packet_io/queue APIs don't say anything about packets >being > possibly dropped after successfully calling odp_queue_enq() to a >pktout > event queue. So to be consistent with standard odp_queue_enq() >operations I > think it is better to return the number of events actually accepted >to the TX queue. > > To have more leeway one option would be to modify the API >documentation to > state that packets may still be dropped after a successful >odp_queue_enq() call > before reaching the NIC. If the application would like to be sure >that the > packets are actually sent, it should use odp_pktout_send() instead. Ordered queues simply say that packets will be delivered to the next queue in the pipeline in the order they originated from their source queue. What happens after that depends on the attributes of the target queue. If the target queue is an exit point from the application, then this is outside of ODP's scope. >>> >>> My point was that with stashing the application has no way of knowing >>>if an >>> ordered pktout enqueue call actually succeed. In case of parallel and >>>atomic >>> queues it does. So my question is, is this acceptable? >>> >> Also, currently, it is not possible for the application to have a >> consistent 'wait/drop on destination queue full' policy for all the >> queue types. > >Today applications have no way of knowing whether packets sent to a >pktout_queue or tm_queue actually make it to the wire or whether they >are vaporized as soon as they hit the wire, so there's no change here. >An RC of 0 simply says that the packet was "accepted" for transmission >and hence the caller no longer owns that packet handle. You need >higher-level protocols to track end-to-end transmission and receipt. >All that ordered queues say is that packets being sent to TX queues >will have those TX calls made in the same order as the source queue >they originated from. > >The only way to track packet disposition today is to (a) create a >reference to the packet you want to transmit, (b) verify that >odp_packet_has_ref(original_pkt) > 0, indicating that an actual >reference was created, (c) transmit that reference, and (d) note when >odp_packet_has_ref(original_pkt) returns to 0. That confirms that the >reference has exited the scope of this ODP instance since a >"successful" transmission will free that reference. Doesn¹t this just confirm that the reference has been freed? But you don¹t know if this was due to the packet actually being transmitted on the wire or if it was dropped before that (which would also free the reference). Back to my original question, how far into the ³machine² can we return (to SW) absolute knowledge of the states of packets? With normal queues (including scheduled queues), a successful enqueue guarantees that the packet (event) was actually enqueued. But pktio egress queues are not normal queues, they are essentially representations of a network interface¹s TX queues but also maintain the order restoration function of events enqueued to a queue when processing an ordered queue. I interpret your comments Bill as even if enqueue to a pktio egress queue is successful (the packet handle is no longer owned by the application), the implementation from that moment on can do whatever it wants with the packets (as long as
Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler
On Thu, Jun 1, 2017 at 10:59 AM, Honnappa Nagarahalliwrote: > On 1 June 2017 at 01:26, Elo, Matias (Nokia - FI/Espoo) > wrote: >> >>> On 31 May 2017, at 23:53, Bill Fischofer wrote: >>> >>> On Wed, May 31, 2017 at 8:12 AM, Elo, Matias (Nokia - FI/Espoo) >>> wrote: >>> What’s the purpose of calling ord_enq_multi() here? To save (stash) >>> packets if the thread is out-of-order? >>> And when the thread is in-order, it is re-enqueueing the packets which >>> again will invoke pktout_enqueue/pktout_enq_multi but this time >>> ord_enq_multi() will not save the packets, instead they will actually be >>> transmitted by odp_pktout_send()? >>> >> >> Since transmitting packets may fail, out-of-order packets cannot be >> stashed here. > You mean that the TX queue of the pktio might be full so not all packets > will actually be enqueued for transmission. Yep. > This is an interesting case but is it a must to know how many packets are > actually accepted? Packets can always be dropped without notice, the > question is from which point this is acceptable. If packets enqueued onto > a pktout (egress) queue are accepted, this means that they must also be > put onto the driver TX queue (as done by odp_pktout_send)? > Currently, the packet_io/queue APIs don't say anything about packets being possibly dropped after successfully calling odp_queue_enq() to a pktout event queue. So to be consistent with standard odp_queue_enq() operations I think it is better to return the number of events actually accepted to the TX queue. To have more leeway one option would be to modify the API documentation to state that packets may still be dropped after a successful odp_queue_enq() call before reaching the NIC. If the application would like to be sure that the packets are actually sent, it should use odp_pktout_send() instead. >>> >>> Ordered queues simply say that packets will be delivered to the next >>> queue in the pipeline in the order they originated from their source >>> queue. What happens after that depends on the attributes of the target >>> queue. If the target queue is an exit point from the application, then >>> this is outside of ODP's scope. >> >> My point was that with stashing the application has no way of knowing if an >> ordered pktout enqueue call actually succeed. In case of parallel and atomic >> queues it does. So my question is, is this acceptable? >> > Also, currently, it is not possible for the application to have a > consistent 'wait/drop on destination queue full' policy for all the > queue types. Today applications have no way of knowing whether packets sent to a pktout_queue or tm_queue actually make it to the wire or whether they are vaporized as soon as they hit the wire, so there's no change here. An RC of 0 simply says that the packet was "accepted" for transmission and hence the caller no longer owns that packet handle. You need higher-level protocols to track end-to-end transmission and receipt. All that ordered queues say is that packets being sent to TX queues will have those TX calls made in the same order as the source queue they originated from. The only way to track packet disposition today is to (a) create a reference to the packet you want to transmit, (b) verify that odp_packet_has_ref(original_pkt) > 0, indicating that an actual reference was created, (c) transmit that reference, and (d) note when odp_packet_has_ref(original_pkt) returns to 0. That confirms that the reference has exited the scope of this ODP instance since a "successful" transmission will free that reference. >>
Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler
On 1 June 2017 at 01:26, Elo, Matias (Nokia - FI/Espoo)wrote: > >> On 31 May 2017, at 23:53, Bill Fischofer wrote: >> >> On Wed, May 31, 2017 at 8:12 AM, Elo, Matias (Nokia - FI/Espoo) >> wrote: >>> >> What’s the purpose of calling ord_enq_multi() here? To save (stash) >> packets if the thread is out-of-order? >> And when the thread is in-order, it is re-enqueueing the packets which >> again will invoke pktout_enqueue/pktout_enq_multi but this time >> ord_enq_multi() will not save the packets, instead they will actually be >> transmitted by odp_pktout_send()? >> > > Since transmitting packets may fail, out-of-order packets cannot be > stashed here. You mean that the TX queue of the pktio might be full so not all packets will actually be enqueued for transmission. >>> >>> Yep. >>> This is an interesting case but is it a must to know how many packets are actually accepted? Packets can always be dropped without notice, the question is from which point this is acceptable. If packets enqueued onto a pktout (egress) queue are accepted, this means that they must also be put onto the driver TX queue (as done by odp_pktout_send)? >>> >>> Currently, the packet_io/queue APIs don't say anything about packets being >>> possibly dropped after successfully calling odp_queue_enq() to a pktout >>> event queue. So to be consistent with standard odp_queue_enq() operations I >>> think it is better to return the number of events actually accepted to the >>> TX queue. >>> >>> To have more leeway one option would be to modify the API documentation to >>> state that packets may still be dropped after a successful odp_queue_enq() >>> call >>> before reaching the NIC. If the application would like to be sure that the >>> packets are actually sent, it should use odp_pktout_send() instead. >> >> Ordered queues simply say that packets will be delivered to the next >> queue in the pipeline in the order they originated from their source >> queue. What happens after that depends on the attributes of the target >> queue. If the target queue is an exit point from the application, then >> this is outside of ODP's scope. > > My point was that with stashing the application has no way of knowing if an > ordered pktout enqueue call actually succeed. In case of parallel and atomic > queues it does. So my question is, is this acceptable? > Also, currently, it is not possible for the application to have a consistent 'wait/drop on destination queue full' policy for all the queue types. >
Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler
> On 31 May 2017, at 23:53, Bill Fischoferwrote: > > On Wed, May 31, 2017 at 8:12 AM, Elo, Matias (Nokia - FI/Espoo) > wrote: >> > What’s the purpose of calling ord_enq_multi() here? To save (stash) > packets if the thread is out-of-order? > And when the thread is in-order, it is re-enqueueing the packets which > again will invoke pktout_enqueue/pktout_enq_multi but this time > ord_enq_multi() will not save the packets, instead they will actually be > transmitted by odp_pktout_send()? > Since transmitting packets may fail, out-of-order packets cannot be stashed here. >>> You mean that the TX queue of the pktio might be full so not all packets >>> will actually be enqueued for transmission. >> >> Yep. >> >>> This is an interesting case but is it a must to know how many packets are >>> actually accepted? Packets can always be dropped without notice, the >>> question is from which point this is acceptable. If packets enqueued onto >>> a pktout (egress) queue are accepted, this means that they must also be >>> put onto the driver TX queue (as done by odp_pktout_send)? >>> >> >> Currently, the packet_io/queue APIs don't say anything about packets being >> possibly dropped after successfully calling odp_queue_enq() to a pktout >> event queue. So to be consistent with standard odp_queue_enq() operations I >> think it is better to return the number of events actually accepted to the >> TX queue. >> >> To have more leeway one option would be to modify the API documentation to >> state that packets may still be dropped after a successful odp_queue_enq() >> call >> before reaching the NIC. If the application would like to be sure that the >> packets are actually sent, it should use odp_pktout_send() instead. > > Ordered queues simply say that packets will be delivered to the next > queue in the pipeline in the order they originated from their source > queue. What happens after that depends on the attributes of the target > queue. If the target queue is an exit point from the application, then > this is outside of ODP's scope. My point was that with stashing the application has no way of knowing if an ordered pktout enqueue call actually succeed. In case of parallel and atomic queues it does. So my question is, is this acceptable?
Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler
We are in the progress to implement a new queue interface which is more modulized. Kevin 2017-05-23 16:22 GMT+08:00 Savolainen, Petri (Nokia - FI/Espoo) < petri.savolai...@nokia.com>: > > > diff --git a/platform/linux-generic/include/odp_queue_internal.h > > b/platform/linux-generic/include/odp_queue_internal.h > > index 560f826e..cbe065dc 100644 > > --- a/platform/linux-generic/include/odp_queue_internal.h > > +++ b/platform/linux-generic/include/odp_queue_internal.h > > @@ -19,16 +19,24 @@ extern "C" { > > #endif > > > > #include > > -#include > > -#include > > -#include > > -#include > > +#include > > +#include > > #include > > #include > > #include > > #include > > + > > #include > > > > +#include > > +#include > > +#include > > +#include > > +#ifdef ODP_SCHEDULE_SCALABLE > > +#include > > +#include > > +#endif > > + > > #define QUEUE_MULTI_MAX CONFIG_BURST_SIZE > > > > #define QUEUE_STATUS_FREE 0 > > @@ -37,8 +45,6 @@ extern "C" { > > #define QUEUE_STATUS_NOTSCHED 3 > > #define QUEUE_STATUS_SCHED4 > > > > - > > -/* forward declaration */ > > union queue_entry_u; > > > > typedef int (*enq_func_t)(union queue_entry_u *, odp_buffer_hdr_t *); > > @@ -49,6 +55,47 @@ typedef int (*enq_multi_func_t)(union queue_entry_u *, > > typedef int (*deq_multi_func_t)(union queue_entry_u *, > > odp_buffer_hdr_t **, int); > > > > +#ifdef ODP_SCHEDULE_SCALABLE > > The queue interface need to be still improved, so that there's no > conditional compiling like this piece of code here. > > > > + > > +struct queue_entry_s { > > + sched_elem_t sched_elem; > > + > > + odp_ticketlock_t lock ODP_ALIGNED_CACHE; > > + int status; > > + > > + enq_func_t enqueue ODP_ALIGNED_CACHE; > > + deq_func_t dequeue; > > + enq_multi_func_t enqueue_multi; > > + deq_multi_func_t dequeue_multi; > > + > > + uint32_t index; > > + odp_queue_thandle; > > + odp_queue_type_t type; > > + odp_queue_param_t param; > > + odp_pktin_queue_t pktin; > > + odp_pktout_queue_t pktout; > > + char name[ODP_QUEUE_NAME_LEN]; > > +}; > > + > > +int _odp_queue_deq(sched_elem_t *q, odp_buffer_hdr_t *buf_hdr[], int > > num); > > +int _odp_queue_deq_sc(sched_elem_t *q, odp_event_t *evp, int num); > > +int _odp_queue_deq_mc(sched_elem_t *q, odp_event_t *evp, int num); > > + > > +/* Round up memory size to next cache line size to > > + * align all memory addresses on cache line boundary. > > + */ > > +static inline void *shm_pool_alloc_align(_odp_ishm_pool_t *pool, > uint32_t > > size) > > +{ > > + void *addr; > > + > > + addr = _odp_ishm_pool_alloc(pool, ROUNDUP_CACHE_LINE(size)); > > + ODP_ASSERT(((uintptr_t)addr & (ODP_CACHE_LINE_SIZE - 1)) == 0); > > + > > + return addr; > > +} > > + > > +#else > > > > > > diff --git a/platform/linux-generic/odp_packet_io.c b/platform/linux- > > generic/odp_packet_io.c > > index d8cae15c..717e7e33 100644 > > --- a/platform/linux-generic/odp_packet_io.c > > +++ b/platform/linux-generic/odp_packet_io.c > > @@ -5,23 +5,25 @@ > > */ > > #include > > > > -#include > > -#include > > -#include > > #include > > -#include > > -#include > > +#include > > #include > > #include > > #include > > -#include > > +#include > > + > > +#include > > #include > > -#include > > -#include > > -#include > > #include > > + > > +#include > > +#include > > #include > > -#include > > +#include > > +#include > > +#include > > +#include > > +#include > > > > #include > > #include > > @@ -472,7 +474,6 @@ int odp_pktio_start(odp_pktio_t hdl) > > return -1; > > } > > } > > - > > sched_fn->pktio_start(pktio_to_id(hdl), num, index); > > } > > > > @@ -554,7 +555,6 @@ static inline int pktin_recv_buf(odp_pktin_queue_t > > queue, > > odp_packet_t packets[num]; > > odp_packet_hdr_t *pkt_hdr; > > odp_buffer_hdr_t *buf_hdr; > > - odp_buffer_t buf; > > int i; > > int pkts; > > int num_rx = 0; > > @@ -564,9 +564,7 @@ static inline int pktin_recv_buf(odp_pktin_queue_t > > queue, > > for (i = 0; i < pkts; i++) { > > pkt = packets[i]; > > pkt_hdr = odp_packet_hdr(pkt); > > - buf = _odp_packet_to_buffer(pkt); > > - buf_hdr = buf_hdl_to_hdr(buf); > > - > > + buf_hdr = > > buf_hdl_to_hdr(_odp_packet_to_buffer(pkt)); > > > Do not include unnecessary style changes like this one (and the one above) > into this patch. These are just cosmetic and obscure the functional > change you are trying to achieve. These waste reviewer time to figure out > why this change was needed - just to realize that it was not actually > needed. > > Also e.g. in these cases, I like the original style better. > > > > > if
Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler
On Wed, May 31, 2017 at 3:53 PM, Bill Fischoferwrote: > On Wed, May 31, 2017 at 8:12 AM, Elo, Matias (Nokia - FI/Espoo) > wrote: >> > What’s the purpose of calling ord_enq_multi() here? To save (stash) > packets if the thread is out-of-order? > And when the thread is in-order, it is re-enqueueing the packets which > again will invoke pktout_enqueue/pktout_enq_multi but this time > ord_enq_multi() will not save the packets, instead they will actually be > transmitted by odp_pktout_send()? > Since transmitting packets may fail, out-of-order packets cannot be stashed here. >>> You mean that the TX queue of the pktio might be full so not all packets >>> will actually be enqueued for transmission. >> >> Yep. >> >>> This is an interesting case but is it a must to know how many packets are >>> actually accepted? Packets can always be dropped without notice, the >>> question is from which point this is acceptable. If packets enqueued onto >>> a pktout (egress) queue are accepted, this means that they must also be >>> put onto the driver TX queue (as done by odp_pktout_send)? >>> >> >> Currently, the packet_io/queue APIs don't say anything about packets being >> possibly dropped after successfully calling odp_queue_enq() to a pktout >> event queue. So to be consistent with standard odp_queue_enq() operations I >> think it is better to return the number of events actually accepted to the >> TX queue. >> >> To have more leeway one option would be to modify the API documentation to >> state that packets may still be dropped after a successful odp_queue_enq() >> call >> before reaching the NIC. If the application would like to be sure that the >> packets are actually sent, it should use odp_pktout_send() instead. > > Ordered queues simply say that packets will be delivered to the next > queue in the pipeline in the order they originated from their source > queue. What happens after that depends on the attributes of the target > queue. If the target queue is an exit point from the application, then > this is outside of ODP's scope. > > Ethernet is a lossy protocol, so just because packets are > "transmitted" in order doesn't mean that they can't be dropped or > reordered by some other node en route to their final destination. This > is why TCP has ACK packets to confirm receipt. All that ODP says is > that it will do its part to preserve order and will not introduce > reorderings itself. So I don't see ODP ordered queue processing > needing to treat pktout queues any differently than it would for any > other target queue. The original oversight was simply that reorder > processing was not being used at all for these targets. Forgot to add that odp_tm_queues have the same need. I don't recall if the new ordered queue logic covers them as well. > >> >> -Matias >>
Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler
On Wed, May 31, 2017 at 8:12 AM, Elo, Matias (Nokia - FI/Espoo)wrote: > What’s the purpose of calling ord_enq_multi() here? To save (stash) packets if the thread is out-of-order? And when the thread is in-order, it is re-enqueueing the packets which again will invoke pktout_enqueue/pktout_enq_multi but this time ord_enq_multi() will not save the packets, instead they will actually be transmitted by odp_pktout_send()? >>> >>> Since transmitting packets may fail, out-of-order packets cannot be >>> stashed here. >> You mean that the TX queue of the pktio might be full so not all packets >> will actually be enqueued for transmission. > > Yep. > >> This is an interesting case but is it a must to know how many packets are >> actually accepted? Packets can always be dropped without notice, the >> question is from which point this is acceptable. If packets enqueued onto >> a pktout (egress) queue are accepted, this means that they must also be >> put onto the driver TX queue (as done by odp_pktout_send)? >> > > Currently, the packet_io/queue APIs don't say anything about packets being > possibly dropped after successfully calling odp_queue_enq() to a pktout > event queue. So to be consistent with standard odp_queue_enq() operations I > think it is better to return the number of events actually accepted to the TX > queue. > > To have more leeway one option would be to modify the API documentation to > state that packets may still be dropped after a successful odp_queue_enq() > call > before reaching the NIC. If the application would like to be sure that the > packets are actually sent, it should use odp_pktout_send() instead. Ordered queues simply say that packets will be delivered to the next queue in the pipeline in the order they originated from their source queue. What happens after that depends on the attributes of the target queue. If the target queue is an exit point from the application, then this is outside of ODP's scope. Ethernet is a lossy protocol, so just because packets are "transmitted" in order doesn't mean that they can't be dropped or reordered by some other node en route to their final destination. This is why TCP has ACK packets to confirm receipt. All that ODP says is that it will do its part to preserve order and will not introduce reorderings itself. So I don't see ODP ordered queue processing needing to treat pktout queues any differently than it would for any other target queue. The original oversight was simply that reorder processing was not being used at all for these targets. > > -Matias >
Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler
>>> What’s the purpose of calling ord_enq_multi() here? To save (stash) >>> packets if the thread is out-of-order? >>> And when the thread is in-order, it is re-enqueueing the packets which >>> again will invoke pktout_enqueue/pktout_enq_multi but this time >>> ord_enq_multi() will not save the packets, instead they will actually be >>> transmitted by odp_pktout_send()? >>> >> >> Since transmitting packets may fail, out-of-order packets cannot be >> stashed here. > You mean that the TX queue of the pktio might be full so not all packets > will actually be enqueued for transmission. Yep. > This is an interesting case but is it a must to know how many packets are > actually accepted? Packets can always be dropped without notice, the > question is from which point this is acceptable. If packets enqueued onto > a pktout (egress) queue are accepted, this means that they must also be > put onto the driver TX queue (as done by odp_pktout_send)? > Currently, the packet_io/queue APIs don't say anything about packets being possibly dropped after successfully calling odp_queue_enq() to a pktout event queue. So to be consistent with standard odp_queue_enq() operations I think it is better to return the number of events actually accepted to the TX queue. To have more leeway one option would be to modify the API documentation to state that packets may still be dropped after a successful odp_queue_enq() call before reaching the NIC. If the application would like to be sure that the packets are actually sent, it should use odp_pktout_send() instead. -Matias
Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler
On 31/05/2017, 12:18, "Elo, Matias (Nokia - FI/Espoo)"wrote: > >> On 31 May 2017, at 12:04, Ola Liljedahl wrote: >> >> >> >> On 31/05/2017, 10:38, "Peltonen, Janne (Nokia - FI/Espoo)" >> wrote: >> >>> >>> >>> Ola Liljedahl wrote: On 23/05/2017, 16:49, "Peltonen, Janne (Nokia - FI/Espoo)" wrote: > >> +static int ord_enq_multi(uint32_t queue_index, void *p_buf_hdr[], >> + int num, int *ret) >> +{ >> +(void)queue_index; >> +(void)p_buf_hdr; >> +(void)num; >> +(void)ret; >> +return 0; >> +} > > How is packet order maintained when enqueuing packets read from an ordered > queue to a pktout queue? Matias' recent fix uses the ord_enq_multi > scheduler > function for that, but this version does not do any ordering. Or is >the > ordering guaranteed by some other means? The scalable scheduler standard queue enqueue function also handles ordered queues. odp_queue_scalable.c can refer to the same thread-specific data as odp_schedule_scalable.c so we don¹t need this internal interface. We could perhaps adapt the code to use this interface but I think this interface is just an artefact of the implementation of the default queues/scheduler. >>> >>> The problem is that odp_pktout_queue_config() sets qentry->s.enqueue >>> to pktout_enqueue() and that does not have any of the scalable >>>scheduler >>> specific magic that odp_queue_scalable.c:queue_enq{_multi}() has. So >>> ordering does not happen for pktout queues even if it works for other >>> queues, right? >> This must be a recent change, it doesn’t look like that in the working >> branch we are using. >> I see the code when changing to the master branch. >> The code in pktout_enqueue() does look like a hack: >>if (sched_fn->ord_enq_multi(qentry->s.index, (void **)buf_hdr, >> len, )) >> A cast to “void **”??? >> >> What’s the purpose of calling ord_enq_multi() here? To save (stash) >> packets if the thread is out-of-order? >> And when the thread is in-order, it is re-enqueueing the packets which >> again will invoke pktout_enqueue/pktout_enq_multi but this time >> ord_enq_multi() will not save the packets, instead they will actually be >> transmitted by odp_pktout_send()? >> > >Since transmitting packets may fail, out-of-order packets cannot be >stashed here. You mean that the TX queue of the pktio might be full so not all packets will actually be enqueued for transmission. This is an interesting case but is it a must to know how many packets are actually accepted? Packets can always be dropped without notice, the question is from which point this is acceptable. If packets enqueued onto a pktout (egress) queue are accepted, this means that they must also be put onto the driver TX queue (as done by odp_pktout_send)? >With the current scheduler implementation sched_fn->ord_enq_multi() waits >until >in-order and always returns 0 (in case of pktout queue). After this >odp_pktout_send() >is called. > >-Matias >
Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler
> On 31 May 2017, at 12:04, Ola Liljedahlwrote: > > > > On 31/05/2017, 10:38, "Peltonen, Janne (Nokia - FI/Espoo)" > wrote: > >> >> >> Ola Liljedahl wrote: >>> On 23/05/2017, 16:49, "Peltonen, Janne (Nokia - FI/Espoo)" >>> wrote: >>> >>> > +static int ord_enq_multi(uint32_t queue_index, void *p_buf_hdr[], > + int num, int *ret) > +{ > + (void)queue_index; > + (void)p_buf_hdr; > + (void)num; > + (void)ret; > + return 0; > +} How is packet order maintained when enqueuing packets read from an >>> ordered queue to a pktout queue? Matias' recent fix uses the ord_enq_multi scheduler function for that, but this version does not do any ordering. Or is the ordering guaranteed by some other means? >>> The scalable scheduler standard queue enqueue function also handles >>> ordered queues. odp_queue_scalable.c can refer to the same >>> thread-specific >>> data as odp_schedule_scalable.c so we don¹t need this internal >>> interface. >>> We could perhaps adapt the code to use this interface but I think this >>> interface is just an artefact of the implementation of the default >>> queues/scheduler. >> >> The problem is that odp_pktout_queue_config() sets qentry->s.enqueue >> to pktout_enqueue() and that does not have any of the scalable scheduler >> specific magic that odp_queue_scalable.c:queue_enq{_multi}() has. So >> ordering does not happen for pktout queues even if it works for other >> queues, right? > This must be a recent change, it doesn’t look like that in the working > branch we are using. > I see the code when changing to the master branch. > The code in pktout_enqueue() does look like a hack: >if (sched_fn->ord_enq_multi(qentry->s.index, (void **)buf_hdr, > len, )) > A cast to “void **”??? > > What’s the purpose of calling ord_enq_multi() here? To save (stash) > packets if the thread is out-of-order? > And when the thread is in-order, it is re-enqueueing the packets which > again will invoke pktout_enqueue/pktout_enq_multi but this time > ord_enq_multi() will not save the packets, instead they will actually be > transmitted by odp_pktout_send()? > Since transmitting packets may fail, out-of-order packets cannot be stashed here. With the current scheduler implementation sched_fn->ord_enq_multi() waits until in-order and always returns 0 (in case of pktout queue). After this odp_pktout_send() is called. -Matias
Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler
On 31/05/2017, 10:38, "Peltonen, Janne (Nokia - FI/Espoo)"wrote: > > >Ola Liljedahl wrote: >> On 23/05/2017, 16:49, "Peltonen, Janne (Nokia - FI/Espoo)" >> wrote: >> >> >> > >> >> +static int ord_enq_multi(uint32_t queue_index, void *p_buf_hdr[], >> >> + int num, int *ret) >> >> +{ >> >> + (void)queue_index; >> >> + (void)p_buf_hdr; >> >> + (void)num; >> >> + (void)ret; >> >> + return 0; >> >> +} >> > >> >How is packet order maintained when enqueuing packets read from an >>ordered >> >queue to a pktout queue? Matias' recent fix uses the ord_enq_multi >> >scheduler >> >function for that, but this version does not do any ordering. Or is the >> >ordering guaranteed by some other means? >> The scalable scheduler standard queue enqueue function also handles >> ordered queues. odp_queue_scalable.c can refer to the same >>thread-specific >> data as odp_schedule_scalable.c so we don¹t need this internal >>interface. >> We could perhaps adapt the code to use this interface but I think this >> interface is just an artefact of the implementation of the default >> queues/scheduler. > >The problem is that odp_pktout_queue_config() sets qentry->s.enqueue >to pktout_enqueue() and that does not have any of the scalable scheduler >specific magic that odp_queue_scalable.c:queue_enq{_multi}() has. So >ordering does not happen for pktout queues even if it works for other >queues, right? This must be a recent change, it doesn’t look like that in the working branch we are using. I see the code when changing to the master branch. The code in pktout_enqueue() does look like a hack: if (sched_fn->ord_enq_multi(qentry->s.index, (void **)buf_hdr, len, )) A cast to “void **”??? What’s the purpose of calling ord_enq_multi() here? To save (stash) packets if the thread is out-of-order? And when the thread is in-order, it is re-enqueueing the packets which again will invoke pktout_enqueue/pktout_enq_multi but this time ord_enq_multi() will not save the packets, instead they will actually be transmitted by odp_pktout_send()? > > Janne > >> >> > >> >> +static void order_lock(void) >> >> +{ >> >> +} >> >> + >> >> +static void order_unlock(void) >> >> +{ >> >> +} >> > >> >Is it ok that these are no-ops? tm_enqueue() seems to use these. >> No these ought to be implemented. We have fixed that now. Thanks. >> >> >> -- Ola >> >> Ola Liljedahl, Networking System Architect, ARM >> Phone: +46 706 866 373 Skype: ola.liljedahl >> >> >> >> >> > >> >> + >> >> +const schedule_fn_t schedule_scalable_fn = { >> >> + .pktio_start= pktio_start, >> >> + .thr_add= thr_add, >> >> + .thr_rem= thr_rem, >> >> + .num_grps = num_grps, >> >> + .init_queue = init_queue, >> >> + .destroy_queue = destroy_queue, >> >> + .sched_queue= sched_queue, >> >> + .ord_enq_multi = ord_enq_multi, >> >> + .init_global= schedule_init_global, >> >> + .term_global= schedule_term_global, >> >> + .init_local = schedule_init_local, >> >> + .term_local = schedule_term_local, >> >> + .order_lock = order_lock, >> >> + .order_unlock = order_unlock, >> >> +}; >> > >> >Janne >> > >> > >
Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler
Ola Liljedahl wrote: > On 23/05/2017, 16:49, "Peltonen, Janne (Nokia - FI/Espoo)" >wrote: > > > > > >> +static int ord_enq_multi(uint32_t queue_index, void *p_buf_hdr[], > >> + int num, int *ret) > >> +{ > >> + (void)queue_index; > >> + (void)p_buf_hdr; > >> + (void)num; > >> + (void)ret; > >> + return 0; > >> +} > > > >How is packet order maintained when enqueuing packets read from an ordered > >queue to a pktout queue? Matias' recent fix uses the ord_enq_multi > >scheduler > >function for that, but this version does not do any ordering. Or is the > >ordering guaranteed by some other means? > The scalable scheduler standard queue enqueue function also handles > ordered queues. odp_queue_scalable.c can refer to the same thread-specific > data as odp_schedule_scalable.c so we don¹t need this internal interface. > We could perhaps adapt the code to use this interface but I think this > interface is just an artefact of the implementation of the default > queues/scheduler. The problem is that odp_pktout_queue_config() sets qentry->s.enqueue to pktout_enqueue() and that does not have any of the scalable scheduler specific magic that odp_queue_scalable.c:queue_enq{_multi}() has. So ordering does not happen for pktout queues even if it works for other queues, right? Janne > > > > >> +static void order_lock(void) > >> +{ > >> +} > >> + > >> +static void order_unlock(void) > >> +{ > >> +} > > > >Is it ok that these are no-ops? tm_enqueue() seems to use these. > No these ought to be implemented. We have fixed that now. Thanks. > > > -- Ola > > Ola Liljedahl, Networking System Architect, ARM > Phone: +46 706 866 373 Skype: ola.liljedahl > > > > > > > >> + > >> +const schedule_fn_t schedule_scalable_fn = { > >> + .pktio_start= pktio_start, > >> + .thr_add= thr_add, > >> + .thr_rem= thr_rem, > >> + .num_grps = num_grps, > >> + .init_queue = init_queue, > >> + .destroy_queue = destroy_queue, > >> + .sched_queue= sched_queue, > >> + .ord_enq_multi = ord_enq_multi, > >> + .init_global= schedule_init_global, > >> + .term_global= schedule_term_global, > >> + .init_local = schedule_init_local, > >> + .term_local = schedule_term_local, > >> + .order_lock = order_lock, > >> + .order_unlock = order_unlock, > >> +}; > > > > Janne > > > >
Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler
On 23/05/2017, 16:49, "Peltonen, Janne (Nokia - FI/Espoo)"wrote: > >> +static int ord_enq_multi(uint32_t queue_index, void *p_buf_hdr[], >> + int num, int *ret) >> +{ >> +(void)queue_index; >> +(void)p_buf_hdr; >> +(void)num; >> +(void)ret; >> +return 0; >> +} > >How is packet order maintained when enqueuing packets read from an ordered >queue to a pktout queue? Matias' recent fix uses the ord_enq_multi >scheduler >function for that, but this version does not do any ordering. Or is the >ordering guaranteed by some other means? The scalable scheduler standard queue enqueue function also handles ordered queues. odp_queue_scalable.c can refer to the same thread-specific data as odp_schedule_scalable.c so we don¹t need this internal interface. We could perhaps adapt the code to use this interface but I think this interface is just an artefact of the implementation of the default queues/scheduler. > >> +static void order_lock(void) >> +{ >> +} >> + >> +static void order_unlock(void) >> +{ >> +} > >Is it ok that these are no-ops? tm_enqueue() seems to use these. No these ought to be implemented. We have fixed that now. Thanks. -- Ola Ola Liljedahl, Networking System Architect, ARM Phone: +46 706 866 373 Skype: ola.liljedahl > >> + >> +const schedule_fn_t schedule_scalable_fn = { >> +.pktio_start= pktio_start, >> +.thr_add= thr_add, >> +.thr_rem= thr_rem, >> +.num_grps = num_grps, >> +.init_queue = init_queue, >> +.destroy_queue = destroy_queue, >> +.sched_queue= sched_queue, >> +.ord_enq_multi = ord_enq_multi, >> +.init_global= schedule_init_global, >> +.term_global= schedule_term_global, >> +.init_local = schedule_init_local, >> +.term_local = schedule_term_local, >> +.order_lock = order_lock, >> +.order_unlock = order_unlock, >> +}; > > Janne > >
Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler
Honnappa Nagarahalli wrote: > On 23 May 2017 at 09:49, Peltonen, Janne (Nokia - FI/Espoo) >wrote: > > > >> +static int ord_enq_multi(uint32_t queue_index, void *p_buf_hdr[], > >> + int num, int *ret) > >> +{ > >> + (void)queue_index; > >> + (void)p_buf_hdr; > >> + (void)num; > >> + (void)ret; > >> + return 0; > >> +} > > > > How is packet order maintained when enqueuing packets read from an ordered > > queue to a pktout queue? Matias' recent fix uses the ord_enq_multi scheduler > > function for that, but this version does not do any ordering. Or is the > > ordering guaranteed by some other means? > > > > We have been running TM related test cases and they are working fine. > So, I would assume we do not need to do anything. The packet ordering question above is not specific to TM but is relevant also when TM is not used. And a test case may not reveal ordering problems unless you are specifically looking for it and somehow making sure that the right order does not happen by accident. Here is a simple use case for packet ordering at output: Application gets incoming packets from single pktin through single ordered and scheduled queue. The packets get spread to multiple threads by the scheduler. The threads output the packets to the same pktout queue used in queued mode. The output packets in the wire should be in the same order as they were received, even if the worker threads processed the packets at very different speeds. > > What is the use case from TM? Does TM use ODP queues? I am not familiar with the TM, but it looks like the scheduler may want to ensure correct packet order using the order_lock function and therefore that function being no-op in the scalable scheduler looked alarming to me. Janne > > >> +static void order_lock(void) > >> +{ > >> +} > >> + > >> +static void order_unlock(void) > >> +{ > >> +} > > > > Is it ok that these are no-ops? tm_enqueue() seems to use these. > > > >> + > >> +const schedule_fn_t schedule_scalable_fn = { > >> + .pktio_start= pktio_start, > >> + .thr_add= thr_add, > >> + .thr_rem= thr_rem, > >> + .num_grps = num_grps, > >> + .init_queue = init_queue, > >> + .destroy_queue = destroy_queue, > >> + .sched_queue= sched_queue, > >> + .ord_enq_multi = ord_enq_multi, > >> + .init_global= schedule_init_global, > >> + .term_global= schedule_term_global, > >> + .init_local = schedule_init_local, > >> + .term_local = schedule_term_local, > >> + .order_lock = order_lock, > >> + .order_unlock = order_unlock, > >> +}; > > > > Janne > > > >
Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler
On 23 May 2017 at 09:49, Peltonen, Janne (Nokia - FI/Espoo)wrote: > >> +static int ord_enq_multi(uint32_t queue_index, void *p_buf_hdr[], >> + int num, int *ret) >> +{ >> + (void)queue_index; >> + (void)p_buf_hdr; >> + (void)num; >> + (void)ret; >> + return 0; >> +} > > How is packet order maintained when enqueuing packets read from an ordered > queue to a pktout queue? Matias' recent fix uses the ord_enq_multi scheduler > function for that, but this version does not do any ordering. Or is the > ordering guaranteed by some other means? > We have been running TM related test cases and they are working fine. So, I would assume we do not need to do anything. What is the use case from TM? Does TM use ODP queues? >> +static void order_lock(void) >> +{ >> +} >> + >> +static void order_unlock(void) >> +{ >> +} > > Is it ok that these are no-ops? tm_enqueue() seems to use these. > >> + >> +const schedule_fn_t schedule_scalable_fn = { >> + .pktio_start= pktio_start, >> + .thr_add= thr_add, >> + .thr_rem= thr_rem, >> + .num_grps = num_grps, >> + .init_queue = init_queue, >> + .destroy_queue = destroy_queue, >> + .sched_queue= sched_queue, >> + .ord_enq_multi = ord_enq_multi, >> + .init_global= schedule_init_global, >> + .term_global= schedule_term_global, >> + .init_local = schedule_init_local, >> + .term_local = schedule_term_local, >> + .order_lock = order_lock, >> + .order_unlock = order_unlock, >> +}; > > Janne > >
Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler
> diff --git a/platform/linux-generic/include/odp_queue_internal.h > b/platform/linux-generic/include/odp_queue_internal.h > index 560f826e..cbe065dc 100644 > --- a/platform/linux-generic/include/odp_queue_internal.h > +++ b/platform/linux-generic/include/odp_queue_internal.h > @@ -19,16 +19,24 @@ extern "C" { > #endif > > #include > -#include > -#include > -#include > -#include > +#include > +#include > #include > #include > #include > #include > + > #include > > +#include > +#include > +#include > +#include > +#ifdef ODP_SCHEDULE_SCALABLE > +#include > +#include > +#endif > + > #define QUEUE_MULTI_MAX CONFIG_BURST_SIZE > > #define QUEUE_STATUS_FREE 0 > @@ -37,8 +45,6 @@ extern "C" { > #define QUEUE_STATUS_NOTSCHED 3 > #define QUEUE_STATUS_SCHED4 > > - > -/* forward declaration */ > union queue_entry_u; > > typedef int (*enq_func_t)(union queue_entry_u *, odp_buffer_hdr_t *); > @@ -49,6 +55,47 @@ typedef int (*enq_multi_func_t)(union queue_entry_u *, > typedef int (*deq_multi_func_t)(union queue_entry_u *, > odp_buffer_hdr_t **, int); > > +#ifdef ODP_SCHEDULE_SCALABLE The queue interface need to be still improved, so that there's no conditional compiling like this piece of code here. > + > +struct queue_entry_s { > + sched_elem_t sched_elem; > + > + odp_ticketlock_t lock ODP_ALIGNED_CACHE; > + int status; > + > + enq_func_t enqueue ODP_ALIGNED_CACHE; > + deq_func_t dequeue; > + enq_multi_func_t enqueue_multi; > + deq_multi_func_t dequeue_multi; > + > + uint32_t index; > + odp_queue_thandle; > + odp_queue_type_t type; > + odp_queue_param_t param; > + odp_pktin_queue_t pktin; > + odp_pktout_queue_t pktout; > + char name[ODP_QUEUE_NAME_LEN]; > +}; > + > +int _odp_queue_deq(sched_elem_t *q, odp_buffer_hdr_t *buf_hdr[], int > num); > +int _odp_queue_deq_sc(sched_elem_t *q, odp_event_t *evp, int num); > +int _odp_queue_deq_mc(sched_elem_t *q, odp_event_t *evp, int num); > + > +/* Round up memory size to next cache line size to > + * align all memory addresses on cache line boundary. > + */ > +static inline void *shm_pool_alloc_align(_odp_ishm_pool_t *pool, uint32_t > size) > +{ > + void *addr; > + > + addr = _odp_ishm_pool_alloc(pool, ROUNDUP_CACHE_LINE(size)); > + ODP_ASSERT(((uintptr_t)addr & (ODP_CACHE_LINE_SIZE - 1)) == 0); > + > + return addr; > +} > + > +#else > diff --git a/platform/linux-generic/odp_packet_io.c b/platform/linux- > generic/odp_packet_io.c > index d8cae15c..717e7e33 100644 > --- a/platform/linux-generic/odp_packet_io.c > +++ b/platform/linux-generic/odp_packet_io.c > @@ -5,23 +5,25 @@ > */ > #include > > -#include > -#include > -#include > #include > -#include > -#include > +#include > #include > #include > #include > -#include > +#include > + > +#include > #include > -#include > -#include > -#include > #include > + > +#include > +#include > #include > -#include > +#include > +#include > +#include > +#include > +#include > > #include > #include > @@ -472,7 +474,6 @@ int odp_pktio_start(odp_pktio_t hdl) > return -1; > } > } > - > sched_fn->pktio_start(pktio_to_id(hdl), num, index); > } > > @@ -554,7 +555,6 @@ static inline int pktin_recv_buf(odp_pktin_queue_t > queue, > odp_packet_t packets[num]; > odp_packet_hdr_t *pkt_hdr; > odp_buffer_hdr_t *buf_hdr; > - odp_buffer_t buf; > int i; > int pkts; > int num_rx = 0; > @@ -564,9 +564,7 @@ static inline int pktin_recv_buf(odp_pktin_queue_t > queue, > for (i = 0; i < pkts; i++) { > pkt = packets[i]; > pkt_hdr = odp_packet_hdr(pkt); > - buf = _odp_packet_to_buffer(pkt); > - buf_hdr = buf_hdl_to_hdr(buf); > - > + buf_hdr = > buf_hdl_to_hdr(_odp_packet_to_buffer(pkt)); Do not include unnecessary style changes like this one (and the one above) into this patch. These are just cosmetic and obscure the functional change you are trying to achieve. These waste reviewer time to figure out why this change was needed - just to realize that it was not actually needed. Also e.g. in these cases, I like the original style better. > if (pkt_hdr->p.input_flags.dst_queue) { > queue_entry_t *dst_queue; > int ret; > @@ -584,11 +582,12 @@ static inline int pktin_recv_buf(odp_pktin_queue_t > queue, > > int pktout_enqueue(queue_entry_t *qentry, odp_buffer_hdr_t *buf_hdr) > { > - odp_packet_t pkt = _odp_packet_from_buffer(buf_hdr- > >handle.handle); > + odp_packet_t pkt; > int len = 1; > int nbr; > > - nbr = odp_pktout_send(qentry->s.pktout, , len); > + pkt =
[lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler
Signed-off-by: Brian BrooksSigned-off-by: Kevin Wang Signed-off-by: Honnappa Nagarahalli Signed-off-by: Ola Liljedahl --- platform/linux-generic/Makefile.am | 15 +- .../include/odp/api/plat/schedule_types.h |4 +- .../linux-generic/include/odp_config_internal.h| 17 +- .../linux-generic/include/odp_queue_internal.h | 132 +- platform/linux-generic/include/odp_schedule_if.h | 23 +- .../linux-generic/include/odp_schedule_scalable.h | 137 ++ .../include/odp_schedule_scalable_config.h | 42 + .../include/odp_schedule_scalable_ordered.h| 131 ++ platform/linux-generic/m4/odp_schedule.m4 | 55 +- platform/linux-generic/odp_classification.c|4 +- platform/linux-generic/odp_packet_io.c | 71 +- platform/linux-generic/odp_queue.c |2 +- platform/linux-generic/odp_queue_scalable.c| 946 ++ platform/linux-generic/odp_schedule_if.c | 36 +- platform/linux-generic/odp_schedule_scalable.c | 1942 .../linux-generic/odp_schedule_scalable_ordered.c | 280 +++ platform/linux-generic/odp_traffic_mngr.c |7 +- 17 files changed, 3756 insertions(+), 88 deletions(-) create mode 100644 platform/linux-generic/include/odp_schedule_scalable.h create mode 100644 platform/linux-generic/include/odp_schedule_scalable_config.h create mode 100644 platform/linux-generic/include/odp_schedule_scalable_ordered.h create mode 100644 platform/linux-generic/odp_queue_scalable.c create mode 100644 platform/linux-generic/odp_schedule_scalable.c create mode 100644 platform/linux-generic/odp_schedule_scalable_ordered.c diff --git a/platform/linux-generic/Makefile.am b/platform/linux-generic/Makefile.am index 7ad63d59..5290993a 100644 --- a/platform/linux-generic/Makefile.am +++ b/platform/linux-generic/Makefile.am @@ -173,6 +173,9 @@ noinst_HEADERS = \ ${srcdir}/include/odp_queue_internal.h \ ${srcdir}/include/odp_ring_internal.h \ ${srcdir}/include/odp_schedule_if.h \ + ${srcdir}/include/odp_schedule_scalable.h \ + ${srcdir}/include/odp_schedule_scalable_config.h \ + ${srcdir}/include/odp_schedule_scalable_ordered.h \ ${srcdir}/include/odp_sorted_list_internal.h \ ${srcdir}/include/odp_shm_internal.h \ ${srcdir}/include/odp_time_internal.h \ @@ -227,13 +230,9 @@ __LIB__libodp_linux_la_SOURCES = \ pktio/ring.c \ odp_pkt_queue.c \ odp_pool.c \ - odp_queue.c \ odp_rwlock.c \ odp_rwlock_recursive.c \ - odp_schedule.c \ odp_schedule_if.c \ - odp_schedule_sp.c \ - odp_schedule_iquery.c \ odp_shared_memory.c \ odp_sorted_list.c \ odp_spinlock.c \ @@ -262,6 +261,14 @@ if ARCH_IS_X86 __LIB__libodp_linux_la_SOURCES += arch/@ARCH_DIR@/cpu_flags.c endif +if ODP_SCHEDULE_SCALABLE +__LIB__libodp_linux_la_SOURCES += odp_queue_scalable.c odp_schedule_scalable.c \ + odp_schedule_scalable_ordered.c +else +__LIB__libodp_linux_la_SOURCES += odp_queue.c odp_schedule.c odp_schedule_sp.c \ + odp_schedule_iquery.c +endif + if HAVE_PCAP __LIB__libodp_linux_la_SOURCES += pktio/pcap.c endif diff --git a/platform/linux-generic/include/odp/api/plat/schedule_types.h b/platform/linux-generic/include/odp/api/plat/schedule_types.h index 535fd6d0..4e75f9ee 100644 --- a/platform/linux-generic/include/odp/api/plat/schedule_types.h +++ b/platform/linux-generic/include/odp/api/plat/schedule_types.h @@ -18,6 +18,8 @@ extern "C" { #endif +#include + /** @addtogroup odp_scheduler * @{ */ @@ -44,7 +46,7 @@ typedef int odp_schedule_sync_t; typedef int odp_schedule_group_t; /* These must be kept in sync with thread_globals_t in odp_thread.c */ -#define ODP_SCHED_GROUP_INVALID -1 +#define ODP_SCHED_GROUP_INVALID ((odp_schedule_group_t)-1) #define ODP_SCHED_GROUP_ALL 0 #define ODP_SCHED_GROUP_WORKER 1 #define ODP_SCHED_GROUP_CONTROL 2 diff --git a/platform/linux-generic/include/odp_config_internal.h b/platform/linux-generic/include/odp_config_internal.h index dadd59e7..6cc844f3 100644 --- a/platform/linux-generic/include/odp_config_internal.h +++ b/platform/linux-generic/include/odp_config_internal.h @@ -7,9 +7,7 @@ #ifndef ODP_CONFIG_INTERNAL_H_ #define ODP_CONFIG_INTERNAL_H_ -#ifdef __cplusplus -extern "C" { -#endif +#include /* * Maximum number of pools @@ -22,6 +20,13 @@ extern "C" { #define