Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler

2017-06-02 Thread Honnappa Nagarahalli
On 2 June 2017 at 05:17, Savolainen, Petri (Nokia - FI/Espoo)
<petri.savolai...@nokia.com> wrote:
>
>
>> -Original Message-
>> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of
>> Honnappa Nagarahalli
>> Sent: Thursday, June 01, 2017 11:30 PM
>> To: Ola Liljedahl <ola.liljed...@arm.com>
>> Cc: lng-odp@lists.linaro.org; Honnappa Nagarahalli
>> <honnappa.nagaraha...@arm.com>; Elo, Matias (Nokia - FI/Espoo)
>> <matias@nokia.com>; Kevin Wang <kevin.w...@arm.com>; nd <n...@arm.com>
>> Subject: Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler
>>
>> On 1 June 2017 at 15:20, Ola Liljedahl <ola.liljed...@arm.com> wrote:
>> >
>> >
>> >
>> >
>> > On 01/06/2017, 22:15, "Honnappa Nagarahalli"
>> > <honnappa.nagaraha...@linaro.org> wrote:
>> >
>> >>On 1 June 2017 at 15:09, Ola Liljedahl <ola.liljed...@arm.com> wrote:
>> >>>
>> >>>
>> >>> On 01/06/2017, 21:03, "Bill Fischofer" <bill.fischo...@linaro.org>
>> >>>wrote:
>> >>>
>> >>>>On Thu, Jun 1, 2017 at 10:59 AM, Honnappa Nagarahalli
>> >>>><honnappa.nagaraha...@linaro.org> wrote:
>> >>>>> On 1 June 2017 at 01:26, Elo, Matias (Nokia - FI/Espoo)
>> >>>>> <matias@nokia.com> wrote:
>> >>>>>>
>> >>>>>>> On 31 May 2017, at 23:53, Bill Fischofer
>> <bill.fischo...@linaro.org>
>> >>>>>>>wrote:
>> >>>>>>>
>> >>>>>>> On Wed, May 31, 2017 at 8:12 AM, Elo, Matias (Nokia - FI/Espoo)
>> >>>>>>> <matias@nokia.com> wrote:
>> >>>>>>>>
>> >>>>>>>>>>> What¹s the purpose of calling ord_enq_multi() here? To save
>> >>>>>>>>>>>(stash)
>> >>>>>>>>>>> packets if the thread is out-of-order?
>> >>>>>>>>>>> And when the thread is in-order, it is re-enqueueing the
>> packets
>> >>>>>>>>>>>which
>> >>>>>>>>>>> again will invoke pktout_enqueue/pktout_enq_multi but this
>> time
>> >>>>>>>>>>> ord_enq_multi() will not save the packets, instead they will
>> >>>>>>>>>>>actually be
>> >>>>>>>>>>> transmitted by odp_pktout_send()?
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> Since transmitting packets may fail, out-of-order packets
>> cannot
>> >>>>>>>>>>be
>> >>>>>>>>>> stashed here.
>> >>>>>>>>> You mean that the TX queue of the pktio might be full so not all
>> >>>>>>>>>packets
>> >>>>>>>>> will actually be enqueued for transmission.
>> >>>>>>>>
>> >>>>>>>> Yep.
>> >>>>>>>>
>> >>>>>>>>> This is an interesting case but is it a must to know how many
>> >>>>>>>>>packets are
>> >>>>>>>>> actually accepted? Packets can always be dropped without notice,
>> >>>>>>>>>the
>> >>>>>>>>> question is from which point this is acceptable. If packets
>> >>>>>>>>>enqueued onto
>> >>>>>>>>> a pktout (egress) queue are accepted, this means that they must
>> >>>>>>>>>also be
>> >>>>>>>>> put onto the driver TX queue (as done by odp_pktout_send)?
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>> Currently, the packet_io/queue APIs don't say anything about
>> >>>>>>>>packets
>> >>>>>>>>being
>> >>>>>>>> possibly dropped after successfully calling odp_queue_enq() to a
>> >>>>>>>>pktout
>> >>>>>>>> event queue. So to be consistent with standard odp_queue_enq()
>> >>>>>>>>operations I
>> >>>>>&

Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler

2017-06-02 Thread Ola Liljedahl

On 02/06/2017, 12:53, "Peltonen, Janne (Nokia - FI/Espoo)"
 wrote:

>>> for packet output to first tell application that a packet was "accepted
>>>for transmission" and then drop it silently. Packet out (it's a simple
>>>function) should be able to determine if packet can be accepted for
>>>transmission and if it's accepted the packet will eventually go out.
>>Obviously, packet out is not so simple to implement when considering
>>order
>>restoration etc. The original linux-generic implementation was wrong.
>
>Ordering in the  Linux-generic implementation went accidentally broken
>in January when the new ordered queue implementation was added. I suppose
>it worked before that.
OK so remove the word ³original² from my statement above.



Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler

2017-06-02 Thread Ola Liljedahl


On 02/06/2017, 12:17, "Savolainen, Petri (Nokia - FI/Espoo)"
<petri.savolai...@nokia.com> wrote:

>
>
>> -Original Message-
>> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of
>> Honnappa Nagarahalli
>> Sent: Thursday, June 01, 2017 11:30 PM
>> To: Ola Liljedahl <ola.liljed...@arm.com>
>> Cc: lng-odp@lists.linaro.org; Honnappa Nagarahalli
>> <honnappa.nagaraha...@arm.com>; Elo, Matias (Nokia - FI/Espoo)
>> <matias....@nokia.com>; Kevin Wang <kevin.w...@arm.com>; nd <n...@arm.com>
>> Subject: Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler
>> 
>> On 1 June 2017 at 15:20, Ola Liljedahl <ola.liljed...@arm.com> wrote:
>> >
>> >
>> >
>> >
>> > On 01/06/2017, 22:15, "Honnappa Nagarahalli"
>> > <honnappa.nagaraha...@linaro.org> wrote:
>> >
>> >>On 1 June 2017 at 15:09, Ola Liljedahl <ola.liljed...@arm.com> wrote:
>> >>>
>> >>>
>> >>> On 01/06/2017, 21:03, "Bill Fischofer" <bill.fischo...@linaro.org>
>> >>>wrote:
>> >>>
>> >>>>On Thu, Jun 1, 2017 at 10:59 AM, Honnappa Nagarahalli
>> >>>><honnappa.nagaraha...@linaro.org> wrote:
>> >>>>> On 1 June 2017 at 01:26, Elo, Matias (Nokia - FI/Espoo)
>> >>>>> <matias@nokia.com> wrote:
>> >>>>>>
>> >>>>>>> On 31 May 2017, at 23:53, Bill Fischofer
>> <bill.fischo...@linaro.org>
>> >>>>>>>wrote:
>> >>>>>>>
>> >>>>>>> On Wed, May 31, 2017 at 8:12 AM, Elo, Matias (Nokia - FI/Espoo)
>> >>>>>>> <matias@nokia.com> wrote:
>> >>>>>>>>
>> >>>>>>>>>>> What¹s the purpose of calling ord_enq_multi() here? To save
>> >>>>>>>>>>>(stash)
>> >>>>>>>>>>> packets if the thread is out-of-order?
>> >>>>>>>>>>> And when the thread is in-order, it is re-enqueueing the
>> packets
>> >>>>>>>>>>>which
>> >>>>>>>>>>> again will invoke pktout_enqueue/pktout_enq_multi but this
>> time
>> >>>>>>>>>>> ord_enq_multi() will not save the packets, instead they will
>> >>>>>>>>>>>actually be
>> >>>>>>>>>>> transmitted by odp_pktout_send()?
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> Since transmitting packets may fail, out-of-order packets
>> cannot
>> >>>>>>>>>>be
>> >>>>>>>>>> stashed here.
>> >>>>>>>>> You mean that the TX queue of the pktio might be full so not
>>all
>> >>>>>>>>>packets
>> >>>>>>>>> will actually be enqueued for transmission.
>> >>>>>>>>
>> >>>>>>>> Yep.
>> >>>>>>>>
>> >>>>>>>>> This is an interesting case but is it a must to know how many
>> >>>>>>>>>packets are
>> >>>>>>>>> actually accepted? Packets can always be dropped without
>>notice,
>> >>>>>>>>>the
>> >>>>>>>>> question is from which point this is acceptable. If packets
>> >>>>>>>>>enqueued onto
>> >>>>>>>>> a pktout (egress) queue are accepted, this means that they
>>must
>> >>>>>>>>>also be
>> >>>>>>>>> put onto the driver TX queue (as done by odp_pktout_send)?
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>> Currently, the packet_io/queue APIs don't say anything about
>> >>>>>>>>packets
>> >>>>>>>>being
>> >>>>>>>> possibly dropped after successfully calling odp_queue_enq() to
>>a
>> >>>>>>>>pktout
>> >>>>>>>> event queue. So to be consistent with standard odp_queue_enq()
>> >>>>>>>>o

Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler

2017-06-02 Thread Savolainen, Petri (Nokia - FI/Espoo)


> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of
> Honnappa Nagarahalli
> Sent: Thursday, June 01, 2017 11:30 PM
> To: Ola Liljedahl <ola.liljed...@arm.com>
> Cc: lng-odp@lists.linaro.org; Honnappa Nagarahalli
> <honnappa.nagaraha...@arm.com>; Elo, Matias (Nokia - FI/Espoo)
> <matias@nokia.com>; Kevin Wang <kevin.w...@arm.com>; nd <n...@arm.com>
> Subject: Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler
> 
> On 1 June 2017 at 15:20, Ola Liljedahl <ola.liljed...@arm.com> wrote:
> >
> >
> >
> >
> > On 01/06/2017, 22:15, "Honnappa Nagarahalli"
> > <honnappa.nagaraha...@linaro.org> wrote:
> >
> >>On 1 June 2017 at 15:09, Ola Liljedahl <ola.liljed...@arm.com> wrote:
> >>>
> >>>
> >>> On 01/06/2017, 21:03, "Bill Fischofer" <bill.fischo...@linaro.org>
> >>>wrote:
> >>>
> >>>>On Thu, Jun 1, 2017 at 10:59 AM, Honnappa Nagarahalli
> >>>><honnappa.nagaraha...@linaro.org> wrote:
> >>>>> On 1 June 2017 at 01:26, Elo, Matias (Nokia - FI/Espoo)
> >>>>> <matias@nokia.com> wrote:
> >>>>>>
> >>>>>>> On 31 May 2017, at 23:53, Bill Fischofer
> <bill.fischo...@linaro.org>
> >>>>>>>wrote:
> >>>>>>>
> >>>>>>> On Wed, May 31, 2017 at 8:12 AM, Elo, Matias (Nokia - FI/Espoo)
> >>>>>>> <matias@nokia.com> wrote:
> >>>>>>>>
> >>>>>>>>>>> What¹s the purpose of calling ord_enq_multi() here? To save
> >>>>>>>>>>>(stash)
> >>>>>>>>>>> packets if the thread is out-of-order?
> >>>>>>>>>>> And when the thread is in-order, it is re-enqueueing the
> packets
> >>>>>>>>>>>which
> >>>>>>>>>>> again will invoke pktout_enqueue/pktout_enq_multi but this
> time
> >>>>>>>>>>> ord_enq_multi() will not save the packets, instead they will
> >>>>>>>>>>>actually be
> >>>>>>>>>>> transmitted by odp_pktout_send()?
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Since transmitting packets may fail, out-of-order packets
> cannot
> >>>>>>>>>>be
> >>>>>>>>>> stashed here.
> >>>>>>>>> You mean that the TX queue of the pktio might be full so not all
> >>>>>>>>>packets
> >>>>>>>>> will actually be enqueued for transmission.
> >>>>>>>>
> >>>>>>>> Yep.
> >>>>>>>>
> >>>>>>>>> This is an interesting case but is it a must to know how many
> >>>>>>>>>packets are
> >>>>>>>>> actually accepted? Packets can always be dropped without notice,
> >>>>>>>>>the
> >>>>>>>>> question is from which point this is acceptable. If packets
> >>>>>>>>>enqueued onto
> >>>>>>>>> a pktout (egress) queue are accepted, this means that they must
> >>>>>>>>>also be
> >>>>>>>>> put onto the driver TX queue (as done by odp_pktout_send)?
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> Currently, the packet_io/queue APIs don't say anything about
> >>>>>>>>packets
> >>>>>>>>being
> >>>>>>>> possibly dropped after successfully calling odp_queue_enq() to a
> >>>>>>>>pktout
> >>>>>>>> event queue. So to be consistent with standard odp_queue_enq()
> >>>>>>>>operations I
> >>>>>>>> think it is better to return the number of events actually
> accepted
> >>>>>>>>to the TX queue.
> >>>>>>>>
> >>>>>>>> To have more leeway one option would be to modify the API
> >>>>>>>>documentation to
> >>>>>>>> state that packets may still be dropped after a

Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler

2017-06-01 Thread Ola Liljedahl




On 01/06/2017, 22:15, "Honnappa Nagarahalli"
 wrote:

>On 1 June 2017 at 15:09, Ola Liljedahl  wrote:
>>
>>
>> On 01/06/2017, 21:03, "Bill Fischofer" 
>>wrote:
>>
>>>On Thu, Jun 1, 2017 at 10:59 AM, Honnappa Nagarahalli
>>> wrote:
 On 1 June 2017 at 01:26, Elo, Matias (Nokia - FI/Espoo)
  wrote:
>
>> On 31 May 2017, at 23:53, Bill Fischofer 
>>wrote:
>>
>> On Wed, May 31, 2017 at 8:12 AM, Elo, Matias (Nokia - FI/Espoo)
>>  wrote:
>>>
>> What¹s the purpose of calling ord_enq_multi() here? To save
>>(stash)
>> packets if the thread is out-of-order?
>> And when the thread is in-order, it is re-enqueueing the packets
>>which
>> again will invoke pktout_enqueue/pktout_enq_multi but this time
>> ord_enq_multi() will not save the packets, instead they will
>>actually be
>> transmitted by odp_pktout_send()?
>>
>
> Since transmitting packets may fail, out-of-order packets cannot
>be
> stashed here.
 You mean that the TX queue of the pktio might be full so not all
packets
 will actually be enqueued for transmission.
>>>
>>> Yep.
>>>
 This is an interesting case but is it a must to know how many
packets are
 actually accepted? Packets can always be dropped without notice,
the
 question is from which point this is acceptable. If packets
enqueued onto
 a pktout (egress) queue are accepted, this means that they must
also be
 put onto the driver TX queue (as done by odp_pktout_send)?

>>>
>>> Currently, the packet_io/queue APIs don't say anything about
>>>packets
>>>being
>>> possibly dropped after successfully calling odp_queue_enq() to a
>>>pktout
>>> event queue. So to be consistent with standard odp_queue_enq()
>>>operations I
>>> think it is better to return the number of events actually accepted
>>>to the TX queue.
>>>
>>> To have more leeway one option would be to modify the API
>>>documentation to
>>> state that packets may still be dropped after a successful
>>>odp_queue_enq() call
>>> before reaching the NIC. If the application would like to be sure
>>>that the
>>> packets are actually sent, it should use odp_pktout_send() instead.
>>
>> Ordered queues simply say that packets will be delivered to the next
>> queue in the pipeline in the order they originated from their source
>> queue. What happens after that depends on the attributes of the
>>target
>> queue. If the target queue is an exit point from the application,
>>then
>> this is outside of ODP's scope.
>
> My point was that with stashing the application has no way of knowing
>if an
> ordered pktout enqueue call actually succeed. In case of parallel and
>atomic
> queues it does. So my question is, is this acceptable?
>
 Also, currently, it is not possible for the application to have a
 consistent 'wait/drop on destination queue full' policy for all the
 queue types.
>>>
>>>Today applications have no way of knowing whether packets sent to a
>>>pktout_queue or tm_queue actually make it to the wire or whether they
>>>are vaporized as soon as they hit the wire, so there's no change here.
>>>An RC of 0 simply says that the packet was "accepted" for transmission
>>>and hence the caller no longer owns that packet handle. You need
>>>higher-level protocols to track end-to-end transmission and receipt.
>>>All that ordered queues say is that packets being sent to TX queues
>>>will have those TX calls made in the same order as the source queue
>>>they originated from.
>>>
>>>The only way to track packet disposition today is to (a) create a
>>>reference to the packet you want to transmit, (b) verify that
>>>odp_packet_has_ref(original_pkt) > 0, indicating that an actual
>>>reference was created, (c) transmit that reference, and (d) note when
>>>odp_packet_has_ref(original_pkt) returns to 0. That confirms that the
>>>reference has exited the scope of this ODP instance since a
>>>"successful" transmission will free that reference.
>> Doesn¹t this just confirm that the reference has been freed? But you
>>don¹t
>> know if this was due to the packet actually being transmitted on the
>>wire
>> or if it was dropped before that (which would also free the reference).
>>
>> Back to my original question, how far into the ³machine² can we return
>> (to SW) absolute knowledge of the states of packets?
>>
>> With normal queues (including scheduled queues), a successful enqueue
>> guarantees that the packet (event) was actually enqueued. But pktio
>>egress
>> queues are not normal queues, they 

Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler

2017-06-01 Thread Honnappa Nagarahalli
On 1 June 2017 at 15:09, Ola Liljedahl  wrote:
>
>
> On 01/06/2017, 21:03, "Bill Fischofer"  wrote:
>
>>On Thu, Jun 1, 2017 at 10:59 AM, Honnappa Nagarahalli
>> wrote:
>>> On 1 June 2017 at 01:26, Elo, Matias (Nokia - FI/Espoo)
>>>  wrote:

> On 31 May 2017, at 23:53, Bill Fischofer 
>wrote:
>
> On Wed, May 31, 2017 at 8:12 AM, Elo, Matias (Nokia - FI/Espoo)
>  wrote:
>>
> What¹s the purpose of calling ord_enq_multi() here? To save
>(stash)
> packets if the thread is out-of-order?
> And when the thread is in-order, it is re-enqueueing the packets
>which
> again will invoke pktout_enqueue/pktout_enq_multi but this time
> ord_enq_multi() will not save the packets, instead they will
>actually be
> transmitted by odp_pktout_send()?
>

 Since transmitting packets may fail, out-of-order packets cannot be
 stashed here.
>>> You mean that the TX queue of the pktio might be full so not all
>>>packets
>>> will actually be enqueued for transmission.
>>
>> Yep.
>>
>>> This is an interesting case but is it a must to know how many
>>>packets are
>>> actually accepted? Packets can always be dropped without notice, the
>>> question is from which point this is acceptable. If packets
>>>enqueued onto
>>> a pktout (egress) queue are accepted, this means that they must
>>>also be
>>> put onto the driver TX queue (as done by odp_pktout_send)?
>>>
>>
>> Currently, the packet_io/queue APIs don't say anything about packets
>>being
>> possibly dropped after successfully calling odp_queue_enq() to a
>>pktout
>> event queue. So to be consistent with standard odp_queue_enq()
>>operations I
>> think it is better to return the number of events actually accepted
>>to the TX queue.
>>
>> To have more leeway one option would be to modify the API
>>documentation to
>> state that packets may still be dropped after a successful
>>odp_queue_enq() call
>> before reaching the NIC. If the application would like to be sure
>>that the
>> packets are actually sent, it should use odp_pktout_send() instead.
>
> Ordered queues simply say that packets will be delivered to the next
> queue in the pipeline in the order they originated from their source
> queue. What happens after that depends on the attributes of the target
> queue. If the target queue is an exit point from the application, then
> this is outside of ODP's scope.

 My point was that with stashing the application has no way of knowing
if an
 ordered pktout enqueue call actually succeed. In case of parallel and
atomic
 queues it does. So my question is, is this acceptable?

>>> Also, currently, it is not possible for the application to have a
>>> consistent 'wait/drop on destination queue full' policy for all the
>>> queue types.
>>
>>Today applications have no way of knowing whether packets sent to a
>>pktout_queue or tm_queue actually make it to the wire or whether they
>>are vaporized as soon as they hit the wire, so there's no change here.
>>An RC of 0 simply says that the packet was "accepted" for transmission
>>and hence the caller no longer owns that packet handle. You need
>>higher-level protocols to track end-to-end transmission and receipt.
>>All that ordered queues say is that packets being sent to TX queues
>>will have those TX calls made in the same order as the source queue
>>they originated from.
>>
>>The only way to track packet disposition today is to (a) create a
>>reference to the packet you want to transmit, (b) verify that
>>odp_packet_has_ref(original_pkt) > 0, indicating that an actual
>>reference was created, (c) transmit that reference, and (d) note when
>>odp_packet_has_ref(original_pkt) returns to 0. That confirms that the
>>reference has exited the scope of this ODP instance since a
>>"successful" transmission will free that reference.
> Doesn¹t this just confirm that the reference has been freed? But you don¹t
> know if this was due to the packet actually being transmitted on the wire
> or if it was dropped before that (which would also free the reference).
>
> Back to my original question, how far into the ³machine² can we return
> (to SW) absolute knowledge of the states of packets?
>
> With normal queues (including scheduled queues), a successful enqueue
> guarantees that the packet (event) was actually enqueued. But pktio egress
> queues are not normal queues, they are essentially representations of a
> network interface¹s TX queues but also maintain the order restoration
> function of events enqueued to a queue when processing an ordered queue.
>
Why is this specific to pktio egress queues. This 

Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler

2017-06-01 Thread Ola Liljedahl


On 01/06/2017, 21:03, "Bill Fischofer"  wrote:

>On Thu, Jun 1, 2017 at 10:59 AM, Honnappa Nagarahalli
> wrote:
>> On 1 June 2017 at 01:26, Elo, Matias (Nokia - FI/Espoo)
>>  wrote:
>>>
 On 31 May 2017, at 23:53, Bill Fischofer 
wrote:

 On Wed, May 31, 2017 at 8:12 AM, Elo, Matias (Nokia - FI/Espoo)
  wrote:
>
 What¹s the purpose of calling ord_enq_multi() here? To save
(stash)
 packets if the thread is out-of-order?
 And when the thread is in-order, it is re-enqueueing the packets
which
 again will invoke pktout_enqueue/pktout_enq_multi but this time
 ord_enq_multi() will not save the packets, instead they will
actually be
 transmitted by odp_pktout_send()?

>>>
>>> Since transmitting packets may fail, out-of-order packets cannot be
>>> stashed here.
>> You mean that the TX queue of the pktio might be full so not all
>>packets
>> will actually be enqueued for transmission.
>
> Yep.
>
>> This is an interesting case but is it a must to know how many
>>packets are
>> actually accepted? Packets can always be dropped without notice, the
>> question is from which point this is acceptable. If packets
>>enqueued onto
>> a pktout (egress) queue are accepted, this means that they must
>>also be
>> put onto the driver TX queue (as done by odp_pktout_send)?
>>
>
> Currently, the packet_io/queue APIs don't say anything about packets
>being
> possibly dropped after successfully calling odp_queue_enq() to a
>pktout
> event queue. So to be consistent with standard odp_queue_enq()
>operations I
> think it is better to return the number of events actually accepted
>to the TX queue.
>
> To have more leeway one option would be to modify the API
>documentation to
> state that packets may still be dropped after a successful
>odp_queue_enq() call
> before reaching the NIC. If the application would like to be sure
>that the
> packets are actually sent, it should use odp_pktout_send() instead.

 Ordered queues simply say that packets will be delivered to the next
 queue in the pipeline in the order they originated from their source
 queue. What happens after that depends on the attributes of the target
 queue. If the target queue is an exit point from the application, then
 this is outside of ODP's scope.
>>>
>>> My point was that with stashing the application has no way of knowing
>>>if an
>>> ordered pktout enqueue call actually succeed. In case of parallel and
>>>atomic
>>> queues it does. So my question is, is this acceptable?
>>>
>> Also, currently, it is not possible for the application to have a
>> consistent 'wait/drop on destination queue full' policy for all the
>> queue types.
>
>Today applications have no way of knowing whether packets sent to a
>pktout_queue or tm_queue actually make it to the wire or whether they
>are vaporized as soon as they hit the wire, so there's no change here.
>An RC of 0 simply says that the packet was "accepted" for transmission
>and hence the caller no longer owns that packet handle. You need
>higher-level protocols to track end-to-end transmission and receipt.
>All that ordered queues say is that packets being sent to TX queues
>will have those TX calls made in the same order as the source queue
>they originated from.
>
>The only way to track packet disposition today is to (a) create a
>reference to the packet you want to transmit, (b) verify that
>odp_packet_has_ref(original_pkt) > 0, indicating that an actual
>reference was created, (c) transmit that reference, and (d) note when
>odp_packet_has_ref(original_pkt) returns to 0. That confirms that the
>reference has exited the scope of this ODP instance since a
>"successful" transmission will free that reference.
Doesn¹t this just confirm that the reference has been freed? But you don¹t
know if this was due to the packet actually being transmitted on the wire
or if it was dropped before that (which would also free the reference).

Back to my original question, how far into the ³machine² can we return
(to SW) absolute knowledge of the states of packets?

With normal queues (including scheduled queues), a successful enqueue
guarantees that the packet (event) was actually enqueued. But pktio egress
queues are not normal queues, they are essentially representations of a
network interface¹s TX queues but also maintain the order restoration
function of events enqueued to a queue when processing an ordered queue.

I interpret your comments Bill as even if enqueue to a pktio egress queue
is successful (the packet handle is no longer owned by the application),
the implementation from that moment on can do whatever it wants with the
packets (as long as 

Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler

2017-06-01 Thread Bill Fischofer
On Thu, Jun 1, 2017 at 10:59 AM, Honnappa Nagarahalli
 wrote:
> On 1 June 2017 at 01:26, Elo, Matias (Nokia - FI/Espoo)
>  wrote:
>>
>>> On 31 May 2017, at 23:53, Bill Fischofer  wrote:
>>>
>>> On Wed, May 31, 2017 at 8:12 AM, Elo, Matias (Nokia - FI/Espoo)
>>>  wrote:

>>> What’s the purpose of calling ord_enq_multi() here? To save (stash)
>>> packets if the thread is out-of-order?
>>> And when the thread is in-order, it is re-enqueueing the packets which
>>> again will invoke pktout_enqueue/pktout_enq_multi but this time
>>> ord_enq_multi() will not save the packets, instead they will actually be
>>> transmitted by odp_pktout_send()?
>>>
>>
>> Since transmitting packets may fail, out-of-order packets cannot be
>> stashed here.
> You mean that the TX queue of the pktio might be full so not all packets
> will actually be enqueued for transmission.

 Yep.

> This is an interesting case but is it a must to know how many packets are
> actually accepted? Packets can always be dropped without notice, the
> question is from which point this is acceptable. If packets enqueued onto
> a pktout (egress) queue are accepted, this means that they must also be
> put onto the driver TX queue (as done by odp_pktout_send)?
>

 Currently, the packet_io/queue APIs don't say anything about packets being
 possibly dropped after successfully calling odp_queue_enq() to a pktout
 event queue. So to be consistent with standard odp_queue_enq() operations I
 think it is better to return the number of events actually accepted to the 
 TX queue.

 To have more leeway one option would be to modify the API documentation to
 state that packets may still be dropped after a successful odp_queue_enq() 
 call
 before reaching the NIC. If the application would like to be sure that the
 packets are actually sent, it should use odp_pktout_send() instead.
>>>
>>> Ordered queues simply say that packets will be delivered to the next
>>> queue in the pipeline in the order they originated from their source
>>> queue. What happens after that depends on the attributes of the target
>>> queue. If the target queue is an exit point from the application, then
>>> this is outside of ODP's scope.
>>
>> My point was that with stashing the application has no way of knowing if an
>> ordered pktout enqueue call actually succeed. In case of parallel and atomic
>> queues it does. So my question is, is this acceptable?
>>
> Also, currently, it is not possible for the application to have a
> consistent 'wait/drop on destination queue full' policy for all the
> queue types.

Today applications have no way of knowing whether packets sent to a
pktout_queue or tm_queue actually make it to the wire or whether they
are vaporized as soon as they hit the wire, so there's no change here.
An RC of 0 simply says that the packet was "accepted" for transmission
and hence the caller no longer owns that packet handle. You need
higher-level protocols to track end-to-end transmission and receipt.
All that ordered queues say is that packets being sent to TX queues
will have those TX calls made in the same order as the source queue
they originated from.

The only way to track packet disposition today is to (a) create a
reference to the packet you want to transmit, (b) verify that
odp_packet_has_ref(original_pkt) > 0, indicating that an actual
reference was created, (c) transmit that reference, and (d) note when
odp_packet_has_ref(original_pkt) returns to 0. That confirms that the
reference has exited the scope of this ODP instance since a
"successful" transmission will free that reference.

>>


Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler

2017-06-01 Thread Honnappa Nagarahalli
On 1 June 2017 at 01:26, Elo, Matias (Nokia - FI/Espoo)
 wrote:
>
>> On 31 May 2017, at 23:53, Bill Fischofer  wrote:
>>
>> On Wed, May 31, 2017 at 8:12 AM, Elo, Matias (Nokia - FI/Espoo)
>>  wrote:
>>>
>> What’s the purpose of calling ord_enq_multi() here? To save (stash)
>> packets if the thread is out-of-order?
>> And when the thread is in-order, it is re-enqueueing the packets which
>> again will invoke pktout_enqueue/pktout_enq_multi but this time
>> ord_enq_multi() will not save the packets, instead they will actually be
>> transmitted by odp_pktout_send()?
>>
>
> Since transmitting packets may fail, out-of-order packets cannot be
> stashed here.
 You mean that the TX queue of the pktio might be full so not all packets
 will actually be enqueued for transmission.
>>>
>>> Yep.
>>>
 This is an interesting case but is it a must to know how many packets are
 actually accepted? Packets can always be dropped without notice, the
 question is from which point this is acceptable. If packets enqueued onto
 a pktout (egress) queue are accepted, this means that they must also be
 put onto the driver TX queue (as done by odp_pktout_send)?

>>>
>>> Currently, the packet_io/queue APIs don't say anything about packets being
>>> possibly dropped after successfully calling odp_queue_enq() to a pktout
>>> event queue. So to be consistent with standard odp_queue_enq() operations I
>>> think it is better to return the number of events actually accepted to the 
>>> TX queue.
>>>
>>> To have more leeway one option would be to modify the API documentation to
>>> state that packets may still be dropped after a successful odp_queue_enq() 
>>> call
>>> before reaching the NIC. If the application would like to be sure that the
>>> packets are actually sent, it should use odp_pktout_send() instead.
>>
>> Ordered queues simply say that packets will be delivered to the next
>> queue in the pipeline in the order they originated from their source
>> queue. What happens after that depends on the attributes of the target
>> queue. If the target queue is an exit point from the application, then
>> this is outside of ODP's scope.
>
> My point was that with stashing the application has no way of knowing if an
> ordered pktout enqueue call actually succeed. In case of parallel and atomic
> queues it does. So my question is, is this acceptable?
>
Also, currently, it is not possible for the application to have a
consistent 'wait/drop on destination queue full' policy for all the
queue types.
>


Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler

2017-06-01 Thread Elo, Matias (Nokia - FI/Espoo)

> On 31 May 2017, at 23:53, Bill Fischofer  wrote:
> 
> On Wed, May 31, 2017 at 8:12 AM, Elo, Matias (Nokia - FI/Espoo)
>  wrote:
>> 
> What’s the purpose of calling ord_enq_multi() here? To save (stash)
> packets if the thread is out-of-order?
> And when the thread is in-order, it is re-enqueueing the packets which
> again will invoke pktout_enqueue/pktout_enq_multi but this time
> ord_enq_multi() will not save the packets, instead they will actually be
> transmitted by odp_pktout_send()?
> 
 
 Since transmitting packets may fail, out-of-order packets cannot be
 stashed here.
>>> You mean that the TX queue of the pktio might be full so not all packets
>>> will actually be enqueued for transmission.
>> 
>> Yep.
>> 
>>> This is an interesting case but is it a must to know how many packets are
>>> actually accepted? Packets can always be dropped without notice, the
>>> question is from which point this is acceptable. If packets enqueued onto
>>> a pktout (egress) queue are accepted, this means that they must also be
>>> put onto the driver TX queue (as done by odp_pktout_send)?
>>> 
>> 
>> Currently, the packet_io/queue APIs don't say anything about packets being
>> possibly dropped after successfully calling odp_queue_enq() to a pktout
>> event queue. So to be consistent with standard odp_queue_enq() operations I
>> think it is better to return the number of events actually accepted to the 
>> TX queue.
>> 
>> To have more leeway one option would be to modify the API documentation to
>> state that packets may still be dropped after a successful odp_queue_enq() 
>> call
>> before reaching the NIC. If the application would like to be sure that the
>> packets are actually sent, it should use odp_pktout_send() instead.
> 
> Ordered queues simply say that packets will be delivered to the next
> queue in the pipeline in the order they originated from their source
> queue. What happens after that depends on the attributes of the target
> queue. If the target queue is an exit point from the application, then
> this is outside of ODP's scope.

My point was that with stashing the application has no way of knowing if an
ordered pktout enqueue call actually succeed. In case of parallel and atomic
queues it does. So my question is, is this acceptable?




Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler

2017-05-31 Thread Kevin Wang
​We are in the progress to implement a new queue interface which is more
modulized.

Kevin

2017-05-23 16:22 GMT+08:00 Savolainen, Petri (Nokia - FI/Espoo) <
petri.savolai...@nokia.com>:

>
> > diff --git a/platform/linux-generic/include/odp_queue_internal.h
> > b/platform/linux-generic/include/odp_queue_internal.h
> > index 560f826e..cbe065dc 100644
> > --- a/platform/linux-generic/include/odp_queue_internal.h
> > +++ b/platform/linux-generic/include/odp_queue_internal.h
> > @@ -19,16 +19,24 @@ extern "C" {
> >  #endif
> >
> >  #include 
> > -#include 
> > -#include 
> > -#include 
> > -#include 
> > +#include 
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> >  #include 
> > +
> >  #include 
> >
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#ifdef ODP_SCHEDULE_SCALABLE
> > +#include 
> > +#include 
> > +#endif
> > +
> >  #define QUEUE_MULTI_MAX CONFIG_BURST_SIZE
> >
> >  #define QUEUE_STATUS_FREE 0
> > @@ -37,8 +45,6 @@ extern "C" {
> >  #define QUEUE_STATUS_NOTSCHED 3
> >  #define QUEUE_STATUS_SCHED4
> >
> > -
> > -/* forward declaration */
> >  union queue_entry_u;
> >
> >  typedef int (*enq_func_t)(union queue_entry_u *, odp_buffer_hdr_t *);
> > @@ -49,6 +55,47 @@ typedef int (*enq_multi_func_t)(union queue_entry_u *,
> >  typedef  int (*deq_multi_func_t)(union queue_entry_u *,
> >   odp_buffer_hdr_t **, int);
> >
> > +#ifdef ODP_SCHEDULE_SCALABLE
>
> The queue interface need to be still improved, so that there's no
> conditional compiling like this piece of code here.
>
>
> > +
> > +struct queue_entry_s {
> > + sched_elem_t sched_elem;
> > +
> > + odp_ticketlock_t lock ODP_ALIGNED_CACHE;
> > + int  status;
> > +
> > + enq_func_t   enqueue ODP_ALIGNED_CACHE;
> > + deq_func_t   dequeue;
> > + enq_multi_func_t enqueue_multi;
> > + deq_multi_func_t dequeue_multi;
> > +
> > + uint32_t   index;
> > + odp_queue_thandle;
> > + odp_queue_type_t   type;
> > + odp_queue_param_t  param;
> > + odp_pktin_queue_t  pktin;
> > + odp_pktout_queue_t pktout;
> > + char   name[ODP_QUEUE_NAME_LEN];
> > +};
> > +
> > +int _odp_queue_deq(sched_elem_t *q, odp_buffer_hdr_t *buf_hdr[], int
> > num);
> > +int _odp_queue_deq_sc(sched_elem_t *q, odp_event_t *evp, int num);
> > +int _odp_queue_deq_mc(sched_elem_t *q, odp_event_t *evp, int num);
> > +
> > +/* Round up memory size to next cache line size to
> > + * align all memory addresses on cache line boundary.
> > + */
> > +static inline void *shm_pool_alloc_align(_odp_ishm_pool_t *pool,
> uint32_t
> > size)
> > +{
> > + void *addr;
> > +
> > + addr = _odp_ishm_pool_alloc(pool, ROUNDUP_CACHE_LINE(size));
> > + ODP_ASSERT(((uintptr_t)addr & (ODP_CACHE_LINE_SIZE - 1)) == 0);
> > +
> > + return addr;
> > +}
> > +
> > +#else
>
>
>
>
> > diff --git a/platform/linux-generic/odp_packet_io.c b/platform/linux-
> > generic/odp_packet_io.c
> > index d8cae15c..717e7e33 100644
> > --- a/platform/linux-generic/odp_packet_io.c
> > +++ b/platform/linux-generic/odp_packet_io.c
> > @@ -5,23 +5,25 @@
> >   */
> >  #include 
> >
> > -#include 
> > -#include 
> > -#include 
> >  #include 
> > -#include 
> > -#include 
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> > -#include 
> > +#include 
> > +
> > +#include 
> >  #include 
> > -#include 
> > -#include 
> > -#include 
> >  #include 
> > +
> > +#include 
> > +#include 
> >  #include 
> > -#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> >
> >  #include 
> >  #include 
> > @@ -472,7 +474,6 @@ int odp_pktio_start(odp_pktio_t hdl)
> >   return -1;
> >   }
> >   }
> > -
> >   sched_fn->pktio_start(pktio_to_id(hdl), num, index);
> >   }
> >
> > @@ -554,7 +555,6 @@ static inline int pktin_recv_buf(odp_pktin_queue_t
> > queue,
> >   odp_packet_t packets[num];
> >   odp_packet_hdr_t *pkt_hdr;
> >   odp_buffer_hdr_t *buf_hdr;
> > - odp_buffer_t buf;
> >   int i;
> >   int pkts;
> >   int num_rx = 0;
> > @@ -564,9 +564,7 @@ static inline int pktin_recv_buf(odp_pktin_queue_t
> > queue,
> >   for (i = 0; i < pkts; i++) {
> >   pkt = packets[i];
> >   pkt_hdr = odp_packet_hdr(pkt);
> > - buf = _odp_packet_to_buffer(pkt);
> > - buf_hdr = buf_hdl_to_hdr(buf);
> > -
> > + buf_hdr =
> > buf_hdl_to_hdr(_odp_packet_to_buffer(pkt));
>
>
> Do not include unnecessary style changes like this one (and the one above)
> into this patch. These are just cosmetic and obscure the   functional
> change you are trying to achieve. These waste reviewer time to figure out
> why this change was needed - just to realize that it was not actually
> needed.
>
> Also e.g. in these cases, I like the original style better.
>
>
>
> >   if 

Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler

2017-05-31 Thread Bill Fischofer
On Wed, May 31, 2017 at 3:53 PM, Bill Fischofer
 wrote:
> On Wed, May 31, 2017 at 8:12 AM, Elo, Matias (Nokia - FI/Espoo)
>  wrote:
>>
> What’s the purpose of calling ord_enq_multi() here? To save (stash)
> packets if the thread is out-of-order?
> And when the thread is in-order, it is re-enqueueing the packets which
> again will invoke pktout_enqueue/pktout_enq_multi but this time
> ord_enq_multi() will not save the packets, instead they will actually be
> transmitted by odp_pktout_send()?
>

 Since transmitting packets may fail, out-of-order packets cannot be
 stashed here.
>>> You mean that the TX queue of the pktio might be full so not all packets
>>> will actually be enqueued for transmission.
>>
>> Yep.
>>
>>> This is an interesting case but is it a must to know how many packets are
>>> actually accepted? Packets can always be dropped without notice, the
>>> question is from which point this is acceptable. If packets enqueued onto
>>> a pktout (egress) queue are accepted, this means that they must also be
>>> put onto the driver TX queue (as done by odp_pktout_send)?
>>>
>>
>> Currently, the packet_io/queue APIs don't say anything about packets being
>> possibly dropped after successfully calling odp_queue_enq() to a pktout
>> event queue. So to be consistent with standard odp_queue_enq() operations I
>> think it is better to return the number of events actually accepted to the 
>> TX queue.
>>
>> To have more leeway one option would be to modify the API documentation to
>> state that packets may still be dropped after a successful odp_queue_enq() 
>> call
>> before reaching the NIC. If the application would like to be sure that the
>> packets are actually sent, it should use odp_pktout_send() instead.
>
> Ordered queues simply say that packets will be delivered to the next
> queue in the pipeline in the order they originated from their source
> queue. What happens after that depends on the attributes of the target
> queue. If the target queue is an exit point from the application, then
> this is outside of ODP's scope.
>
> Ethernet is a lossy protocol, so just because packets are
> "transmitted" in order doesn't mean that they can't be dropped or
> reordered by some other node en route to their final destination. This
> is why TCP has ACK packets to confirm receipt. All that ODP says is
> that it will do its part to preserve order and will not introduce
> reorderings itself.  So I don't see ODP ordered queue processing
> needing to treat pktout queues any differently than it would for any
> other target queue. The original oversight was simply that reorder
> processing was not being used at all for these targets.

Forgot to add that odp_tm_queues have the same need. I don't recall if
the new ordered queue logic covers them as well.

>
>>
>> -Matias
>>


Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler

2017-05-31 Thread Bill Fischofer
On Wed, May 31, 2017 at 8:12 AM, Elo, Matias (Nokia - FI/Espoo)
 wrote:
>
 What’s the purpose of calling ord_enq_multi() here? To save (stash)
 packets if the thread is out-of-order?
 And when the thread is in-order, it is re-enqueueing the packets which
 again will invoke pktout_enqueue/pktout_enq_multi but this time
 ord_enq_multi() will not save the packets, instead they will actually be
 transmitted by odp_pktout_send()?

>>>
>>> Since transmitting packets may fail, out-of-order packets cannot be
>>> stashed here.
>> You mean that the TX queue of the pktio might be full so not all packets
>> will actually be enqueued for transmission.
>
> Yep.
>
>> This is an interesting case but is it a must to know how many packets are
>> actually accepted? Packets can always be dropped without notice, the
>> question is from which point this is acceptable. If packets enqueued onto
>> a pktout (egress) queue are accepted, this means that they must also be
>> put onto the driver TX queue (as done by odp_pktout_send)?
>>
>
> Currently, the packet_io/queue APIs don't say anything about packets being
> possibly dropped after successfully calling odp_queue_enq() to a pktout
> event queue. So to be consistent with standard odp_queue_enq() operations I
> think it is better to return the number of events actually accepted to the TX 
> queue.
>
> To have more leeway one option would be to modify the API documentation to
> state that packets may still be dropped after a successful odp_queue_enq() 
> call
> before reaching the NIC. If the application would like to be sure that the
> packets are actually sent, it should use odp_pktout_send() instead.

Ordered queues simply say that packets will be delivered to the next
queue in the pipeline in the order they originated from their source
queue. What happens after that depends on the attributes of the target
queue. If the target queue is an exit point from the application, then
this is outside of ODP's scope.

Ethernet is a lossy protocol, so just because packets are
"transmitted" in order doesn't mean that they can't be dropped or
reordered by some other node en route to their final destination. This
is why TCP has ACK packets to confirm receipt. All that ODP says is
that it will do its part to preserve order and will not introduce
reorderings itself.  So I don't see ODP ordered queue processing
needing to treat pktout queues any differently than it would for any
other target queue. The original oversight was simply that reorder
processing was not being used at all for these targets.

>
> -Matias
>


Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler

2017-05-31 Thread Elo, Matias (Nokia - FI/Espoo)

>>> What’s the purpose of calling ord_enq_multi() here? To save (stash)
>>> packets if the thread is out-of-order?
>>> And when the thread is in-order, it is re-enqueueing the packets which
>>> again will invoke pktout_enqueue/pktout_enq_multi but this time
>>> ord_enq_multi() will not save the packets, instead they will actually be
>>> transmitted by odp_pktout_send()?
>>> 
>> 
>> Since transmitting packets may fail, out-of-order packets cannot be
>> stashed here.
> You mean that the TX queue of the pktio might be full so not all packets
> will actually be enqueued for transmission.

Yep. 

> This is an interesting case but is it a must to know how many packets are
> actually accepted? Packets can always be dropped without notice, the
> question is from which point this is acceptable. If packets enqueued onto
> a pktout (egress) queue are accepted, this means that they must also be
> put onto the driver TX queue (as done by odp_pktout_send)?
> 

Currently, the packet_io/queue APIs don't say anything about packets being
possibly dropped after successfully calling odp_queue_enq() to a pktout
event queue. So to be consistent with standard odp_queue_enq() operations I
think it is better to return the number of events actually accepted to the TX 
queue.

To have more leeway one option would be to modify the API documentation to
state that packets may still be dropped after a successful odp_queue_enq() call
before reaching the NIC. If the application would like to be sure that the
packets are actually sent, it should use odp_pktout_send() instead.

-Matias



Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler

2017-05-31 Thread Ola Liljedahl
On 31/05/2017, 12:18, "Elo, Matias (Nokia - FI/Espoo)"
 wrote:


>
>> On 31 May 2017, at 12:04, Ola Liljedahl  wrote:
>> 
>> 
>> 
>> On 31/05/2017, 10:38, "Peltonen, Janne (Nokia - FI/Espoo)"
>>  wrote:
>> 
>>> 
>>> 
>>> Ola Liljedahl wrote:
 On 23/05/2017, 16:49, "Peltonen, Janne (Nokia - FI/Espoo)"
  wrote:
 
 
> 
>> +static int ord_enq_multi(uint32_t queue_index, void *p_buf_hdr[],
>> + int num, int *ret)
>> +{
>> +(void)queue_index;
>> +(void)p_buf_hdr;
>> +(void)num;
>> +(void)ret;
>> +return 0;
>> +}
> 
> How is packet order maintained when enqueuing packets read from an
 ordered
> queue to a pktout queue? Matias' recent fix uses the ord_enq_multi
> scheduler
> function for that, but this version does not do any ordering. Or is
>the
> ordering guaranteed by some other means?
 The scalable scheduler standard queue enqueue function also handles
 ordered queues. odp_queue_scalable.c can refer to the same
 thread-specific
 data as odp_schedule_scalable.c so we don¹t need this internal
 interface.
 We could perhaps adapt the code to use this interface but I think this
 interface is just an artefact of the implementation of the default
 queues/scheduler.
>>> 
>>> The problem is that odp_pktout_queue_config() sets qentry->s.enqueue
>>> to pktout_enqueue() and that does not have any of the scalable
>>>scheduler
>>> specific magic that odp_queue_scalable.c:queue_enq{_multi}() has. So
>>> ordering does not happen for pktout queues even if it works for other
>>> queues, right?
>> This must be a recent change, it doesn’t look like that in the working
>> branch we are using.
>> I see the code when changing to the master branch.
>> The code in pktout_enqueue() does look like a hack:
>>if (sched_fn->ord_enq_multi(qentry->s.index, (void **)buf_hdr,
>> len, ))
>> A cast to “void **”???
>> 
>> What’s the purpose of calling ord_enq_multi() here? To save (stash)
>> packets if the thread is out-of-order?
>> And when the thread is in-order, it is re-enqueueing the packets which
>> again will invoke pktout_enqueue/pktout_enq_multi but this time
>> ord_enq_multi() will not save the packets, instead they will actually be
>> transmitted by odp_pktout_send()?
>> 
>
>Since transmitting packets may fail, out-of-order packets cannot be
>stashed here.
You mean that the TX queue of the pktio might be full so not all packets
will actually be enqueued for transmission.
This is an interesting case but is it a must to know how many packets are
actually accepted? Packets can always be dropped without notice, the
question is from which point this is acceptable. If packets enqueued onto
a pktout (egress) queue are accepted, this means that they must also be
put onto the driver TX queue (as done by odp_pktout_send)?


>With the current scheduler implementation sched_fn->ord_enq_multi() waits
>until
>in-order and always returns 0 (in case of pktout queue). After this
>odp_pktout_send()
>is called.
>
>-Matias
>



Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler

2017-05-31 Thread Elo, Matias (Nokia - FI/Espoo)

> On 31 May 2017, at 12:04, Ola Liljedahl  wrote:
> 
> 
> 
> On 31/05/2017, 10:38, "Peltonen, Janne (Nokia - FI/Espoo)"
>  wrote:
> 
>> 
>> 
>> Ola Liljedahl wrote:
>>> On 23/05/2017, 16:49, "Peltonen, Janne (Nokia - FI/Espoo)"
>>>  wrote:
>>> 
>>> 
 
> +static int ord_enq_multi(uint32_t queue_index, void *p_buf_hdr[],
> +  int num, int *ret)
> +{
> + (void)queue_index;
> + (void)p_buf_hdr;
> + (void)num;
> + (void)ret;
> + return 0;
> +}
 
 How is packet order maintained when enqueuing packets read from an
>>> ordered
 queue to a pktout queue? Matias' recent fix uses the ord_enq_multi
 scheduler
 function for that, but this version does not do any ordering. Or is the
 ordering guaranteed by some other means?
>>> The scalable scheduler standard queue enqueue function also handles
>>> ordered queues. odp_queue_scalable.c can refer to the same
>>> thread-specific
>>> data as odp_schedule_scalable.c so we don¹t need this internal
>>> interface.
>>> We could perhaps adapt the code to use this interface but I think this
>>> interface is just an artefact of the implementation of the default
>>> queues/scheduler.
>> 
>> The problem is that odp_pktout_queue_config() sets qentry->s.enqueue
>> to pktout_enqueue() and that does not have any of the scalable scheduler
>> specific magic that odp_queue_scalable.c:queue_enq{_multi}() has. So
>> ordering does not happen for pktout queues even if it works for other
>> queues, right?
> This must be a recent change, it doesn’t look like that in the working
> branch we are using.
> I see the code when changing to the master branch.
> The code in pktout_enqueue() does look like a hack:
>if (sched_fn->ord_enq_multi(qentry->s.index, (void **)buf_hdr,
> len, ))
> A cast to “void **”???
> 
> What’s the purpose of calling ord_enq_multi() here? To save (stash)
> packets if the thread is out-of-order?
> And when the thread is in-order, it is re-enqueueing the packets which
> again will invoke pktout_enqueue/pktout_enq_multi but this time
> ord_enq_multi() will not save the packets, instead they will actually be
> transmitted by odp_pktout_send()?
> 

Since transmitting packets may fail, out-of-order packets cannot be stashed 
here.
With the current scheduler implementation sched_fn->ord_enq_multi() waits until
in-order and always returns 0 (in case of pktout queue). After this 
odp_pktout_send()
is called.

-Matias



Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler

2017-05-31 Thread Ola Liljedahl


On 31/05/2017, 10:38, "Peltonen, Janne (Nokia - FI/Espoo)"
 wrote:

>
>
>Ola Liljedahl wrote:
>> On 23/05/2017, 16:49, "Peltonen, Janne (Nokia - FI/Espoo)"
>>  wrote:
>> 
>> 
>> >
>> >> +static int ord_enq_multi(uint32_t queue_index, void *p_buf_hdr[],
>> >> +  int num, int *ret)
>> >> +{
>> >> + (void)queue_index;
>> >> + (void)p_buf_hdr;
>> >> + (void)num;
>> >> + (void)ret;
>> >> + return 0;
>> >> +}
>> >
>> >How is packet order maintained when enqueuing packets read from an
>>ordered
>> >queue to a pktout queue? Matias' recent fix uses the ord_enq_multi
>> >scheduler
>> >function for that, but this version does not do any ordering. Or is the
>> >ordering guaranteed by some other means?
>> The scalable scheduler standard queue enqueue function also handles
>> ordered queues. odp_queue_scalable.c can refer to the same
>>thread-specific
>> data as odp_schedule_scalable.c so we don¹t need this internal
>>interface.
>> We could perhaps adapt the code to use this interface but I think this
>> interface is just an artefact of the implementation of the default
>> queues/scheduler.
>
>The problem is that odp_pktout_queue_config() sets qentry->s.enqueue
>to pktout_enqueue() and that does not have any of the scalable scheduler
>specific magic that odp_queue_scalable.c:queue_enq{_multi}() has. So
>ordering does not happen for pktout queues even if it works for other
>queues, right?
This must be a recent change, it doesn’t look like that in the working
branch we are using.
I see the code when changing to the master branch.
The code in pktout_enqueue() does look like a hack:
if (sched_fn->ord_enq_multi(qentry->s.index, (void **)buf_hdr,
len, ))
A cast to “void **”???

What’s the purpose of calling ord_enq_multi() here? To save (stash)
packets if the thread is out-of-order?
And when the thread is in-order, it is re-enqueueing the packets which
again will invoke pktout_enqueue/pktout_enq_multi but this time
ord_enq_multi() will not save the packets, instead they will actually be
transmitted by odp_pktout_send()?



>
>   Janne
>
>> 
>> >
>> >> +static void order_lock(void)
>> >> +{
>> >> +}
>> >> +
>> >> +static void order_unlock(void)
>> >> +{
>> >> +}
>> >
>> >Is it ok that these are no-ops? tm_enqueue() seems to use these.
>> No these ought to be implemented. We have fixed that now. Thanks.
>> 
>> 
>> -- Ola
>> 
>> Ola Liljedahl, Networking System Architect, ARM
>> Phone: +46 706 866 373  Skype: ola.liljedahl
>> 
>> 
>> 
>> 
>> >
>> >> +
>> >> +const schedule_fn_t schedule_scalable_fn = {
>> >> + .pktio_start= pktio_start,
>> >> + .thr_add= thr_add,
>> >> + .thr_rem= thr_rem,
>> >> + .num_grps   = num_grps,
>> >> + .init_queue = init_queue,
>> >> + .destroy_queue  = destroy_queue,
>> >> + .sched_queue= sched_queue,
>> >> + .ord_enq_multi  = ord_enq_multi,
>> >> + .init_global= schedule_init_global,
>> >> + .term_global= schedule_term_global,
>> >> + .init_local = schedule_init_local,
>> >> + .term_local = schedule_term_local,
>> >> + .order_lock = order_lock,
>> >> + .order_unlock   = order_unlock,
>> >> +};
>> >
>> >Janne
>> >
>> >
>



Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler

2017-05-31 Thread Peltonen, Janne (Nokia - FI/Espoo)


Ola Liljedahl wrote:
> On 23/05/2017, 16:49, "Peltonen, Janne (Nokia - FI/Espoo)"
>  wrote:
> 
> 
> >
> >> +static int ord_enq_multi(uint32_t queue_index, void *p_buf_hdr[],
> >> +   int num, int *ret)
> >> +{
> >> +  (void)queue_index;
> >> +  (void)p_buf_hdr;
> >> +  (void)num;
> >> +  (void)ret;
> >> +  return 0;
> >> +}
> >
> >How is packet order maintained when enqueuing packets read from an ordered
> >queue to a pktout queue? Matias' recent fix uses the ord_enq_multi
> >scheduler
> >function for that, but this version does not do any ordering. Or is the
> >ordering guaranteed by some other means?
> The scalable scheduler standard queue enqueue function also handles
> ordered queues. odp_queue_scalable.c can refer to the same thread-specific
> data as odp_schedule_scalable.c so we don¹t need this internal interface.
> We could perhaps adapt the code to use this interface but I think this
> interface is just an artefact of the implementation of the default
> queues/scheduler.

The problem is that odp_pktout_queue_config() sets qentry->s.enqueue
to pktout_enqueue() and that does not have any of the scalable scheduler
specific magic that odp_queue_scalable.c:queue_enq{_multi}() has. So
ordering does not happen for pktout queues even if it works for other
queues, right?

Janne

> 
> >
> >> +static void order_lock(void)
> >> +{
> >> +}
> >> +
> >> +static void order_unlock(void)
> >> +{
> >> +}
> >
> >Is it ok that these are no-ops? tm_enqueue() seems to use these.
> No these ought to be implemented. We have fixed that now. Thanks.
> 
> 
> -- Ola
> 
> Ola Liljedahl, Networking System Architect, ARM
> Phone: +46 706 866 373  Skype: ola.liljedahl
> 
> 
> 
> 
> >
> >> +
> >> +const schedule_fn_t schedule_scalable_fn = {
> >> +  .pktio_start= pktio_start,
> >> +  .thr_add= thr_add,
> >> +  .thr_rem= thr_rem,
> >> +  .num_grps   = num_grps,
> >> +  .init_queue = init_queue,
> >> +  .destroy_queue  = destroy_queue,
> >> +  .sched_queue= sched_queue,
> >> +  .ord_enq_multi  = ord_enq_multi,
> >> +  .init_global= schedule_init_global,
> >> +  .term_global= schedule_term_global,
> >> +  .init_local = schedule_init_local,
> >> +  .term_local = schedule_term_local,
> >> +  .order_lock = order_lock,
> >> +  .order_unlock   = order_unlock,
> >> +};
> >
> > Janne
> >
> >



Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler

2017-05-30 Thread Ola Liljedahl
On 23/05/2017, 16:49, "Peltonen, Janne (Nokia - FI/Espoo)"
 wrote:


>
>> +static int ord_enq_multi(uint32_t queue_index, void *p_buf_hdr[],
>> + int num, int *ret)
>> +{
>> +(void)queue_index;
>> +(void)p_buf_hdr;
>> +(void)num;
>> +(void)ret;
>> +return 0;
>> +}
>
>How is packet order maintained when enqueuing packets read from an ordered
>queue to a pktout queue? Matias' recent fix uses the ord_enq_multi
>scheduler
>function for that, but this version does not do any ordering. Or is the
>ordering guaranteed by some other means?
The scalable scheduler standard queue enqueue function also handles
ordered queues. odp_queue_scalable.c can refer to the same thread-specific
data as odp_schedule_scalable.c so we don¹t need this internal interface.
We could perhaps adapt the code to use this interface but I think this
interface is just an artefact of the implementation of the default
queues/scheduler.

>
>> +static void order_lock(void)
>> +{
>> +}
>> +
>> +static void order_unlock(void)
>> +{
>> +}
>
>Is it ok that these are no-ops? tm_enqueue() seems to use these.
No these ought to be implemented. We have fixed that now. Thanks.


-- Ola

Ola Liljedahl, Networking System Architect, ARM
Phone: +46 706 866 373  Skype: ola.liljedahl




>
>> +
>> +const schedule_fn_t schedule_scalable_fn = {
>> +.pktio_start= pktio_start,
>> +.thr_add= thr_add,
>> +.thr_rem= thr_rem,
>> +.num_grps   = num_grps,
>> +.init_queue = init_queue,
>> +.destroy_queue  = destroy_queue,
>> +.sched_queue= sched_queue,
>> +.ord_enq_multi  = ord_enq_multi,
>> +.init_global= schedule_init_global,
>> +.term_global= schedule_term_global,
>> +.init_local = schedule_init_local,
>> +.term_local = schedule_term_local,
>> +.order_lock = order_lock,
>> +.order_unlock   = order_unlock,
>> +};
>
>   Janne
>
>



Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler

2017-05-24 Thread Peltonen, Janne (Nokia - FI/Espoo)
Honnappa Nagarahalli wrote:
> On 23 May 2017 at 09:49, Peltonen, Janne (Nokia - FI/Espoo)
>  wrote:
> >
> >> +static int ord_enq_multi(uint32_t queue_index, void *p_buf_hdr[],
> >> +  int num, int *ret)
> >> +{
> >> + (void)queue_index;
> >> + (void)p_buf_hdr;
> >> + (void)num;
> >> + (void)ret;
> >> + return 0;
> >> +}
> >
> > How is packet order maintained when enqueuing packets read from an ordered
> > queue to a pktout queue? Matias' recent fix uses the ord_enq_multi scheduler
> > function for that, but this version does not do any ordering. Or is the
> > ordering guaranteed by some other means?
> >
> 
> We have been running TM related test cases and they are working fine.
> So, I would assume we do not need to do anything.

The packet ordering question above is not specific to TM but is
relevant also when TM is not used. And a test case may not reveal
ordering problems unless you are specifically looking for it and
somehow making sure that the right order does not happen by accident.

Here is a simple use case for packet ordering at output: Application
gets incoming packets from single pktin through single ordered and
scheduled queue. The packets get spread to multiple threads by the
scheduler. The threads output the packets to the same pktout queue
used in queued mode. The output packets in the wire should be in the
same order as they were received, even if the worker threads processed
the packets at very different speeds.

> 
> What is the use case from TM? Does TM use ODP queues?

I am not familiar with the TM, but it looks like the scheduler may want to
ensure correct packet order using the order_lock function and therefore
that function being no-op in the scalable scheduler looked alarming to me.

Janne

> 
> >> +static void order_lock(void)
> >> +{
> >> +}
> >> +
> >> +static void order_unlock(void)
> >> +{
> >> +}
> >
> > Is it ok that these are no-ops? tm_enqueue() seems to use these.
> >
> >> +
> >> +const schedule_fn_t schedule_scalable_fn = {
> >> + .pktio_start= pktio_start,
> >> + .thr_add= thr_add,
> >> + .thr_rem= thr_rem,
> >> + .num_grps   = num_grps,
> >> + .init_queue = init_queue,
> >> + .destroy_queue  = destroy_queue,
> >> + .sched_queue= sched_queue,
> >> + .ord_enq_multi  = ord_enq_multi,
> >> + .init_global= schedule_init_global,
> >> + .term_global= schedule_term_global,
> >> + .init_local = schedule_init_local,
> >> + .term_local = schedule_term_local,
> >> + .order_lock = order_lock,
> >> + .order_unlock   = order_unlock,
> >> +};
> >
> > Janne
> >
> >


Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler

2017-05-23 Thread Honnappa Nagarahalli
On 23 May 2017 at 09:49, Peltonen, Janne (Nokia - FI/Espoo)
 wrote:
>
>> +static int ord_enq_multi(uint32_t queue_index, void *p_buf_hdr[],
>> +  int num, int *ret)
>> +{
>> + (void)queue_index;
>> + (void)p_buf_hdr;
>> + (void)num;
>> + (void)ret;
>> + return 0;
>> +}
>
> How is packet order maintained when enqueuing packets read from an ordered
> queue to a pktout queue? Matias' recent fix uses the ord_enq_multi scheduler
> function for that, but this version does not do any ordering. Or is the
> ordering guaranteed by some other means?
>

We have been running TM related test cases and they are working fine.
So, I would assume we do not need to do anything.

What is the use case from TM? Does TM use ODP queues?

>> +static void order_lock(void)
>> +{
>> +}
>> +
>> +static void order_unlock(void)
>> +{
>> +}
>
> Is it ok that these are no-ops? tm_enqueue() seems to use these.
>
>> +
>> +const schedule_fn_t schedule_scalable_fn = {
>> + .pktio_start= pktio_start,
>> + .thr_add= thr_add,
>> + .thr_rem= thr_rem,
>> + .num_grps   = num_grps,
>> + .init_queue = init_queue,
>> + .destroy_queue  = destroy_queue,
>> + .sched_queue= sched_queue,
>> + .ord_enq_multi  = ord_enq_multi,
>> + .init_global= schedule_init_global,
>> + .term_global= schedule_term_global,
>> + .init_local = schedule_init_local,
>> + .term_local = schedule_term_local,
>> + .order_lock = order_lock,
>> + .order_unlock   = order_unlock,
>> +};
>
> Janne
>
>


Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler

2017-05-23 Thread Savolainen, Petri (Nokia - FI/Espoo)

> diff --git a/platform/linux-generic/include/odp_queue_internal.h
> b/platform/linux-generic/include/odp_queue_internal.h
> index 560f826e..cbe065dc 100644
> --- a/platform/linux-generic/include/odp_queue_internal.h
> +++ b/platform/linux-generic/include/odp_queue_internal.h
> @@ -19,16 +19,24 @@ extern "C" {
>  #endif
> 
>  #include 
> -#include 
> -#include 
> -#include 
> -#include 
> +#include 
> +#include 
>  #include 
>  #include 
>  #include 
>  #include 
> +
>  #include 
> 
> +#include 
> +#include 
> +#include 
> +#include 
> +#ifdef ODP_SCHEDULE_SCALABLE
> +#include 
> +#include 
> +#endif
> +
>  #define QUEUE_MULTI_MAX CONFIG_BURST_SIZE
> 
>  #define QUEUE_STATUS_FREE 0
> @@ -37,8 +45,6 @@ extern "C" {
>  #define QUEUE_STATUS_NOTSCHED 3
>  #define QUEUE_STATUS_SCHED4
> 
> -
> -/* forward declaration */
>  union queue_entry_u;
> 
>  typedef int (*enq_func_t)(union queue_entry_u *, odp_buffer_hdr_t *);
> @@ -49,6 +55,47 @@ typedef int (*enq_multi_func_t)(union queue_entry_u *,
>  typedef  int (*deq_multi_func_t)(union queue_entry_u *,
>   odp_buffer_hdr_t **, int);
> 
> +#ifdef ODP_SCHEDULE_SCALABLE

The queue interface need to be still improved, so that there's no conditional 
compiling like this piece of code here.


> +
> +struct queue_entry_s {
> + sched_elem_t sched_elem;
> +
> + odp_ticketlock_t lock ODP_ALIGNED_CACHE;
> + int  status;
> +
> + enq_func_t   enqueue ODP_ALIGNED_CACHE;
> + deq_func_t   dequeue;
> + enq_multi_func_t enqueue_multi;
> + deq_multi_func_t dequeue_multi;
> +
> + uint32_t   index;
> + odp_queue_thandle;
> + odp_queue_type_t   type;
> + odp_queue_param_t  param;
> + odp_pktin_queue_t  pktin;
> + odp_pktout_queue_t pktout;
> + char   name[ODP_QUEUE_NAME_LEN];
> +};
> +
> +int _odp_queue_deq(sched_elem_t *q, odp_buffer_hdr_t *buf_hdr[], int
> num);
> +int _odp_queue_deq_sc(sched_elem_t *q, odp_event_t *evp, int num);
> +int _odp_queue_deq_mc(sched_elem_t *q, odp_event_t *evp, int num);
> +
> +/* Round up memory size to next cache line size to
> + * align all memory addresses on cache line boundary.
> + */
> +static inline void *shm_pool_alloc_align(_odp_ishm_pool_t *pool, uint32_t
> size)
> +{
> + void *addr;
> +
> + addr = _odp_ishm_pool_alloc(pool, ROUNDUP_CACHE_LINE(size));
> + ODP_ASSERT(((uintptr_t)addr & (ODP_CACHE_LINE_SIZE - 1)) == 0);
> +
> + return addr;
> +}
> +
> +#else




> diff --git a/platform/linux-generic/odp_packet_io.c b/platform/linux-
> generic/odp_packet_io.c
> index d8cae15c..717e7e33 100644
> --- a/platform/linux-generic/odp_packet_io.c
> +++ b/platform/linux-generic/odp_packet_io.c
> @@ -5,23 +5,25 @@
>   */
>  #include 
> 
> -#include 
> -#include 
> -#include 
>  #include 
> -#include 
> -#include 
> +#include 
>  #include 
>  #include 
>  #include 
> -#include 
> +#include 
> +
> +#include 
>  #include 
> -#include 
> -#include 
> -#include 
>  #include 
> +
> +#include 
> +#include 
>  #include 
> -#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> 
>  #include 
>  #include 
> @@ -472,7 +474,6 @@ int odp_pktio_start(odp_pktio_t hdl)
>   return -1;
>   }
>   }
> -
>   sched_fn->pktio_start(pktio_to_id(hdl), num, index);
>   }
> 
> @@ -554,7 +555,6 @@ static inline int pktin_recv_buf(odp_pktin_queue_t
> queue,
>   odp_packet_t packets[num];
>   odp_packet_hdr_t *pkt_hdr;
>   odp_buffer_hdr_t *buf_hdr;
> - odp_buffer_t buf;
>   int i;
>   int pkts;
>   int num_rx = 0;
> @@ -564,9 +564,7 @@ static inline int pktin_recv_buf(odp_pktin_queue_t
> queue,
>   for (i = 0; i < pkts; i++) {
>   pkt = packets[i];
>   pkt_hdr = odp_packet_hdr(pkt);
> - buf = _odp_packet_to_buffer(pkt);
> - buf_hdr = buf_hdl_to_hdr(buf);
> -
> + buf_hdr =
> buf_hdl_to_hdr(_odp_packet_to_buffer(pkt));


Do not include unnecessary style changes like this one (and the one above) into 
this patch. These are just cosmetic and obscure the   functional change you are 
trying to achieve. These waste reviewer time to figure out why this change was 
needed - just to realize that it was not actually needed.

Also e.g. in these cases, I like the original style better.



>   if (pkt_hdr->p.input_flags.dst_queue) {
>   queue_entry_t *dst_queue;
>   int ret;
> @@ -584,11 +582,12 @@ static inline int pktin_recv_buf(odp_pktin_queue_t
> queue,
> 
>  int pktout_enqueue(queue_entry_t *qentry, odp_buffer_hdr_t *buf_hdr)
>  {
> - odp_packet_t pkt = _odp_packet_from_buffer(buf_hdr-
> >handle.handle);
> + odp_packet_t pkt;
>   int len = 1;
>   int nbr;
> 
> - nbr = odp_pktout_send(qentry->s.pktout, , len);
> + pkt = 

[lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler

2017-05-22 Thread Brian Brooks
Signed-off-by: Brian Brooks 
Signed-off-by: Kevin Wang 
Signed-off-by: Honnappa Nagarahalli 
Signed-off-by: Ola Liljedahl 
---
 platform/linux-generic/Makefile.am |   15 +-
 .../include/odp/api/plat/schedule_types.h  |4 +-
 .../linux-generic/include/odp_config_internal.h|   17 +-
 .../linux-generic/include/odp_queue_internal.h |  132 +-
 platform/linux-generic/include/odp_schedule_if.h   |   23 +-
 .../linux-generic/include/odp_schedule_scalable.h  |  137 ++
 .../include/odp_schedule_scalable_config.h |   42 +
 .../include/odp_schedule_scalable_ordered.h|  131 ++
 platform/linux-generic/m4/odp_schedule.m4  |   55 +-
 platform/linux-generic/odp_classification.c|4 +-
 platform/linux-generic/odp_packet_io.c |   71 +-
 platform/linux-generic/odp_queue.c |2 +-
 platform/linux-generic/odp_queue_scalable.c|  946 ++
 platform/linux-generic/odp_schedule_if.c   |   36 +-
 platform/linux-generic/odp_schedule_scalable.c | 1942 
 .../linux-generic/odp_schedule_scalable_ordered.c  |  280 +++
 platform/linux-generic/odp_traffic_mngr.c  |7 +-
 17 files changed, 3756 insertions(+), 88 deletions(-)
 create mode 100644 platform/linux-generic/include/odp_schedule_scalable.h
 create mode 100644 
platform/linux-generic/include/odp_schedule_scalable_config.h
 create mode 100644 
platform/linux-generic/include/odp_schedule_scalable_ordered.h
 create mode 100644 platform/linux-generic/odp_queue_scalable.c
 create mode 100644 platform/linux-generic/odp_schedule_scalable.c
 create mode 100644 platform/linux-generic/odp_schedule_scalable_ordered.c

diff --git a/platform/linux-generic/Makefile.am 
b/platform/linux-generic/Makefile.am
index 7ad63d59..5290993a 100644
--- a/platform/linux-generic/Makefile.am
+++ b/platform/linux-generic/Makefile.am
@@ -173,6 +173,9 @@ noinst_HEADERS = \
  ${srcdir}/include/odp_queue_internal.h \
  ${srcdir}/include/odp_ring_internal.h \
  ${srcdir}/include/odp_schedule_if.h \
+ ${srcdir}/include/odp_schedule_scalable.h \
+ ${srcdir}/include/odp_schedule_scalable_config.h \
+ ${srcdir}/include/odp_schedule_scalable_ordered.h \
  ${srcdir}/include/odp_sorted_list_internal.h \
  ${srcdir}/include/odp_shm_internal.h \
  ${srcdir}/include/odp_time_internal.h \
@@ -227,13 +230,9 @@ __LIB__libodp_linux_la_SOURCES = \
   pktio/ring.c \
   odp_pkt_queue.c \
   odp_pool.c \
-  odp_queue.c \
   odp_rwlock.c \
   odp_rwlock_recursive.c \
-  odp_schedule.c \
   odp_schedule_if.c \
-  odp_schedule_sp.c \
-  odp_schedule_iquery.c \
   odp_shared_memory.c \
   odp_sorted_list.c \
   odp_spinlock.c \
@@ -262,6 +261,14 @@ if ARCH_IS_X86
 __LIB__libodp_linux_la_SOURCES += arch/@ARCH_DIR@/cpu_flags.c
 endif
 
+if ODP_SCHEDULE_SCALABLE
+__LIB__libodp_linux_la_SOURCES += odp_queue_scalable.c odp_schedule_scalable.c 
\
+ odp_schedule_scalable_ordered.c
+else
+__LIB__libodp_linux_la_SOURCES += odp_queue.c odp_schedule.c odp_schedule_sp.c 
\
+ odp_schedule_iquery.c
+endif
+
 if HAVE_PCAP
 __LIB__libodp_linux_la_SOURCES += pktio/pcap.c
 endif
diff --git a/platform/linux-generic/include/odp/api/plat/schedule_types.h 
b/platform/linux-generic/include/odp/api/plat/schedule_types.h
index 535fd6d0..4e75f9ee 100644
--- a/platform/linux-generic/include/odp/api/plat/schedule_types.h
+++ b/platform/linux-generic/include/odp/api/plat/schedule_types.h
@@ -18,6 +18,8 @@
 extern "C" {
 #endif
 
+#include 
+
 /** @addtogroup odp_scheduler
  *  @{
  */
@@ -44,7 +46,7 @@ typedef int odp_schedule_sync_t;
 typedef int odp_schedule_group_t;
 
 /* These must be kept in sync with thread_globals_t in odp_thread.c */
-#define ODP_SCHED_GROUP_INVALID -1
+#define ODP_SCHED_GROUP_INVALID ((odp_schedule_group_t)-1)
 #define ODP_SCHED_GROUP_ALL 0
 #define ODP_SCHED_GROUP_WORKER  1
 #define ODP_SCHED_GROUP_CONTROL 2
diff --git a/platform/linux-generic/include/odp_config_internal.h 
b/platform/linux-generic/include/odp_config_internal.h
index dadd59e7..6cc844f3 100644
--- a/platform/linux-generic/include/odp_config_internal.h
+++ b/platform/linux-generic/include/odp_config_internal.h
@@ -7,9 +7,7 @@
 #ifndef ODP_CONFIG_INTERNAL_H_
 #define ODP_CONFIG_INTERNAL_H_
 
-#ifdef __cplusplus
-extern "C" {
-#endif
+#include 
 
 /*
  * Maximum number of pools
@@ -22,6 +20,13 @@ extern "C" {
 #define