That's exactly what meta data is used for. This particular bit of meta data would only be settable by a scheduler, so there would be no application-accessible setter for it. But I see no harm in making the getter available. That would also potentially eliminate the need for the second argument to odp_schedule() since the application could always retrieve the source queue information from the returned buffer if desired.
On Wed, Oct 15, 2014 at 2:14 PM, Ola Liljedahl <[email protected]> wrote: > I think the ODP implementation is supposed to remember the source queue > for an atomically scheduled packet. Same for ordered queues. No need to > expose this to the application and less risk of doing it wrong. > > -- Ola > > > On 15 October 2014 20:57, Bill Fischofer <[email protected]> > wrote: > >> Thanks, Alex. This does imply that we're missing a critical bit of >> buffer meta data. Since odp_queue_enq() only specifies a target queue and >> buffer, there needs to be some way to relate this call to the queue that >> the buffer was previously sourced from so that the atomic/ordered semantics >> can be maintained. >> >> Should we have a last_queue field in the buffers that gets set by >> odp_schedule() and referenced as part of subsequent enq operations? >> >> Bill >> >> On Wed, Oct 15, 2014 at 1:44 PM, Alexandru Badicioiu < >> [email protected]> wrote: >> >>> Bill, I have the same understanding as yours regarding these aspects. >>> Free calls should be aware of the source queue of a buffer to inform the >>> scheduler that the context should be released too. The same for enqueue >>> calls. I'm not sure of the use of explicit release context calls - >>> eventually any buffer/event/packet (i.e. entity delivered by the scheduler) >>> would be enqueued or freed so the scheduler will be informed. >>> >>> Alex >>> >>> On 15 October 2014 20:23, Bill Fischofer <[email protected]> >>> wrote: >>> >>>> If I call odp_schedule() and get back an event associated with an >>>> atomic queue, my understanding is that I owe the implementation a >>>> subsequent call to odp_schedule_release_atomic() to dispose of it. >>>> Similarly, if I receive an event from an ordered queue, I need to enq that >>>> event somewhere else or otherwise there will be a gap in the downstream >>>> order that will cause a stall. >>>> >>>> The interplay between queues and the scheduler is why we need a design >>>> that spells this out in detail. That's where what is and is not API vs. >>>> implementation is also spelled out. Right now my understanding is we have >>>> the following types of queues and their scheduling >>>> implications/interactions: >>>> >>>> - Parallel: Anything on the queue can be given to anyone without >>>> restriction. There are no restrictions relating to subsequent event >>>> disposal or downstream processing. >>>> >>>> >>>> - Atomic: Anyone can get something from a queue, but once they get >>>> it the queue is not able to give out subsequent events to anyone else >>>> until >>>> the queue is either explicitly (via odp_schedule_release_atomic()) or >>>> implicitly (via a subsequent enq of the event to some other queue) made >>>> available for rescheduling. >>>> >>>> >>>> - Ordered: Anything on the queue can be given to anyone without >>>> restriction, however subsequent enqs of those events onto other queues >>>> must >>>> be order preserving. This implies that if an application wishes to >>>> dispose >>>> of an event without a subsequent enq it needs to inform the scheduler of >>>> this to prevent downstream stalls. So it appears there needs to be an >>>> ordered equivalent to odp_schedule_relaese_atomic() that we currently >>>> don't >>>> have. >>>> >>>> Is this understanding correct? If not what is the correct way to view >>>> this? In any event, we just need to write this out and get agreement as to >>>> what the meanings and conventions are associated with this. >>>> >>>> On Wed, Oct 15, 2014 at 11:42 AM, Ola Liljedahl < >>>> [email protected]> wrote: >>>> >>>>> Is should be part of the implementation. But it is not. Because >>>>> prescheduled events might not be processed if the thread the events are >>>>> prefetched to stops calling schedule() and process those events. And the >>>>> corresponding queues (if atomic) will be locked forever... Also problems >>>>> with ordered queues as later packets will also be stalled until those >>>>> prefetched packets are released. >>>>> >>>>> On 15 October 2014 17:54, Bill Fischofer <[email protected]> >>>>> wrote: >>>>> >>>>>> Whether or not events are prefetched is an implementation >>>>>> consideration that is part of the implementation, not the application, >>>>>> no? >>>>>> Again, I don't think this is something we need to worry about for ODP >>>>>> v1.0. It should be properly addressed in a wider context post-v1.0. >>>>>> >>>>>> On Wed, Oct 15, 2014 at 10:51 AM, Ola Liljedahl < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> What if a thread wants to stop consuming and processing events? We >>>>>>> don't (can't) leave some events prescheduled (and stashed in some >>>>>>> per-core >>>>>>> "portal") after the thread has stopped processing. So a thread must be >>>>>>> able >>>>>>> to stop prefetching and then consume and process all remaining >>>>>>> (prefetched) >>>>>>> events before it completely stops processing. How would this work on >>>>>>> Freescale or TI ODP implementations? >>>>>>> >>>>>>> >>>>>>> On 15 October 2014 15:54, Savolainen, Petri (NSN - FI/Espoo) < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> System will deadlock if your application decides to step out from >>>>>>>> the schedule loop, and a throughput optimized scheduler has already >>>>>>>> pre-scheduled a number of buffers to that core (== locked a number of >>>>>>>> atomic queues). >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Application has to be sure that scheduler has not locked anything >>>>>>>> for that core before stepping out of the schedule loop. Typically, it’s >>>>>>>> impossible for the HW scheduler to rewind scheduling decision >>>>>>>> afterwards >>>>>>>> (when application tells it wants to exit). >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -Petri >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> *From:* ext Bill Fischofer [mailto:[email protected]] >>>>>>>> *Sent:* Wednesday, October 15, 2014 4:38 PM >>>>>>>> *To:* Savolainen, Petri (NSN - FI/Espoo) >>>>>>>> *Cc:* ext Alexandru Badicioiu; Ola Liljedahl; >>>>>>>> [email protected] >>>>>>>> >>>>>>>> *Subject:* Re: [lng-odp] odp_schedule() vs. odp_schedule_one() >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> It's not clear why you'd want to expose implementation >>>>>>>> considerations through the API. That's what DPDK does and it gets them >>>>>>>> into all sorts of portability trouble. odp_schedule() is how a thread >>>>>>>> discovers the next thing it's supposed to do. From that standpoint >>>>>>>> there >>>>>>>> doesn't appear to be any application-visible distinction between >>>>>>>> odp_schedule() and odp_schedule_one(). In both cases, the application >>>>>>>> gets >>>>>>>> a buffer, as well as the queue it was drawn from. That's all the >>>>>>>> application needs to know--everything else is behind-the-scenes >>>>>>>> implementation mechanics that will vary from platform to platform. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Wed, Oct 15, 2014 at 8:13 AM, Savolainen, Petri (NSN - FI/Espoo) >>>>>>>> <[email protected]> wrote: >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> It’s not only push vs pull. It can be also “pull many” vs “pull >>>>>>>> one”. Alex, I think your HW supports both: pull many or pull only one. >>>>>>>> >>>>>>>> Global scheduling == SoC level scheduling, not scheduling from e.g. >>>>>>>> per core level stash of (pre-scheduled) buffers/queues. >>>>>>>> >>>>>>>> The first goal of the function is to streamline application main >>>>>>>> loop when application have to step out of the schedule loop often >>>>>>>> (e.g. in >>>>>>>> addition to ODP scheduler, poll a third party lib). So instead of ... >>>>>>>> >>>>>>>> main_odp_loop >>>>>>>> { >>>>>>>> odp_schedule_resume() >>>>>>>> >>>>>>>> buf = odp_schedule(...) >>>>>>>> >>>>>>>> <process it> >>>>>>>> >>>>>>>> odp_schedule_pause() >>>>>>>> >>>>>>>> while ( (buf = odp_schedule(...)) != INVALID) >>>>>>>> { >>>>>>>> <process it> >>>>>>>> } >>>>>>>> >>>>>>>> odp_schedule_release_atomic() >>>>>>>> >>>>>>>> return >>>>>>>> } >>>>>>>> >>>>>>>> ... you can do ... >>>>>>>> >>>>>>>> main_odp_loop >>>>>>>> { >>>>>>>> >>>>>>>> buf = odp_schedule_one(...) >>>>>>>> >>>>>>>> <process it> >>>>>>>> >>>>>>>> odp_schedule_release_atomic() >>>>>>>> >>>>>>>> return >>>>>>>> } >>>>>>>> >>>>>>>> >>>>>>>> The second goal is to optimize for QoS response time. It could be >>>>>>>> handled with another call that tells ODP to optimize for QoS instead of >>>>>>>> throughput. >>>>>>>> >>>>>>>> >>>>>>>> -Petri >>>>>>>> >>>>>>>> >>>>>>>> From: [email protected] [mailto: >>>>>>>> [email protected]] On Behalf Of ext Alexandru >>>>>>>> Badicioiu >>>>>>>> Sent: Wednesday, October 15, 2014 3:52 PM >>>>>>>> To: Ola Liljedahl >>>>>>>> Cc: [email protected] >>>>>>>> Subject: Re: [lng-odp] odp_schedule() vs. odp_schedule_one() >>>>>>>> >>>>>>>> >>>>>>>> The documentation suggests that these two calls can be used in the >>>>>>>> same application which may be a problem also for platforms which do >>>>>>>> support >>>>>>>> both modes, but not at the same time or without re-initialization, >>>>>>>> re-configuration, etc. By modes I mean PUSH (odp_schedule()), when the >>>>>>>> scheduler runs independently of the application and pushes frames to >>>>>>>> the >>>>>>>> application, and PULL (odp_schedule_one()) when the scheduler runs >>>>>>>> when >>>>>>>> the application decides and the application pulls the frames from the >>>>>>>> scheduler. >>>>>>>> Also the term "global scheduling" is confusing and may not reflect >>>>>>>> the reality of the HW. >>>>>>>> >>>>>>>> >>>>>>>> Alex >>>>>>>> >>>>>>>> On 15 October 2014 15:15, Ola Liljedahl <[email protected]> >>>>>>>> wrote: >>>>>>>> * Schedule one buffer >>>>>>>> * >>>>>>>> * Like odp_schedule(), but is quaranteed to schedule only one >>>>>>>> buffer at a time. >>>>>>>> * Each call will perform global scheduling and will reserve one >>>>>>>> buffer per >>>>>>>> * thread in maximum. When called after other schedule functions, >>>>>>>> returns >>>>>>>> * locally stored buffers (if any) first, and then continues in the >>>>>>>> global >>>>>>>> * scheduling mode. >>>>>>>> * >>>>>>>> * This function optimises priority scheduling (over throughput). >>>>>>>> >>>>>>>> As Taras commented, some implementations will not be able to truly >>>>>>>> schedule only one event at a time. Scheduler implementations could use >>>>>>>> a >>>>>>>> pipelined designed where events are scheduled in advance so that the >>>>>>>> next >>>>>>>> event can be prefetched while the current event is being processed. >>>>>>>> This >>>>>>>> will limit concurrent processing (e.g. an idle core could have received >>>>>>>> that second event and process it concurrently, this would have reduced >>>>>>>> latency for that event). >>>>>>>> >>>>>>>> odp_schedule_one() has the same functionality as odp_schedule(). >>>>>>>> However it is supposed to guarantee only one event at a time is >>>>>>>> scheduled >>>>>>>> in order to prioritize latency to the potential detriment of >>>>>>>> throughput. >>>>>>>> >>>>>>>> We question whether odp_schedule_one() actually has to guarantee >>>>>>>> only one event at a time. The functionality provided is the same for >>>>>>>> these >>>>>>>> two calls. One call is focused on throughput (and minimizing overhead, >>>>>>>> e.g.by allowing prescheduling and do prefetching), the other is >>>>>>>> focused on latency (at the cost of overhead). An ODP implementation >>>>>>>> could >>>>>>>> use the same implementation for both functions (some ODP >>>>>>>> implementations >>>>>>>> will always schedule events in advance, other implementations will >>>>>>>> always >>>>>>>> only schedule one event at a time). odp_schedule_one() just hints the >>>>>>>> ODP >>>>>>>> implementations that latency and concurrent processing is more >>>>>>>> important >>>>>>>> but this is not a strict requirement. >>>>>>>> >>>>>>>> Maybe we only need one schedule call and possibly use a different >>>>>>>> mechanism to hint the ODP scheduler whether to optimize for throughput >>>>>>>> (e.g. preschedule/prefetch) or latency. >>>>>>>> >>>>>>>> --Ola >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> lng-odp mailing list >>>>>>>> [email protected] >>>>>>>> http://lists.linaro.org/mailman/listinfo/lng-odp >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> lng-odp mailing list >>>>>>>> [email protected] >>>>>>>> http://lists.linaro.org/mailman/listinfo/lng-odp >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
_______________________________________________ lng-odp mailing list [email protected] http://lists.linaro.org/mailman/listinfo/lng-odp
