Hi Honnappa,

Thanks a lot for the detailed explanation. So I assume there has been a plan to 
improve the linux generic timer and I can also expect the hardware timer 
implementation.

For the question you mention, please check below information about out project:

1. it is multicore network firewall which targets on running on Intel X86 (with 
DPDK)/Cavium ARM (with Octeon NP hardware support) platform. It also can run on 
VM environment with ODP DPDK support.
2. the core number is variable depends on different platform, from 2 worker 
cores to 32+ worker cores.

And of course, it will be nice I can help to put more thought and practice on 
the improvement of timer system.

Best Regards,
 
Mario


-----Original Message-----
From: Honnappa Nagarahalli [mailto:[email protected]] 
Sent: 2017年4月8日 1:13
To: Bill Fischofer <[email protected]>
Cc: Mario (Miao) Mao <[email protected]>; Zhong Chen <[email protected]>; 
[email protected]
Subject: Re: [lng-odp] linux-gen: timer: ODP general timer timeout machenism on 
massive timers usage

On 7 April 2017 at 10:33, Bill Fischofer <[email protected]> wrote:
> Thanks for your note Mario. As an aside, please subscribe to the ODP 
> mailing list otherwise I have to approve each post you make manually, 
> which will only add delay. Please see 
> https://lists.linaro.org/mailman/listinfo/lng-odp for that. We also 
> have a weekly public call on Tuesdays at 15:00 UTC (that's 8:00AM 
> PDT). Just go do http://meetings.opendataplane.org to join us.
>
>
> On Fri, Apr 7, 2017 at 3:24 AM, Mario (Miao) Mao <[email protected]> wrote:
>> Hi All,
>>
>> We have a project to adapt an existed firewall implementation to ODP 
>> framework. The current implementation needs to generate massive number of 
>> timer which link to each flow element. So there is requirement to run up to 
>> a million of timers to trigger the flow event.
>>
Definitely a very good problem to solve for timer implementation in ODP.
Currently we are working on improving the timer implementation. Please take a 
look at the presentation I made at BUD17:
http://connect.linaro.org/resource/bud17/bud17-320/.
It talks about the current state, how the problem can be divided and the work 
we are doing to address some parts of the issue.

>> An straightforward solution for us is to directly use ODP timer API to do 
>> the adaption. So we need to create individual timer for each flow element 
>> and attach a timeout event on. The advantage is the ODP event scheduling 
>> queue mechanism can help on a lockless flow processing for timeout event 
>> will be placed intp the same queue with packet belong to this flow. The 
>> prototype of this solution does work and it perfect fit our original 
>> processing flow.
>>

>> However, by carefully checking the ODP timer trigger mechanism, we find 
>> there could be performance issue when massive flow timer is created. The 
>> reason is current ODP timer implementation seems to run a full loop to check 
>> all active timers, regardless whether the timer will timeout in coming tick. 
>> So the worst situation is the timer_thread needs to check a million of timer 
>> tick buffer entries in each resolution interval. I do worry about it 
>> impacting the system performance or causing timer event losing.
>
> odp-linux is a general reference implementation designed for clarity 
> rather than performance. Since ODP supports multiple implementations, 
> the expectation is that each platform will provide a conforming 
> implementation that is optimized to it. For example, on platforms that 
> have a dedicated hardware timer engine the ODP APIs would be mapped 
> directly to that hardware by the implementation.
>
> For general production-grade operation we are developing an odp-cloud 
> implementation that is intended to be far more scalable and efficient.
> You're welcome to participate in that definition/development.
>
>>
>> So by a general thought, is that a better way to distribute active timer 
>> entries to different tick buckets instead of putting in a large tick buffer 
>> pool? In each resolution interval, the timer system only checks expired tick 
>> bucket and flush timeout events attached to expired timers. In an example of 
>> setting resolution to 1/10 second and allowing up to 120 second relative 
>> timeout, a bucket table (size = 1200) is needed and the loop round could be 
>> reduced to 1/1200 comparing of current mechanism. The only trade off is a 
>> tick bucket table which need more memory plus a complex timer adding/reset 
>> algorithm comparing with current one.
>
> That may be a promising approach and sounds similar to the work 
> currently ongoing on providing an optimized scalable software 
> scheduler intended to be one of the centerpieces of odp-cloud. Please 
> feel free to develop these thoughts further and join us to help 
> realize them.
>
This is one of the parts that was on our road-map to optimize. Can you provide 
more information about your system? Can I assume that it is a multi-core 
general CPU kind of environment? How many cores do you expect in the system?

>>
>> Let me know if I make some misunderstanding on current implementation. And 
>> thanks a lot for any comments.
>>
>> Mario Mao
>> Software Engineer
>> SonicWALL
>> office +86 21 65100909 Ext: 42415
>>

Reply via email to