On 12/05/2021 03:11, Flavio Leitner wrote:
> Hi,
> 
> On Fri, Apr 30, 2021 at 11:31:26AM -0400, Mark Gray wrote:
>> This series proposes a new method of distributing upcalls
>> to user space threads attempting to resolve a number of
>> issues with the current method.
>>
> 
> I ran some tests with old V10, current master and this RFC
> including the kernel (based on 5.11.0) on a 28 cores system.
> 

Thanks Flavio

> The old v10 had the issue of not scaling up in case of a high
> load of upcalls. The test sends a burst of UDP packets which
> causes upcalls. The table below shows how many packets could
> be sent without increasing the upcall loss counter.
>                v10       master     rfc
> packets        2k5       >55k       10k
> 
> So, it reproduced the same old v10 value. Regarding to branch
> master then it's not determined due to test limitation. It is
> at least above 55k (last time I think it was 63k). The RFC patch
> resulted in a better number compared with v10 though the test
> should be using only one thread as v10. I think that keeping
> the CPU context could explain the difference.

As this patch distributes packets to different kernel space threads (and
hence user space threads) based on a flow hash, a single flow will only
get distributed to one user space thread. I think this is what you are
seeing here? Although "master" will currently distribute that to
multiple user space threads (performing better), it means that upcalls
can be processed out of order which is incorrect and undesired. I think
this is ok because in real-world scenarios, there will always be
multiple flows so they will get distributed between user space threads.
A single flow consuming the throughput of a single thread is probably
only going to be seen in benchmarks?
> 
> Running the test with 8 parallel threads sending one burst of
> UDP packets each resulted in the following table:
>   Branch   missed       lost   
>    v10     52018        50288
>   master   52022        0
>    RFC     52021        0
> 

This looks good!

> Now the wake ups, one thread:
>   Branch   wake    processing
>   master   20+       16+
>    RFC     3         1
> 

This looks great!

> Column wake: number of different threads receiving
>        sched:sched_wakeup or irq:softirq_entry.
> Column processing: number of CPUs with double digits
>        usage.
> 
> And 8 parallel threads:
>   Branch   wake    processing
>   master   20+       20+
>    RFC     10        8+
> 
> The results show that this new patch-set addressed the main
> thundering herd issue and the scalability issue I reported
> during V10 review.

Great!

> 
> Unfortunately I can review the patches only next week.
> 

No problem. Thanks again for the independent benchmarking.
> Thanks,
> fbl
> 

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to