> >>              DP_PACKET_BATCH_FOR_EACH (packet, packets_) {
> >> +                
> >> p->output_pkts_rxqs[dp_packet_batch_size(&p->output_pkts)] =
> >> +                                                            
> >> pmd->ctx.last_rxq;
> >>                  dp_packet_batch_add(&p->output_pkts, packet);
> >>              }
> >>              return;
> >> -------------------------------------------------------------------
> >>
> >> What do you think?
> >>
> 
> Thanks Ilya. I didn't test or check the cycle counts but on first review
> it looks good. I had something similar in mind about saving the rxq's
> that contributed packets. I'm not sure if there's any performance hit
> with adding the cycles used on a per pkt basis - if so, we could do
> something a bit less accurate. In my original email I also missed the
> case where flush is done with pkts from multiple rxq's but cycles are
> attributed solely to one rxq, but this patch fixes that too.
> 
> I'm not sure yet of the impact towards rework on patches Jan sent for
> pmd debug or I sent for balance stats. It seems a lot of people are
> finishing up for the holidays in the next day or two, so I think we have
> some more time for discussion and test before it is merged anyway.
> 
> btw, Bhanu/Ilya thanks for tx batching :-)
> 
> Kevin.
> 

I also had a quick glance at the incremental patch and it looks correct as it 
addresses both errors in wrongly or not accounting tx cycles. I haven't tested, 
though, and have no feedback regarding the performance penalty, if any.

I had a cheaper solution in mind that would accept the inaccuracy of the 
accounting of tx cycles during rx batch processing (it statistically slightly 
biases the load distribution towards the more heavily loaded rx queues). And 
instead of attributing the tx cycles outside the rx batching based on a stored 
rx queue per packet, I would simply estimate the actual processing cycles per 
rx queue per interval by distributing the total PMD processing cycles in the 
interval over the polled rx queues weighted by their counted processing cycles. 
After all, Kevin algorithm only needs comparable relative rx queue loads across 
the PMDs to function. In this context I would also suggest to exclude idle 
polling cycles from the algorithm because they are not load related and always 
eat up the total cycle budget of the PMD anyhow.

Finally, the order of merging our patches matters. I'm sure my PMD performance 
metrics refactoring conflicts with both Ilya's and Kevin's latest changes. I 
don't know which order will cause the least rework. But we should decide on an 
order and work accordingly to avoid unnecessary churn.

BR, Jan
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to