On 12/21/2017 12:27 PM, Jan Scheurich wrote: >>>> I had a cheaper solution in mind that would accept the inaccuracy of the >>>> accounting of tx cycles during rx batch processing (it >> statistically slightly biases the load distribution towards the more heavily >> loaded rx queues). And instead of attributing the tx cycles outside >> the rx batching based on a stored rx queue per packet, I would simply >> estimate the actual processing cycles per rx queue per interval by >> distributing the total PMD processing cycles in the interval over the polled >> rx queues weighted by their counted processing cycles. >>> >>> I'm not sure if I fully understand your idea, but batches are likely will >>> be sent >>> on one of the next polling cycles, not on the cycle where they was received. >>> Moreover, send cycles are very different for different port types. And also >>> packets >>> form a single rx queue may have different destinations (different types of >>> output ports). >>> I'm not sure if it possible to correctly count cycles in this case. >>> >> >> I agree, tx processing can be very different depending on destination, >> especially with larger packet sizes where some paths may have a copy. > > I understand. There can be a big difference between physical and vhostuser > ports, especially if the copy passes over the QPI. And if the per packet tx > accounting is not causing too much overhead, it would be the best. > > My suggestion for simplification was not aiming at the same accuracy as > Ilya's patch. Perhaps rebalancing doesn't really require that accuracy. Often > the new rx queue distribution will anyway behave different compared to the > prediction because of changed batching characteristics. >
Sure, that is why I had suggested to perhaps make no assumption about future batch opportunities and just add the full cost of tx to each rxq that has at least one packet in the flush. OTOH, then the stats for the current situation could look worse than things actually are. Also, it might not really simplify the code as Ilya'a patch just gives each rxq some cycles per packet in a simple loop, whereas changing to give a full cycle count per rxq would mean checking for duplicate rxqs etc. I guess it's debatable. Anyway, I think both those schemes are a fix for attributing cycles to the wrong rxq, or not counting them at all. >> >>> Am I missed something? >>> >>>> After all, Kevin algorithm only needs comparable relative rx queue loads >>>> across the PMDs to function. In this context I would also >> suggest to exclude idle polling cycles from the algorithm because they are >> not load related and always eat up the total cycle budget of the >> PMD anyhow. >> >> Idle polling per rxq is not counted in the balance algorithm (for the >> reason you say) but it's needed to calculate a total cycles over the >> correct time period for providing the user with an rxq percentage of pmd >> used stats. Counting (2/3) and calculations (3/3) are in the balance >> stats patches for ref. > > Sorry, I didn't check the patches carefully. I just assumed because you do > count idle polling cycles per rx queue. But the total number of cycles per > interval could easily obtained from the PMD cycles. Perhaps a chance to > simplify? > Good question, it had been my first thought too. The issue was the full PMD cycles are not in sync with the rxq measurement intervals and are also controlled by the user with pmd-stats-clear. It was simpler to just extend the rxq counting code a bit to capture idle cycles, than to try and synchronize with the full PMD cycles and I didn't want to change their behaviour. Actually, with the rework to the data structure, I think measuring idle/proc cycles for an rxq simplified the existing code a little too. thanks, Kevin. > BR, Jan > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
