Dear David,
A bit of history. We worked on vpp for a decade before making any serious effort to multi-thread it. The first scheme that I tried was to break up the graph into reconfigurable pipeline stages. Effective partitioning of the graph is highly workload-dependent, and it can change in a heartbeat. the resulting system runs at the speed of the slowest pipeline stage. In terms of easily measured inter-thread handoff cost, it’s not awful. 2-3 clocks/pkt. Handing vectors of packets between threads can cause a festival of cache coherence traffic, and it can easily undo the positive effects of ddio (packet data DMA into the cache hierarchy). We actually use the scheme you describe in a very fine-grained way: dual and quad loop graph dispatch functions process 2 or 4 packets at the same time. Until we run out of registers, a superscalar CPU can “do the same thing to 2 or 4 packets at the same time” pretty effectively. Including memory hierarchy stalls, vpp averages more than two instructions retired per clock cycle. At the graph node level, I can’t see how to leverage this technique. Presenting [identical] vectors to 2 (or more) nodes running on multiple threads would mean (1) the parallelized subgraph would run at the speed of the slowest node. (2) you’d pay the handoff costs already discussed above, and (3) you’d need an expensive algorithm to make sure that all vector replicas were finished before reentering sequential processing. (4) None of the graph nodes we’ve ever constructed are free of ordering constraints. Every node alters packet state in a meaningful way, or they wouldn’t be worth having. (😉)… We’ve had considerable success with flow-hashing across a set of identical graph replicas [worker threads], even when available hardware RSS hashing is not useful [think about NATted UDP traffic]. Hope this is of some interest. Thanks… Dave From: vpp-dev-boun...@lists.fd.io [mailto:vpp-dev-boun...@lists.fd.io] On Behalf Of David Bainbridge Sent: Friday, January 26, 2018 12:39 PM To: vpp-dev@lists.fd.io Subject: [vpp-dev] VPP Graph Optimization I have just started to read up on VPP/FD.io, and I have a question about graph optimization and was wondering if (as I suspect) this has already been thought about and either planned or decided against. The documentation I found on VPP essentially says that VPP uses batch processing and processes all packets in a vector on one step before proceeding to the next step. The claim is this provides overall better throughput because of instruction caching. I was wondering if optimization of the graph to understand where concurrency can be leveraged has been considered, as well as where you could process the vector by two steps with an offset. If this is possible, then steps could be pinned to cores and perhaps both concurrency and instruction caching could be leveraged. For example assume the following graph: In this graph, steps B,C can be done concurrently as they don't "modify" the vector. Steps D, E can't be done concurrently, but as they don't require look back/forward they can be done in offset. What I am suggesting is, if there are enough cores, then steps could be pinned to cores to achieve the benefits of instruction caching, and after step A is complete, steps B,C could be done concurrently. After B,C are complete then D can be started and as D completes processing on a packet if can then be processed by E (i.e., the entire vector does not need to be processed by D before processing by E is started). I make no argument that this doesn't increase complexity and also introduces coordination costs that don't exists today. To be fair, offset processing could be viewed as splitting the original large vector into smaller vectors and processing the smaller vectors from start to finish (almost dynamic optimization based on dynamic vector resizing). Just curious to hear others thoughts and if some of this has been thought through or experimented with. As I said, just thinking off the cuff and wondering; not fully thought through. avèk respè, /david
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ vpp-dev mailing list vpp-dev@lists.fd.io https://lists.fd.io/mailman/listinfo/vpp-dev