On Thu, Jun 8, 2017 at 6:20 PM, Ben Pfaff <[email protected]> wrote: > > On Thu, Jun 08, 2017 at 04:21:17PM -0700, Han Zhou wrote: > > On Thu, Jun 8, 2017 at 4:14 PM, Ben Pfaff <[email protected]> wrote: > > > > > > On Tue, Jun 06, 2017 at 05:24:19PM -0700, Han Zhou wrote: > > > > On Tue, Jun 6, 2017 at 3:56 PM, Ben Pfaff <[email protected]> wrote: > > > > > > > > > > On Thu, May 25, 2017 at 05:26:47PM -0700, Han Zhou wrote: > > > > > > This patch introduces multi-threading for ovn-controller and use > > > > > > dedicated thread for packet-in processing as a start. It decouples > > > > > > packet-in processing and ovs flow computing, so that packet-in > > inputs > > > > > > won't trigger flow recomputing, and flow computing won't block > > > > > > packet-in processing. In large scale environment this largely > > reduces > > > > > > CPU cost and improves performance. > > > > > > > > > > Won't this double the load on the southbound database server, as well > > as > > > > > the bandwidth to and from it? We already have a bottleneck there. > > > > > > > > Ben, yes this is the trade-off. Here are the considerations: > > > > > > > > 1. The bottle-neck in ovn-controller is easier to hit (you don't even > > need > > > > many number of HVs to hit it) > > > > 2. The bottle-neck of southbound DB do exist when number of HV increases > > > > but since you are already working on the ovsdb clustering I suppose it > > will > > > > be resolved. > > > > > > > > However I agree that this is not ideal. Alternatively we can spin-up a > > > > dedicated thread for SB IDL processing and other "worker" thread just > > read > > > > the data with proper locking. This will be more complicated but should > > be > > > > doable, what do you think? > > > > > > I spent a little time thinking about this. I think that the approach > > > that you're proposing is probably practical. Do you want to try to > > > experiment with it and see whether it's reasonably possible? > > > > It basically needs to separate reads and writes for SB IDL from the > > xxx_run() functions, which may be tricky. But if there is no other way > > around I'll go down this path. > > I thought you were proposing some very coarse-grained locking. Maybe > you should describe what you mean in a little more detail.
Sorry for confusing. Let me describe my thoughts and proposed options here. Each thread is inherently a separate controller. The problem is, each iteration in each controller is a ovsdb transaction cycle, and each have its own pace of the iteration, so we can't just use a big lock to force them to iterate in same pace, because that will make the multithreading meaningless. Because of this, for a shared IDL, only one controller should own the life-cycle of ovsdb transaction: run idl, make changes, and commit transaction. The other controller could just read IDL data when it is not being updated by the transaction owner. However, in our situation both controllers need to "write" SB data. The main controller writes port-binding and nb_cfg, and pinctrl writes mac-bindings, which makes the situation more complicated. Here are 3 options based on these considerations: [Option 1]: On top of the current patch which blindly duplicates the IDL connection, we can optimize it so that each controller monitors its own interested data, for example, pinctrl monitors dns, mac-binding and port-binding, but not logical-flows. Pros: clean, easy to maintain. Cons: double IDL connections. For data interested by both controllers (i.e. port-bindings) there is waste of bandwidth. [Option 2]: For the data that a controller needs to write (e.g. mac-bindings for pinctrl, port-bindings for the main controller), each controller still use its own IDL. But for readonly data (e.g. port-bindings for pinctrl), only the main controller monitors and pinctrl just read the data by copying it once in each pinctrl iteration, with a lock to make sure it is not being written during the copying. (For the copying, we need a deep-copy for the IDL structures.) Pros: No wasted bandwidth, although still double connections to be handled by the ovsdb server. Cons: still double IDL connections. More complex than Option1. [Option 3]: Add a 3rd thread, dedicated for IDL running and committing. The transaction boundary will not be logical iterations of the controllers, since it is not reasonable to sync the pace between different controllers. The transaction will be based on predefined batch size. The other controller threads will not directly write to IDL, but send the writes tasks to the IDL thread in a producer/consumer mechanism. Pros: only one IDL connection per HV Cons: The most complex one. Option 3 sounds cool but we are not sure if the performance again justifies the complexity and the pain to maintain it. It seems to me reasonable for each "logic" controller to have its own IDL connection, each controls its own interested data. Option 2 is the one I feel comfortable with, but compare to Option 1, the major gain is saving the bandwidth for getting port-binding updates, which I wonder if it really matters that much for scalability. So personally I would suggest to go just option 1 at this point as a start. With the upcoming clustering support of ovsdb we can see if we can reach big enough scale or do we need to optimize it further with option 2, or even option 3, in an incremental way. What do you think? Thanks, Han _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
