Hi Mark, thank you very much for the feedback!!
On Fri, Feb 23, 2018 at 10:56 AM, Mark Michelson <mmich...@redhat.com>
> Hi Han,
> First off, great work! This is fantastic to see, and I'm super excited to
see such a drop in CPU usage with this patch.
> Since this is an RFC patch, I won't go too deep into nitty-gritty details
on this. Rather, I'll look at it from a high level.
> The API design idea is great. I like the use of a directed graph since it
clearly indicates dependencies. The patch progression is done really well
since it allows for us to easily understand each individual change. It also
makes it easy to see additional incremental changes that could be added in
> My main concern is that the specific setup you have used here results in
data encapsulation not being very strong. For instance, the en_flow_output
node consistently looks inside the en_runtime_data node for input data. It
appears from my quick look that en_flow_output is treating this data as
immutable, but I'm not 100% sure if I'm correct. There's nothing in place
to stop it from modifying that data if it wanted to, though.
Agree. It is also my concern for the lack of strong encapsulation. The
principle to follow is that each node take input node data as immutable,
and write its own data. I will think about how to make this explicitly from
code perspective. I didn't have a good idea on this. Do you have any
> Also it is common for nodes to modify global data rather than just
modifying node-local data(e.g. making calls to ofctrl_add_flow(), which
modifies a global hmap).
flow-table can be considered as data of en_flow_output. Since
en_flow_output is the leaf node of the DAG and no other engine node depends
on it, so I didn't think it is quit necessary to maintain its own local
data. But I agree it is better to be clear from the engine node
perspective. I can move the flow-table to en_flow_output data, and update
ofctrl module as consumer of that data.
> Ideally, this entire thing would be doable without the need for the
engine_get_input() function. Each node would operate independently on its
own data. Doing this would aid in the ability to reason about what each
node is responsible for, and it would open the door for introducing
parallel processing of sibling nodes in the DAG.
Sorry that I didn't get this point. Each node would operate on its own data
but they also depends on their inputs, and that's why engine_get_input() is
here for. Did I misunderstand your point?
For parallel processing, I am a little hesitating, because the goal is to
reduce the CPU cost on HVs (to save CPU for real workloads). The processing
engine should be non-blocking, so if we need parallel processing in the
engine, it means we are requesting too much CPU on the HV. I hope we don't
need parallel processing for flow computing.
> All that being said, I would be happy to see a change like this go in
as-is (pending more in-depth code review, of course) if the data
encapsulation work would creep the scope out too much for the initial
> On 02/20/2018 01:04 PM, Han Zhou wrote:
>> ovn-controller currently recomputes everything when there are any changes
>> of input, which leads to high CPU usages and slow in end-to-end flow
>> enforcement in response to changes. It even wastes CPU to recompute flows
>> for unrelated inputs such as pinctrl events.
>> This patch series implements incremental processing in ovn-controller to
>> solve above problems. There has been a similar attempt of solve the
>> earlier but was reverted (see commit: 926c34fd). This patch series takes
>> a different approach with an incremental processing engine, to make the
>> dependencies clear and easier to maintain. The engine is a DAG
>> dependencies between different nodes. Each node maintains its own data,
>> depends on its inputs and the data can also be inputs of other nodes.
>> node implements a method to recompute its data based on all the inputs,
>> also implements methods to handle changes of different inputs
>> The engine will be responsible to try incremental processing for each
>> based on the dependencies or fallback to recompute when changes cannot be
>> handled incrementally.
>> This patch series can incrementally process the most common changes:
>> logical flows and port bindings from OVNSB. It can be expanded further
>> more fine grained incremental processing gradually.
>> With the patch series, the CPU time of ovn-controller in ovn-scale-test
>> for 500 lports creating and binding on 50 HVs decreased 90%.
>> This is RFC version to get feedback to see if there is any major issue of
>> this approach, before refining it future for formal review. There are
>> two test cases failed and debugging ongoing.
>> Han Zhou (8):
>> ovn-controller: Incremental processing engine
>> ovn-controller: Track OVSDB changes
>> ovn-controller: Initial use of incremental engine in main
>> ovn-controller: Split SB inputs as separate incremental engine nodes
>> ovn-controller: split ovs_idl inputs in incremental engine
>> ovn-controller: Incremental logical flow processing
>> ovn-controller: runtime_data change handler for SB port-binding
>> ovn-controller: port-binding incremental processing for physical flows
>> include/ovn/actions.h | 3 +
>> ovn/controller/bfd.c | 4 +-
>> ovn/controller/binding.c | 101 +++++++-
>> ovn/controller/binding.h | 6 +
>> ovn/controller/encaps.c | 12 +-
>> ovn/controller/lflow.c | 104 ++++++--
>> ovn/controller/lflow.h | 10 +-
>> ovn/controller/ofctrl.c | 226 ++++++++++++-----
>> ovn/controller/ofctrl.h | 16 +-
>> ovn/controller/ovn-controller.c | 546
>> ovn/controller/ovn-controller.h | 5 +
>> ovn/controller/physical.c | 140 +++++++----
>> ovn/controller/physical.h | 8 +-
>> ovn/lib/actions.c | 6 +-
>> ovn/lib/automake.mk | 4 +-
>> ovn/lib/extend-table.c | 31 ++-
>> ovn/lib/extend-table.h | 9 +-
>> ovn/lib/inc-proc-eng.c | 97 +++++++
>> ovn/lib/inc-proc-eng.h | 118 +++++++++
>> 19 files changed, 1143 insertions(+), 303 deletions(-)
>> create mode 100644 ovn/lib/inc-proc-eng.c
>> create mode 100644 ovn/lib/inc-proc-eng.h
dev mailing list