On Thu, Dec 8, 2011 at 10:48 AM, Pravin B Shelar <[email protected]> wrote:
> diff --git a/datapath/datapath.c b/datapath/datapath.c
> index c86c20b..50fd00e 100644
> --- a/datapath/datapath.c
> +++ b/datapath/datapath.c
> +static void __rehash_flow_table(void)
> +{
> + struct datapath *dp;
> +
> + list_for_each_entry(dp, &dps, list_node) {
> + struct flow_table *old_table = genl_dereference(dp->table);
> + struct flow_table *new_table;
> +
> + new_table = ovs_flow_tbl_rehash(old_table);
> +
> + if (!IS_ERR(new_table)) {
Can you drop the blank line above to keep it consistent with upstream?
> @@ -1581,6 +1611,13 @@ static struct genl_ops dp_datapath_genl_ops[] = {
> .policy = datapath_policy,
> .doit = ovs_dp_cmd_set,
> },
> +#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,35)
> + { .cmd = OVS_DP_CMD_REHASH,
> + .flags = 0,
Since this shouldn't be called by userspace at all, I don't see a
reason to make it permissive about who can do it. There shouldn't be
an impact on traffic but rehashing can cause a load spike. It seems
best to restrict this to CAP_NET_ADMIN so only already trusted users
can do this.
> +#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,35)
> +static void rehash_flow_table(struct work_struct *work)
> +{
> + struct nlmsghdr *nlh;
> + struct ovs_header *hdr;
> + struct sk_buff *skb;
> +
> + skb = genlmsg_new(NLMSG_DEFAULT_SIZE, GFP_ATOMIC);
Since this is running in work queue, can't this be GFP_KERNEL?
> + if (!skb)
> + goto out;
> +
> + hdr = genlmsg_put(skb, 0, 0, &dp_datapath_genl_family,
> + 0, OVS_DP_CMD_REHASH);
> + if (!hdr) {
> + kfree_skb(skb);
> + return;
> + }
I don't think that's necessary to check the return value of
genlmsg_put() because it is just returning an offset into the buffer
that you already allocated. However, if it were to return a NULL
value then this would stop the work queue.
> + nlh = nlmsg_hdr(skb);
> + nlh->nlmsg_flags |= NLM_F_REQUEST;
Can't you just put this into flags argument of genlmsg_put?
> @@ -2069,8 +2144,9 @@ static int __init dp_init(void)
> if (err < 0)
> goto error_unreg_notifier;
>
> - return 0;
> + schedule_delayed_work(&rehash_flow_wq, REHASH_FLOW_INTERVAL);
>
> + return 0;
> error_unreg_notifier:
Can you keep the spacing after return consistent with upstream?
> static void dp_cleanup(void)
> {
> rcu_barrier();
> + cancel_delayed_work_sync(&rehash_flow_wq);
> + ovs_workqueues_exit();
Can we keep these in the same order as they are upstream - i.e.
cancel_delayed_work_sync before the rcu_barrier and
ovs_worqueues_exit() at the end?
This is more than just cosmetic - it actually results in a functional
difference. I don't think it actually makes a difference at the
moment because all datapaths need to have been removed already and
therefore nothing should really be running but the rehashing can
generate RCU garbage and tunneling uses the workqueues so I don't
think it's really good to shutdown those components first.
> diff --git a/datapath/flow.c b/datapath/flow.c
> index 78dea3a..bb9533a 100644
> --- a/datapath/flow.c
> +++ b/datapath/flow.c
> +static struct flow_table *__flow_tbl_rehash(struct flow_table *table,
> + int n_buckets)
Can you keep the line breaks the same as upstream?
> diff --git a/datapath/flow.h b/datapath/flow.h
> index 36e738d..1a91d68 100644
> --- a/datapath/flow.h
> +++ b/datapath/flow.h
> @@ -96,7 +96,7 @@ struct sw_flow_key {
>
> struct sw_flow {
> struct rcu_head rcu;
> - struct hlist_node hash_node;
> + struct hlist_node hash_node[2];
Can you please keep the same spacing as the upstream version?
> diff --git a/datapath/linux/compat/include/linux/workqueue.h
> b/datapath/linux/compat/include/linux/workqueue.h
> index 01c6345..2754f60 100644
> --- a/datapath/linux/compat/include/linux/workqueue.h
> +++ b/datapath/linux/compat/include/linux/workqueue.h
[...]
> -#include <linux/timer.h>
> +struct workqueue_struct;
Is this actually used by anything?
> diff --git a/datapath/linux/compat/workqueue.c
> b/datapath/linux/compat/workqueue.c
> new file mode 100644
> index 0000000..f74a08f
> --- /dev/null
> +++ b/datapath/linux/compat/workqueue.c
> +static int worker_thread(void *dummy)
> +{
> + DEFINE_WAIT(wait);
> +
> + for (;;) {
> + prepare_to_wait(&more_work, &wait, TASK_INTERRUPTIBLE);
> + if (!kthread_should_stop() && list_empty(&workq))
> + schedule();
> + finish_wait(&more_work, &wait);
Isn't this basically an open-coded version of wait_event_interruptible?
> diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
> index 0578b5f..21f1060 100644
> --- a/include/linux/openvswitch.h
> +++ b/include/linux/openvswitch.h
> @@ -66,7 +66,8 @@ enum ovs_datapath_cmd {
> OVS_DP_CMD_NEW,
> OVS_DP_CMD_DEL,
> OVS_DP_CMD_GET,
> - OVS_DP_CMD_SET
> + OVS_DP_CMD_SET,
> + OVS_DP_CMD_REHASH,
I'd really like to avoid exposing this to userspace since it is
entirely internal to the kernel. Can we just give this a high number
and keep it inside of datapath.c?
_______________________________________________
dev mailing list
[email protected]
http://openvswitch.org/mailman/listinfo/dev