[ovs-dev] [PATCH] bridge: Fix bug where IDL wakeup causes 100% CPU.

2014-09-29 Thread Joe Stringer
Commit 9c537baf613a16e (bridge: Refactor the stats and status update.)
inadvertently changed the way that the status transaction is destroyed,
which could cause the main thread to constantly wake up.

The bug occurs when a transaction returns TXN_INCOMPLETE, then there are
no subsequent changes to connectivity or 'status_txn_try_again'.
ovsdb_idl_run() processes the transaction's reply and updates the status
to TXN_SUCCESS. The logic in status_update_wait() wakes up the main
thread instantly so that the transaction can be cleaned up, however the
logic to clean it up in run_status_update() is inaccessible until the
next connectivity change. This happens every time through the main loop,
causing the main thread to spin at 100%.

This patch fixes the behaviour  by ensuring that ovsdb_idl_txn_commit()
gets a chance to run whenever there is an ongoing status transaction.

Steps to reproduce: Unload/Reload kernel module  restart ovs-vswitchd.

Signed-off-by: Joe Stringer joestrin...@nicira.com
---
I considered implementing this fix using ovsdb_idl_get_seqno() for the
entire logic block, but concluded that it doesn't make sense to update
the status whenever there is any kind of status change to any ongoing
transaction. Rather, it makes sense that if there is a status
transaction outstanding, to finish that transaction and clean it up when
it is done.
---
 vswitchd/bridge.c |   18 +++---
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
index 5ba8d64..61896d3 100644
--- a/vswitchd/bridge.c
+++ b/vswitchd/bridge.c
@@ -2656,16 +2656,13 @@ run_stats_update(void)
 static void
 run_status_update(void)
 {
-uint64_t seq;
-
-/* Check the need to update status. */
-seq = seq_read(connectivity_seq_get());
-if (seq != connectivity_seqno || status_txn_try_again) {
-enum ovsdb_idl_txn_status status;
+if (!status_txn) {
+uint64_t seq;
 
 /* Rate limit the update.  Do not start a new update if the
  * previous one is not done. */
-if (!status_txn) {
+seq = seq_read(connectivity_seq_get());
+if (seq != connectivity_seqno || status_txn_try_again) {
 struct bridge *br;
 
 connectivity_seqno = seq;
@@ -2687,6 +2684,13 @@ run_status_update(void)
 }
 }
 }
+}
+
+/* Make progress on the transaction. If it finishes, then destroy the
+ * transaction. Otherwise, keep it so that we can check progress during
+ * the next run through the main loop. */
+if (status_txn) {
+enum ovsdb_idl_txn_status status;
 
 status = ovsdb_idl_txn_commit(status_txn);
 if (status != TXN_INCOMPLETE) {
-- 
1.7.10.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] bridge: Fix bug where IDL wakeup causes 100% CPU.

2014-09-29 Thread Alex Wang
Thx a lot for the fix, that was a bad move~

Acked-by: Alex Wang al...@nicira.com

On Mon, Sep 29, 2014 at 3:09 AM, Joe Stringer joestrin...@nicira.com
wrote:

 Commit 9c537baf613a16e (bridge: Refactor the stats and status update.)
 inadvertently changed the way that the status transaction is destroyed,
 which could cause the main thread to constantly wake up.

 The bug occurs when a transaction returns TXN_INCOMPLETE, then there are
 no subsequent changes to connectivity or 'status_txn_try_again'.
 ovsdb_idl_run() processes the transaction's reply and updates the status
 to TXN_SUCCESS. The logic in status_update_wait() wakes up the main
 thread instantly so that the transaction can be cleaned up, however the
 logic to clean it up in run_status_update() is inaccessible until the
 next connectivity change. This happens every time through the main loop,
 causing the main thread to spin at 100%.

 This patch fixes the behaviour  by ensuring that ovsdb_idl_txn_commit()
 gets a chance to run whenever there is an ongoing status transaction.

 Steps to reproduce: Unload/Reload kernel module  restart ovs-vswitchd.

 Signed-off-by: Joe Stringer joestrin...@nicira.com
 ---
 I considered implementing this fix using ovsdb_idl_get_seqno() for the
 entire logic block, but concluded that it doesn't make sense to update
 the status whenever there is any kind of status change to any ongoing
 transaction. Rather, it makes sense that if there is a status
 transaction outstanding, to finish that transaction and clean it up when
 it is done.
 ---
  vswitchd/bridge.c |   18 +++---
  1 file changed, 11 insertions(+), 7 deletions(-)

 diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
 index 5ba8d64..61896d3 100644
 --- a/vswitchd/bridge.c
 +++ b/vswitchd/bridge.c
 @@ -2656,16 +2656,13 @@ run_stats_update(void)
  static void
  run_status_update(void)
  {
 -uint64_t seq;
 -
 -/* Check the need to update status. */
 -seq = seq_read(connectivity_seq_get());
 -if (seq != connectivity_seqno || status_txn_try_again) {
 -enum ovsdb_idl_txn_status status;
 +if (!status_txn) {
 +uint64_t seq;

  /* Rate limit the update.  Do not start a new update if the
   * previous one is not done. */
 -if (!status_txn) {
 +seq = seq_read(connectivity_seq_get());
 +if (seq != connectivity_seqno || status_txn_try_again) {
  struct bridge *br;

  connectivity_seqno = seq;
 @@ -2687,6 +2684,13 @@ run_status_update(void)
  }
  }
  }
 +}
 +
 +/* Make progress on the transaction. If it finishes, then destroy the
 + * transaction. Otherwise, keep it so that we can check progress
 during
 + * the next run through the main loop. */
 +if (status_txn) {
 +enum ovsdb_idl_txn_status status;

  status = ovsdb_idl_txn_commit(status_txn);
  if (status != TXN_INCOMPLETE) {
 --
 1.7.10.4


___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] bridge: Fix bug where IDL wakeup causes 100% CPU.

2014-09-29 Thread Joe Stringer
Thanks for review, I reworded the commit message and comment and pushed to
master:

Commit 9c537baf613a16e (bridge: Refactor the stats and status update.)
inadvertently changed the way that the status transaction is destroyed,
which could cause the main thread to constantly wake up.

The bug occurs when ovsdb_idl_txn_commit() returns TXN_INCOMPLETE and
there are no further changes to connectivity or 'status_txn_try_again'.

- ovsdb_idl_run() receives the transaction reply and updates the
  transaction status to TXN_SUCCESS.
- status_update_wait() detects that the transaction is in progress, and
  the status is TXN_SUCCESS so wakes up the main thread immediately.
- run_status_update() is meant to destroy the transaction now that it is
  finished, however the logic is never run because there were no
changes.
- Repeat the wakeup every time the main loop runs.

This patch fixes the behaviour by ensuring that ovsdb_idl_txn_commit()
gets a chance to run whenever there is an ongoing status transaction.

Bug was found by unloading and reloading the kernel module, then
immediately restarting ovs-vswitchd.

On 30 September 2014 05:45, Alex Wang al...@nicira.com wrote:

 Thx a lot for the fix, that was a bad move~

 Acked-by: Alex Wang al...@nicira.com

 On Mon, Sep 29, 2014 at 3:09 AM, Joe Stringer joestrin...@nicira.com
 wrote:

 Commit 9c537baf613a16e (bridge: Refactor the stats and status update.)
 inadvertently changed the way that the status transaction is destroyed,
 which could cause the main thread to constantly wake up.

 The bug occurs when a transaction returns TXN_INCOMPLETE, then there are
 no subsequent changes to connectivity or 'status_txn_try_again'.
 ovsdb_idl_run() processes the transaction's reply and updates the status
 to TXN_SUCCESS. The logic in status_update_wait() wakes up the main
 thread instantly so that the transaction can be cleaned up, however the
 logic to clean it up in run_status_update() is inaccessible until the
 next connectivity change. This happens every time through the main loop,
 causing the main thread to spin at 100%.

 This patch fixes the behaviour  by ensuring that ovsdb_idl_txn_commit()
 gets a chance to run whenever there is an ongoing status transaction.

 Steps to reproduce: Unload/Reload kernel module  restart ovs-vswitchd.

 Signed-off-by: Joe Stringer joestrin...@nicira.com
 ---
 I considered implementing this fix using ovsdb_idl_get_seqno() for the
 entire logic block, but concluded that it doesn't make sense to update
 the status whenever there is any kind of status change to any ongoing
 transaction. Rather, it makes sense that if there is a status
 transaction outstanding, to finish that transaction and clean it up when
 it is done.
 ---
  vswitchd/bridge.c |   18 +++---
  1 file changed, 11 insertions(+), 7 deletions(-)

 diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
 index 5ba8d64..61896d3 100644
 --- a/vswitchd/bridge.c
 +++ b/vswitchd/bridge.c
 @@ -2656,16 +2656,13 @@ run_stats_update(void)
  static void
  run_status_update(void)
  {
 -uint64_t seq;
 -
 -/* Check the need to update status. */
 -seq = seq_read(connectivity_seq_get());
 -if (seq != connectivity_seqno || status_txn_try_again) {
 -enum ovsdb_idl_txn_status status;
 +if (!status_txn) {
 +uint64_t seq;

  /* Rate limit the update.  Do not start a new update if the
   * previous one is not done. */
 -if (!status_txn) {
 +seq = seq_read(connectivity_seq_get());
 +if (seq != connectivity_seqno || status_txn_try_again) {
  struct bridge *br;

  connectivity_seqno = seq;
 @@ -2687,6 +2684,13 @@ run_status_update(void)
  }
  }
  }
 +}
 +
 +/* Make progress on the transaction. If it finishes, then destroy the
 + * transaction. Otherwise, keep it so that we can check progress
 during
 + * the next run through the main loop. */
 +if (status_txn) {
 +enum ovsdb_idl_txn_status status;

  status = ovsdb_idl_txn_commit(status_txn);
  if (status != TXN_INCOMPLETE) {
 --
 1.7.10.4



___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev