Issue #2842 has been updated by tkusumi.

Status changed from New to Closed

10:28 (dillon) tkusumi: I looked at that one and followed up on bugs.  The 
patch looks ok but I would rather it not be applied.  The btree code is 
extremely sensitive and it is locking all the children as a safety
10:29 (tkusumi) dillon: ok then i'll not commit it.  i wanted to make sure if 
you see this as too sensitive.
10:29 (dillon) its definitely too sensitive :-)
10:29 (dillon) I have had nothing but trouble trying to optimize the btree 
operations in hammer1
10:30 (dillon) splits don't happen all that often, so there's no performance 
benefit to optimizing the locking there.  Its best for it to just lock 
everything
10:31 (tkusumi) dillon:  "I looked at that one and followed up on bugs"  i 
don't see your followup on bug#2842.  am i missing something ?
10:31 (tkusumi) i only see my post.
10:31 (dillon) hrm.  I replied to the email.  I don't know why it didn't 
automatically add it to the ticket
10:31 (dillon) grumble. something else I probably have to fix in the bug 
reporting system
10:33 (tkusumi) the other thing i thought about this func is that shouldn't you 
be doing  
10:33 (tkusumi) error = hammer_cursor_upgrade(cursor)
10:33 (tkusumi) before
10:33 (tkusumi) hammer_node_lock_init(&lockroot, cursor->node);    error = 
hammer_btree_lock_children(cursor, 1, split, &lockroot, NULL);
10:33 (tkusumi) ?
10:34 (tkusumi) i mean lock parent before its children.
10:34 (dillon) what line ?
10:34 (tkusumi) the beginning part of btree_split_internal().
10:35 (tkusumi) 1453
10:35 (tkusumi) rebalance and reblock gets node exlock before node's children's 
lock.
10:35 (tkusumi) but this split internal gets children locks first and then node.
10:36 (dillon) ah. there's deadlock detection in that sequence, so no.  The 
reason I don't upgrade the parent first in this situation is because it can 
cause stalls in other threads due to the time it takes to lock the children
10:36 (tkusumi) (no actually reblock does)
10:37 (dillon) the cursor is already locked shared
10:37 (dillon) The upgrade can fail regardless of order due to already being 
locked shared


----------------------------------------
Submit #2842: [PATCH] sys/vfs/hammer: Don't ex-lock all children on internal 
split
http://bugs.dragonflybsd.org/issues/2842#change-12706

* Author: tkusumi
* Status: Closed
* Priority: Normal
* Assignee: 
* Category: 
* Target version: 
----------------------------------------
Matt, please review this.
The diff is also attached to this PR.


=====
Date: Sun, 6 Sep 2015 00:17:00 +0900
Subject: [PATCH] sys/vfs/hammer: Don't ex-lock all children on internal split

hammer_split_internal() ex-locks all children of the node
to split via hammer_btree_lock_children(), but it actually
only needs to lock children at (index >= split).

Children at (index >= 0 && index < split) are not touched
by internal split and don't get affected by this operation.
In the following example, internal split does nothing to
E with * and their child nodes.

This commit adds an extra argument int start_index to
hammer_btree_lock_children() to specify from which child
this function needs to ex-lock.

=====
before split
I
0 1 2 3 ...     n
E E E E E E E E E R
           /
          /
         I                 62
         0 ... s ...     61|
         E E E E ... E E E R
         * * *

=====
after split
I                 n+1
0 1 2 3 ...   p   |
E E E E E E E E E E R
           /   \
          /     \
         I       \
         0 ... s  \
         E E E R   \
         * * *      I
                    0 1 2 ...
                    E E E ... R
---
 sys/vfs/hammer/hammer.h           |  1 +
 sys/vfs/hammer/hammer_btree.c     | 20 +++++++++++---------
 sys/vfs/hammer/hammer_rebalance.c |  2 +-
 sys/vfs/hammer/hammer_reblock.c   |  2 +-
 4 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/sys/vfs/hammer/hammer.h b/sys/vfs/hammer/hammer.h
index c47889f..b35f2e9 100644
--- a/sys/vfs/hammer/hammer.h
+++ b/sys/vfs/hammer/hammer.h
@@ -1265,6 +1265,7 @@ void      hammer_btree_lcache_init(hammer_mount_t hmp, 
hammer_node_lock_t lcache,
                        int depth);
 void   hammer_btree_lcache_free(hammer_mount_t hmp, hammer_node_lock_t lcache);
 int    hammer_btree_lock_children(hammer_cursor_t cursor, int depth,
+                       int start_index,
                        hammer_node_lock_t parent,
                        hammer_node_lock_t lcache);
 void   hammer_btree_lock_copy(hammer_cursor_t cursor,
diff --git a/sys/vfs/hammer/hammer_btree.c b/sys/vfs/hammer/hammer_btree.c
index a34e6ca..cbd356a 100644
--- a/sys/vfs/hammer/hammer_btree.c
+++ b/sys/vfs/hammer/hammer_btree.c
@@ -1450,13 +1450,8 @@ btree_split_internal(hammer_cursor_t cursor)
        int i;
        const int esize = sizeof(*elm);
 
-       hammer_node_lock_init(&lockroot, cursor->node);
-       error = hammer_btree_lock_children(cursor, 1, &lockroot, NULL);
-       if (error)
-               goto done;
        if ((error = hammer_cursor_upgrade(cursor)) != 0)
-               goto done;
-       ++hammer_stats_btree_splits;
+               return(error);
 
        /*
         * Calculate the split point.  If the insertion point is at the
@@ -1485,6 +1480,12 @@ btree_split_internal(hammer_cursor_t cursor)
                        --split;
        }
 
+       hammer_node_lock_init(&lockroot, cursor->node);
+       error = hammer_btree_lock_children(cursor, 1, split, &lockroot, NULL);
+       if (error)
+               goto done;
+       ++hammer_stats_btree_splits;
+
        /*
         * If we are at the root of the filesystem, create a new root node
         * with 1 element and split normally.  Avoid making major
@@ -2631,7 +2632,7 @@ hammer_btree_lcache_free(hammer_mount_t hmp, 
hammer_node_lock_t lcache)
  * is usually aliased from a cursor.
  */
 int
-hammer_btree_lock_children(hammer_cursor_t cursor, int depth,
+hammer_btree_lock_children(hammer_cursor_t cursor, int depth, int start_index,
                           hammer_node_lock_t parent,
                           hammer_node_lock_t lcache)
 {
@@ -2657,7 +2658,7 @@ hammer_btree_lock_children(hammer_cursor_t cursor, int 
depth,
         * pre-get the children before trying to lock the mess.  This is
         * only done one-level deep for now.
         */
-       for (i = 0; i < ondisk->count; ++i) {
+       for (i = start_index; i < ondisk->count; ++i) {
                ++hammer_stats_btree_elements;
                elm = &ondisk->elms[i];
                child = hammer_get_node(cursor->trans,
@@ -2670,7 +2671,7 @@ hammer_btree_lock_children(hammer_cursor_t cursor, int 
depth,
        /*
         * Do it for real
         */
-       for (i = 0; error == 0 && i < ondisk->count; ++i) {
+       for (i = start_index; error == 0 && i < ondisk->count; ++i) {
                ++hammer_stats_btree_elements;
                elm = &ondisk->elms[i];
 
@@ -2712,6 +2713,7 @@ hammer_btree_lock_children(hammer_cursor_t cursor, int 
depth,
                                        error = hammer_btree_lock_children(
                                                        cursor,
                                                        depth - 1,
+                                                       0,
                                                        item,
                                                        lcache);
                                }
diff --git a/sys/vfs/hammer/hammer_rebalance.c 
b/sys/vfs/hammer/hammer_rebalance.c
index 28247bd..0788d6d 100644
--- a/sys/vfs/hammer/hammer_rebalance.c
+++ b/sys/vfs/hammer/hammer_rebalance.c
@@ -285,7 +285,7 @@ rebalance_node(struct hammer_ioc_rebalance *rebal, 
hammer_cursor_t cursor,
        error = hammer_cursor_upgrade(cursor);
        if (error)
                goto done;
-       error = hammer_btree_lock_children(cursor, 2, &lockroot, lcache);
+       error = hammer_btree_lock_children(cursor, 2, 0, &lockroot, lcache);
        if (error)
                goto done;
 
diff --git a/sys/vfs/hammer/hammer_reblock.c b/sys/vfs/hammer/hammer_reblock.c
index 21aef3d..c002367 100644
--- a/sys/vfs/hammer/hammer_reblock.c
+++ b/sys/vfs/hammer/hammer_reblock.c
@@ -571,7 +571,7 @@ hammer_reblock_int_node(struct hammer_ioc_reblock *reblock,
        int error;
 
        hammer_node_lock_init(&lockroot, cursor->node);
-       error = hammer_btree_lock_children(cursor, 1, &lockroot, NULL);
+       error = hammer_btree_lock_children(cursor, 1, 0, &lockroot, NULL);
        if (error)
                goto done;
 
-- 
2.4.6



---Files--------------------------------
0001-sys-vfs-hammer-Don-t-ex-lock-all-children-on-interna.patch (5.16 KB)


-- 
You have received this notification because you have either subscribed to it, 
or are involved in it.
To change your notification preferences, please click here: 
http://bugs.dragonflybsd.org/my/account

Reply via email to