Issue #2842 has been updated by tkusumi. Status changed from New to Closed
10:28 (dillon) tkusumi: I looked at that one and followed up on bugs. The patch looks ok but I would rather it not be applied. The btree code is extremely sensitive and it is locking all the children as a safety 10:29 (tkusumi) dillon: ok then i'll not commit it. i wanted to make sure if you see this as too sensitive. 10:29 (dillon) its definitely too sensitive :-) 10:29 (dillon) I have had nothing but trouble trying to optimize the btree operations in hammer1 10:30 (dillon) splits don't happen all that often, so there's no performance benefit to optimizing the locking there. Its best for it to just lock everything 10:31 (tkusumi) dillon: "I looked at that one and followed up on bugs" i don't see your followup on bug#2842. am i missing something ? 10:31 (tkusumi) i only see my post. 10:31 (dillon) hrm. I replied to the email. I don't know why it didn't automatically add it to the ticket 10:31 (dillon) grumble. something else I probably have to fix in the bug reporting system 10:33 (tkusumi) the other thing i thought about this func is that shouldn't you be doing 10:33 (tkusumi) error = hammer_cursor_upgrade(cursor) 10:33 (tkusumi) before 10:33 (tkusumi) hammer_node_lock_init(&lockroot, cursor->node); error = hammer_btree_lock_children(cursor, 1, split, &lockroot, NULL); 10:33 (tkusumi) ? 10:34 (tkusumi) i mean lock parent before its children. 10:34 (dillon) what line ? 10:34 (tkusumi) the beginning part of btree_split_internal(). 10:35 (tkusumi) 1453 10:35 (tkusumi) rebalance and reblock gets node exlock before node's children's lock. 10:35 (tkusumi) but this split internal gets children locks first and then node. 10:36 (dillon) ah. there's deadlock detection in that sequence, so no. The reason I don't upgrade the parent first in this situation is because it can cause stalls in other threads due to the time it takes to lock the children 10:36 (tkusumi) (no actually reblock does) 10:37 (dillon) the cursor is already locked shared 10:37 (dillon) The upgrade can fail regardless of order due to already being locked shared ---------------------------------------- Submit #2842: [PATCH] sys/vfs/hammer: Don't ex-lock all children on internal split http://bugs.dragonflybsd.org/issues/2842#change-12706 * Author: tkusumi * Status: Closed * Priority: Normal * Assignee: * Category: * Target version: ---------------------------------------- Matt, please review this. The diff is also attached to this PR. ===== Date: Sun, 6 Sep 2015 00:17:00 +0900 Subject: [PATCH] sys/vfs/hammer: Don't ex-lock all children on internal split hammer_split_internal() ex-locks all children of the node to split via hammer_btree_lock_children(), but it actually only needs to lock children at (index >= split). Children at (index >= 0 && index < split) are not touched by internal split and don't get affected by this operation. In the following example, internal split does nothing to E with * and their child nodes. This commit adds an extra argument int start_index to hammer_btree_lock_children() to specify from which child this function needs to ex-lock. ===== before split I 0 1 2 3 ... n E E E E E E E E E R / / I 62 0 ... s ... 61| E E E E ... E E E R * * * ===== after split I n+1 0 1 2 3 ... p | E E E E E E E E E E R / \ / \ I \ 0 ... s \ E E E R \ * * * I 0 1 2 ... E E E ... R --- sys/vfs/hammer/hammer.h | 1 + sys/vfs/hammer/hammer_btree.c | 20 +++++++++++--------- sys/vfs/hammer/hammer_rebalance.c | 2 +- sys/vfs/hammer/hammer_reblock.c | 2 +- 4 files changed, 14 insertions(+), 11 deletions(-) diff --git a/sys/vfs/hammer/hammer.h b/sys/vfs/hammer/hammer.h index c47889f..b35f2e9 100644 --- a/sys/vfs/hammer/hammer.h +++ b/sys/vfs/hammer/hammer.h @@ -1265,6 +1265,7 @@ void hammer_btree_lcache_init(hammer_mount_t hmp, hammer_node_lock_t lcache, int depth); void hammer_btree_lcache_free(hammer_mount_t hmp, hammer_node_lock_t lcache); int hammer_btree_lock_children(hammer_cursor_t cursor, int depth, + int start_index, hammer_node_lock_t parent, hammer_node_lock_t lcache); void hammer_btree_lock_copy(hammer_cursor_t cursor, diff --git a/sys/vfs/hammer/hammer_btree.c b/sys/vfs/hammer/hammer_btree.c index a34e6ca..cbd356a 100644 --- a/sys/vfs/hammer/hammer_btree.c +++ b/sys/vfs/hammer/hammer_btree.c @@ -1450,13 +1450,8 @@ btree_split_internal(hammer_cursor_t cursor) int i; const int esize = sizeof(*elm); - hammer_node_lock_init(&lockroot, cursor->node); - error = hammer_btree_lock_children(cursor, 1, &lockroot, NULL); - if (error) - goto done; if ((error = hammer_cursor_upgrade(cursor)) != 0) - goto done; - ++hammer_stats_btree_splits; + return(error); /* * Calculate the split point. If the insertion point is at the @@ -1485,6 +1480,12 @@ btree_split_internal(hammer_cursor_t cursor) --split; } + hammer_node_lock_init(&lockroot, cursor->node); + error = hammer_btree_lock_children(cursor, 1, split, &lockroot, NULL); + if (error) + goto done; + ++hammer_stats_btree_splits; + /* * If we are at the root of the filesystem, create a new root node * with 1 element and split normally. Avoid making major @@ -2631,7 +2632,7 @@ hammer_btree_lcache_free(hammer_mount_t hmp, hammer_node_lock_t lcache) * is usually aliased from a cursor. */ int -hammer_btree_lock_children(hammer_cursor_t cursor, int depth, +hammer_btree_lock_children(hammer_cursor_t cursor, int depth, int start_index, hammer_node_lock_t parent, hammer_node_lock_t lcache) { @@ -2657,7 +2658,7 @@ hammer_btree_lock_children(hammer_cursor_t cursor, int depth, * pre-get the children before trying to lock the mess. This is * only done one-level deep for now. */ - for (i = 0; i < ondisk->count; ++i) { + for (i = start_index; i < ondisk->count; ++i) { ++hammer_stats_btree_elements; elm = &ondisk->elms[i]; child = hammer_get_node(cursor->trans, @@ -2670,7 +2671,7 @@ hammer_btree_lock_children(hammer_cursor_t cursor, int depth, /* * Do it for real */ - for (i = 0; error == 0 && i < ondisk->count; ++i) { + for (i = start_index; error == 0 && i < ondisk->count; ++i) { ++hammer_stats_btree_elements; elm = &ondisk->elms[i]; @@ -2712,6 +2713,7 @@ hammer_btree_lock_children(hammer_cursor_t cursor, int depth, error = hammer_btree_lock_children( cursor, depth - 1, + 0, item, lcache); } diff --git a/sys/vfs/hammer/hammer_rebalance.c b/sys/vfs/hammer/hammer_rebalance.c index 28247bd..0788d6d 100644 --- a/sys/vfs/hammer/hammer_rebalance.c +++ b/sys/vfs/hammer/hammer_rebalance.c @@ -285,7 +285,7 @@ rebalance_node(struct hammer_ioc_rebalance *rebal, hammer_cursor_t cursor, error = hammer_cursor_upgrade(cursor); if (error) goto done; - error = hammer_btree_lock_children(cursor, 2, &lockroot, lcache); + error = hammer_btree_lock_children(cursor, 2, 0, &lockroot, lcache); if (error) goto done; diff --git a/sys/vfs/hammer/hammer_reblock.c b/sys/vfs/hammer/hammer_reblock.c index 21aef3d..c002367 100644 --- a/sys/vfs/hammer/hammer_reblock.c +++ b/sys/vfs/hammer/hammer_reblock.c @@ -571,7 +571,7 @@ hammer_reblock_int_node(struct hammer_ioc_reblock *reblock, int error; hammer_node_lock_init(&lockroot, cursor->node); - error = hammer_btree_lock_children(cursor, 1, &lockroot, NULL); + error = hammer_btree_lock_children(cursor, 1, 0, &lockroot, NULL); if (error) goto done; -- 2.4.6 ---Files-------------------------------- 0001-sys-vfs-hammer-Don-t-ex-lock-all-children-on-interna.patch (5.16 KB) -- You have received this notification because you have either subscribed to it, or are involved in it. To change your notification preferences, please click here: http://bugs.dragonflybsd.org/my/account