Re: [RFC PATCH 5/7] Btrfs: add btrfs_compare_trees function

2012-07-05 Thread Alex Lyakas
Alexander,

 +   if (advance_left  !left_end_reached) {
 +   ret = tree_advance(left_root, left_path, 
 left_level,
 +   left_root_level,
 +   advance_left != ADVANCE_ONLY_NEXT,
 +   left_key);
 +   if (ret  0)
 +   left_end_reached = ADVANCE;
 +   advance_left = 0;
 +   }
 +   if (advance_right  !right_end_reached) {
 +   ret = tree_advance(right_root, right_path, 
 right_level,
 +   right_root_level,
 +   advance_right != ADVANCE_ONLY_NEXT,
 +   right_key);
 +   if (ret  0)
 +   right_end_reached = ADVANCE;
 +   advance_right = 0;
 +   }
 Do you think it's worth it to put a check/warning/smth before that,
 that either advance_right or advance_left is non-zero, or we have
 reached ends in both trees?


 Having the left or right end reached before the other sides end is
 reached is something that is completely normal and expected.
What I meant was that when we start the loop, either advance_left!=0
or advance_right!=0. So I thought it would be good to notice that.
However, on the very first loop iteration, both of them are zero, so I
was wrong.


 +   } else if (left_level == right_level) {
 ...
 +   } else if (left_level  right_level) {
 +   advance_right = ADVANCE;
 +   } else {
 +   advance_left = ADVANCE;
 +   }
 Can you pls explain why it is correct?
 Why if we are on lower level in the newer tree than we are in the
 older tree, we need to advance the older tree? I.e., why this
 implies that we are on the lower key in the older tree? (And
 vice-versa). I.e., how difference in levels indicates relation between
 keys?
 Difference in levels has no relation to the keys. These advances
 basically try to keep the two trees positions in-sync. The compare
 always tries to get both trees to a point where they are at the same
 level, as only then we can compare keys. Also, the two trees may have
 different root levels, this code also handles that case.

Can you pls tell me if I understand your algorithm correctly:
Basically, we need to get to the leaf levels and compare the items in
the leafs. Only when we are on the leaf level, we can safely signal
deletions and additions of items, not on upper levels.
There is only one optimization: we want to find nodes that are shared,
and such nodes can be only on the same level. To make this
optimization happen, we try to always match the levels of the tree.
This is the purpose of:
} else if (left_level  right_level) {
advance_right = ADVANCE;
} else {
advance_left = ADVANCE;
}

Note: I think that instead of comparing levels, we could always
compare keys and ADVANCE the lower key. (Because on ADVANCing we never
loose information, we just get closer to leafs, so we don't skip
anything.) But then there is less chance of optimization. Does this
make sense? So what you said that we can compare keys only on the same
level...we can always compare them, correct?

I will now study the rest of your patchset.

Thanks!
Alex.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 5/7] Btrfs: add btrfs_compare_trees function

2012-07-05 Thread Alexander Block
On Thu, Jul 5, 2012 at 2:19 PM, Alex Lyakas
alex.bolshoy.bt...@gmail.com wrote:
 Alexander,

 +   if (advance_left  !left_end_reached) {
 +   ret = tree_advance(left_root, left_path, 
 left_level,
 +   left_root_level,
 +   advance_left != ADVANCE_ONLY_NEXT,
 +   left_key);
 +   if (ret  0)
 +   left_end_reached = ADVANCE;
 +   advance_left = 0;
 +   }
 +   if (advance_right  !right_end_reached) {
 +   ret = tree_advance(right_root, right_path, 
 right_level,
 +   right_root_level,
 +   advance_right != ADVANCE_ONLY_NEXT,
 +   right_key);
 +   if (ret  0)
 +   right_end_reached = ADVANCE;
 +   advance_right = 0;
 +   }
 Do you think it's worth it to put a check/warning/smth before that,
 that either advance_right or advance_left is non-zero, or we have
 reached ends in both trees?


 Having the left or right end reached before the other sides end is
 reached is something that is completely normal and expected.
 What I meant was that when we start the loop, either advance_left!=0
 or advance_right!=0. So I thought it would be good to notice that.
 However, on the very first loop iteration, both of them are zero, so I
 was wrong.


 +   } else if (left_level == right_level) {
 ...
 +   } else if (left_level  right_level) {
 +   advance_right = ADVANCE;
 +   } else {
 +   advance_left = ADVANCE;
 +   }
 Can you pls explain why it is correct?
 Why if we are on lower level in the newer tree than we are in the
 older tree, we need to advance the older tree? I.e., why this
 implies that we are on the lower key in the older tree? (And
 vice-versa). I.e., how difference in levels indicates relation between
 keys?
 Difference in levels has no relation to the keys. These advances
 basically try to keep the two trees positions in-sync. The compare
 always tries to get both trees to a point where they are at the same
 level, as only then we can compare keys. Also, the two trees may have
 different root levels, this code also handles that case.

 Can you pls tell me if I understand your algorithm correctly:
 Basically, we need to get to the leaf levels and compare the items in
 the leafs. Only when we are on the leaf level, we can safely signal
 deletions and additions of items, not on upper levels.
 There is only one optimization: we want to find nodes that are shared,
 and such nodes can be only on the same level. To make this
 optimization happen, we try to always match the levels of the tree.
 This is the purpose of:
 } else if (left_level  right_level) {
 advance_right = ADVANCE;
 } else {
 advance_left = ADVANCE;
 }

Sounds like you understood it.
 Note: I think that instead of comparing levels, we could always
 compare keys and ADVANCE the lower key. (Because on ADVANCing we never
 loose information, we just get closer to leafs, so we don't skip
 anything.) But then there is less chance of optimization. Does this
 make sense? So what you said that we can compare keys only on the same
 level...we can always compare them, correct?
Hmm I think I don't understand what you mean. When we are at level 0,
advancing will in most cases mean that we only get to the next
(slot+1) item. Only in case we are on the last item for this leaf, we
change levels. In that case, the first ADVANCE will go as much levels
up until its possible to got to the next node at that level. After
this, the next ADVANCEs will go down the tree again until we are at
the same level as the other tree is.

Imagine this tree:
R __
/ \  |
   /   \ |
  AB  C
 / \/ \ / \
D  E F  G H  I

It would iterate in this order:
R(slot 0)  - down
A(slot 0)  - down
D(slot 0..nr) - upnext
A(slot 1)  - down
E(slot 0..nr) - upnext
R(slot 1)  - down
B(slot 0)  - down
F(slot 0..nr) - upnext
B(slot 1)  - down
G(slot 0..nr) - upnext
R(slot 2)  - down
C(slot 0)  - down
H(slot 0..nr) - upnext
C(slot 1)  - down
I(slot 0..nr)   - upnext
done because upnext can't advance anymore.


 I will now study the rest of your patchset.

 Thanks!
 Alex.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 5/7] Btrfs: add btrfs_compare_trees function

2012-07-05 Thread Alex Lyakas
On Thu, Jul 5, 2012 at 3:47 PM, Alexander Block abloc...@googlemail.com wrote:
 On Thu, Jul 5, 2012 at 2:19 PM, Alex Lyakas
 alex.bolshoy.bt...@gmail.com wrote:
 Alexander,

 +   if (advance_left  !left_end_reached) {
 +   ret = tree_advance(left_root, left_path, 
 left_level,
 +   left_root_level,
 +   advance_left != ADVANCE_ONLY_NEXT,
 +   left_key);
 +   if (ret  0)
 +   left_end_reached = ADVANCE;
 +   advance_left = 0;
 +   }
 +   if (advance_right  !right_end_reached) {
 +   ret = tree_advance(right_root, right_path, 
 right_level,
 +   right_root_level,
 +   advance_right != 
 ADVANCE_ONLY_NEXT,
 +   right_key);
 +   if (ret  0)
 +   right_end_reached = ADVANCE;
 +   advance_right = 0;
 +   }
 Do you think it's worth it to put a check/warning/smth before that,
 that either advance_right or advance_left is non-zero, or we have
 reached ends in both trees?


 Having the left or right end reached before the other sides end is
 reached is something that is completely normal and expected.
 What I meant was that when we start the loop, either advance_left!=0
 or advance_right!=0. So I thought it would be good to notice that.
 However, on the very first loop iteration, both of them are zero, so I
 was wrong.


 +   } else if (left_level == right_level) {
 ...
 +   } else if (left_level  right_level) {
 +   advance_right = ADVANCE;
 +   } else {
 +   advance_left = ADVANCE;
 +   }
 Can you pls explain why it is correct?
 Why if we are on lower level in the newer tree than we are in the
 older tree, we need to advance the older tree? I.e., why this
 implies that we are on the lower key in the older tree? (And
 vice-versa). I.e., how difference in levels indicates relation between
 keys?
 Difference in levels has no relation to the keys. These advances
 basically try to keep the two trees positions in-sync. The compare
 always tries to get both trees to a point where they are at the same
 level, as only then we can compare keys. Also, the two trees may have
 different root levels, this code also handles that case.

 Can you pls tell me if I understand your algorithm correctly:
 Basically, we need to get to the leaf levels and compare the items in
 the leafs. Only when we are on the leaf level, we can safely signal
 deletions and additions of items, not on upper levels.
 There is only one optimization: we want to find nodes that are shared,
 and such nodes can be only on the same level. To make this
 optimization happen, we try to always match the levels of the tree.
 This is the purpose of:
 } else if (left_level  right_level) {
 advance_right = ADVANCE;
 } else {
 advance_left = ADVANCE;
 }

 Sounds like you understood it.
 Note: I think that instead of comparing levels, we could always
 compare keys and ADVANCE the lower key. (Because on ADVANCing we never
 loose information, we just get closer to leafs, so we don't skip
 anything.) But then there is less chance of optimization. Does this
 make sense? So what you said that we can compare keys only on the same
 level...we can always compare them, correct?
 Hmm I think I don't understand what you mean. When we are at level 0,
 advancing will in most cases mean that we only get to the next
 (slot+1) item. Only in case we are on the last item for this leaf, we
 change levels. In that case, the first ADVANCE will go as much levels
 up until its possible to got to the next node at that level. After
 this, the next ADVANCEs will go down the tree again until we are at
 the same level as the other tree is.

 Imagine this tree:
 R __
 / \  |
/   \ |
   AB  C
  / \/ \ / \
 D  E F  G H  I

 It would iterate in this order:
 R(slot 0)  - down
 A(slot 0)  - down
 D(slot 0..nr) - upnext
 A(slot 1)  - down
 E(slot 0..nr) - upnext
 R(slot 1)  - down
 B(slot 0)  - down
 F(slot 0..nr) - upnext
 B(slot 1)  - down
 G(slot 0..nr) - upnext
 R(slot 2)  - down
 C(slot 0)  - down
 H(slot 0..nr) - upnext
 C(slot 1)  - down
 I(slot 0..nr)   - upnext
 done because upnext can't advance anymore.

Thanks for the nice ASCII graphics!

Yes, if one of the scan heads is on level 0, and the other one is on a
different level, then yes, we need to ADVANCE the other one, we cannot
compare keys at this point. We need to get the other head down to
level 0, and then compare keys and items. I overlooked 

Re: [RFC PATCH 5/7] Btrfs: add btrfs_compare_trees function

2012-07-04 Thread Alex Lyakas
Hi Alex,

 +   spin_lock(left_root-root_times_lock);
 +   ctransid = btrfs_root_ctransid(left_root-root_item);
 +   spin_unlock(left_root-root_times_lock);
 +   if (ctransid != left_start_ctransid)
 +   left_start_ctransid = 0;
 +
 +   spin_lock(right_root-root_times_lock);
 +   ctransid = 
 btrfs_root_ctransid(right_root-root_item);
 +   spin_unlock(right_root-root_times_lock);
 +   if (ctransid != right_start_ctransid)
 +   left_start_ctransid = 0;
Shouldn't it be here right_start_ctransid=0? Otherwise,
right_start_ctransid is pretty useless in this function.

 +
 +   if (!left_start_ctransid || !right_start_ctransid) {
 +   WARN(1, KERN_WARNING
 +   btrfs: btrfs_compare_tree detected 
 +   a change in one of the trees while 
 +   iterating. This is probably a 
 +   bug.\n);
 +   ret = -EIO;
 +   goto out;
 +   }

I am reading the code have more questions (and comments), but will
send them all later.

Alex.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 5/7] Btrfs: add btrfs_compare_trees function

2012-07-04 Thread Alex Lyakas
Hi Alex,

 +static int tree_compare_item(struct btrfs_root *left_root,
 +struct btrfs_path *left_path,
 +struct btrfs_path *right_path,
 +char *tmp_buf)
 +{
 +   int cmp;
 +   int len1, len2;
 +   unsigned long off1, off2;
 +
 +   len1 = btrfs_item_size_nr(left_path-nodes[0], left_path-slots[0]);
 +   len2 = btrfs_item_size_nr(right_path-nodes[0], right_path-slots[0]);
 +   if (len1 != len2)
 +   return 1;
 +
 +   off1 = btrfs_item_ptr_offset(left_path-nodes[0], 
 left_path-slots[0]);
 +   off2 = btrfs_item_ptr_offset(right_path-nodes[0],
 +   right_path-slots[0]);
 +
 +   read_extent_buffer(left_path-nodes[0], tmp_buf, off1, len1);
 +
 +   cmp = memcmp_extent_buffer(right_path-nodes[0], tmp_buf, off2, len1);
 +   if (cmp)
 +   return 1;
 +   return 0;
 +}
It might be worth to note in the comment, that tmp_buff should be
large enough to hold the item from the left tree. Can it happen that
the right tree has a different leafsize?

 +   /*
 +* Strategy: Go to the first items of both trees. Then do
 +*
 +* If both trees are at level 0
 +*   Compare keys of current items
 +* If left  right treat left item as new, advance left tree
 +*   and repeat
 +* If left  right treat right item as deleted, advance right tree
 +*   and repeat
 +* If left == right do deep compare of items, treat as changed if
 +*   needed, advance both trees and repeat
 +* If both trees are at the same level but not at level 0
 +*   Compare keys of current nodes/leafs
 +* If left  right advance left tree and repeat
 +* If left  right advance right tree and repeat
 +* If left == right compare blockptrs of the next nodes/leafs
 +*   If they match advance both trees but stay at the same level
 +* and repeat
 +*   If they don't match advance both trees while allowing to go
 +* deeper and repeat
 +* If tree levels are different
 +*   Advance the tree that needs it and repeat
 +*
 +* Advancing a tree means:
 +*   If we are at level 0, try to go to the next slot. If that's not
 +*   possible, go one level up and repeat. Stop when we found a level
 +*   where we could go to the next slot. We may at this point be on a
 +*   node or a leaf.
 +*
 +*   If we are not at level 0 and not on shared tree blocks, go one
 +*   level deeper.
 +*
 +*   If we are not at level 0 and on shared tree blocks, go one slot 
 to
 +*   the right if possible or go up and right.
 +*/
According to the strategy and to the code later, left tree is
treated as newer one, while right as older one, correct? Do you
think it would be more intuitive to make it the other way around,
although I guess this is a matter of personal taste. I had to draw the
leafs reversed to keep going:
R   L
- -
| | | | | | | |
- -


 +   if (advance_left  !left_end_reached) {
 +   ret = tree_advance(left_root, left_path, left_level,
 +   left_root_level,
 +   advance_left != ADVANCE_ONLY_NEXT,
 +   left_key);
 +   if (ret  0)
 +   left_end_reached = ADVANCE;
 +   advance_left = 0;
 +   }
 +   if (advance_right  !right_end_reached) {
 +   ret = tree_advance(right_root, right_path, 
 right_level,
 +   right_root_level,
 +   advance_right != ADVANCE_ONLY_NEXT,
 +   right_key);
 +   if (ret  0)
 +   right_end_reached = ADVANCE;
 +   advance_right = 0;
 +   }
Do you think it's worth it to put a check/warning/smth before that,
that either advance_right or advance_left is non-zero, or we have
reached ends in both trees?


 +   } else if (left_level == right_level) {
...
 +   } else if (left_level  right_level) {
 +   advance_right = ADVANCE;
 +   } else {
 +   advance_left = ADVANCE;
 +   }
Can you pls explain why it is correct?
Why if we are on lower level in the newer tree than we are in the
older tree, we need to advance the older tree? I.e., why this
implies that we are on the lower key in the older tree? (And
vice-versa). I.e., how difference in levels indicates relation between
keys?

Thanks,
  Alex.
--
To unsubscribe from this list: send the 

Re: [RFC PATCH 5/7] Btrfs: add btrfs_compare_trees function

2012-07-04 Thread Alexander Block
On Wed, Jul 4, 2012 at 8:27 PM, Alex Lyakas
alex.bolshoy.bt...@gmail.com wrote:
 Hi Alex,

 +   spin_lock(left_root-root_times_lock);
 +   ctransid = 
 btrfs_root_ctransid(left_root-root_item);
 +   spin_unlock(left_root-root_times_lock);
 +   if (ctransid != left_start_ctransid)
 +   left_start_ctransid = 0;
 +
 +   spin_lock(right_root-root_times_lock);
 +   ctransid = 
 btrfs_root_ctransid(right_root-root_item);
 +   spin_unlock(right_root-root_times_lock);
 +   if (ctransid != right_start_ctransid)
 +   left_start_ctransid = 0;
 Shouldn't it be here right_start_ctransid=0? Otherwise,
 right_start_ctransid is pretty useless in this function.

Hmm you're right, it should be right_start_ctransid. However...the
code was working by accident because the next if does check for left
and right :)
Fixed that in my git repo.
 +
 +   if (!left_start_ctransid || !right_start_ctransid) {
 +   WARN(1, KERN_WARNING
 +   btrfs: btrfs_compare_tree detected 
 +   a change in one of the trees while 
 +   iterating. This is probably a 
 +   bug.\n);
 +   ret = -EIO;
 +   goto out;
 +   }

 I am reading the code have more questions (and comments), but will
 send them all later.

 Alex.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 5/7] Btrfs: add btrfs_compare_trees function

2012-07-04 Thread Alexander Block
On Wed, Jul 4, 2012 at 9:13 PM, Alex Lyakas
alex.bolshoy.bt...@gmail.com wrote:
 Hi Alex,

 +static int tree_compare_item(struct btrfs_root *left_root,
 +struct btrfs_path *left_path,
 +struct btrfs_path *right_path,
 +char *tmp_buf)
 +{
 +   int cmp;
 +   int len1, len2;
 +   unsigned long off1, off2;
 +
 +   len1 = btrfs_item_size_nr(left_path-nodes[0], left_path-slots[0]);
 +   len2 = btrfs_item_size_nr(right_path-nodes[0], 
 right_path-slots[0]);
 +   if (len1 != len2)
 +   return 1;
 +
 +   off1 = btrfs_item_ptr_offset(left_path-nodes[0], 
 left_path-slots[0]);
 +   off2 = btrfs_item_ptr_offset(right_path-nodes[0],
 +   right_path-slots[0]);
 +
 +   read_extent_buffer(left_path-nodes[0], tmp_buf, off1, len1);
 +
 +   cmp = memcmp_extent_buffer(right_path-nodes[0], tmp_buf, off2, 
 len1);
 +   if (cmp)
 +   return 1;
 +   return 0;
 +}
 It might be worth to note in the comment, that tmp_buff should be
 large enough to hold the item from the left tree. Can it happen that
 the right tree has a different leafsize?

This function is only to be used for for the tree compare function and
there we allocate a buffer of root-leafsize, so definitely all items
should fit. As far as I know, Chris (please correct me if I'm wrong)
once guaranteed that ALL trees in a FS will have the same leaf size
and this will ever be the case.
 +   /*
 +* Strategy: Go to the first items of both trees. Then do
 +*
 +* If both trees are at level 0
 +*   Compare keys of current items
 +* If left  right treat left item as new, advance left tree
 +*   and repeat
 +* If left  right treat right item as deleted, advance right 
 tree
 +*   and repeat
 +* If left == right do deep compare of items, treat as changed if
 +*   needed, advance both trees and repeat
 +* If both trees are at the same level but not at level 0
 +*   Compare keys of current nodes/leafs
 +* If left  right advance left tree and repeat
 +* If left  right advance right tree and repeat
 +* If left == right compare blockptrs of the next nodes/leafs
 +*   If they match advance both trees but stay at the same level
 +* and repeat
 +*   If they don't match advance both trees while allowing to go
 +* deeper and repeat
 +* If tree levels are different
 +*   Advance the tree that needs it and repeat
 +*
 +* Advancing a tree means:
 +*   If we are at level 0, try to go to the next slot. If that's not
 +*   possible, go one level up and repeat. Stop when we found a level
 +*   where we could go to the next slot. We may at this point be on a
 +*   node or a leaf.
 +*
 +*   If we are not at level 0 and not on shared tree blocks, go one
 +*   level deeper.
 +*
 +*   If we are not at level 0 and on shared tree blocks, go one slot 
 to
 +*   the right if possible or go up and right.
 +*/
 According to the strategy and to the code later, left tree is
 treated as newer one, while right as older one, correct? Do you
 think it would be more intuitive to make it the other way around,
 although I guess this is a matter of personal taste. I had to draw the
 leafs reversed to keep going:
 R   L
 - -
 | | | | | | | |
 - -


To be honest...I always preferred the way you suggested in the past
when I thought about compares. But for some reason, I didn't even
think about that and just implemented that function in single
flow...it took days until I've even noticed that I swapped left/right
in my head :D I now would like to stay with that, as all the btrfs
send code uses left/right in this way and I never had the problem with
mixing that up again. If people like, I have nothing against changing
that later if someone wants to, but that's nothing I would like to do
myself.
 +   if (advance_left  !left_end_reached) {
 +   ret = tree_advance(left_root, left_path, left_level,
 +   left_root_level,
 +   advance_left != ADVANCE_ONLY_NEXT,
 +   left_key);
 +   if (ret  0)
 +   left_end_reached = ADVANCE;
 +   advance_left = 0;
 +   }
 +   if (advance_right  !right_end_reached) {
 +   ret = tree_advance(right_root, right_path, 
 right_level,
 +   right_root_level,
 +   advance_right != ADVANCE_ONLY_NEXT,
 +   

Re: [RFC PATCH 5/7] Btrfs: add btrfs_compare_trees function

2012-07-04 Thread David Sterba
On Wed, Jul 04, 2012 at 10:18:34PM +0200, Alexander Block wrote:
  It might be worth to note in the comment, that tmp_buff should be
  large enough to hold the item from the left tree. Can it happen that
  the right tree has a different leafsize?
 
 This function is only to be used for for the tree compare function and
 there we allocate a buffer of root-leafsize, so definitely all items
 should fit. As far as I know, Chris (please correct me if I'm wrong)
 once guaranteed that ALL trees in a FS will have the same leaf size
 and this will ever be the case.

Not only leaves are of the same size in all trees, but also nodes, since
the metadata bigblocks patches.

david
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html