date:20110818

btrfs and mainline and git

2011-08-18 Thread Anand Jain



Hello,

1.
 I normally copy btrfs into the mainline and run make,
 however with the recent btrfs release its failing with
 the following,  any idea. ?

-
# make
scripts/kconfig/conf --silentoldconfig Kconfig
fs/btrfs/Kconfig:6: syntax error
fs/Kconfig:9: missing end statement for this entry
fs/Kconfig:5: missing end statement for this entry
fs/btrfs/Kconfig:5: invalid statement
arch/x86/Kconfig:253: recursive inclusion detected. Inclusion path:
  current file : 'arch/x86/Kconfig'
  included from: 'fs/btrfs/Kconfig:11'
  included from: 'fs/Kconfig:36'
  included from: 'arch/x86/Kconfig:2145'
make[2]: *** [silentoldconfig] Error 1
make[1]: *** [silentoldconfig] Error 2
make: *** No rule to make target `include/config/auto.conf', needed by 
`include/config/kernel.release'.  Stop.

#
-


2.
 Looks like the official way is to use git merge. What are the
 recommended git (merge ?) steps, to integrate btrfs into the mainline
 ?

Thanks for your time,
-Anand
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Btrfs: check if there is enough space for balancing smarter

2011-08-18 Thread liubo

On 08/19/2011 10:46 AM, David Sterba wrote:
> Hi,
> 
> too late, already pulled
> 
> On Wed, Aug 03, 2011 at 06:15:25PM +0800, Liu Bo wrote:
>> --- a/fs/btrfs/extent-tree.c
>> +++ b/fs/btrfs/extent-tree.c
>> @@ -6682,6 +6682,10 @@ int btrfs_can_relocate(struct btrfs_root *root, u64 
>> bytenr)
>>  struct btrfs_space_info *space_info;
>>  struct btrfs_fs_devices *fs_devices = root->fs_info->fs_devices;
>>  struct btrfs_device *device;
>> +u64 min_free;
> ^^^
> 
>> +int index;
>> +int dev_nr = 0;
>> +int dev_min = 1;
>>  int full = 0;
>>  int ret = 0;
>>  
>> @@ -6728,9 +6733,29 @@ int btrfs_can_relocate(struct btrfs_root *root, u64 
>> bytenr)
>>  if (full)
>>  goto out;
>>  
>> +/*
>> + * index:
>> + *  0: raid10
>> + *  1: raid1
>> + *  2: dup
>> + *  3: raid0
>> + *  4: single
>> + */
>> +index = get_block_group_index(block_group);
>> +if (index == 0) {
>> +dev_min = 4;
>> +min_free /= 2;
>> +} else if (index == 1) {
>> +dev_min = 2;
>> +} else if (index == 2) {
>> +min_free *= 2;
>> +} else if (index == 3) {
>> +dev_min = fs_devices->rw_devices;
>> +min_free /= dev_min;
> ^^^
> 
> 64bit division will break 32bit builds, can you please convert it to
> do_div ? the other is 'div-by-power-of-2' which will most probably be
> converted to shifts.
> 

This is my fault, sorry.  Will fix it soon.


thanks,
liubo

> 
> david
> 
>> +}
>> +
>>  mutex_lock(&root->fs_info->chunk_mutex);
>>  list_for_each_entry(device, &fs_devices->alloc_list, dev_alloc_list) {
>> -u64 min_free = btrfs_block_group_used(&block_group->item);
>>  u64 dev_offset;
>>  
>>  /*
>> @@ -6741,7 +6766,11 @@ int btrfs_can_relocate(struct btrfs_root *root, u64 
>> bytenr)
>>  ret = find_free_dev_extent(NULL, device, min_free,
>> &dev_offset, NULL);
>>  if (!ret)
>> +dev_nr++;
>> +
>> +if (dev_nr >= dev_min)
>>  break;
>> +
>>  ret = -1;
>>  }
>>  }
>> -- 
>> 1.6.5.2
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Btrfs: check if there is enough space for balancing smarter

2011-08-18 Thread David Sterba

Hi,

too late, already pulled

On Wed, Aug 03, 2011 at 06:15:25PM +0800, Liu Bo wrote:
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -6682,6 +6682,10 @@ int btrfs_can_relocate(struct btrfs_root *root, u64 
> bytenr)
>   struct btrfs_space_info *space_info;
>   struct btrfs_fs_devices *fs_devices = root->fs_info->fs_devices;
>   struct btrfs_device *device;
> + u64 min_free;
^^^

> + int index;
> + int dev_nr = 0;
> + int dev_min = 1;
>   int full = 0;
>   int ret = 0;
>  
> @@ -6728,9 +6733,29 @@ int btrfs_can_relocate(struct btrfs_root *root, u64 
> bytenr)
>   if (full)
>   goto out;
>  
> + /*
> +  * index:
> +  *  0: raid10
> +  *  1: raid1
> +  *  2: dup
> +  *  3: raid0
> +  *  4: single
> +  */
> + index = get_block_group_index(block_group);
> + if (index == 0) {
> + dev_min = 4;
> + min_free /= 2;
> + } else if (index == 1) {
> + dev_min = 2;
> + } else if (index == 2) {
> + min_free *= 2;
> + } else if (index == 3) {
> + dev_min = fs_devices->rw_devices;
> + min_free /= dev_min;
^^^

64bit division will break 32bit builds, can you please convert it to
do_div ? the other is 'div-by-power-of-2' which will most probably be
converted to shifts.


david

> + }
> +
>   mutex_lock(&root->fs_info->chunk_mutex);
>   list_for_each_entry(device, &fs_devices->alloc_list, dev_alloc_list) {
> - u64 min_free = btrfs_block_group_used(&block_group->item);
>   u64 dev_offset;
>  
>   /*
> @@ -6741,7 +6766,11 @@ int btrfs_can_relocate(struct btrfs_root *root, u64 
> bytenr)
>   ret = find_free_dev_extent(NULL, device, min_free,
>  &dev_offset, NULL);
>   if (!ret)
> + dev_nr++;
> +
> + if (dev_nr >= dev_min)
>   break;
> +
>   ret = -1;
>   }
>   }
> -- 
> 1.6.5.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: kernel oops -> no mount (different problem than others)

2011-08-18 Thread cwillu

You might try mounting it "-o ro" as a stopgap to regain readonly access.

Judging from the bootlog, the error itself appears to be enospc.  In
which case there's no already-available quick fix; I expect a
developer to chime in any second now :p

> From the logs it is listing a transid error but NOT that it is
> expecting a different one, simply
>
> device label 1TB devid 1 transid 248472 /dev/sdb

That particular line is the normal listing of devices, which is
expected and completely normal.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

kernel oops -> no mount (different problem than others)

2011-08-18 Thread Liam

Hi.
I''ve been using btrfs on an external drive for >1yr but now I cannot mount it.
>From the logs it is listing a transid error but NOT that it is
expecting a different one, simply

device label 1TB devid 1 transid 248472 /dev/sdb

I've been having this problem now for more than 1 month and have not
managed to get anywhere with it.
Any help would be appreciated.

Attached is the (edited) boot log.
On Fedora 15 w/ updates-testing enabled.
[   84.596465] device label 1TB devid 1 transid 248469 /dev/sdb
[  226.128630] [ cut here ]
[  226.128666] WARNING: at fs/btrfs/extent-tree.c:5685 
btrfs_alloc_free_block+0xca/0x27c [btrfs]()
[  226.128668] Hardware name: 4384FM4
[  226.128669] Modules linked in: fuse ebtable_nat ebtables ipt_MASQUERADE 
iptable_nat nf_nat xt_CHECKSUM iptable_mangle bridge 8021q garp stp llc sunrpc 
acpi_cpufreq freq_table mperf ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 
ip6table_filter ip6_tables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state 
nf_conntrack rfcomm bnep snd_hda_codec_hdmi snd_hda_codec_conexant 
snd_hda_intel snd_hda_codec snd_hwdep thinkpad_acpi intel_ips snd_seq 
snd_seq_device snd_pcm iTCO_wdt iTCO_vendor_support virtio_net snd_timer arc4 
iwlagn e1000e uvcvideo videodev media snd_page_alloc v4l2_compat_ioctl32 btusb 
bluetooth snd mac80211 cfg80211 i2c_i801 rfkill kvm_intel kvm microcode joydev 
soundcore ipv6 btrfs zlib_deflate libcrc32c usb_storage uas sdhci_pci sdhci 
firewire_ohci firewire_core mmc_core crc_itu_t mxm_wmi wmi i915 drm_kms_helper 
drm i2c_algo_bit i2c_core video [last unloaded: scsi_wait_scan]
[  226.128706] Pid: 758, comm: mount Not tainted 2.6.40.3-0.fc15.x86_64 #1
[  226.128707] Call Trace:
[  226.128716]  [] warn_slowpath_common+0x83/0x9b
[  226.128719]  [] warn_slowpath_null+0x1a/0x1c
[  226.128727]  [] btrfs_alloc_free_block+0xca/0x27c [btrfs]
[  226.128740]  [] ? map_extent_buffer+0x6e/0xa8 [btrfs]
[  226.128747]  [] __btrfs_cow_block+0xfc/0x30c [btrfs]
[  226.128756]  [] ? btrfs_buffer_uptodate+0x47/0x58 [btrfs]
[  226.128763]  [] ? read_block_for_search+0x94/0x397 [btrfs]
[  226.128770]  [] btrfs_cow_block+0xfb/0x145 [btrfs]
[  226.128777]  [] btrfs_search_slot+0x14d/0x4b6 [btrfs]
[  226.128785]  [] lookup_inline_extent_backref+0x99/0x31e 
[btrfs]
[  226.128789]  [] ? kmem_cache_alloc+0x44/0x10b
[  226.128797]  [] __btrfs_free_extent+0xc0/0x558 [btrfs]
[  226.128802]  [] ? _raw_spin_lock+0xe/0x10
[  226.128804]  [] ? virt_to_head_page+0xe/0x31
[  226.128806]  [] ? kfree+0x4d/0xda
[  226.128818]  [] ? btrfs_delayed_ref_lock+0x3f/0x9d [btrfs]
[  226.128826]  [] run_clustered_refs+0x615/0x672 [btrfs]
[  226.128838]  [] ? btrfs_find_ref_cluster+0x5a/0x145 [btrfs]
[  226.128846]  [] btrfs_run_delayed_refs+0xd1/0x193 [btrfs]
[  226.128856]  [] __btrfs_end_transaction+0x8e/0x1f0 [btrfs]
[  226.128866]  [] btrfs_end_transaction+0x15/0x17 [btrfs]
[  226.128877]  [] btrfs_evict_inode+0x171/0x20e [btrfs]
[  226.128880]  [] evict+0x77/0x117
[  226.128882]  [] iput+0x130/0x138
[  226.128893]  [] btrfs_orphan_cleanup+0x1ee/0x2c0 [btrfs]
[  226.128903]  [] open_ctree+0x11a2/0x13ff [btrfs]
[  226.128910]  [] btrfs_mount+0x233/0x496 [btrfs]
[  226.128913]  [] ? pcpu_next_pop+0x3d/0x4a
[  226.128915]  [] ? pcpu_alloc+0x7f7/0x833
[  226.128919]  [] mount_fs+0x69/0x155
[  226.128921]  [] ? __alloc_percpu+0x10/0x12
[  226.128923]  [] vfs_kern_mount+0x63/0x9d
[  226.128926]  [] do_kern_mount+0x4d/0xdf
[  226.128928]  [] do_mount+0x63c/0x69f
[  226.128930]  [] ? memdup_user+0x55/0x7d
[  226.128932]  [] ? strndup_user+0x3b/0x51
[  226.128934]  [] sys_mount+0x88/0xc2
[  226.128936]  [] system_call_fastpath+0x16/0x1b
[  226.128938] ---[ end trace b471214728831853 ]---
[  226.128991] [ cut here ]
[  226.129000] WARNING: at fs/btrfs/extent-tree.c:5685 
btrfs_alloc_free_block+0xca/0x27c [btrfs]()
[  226.129002] Hardware name: 4384FM4
[  226.129002] Modules linked in: fuse ebtable_nat ebtables ipt_MASQUERADE 
iptable_nat nf_nat xt_CHECKSUM iptable_mangle bridge 8021q garp stp llc sunrpc 
acpi_cpufreq freq_table mperf ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 
ip6table_filter ip6_tables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state 
nf_conntrack rfcomm bnep snd_hda_codec_hdmi snd_hda_codec_conexant 
snd_hda_intel snd_hda_codec snd_hwdep thinkpad_acpi intel_ips snd_seq 
snd_seq_device snd_pcm iTCO_wdt iTCO_vendor_support virtio_net snd_timer arc4 
iwlagn e1000e uvcvideo videodev media snd_page_alloc v4l2_compat_ioctl32 btusb 
bluetooth snd mac80211 cfg80211 i2c_i801 rfkill kvm_intel kvm microcode joydev 
soundcore ipv6 btrfs zlib_deflate libcrc32c usb_storage uas sdhci_pci sdhci 
firewire_ohci firewire_core mmc_core crc_itu_t mxm_wmi wmi i915 drm_kms_helper 
drm i2c_algo_bit i2c_core video [last unloaded: scsi_wait_scan]
[  226.129031] Pid: 758, comm: mount Tainted: GW   
2.6.40.3-0.fc15.x86_64 #1
[  226.129032] Call Trace:
[  226.129035]  [] warn_slowpath_common+0x83/0x9b
[  226.129038]  [] warn_slowpath_null+0x1a/0x1c

[PATCH 8/8] btrfs: make del_ptr() and btrfs_del_leaf() void

2011-08-18 Thread Mark Fasheh

From: Mark Fasheh 

Since fixup_low_keys() has been made void, del_ptr() always returns zero. We
can then make it void as well. This allows us in turn to make
btrfs_del_leaf() void as the only return value it was previously catching
was from del_ptr(). This winds up removing a couple of un-needed BUG_ON(ret)
lines.

Signed-off-by: Mark Fasheh 
---
 fs/btrfs/ctree.c |   40 +---
 1 files changed, 13 insertions(+), 27 deletions(-)

diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index b42caec..32f8030 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -36,8 +36,8 @@ static int balance_node_right(struct btrfs_trans_handle 
*trans,
  struct btrfs_root *root,
  struct extent_buffer *dst_buf,
  struct extent_buffer *src_buf);
-static int del_ptr(struct btrfs_trans_handle *trans, struct btrfs_root *root,
-  struct btrfs_path *path, int level, int slot);
+static void del_ptr(struct btrfs_trans_handle *trans, struct btrfs_root *root,
+   struct btrfs_path *path, int level, int slot);
 
 struct btrfs_path *btrfs_alloc_path(void)
 {
@@ -1005,10 +1005,7 @@ static noinline int balance_level(struct 
btrfs_trans_handle *trans,
if (btrfs_header_nritems(right) == 0) {
clean_tree_block(trans, root, right);
btrfs_tree_unlock(right);
-   wret = del_ptr(trans, root, path, level + 1, pslot +
-  1);
-   if (wret)
-   ret = wret;
+   del_ptr(trans, root, path, level + 1, pslot + 1);
root_sub_used(root, right->len);
btrfs_free_tree_block(trans, root, right, 0, 1);
free_extent_buffer(right);
@@ -1046,9 +1043,7 @@ static noinline int balance_level(struct 
btrfs_trans_handle *trans,
if (btrfs_header_nritems(mid) == 0) {
clean_tree_block(trans, root, mid);
btrfs_tree_unlock(mid);
-   wret = del_ptr(trans, root, path, level + 1, pslot);
-   if (wret)
-   ret = wret;
+   del_ptr(trans, root, path, level + 1, pslot);
root_sub_used(root, mid->len);
btrfs_free_tree_block(trans, root, mid, 0, 1);
free_extent_buffer(mid);
@@ -3673,12 +3668,11 @@ int btrfs_insert_item(struct btrfs_trans_handle *trans, 
struct btrfs_root
  * the tree should have been previously balanced so the deletion does not
  * empty a node.
  */
-static int del_ptr(struct btrfs_trans_handle *trans, struct btrfs_root *root,
-  struct btrfs_path *path, int level, int slot)
+static void del_ptr(struct btrfs_trans_handle *trans, struct btrfs_root *root,
+   struct btrfs_path *path, int level, int slot)
 {
struct extent_buffer *parent = path->nodes[level];
u32 nritems;
-   int ret = 0;
 
nritems = btrfs_header_nritems(parent);
if (slot != nritems - 1) {
@@ -3701,7 +3695,6 @@ static int del_ptr(struct btrfs_trans_handle *trans, 
struct btrfs_root *root,
fixup_low_keys(trans, root, path, &disk_key, level + 1);
}
btrfs_mark_buffer_dirty(parent);
-   return ret;
 }
 
 /*
@@ -3714,17 +3707,13 @@ static int del_ptr(struct btrfs_trans_handle *trans, 
struct btrfs_root *root,
  * The path must have already been setup for deleting the leaf, including
  * all the proper balancing.  path->nodes[1] must be locked.
  */
-static noinline int btrfs_del_leaf(struct btrfs_trans_handle *trans,
-  struct btrfs_root *root,
-  struct btrfs_path *path,
-  struct extent_buffer *leaf)
+static noinline void btrfs_del_leaf(struct btrfs_trans_handle *trans,
+   struct btrfs_root *root,
+   struct btrfs_path *path,
+   struct extent_buffer *leaf)
 {
-   int ret;
-
WARN_ON(btrfs_header_generation(leaf) != trans->transid);
-   ret = del_ptr(trans, root, path, 1, path->slots[1]);
-   if (ret)
-   return ret;
+   del_ptr(trans, root, path, 1, path->slots[1]);
 
/*
 * btrfs_free_extent is expensive, we want to make sure we
@@ -3735,7 +3724,6 @@ static noinline int btrfs_del_leaf(struct 
btrfs_trans_handle *trans,
root_sub_used(root, leaf->len);
 
btrfs_free_tree_block(trans, root, leaf, 0, 1);
-   return 0;
 }
 /*
  * delete the item at the leaf level in path.  If that empties
@@ -3792,8 +3780,7 @@ int btrfs_del_items(struct btrfs_trans_handle *trans, 
struct btrfs_root *root,
} else {
btrfs_set_path_blocking(path);
clean_tree_block(tr

[PATCH 5/8] btrfs: Don't BUG_ON errors in __finish_chunk_alloc()

2011-08-18 Thread Mark Fasheh

From: Mark Fasheh 

All callers of __finish_chunk_alloc() BUG_ON() return value, so it's trivial
for us to always bubble up any errors caught in __finish_chunk_alloc() to be
caught there.

Signed-off-by: Mark Fasheh 
---
 fs/btrfs/volumes.c |7 ++-
 1 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 53875ae..5d166c2 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -2600,16 +2600,13 @@ static int __finish_chunk_alloc(struct 
btrfs_trans_handle *trans,
key.offset = chunk_offset;
 
ret = btrfs_insert_item(trans, chunk_root, &key, chunk, item_size);
-   BUG_ON(ret);
-
-   if (map->type & BTRFS_BLOCK_GROUP_SYSTEM) {
+   if (ret == 0 && map->type & BTRFS_BLOCK_GROUP_SYSTEM) {
ret = btrfs_add_system_chunk(trans, chunk_root, &key, chunk,
 item_size);
-   BUG_ON(ret);
}
 
kfree(chunk);
-   return 0;
+   return ret;
 }
 
 /*
-- 
1.7.6

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 7/8] btrfs: make fixup_low_keys() void

2011-08-18 Thread Mark Fasheh

From: Mark Fasheh 

This is trivial - fixup_low_keys always returns zero so we can make it void.
As a result, we can then make setup_items_for_insert() void too which lets
us cut out a couple of BUG_ON(ret) lines.

Signed-off-by: Mark Fasheh 
---
 fs/btrfs/ctree.c |   59 ++---
 fs/btrfs/ctree.h |8 +++---
 fs/btrfs/delayed-inode.c |6 +---
 3 files changed, 25 insertions(+), 48 deletions(-)

diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index 84e4053e..b42caec 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -1874,16 +1874,12 @@ done:
  * This is used after shifting pointers to the left, so it stops
  * fixing up pointers when a given leaf/node is not in slot 0 of the
  * higher levels
- *
- * If this fails to write a tree block, it returns -1, but continues
- * fixing up the blocks in ram so the tree is consistent.
  */
-static int fixup_low_keys(struct btrfs_trans_handle *trans,
- struct btrfs_root *root, struct btrfs_path *path,
- struct btrfs_disk_key *key, int level)
+static void fixup_low_keys(struct btrfs_trans_handle *trans,
+  struct btrfs_root *root, struct btrfs_path *path,
+  struct btrfs_disk_key *key, int level)
 {
int i;
-   int ret = 0;
struct extent_buffer *t;
 
for (i = level; i < BTRFS_MAX_LEVEL; i++) {
@@ -1896,7 +1892,6 @@ static int fixup_low_keys(struct btrfs_trans_handle 
*trans,
if (tslot != 0)
break;
}
-   return ret;
 }
 
 /*
@@ -2524,7 +2519,6 @@ static noinline int __push_leaf_left(struct 
btrfs_trans_handle *trans,
u32 old_left_nritems;
u32 nr;
int ret = 0;
-   int wret;
u32 this_item_size;
u32 old_left_item_size;
 
@@ -2630,9 +2624,7 @@ static noinline int __push_leaf_left(struct 
btrfs_trans_handle *trans,
clean_tree_block(trans, root, right);
 
btrfs_item_key(right, &disk_key, 0);
-   wret = fixup_low_keys(trans, root, path, &disk_key, 1);
-   if (wret)
-   ret = wret;
+   fixup_low_keys(trans, root, path, &disk_key, 1);
 
/* then fixup the leaf pointer in the path */
if (path->slots[0] < push_items) {
@@ -2988,12 +2980,8 @@ again:
free_extent_buffer(path->nodes[0]);
path->nodes[0] = right;
path->slots[0] = 0;
-   if (path->slots[1] == 0) {
-   wret = fixup_low_keys(trans, root,
-   path, &disk_key, 1);
-   if (wret)
-   ret = wret;
-   }
+   if (path->slots[1] == 0)
+   fixup_low_keys(trans, root, path, &disk_key, 1);
}
btrfs_mark_buffer_dirty(right);
return ret;
@@ -3209,10 +3197,9 @@ int btrfs_duplicate_item(struct btrfs_trans_handle 
*trans,
return ret;
 
path->slots[0]++;
-   ret = setup_items_for_insert(trans, root, path, new_key, &item_size,
-item_size, item_size +
-sizeof(struct btrfs_item), 1);
-   BUG_ON(ret);
+   setup_items_for_insert(trans, root, path, new_key, &item_size,
+  item_size, item_size +
+  sizeof(struct btrfs_item), 1);
 
leaf = path->nodes[0];
memcpy_extent_buffer(leaf,
@@ -3515,7 +3502,7 @@ int btrfs_insert_some_items(struct btrfs_trans_handle 
*trans,
ret = 0;
if (slot == 0) {
btrfs_cpu_key_to_disk(&disk_key, cpu_key);
-   ret = fixup_low_keys(trans, root, path, &disk_key, 1);
+   fixup_low_keys(trans, root, path, &disk_key, 1);
}
 
if (btrfs_leaf_free_space(root, leaf) < 0) {
@@ -3533,17 +3520,16 @@ out:
  * to save stack depth by doing the bulk of the work in a function
  * that doesn't call btrfs_search_slot
  */
-int setup_items_for_insert(struct btrfs_trans_handle *trans,
-  struct btrfs_root *root, struct btrfs_path *path,
-  struct btrfs_key *cpu_key, u32 *data_size,
-  u32 total_data, u32 total_size, int nr)
+void setup_items_for_insert(struct btrfs_trans_handle *trans,
+   struct btrfs_root *root, struct btrfs_path *path,
+   struct btrfs_key *cpu_key, u32 *data_size,
+   u32 total_data, u32 total_size, int nr)
 {
struct btrfs_item *item;
int i;
u32 nritems;
unsigned int data_end;
struct btrfs_disk_key disk_key;
-   int ret;
struct extent_buffer *leaf;
int slot;
 
@@ -3604,10 +3590,9 @@ int setup_items_f

[PATCH 6/8] btrfs: fix error check of btrfs_lookup_dentry()

2011-08-18 Thread Mark Fasheh

From: Tsutomu Itoh 

Clean up btrfs_lookup_dentry() to never return NULL, but PTR_ERR(-ENOENT)
instead. This keeps the return value convention consistent.

Callers who pass to d_instatiate() require a trivial update.

create_snapshot() in particular looks like it can also lose a BUG_ON(!inode)
which is not really needed - there seems less harm in returning ENOENT to
userspace at that point in the stack than there is to crash the machine.

Mark: Fixed conflicts against latest tree, gave the patch a more thorough
description.

Signed-off-by: Tsutomu Itoh 
Signed-off-by: Mark Fasheh 
---
 fs/btrfs/inode.c |   12 ++--
 fs/btrfs/ioctl.c |   11 +--
 2 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 7028c0c..9f3a85d 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -4027,7 +4027,7 @@ struct inode *btrfs_lookup_dentry(struct inode *dir, 
struct dentry *dentry)
return ERR_PTR(ret);
 
if (location.objectid == 0)
-   return NULL;
+   return ERR_PTR(-ENOENT);
 
if (location.type == BTRFS_INODE_ITEM_KEY) {
inode = btrfs_iget(dir->i_sb, &location, root, NULL);
@@ -4085,7 +4085,15 @@ static void btrfs_dentry_release(struct dentry *dentry)
 static struct dentry *btrfs_lookup(struct inode *dir, struct dentry *dentry,
   struct nameidata *nd)
 {
-   return d_splice_alias(btrfs_lookup_dentry(dir, dentry), dentry);
+   struct inode *inode = btrfs_lookup_dentry(dir, dentry);
+   if (IS_ERR(inode)) {
+   if (PTR_ERR(inode) == -ENOENT)
+   inode = NULL;
+   else
+   return ERR_CAST(inode);
+   }
+
+   return d_splice_alias(inode, dentry);
 }
 
 unsigned char btrfs_filetype_table[] = {
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index b3440f5..692eac2 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -325,6 +325,7 @@ static noinline int create_subvol(struct btrfs_root *root,
struct btrfs_root *new_root;
struct dentry *parent = dentry->d_parent;
struct inode *dir;
+   struct inode *inode;
int ret;
int err;
u64 objectid;
@@ -435,7 +436,13 @@ static noinline int create_subvol(struct btrfs_root *root,
 
BUG_ON(ret);
 
-   d_instantiate(dentry, btrfs_lookup_dentry(dir, dentry));
+   inode = btrfs_lookup_dentry(dir, dentry);
+   if (IS_ERR(inode)) {
+   ret = PTR_ERR(inode);
+   goto fail;
+   }
+
+   d_instantiate(dentry, inode);
 fail:
if (async_transid) {
*async_transid = trans->transid;
@@ -505,7 +512,7 @@ static int create_snapshot(struct btrfs_root *root, struct 
dentry *dentry,
ret = PTR_ERR(inode);
goto fail;
}
-   BUG_ON(!inode);
+
d_instantiate(dentry, inode);
ret = 0;
 fail:
-- 
1.7.6

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 4/8] btrfs: make insert_ptr() void

2011-08-18 Thread Mark Fasheh

From: Mark Fasheh 

insert_ptr() always returns zero, so all the exta error handling can go
away.  This makes it trivial to also make copy_for_split() a void function
as it's only return was from insert_ptr(). Finally, this all makes the
BUG_ON(ret) in split_leaf() meaningless so I removed that.

Signed-off-by: Mark Fasheh 
---
 fs/btrfs/ctree.c |   59 -
 1 files changed, 18 insertions(+), 41 deletions(-)

diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index 5064930..84e4053e 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -2134,12 +2134,10 @@ static noinline int insert_new_root(struct 
btrfs_trans_handle *trans,
  *
  * slot and level indicate where you want the key to go, and
  * blocknr is the block the key points to.
- *
- * returns zero on success and < 0 on any error
  */
-static int insert_ptr(struct btrfs_trans_handle *trans, struct btrfs_root
- *root, struct btrfs_path *path, struct btrfs_disk_key
- *key, u64 bytenr, int slot, int level)
+static void insert_ptr(struct btrfs_trans_handle *trans, struct btrfs_root
+  *root, struct btrfs_path *path, struct btrfs_disk_key
+  *key, u64 bytenr, int slot, int level)
 {
struct extent_buffer *lower;
int nritems;
@@ -2163,7 +2161,6 @@ static int insert_ptr(struct btrfs_trans_handle *trans, 
struct btrfs_root
btrfs_set_node_ptr_generation(lower, slot, trans->transid);
btrfs_set_header_nritems(lower, nritems + 1);
btrfs_mark_buffer_dirty(lower);
-   return 0;
 }
 
 /*
@@ -2184,7 +2181,6 @@ static noinline int split_node(struct btrfs_trans_handle 
*trans,
struct btrfs_disk_key disk_key;
int mid;
int ret;
-   int wret;
u32 c_nritems;
 
c = path->nodes[level];
@@ -2241,11 +2237,8 @@ static noinline int split_node(struct btrfs_trans_handle 
*trans,
btrfs_mark_buffer_dirty(c);
btrfs_mark_buffer_dirty(split);
 
-   wret = insert_ptr(trans, root, path, &disk_key, split->start,
- path->slots[level + 1] + 1,
- level + 1);
-   if (wret)
-   ret = wret;
+   insert_ptr(trans, root, path, &disk_key, split->start,
+  path->slots[level + 1] + 1, level + 1);
 
if (path->slots[level] >= mid) {
path->slots[level] -= mid;
@@ -2735,18 +2728,16 @@ out:
  *
  * returns 0 if all went well and < 0 on failure.
  */
-static noinline int copy_for_split(struct btrfs_trans_handle *trans,
-  struct btrfs_root *root,
-  struct btrfs_path *path,
-  struct extent_buffer *l,
-  struct extent_buffer *right,
-  int slot, int mid, int nritems)
+static noinline void copy_for_split(struct btrfs_trans_handle *trans,
+   struct btrfs_root *root,
+   struct btrfs_path *path,
+   struct extent_buffer *l,
+   struct extent_buffer *right,
+   int slot, int mid, int nritems)
 {
int data_copy_size;
int rt_data_off;
int i;
-   int ret = 0;
-   int wret;
struct btrfs_disk_key disk_key;
 
nritems = nritems - mid;
@@ -2774,12 +2765,9 @@ static noinline int copy_for_split(struct 
btrfs_trans_handle *trans,
}
 
btrfs_set_header_nritems(l, mid);
-   ret = 0;
btrfs_item_key(right, &disk_key, 0);
-   wret = insert_ptr(trans, root, path, &disk_key, right->start,
- path->slots[1] + 1, 1);
-   if (wret)
-   ret = wret;
+   insert_ptr(trans, root, path, &disk_key, right->start,
+  path->slots[1] + 1, 1);
 
btrfs_mark_buffer_dirty(right);
btrfs_mark_buffer_dirty(l);
@@ -2797,8 +2785,6 @@ static noinline int copy_for_split(struct 
btrfs_trans_handle *trans,
}
 
BUG_ON(path->slots[0] < 0);
-
-   return ret;
 }
 
 /*
@@ -2987,12 +2973,8 @@ again:
if (split == 0) {
if (mid <= slot) {
btrfs_set_header_nritems(right, 0);
-   wret = insert_ptr(trans, root, path,
- &disk_key, right->start,
- path->slots[1] + 1, 1);
-   if (wret)
-   ret = wret;
-
+   insert_ptr(trans, root, path, &disk_key, right->start,
+  path->slots[1] + 1, 1);
btrfs_tree_unlock(path->nodes[0]);
free_extent_buffer(path->nodes[0]);
path->nodes[0] = right;
@@ -3000,12 +2982,8 @@ again:
path->slots[1] +=

[PATCH 3/8] btrfs: Don't BUG_ON kzalloc error in btrfs_lookup_csums_range()

2011-08-18 Thread Mark Fasheh

From: Mark Fasheh 

Unfortunately it isn't enough to just exit here - the kzalloc() happens in a
loop and the allocated items are added to a linked list whose head is passed
in from the caller.

To fix the BUG_ON() and also provide the semantic that the list passed in is
only modified on success, I create function-local temporary list that we add
items too. If no error is met, that list is spliced to the callers at the
end of the function. Otherwise the list will be walked and all items freed
before the error value is returned.

I did a simple test on this patch by forcing an error at the kzalloc() point
and verifying that when this hits (git clone seemed to exercise this), the
function throws the proper error. Unfortunately but predictably, we later
hit a BUG_ON(ret) type line that still hasn't been fixed up ;)

Signed-off-by: Mark Fasheh 
---
 fs/btrfs/file-item.c |   15 +--
 1 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c
index b910694..679fbff 100644
--- a/fs/btrfs/file-item.c
+++ b/fs/btrfs/file-item.c
@@ -284,6 +284,7 @@ int btrfs_lookup_csums_range(struct btrfs_root *root, u64 
start, u64 end,
struct btrfs_ordered_sum *sums;
struct btrfs_sector_sum *sector_sum;
struct btrfs_csum_item *item;
+   LIST_HEAD(tmplist);
unsigned long offset;
int ret;
size_t size;
@@ -358,7 +359,10 @@ int btrfs_lookup_csums_range(struct btrfs_root *root, u64 
start, u64 end,
MAX_ORDERED_SUM_BYTES(root));
sums = kzalloc(btrfs_ordered_sum_size(root, size),
GFP_NOFS);
-   BUG_ON(!sums);
+   if (!sums) {
+   ret = -ENOMEM;
+   goto fail;
+   }
 
sector_sum = sums->sums;
sums->bytenr = start;
@@ -380,12 +384,19 @@ int btrfs_lookup_csums_range(struct btrfs_root *root, u64 
start, u64 end,
offset += csum_size;
sector_sum++;
}
-   list_add_tail(&sums->list, list);
+   list_add_tail(&sums->list, &tmplist);
}
path->slots[0]++;
}
ret = 0;
 fail:
+   while (ret < 0 && !list_empty(&tmplist)) {
+   sums = list_entry(&tmplist, struct btrfs_ordered_sum, list);
+   list_del(&sums->list);
+   kfree(sums);
+   }
+   list_splice_tail(&tmplist, list);
+
btrfs_free_path(path);
return ret;
 }
-- 
1.7.6

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/8] btrfs: Don't BUG_ON() errors in update_ref_for_cow()

2011-08-18 Thread Mark Fasheh

From: Mark Fasheh 

The only caller of update_ref_for_cow() is __btrfs_cow_block() which was
originally ignoring any return values. update_ref_for_cow() however doesn't
look like a candidate to become a void function - there are a few places
where errors can occur.

So instead I changed update_ref_for_cow() to bubble all errors up (instead
of BUG_ON). __btrfs_cow_block() was then updated to catch and BUG_ON() any
errors from update_ref_for_cow(). The end effect is that we have no change
in behavior, but about 8 different places where a BUG_ON(ret) was removed.

Obviously a future patch will have to address the BUG_ON() in
__btrfs_cow_block().

Signed-off-by: Mark Fasheh 
---
 fs/btrfs/ctree.c |   31 +--
 1 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index 011cab3..5064930 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -331,7 +331,8 @@ static noinline int update_ref_for_cow(struct 
btrfs_trans_handle *trans,
if (btrfs_block_can_be_shared(root, buf)) {
ret = btrfs_lookup_extent_info(trans, root, buf->start,
   buf->len, &refs, &flags);
-   BUG_ON(ret);
+   if (ret)
+   return ret;
BUG_ON(refs == 0);
} else {
refs = 1;
@@ -351,14 +352,18 @@ static noinline int update_ref_for_cow(struct 
btrfs_trans_handle *trans,
 root->root_key.objectid == BTRFS_TREE_RELOC_OBJECTID) &&
!(flags & BTRFS_BLOCK_FLAG_FULL_BACKREF)) {
ret = btrfs_inc_ref(trans, root, buf, 1);
-   BUG_ON(ret);
+   if (ret)
+   return ret;
 
if (root->root_key.objectid ==
BTRFS_TREE_RELOC_OBJECTID) {
ret = btrfs_dec_ref(trans, root, buf, 0);
-   BUG_ON(ret);
+   if (ret)
+   return ret;
+
ret = btrfs_inc_ref(trans, root, cow, 1);
-   BUG_ON(ret);
+   if (ret)
+   return ret;
}
new_flags |= BTRFS_BLOCK_FLAG_FULL_BACKREF;
} else {
@@ -368,14 +373,16 @@ static noinline int update_ref_for_cow(struct 
btrfs_trans_handle *trans,
ret = btrfs_inc_ref(trans, root, cow, 1);
else
ret = btrfs_inc_ref(trans, root, cow, 0);
-   BUG_ON(ret);
+   if (ret)
+   return ret;
}
if (new_flags != 0) {
ret = btrfs_set_disk_extent_flags(trans, root,
  buf->start,
  buf->len,
  new_flags, 0);
-   BUG_ON(ret);
+   if (ret)
+   return ret;
}
} else {
if (flags & BTRFS_BLOCK_FLAG_FULL_BACKREF) {
@@ -384,9 +391,12 @@ static noinline int update_ref_for_cow(struct 
btrfs_trans_handle *trans,
ret = btrfs_inc_ref(trans, root, cow, 1);
else
ret = btrfs_inc_ref(trans, root, cow, 0);
-   BUG_ON(ret);
+   if (ret)
+   return ret;
+
ret = btrfs_dec_ref(trans, root, buf, 1);
-   BUG_ON(ret);
+   if (ret)
+   return ret;
}
clean_tree_block(trans, root, buf);
*last_ref = 1;
@@ -415,7 +425,7 @@ static noinline int __btrfs_cow_block(struct 
btrfs_trans_handle *trans,
 {
struct btrfs_disk_key disk_key;
struct extent_buffer *cow;
-   int level;
+   int level, ret;
int last_ref = 0;
int unlock_orig = 0;
u64 parent_start;
@@ -467,7 +477,8 @@ static noinline int __btrfs_cow_block(struct 
btrfs_trans_handle *trans,
(unsigned long)btrfs_header_fsid(cow),
BTRFS_FSID_SIZE);
 
-   update_ref_for_cow(trans, root, buf, cow, &last_ref);
+   ret = update_ref_for_cow(trans, root, buf, cow, &last_ref);
+   BUG_ON(ret);
 
if (root->ref_cows)
btrfs_reloc_cow_block(trans, root, buf, cow);
-- 
1.7.6

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordom

[PATCH 1/8] btrfs: Don't BUG_ON errors from btrfs_create_subvol_root()

2011-08-18 Thread Mark Fasheh

From: Mark Fasheh 

This is called from only one place - create_subvol() which passes errors
safely back out to it's caller, btrfs_mksubvol where they are handled.

Additionally, btrfs_create_subvol_root() itself bug's needlessly from error
return of btrfs_update_inode(). Since create_subvol() was fixed to catch
errors we can bubble this one up too.

Signed-off-by: Mark Fasheh 
---
 fs/btrfs/inode.c |3 +--
 fs/btrfs/ioctl.c |2 ++
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 15fceef..7028c0c 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6722,10 +6722,9 @@ int btrfs_create_subvol_root(struct btrfs_trans_handle 
*trans,
btrfs_i_size_write(inode, 0);
 
err = btrfs_update_inode(trans, new_root, inode);
-   BUG_ON(err);
 
iput(inode);
-   return 0;
+   return err;
 }
 
 struct inode *btrfs_alloc_inode(struct super_block *sb)
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 7cf0133..b3440f5 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -411,6 +411,8 @@ static noinline int create_subvol(struct btrfs_root *root,
btrfs_record_root_in_trans(trans, new_root);
 
ret = btrfs_create_subvol_root(trans, new_root, new_dirid);
+   if (ret)
+   goto fail;
/*
 * insert the directory item
 */
-- 
1.7.6

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 0/8] btrfs: Error handling fixes

2011-08-18 Thread Mark Fasheh

Hi,

The following are assorted fixes to error handling from all parts of the
Btrfs code.  Every patch in this series stands on it's own, with the
exception of the last patch which relies on the one before it (so patches 7
and 8 can be considered a pair).  I also included in this series an
uncommited patch from Tsutomu Itoh which was a better version of a patch I
had written. He should be cc'd on that mail.

For the most part, I'm still concentrating on eliminating sites where we
BUG_ON(ret) instead of bubbling errors up the stack. The patches were
tested using some simple file system commands and a background kernel build.

A git branch with these patches is available:

git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/btrfs-error-handling.git 
for_mason


Thanks,
--Mark
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [GIT PULL] Btrfs updates

2011-08-18 Thread Sage Weil

Hi Chris, Josef,

Can some form of the clone ioctl transaction start reservation fix go in 
soon as well?  That hits a BUG_ON every time.

Thanks!
sage


On Thu, 18 Aug 2011, Chris Mason wrote:

> Hi everyone,
> 
> The for-linus branch of the btrfs-unstable tree:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable.git 
> for-linus
> 
> Has our current pull request for 3.1-rc.  The for-linus branch includes
> 3.1-rc2 because it two fixes from Dan Carpenter and Jeff Mahoney that
> only apply to 3.1.
> 
> The master branch of btrfs-unstable is based on 3.0, and includes
> everything except those two changes.
> 
> This is a variety pack of fixes.  We do have another fix pending for a
> race in our readdir optimizations, but Josef will work that out with Al
> Viro and send it in later this week (or early next).
> 
> Chris Mason (1) commits (+17/-0):
> Btrfs: force unplugs when switching from high to regular priority bios
> 
> Dan Carpenter (2) commits (+10/-4):
> btrfs: unlock on error in btrfs_file_llseek() (+8/-4)
> btrfs: memory leak in btrfs_add_inode_defrag() (+2/-0)
> 
> Jeff Mahoney (1) commits (+8/-4):
> btrfs: btrfs_permission's RO check shouldn't apply to device nodes
> 
> Josef Bacik (2) commits (+43/-2):
> Btrfs: set i_size properly when fallocating (+14/-0)
> Btrfs: detect wether a device supports discard (+29/-2)
> 
> Li Zefan (1) commits (+2/-4):
> Btrfs: use plain page_address() in header fields setget functions
> 
> Miao Xie (2) commits (+12/-6):
> Btrfs: fix uninitialized sync_pending (+1/-1)
> Btrfs: fix wrong free space information (+11/-5)
> 
> Sage Weil (1) commits (+4/-0):
> Btrfs: truncate pages from clone ioctl target range
> 
> Tsutomu Itoh (1) commits (+16/-10):
> Btrfs: forced readonly when btrfs_drop_snapshot() fails
> 
> liubo (3) commits (+72/-14):
> Btrfs: check if there is enough space for balancing smarter (+35/-6)
> Btrfs: fix a bug of balance on full multi-disk partitions (+13/-4)
> Btrfs: fix an oops of log replay (+24/-4)
> 
> Total: (14) commits (+183/-43)
> 
>  fs/btrfs/ctree.h|   10 ++---
>  fs/btrfs/extent-tree.c  |   75 +-
>  fs/btrfs/file.c |   28 ++--
>  fs/btrfs/free-space-cache.c |   16 ++---
>  fs/btrfs/inode.c|   12 --
>  fs/btrfs/ioctl.c|4 ++
>  fs/btrfs/tree-log.c |   28 ++--
>  fs/btrfs/volumes.c  |   51 +++--
>  fs/btrfs/volumes.h  |2 +
>  9 files changed, 183 insertions(+), 43 deletions(-)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Honest timeline for btrfsck

2011-08-18 Thread Hugo Mills

On Thu, Aug 18, 2011 at 04:50:08PM -0400, Chris Mason wrote:
> I've been working non-stop on this.  Currently fsck has four parts:

   This all looks like great stuff. Can't wait to try it out...

   One thing strikes me for purposes of automated testing and
regression testing: Do you have tools or techniques for breaking a
filesystem in specific ways?

> 1) mount -o recovery mode.  I've posted smaller forms of these patches
> in the past that bypass log tree replay.  The new versions have code to
> create stub roots for trees that can't be read (like the extent
> allocation tree) and will allow the mount to proceed.

   I can see that this will deal with some kinds of breakage, like the
log tree being missing, but most of the other trees are kind of
important for minor things like finding your data. :)

   How useful or reliable is it to ignore missing trees that aren't
the log tree? I'd have thought that if you were missing one of the 6
main trees, you'd have a pretty much unreadable FS.

> 2) fsck that scans for older roots.  This takes advantage of older
> copies of metadata to look for consistent tree roots on disk.  The
> downside is that it is currently very slow.  I'm trying to speed it up
> by limiting the search to only the metadata block groups and a few other
> tricks.

   If this is in decent shape, it's probably worth it to release it in
its current form anyway (and possibly request a moratorium on extra
patches until you've finished the optimisation). I suspect that
there's a number of people out there who wouldn't mind the speed
issues just to get a filesystem back.

> 3) fsck that fixes the extent allocation tree and the chunk tree.  This
> is where I've been spending most of my time.  The problem is that it
> tends to recover some filesystems and badly break others.  While I'm
> fixing up the corner cases that work poorly, I'm adding an undo log to
> the fsck code so that you can get the FS back into its original state if
> you don't like the result of the fsck.

> 4) The rest of the corruptions can be dealt with fairly well from the
> kernel.  I have a series of patches to make the extent allocation tree
> less strict about reference counts and other rules, basically allowing
> the FS to limp along instead of crash.

   Is that going to be always-on, with stubs to highlight where
subsequent patches can add the requisite healing code in later
revisions, or as a mount flag like -o recovery?

> These four things together are basically my minimal set of features
> required for fedora and our own internal projects at Oracle to start
> treating us as production filesystem.
> 
> There are always bugs to fix, and I have #1 and #2 mostly ready.  I had
> hoped to get #1 out the door before I left on vacation and I still might
> post it tonight.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
  --- "You know,  the British have always been nice to mad people." ---  

signature.asc
Description: Digital signature

Re: Honest timeline for btrfsck

2011-08-18 Thread Chris Mason

Excerpts from Yalonda Gishtaka's message of 2011-08-17 21:09:37 -0400:
> Chris Mason  oracle.com> writes:
> 
> > 
> > Aside from making sure the kernel code is stable, btrfsck is all I'm
> > working on right now.  I do expect a release in the next two weeks that
> > can recover your data (and many others).
> > 
> > Thanks,
> > Chris
> > --
> 
> 
> Chris,
> 
> We're all on the edge of our seats.  Can you provide an updated ETA on the 
> release of the first functional btrfsck tool?  No pressure or anything ;)

Hi everyone,

I've been working non-stop on this.  Currently fsck has four parts:

1) mount -o recovery mode.  I've posted smaller forms of these patches
in the past that bypass log tree replay.  The new versions have code to
create stub roots for trees that can't be read (like the extent
allocation tree) and will allow the mount to proceed.

2) fsck that scans for older roots.  This takes advantage of older
copies of metadata to look for consistent tree roots on disk.  The
downside is that it is currently very slow.  I'm trying to speed it up
by limiting the search to only the metadata block groups and a few other
tricks.

3) fsck that fixes the extent allocation tree and the chunk tree.  This
is where I've been spending most of my time.  The problem is that it
tends to recover some filesystems and badly break others.  While I'm
fixing up the corner cases that work poorly, I'm adding an undo log to
the fsck code so that you can get the FS back into its original state if
you don't like the result of the fsck.

4) The rest of the corruptions can be dealt with fairly well from the
kernel.  I have a series of patches to make the extent allocation tree
less strict about reference counts and other rules, basically allowing
the FS to limp along instead of crash.

These four things together are basically my minimal set of features
required for fedora and our own internal projects at Oracle to start
treating us as production filesystem.

There are always bugs to fix, and I have #1 and #2 mostly ready.  I had
hoped to get #1 out the door before I left on vacation and I still might
post it tonight.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 5/5] btrfs: use fs netlink interface for ENOSPC conditions

2011-08-18 Thread Lukas Czerner

On Thu, 18 Aug 2011, David Sterba wrote:

> Hi,
> 
> I see you are mixing a cleanup while adding a new feature.
> 
> On Thu, Aug 18, 2011 at 02:18:26PM +0200, Lukas Czerner wrote:
> > Register fs netlink interface and send proper warning if ENOSPC is
> > encountered. Note that we differentiate between enospc for metadata and
> > enospc for data.
> > 
> > Signed-off-by: Lukas Czerner 
> > Cc: Chris Mason 
> > Cc: linux-btrfs@vger.kernel.org
> > ---
> >  fs/btrfs/extent-tree.c |   13 ++---
> >  fs/btrfs/super.c   |1 +
> >  2 files changed, 11 insertions(+), 3 deletions(-)
> > 
> > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> > index 66bac22..a47d9b9 100644
> > --- a/fs/btrfs/extent-tree.c
> > +++ b/fs/btrfs/extent-tree.c
> > @@ -3029,13 +3029,14 @@ int btrfs_check_data_free_space(struct inode 
> > *inode, u64 bytes)
> 
> hm, if diff is correct with the context, you are changing function
> btrfs_check_data_free_space and reporting it as metadata ENOSPC later.
> Although it's called for metadata (or I should say non-user file blocks,
> like space cache, ino cache) block allocation, it's called from
> btrfs_fallocate too. Seems that there has to be additional flag to say
> if it's really metadata or data.

Actually diff got the context wrong. Sometimes it has some difficulties
if there are labels involved.

> 
> >  {
> > struct btrfs_space_info *data_sinfo;
> > struct btrfs_root *root = BTRFS_I(inode)->root;
> > +   struct btrfs_fs_info *fs_info = root->fs_info;
> 
> change unrealated to ENOSPC-netlink

Not really. I am using fs_info to get reference to the last_bdev and
it is useful for other places as well.

> 
> > u64 used;
> > int ret = 0, committed = 0, alloc_chunk = 1;
> >  
> > /* make sure bytes are sectorsize aligned */
> > bytes = (bytes + root->sectorsize - 1) & ~((u64)root->sectorsize - 1);
> >  
> > -   if (root == root->fs_info->tree_root ||
> > +   if (root == fs_info->tree_root ||
> 
> here
> 
> > BTRFS_I(inode)->location.objectid == BTRFS_FREE_INO_OBJECTID) {
> > alloc_chunk = 0;
> > committed = 1;
> > @@ -3070,7 +3071,7 @@ alloc:
> > if (IS_ERR(trans))
> > return PTR_ERR(trans);
> >  
> > -   ret = do_chunk_alloc(trans, root->fs_info->extent_root,
> > +   ret = do_chunk_alloc(trans, fs_info->extent_root,
> 
> here
> 
> >  bytes + 2 * 1024 * 1024,
> >  alloc_target,
> >  CHUNK_ALLOC_NO_FORCE);
> > @@ -3100,7 +3101,7 @@ alloc:
> > /* commit the current transaction and try again */
> >  commit_trans:
> > if (!committed &&
> > -   !atomic_read(&root->fs_info->open_ioctl_trans)) {
> > +   !atomic_read(&fs_info->open_ioctl_trans)) {
> 
> here
> 
> > committed = 1;
> > trans = btrfs_join_transaction(root);
> > if (IS_ERR(trans))
> > @@ -3111,6 +3112,8 @@ commit_trans:
> > goto again;
> > }
> >  
> > +   fs_nl_send_warning(fs_info->fs_devices->latest_bdev->bd_dev,
> > +  FS_NL_ENOSPC_WARN);
> 
> or is it due to this line being too long with root-> ? :)

exactly :) But as I said fs_info is referenced on other paces as well so
it is useful anyway.

> 
> > return -ENOSPC;
> > }
> > data_sinfo->bytes_may_use += bytes;
> > @@ -3522,6 +3525,10 @@ again:
> > }
> >  
> >  out:
> > +   if (unlikely(-ENOSPC == ret)) {
> 
> 'unlikely' is not needed here, it does not bring anything compiler
> wouldn't know, static branch prediction will give low probabiliy to this
> check anyway

Ok, but I would like to know why do you think that. It is not only in
the error path and there are actually several paths to this condition so
maybe I am being dense, could you explain it a bit for me ?

Thanks!
-Lukas

> 
> > +   dev_t bdev = root->fs_info->fs_devices->latest_bdev->bd_dev;
> > +   fs_nl_send_warning(bdev, FS_NL_META_ENOSPC_WARN);
> > +   }
> > if (flushing) {
> > spin_lock(&space_info->lock);
> > space_info->flush = 0;
> > diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
> > index 15634d4..8ac9e01 100644
> > --- a/fs/btrfs/super.c
> > +++ b/fs/btrfs/super.c
> > @@ -1266,6 +1266,7 @@ static int __init init_btrfs_fs(void)
> > if (err)
> > goto unregister_ioctl;
> >  
> > +   init_fs_nl_family();
> > printk(KERN_INFO "%s loaded\n", BTRFS_BUILD_VERSION);
> > return 0;
> 
> 
> david
> 

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Filesystem corrupt after renaming snapshots.

2011-08-18 Thread Josef Bacik

On Thu, Aug 18, 2011 at 07:53:46PM +0200, David Sterba wrote:
> Hi,
> 
> On Wed, Aug 10, 2011 at 08:38:59PM +1200, Ralph Loader wrote:
> > Hi,
> > 
> > Recently I suffered from a badly corrupted btrfs filesystem.
> > 
> > I had several snapshots in /snap that I moved into / (using /bin/mv).
> > After that, attempting to access the ls the snapshot resulted in the
> > ls process hanging.  There were syslog messages:
> > 
> > Aug  7 20:56:42 i kernel: [  111.882816] [ cut here 
> > ]
> > Aug  7 20:56:42 i kernel: [  111.882896] WARNING: at fs/btrfs/inode.c:2408 
> > btrfs_orphan_cleanup+0x1bf/0x2c0 [btrfs]()
> > Aug  7 20:56:42 i kernel: [  111.882903] Hardware name: GA-MA790GP-DS4H
> > Aug  7 20:56:42 i kernel: [  111.882907] Modules linked in: fuse 
> > ipt_MASQUERADE xt_state nf_nat_h323 nf_conntrack_h323 nf_nat_pptp 
> > nf_conntrack_pptp nf_conntrack_proto_gre nf_nat_proto_gre nf_nat_tftp 
> > nf_conntrack_tftp nf_nat_sip nf_conntrack_sip nf_nat_irc nf_conntrack_irc 
> > nf_nat_ftp nf_conntrack_ftp iptable_nat nf_nat nf_conntrack_ipv4 
> > nf_conntrack nf_defrag_ipv4 ppdev parport_pc lp parport bnep bluetooth 
> > k8temp it87 cpufreq_ondemand hwmon_vid powernow_k8 freq_table mperf arc4 
> > rt73usb crc_itu_t rt2x00usb rt2x00lib mac80211 cfg80211 rfkill ftdi_sio 
> > snd_hda_codec_hdmi uvcvideo snd_hda_codec_realtek snd_hda_intel videodev 
> > snd_hda_codec snd_seq snd_usb_audio snd_hwdep snd_usbmidi_lib snd_rawmidi 
> > snd_seq_device media snd_pcm snd_timer snd soundcore v4l2_compat_ioctl32 
> > sp5100_tco e100 snd_page_alloc i2c_piix4 k10temp edac_core edac_mce_amd 
> > r8169 shpchp mii serio_raw virtio_net kvm_amd kvm btrfs zlib_deflate 
> > libcrc32c pata_acpi ata_generic pata_atiixp wmi radeon ttm drm_kms_helper 
> > drm i2c_algo_bit i2c_core [last
> > Aug  7 20:56:42 i kernel: unloaded: scsi_wait_scan]
> > Aug  7 20:56:42 i kernel: [  111.883125] Pid: 1552, comm: ls Not tainted 
> > 2.6.40-4.fc15.x86_64 #1
> > Aug  7 20:56:42 i kernel: [  111.883135] Call Trace:
> 
> I've probably hit the same problem, though not apparent fs corruption
> happened. The partition is used as TEST_DIR for xfstests or fs_mark or
> ..., ie. the one not mkfs'ed and the files just pile. Until the free
> space goes out someday, which happened, and I can now reliably trigger
> the same warning in fs/btrfs/inode.c with some non-mainline patches.
> 
> This means chris' (for-linus) and josef's (for-chris) branches on top of
> linus-rc2 . On bare linus-rc2 the warning does not show.
> 
> 
> Following traces are from xfstests/083:
> 
> Initially, there is a bunch of
> 
> [  479.487424] Could not get space for a delete, will truncate on mount
>

Yeah I need to fix how we reserve space for truncates, this is working out worse
than I planned.
 
> and traces:
> 
> [  480.148082] [ cut here ]
> [  480.153233] WARNING: at fs/btrfs/extent-tree.c:3885 
> btrfs_free_block_groups+0x2ac/0x320 [btrfs]()
> [  480.162656] Hardware name: Santa Rosa platform
> [  480.162660] Modules linked in: aoe btrfs
> [  480.162668] Pid: 5600, comm: umount Tainted: GW 3.1.0-rc2-default+ 
> #109
> [  480.162672] Call Trace:
> [  480.162683]  [] warn_slowpath_common+0x7f/0xc0
> [  480.162689]  [] warn_slowpath_null+0x1a/0x20
> [  480.162708]  [] btrfs_free_block_groups+0x2ac/0x320 
> [btrfs]
> [  480.162729]  [] close_ctree+0x1e9/0x390 [btrfs]
> [  480.162736]  [] ? dispose_list+0x4f/0x60
> [  480.162750]  [] btrfs_put_super+0x1d/0x30 [btrfs]
> [  480.162757]  [] generic_shutdown_super+0x62/0xe0
> [  480.162763]  [] kill_anon_super+0x16/0x30
> [  480.162768]  [] ? deactivate_super+0x42/0x70
> [  480.162774]  [] deactivate_locked_super+0x45/0x80
> [  480.162779]  [] deactivate_super+0x4a/0x70
> [  480.162785]  [] mntput_no_expire+0xa2/0xf0
> [  480.162791]  [] sys_umount+0x6f/0x390
> [  480.162798]  [] system_call_fastpath+0x16/0x1b
> 
> [  480.162823] WARNING: at fs/btrfs/extent-tree.c:3886 
> btrfs_free_block_groups+0x31a/0x320 [btrfs]()
> [  480.162826] Hardware name: Santa Rosa platform
> [  480.162829] Modules linked in: aoe btrfs
> [  480.162836] Pid: 5600, comm: umount Tainted: GW 3.1.0-rc2-default+ 
> #109
> [  480.162839] Call Trace:
> [  480.162844]  [] warn_slowpath_common+0x7f/0xc0
> [  480.162851]  [] warn_slowpath_null+0x1a/0x20
> [  480.162869]  [] btrfs_free_block_groups+0x31a/0x320 
> [btrfs]
> [  480.162889]  [] close_ctree+0x1e9/0x390 [btrfs]
> [  480.162895]  [] ? dispose_list+0x4f/0x60
> [  480.162909]  [] btrfs_put_super+0x1d/0x30 [btrfs]
> [  480.162915]  [] generic_shutdown_super+0x62/0xe0
> [  480.162921]  [] kill_anon_super+0x16/0x30
> [  480.162926]  [] ? deactivate_super+0x42/0x70
> [  480.162932]  [] deactivate_locked_super+0x45/0x80
> [  480.162937]  [] deactivate_super+0x4a/0x70
> [  480.162943]  [] mntput_no_expire+0xa2/0xf0
> [  480.162948]  [] sys_umount+0x6f/0x390
> [  480.162954]  [] system_call_fastpath+0x16/0x1b
> 
> 
> 3882 static void release_global_block_rsv

Re: [PATCH 5/5] btrfs: use fs netlink interface for ENOSPC conditions

2011-08-18 Thread David Sterba

Hi,

I see you are mixing a cleanup while adding a new feature.

On Thu, Aug 18, 2011 at 02:18:26PM +0200, Lukas Czerner wrote:
> Register fs netlink interface and send proper warning if ENOSPC is
> encountered. Note that we differentiate between enospc for metadata and
> enospc for data.
> 
> Signed-off-by: Lukas Czerner 
> Cc: Chris Mason 
> Cc: linux-btrfs@vger.kernel.org
> ---
>  fs/btrfs/extent-tree.c |   13 ++---
>  fs/btrfs/super.c   |1 +
>  2 files changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 66bac22..a47d9b9 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -3029,13 +3029,14 @@ int btrfs_check_data_free_space(struct inode *inode, 
> u64 bytes)

hm, if diff is correct with the context, you are changing function
btrfs_check_data_free_space and reporting it as metadata ENOSPC later.
Although it's called for metadata (or I should say non-user file blocks,
like space cache, ino cache) block allocation, it's called from
btrfs_fallocate too. Seems that there has to be additional flag to say
if it's really metadata or data.

>  {
>   struct btrfs_space_info *data_sinfo;
>   struct btrfs_root *root = BTRFS_I(inode)->root;
> + struct btrfs_fs_info *fs_info = root->fs_info;

change unrealated to ENOSPC-netlink

>   u64 used;
>   int ret = 0, committed = 0, alloc_chunk = 1;
>  
>   /* make sure bytes are sectorsize aligned */
>   bytes = (bytes + root->sectorsize - 1) & ~((u64)root->sectorsize - 1);
>  
> - if (root == root->fs_info->tree_root ||
> + if (root == fs_info->tree_root ||

here

>   BTRFS_I(inode)->location.objectid == BTRFS_FREE_INO_OBJECTID) {
>   alloc_chunk = 0;
>   committed = 1;
> @@ -3070,7 +3071,7 @@ alloc:
>   if (IS_ERR(trans))
>   return PTR_ERR(trans);
>  
> - ret = do_chunk_alloc(trans, root->fs_info->extent_root,
> + ret = do_chunk_alloc(trans, fs_info->extent_root,

here

>bytes + 2 * 1024 * 1024,
>alloc_target,
>CHUNK_ALLOC_NO_FORCE);
> @@ -3100,7 +3101,7 @@ alloc:
>   /* commit the current transaction and try again */
>  commit_trans:
>   if (!committed &&
> - !atomic_read(&root->fs_info->open_ioctl_trans)) {
> + !atomic_read(&fs_info->open_ioctl_trans)) {

here

>   committed = 1;
>   trans = btrfs_join_transaction(root);
>   if (IS_ERR(trans))
> @@ -3111,6 +3112,8 @@ commit_trans:
>   goto again;
>   }
>  
> + fs_nl_send_warning(fs_info->fs_devices->latest_bdev->bd_dev,
> +FS_NL_ENOSPC_WARN);

or is it due to this line being too long with root-> ? :)

>   return -ENOSPC;
>   }
>   data_sinfo->bytes_may_use += bytes;
> @@ -3522,6 +3525,10 @@ again:
>   }
>  
>  out:
> + if (unlikely(-ENOSPC == ret)) {

'unlikely' is not needed here, it does not bring anything compiler
wouldn't know, static branch prediction will give low probabiliy to this
check anyway

> + dev_t bdev = root->fs_info->fs_devices->latest_bdev->bd_dev;
> + fs_nl_send_warning(bdev, FS_NL_META_ENOSPC_WARN);
> + }
>   if (flushing) {
>   spin_lock(&space_info->lock);
>   space_info->flush = 0;
> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
> index 15634d4..8ac9e01 100644
> --- a/fs/btrfs/super.c
> +++ b/fs/btrfs/super.c
> @@ -1266,6 +1266,7 @@ static int __init init_btrfs_fs(void)
>   if (err)
>   goto unregister_ioctl;
>  
> + init_fs_nl_family();
>   printk(KERN_INFO "%s loaded\n", BTRFS_BUILD_VERSION);
>   return 0;


david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[GIT PULL] Btrfs updates

2011-08-18 Thread Chris Mason

Hi everyone,

The for-linus branch of the btrfs-unstable tree:

git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable.git for-linus

Has our current pull request for 3.1-rc.  The for-linus branch includes
3.1-rc2 because it two fixes from Dan Carpenter and Jeff Mahoney that
only apply to 3.1.

The master branch of btrfs-unstable is based on 3.0, and includes
everything except those two changes.

This is a variety pack of fixes.  We do have another fix pending for a
race in our readdir optimizations, but Josef will work that out with Al
Viro and send it in later this week (or early next).

Chris Mason (1) commits (+17/-0):
Btrfs: force unplugs when switching from high to regular priority bios

Dan Carpenter (2) commits (+10/-4):
btrfs: unlock on error in btrfs_file_llseek() (+8/-4)
btrfs: memory leak in btrfs_add_inode_defrag() (+2/-0)

Jeff Mahoney (1) commits (+8/-4):
btrfs: btrfs_permission's RO check shouldn't apply to device nodes

Josef Bacik (2) commits (+43/-2):
Btrfs: set i_size properly when fallocating (+14/-0)
Btrfs: detect wether a device supports discard (+29/-2)

Li Zefan (1) commits (+2/-4):
Btrfs: use plain page_address() in header fields setget functions

Miao Xie (2) commits (+12/-6):
Btrfs: fix uninitialized sync_pending (+1/-1)
Btrfs: fix wrong free space information (+11/-5)

Sage Weil (1) commits (+4/-0):
Btrfs: truncate pages from clone ioctl target range

Tsutomu Itoh (1) commits (+16/-10):
Btrfs: forced readonly when btrfs_drop_snapshot() fails

liubo (3) commits (+72/-14):
Btrfs: check if there is enough space for balancing smarter (+35/-6)
Btrfs: fix a bug of balance on full multi-disk partitions (+13/-4)
Btrfs: fix an oops of log replay (+24/-4)

Total: (14) commits (+183/-43)

 fs/btrfs/ctree.h|   10 ++---
 fs/btrfs/extent-tree.c  |   75 +-
 fs/btrfs/file.c |   28 ++--
 fs/btrfs/free-space-cache.c |   16 ++---
 fs/btrfs/inode.c|   12 --
 fs/btrfs/ioctl.c|4 ++
 fs/btrfs/tree-log.c |   28 ++--
 fs/btrfs/volumes.c  |   51 +++--
 fs/btrfs/volumes.h  |2 +
 9 files changed, 183 insertions(+), 43 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Filesystem corrupt after renaming snapshots.

2011-08-18 Thread David Sterba

Hi,

On Wed, Aug 10, 2011 at 08:38:59PM +1200, Ralph Loader wrote:
> Hi,
> 
> Recently I suffered from a badly corrupted btrfs filesystem.
> 
> I had several snapshots in /snap that I moved into / (using /bin/mv).
> After that, attempting to access the ls the snapshot resulted in the
> ls process hanging.  There were syslog messages:
> 
> Aug  7 20:56:42 i kernel: [  111.882816] [ cut here ]
> Aug  7 20:56:42 i kernel: [  111.882896] WARNING: at fs/btrfs/inode.c:2408 
> btrfs_orphan_cleanup+0x1bf/0x2c0 [btrfs]()
> Aug  7 20:56:42 i kernel: [  111.882903] Hardware name: GA-MA790GP-DS4H
> Aug  7 20:56:42 i kernel: [  111.882907] Modules linked in: fuse 
> ipt_MASQUERADE xt_state nf_nat_h323 nf_conntrack_h323 nf_nat_pptp 
> nf_conntrack_pptp nf_conntrack_proto_gre nf_nat_proto_gre nf_nat_tftp 
> nf_conntrack_tftp nf_nat_sip nf_conntrack_sip nf_nat_irc nf_conntrack_irc 
> nf_nat_ftp nf_conntrack_ftp iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack 
> nf_defrag_ipv4 ppdev parport_pc lp parport bnep bluetooth k8temp it87 
> cpufreq_ondemand hwmon_vid powernow_k8 freq_table mperf arc4 rt73usb 
> crc_itu_t rt2x00usb rt2x00lib mac80211 cfg80211 rfkill ftdi_sio 
> snd_hda_codec_hdmi uvcvideo snd_hda_codec_realtek snd_hda_intel videodev 
> snd_hda_codec snd_seq snd_usb_audio snd_hwdep snd_usbmidi_lib snd_rawmidi 
> snd_seq_device media snd_pcm snd_timer snd soundcore v4l2_compat_ioctl32 
> sp5100_tco e100 snd_page_alloc i2c_piix4 k10temp edac_core edac_mce_amd r8169 
> shpchp mii serio_raw virtio_net kvm_amd kvm btrfs zlib_deflate libcrc32c 
> pata_acpi ata_generic pata_atiixp wmi radeon ttm drm_kms_helper drm 
> i2c_algo_bit i2c_core [last
> Aug  7 20:56:42 i kernel: unloaded: scsi_wait_scan]
> Aug  7 20:56:42 i kernel: [  111.883125] Pid: 1552, comm: ls Not tainted 
> 2.6.40-4.fc15.x86_64 #1
> Aug  7 20:56:42 i kernel: [  111.883135] Call Trace:

I've probably hit the same problem, though not apparent fs corruption
happened. The partition is used as TEST_DIR for xfstests or fs_mark or
..., ie. the one not mkfs'ed and the files just pile. Until the free
space goes out someday, which happened, and I can now reliably trigger
the same warning in fs/btrfs/inode.c with some non-mainline patches.

This means chris' (for-linus) and josef's (for-chris) branches on top of
linus-rc2 . On bare linus-rc2 the warning does not show.


Following traces are from xfstests/083:

Initially, there is a bunch of

[  479.487424] Could not get space for a delete, will truncate on mount

and traces:

[  480.148082] [ cut here ]
[  480.153233] WARNING: at fs/btrfs/extent-tree.c:3885 
btrfs_free_block_groups+0x2ac/0x320 [btrfs]()
[  480.162656] Hardware name: Santa Rosa platform
[  480.162660] Modules linked in: aoe btrfs
[  480.162668] Pid: 5600, comm: umount Tainted: GW 3.1.0-rc2-default+ 
#109
[  480.162672] Call Trace:
[  480.162683]  [] warn_slowpath_common+0x7f/0xc0
[  480.162689]  [] warn_slowpath_null+0x1a/0x20
[  480.162708]  [] btrfs_free_block_groups+0x2ac/0x320 [btrfs]
[  480.162729]  [] close_ctree+0x1e9/0x390 [btrfs]
[  480.162736]  [] ? dispose_list+0x4f/0x60
[  480.162750]  [] btrfs_put_super+0x1d/0x30 [btrfs]
[  480.162757]  [] generic_shutdown_super+0x62/0xe0
[  480.162763]  [] kill_anon_super+0x16/0x30
[  480.162768]  [] ? deactivate_super+0x42/0x70
[  480.162774]  [] deactivate_locked_super+0x45/0x80
[  480.162779]  [] deactivate_super+0x4a/0x70
[  480.162785]  [] mntput_no_expire+0xa2/0xf0
[  480.162791]  [] sys_umount+0x6f/0x390
[  480.162798]  [] system_call_fastpath+0x16/0x1b

[  480.162823] WARNING: at fs/btrfs/extent-tree.c:3886 
btrfs_free_block_groups+0x31a/0x320 [btrfs]()
[  480.162826] Hardware name: Santa Rosa platform
[  480.162829] Modules linked in: aoe btrfs
[  480.162836] Pid: 5600, comm: umount Tainted: GW 3.1.0-rc2-default+ 
#109
[  480.162839] Call Trace:
[  480.162844]  [] warn_slowpath_common+0x7f/0xc0
[  480.162851]  [] warn_slowpath_null+0x1a/0x20
[  480.162869]  [] btrfs_free_block_groups+0x31a/0x320 [btrfs]
[  480.162889]  [] close_ctree+0x1e9/0x390 [btrfs]
[  480.162895]  [] ? dispose_list+0x4f/0x60
[  480.162909]  [] btrfs_put_super+0x1d/0x30 [btrfs]
[  480.162915]  [] generic_shutdown_super+0x62/0xe0
[  480.162921]  [] kill_anon_super+0x16/0x30
[  480.162926]  [] ? deactivate_super+0x42/0x70
[  480.162932]  [] deactivate_locked_super+0x45/0x80
[  480.162937]  [] deactivate_super+0x4a/0x70
[  480.162943]  [] mntput_no_expire+0xa2/0xf0
[  480.162948]  [] sys_umount+0x6f/0x390
[  480.162954]  [] system_call_fastpath+0x16/0x1b


3882 static void release_global_block_rsv(struct btrfs_fs_info *fs_info)
3883 {
3884 block_rsv_release_bytes(&fs_info->global_block_rsv, NULL, (u64)-1);

3885 WARN_ON(fs_info->delalloc_block_rsv.size > 0);
3886 WARN_ON(fs_info->delalloc_block_rsv.reserved > 0);

3887 WARN_ON(fs_info->trans_block_rsv.size > 0);
3888 WARN_ON(fs_info->trans_block_rsv.reserved > 0);
3889

Re: processes stuck in llseek

2011-08-18 Thread Chris Mason

Excerpts from Mitch Harder's message of 2011-08-18 12:53:33 -0400:
> On Thu, Aug 18, 2011 at 11:00 AM, Chris Mason  wrote:
> > Excerpts from Mitch Harder's message of 2011-08-18 11:40:17 -0400:
> >> On Tue, Aug 16, 2011 at 8:23 PM, Li Zefan  wrote:
> >> > Dan Merillat wrote:
> >> >> On Tue, Aug 16, 2011 at 8:51 AM, Chris Mason  
> >> >> wrote:
> >> >>> Excerpts from Dan Merillat's message of 2011-08-15 23:59:50 -0400:
> >> >>
> >> >>> Dan Carpenter sent a patch for this, I'll get it queued up for rc3.
> >> >>
> >> >> Can you send it?  I'd like to test it to see if it fixes my system.
> >> >
> >> > Here it is.
> >> >
> >> > http://marc.info/?l=linux-btrfs&m=131176036219732&w=2
> >> >
> >>
> >> Doesn't this patch rely on Josef's SEEK_HOLE/SEEK_DATA patch set which
> >> isn't in the kernel yet?
> >>
> >> http://marc.info/?l=linux-btrfs&m=130927580606177&w=2
> >
> > It does, but the hang was reported on 3.1-rc1, which does have Josef's
> > code.
> >
> 
> Thanks.
> 
> That gives me some insights regarding the differences between the
> 'for-linus' and the 'for-linus-merged' branches.

for-linus is usually what I send him to pull, and master is usually the
stable things against the last release (3.0 as of today).

for-linus-merged is used when there is a conflict between his current
tree and my for-linus branch.  Linus almost never uses this directly,
since he really likes to resolve conflicts himself.  This is mostly
because he wants to see what the conflicts are and make sure the
integration is done correctly.

But I still provide a for-linus-merged just so we can double check 
the results of the conflict resolution.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: processes stuck in llseek

2011-08-18 Thread Mitch Harder

On Thu, Aug 18, 2011 at 11:00 AM, Chris Mason  wrote:
> Excerpts from Mitch Harder's message of 2011-08-18 11:40:17 -0400:
>> On Tue, Aug 16, 2011 at 8:23 PM, Li Zefan  wrote:
>> > Dan Merillat wrote:
>> >> On Tue, Aug 16, 2011 at 8:51 AM, Chris Mason  
>> >> wrote:
>> >>> Excerpts from Dan Merillat's message of 2011-08-15 23:59:50 -0400:
>> >>
>> >>> Dan Carpenter sent a patch for this, I'll get it queued up for rc3.
>> >>
>> >> Can you send it?  I'd like to test it to see if it fixes my system.
>> >
>> > Here it is.
>> >
>> > http://marc.info/?l=linux-btrfs&m=131176036219732&w=2
>> >
>>
>> Doesn't this patch rely on Josef's SEEK_HOLE/SEEK_DATA patch set which
>> isn't in the kernel yet?
>>
>> http://marc.info/?l=linux-btrfs&m=130927580606177&w=2
>
> It does, but the hang was reported on 3.1-rc1, which does have Josef's
> code.
>

Thanks.

That gives me some insights regarding the differences between the
'for-linus' and the 'for-linus-merged' branches.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: processes stuck in llseek

2011-08-18 Thread Chris Mason

Excerpts from Mitch Harder's message of 2011-08-18 11:40:17 -0400:
> On Tue, Aug 16, 2011 at 8:23 PM, Li Zefan  wrote:
> > Dan Merillat wrote:
> >> On Tue, Aug 16, 2011 at 8:51 AM, Chris Mason  
> >> wrote:
> >>> Excerpts from Dan Merillat's message of 2011-08-15 23:59:50 -0400:
> >>
> >>> Dan Carpenter sent a patch for this, I'll get it queued up for rc3.
> >>
> >> Can you send it?  I'd like to test it to see if it fixes my system.
> >
> > Here it is.
> >
> > http://marc.info/?l=linux-btrfs&m=131176036219732&w=2
> >
> 
> Doesn't this patch rely on Josef's SEEK_HOLE/SEEK_DATA patch set which
> isn't in the kernel yet?
> 
> http://marc.info/?l=linux-btrfs&m=130927580606177&w=2

It does, but the hang was reported on 3.1-rc1, which does have Josef's
code.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: processes stuck in llseek

2011-08-18 Thread Mitch Harder

On Tue, Aug 16, 2011 at 8:23 PM, Li Zefan  wrote:
> Dan Merillat wrote:
>> On Tue, Aug 16, 2011 at 8:51 AM, Chris Mason  wrote:
>>> Excerpts from Dan Merillat's message of 2011-08-15 23:59:50 -0400:
>>
>>> Dan Carpenter sent a patch for this, I'll get it queued up for rc3.
>>
>> Can you send it?  I'd like to test it to see if it fixes my system.
>
> Here it is.
>
> http://marc.info/?l=linux-btrfs&m=131176036219732&w=2
>

Doesn't this patch rely on Josef's SEEK_HOLE/SEEK_DATA patch set which
isn't in the kernel yet?

http://marc.info/?l=linux-btrfs&m=130927580606177&w=2
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Applications using fsync cause hangs for several seconds every few minutes

2011-08-18 Thread Chris Mason

Excerpts from Andrew Guertin's message of 2011-08-11 21:13:18 -0400:
> On 08/09/2011 05:29 PM, Andrew Guertin wrote:
> > I have not tried 3.1-rc1, but plan to soon.
> 
> I've tested now, this does still occur in 3.1-rc1.

Ok, I had high hopes that the btrfs changes in rc1 would fix this.

Could you please try with the deadline elevator instead of the cfq
default?

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 5/5] btrfs: use fs netlink interface for ENOSPC conditions

2011-08-18 Thread Lukas Czerner

Register fs netlink interface and send proper warning if ENOSPC is
encountered. Note that we differentiate between enospc for metadata and
enospc for data.

Signed-off-by: Lukas Czerner 
Cc: Chris Mason 
Cc: linux-btrfs@vger.kernel.org
---
 fs/btrfs/extent-tree.c |   13 ++---
 fs/btrfs/super.c   |1 +
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 66bac22..a47d9b9 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3029,13 +3029,14 @@ int btrfs_check_data_free_space(struct inode *inode, 
u64 bytes)
 {
struct btrfs_space_info *data_sinfo;
struct btrfs_root *root = BTRFS_I(inode)->root;
+   struct btrfs_fs_info *fs_info = root->fs_info;
u64 used;
int ret = 0, committed = 0, alloc_chunk = 1;
 
/* make sure bytes are sectorsize aligned */
bytes = (bytes + root->sectorsize - 1) & ~((u64)root->sectorsize - 1);
 
-   if (root == root->fs_info->tree_root ||
+   if (root == fs_info->tree_root ||
BTRFS_I(inode)->location.objectid == BTRFS_FREE_INO_OBJECTID) {
alloc_chunk = 0;
committed = 1;
@@ -3070,7 +3071,7 @@ alloc:
if (IS_ERR(trans))
return PTR_ERR(trans);
 
-   ret = do_chunk_alloc(trans, root->fs_info->extent_root,
+   ret = do_chunk_alloc(trans, fs_info->extent_root,
 bytes + 2 * 1024 * 1024,
 alloc_target,
 CHUNK_ALLOC_NO_FORCE);
@@ -3100,7 +3101,7 @@ alloc:
/* commit the current transaction and try again */
 commit_trans:
if (!committed &&
-   !atomic_read(&root->fs_info->open_ioctl_trans)) {
+   !atomic_read(&fs_info->open_ioctl_trans)) {
committed = 1;
trans = btrfs_join_transaction(root);
if (IS_ERR(trans))
@@ -3111,6 +3112,8 @@ commit_trans:
goto again;
}
 
+   fs_nl_send_warning(fs_info->fs_devices->latest_bdev->bd_dev,
+  FS_NL_ENOSPC_WARN);
return -ENOSPC;
}
data_sinfo->bytes_may_use += bytes;
@@ -3522,6 +3525,10 @@ again:
}
 
 out:
+   if (unlikely(-ENOSPC == ret)) {
+   dev_t bdev = root->fs_info->fs_devices->latest_bdev->bd_dev;
+   fs_nl_send_warning(bdev, FS_NL_META_ENOSPC_WARN);
+   }
if (flushing) {
spin_lock(&space_info->lock);
space_info->flush = 0;
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 15634d4..8ac9e01 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1266,6 +1266,7 @@ static int __init init_btrfs_fs(void)
if (err)
goto unregister_ioctl;
 
+   init_fs_nl_family();
printk(KERN_INFO "%s loaded\n", BTRFS_BUILD_VERSION);
return 0;
 
-- 
1.7.4.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Applications using fsync cause hangs for several seconds every few minutes

2011-08-18 Thread Andrew Guertin

On 08/18/2011 03:29 AM, Andrew Guertin wrote:
> I have not seen slowdowns on 2.6.38. More specifically, I observe the
> following behaviors after commit 4e69b59:
> 
> * Many processes occasionally hang for a short time
> * When this happens, my cpu monitor shows a short burst of cpu activity
> (100% of 1 core) followed by a longer period of IO
> * When this happens, iotop shows [btrfs-submit-0] and [btrfs-transacti]
> at the top of the list
> * Behavior slowly increases in duration (and frequency?) over time, and
> goes away with a reboot
> * Heavy IO makes behavior appear faster
> 
> ... and the following behaviors before commit 4e69b59:
> 
> * Occasional spikes of IO on cpu monitor concurrent with
> [btrfs-submit-0] and [btrfs-transacti] at top of iotop
> * No hangs, even when that occurs
> 
> I wasn't taking notes or anything though, so I'm not 100% certain I was
> observing or interpreting or remembering everything correctly.

I've investigated a little more, and have a few things to add:

Before commit 4e69b59:

* In the IO spikes where [btrfs-submit-0] and [btrfs-transacti] are at
the top of iotop, there is no short burst of cpu activity preceding them

* When running gentoo's emerge --sync (which IIRC is mainly an rsync of
~200MB of small files), output appears to pause during these spikes. I
wasn't able to tell if output stopped entirely or just slowed down.

--Andrew
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: snapshot ctime // Re: [RFC] btrfs auto snapshot

2011-08-18 Thread Anand Jain




On 08/17/2011 11:56 PM, Jérôme Poulin wrote:

On Wed, Aug 17, 2011 at 11:13 AM, Roman Mamedov  wrote:

So until someone cares about snapshot ctime enough to fix this, btrfs will not 
be a convenient FS to work with timed snapshotting/cleanup.


Isn't the ctime the creation date of the original folder?


 It will be a nice thing to have the snapshot time returned by
 btrfs-prog.

 something like
  btrfs get stime 

Cheers, Anand
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] btrfs auto snapshot

2011-08-18 Thread Anand Jain




David,


I think that you need to be careful not to impose your idea of when to
take snapshots and how long to keep them onto the design. For example
why take snapshots every 15 minutes? Why not every 10 or every hour?


 crontab is anyways changeable by the admin, I think we can have that
 flexibility.


Why treat monthly snapshots as special when it does not fit into most
working weeks? would weekly be more logical? What about 2 weekly (When
I worked at Nokia, internal releases where done on Tuesday of each even
numbered week, so we would have wanted the snapshot taken on that day
to be retained longer than snapshots taken on other days, or Tuesdays
in odd numbered weeks.)


 agreed. weekly is more important. (I had that in mind but missed it
 when writing, sorry for that).


I think a more flexible design would be to allow the user to specify
(via a config file for each subvolume) a label for each type of snapshot
and how long to keep snapshots depending on when they are taken. This
can be done using syntax similar to crontab:


 simple and nice idea. thanks for explaining, will try to get this
 in the initial release.

Cheers, Anand

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] btrfs auto snapshot

2011-08-18 Thread Anand Jain




 Thanks MgE. snapper is cool, does most the stuff required here.

 however the challenging part will be to keep the number of tools
 (to manage btrfs) at a limit 1 or 2 max. (too many tools to manage
 btrfs is most likely to confuse).

Cheers, Anand

On 08/17/2011 09:31 PM, Matthias G. Eckermann wrote:

Hello Anand and all,

On 2011-08-17 T 10:15 +0800 Anand Jain wrote:


  Appears that no one is working on the auto-snapshot feature for
  btrfs, so here I am implementing the same.


thanks for bringing this up! The group of features you are listing is
indeed of high interest for people using btrfs.

That said, not only have other people though about this, but a lot of
your question already have been implemented in "snapper", and open
source infrastructure developed as part of openSUSE and SUSE Linux
Enterprise.

Please see:
http://en.opensuse.org/Portal:Snapper
http://en.opensuse.org/openSUSE:Snapper_install
http://lizards.opensuse.org/2011/04/01/introducing-snapper/

Source code is here:
http://gitorious.org/opensuse/snapper

"snapper" will be part of openSUSE 12.1 and SUSE Linux Enterprise 11
Service Pack 2, and is available as part of the respective Beta
releases and Milestones already.

snapper's concept in short:
- shared library to make the functionality available to
   other tools as well
- libsnapper is implemented on top of the btrfsprogs
- cmdline tool "snapper"
- global configuration file
/etc/sysconfig/snapper
- one configuration file per subvolume to be snapshotted
/etc/snapper/configs/
   I call this a "single configuration" going forward.
   Here also policies for time based snapshotting and
   cleanup are to be configured.
- Integration into SUSE's management framework (YaST2/zypper),
   however, "snapper" should work independent of those,
   i.e. usable on other distributions easily.


  Below is a draft on the feature list.  Any comments / questions /
  suggestions are welcome, please do let me know.


Let me go through the single features quickly and list the matching
snapper functionality.


  btrfs auto snapshot feature will include:
  Initially:
  - configurable timely snapshots


Yes. Configured per single configuration


  - uses services and crontab to schedule


Yes.


  - Gnome integration


I more see a need for integration into systems management frameworks.


  - snapshot rollback and cleanups


Yes. Rules for cleanups (time based, number of snapshots)
per single configuration.


  - snapshot trashing based on available space


// not yet done.


  - snapshot destination will be subvol/.btrfs/snapshot@  and


snapshot destination is "/.snapshots//",


snapshot/.btrfs/snapshot@  for subvolume and snapshot
respectively


Timestamp and Description of a snapshot are stored in a small XML
file /.snapshots//info.xml". One small file per snapshot.

[...]


  Challenges:
- rollback per file or dir instead of entire snapshot-rollback ?


snapper implements  "rollback" on a FILE level only.

To differentiate this way of "rolling back" from jumping
into another snapshot, we call it
"undochange"
for now. This keeps the option to also manage a full
per snapshot-rollback in a later point int time.

[...]

  modify the snapshot - do we need to implement a kind of read-only
  snapshot ?


snapper treats snapshots as read only snapshots, i.e. when doing a
rollback - aehem, I should say "undochange" - only the "master" volume
will be changed, not the single snapshots.  We are aware that this has
pros and cons. But that's another discussion.

I hope that this is a starting point for you.

Enjoy "snapper".

so long -
MgE


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/3] Btrfs: fix wrong nbytes information of the inode

2011-08-18 Thread Miao Xie

On thu, 18 Aug 2011 16:11:31 +0800, Miao Xie wrote:
> If we write some data into the data hole of the file(no preallocation for this
> hole), Btrfs will allocate some disk space, and update nbytes of the inode, 
> but
> the other element--disk_i_size needn't be updated. At this condition, we must
> update inode metadata though disk_i_size is not 
> changed(btrfs_ordered_update_i_size()
> return 1).
> 
>  # mkfs.btrfs /dev/sdb1
>  # mount /dev/sdb1 /mnt
>  # touch /mnt/a
>  # truncate -s 856002 /mnt/a
>  # dd if=/dev/zero of=/mnt/a bs=4K count=1 conv=nocreat,notrunc
>  # umount /mnt
>  # btrfsck /dev/sdb1
>  root 5 inode 257 errors 400
>  found 32768 bytes used err is 1
> 
> Signed-off-by: Miao Xie 

Reported-by: Tsutomu Itoh 
Tested-by: Tsutomu Itoh 

> ---
>  fs/btrfs/inode.c |4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 18d08f4..634dd797 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -1786,7 +1786,7 @@ static int btrfs_finish_ordered_io(struct inode *inode, 
> u64 start, u64 end)
> &ordered_extent->list);
>  
>   ret = btrfs_ordered_update_i_size(inode, 0, ordered_extent);
> - if (!ret) {
> + if (!ret || !test_bit(BTRFS_ORDERED_PREALLOC, &ordered_extent->flags)) {
>   ret = btrfs_update_inode(trans, root, inode);
>   BUG_ON(ret);
>   }
> @@ -5788,7 +5788,7 @@ again:
>  
>   add_pending_csums(trans, inode, ordered->file_offset, &ordered->list);
>   ret = btrfs_ordered_update_i_size(inode, 0, ordered);
> - if (!ret)
> + if (!ret || !test_bit(BTRFS_ORDERED_PREALLOC, &ordered->flags))
>   btrfs_update_inode(trans, root, inode);
>   ret = 0;
>  out_unlock:

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3/3] Btrfs: fix unclosed transaction handle in btrfs_cont_expand()

2011-08-18 Thread Miao Xie

The function - btrfs_cont_expand() forgot to close the transaction handle before
it jump out the while loop. Fix it.

Signed-off-by: Miao Xie 
---
 fs/btrfs/inode.c |8 ++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 634dd797..ee57b40 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -3510,15 +3510,19 @@ int btrfs_cont_expand(struct inode *inode, loff_t 
oldsize, loff_t size)
err = btrfs_drop_extents(trans, inode, cur_offset,
 cur_offset + hole_size,
 &hint_byte, 1);
-   if (err)
+   if (err) {
+   btrfs_end_transaction(trans, root);
break;
+   }
 
err = btrfs_insert_file_extent(trans, root,
btrfs_ino(inode), cur_offset, 0,
0, hole_size, 0, hole_size,
0, 0, 0);
-   if (err)
+   if (err) {
+   btrfs_end_transaction(trans, root);
break;
+   }
 
btrfs_drop_extent_cache(inode, hole_start,
last_byte - 1, 0);
-- 
1.7.4
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/3] Btrfs: fix the file extent gap when doing direct IO

2011-08-18 Thread Miao Xie

When we write some data to the place that is beyond the end of the file
in direct I/O mode, a data hole will be created. And Btrfs should insert
a file extent item that point to this hole into the fs tree. But unfortunately
Btrfs forgets doing it.

The following is a simple way to reproduce it:
 # mkfs.btrfs /dev/sdc2
 # mount /dev/sdc2 /test4
 # touch /test4/a
 # dd if=/dev/zero of=/test4/a seek=8 count=1 bs=4K oflag=direct 
conv=nocreat,notrunc
 # umount /test4
 # btrfsck /dev/sdc2
 root 5 inode 257 errors 100

Reported-by: Tsutomu Itoh 
Signed-off-by: Miao Xie 
Tested-by: Tsutomu Itoh 
---
 fs/btrfs/file.c |   16 ++--
 1 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 010aec8..a9c4636 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1073,12 +1073,6 @@ static noinline int prepare_pages(struct btrfs_root 
*root, struct file *file,
start_pos = pos & ~((u64)root->sectorsize - 1);
last_pos = ((u64)index + num_pages) << PAGE_CACHE_SHIFT;
 
-   if (start_pos > inode->i_size) {
-   err = btrfs_cont_expand(inode, i_size_read(inode), start_pos);
-   if (err)
-   return err;
-   }
-
 again:
for (i = 0; i < num_pages; i++) {
pages[i] = find_or_create_page(inode->i_mapping, index + i,
@@ -1336,6 +1330,7 @@ static ssize_t btrfs_file_aio_write(struct kiocb *iocb,
struct inode *inode = fdentry(file)->d_inode;
struct btrfs_root *root = BTRFS_I(inode)->root;
loff_t *ppos = &iocb->ki_pos;
+   u64 start_pos;
ssize_t num_written = 0;
ssize_t err = 0;
size_t count, ocount;
@@ -1384,6 +1379,15 @@ static ssize_t btrfs_file_aio_write(struct kiocb *iocb,
file_update_time(file);
BTRFS_I(inode)->sequence++;
 
+   start_pos = round_down(pos, root->sectorsize);
+   if (start_pos > i_size_read(inode)) {
+   err = btrfs_cont_expand(inode, i_size_read(inode), start_pos);
+   if (err) {
+   mutex_unlock(&inode->i_mutex);
+   goto out;
+   }
+   }
+
if (unlikely(file->f_flags & O_DIRECT)) {
num_written = __btrfs_direct_write(iocb, iov, nr_segs,
   pos, ppos, count, ocount);
-- 
1.7.4
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/3] Btrfs: fix wrong nbytes information of the inode

2011-08-18 Thread Miao Xie

If we write some data into the data hole of the file(no preallocation for this
hole), Btrfs will allocate some disk space, and update nbytes of the inode, but
the other element--disk_i_size needn't be updated. At this condition, we must
update inode metadata though disk_i_size is not 
changed(btrfs_ordered_update_i_size()
return 1).

 # mkfs.btrfs /dev/sdb1
 # mount /dev/sdb1 /mnt
 # touch /mnt/a
 # truncate -s 856002 /mnt/a
 # dd if=/dev/zero of=/mnt/a bs=4K count=1 conv=nocreat,notrunc
 # umount /mnt
 # btrfsck /dev/sdb1
 root 5 inode 257 errors 400
 found 32768 bytes used err is 1

Signed-off-by: Miao Xie 
---
 fs/btrfs/inode.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 18d08f4..634dd797 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1786,7 +1786,7 @@ static int btrfs_finish_ordered_io(struct inode *inode, 
u64 start, u64 end)
  &ordered_extent->list);
 
ret = btrfs_ordered_update_i_size(inode, 0, ordered_extent);
-   if (!ret) {
+   if (!ret || !test_bit(BTRFS_ORDERED_PREALLOC, &ordered_extent->flags)) {
ret = btrfs_update_inode(trans, root, inode);
BUG_ON(ret);
}
@@ -5788,7 +5788,7 @@ again:
 
add_pending_csums(trans, inode, ordered->file_offset, &ordered->list);
ret = btrfs_ordered_update_i_size(inode, 0, ordered);
-   if (!ret)
+   if (!ret || !test_bit(BTRFS_ORDERED_PREALLOC, &ordered->flags))
btrfs_update_inode(trans, root, inode);
ret = 0;
 out_unlock:
-- 
1.7.4
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Applications using fsync cause hangs for several seconds every few minutes

2011-08-18 Thread youagree

On 08/18/2011 09:29 AM, Andrew Guertin wrote:
> * Many processes occasionally hang for a short time
> * When this happens, my cpu monitor shows a short burst of cpu activity
> (100% of 1 core) followed by a longer period of IO
> * When this happens, iotop shows [btrfs-submit-0] and [btrfs-transacti]
> at the top of the list
> * Behavior slowly increases in duration (and frequency?) over time, and
> goes away with a reboot
> * Heavy IO makes behavior appear faster
> 
> ... and the following behaviors before commit 4e69b59:
> 
> * Occasional spikes of IO on cpu monitor concurrent with
> [btrfs-submit-0] and [btrfs-transacti] at top of iotop
> * No hangs, even when that occurs

Yes, exactly that happened in my case too. Yours is a much more precise
description! I did not diagnose 2.6.38 further because I just wanted to
establish a known-good version and at first sight (2 days uptime) my HDD
behavior showed that it cannot be good if _any_ HDD thrashing appears at
all in the first place...

I was able to work with the computer during those IO spikes on 2.6.38
too, although it was observable that the HDD is being thrased
(meanwhile, LED was almost constant lit). But it didn't cause other
programs to be unresponsive, I confirm...

> I wasn't taking notes or anything though, so I'm not 100% certain I was
> observing or interpreting or remembering everything correctly.
> 
> --Andrew
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Applications using fsync cause hangs for several seconds every few minutes

2011-08-18 Thread Andrew Guertin


On 08/17/2011 10:41 PM, Anand Jain wrote:

Dave,

good to have a test case on the 3.0 kernel. do you have btrfs as
root fs ? and
can you show how are you using the btrfs mainly I would need
'btrfs fi show' let me try if I can reproduce.

Thanks, Anand


Personally, I find that large compiles are very "useful" in making the issue 
occur sooner. I'm on gentoo, so when I was bisecting, I'd often just emerge 
openoffice and let it run for a while.


For observing, the best way I found was to run JOSM (Java OpenStreetMap editor). 
Browsing around a map is very interactive, so it's immediately noticeable when 
it hangs, and downloading map tiles all the time uses a lot of IO. In-browser 
map applications would probably work too.


My filesystem is partitioned with a small ext2 /boot as sda1, a 2GB swap as 
sda2, and the remaining space as btrfs / on sda3.

btrfs fi show gives:
Label: none  uuid: 28559ad8-7db8-402b-a93d-27ec9c5e943b
Total devices 1 FS bytes used 102.83GB
devid1 size 144.90GB used 144.90GB path /dev/sda3

Btrfs v0.19-35-g1b444cd-dirty

--Andrew
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Applications using fsync cause hangs for several seconds every few minutes

2011-08-18 Thread Andrew Guertin


On 08/18/2011 02:44 AM, youagree wrote:

Also, a patch by Josef Bacik was an attempt for fixing this, but no one
reported about testing it on an affected system, it did not eliminate
the slowdowns for me:

PLEASE TEST: Everybody who is seeing weird and long hangs
news://news.gmane.org:119/4e36c47e.70...@redhat.com


I had not seen this (actually, I had skimmed it but not thought it was 
relevant). I will try it as soon as I get a chance.



The HDD thrashing appeared on all other kernel versions I tried, higher
than 2.6.37.
Initially, I had been into looking for a latest known good kernel (to
prepare a proper git bisect as cwillu advised) and at first I also felt
like 2.6.38 does not show this miserable behaviour. But later it turned
out this was only for approximately 2 days of uptime. Given enough time,
the lock-ups appeared on 2.6.38 too. Although they were not that
apparent than on later kernel versions, and the individual lockups took
much less time with 2.6.38 running for 2 days (binary Sabayon Linux
repository kernel).


I have not seen slowdowns on 2.6.38. More specifically, I observe the following 
behaviors after commit 4e69b59:


* Many processes occasionally hang for a short time
* When this happens, my cpu monitor shows a short burst of cpu activity (100% of 
1 core) followed by a longer period of IO
* When this happens, iotop shows [btrfs-submit-0] and [btrfs-transacti] at the 
top of the list
* Behavior slowly increases in duration (and frequency?) over time, and goes 
away with a reboot

* Heavy IO makes behavior appear faster

... and the following behaviors before commit 4e69b59:

* Occasional spikes of IO on cpu monitor concurrent with [btrfs-submit-0] and 
[btrfs-transacti] at top of iotop

* No hangs, even when that occurs

I wasn't taking notes or anything though, so I'm not 100% certain I was 
observing or interpreting or remembering everything correctly.


--Andrew
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

39 matches

Mail list logo