Re: How to fasten btrfs?

2011-01-25 Thread Hubert Kario
On Tuesday 25 of January 2011 07:45:02 Magicloud Magiclouds wrote:
 Hi,
   I am using 2.6.36.3 kernel with btrfs, 512MB memory and a very slow
 disk, no special options for mounting btrfs except noatime. Now I
 found it very slow. When I rm a 5GB movie, it took 20 secs.

try mounting with nodatacow and defragment the directory the file resides in

Regards
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: version

2011-01-25 Thread Helmut Hullen
Hallo, Chris,

Du meintest am 25.01.11:

 Regard the difference between df and btrfs filesystem df.

 I suspect this is fixed in 2.6.38 with the following commit.

 BE WARNED: there are some fairly hairy changes to the pathname
 lookup code to replace the BKL with RCU (not specific to btrfs)
 and so if you are tempted to try it (currently 2.6.38-rc2) only
 do so on a system that you don't care about data on and/or have
 very good incremental backups of which you trust...


 commit 6d07bcec969af335d4e35b3921131b7929bd634e
 Author: Miao Xie mi...@cn.fujitsu.com
 Date:   Wed Jan 5 10:07:31 2011 +

 btrfs: fix wrong free space information of btrfs

I've tried 2.6.38-rc2 - new problems.

mkfs.btrfs /dev/sdb
mount /dev/sdb /mnt/btr
btrfs device add /dev/sdc

cp dir_with_6_GByte /mnt/btr

leads to

---

# btrfs filesystem show
Label: none  uuid: 4a8c2c48-6c0e-4a97-8286-d1f7d930f9a8
Total devices 2 FS bytes used 6.53GB
devid1 size 7.51GB used 805.50MB path /dev/sdb
devid2 size 37.27GB used 7.00GB path /dev/sdc

Btrfs Btrfs v0.19

# btrfs filesystem df /mnt/btr
Data: total=7.01GB, used=6.52GB
System, DUP: total=8.00MB, used=4.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=384.75MB, used=9.84MB
Metadata: total=8.00MB, used=0.00

# df -t btrfs
Dateisystem   Typ1K-Blöcke   Benutzt Verfügbar Ben% Eingehängt auf
/dev/sdb btrfs46963224   6861804  39303836  15% /mnt/btr

---

And then:

btrfs filesystem balance /mnt/btr

crashes with the dmesg lines

- dmesg ---

bio too big device sdc (256  240)
bio too big device sdc (256  240)
bio too big device sdc (256  240)
bio too big device sdc (256  240)
[ cut here ]
kernel BUG at fs/btrfs/volumes.c:2097!
invalid opcode:  [#1]
last sysfs file: 
/sys/devices/pci:00/:00:07.1/host1/target1:0:0/1:0:0:0/block/sdb/dev
Modules linked in: sg nf_nat_ftp nf_conntrack_ftp ipt_MASQUERADE iptable_nat 
nf_nat xt_DSCP xt_multiport xt_recent nf_conntrack_ipv4 nf_defrag_ipv4 xt_state 
nf_conntrack xt_tcpudp ipt_REJECT iptable_filter iptable_mangle ip_tables 
xt_iprange x_tables nfsd exportfs 8139too 8139cp savagefb fb_ddc i2c_algo_bit 
vgastate i2c_piix4 piix e100 mii intel_agp intel_gtt agpgart cmd64x video 
thermal_sys ac battery yenta_socket pcmcia_rsrc pcmcia pcmcia_core 
thinkpad_acpi hwmon rfkill nvram fuse

Pid: 16501, comm: btrfs Not tainted 2.6.38-rc2-ODS #1 26478EG/26478EG
EIP: 0060:[c1235264] EFLAGS: 00010282 CPU: 0
EIP is at btrfs_balance+0x2d4/0x2e0
EAX: fffb EBX: cd57 ECX: d7cd4090 EDX: 
ESI: d097c070 EDI: cf9f3c00 EBP: d37dde9c ESP: d37dde38
 DS: 007b ES: 007b FS:  GS: 00e0 SS: 0068
Process btrfs (pid: 16501, ti=d37dc000 task=cd5d6f00 task.ti=d37dc000)
Stack:
 99cc 0001 00e4 0010  0100 cd56e000 cf9f67d8
 aea58000 0001  99cc 0001 0100a3c8  00e4
 0199cc00  0001 e400   cfa50700 ffea
Call Trace:
 [c123bcf1] btrfs_ioctl+0x2e1/0x9d0
 [c123ba10] ? btrfs_ioctl+0x0/0x9d0
 [c10c3f65] do_vfs_ioctl+0x85/0x590
 [c10206db] ? do_page_fault+0x17b/0x380
 [c10b554b] ? do_sys_open+0xdb/0x110
 [c10c44f7] sys_ioctl+0x87/0x90
 [c1753d0c] syscall_call+0x7/0xb
Code: 1b ff ff ff 89 f0 e8 cc 75 fb ff 8b 55 b4 8b 82 10 01 00 00 05 74 19 00 
00 e8 09 dc 51 00 e9 70 fd ff ff 31 db eb dd 85 c0 74 9d 0f 0b 0f 0b 0f 0b 0f 
0b 0f 0b 66 90 55 89 e5 56 53 83 ec 34 3e
EIP: [c1235264] btrfs_balance+0x2d4/0x2e0 SS:ESP 0068:d37dde38
---[ end trace 8dcdbc0f75858a35 ]---

---


Viele Gruesse!
Helmut
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to fasten btrfs?

2011-01-25 Thread Hubert Kario
On Tuesday, January 25, 2011 15:28:11 Magicloud Magiclouds wrote:
 So you were saying that, even an unlink operation, the system walks
 through all file data blocks? I thought unlink wouldn't have to write
 too much data.

No, but the amount of data to be changed if the number of extents is small 
will be lower.

 
 On Tue, Jan 25, 2011 at 6:30 PM, Hubert Kario h...@qbs.com.pl wrote:
  On Tuesday 25 of January 2011 07:45:02 Magicloud Magiclouds wrote:
  Hi,
I am using 2.6.36.3 kernel with btrfs, 512MB memory and a very slow
  disk, no special options for mounting btrfs except noatime. Now I
  found it very slow. When I rm a 5GB movie, it took 20 secs.
  
  try mounting with nodatacow and defragment the directory the file resides
  in
  
  Regards
  --
  Hubert Kario
  QBS - Quality Business Software
  02-656 Warszawa, ul. Ksawerów 30/85
  tel. +48 (22) 646-61-51, 646-74-24
  www.qbs.com.pl

-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: flush-btrfs-1 hangs when building openwrt

2011-01-25 Thread Josef Bacik
On Tue, Jan 25, 2011 at 09:51:01AM +0100, Daniel Poelzleithner wrote:
 Hi,
 
 Since update to 2.6.37 I can't build openwrt on my btrfs buildroot anymore.
 I'm not sure if this is related to the other flush-btrfs-1 thread.
 
 plenty of diskspace is free:
 
 /dev/mapper/cruor-build
97G   68G   27G  73% /opt/build
 
 It always hangs when openwrt builds the ext4 image and runs tune2fs on it.
 
 /opt/build/fahrenheit/openwrt/staging_dir/host/bin/tune2fs -O
 extents,uninit_bg,dir_index
 /opt/build/fahrenheit/openwrt/build_dir/linux-x86_kvm_guest/root.ext4
 tune2fs 1.41.13 (13-Dec-2010)
 
 the processes can't be killed.
 
 alt-sysctl-t does not show anything, nor is there a oops.
 
 I put the the openwrt config I'm using at https://gist.github.com/794593
 , maybe it is reproduceable.
 
 Linux cruor 2.6.37 #2 SMP Thu Jan 20 02:09:59 CET 2011 x86_64 GNU/Linux
 

How about sysrq+w when it's hanging.  Also could you give the exact steps to
reproduce?  I went to the openwrt site to try and build, but it seems like
theres alot of moving parts.  If you can just tell me what to download and what
you run to reproduce I can try and reproduce locally.  Thanks,

Josef
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Synching a Backup Server

2011-01-25 Thread Kaspar Schleiser

On 01/22/2011 02:55 PM, Hubert Kario wrote:

It looks like ZFS, Btrfs, and LVM should work in similar manners, but
the overloaded terminology (pool, volume, sub-volume, filesystem are
different in all three) and new terminology that's only in Btrfs is
confusing.


With btrfs you need to have *a* filesystem, once you have it, you can add and
remove disks/partitions from it, no need to use 'mkfs.btrfs', just 'btrfs'.


That's just a design decision, right? There's no need for a default or 
root subvolume.


It should be rather easy to change btrfs so that you first have to 
create a storage pool which combines disks for btrfs, and on top of 
that you can create filesystems which are just subvolumes.


The creation of a storage pool could be very similar to the current 
mkfs, just without the creation of a root subvolume.


A new, simpler mkfs would then just create a subvolume on top of the 
storage pool that can be mounted.


Regards,
Kaspar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Synching a Backup Server

2011-01-25 Thread Hubert Kario
On Tuesday, January 25, 2011 18:29:35 Kaspar Schleiser wrote:
 On 01/22/2011 02:55 PM, Hubert Kario wrote:
  It looks like ZFS, Btrfs, and LVM should work in similar manners, but
  the overloaded terminology (pool, volume, sub-volume, filesystem are
  different in all three) and new terminology that's only in Btrfs is
  confusing.
  
  With btrfs you need to have *a* filesystem, once you have it, you can add
  and remove disks/partitions from it, no need to use 'mkfs.btrfs', just
  'btrfs'.
 
 That's just a design decision, right? There's no need for a default or
 root subvolume.
 
 It should be rather easy to change btrfs so that you first have to
 create a storage pool which combines disks for btrfs, and on top of
 that you can create filesystems which are just subvolumes.
 
 The creation of a storage pool could be very similar to the current
 mkfs, just without the creation of a root subvolume.
 
 A new, simpler mkfs would then just create a subvolume on top of the
 storage pool that can be mounted.
 
 Regards,
 Kaspar

I'm not sure, but for btrfs to support storage pools the way ZFS does would 
require change in disk layout.

Besides, I don't see *why* this should be done...

And as far as I know ZFS doesn't support different reduncancy levels for 
different files residing in the same directory. You can have 
~/1billion$-project.tar.gz with triple redundancy and ~/temp.video.mkv with no 
reduncancy with btrfs...

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: fix formatting in file.c

2011-01-25 Thread Josef Bacik
Sorry, but these were bugging me.  Just cleanup some of the formatting in
file.c.

Signed-off-by: Josef Bacik jo...@redhat.com
---
 fs/btrfs/file.c |   16 
 1 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index c800d58..a754865 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -75,9 +75,8 @@ static noinline int btrfs_copy_from_user(loff_t pos, int 
num_pages,
total_copied += copied;
 
/* Return to btrfs_file_aio_write to fault page */
-   if (unlikely(copied == 0)) {
+   if (unlikely(copied == 0))
break;
-   }
 
if (unlikely(copied  PAGE_CACHE_SIZE - offset)) {
offset += copied;
@@ -149,13 +148,14 @@ static noinline int dirty_and_release_pages(struct 
btrfs_trans_handle *trans,
ClearPageChecked(p);
set_page_dirty(p);
}
-   if (end_pos  isize) {
+
+   /*
+* we've only changed i_size in ram, and we haven't updated
+* the disk i_size.  There is no need to log the inode
+* at this time.
+*/
+   if (end_pos  isize)
i_size_write(inode, end_pos);
-   /* we've only changed i_size in ram, and we haven't updated
-* the disk i_size.  There is no need to log the inode
-* at this time.
-*/
-   }
return 0;
 }
 
-- 
1.6.6.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Synching a Backup Server

2011-01-25 Thread Hubert Kario
On Tuesday, January 25, 2011 18:59:39 Freddie Cash wrote:
 On Tue, Jan 25, 2011 at 9:43 AM, Hubert Kario h...@qbs.com.pl wrote:
  Besides, I don't see *why* this should be done...
  
  And as far as I know ZFS doesn't support different reduncancy levels for
  different files residing in the same directory. You can have
  ~/1billion$-project.tar.gz with triple redundancy and ~/temp.video.mkv
  with no reduncancy with btrfs...
 
 With ZFS, redundancy (mirror, raidz1, raidz2, raidz3) is done at the
 storage pool layer, and affects the entire pool.  You can mix and
 match redundancy levels (combine mirror vdevs and raidz vdevs in the
 same pool), but there's no way to control what data blocks go to which
 vdev, as it's all just one giant pool of storage.
 
 However, there is a copies property for each filesystem that affects
 how many copies of data blocks are stored, to increase the redundancy
 for that filesystem.  For example, you can create a storage pool using
 2 mirror vdevs (4 drives; equivalent to a RAID10 setup); then create a
 filesystem with copies=2.  Thus, any blocks written to that filesystem
 will be stored twice, each of which is then striped across the two
 vdevs, and then mirrored to each disk in the vdevs, potentially
 leading to 4 (or more) blocks of data written to disk.
 
 This is similar to using Linux md to create RAID arrays underneath LVM
 volume groups.  The redundancy is managed via md; the filesystems just
 see a collection of blocks to write to.
 
 The big difference (from what I understand) between ZFS and Btrfs is
 the layering.  ZFS separate storage management from filesystem
 management, so redundancy happens at lower layers and the filesystem
 just sends blocks to the pool.  Whereas Btrfs combines them into one,
 so that redundancy is managed at the filesystem level and can be
 changed on a per-directory (or per-sub-volume?) basis, with the
 filesystem handling the writes and the redundancy.

Right now you can't change the raid level at all but there are hooks planned 
to enable selecting raid level on a per file basis.

btrfs allows for better management of space ond less over provisioning.

So I'd say that management of storage space with btrfs is even easier than 
with ZFS:

admin sets the default redundancy level for whole file system (let's say that 
it's a 4 disk system) to a RAID1 with two copies.
After seting up the system sets the redundancy level in directories with 
databases to RAID10
Users storing big files use RAID5 for some files.

one of the drives fails, admin removes the drive from set, schedules 
reballance.

the set is smaller but all reduncancy is preserved

New drives arrive, they are added to fs. FS is reballanced for the second time 
to achive better performance (the space would be usable even without it).

 
 I don't pretend to understand all the intricacies of how Btrfs works
 (I'm working on it), but the layering in ZFS is very nice and easy to
 work with in comparison.  Interesting how ZFS is considered the
 rampant layering violation, though.  ;)  :)  :D

btrfs is much simpler from user point of view :)

as for rampant layering violation: most of the code that deals with stored 
data isn't concerned with raid level, in contrast with zfs. In other words, 
its in the code, not interface.

-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


full btrfs partition, became unmountable (+ a solution that thankfully worked for me)

2011-01-25 Thread Cyrille Chépélov
Hello all,

Last Friday, the /var and /home partition on one of my appliances became
full. This should normally not be much of a problem, except that after
the incident, I had been unable to mount the partition back again.

The appliance runs 2.6.32 as provided by Debian during the last two
months. 
The rescue computer runs 2.6.37; both exhibited the same behaviour at
mount: an infinite loop-and-abort cycle (I unfortunately did not write
down the exact messages, but in a nutshell, there was not enough free
space to replay the log, so it aborted).

After pulling the SD card (yes) to break the loop, I ended up with a
corrupt file system. Any attempt to mount, debug or fsck (using
btrfs-tools 0.19+20100601 as shipped by Debian, or compiled from git
1b444cd2e6ab8dcafdd) aborted with the following message:
btrfs-debug-tree: disk-io.c:741: open_ctree_fd: Assertion `!(!
tree_root-node)' failed.

After much scavenging on the disk image, I finally managed to recover,
using the (dirty) patch attached here. Since apparently other people had
similar issues, I'm posting it in the hope it might be useful.

-- Cyrille

PS: Chris, if btrfs-images of before and after my butcher fix would
be useful to you, just let me know. 
diff --git a/btrfsck.c b/btrfsck.c
index 63e44d1..1e6bc32 100644
--- a/btrfsck.c
+++ b/btrfsck.c
@@ -2823,13 +2823,17 @@ int main(int ac, char **av)
 	u64 bytenr = 0;
 	int ret;
 	int num;
+	int emergency_super = 0;
 
 	while(1) {
 		int c;
-		c = getopt(ac, av, s:);
+		c = getopt(ac, av, es:);
 		if (c  0)
 			break;
 		switch(c) {
+			case 'e':
+emergency_super = 1;
+break;
 			case 's':
 num = atol(optarg);
 bytenr = btrfs_sb_offset(num);
@@ -2861,6 +2865,12 @@ int main(int ac, char **av)
 	if (root == NULL)
 		return 1;
 
+	if (root-fs_info-emergency_root_tree  (!emergency_super)) {
+		printf(DANGEROUS: had to use a synthetic super. Please run with '-e' flag if you know why you do it.\n);
+		printf(  ... and have perfect backups.\n);
+		return 1;
+	}
+
 	ret = check_extents(root);
 	if (ret)
 		goto out;
@@ -2869,6 +2879,27 @@ int main(int ac, char **av)
 		goto out;
 
 	ret = check_root_refs(root, root_cache);
+	if (ret)
+		goto out;
+
+
+	if (root-fs_info-emergency_root_tree) {
+		printf(DANGEROUS: had to use a scavenged root. Apparently could figure out the primary trees ?\n 
+		  Now writing supers, knock wood.\n);
+		free_root_recs(root_cache);
+		close_ctree(root);
+		
+		cache_tree_init(root_cache);
+		root = open_ctree(av[optind], bytenr, 1 /* WRITES! */);
+
+		ret = write_all_supers(root);
+		if (ret) {
+			printf(Error writing superblocks.\n);		
+			goto out;			
+		}
+		printf(Wrote back superblocks.\n);
+	}
+
 out:
 	free_root_recs(root_cache);
 	close_ctree(root);
@@ -2897,6 +2928,7 @@ out:
 	printf(file data blocks allocated: %llu\n referenced %llu\n,
 		(unsigned long long)data_bytes_allocated,
 		(unsigned long long)data_bytes_referenced);
+
 	printf(%s\n, BTRFS_BUILD_VERSION);
 	return ret;
 }
diff --git a/ctree.h b/ctree.h
index b79e238..7439d87 100644
--- a/ctree.h
+++ b/ctree.h
@@ -728,6 +728,7 @@ struct btrfs_fs_info {
 	struct list_head space_info;
 	int system_allocs;
 	int readonly;
+	int emergency_root_tree;
 };
 
 /*
diff --git a/disk-io.c b/disk-io.c
index a6e1000..0dd7310 100644
--- a/disk-io.c
+++ b/disk-io.c
@@ -41,8 +41,11 @@ static int check_tree_block(struct btrfs_root *root, struct extent_buffer *buf)
 	struct btrfs_fs_devices *fs_devices;
 	int ret = 1;
 
-	if (buf-start != btrfs_header_bytenr(buf))
+	if (buf-start != btrfs_header_bytenr(buf)) {
+		fprintf(stderr, start place mismatch, buf says %llu  btrfs_hdr says %llu\n,
+			buf-start, btrfs_header_bytenr(buf));
 		return ret;
+	}
 
 	fs_devices = root-fs_info-fs_devices;
 	while (fs_devices) {
@@ -204,16 +207,26 @@ struct extent_buffer *read_tree_block(struct btrfs_root *root, u64 bytenr,
 		eb-dev_bytenr = multi-stripes[0].physical;
 		kfree(multi);
 		ret = read_extent_from_disk(eb);
-		if (ret == 0  check_tree_block(root, eb) == 0 
-		csum_tree_block(root, eb, 1) == 0 
-		verify_parent_transid(eb-tree, eb, parent_transid) == 0) {
-			btrfs_set_buffer_uptodate(eb);
-			return eb;
+		if (ret == 0) {
+			
+		if (check_tree_block(root, eb) == 0) {
+			/* fprintf(stderr, checked tree block %p for %p -- %llu \n, root, eb, eb-start);*/
+			if (csum_tree_block(root, eb, 1) == 0) {
+/* fprintf(stderr, tree block csum %p for %p--%llu is OK\n, root, eb, eb-start); */
+if (verify_parent_transid(eb-tree, eb, parent_transid) == 0) {
+	/*fprintf(stderr, tree block %p--%llu has correct transid, setting uptodate\n,  eb, eb-start);*/
+
+	btrfs_set_buffer_uptodate(eb);
+	return eb;
+}
+			}	
+		   }
+
 		}
 		num_copies = btrfs_num_copies(root-fs_info-mapping_tree,
 	  eb-start, eb-len);
 		if (num_copies == 1) {
-			break;
+			//break;
 		}
 		mirror_num++;
 		if (mirror_num  num_copies) {
@@ -581,7 +594,7 @@ struct btrfs_root 

[PATCH] Btrfs: simplify our write path

2011-01-25 Thread Josef Bacik
Our aio_write function is huge and kind of hard to follow at times.  So this
patch fixes this by breaking out the buffered and direct write paths out into
seperate functions so it's a little clearer what's going on.  I've also fixed
some wrong typing that we had and added the ability to handle getting an error
back from btrfs_set_extent_delalloc.  Tested this with xfstests and everything
came out fine.  Thanks,

Signed-off-by: Josef Bacik jo...@redhat.com
---
 fs/btrfs/file.c |  384 ++-
 1 files changed, 208 insertions(+), 176 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index a754865..c0f312a 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -45,14 +45,14 @@
  * and be replaced with calls into generic code.
  */
 static noinline int btrfs_copy_from_user(loff_t pos, int num_pages,
-int write_bytes,
+size_t write_bytes,
 struct page **prepared_pages,
 struct iov_iter *i)
 {
size_t copied = 0;
+   size_t total_copied = 0;
int pg = 0;
int offset = pos  (PAGE_CACHE_SIZE - 1);
-   int total_copied = 0;
 
while (write_bytes  0) {
size_t count = min_t(size_t,
@@ -116,13 +116,12 @@ static noinline void btrfs_drop_pages(struct page 
**pages, size_t num_pages)
  * this also makes the decision about creating an inline extent vs
  * doing real data extents, marking pages dirty and delalloc as required.
  */
-static noinline int dirty_and_release_pages(struct btrfs_trans_handle *trans,
-  struct btrfs_root *root,
-  struct file *file,
-  struct page **pages,
-  size_t num_pages,
-  loff_t pos,
-  size_t write_bytes)
+static noinline int dirty_and_release_pages(struct btrfs_root *root,
+   struct file *file,
+   struct page **pages,
+   size_t num_pages,
+   loff_t pos,
+   size_t write_bytes)
 {
int err = 0;
int i;
@@ -140,7 +139,8 @@ static noinline int dirty_and_release_pages(struct 
btrfs_trans_handle *trans,
end_of_last_block = start_pos + num_bytes - 1;
err = btrfs_set_extent_delalloc(inode, start_pos, end_of_last_block,
NULL);
-   BUG_ON(err);
+   if (err)
+   return err;
 
for (i = 0; i  num_pages; i++) {
struct page *p = pages[i];
@@ -839,119 +839,32 @@ again:
return 0;
 }
 
-static ssize_t btrfs_file_aio_write(struct kiocb *iocb,
-   const struct iovec *iov,
-   unsigned long nr_segs, loff_t pos)
+static noinline ssize_t __btrfs_buffered_write(struct file *file,
+  struct iov_iter *i,
+  loff_t pos)
 {
-   struct file *file = iocb-ki_filp;
struct inode *inode = fdentry(file)-d_inode;
struct btrfs_root *root = BTRFS_I(inode)-root;
struct page *pinned[2];
struct page **pages = NULL;
-   struct iov_iter i;
-   loff_t *ppos = iocb-ki_pos;
-   loff_t start_pos;
-   ssize_t num_written = 0;
-   ssize_t err = 0;
-   size_t count;
-   size_t ocount;
-   int ret = 0;
-   int nrptrs;
unsigned long first_index;
unsigned long last_index;
-   int will_write;
-   int buffered = 0;
-   int copied = 0;
-   int dirty_pages = 0;
-
-   will_write = ((file-f_flags  O_DSYNC) || IS_SYNC(inode) ||
- (file-f_flags  O_DIRECT));
-
-   pinned[0] = NULL;
-   pinned[1] = NULL;
-
-   start_pos = pos;
-
-   vfs_check_frozen(inode-i_sb, SB_FREEZE_WRITE);
-
-   mutex_lock(inode-i_mutex);
-
-   err = generic_segment_checks(iov, nr_segs, ocount, VERIFY_READ);
-   if (err)
-   goto out;
-   count = ocount;
-
-   current-backing_dev_info = inode-i_mapping-backing_dev_info;
-   err = generic_write_checks(file, pos, count, S_ISBLK(inode-i_mode));
-   if (err)
-   goto out;
-
-   if (count == 0)
-   goto out;
-
-   err = file_remove_suid(file);
-   if (err)
-   goto out;
-
-   /*
-* If BTRFS flips readonly due to some impossible error
-* (fs_info-fs_state now has BTRFS_SUPER_FLAG_ERROR),
-* although we have opened a file as writable, we have
-* to stop this write operation to ensure FS consistency.
-*/
-   if (root-fs_info-fs_state  

[PATCH] Btrfs: fix how we deal with the pages array in the write path

2011-01-25 Thread Josef Bacik
Really we don't need to memset the pages array at all, since we know how many
pages we're going to use in the array and pass that around.  So don't memset,
just trust we're not idiots and we pass num_pages around properly.

Signed-off-by: Josef Bacik jo...@redhat.com
---
 fs/btrfs/file.c |   10 +-
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index c0f312a..f4eef63 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -95,8 +95,6 @@ static noinline void btrfs_drop_pages(struct page **pages, 
size_t num_pages)
 {
size_t i;
for (i = 0; i  num_pages; i++) {
-   if (!pages[i])
-   break;
/* page checked is some magic around finding pages that
 * have been modified without going through btrfs_set_page_dirty
 * clear it here
@@ -788,7 +786,6 @@ static noinline int prepare_pages(struct btrfs_root *root, 
struct file *file,
return err;
}
 
-   memset(pages, 0, num_pages * sizeof(struct page *));
 again:
for (i = 0; i  num_pages; i++) {
pages[i] = grab_cache_page(inode-i_mapping, index + i);
@@ -910,8 +907,6 @@ static noinline ssize_t __btrfs_buffered_write(struct file 
*file,
 
WARN_ON(num_pages  nrptrs);
 again:
-   memset(pages, 0, sizeof(struct page *) * nrptrs);
-
/*
 * Fault pages before locking them in prepare_pages
 * to avoid recursive lock
@@ -926,6 +921,11 @@ again:
if (ret)
break;
 
+   /*
+* This is going to setup the pages array with the number of
+* pages we want, so we don't really need to worry about the
+* contents of pages from loop to loop
+*/
ret = prepare_pages(root, file, pages, num_pages,
pos, first_index, last_index,
write_bytes);
-- 
1.6.6.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: handle no memory properly in prepare_pages

2011-01-25 Thread Josef Bacik
Instead of doing a BUG_ON(1) in prepare_pages if grab_cache_page() fails, just
loop through the pages we've already grabbed and unlock and release them, then
return -ENOMEM like we should.  Thanks,

Signed-off-by: Josef Bacik jo...@redhat.com
---
 fs/btrfs/file.c |8 ++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index f4eef63..416eee8 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -790,8 +790,12 @@ again:
for (i = 0; i  num_pages; i++) {
pages[i] = grab_cache_page(inode-i_mapping, index + i);
if (!pages[i]) {
-   err = -ENOMEM;
-   BUG_ON(1);
+   int c;
+   for (c = i - 1; c = 0; c--) {
+   unlock_page(pages[c]);
+   page_cache_release(pages[c]);
+   }
+   return -ENOMEM;
}
wait_on_page_writeback(pages[i]);
}
-- 
1.6.6.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] btrfs: fix missing break in switch phrase

2011-01-25 Thread liubo

There is a missing break in switch, fix it.

Signed-off-by: Liu Bo liubo2...@cn.fujitsu.com
---
 fs/btrfs/print-tree.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/fs/btrfs/print-tree.c b/fs/btrfs/print-tree.c
index 0d126be..fb2605d 100644
--- a/fs/btrfs/print-tree.c
+++ b/fs/btrfs/print-tree.c
@@ -260,6 +260,7 @@ void btrfs_print_leaf(struct btrfs_root *root, struct 
extent_buffer *l)
 #else
BUG();
 #endif
+   break;
case BTRFS_BLOCK_GROUP_ITEM_KEY:
bi = btrfs_item_ptr(l, i,
struct btrfs_block_group_item);
-- 
1.6.5.2
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html