Re: gcc inlining heuristics was Re: [PATCH -v7][RFC]: mutex: implement adaptive spinning

2009-01-21 Thread Andi Kleen
 GCC 4.3.2. Maybe i missed something obvious?

The typical use case of restrict is to tell it that multiple given
arrays are independent and then give the loop optimizer 
more freedom to handle expressions in the loop that
accesses these arrays.

Since there are no loops in the list functions nothing changed.

Ok presumably there are some other optimizations which 
rely on that alias information too, but again the list_*
stuff is probably too simple to trigger any of them.

-Andi

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: gcc inlining heuristics was Re: [PATCH -v7][RFC]: mutex: implement adaptive spinning

2009-01-21 Thread Nick Piggin
On Wed, Jan 21, 2009 at 09:54:02AM +0100, Andi Kleen wrote:
  GCC 4.3.2. Maybe i missed something obvious?
 
 The typical use case of restrict is to tell it that multiple given
 arrays are independent and then give the loop optimizer 
 more freedom to handle expressions in the loop that
 accesses these arrays.
 
 Since there are no loops in the list functions nothing changed.
 
 Ok presumably there are some other optimizations which 
 rely on that alias information too, but again the list_*
 stuff is probably too simple to trigger any of them.

Any function that does several interleaved loads and stores
through different pointers could have much more freedom to
move loads early and stores late. Big OOOE CPUs won't care
so much, but embedded and things (including in-order x86)
are very important users of the kernel.

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: gcc inlining heuristics was Re: [PATCH -v7][RFC]: mutex: implement adaptive spinning

2009-01-21 Thread Nick Piggin
On Wed, Jan 21, 2009 at 10:20:49AM +0100, Andi Kleen wrote:
 On Wed, Jan 21, 2009 at 09:52:08AM +0100, Nick Piggin wrote:
  On Wed, Jan 21, 2009 at 09:54:02AM +0100, Andi Kleen wrote:
GCC 4.3.2. Maybe i missed something obvious?
   
   The typical use case of restrict is to tell it that multiple given
   arrays are independent and then give the loop optimizer 
   more freedom to handle expressions in the loop that
   accesses these arrays.
   
   Since there are no loops in the list functions nothing changed.
   
   Ok presumably there are some other optimizations which 
   rely on that alias information too, but again the list_*
   stuff is probably too simple to trigger any of them.
  
  Any function that does several interleaved loads and stores
  through different pointers could have much more freedom to
  move loads early and stores late. 
 
 For once that would require more live registers. It's not
 a clear and obvious win. Especially not if you have
 only very little registers, like on 32bit x86.
 
 Then it would typically increase code size.

The point is that the compiler is then free to do it. If things
slow down after the compiler gets *more* information, then that
is a problem with the compiler heuristics rather than the
information we give it.

 
 Then x86s tend to have very very fast L1 caches and
 if something is not in L1 on reads then the cost of fetching
 something for a read dwarfs the few cycles you can typically
 get out of this.

Well most architectures have L1 caches of several cycles. And
L2 miss typically means going to L2 which in some cases the
compiler is expected to attempt to cover as much as possible
(eg in-order architectures).

If the caches are missed completely, then especially with an
in-order architecture, you want to issue as many parallel loads
as possible during the stall. If the compiler can't resolve
aliases, then it simply won't be able to bring some of those
loads forward.


 And lastly even on a in order system stores can 
 be typically queued without stalling, so it doesn't
 hurt to do them early.

Store queues are, what? On the order of tens of entries for
big power hungry x86? I'd guess much smaller for low power
in-order x86 and ARM etc. These can definitely fill up and
stall, so you still want to get loads out early if possible.

Even a lot of OOOE CPUs I think won't have the best alias
anaysis, so all else being equal, it wouldn't hurt them to
move loads earlier.


 Also at least x86 gcc normally doesn't do scheduling 
 beyond basic blocks, so any if () shuts it up.

I don't think any of this is a reason not to use restrict, though.
But... there are so many places we could add it to the kernel, and
probably so few where it makes much difference. Maybe it should be
able to help some critical core code, though.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Warning and BUG with btrfs and corrupted image

2009-01-21 Thread Eric Sesterhenn
* Pavel Machek (pa...@suse.cz) wrote:
 On Tue 2009-01-20 18:34:55, Eric Sesterhenn wrote:
  * Dave Chinner (da...@fromorbit.com) wrote:
   On Tue, Jan 20, 2009 at 11:15:03AM +0100, Eric Sesterhenn wrote:
* Dave Chinner (da...@fromorbit.com) wrote:
 On Tue, Jan 20, 2009 at 07:31:50AM +0100, Eric Sesterhenn wrote:
  * Pavel Machek (pa...@suse.cz) wrote:
   Does ext2/3 and vfat survive that kind of attacks? Those are 'in
   production' and should survive it...
  
  I regularly (once or twice a week) test 100 corrupted images of 
  vfat, udf, msdos, swap, iso9660, ext2, ext3, ext4, minix, bfs, befs,
  hfs, hfs+, qnx4, affs and cramfs on each of my two test machines.
 
 Any reason you are not testing XFS in that set?

So far the responses from xfs folks have been disappointing, if you are
interested in bugreports i can send you some.
   
   Sure I am.  It would be good if you could start testing XFS along
   with all the other filesystems and report anything you find.
  
  Ok, i wont report stuff with only xfs-internal backtraces from
  xfs_error_report() or are they interesting to you?
  
  This occurs during mount, box is dead afterwards
  Image can be found here :
  http://www.cccmz.de/~snakebyte/xfs.11.img.bz2
  I see this every ~10 images, which makes further testing hard :)
 
 BTW have you considered trying to run fsck.* on such images? I had lot
 of fun with e2fsck/e3fsck/fsck.vfat.

Not yet, but if one assumes the people writing the userspace code are
the same writing the kernel code, and probably make the same errors in
both versions this might be interesting.

For userspace a more advanced approach than random bit flipping is
possible with tools like bunny (http://code.google.com/p/bunny-the-fuzzer/)
Bunny instrumentates the code, and then tries to change the input file
based on the feedback it gets from the instrumentation. I fired up
an instance testing ext2, lets see if it finishes until evening :-)

Maybe someone is brave enough to test this approach with user mode
linux... :)

Greetings, Eric
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: gcc inlining heuristics was Re: [PATCH -v7][RFC]: mutex: implement adaptive spinning

2009-01-21 Thread Andi Kleen
 The point is that the compiler is then free to do it. If things
 slow down after the compiler gets *more* information, then that
 is a problem with the compiler heuristics rather than the
 information we give it.

The point was the -Os disables it typically then.
(not always, compiler heuristics are far from perfect)

 
  
  Then x86s tend to have very very fast L1 caches and
  if something is not in L1 on reads then the cost of fetching
  something for a read dwarfs the few cycles you can typically
  get out of this.
 
 Well most architectures have L1 caches of several cycles. And
 L2 miss typically means going to L2 which in some cases the
 compiler is expected to attempt to cover as much as possible
 (eg in-order architectures).

L2 cache is so much slower that scheduling a few instructions
more doesn't help much.

 stall, so you still want to get loads out early if possible.
 
 Even a lot of OOOE CPUs I think won't have the best alias
 anaysis, so all else being equal, it wouldn't hurt them to
 move loads earlier.

Hmm, but if the load is nearby it won't matter if a 
store is in the middle, because the CPU will just execute
over it.

The real big win is if you do some computation inbetween,
but at least for typical list manipulation there isn't 
really any.

  Also at least x86 gcc normally doesn't do scheduling 
  beyond basic blocks, so any if () shuts it up.
 
 I don't think any of this is a reason not to use restrict, though.
 But... there are so many places we could add it to the kernel, and
 probably so few where it makes much difference. Maybe it should be
 able to help some critical core code, though.

Frankly I think it would be another unlikely().

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Fix stop searching test in replace_one_extent

2009-01-21 Thread Yan Zheng
Hello,

replace_one_extent search tree leaves for references to
a given extent. It stop searching if goes beyond the last
possible position. The last possible position is computed
by adding the starting offset of a found file extent to 
the full size of the extent. The code uses physical size
of the extent as the full size. This is incorrect since
btrfs supported compression. The fix is get the full size
from ram_bytes field of file extent item. Thank you,

Signed-off-by: Yan Zheng zheng@oracle.com

---
diff -urp 1/fs/btrfs/extent-tree.c 2/fs/btrfs/extent-tree.c
--- 1/fs/btrfs/extent-tree.c2009-01-07 07:16:42.762364700 +0800
+++ 2/fs/btrfs/extent-tree.c2009-01-08 15:14:47.0 +0800
@@ -,7 +4445,7 @@ static noinline int replace_one_extent(s
u64 lock_end = 0;
u64 num_bytes;
u64 ext_offset;
-   u64 first_pos;
+   u64 search_end = (u64)-1;
u32 nritems;
int nr_scaned = 0;
int extent_locked = 0;
@@ -4452,7 +4453,6 @@ static noinline int replace_one_extent(s
int ret;
 
memcpy(key, leaf_key, sizeof(key));
-   first_pos = INT_LIMIT(loff_t) - extent_key-offset;
if (ref_path-owner_objectid != BTRFS_MULTIPLE_OBJECTIDS) {
if (key.objectid  ref_path-owner_objectid ||
(key.objectid == ref_path-owner_objectid 
@@ -4501,7 +4501,7 @@ next:
if ((key.objectid  ref_path-owner_objectid) ||
(key.objectid == ref_path-owner_objectid 
 key.type  BTRFS_EXTENT_DATA_KEY) ||
-   (key.offset = first_pos + extent_key-offset))
+   key.offset = search_end)
break;
}
 
@@ -4534,8 +4534,10 @@ next:
num_bytes = btrfs_file_extent_num_bytes(leaf, fi);
ext_offset = btrfs_file_extent_offset(leaf, fi);
 
-   if (first_pos  key.offset - ext_offset)
-   first_pos = key.offset - ext_offset;
+   if (search_end == (u64)-1) {
+   search_end = key.offset - ext_offset +
+   btrfs_file_extent_ram_bytes(leaf, fi);
+   }
 
if (!extent_locked) {
lock_start = key.offset;
@@ -4724,7 +4726,7 @@ next:
}
 skip:
if (ref_path-owner_objectid != BTRFS_MULTIPLE_OBJECTIDS 
-   key.offset = first_pos + extent_key-offset)
+   key.offset = search_end)
break;
 
cond_resched();


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Fix infinite loop in btrfs_extent_post_op

2009-01-21 Thread Yan Zheng
Hello,

btrfs_extent_post_op calls finish_current_insert and
del_pending_extents. They both may enter infinite loop.
finish_current_insert enters infinite loop if it only
finds some backrefs to update. The infinite loop in
del_pending_extents is due to a variable not properly
set. Thank you,

Signed-off-by: Yan Zheng zheng@oracle.com

---
diff -urp 1/fs/btrfs/extent-tree.c 2/fs/btrfs/extent-tree.c
--- 1/fs/btrfs/extent-tree.c2009-01-07 07:16:42.762364700 +0800
+++ 2/fs/btrfs/extent-tree.c2009-01-20 13:36:07.0 +0800
@@ -2159,7 +2159,8 @@ again:
ret = find_first_extent_bit(info-extent_ins, search, start,
end, EXTENT_WRITEBACK);
if (ret) {
-   if (skipped  all  !num_inserts) {
+   if (skipped  all  !num_inserts 
+   list_empty(update_list)) {
skipped = 0;
search = 0;
continue;
@@ -2547,6 +2548,7 @@ again:
if (ret) {
if (all  skipped  !nr) {
search = 0;
+   skipped = 0;
continue;
}
mutex_unlock(info-extent_ins_mutex);


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Progs: update convert for uninitialized block groups

2009-01-21 Thread Yan Zheng
Hello,

There is a new feature 'uninitialized block groups' in ext4.
Block and inode bitmaps in uninitialized block groups are
uninitialized. This confuses the converter. The fix is call
ext2fs_new_inode for each block group at open time. It set
up uninitialized block and inode bitmaps appropriately.

This patch also contains some other minor fixes. Thank you,

Signed-off-by: Yan Zheng zheng@oracle.com

---
diff -urp btrfs-progs-unstable/convert.c btrfs-progs/convert.c
--- btrfs-progs-unstable/convert.c  2009-01-13 18:55:53.214449273 +0800
+++ btrfs-progs/convert.c   2009-01-21 16:54:37.0 +0800
@@ -54,6 +54,7 @@ static int open_ext2fs(const char *name,
 {
errcode_t ret;
ext2_filsys ext2_fs;
+   ext2_ino_t ino;
ret = ext2fs_open(name, 0, 0, 0, unix_io_manager, ext2_fs);
if (ret) {
fprintf(stderr, ext2fs_open: %s\n, error_message(ret));
@@ -71,6 +72,17 @@ static int open_ext2fs(const char *name,
error_message(ret));
goto fail;
}
+   /*
+* search each block group for a free inode. this set up
+* uninit block/inode bitmaps appropriately.
+*/
+   ino = 1;
+   while (ino = ext2_fs-super-s_inodes_count) {
+   ext2_ino_t foo;
+   ext2fs_new_inode(ext2_fs, ino, 0, NULL, foo);
+   ino += EXT2_INODES_PER_GROUP(ext2_fs-super);
+   }
+
*ret_fs = ext2_fs;
return 0;
 fail:
@@ -1929,6 +1941,7 @@ static int relocate_one_reference(struct
key.offset = 0;
}
btrfs_init_path(path);
+recow:
ret = btrfs_search_slot(trans, root, key, path, -1, 1);
if (ret  0)
goto fail;
@@ -1936,8 +1949,17 @@ static int relocate_one_reference(struct
leaf = path.nodes[0];
nritems = btrfs_header_nritems(leaf);
while (1) {
-   if (path.slots[0] = nritems)
-   break;
+   if (path.slots[0] = nritems) {
+   ret = btrfs_next_leaf(root, path);
+   if (ret  0)
+   goto fail;
+   if (ret  0)
+   break;
+   btrfs_item_key_to_cpu(path.nodes[0], key,
+ path.slots[0]);
+   btrfs_release_path(root, path);
+   goto recow;
+   }
btrfs_item_key_to_cpu(leaf, key, path.slots[0]);
if (key.objectid != objectid ||
key.type != BTRFS_EXTENT_DATA_KEY)
@@ -2048,7 +2070,7 @@ static int relocate_one_reference(struct
if (data.num_blocks  0) {
ret = record_file_blocks(trans, root, objectid, inode,
 data.first_block, data.disk_block,
-data.num_blocks, 0);
+data.num_blocks, datacsum);
if (ret)
goto fail;
}
@@ -2218,7 +2240,7 @@ next:
ret = btrfs_commit_transaction(trans, cur_root);
BUG_ON(ret);
 
-   if (num_refs  0  pass++  4)
+   if (num_refs  0  pass++  16)
goto again;
 
ret = (num_refs  0) ? -1 : 0;
@@ -2537,6 +2559,7 @@ int do_rollback(const char *devname, int
 {
int fd;
int ret;
+   int i;
struct btrfs_root *root;
struct btrfs_root *ext2_root;
struct btrfs_root *chunk_root;
@@ -2654,6 +2677,10 @@ int do_rollback(const char *devname, int
struct btrfs_file_extent_item);
if (btrfs_file_extent_type(leaf, fi) != BTRFS_FILE_EXTENT_REG)
break;
+   if (btrfs_file_extent_compression(leaf, fi) ||
+   btrfs_file_extent_encryption(leaf, fi) ||
+   btrfs_file_extent_other_encoding(leaf, fi))
+   break;
 
bytenr = btrfs_file_extent_disk_bytenr(leaf, fi);
/* skip holes and direct mapped extents */
@@ -2750,6 +2777,15 @@ next_extent:
goto fail;
}
 
+   /* zero btrfs super block mirrors */
+   memset(buf, 0, sectorsize);
+   for (i = 1 ; i  BTRFS_SUPER_MIRROR_MAX; i++) {
+   bytenr = btrfs_sb_offset(i);
+   if (bytenr = total_bytes)
+   break;
+   ret = pwrite(fd, buf, sectorsize, bytenr);
+   }
+
sb_bytenr = (u64)-1;
/* copy all relocated blocks back */
while(1) {

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: check return value for kthread_run() correctly

2009-01-21 Thread Chris Mason
On Tue, 2009-01-20 at 00:28 +0800, Qinghuang Feng wrote:
 kthread_run() returns the kthread or ERR_PTR(-ENOMEM), not NULL.
 

Thanks, got it.

-chris


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Patch] bit-radix.c: fix declarations

2009-01-21 Thread Chris Mason
On Mon, 2009-01-19 at 22:19 +0800, Américo Wang wrote:
 bit-radix.c should #include bit-radix.h, and add a missing declaration
 for find_next_bit().
 
 Signed-off-by: WANG Cong wangc...@zeuux.org
 

I fixed this a little differently, the bit-radix code wasn't being used
anymore.  It's gone now ;)

-chris


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Patch] version.sh: clean up the code

2009-01-21 Thread Chris Mason
On Mon, 2009-01-19 at 21:51 +0800, Américo Wang wrote:
 - In bash, integer comparation should use '-eq', not '=='.
 - Remove code for Mercurial, since btrfs now uses git.
 

version.sh should be removed from the mainline sources and only live in
the standalone tree.  So, I'll apply this one a bit later.

-chris


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Patch] Btrfs: use BTRFS_VOL_NAME_MAX for struct btrfs_ioctl_vol_args

2009-01-21 Thread Chris Mason
On Mon, 2009-01-19 at 21:12 +0800, Américo Wang wrote:
 On Mon, Jan 19, 2009 at 08:03:37AM -0500, Josef Bacik wrote:
 On Mon, Jan 19, 2009 at 08:57:32PM +0800, Américo Wang wrote:
  
  I found userspace tool, btrfsctl, uses BTRFS_VOL_NAME_MAX, and
  it also looks that this one is more proper.
  
  Kill BTRFS_PATH_NAME_MAX since no one will use it.
  
 
 Nope, BTRFS_PATH_NAME_MAX is specifically used for the ioctl stuff, makes the
 arguments 4k aligned, this patch is incorrect.  Thanks,
 
 Ok, then what is BTRFS_VOL_NAME_MAX? :)

Right now it is only used in the progs.  The disk format doesn't really
have a max there, it is just to keep names usable.  But, we should add a
check in the kernel ioctl side, are you interested in sending a patch
for it?

-chris


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Misleading error message in btrfsctl

2009-01-21 Thread Chris Mason
On Sun, 2009-01-18 at 09:12 -0500, Calvin Walton wrote:
 On Sun, 2009-01-18 at 14:40 +0100, Peter Klotz wrote:
  Calling btrfsctl results in this message when module btrfs.ko is not loaded:
  
  r...@asus:/root# btrfsctl -A /dev/sda
  ioctl:: Bad file descriptor
 ..
  The output with applied patch:
  
  r...@asus:/root# btrfsctl -A /dev/sda
  failed to open /dev/btrfs-control
 
 It would probably be better to use perror() instead of fprintf, in order
 to give the actual error message. I often find it useful to know why the
 device failed to open - is it missing, permission issues? I've updated
 the patch:
 

Thanks

-chris


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] ioctl to fetch csums of file extents

2009-01-21 Thread Chris Mason
On Fri, 2008-12-19 at 13:14 -0800, Yehuda Sadeh Weinraub wrote:
 This adds a new ioctl that requests for the csums of file's extent.
 The csums of contiguous extents can also be requested. The call will
 not return anything for extents that don't have csum information for
 their data, or that the csum list is not available.
 Presumably, the user calls fiemap first in order to get the list of
 extents. The max number of requested csum items is specified in the
 request and is updated according to the actual number of returned
 items.
 

The swapfile discussions convinced me we're better off taking fiemap
support now and disabling bmap entirely, at least until we can fix the
swapfile code to work properly with btrfs.

So, I'd like to take this patch, but there is line wrapping and other
whitespace problems.  Could you please send it with a mail client that
doesn't mangle? ;)

(Attaching it is fine too, at least on this list)

Thanks!
-chris


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] ioctl to fetch csums of file extents

2009-01-21 Thread Yehuda Sadeh Weinraub

 So, I'd like to take this patch, but there is line wrapping and other
 whitespace problems.  Could you please send it with a mail client that
 doesn't mangle? ;)

 (Attaching it is fine too, at least on this list)


Attached both patches.

Thanks,
Yehuda
From d7768ad789685150eb5d6fc94f1133fa6089bbeb Mon Sep 17 00:00:00 2001
From: Yehuda Sadeh yeh...@hq.newdream.net
Date: Fri, 19 Dec 2008 12:56:56 -0800
Subject: [PATCH 1/1] btrfs fiemap support


Signed-off-by: Yehuda Sadeh yeh...@hq.newdream.net
---
 fs/btrfs/extent_io.c |   84 ++
 fs/btrfs/extent_io.h |2 +
 fs/btrfs/inode.c |7 
 3 files changed, 93 insertions(+), 0 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 25ce2d1..a27bb39 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2900,6 +2900,89 @@ out:
 	return sector;
 }
 
+int extent_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
+		__u64 start, __u64 len, get_extent_t *get_extent)
+{
+	int ret;
+	u64 off = start;
+	u64 max = start + len;
+	u32 flags = 0;
+	u64 disko = 0;
+	struct extent_map *em = NULL;
+	int end = 0;
+	u64 em_start = 0, em_len = 0;
+	ret = 0;
+
+	lock_extent(BTRFS_I(inode)-io_tree, start, start + len,
+		GFP_NOFS);
+	em = get_extent(inode, NULL, 0, off, max - off, 0);
+	if (!em)
+		goto out;
+	if (IS_ERR(em)) {
+		ret = PTR_ERR(em);
+		goto out;
+	}
+	while (!end) {
+		off = em-start + em-len;
+
+		em_start = em-start;
+		em_len = em-len;
+
+		disko = 0;
+		flags = 0;
+
+		switch (em-block_start) {
+		case EXTENT_MAP_LAST_BYTE:
+			end = 1;
+			flags |= FIEMAP_EXTENT_LAST;
+			break;
+		case EXTENT_MAP_HOLE:
+			flags |= FIEMAP_EXTENT_UNWRITTEN;
+			break;
+		case EXTENT_MAP_INLINE:
+			flags |= (FIEMAP_EXTENT_DATA_INLINE |
+  FIEMAP_EXTENT_NOT_ALIGNED);
+			break;
+		case EXTENT_MAP_DELALLOC:
+			flags |= (FIEMAP_EXTENT_DELALLOC |
+  FIEMAP_EXTENT_UNKNOWN);
+			break;
+		default:
+			disko = em-block_start;
+			break;
+		}
+		if (test_bit(EXTENT_FLAG_COMPRESSED, em-flags))
+			flags |= FIEMAP_EXTENT_ENCODED;
+
+		free_extent_map(em);
+
+		if (!end) {
+			em = get_extent(inode, NULL, 0, off, max - off, 0);
+			if (!em)
+goto out;
+			if (IS_ERR(em)) {
+ret = PTR_ERR(em);
+goto out;
+			}
+		}
+		if (test_bit(EXTENT_FLAG_VACANCY, em-flags)) {
+			flags |= FIEMAP_EXTENT_LAST;
+			end = 1;
+		}
+
+		ret = fiemap_fill_next_extent(fieinfo, em_start, disko,
+	em_len, flags);
+		if (ret)
+			goto out_free;
+	}
+out_free:
+	free_extent_map(em);
+out:
+	unlock_extent(BTRFS_I(inode)-io_tree, start, start + len,
+			GFP_NOFS);
+	return ret;
+}
+
 static inline struct page *extent_buffer_page(struct extent_buffer *eb,
 	  unsigned long i)
 {
@@ -3786,3 +3869,4 @@ out:
 	return ret;
 }
 EXPORT_SYMBOL(try_release_extent_buffer);
+
diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
index c5b483a..e80c6d9 100644
--- a/fs/btrfs/extent_io.h
+++ b/fs/btrfs/extent_io.h
@@ -193,6 +193,8 @@ int extent_commit_write(struct extent_io_tree *tree,
 			unsigned from, unsigned to);
 sector_t extent_bmap(struct address_space *mapping, sector_t iblock,
 		get_extent_t *get_extent);
+int extent_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
+		__u64 start, __u64 len, get_extent_t *get_extent);
 int set_range_dirty(struct extent_io_tree *tree, u64 start, u64 end);
 int set_state_private(struct extent_io_tree *tree, u64 start, u64 private);
 int get_state_private(struct extent_io_tree *tree, u64 start, u64 *private);
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 068bad4..ff182be 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -4171,6 +4171,12 @@ static sector_t btrfs_bmap(struct address_space *mapping, sector_t iblock)
 	return extent_bmap(mapping, iblock, btrfs_get_extent);
 }
 
+static int btrfs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
+		__u64 start, __u64 len)
+{
+	return extent_fiemap(inode, fieinfo, start, len, btrfs_get_extent);
+}
+
 int btrfs_readpage(struct file *file, struct page *page)
 {
 	struct extent_io_tree *tree;
@@ -5023,6 +5029,7 @@ static struct inode_operations btrfs_file_inode_operations = {
 	.removexattr	= btrfs_removexattr,
 	.permission	= btrfs_permission,
 	.fallocate	= btrfs_fallocate,
+	.fiemap		= btrfs_fiemap,
 };
 static struct inode_operations btrfs_special_inode_operations = {
 	.getattr	= btrfs_getattr,
-- 
1.5.6.5

From bca297ff9a136b740054d024a527d2a2d52ff945 Mon Sep 17 00:00:00 2001
From: Yehuda Sadeh yeh...@hq.newdream.net
Date: Fri, 19 Dec 2008 12:57:33 -0800
Subject: [PATCH 1/1] ioctl to fetch csums of file extents


Signed-off-by: Yehuda Sadeh yeh...@hq.newdream.net
---
 fs/btrfs/ioctl.c |  101 ++
 fs/btrfs/ioctl.h |9 +
 2 files changed, 110 insertions(+), 0 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index ab429fe..18a57c1 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@