[PATCH] btrfs/ctree.h: trivial fixup the comment for struct btrfs_dev_extent with the right fields' names

2012-03-13 Thread Wang Sheng-Hui

Signed-off-by: Wang Sheng-Hui 
---
 fs/btrfs/ctree.h |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 80b6486..a515e4e 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -597,9 +597,9 @@ struct btrfs_extent_ref_v0 {
 } __attribute__ ((__packed__));
 
 
-/* dev extents record free space on individual devices.  The owner
+/* dev extents record free space on individual devices.  The chunk_tree
  * field points back to the chunk allocation mapping tree that allocated
- * the extent.  The chunk tree uuid field is a way to double check the owner
+ * the extent.  The chunk_tree_uuid field is a way to double check the owner
  */
 struct btrfs_dev_extent {
__le64 chunk_tree;
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel BUG at fs/btrfs/transaction.c:1337!

2012-03-13 Thread Anand Jain



> $sudo btrfsck /home/not-a-user/broken-btrfs.img
> bad block 29933568
> bad block 44224512
> parent transid verify failed on 54566912 wanted 3532 found 3475

These aren't related to the original problem as in the subject.
But it could be the panic's aftermath effect if IOs didn't made
into the disk depending on how the system was halted.
Previously this mailing-list reported the same BUG calling,
  'BUG at fs/btrfs/transaction.c:1337!'
however the stack-trace in them are different.

Here its something to do with type of IO to the underneath SSD.
SSDs generally gives better performance for random access, however
their performance aren't good when it comes to sequential access.
So it may provide some clues if we could know the sizes of upload
and download as in the below statement...

> It seems not.  The oops occurs when I'm surfing the Internet using
> chromium and downloading and uploading using transmisson.  Suddenly it
> switched to console and printed these lines.  My root filesystem / is
> on a 16GB SSD whoes device file is /dev/sdb1.  The mount option is
> ssd,compress.

Further, You could know the Qdepth at
# cat /sys/block/sd< >/queue/nr_requests

And may be that this Q was 90% full. but this at the moment will be of
little help unless we know the other stuffs.. in the logs and probably
in the dump.

If you would want to give a try to reproduce, i would recommend
 - create sequential IO of file size equal to the files which were
   being download / uploaded.
 - do not change any of the parameter as in your original boot-disk.

 If this could trigger the problem, after the reboot enable the
 coredump and recreate the problem collect the dump. this may help.


Thanks, Anand


On 03/14/12 02:28, qasdfgtyuiop wrote:

These information might be useful:
$sudo btrfsck /home/not-a-user/broken-btrfs.img
bad block 29933568
bad block 44224512
parent transid verify failed on 54566912 wanted 3532 found 3475
parent transid verify failed on 54566912 wanted 3532 found 3475
Extent back ref already exists for 40439808 parent 0 root 5
Extent back ref already exists for 117432320 parent 0 root 5
Extent back ref already exists for 117600256 parent 0 root 5
Extent back ref already exists for 49078272 parent 0 root 5
Extent back ref already exists for 164507648 parent 0 root 5
Extent back ref already exists for 166637568 parent 0 root 5
Extent back ref already exists for 61587456 parent 0 root 5
Extent back ref already exists for 117633024 parent 0 root 5
Extent back ref already exists for 250806272 parent 0 root 5
Extent back ref already exists for 35176448 parent 0 root 5
Extent back ref already exists for 117395456 parent 0 root 5
Extent back ref already exists for 250949632 parent 0 root 5
Extent back ref already exists for 117628928 parent 0 root 5
Extent back ref already exists for 280928256 parent 0 root 5
Extent back ref already exists for 34058240 parent 0 root 5
Extent back ref already exists for 63078400 parent 0 root 5
Extent back ref already exists for 117325824 parent 0 root 5
Extent back ref already exists for 319053824 parent 0 root 5
Extent back ref already exists for 117391360 parent 0 root 5
Extent back ref already exists for 5888 parent 0 root 5
Extent back ref already exists for 139710464 parent 0 root 5
Extent back ref already exists for 117592064 parent 0 root 5
Extent back ref already exists for 34013184 parent 0 root 5
Extent back ref already exists for 294502400 parent 0 root 5
Extent back ref already exists for 70004736 parent 0 root 5
Extent back ref already exists for 48816128 parent 0 root 5
Extent back ref already exists for 53764096 parent 0 root 5
Extent back ref already exists for 5456105472 parent 0 root 5
Extent back ref already exists for 57430016 parent 0 root 5
Extent back ref already exists for 62701568 parent 0 root 5
Extent back ref already exists for 34869248 parent 0 root 5
Extent back ref already exists for 34877440 parent 0 root 5
Extent back ref already exists for 5456109568 parent 0 root 5
Extent back ref already exists for 5456113664 parent 0 root 5
Extent back ref already exists for 35168256 parent 0 root 5
Extent back ref already exists for 5456117760 parent 0 root 5
Extent back ref already exists for 51052544 parent 0 root 5
Extent back ref already exists for 33845248 parent 0 root 5
Extent back ref already exists for 33849344 parent 0 root 5
Extent back ref already exists for 62627840 parent 0 root 5
Extent back ref already exists for 5456121856 parent 0 root 5
Extent back ref already exists for 30162944 parent 0 root 5
Extent back ref already exists for 307396608 parent 0 root 5
Extent back ref already exists for 5470208000 parent 0 root 5
Extent back ref already exists for 30547968 parent 0 root 5
Extent back ref already exists for 47996928 parent 0 root 5
Extent back ref already exists for 5470183424 parent 0 root 5
Extent back ref already exists for 35127296 parent 0 root 5
Extent back ref already exists for 59133952 parent 0 root 5
Extent back ref already 

Re: getdents - ext4 vs btrfs performance

2012-03-13 Thread Ted Ts'o
On Wed, Mar 14, 2012 at 10:48:17AM +0800, Yongqiang Yang wrote:
> What if we use inode number as the hash value?  Does it work?

The whole point of using the tree structure is to accelerate filename
-> inode number lookups.  So the namei lookup doesn't have the inode
number; the whole point is to use the filename to lookup the inode
number.  So we can't use the inode number as the hash value since
that's what we are trying to find out.

We could do this if we have two b-trees, one indexed by filename and
one indexed by inode number, which is what JFS (and I believe btrfs)
does.

- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: getdents - ext4 vs btrfs performance

2012-03-13 Thread Yongqiang Yang
What if we use inode number as the hash value?  Does it work?

Yongqiang.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: getdents - ext4 vs btrfs performance

2012-03-13 Thread Ted Ts'o
On Tue, Mar 13, 2012 at 04:22:52PM -0400, Phillip Susi wrote:
> 
> I think a format change would be preferable to runtime sorting.

Are you volunteering to spearhead the design and coding of such a
thing?  Run-time sorting is backwards compatible, and a heck of a lot
easier to code and test...

The reality is we'd probably want to implement run-time sorting
*anyway*, for the vast majority of people who don't want to convert to
a new incompatible file system format.  (Even if you can do the
conversion using e2fsck --- which is possible, but it would be even
more code to write --- system administrators tend to be very
conservative about such things, since they might need to boot an older
kernel, or use a rescue CD that doesn't have an uptodate kernel or
file system utilities, etc.)

> So the index nodes contain the hash ranges for the leaf block, but
> the leaf block only contains the regular directory entries, not a
> hash for each name?  That would mean that adding or removing names
> would require moving around the regular directory entries wouldn't
> it?

They aren't sorted in the leaf block, so we only need to move around
regular directory entries when we do a node split (and at the moment
we don't support shrinking directories), so we don't have to worry the
reverse case.

> I would think that hash collisions are rare enough that reading a
> directory block you end up not needing once in a blue moon would be
> chalked up under "who cares".  So just stick with hash, offset pairs
> to map the hash to the normal directory entry.

With a 64-bit hash, and if we were actually going to implement this as
a new incompatible feature, you're probably right in terms of
accepting the extra directory block search.

We would still have to implement the case where hash collisions *do*
exist, though, and make sure the right thing happens in that case.
Even if the chance of that happening is 1 in 2**32, with enough
deployed systems (i.e., every Android handset, etc.) it's going to
happen in real life.

- Ted




--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs-progs: make print-tree.c aware of free space cache

2012-03-13 Thread Ilya Dryomov
This adds proper formatting for free space and inode cache items in
btrfs-debug-tree output.

Signed-off-by: Ilya Dryomov 
---
 ctree.h  |   29 +
 print-tree.c |   52 +++-
 2 files changed, 72 insertions(+), 9 deletions(-)

diff --git a/ctree.h b/ctree.h
index 141ec59..147c3cb 100644
--- a/ctree.h
+++ b/ctree.h
@@ -256,6 +256,13 @@ struct btrfs_chunk {
/* additional stripes go here */
 } __attribute__ ((__packed__));
 
+struct btrfs_free_space_header {
+   struct btrfs_disk_key location;
+   __le64 generation;
+   __le64 num_entries;
+   __le64 num_bitmaps;
+} __attribute__ ((__packed__));
+
 static inline unsigned long btrfs_chunk_item_size(int num_stripes)
 {
BUG_ON(num_stripes == 0);
@@ -1432,6 +1439,28 @@ static inline void btrfs_set_dir_item_key(struct 
extent_buffer *eb,
write_eb_member(eb, item, struct btrfs_dir_item, location, key);
 }
 
+/* struct btrfs_free_space_header */
+BTRFS_SETGET_FUNCS(free_space_entries, struct btrfs_free_space_header,
+  num_entries, 64);
+BTRFS_SETGET_FUNCS(free_space_bitmaps, struct btrfs_free_space_header,
+  num_bitmaps, 64);
+BTRFS_SETGET_FUNCS(free_space_generation, struct btrfs_free_space_header,
+  generation, 64);
+
+static inline void btrfs_free_space_key(struct extent_buffer *eb,
+   struct btrfs_free_space_header *h,
+   struct btrfs_disk_key *key)
+{
+   read_eb_member(eb, h, struct btrfs_free_space_header, location, key);
+}
+
+static inline void btrfs_set_free_space_key(struct extent_buffer *eb,
+   struct btrfs_free_space_header *h,
+   struct btrfs_disk_key *key)
+{
+   write_eb_member(eb, h, struct btrfs_free_space_header, location, key);
+}
+
 /* struct btrfs_disk_key */
 BTRFS_SETGET_STACK_FUNCS(disk_key_objectid, struct btrfs_disk_key,
 objectid, 64);
diff --git a/print-tree.c b/print-tree.c
index fc134c0..face47a 100644
--- a/print-tree.c
+++ b/print-tree.c
@@ -94,6 +94,7 @@ static void print_chunk(struct extent_buffer *eb, struct 
btrfs_chunk *chunk)
  (unsigned long long)btrfs_stripe_offset_nr(eb, chunk, i));
}
 }
+
 static void print_dev_item(struct extent_buffer *eb,
   struct btrfs_dev_item *dev_item)
 {
@@ -276,8 +277,29 @@ static void print_root_ref(struct extent_buffer *leaf, int 
slot, char *tag)
   namelen, namebuf);
 }
 
-static void print_key_type(u8 type)
+static void print_free_space_header(struct extent_buffer *leaf, int slot)
 {
+   struct btrfs_free_space_header *header;
+   struct btrfs_disk_key location;
+
+   header = btrfs_item_ptr(leaf, slot, struct btrfs_free_space_header);
+   btrfs_free_space_key(leaf, header, &location);
+   printf("\t\tlocation ");
+   btrfs_print_key(&location);
+   printf("\n");
+   printf("\t\tcache generation %llu entries %llu bitmaps %llu\n",
+  (unsigned long long)btrfs_free_space_generation(leaf, header),
+  (unsigned long long)btrfs_free_space_entries(leaf, header),
+  (unsigned long long)btrfs_free_space_bitmaps(leaf, header));
+}
+
+static void print_key_type(u64 objectid, u8 type)
+{
+   if (type == 0 && objectid == BTRFS_FREE_SPACE_OBJECTID) {
+   printf("UNTYPED");
+   return;
+   }
+
switch (type) {
case BTRFS_INODE_ITEM_KEY:
printf("INODE_ITEM");
@@ -362,10 +384,10 @@ static void print_key_type(u8 type)
};
 }
 
-static void print_objectid(unsigned long long objectid, u8 type)
+static void print_objectid(u64 objectid, u8 type)
 {
if (type == BTRFS_DEV_EXTENT_KEY) {
-   printf("%llu", objectid); /* device id */
+   printf("%llu", (unsigned long long)objectid); /* device id */
return;
}
 
@@ -415,6 +437,12 @@ static void print_objectid(unsigned long long objectid, u8 
type)
case BTRFS_EXTENT_CSUM_OBJECTID:
printf("EXTENT_CSUM");
break;
+   case BTRFS_FREE_SPACE_OBJECTID:
+   printf("FREE_SPACE");
+   break;
+   case BTRFS_FREE_INO_OBJECTID:
+   printf("FREE_INO");
+   break;
case BTRFS_MULTIPLE_OBJECTIDS:
printf("MULTIPLE");
break;
@@ -425,19 +453,19 @@ static void print_objectid(unsigned long long objectid, 
u8 type)
}
/* fall-thru */
default:
-   printf("%llu", objectid);
+   printf("%llu", (unsigned long long)objectid);
}
 }
 
 void btrfs_print_key(struct btrfs_disk_key *disk_key)
 {
-   u8 type;
+   u64 objectid = btrfs_disk_key_objectid(disk_key);
+   u8 type = btrfs_disk_k

[PATCH] Btrfs-progs: allow dup for data chunks in mixed mode

2012-03-13 Thread Ilya Dryomov
Before commit a46e7ff2 was merged it was possible to create dup for
data+metadata chunks (mixed mode) by giving -m raid1 -d raid1 -M to
mkfs.  a46e7ff2 purposefully disabled behind the scenes profile
upgrading/downgrading, so give users a chance to pick dup explicitly and
bail if dup for data is requested in normal mode.

Signed-off-by: Ilya Dryomov 
---
 mkfs.c |   16 
 1 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/mkfs.c b/mkfs.c
index d3f45bd..6d3ef29 100644
--- a/mkfs.c
+++ b/mkfs.c
@@ -258,17 +258,23 @@ static int create_raid_groups(struct btrfs_trans_handle 
*trans,
 
if (metadata_profile & ~allowed) {
fprintf(stderr, "unable to create FS with metadata "
-   "profile %llu (%llu devices)\n", metadata_profile,
+   "profile %llu (have %llu devices)\n", metadata_profile,
num_devices);
exit(1);
}
if (data_profile & ~allowed) {
fprintf(stderr, "unable to create FS with data "
-   "profile %llu (%llu devices)\n", data_profile,
+   "profile %llu (have %llu devices)\n", data_profile,
num_devices);
exit(1);
}
 
+   /* allow dup'ed data chunks only in mixed mode */
+   if (!mixed && (data_profile & BTRFS_BLOCK_GROUP_DUP)) {
+   fprintf(stderr, "dup for data is allowed only in mixed mode\n");
+   exit(1);
+   }
+
if (allowed & metadata_profile) {
u64 meta_flags = BTRFS_BLOCK_GROUP_METADATA;
 
@@ -329,7 +335,7 @@ static void print_usage(void)
fprintf(stderr, "options:\n");
fprintf(stderr, "\t -A --alloc-start the offset to start the FS\n");
fprintf(stderr, "\t -b --byte-count total number of bytes in the FS\n");
-   fprintf(stderr, "\t -d --data data profile, raid0, raid1, raid10 or 
single\n");
+   fprintf(stderr, "\t -d --data data profile, raid0, raid1, raid10, dup 
or single\n");
fprintf(stderr, "\t -l --leafsize size of btree leaves\n");
fprintf(stderr, "\t -L --label set a label\n");
fprintf(stderr, "\t -m --metadata metadata profile, values like data 
profile\n");
@@ -355,10 +361,12 @@ static u64 parse_profile(char *s)
return BTRFS_BLOCK_GROUP_RAID1;
} else if (strcmp(s, "raid10") == 0) {
return BTRFS_BLOCK_GROUP_RAID10;
+   } else if (strcmp(s, "dup") == 0) {
+   return BTRFS_BLOCK_GROUP_DUP;
} else if (strcmp(s, "single") == 0) {
return 0;
} else {
-   fprintf(stderr, "Unknown option %s\n", s);
+   fprintf(stderr, "Unknown profile %s\n", s);
print_usage();
}
/* not reached */
-- 
1.7.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: getdents - ext4 vs btrfs performance

2012-03-13 Thread Phillip Susi

On 3/13/2012 3:53 PM, Ted Ts'o wrote:

Because that would be a format change.


I think a format change would be preferable to runtime sorting.


What we have today is not a hash table; it's a hashed tree, where we
use a fixed-length key for the tree based on the hash of the file
name.  Currently the leaf nodes of the tree are the directory blocks
themselves; that is, the lowest level of the index blocks tells you to
look at directory block N, where that directory contains the directory
indexes for those file names which are in a particular range (say,
between 0x2325777A and 0x2325801).


So the index nodes contain the hash ranges for the leaf block, but the 
leaf block only contains the regular directory entries, not a hash for 
each name?  That would mean that adding or removing names would require 
moving around the regular directory entries wouldn't it?



If we aren't going to change the ordering of the directory directory,
that means we would need to change things so the leaf nodes contain
the actual directory file names themselves, so that we know whether or
not we've hit the correct entry or not before we go to read in a
specific directory block (otherwise, you'd have problems dealing with
hash collisions).  But in that case, instead of storing the pointer to
the directory entry, since the bulk of the size of a directory entry
is the filename itself, you might as well store the inode number in
the tree itself, and be done with it.


I would think that hash collisions are rare enough that reading a 
directory block you end up not needing once in a blue moon would be 
chalked up under "who cares".  So just stick with hash, offset pairs to 
map the hash to the normal directory entry.


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: getdents - ext4 vs btrfs performance

2012-03-13 Thread Ted Ts'o
On Tue, Mar 13, 2012 at 03:05:59PM -0400, Phillip Susi wrote:
> Why not just separate the hash table from the conventional, mostly
> in inode order directory entries?  For instance, the first 200k of
> the directory could be the normal entries that would tend to be in
> inode order ( and e2fsck -D would reorder ), and the last 56k of the
> directory would contain the hash table.  Then readdir() just walks
> the directory like normal, and namei() can check the hash table.

Because that would be a format change.

What we have today is not a hash table; it's a hashed tree, where we
use a fixed-length key for the tree based on the hash of the file
name.  Currently the leaf nodes of the tree are the directory blocks
themselves; that is, the lowest level of the index blocks tells you to
look at directory block N, where that directory contains the directory
indexes for those file names which are in a particular range (say,
between 0x2325777A and 0x2325801).

If we aren't going to change the ordering of the directory directory,
that means we would need to change things so the leaf nodes contain
the actual directory file names themselves, so that we know whether or
not we've hit the correct entry or not before we go to read in a
specific directory block (otherwise, you'd have problems dealing with
hash collisions).  But in that case, instead of storing the pointer to
the directory entry, since the bulk of the size of a directory entry
is the filename itself, you might as well store the inode number in
the tree itself, and be done with it.

And in that case, since you are replicating the information directory
twice over, and it's going to be an incompatible format change anyway,
you might as well just store the second copy of the directory entries
in *another* btree, except this one is indexed by inode number, and
then you use the second tree for readdir(), and you make the
telldir/seekdir cookie be the inode number.  That way readdir() will
always return results in a stat() optimized order, even in the face of
directory fragmentation and file system aging.

**This** is why telldir/seekdir is so evil; in order to do things
correctly in terms of the semantics of readdir() in the face of
telldir/seekdir and file names getting inserted and deleted into the
btree, and the possibility for tree splits/rotations, etc., and the
fact that the cookie is only 32 or 64 bits, you essentially either
need to just do something stupid and have a linear directory aala ext2
and V7 Unix, or you need to store the directory information twice over
in redundant b-trees.

Or, userspace needs to do the gosh-darned sorting by inode, or we do
some hack such as only sorting the inodes using non-swapping kernel
memory if the directory is smaller than some arbitrary size, such as
256k or 512k.

- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: getdents - ext4 vs btrfs performance

2012-03-13 Thread Phillip Susi

On 3/9/2012 11:48 PM, Ted Ts'o wrote:

I suspect the best optimization for now is probably something like
this:

1) Since the vast majority of directories are less than (say) 256k
(this would be a tunable value), for directories which are less than
this threshold size, the entire directory is sucked in after the first
readdir() after an opendir() or rewinddir().  The directory contents
are then sorted by inode number (or loaded into an rbtree ordered by
inode number), and returned back to userspace in the inode order via
readdir().  The directory contents will be released on a closedir() or
rewinddir().


Why not just separate the hash table from the conventional, mostly in 
inode order directory entries?  For instance, the first 200k of the 
directory could be the normal entries that would tend to be in inode 
order ( and e2fsck -D would reorder ), and the last 56k of the directory 
would contain the hash table.  Then readdir() just walks the directory 
like normal, and namei() can check the hash table.


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NOCOW + compress-force = bug

2012-03-13 Thread Jeff Mahoney
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 03/13/2012 02:11 PM, Jeff Mahoney wrote:
> On 02/16/2012 12:58 PM, Chris Mason wrote:
>> On Thu, Feb 16, 2012 at 07:55:15PM +0600, Roman Mamedov wrote:
>>> Hello,
>>> 
>>> Please be aware that there seems to be a possible problem with 
>>> using NOCOW flag on files situated on a filesystem mounted
>>> with compress-force(=lzo, in my case).
>>> 
>>> Since experimenting with NOCOW, I started regularly hitting
>>> this BUG at extent-tree.c:5813
>>> 
>>> 5813 BUG_ON(!(flags & 
>>> BTRFS_BLOCK_FLAG_FULL_BACKREF));
>>> 
>>> I was unable to make netconsole work over a bridged interface,
>>> so can only post screenshots of this OOPS: 
>>> http://romanrm.ru/pics/2012/2012-02-16-btrfs-bug-1.jpg 
>>> http://romanrm.ru/pics/2012/2012-02-16-btrfs-bug-2.jpg
>>> 
>>> This happened four times already, and always on snapshot
>>> creation (but not every case). I have hourly snapshots in
>>> crontab, and only one case out of about ten fails with this
>>> problem. Did not try to deliberately reproduce it yet by
>>> manually making snapshots very often, etc.
> 
>> Interesting, NOCOW and compression don't really mix.  We always
>> cow for compression.  I'll try to reproduce it.
> 
> I hit this one today without nocow or compression. The only thing 
> non-default was that I mounted with -ossd. The backing store was a
> 1GB non-sparse loopback file on tmpfs.
> 
> I had kdump enabled and with 16GB, I wasn't waiting around for the 
> dump to complete. If it happens again, I'll have a full stack
> trace. My test case was filling the disk while making snapshots.

Well that didn't take long.

[  626.100684] [ cut here ]
[  626.104053] kernel BUG at
/usr/src/packages/BUILD/kernel-default-3.0.23/linux-3.0/fs/btrfs/extent-tree.c:6091!
[  626.104053] invalid opcode:  [#1] SMP
[  626.104053] CPU 8
[  626.104053] Modules linked in: btrfs zlib_deflate crc32c libcrc32c
autofs4 edd nfs lockd fscache auth_rpcgss nfs_acl sunrpc ipv6 ipv6_lib
af_packet cpufreq_conservative cpufreq_userspace cpufreq_powersave
powernow_k8 mperf microcode fuse loop dm_mod igb i2c_piix4 i2c_core
k10temp sg dca rtc_cmos pcspkr button serio_raw ext3 jbd mbcache
ohci_hcd ehci_hcd usbcore sd_mod usb_common crc_t10dif processor
thermal_sys hwmon ata_generic ahci libahci pata_atiixp libata scsi_mod
[  626.104053] Supported: Yes
[  626.104053]
[  626.104053] Pid: 14214, comm: btrfs Not tainted
3.0.23-0.0.0.0.4dd40bc-default #1 HP ProLiant DL165 G7
[  626.104053] RIP: 0010:[]  []
alloc_reserved_tree_block+0x1e3/0x1f0 [btrfs]
[  626.104053] RSP: 0018:88020766fae8  EFLAGS: 00010246
[  626.104053] RAX: 880434b76000 RBX: 0e2c RCX:

[  626.104053] RDX: 8800 RSI:  RDI:
880404690678
[  626.104053] RBP: 880436e7f670 R08: 88020766faa8 R09:
1000
[  626.104053] R10: 0e2b R11: b000 R12:
880404690678
[  626.104053] R13: 0d9d R14: 880432ac15a0 R15:
880437a01c80
[  626.104053] FS:  7fabeaf13740() GS:88043fc0()
knlGS:
[  626.104053] CS:  0010 DS:  ES:  CR0: 8005003b
[  626.104053] CR2: 7fabea61bb50 CR3: 000402dc1000 CR4:
06e0
[  626.104053] DR0:  DR1:  DR2:

[  626.104053] DR3:  DR6: 0ff0 DR7:
0400
[  626.104053] Process btrfs (pid: 14214, threadinfo 88020766e000,
task 8802389260c0)
[  626.104053] Stack:
[  626.104053]  88020766fb28  
04d3e000
[  626.104053]  8800b220c000 8804357bd000 0033
88043977c540
[  626.104053]  880437a01c80 880432ac15a0 8800b220c000
0001
[  626.104053] Call Trace:
[  626.104053]  [] run_delayed_tree_ref+0xfc/0x150
[btrfs]
[  626.104053]  [] run_clustered_refs+0xce/0x310 [btrfs]
[  626.104053]  []
btrfs_run_delayed_refs+0x139/0x2e0 [btrfs]
[  626.104053]  []
btrfs_commit_transaction+0x433/0x8a0 [btrfs]
[  626.104053]  [] create_snapshot+0x1a1/0x1c0 [btrfs]
[  626.104053]  [] btrfs_mksubvol+0x150/0x1e0 [btrfs]
[  626.104053]  []
btrfs_ioctl_snap_create_transid+0x16b/0x1a0 [btrfs]
[  626.104053]  []
btrfs_ioctl_snap_create_v2+0x108/0x110 [btrfs]
[  626.104053]  [] btrfs_ioctl+0x697/0x7d0 [btrfs]
[  626.104053]  [] do_vfs_ioctl+0x8b/0x3b0
[  626.104053]  [] sys_ioctl+0xa1/0xb0
[  626.104053]  [] system_call_fastpath+0x16/0x1b
[  626.104053] DWARF2 unwinder stuck at system_call_fastpath+0x16/0x1b
[  626.104053]
[  626.104053] Leftover inexact backtrace:
[  626.104053]
[  626.104053] Code: 4c 89 e7 e8 50 76 02 00 e9 6b ff ff ff 48 8b 44
24 78 48 c7 c7 30 94 43 a0 48 8b 50 09 48 8b 30 31 c0 e8 47 2b 07 e1
0f 0b eb fe <0f> 0b eb fe 66 0f 1f 84 00 00 00 00 00 41 57 41 56 41 55
41 54
[  626.104053] RIP  []
alloc_reserved_tree_block+0x1e3/0x1f0 [btrfs]
[  626.104053]  RSP 


- -- 
Jeff Mahoney
SUS

Re: kernel BUG at fs/btrfs/transaction.c:1337!

2012-03-13 Thread qasdfgtyuiop
These information might be useful:
$sudo btrfsck /home/not-a-user/broken-btrfs.img
bad block 29933568
bad block 44224512
parent transid verify failed on 54566912 wanted 3532 found 3475
parent transid verify failed on 54566912 wanted 3532 found 3475
Extent back ref already exists for 40439808 parent 0 root 5
Extent back ref already exists for 117432320 parent 0 root 5
Extent back ref already exists for 117600256 parent 0 root 5
Extent back ref already exists for 49078272 parent 0 root 5
Extent back ref already exists for 164507648 parent 0 root 5
Extent back ref already exists for 166637568 parent 0 root 5
Extent back ref already exists for 61587456 parent 0 root 5
Extent back ref already exists for 117633024 parent 0 root 5
Extent back ref already exists for 250806272 parent 0 root 5
Extent back ref already exists for 35176448 parent 0 root 5
Extent back ref already exists for 117395456 parent 0 root 5
Extent back ref already exists for 250949632 parent 0 root 5
Extent back ref already exists for 117628928 parent 0 root 5
Extent back ref already exists for 280928256 parent 0 root 5
Extent back ref already exists for 34058240 parent 0 root 5
Extent back ref already exists for 63078400 parent 0 root 5
Extent back ref already exists for 117325824 parent 0 root 5
Extent back ref already exists for 319053824 parent 0 root 5
Extent back ref already exists for 117391360 parent 0 root 5
Extent back ref already exists for 5888 parent 0 root 5
Extent back ref already exists for 139710464 parent 0 root 5
Extent back ref already exists for 117592064 parent 0 root 5
Extent back ref already exists for 34013184 parent 0 root 5
Extent back ref already exists for 294502400 parent 0 root 5
Extent back ref already exists for 70004736 parent 0 root 5
Extent back ref already exists for 48816128 parent 0 root 5
Extent back ref already exists for 53764096 parent 0 root 5
Extent back ref already exists for 5456105472 parent 0 root 5
Extent back ref already exists for 57430016 parent 0 root 5
Extent back ref already exists for 62701568 parent 0 root 5
Extent back ref already exists for 34869248 parent 0 root 5
Extent back ref already exists for 34877440 parent 0 root 5
Extent back ref already exists for 5456109568 parent 0 root 5
Extent back ref already exists for 5456113664 parent 0 root 5
Extent back ref already exists for 35168256 parent 0 root 5
Extent back ref already exists for 5456117760 parent 0 root 5
Extent back ref already exists for 51052544 parent 0 root 5
Extent back ref already exists for 33845248 parent 0 root 5
Extent back ref already exists for 33849344 parent 0 root 5
Extent back ref already exists for 62627840 parent 0 root 5
Extent back ref already exists for 5456121856 parent 0 root 5
Extent back ref already exists for 30162944 parent 0 root 5
Extent back ref already exists for 307396608 parent 0 root 5
Extent back ref already exists for 5470208000 parent 0 root 5
Extent back ref already exists for 30547968 parent 0 root 5
Extent back ref already exists for 47996928 parent 0 root 5
Extent back ref already exists for 5470183424 parent 0 root 5
Extent back ref already exists for 35127296 parent 0 root 5
Extent back ref already exists for 59133952 parent 0 root 5
Extent back ref already exists for 33841152 parent 0 root 5
Extent back ref already exists for 5432545280 parent 0 root 5
Extent back ref already exists for 5456003072 parent 0 root 5
Extent back ref already exists for 38998016 parent 0 root 5
Extent back ref already exists for 30568448 parent 0 root 5
Extent back ref already exists for 33296384 parent 0 root 5
Extent back ref already exists for 5470212096 parent 0 root 5
Extent back ref already exists for 5470216192 parent 0 root 5
Extent back ref already exists for 41598976 parent 0 root 5
Extent back ref already exists for 30572544 parent 0 root 5
Extent back ref already exists for 338010112 parent 0 root 5
Extent back ref already exists for 176168960 parent 0 root 5
Extent back ref already exists for 290648064 parent 0 root 5
Extent back ref already exists for 42500096 parent 0 root 5
Extent back ref already exists for 59551744 parent 0 root 5
Extent back ref already exists for 59772928 parent 0 root 5
Extent back ref already exists for 65953792 parent 0 root 5
Extent back ref already exists for 117329920 parent 0 root 5
Extent back ref already exists for 164323328 parent 0 root 5
Extent back ref already exists for 44875776 parent 0 root 5
Extent back ref already exists for 30584832 parent 0 root 5
Extent back ref already exists for 332926976 parent 0 root 5
Extent back ref already exists for 333000704 parent 0 root 5
parent transid verify failed on 51126272 wanted 3532 found 3413
parent transid verify failed on 51126272 wanted 3532 found 3413
parent transid verify failed on 51126272 wanted 3532 found 3413
parent transid verify failed on 51126272 wanted 3532 found 3413
Ignoring transid failure
leaf 51126272 items 42 free space 803 generation 3413 owner 2
fs uuid 51221b80-6228-4621-bc28-66ab07bd8

Re: NOCOW + compress-force = bug

2012-03-13 Thread Jeff Mahoney
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 02/16/2012 12:58 PM, Chris Mason wrote:
> On Thu, Feb 16, 2012 at 07:55:15PM +0600, Roman Mamedov wrote:
>> Hello,
>> 
>> Please be aware that there seems to be a possible problem with
>> using NOCOW flag on files situated on a filesystem mounted with
>> compress-force(=lzo, in my case).
>> 
>> Since experimenting with NOCOW, I started regularly hitting this
>> BUG at extent-tree.c:5813
>> 
>> 5813 BUG_ON(!(flags &
>> BTRFS_BLOCK_FLAG_FULL_BACKREF));
>> 
>> I was unable to make netconsole work over a bridged interface, so
>> can only post screenshots of this OOPS: 
>> http://romanrm.ru/pics/2012/2012-02-16-btrfs-bug-1.jpg 
>> http://romanrm.ru/pics/2012/2012-02-16-btrfs-bug-2.jpg
>> 
>> This happened four times already, and always on snapshot creation
>> (but not every case). I have hourly snapshots in crontab, and
>> only one case out of about ten fails with this problem. Did not
>> try to deliberately reproduce it yet by manually making snapshots
>> very often, etc.
> 
> Interesting, NOCOW and compression don't really mix.  We always cow
> for compression.  I'll try to reproduce it.

I hit this one today without nocow or compression. The only thing
non-default was that I mounted with -ossd. The backing store was a 1GB
non-sparse loopback file on tmpfs.

I had kdump enabled and with 16GB, I wasn't waiting around for the
dump to complete. If it happens again, I'll have a full stack trace.
My test case was filling the disk while making snapshots.

- -Jeff


- -- 
Jeff Mahoney
SUSE Labs
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.18 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBAgAGBQJPX424AAoJEB57S2MheeWybq0P/1xv09dKpBfuvK/2vpyriebN
3RsPtHVZgAL9y9XeLVw0KPlRKTGM+PIyc+AEloIiYQgULifA625nPa6+DXlNqzCp
0jKVleAE5RLCOtBDC91GX8JO/55fszjTQHSXAHjSRr4vr/4bZEf4tLBTwbW7Nfax
YAUPQ6cu+FQlCDWppXMLswKmLsmnewTnKrZp0YQTOTVGY3dajsvGTR3vR0VyxMwY
FZ13hvGOfP+UXsZC1qijxA3g8CrWwu/dhiU/dnCYwhcCF+0ONiHHHhB3krDAkNP8
yVAZgoSGJsIUdDiqeKXaSm8RDT6vGQwnJicFQSkUqGiHXyi+/fXVUULNSbgj39EC
R2jHLlXtMMaRyoqiM/wOfZUOr5MGQhk0duXB1NjNGAaffAsjvBY0c8y4yvGJVhtR
E2EknQLHr2jBWF8KCpLe0YYLPjcB3Gp3SPUhyGZbg4ATUAv2amMcw13deI8I17gI
v9dJSCWyi+r5c0d26rgtsS9SpfM8qHz5A/EvqVtn99DgS+O49o4+3F8M+2WioDjM
mHndWGXqg4fBIwrpxvo5RtBmQ8OYgZzxZXoZU3XP/eRS4c6z8OJ9U92/pUFP64Id
6tpqU1E9XxEijTWwcIyvkSzu6bjOLQucePBPCKkWLtTy9XrbNMTmvcGozKNCTVKK
km1EWb9RM2m8JPrEc+JA
=ywbL
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel BUG at fs/btrfs/transaction.c:1337!

2012-03-13 Thread qasdfgtyuiop
I'm sorry but I don't know how to get the kernel dump.  It seems that
the kernel dump is not enabled for my kernel:
# CONFIG_CRASH_DUMP is not set

On 3/13/12, Anand Jain  wrote:
>
>
>   These logs doesn't have the traces of the below BUG_ON().
>   stack as in the dmesg below has 'btrfs_congested_fn'
>   which generally notifies block-device near Q full condition.
>   we would need logs to confirm anything further. was there
>   a kernel dump generated when BUG_ON was called? that should
>   help.
>
> Thanks, Anand
>
>
> On 03/13/12 13:42, qasdfgtyuiop wrote:
>> All my logs are attached
>> I have reinstalled my system the old filesystem has been backuped as a
>> img file using dd if=/dev/sdb1 of=broken-btrfs.img bs=1G.  I mounted
>> the broken filesystem and copied the log files out.  While copying I
>> got the error message:
>> cp: reading `log/messages.log': Input/output error
>> cp: failed to extend `/home/gaoxiang/tests/log/messages.log': Input/output
>> error
>> cp: reading `log/kernel.log': Input/output error
>> cp: failed to extend `/home/gaoxiang/tests/log/kernel.log': Input/output
>> error
>> cp: reading `log/errors.log': Input/output error
>> cp: failed to extend `/home/gaoxiang/tests/log/errors.log': Input/output
>> error
>> cp: reading `log/everything.log': Input/output error
>> cp: failed to extend `/home/gaoxiang/tests/log/everything.log':
>> Input/output error
>>
>>
>> On Mon, Mar 12, 2012 at 3:09 PM, Anand Jain  wrote:
>>>
>>>   could you also post few lines of dmesg logged _before_ the below logs.
>>>
>>> Thanks,  -Anand
>>>
>>>
>>>
>>> On Saturday 10,March,2012 06:26 PM, qasdfgtyuiop wrote:

 [11558.527680] [ cut here ]
 [11558.527708] kernel BUG at fs/btrfs/transaction.c:1337!
 [11558.527730] invalid opcode:  [#1] PREEMPT SMP
>>> [11558.527764] CPU 1
 [11558.527776] Modules linked in: loop nls_cp437 vfat fat dm_mod xfs
 exportfs jfs usb_storage uas fuse ext4 jbd2 mbcache snd_hda_codec_hdmi
 snd_hda_codec_realtek arc4 iwlwifi snd_hda_intel snd_hda_codec
 uvcvideo snd_hwdep nouveau iTCO_wdt i2c_i801 jmb38x_ms broadcom
 snd_pcm i915 videodev tg3 sdhci_pci v4l2_compat_ioctl32 mac80211 sdhci
 ttm i2c_algo_bit snd_page_alloc intel_agp snd_timer mxm_wmi serio_raw
 drm_kms_helper btusb media bluetooth drm psmouse evdev joydev pcspkr
 crc16 mei(C) iTCO_vendor_support snd libphy intel_ips memstick
 mmc_core cfg80211 soundcore intel_gtt i2c_core thermal battery
 ideapad_laptop sparse_keymap rfkill wmi ac video processor button
 btrfs crc32c libcrc32c zlib_deflate sd_mod sr_mod cdrom usbhid hid
 ahci libahci libata scsi_mod ehci_hcd usbcore usb_common
 [11558.528323]
 [11558.528333] Pid: 125, comm: btrfs-transacti Tainted: G C
 3.2.9-1-ARCH #1 LENOVO   IdeaPad Y460/KL2
 [11558.528389] RIP: 0010:[]  []
 btrfs_commit_transaction+0x879/0x880 [btrfs]
 [11558.528439] RSP: 0018:8801af1dfde0  EFLAGS: 00010282
 [11558.528460] RAX: fffb RBX: 8801afa91690 RCX:
 
 [11558.528484] RDX: 8801af1dfce8 RSI: 03c1 RDI:
 8801afa916f0
 [11558.528510] RBP: 8801af1dfe70 R08: 2000 R09:
 
 [11558.528536] R10:  R11: 0001 R12:
 8801b13ab000
 [11558.528561] R13: 8801afa90c18 R14: 8801afa91708 R15:
 0218
 [11558.528586] FS:  () GS:8801bbc8()
 knlGS:
 [11558.528616] CS:  0010 DS:  ES:  CR0: 8005003b
 [11558.528637] CR2: 7fb7899244e8 CR3: 01805000 CR4:
 06e0
 [11558.528663] DR0:  DR1:  DR2:
 
 [11558.528688] DR3:  DR6: 0ff0 DR7:
 0400
 [11558.528714] Process btrfs-transacti (pid: 125, threadinfo
 8801af1de000, task 8801aff2ce60)
 [11558.528744] Stack:
 [11558.528755]  8801af1dfe10  
 8801aff2ce60
 [11558.528791]  81088d80 8801af1dfe08 8801af1dfe08
 a014ccd4
 [11558.528827]   af2b0560 8801aff2ce60
 8801b13ab000
 [11558.528863] Call Trace:
 [11558.528879]  [] ? abort_exclusive_wait+0xb0/0xb0
 [11558.528910]  [] ? start_transaction+0x94/0x2b0
 [btrfs]
 [11558.528940]  [] transaction_kthread+0x26d/0x290
 [btrfs]
 [11558.528971]  [] ? btrfs_congested_fn+0xd0/0xd0
 [btrfs]
 [11558.528996]  [] kthread+0x8c/0xa0
 [11558.529018]  [] kernel_thread_helper+0x4/0x10
 [11558.529040]  [] ? kthread_worker_fn+0x190/0x190
 [11558.529064]  [] ? gs_change+0x13/0x13
 [11558.529083] Code: ff ff e9 44 f9 ff ff 0f 0b 0f 0b 0f 0b be fc 04
 00 00 48 c7 c7 fa 00 1a a0 e8 84 98 f1 e0 e9 3c fb ff ff 0f 0b 0f 0b
 0f 0b 0f 0b<0f> 

Re: btrfs encryption problems

2012-03-13 Thread 810d4rk
I'm not having luck getting the encrypted btrfs back since the drive
the drive was unplugged during a write operation, the experimental
fsck with the repair option gives no valid btrfs found, mount gives
this:
sudo mount -t btrfs /dev/dm-1 /media/
mount: wrong fs type, bad option, bad superblock on
/dev/mapper/udisks-luks-uuid-269300fe-1329-42f8-b7fa-4a399a71d56f-uid1000,
missing codepage or helper program, or other error

There is nothing more to do to this?


On 23 December 2011 14:25, 810d4rk <810d...@gmail.com> wrote:
>> I've been using btrfs and luks encryption on my Acer netbook for about
>> a year now.  I haven't had an unmountable corruption on that computer
>> yet.
>>
>> What are your goals now?
>
> I would like to recover the data that is not in the backup.
>
>> Are you trying to recover data from this disk, or are you trying to
>> accomplish some debugging?
>
> I am trying to recover the data, also I think this bug needs to be fixed.
>
>> Reviewing the thread, I don't see where you've run btrfsck on the
>> /dev/mapper/.
>>
>> Although btrfsck won't fix anything, it might give some insight as the
>> the extent of the corruption.
>
> I have run it now and it says "No valid Btrfs found"
>
>
> --
> Thanks



-- 
Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel BUG at fs/btrfs/transaction.c:1337!

2012-03-13 Thread Anand Jain



 These logs doesn't have the traces of the below BUG_ON().
 stack as in the dmesg below has 'btrfs_congested_fn'
 which generally notifies block-device near Q full condition.
 we would need logs to confirm anything further. was there
 a kernel dump generated when BUG_ON was called? that should
 help.

Thanks, Anand


On 03/13/12 13:42, qasdfgtyuiop wrote:

All my logs are attached
I have reinstalled my system the old filesystem has been backuped as a
img file using dd if=/dev/sdb1 of=broken-btrfs.img bs=1G.  I mounted
the broken filesystem and copied the log files out.  While copying I
got the error message:
cp: reading `log/messages.log': Input/output error
cp: failed to extend `/home/gaoxiang/tests/log/messages.log': Input/output error
cp: reading `log/kernel.log': Input/output error
cp: failed to extend `/home/gaoxiang/tests/log/kernel.log': Input/output error
cp: reading `log/errors.log': Input/output error
cp: failed to extend `/home/gaoxiang/tests/log/errors.log': Input/output error
cp: reading `log/everything.log': Input/output error
cp: failed to extend `/home/gaoxiang/tests/log/everything.log':
Input/output error


On Mon, Mar 12, 2012 at 3:09 PM, Anand Jain  wrote:


  could you also post few lines of dmesg logged _before_ the below logs.

Thanks,  -Anand



On Saturday 10,March,2012 06:26 PM, qasdfgtyuiop wrote:


[11558.527680] [ cut here ]
[11558.527708] kernel BUG at fs/btrfs/transaction.c:1337!
[11558.527730] invalid opcode:  [#1] PREEMPT SMP

[11558.527764] CPU 1

[11558.527776] Modules linked in: loop nls_cp437 vfat fat dm_mod xfs
exportfs jfs usb_storage uas fuse ext4 jbd2 mbcache snd_hda_codec_hdmi
snd_hda_codec_realtek arc4 iwlwifi snd_hda_intel snd_hda_codec
uvcvideo snd_hwdep nouveau iTCO_wdt i2c_i801 jmb38x_ms broadcom
snd_pcm i915 videodev tg3 sdhci_pci v4l2_compat_ioctl32 mac80211 sdhci
ttm i2c_algo_bit snd_page_alloc intel_agp snd_timer mxm_wmi serio_raw
drm_kms_helper btusb media bluetooth drm psmouse evdev joydev pcspkr
crc16 mei(C) iTCO_vendor_support snd libphy intel_ips memstick
mmc_core cfg80211 soundcore intel_gtt i2c_core thermal battery
ideapad_laptop sparse_keymap rfkill wmi ac video processor button
btrfs crc32c libcrc32c zlib_deflate sd_mod sr_mod cdrom usbhid hid
ahci libahci libata scsi_mod ehci_hcd usbcore usb_common
[11558.528323]
[11558.528333] Pid: 125, comm: btrfs-transacti Tainted: G C
3.2.9-1-ARCH #1 LENOVO   IdeaPad Y460/KL2
[11558.528389] RIP: 0010:[]  []
btrfs_commit_transaction+0x879/0x880 [btrfs]
[11558.528439] RSP: 0018:8801af1dfde0  EFLAGS: 00010282
[11558.528460] RAX: fffb RBX: 8801afa91690 RCX:

[11558.528484] RDX: 8801af1dfce8 RSI: 03c1 RDI:
8801afa916f0
[11558.528510] RBP: 8801af1dfe70 R08: 2000 R09:

[11558.528536] R10:  R11: 0001 R12:
8801b13ab000
[11558.528561] R13: 8801afa90c18 R14: 8801afa91708 R15:
0218
[11558.528586] FS:  () GS:8801bbc8()
knlGS:
[11558.528616] CS:  0010 DS:  ES:  CR0: 8005003b
[11558.528637] CR2: 7fb7899244e8 CR3: 01805000 CR4:
06e0
[11558.528663] DR0:  DR1:  DR2:

[11558.528688] DR3:  DR6: 0ff0 DR7:
0400
[11558.528714] Process btrfs-transacti (pid: 125, threadinfo
8801af1de000, task 8801aff2ce60)
[11558.528744] Stack:
[11558.528755]  8801af1dfe10  
8801aff2ce60
[11558.528791]  81088d80 8801af1dfe08 8801af1dfe08
a014ccd4
[11558.528827]   af2b0560 8801aff2ce60
8801b13ab000
[11558.528863] Call Trace:
[11558.528879]  [] ? abort_exclusive_wait+0xb0/0xb0
[11558.528910]  [] ? start_transaction+0x94/0x2b0
[btrfs]
[11558.528940]  [] transaction_kthread+0x26d/0x290
[btrfs]
[11558.528971]  [] ? btrfs_congested_fn+0xd0/0xd0
[btrfs]
[11558.528996]  [] kthread+0x8c/0xa0
[11558.529018]  [] kernel_thread_helper+0x4/0x10
[11558.529040]  [] ? kthread_worker_fn+0x190/0x190
[11558.529064]  [] ? gs_change+0x13/0x13
[11558.529083] Code: ff ff e9 44 f9 ff ff 0f 0b 0f 0b 0f 0b be fc 04
00 00 48 c7 c7 fa 00 1a a0 e8 84 98 f1 e0 e9 3c fb ff ff 0f 0b 0f 0b
0f 0b 0f 0b<0f>0b 0f 0b 0f 1f 00 55 48 89 e5 53 48 83 ec 08 66 66 66
66 90
[11558.529389] RIP  []
btrfs_commit_transaction+0x879/0x880 [btrfs]
[11558.529425]  RSP
[11558.592012] ---[ end trace e0456c287e012690 ]---
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: introduce free_extent_buffer_stale

2012-03-13 Thread Josef Bacik
Because btrfs cow's we can end up with extent buffers that are no longer
necessary just sitting around in memory.  So instead of evicting these pages, we
could end up evicting things we actually care about.  Thus we have
free_extent_buffer_stale for use when we are freeing tree blocks.  This will
make it so that the ref for the eb being in the radix tree is dropped as soon as
possible and then is freed when the refcount hits 0 instead of waiting to be
released by releasepage.  Thanks,

Signed-off-by: Josef Bacik 
---
 fs/btrfs/ctree.c   |   31 +++---
 fs/btrfs/disk-io.c |   14 +
 fs/btrfs/extent-tree.c |4 -
 fs/btrfs/extent_io.c   |  157 ++--
 fs/btrfs/extent_io.h   |5 +-
 5 files changed, 152 insertions(+), 59 deletions(-)

diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index 0639a55..fbf0f12 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -156,10 +156,23 @@ struct extent_buffer *btrfs_root_node(struct btrfs_root 
*root)
 {
struct extent_buffer *eb;
 
-   rcu_read_lock();
-   eb = rcu_dereference(root->node);
-   extent_buffer_get(eb);
-   rcu_read_unlock();
+   while (1) {
+   rcu_read_lock();
+   eb = rcu_dereference(root->node);
+
+   /*
+* RCU really hurts here, we could free up the root node because
+* it was cow'ed but we may not get the new root node yet so do
+* the inc_not_zero dance and if it doesn't work then
+* synchronize_rcu and try again.
+*/
+   if (extent_buffer_inc_not_zero(eb)) {
+   rcu_read_unlock();
+   break;
+   }
+   rcu_read_unlock();
+   synchronize_rcu();
+   }
return eb;
 }
 
@@ -504,7 +517,7 @@ static noinline int __btrfs_cow_block(struct 
btrfs_trans_handle *trans,
}
if (unlock_orig)
btrfs_tree_unlock(buf);
-   free_extent_buffer(buf);
+   free_extent_buffer_stale(buf);
btrfs_mark_buffer_dirty(cow);
*cow_ret = cow;
return 0;
@@ -959,7 +972,7 @@ static noinline int balance_level(struct btrfs_trans_handle 
*trans,
root_sub_used(root, mid->len);
btrfs_free_tree_block(trans, root, mid, 0, 1, 0);
/* once for the root ptr */
-   free_extent_buffer(mid);
+   free_extent_buffer_stale(mid);
return 0;
}
if (btrfs_header_nritems(mid) >
@@ -1016,7 +1029,7 @@ static noinline int balance_level(struct 
btrfs_trans_handle *trans,
ret = wret;
root_sub_used(root, right->len);
btrfs_free_tree_block(trans, root, right, 0, 1, 0);
-   free_extent_buffer(right);
+   free_extent_buffer_stale(right);
right = NULL;
} else {
struct btrfs_disk_key right_key;
@@ -1056,7 +1069,7 @@ static noinline int balance_level(struct 
btrfs_trans_handle *trans,
ret = wret;
root_sub_used(root, mid->len);
btrfs_free_tree_block(trans, root, mid, 0, 1, 0);
-   free_extent_buffer(mid);
+   free_extent_buffer_stale(mid);
mid = NULL;
} else {
/* update the parent key to reflect our changes */
@@ -3781,7 +3794,9 @@ static noinline int btrfs_del_leaf(struct 
btrfs_trans_handle *trans,
 
root_sub_used(root, leaf->len);
 
+   extent_buffer_get(leaf);
btrfs_free_tree_block(trans, root, leaf, 0, 1, 0);
+   free_extent_buffer_stale(leaf);
return 0;
 }
 /*
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index b8f2284..a3c9166 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -925,21 +925,9 @@ static int btree_readpage(struct file *file, struct page 
*page)
 
 static int btree_releasepage(struct page *page, gfp_t gfp_flags)
 {
-   struct extent_map_tree *map;
-   struct extent_io_tree *tree;
-   int ret;
-
if (PageWriteback(page) || PageDirty(page))
return 0;
-
-   tree = &BTRFS_I(page->mapping->host)->io_tree;
-   map = &BTRFS_I(page->mapping->host)->extent_tree;
-
-   ret = try_release_extent_state(map, tree, page, gfp_flags);
-   if (!ret)
-   return 0;
-
-   return try_release_extent_buffer(tree, page);
+   return try_release_extent_buffer(page, gfp_flags);
 }
 
 static void btree_invalidatepage(struct page *page, unsigned long offset)
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index e5111d5..4f7cc56 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -5011,10 +5011,6 @@ static int __btrfs_free_extent(struct btrfs_trans_handle 
*trans,
if (is_data) {
ret = btr

Re: [PATCH] Btrfs: convert refs and pages_reading to ints

2012-03-13 Thread Josef Bacik
On Tue, Mar 13, 2012 at 10:51:04AM +0100, Jan Schmidt wrote:
> Hi Josef,
> 
> On 09.03.2012 17:06, Josef Bacik wrote:
> > I need to be able to safely deal with refs in my next patch, so convert 
> > refs and
> 
> Did I miss your next patch?
> 
> > pages_reading to ints and introduce an eb_lock spinlock so I can use this to
> > safely manipulate the refs count when marking eb's as stale.  Thanks,
> 
> I don't see what makes this version safer, are you synchronizing
> eb->refs with eb->pages_reading? This would be strange, because
> eb->pages_reading sounds like it could never be != 0 when eb->refs goes
> down to 0. Could you extend your description a bit, please?
>

So I'm introducing eb_lock to protect eb local stuff, such as refs and
pages_reading.  The idea is that since I have to have a spin_lock anyway I might
as well use it for pages_reading as well, especially since further down the line
(like what I'm currently working on) I need to change it to cover pages in write
as well and I need to do it in a way that atomic won't help.
 
> > diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
> > index f030a2d..d11e872 100644
> > --- a/fs/btrfs/extent_io.h
> > +++ b/fs/btrfs/extent_io.h
> > @@ -128,8 +128,9 @@ struct extent_buffer {
> > unsigned long map_len;
> > unsigned long bflags;
> > struct extent_io_tree *tree;
> > -   atomic_t refs;
> > -   atomic_t pages_reading;
> > +   spinlock_t eb_lock;
> 
> If you need that one, then please use another name for this. We already
> have the extent buffer's rwlock, it'll only be a matter of time until
> somebody (me) confuses eb->lock and eb->eb_lock. I'd like something
> representing its purpose (which I didn't catch yet). eb->refs_lock or
> eb->pages_lock might be appropriate.
> 

pages_lock sounds fine to me.  Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


bcache with SSD instead of battery powered raid cards

2012-03-13 Thread Kiran Patil
Hi,

Is anybody using bcache with SSD instead of battery powered raid cards
with Btrfs ?

Hard drives are cheap and big, SSDs are fast but small and expensive.
Wouldn't it be nice if you could transparently get the advantages of
both? With Bcache, you can have your cake and eat it too.

Bcache is a patch for the Linux kernel to use SSDs to cache other
block devices. It's analogous to L2Arc for ZFS, but Bcache also does
writeback caching, and it's filesystem agnostic. It's designed to be
switched on with a minimum of effort, and to work well without
configuration on any setup. By default it won't cache sequential IO,
just the random reads and writes that SSDs excel at. It's meant to be
suitable for desktops, servers, high end storage arrays, and perhaps
even embedded.

http://bcache.evilpiepirate.org/

http://news.gmane.org/gmane.linux.kernel.bcache.devel

Thanks,
Kiran.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: convert refs and pages_reading to ints

2012-03-13 Thread Jan Schmidt
Hi Josef,

On 09.03.2012 17:06, Josef Bacik wrote:
> I need to be able to safely deal with refs in my next patch, so convert refs 
> and

Did I miss your next patch?

> pages_reading to ints and introduce an eb_lock spinlock so I can use this to
> safely manipulate the refs count when marking eb's as stale.  Thanks,

I don't see what makes this version safer, are you synchronizing
eb->refs with eb->pages_reading? This would be strange, because
eb->pages_reading sounds like it could never be != 0 when eb->refs goes
down to 0. Could you extend your description a bit, please?

> diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
> index f030a2d..d11e872 100644
> --- a/fs/btrfs/extent_io.h
> +++ b/fs/btrfs/extent_io.h
> @@ -128,8 +128,9 @@ struct extent_buffer {
>   unsigned long map_len;
>   unsigned long bflags;
>   struct extent_io_tree *tree;
> - atomic_t refs;
> - atomic_t pages_reading;
> + spinlock_t eb_lock;

If you need that one, then please use another name for this. We already
have the extent buffer's rwlock, it'll only be a matter of time until
somebody (me) confuses eb->lock and eb->eb_lock. I'd like something
representing its purpose (which I didn't catch yet). eb->refs_lock or
eb->pages_lock might be appropriate.

> + int refs;
> + int pages_reading;
>   struct list_head leak_list;
>   struct rcu_head rcu_head;
>   pid_t lock_owner;

Thanks,
-Jan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs oops (autodefrag related?)

2012-03-13 Thread Avi Kivity
On 03/13/2012 02:04 AM, Chris Mason wrote:
> On Mon, Mar 12, 2012 at 09:32:54PM +0200, Avi Kivity wrote:
> > Because I'm such a btrfs fanboi I'm running btrfs on my /, all past
> > experience notwithstanding.  In an attempt to recover some performance,
> > I enabled autodefrag, and got this in return:
>
> Hi Avi,
>
> This one was fixed in the 3.3 series.  You can pull from my for-linus
> repo for a commit against 3.2.
>
> git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git for-linus
>
> The individual fix is here:
>
> http://git.kernel.org/?p=linux/kernel/git/mason/linux-btrfs.git;a=commit;h=87826df0ec36fc28884b4ddbb3f3af41c4c2008f
>
>

Thanks.  Suggest queueing it for -stable.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html