Re: can't access diagrams on wiki

2012-01-24 Thread Arne Jansen
On 25.01.2012 03:37, Anand Jain wrote:
> 
>> The wiki on kernel.org is in read-only mode
>> [1] http://btrfs.ipv5.de/
> 
>  Is wiki still in read only mode? I am able to login,
>  but there isn't any link to create new page ?
> 

You mean the wiki  mentioned above? You have to confirm
your email address to create and edit pages. This was necessary
to slow down spammers :(

-Arne

> thanks, Anand
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG - btrfs] kernel oops in extent_range_uptodate

2012-01-24 Thread Mitch Harder
On Tue, Jan 24, 2012 at 10:24 AM, Vincent Vanackere
 wrote:
> On 01/20/2012 09:54 PM, Mitch Harder wrote:
>>
>> On Fri, Jan 20, 2012 at 10:48 AM, Vincent Vanackere
>>   wrote:
>>>
>>> On 01/19/2012 05:24 PM, Mitch Harder wrote:

 On Thu, Jan 19, 2012 at 8:42 AM, Vincent Vanackere
     wrote:
>
> Hi,
>
> With the most current git kernel
> (90a4c0f51e8e44111a926be6f4c87af3938a79c3)
> I'm still getting the same reproducible kernel panic when trying to
> read
> a
> particular file stored on a btrfs filesystem (as seen in the log there
> are
> indeed disk media errors on this disk).
> I'd like the "software" part of this to be fixed - btrfs should
> definitely
> not oops even in case of media error - before sending the disk to RMA.
> Is
> there anything I can do to make progress on this ?
>
 Is this kernel compiled with "Compile the kernel with debug info" (in
 the "Kernel hacking  --->" configuration section)?

 It would be nice to have the specific line of code passing the NULL
 pointer.
>>>
>>>
>>> The kernel was compiled with debug information but modern linux
>>> distribution
>>> make it really hard to keep your debug information it seems :-(
>>
>> I see where the find_get_page(...) function called in
>> extent_range_uptodate has the potential to return a NULL value.
>>
>> Could you try the following patch, and if it solves your oops and
>> shows the included warning in your dmesg log, I'll simplify the patch
>> to drop the printk and submit it to the list.
>>
>> I only included the printk since your current error log is ambiguous
>> regarding the specific point where we're getting the NULL pointer
>> dereference, but I'll pull it out if it works.
>>
>> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
>> index 9d09a4f..35c3a2a 100644
>> --- a/fs/btrfs/extent_io.c
>> +++ b/fs/btrfs/extent_io.c
>> @@ -3909,6 +3909,13 @@ int extent_range_uptodate(struct extent_io_tree
>> *tree,
>>        while (start<= end) {
>>                index = start>>  PAGE_CACHE_SHIFT;
>>                page = find_get_page(tree->mapping, index);
>> +               if (unlikely(!page)) {
>> +                       if (printk_ratelimit())
>> +                               printk(KERN_WARNING
>> +                                      "btrfs: NULL page in "
>> +                                      "extent_range_uptodate()\n");
>> +                       return 1;
>> +               }
>>                uptodate = PageUptodate(page);
>>                page_cache_release(page);
>>                if (!uptodate) {
>
>
> Indeed your patch helps. No kernel panic any more... but it looks like the
> task doesn't finish and there's another problem to solve now :
>
>
> sd 5:0:0:0: [sdd] Unhandled sense code
> sd 5:0:0:0: [sdd]  Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
> sd 5:0:0:0: [sdd]  Sense Key : Medium Error [current] [descriptor]
> Descriptor sense data with sense descriptors (in hex):
>        72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
>        70 2f dc 61
> sd 5:0:0:0: [sdd]  Add. Sense: Unrecovered read error - auto reallocate
> failed
> sd 5:0:0:0: [sdd] CDB: Read(10): 28 00 70 2f dc 5f 00 00 08 00
> end_request: I/O error, dev sdd, sector 1882184801
> ata6: EH complete
> btrfs: NULL page in extent_range_uptodate()
> btrfs: NULL page in extent_range_uptodate()
> btrfs bad tree block start 959241011200 959241011200
> INFO: task cat:3099 blocked for more than 120 seconds.
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> cat             D 8180c600     0  3099   3002 0x
>  8801f2b0f618 0086 8801f2b0f5d8 880221018770
>  880222c65b80 8801f2b0ffd8 8801f2b0ffd8 8801f2b0ffd8
>  8802241816e0 880222c65b80 8801f2b0f5e8 88022fd13e88
> Call Trace:
>  [] ? __lock_page+0x70/0x70
>  [] schedule+0x3f/0x60
>  [] io_schedule+0x8f/0xd0
>  [] sleep_on_page+0xe/0x20
>  [] __wait_on_bit+0x5f/0x90
>  [] wait_on_page_bit+0x78/0x80
>  [] ? autoremove_wake_function+0x40/0x40
>  [] read_extent_buffer_pages+0x471/0x4d0 [btrfs]
>  [] ? verify_parent_transid+0x160/0x160 [btrfs]
>  [] btree_read_extent_buffer_pages.isra.99+0x8a/0xc0
> [btrfs]
>  [] read_tree_block+0x41/0x60 [btrfs]
>  [] read_block_for_search.isra.34+0xf3/0x3d0 [btrfs]
>  [] btrfs_search_slot+0x300/0x8a0 [btrfs]
>  [] btrfs_lookup_csum+0x74/0x170 [btrfs]
>  [] __btrfs_lookup_bio_sums+0x1af/0x3b0 [btrfs]
>  [] btrfs_lookup_bio_sums+0x16/0x20 [btrfs]
>  [] btrfs_submit_bio_hook+0x140/0x170 [btrfs]
>  [] ? btrfs_real_readdir+0x720/0x720 [btrfs]
>  [] submit_one_bio+0x6a/0xa0 [btrfs]
>  [] extent_readpages+0xe4/0x100 [btrfs]
>  [] ? btrfs_real_readdir+0x720/0x720 [btrfs]
>  [] btrfs_readpages+0x1f/0x30 [btrfs]
>  [] __do_page_cache_readahead+0x1af/0x250
>  [] ra_submit+0x21/0x30
>  [] ondemand_readahead+0x115/0x230
>  [] ? __do_fault+0x419/0x530
>  [] page_cache_sync_readahead+0x31/0x50
>

Re: can't access diagrams on wiki

2012-01-24 Thread Anand Jain



The wiki on kernel.org is in read-only mode
[1] http://btrfs.ipv5.de/


 Is wiki still in read only mode? I am able to login,
 but there isn't any link to create new page ?

thanks, Anand
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ANN: linux-kernel-lzo-2.06.20120123 - update LZO to v2.06

2012-01-24 Thread Andi Kleen
On Mon, Jan 23, 2012 at 05:19:40PM +0100, Markus F.X.J. Oberhumer wrote:
> Hi,
> 
> I've prepared a small package that updates the LZO version in the Linux
> kernel to LZO v2.06.

I ran benchmarks on the new miniLZO and LZ4 on 64bit. LZ4 is generally slower 
than snappy/lzo in the micro benchmarks. The new LZO is better than the old
one, but still loses to snappy most of the time (but often by very
small amounts only)

Will be worth checking the new LZO will the full distribution boot test.

I agree it's definitely a good idea to update the kernel version.
However I must say it would be a major project to bring it up
to kernel coding standards.

snappy is still interesting, but much less so than it was before.

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] improve space utilization on off-sized raid devices

2012-01-24 Thread Thomas Schmidt
On Thursday 01 December 2011 09:55:27 Arne Jansen wrote:
> As RAID0 is already not a strict 'all disks or none', I like the idea to
> have it even more dynamic to reach full optimization. But I'd like to see
> some properties conserved:
>  a) In case of even size disks, the stripes should always be full size, not
>n - 1
>  b) Minor variations in the used space per disk due to metadata chunks
> should not lead to deviation from a)
>  c) The algorithms should not give weird results under unconventional
> setups. Some theoretical background would be nice 

Resent because it did not appear on the ML for about 4h.
KMail's acting up.

Sorry to only get back to you now, I must have missed your mail somehow.

The problem is the shrinking stripe width with unmatched devices. Once it hits 
devs_min-1 it's over. My solution is to try to keep the stripe width constant.
The sorting then takes care of selecting the right devices.

It's simply: space / min-hight = max-width
a) is dictated by math
Since circumstances change (add, rm devs, rounding, ...) it is calculated again 
at every allocation. The result is then rounded to the nearest multiple of 
devs_increment. This takes care of b).

The code may look wiered but should be identical to the mathematical
floor(Space / min-hight + increment/2) if considered together with the round 
down already present in the line after my patch.

The two ifs should safeguard against weird stuff by limiting the result to sane 
values.

I include an updated patch below. It's again written for and tested with 3.0.0 
but diff3 worked nicely for applying it to 3.3-rc1.

--- volumes.c.orig  2012-01-20 16:59:31.0 +0100
+++ volumes.c   2012-01-24 11:24:07.261401805 +0100
@@ -2329,6 +2329,8 @@
u64 stripe_size;
u64 num_bytes;
int ndevs;
+   u64 fs_total_avail;
+   int opt_ndevs;
int i;
int j;
 
@@ -2448,6 +2450,7 @@
devices_info[ndevs].total_avail = total_avail;
devices_info[ndevs].dev = device;
++ndevs;
+   fs_total_avail += total_avail;
}
 
/*
@@ -2456,6 +2459,16 @@
sort(devices_info, ndevs, sizeof(struct btrfs_device_info),
 btrfs_cmp_device_info, NULL);
 
+   /*
+   * do not allocate space on all devices
+   * instead balance free space to maximise space utilization
+   */
+   opt_ndevs = (fs_total_avail*2 + 
devs_increment*devices_info[0].total_avail) / (devices_info[0].total_avail*2);
+   if (opt_ndevs < devs_min)
+   opt_ndevs = devs_min;
+   if (ndevs > opt_ndevs)
+   ndevs = opt_ndevs;
+
/* round down to number of usable stripes */
ndevs -= ndevs % devs_increment;

-- 
Ihr GMX Postfach immer dabei: die kostenlose GMX Mail App für Android.
Komfortabel, sicher und schnell: www.gmx.de/android
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs slowdown with ceph (how to reproduce)

2012-01-24 Thread Martin Mailand

Hi Chris,
great to hear that, could you give me a ping if you fixed it, than I can 
retry it?


-martin

Am 24.01.2012 20:40, schrieb Chris Mason:

On Tue, Jan 24, 2012 at 08:15:58PM +0100, Martin Mailand wrote:

Hi
I tried the branch on one of my ceph osd, and there is a big
difference in the performance.
The average request size stayed high, but after around a hour the
kernel crashed.

IOstat
http://pastebin.com/xjuriJ6J

Kernel trace
http://pastebin.com/SYE95GgH


Aha, this I know how to fix.  Thanks for trying it out.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs slowdown with ceph (how to reproduce)

2012-01-24 Thread Chris Mason
On Tue, Jan 24, 2012 at 08:15:58PM +0100, Martin Mailand wrote:
> Hi
> I tried the branch on one of my ceph osd, and there is a big
> difference in the performance.
> The average request size stayed high, but after around a hour the
> kernel crashed.
> 
> IOstat
> http://pastebin.com/xjuriJ6J
> 
> Kernel trace
> http://pastebin.com/SYE95GgH

Aha, this I know how to fix.  Thanks for trying it out.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs slowdown with ceph (how to reproduce)

2012-01-24 Thread Martin Mailand

Hi
I tried the branch on one of my ceph osd, and there is a big difference 
in the performance.
The average request size stayed high, but after around a hour the kernel 
crashed.


IOstat
http://pastebin.com/xjuriJ6J

Kernel trace
http://pastebin.com/SYE95GgH

-martin

Am 23.01.2012 19:50, schrieb Chris Mason:

On Mon, Jan 23, 2012 at 01:19:29PM -0500, Josef Bacik wrote:

On Fri, Jan 20, 2012 at 01:13:37PM +0100, Christian Brunner wrote:

As you might know, I have been seeing btrfs slowdowns in our ceph
cluster for quite some time. Even with the latest btrfs code for 3.3
I'm still seeing these problems. To make things reproducible, I've now
written a small test, that imitates ceph's behavior:

On a freshly created btrfs filesystem (2 TB size, mounted with
"noatime,nodiratime,compress=lzo,space_cache,inode_cache") I'm opening
100 files. After that I'm doing random writes on these files with a
sync_file_range after each write (each write has a size of 100 bytes)
and ioctl(BTRFS_IOC_SYNC) after every 100 writes.

After approximately 20 minutes, write activity suddenly increases
fourfold and the average request size decreases (see chart in the
attachment).

You can find IOstat output here: http://pastebin.com/Smbfg1aG

I hope that you are able to trace down the problem with the test
program in the attachment.


Ran it, saw the problem, tried the dangerdonteveruse branch in Chris's tree and
formatted the fs with 64k node and leaf sizes and the problem appeared to go
away.  So surprise surprise fragmentation is biting us in the ass.  If you can
try running that branch with 64k node and leaf sizes with your ceph cluster and
see how that works out.  Course you should only do that if you dont mind if you
lose everything :).  Thanks,



Please keep in mind this branch is only out there for development, and
it really might have huge flaws.  scrub doesn't work with it correctly
right now, and the IO error recovery code is probably broken too.

Long term though, I think the bigger block sizes are going to make a
huge difference in these workloads.

If you use the very dangerous code:

mkfs.btrfs -l 64k -n 64k /dev/xxx

(-l is leaf size, -n is node size).

64K is the max right now, 32K may help just as much at a lower CPU cost.

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: fix warning for 32-bit build of fs/btrfs/check-integrity.c

2012-01-24 Thread Stefan Behrens
There have been 4 warnings on 32-bit build, they are herewith fixed.

Signed-off-by: Stefan Behrens 
---
 fs/btrfs/check-integrity.c |   11 ++-
 1 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/check-integrity.c b/fs/btrfs/check-integrity.c
index ad0b3ba..b669a7d 100644
--- a/fs/btrfs/check-integrity.c
+++ b/fs/btrfs/check-integrity.c
@@ -1662,7 +1662,7 @@ static void btrfsic_process_written_block(struct 
btrfsic_dev_state *dev_state,
block = btrfsic_block_hashtable_lookup(bdev, dev_bytenr,
   &state->block_hashtable);
if (NULL != block) {
-   u64 bytenr;
+   u64 bytenr = 0;
struct list_head *elem_ref_to;
struct list_head *tmp_ref_to;
 
@@ -2777,9 +2777,10 @@ int btrfsic_submit_bh(int rw, struct buffer_head *bh)
printk(KERN_INFO
   "submit_bh(rw=0x%x, blocknr=%lu (bytenr %llu),"
   " size=%lu, data=%p, bdev=%p)\n",
-  rw, bh->b_blocknr,
-  (unsigned long long)dev_bytenr, bh->b_size,
-  bh->b_data, bh->b_bdev);
+  rw, (unsigned long)bh->b_blocknr,
+  (unsigned long long)dev_bytenr,
+  (unsigned long)bh->b_size, bh->b_data,
+  bh->b_bdev);
btrfsic_process_written_block(dev_state, dev_bytenr,
  bh->b_data, bh->b_size, NULL,
  NULL, bh, rw);
@@ -2844,7 +2845,7 @@ void btrfsic_submit_bio(int rw, struct bio *bio)
printk(KERN_INFO
   "submit_bio(rw=0x%x, bi_vcnt=%u,"
   " bi_sector=%lu (bytenr %llu), bi_bdev=%p)\n",
-  rw, bio->bi_vcnt, bio->bi_sector,
+  rw, bio->bi_vcnt, (unsigned long)bio->bi_sector,
   (unsigned long long)dev_bytenr,
   bio->bi_bdev);
 
-- 
1.7.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] improve space utilization on off-sized raid devices

2012-01-24 Thread THomas Schmidt
On Thursday 01 December 2011 09:55:27 Arne Jansen wrote:
> As RAID0 is already not a strict 'all disks or none', I like the idea to
> have it even more dynamic to reach full optimization. But I'd like to see
> some properties conserved:
>  a) In case of even size disks, the stripes should always be full size, not
>n - 1
>  b) Minor variations in the used space per disk due to metadata chunks
> should not lead to deviation from a)
>  c) The algorithms should not give weird results under unconventional
> setups. Some theoretical background would be nice :)

Sorry to only get back to you now, I must have missed your mail somehow.

The problem is the shrinking stripe width with unmatched devices. Once it hits 
devs_min-1 it's over. My solution is to try to keep the stripe width constant.
The sorting then takes care of selecting the right devices.

It's simply: space / min-hight = max-width
a) is dictated by math
Since circumstances change (add, rm devs, rounding, ...) it is calculated again 
at every allocation. The result is then rounded to the nearest multiple of 
devs_increment. This takes care of b).

The code may look wiered but should be identical to the mathematical
floor(Space / min-hight + increment/2) if considered together with the round 
down already present in the line after my patch.

The two ifs should safeguard against weird stuff by limiting the result to sane 
values.

I include an updated patch below. It's again written for and tested with 3.0.0 
but diff3 worked nicely for applying it to 3.3-rc1.

--- volumes.c.orig  2012-01-20 16:59:31.0 +0100
+++ volumes.c   2012-01-24 11:24:07.261401805 +0100
@@ -2329,6 +2329,8 @@
u64 stripe_size;
u64 num_bytes;
int ndevs;
+   u64 fs_total_avail;
+   int opt_ndevs;
int i;
int j;
 
@@ -2448,6 +2450,7 @@
devices_info[ndevs].total_avail = total_avail;
devices_info[ndevs].dev = device;
++ndevs;
+   fs_total_avail += total_avail;
}
 
/*
@@ -2456,6 +2459,16 @@
sort(devices_info, ndevs, sizeof(struct btrfs_device_info),
 btrfs_cmp_device_info, NULL);
 
+   /*
+   * do not allocate space on all devices
+   * instead balance free space to maximise space utilization
+   */
+   opt_ndevs = (fs_total_avail*2 + 
devs_increment*devices_info[0].total_avail) / (devices_info[0].total_avail*2);
+   if (opt_ndevs < devs_min)
+   opt_ndevs = devs_min;
+   if (ndevs > opt_ndevs)
+   ndevs = opt_ndevs;
+
/* round down to number of usable stripes */
ndevs -= ndevs % devs_increment;

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Updated btrfs/crypto snappy interface ready for merging

2012-01-24 Thread Hugo Chevrain
> evergreen writes:
> > 
> 
> Although there is LessFS's Mark Ruijter :
> http://www.lessfs.com/wordpress/?p=684&cpage=1#comment-3114
> 
> "On average the speeds appear to be 48% faster then snappy."
> Sounds promising...
> 
> --

Well, it seems that the performance is really good.
Apparently, LZ4 has displaced Snappy as speed champion for LessFS.

http://www.lessfs.com/wordpress/?p=688



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: advance window_start if we're using a bitmap

2012-01-24 Thread Josef Bacik
If we span a long area in a bitmap we could end up taking a lot of time
searching to the next free area if we're searching from the original
window_start, so advance window_start in order to make sure we don't do any
superficial searching.  Thanks,

Signed-off-by: Josef Bacik 
---
 fs/btrfs/free-space-cache.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
index 43e75f2..c2f2059 100644
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -2251,6 +2251,7 @@ u64 btrfs_alloc_from_cluster(struct 
btrfs_block_group_cache *block_group,
 offset_index);
continue;
}
+   cluster->window_start += bytes;
} else {
ret = entry->offset;
 
-- 
1.7.5.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG - btrfs] kernel oops in extent_range_uptodate

2012-01-24 Thread Vincent Vanackere

On 01/20/2012 09:54 PM, Mitch Harder wrote:

On Fri, Jan 20, 2012 at 10:48 AM, Vincent Vanackere
  wrote:

On 01/19/2012 05:24 PM, Mitch Harder wrote:

On Thu, Jan 19, 2012 at 8:42 AM, Vincent Vanackere
wrote:

Hi,

With the most current git kernel
(90a4c0f51e8e44111a926be6f4c87af3938a79c3)
I'm still getting the same reproducible kernel panic when trying to read
a
particular file stored on a btrfs filesystem (as seen in the log there
are
indeed disk media errors on this disk).
I'd like the "software" part of this to be fixed - btrfs should
definitely
not oops even in case of media error - before sending the disk to RMA. Is
there anything I can do to make progress on this ?


Is this kernel compiled with "Compile the kernel with debug info" (in
the "Kernel hacking  --->" configuration section)?

It would be nice to have the specific line of code passing the NULL
pointer.


The kernel was compiled with debug information but modern linux distribution
make it really hard to keep your debug information it seems :-(

I see where the find_get_page(...) function called in
extent_range_uptodate has the potential to return a NULL value.

Could you try the following patch, and if it solves your oops and
shows the included warning in your dmesg log, I'll simplify the patch
to drop the printk and submit it to the list.

I only included the printk since your current error log is ambiguous
regarding the specific point where we're getting the NULL pointer
dereference, but I'll pull it out if it works.

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 9d09a4f..35c3a2a 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -3909,6 +3909,13 @@ int extent_range_uptodate(struct extent_io_tree *tree,
while (start<= end) {
index = start>>  PAGE_CACHE_SHIFT;
page = find_get_page(tree->mapping, index);
+   if (unlikely(!page)) {
+   if (printk_ratelimit())
+   printk(KERN_WARNING
+  "btrfs: NULL page in "
+  "extent_range_uptodate()\n");
+   return 1;
+   }
uptodate = PageUptodate(page);
page_cache_release(page);
if (!uptodate) {


Indeed your patch helps. No kernel panic any more... but it looks like 
the task doesn't finish and there's another problem to solve now :


sd 5:0:0:0: [sdd] Unhandled sense code
sd 5:0:0:0: [sdd]  Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
sd 5:0:0:0: [sdd]  Sense Key : Medium Error [current] [descriptor]
Descriptor sense data with sense descriptors (in hex):
72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
70 2f dc 61
sd 5:0:0:0: [sdd]  Add. Sense: Unrecovered read error - auto reallocate 
failed

sd 5:0:0:0: [sdd] CDB: Read(10): 28 00 70 2f dc 5f 00 00 08 00
end_request: I/O error, dev sdd, sector 1882184801
ata6: EH complete
btrfs: NULL page in extent_range_uptodate()
btrfs: NULL page in extent_range_uptodate()
btrfs bad tree block start 959241011200 959241011200
INFO: task cat:3099 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
cat D 8180c600 0  3099   3002 0x
 8801f2b0f618 0086 8801f2b0f5d8 880221018770
 880222c65b80 8801f2b0ffd8 8801f2b0ffd8 8801f2b0ffd8
 8802241816e0 880222c65b80 8801f2b0f5e8 88022fd13e88
Call Trace:
 [] ? __lock_page+0x70/0x70
 [] schedule+0x3f/0x60
 [] io_schedule+0x8f/0xd0
 [] sleep_on_page+0xe/0x20
 [] __wait_on_bit+0x5f/0x90
 [] wait_on_page_bit+0x78/0x80
 [] ? autoremove_wake_function+0x40/0x40
 [] read_extent_buffer_pages+0x471/0x4d0 [btrfs]
 [] ? verify_parent_transid+0x160/0x160 [btrfs]
 [] btree_read_extent_buffer_pages.isra.99+0x8a/0xc0 
[btrfs]

 [] read_tree_block+0x41/0x60 [btrfs]
 [] read_block_for_search.isra.34+0xf3/0x3d0 [btrfs]
 [] btrfs_search_slot+0x300/0x8a0 [btrfs]
 [] btrfs_lookup_csum+0x74/0x170 [btrfs]
 [] __btrfs_lookup_bio_sums+0x1af/0x3b0 [btrfs]
 [] btrfs_lookup_bio_sums+0x16/0x20 [btrfs]
 [] btrfs_submit_bio_hook+0x140/0x170 [btrfs]
 [] ? btrfs_real_readdir+0x720/0x720 [btrfs]
 [] submit_one_bio+0x6a/0xa0 [btrfs]
 [] extent_readpages+0xe4/0x100 [btrfs]
 [] ? btrfs_real_readdir+0x720/0x720 [btrfs]
 [] btrfs_readpages+0x1f/0x30 [btrfs]
 [] __do_page_cache_readahead+0x1af/0x250
 [] ra_submit+0x21/0x30
 [] ondemand_readahead+0x115/0x230
 [] ? __do_fault+0x419/0x530
 [] page_cache_sync_readahead+0x31/0x50
 [] generic_file_aio_read+0x438/0x780
 [] do_sync_read+0xd2/0x110
 [] ? security_file_permission+0x93/0xb0
 [] ? rw_verify_area+0x61/0xf0
 [] vfs_read+0xb0/0x180
 [] sys_read+0x4a/0x90
 [] system_call_fastpath+0x16/0x1b

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.ht

Re: 3.2-rc4: scrubbing locks up the kernel, then hung tasks on boot

2012-01-24 Thread Martin Steigerwald
Am Dienstag, 24. Januar 2012 schrieb Arne Jansen:
> On 24.01.2012 11:16, Martin Steigerwald wrote:
> > Am Dienstag, 24. Januar 2012 schrieb Arne Jansen:
> >> On 21.01.2012 11:49, Martin Steigerwald wrote:
> >>> Am Dienstag, 20. Dezember 2011 schrieb Martin Steigerwald:
> >>> 
> >>> When I do btrfs scrub start / the machine locks immediately up
> >>> hard.
> >> 
> >> Can you please give me the output of sysrq-w when the machine is
> >> locked up?
> > 
> > What would be the easiest way to do that?
> > 
> > The machine is locked up, means, no mouse, no keyboard, no ping - as
> > far as I remember I tested a ping, but I can verify -, no nothing.
> 
> Hm, I always use a (hardware) serial console, but I think on default
> loglevel the sysrq-output goes to dmesg only, so you have to adjust
> the loglevel also. Normally I have a ssh into the box running with a
> tail -f /var/log/messages. This locks up very seldom.
> Have you tried some sysrq combinations? It might work, even when the
> keyboard looks locked up in all other respects.

Ok, so I may try the SSH way, maybe I get something before network gets 
nonfunctional. Or maybe I didn´t even test ssh/ping and network still 
works. But I think I tried accessing the machine via network, I usually do 
this in such cases.

If that does not work and USB to serial adapter might work.

But then as far as I remember even audio locked up hard replaying the same 
sample all over again, so I do not know whether the kernel will do 
anything at all.

Aside from that I got the idea to try to scrubbing with kernel 3.1 to 
verify whether thats an regression.

Well I try some things and report back. Since the / filesystem on that 
ThinkPad T23 is the only one it might have some wierd issues. Its one of 
my oldest BTRFS filesystems.

Thanks,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 3.2-rc4: scrubbing locks up the kernel, then hung tasks on boot

2012-01-24 Thread Arne Jansen
On 24.01.2012 11:16, Martin Steigerwald wrote:
> Am Dienstag, 24. Januar 2012 schrieb Arne Jansen:
>> On 21.01.2012 11:49, Martin Steigerwald wrote:
>>> Am Dienstag, 20. Dezember 2011 schrieb Martin Steigerwald:
>>>
>>> When I do btrfs scrub start / the machine locks immediately up hard.
>>
>> Can you please give me the output of sysrq-w when the machine is locked
>> up?
> 
> What would be the easiest way to do that?
> 
> The machine is locked up, means, no mouse, no keyboard, no ping - as far 
> as I remember I tested a ping, but I can verify -, no nothing.

Hm, I always use a (hardware) serial console, but I think on default loglevel
the sysrq-output goes to dmesg only, so you have to adjust the loglevel also.
Normally I have a ssh into the box running with a tail -f /var/log/messages.
This locks up very seldom.
Have you tried some sysrq combinations? It might work, even when the
keyboard looks locked up in all other respects.

> 
> Serial console with USB serial adapter?
> 
> Thanks,

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 3.2-rc4: scrubbing locks up the kernel, then hung tasks on boot

2012-01-24 Thread Martin Steigerwald
Am Dienstag, 24. Januar 2012 schrieb Arne Jansen:
> On 21.01.2012 11:49, Martin Steigerwald wrote:
> > Am Dienstag, 20. Dezember 2011 schrieb Martin Steigerwald:
> > 
> > When I do btrfs scrub start / the machine locks immediately up hard.
> 
> Can you please give me the output of sysrq-w when the machine is locked
> up?

What would be the easiest way to do that?

The machine is locked up, means, no mouse, no keyboard, no ping - as far 
as I remember I tested a ping, but I can verify -, no nothing.

Serial console with USB serial adapter?

Thanks,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 3.2-rc4: scrubbing locks up the kernel, then hung tasks on boot

2012-01-24 Thread Arne Jansen
On 21.01.2012 11:49, Martin Steigerwald wrote:
> Am Dienstag, 20. Dezember 2011 schrieb Martin Steigerwald:
> 
> When I do btrfs scrub start / the machine locks immediately up hard.

Can you please give me the output of sysrq-w when the machine is locked up?

Thanks,
Arne

> 
> Then usually on next boot it stops on space_cache enabled message, but not 
> the one for /, but the one for /home which is mounted later.
> 
> When I then boot with 3.1 it works. BTRFS redos the space_cache then while 
> the machine takes ages to boot - I mean ages - 10 minutes till KDM prompt 
> is no problem there.
> 
> I thought I just mention it here.
> 
> Since I got no hints on what to do, I probably redo both filesystems on the 
> machine. Should that not work out, I switch the box to Ext4.
> 
> btrfs filesystem scrub works on my ThinkPad T520 with 64-bit debian and 
> Intel SSD 320 and one 2,5 inch external drive as well as a 3,5 inch 
> external backup drive both via eSATA, so this seems to be no principal 
> issue. It also works on a workstation at work which has 32-bit debian as 
> well.
> 
> Thanks,

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html