Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()

2013-03-04 Thread Dave Chinner
On Mon, Mar 04, 2013 at 08:32:45AM +, Tony Lu wrote:
> Thanks for you following up.
> 
> My apologize that I just found that it is one change I made before
> that causes this problem. This change forces mkfs.xfs to format
> xfs partitions whose sectorsize were not smaller than 4096 bytes,
> which was due to a bug that earlier versions of xfs used (struct
> *page)->private(long) as a bitmap to represent each block's state
> within a page (the size of a page could be 64K or larger, then it
> needs 128 bit or more to represent each block's state within a
> page).

You do realise that bug doesn't affect x86-64 platforms as they
don't support 64k pages?

> This is reproducible on 2.6.38.6 kernel on X86. But I do not get
> why this change makes the xfs log inconsistent during
> mount/cp/umount operations.

Neither do I, and I don't care to look any further because the
problem is of your own making. In future, please check first that
the bug you are reporting is reproducable on a current upstream
kernel and userspace.

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()

2013-03-04 Thread Tony Lu
Thanks for you following up.

My apologize that I just found that it is one change I made before that causes 
this problem. This change forces mkfs.xfs to format xfs partitions whose 
sectorsize were not smaller than 4096 bytes, which was due to a bug that 
earlier versions of xfs used (struct *page)->private(long) as a bitmap to 
represent each block's state within a page (the size of a page could be 64K or 
larger, then it needs 128 bit or more to represent each block's state within a 
page).

This is reproducible on 2.6.38.6 kernel on X86. But I do not get why this 
change makes the xfs log inconsistent during mount/cp/umount operations.

diff -dur xfsprogs-3.1.4.ori/include/xfs_alloc_btree.h 
xfsprogs-3.1.4/include/xfs_alloc_btree.h
--- xfsprogs-3.1.4.ori/include/xfs_alloc_btree.h2010-01-30 
03:46:13.0 +0800
+++ xfsprogs-3.1.4/include/xfs_alloc_btree.h2013-03-04 16:11:41.0 
+0800
@@ -59,7 +59,7 @@
 #define XFS_MAX_BLOCKSIZE_LOG  16  /* i.e. 65536 bytes */
 #define XFS_MIN_BLOCKSIZE  (1 << XFS_MIN_BLOCKSIZE_LOG)
 #define XFS_MAX_BLOCKSIZE  (1 << XFS_MAX_BLOCKSIZE_LOG)
-#define XFS_MIN_SECTORSIZE_LOG 9   /* i.e. 512 bytes */
+#define XFS_MIN_SECTORSIZE_LOG 12  /* i.e. 512 bytes */
 #define XFS_MAX_SECTORSIZE_LOG 15  /* i.e. 32768 bytes */
 #define XFS_MIN_SECTORSIZE (1 << XFS_MIN_SECTORSIZE_LOG)
 #define XFS_MAX_SECTORSIZE (1 << XFS_MAX_SECTORSIZE_LOG)

Thanks
-Tony

>-Original Message-
>From: Mark Tinguely [mailto:tingu...@sgi.com]
>Sent: Saturday, March 02, 2013 4:24 AM
>To: Tony Lu
>Cc: Alex Elder; linux-kernel@vger.kernel.org; Chris Metcalf; x...@oss.sgi.com;
>Ben Myers; Dave Chinner; linux-fsde...@vger.kernel.org
>Subject: Re: [PATCH] xfs: Fix possible truncation of log data in
>xlog_bread_noalign()
>
>On 03/01/13 09:51, Mark Tinguely wrote:
>> On 02/26/13 01:28, Tony Lu wrote:
>>> I get a reliable way to reproduce this bug. The logprint and metadump
>>> are attached.
>>>
>>> Kernel version: 2.6.38.8
>>> Mkfs.xfs version: xfsprogs-3.1.1
>>> mkfs.xfs -s size=4096 /dev/sda1
>>>
>>> Run the following mount-cp-umount script to reproduce:
>>> #!/bin/sh
>>> device=/dev/sda1
>>> mount_point=/mnt
>>> times=10
>>>
>>> for ((num=1;num<=$times;num++))
>>> do
>>> echo "$num mount $device $mount_point"
>>> mount $device $mount_point
>>>
>>> echo "cp -rf /bin $mount_point/$num"
>>> cp -rf /bin $mount_point/$num
>>>
>>> echo "$num umount $device $mount_point"
>>> umount $mount_point
>>>
>>> #num=$(($num + 1))
>>> done
>>>
>>> After several times of mount/cp/umount, this xfs crashes, and the xfs
>>> partition can not be mounted any more. Here is the output of console.
>>> -sh-4.1# ./umount-test
>>> 1 mount /dev/sda1 /mnt
>>> XFS mounting filesystem sda1
>>> cp -rf /bin /mnt/1
>>> 1 umount /dev/sda1 /mnt
>>> 2 mount /dev/sda1 /mnt
>>> XFS mounting filesystem sda1
>>> cp -rf /bin /mnt/2
>>> 2 umount /dev/sda1 /mnt
>>> 3 mount /dev/sda1 /mnt
>>> XFS mounting filesystem sda1
>>> cp -rf /bin /mnt/3
>>> 3 umount /dev/sda1 /mnt
>>> 4 mount /dev/sda1 /mnt
>>> XFS mounting filesystem sda1
>>> cp -rf /bin /mnt/4
>>> 4 umount /dev/sda1 /mnt
>>> 5 mount /dev/sda1 /mnt
>>> XFS mounting filesystem sda1
>>> Starting XFS recovery on filesystem: sda1 (logdev: internal)
>>> Ending XFS recovery on filesystem: sda1 (logdev: internal)cp -rf /bin
>>> /mnt/5
>>> 5 umount /dev/sda1 /mnt
>>> 6 mount /dev/sda1 /mnt
>>>
>>> XFS mounting filesystem sda1
>>> Starting XFS recovery on filesystem: sda1 (logdev: internal)
>>> Ending XFS recovery on filesystem: sda1 (logdev: internal)Interrupt
>>> cp -rf /bin /mnt/6
>>> 6 umount /dev/sda1 /mnt
>>> 7 mount /dev/sda1 /mnt
>>>
>>> XFS mounting filesystem sda1
>>> cp -rf /bin /mnt/7
>>> 7 umount /dev/sda1 /mnt
>>> Interrupt
>>> 8 mount /dev/sda1 /mnt
>>> XFS mounting filesystem sda1
>>> Starting XFS recovery on filesystem: sda1 (logdev: internal)
>>> XFS: xlog_recover_process_data: bad clientid
>>> XFS: log mount/recovery failed: error 5
>>> XFS: log mount failed
>>>
>>> Thanks
>>> -Tony
>>
>> It works fine on a 2.6.32 machine I had sitting around - and I never
>> required log recovery.
>>
>> I think you need to answer Dave's question as to why 

RE: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()

2013-03-04 Thread Tony Lu
Thanks for you following up.

My apologize that I just found that it is one change I made before that causes 
this problem. This change forces mkfs.xfs to format xfs partitions whose 
sectorsize were not smaller than 4096 bytes, which was due to a bug that 
earlier versions of xfs used (struct *page)-private(long) as a bitmap to 
represent each block's state within a page (the size of a page could be 64K or 
larger, then it needs 128 bit or more to represent each block's state within a 
page).

This is reproducible on 2.6.38.6 kernel on X86. But I do not get why this 
change makes the xfs log inconsistent during mount/cp/umount operations.

diff -dur xfsprogs-3.1.4.ori/include/xfs_alloc_btree.h 
xfsprogs-3.1.4/include/xfs_alloc_btree.h
--- xfsprogs-3.1.4.ori/include/xfs_alloc_btree.h2010-01-30 
03:46:13.0 +0800
+++ xfsprogs-3.1.4/include/xfs_alloc_btree.h2013-03-04 16:11:41.0 
+0800
@@ -59,7 +59,7 @@
 #define XFS_MAX_BLOCKSIZE_LOG  16  /* i.e. 65536 bytes */
 #define XFS_MIN_BLOCKSIZE  (1  XFS_MIN_BLOCKSIZE_LOG)
 #define XFS_MAX_BLOCKSIZE  (1  XFS_MAX_BLOCKSIZE_LOG)
-#define XFS_MIN_SECTORSIZE_LOG 9   /* i.e. 512 bytes */
+#define XFS_MIN_SECTORSIZE_LOG 12  /* i.e. 512 bytes */
 #define XFS_MAX_SECTORSIZE_LOG 15  /* i.e. 32768 bytes */
 #define XFS_MIN_SECTORSIZE (1  XFS_MIN_SECTORSIZE_LOG)
 #define XFS_MAX_SECTORSIZE (1  XFS_MAX_SECTORSIZE_LOG)

Thanks
-Tony

-Original Message-
From: Mark Tinguely [mailto:tingu...@sgi.com]
Sent: Saturday, March 02, 2013 4:24 AM
To: Tony Lu
Cc: Alex Elder; linux-kernel@vger.kernel.org; Chris Metcalf; x...@oss.sgi.com;
Ben Myers; Dave Chinner; linux-fsde...@vger.kernel.org
Subject: Re: [PATCH] xfs: Fix possible truncation of log data in
xlog_bread_noalign()

On 03/01/13 09:51, Mark Tinguely wrote:
 On 02/26/13 01:28, Tony Lu wrote:
 I get a reliable way to reproduce this bug. The logprint and metadump
 are attached.

 Kernel version: 2.6.38.8
 Mkfs.xfs version: xfsprogs-3.1.1
 mkfs.xfs -s size=4096 /dev/sda1

 Run the following mount-cp-umount script to reproduce:
 #!/bin/sh
 device=/dev/sda1
 mount_point=/mnt
 times=10

 for ((num=1;num=$times;num++))
 do
 echo $num mount $device $mount_point
 mount $device $mount_point

 echo cp -rf /bin $mount_point/$num
 cp -rf /bin $mount_point/$num

 echo $num umount $device $mount_point
 umount $mount_point

 #num=$(($num + 1))
 done

 After several times of mount/cp/umount, this xfs crashes, and the xfs
 partition can not be mounted any more. Here is the output of console.
 -sh-4.1# ./umount-test
 1 mount /dev/sda1 /mnt
 XFS mounting filesystem sda1
 cp -rf /bin /mnt/1
 1 umount /dev/sda1 /mnt
 2 mount /dev/sda1 /mnt
 XFS mounting filesystem sda1
 cp -rf /bin /mnt/2
 2 umount /dev/sda1 /mnt
 3 mount /dev/sda1 /mnt
 XFS mounting filesystem sda1
 cp -rf /bin /mnt/3
 3 umount /dev/sda1 /mnt
 4 mount /dev/sda1 /mnt
 XFS mounting filesystem sda1
 cp -rf /bin /mnt/4
 4 umount /dev/sda1 /mnt
 5 mount /dev/sda1 /mnt
 XFS mounting filesystem sda1
 Starting XFS recovery on filesystem: sda1 (logdev: internal)
 Ending XFS recovery on filesystem: sda1 (logdev: internal)cp -rf /bin
 /mnt/5
 5 umount /dev/sda1 /mnt
 6 mount /dev/sda1 /mnt

 XFS mounting filesystem sda1
 Starting XFS recovery on filesystem: sda1 (logdev: internal)
 Ending XFS recovery on filesystem: sda1 (logdev: internal)Interrupt
 cp -rf /bin /mnt/6
 6 umount /dev/sda1 /mnt
 7 mount /dev/sda1 /mnt

 XFS mounting filesystem sda1
 cp -rf /bin /mnt/7
 7 umount /dev/sda1 /mnt
 Interrupt
 8 mount /dev/sda1 /mnt
 XFS mounting filesystem sda1
 Starting XFS recovery on filesystem: sda1 (logdev: internal)
 XFS: xlog_recover_process_data: bad clientid
 XFS: log mount/recovery failed: error 5
 XFS: log mount failed

 Thanks
 -Tony

 It works fine on a 2.6.32 machine I had sitting around - and I never
 required log recovery.

 I think you need to answer Dave's question as to why is your unmounts
 are requiring recovery?

 Are there errors in the /var/log/messages?

 I downloaded the Linux 2.6.38.8 source and take a look if I can recreate
 the problem.

 --Mark.

I could not reproduce the problem on a vanilla install. XFS shutdown and
remounted cleanly running your script (several iterations looping set to
100).

I started fsstress on another XFS partition on the same disk to see if I
could force a shutdown race. With CONFIG_XFS_DEBUG=y, I could trigger
other ASSERTs on the fsstress partition so I never stayed up long enough
to cause a shutdown race.

Not wanting to patch that version of Linux/XFS, I am bailing here. If
you want to turn on the XFS debug it may point out why your filesystem
is not shutting down cleanly.

--Mark.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()

2013-03-04 Thread Dave Chinner
On Mon, Mar 04, 2013 at 08:32:45AM +, Tony Lu wrote:
 Thanks for you following up.
 
 My apologize that I just found that it is one change I made before
 that causes this problem. This change forces mkfs.xfs to format
 xfs partitions whose sectorsize were not smaller than 4096 bytes,
 which was due to a bug that earlier versions of xfs used (struct
 *page)-private(long) as a bitmap to represent each block's state
 within a page (the size of a page could be 64K or larger, then it
 needs 128 bit or more to represent each block's state within a
 page).

You do realise that bug doesn't affect x86-64 platforms as they
don't support 64k pages?

 This is reproducible on 2.6.38.6 kernel on X86. But I do not get
 why this change makes the xfs log inconsistent during
 mount/cp/umount operations.

Neither do I, and I don't care to look any further because the
problem is of your own making. In future, please check first that
the bug you are reporting is reproducable on a current upstream
kernel and userspace.

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()

2013-03-01 Thread Mark Tinguely

On 03/01/13 09:51, Mark Tinguely wrote:

On 02/26/13 01:28, Tony Lu wrote:

I get a reliable way to reproduce this bug. The logprint and metadump
are attached.

Kernel version: 2.6.38.8
Mkfs.xfs version: xfsprogs-3.1.1
mkfs.xfs -s size=4096 /dev/sda1

Run the following mount-cp-umount script to reproduce:
#!/bin/sh
device=/dev/sda1
mount_point=/mnt
times=10

for ((num=1;num<=$times;num++))
do
echo "$num mount $device $mount_point"
mount $device $mount_point

echo "cp -rf /bin $mount_point/$num"
cp -rf /bin $mount_point/$num

echo "$num umount $device $mount_point"
umount $mount_point

#num=$(($num + 1))
done

After several times of mount/cp/umount, this xfs crashes, and the xfs
partition can not be mounted any more. Here is the output of console.
-sh-4.1# ./umount-test
1 mount /dev/sda1 /mnt
XFS mounting filesystem sda1
cp -rf /bin /mnt/1
1 umount /dev/sda1 /mnt
2 mount /dev/sda1 /mnt
XFS mounting filesystem sda1
cp -rf /bin /mnt/2
2 umount /dev/sda1 /mnt
3 mount /dev/sda1 /mnt
XFS mounting filesystem sda1
cp -rf /bin /mnt/3
3 umount /dev/sda1 /mnt
4 mount /dev/sda1 /mnt
XFS mounting filesystem sda1
cp -rf /bin /mnt/4
4 umount /dev/sda1 /mnt
5 mount /dev/sda1 /mnt
XFS mounting filesystem sda1
Starting XFS recovery on filesystem: sda1 (logdev: internal)
Ending XFS recovery on filesystem: sda1 (logdev: internal)cp -rf /bin
/mnt/5
5 umount /dev/sda1 /mnt
6 mount /dev/sda1 /mnt

XFS mounting filesystem sda1
Starting XFS recovery on filesystem: sda1 (logdev: internal)
Ending XFS recovery on filesystem: sda1 (logdev: internal)Interrupt
cp -rf /bin /mnt/6
6 umount /dev/sda1 /mnt
7 mount /dev/sda1 /mnt

XFS mounting filesystem sda1
cp -rf /bin /mnt/7
7 umount /dev/sda1 /mnt
Interrupt
8 mount /dev/sda1 /mnt
XFS mounting filesystem sda1
Starting XFS recovery on filesystem: sda1 (logdev: internal)
XFS: xlog_recover_process_data: bad clientid
XFS: log mount/recovery failed: error 5
XFS: log mount failed

Thanks
-Tony


It works fine on a 2.6.32 machine I had sitting around - and I never
required log recovery.

I think you need to answer Dave's question as to why is your unmounts
are requiring recovery?

Are there errors in the /var/log/messages?

I downloaded the Linux 2.6.38.8 source and take a look if I can recreate
the problem.

--Mark.


I could not reproduce the problem on a vanilla install. XFS shutdown and 
remounted cleanly running your script (several iterations looping set to 
100).


I started fsstress on another XFS partition on the same disk to see if I 
could force a shutdown race. With CONFIG_XFS_DEBUG=y, I could trigger 
other ASSERTs on the fsstress partition so I never stayed up long enough 
to cause a shutdown race.


Not wanting to patch that version of Linux/XFS, I am bailing here. If 
you want to turn on the XFS debug it may point out why your filesystem 
is not shutting down cleanly.


--Mark.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()

2013-03-01 Thread Mark Tinguely

On 02/26/13 01:28, Tony Lu wrote:

I get a reliable way to reproduce this bug. The logprint and metadump are 
attached.

Kernel version: 2.6.38.8
Mkfs.xfs version: xfsprogs-3.1.1
mkfs.xfs -s size=4096 /dev/sda1

Run the following mount-cp-umount script to reproduce:
#!/bin/sh
device=/dev/sda1
mount_point=/mnt
times=10

for ((num=1;num<=$times;num++))
do
 echo "$num mount $device $mount_point"
 mount $device $mount_point

 echo "cp -rf /bin $mount_point/$num"
 cp -rf /bin $mount_point/$num

 echo "$num umount $device $mount_point"
 umount $mount_point

#num=$(($num + 1))
done

After several times of mount/cp/umount, this xfs crashes, and the xfs partition 
can not be mounted any more. Here is the output of console.
-sh-4.1# ./umount-test
1 mount /dev/sda1 /mnt
XFS mounting filesystem sda1
cp -rf /bin /mnt/1
1 umount /dev/sda1 /mnt
2 mount /dev/sda1 /mnt
XFS mounting filesystem sda1
cp -rf /bin /mnt/2
2 umount /dev/sda1 /mnt
3 mount /dev/sda1 /mnt
XFS mounting filesystem sda1
cp -rf /bin /mnt/3
3 umount /dev/sda1 /mnt
4 mount /dev/sda1 /mnt
XFS mounting filesystem sda1
cp -rf /bin /mnt/4
4 umount /dev/sda1 /mnt
5 mount /dev/sda1 /mnt
XFS mounting filesystem sda1
Starting XFS recovery on filesystem: sda1 (logdev: internal)
Ending XFS recovery on filesystem: sda1 (logdev: internal)cp -rf /bin /mnt/5
5 umount /dev/sda1 /mnt
6 mount /dev/sda1 /mnt

XFS mounting filesystem sda1
Starting XFS recovery on filesystem: sda1 (logdev: internal)
Ending XFS recovery on filesystem: sda1 (logdev: internal)Interrupt
cp -rf /bin /mnt/6
6 umount /dev/sda1 /mnt
7 mount /dev/sda1 /mnt

XFS mounting filesystem sda1
cp -rf /bin /mnt/7
7 umount /dev/sda1 /mnt
Interrupt
8 mount /dev/sda1 /mnt
XFS mounting filesystem sda1
Starting XFS recovery on filesystem: sda1 (logdev: internal)
XFS: xlog_recover_process_data: bad clientid
XFS: log mount/recovery failed: error 5
XFS: log mount failed

Thanks
-Tony


It works fine on a 2.6.32 machine I had sitting around - and I never 
required log recovery.


I think you need to answer Dave's question as to why is your unmounts 
are requiring recovery?


Are there errors in the /var/log/messages?

I downloaded the Linux 2.6.38.8 source and take a look if I can recreate 
the problem.


--Mark.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()

2013-03-01 Thread Mark Tinguely

On 02/26/13 01:28, Tony Lu wrote:

I get a reliable way to reproduce this bug. The logprint and metadump are 
attached.

Kernel version: 2.6.38.8
Mkfs.xfs version: xfsprogs-3.1.1
mkfs.xfs -s size=4096 /dev/sda1

Run the following mount-cp-umount script to reproduce:
#!/bin/sh
device=/dev/sda1
mount_point=/mnt
times=10

for ((num=1;num=$times;num++))
do
 echo $num mount $device $mount_point
 mount $device $mount_point

 echo cp -rf /bin $mount_point/$num
 cp -rf /bin $mount_point/$num

 echo $num umount $device $mount_point
 umount $mount_point

#num=$(($num + 1))
done

After several times of mount/cp/umount, this xfs crashes, and the xfs partition 
can not be mounted any more. Here is the output of console.
-sh-4.1# ./umount-test
1 mount /dev/sda1 /mnt
XFS mounting filesystem sda1
cp -rf /bin /mnt/1
1 umount /dev/sda1 /mnt
2 mount /dev/sda1 /mnt
XFS mounting filesystem sda1
cp -rf /bin /mnt/2
2 umount /dev/sda1 /mnt
3 mount /dev/sda1 /mnt
XFS mounting filesystem sda1
cp -rf /bin /mnt/3
3 umount /dev/sda1 /mnt
4 mount /dev/sda1 /mnt
XFS mounting filesystem sda1
cp -rf /bin /mnt/4
4 umount /dev/sda1 /mnt
5 mount /dev/sda1 /mnt
XFS mounting filesystem sda1
Starting XFS recovery on filesystem: sda1 (logdev: internal)
Ending XFS recovery on filesystem: sda1 (logdev: internal)cp -rf /bin /mnt/5
5 umount /dev/sda1 /mnt
6 mount /dev/sda1 /mnt

XFS mounting filesystem sda1
Starting XFS recovery on filesystem: sda1 (logdev: internal)
Ending XFS recovery on filesystem: sda1 (logdev: internal)Interrupt
cp -rf /bin /mnt/6
6 umount /dev/sda1 /mnt
7 mount /dev/sda1 /mnt

XFS mounting filesystem sda1
cp -rf /bin /mnt/7
7 umount /dev/sda1 /mnt
Interrupt
8 mount /dev/sda1 /mnt
XFS mounting filesystem sda1
Starting XFS recovery on filesystem: sda1 (logdev: internal)
XFS: xlog_recover_process_data: bad clientid
XFS: log mount/recovery failed: error 5
XFS: log mount failed

Thanks
-Tony


It works fine on a 2.6.32 machine I had sitting around - and I never 
required log recovery.


I think you need to answer Dave's question as to why is your unmounts 
are requiring recovery?


Are there errors in the /var/log/messages?

I downloaded the Linux 2.6.38.8 source and take a look if I can recreate 
the problem.


--Mark.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()

2013-03-01 Thread Mark Tinguely

On 03/01/13 09:51, Mark Tinguely wrote:

On 02/26/13 01:28, Tony Lu wrote:

I get a reliable way to reproduce this bug. The logprint and metadump
are attached.

Kernel version: 2.6.38.8
Mkfs.xfs version: xfsprogs-3.1.1
mkfs.xfs -s size=4096 /dev/sda1

Run the following mount-cp-umount script to reproduce:
#!/bin/sh
device=/dev/sda1
mount_point=/mnt
times=10

for ((num=1;num=$times;num++))
do
echo $num mount $device $mount_point
mount $device $mount_point

echo cp -rf /bin $mount_point/$num
cp -rf /bin $mount_point/$num

echo $num umount $device $mount_point
umount $mount_point

#num=$(($num + 1))
done

After several times of mount/cp/umount, this xfs crashes, and the xfs
partition can not be mounted any more. Here is the output of console.
-sh-4.1# ./umount-test
1 mount /dev/sda1 /mnt
XFS mounting filesystem sda1
cp -rf /bin /mnt/1
1 umount /dev/sda1 /mnt
2 mount /dev/sda1 /mnt
XFS mounting filesystem sda1
cp -rf /bin /mnt/2
2 umount /dev/sda1 /mnt
3 mount /dev/sda1 /mnt
XFS mounting filesystem sda1
cp -rf /bin /mnt/3
3 umount /dev/sda1 /mnt
4 mount /dev/sda1 /mnt
XFS mounting filesystem sda1
cp -rf /bin /mnt/4
4 umount /dev/sda1 /mnt
5 mount /dev/sda1 /mnt
XFS mounting filesystem sda1
Starting XFS recovery on filesystem: sda1 (logdev: internal)
Ending XFS recovery on filesystem: sda1 (logdev: internal)cp -rf /bin
/mnt/5
5 umount /dev/sda1 /mnt
6 mount /dev/sda1 /mnt

XFS mounting filesystem sda1
Starting XFS recovery on filesystem: sda1 (logdev: internal)
Ending XFS recovery on filesystem: sda1 (logdev: internal)Interrupt
cp -rf /bin /mnt/6
6 umount /dev/sda1 /mnt
7 mount /dev/sda1 /mnt

XFS mounting filesystem sda1
cp -rf /bin /mnt/7
7 umount /dev/sda1 /mnt
Interrupt
8 mount /dev/sda1 /mnt
XFS mounting filesystem sda1
Starting XFS recovery on filesystem: sda1 (logdev: internal)
XFS: xlog_recover_process_data: bad clientid
XFS: log mount/recovery failed: error 5
XFS: log mount failed

Thanks
-Tony


It works fine on a 2.6.32 machine I had sitting around - and I never
required log recovery.

I think you need to answer Dave's question as to why is your unmounts
are requiring recovery?

Are there errors in the /var/log/messages?

I downloaded the Linux 2.6.38.8 source and take a look if I can recreate
the problem.

--Mark.


I could not reproduce the problem on a vanilla install. XFS shutdown and 
remounted cleanly running your script (several iterations looping set to 
100).


I started fsstress on another XFS partition on the same disk to see if I 
could force a shutdown race. With CONFIG_XFS_DEBUG=y, I could trigger 
other ASSERTs on the fsstress partition so I never stayed up long enough 
to cause a shutdown race.


Not wanting to patch that version of Linux/XFS, I am bailing here. If 
you want to turn on the XFS debug it may point out why your filesystem 
is not shutting down cleanly.


--Mark.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()

2013-02-26 Thread Dave Chinner
On Tue, Feb 26, 2013 at 07:28:19AM +, Tony Lu wrote:
> I get a reliable way to reproduce this bug. The logprint and metadump are 
> attached.
> 
> Kernel version: 2.6.38.8

This is important



 because this:

> 4 umount /dev/sda1 /mnt
> 5 mount /dev/sda1 /mnt
> XFS mounting filesystem sda1
> Starting XFS recovery on filesystem: sda1 (logdev: internal)
> Ending XFS recovery on filesystem: sda1 (logdev: internal)

Indicates that the unmount record is either not being written, it is
being written when there log has not been fully flushed or log
recovery is not finding it. You need to copy out the log
first to determine what the state of the log is before you mount the
filesystem - that way if log recovery is run you can see whether it
was supposed to run. (i.e. a clean log should never run recovery,
and unmount should always leave a clean log).

Either way, I'm more than 10,000 iterations into a run of 100k
iterations of this script on 3.8.0, and I have not seen a single log
recovery attempt occur. That implies you are seeing a bug in 2.6.38
that has since been fixed. It would be a good idea for you to
upgrade the system to a 3.8 kernel and determine if you can still
reproduce the problem on your system - that way we'll know if the
bug really has been fixed or not

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()

2013-02-26 Thread Dave Chinner
On Tue, Feb 26, 2013 at 07:28:19AM +, Tony Lu wrote:
 I get a reliable way to reproduce this bug. The logprint and metadump are 
 attached.
 
 Kernel version: 2.6.38.8

This is important



 because this:

 4 umount /dev/sda1 /mnt
 5 mount /dev/sda1 /mnt
 XFS mounting filesystem sda1
 Starting XFS recovery on filesystem: sda1 (logdev: internal)
 Ending XFS recovery on filesystem: sda1 (logdev: internal)

Indicates that the unmount record is either not being written, it is
being written when there log has not been fully flushed or log
recovery is not finding it. You need to copy out the log
first to determine what the state of the log is before you mount the
filesystem - that way if log recovery is run you can see whether it
was supposed to run. (i.e. a clean log should never run recovery,
and unmount should always leave a clean log).

Either way, I'm more than 10,000 iterations into a run of 100k
iterations of this script on 3.8.0, and I have not seen a single log
recovery attempt occur. That implies you are seeing a bug in 2.6.38
that has since been fixed. It would be a good idea for you to
upgrade the system to a 3.8 kernel and determine if you can still
reproduce the problem on your system - that way we'll know if the
bug really has been fixed or not

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()

2013-02-24 Thread Dave Chinner
On Sun, Feb 24, 2013 at 04:46:30AM +, Tony Lu wrote:
> >> For example, if xlog_bread_noalign() wants to read blocks from #1
> >> to # 9, in which case the passed parameter blk_no is 1, and nbblks
> >> is 8, sectBBsize is 8, after the round down and round up
> >> operations, we get blk_no as 0, and nbblks as still 8. We
> >> definitely lose the last block of the log data.
> >
> >Yes, I fully understand that. But I also understand how the log
> >works and that this behaviour *should not happen*. That's why
> >I'm asking questions about what the problem you are trying to fix.
> 
> I am not sure about this, since I saw many reads on
> non-sector-align blocks even when successfully mounting good XFS
> partitions.

I didn't say that non-sector align reads should not be attempted by
log recovery - it's obvious from the on disk format of the log that
we have to parse it in chunks of 512 bytes to make sense of it's
contents, and that leads to the 512 byte reads and other subsequent
unaligned reads.

*However*

Seeing that there are unaligned reads occurring does not mean that
the structures in the log should be unaligned. Your test output
indicated a log record header at an unaligned block address, and
that's incorrect. It doesn't matter what the rest of the log
recovery code does with non-aligned IO - the fact is that your debug
implies that the contents of the log is corrupt and that implies a
deeper problem

> And also there is code in xlog_write_log_records() which handles
> non-sector-align reads and writes.

Yes, it does handle it, but that doesn't mean that it is correct to
pass unaligned block ranges to it.

Cheers,

Dave.

-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()

2013-02-24 Thread Dave Chinner
On Sun, Feb 24, 2013 at 04:46:30AM +, Tony Lu wrote:
  For example, if xlog_bread_noalign() wants to read blocks from #1
  to # 9, in which case the passed parameter blk_no is 1, and nbblks
  is 8, sectBBsize is 8, after the round down and round up
  operations, we get blk_no as 0, and nbblks as still 8. We
  definitely lose the last block of the log data.
 
 Yes, I fully understand that. But I also understand how the log
 works and that this behaviour *should not happen*. That's why
 I'm asking questions about what the problem you are trying to fix.
 
 I am not sure about this, since I saw many reads on
 non-sector-align blocks even when successfully mounting good XFS
 partitions.

I didn't say that non-sector align reads should not be attempted by
log recovery - it's obvious from the on disk format of the log that
we have to parse it in chunks of 512 bytes to make sense of it's
contents, and that leads to the 512 byte reads and other subsequent
unaligned reads.

*However*

Seeing that there are unaligned reads occurring does not mean that
the structures in the log should be unaligned. Your test output
indicated a log record header at an unaligned block address, and
that's incorrect. It doesn't matter what the rest of the log
recovery code does with non-aligned IO - the fact is that your debug
implies that the contents of the log is corrupt and that implies a
deeper problem

 And also there is code in xlog_write_log_records() which handles
 non-sector-align reads and writes.

Yes, it does handle it, but that doesn't mean that it is correct to
pass unaligned block ranges to it.

Cheers,

Dave.

-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()

2013-02-23 Thread Tony Lu
>> For example, if xlog_bread_noalign() wants to read blocks from #1
>> to # 9, in which case the passed parameter blk_no is 1, and nbblks
>> is 8, sectBBsize is 8, after the round down and round up
>> operations, we get blk_no as 0, and nbblks as still 8. We
>> definitely lose the last block of the log data.
>
>Yes, I fully understand that. But I also understand how the log
>works and that this behaviour *should not happen*. That's why
>I'm asking questions about what the problem you are trying to fix.

I am not sure about this, since I saw many reads on non-sector-align blocks 
even when successfully mounting good XFS partitions. 
-sh-4.1# mount /dev/sda3 /home/
XFS (sda3): Mounting Filesystem
xlog_bread_noalign:blk_no=0,nbblks=1,l_sectBBsize=8
xlog_bread_noalign:blk_no=61447,nbblks=1,l_sectBBsize=8
xlog_bread_noalign:blk_no=0,nbblks=1,l_sectBBsize=8
...
xlog_bread_noalign:blk_no=8695,nbblks=1,l_sectBBsize=8
xlog_bread_noalign:blk_no=4600,nbblks=4096,l_sectBBsize=8
xlog_bread_noalign:blk_no=8184,nbblks=512,l_sectBBsize=8

And also there is code in xlog_write_log_records() which handles 
non-sector-align reads and writes.

/* We may need to do a read at the start to fill in part of
 * the buffer in the starting sector not covered by the first
 * write below.
 */
balign = round_down(start_block, sectbb);
if (balign != start_block) {
error = xlog_bread_noalign(log, start_block, 1, bp);
if (error)
goto out_put_bp;

j = start_block - balign;
}

>Ramdisks don't persist over a reboot, so you must have had some
>other way of reproducing the problem. Can you tell me how you
>reproduced it on a ramdisk? Better yet, send me a script that
>reproduces the problem?

I will try to reproduce it. Basically it is a loop of mount, creating many 
files and unmount.

Thanks
-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()

2013-02-23 Thread Dave Chinner
On Sat, Feb 23, 2013 at 07:06:10AM +, Tony Lu wrote:
> >From: Dave Chinner [mailto:da...@fromorbit.com]
> >On Fri, Feb 22, 2013 at 08:12:52AM +, Tony Lu wrote:
> >> I encountered the following panic when using xfs partitions as rootfs, 
> >> which
> >> is due to the truncated log data read by xlog_bread_noalign(). We should
> >> extend the buffer by one extra log sector to ensure there's enough space to
> >> accommodate requested log data, which we indeed did in xlog_get_bp(), but 
> >> we
> >> forgot to do in xlog_bread_noalign().
> >
> >We've never done that round up in xlog_bread_noalign(). It shouldn't
> >be necessary as xlog_get_bp() and xlog_bread_noalign() are doing
> >fundamentally different things. That is, xlog_get_bp() is ensuring
> >the buffer is large enough for the upcoming IO that will be
> >requested, while xlog_bread_noalign() is simply ensuring what it is
> >passed is correctly aligned to device sector boundaries.
> 
> I set the sector size as 4096 when making the xfs filesystem.
> -sh-4.1# mkfs.xfs -s size=4096 -f /dev/sda3

.

> In this case, xlog_bread_noalign() needs to do such round up and
> round down frequently. And it is used to ensure what it is passed
> is aligned to the log sector size, but not the device sector
> boundaries.

If you have a 4k sector device, then the log sector size is the same
as the physical device. Hence the log code assumes that if you have
a specific log sector size, it is operating on a device that has the
physical IO constraints of that sector size.

> >So, if you have to fudge an extra block for xlog_bread_noalign(),
> >that implies that what xlog_bread_noalign() was passed was
> >probably not correct. It also implies that you are using sector
> >sizes larger than 512 bytes, because that's the only time this
> >might matter. Put simply, this:
> 
> While debugging, I found when it crashed, the blk_no was not align
> to the log sector size and nnblks was aligned to the log sector
> size, which makes sense.

Actually, it doesn't. The log writes done by the kernel are supposed
to be aligned and padded to sector size, which means that we should
never see an unaligned block numbers when reading log buffer headers
back off disk. i.e. when you run mkfs.xfs -s size=4096, you end up
with a a log stripe unit of 4096 bytes, which means it pads
every write to 4096 byte boundaries rather than 512 byte boundaries.

> Starting XFS recovery on filesystem: ram0 (logdev: internal)

Ok, you're using a ramdisk and not a real 4k sector device. Hence it
won't fail if we do 512 byte aligned IO rather than 4k aligned IO.

> xlog_bread_noalign--before round down/up: blk_no=0xf4d,nbblks=0x1
> xlog_bread_noalign--after round down/up: blk_no=0xf4c,nbblks=0x4
> xlog_bread_noalign--before round down/up: blk_no=0xf4d,nbblks=0x1
> xlog_bread_noalign--after round down/up: blk_no=0xf4c,nbblks=0x4
> xlog_bread_noalign--before round down/up: blk_no=0xf4e,nbblks=0x3f
> xlog_bread_noalign--after round down/up: blk_no=0xf4c,nbblks=0x40
> XFS: xlog_recover_process_data: bad clientid
>
> For example, if xlog_bread_noalign() wants to read blocks from #1
> to # 9, in which case the passed parameter blk_no is 1, and nbblks
> is 8, sectBBsize is 8, after the round down and round up
> operations, we get blk_no as 0, and nbblks as still 8. We
> definitely lose the last block of the log data.

Yes, I fully understand that. But I also understand how the log
works and that this behaviour *should not happen*. That's why
I'm asking questions about what the problem you are trying to fix.

The issue here is that the log buffer write was not aligned to the
underlying sector size. That is, what we see here is a header block
read, followed by the log buffer data read. The header size is
determined by the iclogbuf size - a 512 byte block implies default
32k iclogbuf size - and the following data region read of 63 blocks
also indicates a 32k iclogbuf size.

IOWs, what we have here is a 32k log buffer write apparently at a
sector-unaligned block address (0xf4d = 3917 which is not a multiple
of 8). This is why log recovery went wrong: a fundamental
architectural assumption the log is built around has somehow been
violated.

That is, the log recovery failure does not appear to be a problem
with the sector alignment done by xlog_bread_noalign() - it appears
to be a failure with the alignment of log buffer IO written to the
log. That's a far more serious problem than a log recovery problem,
but I can't see how that could occur and so I need a test case that
reproduces the recovery failure for deeper analysis

> I was using the 2.6.38.6 kernel, and using xfs as a rootfs
> partition. After untaring the rootfs files on the xfs partition,
> and tried to reboot from the xfs, then the panic occasionally
> occurred.

Ramdisks don't persist over a reboot, so you must have had some
other way of reproducing the problem. Can you tell me how you
reproduced it on a ramdisk? Better yet, send me a script that

RE: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()

2013-02-23 Thread Tony Lu
>-Original Message-
>From: Ben Myers [mailto:b...@sgi.com]
>
>Hi Tony,
>
>On Fri, Feb 22, 2013 at 08:12:52AM +, Tony Lu wrote:
>> I encountered the following panic when using xfs partitions as rootfs, which
>> is due to the truncated log data read by xlog_bread_noalign(). We should
>> extend the buffer by one extra log sector to ensure there's enough space to
>> accommodate requested log data, which we indeed did in xlog_get_bp(), but we
>> forgot to do in xlog_bread_noalign().
>>
>> XFS mounting filesystem sda2
>> Starting XFS recovery on filesystem: sda2 (logdev: internal)
>> XFS: xlog_recover_process_data: bad clientid
>> XFS: log mount/recovery failed: error 5
>> XFS: log mount failedVFS: Cannot open root device "sda2" or unknown-block(8,)
>> Please append a correct "root=" boot option; here are the available partitio:
>> 0800   156290904 sda  driver: sd
>>   080131463271 sda1 ----
>>   080231463302 sda2 ----
>>   080331463302 sda3 ----
>>   0804   1 sda4 ----
>>   080510490413 sda5 ----
>>   080651407968 sda6 ----
>> Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(8,)
>>
>> Starting stack dump of tid 1, pid 1 (swapper) on cpu 35 at cycle 42273138234
>>   frame 0: 0xfff70016e5a0 dump_stack+0x0/0x20 (sp 0xfe03fbedfe88)
>>   frame 1: 0xfff7004af470 panic+0x150/0x3a0 (sp 0xfe03fbedfe88)
>>   frame 2: 0xfff700881e88 mount_block_root+0x2c0/0x4c8 (sp
>0xfe03fbe)
>>   frame 3: 0xfff700882390 prepare_namespace+0x250/0x358 (sp
>0xfe03fb)
>>   frame 4: 0xfff700880778 kernel_init+0x4c8/0x520 (sp
>0xfe03fbedffb0)
>>   frame 5: 0xfff70011ecb8 start_kernel_thread+0x18/0x20 (sp
>0xfe03fb)
>> Stack dump complete
>>
>> Signed-off-by: Zhigang Lu 
>> Reviewed-by: Chris Metcalf 
>
>Looks fine to me.  I'll pull it in after some testing.
>
>Do you happen to have a metadump of this filesystem?
>
>Reviewed-by: Ben Myers 

Sorry I did not keep the metadump of it. But I kept some debugging info when I 
debugged and fixed it a year ago.

Starting XFS recovery on filesystem: ram0 (logdev: internal)
xlog_bread_noalign--before round down/up: blk_no=0xf4d,nbblks=0x1
xlog_bread_noalign--after round down/up: blk_no=0xf4c,nbblks=0x4
xlog_bread_noalign--before round down/up: blk_no=0xf4d,nbblks=0x1
xlog_bread_noalign--after round down/up: blk_no=0xf4c,nbblks=0x4
xlog_bread_noalign--before round down/up: blk_no=0xf4e,nbblks=0x3f
xlog_bread_noalign--after round down/up: blk_no=0xf4c,nbblks=0x40
XFS: xlog_recover_process_data: bad clientid
Assertion failed: 0, file: 
/home/scratch/zlu/zlu-main/sys/linux/source/fs/xfs/xfs_log_recover.c, line: 2852
BUG: failure at 
/home/scratch/zlu/zlu-main/sys/linux/source/fs/xfs/support/debug.c:100/assfail()!
Kernel panic - not syncing: BUG!

Starting stack dump of tid 843, pid 843 (mount) on cpu 1 at cycle 345934778384
  frame 0: 0xfff7001380a0 dump_stack+0x0/0x20 (sp 0xfe43e55df7b0)
  frame 1: 0xfff7003b5470 panic+0x150/0x3a0 (sp 0xfe43e55df7b0)
  frame 2: 0xfff700824cf0 assfail+0x80/0x80 (sp 0xfe43e55df858)
  frame 3: 0xfff70037c7c0 xlog_recover_process_data+0x598/0x698 (sp 
0xfe43e55df868)
  frame 4: 0xfff7002c55e8 xlog_do_recovery_pass+0x810/0x908 (sp 
0xfe43e55df8e8)
  frame 5: 0xfff70068f0d8 xlog_do_log_recovery+0xc8/0x1d8 (sp 
0xfe43e55dfa48)
  frame 6: 0xfff70054cf60 xlog_do_recover+0x48/0x380 (sp 0xfe43e55dfa88)
  frame 7: 0xfff7006fdbf0 xlog_recover+0x138/0x170 (sp 0xfe43e55dfac0)
  frame 8: 0xfff7005b2d70 xfs_log_mount+0x150/0x2e8 (sp 0xfe43e55dfb00)
  frame 9: 0xfff700269830 xfs_mountfs+0x510/0xb20 (sp 0xfe43e55dfb38)
  frame 10: 0xfff700486930 xfs_fs_fill_super+0x2e0/0x3f0 (sp 
0xfe43e55dfba8)
  frame 11: 0xfff7000950c8 mount_bdev+0x168/0x2d0 (sp 0xfe43e55dfbe0)
  frame 12: 0xfff700071e08 vfs_kern_mount+0x110/0x408 (sp 
0xfe43e55dfc50)
  frame 13: 0xfff7000badf8 do_kern_mount+0x68/0x1e0 (sp 0xfe43e55dfc98)
  frame 14: 0xfff700046470 do_mount+0x200/0x878 (sp 0xfe43e55dfcd8)
  frame 15: 0xfff7000c8050 sys_mount+0xd0/0x1a0 (sp 0xfe43e55dfd60)
  frame 16: 0xfff7001a2c30 handle_syscall+0x280/0x340 (sp 
0xfe43e55dfdc0)
  
  frame 17: 0xd46688 libc-2.12.so[c2+1d] (sp 0x1ddf4b0)
  frame 18: 0x160 mount[155+2] (sp 0x1ddf4b0)
  frame 19: 0x1557dc0 mount[155+2] (sp 0x1ddf500)
  frame 20: 0x1558a80 mount[155+2] (sp 0x1ddf858)
  frame 21: 0x1559a60 mount[155+2] (sp 0x1ddf930)
  frame 22: 0xc3e5e8 libc-2.12.so[c2+1d] (sp 0x1ddfaf8)
Stack dump complete
Client requested halt.

Thanks
-Tony
--
To unsubscribe from this list: 

RE: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()

2013-02-23 Thread Tony Lu
-Original Message-
From: Ben Myers [mailto:b...@sgi.com]

Hi Tony,

On Fri, Feb 22, 2013 at 08:12:52AM +, Tony Lu wrote:
 I encountered the following panic when using xfs partitions as rootfs, which
 is due to the truncated log data read by xlog_bread_noalign(). We should
 extend the buffer by one extra log sector to ensure there's enough space to
 accommodate requested log data, which we indeed did in xlog_get_bp(), but we
 forgot to do in xlog_bread_noalign().

 XFS mounting filesystem sda2
 Starting XFS recovery on filesystem: sda2 (logdev: internal)
 XFS: xlog_recover_process_data: bad clientid
 XFS: log mount/recovery failed: error 5
 XFS: log mount failedVFS: Cannot open root device sda2 or unknown-block(8,)
 Please append a correct root= boot option; here are the available partitio:
 0800   156290904 sda  driver: sd
   080131463271 sda1 ----
   080231463302 sda2 ----
   080331463302 sda3 ----
   0804   1 sda4 ----
   080510490413 sda5 ----
   080651407968 sda6 ----
 Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(8,)

 Starting stack dump of tid 1, pid 1 (swapper) on cpu 35 at cycle 42273138234
   frame 0: 0xfff70016e5a0 dump_stack+0x0/0x20 (sp 0xfe03fbedfe88)
   frame 1: 0xfff7004af470 panic+0x150/0x3a0 (sp 0xfe03fbedfe88)
   frame 2: 0xfff700881e88 mount_block_root+0x2c0/0x4c8 (sp
0xfe03fbe)
   frame 3: 0xfff700882390 prepare_namespace+0x250/0x358 (sp
0xfe03fb)
   frame 4: 0xfff700880778 kernel_init+0x4c8/0x520 (sp
0xfe03fbedffb0)
   frame 5: 0xfff70011ecb8 start_kernel_thread+0x18/0x20 (sp
0xfe03fb)
 Stack dump complete

 Signed-off-by: Zhigang Lu z...@tilera.com
 Reviewed-by: Chris Metcalf cmetc...@tilera.com

Looks fine to me.  I'll pull it in after some testing.

Do you happen to have a metadump of this filesystem?

Reviewed-by: Ben Myers b...@sgi.com

Sorry I did not keep the metadump of it. But I kept some debugging info when I 
debugged and fixed it a year ago.

Starting XFS recovery on filesystem: ram0 (logdev: internal)
xlog_bread_noalign--before round down/up: blk_no=0xf4d,nbblks=0x1
xlog_bread_noalign--after round down/up: blk_no=0xf4c,nbblks=0x4
xlog_bread_noalign--before round down/up: blk_no=0xf4d,nbblks=0x1
xlog_bread_noalign--after round down/up: blk_no=0xf4c,nbblks=0x4
xlog_bread_noalign--before round down/up: blk_no=0xf4e,nbblks=0x3f
xlog_bread_noalign--after round down/up: blk_no=0xf4c,nbblks=0x40
XFS: xlog_recover_process_data: bad clientid
Assertion failed: 0, file: 
/home/scratch/zlu/zlu-main/sys/linux/source/fs/xfs/xfs_log_recover.c, line: 2852
BUG: failure at 
/home/scratch/zlu/zlu-main/sys/linux/source/fs/xfs/support/debug.c:100/assfail()!
Kernel panic - not syncing: BUG!

Starting stack dump of tid 843, pid 843 (mount) on cpu 1 at cycle 345934778384
  frame 0: 0xfff7001380a0 dump_stack+0x0/0x20 (sp 0xfe43e55df7b0)
  frame 1: 0xfff7003b5470 panic+0x150/0x3a0 (sp 0xfe43e55df7b0)
  frame 2: 0xfff700824cf0 assfail+0x80/0x80 (sp 0xfe43e55df858)
  frame 3: 0xfff70037c7c0 xlog_recover_process_data+0x598/0x698 (sp 
0xfe43e55df868)
  frame 4: 0xfff7002c55e8 xlog_do_recovery_pass+0x810/0x908 (sp 
0xfe43e55df8e8)
  frame 5: 0xfff70068f0d8 xlog_do_log_recovery+0xc8/0x1d8 (sp 
0xfe43e55dfa48)
  frame 6: 0xfff70054cf60 xlog_do_recover+0x48/0x380 (sp 0xfe43e55dfa88)
  frame 7: 0xfff7006fdbf0 xlog_recover+0x138/0x170 (sp 0xfe43e55dfac0)
  frame 8: 0xfff7005b2d70 xfs_log_mount+0x150/0x2e8 (sp 0xfe43e55dfb00)
  frame 9: 0xfff700269830 xfs_mountfs+0x510/0xb20 (sp 0xfe43e55dfb38)
  frame 10: 0xfff700486930 xfs_fs_fill_super+0x2e0/0x3f0 (sp 
0xfe43e55dfba8)
  frame 11: 0xfff7000950c8 mount_bdev+0x168/0x2d0 (sp 0xfe43e55dfbe0)
  frame 12: 0xfff700071e08 vfs_kern_mount+0x110/0x408 (sp 
0xfe43e55dfc50)
  frame 13: 0xfff7000badf8 do_kern_mount+0x68/0x1e0 (sp 0xfe43e55dfc98)
  frame 14: 0xfff700046470 do_mount+0x200/0x878 (sp 0xfe43e55dfcd8)
  frame 15: 0xfff7000c8050 sys_mount+0xd0/0x1a0 (sp 0xfe43e55dfd60)
  frame 16: 0xfff7001a2c30 handle_syscall+0x280/0x340 (sp 
0xfe43e55dfdc0)
  syscall while in user mode
  frame 17: 0xd46688 libc-2.12.so[c2+1d] (sp 0x1ddf4b0)
  frame 18: 0x160 mount[155+2] (sp 0x1ddf4b0)
  frame 19: 0x1557dc0 mount[155+2] (sp 0x1ddf500)
  frame 20: 0x1558a80 mount[155+2] (sp 0x1ddf858)
  frame 21: 0x1559a60 mount[155+2] (sp 0x1ddf930)
  frame 22: 0xc3e5e8 libc-2.12.so[c2+1d] (sp 0x1ddfaf8)
Stack dump complete
Client requested halt.

Thanks
-Tony
--
To unsubscribe from this list: send the 

Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()

2013-02-23 Thread Dave Chinner
On Sat, Feb 23, 2013 at 07:06:10AM +, Tony Lu wrote:
 From: Dave Chinner [mailto:da...@fromorbit.com]
 On Fri, Feb 22, 2013 at 08:12:52AM +, Tony Lu wrote:
  I encountered the following panic when using xfs partitions as rootfs, 
  which
  is due to the truncated log data read by xlog_bread_noalign(). We should
  extend the buffer by one extra log sector to ensure there's enough space to
  accommodate requested log data, which we indeed did in xlog_get_bp(), but 
  we
  forgot to do in xlog_bread_noalign().
 
 We've never done that round up in xlog_bread_noalign(). It shouldn't
 be necessary as xlog_get_bp() and xlog_bread_noalign() are doing
 fundamentally different things. That is, xlog_get_bp() is ensuring
 the buffer is large enough for the upcoming IO that will be
 requested, while xlog_bread_noalign() is simply ensuring what it is
 passed is correctly aligned to device sector boundaries.
 
 I set the sector size as 4096 when making the xfs filesystem.
 -sh-4.1# mkfs.xfs -s size=4096 -f /dev/sda3

.

 In this case, xlog_bread_noalign() needs to do such round up and
 round down frequently. And it is used to ensure what it is passed
 is aligned to the log sector size, but not the device sector
 boundaries.

If you have a 4k sector device, then the log sector size is the same
as the physical device. Hence the log code assumes that if you have
a specific log sector size, it is operating on a device that has the
physical IO constraints of that sector size.

 So, if you have to fudge an extra block for xlog_bread_noalign(),
 that implies that what xlog_bread_noalign() was passed was
 probably not correct. It also implies that you are using sector
 sizes larger than 512 bytes, because that's the only time this
 might matter. Put simply, this:
 
 While debugging, I found when it crashed, the blk_no was not align
 to the log sector size and nnblks was aligned to the log sector
 size, which makes sense.

Actually, it doesn't. The log writes done by the kernel are supposed
to be aligned and padded to sector size, which means that we should
never see an unaligned block numbers when reading log buffer headers
back off disk. i.e. when you run mkfs.xfs -s size=4096, you end up
with a a log stripe unit of 4096 bytes, which means it pads
every write to 4096 byte boundaries rather than 512 byte boundaries.

 Starting XFS recovery on filesystem: ram0 (logdev: internal)

Ok, you're using a ramdisk and not a real 4k sector device. Hence it
won't fail if we do 512 byte aligned IO rather than 4k aligned IO.

 xlog_bread_noalign--before round down/up: blk_no=0xf4d,nbblks=0x1
 xlog_bread_noalign--after round down/up: blk_no=0xf4c,nbblks=0x4
 xlog_bread_noalign--before round down/up: blk_no=0xf4d,nbblks=0x1
 xlog_bread_noalign--after round down/up: blk_no=0xf4c,nbblks=0x4
 xlog_bread_noalign--before round down/up: blk_no=0xf4e,nbblks=0x3f
 xlog_bread_noalign--after round down/up: blk_no=0xf4c,nbblks=0x40
 XFS: xlog_recover_process_data: bad clientid

 For example, if xlog_bread_noalign() wants to read blocks from #1
 to # 9, in which case the passed parameter blk_no is 1, and nbblks
 is 8, sectBBsize is 8, after the round down and round up
 operations, we get blk_no as 0, and nbblks as still 8. We
 definitely lose the last block of the log data.

Yes, I fully understand that. But I also understand how the log
works and that this behaviour *should not happen*. That's why
I'm asking questions about what the problem you are trying to fix.

The issue here is that the log buffer write was not aligned to the
underlying sector size. That is, what we see here is a header block
read, followed by the log buffer data read. The header size is
determined by the iclogbuf size - a 512 byte block implies default
32k iclogbuf size - and the following data region read of 63 blocks
also indicates a 32k iclogbuf size.

IOWs, what we have here is a 32k log buffer write apparently at a
sector-unaligned block address (0xf4d = 3917 which is not a multiple
of 8). This is why log recovery went wrong: a fundamental
architectural assumption the log is built around has somehow been
violated.

That is, the log recovery failure does not appear to be a problem
with the sector alignment done by xlog_bread_noalign() - it appears
to be a failure with the alignment of log buffer IO written to the
log. That's a far more serious problem than a log recovery problem,
but I can't see how that could occur and so I need a test case that
reproduces the recovery failure for deeper analysis

 I was using the 2.6.38.6 kernel, and using xfs as a rootfs
 partition. After untaring the rootfs files on the xfs partition,
 and tried to reboot from the xfs, then the panic occasionally
 occurred.

Ramdisks don't persist over a reboot, so you must have had some
other way of reproducing the problem. Can you tell me how you
reproduced it on a ramdisk? Better yet, send me a script that
reproduces the problem?

 I hope I can provide the corrupted log for you, but 

RE: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()

2013-02-23 Thread Tony Lu
 For example, if xlog_bread_noalign() wants to read blocks from #1
 to # 9, in which case the passed parameter blk_no is 1, and nbblks
 is 8, sectBBsize is 8, after the round down and round up
 operations, we get blk_no as 0, and nbblks as still 8. We
 definitely lose the last block of the log data.

Yes, I fully understand that. But I also understand how the log
works and that this behaviour *should not happen*. That's why
I'm asking questions about what the problem you are trying to fix.

I am not sure about this, since I saw many reads on non-sector-align blocks 
even when successfully mounting good XFS partitions. 
-sh-4.1# mount /dev/sda3 /home/
XFS (sda3): Mounting Filesystem
xlog_bread_noalign:blk_no=0,nbblks=1,l_sectBBsize=8
xlog_bread_noalign:blk_no=61447,nbblks=1,l_sectBBsize=8
xlog_bread_noalign:blk_no=0,nbblks=1,l_sectBBsize=8
...
xlog_bread_noalign:blk_no=8695,nbblks=1,l_sectBBsize=8
xlog_bread_noalign:blk_no=4600,nbblks=4096,l_sectBBsize=8
xlog_bread_noalign:blk_no=8184,nbblks=512,l_sectBBsize=8

And also there is code in xlog_write_log_records() which handles 
non-sector-align reads and writes.

/* We may need to do a read at the start to fill in part of
 * the buffer in the starting sector not covered by the first
 * write below.
 */
balign = round_down(start_block, sectbb);
if (balign != start_block) {
error = xlog_bread_noalign(log, start_block, 1, bp);
if (error)
goto out_put_bp;

j = start_block - balign;
}

Ramdisks don't persist over a reboot, so you must have had some
other way of reproducing the problem. Can you tell me how you
reproduced it on a ramdisk? Better yet, send me a script that
reproduces the problem?

I will try to reproduce it. Basically it is a loop of mount, creating many 
files and unmount.

Thanks
-Tony
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()

2013-02-22 Thread Tony Lu
>From: Dave Chinner [mailto:da...@fromorbit.com]
>On Fri, Feb 22, 2013 at 08:12:52AM +, Tony Lu wrote:
>> I encountered the following panic when using xfs partitions as rootfs, which
>> is due to the truncated log data read by xlog_bread_noalign(). We should
>> extend the buffer by one extra log sector to ensure there's enough space to
>> accommodate requested log data, which we indeed did in xlog_get_bp(), but we
>> forgot to do in xlog_bread_noalign().
>
>We've never done that round up in xlog_bread_noalign(). It shouldn't
>be necessary as xlog_get_bp() and xlog_bread_noalign() are doing
>fundamentally different things. That is, xlog_get_bp() is ensuring
>the buffer is large enough for the upcoming IO that will be
>requested, while xlog_bread_noalign() is simply ensuring what it is
>passed is correctly aligned to device sector boundaries.

I set the sector size as 4096 when making the xfs filesystem.
-sh-4.1# mkfs.xfs -s size=4096 -f /dev/sda3

In this case, xlog_bread_noalign() needs to do such round up and round down 
frequently. And it is used to ensure what it is passed is aligned to the log 
sector size, but not the device sector boundaries.

Here is the debug info I added when mounting this xfs partition.
-sh-4.1# mount /dev/sda3 /home/
XFS (sda3): Mounting Filesystem
xlog_bread_noalign:blk_no=0,nbblks=1,l_sectBBsize=8
xlog_bread_noalign:blk_no=61447,nbblks=1,l_sectBBsize=8
xlog_bread_noalign:blk_no=0,nbblks=1,l_sectBBsize=8
...
xlog_bread_noalign:blk_no=8695,nbblks=1,l_sectBBsize=8
xlog_bread_noalign:blk_no=4600,nbblks=4096,l_sectBBsize=8
xlog_bread_noalign:blk_no=8184,nbblks=512,l_sectBBsize=8

>So, if you have to fudge an extra block for xlog_bread_noalign(),
>that implies that what xlog_bread_noalign() was passed was probably
>not correct. It also implies that you are using sector sizes larger
>than 512 bytes, because that's the only time this might matter. Put
>simply, this:

While debugging, I found when it crashed, the blk_no was not align to the log 
sector size and nnblks was aligned to the log sector size, which makes sense.

For example, if xlog_bread_noalign() wants to read blocks from #1 to # 9, in 
which case the passed parameter blk_no is 1, and nbblks is 8, sectBBsize is 8, 
after the round down and round up operations, we get blk_no as 0, and nbblks as 
still 8. We definitely lose the last block of the log data.

>> XFS mounting filesystem sda2
>> Starting XFS recovery on filesystem: sda2 (logdev: internal)
>> XFS: xlog_recover_process_data: bad clientid
>> XFS: log mount/recovery failed: error 5
>> XFS: log mount failed
>
>Is not sufficient information for me to determine if you've correctly
>analysed the problem you were seeing and that this is the correct
>fix for it. I don't even know what kernel you are seeing this on, or
>how you are reproducing it.

I was using the 2.6.38.6 kernel, and using xfs as a rootfs partition. After 
untaring the rootfs files on the xfs partition, and tried to reboot from the 
xfs, then the panic occasionally occurred.

>
>Note that I'm not saying the fix isn't necessary or correct, just
>that I cannot review it based this commit message.  Given that this
>code is essentially unchanged in behaviour since the large sector
>size support was adding in 2003(*), understanding how it is
>deficient is critical part of the reviewi process
>
>Information you need to provide so I have a chance of reviewing
>whether it is correct or not:
>
>   - what kernel you saw this on,
>   - what the filesystem configuration was
>   - what workload reproduced this problem (a test case would
> be nice, and xfstest even better)
>   - the actual contents of the log that lead to the short read
> during recovery
>   - whether xfs_logprint was capable of parsing the log
> correctly
>   - where in the actual log recovery process the failure
> occurred (e.g. was it trying to recover transactions from
> a section of a wrapped log?)

I hope I can provide the corrupted log for you, but probably I could not find 
it, since I fixed this bug a year ago. Recently when I do some clean-up on my 
code, I find this one, so I think I should return it back to the community.

>IOWs, please show your working so we can determine if this is the
>root cause of the problem you are seeing. :)
>
>(*)
>http://oss.sgi.com/cgi-bin/gitweb.cgi?p=archive/xfs-import.git;a=commitdiff
>;h=f14e527f411712f89178c31370b5d733ea1d0280
>
>FWIW, I think your change might need work - there's the possibility
>that is can round up the length beyond the end of the log if we ask
>to read up to the last sector of the log (i.e. blkno + blklen ==
>end of log) and then round up blklen by one sector
>
Good catch, you are right on this. To avoid this possibility, I changed the 
patch a little bit as following.
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -171,6 +171,7 @@ xlog_bread_noalign(
struct xfs_buf  *bp)
 {

Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()

2013-02-22 Thread Dave Chinner
On Fri, Feb 22, 2013 at 08:12:52AM +, Tony Lu wrote:
> I encountered the following panic when using xfs partitions as rootfs, which
> is due to the truncated log data read by xlog_bread_noalign(). We should
> extend the buffer by one extra log sector to ensure there's enough space to
> accommodate requested log data, which we indeed did in xlog_get_bp(), but we
> forgot to do in xlog_bread_noalign().

We've never done that round up in xlog_bread_noalign(). It shouldn't
be necessary as xlog_get_bp() and xlog_bread_noalign() are doing
fundamentally different things. That is, xlog_get_bp() is ensuring
the buffer is large enough for the upcoming IO that will be
requested, while xlog_bread_noalign() is simply ensuring what it is
passed is correctly aligned to device sector boundaries.

So, if you have to fudge an extra block for xlog_bread_noalign(),
that implies that what xlog_bread_noalign() was passed was probably
not correct. It also implies that you are using sector sizes larger
than 512 bytes, because that's the only time this might matter. Put
simply, this:

> XFS mounting filesystem sda2
> Starting XFS recovery on filesystem: sda2 (logdev: internal)
> XFS: xlog_recover_process_data: bad clientid
> XFS: log mount/recovery failed: error 5
> XFS: log mount failed

Is not sufficient information for me to determine if you've correctly
analysed the problem you were seeing and that this is the correct
fix for it. I don't even know what kernel you are seeing this on, or
how you are reproducing it.

Note that I'm not saying the fix isn't necessary or correct, just
that I cannot review it based this commit message.  Given that this
code is essentially unchanged in behaviour since the large sector
size support was adding in 2003(*), understanding how it is
deficient is critical part of the reviewi process

Information you need to provide so I have a chance of reviewing
whether it is correct or not:

- what kernel you saw this on,
- what the filesystem configuration was
- what workload reproduced this problem (a test case would
  be nice, and xfstest even better)
- the actual contents of the log that lead to the short read
  during recovery
- whether xfs_logprint was capable of parsing the log
  correctly
- where in the actual log recovery process the failure
  occurred (e.g. was it trying to recover transactions from
  a section of a wrapped log?)

IOWs, please show your working so we can determine if this is the
root cause of the problem you are seeing. :)

(*) 
http://oss.sgi.com/cgi-bin/gitweb.cgi?p=archive/xfs-import.git;a=commitdiff;h=f14e527f411712f89178c31370b5d733ea1d0280

FWIW, I think your change might need work - there's the possibility
that is can round up the length beyond the end of the log if we ask
to read up to the last sector of the log (i.e. blkno + blklen ==
end of log) and then round up blklen by one sector

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()

2013-02-22 Thread Ben Myers
Hi Tony,

On Fri, Feb 22, 2013 at 08:12:52AM +, Tony Lu wrote:
> I encountered the following panic when using xfs partitions as rootfs, which
> is due to the truncated log data read by xlog_bread_noalign(). We should
> extend the buffer by one extra log sector to ensure there's enough space to
> accommodate requested log data, which we indeed did in xlog_get_bp(), but we
> forgot to do in xlog_bread_noalign().
> 
> XFS mounting filesystem sda2
> Starting XFS recovery on filesystem: sda2 (logdev: internal)
> XFS: xlog_recover_process_data: bad clientid
> XFS: log mount/recovery failed: error 5
> XFS: log mount failedVFS: Cannot open root device "sda2" or unknown-block(8,)
> Please append a correct "root=" boot option; here are the available partitio:
> 0800   156290904 sda  driver: sd
>   080131463271 sda1 ----
>   080231463302 sda2 ----
>   080331463302 sda3 ----
>   0804   1 sda4 ----
>   080510490413 sda5 ----
>   080651407968 sda6 ----
> Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(8,)
> 
> Starting stack dump of tid 1, pid 1 (swapper) on cpu 35 at cycle 42273138234
>   frame 0: 0xfff70016e5a0 dump_stack+0x0/0x20 (sp 0xfe03fbedfe88)
>   frame 1: 0xfff7004af470 panic+0x150/0x3a0 (sp 0xfe03fbedfe88)
>   frame 2: 0xfff700881e88 mount_block_root+0x2c0/0x4c8 (sp 0xfe03fbe)
>   frame 3: 0xfff700882390 prepare_namespace+0x250/0x358 (sp 0xfe03fb)
>   frame 4: 0xfff700880778 kernel_init+0x4c8/0x520 (sp 0xfe03fbedffb0)
>   frame 5: 0xfff70011ecb8 start_kernel_thread+0x18/0x20 (sp 0xfe03fb)
> Stack dump complete
> 
> Signed-off-by: Zhigang Lu 
> Reviewed-by: Chris Metcalf 

Looks fine to me.  I'll pull it in after some testing.

Do you happen to have a metadump of this filesystem?

Reviewed-by: Ben Myers 

Thanks!
-Ben
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()

2013-02-22 Thread Ben Myers
Hi Tony,

On Fri, Feb 22, 2013 at 08:12:52AM +, Tony Lu wrote:
 I encountered the following panic when using xfs partitions as rootfs, which
 is due to the truncated log data read by xlog_bread_noalign(). We should
 extend the buffer by one extra log sector to ensure there's enough space to
 accommodate requested log data, which we indeed did in xlog_get_bp(), but we
 forgot to do in xlog_bread_noalign().
 
 XFS mounting filesystem sda2
 Starting XFS recovery on filesystem: sda2 (logdev: internal)
 XFS: xlog_recover_process_data: bad clientid
 XFS: log mount/recovery failed: error 5
 XFS: log mount failedVFS: Cannot open root device sda2 or unknown-block(8,)
 Please append a correct root= boot option; here are the available partitio:
 0800   156290904 sda  driver: sd
   080131463271 sda1 ----
   080231463302 sda2 ----
   080331463302 sda3 ----
   0804   1 sda4 ----
   080510490413 sda5 ----
   080651407968 sda6 ----
 Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(8,)
 
 Starting stack dump of tid 1, pid 1 (swapper) on cpu 35 at cycle 42273138234
   frame 0: 0xfff70016e5a0 dump_stack+0x0/0x20 (sp 0xfe03fbedfe88)
   frame 1: 0xfff7004af470 panic+0x150/0x3a0 (sp 0xfe03fbedfe88)
   frame 2: 0xfff700881e88 mount_block_root+0x2c0/0x4c8 (sp 0xfe03fbe)
   frame 3: 0xfff700882390 prepare_namespace+0x250/0x358 (sp 0xfe03fb)
   frame 4: 0xfff700880778 kernel_init+0x4c8/0x520 (sp 0xfe03fbedffb0)
   frame 5: 0xfff70011ecb8 start_kernel_thread+0x18/0x20 (sp 0xfe03fb)
 Stack dump complete
 
 Signed-off-by: Zhigang Lu z...@tilera.com
 Reviewed-by: Chris Metcalf cmetc...@tilera.com

Looks fine to me.  I'll pull it in after some testing.

Do you happen to have a metadump of this filesystem?

Reviewed-by: Ben Myers b...@sgi.com

Thanks!
-Ben
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()

2013-02-22 Thread Dave Chinner
On Fri, Feb 22, 2013 at 08:12:52AM +, Tony Lu wrote:
 I encountered the following panic when using xfs partitions as rootfs, which
 is due to the truncated log data read by xlog_bread_noalign(). We should
 extend the buffer by one extra log sector to ensure there's enough space to
 accommodate requested log data, which we indeed did in xlog_get_bp(), but we
 forgot to do in xlog_bread_noalign().

We've never done that round up in xlog_bread_noalign(). It shouldn't
be necessary as xlog_get_bp() and xlog_bread_noalign() are doing
fundamentally different things. That is, xlog_get_bp() is ensuring
the buffer is large enough for the upcoming IO that will be
requested, while xlog_bread_noalign() is simply ensuring what it is
passed is correctly aligned to device sector boundaries.

So, if you have to fudge an extra block for xlog_bread_noalign(),
that implies that what xlog_bread_noalign() was passed was probably
not correct. It also implies that you are using sector sizes larger
than 512 bytes, because that's the only time this might matter. Put
simply, this:

 XFS mounting filesystem sda2
 Starting XFS recovery on filesystem: sda2 (logdev: internal)
 XFS: xlog_recover_process_data: bad clientid
 XFS: log mount/recovery failed: error 5
 XFS: log mount failed

Is not sufficient information for me to determine if you've correctly
analysed the problem you were seeing and that this is the correct
fix for it. I don't even know what kernel you are seeing this on, or
how you are reproducing it.

Note that I'm not saying the fix isn't necessary or correct, just
that I cannot review it based this commit message.  Given that this
code is essentially unchanged in behaviour since the large sector
size support was adding in 2003(*), understanding how it is
deficient is critical part of the reviewi process

Information you need to provide so I have a chance of reviewing
whether it is correct or not:

- what kernel you saw this on,
- what the filesystem configuration was
- what workload reproduced this problem (a test case would
  be nice, and xfstest even better)
- the actual contents of the log that lead to the short read
  during recovery
- whether xfs_logprint was capable of parsing the log
  correctly
- where in the actual log recovery process the failure
  occurred (e.g. was it trying to recover transactions from
  a section of a wrapped log?)

IOWs, please show your working so we can determine if this is the
root cause of the problem you are seeing. :)

(*) 
http://oss.sgi.com/cgi-bin/gitweb.cgi?p=archive/xfs-import.git;a=commitdiff;h=f14e527f411712f89178c31370b5d733ea1d0280

FWIW, I think your change might need work - there's the possibility
that is can round up the length beyond the end of the log if we ask
to read up to the last sector of the log (i.e. blkno + blklen ==
end of log) and then round up blklen by one sector

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()

2013-02-22 Thread Tony Lu
From: Dave Chinner [mailto:da...@fromorbit.com]
On Fri, Feb 22, 2013 at 08:12:52AM +, Tony Lu wrote:
 I encountered the following panic when using xfs partitions as rootfs, which
 is due to the truncated log data read by xlog_bread_noalign(). We should
 extend the buffer by one extra log sector to ensure there's enough space to
 accommodate requested log data, which we indeed did in xlog_get_bp(), but we
 forgot to do in xlog_bread_noalign().

We've never done that round up in xlog_bread_noalign(). It shouldn't
be necessary as xlog_get_bp() and xlog_bread_noalign() are doing
fundamentally different things. That is, xlog_get_bp() is ensuring
the buffer is large enough for the upcoming IO that will be
requested, while xlog_bread_noalign() is simply ensuring what it is
passed is correctly aligned to device sector boundaries.

I set the sector size as 4096 when making the xfs filesystem.
-sh-4.1# mkfs.xfs -s size=4096 -f /dev/sda3

In this case, xlog_bread_noalign() needs to do such round up and round down 
frequently. And it is used to ensure what it is passed is aligned to the log 
sector size, but not the device sector boundaries.

Here is the debug info I added when mounting this xfs partition.
-sh-4.1# mount /dev/sda3 /home/
XFS (sda3): Mounting Filesystem
xlog_bread_noalign:blk_no=0,nbblks=1,l_sectBBsize=8
xlog_bread_noalign:blk_no=61447,nbblks=1,l_sectBBsize=8
xlog_bread_noalign:blk_no=0,nbblks=1,l_sectBBsize=8
...
xlog_bread_noalign:blk_no=8695,nbblks=1,l_sectBBsize=8
xlog_bread_noalign:blk_no=4600,nbblks=4096,l_sectBBsize=8
xlog_bread_noalign:blk_no=8184,nbblks=512,l_sectBBsize=8

So, if you have to fudge an extra block for xlog_bread_noalign(),
that implies that what xlog_bread_noalign() was passed was probably
not correct. It also implies that you are using sector sizes larger
than 512 bytes, because that's the only time this might matter. Put
simply, this:

While debugging, I found when it crashed, the blk_no was not align to the log 
sector size and nnblks was aligned to the log sector size, which makes sense.

For example, if xlog_bread_noalign() wants to read blocks from #1 to # 9, in 
which case the passed parameter blk_no is 1, and nbblks is 8, sectBBsize is 8, 
after the round down and round up operations, we get blk_no as 0, and nbblks as 
still 8. We definitely lose the last block of the log data.

 XFS mounting filesystem sda2
 Starting XFS recovery on filesystem: sda2 (logdev: internal)
 XFS: xlog_recover_process_data: bad clientid
 XFS: log mount/recovery failed: error 5
 XFS: log mount failed

Is not sufficient information for me to determine if you've correctly
analysed the problem you were seeing and that this is the correct
fix for it. I don't even know what kernel you are seeing this on, or
how you are reproducing it.

I was using the 2.6.38.6 kernel, and using xfs as a rootfs partition. After 
untaring the rootfs files on the xfs partition, and tried to reboot from the 
xfs, then the panic occasionally occurred.


Note that I'm not saying the fix isn't necessary or correct, just
that I cannot review it based this commit message.  Given that this
code is essentially unchanged in behaviour since the large sector
size support was adding in 2003(*), understanding how it is
deficient is critical part of the reviewi process

Information you need to provide so I have a chance of reviewing
whether it is correct or not:

   - what kernel you saw this on,
   - what the filesystem configuration was
   - what workload reproduced this problem (a test case would
 be nice, and xfstest even better)
   - the actual contents of the log that lead to the short read
 during recovery
   - whether xfs_logprint was capable of parsing the log
 correctly
   - where in the actual log recovery process the failure
 occurred (e.g. was it trying to recover transactions from
 a section of a wrapped log?)

I hope I can provide the corrupted log for you, but probably I could not find 
it, since I fixed this bug a year ago. Recently when I do some clean-up on my 
code, I find this one, so I think I should return it back to the community.

IOWs, please show your working so we can determine if this is the
root cause of the problem you are seeing. :)

(*)
http://oss.sgi.com/cgi-bin/gitweb.cgi?p=archive/xfs-import.git;a=commitdiff
;h=f14e527f411712f89178c31370b5d733ea1d0280

FWIW, I think your change might need work - there's the possibility
that is can round up the length beyond the end of the log if we ask
to read up to the last sector of the log (i.e. blkno + blklen ==
end of log) and then round up blklen by one sector

Good catch, you are right on this. To avoid this possibility, I changed the 
patch a little bit as following.
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -171,6 +171,7 @@ xlog_bread_noalign(
struct xfs_buf  *bp)
 {
int error;
+   xfs_daddr_t orig_blk_no = blk_no;