Re: [RFC] Simplifying kernel configuration for distro issues

2012-07-22 Thread David Greaves
On 18/07/12 10:55, Tom Gundersen wrote:
> Linus,
> 
>> The point I'm slowly getting to is that I would actually love to have
>> *distro* Kconfig-files, where the distribution would be able to say
>> "These are the minimums I *require* to work". So we'd have a "Distro"
>> submenu, where you could pick the distro(s) you use, and then pick
>> which release, and we'd have something like
> 
> As someone working on one of the smaller distributions (Arch), I think
> it would be even better if rather than having "distro" entries, we'd
> have "application" entries. I.e., entries for applications that have
> specific kernel requirements/suggestions (udev, systemd, upstart,
> bootchart, pulseaudio, networkmanager, etc). If applications have soft
> requirements, they could have sub-entries explaining the benefit of
> enabling each.

Also coming from a 'very small distro' position; I had this problem a few months
ago... my solution was this:

https://github.com/lbt/mer-kernel-check/blob/master/mer_verify_kernel_config#L127

So I'd appreciate something very much along the lines of what various low-level
services need and why since that way we can share work between distros and
package maintainers and offer this kind of ability to our users too.

David

-- 
"Don't worry, you'll be fine; I saw it work in a cartoon once..."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Simplifying kernel configuration for distro issues

2012-07-22 Thread David Greaves
On 18/07/12 10:55, Tom Gundersen wrote:
 Linus,
 
 The point I'm slowly getting to is that I would actually love to have
 *distro* Kconfig-files, where the distribution would be able to say
 These are the minimums I *require* to work. So we'd have a Distro
 submenu, where you could pick the distro(s) you use, and then pick
 which release, and we'd have something like
 
 As someone working on one of the smaller distributions (Arch), I think
 it would be even better if rather than having distro entries, we'd
 have application entries. I.e., entries for applications that have
 specific kernel requirements/suggestions (udev, systemd, upstart,
 bootchart, pulseaudio, networkmanager, etc). If applications have soft
 requirements, they could have sub-entries explaining the benefit of
 enabling each.

Also coming from a 'very small distro' position; I had this problem a few months
ago... my solution was this:

https://github.com/lbt/mer-kernel-check/blob/master/mer_verify_kernel_config#L127

So I'd appreciate something very much along the lines of what various low-level
services need and why since that way we can share work between distros and
package maintainers and offer this kind of ability to our users too.

David

-- 
Don't worry, you'll be fine; I saw it work in a cartoon once...
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23.1: mdadm/raid5 hung/d-state

2007-11-04 Thread David Greaves
Michael Tokarev wrote:
> Justin Piszcz wrote:
>> On Sun, 4 Nov 2007, Michael Tokarev wrote:
> []
>>> The next time you come across something like that, do a SysRq-T dump and
>>> post that.  It shows a stack trace of all processes - and in particular,
>>> where exactly each task is stuck.
> 
>> Yes I got it before I rebooted, ran that and then dmesg > file.
>>
>> Here it is:
>>
>> [1172609.665902]  80747dc0 80747dc0 80747dc0 
>> 80744d80
>> [1172609.668768]  80747dc0 81015c3aa918 810091c899b4 
>> 810091c899a8
> 
> That's only partial list.  All the kernel threads - which are most important
> in this context - aren't shown.  You ran out of dmesg buffer, and the most
> interesting entries was at the beginning.  If your /var/log partition is
> working, the stuff should be in /var/log/kern.log or equivalent.  If it's
> not working, there is a way to capture the info still, by stopping syslogd,
> cat'ing /proc/kmsg to some tmpfs file and scp'ing it elsewhere.

or netconsole is actually pretty easy and incredibly useful in this kind of
situation even if there's no disk at all :)

David

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23.1: mdadm/raid5 hung/d-state

2007-11-04 Thread David Greaves
Michael Tokarev wrote:
 Justin Piszcz wrote:
 On Sun, 4 Nov 2007, Michael Tokarev wrote:
 []
 The next time you come across something like that, do a SysRq-T dump and
 post that.  It shows a stack trace of all processes - and in particular,
 where exactly each task is stuck.
 
 Yes I got it before I rebooted, ran that and then dmesg  file.

 Here it is:

 [1172609.665902]  80747dc0 80747dc0 80747dc0 
 80744d80
 [1172609.668768]  80747dc0 81015c3aa918 810091c899b4 
 810091c899a8
 
 That's only partial list.  All the kernel threads - which are most important
 in this context - aren't shown.  You ran out of dmesg buffer, and the most
 interesting entries was at the beginning.  If your /var/log partition is
 working, the stuff should be in /var/log/kern.log or equivalent.  If it's
 not working, there is a way to capture the info still, by stopping syslogd,
 cat'ing /proc/kmsg to some tmpfs file and scp'ing it elsewhere.

or netconsole is actually pretty easy and incredibly useful in this kind of
situation even if there's no disk at all :)

David

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFD] Layering: Use-Case Composers (was: DRBD - what is it, anyways? [compare with e.g. NBD + MD raid])

2007-08-13 Thread David Greaves

[EMAIL PROTECTED] wrote:
Would this just be relevant to network devices or would it improve 
support for jostled usb and sata hot-plugging I wonder?


good question, I suspect that some of the error handling would be 
similar (for devices that are unreachable not haning the system for 
example), but a lot of the rest would be different (do you really want 
to try to auto-resync to a drive that you _think_ just reappeared,
Well, omit 'think' and the answer may be "yes". A lot of systems are quite 
simple and RAID is common on the desktop now. If jostled USB fits into this 
category - then "yes".


what 
if it's a different drive? how can you be sure?
And that's the key isn't it. We have the RAID device UUID and the superblock 
info. Isn't that enough? If not then given the work involved an extended 
superblock wouldn't be unreasonable.
And I suspect the capability of devices would need recording in the superblock 
too? eg 'retry-on-fail'
I can see how md would fail a device but may now periodically retry it. If a 
retry shows that it's back then it would validate it (UUID) and then resync it.


) the error rate of a 
network is gong to be significantly higher then for USB or SATA drives 
(although I suppose iscsi would be limilar)


I do agree - I was looking for value-add for the existing subsystem. If this 
benefits existing RAID users then it's more likely to be attractive.


David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFD] Layering: Use-Case Composers (was: DRBD - what is it, anyways? [compare with e.g. NBD + MD raid])

2007-08-13 Thread David Greaves

[EMAIL PROTECTED] wrote:
per the message below MD (or DM) would need to be modified to work 
reasonably well with one of the disk components being over an unreliable 
link (like a network link)


are the MD/DM maintainers interested in extending their code in this 
direction? or would they prefer to keep it simpler by being able to 
continue to assume that the raid components are connected over a highly 
reliable connection?


if they are interested in adding (and maintaining) this functionality 
then there is a real possibility that NBD+MD/DM could eliminate the need 
for DRDB. however if they are not interested in adding all the code to 
deal with the network type issues, then the argument that DRDB should 
not be merged becouse you can do the same thing with MD/DM + NBD is 
invalid and can be dropped/ignored


David Lang


As a user I'd like to see md/nbd be extended to cope with unreliable links.
I think md could be better in handling link exceptions. My unreliable memory 
recalls sporadic issues with hot-plug leaving md hanging and certain lower level 
errors (or even very high latency) causing unsatisfactory behaviour in what is 
supposed to be a fault 'tolerant' subsystem.



Would this just be relevant to network devices or would it improve support for 
jostled usb and sata hot-plugging I wonder?


David

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFD] Layering: Use-Case Composers (was: DRBD - what is it, anyways? [compare with e.g. NBD + MD raid])

2007-08-13 Thread David Greaves

Paul Clements wrote:
Well, if people would like to see a timeout option, I actually coded up 
a patch a couple of years ago to do just that, but I never got it into 
mainline because you can do almost as well by doing a check at 
user-level (I basically ping the nbd connection periodically and if it 
fails, I kill -9 the nbd-client).



Yes please.

David

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFD] Layering: Use-Case Composers (was: DRBD - what is it, anyways? [compare with e.g. NBD + MD raid])

2007-08-13 Thread David Greaves

Paul Clements wrote:
Well, if people would like to see a timeout option, I actually coded up 
a patch a couple of years ago to do just that, but I never got it into 
mainline because you can do almost as well by doing a check at 
user-level (I basically ping the nbd connection periodically and if it 
fails, I kill -9 the nbd-client).



Yes please.

David

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFD] Layering: Use-Case Composers (was: DRBD - what is it, anyways? [compare with e.g. NBD + MD raid])

2007-08-13 Thread David Greaves

[EMAIL PROTECTED] wrote:
per the message below MD (or DM) would need to be modified to work 
reasonably well with one of the disk components being over an unreliable 
link (like a network link)


are the MD/DM maintainers interested in extending their code in this 
direction? or would they prefer to keep it simpler by being able to 
continue to assume that the raid components are connected over a highly 
reliable connection?


if they are interested in adding (and maintaining) this functionality 
then there is a real possibility that NBD+MD/DM could eliminate the need 
for DRDB. however if they are not interested in adding all the code to 
deal with the network type issues, then the argument that DRDB should 
not be merged becouse you can do the same thing with MD/DM + NBD is 
invalid and can be dropped/ignored


David Lang


As a user I'd like to see md/nbd be extended to cope with unreliable links.
I think md could be better in handling link exceptions. My unreliable memory 
recalls sporadic issues with hot-plug leaving md hanging and certain lower level 
errors (or even very high latency) causing unsatisfactory behaviour in what is 
supposed to be a fault 'tolerant' subsystem.



Would this just be relevant to network devices or would it improve support for 
jostled usb and sata hot-plugging I wonder?


David

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFD] Layering: Use-Case Composers (was: DRBD - what is it, anyways? [compare with e.g. NBD + MD raid])

2007-08-13 Thread David Greaves

[EMAIL PROTECTED] wrote:
Would this just be relevant to network devices or would it improve 
support for jostled usb and sata hot-plugging I wonder?


good question, I suspect that some of the error handling would be 
similar (for devices that are unreachable not haning the system for 
example), but a lot of the rest would be different (do you really want 
to try to auto-resync to a drive that you _think_ just reappeared,
Well, omit 'think' and the answer may be yes. A lot of systems are quite 
simple and RAID is common on the desktop now. If jostled USB fits into this 
category - then yes.


what 
if it's a different drive? how can you be sure?
And that's the key isn't it. We have the RAID device UUID and the superblock 
info. Isn't that enough? If not then given the work involved an extended 
superblock wouldn't be unreasonable.
And I suspect the capability of devices would need recording in the superblock 
too? eg 'retry-on-fail'
I can see how md would fail a device but may now periodically retry it. If a 
retry shows that it's back then it would validate it (UUID) and then resync it.


) the error rate of a 
network is gong to be significantly higher then for USB or SATA drives 
(although I suppose iscsi would be limilar)


I do agree - I was looking for value-add for the existing subsystem. If this 
benefits existing RAID users then it's more likely to be attractive.


David
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Problem recovering a failed RIAD5 array with 4-drives. --RESOLVED

2007-07-16 Thread David Greaves

James wrote:
I don't know the original order of the array before all the problems 
started. 
Is there a way to determine the original order? 

No, unless you have some old kernel logs of the last time it assembled
the array properly.
The one thing that "--create" does destroy is the information about
any previous array that the drives were a part of.

The order that --detail is showing now is the order that appeared after 
issuing the command is it is in the email. (ie: a b c d)

Odd.  I cannot reproduce it.
I suggest you try different arrangements (of the 3 good drives and the
word 'missing') until you find one that 'fsck -n' likes.

NeilBrown




I don't understand how the order of --detail was different than the command 
line on my system, however


YOU ARE A LIFE SAVER!!!

After going through 21 combinations, beginning to lose all hope and plummeting 
into eternal despair, combo 22 worked. The array is up and working. All the 
data (1.3Tb) is there and I'm probably the happiest character on the mail 
list today. 


Thanks a bunch for your help.


Funnily enough someone else was having a similar problem on the linux-raid list 
at the same time


Here's a script that may be useful to others in this predicament - a hell of a 
lot quicker than doing it by hand...


The 'is the filesystem safe' test probably wants improving from a read-only 
mount...

http://linux-raid.osdl.org/index.php/RAID_Recovery
http://linux-raid.osdl.org/index.php/Permute_array.pl

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Problem recovering a failed RIAD5 array with 4-drives. --RESOLVED

2007-07-16 Thread David Greaves

James wrote:
I don't know the original order of the array before all the problems 
started. 
Is there a way to determine the original order? 

No, unless you have some old kernel logs of the last time it assembled
the array properly.
The one thing that --create does destroy is the information about
any previous array that the drives were a part of.

The order that --detail is showing now is the order that appeared after 
issuing the command is it is in the email. (ie: a b c d)

Odd.  I cannot reproduce it.
I suggest you try different arrangements (of the 3 good drives and the
word 'missing') until you find one that 'fsck -n' likes.

NeilBrown




I don't understand how the order of --detail was different than the command 
line on my system, however


YOU ARE A LIFE SAVER!!!

After going through 21 combinations, beginning to lose all hope and plummeting 
into eternal despair, combo 22 worked. The array is up and working. All the 
data (1.3Tb) is there and I'm probably the happiest character on the mail 
list today. 


Thanks a bunch for your help.


Funnily enough someone else was having a similar problem on the linux-raid list 
at the same time


Here's a script that may be useful to others in this predicament - a hell of a 
lot quicker than doing it by hand...


The 'is the filesystem safe' test probably wants improving from a read-only 
mount...

http://linux-raid.osdl.org/index.php/RAID_Recovery
http://linux-raid.osdl.org/index.php/Permute_array.pl

David
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SMART problems in 2.6.22

2007-07-09 Thread David Greaves

Bruce Allen wrote:

Hi David,


http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg164863.html


This is mine and although it's a 'real' problem, it is something 
that's easy to hack around by having the suspend script turn on smart 
after it is resumed. (Of course I can't use resume until a skge wol 
bug is fixed so I won't see/test this unless asked too.)


The smart init scripts run '-s on' when the system boots anyway for my 
system - this problem only occurs for me during suspend/resume. Maybe 
smartd should detect that as Alan says.


OK, that should be easy to do.  So let's forget about the 'SMART 
disabled' issue.  This is easy to fix in multiple ways and is not a LKML 
issue.

Sure.


David: can you reproduce the more serious problem 
http://article.gmane.org/gmane.linux.utilities.smartmontools/4712 
reported by Jan Dvorak?

Sorry, I haven't seen that problem.

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SMART problems in 2.6.22

2007-07-09 Thread David Greaves

Hi Bruce
From some of the earlier threads that I missed (below) I have the 
impression that the problem may be a very simple one, namely that 
starting with 2.6.22 one needs to run a command to enable SMART when a 
box is first booted -- the kernel no longer does this as part of the 
init/setup of the disks. But that is NOT consistent with the first two 
reports above, which show 'SMART ENABLED'.


Here are some of the earlier threads that I completely missed:

http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg164863.html
This is mine and although it's a 'real' problem, it is something that's easy to 
hack around by having the suspend script turn on smart after it is resumed. (Of 
course I can't use resume until a skge wol bug is fixed so I won't see/test this 
unless asked too.)


The smart init scripts run '-s on' when the system boots anyway for my system - 
this problem only occurs for me during suspend/resume. Maybe smartd should 
detect that as Alan says.


Please let me know if there's anything else you need.

David


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SMART problems in 2.6.22

2007-07-09 Thread David Greaves

Hi Bruce
From some of the earlier threads that I missed (below) I have the 
impression that the problem may be a very simple one, namely that 
starting with 2.6.22 one needs to run a command to enable SMART when a 
box is first booted -- the kernel no longer does this as part of the 
init/setup of the disks. But that is NOT consistent with the first two 
reports above, which show 'SMART ENABLED'.


Here are some of the earlier threads that I completely missed:

http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg164863.html
This is mine and although it's a 'real' problem, it is something that's easy to 
hack around by having the suspend script turn on smart after it is resumed. (Of 
course I can't use resume until a skge wol bug is fixed so I won't see/test this 
unless asked too.)


The smart init scripts run '-s on' when the system boots anyway for my system - 
this problem only occurs for me during suspend/resume. Maybe smartd should 
detect that as Alan says.


Please let me know if there's anything else you need.

David


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SMART problems in 2.6.22

2007-07-09 Thread David Greaves

Bruce Allen wrote:

Hi David,


http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg164863.html


This is mine and although it's a 'real' problem, it is something 
that's easy to hack around by having the suspend script turn on smart 
after it is resumed. (Of course I can't use resume until a skge wol 
bug is fixed so I won't see/test this unless asked too.)


The smart init scripts run '-s on' when the system boots anyway for my 
system - this problem only occurs for me during suspend/resume. Maybe 
smartd should detect that as Alan says.


OK, that should be easy to do.  So let's forget about the 'SMART 
disabled' issue.  This is easy to fix in multiple ways and is not a LKML 
issue.

Sure.


David: can you reproduce the more serious problem 
http://article.gmane.org/gmane.linux.utilities.smartmontools/4712 
reported by Jan Dvorak?

Sorry, I haven't seen that problem.

David
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-lvm] 2.6.22-rc5 XFS fails after hibernate/resume

2007-07-02 Thread David Greaves

Rafael J. Wysocki wrote:

On Monday, 2 July 2007 16:32, David Greaves wrote:

Rafael J. Wysocki wrote:

On Monday, 2 July 2007 12:56, Tejun Heo wrote:

David Greaves wrote:

Tejun Heo wrote:

It's really weird tho.  The PHY RDY status changed events are coming
from the device which is NOT used while resuming

There is an obvious problem there though Tejun (the errors even when sda
isn't involved in the OS boot) - can I start another thread about that
issue/bug later? I need to reshuffle partitions so I'd rather get the
hibernate working first and then go back to it if that's OK?

Yeah, sure.  The problem is that we don't know whether or how those two
are related.  It would be great if there's a way to verify memory image
read from hibernation is intact.  Rafael, any ideas?

Well, s2disk has an option to compute an MD5 checksum of the image during
the hibernation and verify it while reading the image.

(Assuming you mean the mainline version)

Sounds like a good think to try next...
Couldn't see anything on this in ../Documentation/power/*
How do I enable it?


Add 'compute checksum = y' to the s2disk's configuration file.


Ah, right - that's uswsusp isn't it? Which isn't what I'm having problems with 
AFAIK?


My suspend procedure is:

xfs_freeze -f /scratch
sync
echo platform > /sys/power/disk
echo disk > /sys/power/state
xfs_freeze -u /scratch

Which should work (actually it should work without the sync/xfs_freeze too).

So to debug the problem I'd like to minimally extend this process rather than 
replace it with another approach.


I take it there isn't an 'echo y > /sys/power/do_image_checksum'?

David


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-lvm] 2.6.22-rc5 XFS fails after hibernate/resume

2007-07-02 Thread David Greaves

Rafael J. Wysocki wrote:

On Monday, 2 July 2007 12:56, Tejun Heo wrote:

David Greaves wrote:

Tejun Heo wrote:

It's really weird tho.  The PHY RDY status changed events are coming
from the device which is NOT used while resuming

There is an obvious problem there though Tejun (the errors even when sda
isn't involved in the OS boot) - can I start another thread about that
issue/bug later? I need to reshuffle partitions so I'd rather get the
hibernate working first and then go back to it if that's OK?

Yeah, sure.  The problem is that we don't know whether or how those two
are related.  It would be great if there's a way to verify memory image
read from hibernation is intact.  Rafael, any ideas?


Well, s2disk has an option to compute an MD5 checksum of the image during
the hibernation and verify it while reading the image.

(Assuming you mean the mainline version)

Sounds like a good think to try next...
Couldn't see anything on this in ../Documentation/power/*
How do I enable it?



 Still, s2disk/resume
aren't very easy to install  and configure ...


I have it working fine on 2 other machines now so that doesn't appear to be a 
problem.


David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-lvm] 2.6.22-rc5 XFS fails after hibernate/resume

2007-07-02 Thread David Greaves

Rafael J. Wysocki wrote:

On Monday, 2 July 2007 12:56, Tejun Heo wrote:

David Greaves wrote:

Tejun Heo wrote:

It's really weird tho.  The PHY RDY status changed events are coming
from the device which is NOT used while resuming

There is an obvious problem there though Tejun (the errors even when sda
isn't involved in the OS boot) - can I start another thread about that
issue/bug later? I need to reshuffle partitions so I'd rather get the
hibernate working first and then go back to it if that's OK?

Yeah, sure.  The problem is that we don't know whether or how those two
are related.  It would be great if there's a way to verify memory image
read from hibernation is intact.  Rafael, any ideas?


Well, s2disk has an option to compute an MD5 checksum of the image during
the hibernation and verify it while reading the image.

(Assuming you mean the mainline version)

Sounds like a good think to try next...
Couldn't see anything on this in ../Documentation/power/*
How do I enable it?



 Still, s2disk/resume
aren't very easy to install  and configure ...


I have it working fine on 2 other machines now so that doesn't appear to be a 
problem.


David
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-lvm] 2.6.22-rc5 XFS fails after hibernate/resume

2007-07-02 Thread David Greaves

Rafael J. Wysocki wrote:

On Monday, 2 July 2007 16:32, David Greaves wrote:

Rafael J. Wysocki wrote:

On Monday, 2 July 2007 12:56, Tejun Heo wrote:

David Greaves wrote:

Tejun Heo wrote:

It's really weird tho.  The PHY RDY status changed events are coming
from the device which is NOT used while resuming

There is an obvious problem there though Tejun (the errors even when sda
isn't involved in the OS boot) - can I start another thread about that
issue/bug later? I need to reshuffle partitions so I'd rather get the
hibernate working first and then go back to it if that's OK?

Yeah, sure.  The problem is that we don't know whether or how those two
are related.  It would be great if there's a way to verify memory image
read from hibernation is intact.  Rafael, any ideas?

Well, s2disk has an option to compute an MD5 checksum of the image during
the hibernation and verify it while reading the image.

(Assuming you mean the mainline version)

Sounds like a good think to try next...
Couldn't see anything on this in ../Documentation/power/*
How do I enable it?


Add 'compute checksum = y' to the s2disk's configuration file.


Ah, right - that's uswsusp isn't it? Which isn't what I'm having problems with 
AFAIK?


My suspend procedure is:

xfs_freeze -f /scratch
sync
echo platform  /sys/power/disk
echo disk  /sys/power/state
xfs_freeze -u /scratch

Which should work (actually it should work without the sync/xfs_freeze too).

So to debug the problem I'd like to minimally extend this process rather than 
replace it with another approach.


I take it there isn't an 'echo y  /sys/power/do_image_checksum'?

David


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-pm] Re: [linux-lvm] 2.6.22-rc4 XFS fails after hibernate/resume

2007-06-29 Thread David Greaves

Rafael J. Wysocki wrote:

On Friday, 29 June 2007 09:54, David Greaves wrote:

David Chinner wrote:

On Fri, Jun 29, 2007 at 08:40:00AM +0100, David Greaves wrote:

What happens if a filesystem is frozen and I hibernate?
Will it be thawed when I resume?

If you froze it yourself, then you'll have to thaw it yourself.
So hibernate will not attempt to re-freeze a frozen fs and, during resume, it 
will only thaw filesystems that were frozen by the suspend?


Right now it doesn't freeze (or thaw) any filesystems.  It just sync()s them
before creating the hibernation image.

Thanks. Yes I realise that :)
I wasn't clear, I should have said:
So hibernate should not attempt to re-freeze a frozen fs and, during resume, it
should only thaw filesystems that were frozen by the suspend.



However, the fact that you've seen corruption with the XFS filesystems frozen
before the hibernation indicates that the problem occurs on a lower level.
And that was why I chimed in - I don't think freezing fixes the problem (though 
it may make sense for other reasons).


David


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-lvm] 2.6.22-rc5 XFS fails after hibernate/resume

2007-06-29 Thread David Greaves

David Greaves wrote:

been away, back now...

again...

David Greaves wrote:
When I move the swap/resume partition to a different controller (ie when 
I broke the / mirror and used the freed space) the problem seems to go 
away.

No, it's not gone away - but it's taking longer to show up.
I can try and put together a test loop that does work, hibernates, resumes and 
repeats but since I know it crashes at some point there doesn't seem much point 
unless I'm looking for something.
There's not much in the logs - is there any other instrumentation that people 
could suggest?
DaveC, given this is happening without (obvious) libata errors do you think it 
may be something in the XFS/md/hibernate area?


If there's anything to be tried then I'll also move to 2.6.22-rc6.


> Tejun Heo wrote:
>> It's really weird tho.  The PHY RDY status changed events are coming
>> from the device which is NOT used while resuming

There is an obvious problem there though Tejun (the errors even when sda isn't 
involved in the OS boot) - can I start another thread about that issue/bug 
later? I need to reshuffle partitions so I'd rather get the hibernate working 
first and then go back to it if that's OK?


David

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-pm] Re: [linux-lvm] 2.6.22-rc4 XFS fails after hibernate/resume

2007-06-29 Thread David Greaves

David Chinner wrote:

On Fri, Jun 29, 2007 at 08:40:00AM +0100, David Greaves wrote:

What happens if a filesystem is frozen and I hibernate?
Will it be thawed when I resume?


If you froze it yourself, then you'll have to thaw it yourself.


So hibernate will not attempt to re-freeze a frozen fs and, during resume, it 
will only thaw filesystems that were frozen by the suspend?


David

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-pm] Re: [linux-lvm] 2.6.22-rc4 XFS fails after hibernate/resume

2007-06-29 Thread David Greaves

David Chinner wrote:

On Fri, Jun 29, 2007 at 12:16:44AM +0200, Rafael J. Wysocki wrote:

There are two solutions possible, IMO.  One would be to make these workqueues
freezable, which is possible, but hacky and Oleg didn't like that very much.
The second would be to freeze XFS from within the hibernation code path,
using freeze_bdev().


The second is much more likely to work reliably. If freezing the
filesystem leaves something in an inconsistent state, then it's
something I can reproduce and debug without needing to
suspend/resume.

FWIW, don't forget you need to thaw the filesystem on resume.


I've been a little distracted recently - sorry. I'll re-read the thread and see 
if there are any test actions I need to complete.


I do know that the corruption problems I've been having:
a) only happen after hibernate/resume
b) only ever happen on one of 2 XFS filesystems
c) happen even when the script does xfs_freeze;sync;hibernate;xfs_thaw

What happens if a filesystem is frozen and I hibernate?
Will it be thawed when I resume?

David

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-pm] Re: [linux-lvm] 2.6.22-rc4 XFS fails after hibernate/resume

2007-06-29 Thread David Greaves

David Chinner wrote:

On Fri, Jun 29, 2007 at 12:16:44AM +0200, Rafael J. Wysocki wrote:

There are two solutions possible, IMO.  One would be to make these workqueues
freezable, which is possible, but hacky and Oleg didn't like that very much.
The second would be to freeze XFS from within the hibernation code path,
using freeze_bdev().


The second is much more likely to work reliably. If freezing the
filesystem leaves something in an inconsistent state, then it's
something I can reproduce and debug without needing to
suspend/resume.

FWIW, don't forget you need to thaw the filesystem on resume.


I've been a little distracted recently - sorry. I'll re-read the thread and see 
if there are any test actions I need to complete.


I do know that the corruption problems I've been having:
a) only happen after hibernate/resume
b) only ever happen on one of 2 XFS filesystems
c) happen even when the script does xfs_freeze;sync;hibernate;xfs_thaw

What happens if a filesystem is frozen and I hibernate?
Will it be thawed when I resume?

David

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-pm] Re: [linux-lvm] 2.6.22-rc4 XFS fails after hibernate/resume

2007-06-29 Thread David Greaves

David Chinner wrote:

On Fri, Jun 29, 2007 at 08:40:00AM +0100, David Greaves wrote:

What happens if a filesystem is frozen and I hibernate?
Will it be thawed when I resume?


If you froze it yourself, then you'll have to thaw it yourself.


So hibernate will not attempt to re-freeze a frozen fs and, during resume, it 
will only thaw filesystems that were frozen by the suspend?


David

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-lvm] 2.6.22-rc5 XFS fails after hibernate/resume

2007-06-29 Thread David Greaves

David Greaves wrote:

been away, back now...

again...

David Greaves wrote:
When I move the swap/resume partition to a different controller (ie when 
I broke the / mirror and used the freed space) the problem seems to go 
away.

No, it's not gone away - but it's taking longer to show up.
I can try and put together a test loop that does work, hibernates, resumes and 
repeats but since I know it crashes at some point there doesn't seem much point 
unless I'm looking for something.
There's not much in the logs - is there any other instrumentation that people 
could suggest?
DaveC, given this is happening without (obvious) libata errors do you think it 
may be something in the XFS/md/hibernate area?


If there's anything to be tried then I'll also move to 2.6.22-rc6.


 Tejun Heo wrote:
 It's really weird tho.  The PHY RDY status changed events are coming
 from the device which is NOT used while resuming

There is an obvious problem there though Tejun (the errors even when sda isn't 
involved in the OS boot) - can I start another thread about that issue/bug 
later? I need to reshuffle partitions so I'd rather get the hibernate working 
first and then go back to it if that's OK?


David

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-pm] Re: [linux-lvm] 2.6.22-rc4 XFS fails after hibernate/resume

2007-06-29 Thread David Greaves

Rafael J. Wysocki wrote:

On Friday, 29 June 2007 09:54, David Greaves wrote:

David Chinner wrote:

On Fri, Jun 29, 2007 at 08:40:00AM +0100, David Greaves wrote:

What happens if a filesystem is frozen and I hibernate?
Will it be thawed when I resume?

If you froze it yourself, then you'll have to thaw it yourself.
So hibernate will not attempt to re-freeze a frozen fs and, during resume, it 
will only thaw filesystems that were frozen by the suspend?


Right now it doesn't freeze (or thaw) any filesystems.  It just sync()s them
before creating the hibernation image.

Thanks. Yes I realise that :)
I wasn't clear, I should have said:
So hibernate should not attempt to re-freeze a frozen fs and, during resume, it
should only thaw filesystems that were frozen by the suspend.



However, the fact that you've seen corruption with the XFS filesystems frozen
before the hibernation indicates that the problem occurs on a lower level.
And that was why I chimed in - I don't think freezing fixes the problem (though 
it may make sense for other reasons).


David


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: limits on raid

2007-06-22 Thread David Greaves

Bill Davidsen wrote:

David Greaves wrote:

[EMAIL PROTECTED] wrote:

On Fri, 22 Jun 2007, David Greaves wrote:
If you end up 'fiddling' in md because someone specified 
--assume-clean on a raid5 [in this case just to save a few minutes 
*testing time* on system with a heavily choked bus!] then that adds 
*even more* complexity and exception cases into all the stuff you 
described.


A "few minutes?" Are you reading the times people are seeing with 
multi-TB arrays? Let's see, 5TB at a rebuild rate of 20MB... three days. 

Yes. But we are talking initial creation here.

And as soon as you believe that the array is actually "usable" you cut 
that rebuild rate, perhaps in half, and get dog-slow performance from 
the array. It's usable in the sense that reads and writes work, but for 
useful work it's pretty painful. You either fail to understand the 
magnitude of the problem or wish to trivialize it for some reason.

I do understand the problem and I'm not trying to trivialise it :)

I _suggested_ that it's worth thinking about things rather than jumping in to 
say "oh, we can code up a clever algorithm that keeps track of what stripes have 
valid parity and which don't and we can optimise the read/copy/write for valid 
stripes and use the raid6 type read-all/write-all for invalid stripes and then 
we can write a bit extra on the check code to set the bitmaps.."


Phew - and that lets us run the array at semi-degraded performance (raid6-like) 
for 3 days rather than either waiting before we put it into production or 
running it very slowly.

Now we run this system for 3 years and we saved 3 days - hmmm IS IT WORTH IT?

What happens in those 3 years when we have a disk fail? The solution doesn't 
apply then - it's 3 days to rebuild - like it or not.


By delaying parity computation until the first write to a stripe only 
the growth of a filesystem is slowed, and all data are protected without 
waiting for the lengthly check. The rebuild speed can be set very low, 
because on-demand rebuild will do most of the work.

I am not saying you are wrong.
I ask merely if the balance of benefit outweighs the balance of complexity.

If the benefit were 24x7 then sure - eg using hardware assist in the raid calcs 
- very useful indeed.


I'm very much for the fs layer reading the lower block structure so I 
don't have to fiddle with arcane tuning parameters - yes, *please* 
help make xfs self-tuning!


Keeping life as straightforward as possible low down makes the upwards 
interface more manageable and that goal more realistic... 


Those two paragraphs are mutually exclusive. The fs can be simple 
because it rests on a simple device, even if the "simple device" is 
provided by LVM or md. And LVM and md can stay simple because they rest 
on simple devices, even if they are provided by PATA, SATA, nbd, etc. 
Independent layers make each layer more robust. If you want to 
compromise the layer separation, some approach like ZFS with full 
integration would seem to be promising. Note that layers allow 
specialized features at each point, trading integration for flexibility.


That's a simplistic summary.
You *can* loosely couple the layers. But you can enrich the interface and 
tightly couple them too - XFS is capable (I guess) of understanding md more 
fully than say ext2.
XFS would still work on a less 'talkative' block device where performance wasn't 
as important (USB flash maybe, dunno).



My feeling is that full integration and independent layers each have 
benefits, as you connect the layers to expose operational details you 
need to handle changes in those details, which would seem to make layers 
more complex.

Agreed.

What I'm looking for here is better performance in one 
particular layer, the md RAID5 layer. I like to avoid unnecessary 
complexity, but I feel that the current performance suggests room for 
improvement.


I agree there is room for improvement.
I suggest that it may be more fruitful to write a tool called "raid5prepare"
that writes zeroes/ones as appropriate to all component devices and then you can 
use --assume-clean without concern. That could look to see if the devices are 
scsi or whatever and take advantage of the hyperfast block writes that can be done.


David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: limits on raid

2007-06-22 Thread David Greaves

[EMAIL PROTECTED] wrote:

On Fri, 22 Jun 2007, David Greaves wrote:

That's not a bad thing - until you look at the complexity it brings - 
and then consider the impact and exceptions when you do, eg hardware 
acceleration? md information fed up to the fs layer for xfs? simple 
long term maintenance?


Often these problems are well worth the benefits of the feature.

I _wonder_ if this is one where the right thing is to "just say no" :)
so for several reasons I don't see this as something that's deserving of 
an atomatic 'no'


David Lang


Err, re-read it, I hope you'll see that I agree with you - I actually just meant 
the --assume-clean workaround stuff :)


If you end up 'fiddling' in md because someone specified --assume-clean on a 
raid5 [in this case just to save a few minutes *testing time* on system with a 
heavily choked bus!] then that adds *even more* complexity and exception cases 
into all the stuff you described.


I'm very much for the fs layer reading the lower block structure so I don't have 
to fiddle with arcane tuning parameters - yes, *please* help make xfs self-tuning!


Keeping life as straightforward as possible low down makes the upwards interface 
more manageable and that goal more realistic...


David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: limits on raid

2007-06-22 Thread David Greaves

Neil Brown wrote:

On Thursday June 21, [EMAIL PROTECTED] wrote:
I didn't get a comment on my suggestion for a quick and dirty fix for 
-assume-clean issues...


Bill Davidsen wrote:
How about a simple solution which would get an array on line and still 
be safe? All it would take is a flag which forced reconstruct writes 
for RAID-5. You could set it with an option, or automatically if 
someone puts --assume-clean with --create, leave it in the superblock 
until the first "repair" runs to completion. And for repair you could 
make some assumptions about bad parity not being caused by error but 
just unwritten.


It is certainly possible, and probably not a lot of effort.  I'm not
really excited about it though.

So if someone to submit a patch that did the right stuff,  I would
probably accept it, but I am unlikely to do it myself.


Thought 2: I think the unwritten bit is easier than you think, you 
only need it on parity blocks for RAID5, not on data blocks. When a 
write is done, if the bit is set do a reconstruct, write the parity 
block, and clear the bit. Keeping a bit per data block is madness, and 
appears to be unnecessary as well.


Where do you propose storing those bits?  And how many would you cache
in memory?  And what performance hit would you suffer for accessing
them?  And would it be worth it?


Sometimes I think one of the problems with Linux is that it tries to do 
everything for everyone.


That's not a bad thing - until you look at the complexity it brings - and then 
consider the impact and exceptions when you do, eg hardware acceleration? md 
information fed up to the fs layer for xfs? simple long term maintenance?


Often these problems are well worth the benefits of the feature.

I _wonder_ if this is one where the right thing is to "just say no" :)

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: limits on raid

2007-06-22 Thread David Greaves

Neil Brown wrote:

On Thursday June 21, [EMAIL PROTECTED] wrote:
I didn't get a comment on my suggestion for a quick and dirty fix for 
-assume-clean issues...


Bill Davidsen wrote:
How about a simple solution which would get an array on line and still 
be safe? All it would take is a flag which forced reconstruct writes 
for RAID-5. You could set it with an option, or automatically if 
someone puts --assume-clean with --create, leave it in the superblock 
until the first repair runs to completion. And for repair you could 
make some assumptions about bad parity not being caused by error but 
just unwritten.


It is certainly possible, and probably not a lot of effort.  I'm not
really excited about it though.

So if someone to submit a patch that did the right stuff,  I would
probably accept it, but I am unlikely to do it myself.


Thought 2: I think the unwritten bit is easier than you think, you 
only need it on parity blocks for RAID5, not on data blocks. When a 
write is done, if the bit is set do a reconstruct, write the parity 
block, and clear the bit. Keeping a bit per data block is madness, and 
appears to be unnecessary as well.


Where do you propose storing those bits?  And how many would you cache
in memory?  And what performance hit would you suffer for accessing
them?  And would it be worth it?


Sometimes I think one of the problems with Linux is that it tries to do 
everything for everyone.


That's not a bad thing - until you look at the complexity it brings - and then 
consider the impact and exceptions when you do, eg hardware acceleration? md 
information fed up to the fs layer for xfs? simple long term maintenance?


Often these problems are well worth the benefits of the feature.

I _wonder_ if this is one where the right thing is to just say no :)

David
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: limits on raid

2007-06-22 Thread David Greaves

[EMAIL PROTECTED] wrote:

On Fri, 22 Jun 2007, David Greaves wrote:

That's not a bad thing - until you look at the complexity it brings - 
and then consider the impact and exceptions when you do, eg hardware 
acceleration? md information fed up to the fs layer for xfs? simple 
long term maintenance?


Often these problems are well worth the benefits of the feature.

I _wonder_ if this is one where the right thing is to just say no :)
so for several reasons I don't see this as something that's deserving of 
an atomatic 'no'


David Lang


Err, re-read it, I hope you'll see that I agree with you - I actually just meant 
the --assume-clean workaround stuff :)


If you end up 'fiddling' in md because someone specified --assume-clean on a 
raid5 [in this case just to save a few minutes *testing time* on system with a 
heavily choked bus!] then that adds *even more* complexity and exception cases 
into all the stuff you described.


I'm very much for the fs layer reading the lower block structure so I don't have 
to fiddle with arcane tuning parameters - yes, *please* help make xfs self-tuning!


Keeping life as straightforward as possible low down makes the upwards interface 
more manageable and that goal more realistic...


David
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: limits on raid

2007-06-22 Thread David Greaves

Bill Davidsen wrote:

David Greaves wrote:

[EMAIL PROTECTED] wrote:

On Fri, 22 Jun 2007, David Greaves wrote:
If you end up 'fiddling' in md because someone specified 
--assume-clean on a raid5 [in this case just to save a few minutes 
*testing time* on system with a heavily choked bus!] then that adds 
*even more* complexity and exception cases into all the stuff you 
described.


A few minutes? Are you reading the times people are seeing with 
multi-TB arrays? Let's see, 5TB at a rebuild rate of 20MB... three days. 

Yes. But we are talking initial creation here.

And as soon as you believe that the array is actually usable you cut 
that rebuild rate, perhaps in half, and get dog-slow performance from 
the array. It's usable in the sense that reads and writes work, but for 
useful work it's pretty painful. You either fail to understand the 
magnitude of the problem or wish to trivialize it for some reason.

I do understand the problem and I'm not trying to trivialise it :)

I _suggested_ that it's worth thinking about things rather than jumping in to 
say oh, we can code up a clever algorithm that keeps track of what stripes have 
valid parity and which don't and we can optimise the read/copy/write for valid 
stripes and use the raid6 type read-all/write-all for invalid stripes and then 
we can write a bit extra on the check code to set the bitmaps..


Phew - and that lets us run the array at semi-degraded performance (raid6-like) 
for 3 days rather than either waiting before we put it into production or 
running it very slowly.

Now we run this system for 3 years and we saved 3 days - hmmm IS IT WORTH IT?

What happens in those 3 years when we have a disk fail? The solution doesn't 
apply then - it's 3 days to rebuild - like it or not.


By delaying parity computation until the first write to a stripe only 
the growth of a filesystem is slowed, and all data are protected without 
waiting for the lengthly check. The rebuild speed can be set very low, 
because on-demand rebuild will do most of the work.

I am not saying you are wrong.
I ask merely if the balance of benefit outweighs the balance of complexity.

If the benefit were 24x7 then sure - eg using hardware assist in the raid calcs 
- very useful indeed.


I'm very much for the fs layer reading the lower block structure so I 
don't have to fiddle with arcane tuning parameters - yes, *please* 
help make xfs self-tuning!


Keeping life as straightforward as possible low down makes the upwards 
interface more manageable and that goal more realistic... 


Those two paragraphs are mutually exclusive. The fs can be simple 
because it rests on a simple device, even if the simple device is 
provided by LVM or md. And LVM and md can stay simple because they rest 
on simple devices, even if they are provided by PATA, SATA, nbd, etc. 
Independent layers make each layer more robust. If you want to 
compromise the layer separation, some approach like ZFS with full 
integration would seem to be promising. Note that layers allow 
specialized features at each point, trading integration for flexibility.


That's a simplistic summary.
You *can* loosely couple the layers. But you can enrich the interface and 
tightly couple them too - XFS is capable (I guess) of understanding md more 
fully than say ext2.
XFS would still work on a less 'talkative' block device where performance wasn't 
as important (USB flash maybe, dunno).



My feeling is that full integration and independent layers each have 
benefits, as you connect the layers to expose operational details you 
need to handle changes in those details, which would seem to make layers 
more complex.

Agreed.

What I'm looking for here is better performance in one 
particular layer, the md RAID5 layer. I like to avoid unnecessary 
complexity, but I feel that the current performance suggests room for 
improvement.


I agree there is room for improvement.
I suggest that it may be more fruitful to write a tool called raid5prepare
that writes zeroes/ones as appropriate to all component devices and then you can 
use --assume-clean without concern. That could look to see if the devices are 
scsi or whatever and take advantage of the hyperfast block writes that can be done.


David
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-lvm] 2.6.22-rc5 XFS fails after hibernate/resume

2007-06-21 Thread David Greaves

been away, back now...

Tejun Heo wrote:

David Greaves wrote:

Tejun Heo wrote:

How reproducible is the problem?  Does the problem go away or occur more
often if you change the drive you write the memory image to?

I don't think there should be activity on the sda drive during resume
itself.

[I broke my / md mirror and am using some of that for swap/resume for now]

I did change the swap/resume device to sdd2 (different controller,
onboard sata_via) and there was no EH during resume. The system seemed
OK, wrote a few Gb of video and did a kernel compile.
I repeated this test, no EH during resume, no problems.
I even ran xfs_fsr, the defragment utility, to stress the fs.

I retain this configuration and try again tonight but it looks like
there _may_ be a link between EH during resume and my problems...
Having retained this new configuration for a couple of days now I haven't had 
any problems.

This is good but not really ideal since / isn't mirrored anymore :(


Of course, I don't understand why it *should* EH during resume, it
doesn't during boot or normal operation...


EH occurs during boot, suspend and resume all the time.  It just runs in
quiet mode to avoid disturbing the users too much.  In your case, EH is
kicking in due to actual exception conditions so it's being verbose to
give clue about what's going on.
I was trying to say that I don't actually see any errors being handled in normal 
operation.
I'm not sure if you are saying that these PHY RDY events are normally handled 
quietly (which would explain it).




It's really weird tho.  The PHY RDY status changed events are coming
from the device which is NOT used while resuming
yes - but the erroring device which is not being used is on the same controller 
as the device with the in-use resume partition.



and it's before any
actual PM events are triggered.  Your kernel just boots, swsusp realizes
it's resuming and tries to read memory image from the swap device.

yes


While reading, the disk controller raises consecutive PHY readiness
changed interrupts.  EH recovers them alright but the end result seems
to indicate that the loaded image is corrupt.

Yes, that's consistent with what I'm seeing.

When I move the swap/resume partition to a different controller (ie when I broke 
the / mirror and used the freed space) the problem seems to go away.


I am seeing messages in dmesg though:
ATA: abnormal status 0xD0 on port 0xf881e0c7
ATA: abnormal status 0xD0 on port 0xf881e0c7
ATA: abnormal status 0xD0 on port 0xf881e0c7
ATA: abnormal status 0xD0 on port 0xf881e0c7
ATA: abnormal status 0xD0 on port 0xf881e0c7
ata1.00: configured for UDMA/100
ata2.00: revalidation failed (errno=-2)
ata2: failed to recover some devices, retrying in 5 secs
sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)

sd 0:0:0:0: resuming
sd 0:0:0:0: [sda] Starting disk
ATA: abnormal status 0x7F on port 0x00019807
ATA: abnormal status 0x7F on port 0x00019007
ATA: abnormal status 0x7F on port 0x00019007
ATA: abnormal status 0x7F on port 0x00019807

ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ATA: abnormal status 0xD0 on port 0xf881e0c7
ATA: abnormal status 0xD0 on port 0xf881e0c7
ATA: abnormal status 0xD0 on port 0xf881e0c7
ATA: abnormal status 0xD0 on port 0xf881e0c7
ATA: abnormal status 0xD0 on port 0xf881e0c7
ata1.00: configured for UDMA/100
ata2.00: revalidation failed (errno=-2)
ata2: failed to recover some devices, retrying in 5 secs



So, there's no device suspend/resume code involved at all.  The kernel
just booted and is trying to read data from the drive.  Please try with
only the first drive attached and see what happens.

That's kinda hard; swap and root are on different drives...

Does it help that although the errors above appear, the system seems OK when I 
just use the other controller?


I have to be cautious what I do with this machine as it's the wife's active 
desktop box .


David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: limits on raid

2007-06-21 Thread David Greaves

[EMAIL PROTECTED] wrote:

On Thu, 21 Jun 2007, David Chinner wrote:
one of the 'killer features' of zfs is that it does checksums of every 
file on disk. so many people don't consider the disk infallable.


several other filesystems also do checksums

both bitkeeper and git do checksums of files to detect disk corruption


How different is that to raid1/5/6 being set to a 'paranoid' "read-verify" mode 
(as per Dan's recent email) where a read reads from _all_ spindles and verifies 
(and with R6 maybe corrects) the stripe before returning it?


Doesn't solve DaveC's issue about the fs doing redundancy but isn't that 
essentially just fs level mirroring?


David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: limits on raid

2007-06-21 Thread David Greaves

Neil Brown wrote:


This isn't quite right.

Thanks :)


Firstly, it is mdadm which decided to make one drive a 'spare' for
raid5, not the kernel.
Secondly, it only applies to raid5, not raid6 or raid1 or raid10.

For raid6, the initial resync (just like the resync after an unclean
shutdown) reads all the data blocks, and writes all the P and Q
blocks.
raid5 can do that, but it is faster the read all but one disk, and
write to that one disk.


How about this:

Initial Creation

When mdadm asks the kernel to create a raid array the most noticeable activity 
is what's called the "initial resync".


Raid level 0 doesn't have any redundancy so there is no initial resync.

For raid levels 1,4,6 and 10 mdadm creates the array and starts a resync. The 
raid algorithm then reads the data blocks and writes the appropriate 
parity/mirror (P+Q) blocks across all the relevant disks. There is some sample 
output in a section below...


For raid5 there is an optimisation: mdadm takes one of the disks and marks it as 
'spare'; it then creates the array in degraded mode. The kernel marks the spare 
disk as 'rebuilding' and starts to read from the 'good' disks, calculate the 
parity and determines what should be on the spare disk and then just writes to it.


Once all this is done the array is clean and all disks are active.

This can take quite a time and the array is not fully resilient whilst this is 
happening (it is however fully useable).






Also is raid4 like raid5 or raid6 in this respect?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: limits on raid

2007-06-21 Thread David Greaves

Neil Brown wrote:


This isn't quite right.

Thanks :)


Firstly, it is mdadm which decided to make one drive a 'spare' for
raid5, not the kernel.
Secondly, it only applies to raid5, not raid6 or raid1 or raid10.

For raid6, the initial resync (just like the resync after an unclean
shutdown) reads all the data blocks, and writes all the P and Q
blocks.
raid5 can do that, but it is faster the read all but one disk, and
write to that one disk.


How about this:

Initial Creation

When mdadm asks the kernel to create a raid array the most noticeable activity 
is what's called the initial resync.


Raid level 0 doesn't have any redundancy so there is no initial resync.

For raid levels 1,4,6 and 10 mdadm creates the array and starts a resync. The 
raid algorithm then reads the data blocks and writes the appropriate 
parity/mirror (P+Q) blocks across all the relevant disks. There is some sample 
output in a section below...


For raid5 there is an optimisation: mdadm takes one of the disks and marks it as 
'spare'; it then creates the array in degraded mode. The kernel marks the spare 
disk as 'rebuilding' and starts to read from the 'good' disks, calculate the 
parity and determines what should be on the spare disk and then just writes to it.


Once all this is done the array is clean and all disks are active.

This can take quite a time and the array is not fully resilient whilst this is 
happening (it is however fully useable).






Also is raid4 like raid5 or raid6 in this respect?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: limits on raid

2007-06-21 Thread David Greaves

[EMAIL PROTECTED] wrote:

On Thu, 21 Jun 2007, David Chinner wrote:
one of the 'killer features' of zfs is that it does checksums of every 
file on disk. so many people don't consider the disk infallable.


several other filesystems also do checksums

both bitkeeper and git do checksums of files to detect disk corruption


How different is that to raid1/5/6 being set to a 'paranoid' read-verify mode 
(as per Dan's recent email) where a read reads from _all_ spindles and verifies 
(and with R6 maybe corrects) the stripe before returning it?


Doesn't solve DaveC's issue about the fs doing redundancy but isn't that 
essentially just fs level mirroring?


David
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-lvm] 2.6.22-rc5 XFS fails after hibernate/resume

2007-06-21 Thread David Greaves

been away, back now...

Tejun Heo wrote:

David Greaves wrote:

Tejun Heo wrote:

How reproducible is the problem?  Does the problem go away or occur more
often if you change the drive you write the memory image to?

I don't think there should be activity on the sda drive during resume
itself.

[I broke my / md mirror and am using some of that for swap/resume for now]

I did change the swap/resume device to sdd2 (different controller,
onboard sata_via) and there was no EH during resume. The system seemed
OK, wrote a few Gb of video and did a kernel compile.
I repeated this test, no EH during resume, no problems.
I even ran xfs_fsr, the defragment utility, to stress the fs.

I retain this configuration and try again tonight but it looks like
there _may_ be a link between EH during resume and my problems...
Having retained this new configuration for a couple of days now I haven't had 
any problems.

This is good but not really ideal since / isn't mirrored anymore :(


Of course, I don't understand why it *should* EH during resume, it
doesn't during boot or normal operation...


EH occurs during boot, suspend and resume all the time.  It just runs in
quiet mode to avoid disturbing the users too much.  In your case, EH is
kicking in due to actual exception conditions so it's being verbose to
give clue about what's going on.
I was trying to say that I don't actually see any errors being handled in normal 
operation.
I'm not sure if you are saying that these PHY RDY events are normally handled 
quietly (which would explain it).




It's really weird tho.  The PHY RDY status changed events are coming
from the device which is NOT used while resuming
yes - but the erroring device which is not being used is on the same controller 
as the device with the in-use resume partition.



and it's before any
actual PM events are triggered.  Your kernel just boots, swsusp realizes
it's resuming and tries to read memory image from the swap device.

yes


While reading, the disk controller raises consecutive PHY readiness
changed interrupts.  EH recovers them alright but the end result seems
to indicate that the loaded image is corrupt.

Yes, that's consistent with what I'm seeing.

When I move the swap/resume partition to a different controller (ie when I broke 
the / mirror and used the freed space) the problem seems to go away.


I am seeing messages in dmesg though:
ATA: abnormal status 0xD0 on port 0xf881e0c7
ATA: abnormal status 0xD0 on port 0xf881e0c7
ATA: abnormal status 0xD0 on port 0xf881e0c7
ATA: abnormal status 0xD0 on port 0xf881e0c7
ATA: abnormal status 0xD0 on port 0xf881e0c7
ata1.00: configured for UDMA/100
ata2.00: revalidation failed (errno=-2)
ata2: failed to recover some devices, retrying in 5 secs
sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)

sd 0:0:0:0: resuming
sd 0:0:0:0: [sda] Starting disk
ATA: abnormal status 0x7F on port 0x00019807
ATA: abnormal status 0x7F on port 0x00019007
ATA: abnormal status 0x7F on port 0x00019007
ATA: abnormal status 0x7F on port 0x00019807

ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ATA: abnormal status 0xD0 on port 0xf881e0c7
ATA: abnormal status 0xD0 on port 0xf881e0c7
ATA: abnormal status 0xD0 on port 0xf881e0c7
ATA: abnormal status 0xD0 on port 0xf881e0c7
ATA: abnormal status 0xD0 on port 0xf881e0c7
ata1.00: configured for UDMA/100
ata2.00: revalidation failed (errno=-2)
ata2: failed to recover some devices, retrying in 5 secs



So, there's no device suspend/resume code involved at all.  The kernel
just booted and is trying to read data from the drive.  Please try with
only the first drive attached and see what happens.

That's kinda hard; swap and root are on different drives...

Does it help that although the errors above appear, the system seems OK when I 
just use the other controller?


I have to be cautious what I do with this machine as it's the wife's active 
desktop box grin.


David
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-lvm] 2.6.22-rc5 XFS fails after hibernate/resume

2007-06-19 Thread David Greaves

Rafael J. Wysocki wrote:

This is on 2.6.22-rc5


Is the Tejun's patch

http://www.sisk.pl/kernel/hibernation_and_suspend/2.6.22-rc5/patches/30-block-always-requeue-nonfs-requests-at-the-front.patch

applied on top of that?


2.6.22-rc5 includes it.

(but, when I was testing rc4, I did apply this patch)

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-lvm] 2.6.22-rc5 XFS fails after hibernate/resume

2007-06-19 Thread David Greaves

Tejun Heo wrote:

Hello,

again...


David Greaves wrote:

Good :)

Now, not so good :)


Oh, crap.  :-)




So I hibernated last night and resumed this morning.
Before hibernating I froze and sync'ed. After resume I thawed it. (Sorry
Dave)

Here are some photos of the screen during resume. This is not 100%
reproducable - it seems to occur only if the system is shutdown for
30mins or so.

Tejun, I wonder if error handling during resume is problematic? I got
the same errors in 2.6.21. I have never seen these (or any other libata)
errors other than during resume.

http://www.dgreaves.com/pub/2.6.22-rc5-resume-failure.jpg
(hard to read, here's one from 2.6.21
http://www.dgreaves.com/pub/2.6.21-resume-failure.jpg


Your controller is repeatedly reporting PHY readiness changed exception.
 Are you reading the system image from the device attached to the first
SATA port?


Yes if you mean 1st as in the one after the zero-th ...

resume=/dev/sdb4
haze:~# swapon -s
FilenameTypeSizeUsedPriority
/dev/sdb4   partition   1004020 0   -1

dmesg snippet below...

sda is part of the /scratch xfs array though. SMART doesn't show any problems 
and of course all is well other than during a resume.


sda/b are on sata_sil (a cheap plugin pci card)




I _think_ I've only seen the xfs problem when a resume shows these errors.


The error handling itself tries very hard to ensure that there is no
data corruption in case of errors.  All commands which experience
exceptions are retried but if the drive itself is doing something
stupid, there's only so much the driver can do.

How reproducible is the problem?  Does the problem go away or occur more
often if you change the drive you write the memory image to?


I don't think there should be activity on the sda drive during resume itself.

[I broke my / md mirror and am using some of that for swap/resume for now]

I did change the swap/resume device to sdd2 (different controller, onboard 
sata_via) and there was no EH during resume. The system seemed OK, wrote a few 
Gb of video and did a kernel compile.

I repeated this test, no EH during resume, no problems.
I even ran xfs_fsr, the defragment utility, to stress the fs.

I retain this configuration and try again tonight but it looks like there _may_ 
be a link between EH during resume and my problems...


Of course, I don't understand why it *should* EH during resume, it doesn't 
during boot or normal operation...


Any more tests you'd like me to try?

David


dmesg snippet...

sata_sil :00:0a.0: version 2.2
ACPI: PCI Interrupt :00:0a.0[A] -> GSI 16 (level, low) -> IRQ 18
scsi0 : sata_sil
PM: Adding info for No Bus:host0
scsi1 : sata_sil
PM: Adding info for No Bus:host1
ata1: SATA max UDMA/100 cmd 0xf881e080 ctl 0xf881e08a bmdma 0xf881e000 irq 0
ata2: SATA max UDMA/100 cmd 0xf881e0c0 ctl 0xf881e0ca bmdma 0xf881e008 irq 0
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata1.00: ATA-7: Maxtor 6B200M0, BANC1980, max UDMA/100
ata1.00: 390721968 sectors, multi 0: LBA48
ata1.00: configured for UDMA/100
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata2.00: ata_hpa_resize 1: sectors = 312581808, hpa_sectors = 312581808
ata2.00: ATA-6: ST3160023AS, 3.18, max UDMA/133
ata2.00: 312581808 sectors, multi 0: LBA48
ata2.00: ata_hpa_resize 1: sectors = 312581808, hpa_sectors = 312581808
ata2.00: configured for UDMA/100
PM: Adding info for No Bus:target0:0:0
scsi 0:0:0:0: Direct-Access ATA  Maxtor 6B200M0   BANC PQ: 0 ANSI: 5
PM: Adding info for scsi:0:0:0:0
sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA

sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA

 sda: sda1
sd 0:0:0:0: [sda] Attached SCSI disk
sd 0:0:0:0: Attached scsi generic sg0 type 0
PM: Adding info for No Bus:target1:0:0
scsi 1:0:0:0: Direct-Access ATA  ST3160023AS  3.18 PQ: 0 ANSI: 5
PM: Adding info for scsi:1:0:0:0
sd 1:0:0:0: [sdb] 312581808 512-byte hardware sectors (160042 MB)
sd 1:0:0:0: [sdb] Write Protect is off
sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA

sd 1:0:0:0: [sdb] 312581808 512-byte hardware sectors (160042 MB)
sd 1:0:0:0: [sdb] Write Protect is off
sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA

 sdb: sdb1 sdb2 sdb3 sdb4
sd 1:0:0:0: [sdb] Attached SCSI disk
sd 1:0:0:0: Attached scsi generic sg1 type 0
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the b

Re: [linux-lvm] 2.6.22-rc5 XFS fails after hibernate/resume

2007-06-19 Thread David Greaves

David Greaves wrote:

I'm going to have to do some more testing...

done



David Chinner wrote:

On Mon, Jun 18, 2007 at 08:49:34AM +0100, David Greaves wrote:

David Greaves wrote:
So doing:
xfs_freeze -f /scratch
sync
echo platform > /sys/power/disk
echo disk > /sys/power/state
# resume
xfs_freeze -u /scratch

Works (for now - more usage testing tonight)


Verrry interesting.

Good :)

Now, not so good :)



What you were seeing was an XFS shutdown occurring because the free space
btree was corrupted. IOWs, the process of suspend/resume has resulted
in either bad data being written to disk, the correct data not being
written to disk or the cached block being corrupted in memory.

That's the kind of thing I was suspecting, yes.

If you run xfs_check on the filesystem after it has shut down after a 
resume,
can you tell us if it reports on-disk corruption? Note: do not run 
xfs_repair
to check this - it does not check the free space btrees; instead it 
simply
rebuilds them from scratch. If xfs_check reports an error, then run 
xfs_repair

to fix it up.

OK, I can try this tonight...



This is on 2.6.22-rc5

So I hibernated last night and resumed this morning.
Before hibernating I froze and sync'ed. After resume I thawed it. (Sorry Dave)

Here are some photos of the screen during resume. This is not 100% reproducable 
- it seems to occur only if the system is shutdown for 30mins or so.


Tejun, I wonder if error handling during resume is problematic? I got the same 
errors in 2.6.21. I have never seen these (or any other libata) errors other 
than during resume.


http://www.dgreaves.com/pub/2.6.22-rc5-resume-failure.jpg
(hard to read, here's one from 2.6.21
http://www.dgreaves.com/pub/2.6.21-resume-failure.jpg

I _think_ I've only seen the xfs problem when a resume shows these errors.


Ok, to try and cause a problem I ran a make and got this back at once:
make: stat: Makefile: Input/output error
make: stat: clean: Input/output error
make: *** No rule to make target `clean'.  Stop.
make: stat: GNUmakefile: Input/output error
make: stat: makefile: Input/output error


I caught the first dmesg this time:

Filesystem "dm-0": XFS internal error xfs_btree_check_sblock at line 334 of file 
fs/xfs/xfs_btree.c.  Caller 0xc01b58e1

 [] show_trace_log_lvl+0x1a/0x30
 [] show_trace+0x12/0x20
 [] dump_stack+0x15/0x20
 [] xfs_error_report+0x4f/0x60
 [] xfs_btree_check_sblock+0x56/0xd0
 [] xfs_alloc_lookup+0x181/0x390
 [] xfs_alloc_lookup_le+0x16/0x20
 [] xfs_free_ag_extent+0x51/0x690
 [] xfs_free_extent+0xa4/0xc0
 [] xfs_bmap_finish+0x119/0x170
 [] xfs_itruncate_finish+0x23a/0x3a0
 [] xfs_inactive+0x482/0x500
 [] xfs_fs_clear_inode+0x34/0xa0
 [] clear_inode+0x57/0xe0
 [] generic_delete_inode+0xe5/0x110
 [] generic_drop_inode+0x167/0x1b0
 [] iput+0x5f/0x70
 [] do_unlinkat+0xdf/0x140
 [] sys_unlink+0x10/0x20
 [] syscall_call+0x7/0xb
 ===
xfs_force_shutdown(dm-0,0x8) called from line 4258 of file fs/xfs/xfs_bmap.c. 
Return address = 0xc021101e
Filesystem "dm-0": Corruption of in-memory data detected.  Shutting down 
filesystem: dm-0

Please umount the filesystem, and rectify the problem(s)

so I cd'ed out of /scratch and umounted.

I then tried the xfs_check.

haze:~# xfs_check /dev/video_vg/video_lv
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed.  Mount the filesystem to replay the log, and unmount it before
re-running xfs_check.  If you are unable to mount the filesystem, then use
the xfs_repair -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.
haze:~# mount /scratch/
haze:~# umount /scratch/
haze:~# xfs_check /dev/video_vg/video_lv

Message from [EMAIL PROTECTED] at Tue Jun 19 08:47:30 2007 ...
haze kernel: Bad page state in process 'xfs_db'

Message from [EMAIL PROTECTED] at Tue Jun 19 08:47:30 2007 ...
haze kernel: page:c1767bc0 flags:0x80010008 mapping: mapcount:-64 
count:0

Message from [EMAIL PROTECTED] at Tue Jun 19 08:47:30 2007 ...
haze kernel: Trying to fix it up, but a reboot is needed

Message from [EMAIL PROTECTED] at Tue Jun 19 08:47:30 2007 ...
haze kernel: Backtrace:

Message from [EMAIL PROTECTED] at Tue Jun 19 08:47:30 2007 ...
haze kernel: Bad page state in process 'syslogd'

Message from [EMAIL PROTECTED] at Tue Jun 19 08:47:30 2007 ...
haze kernel: page:c1767cc0 flags:0x80010008 mapping: mapcount:-64 
count:0

Message from [EMAIL PROTECTED] at Tue Jun 19 08:47:30 2007 ...
haze kernel: Trying to fix it up, but a reboot is needed

Message from [EMAIL PROTECTED] at Tue Jun 19 08:47:30 2007 ...
haze kernel: Backtrace:

ugh. Try again
haze:~# xfs_check /dev/video_vg/video_lv
haze:~#

whilst running a top reported this as roughly the peak memory usage:
 8759 root  18   0  479m 474m  876 R  2.0 46.9   0:02.49 xfs_db
so it looks like it didn't run out of memory (machine has 1Gb).

xfs freeze/umount problem

2007-06-19 Thread David Greaves

David Chinner wrote:
> FWIW, I'm on record stating that "sync" is not sufficient to quiesce an XFS
> filesystem for a suspend/resume to work safely and have argued that the only
> safe thing to do is freeze the filesystem before suspend and thaw it after
> resume.

Whilst testing a potential bug in another thread I accidentally found that 
unmounting a filesystem that I'd just frozen would hang.


As the saying goes: "Well, duh!!"

I could eventually run an unfreeze but the mount was still hung. This lead to an 
unclean shutdown.


OK, it may not be bright but it seems like this shouldn't happen; umount should 
either unfreeze and work or fail ("Attempt to umount a frozen filesystem.") if 
the fs is frozen.


Is this a kernel bug/misfeature or a (u)mount one?
Suggestions as to the best place to report it if not in the cc's?

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


xfs freeze/umount problem

2007-06-19 Thread David Greaves

David Chinner wrote:
 FWIW, I'm on record stating that sync is not sufficient to quiesce an XFS
 filesystem for a suspend/resume to work safely and have argued that the only
 safe thing to do is freeze the filesystem before suspend and thaw it after
 resume.

Whilst testing a potential bug in another thread I accidentally found that 
unmounting a filesystem that I'd just frozen would hang.


As the saying goes: Well, duh!!

I could eventually run an unfreeze but the mount was still hung. This lead to an 
unclean shutdown.


OK, it may not be bright but it seems like this shouldn't happen; umount should 
either unfreeze and work or fail (Attempt to umount a frozen filesystem.) if 
the fs is frozen.


Is this a kernel bug/misfeature or a (u)mount one?
Suggestions as to the best place to report it if not in the cc's?

David
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-lvm] 2.6.22-rc5 XFS fails after hibernate/resume

2007-06-19 Thread David Greaves

David Greaves wrote:

I'm going to have to do some more testing...

done



David Chinner wrote:

On Mon, Jun 18, 2007 at 08:49:34AM +0100, David Greaves wrote:

David Greaves wrote:
So doing:
xfs_freeze -f /scratch
sync
echo platform  /sys/power/disk
echo disk  /sys/power/state
# resume
xfs_freeze -u /scratch

Works (for now - more usage testing tonight)


Verrry interesting.

Good :)

Now, not so good :)



What you were seeing was an XFS shutdown occurring because the free space
btree was corrupted. IOWs, the process of suspend/resume has resulted
in either bad data being written to disk, the correct data not being
written to disk or the cached block being corrupted in memory.

That's the kind of thing I was suspecting, yes.

If you run xfs_check on the filesystem after it has shut down after a 
resume,
can you tell us if it reports on-disk corruption? Note: do not run 
xfs_repair
to check this - it does not check the free space btrees; instead it 
simply
rebuilds them from scratch. If xfs_check reports an error, then run 
xfs_repair

to fix it up.

OK, I can try this tonight...



This is on 2.6.22-rc5

So I hibernated last night and resumed this morning.
Before hibernating I froze and sync'ed. After resume I thawed it. (Sorry Dave)

Here are some photos of the screen during resume. This is not 100% reproducable 
- it seems to occur only if the system is shutdown for 30mins or so.


Tejun, I wonder if error handling during resume is problematic? I got the same 
errors in 2.6.21. I have never seen these (or any other libata) errors other 
than during resume.


http://www.dgreaves.com/pub/2.6.22-rc5-resume-failure.jpg
(hard to read, here's one from 2.6.21
http://www.dgreaves.com/pub/2.6.21-resume-failure.jpg

I _think_ I've only seen the xfs problem when a resume shows these errors.


Ok, to try and cause a problem I ran a make and got this back at once:
make: stat: Makefile: Input/output error
make: stat: clean: Input/output error
make: *** No rule to make target `clean'.  Stop.
make: stat: GNUmakefile: Input/output error
make: stat: makefile: Input/output error


I caught the first dmesg this time:

Filesystem dm-0: XFS internal error xfs_btree_check_sblock at line 334 of file 
fs/xfs/xfs_btree.c.  Caller 0xc01b58e1

 [c0104f6a] show_trace_log_lvl+0x1a/0x30
 [c0105c52] show_trace+0x12/0x20
 [c0105d15] dump_stack+0x15/0x20
 [c01daddf] xfs_error_report+0x4f/0x60
 [c01cd736] xfs_btree_check_sblock+0x56/0xd0
 [c01b58e1] xfs_alloc_lookup+0x181/0x390
 [c01b5b06] xfs_alloc_lookup_le+0x16/0x20
 [c01b30c1] xfs_free_ag_extent+0x51/0x690
 [c01b4ea4] xfs_free_extent+0xa4/0xc0
 [c01bf739] xfs_bmap_finish+0x119/0x170
 [c01e3f4a] xfs_itruncate_finish+0x23a/0x3a0
 [c02046a2] xfs_inactive+0x482/0x500
 [c0210ad4] xfs_fs_clear_inode+0x34/0xa0
 [c017d777] clear_inode+0x57/0xe0
 [c017d8e5] generic_delete_inode+0xe5/0x110
 [c017da77] generic_drop_inode+0x167/0x1b0
 [c017cedf] iput+0x5f/0x70
 [c01735cf] do_unlinkat+0xdf/0x140
 [c0173640] sys_unlink+0x10/0x20
 [c01040a4] syscall_call+0x7/0xb
 ===
xfs_force_shutdown(dm-0,0x8) called from line 4258 of file fs/xfs/xfs_bmap.c. 
Return address = 0xc021101e
Filesystem dm-0: Corruption of in-memory data detected.  Shutting down 
filesystem: dm-0

Please umount the filesystem, and rectify the problem(s)

so I cd'ed out of /scratch and umounted.

I then tried the xfs_check.

haze:~# xfs_check /dev/video_vg/video_lv
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed.  Mount the filesystem to replay the log, and unmount it before
re-running xfs_check.  If you are unable to mount the filesystem, then use
the xfs_repair -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.
haze:~# mount /scratch/
haze:~# umount /scratch/
haze:~# xfs_check /dev/video_vg/video_lv

Message from [EMAIL PROTECTED] at Tue Jun 19 08:47:30 2007 ...
haze kernel: Bad page state in process 'xfs_db'

Message from [EMAIL PROTECTED] at Tue Jun 19 08:47:30 2007 ...
haze kernel: page:c1767bc0 flags:0x80010008 mapping: mapcount:-64 
count:0

Message from [EMAIL PROTECTED] at Tue Jun 19 08:47:30 2007 ...
haze kernel: Trying to fix it up, but a reboot is needed

Message from [EMAIL PROTECTED] at Tue Jun 19 08:47:30 2007 ...
haze kernel: Backtrace:

Message from [EMAIL PROTECTED] at Tue Jun 19 08:47:30 2007 ...
haze kernel: Bad page state in process 'syslogd'

Message from [EMAIL PROTECTED] at Tue Jun 19 08:47:30 2007 ...
haze kernel: page:c1767cc0 flags:0x80010008 mapping: mapcount:-64 
count:0

Message from [EMAIL PROTECTED] at Tue Jun 19 08:47:30 2007 ...
haze kernel: Trying to fix it up, but a reboot is needed

Message from [EMAIL PROTECTED] at Tue Jun 19 08:47:30 2007 ...
haze kernel: Backtrace:

ugh. Try again
haze:~# xfs_check /dev/video_vg/video_lv
haze:~#

whilst running a top reported this as roughly the peak memory usage:
 8759 root

Re: [linux-lvm] 2.6.22-rc5 XFS fails after hibernate/resume

2007-06-19 Thread David Greaves

Tejun Heo wrote:

Hello,

again...


David Greaves wrote:

Good :)

Now, not so good :)


Oh, crap.  :-)

grin


So I hibernated last night and resumed this morning.
Before hibernating I froze and sync'ed. After resume I thawed it. (Sorry
Dave)

Here are some photos of the screen during resume. This is not 100%
reproducable - it seems to occur only if the system is shutdown for
30mins or so.

Tejun, I wonder if error handling during resume is problematic? I got
the same errors in 2.6.21. I have never seen these (or any other libata)
errors other than during resume.

http://www.dgreaves.com/pub/2.6.22-rc5-resume-failure.jpg
(hard to read, here's one from 2.6.21
http://www.dgreaves.com/pub/2.6.21-resume-failure.jpg


Your controller is repeatedly reporting PHY readiness changed exception.
 Are you reading the system image from the device attached to the first
SATA port?


Yes if you mean 1st as in the one after the zero-th ...

resume=/dev/sdb4
haze:~# swapon -s
FilenameTypeSizeUsedPriority
/dev/sdb4   partition   1004020 0   -1

dmesg snippet below...

sda is part of the /scratch xfs array though. SMART doesn't show any problems 
and of course all is well other than during a resume.


sda/b are on sata_sil (a cheap plugin pci card)




I _think_ I've only seen the xfs problem when a resume shows these errors.


The error handling itself tries very hard to ensure that there is no
data corruption in case of errors.  All commands which experience
exceptions are retried but if the drive itself is doing something
stupid, there's only so much the driver can do.

How reproducible is the problem?  Does the problem go away or occur more
often if you change the drive you write the memory image to?


I don't think there should be activity on the sda drive during resume itself.

[I broke my / md mirror and am using some of that for swap/resume for now]

I did change the swap/resume device to sdd2 (different controller, onboard 
sata_via) and there was no EH during resume. The system seemed OK, wrote a few 
Gb of video and did a kernel compile.

I repeated this test, no EH during resume, no problems.
I even ran xfs_fsr, the defragment utility, to stress the fs.

I retain this configuration and try again tonight but it looks like there _may_ 
be a link between EH during resume and my problems...


Of course, I don't understand why it *should* EH during resume, it doesn't 
during boot or normal operation...


Any more tests you'd like me to try?

David


dmesg snippet...

sata_sil :00:0a.0: version 2.2
ACPI: PCI Interrupt :00:0a.0[A] - GSI 16 (level, low) - IRQ 18
scsi0 : sata_sil
PM: Adding info for No Bus:host0
scsi1 : sata_sil
PM: Adding info for No Bus:host1
ata1: SATA max UDMA/100 cmd 0xf881e080 ctl 0xf881e08a bmdma 0xf881e000 irq 0
ata2: SATA max UDMA/100 cmd 0xf881e0c0 ctl 0xf881e0ca bmdma 0xf881e008 irq 0
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata1.00: ATA-7: Maxtor 6B200M0, BANC1980, max UDMA/100
ata1.00: 390721968 sectors, multi 0: LBA48
ata1.00: configured for UDMA/100
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata2.00: ata_hpa_resize 1: sectors = 312581808, hpa_sectors = 312581808
ata2.00: ATA-6: ST3160023AS, 3.18, max UDMA/133
ata2.00: 312581808 sectors, multi 0: LBA48
ata2.00: ata_hpa_resize 1: sectors = 312581808, hpa_sectors = 312581808
ata2.00: configured for UDMA/100
PM: Adding info for No Bus:target0:0:0
scsi 0:0:0:0: Direct-Access ATA  Maxtor 6B200M0   BANC PQ: 0 ANSI: 5
PM: Adding info for scsi:0:0:0:0
sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA

sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA

 sda: sda1
sd 0:0:0:0: [sda] Attached SCSI disk
sd 0:0:0:0: Attached scsi generic sg0 type 0
PM: Adding info for No Bus:target1:0:0
scsi 1:0:0:0: Direct-Access ATA  ST3160023AS  3.18 PQ: 0 ANSI: 5
PM: Adding info for scsi:1:0:0:0
sd 1:0:0:0: [sdb] 312581808 512-byte hardware sectors (160042 MB)
sd 1:0:0:0: [sdb] Write Protect is off
sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA

sd 1:0:0:0: [sdb] 312581808 512-byte hardware sectors (160042 MB)
sd 1:0:0:0: [sdb] Write Protect is off
sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA

 sdb: sdb1 sdb2 sdb3 sdb4
sd 1:0:0:0: [sdb] Attached SCSI disk
sd 1:0:0:0: Attached scsi generic sg1 type 0
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message

Re: [linux-lvm] 2.6.22-rc5 XFS fails after hibernate/resume

2007-06-19 Thread David Greaves

Rafael J. Wysocki wrote:

This is on 2.6.22-rc5


Is the Tejun's patch

http://www.sisk.pl/kernel/hibernation_and_suspend/2.6.22-rc5/patches/30-block-always-requeue-nonfs-requests-at-the-front.patch

applied on top of that?


2.6.22-rc5 includes it.

(but, when I was testing rc4, I did apply this patch)

David
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-lvm] 2.6.22-rc4 XFS fails after hibernate/resume

2007-06-18 Thread David Greaves

OK, just an quick ack

When I resumed tonight (having done a freeze/thaw over the suspend) some libata 
errors threw up during the resume and there was an eventual hard hang. Maybe I 
spoke to soon?


I'm going to have to do some more testing...

David Chinner wrote:

On Mon, Jun 18, 2007 at 08:49:34AM +0100, David Greaves wrote:

David Greaves wrote:
So doing:
xfs_freeze -f /scratch
sync
echo platform > /sys/power/disk
echo disk > /sys/power/state
# resume
xfs_freeze -u /scratch

Works (for now - more usage testing tonight)


Verrry interesting.

Good :)



What you were seeing was an XFS shutdown occurring because the free space
btree was corrupted. IOWs, the process of suspend/resume has resulted
in either bad data being written to disk, the correct data not being
written to disk or the cached block being corrupted in memory.

That's the kind of thing I was suspecting, yes.


If you run xfs_check on the filesystem after it has shut down after a resume,
can you tell us if it reports on-disk corruption? Note: do not run xfs_repair
to check this - it does not check the free space btrees; instead it simply
rebuilds them from scratch. If xfs_check reports an error, then run xfs_repair
to fix it up.

OK, I can try this tonight...


FWIW, I'm on record stating that "sync" is not sufficient to quiesce an XFS
filesystem for a suspend/resume to work safely and have argued that the only
safe thing to do is freeze the filesystem before suspend and thaw it after
resume. This is why I originally asked you to test that with the other problem
that you reported. Up until this point in time, there's been no evidence to
prove either side of the argument..

Cheers,

Dave.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-lvm] 2.6.22-rc4 XFS fails after hibernate/resume

2007-06-18 Thread David Greaves

David Greaves wrote:

David Robinson wrote:

David Greaves wrote:

This isn't a regression.

I was seeing these problems on 2.6.21 (but 22 was in -rc so I waited 
to try it).
I tried 2.6.22-rc4 (with Tejun's patches) to see if it had improved - 
no.


Note this is a different (desktop) machine to that involved my recent 
bugs.


The machine will work for days (continually powered up) without a 
problem and then exhibits a filesystem failure within minutes of a 
resume.





OK, that gave me an idea.

Freeze the filesystem
md5sum the lvm
hibernate
resume
md5sum the lvm



So the lvm and below looks OK...

I'll see how it behaves now the filesystem has been frozen/thawed over 
the hibernate...



And it appears to behave well. (A few hours compile/clean cycling kernel builds 
on that filesystem were OK).



Historically I've done:
sync
echo platform > /sys/power/disk
echo disk > /sys/power/state
# resume

and had filesystem corruption (only on this machine, my other hibernating xfs 
machines don't have this problem)


So doing:
xfs_freeze -f /scratch
sync
echo platform > /sys/power/disk
echo disk > /sys/power/state
# resume
xfs_freeze -u /scratch

Works (for now - more usage testing tonight)

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-lvm] 2.6.22-rc4 XFS fails after hibernate/resume

2007-06-18 Thread David Greaves

David Greaves wrote:

David Robinson wrote:

David Greaves wrote:

This isn't a regression.

I was seeing these problems on 2.6.21 (but 22 was in -rc so I waited 
to try it).
I tried 2.6.22-rc4 (with Tejun's patches) to see if it had improved - 
no.


Note this is a different (desktop) machine to that involved my recent 
bugs.


The machine will work for days (continually powered up) without a 
problem and then exhibits a filesystem failure within minutes of a 
resume.


snip


OK, that gave me an idea.

Freeze the filesystem
md5sum the lvm
hibernate
resume
md5sum the lvm

snip

So the lvm and below looks OK...

I'll see how it behaves now the filesystem has been frozen/thawed over 
the hibernate...



And it appears to behave well. (A few hours compile/clean cycling kernel builds 
on that filesystem were OK).



Historically I've done:
sync
echo platform  /sys/power/disk
echo disk  /sys/power/state
# resume

and had filesystem corruption (only on this machine, my other hibernating xfs 
machines don't have this problem)


So doing:
xfs_freeze -f /scratch
sync
echo platform  /sys/power/disk
echo disk  /sys/power/state
# resume
xfs_freeze -u /scratch

Works (for now - more usage testing tonight)

David
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-lvm] 2.6.22-rc4 XFS fails after hibernate/resume

2007-06-18 Thread David Greaves

OK, just an quick ack

When I resumed tonight (having done a freeze/thaw over the suspend) some libata 
errors threw up during the resume and there was an eventual hard hang. Maybe I 
spoke to soon?


I'm going to have to do some more testing...

David Chinner wrote:

On Mon, Jun 18, 2007 at 08:49:34AM +0100, David Greaves wrote:

David Greaves wrote:
So doing:
xfs_freeze -f /scratch
sync
echo platform  /sys/power/disk
echo disk  /sys/power/state
# resume
xfs_freeze -u /scratch

Works (for now - more usage testing tonight)


Verrry interesting.

Good :)



What you were seeing was an XFS shutdown occurring because the free space
btree was corrupted. IOWs, the process of suspend/resume has resulted
in either bad data being written to disk, the correct data not being
written to disk or the cached block being corrupted in memory.

That's the kind of thing I was suspecting, yes.


If you run xfs_check on the filesystem after it has shut down after a resume,
can you tell us if it reports on-disk corruption? Note: do not run xfs_repair
to check this - it does not check the free space btrees; instead it simply
rebuilds them from scratch. If xfs_check reports an error, then run xfs_repair
to fix it up.

OK, I can try this tonight...


FWIW, I'm on record stating that sync is not sufficient to quiesce an XFS
filesystem for a suspend/resume to work safely and have argued that the only
safe thing to do is freeze the filesystem before suspend and thaw it after
resume. This is why I originally asked you to test that with the other problem
that you reported. Up until this point in time, there's been no evidence to
prove either side of the argument..

Cheers,

Dave.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22-rc4 XFS fails after hibernate/resume

2007-06-17 Thread David Greaves

Rafael J. Wysocki wrote:

On Saturday, 16 June 2007 21:56, David Greaves wrote:

This isn't a regression.

I was seeing these problems on 2.6.21 (but 22 was in -rc so I waited to try it).
I tried 2.6.22-rc4 (with Tejun's patches) to see if it had improved - no.

Note this is a different (desktop) machine to that involved my recent bugs.

The machine will work for days (continually powered up) without a problem and 
then exhibits a filesystem failure within minutes of a resume.


I know xfs/raid are OK with hibernate. Is lvm?

The root filesystem is xfs on raid1 and that doesn't seem to have any problems.


What is the partition that's showing problems?  How's it set up, on how many
drives etc.?

I did put that in the OP :)
Here's a recap...
/dev/mapper/video_vg-video_lv on /scratch type xfs (rw)

md1 : active raid5 sdd1[0] sda1[2] sdc1[1]
  390716672 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]

haze:~# pvdisplay
  --- Physical volume ---
  PV Name   /dev/md1
  VG Name   video_vg
  PV Size   372.62 GB / not usable 3.25 MB
  Allocatable   yes (but full)
  PE Size (KByte)   4096
  Total PE  95389
  Free PE   0
  Allocated PE  95389
  PV UUID   IUig5k-460l-sMZc-23Iz-MMFl-Cfh9-XuBMiq



Also, is the dmesg output below from right after the resume?


It runs OK for a few minutes - just enough to think "hey, maybe it'll work this 
time".  Not more than an hour of normal use.

Then you notice when some app fails because the filesystem went away.
The dmesg comes from that point.

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-lvm] 2.6.22-rc4 XFS fails after hibernate/resume

2007-06-17 Thread David Greaves

David Robinson wrote:

David Greaves wrote:

This isn't a regression.

I was seeing these problems on 2.6.21 (but 22 was in -rc so I waited 
to try it).

I tried 2.6.22-rc4 (with Tejun's patches) to see if it had improved - no.

Note this is a different (desktop) machine to that involved my recent 
bugs.


The machine will work for days (continually powered up) without a 
problem and then exhibits a filesystem failure within minutes of a 
resume.


I know xfs/raid are OK with hibernate. Is lvm?


I have LVM working with hibernate w/o any problems (w/ ext3). If there 
were a problem it wouldn't be with LVM but with device-mapper, and I 
doubt there's a problem with either. The stack trace shows that you're 
within XFS code (but it's likely its hibernate).


Thanks - that's good to know.
The suspicion arises because I have xfs on raid1 as root and have *never* had a 
problem with that filesystem. It's *always* xfs on lvm on raid5. I also have 
another system (previously discussed) that reliably hibernated xfs on raid6.


(Clearly raid5 is in my suspect list)


You can easily check whether its LVM/device-mapper:

1) check "dmsetup table" - it should be the same before hibernating and 
after resuming.


2) read directly from the LV - ie, "dd if=/dev/mapper/video_vg-video_lv 
of=/dev/null bs=10M count=200".


If dmsetup shows the same info and you can read directly from the LV I 
doubt it would be a LVM/device-mapper problem.


OK, that gave me an idea.

Freeze the filesystem
md5sum the lvm
hibernate
resume
md5sum the lvm

so:


haze:~# xfs_freeze -f /scratch/

Without this sync, the next two md5sums differed..
haze:~# sync
haze:~# dd if=/dev/video_vg/video_lv bs=10M count=200 | md5sum
200+0 records in
200+0 records out
2097152000 bytes (2.1 GB) copied, 41.2495 seconds, 50.8 MB/s
f42539366bb4269623fa4db14e8e8be2  -
haze:~# dd if=/dev/video_vg/video_lv bs=10M count=200 | md5sum
200+0 records in
200+0 records out
2097152000 bytes (2.1 GB) copied, 41.8111 seconds, 50.2 MB/s
f42539366bb4269623fa4db14e8e8be2  -


haze:~# echo platform > /sys/power/disk
haze:~# echo disk > /sys/power/state


haze:~# dd if=/dev/video_vg/video_lv bs=10M count=200 | md5sum
200+0 records in
200+0 records out
2097152000 bytes (2.1 GB) copied, 42.0478 seconds, 49.9 MB/s
f42539366bb4269623fa4db14e8e8be2  -
haze:~# xfs_freeze -u /scratch/

So the lvm and below looks OK...

I'll see how it behaves now the filesystem has been frozen/thawed over the 
hibernate...


David

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22-rc4 XFS fails after hibernate/resume

2007-06-17 Thread David Greaves

Rafael J. Wysocki wrote:

On Saturday, 16 June 2007 21:56, David Greaves wrote:

This isn't a regression.

I was seeing these problems on 2.6.21 (but 22 was in -rc so I waited to try it).
I tried 2.6.22-rc4 (with Tejun's patches) to see if it had improved - no.

Note this is a different (desktop) machine to that involved my recent bugs.

The machine will work for days (continually powered up) without a problem and 
then exhibits a filesystem failure within minutes of a resume.


I know xfs/raid are OK with hibernate. Is lvm?

The root filesystem is xfs on raid1 and that doesn't seem to have any problems.


What is the partition that's showing problems?  How's it set up, on how many
drives etc.?

I did put that in the OP :)
Here's a recap...
/dev/mapper/video_vg-video_lv on /scratch type xfs (rw)

md1 : active raid5 sdd1[0] sda1[2] sdc1[1]
  390716672 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]

haze:~# pvdisplay
  --- Physical volume ---
  PV Name   /dev/md1
  VG Name   video_vg
  PV Size   372.62 GB / not usable 3.25 MB
  Allocatable   yes (but full)
  PE Size (KByte)   4096
  Total PE  95389
  Free PE   0
  Allocated PE  95389
  PV UUID   IUig5k-460l-sMZc-23Iz-MMFl-Cfh9-XuBMiq



Also, is the dmesg output below from right after the resume?


It runs OK for a few minutes - just enough to think hey, maybe it'll work this 
time.  Not more than an hour of normal use.

Then you notice when some app fails because the filesystem went away.
The dmesg comes from that point.

David
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-lvm] 2.6.22-rc4 XFS fails after hibernate/resume

2007-06-17 Thread David Greaves

David Robinson wrote:

David Greaves wrote:

This isn't a regression.

I was seeing these problems on 2.6.21 (but 22 was in -rc so I waited 
to try it).

I tried 2.6.22-rc4 (with Tejun's patches) to see if it had improved - no.

Note this is a different (desktop) machine to that involved my recent 
bugs.


The machine will work for days (continually powered up) without a 
problem and then exhibits a filesystem failure within minutes of a 
resume.


I know xfs/raid are OK with hibernate. Is lvm?


I have LVM working with hibernate w/o any problems (w/ ext3). If there 
were a problem it wouldn't be with LVM but with device-mapper, and I 
doubt there's a problem with either. The stack trace shows that you're 
within XFS code (but it's likely its hibernate).


Thanks - that's good to know.
The suspicion arises because I have xfs on raid1 as root and have *never* had a 
problem with that filesystem. It's *always* xfs on lvm on raid5. I also have 
another system (previously discussed) that reliably hibernated xfs on raid6.


(Clearly raid5 is in my suspect list)


You can easily check whether its LVM/device-mapper:

1) check dmsetup table - it should be the same before hibernating and 
after resuming.


2) read directly from the LV - ie, dd if=/dev/mapper/video_vg-video_lv 
of=/dev/null bs=10M count=200.


If dmsetup shows the same info and you can read directly from the LV I 
doubt it would be a LVM/device-mapper problem.


OK, that gave me an idea.

Freeze the filesystem
md5sum the lvm
hibernate
resume
md5sum the lvm

so:


haze:~# xfs_freeze -f /scratch/

Without this sync, the next two md5sums differed..
haze:~# sync
haze:~# dd if=/dev/video_vg/video_lv bs=10M count=200 | md5sum
200+0 records in
200+0 records out
2097152000 bytes (2.1 GB) copied, 41.2495 seconds, 50.8 MB/s
f42539366bb4269623fa4db14e8e8be2  -
haze:~# dd if=/dev/video_vg/video_lv bs=10M count=200 | md5sum
200+0 records in
200+0 records out
2097152000 bytes (2.1 GB) copied, 41.8111 seconds, 50.2 MB/s
f42539366bb4269623fa4db14e8e8be2  -


haze:~# echo platform  /sys/power/disk
haze:~# echo disk  /sys/power/state


haze:~# dd if=/dev/video_vg/video_lv bs=10M count=200 | md5sum
200+0 records in
200+0 records out
2097152000 bytes (2.1 GB) copied, 42.0478 seconds, 49.9 MB/s
f42539366bb4269623fa4db14e8e8be2  -
haze:~# xfs_freeze -u /scratch/

So the lvm and below looks OK...

I'll see how it behaves now the filesystem has been frozen/thawed over the 
hibernate...


David

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.22-rc4 XFS fails after hibernate/resume

2007-06-16 Thread David Greaves

This isn't a regression.

I was seeing these problems on 2.6.21 (but 22 was in -rc so I waited to try it).
I tried 2.6.22-rc4 (with Tejun's patches) to see if it had improved - no.

Note this is a different (desktop) machine to that involved my recent bugs.

The machine will work for days (continually powered up) without a problem and 
then exhibits a filesystem failure within minutes of a resume.


I know xfs/raid are OK with hibernate. Is lvm?

The root filesystem is xfs on raid1 and that doesn't seem to have any problems.

System info:

/dev/mapper/video_vg-video_lv on /scratch type xfs (rw)

haze:~# vgdisplay
  --- Volume group ---
  VG Name   video_vg
  System ID
  Formatlvm2
  Metadata Areas1
  Metadata Sequence No  19
  VG Access read/write
  VG Status resizable
  MAX LV0
  Cur LV1
  Open LV   1
  Max PV0
  Cur PV1
  Act PV1
  VG Size   372.61 GB
  PE Size   4.00 MB
  Total PE  95389
  Alloc PE / Size   95389 / 372.61 GB
  Free  PE / Size   0 / 0
  VG UUID   I2gW2x-aHcC-kqzs-Efpd-Q7TE-dkWf-KpHSO7

haze:~# pvdisplay
  --- Physical volume ---
  PV Name   /dev/md1
  VG Name   video_vg
  PV Size   372.62 GB / not usable 3.25 MB
  Allocatable   yes (but full)
  PE Size (KByte)   4096
  Total PE  95389
  Free PE   0
  Allocated PE  95389
  PV UUID   IUig5k-460l-sMZc-23Iz-MMFl-Cfh9-XuBMiq

md1 : active raid5 sdd1[0] sda1[2] sdc1[1]
  390716672 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]



00:00.0 Host bridge: VIA Technologies, Inc. VT8377 [KT400/KT600 AGP] Host Bridge 
(rev 80)

00:01.0 PCI bridge: VIA Technologies, Inc. VT8237 PCI Bridge
00:0a.0 Mass storage controller: Silicon Image, Inc. SiI 3112 
[SATALink/SATARaid] Serial ATA Controller (rev 02)
00:0b.0 Ethernet controller: Marvell Technology Group Ltd. 88E8001 Gigabit 
Ethernet Controller (rev 12)
00:0f.0 RAID bus controller: VIA Technologies, Inc. VIA VT6420 SATA RAID 
Controller (rev 80)
00:0f.1 IDE interface: VIA Technologies, Inc. 
VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
00:10.0 USB Controller: VIA Technologies, Inc. VT82x UHCI USB 1.1 Controller 
(rev 81)
00:10.1 USB Controller: VIA Technologies, Inc. VT82x UHCI USB 1.1 Controller 
(rev 81)
00:10.2 USB Controller: VIA Technologies, Inc. VT82x UHCI USB 1.1 Controller 
(rev 81)
00:10.3 USB Controller: VIA Technologies, Inc. VT82x UHCI USB 1.1 Controller 
(rev 81)

00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 86)
00:11.0 ISA bridge: VIA Technologies, Inc. VT8237 ISA bridge 
[KT600/K8T800/K8T890 South]
00:11.5 Multimedia audio controller: VIA Technologies, Inc. VT8233/A/8235/8237 
AC97 Audio Controller (rev 60)
00:11.6 Communication controller: VIA Technologies, Inc. AC'97 Modem Controller 
(rev 80)

00:12.0 Ethernet controller: VIA Technologies, Inc. VT6102 [Rhine-II] (rev 78)
01:00.0 VGA compatible controller: ATI Technologies Inc RV280 [Radeon 9200 PRO] 
(rev 01)



tail end of info from dmesg:

k_prepare_write+0x272/0x490
 [] xfs_iomap+0x391/0x4b0
 [] xfs_bmap+0x0/0x10
 [] xfs_map_blocks+0x47/0x90
 [] xfs_page_state_convert+0x3dc/0x7b0
 [] xfs_ilock+0x71/0xa0
 [] xfs_iunlock+0x85/0x90
 [] xfs_vm_writepage+0x60/0xf0
 [] __writepage+0x8/0x30
 [] write_cache_pages+0x1ff/0x320
 [] __writepage+0x0/0x30
 [] generic_writepages+0x20/0x30
 [] do_writepages+0x2b/0x50
 [] __filemap_fdatawrite_range+0x72/0x90
 [] xfs_file_fsync+0x0/0x80
 [] filemap_fdatawrite+0x23/0x30
 [] do_fsync+0x4e/0xb0
 [] __do_fsync+0x25/0x40
 [] syscall_call+0x7/0xb
 ===
Filesystem "dm-0": XFS internal error xfs_btree_check_sblock at line 334 of file 
fs/xfs/xfs_btree.c.  Caller 0xc01b27be

 [] xfs_btree_check_sblock+0x5b/0xd0
 [] xfs_alloc_lookup+0x17e/0x390
 [] xfs_alloc_lookup+0x17e/0x390
 [] xfs_alloc_ag_vextent_near+0x59/0xa30
 [] xfs_alloc_ag_vextent+0x8d/0x100
 [] xfs_alloc_vextent+0x223/0x450
 [] xfs_bmap_btalloc+0x400/0x770
 [] xfs_iext_bno_to_ext+0x9d/0x1d0
 [] xfs_bmapi+0x10bd/0x1490
 [] xlog_grant_log_space+0x22e/0x2b0
 [] xfs_log_reserve+0xc0/0xe0
 [] xfs_iomap_write_allocate+0x27f/0x4f0
 [] __block_prepare_write+0x421/0x490
 [] __block_prepare_write+0x272/0x490
 [] xfs_iomap+0x391/0x4b0
 [] xfs_bmap+0x0/0x10
 [] xfs_map_blocks+0x47/0x90
 [] xfs_page_state_convert+0x3dc/0x7b0
 [] xfs_ilock+0x71/0xa0
 [] xfs_iunlock+0x85/0x90
 [] xfs_vm_writepage+0x60/0xf0
 [] __writepage+0x8/0x30
 [] write_cache_pages+0x1ff/0x320
 [] __writepage+0x0/0x30
 [] generic_writepages+0x20/0x30
 [] do_writepages+0x2b/0x50
 [] __filemap_fdatawrite_range+0x72/0x90
 [] xfs_file_fsync+0x0/0x80
 [] filemap_fdatawrite+0x23/0x30
 [] do_fsync+0x4e/0xb0
 [] __do_fsync+0x25/0x40
 [] syscall_call+0x7/0xb
 ===
Filesystem "dm-0": XFS internal error 

Re: limits on raid

2007-06-16 Thread David Greaves

[EMAIL PROTECTED] wrote:

On Sat, 16 Jun 2007, Neil Brown wrote:

I want to test several configurations, from a 45 disk raid6 to a 45 disk 
raid0. at 2-3 days per test (or longer, depending on the tests) this 
becomes a very slow process.
Are you suggesting the code that is written to enhance data integrity is 
optimised (or even touched) to support this kind of test scenario?

Seriously? :)

also, when a rebuild is slow enough (and has enough of a performance 
impact) it's not uncommon to want to operate in degraded mode just long 
enought oget to a maintinance window and then recreate the array and 
reload from backup.


so would mdadm --remove the rebuilding disk help?

David

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: limits on raid

2007-06-16 Thread David Greaves

Neil Brown wrote:

On Friday June 15, [EMAIL PROTECTED] wrote:
 

  As I understand the way
raid works, when you write a block to the array, it will have to read all
the other blocks in the stripe and recalculate the parity and write it out.


Your understanding is incomplete.


Does this help?
[for future reference so you can paste a url and save the typing for code :) ]

http://linux-raid.osdl.org/index.php/Initial_Array_Creation

David



Initial Creation

When mdadm asks the kernel to create a raid array the most noticeable activity 
is what's called the "initial resync".


The kernel takes one (or two for raid6) disks and marks them as 'spare'; it then 
creates the array in degraded mode. It then marks spare disks as 'rebuilding' 
and starts to read from the 'good' disks, calculate the parity and determines 
what should be on any spare disks and then writes it. Once all this is done the 
array is clean and all disks are active.


This can take quite a time and the array is not fully resilient whilst this is 
happening (it is however fully useable).


--assume-clean

Some people have noticed the --assume-clean option in mdadm and speculated that 
this can be used to skip the initial resync. Which it does. But this is a bad 
idea in some cases - and a *very* bad idea in others.


raid5

For raid5 especially it is NOT safe to skip the initial sync. The raid5 
implementation optimises use of the component disks and it is possible for all 
updates to be "read-modify-write" updates which assume the parity is correct. If 
it is wrong, it stays wrong. Then when you lose a drive, the parity blocks are 
wrong so the data you recover using them is wrong. In other words - you will get 
data corruption.


For raid5 on an array with more than 3 drive, if you attempt to write a single 
block, it will:


* read the current value of the block, and the parity block.
* "subtract" the old value of the block from the parity, and "add" the new 
value.

* write out the new data and the new parity.

If the parity was wrong before, it will still be wrong. If you then lose a 
drive, you lose your data.


linear, raid0,1,10

These raid levels do not need an initial sync.

linear and raid0 have no redundancy.

raid1 always writes all data to all disks.

raid10 always writes all data to all relevant disks.


Other raid levels

Probably the most noticeable effect for the other raid levels is that if you 
don't sync first, then every check will find lots of errors. (Of course you 
could 'repair' instead of 'check'. Or do that once. Or something.)


For raid6 it is also safe to not sync first, though with the same caveat. Raid6 
always updates parity by reading all blocks in the stripe that aren't known and 
calculating P and Q. So the first write to a stripe will make P and Q correct 
for that stripe. This is current behaviour. There is no guarantee it will never 
changed (so theoretically one day you may upgrade your kernel and suffer data 
corruption on an old raid6 array).


Summary

In summary, it is safe to use --assume-clean on a raid1 or raid1o, though a 
"repair" is recommended before too long. For other raid levels it is best avoided.


Potential 'Solutions'

There have been 'solutions' suggested including the use of bitmaps to 
efficiently store 'not yet synced' information about the array. It would be 
possible to have a 'this is not initialised' flag on the array, and if that is 
not set, always do a reconstruct-write rather than a read-modify-write. But the 
first time you have an unclean shutdown you are going to resync all the parity 
anyway (unless you have a bitmap) so you may as well resync at the start. So 
essentially, at the moment, there is no interest in implementing this since the 
added complexity is not justified.


What's the problem anyway?

First of all RAID is all about being safe with your data.

And why is it such a big deal anyway? The initial resync doesn't stop you from 
using the array. If you wanted to put an array into production instantly and 
couldn't afford any slowdown due to resync, then you might want to skip the 
initial resync but is that really likely?


So what is --assume-clean for then?

Disaster recovery. If you want to build an array from components that used to be 
in a raid then this stops the kernel from scribbling on them. As the man page says :


"Use this ony if you really know what you are doing."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3

2007-06-16 Thread David Greaves

Krzysztof Halasa wrote:

David Greaves <[EMAIL PROTECTED]> writes:


How hard would it be to reprogramm the flash?

The flash contains hashes signed by the companies private key.

The kernel contains the public key. It can decrypt the hashes but the
private key isn't available to encrypt them. So although you can put a
new application onto the system, you can't create a signed hash to
write to the flash.

>

Then how hard would it be to reprogram the flash, to get rid of all
this crap? Or to just put your public key there.

Do they at least use BGA type of flash chips so you can't attach
a clip and have to use something more demanding?


Stop trying to technically crack my 5-minute fag-packet design - that's easy and 
boring :)


Tivo have solved this problem - use their solution - but do it on something more 
general purpose.


Help fix it - the point is more "is this feasible". And if it is, "does it 
matter?"


David


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3

2007-06-16 Thread David Greaves

Krzysztof Halasa wrote:

David Greaves <[EMAIL PROTECTED]> writes:


This 5 minute design undoubtedly has flaws but it shows a direction:
A basically standard 'De11' PC with some flash.
A Tivoised boot system so only signed kernels boot.
A modified kernel that only runs (FOSS) executables whose signed hash
lives in the flash.


How hard would it be to reprogramm the flash?


The flash contains hashes signed by the companies private key.

The kernel contains the public key. It can decrypt the hashes but the private 
key isn't available to encrypt them. So although you can put a new application 
onto the system, you can't create a signed hash to write to the flash.


The kernel only runs the executable if the hash is valid.
You can re-write the kernel to avoid this check - but the hardware is Tivoised - 
so you can't run it.


I am not suggesting the kernel should go down the GPLV2 route - I am wondering 
if this is a viable scenario or one of Schneiers'  "movie-plot" threats :)


David

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3

2007-06-16 Thread David Greaves

Krzysztof Halasa wrote:

David Greaves [EMAIL PROTECTED] writes:


This 5 minute design undoubtedly has flaws but it shows a direction:
A basically standard 'De11' PC with some flash.
A Tivoised boot system so only signed kernels boot.
A modified kernel that only runs (FOSS) executables whose signed hash
lives in the flash.


How hard would it be to reprogramm the flash?


The flash contains hashes signed by the companies private key.

The kernel contains the public key. It can decrypt the hashes but the private 
key isn't available to encrypt them. So although you can put a new application 
onto the system, you can't create a signed hash to write to the flash.


The kernel only runs the executable if the hash is valid.
You can re-write the kernel to avoid this check - but the hardware is Tivoised - 
so you can't run it.


I am not suggesting the kernel should go down the GPLV2 route - I am wondering 
if this is a viable scenario or one of Schneiers'  movie-plot threats :)


David

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3

2007-06-16 Thread David Greaves

Krzysztof Halasa wrote:

David Greaves [EMAIL PROTECTED] writes:


How hard would it be to reprogramm the flash?

The flash contains hashes signed by the companies private key.

The kernel contains the public key. It can decrypt the hashes but the
private key isn't available to encrypt them. So although you can put a
new application onto the system, you can't create a signed hash to
write to the flash.



Then how hard would it be to reprogram the flash, to get rid of all
this crap? Or to just put your public key there.

Do they at least use BGA type of flash chips so you can't attach
a clip and have to use something more demanding?


Stop trying to technically crack my 5-minute fag-packet design - that's easy and 
boring :)


Tivo have solved this problem - use their solution - but do it on something more 
general purpose.


Help fix it - the point is more is this feasible. And if it is, does it 
matter?


David


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: limits on raid

2007-06-16 Thread David Greaves

Neil Brown wrote:

On Friday June 15, [EMAIL PROTECTED] wrote:
 

  As I understand the way
raid works, when you write a block to the array, it will have to read all
the other blocks in the stripe and recalculate the parity and write it out.


Your understanding is incomplete.


Does this help?
[for future reference so you can paste a url and save the typing for code :) ]

http://linux-raid.osdl.org/index.php/Initial_Array_Creation

David



Initial Creation

When mdadm asks the kernel to create a raid array the most noticeable activity 
is what's called the initial resync.


The kernel takes one (or two for raid6) disks and marks them as 'spare'; it then 
creates the array in degraded mode. It then marks spare disks as 'rebuilding' 
and starts to read from the 'good' disks, calculate the parity and determines 
what should be on any spare disks and then writes it. Once all this is done the 
array is clean and all disks are active.


This can take quite a time and the array is not fully resilient whilst this is 
happening (it is however fully useable).


--assume-clean

Some people have noticed the --assume-clean option in mdadm and speculated that 
this can be used to skip the initial resync. Which it does. But this is a bad 
idea in some cases - and a *very* bad idea in others.


raid5

For raid5 especially it is NOT safe to skip the initial sync. The raid5 
implementation optimises use of the component disks and it is possible for all 
updates to be read-modify-write updates which assume the parity is correct. If 
it is wrong, it stays wrong. Then when you lose a drive, the parity blocks are 
wrong so the data you recover using them is wrong. In other words - you will get 
data corruption.


For raid5 on an array with more than 3 drive, if you attempt to write a single 
block, it will:


* read the current value of the block, and the parity block.
* subtract the old value of the block from the parity, and add the new 
value.

* write out the new data and the new parity.

If the parity was wrong before, it will still be wrong. If you then lose a 
drive, you lose your data.


linear, raid0,1,10

These raid levels do not need an initial sync.

linear and raid0 have no redundancy.

raid1 always writes all data to all disks.

raid10 always writes all data to all relevant disks.


Other raid levels

Probably the most noticeable effect for the other raid levels is that if you 
don't sync first, then every check will find lots of errors. (Of course you 
could 'repair' instead of 'check'. Or do that once. Or something.)


For raid6 it is also safe to not sync first, though with the same caveat. Raid6 
always updates parity by reading all blocks in the stripe that aren't known and 
calculating P and Q. So the first write to a stripe will make P and Q correct 
for that stripe. This is current behaviour. There is no guarantee it will never 
changed (so theoretically one day you may upgrade your kernel and suffer data 
corruption on an old raid6 array).


Summary

In summary, it is safe to use --assume-clean on a raid1 or raid1o, though a 
repair is recommended before too long. For other raid levels it is best avoided.


Potential 'Solutions'

There have been 'solutions' suggested including the use of bitmaps to 
efficiently store 'not yet synced' information about the array. It would be 
possible to have a 'this is not initialised' flag on the array, and if that is 
not set, always do a reconstruct-write rather than a read-modify-write. But the 
first time you have an unclean shutdown you are going to resync all the parity 
anyway (unless you have a bitmap) so you may as well resync at the start. So 
essentially, at the moment, there is no interest in implementing this since the 
added complexity is not justified.


What's the problem anyway?

First of all RAID is all about being safe with your data.

And why is it such a big deal anyway? The initial resync doesn't stop you from 
using the array. If you wanted to put an array into production instantly and 
couldn't afford any slowdown due to resync, then you might want to skip the 
initial resync but is that really likely?


So what is --assume-clean for then?

Disaster recovery. If you want to build an array from components that used to be 
in a raid then this stops the kernel from scribbling on them. As the man page says :


Use this ony if you really know what you are doing.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: limits on raid

2007-06-16 Thread David Greaves

[EMAIL PROTECTED] wrote:

On Sat, 16 Jun 2007, Neil Brown wrote:

I want to test several configurations, from a 45 disk raid6 to a 45 disk 
raid0. at 2-3 days per test (or longer, depending on the tests) this 
becomes a very slow process.
Are you suggesting the code that is written to enhance data integrity is 
optimised (or even touched) to support this kind of test scenario?

Seriously? :)

also, when a rebuild is slow enough (and has enough of a performance 
impact) it's not uncommon to want to operate in degraded mode just long 
enought oget to a maintinance window and then recreate the array and 
reload from backup.


so would mdadm --remove the rebuilding disk help?

David

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.22-rc4 XFS fails after hibernate/resume

2007-06-16 Thread David Greaves

This isn't a regression.

I was seeing these problems on 2.6.21 (but 22 was in -rc so I waited to try it).
I tried 2.6.22-rc4 (with Tejun's patches) to see if it had improved - no.

Note this is a different (desktop) machine to that involved my recent bugs.

The machine will work for days (continually powered up) without a problem and 
then exhibits a filesystem failure within minutes of a resume.


I know xfs/raid are OK with hibernate. Is lvm?

The root filesystem is xfs on raid1 and that doesn't seem to have any problems.

System info:

/dev/mapper/video_vg-video_lv on /scratch type xfs (rw)

haze:~# vgdisplay
  --- Volume group ---
  VG Name   video_vg
  System ID
  Formatlvm2
  Metadata Areas1
  Metadata Sequence No  19
  VG Access read/write
  VG Status resizable
  MAX LV0
  Cur LV1
  Open LV   1
  Max PV0
  Cur PV1
  Act PV1
  VG Size   372.61 GB
  PE Size   4.00 MB
  Total PE  95389
  Alloc PE / Size   95389 / 372.61 GB
  Free  PE / Size   0 / 0
  VG UUID   I2gW2x-aHcC-kqzs-Efpd-Q7TE-dkWf-KpHSO7

haze:~# pvdisplay
  --- Physical volume ---
  PV Name   /dev/md1
  VG Name   video_vg
  PV Size   372.62 GB / not usable 3.25 MB
  Allocatable   yes (but full)
  PE Size (KByte)   4096
  Total PE  95389
  Free PE   0
  Allocated PE  95389
  PV UUID   IUig5k-460l-sMZc-23Iz-MMFl-Cfh9-XuBMiq

md1 : active raid5 sdd1[0] sda1[2] sdc1[1]
  390716672 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]



00:00.0 Host bridge: VIA Technologies, Inc. VT8377 [KT400/KT600 AGP] Host Bridge 
(rev 80)

00:01.0 PCI bridge: VIA Technologies, Inc. VT8237 PCI Bridge
00:0a.0 Mass storage controller: Silicon Image, Inc. SiI 3112 
[SATALink/SATARaid] Serial ATA Controller (rev 02)
00:0b.0 Ethernet controller: Marvell Technology Group Ltd. 88E8001 Gigabit 
Ethernet Controller (rev 12)
00:0f.0 RAID bus controller: VIA Technologies, Inc. VIA VT6420 SATA RAID 
Controller (rev 80)
00:0f.1 IDE interface: VIA Technologies, Inc. 
VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
00:10.0 USB Controller: VIA Technologies, Inc. VT82x UHCI USB 1.1 Controller 
(rev 81)
00:10.1 USB Controller: VIA Technologies, Inc. VT82x UHCI USB 1.1 Controller 
(rev 81)
00:10.2 USB Controller: VIA Technologies, Inc. VT82x UHCI USB 1.1 Controller 
(rev 81)
00:10.3 USB Controller: VIA Technologies, Inc. VT82x UHCI USB 1.1 Controller 
(rev 81)

00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 86)
00:11.0 ISA bridge: VIA Technologies, Inc. VT8237 ISA bridge 
[KT600/K8T800/K8T890 South]
00:11.5 Multimedia audio controller: VIA Technologies, Inc. VT8233/A/8235/8237 
AC97 Audio Controller (rev 60)
00:11.6 Communication controller: VIA Technologies, Inc. AC'97 Modem Controller 
(rev 80)

00:12.0 Ethernet controller: VIA Technologies, Inc. VT6102 [Rhine-II] (rev 78)
01:00.0 VGA compatible controller: ATI Technologies Inc RV280 [Radeon 9200 PRO] 
(rev 01)



tail end of info from dmesg:

k_prepare_write+0x272/0x490
 [c01e7c81] xfs_iomap+0x391/0x4b0
 [c020e3c0] xfs_bmap+0x0/0x10
 [c0207067] xfs_map_blocks+0x47/0x90
 [c020847c] xfs_page_state_convert+0x3dc/0x7b0
 [c01dff61] xfs_ilock+0x71/0xa0
 [c01dfed5] xfs_iunlock+0x85/0x90
 [c0208980] xfs_vm_writepage+0x60/0xf0
 [c014cd78] __writepage+0x8/0x30
 [c014d1bf] write_cache_pages+0x1ff/0x320
 [c014cd70] __writepage+0x0/0x30
 [c014d300] generic_writepages+0x20/0x30
 [c014d33b] do_writepages+0x2b/0x50
 [c0148592] __filemap_fdatawrite_range+0x72/0x90
 [c020b260] xfs_file_fsync+0x0/0x80
 [c0148893] filemap_fdatawrite+0x23/0x30
 [c018691e] do_fsync+0x4e/0xb0
 [c01869a5] __do_fsync+0x25/0x40
 [c0103fb4] syscall_call+0x7/0xb
 ===
Filesystem dm-0: XFS internal error xfs_btree_check_sblock at line 334 of file 
fs/xfs/xfs_btree.c.  Caller 0xc01b27be

 [c01cb73b] xfs_btree_check_sblock+0x5b/0xd0
 [c01b27be] xfs_alloc_lookup+0x17e/0x390
 [c01b27be] xfs_alloc_lookup+0x17e/0x390
 [c01b0d19] xfs_alloc_ag_vextent_near+0x59/0xa30
 [c01b177d] xfs_alloc_ag_vextent+0x8d/0x100
 [c01b1f93] xfs_alloc_vextent+0x223/0x450
 [c01bf7d0] xfs_bmap_btalloc+0x400/0x770
 [c01e183d] xfs_iext_bno_to_ext+0x9d/0x1d0
 [c01c483d] xfs_bmapi+0x10bd/0x1490
 [c01edace] xlog_grant_log_space+0x22e/0x2b0
 [c01edf60] xfs_log_reserve+0xc0/0xe0
 [c01e918f] xfs_iomap_write_allocate+0x27f/0x4f0
 [c0188861] __block_prepare_write+0x421/0x490
 [c01886b2] __block_prepare_write+0x272/0x490
 [c01e7c81] xfs_iomap+0x391/0x4b0
 [c020e3c0] xfs_bmap+0x0/0x10
 [c0207067] xfs_map_blocks+0x47/0x90
 [c020847c] xfs_page_state_convert+0x3dc/0x7b0
 [c01dff61] xfs_ilock+0x71/0xa0
 [c01dfed5] xfs_iunlock+0x85/0x90
 [c0208980] xfs_vm_writepage+0x60/0xf0
 [c014cd78] __writepage+0x8/0x30
 [c014d1bf] write_cache_pages+0x1ff/0x320
 

2.6.22-rc4 hibernate disables skge wol

2007-06-15 Thread David Greaves

I've started a new thread here since the old one got somewhat hijacked.

Rafael J. Wysocki wrote:
> On Friday, 1 June 2007 23:23, David Greaves wrote:
>> Not a regression though, it does it in 2.6.21
>>
>> If I cause the system to save state to disk then whilst off it no longer
>> responds to g-wol.
>
> Can you please try with the hibernation and suspend patch series from
>
> http://www.sisk.pl/kernel/hibernation_and_suspend/2.6.22-rc3/patches/
>
> applied?
>
> Greetings,
> Rafael
>
>
Now, IIRC, some time ago you asked me to try some patches :)

So I applied them (and Tejun's fixes as per the old thread) to rc4 (which seems 
to include a couple already).


Hibernate/resume works but although WOL works on an init 0, it doesn't work on a 
hibernated system :(


David

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3

2007-06-15 Thread David Greaves

Linus Torvalds wrote:


On Fri, 15 Jun 2007, David Greaves wrote:

Surely it's more:
  bad == go away and don't use future improvements to our software anymore
please.
??


Well, with the understanding that I don't think that what Tivo did was bad 
in the first place, let me tackle that question *anyway*.


The answer is: Not necessarily.

I do agree with what you say here. Maybe a summary:
Babies, bathwater...
When you have a hammer (license) everything looks like a nail...

See? You don't actually have to like Tivo to see downsides to trying to 
stop them. Because these kinds of things have consequences *outside* of 
just stopping Tivo.


My concern is around embedded type systems and maybe even the 'trusted' 
frameworks etc.


I _think_ I can see a completely opensource system that the end user cannot 
modify _in any way_. Which kinda defeats the point (to me) of opensource.


This 5 minute design undoubtedly has flaws but it shows a direction:
A basically standard 'De11' PC with some flash.
A Tivoised boot system so only signed kernels boot.
A modified kernel that only runs (FOSS) executables whose signed hash lives in 
the flash.


Do we (you) _want_ to prevent this?

Do we trust in 'the market' to prevent this?

Do we use license tools?

David

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] block: always requeue !fs requests at the front

2007-06-15 Thread David Greaves

David, please test this.  Jens, does it look okay?


Phew!

Works for me.

I applied it to 2.6.22-rc4 (along with 
sata_promise_use_TF_interface_for_polling_NODATA_commands.patch) hibernate and 
resume worked.


Thanks for digging it out Tejun (and everyone else!) :)

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3

2007-06-15 Thread David Greaves

Daniel Hazelton wrote:

Now for a different PoV:
Do I think Tivoisation is bad for the community ?
Of course I think it is but your mileage may vary.


And I happen to agree with you. What I disagree with is taking steps to 
make "bad == illegal". I also have a problem with doing things that force my 
viewpoint on other people.


Surely it's more:
  bad == go away and don't use future improvements to our software anymore 
please.
??

*If* you think it's bad (Linus doesn't as far as the kernel goes) then isn't it 
reasonable to exclude 'bad for the community' from the community?


This isn't retroactive - they can continue to use any V2 software they had, they 
wouldn't be able to use V3 developments.


That seems to me to be a very, very reasonable thing to do (and very much *not* 
bad == illegal IMHO)


David

PS well, I was just seeing if anyone had fixed my libata/md bug yet but this 
seemed more interesting.

PPS and Tejun has, I'm off...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3

2007-06-15 Thread David Greaves

Daniel Hazelton wrote:

Now for a different PoV:
Do I think Tivoisation is bad for the community ?
Of course I think it is but your mileage may vary.


And I happen to agree with you. What I disagree with is taking steps to 
make bad == illegal. I also have a problem with doing things that force my 
viewpoint on other people.


Surely it's more:
  bad == go away and don't use future improvements to our software anymore 
please.
??

*If* you think it's bad (Linus doesn't as far as the kernel goes) then isn't it 
reasonable to exclude 'bad for the community' from the community?


This isn't retroactive - they can continue to use any V2 software they had, they 
wouldn't be able to use V3 developments.


That seems to me to be a very, very reasonable thing to do (and very much *not* 
bad == illegal IMHO)


David

PS well, I was just seeing if anyone had fixed my libata/md bug yet but this 
seemed more interesting.

PPS and Tejun has, I'm off...
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] block: always requeue !fs requests at the front

2007-06-15 Thread David Greaves

David, please test this.  Jens, does it look okay?


Phew!

Works for me.

I applied it to 2.6.22-rc4 (along with 
sata_promise_use_TF_interface_for_polling_NODATA_commands.patch) hibernate and 
resume worked.


Thanks for digging it out Tejun (and everyone else!) :)

David
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3

2007-06-15 Thread David Greaves

Linus Torvalds wrote:


On Fri, 15 Jun 2007, David Greaves wrote:

Surely it's more:
  bad == go away and don't use future improvements to our software anymore
please.
??


Well, with the understanding that I don't think that what Tivo did was bad 
in the first place, let me tackle that question *anyway*.


The answer is: Not necessarily.

I do agree with what you say here. Maybe a summary:
Babies, bathwater...
When you have a hammer (license) everything looks like a nail...

See? You don't actually have to like Tivo to see downsides to trying to 
stop them. Because these kinds of things have consequences *outside* of 
just stopping Tivo.


My concern is around embedded type systems and maybe even the 'trusted' 
frameworks etc.


I _think_ I can see a completely opensource system that the end user cannot 
modify _in any way_. Which kinda defeats the point (to me) of opensource.


This 5 minute design undoubtedly has flaws but it shows a direction:
A basically standard 'De11' PC with some flash.
A Tivoised boot system so only signed kernels boot.
A modified kernel that only runs (FOSS) executables whose signed hash lives in 
the flash.


Do we (you) _want_ to prevent this?

Do we trust in 'the market' to prevent this?

Do we use license tools?

David

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.22-rc4 hibernate disables skge wol

2007-06-15 Thread David Greaves

I've started a new thread here since the old one got somewhat hijacked.

Rafael J. Wysocki wrote:
 On Friday, 1 June 2007 23:23, David Greaves wrote:
 Not a regression though, it does it in 2.6.21

 If I cause the system to save state to disk then whilst off it no longer
 responds to g-wol.

 Can you please try with the hibernation and suspend patch series from

 http://www.sisk.pl/kernel/hibernation_and_suspend/2.6.22-rc3/patches/

 applied?

 Greetings,
 Rafael


Now, IIRC, some time ago you asked me to try some patches :)

So I applied them (and Tejun's fixes as per the old thread) to rc4 (which seems 
to include a couple already).


Hibernate/resume works but although WOL works on an init 0, it doesn't work on a 
hibernated system :(


David

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)

2007-06-14 Thread David Greaves

Tejun Heo wrote:

They're waiting for the commands they issued to complete.  ata_aux is
trying to revalidate the scsi device after libata EH finished waking up
the port and hibernate is trying to resume scsi disk device.  ata_aux is
issuing either TEST UNIT READY or START STOP.  hibernate is issuing
START STOP.

This can be caused by one of the followings.

1. SCSI EH thread (ATA EH runs off it) for the SCSI device hasn't
finished yet.  All commands are deferred while EH is in progress.

2. request_queue is stuck - somehow somebody forgot to kick the queue at
some point.

3. command is stuck somewhere in SCSI/ATA land.

#1 doesn't seem to be the case as all scsi_eh threads seems idle.  I'm
looking at the code but can't find anything which could cause #2 or #3.
 Also, these code paths are traveled really frequently.

I'm also trying to reproduce the problem here with xfs over RAID-6 array
but haven't been successful yet.

David, do you store the hibernation image on the RAID-6 array?

No, swap is on a pata disk.


 Can you post the captured kernel log when it locks up?


Sure... this was still on the serial terminal screen from the sysrq-t trace from 
this morning:


[run hibernate script here]

swsusp: Basic memory bitmaps created
Stopping tasks ... done.
Shrinking memory... done (0 pages freed)
Freed 0 kbytes in 0.04 seconds (0.00 MB/s)
sd 5:0:0:0: [sdf] Synchronizing SCSI cache
sd 4:0:0:0: [sde] Synchronizing SCSI cache
sd 3:0:0:0: [sdd] Synchronizing SCSI cache
sd 2:0:0:0: [sdc] Synchronizing SCSI cache
sd 1:0:0:0: [sdb] Synchronizing SCSI cache
sd 0:0:0:0: [sda] Synchronizing SCSI cache
pnp: Device 00:09 disabled.
pnp: Device 00:08 activated.
pnp: Device 00:09 activated.
pnp: Failed to activate device 00:0a.
pnp: Failed to activate device 00:0b.
ATA: abnormal status 0x7F on port 0x0001a407
ATA: abnormal status 0x7F on port 0x0001a407
ATA: abnormal status 0x7F on port 0x0001b007
ATA: abnormal status 0x7F on port 0x0001b007
ata5.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata5.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata5.00: configured for UDMA/133
sd 0:0:0:0: [sda] Starting disk
ata6.00: ata_hpa_resize 1: sectors = 781422768, hpa_sectors = 781422768
sd 1:0:0:0: [sdb] Starting disk
sd 2:0:0:0: [sdc] Starting disk
ata6.00: ata_hpa_resize 1: sectors = 781422768, hpa_sectors = 781422768
ata6.00: configured for UDMA/133
sd 3:0:0:0: [sdd] Starting disk
sd 4:0:0:0: [sde] Starting disk
sd 5:0:0:0: [sdf] Starting disk
sd 4:0:0:0: [sde] 490234752 512-byte hardware sectors (251000 MB)
sd 4:0:0:0: [sde] Write Protect is off
sd 4:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA

sd 5:0:0:0: [sdf] 781422768 512-byte hardware sectors (400088 MB)
sd 5:0:0:0: [sdf] Write Protect is off
sd 5:0:0:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA
Saving image data pages (36338 pages) ...  19%<6>skge eth0: Link is up at 1000 
Mbps, full duplex, flow control both

done
Wrote 145352 kbytes in 8.49 seconds (17.12 MB/s)
S|
md: stopping all md devices.
md: md0 still in use.
sd 5:0:0:0: [sdf] Synchronizing SCSI cache
sd 5:0:0:0: [sdf] Stopping disk
sd 4:0:0:0: [sde] Synchronizing SCSI cache
sd 4:0:0:0: [sde] Stopping disk
sd 3:0:0:0: [sdd] Synchronizing SCSI cache
sd 3:0:0:0: [sdd] Stopping disk
sd 2:0:0:0: [sdc] Synchronizing SCSI cache
sd 2:0:0:0: [sdc] Stopping disk
sd 1:0:0:0: [sdb] Synchronizing SCSI cache
sd 1:0:0:0: [sdb] Stopping disk
sd 0:0:0:0: [sda] Synchronizing SCSI cache
sd 0:0:0:0: [sda] Stopping disk
Shutdown: hdb
Shutdown: hda
ACPI: PCI interrupt for device :00:09.0 disabled

[power off/on]

Linux version 2.6.21-g9666f400-dirty ([EMAIL PROTECTED]) (gcc version 3.3.5 
(Debian 1:3.3.5-13)) #23 Wed Jun 13 22:51:26 BST 2007

BIOS-provided physical RAM map:
 BIOS-e820:  - 0009c400 (usable)
 BIOS-e820: 0009c400 - 000a (reserved)
 BIOS-e820: 000f - 0010 (reserved)
 BIOS-e820: 0010 - 3fffc000 (usable)
 BIOS-e820: 3fffc000 - 3000 (ACPI data)
 BIOS-e820: 3000 - 4000 (ACPI NVS)
 BIOS-e820: fec0 - fec01000 (reserved)
 BIOS-e820: fee0 - fee01000 (reserved)
 BIOS-e820:  - 0001 (reserved)
127MB HIGHMEM available.
896MB LOWMEM available.
Zone PFN ranges:
  DMA 0 -> 4096
  Normal   4096 ->   229376
  HighMem229376 ->   262140
early_node_map[1] active PFN ranges
0:0 ->   262140
DMI 2.3 present.
ACPI: RSDP 000F62A0, 0014 (r0 ASUS  )
ACPI: RSDT 3FFFC000, 0030 (r1 ASUS   A7V600   42302E31 MSFT 31313031)
ACPI: FACP 3FFFC0B2, 0074 (r1 ASUS   A7V600   42302E31 MSFT 31313031)
ACPI: DSDT 3FFFC126, 2C4F (r1   ASUS A7V600   1000 MSFT  10B)
ACPI: FACS 3000, 0040
ACPI: BOOT 3FFFC030, 0028 (r1 ASUS   A7V600   42302E31 MSFT 31313031)
ACPI: APIC 3FFFC058, 005A (r1 ASUS   A7V600   

Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)

2007-06-14 Thread David Greaves

Tejun Heo wrote:

They're waiting for the commands they issued to complete.  ata_aux is
trying to revalidate the scsi device after libata EH finished waking up
the port and hibernate is trying to resume scsi disk device.  ata_aux is
issuing either TEST UNIT READY or START STOP.  hibernate is issuing
START STOP.

This can be caused by one of the followings.

1. SCSI EH thread (ATA EH runs off it) for the SCSI device hasn't
finished yet.  All commands are deferred while EH is in progress.

2. request_queue is stuck - somehow somebody forgot to kick the queue at
some point.

3. command is stuck somewhere in SCSI/ATA land.

#1 doesn't seem to be the case as all scsi_eh threads seems idle.  I'm
looking at the code but can't find anything which could cause #2 or #3.
 Also, these code paths are traveled really frequently.

I'm also trying to reproduce the problem here with xfs over RAID-6 array
but haven't been successful yet.

David, do you store the hibernation image on the RAID-6 array?

No, swap is on a pata disk.


 Can you post the captured kernel log when it locks up?


Sure... this was still on the serial terminal screen from the sysrq-t trace from 
this morning:


[run hibernate script here]

swsusp: Basic memory bitmaps created
Stopping tasks ... done.
Shrinking memory... done (0 pages freed)
Freed 0 kbytes in 0.04 seconds (0.00 MB/s)
sd 5:0:0:0: [sdf] Synchronizing SCSI cache
sd 4:0:0:0: [sde] Synchronizing SCSI cache
sd 3:0:0:0: [sdd] Synchronizing SCSI cache
sd 2:0:0:0: [sdc] Synchronizing SCSI cache
sd 1:0:0:0: [sdb] Synchronizing SCSI cache
sd 0:0:0:0: [sda] Synchronizing SCSI cache
pnp: Device 00:09 disabled.
pnp: Device 00:08 activated.
pnp: Device 00:09 activated.
pnp: Failed to activate device 00:0a.
pnp: Failed to activate device 00:0b.
ATA: abnormal status 0x7F on port 0x0001a407
ATA: abnormal status 0x7F on port 0x0001a407
ATA: abnormal status 0x7F on port 0x0001b007
ATA: abnormal status 0x7F on port 0x0001b007
ata5.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata5.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata5.00: configured for UDMA/133
sd 0:0:0:0: [sda] Starting disk
ata6.00: ata_hpa_resize 1: sectors = 781422768, hpa_sectors = 781422768
sd 1:0:0:0: [sdb] Starting disk
sd 2:0:0:0: [sdc] Starting disk
ata6.00: ata_hpa_resize 1: sectors = 781422768, hpa_sectors = 781422768
ata6.00: configured for UDMA/133
sd 3:0:0:0: [sdd] Starting disk
sd 4:0:0:0: [sde] Starting disk
sd 5:0:0:0: [sdf] Starting disk
sd 4:0:0:0: [sde] 490234752 512-byte hardware sectors (251000 MB)
sd 4:0:0:0: [sde] Write Protect is off
sd 4:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA

sd 5:0:0:0: [sdf] 781422768 512-byte hardware sectors (400088 MB)
sd 5:0:0:0: [sdf] Write Protect is off
sd 5:0:0:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO 
or FUA
Saving image data pages (36338 pages) ...  19%6skge eth0: Link is up at 1000 
Mbps, full duplex, flow control both

done
Wrote 145352 kbytes in 8.49 seconds (17.12 MB/s)
S|
md: stopping all md devices.
md: md0 still in use.
sd 5:0:0:0: [sdf] Synchronizing SCSI cache
sd 5:0:0:0: [sdf] Stopping disk
sd 4:0:0:0: [sde] Synchronizing SCSI cache
sd 4:0:0:0: [sde] Stopping disk
sd 3:0:0:0: [sdd] Synchronizing SCSI cache
sd 3:0:0:0: [sdd] Stopping disk
sd 2:0:0:0: [sdc] Synchronizing SCSI cache
sd 2:0:0:0: [sdc] Stopping disk
sd 1:0:0:0: [sdb] Synchronizing SCSI cache
sd 1:0:0:0: [sdb] Stopping disk
sd 0:0:0:0: [sda] Synchronizing SCSI cache
sd 0:0:0:0: [sda] Stopping disk
Shutdown: hdb
Shutdown: hda
ACPI: PCI interrupt for device :00:09.0 disabled

[power off/on]

Linux version 2.6.21-g9666f400-dirty ([EMAIL PROTECTED]) (gcc version 3.3.5 
(Debian 1:3.3.5-13)) #23 Wed Jun 13 22:51:26 BST 2007

BIOS-provided physical RAM map:
 BIOS-e820:  - 0009c400 (usable)
 BIOS-e820: 0009c400 - 000a (reserved)
 BIOS-e820: 000f - 0010 (reserved)
 BIOS-e820: 0010 - 3fffc000 (usable)
 BIOS-e820: 3fffc000 - 3000 (ACPI data)
 BIOS-e820: 3000 - 4000 (ACPI NVS)
 BIOS-e820: fec0 - fec01000 (reserved)
 BIOS-e820: fee0 - fee01000 (reserved)
 BIOS-e820:  - 0001 (reserved)
127MB HIGHMEM available.
896MB LOWMEM available.
Zone PFN ranges:
  DMA 0 - 4096
  Normal   4096 -   229376
  HighMem229376 -   262140
early_node_map[1] active PFN ranges
0:0 -   262140
DMI 2.3 present.
ACPI: RSDP 000F62A0, 0014 (r0 ASUS  )
ACPI: RSDT 3FFFC000, 0030 (r1 ASUS   A7V600   42302E31 MSFT 31313031)
ACPI: FACP 3FFFC0B2, 0074 (r1 ASUS   A7V600   42302E31 MSFT 31313031)
ACPI: DSDT 3FFFC126, 2C4F (r1   ASUS A7V600   1000 MSFT  10B)
ACPI: FACS 3000, 0040
ACPI: BOOT 3FFFC030, 0028 (r1 ASUS   A7V600   42302E31 MSFT 31313031)
ACPI: APIC 3FFFC058, 005A (r1 ASUS   A7V600   42302E31 

Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)

2007-06-13 Thread David Greaves

Linus Torvalds wrote:


On Wed, 13 Jun 2007, David Greaves wrote:

git-bisect bad
9666f4009c22f6520ac3fb8a19c9e32ab973e828 is first bad commit
commit 9666f4009c22f6520ac3fb8a19c9e32ab973e828
Author: Tejun Heo <[EMAIL PROTECTED]>
Date:   Fri May 4 21:27:47 2007 +0200

libata: reimplement suspend/resume support using sdev->manage_start_stop

Good.


Ok, good. So the bug is apparently in the generic SCSI layer start/stop
handling. I'm not entirely surprised, most people would never have 
triggered it (I _think_ it's disabled by default for all devices, and that 
the libata-scsi.c change was literally the first thing to ever enable it 
by default for anything!)



So here's a sysrq-t from a failed resume. Ask if you'd like anything else...


I'm not seeing anything really obvious. The traces would probably look 
better if you enabled CONFIG_FRAME_POINTER, though. That should cut down 
on some of the noise and make the traces a bit more readable.


I can do that...

"hibernate" is definitely stuck on the new code: it's in the 
"sd_start_stop_device()" call-chain, but I note that ata_aux at the same 
time is also doing some sd_spinup_disk logic as part of rescanning. Maybe 
that's part of the confusion: trying to rescan the bus at the same time 
upper layers (who already *know* the disks that are there) are trying to 
spin up the devices.


Tejun? Jeff?


SysRq : Show State

 freesibling
  task PCstack   pid father child younger older
init  D 28D10C50 0 1  0 (NOTLB)
   c1941ea0 0082 46706775 28d10c50 46706775 28d10c50 1000 f64b4000
   c1941e80 c04250e0 d5151017 0018 2e2e  f7cb15b0 c192eb3c
   0073 0207 d5157d78 0018 c1941ea0   c1941f08
Call Trace:
 [] refrigerator+0x3f/0x50
 [] get_signal_to_deliver+0x226/0x230
 [] do_signal+0x5b/0x120
 [] do_notify_resume+0x3d/0x40
 [] work_notifysig+0x13/0x19
 ===
kthreadd  S C192E530 0 2  0 (L-TLB)
   c1943fd0 0046  c192e530 c01175f0  f6967ed4 
   0003 0292 f6967eb8  c1943fc0 0292 f7122090 c192e63c
   c1943fc0 0060 d3a629f3 000b c1943fd0 c042d298  
Call Trace:
 [] kthreadd+0x74/0xa0
 [] kernel_thread_helper+0x7/0x4c
 ===
ksoftirqd/0   S C04903C8 0 3  2 (L-TLB)
   c1945fb0 0046  c04903c8 c1945f70 0046 c1932550 c192e140
   c1945f80 c011ee7c c6c1e024  c1945fa0 c011ec21 f6ba0ad0 c192e13c
   c1945fa0 0db7 1ba6eda3 0009 c1945fb0  c011ef80 fffc
Call Trace:
 [] ksoftirqd+0x7b/0x90
 [] kthread+0x67/0x70
 [] kernel_thread_helper+0x7/0x4c
 ===
watchdog/0S C0364CA5 0 4  2 (L-TLB)
   c1947fb0 0046 c1943f70 c0364ca5 c1947f70 0292 c048b0e0 c1932a50
   ac6b4e00 11ef f7d635a7 0008 c1947fb0  c1932550 c1932b5c
   c1947fa0 0a3e 7beb1af2 0004 c1947fb0  c0140310 fffc
Call Trace:
 [] watchdog+0x4e/0x70
 [] kthread+0x67/0x70
 [] kernel_thread_helper+0x7/0x4c
 ===
events/0  R running 0 5  2 (L-TLB)
khelper   S  0 6  2 (L-TLB)
   c194bf60 0046   c194bf20 0001 f6bf8160 c0127a40
   c194bf30 c0127a67 f6bf8160 c1914c20 c194bf60 c0127ebd f65890b0 c193215c
   ad284200 08d8 0b3c4782 000f 0246 c1914c20 c194bf88 c1914c28
Call Trace:
 [] worker_thread+0xec/0xf0
 [] kthread+0x67/0x70
 [] kernel_thread_helper+0x7/0x4c
 ===
kblockd/0 S F7EA2438 035  2 (L-TLB)
   c19a7f60 0046 c19146e0 f7ea2438 c19a7f20 c021ac9c c19146e0 c021acc0
   c19a7f30 c021acce 2bfa8b70 0009 c19a7f60 c0127ebd c19b3a50 c19616dc
   006e 0047 e0f8a696 0010 0246 c19146e0 c19a7f88 c19146e8
Call Trace:
 [] worker_thread+0xec/0xf0
 [] kthread+0x67/0x70
 [] kernel_thread_helper+0x7/0x4c
 ===
kacpidS C192E530 036  2 (L-TLB)
   c19a9f60 0046 c048b0e0 c192e530 c19a9f20 c0116be1 c192e530 7c841b53
   c19a9f40 c0116d2b 7c842264 0004 0087  c192e530 c19611dc
   0078 013f 7c8423a7 0004 0246 c19145e0 c19a9f88 c19145e8
Call Trace:
 [] worker_thread+0xec/0xf0
 [] kthread+0x67/0x70
 [] kernel_thread_helper+0x7/0x4c
 ===
kacpi_notify  S C192E530 037  2 (L-TLB)
   c19abf60 0046 c048b0e0 c192e530 c19abf20 c0116be1 c192e530 7c844295
   c19abf40 c0116d2b 7cc769c0 0004 015e  c1932050 c1969b3c
   0078 0328 7cc76cfd 0004 0246 c19145a0 c19abf88 c19145a8
Call Trace:
 [] worker_thread+0xec/0xf0
 [] kthread+0x67/0x70
 [] kernel_thread_helper+0x7/0x4c
 ===
ata/0 S C192E530 0   121  2 (L-TLB)
   c19f5f60 0046 00

Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)

2007-06-13 Thread David Greaves

Linus Torvalds wrote:


On Wed, 13 Jun 2007, David Greaves wrote:

git-bisect bad
9666f4009c22f6520ac3fb8a19c9e32ab973e828 is first bad commit
commit 9666f4009c22f6520ac3fb8a19c9e32ab973e828
Author: Tejun Heo [EMAIL PROTECTED]
Date:   Fri May 4 21:27:47 2007 +0200

libata: reimplement suspend/resume support using sdev-manage_start_stop

Good.


Ok, good. So the bug is apparently in the generic SCSI layer start/stop
handling. I'm not entirely surprised, most people would never have 
triggered it (I _think_ it's disabled by default for all devices, and that 
the libata-scsi.c change was literally the first thing to ever enable it 
by default for anything!)



So here's a sysrq-t from a failed resume. Ask if you'd like anything else...


I'm not seeing anything really obvious. The traces would probably look 
better if you enabled CONFIG_FRAME_POINTER, though. That should cut down 
on some of the noise and make the traces a bit more readable.


I can do that...

hibernate is definitely stuck on the new code: it's in the 
sd_start_stop_device() call-chain, but I note that ata_aux at the same 
time is also doing some sd_spinup_disk logic as part of rescanning. Maybe 
that's part of the confusion: trying to rescan the bus at the same time 
upper layers (who already *know* the disks that are there) are trying to 
spin up the devices.


Tejun? Jeff?


SysRq : Show State

 freesibling
  task PCstack   pid father child younger older
init  D 28D10C50 0 1  0 (NOTLB)
   c1941ea0 0082 46706775 28d10c50 46706775 28d10c50 1000 f64b4000
   c1941e80 c04250e0 d5151017 0018 2e2e  f7cb15b0 c192eb3c
   0073 0207 d5157d78 0018 c1941ea0   c1941f08
Call Trace:
 [c013b1ff] refrigerator+0x3f/0x50
 [c0124646] get_signal_to_deliver+0x226/0x230
 [c0103cab] do_signal+0x5b/0x120
 [c0103dad] do_notify_resume+0x3d/0x40
 [c0103f6e] work_notifysig+0x13/0x19
 ===
kthreadd  S C192E530 0 2  0 (L-TLB)
   c1943fd0 0046  c192e530 c01175f0  f6967ed4 
   0003 0292 f6967eb8  c1943fc0 0292 f7122090 c192e63c
   c1943fc0 0060 d3a629f3 000b c1943fd0 c042d298  
Call Trace:
 [c012b454] kthreadd+0x74/0xa0
 [c01049fb] kernel_thread_helper+0x7/0x4c
 ===
ksoftirqd/0   S C04903C8 0 3  2 (L-TLB)
   c1945fb0 0046  c04903c8 c1945f70 0046 c1932550 c192e140
   c1945f80 c011ee7c c6c1e024  c1945fa0 c011ec21 f6ba0ad0 c192e13c
   c1945fa0 0db7 1ba6eda3 0009 c1945fb0  c011ef80 fffc
Call Trace:
 [c011effb] ksoftirqd+0x7b/0x90
 [c012b247] kthread+0x67/0x70
 [c01049fb] kernel_thread_helper+0x7/0x4c
 ===
watchdog/0S C0364CA5 0 4  2 (L-TLB)
   c1947fb0 0046 c1943f70 c0364ca5 c1947f70 0292 c048b0e0 c1932a50
   ac6b4e00 11ef f7d635a7 0008 c1947fb0  c1932550 c1932b5c
   c1947fa0 0a3e 7beb1af2 0004 c1947fb0  c0140310 fffc
Call Trace:
 [c014035e] watchdog+0x4e/0x70
 [c012b247] kthread+0x67/0x70
 [c01049fb] kernel_thread_helper+0x7/0x4c
 ===
events/0  R running 0 5  2 (L-TLB)
khelper   S  0 6  2 (L-TLB)
   c194bf60 0046   c194bf20 0001 f6bf8160 c0127a40
   c194bf30 c0127a67 f6bf8160 c1914c20 c194bf60 c0127ebd f65890b0 c193215c
   ad284200 08d8 0b3c4782 000f 0246 c1914c20 c194bf88 c1914c28
Call Trace:
 [c012808c] worker_thread+0xec/0xf0
 [c012b247] kthread+0x67/0x70
 [c01049fb] kernel_thread_helper+0x7/0x4c
 ===
kblockd/0 S F7EA2438 035  2 (L-TLB)
   c19a7f60 0046 c19146e0 f7ea2438 c19a7f20 c021ac9c c19146e0 c021acc0
   c19a7f30 c021acce 2bfa8b70 0009 c19a7f60 c0127ebd c19b3a50 c19616dc
   006e 0047 e0f8a696 0010 0246 c19146e0 c19a7f88 c19146e8
Call Trace:
 [c012808c] worker_thread+0xec/0xf0
 [c012b247] kthread+0x67/0x70
 [c01049fb] kernel_thread_helper+0x7/0x4c
 ===
kacpidS C192E530 036  2 (L-TLB)
   c19a9f60 0046 c048b0e0 c192e530 c19a9f20 c0116be1 c192e530 7c841b53
   c19a9f40 c0116d2b 7c842264 0004 0087  c192e530 c19611dc
   0078 013f 7c8423a7 0004 0246 c19145e0 c19a9f88 c19145e8
Call Trace:
 [c012808c] worker_thread+0xec/0xf0
 [c012b247] kthread+0x67/0x70
 [c01049fb] kernel_thread_helper+0x7/0x4c
 ===
kacpi_notify  S C192E530 037  2 (L-TLB)
   c19abf60 0046 c048b0e0 c192e530 c19abf20 c0116be1 c192e530 7c844295
   c19abf40 c0116d2b 7cc769c0 0004 015e  c1932050 c1969b3c
   0078 0328 7cc76cfd 0004 0246 c19145a0 c19abf88 c19145a8
Call Trace:
 [c012808c] worker_thread+0xec/0xf0
 [c012b247] kthread

Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)

2007-06-12 Thread David Greaves

Pavel Machek wrote:

Hi!

cu:~# mount -oremount,ro /huge
cu:~# /usr/net/bin/hibernate
[this works and resumes]

cu:~# mount -oremount,rw /huge
cu:~# /usr/net/bin/hibernate
[this works and resumes too !]

cu:~# touch /huge/tst
cu:~# /usr/net/bin/hibernate
[but this doesn't even hibernate]


This is very probably separate problem... and you should have enough
data in dmesg to do something with it.


What makes you say it's a different problem - it's hanging at the same point 
visually - it's just that one is pre suspend, one is post suspend.


It all feels very related to me - the behaviour all hinges around the same patch 
too.


I'll take a look in dmesg though...

David

PS, looks like some mail holdups somewhere...
Received: from spitz.ucw.cz (gprs189-60.eurotel.cz [160.218.189.60])
by mail.ukfsn.org (Postfix) with ESMTP id A9125E6AE9
for <[EMAIL PROTECTED]>; Tue, 12 Jun 2007 15:41:23 +0100 (BST)
Received: by spitz.ucw.cz (Postfix, from userid 0)
id E05FC279F2; Sun, 10 Jun 2007 18:43:48 + (UTC)
Date: Sun, 10 Jun 2007 18:43:48 +
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)

2007-06-12 Thread David Greaves

[RESEND since I sent this late last friday and it's probably been buried by 
now.]

I had this as a PS, then I thought, we could all be wasting our time...

I don't like these "Section mismatch" warnings but that's because I'm paranoid
rather than because I know what they mean. I'll be happier when someone says
"That's OK, I know about them, they're not the problem"

WARNING: arch/i386/kernel/built-in.o(.text+0x968f): Section mismatch: reference
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0x9781): Section mismatch: reference
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0x9786): Section mismatch: reference
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0xa25c): Section mismatch: reference
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa303): Section mismatch: reference
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa31b): Section mismatch: reference
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa344): Section mismatch: reference
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.exit.text+0x19): Section mismatch:
reference to .init.text: (between 'cache_remove_dev' and 'powernow_k6_exit')
WARNING: arch/i386/kernel/built-in.o(.data+0x2160): Section mismatch: reference
to .init.text: (between 'thermal_throttle_cpu_notifier' and 'mce_work')
WARNING: kernel/built-in.o(.text+0x14502): Section mismatch: reference to
.init.text: (between 'kthreadd' and 'init_waitqueue_head')

I'm paranoid because Andrew Morton said a couple of weeks ago:

Could the people who write these bugs, please, like, fix them?
It's not trivial noise.  These things lead to kernel crashes.


Anyhow...

David Chinner wrote:

sync just guarantees that metadata changes are logged and data is
on disk - it doesn't stop the filesystem from doing anything after
the sync...

No, but there are no apps accessing the filesystem. It's just available for NFS
serving. Seems safer before potentially hanging the machine?


Also I made these changes to the kernel:
cu:/boot# diff config-2.6.22-rc4-TejuTst-dbg3-dirty
config-2.6.22-rc4-TejuTst-dbg1-dirty
3,4c3,4
< # Linux kernel version: 2.6.22-rc4-TejuTst-dbg3
< # Thu Jun  7 20:00:34 2007
---

# Linux kernel version: 2.6.22-rc4-TejuTst3
# Thu Jun  7 10:59:21 2007

242,244c242
< CONFIG_PM_DEBUG=y
< CONFIG_DISABLE_CONSOLE_SUSPEND=y
< # CONFIG_PM_TRACE is not set
---

# CONFIG_PM_DEBUG is not set


positive: I can now get sysrq-t :)
negative: if I build skge into the kernel the behaviour changes so I can't run
netconsole

Just to be sure I tested and this kernel suspends/restores with /huge unmounted.
It also hangs without an umount so the behaviour is the same.


Ok, so a clean inode is sufficient to prevent hibernate from working.

So, what's different between a sync and a remount?

do_remount_sb() does:

599 shrink_dcache_sb(sb);
600 fsync_super(sb);

of which a sync does neither. sync does what fsync_super() does in
different sort of way, but does not call sync_blockdev() on each
block device. It looks like that is the two main differences between
sync and remount - remount trims the dentry cache and syncs the blockdev,
sync doesn't.


What about freezing the filesystem?

cu:~# xfs_freeze -f /huge
cu:~# /usr/net/bin/hibernate
[but this doesn't even hibernate - same as the 'touch']


I suspect that the frozen filesystem might cause other problems
in the hibernate process. However, while a freeze calls sync_blockdev()
it does not trim the dentry cache.

So, rather than a remount before hibernate, lets see if we can 
remove the dentries some other way to determine if removing excess

dentries/inodes from the caches makes a difference. Can you do:

# touch /huge/foo
# sync
# echo 1 > /proc/sys/vm/drop_caches
# hibernate

success


# touch /huge/bar
# sync
# echo 2 > /proc/sys/vm/drop_caches
# hibernate

success


# touch /huge/baz
# sync
# echo 3 > /proc/sys/vm/drop_caches
# hibernate

success

So I added
# touch /huge/bork
# sync
# hibernate

And it still succeeded - sigh.

So I thought a bit and did:
rm /huge/b* /huge/foo


Clean boot
# touch /huge/bar
# sync
# echo 2 > /proc/sys/vm/drop_caches
# hibernate

hangs on suspend (sysrq-b doesn't work)


Clean boot
# touch /huge/baz
# sync
# echo 3 > /proc/sys/vm/drop_caches
# hibernate

hangs on suspend (sysrq-b doesn't work)

So I rebooted and hibernated to make sure I'm not having random behaviour - yep,
hang on resume (as per usual).

Now I wonder if any other mounts have an effect...
reboot and umount /dev/hdb2 xfs fs, - hang on hibernate


I'm confused. I'm going to order chinese takeaway and then find a serial 
cable...

David
PS 2.6.21.1 works fine.

Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)

2007-06-12 Thread David Greaves

[RESEND since I sent this late last friday and it's probably been buried by 
now.]

I had this as a PS, then I thought, we could all be wasting our time...

I don't like these Section mismatch warnings but that's because I'm paranoid
rather than because I know what they mean. I'll be happier when someone says
That's OK, I know about them, they're not the problem

WARNING: arch/i386/kernel/built-in.o(.text+0x968f): Section mismatch: reference
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0x9781): Section mismatch: reference
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0x9786): Section mismatch: reference
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0xa25c): Section mismatch: reference
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa303): Section mismatch: reference
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa31b): Section mismatch: reference
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa344): Section mismatch: reference
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.exit.text+0x19): Section mismatch:
reference to .init.text: (between 'cache_remove_dev' and 'powernow_k6_exit')
WARNING: arch/i386/kernel/built-in.o(.data+0x2160): Section mismatch: reference
to .init.text: (between 'thermal_throttle_cpu_notifier' and 'mce_work')
WARNING: kernel/built-in.o(.text+0x14502): Section mismatch: reference to
.init.text: (between 'kthreadd' and 'init_waitqueue_head')

I'm paranoid because Andrew Morton said a couple of weeks ago:

Could the people who write these bugs, please, like, fix them?
It's not trivial noise.  These things lead to kernel crashes.


Anyhow...

David Chinner wrote:

sync just guarantees that metadata changes are logged and data is
on disk - it doesn't stop the filesystem from doing anything after
the sync...

No, but there are no apps accessing the filesystem. It's just available for NFS
serving. Seems safer before potentially hanging the machine?


Also I made these changes to the kernel:
cu:/boot# diff config-2.6.22-rc4-TejuTst-dbg3-dirty
config-2.6.22-rc4-TejuTst-dbg1-dirty
3,4c3,4
 # Linux kernel version: 2.6.22-rc4-TejuTst-dbg3
 # Thu Jun  7 20:00:34 2007
---

# Linux kernel version: 2.6.22-rc4-TejuTst3
# Thu Jun  7 10:59:21 2007

242,244c242
 CONFIG_PM_DEBUG=y
 CONFIG_DISABLE_CONSOLE_SUSPEND=y
 # CONFIG_PM_TRACE is not set
---

# CONFIG_PM_DEBUG is not set


positive: I can now get sysrq-t :)
negative: if I build skge into the kernel the behaviour changes so I can't run
netconsole

Just to be sure I tested and this kernel suspends/restores with /huge unmounted.
It also hangs without an umount so the behaviour is the same.


Ok, so a clean inode is sufficient to prevent hibernate from working.

So, what's different between a sync and a remount?

do_remount_sb() does:

599 shrink_dcache_sb(sb);
600 fsync_super(sb);

of which a sync does neither. sync does what fsync_super() does in
different sort of way, but does not call sync_blockdev() on each
block device. It looks like that is the two main differences between
sync and remount - remount trims the dentry cache and syncs the blockdev,
sync doesn't.


What about freezing the filesystem?

cu:~# xfs_freeze -f /huge
cu:~# /usr/net/bin/hibernate
[but this doesn't even hibernate - same as the 'touch']


I suspect that the frozen filesystem might cause other problems
in the hibernate process. However, while a freeze calls sync_blockdev()
it does not trim the dentry cache.

So, rather than a remount before hibernate, lets see if we can 
remove the dentries some other way to determine if removing excess

dentries/inodes from the caches makes a difference. Can you do:

# touch /huge/foo
# sync
# echo 1  /proc/sys/vm/drop_caches
# hibernate

success


# touch /huge/bar
# sync
# echo 2  /proc/sys/vm/drop_caches
# hibernate

success


# touch /huge/baz
# sync
# echo 3  /proc/sys/vm/drop_caches
# hibernate

success

So I added
# touch /huge/bork
# sync
# hibernate

And it still succeeded - sigh.

So I thought a bit and did:
rm /huge/b* /huge/foo


Clean boot
# touch /huge/bar
# sync
# echo 2  /proc/sys/vm/drop_caches
# hibernate

hangs on suspend (sysrq-b doesn't work)


Clean boot
# touch /huge/baz
# sync
# echo 3  /proc/sys/vm/drop_caches
# hibernate

hangs on suspend (sysrq-b doesn't work)

So I rebooted and hibernated to make sure I'm not having random behaviour - yep,
hang on resume (as per usual).

Now I wonder if any other mounts have an effect...
reboot and umount /dev/hdb2 xfs fs, - hang on hibernate


I'm confused. I'm going to order chinese takeaway and then find a serial 
cable...

David
PS 2.6.21.1 works fine.
PPS the 

Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)

2007-06-12 Thread David Greaves

Pavel Machek wrote:

Hi!

cu:~# mount -oremount,ro /huge
cu:~# /usr/net/bin/hibernate
[this works and resumes]

cu:~# mount -oremount,rw /huge
cu:~# /usr/net/bin/hibernate
[this works and resumes too !]

cu:~# touch /huge/tst
cu:~# /usr/net/bin/hibernate
[but this doesn't even hibernate]


This is very probably separate problem... and you should have enough
data in dmesg to do something with it.


What makes you say it's a different problem - it's hanging at the same point 
visually - it's just that one is pre suspend, one is post suspend.


It all feels very related to me - the behaviour all hinges around the same patch 
too.


I'll take a look in dmesg though...

David

PS, looks like some mail holdups somewhere...
Received: from spitz.ucw.cz (gprs189-60.eurotel.cz [160.218.189.60])
by mail.ukfsn.org (Postfix) with ESMTP id A9125E6AE9
for [EMAIL PROTECTED]; Tue, 12 Jun 2007 15:41:23 +0100 (BST)
Received: by spitz.ucw.cz (Postfix, from userid 0)
id E05FC279F2; Sun, 10 Jun 2007 18:43:48 + (UTC)
Date: Sun, 10 Jun 2007 18:43:48 +
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)

2007-06-08 Thread David Greaves

I had this as a PS, then I thought, we could all be wasting our time...

I don't like these "Section mismatch" warnings but that's because I'm paranoid 
rather than because I know what they mean. I'll be happier when someone says 
"That's OK, I know about them, they're not the problem"


WARNING: arch/i386/kernel/built-in.o(.text+0x968f): Section mismatch: reference 
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0x9781): Section mismatch: reference 
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0x9786): Section mismatch: reference 
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0xa25c): Section mismatch: reference 
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa303): Section mismatch: reference 
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa31b): Section mismatch: reference 
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa344): Section mismatch: reference 
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.exit.text+0x19): Section mismatch: 
reference to .init.text: (between 'cache_remove_dev' and 'powernow_k6_exit')
WARNING: arch/i386/kernel/built-in.o(.data+0x2160): Section mismatch: reference 
to .init.text: (between 'thermal_throttle_cpu_notifier' and 'mce_work')
WARNING: kernel/built-in.o(.text+0x14502): Section mismatch: reference to 
.init.text: (between 'kthreadd' and 'init_waitqueue_head')



Andrew Morton said a couple of weeks ago:
> Could the people who write these bugs, please, like, fix them?
> It's not trivial noise.  These things lead to kernel crashes.

Anyhow...

David Chinner wrote:

sync just guarantees that metadata changes are logged and data is
on disk - it doesn't stop the filesystem from doing anything after
the sync...
No, but there are no apps accessing the filesystem. It's just available for NFS 
serving. Seems safer before potentially hanging the machine?



Also I made these changes to the kernel:
cu:/boot# diff config-2.6.22-rc4-TejuTst-dbg3-dirty 
config-2.6.22-rc4-TejuTst-dbg1-dirty

3,4c3,4
< # Linux kernel version: 2.6.22-rc4-TejuTst-dbg3
< # Thu Jun  7 20:00:34 2007
---
> # Linux kernel version: 2.6.22-rc4-TejuTst3
> # Thu Jun  7 10:59:21 2007
242,244c242
< CONFIG_PM_DEBUG=y
< CONFIG_DISABLE_CONSOLE_SUSPEND=y
< # CONFIG_PM_TRACE is not set
---
> # CONFIG_PM_DEBUG is not set

positive: I can now get sysrq-t :)
negative: if I build skge into the kernel the behaviour changes so I can't run 
netconsole


Just to be sure I tested and this kernel suspends/restores with /huge unmounted.
It also hangs without an umount so the behaviour is the same.


Ok, so a clean inode is sufficient to prevent hibernate from working.

So, what's different between a sync and a remount?

do_remount_sb() does:

599 shrink_dcache_sb(sb);
600 fsync_super(sb);

of which a sync does neither. sync does what fsync_super() does in
different sort of way, but does not call sync_blockdev() on each
block device. It looks like that is the two main differences between
sync and remount - remount trims the dentry cache and syncs the blockdev,
sync doesn't.


What about freezing the filesystem?

cu:~# xfs_freeze -f /huge
cu:~# /usr/net/bin/hibernate
[but this doesn't even hibernate - same as the 'touch']


I suspect that the frozen filesystem might cause other problems
in the hibernate process. However, while a freeze calls sync_blockdev()
it does not trim the dentry cache.

So, rather than a remount before hibernate, lets see if we can 
remove the dentries some other way to determine if removing excess

dentries/inodes from the caches makes a difference. Can you do:

# touch /huge/foo
# sync
# echo 1 > /proc/sys/vm/drop_caches
# hibernate

success


# touch /huge/bar
# sync
# echo 2 > /proc/sys/vm/drop_caches
# hibernate

success


# touch /huge/baz
# sync
# echo 3 > /proc/sys/vm/drop_caches
# hibernate

success

So I added
# touch /huge/bork
# sync
# hibernate

And it still succeeded - sigh.

So I thought a bit and did:
rm /huge/b* /huge/foo

> Clean boot
> # touch /huge/bar
> # sync
> # echo 2 > /proc/sys/vm/drop_caches
> # hibernate
hangs on suspend (sysrq-b doesn't work)

> Clean boot
> # touch /huge/baz
> # sync
> # echo 3 > /proc/sys/vm/drop_caches
> # hibernate
hangs on suspend (sysrq-b doesn't work)

So I rebooted and hibernated to make sure I'm not having random behaviour - yep, 
hang on resume (as per usual).


Now I wonder if any other mounts have an effect...
reboot and umount /dev/hdb2 xfs fs, - hang on hibernate


I'm confused. I'm going to order chinese takeaway and then find a serial 
cable...

David
PS 2.6.21.1 works fine.



-
To unsubscribe from this list: send the line "unsubscribe 

Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)

2007-06-08 Thread David Greaves

I had this as a PS, then I thought, we could all be wasting our time...

I don't like these Section mismatch warnings but that's because I'm paranoid 
rather than because I know what they mean. I'll be happier when someone says 
That's OK, I know about them, they're not the problem


WARNING: arch/i386/kernel/built-in.o(.text+0x968f): Section mismatch: reference 
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0x9781): Section mismatch: reference 
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0x9786): Section mismatch: reference 
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0xa25c): Section mismatch: reference 
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa303): Section mismatch: reference 
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa31b): Section mismatch: reference 
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa344): Section mismatch: reference 
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.exit.text+0x19): Section mismatch: 
reference to .init.text: (between 'cache_remove_dev' and 'powernow_k6_exit')
WARNING: arch/i386/kernel/built-in.o(.data+0x2160): Section mismatch: reference 
to .init.text: (between 'thermal_throttle_cpu_notifier' and 'mce_work')
WARNING: kernel/built-in.o(.text+0x14502): Section mismatch: reference to 
.init.text: (between 'kthreadd' and 'init_waitqueue_head')



Andrew Morton said a couple of weeks ago:
 Could the people who write these bugs, please, like, fix them?
 It's not trivial noise.  These things lead to kernel crashes.

Anyhow...

David Chinner wrote:

sync just guarantees that metadata changes are logged and data is
on disk - it doesn't stop the filesystem from doing anything after
the sync...
No, but there are no apps accessing the filesystem. It's just available for NFS 
serving. Seems safer before potentially hanging the machine?



Also I made these changes to the kernel:
cu:/boot# diff config-2.6.22-rc4-TejuTst-dbg3-dirty 
config-2.6.22-rc4-TejuTst-dbg1-dirty

3,4c3,4
 # Linux kernel version: 2.6.22-rc4-TejuTst-dbg3
 # Thu Jun  7 20:00:34 2007
---
 # Linux kernel version: 2.6.22-rc4-TejuTst3
 # Thu Jun  7 10:59:21 2007
242,244c242
 CONFIG_PM_DEBUG=y
 CONFIG_DISABLE_CONSOLE_SUSPEND=y
 # CONFIG_PM_TRACE is not set
---
 # CONFIG_PM_DEBUG is not set

positive: I can now get sysrq-t :)
negative: if I build skge into the kernel the behaviour changes so I can't run 
netconsole


Just to be sure I tested and this kernel suspends/restores with /huge unmounted.
It also hangs without an umount so the behaviour is the same.


Ok, so a clean inode is sufficient to prevent hibernate from working.

So, what's different between a sync and a remount?

do_remount_sb() does:

599 shrink_dcache_sb(sb);
600 fsync_super(sb);

of which a sync does neither. sync does what fsync_super() does in
different sort of way, but does not call sync_blockdev() on each
block device. It looks like that is the two main differences between
sync and remount - remount trims the dentry cache and syncs the blockdev,
sync doesn't.


What about freezing the filesystem?

cu:~# xfs_freeze -f /huge
cu:~# /usr/net/bin/hibernate
[but this doesn't even hibernate - same as the 'touch']


I suspect that the frozen filesystem might cause other problems
in the hibernate process. However, while a freeze calls sync_blockdev()
it does not trim the dentry cache.

So, rather than a remount before hibernate, lets see if we can 
remove the dentries some other way to determine if removing excess

dentries/inodes from the caches makes a difference. Can you do:

# touch /huge/foo
# sync
# echo 1  /proc/sys/vm/drop_caches
# hibernate

success


# touch /huge/bar
# sync
# echo 2  /proc/sys/vm/drop_caches
# hibernate

success


# touch /huge/baz
# sync
# echo 3  /proc/sys/vm/drop_caches
# hibernate

success

So I added
# touch /huge/bork
# sync
# hibernate

And it still succeeded - sigh.

So I thought a bit and did:
rm /huge/b* /huge/foo

 Clean boot
 # touch /huge/bar
 # sync
 # echo 2  /proc/sys/vm/drop_caches
 # hibernate
hangs on suspend (sysrq-b doesn't work)

 Clean boot
 # touch /huge/baz
 # sync
 # echo 3  /proc/sys/vm/drop_caches
 # hibernate
hangs on suspend (sysrq-b doesn't work)

So I rebooted and hibernated to make sure I'm not having random behaviour - yep, 
hang on resume (as per usual).


Now I wonder if any other mounts have an effect...
reboot and umount /dev/hdb2 xfs fs, - hang on hibernate


I'm confused. I'm going to order chinese takeaway and then find a serial 
cable...

David
PS 2.6.21.1 works fine.



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a 

Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)

2007-06-07 Thread David Greaves

Mark Lord wrote:

Tejun Heo wrote:


Can you setup serial console and/or netconsole (not sure whether this
would work tho)?


Since he has good console output already, capturable by digicam,
I think a better approach might be to provide a patch with extra 
instrumentation..

You know.. progress messages and the like, so we can see at what step
things stop working.  Or would that not help ?

David, does scrollback work on your dead console?


h, scrollback doesn't currently _do_ anything.

But the messages didn't scroll there, they just appear (as the memory is 
restored I assume). The same messages appear during the fail-to-suspend case too.


Linus said at one point:
> Ok, it wasn't a hidden oops. The DISABLE_CONSOLE_SUSPEND=y thing sometimes
> shows oopses that are otherwise hidden, but at other times it just causes
> more problems (hard hangs when trying to display something on a device
> that is suspended, or behind a bridge that got suspended).

> In your case, the screen output just shows normal resume output, and it
> apparently just hung for some unknown reason. It *may* be worth trying to
> do a SysRQ + 't' thing to see what tasks are running (or rather, not
> running), but since you won't be able to capture it, it's probably not
> going to be useful.

So I've since removed DISABLE_CONSOLE_SUSPEND=y
Should I put it back?

I was actually doing the netconsole anyway - but skge is currently a module - 
I've avoided making any changes to the config during all these tests but what 
the heck...


And wouldn't you know it.
Get netconsole working (ie new kernel with skge builtin) and I get the hang on 
suspend. Here's the netconsole output...


swsusp: Basic memory bitmaps created
Stopping tasks ... done.
Shrinking memory... done (0 pages freed)
Freed 0 kbytes in 0.03 seconds (0.00 MB/s)
Suspending console(s)


Given that moving something from module to builtin changes the behaviour I 
thought I'd bring these warnings up again (Andrew or Alan mentioned similar 
warnings being problems in another thread...)

Now, I have mentioned these before but there's been a lot going on so here you 
go:

  MODPOST vmlinux
WARNING: arch/i386/kernel/built-in.o(.text+0x968f): Section mismatch: reference 
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0x9781): Section mismatch: reference 
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0x9786): Section mismatch: reference 
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0xa25c): Section mismatch: reference 
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa303): Section mismatch: reference 
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa31b): Section mismatch: reference 
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa344): Section mismatch: reference 
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.exit.text+0x19): Section mismatch: 
reference to .init.text: (between 'cache_remove_dev' and 'powernow_k6_exit')
WARNING: arch/i386/kernel/built-in.o(.data+0x2160): Section mismatch: reference 
to .init.text: (between 'thermal_throttle_cpu_notifier' and 'mce_work')
WARNING: kernel/built-in.o(.text+0x14502): Section mismatch: reference to 
.init.text: (between 'kthreadd' and 'init_waitqueue_head')



David
PS Gotta go - back in a couple of hours - let me know if there are any more 
tests to try.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)

2007-06-07 Thread David Greaves

David Chinner wrote:

On Thu, Jun 07, 2007 at 11:30:05AM +0100, David Greaves wrote:

Tejun Heo wrote:

Hello,

David Greaves wrote:

Just to be clear. This problem is where my system won't resume after s2d
unless I umount my xfs over raid6 filesystem.

This is really weird.  I don't see how xfs mount can affect this at all.

Indeed.
It does :)


Ok, so lets determine if it really is XFS.

Seems like a good next step...


Does the lockup happen with a
different filesystem on the md device? Or if you can't test that, does
any other XFS filesystem you have show the same problem?

It's a rather full 1.2Tb raid6 array - can't reformat it - sorry :)
I only noticed the problem when I umounted the fs during tests to prevent 
corruption - and it worked. I'm doing a sync each time it hibernates (see below) 
and a couple of paranoia xfs_repairs haven't shown any problems.


I do have another xfs filesystem on /dev/hdb2 (mentioned when I noticed the 
md/XFS correlation). It doesn't seem to have/cause any problems.



If it is xfs that is causing the problem, what happens if you
remount read-only instead of unmounting before shutting down?

Yes, I'm happy to try these tests.
nb, the hibernate script is:
ethtool -s eth0 wol g
sync
echo platform > /sys/power/disk
echo disk > /sys/power/state

So there has always been a sync before any hibernate.


cu:~# mount -oremount,ro /huge
cu:~# mount
/dev/hda2 on / type xfs (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
usbfs on /proc/bus/usb type usbfs (rw)
tmpfs on /dev/shm type tmpfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
nfsd on /proc/fs/nfsd type nfsd (rw)
/dev/hda1 on /boot type ext3 (rw)
/dev/md0 on /huge type xfs (ro)
/dev/hdb2 on /scratch type xfs (rw)
tmpfs on /dev type tmpfs (rw,size=10M,mode=0755)
rpc_pipefs on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
cu:(pid2862,port1022) on /net type nfs 
(intr,rw,port=1022,toplvl,map=/usr/share/am-utils/amd.net,noac)

elm:/space on /amd/elm/root/space type nfs (rw,vers=3,proto=tcp)
elm:/space-backup on /amd/elm/root/space-backup type nfs (rw,vers=3,proto=tcp)
elm:/usr/src on /amd/elm/root/usr/src type nfs (rw,vers=3,proto=tcp)
cu:~# /usr/net/bin/hibernate
[this works and resumes]

cu:~# mount -oremount,rw /huge
cu:~# /usr/net/bin/hibernate
[this works and resumes too !]

cu:~# touch /huge/tst
cu:~# /usr/net/bin/hibernate
[but this doesn't even hibernate]




> What about freezing the filesystem?
cu:~# xfs_freeze -f /huge
cu:~# /usr/net/bin/hibernate
[but this doesn't even hibernate - same as the 'touch']

Nb the screen looks like this:
http://www.dgreaves.com/pub/2.6.21-rc4-ptched-suspend-failure.jpg
whether it hangs on suspend or resume.

So I wouldn't say it *is* XFS at fault - but there certainly seems to be an 
interaction...

At least it's easily reproducible :) Shame about the sysrq

I can think of other permutations of freeze/ro/writing tests but I'm just 
thrashing really. Happy for you to tell me what to try next ...



David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)

2007-06-07 Thread David Greaves

Duane Griffin wrote:

On 07/06/07, David Greaves <[EMAIL PROTECTED]> wrote:

> How hard does the machine freeze?  Can you use sysrq?  If so, please
> dump sysrq-t.
I suspect there is a problem writing to the consoles...

I recompiled (rc4+patch) with sysrq support, suspended, resumed and tried
sysrq-t but got no output.

I *can* change VTs and see the various login prompts, bitmap messages 
and the

console messages. Caps/Num lock lights work.

Fearing incompetence I tried sysrq-s sysrq-u sysrq-b and got a reboot 
so sysrq

is OK.


Try sysrq-9 before the sysrq-t. Probably the messages are not being
printed to console with your default output level.


Good idea :)
Didn't work :(

Cheers

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)

2007-06-07 Thread David Greaves

Tejun Heo wrote:

Hello,

David Greaves wrote:

Just to be clear. This problem is where my system won't resume after s2d
unless I umount my xfs over raid6 filesystem.


This is really weird.  I don't see how xfs mount can affect this at all.

Indeed.
It does :)


How hard does the machine freeze?  Can you use sysrq?  If so, please
dump sysrq-t.

I suspect there is a problem writing to the consoles...

I recompiled (rc4+patch) with sysrq support, suspended, resumed and tried 
sysrq-t but got no output.


I *can* change VTs and see the various login prompts, bitmap messages and the 
console messages. Caps/Num lock lights work.


Fearing incompetence I tried sysrq-s sysrq-u sysrq-b and got a reboot so sysrq 
is OK.


Any suggestions on how to see more? Or what to try next?

Any other kernel debug options to set?

David
PS Back in a couple of hours...


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)

2007-06-07 Thread David Greaves

Tejun Heo wrote:

Hello,

David Greaves wrote:

Just to be clear. This problem is where my system won't resume after s2d
unless I umount my xfs over raid6 filesystem.


This is really weird.  I don't see how xfs mount can affect this at all.

Indeed.
It does :)


How hard does the machine freeze?  Can you use sysrq?  If so, please
dump sysrq-t.

I suspect there is a problem writing to the consoles...

I recompiled (rc4+patch) with sysrq support, suspended, resumed and tried 
sysrq-t but got no output.


I *can* change VTs and see the various login prompts, bitmap messages and the 
console messages. Caps/Num lock lights work.


Fearing incompetence I tried sysrq-s sysrq-u sysrq-b and got a reboot so sysrq 
is OK.


Any suggestions on how to see more? Or what to try next?

Any other kernel debug options to set?

David
PS Back in a couple of hours...


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)

2007-06-07 Thread David Greaves

David Chinner wrote:

On Thu, Jun 07, 2007 at 11:30:05AM +0100, David Greaves wrote:

Tejun Heo wrote:

Hello,

David Greaves wrote:

Just to be clear. This problem is where my system won't resume after s2d
unless I umount my xfs over raid6 filesystem.

This is really weird.  I don't see how xfs mount can affect this at all.

Indeed.
It does :)


Ok, so lets determine if it really is XFS.

Seems like a good next step...


Does the lockup happen with a
different filesystem on the md device? Or if you can't test that, does
any other XFS filesystem you have show the same problem?

It's a rather full 1.2Tb raid6 array - can't reformat it - sorry :)
I only noticed the problem when I umounted the fs during tests to prevent 
corruption - and it worked. I'm doing a sync each time it hibernates (see below) 
and a couple of paranoia xfs_repairs haven't shown any problems.


I do have another xfs filesystem on /dev/hdb2 (mentioned when I noticed the 
md/XFS correlation). It doesn't seem to have/cause any problems.



If it is xfs that is causing the problem, what happens if you
remount read-only instead of unmounting before shutting down?

Yes, I'm happy to try these tests.
nb, the hibernate script is:
ethtool -s eth0 wol g
sync
echo platform  /sys/power/disk
echo disk  /sys/power/state

So there has always been a sync before any hibernate.


cu:~# mount -oremount,ro /huge
cu:~# mount
/dev/hda2 on / type xfs (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
usbfs on /proc/bus/usb type usbfs (rw)
tmpfs on /dev/shm type tmpfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
nfsd on /proc/fs/nfsd type nfsd (rw)
/dev/hda1 on /boot type ext3 (rw)
/dev/md0 on /huge type xfs (ro)
/dev/hdb2 on /scratch type xfs (rw)
tmpfs on /dev type tmpfs (rw,size=10M,mode=0755)
rpc_pipefs on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
cu:(pid2862,port1022) on /net type nfs 
(intr,rw,port=1022,toplvl,map=/usr/share/am-utils/amd.net,noac)

elm:/space on /amd/elm/root/space type nfs (rw,vers=3,proto=tcp)
elm:/space-backup on /amd/elm/root/space-backup type nfs (rw,vers=3,proto=tcp)
elm:/usr/src on /amd/elm/root/usr/src type nfs (rw,vers=3,proto=tcp)
cu:~# /usr/net/bin/hibernate
[this works and resumes]

cu:~# mount -oremount,rw /huge
cu:~# /usr/net/bin/hibernate
[this works and resumes too !]

cu:~# touch /huge/tst
cu:~# /usr/net/bin/hibernate
[but this doesn't even hibernate]




 What about freezing the filesystem?
cu:~# xfs_freeze -f /huge
cu:~# /usr/net/bin/hibernate
[but this doesn't even hibernate - same as the 'touch']

Nb the screen looks like this:
http://www.dgreaves.com/pub/2.6.21-rc4-ptched-suspend-failure.jpg
whether it hangs on suspend or resume.

So I wouldn't say it *is* XFS at fault - but there certainly seems to be an 
interaction...

At least it's easily reproducible :) Shame about the sysrq

I can think of other permutations of freeze/ro/writing tests but I'm just 
thrashing really. Happy for you to tell me what to try next ...



David
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)

2007-06-07 Thread David Greaves

Duane Griffin wrote:

On 07/06/07, David Greaves [EMAIL PROTECTED] wrote:

 How hard does the machine freeze?  Can you use sysrq?  If so, please
 dump sysrq-t.
I suspect there is a problem writing to the consoles...

I recompiled (rc4+patch) with sysrq support, suspended, resumed and tried
sysrq-t but got no output.

I *can* change VTs and see the various login prompts, bitmap messages 
and the

console messages. Caps/Num lock lights work.

Fearing incompetence I tried sysrq-s sysrq-u sysrq-b and got a reboot 
so sysrq

is OK.


Try sysrq-9 before the sysrq-t. Probably the messages are not being
printed to console with your default output level.


Good idea :)
Didn't work :(

Cheers

David
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)

2007-06-07 Thread David Greaves

Mark Lord wrote:

Tejun Heo wrote:


Can you setup serial console and/or netconsole (not sure whether this
would work tho)?


Since he has good console output already, capturable by digicam,
I think a better approach might be to provide a patch with extra 
instrumentation..

You know.. progress messages and the like, so we can see at what step
things stop working.  Or would that not help ?

David, does scrollback work on your dead console?


h, scrollback doesn't currently _do_ anything.

But the messages didn't scroll there, they just appear (as the memory is 
restored I assume). The same messages appear during the fail-to-suspend case too.


Linus said at one point:
 Ok, it wasn't a hidden oops. The DISABLE_CONSOLE_SUSPEND=y thing sometimes
 shows oopses that are otherwise hidden, but at other times it just causes
 more problems (hard hangs when trying to display something on a device
 that is suspended, or behind a bridge that got suspended).

 In your case, the screen output just shows normal resume output, and it
 apparently just hung for some unknown reason. It *may* be worth trying to
 do a SysRQ + 't' thing to see what tasks are running (or rather, not
 running), but since you won't be able to capture it, it's probably not
 going to be useful.

So I've since removed DISABLE_CONSOLE_SUSPEND=y
Should I put it back?

I was actually doing the netconsole anyway - but skge is currently a module - 
I've avoided making any changes to the config during all these tests but what 
the heck...


And wouldn't you know it.
Get netconsole working (ie new kernel with skge builtin) and I get the hang on 
suspend. Here's the netconsole output...


swsusp: Basic memory bitmaps created
Stopping tasks ... done.
Shrinking memory... done (0 pages freed)
Freed 0 kbytes in 0.03 seconds (0.00 MB/s)
Suspending console(s)


Given that moving something from module to builtin changes the behaviour I 
thought I'd bring these warnings up again (Andrew or Alan mentioned similar 
warnings being problems in another thread...)

Now, I have mentioned these before but there's been a lot going on so here you 
go:

  MODPOST vmlinux
WARNING: arch/i386/kernel/built-in.o(.text+0x968f): Section mismatch: reference 
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0x9781): Section mismatch: reference 
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0x9786): Section mismatch: reference 
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0xa25c): Section mismatch: reference 
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa303): Section mismatch: reference 
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa31b): Section mismatch: reference 
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa344): Section mismatch: reference 
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.exit.text+0x19): Section mismatch: 
reference to .init.text: (between 'cache_remove_dev' and 'powernow_k6_exit')
WARNING: arch/i386/kernel/built-in.o(.data+0x2160): Section mismatch: reference 
to .init.text: (between 'thermal_throttle_cpu_notifier' and 'mce_work')
WARNING: kernel/built-in.o(.text+0x14502): Section mismatch: reference to 
.init.text: (between 'kthreadd' and 'init_waitqueue_head')



David
PS Gotta go - back in a couple of hours - let me know if there are any more 
tests to try.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)

2007-06-06 Thread David Greaves

Tejun Heo wrote:

Hello,

David Greaves wrote:

Linus Torvalds wrote:
It would be interesting to see what triggered it, since it apparently 
worked before. So yes, a bisection would be great.

Tejun, all the problematic patches are yours - so adding you.


Ouch

 that's what everyone says!

Just to be clear. This problem is where my system won't resume after s2d unless 
I umount my xfs over raid6 filesystem.



given the first patch identified is
9666f4009c22f6520ac3fb8a19c9e32ab973e828: "libata: reimplement suspend/resume
support using sdev->manage_start_stop"
That seems a good candidate...


9ce3075c20d458040138690edfdf6446664ec3ee works, right?

Yes
git reset --hard ec4883b015c3212f6f6d04fb2ff45f528492f598
vi Makefile
make oldconfig
make && make install && make modules_install && update-grub
init 6


 Can you test
9666f4009c22f6520ac3fb8a19c9e32ab973e828 by removing
ata_scsi_device_suspend/resume callbacks from sata_via.c?   Just delete
all lines referencing those two functions.  There were one or two
fallouts from the conversion.


Yes, after I posted I realised that Andrews patch fixed the compile failure :)

git reset --hard 9666f4009c22f6520ac3fb8a19c9e32ab973e828

diff --git a/drivers/ata/sata_via.c b/drivers/ata/sata_via.c
index 939c924..bad87b5 100644
--- a/drivers/ata/sata_via.c
+++ b/drivers/ata/sata_via.c
@@ -117,8 +117,6 @@ static struct scsi_host_template svia_sht = {
.slave_destroy  = ata_scsi_slave_destroy,
.bios_param = ata_std_bios_param,
 #ifdef CONFIG_PM
-   .suspend= ata_scsi_device_suspend,
-   .resume = ata_scsi_device_resume,
 #endif
 };

So now this compiles but it does cause the problem:

umount /huge
echo platform > /sys/power/disk
echo disk > /sys/power/state
# resumes fine

mount /huge
echo platform > /sys/power/disk
echo disk > /sys/power/state
# won't resume

FWIW, /huge is:
/dev/md0 on /huge type xfs (rw)
cu:~# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid6] [raid5] [raid4]
md0 : active raid6 sdf1[0] sde1[1] sdd1[2] sdc1[3] sdb1[4] sda1[5] hdb1[6]
  1225557760 blocks level 6, 256k chunk, algorithm 2 [7/7] [UUU]
  bitmap: 0/234 pages [0KB], 512KB chunk

unused devices: 



How many drives do you have?

8 in total
2 pata : VIA vt8237
2 sata on sata_via
4 sata on sata_promise
+1 pata cdrom

  Behavior difference introduced by the

reimplementation is serialization of resume sequence, so it takes more
time.  My test machine had problems resuming if resume took too long
even with the previous implementation.  It didn't matter whether the
long resuming sequence is caused by too many controllers or explicit
ssleep().  If time needed for resume sequence is over certain threshold,
machine hangs while resuming.  I thought it was a BIOS glitch and didn't
dig into it but you might be seeing the same issue.

given the mount/umount thing this sounds unlikely... but what do I know?

resume does throw up:
ATA: abnormal status 0x7F on port 0x0001b007
ATA: abnormal status 0x7F on port 0x0001b007
ATA: abnormal status 0x7F on port 0x0001a407
ATA: abnormal status 0x7F on port 0x0001a407

which I've not noticed before... oh, alright, I'll check...
reboots to 2.6.21, suspend, resume...
nope, not output on resume in 2.6.21



Please post dmesg too.  Thanks.



Here is:
 dmesg from 2.6.22-9666f4009c22f6520ac3fb8a19c9e32ab973e828 (ie with sata_via 
fix)
 dmesg from resume of above when /huge is unmounted
 dmesg from resume of 2.6.21

Linux version 2.6.21-TejunTst2-g9666f400-dirty ([EMAIL PROTECTED]) (gcc 
version 3.3.5 (Debian 1:3.3.5-13)) #13 Wed Jun 6 10:16:03 BST 2007

BIOS-provided physical RAM map:
 BIOS-e820:  - 0009c400 (usable)
 BIOS-e820: 0009c400 - 000a (reserved)
 BIOS-e820: 000f - 0010 (reserved)
 BIOS-e820: 0010 - 3fffc000 (usable)
 BIOS-e820: 3fffc000 - 3000 (ACPI data)
 BIOS-e820: 3000 - 4000 (ACPI NVS)
 BIOS-e820: fec0 - fec01000 (reserved)
 BIOS-e820: fee0 - fee01000 (reserved)
 BIOS-e820:  - 0001 (reserved)
127MB HIGHMEM available.
896MB LOWMEM available.
Entering add_active_range(0, 0, 262140) 0 entries of 256 used
Zone PFN ranges:
  DMA 0 -> 4096
  Normal   4096 ->   229376
  HighMem229376 ->   262140
early_node_map[1] active PFN ranges
0:0 ->   262140
On node 0 totalpages: 262140
  DMA zone: 32 pages used for memmap
  DMA zone: 0 pages reserved
  DMA zone: 4064 pages, LIFO batch:0
  Normal zone: 1760 pages used for memmap
  Normal zone: 223520 pages, LIFO batch:31
  HighMem zone: 255 pages used for memmap
  HighMem zone: 32509 pages, LIFO batch:7
DMI 2.3 present.
ACPI: RSDP 000F62A0, 0014 (r0 ASUS  )
ACPI: RSDT 3FFFC000, 0030 (r1 ASUS   A7V600   42302E31 MSFT 31313031)
ACPI: F

Re: 2.6.22-rc3 hibernate(?) fails totally - regression (xfs on raid6)

2007-06-06 Thread David Greaves

Tejun Heo wrote:

Hello,

David Greaves wrote:

Linus Torvalds wrote:
It would be interesting to see what triggered it, since it apparently 
worked before. So yes, a bisection would be great.

Tejun, all the problematic patches are yours - so adding you.


Ouch

grin that's what everyone says!

Just to be clear. This problem is where my system won't resume after s2d unless 
I umount my xfs over raid6 filesystem.



given the first patch identified is
9666f4009c22f6520ac3fb8a19c9e32ab973e828: libata: reimplement suspend/resume
support using sdev-manage_start_stop
That seems a good candidate...


9ce3075c20d458040138690edfdf6446664ec3ee works, right?

Yes
git reset --hard ec4883b015c3212f6f6d04fb2ff45f528492f598
vi Makefile
make oldconfig
make  make install  make modules_install  update-grub
init 6


 Can you test
9666f4009c22f6520ac3fb8a19c9e32ab973e828 by removing
ata_scsi_device_suspend/resume callbacks from sata_via.c?   Just delete
all lines referencing those two functions.  There were one or two
fallouts from the conversion.


Yes, after I posted I realised that Andrews patch fixed the compile failure :)

git reset --hard 9666f4009c22f6520ac3fb8a19c9e32ab973e828

diff --git a/drivers/ata/sata_via.c b/drivers/ata/sata_via.c
index 939c924..bad87b5 100644
--- a/drivers/ata/sata_via.c
+++ b/drivers/ata/sata_via.c
@@ -117,8 +117,6 @@ static struct scsi_host_template svia_sht = {
.slave_destroy  = ata_scsi_slave_destroy,
.bios_param = ata_std_bios_param,
 #ifdef CONFIG_PM
-   .suspend= ata_scsi_device_suspend,
-   .resume = ata_scsi_device_resume,
 #endif
 };

So now this compiles but it does cause the problem:

umount /huge
echo platform  /sys/power/disk
echo disk  /sys/power/state
# resumes fine

mount /huge
echo platform  /sys/power/disk
echo disk  /sys/power/state
# won't resume

FWIW, /huge is:
/dev/md0 on /huge type xfs (rw)
cu:~# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid6] [raid5] [raid4]
md0 : active raid6 sdf1[0] sde1[1] sdd1[2] sdc1[3] sdb1[4] sda1[5] hdb1[6]
  1225557760 blocks level 6, 256k chunk, algorithm 2 [7/7] [UUU]
  bitmap: 0/234 pages [0KB], 512KB chunk

unused devices: none



How many drives do you have?

8 in total
2 pata : VIA vt8237
2 sata on sata_via
4 sata on sata_promise
+1 pata cdrom

  Behavior difference introduced by the

reimplementation is serialization of resume sequence, so it takes more
time.  My test machine had problems resuming if resume took too long
even with the previous implementation.  It didn't matter whether the
long resuming sequence is caused by too many controllers or explicit
ssleep().  If time needed for resume sequence is over certain threshold,
machine hangs while resuming.  I thought it was a BIOS glitch and didn't
dig into it but you might be seeing the same issue.

given the mount/umount thing this sounds unlikely... but what do I know?

resume does throw up:
ATA: abnormal status 0x7F on port 0x0001b007
ATA: abnormal status 0x7F on port 0x0001b007
ATA: abnormal status 0x7F on port 0x0001a407
ATA: abnormal status 0x7F on port 0x0001a407

which I've not noticed before... oh, alright, I'll check...
reboots to 2.6.21, suspend, resume...
nope, not output on resume in 2.6.21



Please post dmesg too.  Thanks.



Here is:
 dmesg from 2.6.22-9666f4009c22f6520ac3fb8a19c9e32ab973e828 (ie with sata_via 
fix)
 dmesg from resume of above when /huge is unmounted
 dmesg from resume of 2.6.21

Linux version 2.6.21-TejunTst2-g9666f400-dirty ([EMAIL PROTECTED]) (gcc 
version 3.3.5 (Debian 1:3.3.5-13)) #13 Wed Jun 6 10:16:03 BST 2007

BIOS-provided physical RAM map:
 BIOS-e820:  - 0009c400 (usable)
 BIOS-e820: 0009c400 - 000a (reserved)
 BIOS-e820: 000f - 0010 (reserved)
 BIOS-e820: 0010 - 3fffc000 (usable)
 BIOS-e820: 3fffc000 - 3000 (ACPI data)
 BIOS-e820: 3000 - 4000 (ACPI NVS)
 BIOS-e820: fec0 - fec01000 (reserved)
 BIOS-e820: fee0 - fee01000 (reserved)
 BIOS-e820:  - 0001 (reserved)
127MB HIGHMEM available.
896MB LOWMEM available.
Entering add_active_range(0, 0, 262140) 0 entries of 256 used
Zone PFN ranges:
  DMA 0 - 4096
  Normal   4096 -   229376
  HighMem229376 -   262140
early_node_map[1] active PFN ranges
0:0 -   262140
On node 0 totalpages: 262140
  DMA zone: 32 pages used for memmap
  DMA zone: 0 pages reserved
  DMA zone: 4064 pages, LIFO batch:0
  Normal zone: 1760 pages used for memmap
  Normal zone: 223520 pages, LIFO batch:31
  HighMem zone: 255 pages used for memmap
  HighMem zone: 32509 pages, LIFO batch:7
DMI 2.3 present.
ACPI: RSDP 000F62A0, 0014 (r0 ASUS  )
ACPI: RSDT 3FFFC000, 0030 (r1 ASUS   A7V600   42302E31 MSFT 31313031)
ACPI: FACP 3FFFC0B2, 0074 (r1 ASUS   A7V600   42302E31 MSFT 31313031)
ACPI

Re: Linux 2.6.22-rc4 - sata_promise regression since -rc3

2007-06-05 Thread David Greaves
Linus Torvalds wrote:
> So -rc4 is out there now, hopefully shrinking the regression list further. 

> I'd ask that people involved with the known regressions please test 
> whether they got fixed, and if you wrote a patch and it's still pending, 
> please make sure to push it upstream..

[Tejun, Jeff, added you since the bisect points to your patch.]

Sorry, mail glitch means I lost a couple of emails...

I said:
Compile warnings and a new regression: hang on boot during sata_promise
detection...

It turns out that the hang times out; it does boot after a while. It's missing 4
of my SATA disks though.
[in turn this means I can't test the hibernate regression against -rc4. But
testing that regression against a862b5c8cd5d847779a049a5fc8cf5b1e6f5fa07 shows
it is still there. Do I get a bonus for finding 2 regressions?]]

I also bisected and got:
Bisecting: 0 revisions left to test after this
[464cf177df7727efcc5506322fc5d0c8b896f545] libata: always use polling SETXFER

According to marc, Mikail said:
Please give us some details about your sata_promise problem:
- describe your hardware (Promise chip version, mainboard, chipset, etc)
I have a Promise TX-4 and onboard via-sata.
:00:00.0 Host bridge: VIA Technologies, Inc. VT8377 [KT400/KT600 AGP] Host
Bridge (rev 80)
:00:01.0 PCI bridge: VIA Technologies, Inc. VT8237 PCI Bridge
:00:0d.0 Unknown mass storage controller: Promise Technology, Inc. PDC20318
(SATA150 TX4) (rev 02)
:00:0f.1 IDE interface: VIA Technologies, Inc.
VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)

- which was the last kernel version prior to 2.6.22-rc4 that worked
2.6.22-rc3
- the kernel messages up to the hang, if you can capture them
Easier once I learned patience...

sata_promise :00:0d.0: version 2.07
ACPI: PCI Interrupt :00:0d.0[A] -> GSI 16 (level, low) -> IRQ 17
scsi0 : sata_promise
scsi1 : sata_promise
scsi2 : sata_promise
scsi3 : sata_promise
ata1: SATA max UDMA/133 cmd 0xf880a200 ctl 0xf880a238 bmdma 0x irq 0
ata2: SATA max UDMA/133 cmd 0xf880a280 ctl 0xf880a2b8 bmdma 0x irq 0
ata3: SATA max UDMA/133 cmd 0xf880a300 ctl 0xf880a338 bmdma 0x irq 0
ata4: SATA max UDMA/133 cmd 0xf880a380 ctl 0xf880a3b8 bmdma 0x irq 0
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata1.00: ATA-7: Maxtor 6B250S0, BANC19J0, max UDMA/133
ata1.00: 490234752 sectors, multi 0: LBA48 NCQ (depth 0/32)
ata1.00: qc timeout (cmd 0xef)
ata1.00: failed to set xfermode (err_mask=0x4)
ata1: failed to recover some devices, retrying in 5 secs
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata1.00: qc timeout (cmd 0xef)
ata1.00: failed to set xfermode (err_mask=0x4)
ata1.00: limiting speed to UDMA/133:PIO3
ata1: failed to recover some devices, retrying in 5 secs
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata1.00: qc timeout (cmd 0xef)
ata1.00: failed to set xfermode (err_mask=0x4)
ata1.00: disabled
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata2.00: ATA-7: Maxtor 7Y250M0, YAR51EW0, max UDMA/133
ata2.00: 490234752 sectors, multi 0: LBA48
ata2.00: qc timeout (cmd 0xef)
ata2.00: failed to set xfermode (err_mask=0x4)
ata2: failed to recover some devices, retrying in 5 secs
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata2.00: qc timeout (cmd 0xef)
ata2.00: failed to set xfermode (err_mask=0x4)
ata2.00: limiting speed to UDMA/133:PIO3
ata2: failed to recover some devices, retrying in 5 secs
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata2.00: qc timeout (cmd 0xef)
ata2.00: failed to set xfermode (err_mask=0x4)
ata2.00: disabled
ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata3.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata3.00: ATA-7: Maxtor 7Y250M0, YAR51EW0, max UDMA/133
ata3.00: 490234752 sectors, multi 0: LBA48
ata3.00: qc timeout (cmd 0xef)
ata3.00: failed to set xfermode (err_mask=0x4)
ata3: failed to recover some devices, retrying in 5 secs
ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata3.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata3.00: qc timeout (cmd 0xef)
ata3.00: failed to set xfermode (err_mask=0x4)
ata3.00: limiting speed to UDMA/133:PIO3
ata3: failed to recover some devices, retrying in 5 secs
ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata3.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata3.00: qc timeout (cmd 0xef)
ata3.00: failed to set xfermode (err_mask=0x4)
ata3.00: disabled
ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata4.00: 

Re: Linux 2.6.22-rc4

2007-06-05 Thread David Greaves
Linus Torvalds wrote:
> So -rc4 is out there now, hopefully shrinking the regression list further. 
> 
> The diffstat (for those that look at those kinds of things) tells the 
> story: lots of small stuff to random files. I think the single biggest 
> file change was the patch-checking script, along with some sparc64 fixes. 
> But the bulk of it all is just a lot of small random things.
> 
> Shortlog appended to give kind of an overview, nothing really stands out 
> there. Mostly driver fixes, with some architecture updates.
> 
> I'd ask that people involved with the known regressions please test 
> whether they got fixed, and if you wrote a patch and it's still pending, 
> please make sure to push it upstream..
> 
>   Linus

Compile warnings and a new regression: hang on boot during sata_promise
detection... :(

I have to go out now, I'll get more details on my return.

make mrproper
cp ../linux-2.6.21.1/.config .
make oldconfig (accept defaults)
make

scripts/kconfig/conf -s arch/i386/Kconfig
drivers/input/keyboard/Kconfig:170:warning: 'select' used by config symbol
'KEYBOARD_ATARI' refers to undefined symbol 'ATARI_KBD_CORE'
drivers/input/mouse/Kconfig:182:warning: 'select' used by config symbol
'MOUSE_ATARI' refers to undefined symbol 'ATARI_KBD_CORE'
  CHK include/linux/version.h

[And I gave my Amiga away a year ago :) ]

  CC  kernel/power/pm.o
kernel/power/pm.c:205: warning: `pm_register' is deprecated (declared at
kernel/power/pm.c:64)
kernel/power/pm.c:205: warning: `pm_register' is deprecated (declared at
kernel/power/pm.c:64)
kernel/power/pm.c:206: warning: `pm_send_all' is deprecated (declared at
kernel/power/pm.c:180)
kernel/power/pm.c:206: warning: `pm_send_all' is deprecated (declared at
kernel/power/pm.c:180)

  CC  fs/xfs/linux-2.6/xfs_lrw.o
fs/xfs/linux-2.6/xfs_lrw.c: In function `xfs_iozero':
fs/xfs/linux-2.6/xfs_lrw.c:162: warning: `memclear_highpage_flush' is deprecated
(declared at include/linux/highmem.h:115)

  CC  drivers/base/dd.o
drivers/base/dd.c:211: warning: `device_probe_drivers' defined but not used

  CC  drivers/pci/search.o
drivers/pci/search.c: In function `pci_find_slot':
drivers/pci/search.c:99: warning: `pci_find_device' is deprecated (declared at
include/linux/pci.h:477)
drivers/pci/search.c: At top level:
drivers/pci/search.c:434: warning: `pci_find_device' is deprecated (declared at
drivers/pci/search.c:241)
drivers/pci/search.c:434: warning: `pci_find_device' is deprecated (declared at
drivers/pci/search.c:241)

  LD  vmlinux
  SYSMAP  System.map
  SYSMAP  .tmp_System.map
  MODPOST vmlinux
WARNING: arch/i386/kernel/built-in.o(.text+0x968f): Section mismatch: reference
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0x9781): Section mismatch: reference
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0x9786): Section mismatch: reference
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0xa25c): Section mismatch: reference
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa303): Section mismatch: reference
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa31b): Section mismatch: reference
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa344): Section mismatch: reference
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.exit.text+0x19): Section mismatch:
reference to .init.text: (between 'cache_remove_dev' and 'powernow_k6_exit')
WARNING: arch/i386/kernel/built-in.o(.data+0x2160): Section mismatch: reference
to .init.text: (between 'thermal_throttle_cpu_notifier' and 'mce_work')
WARNING: kernel/built-in.o(.text+0x14482): Section mismatch: reference to
.init.text: (between 'kthreadd' and 'init_waitqueue_head')


David

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22-rc3 hibernate(?) disables SMART on ide

2007-06-05 Thread David Greaves
Mark Lord wrote:
> That's odd.  Could you try that again,
> with the latest (either v7.3 or v7.4) version of hdparm
> (from sourceforge) ?

Using Debian's 7.3 via apt-get experimental - is that OK or would you like me to
compile the upstream?

2.6.21.1

cu:~# hdparm -V
hdparm v7.3
cu:~# hdparm -K1 /dev/hda

/dev/hda:
 setting drive keep features to 1 (on)
 HDIO_DRIVE_CMD(keepsettings) failed: Input/output error

dmesg:
skge eth0: Link is up at 1000 Mbps, full duplex, flow control both
hda: drive_cmd: status=0x51 { DriveReady SeekComplete Error }
hda: drive_cmd: error=0x04 { DriveStatusError }
ide: failed opcode was: 0xef

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22-rc3 hibernate(?) disables SMART on ide

2007-06-05 Thread David Greaves
Mark Lord wrote:
 That's odd.  Could you try that again,
 with the latest (either v7.3 or v7.4) version of hdparm
 (from sourceforge) ?

Using Debian's 7.3 via apt-get experimental - is that OK or would you like me to
compile the upstream?

2.6.21.1

cu:~# hdparm -V
hdparm v7.3
cu:~# hdparm -K1 /dev/hda

/dev/hda:
 setting drive keep features to 1 (on)
 HDIO_DRIVE_CMD(keepsettings) failed: Input/output error

dmesg:
skge eth0: Link is up at 1000 Mbps, full duplex, flow control both
hda: drive_cmd: status=0x51 { DriveReady SeekComplete Error }
hda: drive_cmd: error=0x04 { DriveStatusError }
ide: failed opcode was: 0xef

David
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.22-rc4

2007-06-05 Thread David Greaves
Linus Torvalds wrote:
 So -rc4 is out there now, hopefully shrinking the regression list further. 
 
 The diffstat (for those that look at those kinds of things) tells the 
 story: lots of small stuff to random files. I think the single biggest 
 file change was the patch-checking script, along with some sparc64 fixes. 
 But the bulk of it all is just a lot of small random things.
 
 Shortlog appended to give kind of an overview, nothing really stands out 
 there. Mostly driver fixes, with some architecture updates.
 
 I'd ask that people involved with the known regressions please test 
 whether they got fixed, and if you wrote a patch and it's still pending, 
 please make sure to push it upstream..
 
   Linus

Compile warnings and a new regression: hang on boot during sata_promise
detection... :(

I have to go out now, I'll get more details on my return.

make mrproper
cp ../linux-2.6.21.1/.config .
make oldconfig (accept defaults)
make

scripts/kconfig/conf -s arch/i386/Kconfig
drivers/input/keyboard/Kconfig:170:warning: 'select' used by config symbol
'KEYBOARD_ATARI' refers to undefined symbol 'ATARI_KBD_CORE'
drivers/input/mouse/Kconfig:182:warning: 'select' used by config symbol
'MOUSE_ATARI' refers to undefined symbol 'ATARI_KBD_CORE'
  CHK include/linux/version.h

[And I gave my Amiga away a year ago :) ]

  CC  kernel/power/pm.o
kernel/power/pm.c:205: warning: `pm_register' is deprecated (declared at
kernel/power/pm.c:64)
kernel/power/pm.c:205: warning: `pm_register' is deprecated (declared at
kernel/power/pm.c:64)
kernel/power/pm.c:206: warning: `pm_send_all' is deprecated (declared at
kernel/power/pm.c:180)
kernel/power/pm.c:206: warning: `pm_send_all' is deprecated (declared at
kernel/power/pm.c:180)

  CC  fs/xfs/linux-2.6/xfs_lrw.o
fs/xfs/linux-2.6/xfs_lrw.c: In function `xfs_iozero':
fs/xfs/linux-2.6/xfs_lrw.c:162: warning: `memclear_highpage_flush' is deprecated
(declared at include/linux/highmem.h:115)

  CC  drivers/base/dd.o
drivers/base/dd.c:211: warning: `device_probe_drivers' defined but not used

  CC  drivers/pci/search.o
drivers/pci/search.c: In function `pci_find_slot':
drivers/pci/search.c:99: warning: `pci_find_device' is deprecated (declared at
include/linux/pci.h:477)
drivers/pci/search.c: At top level:
drivers/pci/search.c:434: warning: `pci_find_device' is deprecated (declared at
drivers/pci/search.c:241)
drivers/pci/search.c:434: warning: `pci_find_device' is deprecated (declared at
drivers/pci/search.c:241)

  LD  vmlinux
  SYSMAP  System.map
  SYSMAP  .tmp_System.map
  MODPOST vmlinux
WARNING: arch/i386/kernel/built-in.o(.text+0x968f): Section mismatch: reference
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0x9781): Section mismatch: reference
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0x9786): Section mismatch: reference
to .init.text: (between 'mtrr_bp_init' and 'mtrr_ap_init')
WARNING: arch/i386/kernel/built-in.o(.text+0xa25c): Section mismatch: reference
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa303): Section mismatch: reference
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa31b): Section mismatch: reference
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.text+0xa344): Section mismatch: reference
to .init.text: (between 'get_mtrr_state' and 'mtrr_wrmsr')
WARNING: arch/i386/kernel/built-in.o(.exit.text+0x19): Section mismatch:
reference to .init.text: (between 'cache_remove_dev' and 'powernow_k6_exit')
WARNING: arch/i386/kernel/built-in.o(.data+0x2160): Section mismatch: reference
to .init.text: (between 'thermal_throttle_cpu_notifier' and 'mce_work')
WARNING: kernel/built-in.o(.text+0x14482): Section mismatch: reference to
.init.text: (between 'kthreadd' and 'init_waitqueue_head')


David

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.22-rc4 - sata_promise regression since -rc3

2007-06-05 Thread David Greaves
Linus Torvalds wrote:
 So -rc4 is out there now, hopefully shrinking the regression list further. 

 I'd ask that people involved with the known regressions please test 
 whether they got fixed, and if you wrote a patch and it's still pending, 
 please make sure to push it upstream..

[Tejun, Jeff, added you since the bisect points to your patch.]

Sorry, mail glitch means I lost a couple of emails...

I said:
Compile warnings and a new regression: hang on boot during sata_promise
detection...

It turns out that the hang times out; it does boot after a while. It's missing 4
of my SATA disks though.
[in turn this means I can't test the hibernate regression against -rc4. But
testing that regression against a862b5c8cd5d847779a049a5fc8cf5b1e6f5fa07 shows
it is still there. Do I get a bonus for finding 2 regressions?]]

I also bisected and got:
Bisecting: 0 revisions left to test after this
[464cf177df7727efcc5506322fc5d0c8b896f545] libata: always use polling SETXFER

According to marc, Mikail said:
Please give us some details about your sata_promise problem:
- describe your hardware (Promise chip version, mainboard, chipset, etc)
I have a Promise TX-4 and onboard via-sata.
:00:00.0 Host bridge: VIA Technologies, Inc. VT8377 [KT400/KT600 AGP] Host
Bridge (rev 80)
:00:01.0 PCI bridge: VIA Technologies, Inc. VT8237 PCI Bridge
:00:0d.0 Unknown mass storage controller: Promise Technology, Inc. PDC20318
(SATA150 TX4) (rev 02)
:00:0f.1 IDE interface: VIA Technologies, Inc.
VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)

- which was the last kernel version prior to 2.6.22-rc4 that worked
2.6.22-rc3
- the kernel messages up to the hang, if you can capture them
Easier once I learned patience...

sata_promise :00:0d.0: version 2.07
ACPI: PCI Interrupt :00:0d.0[A] - GSI 16 (level, low) - IRQ 17
scsi0 : sata_promise
scsi1 : sata_promise
scsi2 : sata_promise
scsi3 : sata_promise
ata1: SATA max UDMA/133 cmd 0xf880a200 ctl 0xf880a238 bmdma 0x irq 0
ata2: SATA max UDMA/133 cmd 0xf880a280 ctl 0xf880a2b8 bmdma 0x irq 0
ata3: SATA max UDMA/133 cmd 0xf880a300 ctl 0xf880a338 bmdma 0x irq 0
ata4: SATA max UDMA/133 cmd 0xf880a380 ctl 0xf880a3b8 bmdma 0x irq 0
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata1.00: ATA-7: Maxtor 6B250S0, BANC19J0, max UDMA/133
ata1.00: 490234752 sectors, multi 0: LBA48 NCQ (depth 0/32)
ata1.00: qc timeout (cmd 0xef)
ata1.00: failed to set xfermode (err_mask=0x4)
ata1: failed to recover some devices, retrying in 5 secs
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata1.00: qc timeout (cmd 0xef)
ata1.00: failed to set xfermode (err_mask=0x4)
ata1.00: limiting speed to UDMA/133:PIO3
ata1: failed to recover some devices, retrying in 5 secs
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata1.00: qc timeout (cmd 0xef)
ata1.00: failed to set xfermode (err_mask=0x4)
ata1.00: disabled
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata2.00: ATA-7: Maxtor 7Y250M0, YAR51EW0, max UDMA/133
ata2.00: 490234752 sectors, multi 0: LBA48
ata2.00: qc timeout (cmd 0xef)
ata2.00: failed to set xfermode (err_mask=0x4)
ata2: failed to recover some devices, retrying in 5 secs
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata2.00: qc timeout (cmd 0xef)
ata2.00: failed to set xfermode (err_mask=0x4)
ata2.00: limiting speed to UDMA/133:PIO3
ata2: failed to recover some devices, retrying in 5 secs
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata2.00: qc timeout (cmd 0xef)
ata2.00: failed to set xfermode (err_mask=0x4)
ata2.00: disabled
ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata3.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata3.00: ATA-7: Maxtor 7Y250M0, YAR51EW0, max UDMA/133
ata3.00: 490234752 sectors, multi 0: LBA48
ata3.00: qc timeout (cmd 0xef)
ata3.00: failed to set xfermode (err_mask=0x4)
ata3: failed to recover some devices, retrying in 5 secs
ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata3.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata3.00: qc timeout (cmd 0xef)
ata3.00: failed to set xfermode (err_mask=0x4)
ata3.00: limiting speed to UDMA/133:PIO3
ata3: failed to recover some devices, retrying in 5 secs
ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata3.00: ata_hpa_resize 1: sectors = 490234752, hpa_sectors = 490234752
ata3.00: qc timeout (cmd 0xef)
ata3.00: failed to set xfermode (err_mask=0x4)
ata3.00: disabled
ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata4.00: ata_hpa_resize 

  1   2   >