Re: Suspend code ordering (again)
Rafael J. Wysocki wrote: On Wednesday, 26 of December 2007, Linus Torvalds wrote: On Tue, 25 Dec 2007, Rafael J. Wysocki wrote: the ACPI specification between versions 1.0x and 2.0. Namely, while ACPI 2.0 and later wants us to put devices into low power states before calling _PTS, ACPI 1.0x wants us to do that after calling _PTS. Since we're following the 2.0 and later specifications right now, we're not doing the right thing for the (strictly) ACPI 1.0x-compliant systems. We ought to be able to fix things on the high level, by calling _PTS earlier on systems that claim to be ACPI 1.0x-compliant. That will require us to modify the generic susped code quite a bit and will need to be tested for some time. That's insane. Are you really saying that ACPI wants totally different orderings for different versions of the spec? Yes, I am. And does Windows really do that? I don't know. Please don't make lots of modifications to the generic suspend code. The only thing that is worth doing is to just have a firmware callback before the "device_suspend()" thing (and then on a ACPI-1.0 system, call _PTS *there*), and on an ACPI-2.0 system, call _PTS *after* device_suspend(). Yes, that's what I'm going to do, but I need to untangle some ACPI code for this purpose. Still, the fact is, some (most, I think) drivers *should* put themselves into D3 only in "late_suspend()", so if ACPI-2.0 really expects _PTS to be called after that, we're just screwed. Well, section 9.1.6 of ACPI 2.0 specifies the suspend ordering directly and says exactly that _PTS is to be executed after putting devices into respective D states. I would not take those sections as gospel, they're really an example only. It's quite possible that Windows does not follow that ordering. Also, as was pointed out, pre-Vista versions of Windows follow ACPI 1.0 and Vista follows 3.0, so 2.0 doesn't really matter since BIOS people won't test against it. 1.0 specifies that _PTS is to be called before suspending devices and 3.0 says that the AML must not depend on any specific device power state, so in both cases it should be safe to call _PTS before suspending, no? -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
Arjan van de Ven wrote: 2) [non-minor] h. [EMAIL PROTECTED] ~]$ lspci -n | wc -l 23 So I would have to perform 23 sysfs twiddles, before I could obtain a full and unabridged 'lspci -vvvxxx'? not you as human, but "lspci" ought to yes. For the userspace interface, the most-often-used knob for diagnostic purposes will be the easiest one. And that's the easiest one is an option to lspci. Nothing more nothing less. Making a global knob in kernel space is a lot more tricky, and in addition really there's enough cases where userspace wants the one device anyway Doing the "for each device I'm about to dump" in lspci is pretty much as hard as doing the global one (if not simpler) So then if you have a system where MMCONFIG doesn't work and you're not using any devices that require extended config space, then doing lspci -vvvxxx will blow up the machine? Yuck. Still don't like this approach. It seems like (partially) covering up problems rather than solving them. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
Arjan van de Ven wrote: 2) [non-minor] h. [EMAIL PROTECTED] ~]$ lspci -n | wc -l 23 So I would have to perform 23 sysfs twiddles, before I could obtain a full and unabridged 'lspci -vvvxxx'? not you as human, but lspci ought to yes. For the userspace interface, the most-often-used knob for diagnostic purposes will be the easiest one. And that's the easiest one is an option to lspci. Nothing more nothing less. Making a global knob in kernel space is a lot more tricky, and in addition really there's enough cases where userspace wants the one device anyway Doing the for each device I'm about to dump in lspci is pretty much as hard as doing the global one (if not simpler) So then if you have a system where MMCONFIG doesn't work and you're not using any devices that require extended config space, then doing lspci -vvvxxx will blow up the machine? Yuck. Still don't like this approach. It seems like (partially) covering up problems rather than solving them. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Suspend code ordering (again)
Rafael J. Wysocki wrote: On Wednesday, 26 of December 2007, Linus Torvalds wrote: On Tue, 25 Dec 2007, Rafael J. Wysocki wrote: the ACPI specification between versions 1.0x and 2.0. Namely, while ACPI 2.0 and later wants us to put devices into low power states before calling _PTS, ACPI 1.0x wants us to do that after calling _PTS. Since we're following the 2.0 and later specifications right now, we're not doing the right thing for the (strictly) ACPI 1.0x-compliant systems. We ought to be able to fix things on the high level, by calling _PTS earlier on systems that claim to be ACPI 1.0x-compliant. That will require us to modify the generic susped code quite a bit and will need to be tested for some time. That's insane. Are you really saying that ACPI wants totally different orderings for different versions of the spec? Yes, I am. And does Windows really do that? I don't know. Please don't make lots of modifications to the generic suspend code. The only thing that is worth doing is to just have a firmware callback before the device_suspend() thing (and then on a ACPI-1.0 system, call _PTS *there*), and on an ACPI-2.0 system, call _PTS *after* device_suspend(). Yes, that's what I'm going to do, but I need to untangle some ACPI code for this purpose. Still, the fact is, some (most, I think) drivers *should* put themselves into D3 only in late_suspend(), so if ACPI-2.0 really expects _PTS to be called after that, we're just screwed. Well, section 9.1.6 of ACPI 2.0 specifies the suspend ordering directly and says exactly that _PTS is to be executed after putting devices into respective D states. I would not take those sections as gospel, they're really an example only. It's quite possible that Windows does not follow that ordering. Also, as was pointed out, pre-Vista versions of Windows follow ACPI 1.0 and Vista follows 3.0, so 2.0 doesn't really matter since BIOS people won't test against it. 1.0 specifies that _PTS is to be called before suspending devices and 3.0 says that the AML must not depend on any specific device power state, so in both cases it should be safe to call _PTS before suspending, no? -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch?] s2ram + P4 + tsc = annoyance
Mike Galbraith wrote: Greetings, s2ram recently became useful here, except for the kernel's annoying habit of disabling my P4's perfectly good TSC. [ 107.894470] CPU 1 is now offline [ 107.894474] SMP alternatives: switching to UP code [ 107.895832] CPU0 attaching sched-domain: [ 107.895836] domain 0: span 1 [ 107.895838] groups: 1 [ 107.896097] CPU1 is down [3.726156] Intel machine check architecture supported. [3.726165] Intel machine check reporting enabled on CPU#0. [3.726167] CPU0: Intel P4/Xeon Extended MCE MSRs (12) available [3.726170] CPU0: Thermal monitoring enabled [3.726175] Back to C! [3.726708] Force enabled HPET at resume [3.726775] Enabling non-boot CPUs ... [3.727049] CPU0 attaching NULL sched-domain. [3.727165] SMP alternatives: switching to SMP code [3.727858] Booting processor 1/1 eip 3000 [3.727862] CPU 1 irqstacks, hard=b042f000 soft=b042d000 [3.738173] Initializing CPU#1 [3.798912] Calibrating delay using timer specific routine.. 5986.12 BogoMIPS (lpj=2993061) [3.798920] CPU: After generic identify, caps: bfebfbff 4400 [3.798931] CPU: Trace cache: 12K uops, L1 D cache: 8K [3.798934] CPU: L2 cache: 512K [3.798936] CPU: Physical Processor ID: 0 [3.798938] CPU: After all inits, caps: bfebfbff b080 4400 [3.798946] Intel machine check architecture supported. [3.798952] Intel machine check reporting enabled on CPU#1. [3.798955] CPU1: Intel P4/Xeon Extended MCE MSRs (12) available [3.798959] CPU1: Thermal monitoring enabled [3.799161] CPU1: Intel(R) Pentium(R) 4 CPU 3.00GHz stepping 09 [3.799187] checking TSC synchronization [CPU#0 - CPU#1]: [3.819181] Measured 63588552840 cycles TSC warp between CPUs, turning off TSC clock. [3.819184] Marking TSC unstable due to: check_tsc_sync_source failed. I wonder why I'm the only guy in the galaxy experiencing this. Does everybody else's clock continue to move forward across resume or something? Anyway, I asked it to please stop doing that, and it complied without even exploding (unlike crabby APICs). Are we missing some logic to resync the TSCs after resume, or something? -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
Linus Torvalds wrote: On Thu, 27 Dec 2007, Jeff Garzik wrote: 2) [non-minor] h. [EMAIL PROTECTED] ~]$ lspci -n | wc -l 23 So I would have to perform 23 sysfs twiddles, before I could obtain a full and unabridged 'lspci -vvvxxx'? Or you force it on with pci=mmconfig or something at boot-time. But yes. The *fact* is that MMCONFIG has not just been globally broken, but broken on a per-device basis. I don't know why (and quite frankly, I doubt anybody does), but the PCI device ID corruption happened only for a specific set of devices. Whether it was a timing issue with particular devices or whether it was a timing issue with some particular bridge (and could affect any devices behind that bridge), who knows... It almost certainly was brought on by a borderline (or broken) northbridge, but it apparently only affected specific devices - which makes me suspect that it wasn't *entirely* due to just the northbridge, and it was a combination of things. Pointer to such a report? The only single-device problems I'm aware of are with some devices within the K8 integrated northbridge, which we already handle. Other than that, the only non-global problems I'm aware of are devices behind host bridges which can't receive/handle MMCONFIG requests, in which case the problem would be bus-wide. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in
Linus Torvalds wrote: But as mentioned, there were other reports too of the exact same bug (with different PCI devices, but the same vendor == 0001 bogosity). Googling for lspci Unknown device 0001: mmconfig shows reports like these: http://lkml.org/lkml/2007/10/29/500 http://madwifi.org/ticket/1587 http://www.nvnews.net/vbulletin/showthread.php?t=103271 http://naoya.g.hatena.ne.jp/naoya/20070529/1180436756 http://bbs.archlinux.org/viewtopic.php?id=34321 ... which all seem to be due to this same bug with different cards (but the common theme seems to be an ATI northbridge). This isn't an example of a per-device breakage, though. It only shows up on some devices, but the cause is apparently the chipset. Those devices work fine on other boards. As mentioned later, it appears that CRS stuff might be related to this problem, but if it couldn't be fixed, I think the only sane solution would be to blacklist MMCONFIG support on that chipset. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Suspend code ordering (again)
Rafael J. Wysocki wrote: Also, as was pointed out, pre-Vista versions of Windows follow ACPI 1.0 and Vista follows 3.0, so 2.0 doesn't really matter since BIOS people won't test against it. 1.0 specifies that _PTS is to be called before suspending devices and 3.0 says that the AML must not depend on any specific device power state, so in both cases it should be safe to call _PTS before suspending, no? Well, IMO, if we take one option only (whichever that is) and there are systems that follow the other one, they will likely break. Apart from this, there are BIOSes that openly claim ACPI 2.0 support (for example, the one in my HP nx6325 does that) and they may actually prefer the post-ACPI-1.0 ordering even if they work with the pre-ACPI-2.0 one. I doubt they would prefer the later ordering in any way that matters, if the Windows version they were designed for uses the earlier ordering. It would be best if somebody could manage to find out what ordering Windows XP (and Windows Vista, for good measure) actually use, then we could just use that. Virtual machine trickery might be an option - the only complication being that it'll be using the DSDT for the fake machine and not the real one.. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: HSM violation errors
Jeff Mitchell wrote: I'm seeing errors in dmesg and the like. It appears to be somewhat similar to the issue reported here: http://kerneltrap.org/mailarchive/linux-kernel/2007/8/25/164711 except that my machine doesn't freeze, and everything seems normal -- hopefully nothing like silent corruption is going on. Also it's on brand new hardware...Intel ICH8 mobile chipset with AHCI. Output from dmesg, hdparm -I /dev/sda and hdparm --drq-hsm-error /dev/sda is below...please let me know if there's anything else that would be of use (and, of course, if this is something I should be worried about :-) ). Thanks. Jeff dmesg: ata1.00: exception Emask 0x2 SAct 0xfffd SErr 0x0 action 0x2 frozen ata1.00: spurious completions during NCQ issue=0x1 SAct=0xfffd FIS=005040a1:0002 You didn't say what kernel you were using, but in the latest kernels this spurious completion check was removed since it was broken, so this error shouldn't happen anymore. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM
Linus Torvalds wrote: IMO, we should check which version of the specification we're supposed to follow, on the basis of FADT contents, for example, and follow this one. No, we should try to figure out what Windows does. *If* windows checks the version, we should do that too. But we should absolutely *not* just assume that the documentation is an accurate picture of reality. Does anybody know how we could find out? Linus Well, it seems that if one had a checked (debug) build of Windows (or at least the acpi.sys driver) installed, as well as a copy of the Microsoft ASL compiler, they could compile and temporarily override the DSDT with a hacked one that would output what the device power states were in some fashion (maybe through the kernel debugger). Some info about this here: http://download.microsoft.com/download/1/8/f/18f8cee2-0b64-41f2-893d-a6f2295b40c8/TW04015_WINHEC2004.ppt I suspect that might require more Windows hacking skill and/or motivation than one might be likely to find on this list, though :-) -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM
Carlos Corbacho wrote: On Tuesday 25 December 2007 13:26:12 Rafael J. Wysocki wrote: Well, citing from the ACPI 2.0 specification, section 9.1.6 Transitioning from the Working to the Sleeping State (which is what we're discussing here): 3. OSPM places all device drivers into their respective Dx state. If the device is enabled for wake, it enters the Dx state associated with the wake capability. If the device is not enabled to wake the system, it enters the D3 state. 4. OSPM executes the _PTS control method, passing an argument that indicates the desired sleeping state (1, 2, 3, or 4 representing S1, S2, S3, and S4). My opinion is that we should follow this part of the specification and so we do. This is that same section from ACPI 1.0B: 3. The OS executes the Prepare To Sleep (_PTS) control method, passing an argument that indicates the desired sleeping state (1, 2, 3, or 4 representing S1, S2, S3, and S4). 4. The OS places all device drivers into their respective Dx state. If the device is enabled for wakeup, it enters the Dx state associated with the wakeup capability. If the device is not enabled to wakeup the system, it enters the D3 state. The DSDTs in question also claim ACPI 1.0 compatiblity. You're wrong, sorry. No, I'm not entirely wrong - read the 1.0 spec, and read section 7.3.2 of the ACPI 2.0 spec. * ACPI 1.0 is very clear - we are breaking the 1.0 spec * ACPI 2.0 is contradictory - section 7.3.2 repeats 1.0 ad verbatim (which is what I quote in reply to Robert Hancock), but as you point out, 9.3.2 says the opposite. So, 1.0 and 3.0 are very clear and rather different on this, and 2.0 is contradictory (and I presume this is one of the points ACPI 3.0 set out to clean up). I will rescind my point on ACPI 2.0 - I don't know what we should or shouldn't be doing there, the spec is unclear. But for ACPI 1.0, we are doing the wrong thing. Correct me if I'm wrong, but it appears ACPI 1.0 wants _PTS called before any devices are suspended, ACPI 2.0 is contradictory, and ACPI 3.0 says that you can't assume anything about device state. My guess is that unless Windows has different behavior depending on ACPI version, it probably has called _PTS before suspending devices all along. Therefore it would likely be safest to emulate that behavior, no? -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM
Carlos Corbacho wrote: On Tuesday 25 December 2007 13:26:12 Rafael J. Wysocki wrote: Well, citing from the ACPI 2.0 specification, section 9.1.6 Transitioning from the Working to the Sleeping State (which is what we're discussing here): 3. OSPM places all device drivers into their respective Dx state. If the device is enabled for wake, it enters the Dx state associated with the wake capability. If the device is not enabled to wake the system, it enters the D3 state. 4. OSPM executes the _PTS control method, passing an argument that indicates the desired sleeping state (1, 2, 3, or 4 representing S1, S2, S3, and S4). My opinion is that we should follow this part of the specification and so we do. This is that same section from ACPI 1.0B: 3. The OS executes the Prepare To Sleep (_PTS) control method, passing an argument that indicates the desired sleeping state (1, 2, 3, or 4 representing S1, S2, S3, and S4). 4. The OS places all device drivers into their respective Dx state. If the device is enabled for wakeup, it enters the Dx state associated with the wakeup capability. If the device is not enabled to wakeup the system, it enters the D3 state. The DSDTs in question also claim ACPI 1.0 compatiblity. You're wrong, sorry. No, I'm not entirely wrong - read the 1.0 spec, and read section 7.3.2 of the ACPI 2.0 spec. * ACPI 1.0 is very clear - we are breaking the 1.0 spec * ACPI 2.0 is contradictory - section 7.3.2 repeats 1.0 ad verbatim (which is what I quote in reply to Robert Hancock), but as you point out, 9.3.2 says the opposite. So, 1.0 and 3.0 are very clear and rather different on this, and 2.0 is contradictory (and I presume this is one of the points ACPI 3.0 set out to clean up). I will rescind my point on ACPI 2.0 - I don't know what we should or shouldn't be doing there, the spec is unclear. But for ACPI 1.0, we are doing the wrong thing. Correct me if I'm wrong, but it appears ACPI 1.0 wants _PTS called before any devices are suspended, ACPI 2.0 is contradictory, and ACPI 3.0 says that you can't assume anything about device state. My guess is that unless Windows has different behavior depending on ACPI version, it probably has called _PTS before suspending devices all along. Therefore it would likely be safest to emulate that behavior, no? -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM
Linus Torvalds wrote: IMO, we should check which version of the specification we're supposed to follow, on the basis of FADT contents, for example, and follow this one. No, we should try to figure out what Windows does. *If* windows checks the version, we should do that too. But we should absolutely *not* just assume that the documentation is an accurate picture of reality. Does anybody know how we could find out? Linus Well, it seems that if one had a checked (debug) build of Windows (or at least the acpi.sys driver) installed, as well as a copy of the Microsoft ASL compiler, they could compile and temporarily override the DSDT with a hacked one that would output what the device power states were in some fashion (maybe through the kernel debugger). Some info about this here: http://download.microsoft.com/download/1/8/f/18f8cee2-0b64-41f2-893d-a6f2295b40c8/TW04015_WINHEC2004.ppt I suspect that might require more Windows hacking skill and/or motivation than one might be likely to find on this list, though :-) -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: HSM violation errors
Jeff Mitchell wrote: I'm seeing errors in dmesg and the like. It appears to be somewhat similar to the issue reported here: http://kerneltrap.org/mailarchive/linux-kernel/2007/8/25/164711 except that my machine doesn't freeze, and everything seems normal -- hopefully nothing like silent corruption is going on. Also it's on brand new hardware...Intel ICH8 mobile chipset with AHCI. Output from dmesg, hdparm -I /dev/sda and hdparm --drq-hsm-error /dev/sda is below...please let me know if there's anything else that would be of use (and, of course, if this is something I should be worried about :-) ). Thanks. Jeff dmesg: ata1.00: exception Emask 0x2 SAct 0xfffd SErr 0x0 action 0x2 frozen ata1.00: spurious completions during NCQ issue=0x1 SAct=0xfffd FIS=005040a1:0002 You didn't say what kernel you were using, but in the latest kernels this spurious completion check was removed since it was broken, so this error shouldn't happen anymore. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM
Carlos Corbacho wrote: On Monday 24 December 2007 18:34:21 Linus Torvalds wrote: On Mon, 24 Dec 2007, Rafael J. Wysocki wrote: Well, having considered that for a longer while, I think the AML code is referring to a device that we have suspended already, and since it's in a low power state, it just can't handle the reference. If that is the case, we'll have to find the device (that should be possible using some code instrumentation) and move the suspending of it into the late stage. Yes. My own experimentation (in device_suspend(), calling _PTS() in the AML after each suspend_device() runs, until one device causes it to hang) points to ohci_hcd being the culprit here (with or without any devices attached). With the ohci_hcd module unloaded, the machine suspends just fine[1]. Of course, I'm at a complete loss as to why suspending OHCI would cause a problem for an IO port write. The name of the operation region, SMIP, suggests that the BIOS has an SMI trap on that port. In that case, writing to that port will result in the BIOS taking control. We have little idea what it could be doing. Could be it's trying to access the OHCI controller which has been suspended already. This sounds kind of like the Toshiba laptops that go nuts somewhere if the AHCI SATA controller gets put into suspend state before the system suspends.. The ACPI spec has the following to say about the _PTS method: "The platform must not make any assumptions about the state of the machine when _PTS is called. For example, operation region accesses that require devices to be configured and enabled may not succeed, as these devices may be in a non-decoding state due to plug and play or power management operations." I would guess some BIOS writers failed to heed this.. NOTE! This following patch is just for discussion, and while I think it's conceptually a good thing to try, I don't think it will help Carlos' problem. But removing the "pci_set_power_state()" in agp_nvidia_suspend() might. nvidia-agp cannot be built on x86-64, so it's not the culprit in this case. Yeah, and this is a PCI Express system not AGP, so it wouldn't load anyway. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]
Loic Prylli wrote: I just realized one thing: the bar sizing code in pci_read_bases() (that writes 0x in the bars) does not seem to disable the PCI_COMMAND_MEM/PCI_COMMAND_IO bits in the cmd register before manipulating the BARs. And it seems nobody else ensures they are disabled at this point either (or am I missing something?). No you're not missing anything. This problem causes many machines to break horribly when MMCONFIG is enabled. There's a patch in -mm to fix this. (It special-cases the case of host bridges and doesn't disable the decode bits for those, since some are known to do crazy things if you do that.) http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc6/2.6.24-rc6-mm1/broken-out/pci-disable-decoding-during-sizing-of-bars.patch Touching the bars while they are enabled would be buggy behaviour from our part, and something trivial to fix. And it might well fix that particular problem (it's fair play from the machine to crash if we create a decoding conflict, simply disabling the cmd bits in pci_read_bases() should remove that conflict). FWIW, to partially answer your last question, Windows does disable mem-space and/or IO-space when sizing the bars of a device (I have some traces of configuration-space-access taken on a window machine for one of the PCI busses). Good to know. There was some speculation that it did not. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Bug 9528] x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM
Linus Torvalds wrote: On Sun, 23 Dec 2007, Carlos Corbacho wrote: Fix suspend-to-RAM on nForce 4 (CK804) boards by increasing PCIBIOS_MIN_IO. Fixes kernel bugzilla #9528 Problem: Linus' patch (52ade9b3b97fd3bea42842a056fe0786c28d0555) to re-order suspend (and fix fall out from Rafael's earlier suspend reordering work) broke suspend-to-RAM on nForce 4 (CK804) boards. Why: After debugging _PTS() in the DSDT, it turns out these nVidia boards are trying to write to an IO port > 0x1000 (0x142E) during suspend. Before the re-ordering, we got away with this. Very interesting. HOWEVER. I'd much rather figure out what the magic IO resource is that clashes. It's almost certainly some hidden and undocumented (or badly documented) ACPI IO area that the kernel doesn't know about, because it's not a regular PCI BAR resource, but some northbridge (or southbridge) magic register range. Those ranges *should* be reserved by the BIOS in the ACPI tables, but this would definitely not be the first time that doesn't happen. I'm having trouble sorting out which report is for which BIOS (and some of them don't have any dmesg posted), but I believe in these cases that memory region is indeed reported as reserved by the BIOS, and no PCI resources should end up allocated there. So I'm not sure why fiddling with PCIBIOS_MIN_IO would have any effect (other than by accident). I wonder if this is the culprit (from Arthur Erhardt's dmesg): pnpacpi: exceeded the max number of mem resources: 12 pnpacpi: exceeded the max number of mem resources: 12 which means we're ignoring some of the memory reservations. I wonder if some IO reservations are also being ignored? Why do we have this silly hard limit of number of resources anyway? If we just ignore random reservations provided by the BIOS, we shouldn't be surprised if things break randomly. This warning at the very least should be much louder (i.e. "Warning: This problem may break your system").. But the right fix would be for us to just figure out what the range is ass a PCI quirk, and just know to avoid it on purpose, ratehr than just being lucky and happen to avoid it because PCIBIOS_MIN_IO just happens to be bigger than the particular address. So can you: - show what your /proc/ioports contains (*with* the bug triggering, ie non-working suspend, so we see what it is that actually ends up using that area) - send out 'dmesg' for a boot (same deal) - add "lspci -xxxvv" output to the deal too. and also make them part of the bugzilla history (I'm cc'ing bugzilla here, and added the bug number to the subject, so hopefully this thread ends up being archived there too). There was some previous work in the PCIBIOS_MIN_IO area over two years ago (71db63acff69618b3d9d3114bd061938150e146b) which bumped this to 0x4000, but this was reverted (2ba84684e8cf6f980e4e95a2300f53a505eb794e) after causing new and entirely different problems on another nForce board. The problem here is classic: these magic ranges tend to be *different* on different boards (because they don't tend to be fixed by hardware, they are programmed regions set up by firmware), so trying to change PCIBIOS_MIN_IO to avoid a problem on one board is almost certain to just introduce it on another board instead. On *your* particular board, 0x142E is used for something, but on somebody elses board it might be 0x162E, and now changing PCIBIOS_MIN_IO to 0x1500 might make that other board hang instead. So you seem to have debugged this very successfully, and I'm wondering if you might be able to find out where that 0x142e comes from, and we could fix it for *all* boards using that chipset by just figuring out what the *hardware* rules (rather than the random firmware setup that will be different on different boards) for that chipset actually are! I suspect it's board specific. Looking at the DSDT for my A8N-SLI Deluxe, that SMIP region is defined at 0x442E (and is reported as reserved). This BIOS doesn't write there in the _PTS method like the ones in the report apparently do though. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Bug 9528] x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM
Linus Torvalds wrote: On Sun, 23 Dec 2007, Carlos Corbacho wrote: Fix suspend-to-RAM on nForce 4 (CK804) boards by increasing PCIBIOS_MIN_IO. Fixes kernel bugzilla #9528 Problem: Linus' patch (52ade9b3b97fd3bea42842a056fe0786c28d0555) to re-order suspend (and fix fall out from Rafael's earlier suspend reordering work) broke suspend-to-RAM on nForce 4 (CK804) boards. Why: After debugging _PTS() in the DSDT, it turns out these nVidia boards are trying to write to an IO port 0x1000 (0x142E) during suspend. Before the re-ordering, we got away with this. Very interesting. HOWEVER. I'd much rather figure out what the magic IO resource is that clashes. It's almost certainly some hidden and undocumented (or badly documented) ACPI IO area that the kernel doesn't know about, because it's not a regular PCI BAR resource, but some northbridge (or southbridge) magic register range. Those ranges *should* be reserved by the BIOS in the ACPI tables, but this would definitely not be the first time that doesn't happen. I'm having trouble sorting out which report is for which BIOS (and some of them don't have any dmesg posted), but I believe in these cases that memory region is indeed reported as reserved by the BIOS, and no PCI resources should end up allocated there. So I'm not sure why fiddling with PCIBIOS_MIN_IO would have any effect (other than by accident). I wonder if this is the culprit (from Arthur Erhardt's dmesg): pnpacpi: exceeded the max number of mem resources: 12 pnpacpi: exceeded the max number of mem resources: 12 which means we're ignoring some of the memory reservations. I wonder if some IO reservations are also being ignored? Why do we have this silly hard limit of number of resources anyway? If we just ignore random reservations provided by the BIOS, we shouldn't be surprised if things break randomly. This warning at the very least should be much louder (i.e. Warning: This problem may break your system).. But the right fix would be for us to just figure out what the range is ass a PCI quirk, and just know to avoid it on purpose, ratehr than just being lucky and happen to avoid it because PCIBIOS_MIN_IO just happens to be bigger than the particular address. So can you: - show what your /proc/ioports contains (*with* the bug triggering, ie non-working suspend, so we see what it is that actually ends up using that area) - send out 'dmesg' for a boot (same deal) - add lspci -xxxvv output to the deal too. and also make them part of the bugzilla history (I'm cc'ing bugzilla here, and added the bug number to the subject, so hopefully this thread ends up being archived there too). There was some previous work in the PCIBIOS_MIN_IO area over two years ago (71db63acff69618b3d9d3114bd061938150e146b) which bumped this to 0x4000, but this was reverted (2ba84684e8cf6f980e4e95a2300f53a505eb794e) after causing new and entirely different problems on another nForce board. The problem here is classic: these magic ranges tend to be *different* on different boards (because they don't tend to be fixed by hardware, they are programmed regions set up by firmware), so trying to change PCIBIOS_MIN_IO to avoid a problem on one board is almost certain to just introduce it on another board instead. On *your* particular board, 0x142E is used for something, but on somebody elses board it might be 0x162E, and now changing PCIBIOS_MIN_IO to 0x1500 might make that other board hang instead. So you seem to have debugged this very successfully, and I'm wondering if you might be able to find out where that 0x142e comes from, and we could fix it for *all* boards using that chipset by just figuring out what the *hardware* rules (rather than the random firmware setup that will be different on different boards) for that chipset actually are! I suspect it's board specific. Looking at the DSDT for my A8N-SLI Deluxe, that SMIP region is defined at 0x442E (and is reported as reserved). This BIOS doesn't write there in the _PTS method like the ones in the report apparently do though. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]
Loic Prylli wrote: I just realized one thing: the bar sizing code in pci_read_bases() (that writes 0x in the bars) does not seem to disable the PCI_COMMAND_MEM/PCI_COMMAND_IO bits in the cmd register before manipulating the BARs. And it seems nobody else ensures they are disabled at this point either (or am I missing something?). No you're not missing anything. This problem causes many machines to break horribly when MMCONFIG is enabled. There's a patch in -mm to fix this. (It special-cases the case of host bridges and doesn't disable the decode bits for those, since some are known to do crazy things if you do that.) http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc6/2.6.24-rc6-mm1/broken-out/pci-disable-decoding-during-sizing-of-bars.patch Touching the bars while they are enabled would be buggy behaviour from our part, and something trivial to fix. And it might well fix that particular problem (it's fair play from the machine to crash if we create a decoding conflict, simply disabling the cmd bits in pci_read_bases() should remove that conflict). FWIW, to partially answer your last question, Windows does disable mem-space and/or IO-space when sizing the bars of a device (I have some traces of configuration-space-access taken on a window machine for one of the PCI busses). Good to know. There was some speculation that it did not. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: x86: Increase PCIBIOS_MIN_IO to 0x1500 to fix nForce 4 suspend-to-RAM
Carlos Corbacho wrote: On Monday 24 December 2007 18:34:21 Linus Torvalds wrote: On Mon, 24 Dec 2007, Rafael J. Wysocki wrote: Well, having considered that for a longer while, I think the AML code is referring to a device that we have suspended already, and since it's in a low power state, it just can't handle the reference. If that is the case, we'll have to find the device (that should be possible using some code instrumentation) and move the suspending of it into the late stage. Yes. My own experimentation (in device_suspend(), calling _PTS() in the AML after each suspend_device() runs, until one device causes it to hang) points to ohci_hcd being the culprit here (with or without any devices attached). With the ohci_hcd module unloaded, the machine suspends just fine[1]. Of course, I'm at a complete loss as to why suspending OHCI would cause a problem for an IO port write. The name of the operation region, SMIP, suggests that the BIOS has an SMI trap on that port. In that case, writing to that port will result in the BIOS taking control. We have little idea what it could be doing. Could be it's trying to access the OHCI controller which has been suspended already. This sounds kind of like the Toshiba laptops that go nuts somewhere if the AHCI SATA controller gets put into suspend state before the system suspends.. The ACPI spec has the following to say about the _PTS method: The platform must not make any assumptions about the state of the machine when _PTS is called. For example, operation region accesses that require devices to be configured and enabled may not succeed, as these devices may be in a non-decoding state due to plug and play or power management operations. I would guess some BIOS writers failed to heed this.. NOTE! This following patch is just for discussion, and while I think it's conceptually a good thing to try, I don't think it will help Carlos' problem. But removing the pci_set_power_state() in agp_nvidia_suspend() might. nvidia-agp cannot be built on x86-64, so it's not the culprit in this case. Yeah, and this is a PCI Express system not AGP, so it wouldn't load anyway. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]
Loic Prylli wrote: On 12/20/2007 6:21 PM, Tony Camuso wrote: And the MMCONFIG problem with enterprise systems and workstations, where we do control the BIOS (for the most part), is due to known bugs in certain versions of certain chipsets, HT1000, AMD8132, among them, not the BIOS. The lack of MMCONFIG support is indeed because some hypertransport chipsets lack that support. But there are some BIOSes out there that are advertising support for all busses in their MCFG acpi attribute (even the busses managed by some amd8131 in a mixed nvidia-ck804/amd8131 motherboard), and the BIOS seems at least faulty for advertising a capability that does not exist. This didn't really occur to me before for some reason. But yes, the MCFG table lists the buses to which each MMCONFIG region is applicable. If there are entire buses which MMCONFIG cannot access, it should not be indicating they are accessible via MMCONFIG in the ACPI MCFG table. If it is, then it's truly a BIOS bug. Unless of course Linux isn't handling what the MCFG table is indicating properly. Then it's our bug. It would be good to verify this on one of the systems involved.. One of the things this patch (currently in -mm) does is dump out the segment and starting/ending buses for each MCFG configuration listed. The dmesg from this patch applied on such a system would tell you which is the case: http://git.kernel.org/?p=linux/kernel/git/x86/linux-2.6-x86.git;a=commit;h=e18c985289ee356f06dbc953281a3c140a02fbb3 -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] Make MMCONFIG space (extended PCI config space) a driver opt-in issue
Arjan van de Ven wrote: Hi, Linus really wants the extended (4Kb) PCI configuration space (using MCFG acpi table etc) to be opt-in, since there's many issues with it and most drivers don't even use/need it. The idea behind opt-in is that if you don't use it, you don't get to suffer the bugs... Booted on my 64 bit test machine; sadly it has a defunct BIOS that doesn't have a working MCFG. From: Arjan van de Ven <[EMAIL PROTECTED]> Subject: Make MMCONFIG space a driver opt-in There are many issues with using the extended PCI configuration space (CPU, Chipset and most of all BIOS bugs). This while the vast majority of drivers and devices don't even use/need to use the memory mapped access methods since they don't use the config space beyond the traditional 256 bytes. This patch makes accessing the extended config space a driver choice, via the pci_enable_ext_config(pdev) API function; drivers that want/need the extended configuration space should call this. (a separate patch will be posted to add this function call to the driver that uses this) I don't really like this approach. Whether MMCONFIG works or not has nothing to do with the device itself, it's an attribute of the machine, and possibly the bus it's been plugged into. This patch might prevent problems in some cases, but it's equally likely to just delay problems until somebody plugs in a device that tries to use extended config space. Neither do I really like the approach of limiting MMCONFIG accesses to ones beyond a certain address in the config space, for a similar reason. The detection of whether MMCONFIG works or not has to work properly (and I think we're pretty close, or at least we know what we need to do to get there, like fixing the stupid MMCONFIG/PCI bar sizing overlap problem, and likely Tony Camuso's patch or something like it, to disable MMCONFIG accesses to devices behind certain broken host bridges). Once that works, then this patch really serves no purpose. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] Make MMCONFIG space (extended PCI config space) a driver opt-in issue
Arjan van de Ven wrote: Hi, Linus really wants the extended (4Kb) PCI configuration space (using MCFG acpi table etc) to be opt-in, since there's many issues with it and most drivers don't even use/need it. The idea behind opt-in is that if you don't use it, you don't get to suffer the bugs... Booted on my 64 bit test machine; sadly it has a defunct BIOS that doesn't have a working MCFG. From: Arjan van de Ven [EMAIL PROTECTED] Subject: Make MMCONFIG space a driver opt-in There are many issues with using the extended PCI configuration space (CPU, Chipset and most of all BIOS bugs). This while the vast majority of drivers and devices don't even use/need to use the memory mapped access methods since they don't use the config space beyond the traditional 256 bytes. This patch makes accessing the extended config space a driver choice, via the pci_enable_ext_config(pdev) API function; drivers that want/need the extended configuration space should call this. (a separate patch will be posted to add this function call to the driver that uses this) I don't really like this approach. Whether MMCONFIG works or not has nothing to do with the device itself, it's an attribute of the machine, and possibly the bus it's been plugged into. This patch might prevent problems in some cases, but it's equally likely to just delay problems until somebody plugs in a device that tries to use extended config space. Neither do I really like the approach of limiting MMCONFIG accesses to ones beyond a certain address in the config space, for a similar reason. The detection of whether MMCONFIG works or not has to work properly (and I think we're pretty close, or at least we know what we need to do to get there, like fixing the stupid MMCONFIG/PCI bar sizing overlap problem, and likely Tony Camuso's patch or something like it, to disable MMCONFIG accesses to devices behind certain broken host bridges). Once that works, then this patch really serves no purpose. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]
Loic Prylli wrote: On 12/20/2007 6:21 PM, Tony Camuso wrote: And the MMCONFIG problem with enterprise systems and workstations, where we do control the BIOS (for the most part), is due to known bugs in certain versions of certain chipsets, HT1000, AMD8132, among them, not the BIOS. The lack of MMCONFIG support is indeed because some hypertransport chipsets lack that support. But there are some BIOSes out there that are advertising support for all busses in their MCFG acpi attribute (even the busses managed by some amd8131 in a mixed nvidia-ck804/amd8131 motherboard), and the BIOS seems at least faulty for advertising a capability that does not exist. This didn't really occur to me before for some reason. But yes, the MCFG table lists the buses to which each MMCONFIG region is applicable. If there are entire buses which MMCONFIG cannot access, it should not be indicating they are accessible via MMCONFIG in the ACPI MCFG table. If it is, then it's truly a BIOS bug. Unless of course Linux isn't handling what the MCFG table is indicating properly. Then it's our bug. It would be good to verify this on one of the systems involved.. One of the things this patch (currently in -mm) does is dump out the segment and starting/ending buses for each MCFG configuration listed. The dmesg from this patch applied on such a system would tell you which is the case: http://git.kernel.org/?p=linux/kernel/git/x86/linux-2.6-x86.git;a=commit;h=e18c985289ee356f06dbc953281a3c140a02fbb3 -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]
Tony Camuso wrote: Robert Hancock wrote: First off, I would like to see confirmation from the horses's mouths here (namely AMD, ServerWorks/Broadcom, and whoever else) that there is no other way to get around this problem than disabling MMCONFIG for accesses behind those chips. I happen to have this one stored in my desktop. From AMD-8132TM HyperTransportTM PCI-X®2.0 Tunnel Revision Guide http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/30801.pdf 79 AMD-8132TM Tunnel Lacks Extended Configuration Space Memory-Mapped I/O Base Address Register Description Current AMD processors do not natively support PCI-defined extended configuration space. A memory mapped I/O base address register (MMIO BAR) is required in chipset devices to support extended configuration space. The AMD-8132 does not have this MMIO BAR. Potential Effect On System The AMD-8132 is a PCI-X® Mode 2 capable device and requires the MMIO BAR to support extended configuration space. Using a device which does have this MMIO BAR and an AMD-8132 on the same HyperTransportTM link of the processor may cause firmware/software problems. The base configuration space of the AMD-8132 and PCI(-X) devices attached to it are accessible using only the mechanism defined in PCI 2.3. Registers of PCI-X Mode 2 devices attached to the AMD-8132 in the extended configuration space are not accessible. The AMD-8132 has no registers in the extended configuration space. Suggested Workaround It is strongly recommended that system designers do not connect the AMD-8132 and devices that use extended configuration space MMIO BARs (ex: HyperTransport-to-PCI Express® bridges) to the same processor HyperTransport link. Fix Planned No That does sound fairly definitive. I have to wonder why certain system designers then didn't follow their strong recommendation.. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]
Tony Camuso wrote: Greg KH wrote: Sure, I realize this, but it solves the problem in one way for broken hardware, such that it at least allows it to work, right? It also provides a better incentive for the manufacturer to fix their bios, which as you are on-site at HP, it would seem odd that they would just not do that instead of trying to work around this in the kernel... thanks, greg k-h I don't think that many OEMs have that much control over the BIOS in their "value lines". :) And the MMCONFIG problem with enterprise systems and workstations, where we do control the BIOS (for the most part), is due to known bugs in certain versions of certain chipsets, HT1000, AMD8132, among them, not the BIOS. Anyway, we are devising better ways to deal with these anomalies than blacklists and telling customers to use "pci=nommconf" And we're bringing them to the community for discussion, improvement, and, we hope, acceptance. First off, I would like to see confirmation from the horses's mouths here (namely AMD, ServerWorks/Broadcom, and whoever else) that there is no other way to get around this problem than disabling MMCONFIG for accesses behind those chips. The case of the device built into the K8 northbridge that's unreachable by MMCONFIG kind of makes sense, since the northbridge is what's translating the MMCONFIG memory access into config accesses. It seems bizarre to me that a bridge chip could possibly have such a problem. The MMCONFIG access should get translated into a configuration space access in the northbridge and from that point on there's no difference between an MMCONFIG and type1 access. Look at MSI for another example, we recently had a patch from NVIDIA posted to enable Hypertransport MSI mapping bits on some chipsets so that MSI would function, because the BIOS failed to set them up properly. Are we sure there's not a similar BIOS configuration issue that could ideally be fixed in the BIOS, or else fixed up in the kernel? -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]
Tony Camuso wrote: Greg KH wrote: Sure, I realize this, but it solves the problem in one way for broken hardware, such that it at least allows it to work, right? It also provides a better incentive for the manufacturer to fix their bios, which as you are on-site at HP, it would seem odd that they would just not do that instead of trying to work around this in the kernel... thanks, greg k-h I don't think that many OEMs have that much control over the BIOS in their value lines. :) And the MMCONFIG problem with enterprise systems and workstations, where we do control the BIOS (for the most part), is due to known bugs in certain versions of certain chipsets, HT1000, AMD8132, among them, not the BIOS. Anyway, we are devising better ways to deal with these anomalies than blacklists and telling customers to use pci=nommconf And we're bringing them to the community for discussion, improvement, and, we hope, acceptance. First off, I would like to see confirmation from the horses's mouths here (namely AMD, ServerWorks/Broadcom, and whoever else) that there is no other way to get around this problem than disabling MMCONFIG for accesses behind those chips. The case of the device built into the K8 northbridge that's unreachable by MMCONFIG kind of makes sense, since the northbridge is what's translating the MMCONFIG memory access into config accesses. It seems bizarre to me that a bridge chip could possibly have such a problem. The MMCONFIG access should get translated into a configuration space access in the northbridge and from that point on there's no difference between an MMCONFIG and type1 access. Look at MSI for another example, we recently had a patch from NVIDIA posted to enable Hypertransport MSI mapping bits on some chipsets so that MSI would function, because the BIOS failed to set them up properly. Are we sure there's not a similar BIOS configuration issue that could ideally be fixed in the BIOS, or else fixed up in the kernel? -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Fwd: Re: [PATCH 0/5]PCI: x86 MMCONFIG]
Tony Camuso wrote: Robert Hancock wrote: First off, I would like to see confirmation from the horses's mouths here (namely AMD, ServerWorks/Broadcom, and whoever else) that there is no other way to get around this problem than disabling MMCONFIG for accesses behind those chips. I happen to have this one stored in my desktop. From AMD-8132TM HyperTransportTM PCI-X®2.0 Tunnel Revision Guide http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/30801.pdf 79 AMD-8132TM Tunnel Lacks Extended Configuration Space Memory-Mapped I/O Base Address Register Description Current AMD processors do not natively support PCI-defined extended configuration space. A memory mapped I/O base address register (MMIO BAR) is required in chipset devices to support extended configuration space. The AMD-8132 does not have this MMIO BAR. Potential Effect On System The AMD-8132 is a PCI-X® Mode 2 capable device and requires the MMIO BAR to support extended configuration space. Using a device which does have this MMIO BAR and an AMD-8132 on the same HyperTransportTM link of the processor may cause firmware/software problems. The base configuration space of the AMD-8132 and PCI(-X) devices attached to it are accessible using only the mechanism defined in PCI 2.3. Registers of PCI-X Mode 2 devices attached to the AMD-8132 in the extended configuration space are not accessible. The AMD-8132 has no registers in the extended configuration space. Suggested Workaround It is strongly recommended that system designers do not connect the AMD-8132 and devices that use extended configuration space MMIO BARs (ex: HyperTransport-to-PCI Express® bridges) to the same processor HyperTransport link. Fix Planned No That does sound fairly definitive. I have to wonder why certain system designers then didn't follow their strong recommendation.. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [lm-sensors] 2.6.24-rc4 hwmon it87 probe fails
Carlos Corbacho wrote: On Thursday 20 December 2007 00:20:21 Bjorn Helgaas wrote: I suspect the manufacturers would say "Oh, the sensors? The BIOS isn't broken, you're just supposed to use WMI or some (undocumented) ACPI device to get at those." It's quite possible - can we have DSDTs for the boards in question so we can quickly check if this is a possibility? (Basically, to see if they have PNP0C14 devices - if they don't, then I'm afraid it's nothing to do with WMI). -Carlos It's quite possible that the BIOS accesses the device either from ACPI AML or possibly even from SMI. In that case it would be quite reasonable for the BIOS to reserve that region to prevent another driver from loading and trying to take conflicting control of the device. One has to be careful before assuming that any such reservation is bogus. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/5]PCI: x86 MMCONFIG
Greg KH wrote: On Wed, Dec 19, 2007 at 05:17:46PM -0500, [EMAIL PROTECTED] wrote: OVERVIEW === The patches should be applied in sequence to obviate any possible build problems. The patch-set was built against 2.6.24-rc5 Description === There exist devices that do not respond correctly to PCI MMCONFIG accesses in x86 platforms. What devices are these? Do you have reports of them somewhere? This patch-set detects the problem by comparing an MMCONFIG read to a Legacy PCI config read of the vendor/device dword of every device discovered during the PCI probing sequence. A miscompare means that a device does not correctly respond to MMCONFIG accesses. When the patch code detects this condition, the bus that serves this device, and all subordinate buses, will be programmed to use Legacy PCI Config accesses. This patch-set DOES NOT detect devices that generate machine checks against MMCONFIG accesses. For such systems, "pci=nommconf" is required in the boot command. That sounds like this patchset can cause bad side affects on hardware that currently works just fine. That is not a good thing to be adding to the kernel, right? I think we need more details on why this patch is needed. Also, we already have something like this in arch/x86/pci/mmconfig-shared.c, in the unreachable_devices function. This attempts to detect devices where MMCONFIG cannot access the configuration space (one of these would be at least one device in the AMD K8 built-in northbridge). If this is not sufficient, I would suggest expanding that mechanism instead of adding all this new code. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/5]PCI: x86 MMCONFIG
Greg KH wrote: On Wed, Dec 19, 2007 at 05:17:46PM -0500, [EMAIL PROTECTED] wrote: OVERVIEW === The patches should be applied in sequence to obviate any possible build problems. The patch-set was built against 2.6.24-rc5 Description === There exist devices that do not respond correctly to PCI MMCONFIG accesses in x86 platforms. What devices are these? Do you have reports of them somewhere? This patch-set detects the problem by comparing an MMCONFIG read to a Legacy PCI config read of the vendor/device dword of every device discovered during the PCI probing sequence. A miscompare means that a device does not correctly respond to MMCONFIG accesses. When the patch code detects this condition, the bus that serves this device, and all subordinate buses, will be programmed to use Legacy PCI Config accesses. This patch-set DOES NOT detect devices that generate machine checks against MMCONFIG accesses. For such systems, pci=nommconf is required in the boot command. That sounds like this patchset can cause bad side affects on hardware that currently works just fine. That is not a good thing to be adding to the kernel, right? I think we need more details on why this patch is needed. Also, we already have something like this in arch/x86/pci/mmconfig-shared.c, in the unreachable_devices function. This attempts to detect devices where MMCONFIG cannot access the configuration space (one of these would be at least one device in the AMD K8 built-in northbridge). If this is not sufficient, I would suggest expanding that mechanism instead of adding all this new code. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [lm-sensors] 2.6.24-rc4 hwmon it87 probe fails
Carlos Corbacho wrote: On Thursday 20 December 2007 00:20:21 Bjorn Helgaas wrote: I suspect the manufacturers would say Oh, the sensors? The BIOS isn't broken, you're just supposed to use WMI or some (undocumented) ACPI device to get at those. It's quite possible - can we have DSDTs for the boards in question so we can quickly check if this is a possibility? (Basically, to see if they have PNP0C14 devices - if they don't, then I'm afraid it's nothing to do with WMI). -Carlos It's quite possible that the BIOS accesses the device either from ACPI AML or possibly even from SMI. In that case it would be quite reasonable for the BIOS to reserve that region to prevent another driver from loading and trying to take conflicting control of the device. One has to be careful before assuming that any such reservation is bogus. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH] 2.6.24-rcx: Make sys_poll() wait at least timeout ms
Karsten Wiese wrote: Am Mittwoch, 19. Dezember 2007 schrieb Robert Hancock: That seems fishy. What is your value of HZ and what is the timeout value that was passed in the bad case? HZ set to 250, timeout to 4ms. Time spent in poll() taken by clock_gettime(CLOCK_MONOTONIC, ) before and after poll()call: i.e 62us. Time measured with hpet gave 166us once. msecs_to_jiffies (kernel/time.c) has this: #if HZ <= MSEC_PER_SEC && !(MSEC_PER_SEC % HZ) /* * HZ is equal to or smaller than 1000, and 1000 is a nice * round multiple of HZ, divide with the factor between them, * but round upwards: */ return (m + (MSEC_PER_SEC / HZ) - 1) / (MSEC_PER_SEC / HZ); With HZ=250 and m=4 this gives 7/4 or only 1 jiffy, which is not more than 4ms, but if we are already at near the end of the current jiffy it could be much less than that (potentially almost no time at all). Maybe we could convert poll to use a hrtimer for this instead? -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Out of memory and no killable processes: 2.6.22-2-686-bigmem
Nico Schottelius wrote: Hello! We are running Debian with 2.6.22-2-686-bigmem on Dell Blade 1955 hardware and get a Kernel Panic with oom + message that there are no processes left to kill: http://home.schottelius.org/~nico/unix/linux/oom_no_killable-2.6.22-1.jpeg Anyone an idea, what's the cause for that? This error happened on two of those machines, What I can see in our analysis done with munin is that the number of open inodes and inode table size decreased within some days from 40k to next to zero. Munin uses awk '{print "used.value " $1-$2 "\nmax.value " $1}' < /proc/sys/fs/inode-nr to log those value (happened on both machines). Thanks for any hint and CC as usual, please. How much RAM is in these machines? If you're running tons of memory, it really is better to run a 64-bit kernel if possible. I believe there are some cases where low memory can be pretty easily exhausted on machines with lots of high memory. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PCI resource problems caused by improper address rounding
Linus Torvalds wrote: On Mon, 17 Dec 2007, Chuck Ebbert wrote: Looks like a commit that I can't find in git due to the arch merge has broken PCI address assignment. This patch by Richard Henderson against 2.6.23 fixes it for x86_64: --- linux-2.6.23.x86_64/arch/x86_64/kernel/e820.c 2007-10-09 13:31:38.0 -0700 +++ linux-2.6.23.x86_64-rth/arch/x86_64/kernel/e820.c 2007-12-15 12:37:44.0 -0800 @@ -718,8 +718,8 @@ __init void e820_setup_gap(void) while ((gapsize >> 4) > round) round += round; /* Fun with two's complement */ - pci_mem_start = (gapstart + round) & -round; + pci_mem_start = (gapstart + round - 1) & -round; No, it's very much meant to be that way. We do *not* want to have the PCI memory abutthe end of memory exactly. So it leaves a gap in between "gapstart" and the actual start of PCI memory addressing very much on purpose. In fact, the very commit (it's f0eca9626c6becb6fc56106b2e4287c6c784af3d in the kernel tree) you mention actually explicitly *explains* that, although maybe it's a bit indirect: if you start allocating PCI resources directly after the end-of-RAM thing, you can easily end up using addresses that are actually inside the magic stolen system RAM that is being used for UMA video etc. So you very much want to have a buffer in between the end-of-RAM and the actual start of the region we try to allocate in. So why do you want them to be close, anyway? Linus PS. On a different topic: if you do git log --follow arch/x86/kernel/e820_64.c you'd see the history past the renames in git. Or just do a "git blame -C" which will also follow renames (and copies). That patch is from the 2.6.14 era - I don't think we even did PnP ACPI resource reservation handling then? It could be that the BIOS was trying to tell us that UMA memory region is reserved using PnP ACPI reservations, but we just ignored it. It seems rather arbitrary in how much it leaves unused - and in this case, likely prevents us from using the nice big open gap that the BIOS presumably expected the graphics card to be mapped into. I suspect this buffer space insertion is really not needed at this point. The patch description is likely technically correct in that the BIOS should have reserved it in E820, but (according to MS comments in a presentation I read) Windows doesn't use E820 for anything other than figuring out where RAM is, it uses PnP ACPI for figuring out areas it needs to avoid. Since BIOS writers test against that behavior, there are surely lots of systems where ignoring PnP ACPI reservations and relying only on E820 would result in things really going blammo (like mappings things over MMCONFIG tables for instance). So disabling it on modern machines is really not an option. And if it's enabled, you likely wouldn't hit the problem it tries to fix. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PCI resource problems caused by improper address rounding
Linus Torvalds wrote: On Tue, 18 Dec 2007, Chuck Ebbert wrote: On 12/18/2007 04:09 PM, Linus Torvalds wrote: I wonder what the heck is the point of that pnp entry. Just for fun, can you try to just disable CONFIG_PNP, and see if it all works then? pnpacpi=off should work. PnP is also trying (and failing) to reserve all physical memory. Yeah, that really is a pretty confused-looking pnp table thing. But I have absolutely zero idea how PnP is even supposed to work - the whole thing is just a total hack for Windows, afaik. The sad part is that *normally* the right thing to do about almost any BIOS information is what we do right now: just avoid that magic address range like the plague, because we have no clue what the heck the BIOS is up to. But it looks like in this particular case, some of the problems may arise exactly *because* we avoid that range. It would be good to know what Windows does. If ACPI is found, does it perhaps just ignore all the PnP entries these days? Linus ACPI is where those PnP entries are coming from (on any modern system anyway). They do show up in Device Manager as devices with resources (the one that reserves all of system RAM on my machine is labeled "System board", others like the one that reserves the MMCONFIG aperature are "Motherboard resources" - the name is based on the PNP device ID, I believe). It could be that Windows is stupid enough that it will map things over top of physical RAM if the BIOS doesn't explicitly reserve it like that. I suspect based on some comments in Microsoft documents that Windows uses the E820 table to figure out where the RAM is, and ACPI/PnP information to figure out where IO mappings are, but may not really combine those two pieces of information into one overall map like Linux does, which would explain why it needs to reserve all physical RAM.. (As mentioned in another post, I would guess the BIOS is reserving that memory range since it's the MMCONFIG aperture..) -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PCI resource problems caused by improper address rounding
Linus Torvalds wrote: On Tue, 18 Dec 2007, Richard Henderson wrote: I've added dmesg, /proc/iomem, and lspci -v output to that bug. Basically, we have c000-cfff : free ddf0-dfef : PCI Bus #04 e000-efff : pnp 00:0b f000-fedf : less than 256MB Gaah. That really is very unlucky. That 256M only goes at one point in the low 4GB, but the thing is, it fits perfectly well above it, and dammit, that resource is explicitly a 64-bit resource or a really good reason. However, I wonder about that e000-efff : pnp 00:0b thing. I actually suspect that that whole allocation is literally *meant* for that 256MB graphics aperture, but the kernel explicitly avoids it because it's listed in the PnP tables. That is probably the MMCONFIG aperture, in that case any attempt to map the graphics BAR there will have disastrous results. (This BIOS has an MCFG table, though it looks like this Fedora kernel has MMCONFIG disabled, so we can't tell what it actually contains.) -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Memory Read Error
shashi59 wrote: I am newbie for Linux Kernel.How can I read the memory area like the range between to .Directly i read that area it shows some error like this "unable to handle kernel paging request at virtual address ". So,I don't know, how to solve this error .Please anyone help me First off, why are you trying to do this and how. Without such details it's impossible to answer this question. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH] 2.6.24-rcx: Make sys_poll() wait at least timeout ms
Karsten Wiese wrote: Hi, while playing with jackd on 2.6.24-rcx, I found poll() timing out too early. That is: earlier than its timeout argument specified. Setting poll()'s timeout argument to "required timeout" + "1 jiffy in ms" fixed it. Patch below should fix it too. Correct? Untested. Otherwise 2.6.24-rc5 ticks just fine here, thanks. Karsten -> Make sys_poll() wait at least timeout ms schedule_timeout(jiffies) waits for at least jiffies - 1. Add 1 jiffie to the timeout_jiffies calculated in sys_poll() to wait at least timeout_msecs, like poll() manpage says. Signed-off-by: Karsten Wiese <[EMAIL PROTECTED]> --- fs/select.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/fs/select.c b/fs/select.c index 47f4792..5633fe9 100644 --- a/fs/select.c +++ b/fs/select.c @@ -739,7 +739,7 @@ asmlinkage long sys_poll(struct pollfd __user *ufds, unsigned int nfds, timeout_jiffies = -1; else #endif - timeout_jiffies = msecs_to_jiffies(timeout_msecs); + timeout_jiffies = msecs_to_jiffies(timeout_msecs) + 1; } else { /* Infinite (< 0) or no (0) timeout */ timeout_jiffies = timeout_msecs; That seems fishy. What is your value of HZ and what is the timeout value that was passed in the bad case? -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Memory Read Error
shashi59 wrote: I am newbie for Linux Kernel.How can I read the memory area like the range between to .Directly i read that area it shows some error like this unable to handle kernel paging request at virtual address . So,I don't know, how to solve this error .Please anyone help me First off, why are you trying to do this and how. Without such details it's impossible to answer this question. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PCI resource problems caused by improper address rounding
Linus Torvalds wrote: On Tue, 18 Dec 2007, Richard Henderson wrote: I've added dmesg, /proc/iomem, and lspci -v output to that bug. Basically, we have c000-cfff : free ddf0-dfef : PCI Bus #04 e000-efff : pnp 00:0b f000-fedf : less than 256MB Gaah. That really is very unlucky. That 256M only goes at one point in the low 4GB, but the thing is, it fits perfectly well above it, and dammit, that resource is explicitly a 64-bit resource or a really good reason. However, I wonder about that e000-efff : pnp 00:0b thing. I actually suspect that that whole allocation is literally *meant* for that 256MB graphics aperture, but the kernel explicitly avoids it because it's listed in the PnP tables. That is probably the MMCONFIG aperture, in that case any attempt to map the graphics BAR there will have disastrous results. (This BIOS has an MCFG table, though it looks like this Fedora kernel has MMCONFIG disabled, so we can't tell what it actually contains.) -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PCI resource problems caused by improper address rounding
Linus Torvalds wrote: On Tue, 18 Dec 2007, Chuck Ebbert wrote: On 12/18/2007 04:09 PM, Linus Torvalds wrote: I wonder what the heck is the point of that pnp entry. Just for fun, can you try to just disable CONFIG_PNP, and see if it all works then? pnpacpi=off should work. PnP is also trying (and failing) to reserve all physical memory. Yeah, that really is a pretty confused-looking pnp table thing. But I have absolutely zero idea how PnP is even supposed to work - the whole thing is just a total hack for Windows, afaik. The sad part is that *normally* the right thing to do about almost any BIOS information is what we do right now: just avoid that magic address range like the plague, because we have no clue what the heck the BIOS is up to. But it looks like in this particular case, some of the problems may arise exactly *because* we avoid that range. It would be good to know what Windows does. If ACPI is found, does it perhaps just ignore all the PnP entries these days? Linus ACPI is where those PnP entries are coming from (on any modern system anyway). They do show up in Device Manager as devices with resources (the one that reserves all of system RAM on my machine is labeled System board, others like the one that reserves the MMCONFIG aperature are Motherboard resources - the name is based on the PNP device ID, I believe). It could be that Windows is stupid enough that it will map things over top of physical RAM if the BIOS doesn't explicitly reserve it like that. I suspect based on some comments in Microsoft documents that Windows uses the E820 table to figure out where the RAM is, and ACPI/PnP information to figure out where IO mappings are, but may not really combine those two pieces of information into one overall map like Linux does, which would explain why it needs to reserve all physical RAM.. (As mentioned in another post, I would guess the BIOS is reserving that memory range since it's the MMCONFIG aperture..) -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH] 2.6.24-rcx: Make sys_poll() wait at least timeout ms
Karsten Wiese wrote: Hi, while playing with jackd on 2.6.24-rcx, I found poll() timing out too early. That is: earlier than its timeout argument specified. Setting poll()'s timeout argument to required timeout + 1 jiffy in ms fixed it. Patch below should fix it too. Correct? Untested. Otherwise 2.6.24-rc5 ticks just fine here, thanks. Karsten - Make sys_poll() wait at least timeout ms schedule_timeout(jiffies) waits for at least jiffies - 1. Add 1 jiffie to the timeout_jiffies calculated in sys_poll() to wait at least timeout_msecs, like poll() manpage says. Signed-off-by: Karsten Wiese [EMAIL PROTECTED] --- fs/select.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/fs/select.c b/fs/select.c index 47f4792..5633fe9 100644 --- a/fs/select.c +++ b/fs/select.c @@ -739,7 +739,7 @@ asmlinkage long sys_poll(struct pollfd __user *ufds, unsigned int nfds, timeout_jiffies = -1; else #endif - timeout_jiffies = msecs_to_jiffies(timeout_msecs); + timeout_jiffies = msecs_to_jiffies(timeout_msecs) + 1; } else { /* Infinite ( 0) or no (0) timeout */ timeout_jiffies = timeout_msecs; That seems fishy. What is your value of HZ and what is the timeout value that was passed in the bad case? -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PCI resource problems caused by improper address rounding
Linus Torvalds wrote: On Mon, 17 Dec 2007, Chuck Ebbert wrote: Looks like a commit that I can't find in git due to the arch merge has broken PCI address assignment. This patch by Richard Henderson against 2.6.23 fixes it for x86_64: --- linux-2.6.23.x86_64/arch/x86_64/kernel/e820.c 2007-10-09 13:31:38.0 -0700 +++ linux-2.6.23.x86_64-rth/arch/x86_64/kernel/e820.c 2007-12-15 12:37:44.0 -0800 @@ -718,8 +718,8 @@ __init void e820_setup_gap(void) while ((gapsize 4) round) round += round; /* Fun with two's complement */ - pci_mem_start = (gapstart + round) -round; + pci_mem_start = (gapstart + round - 1) -round; No, it's very much meant to be that way. We do *not* want to have the PCI memory abutthe end of memory exactly. So it leaves a gap in between gapstart and the actual start of PCI memory addressing very much on purpose. In fact, the very commit (it's f0eca9626c6becb6fc56106b2e4287c6c784af3d in the kernel tree) you mention actually explicitly *explains* that, although maybe it's a bit indirect: if you start allocating PCI resources directly after the end-of-RAM thing, you can easily end up using addresses that are actually inside the magic stolen system RAM that is being used for UMA video etc. So you very much want to have a buffer in between the end-of-RAM and the actual start of the region we try to allocate in. So why do you want them to be close, anyway? Linus PS. On a different topic: if you do git log --follow arch/x86/kernel/e820_64.c you'd see the history past the renames in git. Or just do a git blame -C which will also follow renames (and copies). That patch is from the 2.6.14 era - I don't think we even did PnP ACPI resource reservation handling then? It could be that the BIOS was trying to tell us that UMA memory region is reserved using PnP ACPI reservations, but we just ignored it. It seems rather arbitrary in how much it leaves unused - and in this case, likely prevents us from using the nice big open gap that the BIOS presumably expected the graphics card to be mapped into. I suspect this buffer space insertion is really not needed at this point. The patch description is likely technically correct in that the BIOS should have reserved it in E820, but (according to MS comments in a presentation I read) Windows doesn't use E820 for anything other than figuring out where RAM is, it uses PnP ACPI for figuring out areas it needs to avoid. Since BIOS writers test against that behavior, there are surely lots of systems where ignoring PnP ACPI reservations and relying only on E820 would result in things really going blammo (like mappings things over MMCONFIG tables for instance). So disabling it on modern machines is really not an option. And if it's enabled, you likely wouldn't hit the problem it tries to fix. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Out of memory and no killable processes: 2.6.22-2-686-bigmem
Nico Schottelius wrote: Hello! We are running Debian with 2.6.22-2-686-bigmem on Dell Blade 1955 hardware and get a Kernel Panic with oom + message that there are no processes left to kill: http://home.schottelius.org/~nico/unix/linux/oom_no_killable-2.6.22-1.jpeg Anyone an idea, what's the cause for that? This error happened on two of those machines, What I can see in our analysis done with munin is that the number of open inodes and inode table size decreased within some days from 40k to next to zero. Munin uses awk '{print used.value $1-$2 \nmax.value $1}' /proc/sys/fs/inode-nr to log those value (happened on both machines). Thanks for any hint and CC as usual, please. How much RAM is in these machines? If you're running tons of memory, it really is better to run a 64-bit kernel if possible. I believe there are some cases where low memory can be pretty easily exhausted on machines with lots of high memory. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC/PATCH] 2.6.24-rcx: Make sys_poll() wait at least timeout ms
Karsten Wiese wrote: Am Mittwoch, 19. Dezember 2007 schrieb Robert Hancock: That seems fishy. What is your value of HZ and what is the timeout value that was passed in the bad case? HZ set to 250, timeout to 4ms. Time spent in poll() taken by clock_gettime(CLOCK_MONOTONIC, time) before and after poll()call: i.e 62us. Time measured with hpet gave 166us once. msecs_to_jiffies (kernel/time.c) has this: #if HZ = MSEC_PER_SEC !(MSEC_PER_SEC % HZ) /* * HZ is equal to or smaller than 1000, and 1000 is a nice * round multiple of HZ, divide with the factor between them, * but round upwards: */ return (m + (MSEC_PER_SEC / HZ) - 1) / (MSEC_PER_SEC / HZ); With HZ=250 and m=4 this gives 7/4 or only 1 jiffy, which is not more than 4ms, but if we are already at near the end of the current jiffy it could be much less than that (potentially almost no time at all). Maybe we could convert poll to use a hrtimer for this instead? -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86_64: fix problems due to use of "outb" to port 80 on some AMD64x2 laptops, etc.
Ingo Molnar wrote: * H. Peter Anvin <[EMAIL PROTECTED]> wrote: Pavel Machek wrote: this is also something for v2.6.24 merging. As much as I like this patch, I do not think it is suitable for .24. Too risky, I'd say. No kidding! We're talking about removing a hack that has been successful on thousands of pieces of hardware over 15 years because it ^[*] breaks ONE machine. [*] "- none of which needs it anymore -" there, fixed it for you ;-) So lets keep this in perspective: this is a hack that only helps on a very low number of systems. (the PIT of one PII era chipset is known to be affected) unfortunately this hack's side-effects are mis-used by an unknown number of drivers to mask PCI posting bugs. We want to figure out those bugs (safely and carefully) and we want to remove this hack from modern machines that dont need it. Doing anything else would be superstition. Are there any such examples known of such drivers? It doesn't seem to make much sense.. PCI IO writes are not posted on any known system (the spec allows them to be posted in the host bus bridge, but if they were they could only be flushed by a read, not a write) and PCI MMIO writes are only guaranteed to flush by doing a read from that device, not by other random port accesses. I suppose using the _p versions of port accesses might happen to mask such problems on certain machines.. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86_64: fix problems due to use of "outb" to port 80 on some AMD64x2 laptops, etc.
David P. Reed wrote: PS: If I have time, I may try to build Rene's port 80 test for Windows and run it under WinXP on this machine (I still have a crappy little partition that boots it). If it freezes the same way, it's almost certain a design "feature", and if it doesn't freeze, we might suspect that there is compensating logic in either Windows ACPI code or some way that windows "sets up" the machine. You'd have to replace the iopl call to an equivalent one for Windows (seems like NtSetInformationProcess(ProcessUserModeIOPL) might do what you need). -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86_64: fix problems due to use of outb to port 80 on some AMD64x2 laptops, etc.
David P. Reed wrote: PS: If I have time, I may try to build Rene's port 80 test for Windows and run it under WinXP on this machine (I still have a crappy little partition that boots it). If it freezes the same way, it's almost certain a design feature, and if it doesn't freeze, we might suspect that there is compensating logic in either Windows ACPI code or some way that windows sets up the machine. You'd have to replace the iopl call to an equivalent one for Windows (seems like NtSetInformationProcess(ProcessUserModeIOPL) might do what you need). -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86_64: fix problems due to use of outb to port 80 on some AMD64x2 laptops, etc.
Ingo Molnar wrote: * H. Peter Anvin [EMAIL PROTECTED] wrote: Pavel Machek wrote: this is also something for v2.6.24 merging. As much as I like this patch, I do not think it is suitable for .24. Too risky, I'd say. No kidding! We're talking about removing a hack that has been successful on thousands of pieces of hardware over 15 years because it ^[*] breaks ONE machine. [*] - none of which needs it anymore - there, fixed it for you ;-) So lets keep this in perspective: this is a hack that only helps on a very low number of systems. (the PIT of one PII era chipset is known to be affected) unfortunately this hack's side-effects are mis-used by an unknown number of drivers to mask PCI posting bugs. We want to figure out those bugs (safely and carefully) and we want to remove this hack from modern machines that dont need it. Doing anything else would be superstition. Are there any such examples known of such drivers? It doesn't seem to make much sense.. PCI IO writes are not posted on any known system (the spec allows them to be posted in the host bus bridge, but if they were they could only be flushed by a read, not a write) and PCI MMIO writes are only guaranteed to flush by doing a read from that device, not by other random port accesses. I suppose using the _p versions of port accesses might happen to mask such problems on certain machines.. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: New question on that sata controller
Gene Heskett wrote: Greetings; When I asked about a sata controller earlier this week, I gave a link to it. Unforch (maybe) when it actually arrived, the cards box showed a silicon image chip, and the card had a via. So much for getting what I ordered... The required module then was sata_via, not sata_uli, and it seems to be working ok. However, this one claims its a raid controller according to an lspci -v: 01:0a.0 RAID bus controller: VIA Technologies, Inc. VT6421 IDE RAID Controller (rev 50) Subsystem: VIA Technologies, Inc. VT6421 IDE RAID Controller Flags: bus master, medium devsel, latency 32, IRQ 19 I/O ports at 9400 [size=16] I/O ports at 9800 [size=16] I/O ports at 9c00 [size=16] I/O ports at a000 [size=16] I/O ports at a400 [size=32] I/O ports at a800 [size=256] [virtual] Expansion ROM at e900 [disabled] [size=64K] Capabilities: [e0] Power Management version 2 I just noted that the Expansion ROM is disabled, but I didn't see any jumpers to enable it on the card prior to installing it. Does anyone know how this is supposed to work? I would like to make it directly bootable but I believe this has to be 'enabled' for that. It's usually normal for it to be disabled after boot, I believe. Are you getting anything showing up on boot indicating its BIOS is active? I cannot find any references to this particular chip in a 'make xconfig' for 2.6.24-rc5. Should this be a concern, or is this one a 'Just Works(TM)' chipset? This card has 3 sata port connectors and one ide fitted. Two rather pleasant side effects of going to the Biostar.tw site and finding a newer bios and installing it on an M7NCD Pro mobo are: 1: FSB now running at 400MHZ, was 333 before as it was not at all stable at 400 and I have been told the XP-2800 Athlon only supports 333 and AMD's site agrees. 2: CPU temps are down around 13F. CPU speed still the same at 2079MHZ according to dmesg. The reduced temps at a higher FSB indicates better interface timing, and if it runs the rest of the night at 400 without a self reboot or crash, I'll leave it there. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: New question on that sata controller
Gene Heskett wrote: Greetings; When I asked about a sata controller earlier this week, I gave a link to it. Unforch (maybe) when it actually arrived, the cards box showed a silicon image chip, and the card had a via. So much for getting what I ordered... The required module then was sata_via, not sata_uli, and it seems to be working ok. However, this one claims its a raid controller according to an lspci -v: 01:0a.0 RAID bus controller: VIA Technologies, Inc. VT6421 IDE RAID Controller (rev 50) Subsystem: VIA Technologies, Inc. VT6421 IDE RAID Controller Flags: bus master, medium devsel, latency 32, IRQ 19 I/O ports at 9400 [size=16] I/O ports at 9800 [size=16] I/O ports at 9c00 [size=16] I/O ports at a000 [size=16] I/O ports at a400 [size=32] I/O ports at a800 [size=256] [virtual] Expansion ROM at e900 [disabled] [size=64K] Capabilities: [e0] Power Management version 2 I just noted that the Expansion ROM is disabled, but I didn't see any jumpers to enable it on the card prior to installing it. Does anyone know how this is supposed to work? I would like to make it directly bootable but I believe this has to be 'enabled' for that. It's usually normal for it to be disabled after boot, I believe. Are you getting anything showing up on boot indicating its BIOS is active? I cannot find any references to this particular chip in a 'make xconfig' for 2.6.24-rc5. Should this be a concern, or is this one a 'Just Works(TM)' chipset? This card has 3 sata port connectors and one ide fitted. Two rather pleasant side effects of going to the Biostar.tw site and finding a newer bios and installing it on an M7NCD Pro mobo are: 1: FSB now running at 400MHZ, was 333 before as it was not at all stable at 400 and I have been told the XP-2800 Athlon only supports 333 and AMD's site agrees. 2: CPU temps are down around 13F. CPU speed still the same at 2079MHZ according to dmesg. The reduced temps at a higher FSB indicates better interface timing, and if it runs the rest of the night at 400 without a self reboot or crash, I'll leave it there. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Could not set non-blocking flag with 2.6.24-rc5
Tino Keitel wrote: Hi folks, I often build Debian packages inside a chroot. Today I discovered a failure during an "aptitude update", which is a command to download new package lists for the package management. In strace, the lines around the failure look like this: 99% [Working]) = 14 14 [pid 5986] select(6, [3 4 5], [], NULL, {0, 50}) = 0 (Timeout) [pid 5986] gettimeofday({1197576353, 670510}, NULL) = 0 [pid 5986] rt_sigprocmask(SIG_BLOCK, [WINCH], [], 8) = 0 [pid 5986] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 99% [Working]) = 14 14 [pid 5986] select(6, [3 4 5], [], NULL, {0, 50}) = 0 (Timeout) [pid 5986] gettimeofday({1197576354, 173902}, NULL) = 0 [pid 5986] rt_sigprocmask(SIG_BLOCK, [WINCH], [], 8) = 0 [pid 5986] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 99% [Working]) = 14 14 [pid 5986] select(6, [3 4 5], [], NULL, {0, 50} [pid 5988] <... select resumed> ) = 1 (in [3], left {105, 0}) [pid 5988] read(3, "", 56559) = 0 [pid 5988] fcntl64(-1, F_GETFL)= -1 EBADF (Bad file descriptor) [pid 5988] fcntl64(-1, F_SETFL, O_ACCMODE|O_CREAT|O_EXCL|O_NOCTTY|O_TRUNC|O_APPEND|O_SYNC|O_ASYNC|O_DIRECT|O_LARGEFILE|O_DIRECTORY|O_NOFOLLOW|O_NOATIME|0xfff8003c) = -1 EBADF (Bad file descriptor) [pid 5988] write(2, ""..., 41FATAL -> Could not set non-blocking flag ) = 41 [pid 5988] write(2, ""..., 19Bad file descriptor) = 19 [pid 5988] write(2, ""..., 1 ) = 1 [pid 5988] exit_group(100) = ? Process 5988 detached This happened with a kernel after 2.6.24-rc5 (4af75653031c6d454b4ace47c1536f0d2e727e3e). I rebooted into 2.6.23.8 and it worked. Now I rebooted into 2.6.24-rc5 again and was able to reproduce the failure, so it looks like a kernel issue to me. With this part of strace output it seems like an obvious userspace bug (calling fcntl on a -1 file descriptor). Could be some other change in behavior or timing difference is triggering the bug,however. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Strange ATA problems
Tejun Heo wrote: Dec 14 01:06:33 fermat kernel: ata1: EH in ADMA mode, notifier 0x0 notifier_error 0x0 gen_ctl 0x1501000 status 0x400 next cpb count 0x0 next cpb idx 0x0 Dec 14 01:06:33 fermat kernel: ata1: CPB 0: ctl_flags 0x1f, resp_flags 0x2 Dec 14 01:06:33 fermat kernel: ata1: CPB 1: ctl_flags 0x1f, resp_flags 0x2 Dec 14 01:06:33 fermat kernel: ata1: CPB 2: ctl_flags 0x1f, resp_flags 0x2 Dec 14 01:06:33 fermat kernel: ata1: CPB 3: ctl_flags 0x1f, resp_flags 0x2 Dec 14 01:06:33 fermat kernel: ata1: CPB 4: ctl_flags 0x1f, resp_flags 0x2 Dec 14 01:06:33 fermat kernel: ata1: CPB 5: ctl_flags 0x1f, resp_flags 0x2 Dec 14 01:06:33 fermat kernel: ata1: CPB 6: ctl_flags 0x1f, resp_flags 0x2 Dec 14 01:06:33 fermat kernel: ata1: CPB 7: ctl_flags 0x1f, resp_flags 0x2 Dec 14 01:06:33 fermat kernel: ata1: CPB 8: ctl_flags 0x1f, resp_flags 0x2 Dec 14 01:06:33 fermat kernel: ata1: CPB 9: ctl_flags 0x1f, resp_flags 0x2 Dec 14 01:06:33 fermat kernel: ata1: CPB 10: ctl_flags 0x1f, resp_flags 0x2 CPB flags stuck at 0x2 indicates that the controller issued the command to the drive and is waiting for completion. Usually seems to indicate some kind of SATA communication problem. If your USB cdrom is bus powered and you yanked it, it could have caused fluctuation in power which in turn can cause disruption on serial ATA bus leading to transmission error and timeouts. There are other possibilities but this kind of thing does happen often with SATA. Those highspeed low-voltage serial links are very susceptible to interferences. Well,.. it actually "worked" again when I unplugged it, but the errors from the cdrom above are probably unrelated.. As long as EH recovered it properly, there's nothing to worry about. What does that mean? That means unless the problem continues to occur repeatedly, you don't have to worry about it. Yes, if it didn't recur, was likely just a transient glitch. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Strange ATA problems
Tejun Heo wrote: Dec 14 01:06:33 fermat kernel: ata1: EH in ADMA mode, notifier 0x0 notifier_error 0x0 gen_ctl 0x1501000 status 0x400 next cpb count 0x0 next cpb idx 0x0 Dec 14 01:06:33 fermat kernel: ata1: CPB 0: ctl_flags 0x1f, resp_flags 0x2 Dec 14 01:06:33 fermat kernel: ata1: CPB 1: ctl_flags 0x1f, resp_flags 0x2 Dec 14 01:06:33 fermat kernel: ata1: CPB 2: ctl_flags 0x1f, resp_flags 0x2 Dec 14 01:06:33 fermat kernel: ata1: CPB 3: ctl_flags 0x1f, resp_flags 0x2 Dec 14 01:06:33 fermat kernel: ata1: CPB 4: ctl_flags 0x1f, resp_flags 0x2 Dec 14 01:06:33 fermat kernel: ata1: CPB 5: ctl_flags 0x1f, resp_flags 0x2 Dec 14 01:06:33 fermat kernel: ata1: CPB 6: ctl_flags 0x1f, resp_flags 0x2 Dec 14 01:06:33 fermat kernel: ata1: CPB 7: ctl_flags 0x1f, resp_flags 0x2 Dec 14 01:06:33 fermat kernel: ata1: CPB 8: ctl_flags 0x1f, resp_flags 0x2 Dec 14 01:06:33 fermat kernel: ata1: CPB 9: ctl_flags 0x1f, resp_flags 0x2 Dec 14 01:06:33 fermat kernel: ata1: CPB 10: ctl_flags 0x1f, resp_flags 0x2 CPB flags stuck at 0x2 indicates that the controller issued the command to the drive and is waiting for completion. Usually seems to indicate some kind of SATA communication problem. If your USB cdrom is bus powered and you yanked it, it could have caused fluctuation in power which in turn can cause disruption on serial ATA bus leading to transmission error and timeouts. There are other possibilities but this kind of thing does happen often with SATA. Those highspeed low-voltage serial links are very susceptible to interferences. Well,.. it actually worked again when I unplugged it, but the errors from the cdrom above are probably unrelated.. As long as EH recovered it properly, there's nothing to worry about. What does that mean? That means unless the problem continues to occur repeatedly, you don't have to worry about it. Yes, if it didn't recur, was likely just a transient glitch. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Could not set non-blocking flag with 2.6.24-rc5
Tino Keitel wrote: Hi folks, I often build Debian packages inside a chroot. Today I discovered a failure during an aptitude update, which is a command to download new package lists for the package management. In strace, the lines around the failure look like this: 99% [Working]) = 14 14 [pid 5986] select(6, [3 4 5], [], NULL, {0, 50}) = 0 (Timeout) [pid 5986] gettimeofday({1197576353, 670510}, NULL) = 0 [pid 5986] rt_sigprocmask(SIG_BLOCK, [WINCH], [], 8) = 0 [pid 5986] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 99% [Working]) = 14 14 [pid 5986] select(6, [3 4 5], [], NULL, {0, 50}) = 0 (Timeout) [pid 5986] gettimeofday({1197576354, 173902}, NULL) = 0 [pid 5986] rt_sigprocmask(SIG_BLOCK, [WINCH], [], 8) = 0 [pid 5986] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 99% [Working]) = 14 14 [pid 5986] select(6, [3 4 5], [], NULL, {0, 50} unfinished ... [pid 5988] ... select resumed ) = 1 (in [3], left {105, 0}) [pid 5988] read(3, , 56559) = 0 [pid 5988] fcntl64(-1, F_GETFL)= -1 EBADF (Bad file descriptor) [pid 5988] fcntl64(-1, F_SETFL, O_ACCMODE|O_CREAT|O_EXCL|O_NOCTTY|O_TRUNC|O_APPEND|O_SYNC|O_ASYNC|O_DIRECT|O_LARGEFILE|O_DIRECTORY|O_NOFOLLOW|O_NOATIME|0xfff8003c) = -1 EBADF (Bad file descriptor) [pid 5988] write(2, ..., 41FATAL - Could not set non-blocking flag ) = 41 [pid 5988] write(2, ..., 19Bad file descriptor) = 19 [pid 5988] write(2, ..., 1 ) = 1 [pid 5988] exit_group(100) = ? Process 5988 detached This happened with a kernel after 2.6.24-rc5 (4af75653031c6d454b4ace47c1536f0d2e727e3e). I rebooted into 2.6.23.8 and it worked. Now I rebooted into 2.6.24-rc5 again and was able to reproduce the failure, so it looks like a kernel issue to me. With this part of strace output it seems like an obvious userspace bug (calling fcntl on a -1 file descriptor). Could be some other change in behavior or timing difference is triggering the bug,however. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ATA ACPI needs "Mr interpreter, would you please shut up?" flag
Tejun Heo wrote: Hello, all. During 2.6.24-rc1, libata enabled ATA-ACPI support by default and there have been a lot of regression reports stemming from it. I have patchset ready to fix most of the problems. With these patches applied, libata should be able to cope with most failures pretty well. There is one remaining issue tho. libata caches the result of _GTM during controller for later use. The primary use is to peek at how BIOS configured the controller. Some controllers (pata_via and pata_amd) lack proper cable detection and BIOS configured values are used as reference. This caching is done before any other operation is performed on the port to avoid caching corrupted data. Problem is that _GTM implementation on certain BIOSen crap themselves if invoked on empty channels. However, as written above, because initial _GTM caching is done before any actual operation is performed on the port, libata can't determine whether the port is occupied or not when trying to cache _GTM result. Unfortunately, VIA PATA is on both categories - it needs _GTM caching but can't cope with _GTM invocation on empty ports. Yay! I seem to have lost the thread/bug report where we decided that one board always choked on an empty channel. Maybe it's not that and it's just another case of the same issue where our resetting default timing values on the controller before calling _GTM would choke the _GTM method? -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PCI resource unavailable on mips
Jon Dufresne wrote: Hi, I've done a bit of linux driver development on x86 in the past. Currently I am working on my first ever linux driver for a mips box. I started by testing the device in an x86 box and got it reasonable stable and am now testing it in the mips box. There appears to be a major problem, one unlike I have ever seen before. My PCI device has three BARS. This can be confirmed by the Technical documentation and the x86 code. When the pci device is first probed, I run a loop to printk out the bar information, this is just as a sanity check. Here is the output on the x86: Bar0:PHYS=e000 LEN=0400 Bar1:PHYS=efa0 LEN=0020 Bar2:PHYS=e800 LEN=0400 So, two 64MB BARs and a 2MB one? but here is the output on the mips: Bar0:PHYS=2000 LEN=0400 Bar1:PHYS=2400 LEN=0020 Bar2:PHYS= LEN= notice, BAR2 has no valid information on the mips. I tried to run "pci_enable_device" before printing this information, as suggested by LDD but it did not help. Has anyone seen a problem like this before and any idea how I can get BAR2 a proper address? If I examine the config space directly there is an address in BAR2's register, however it isn't in the 0x2000 range like the other two, instead it is 0x1c00. Also if I do a ``cat /proc/iomem'' I correctly see BAR0 and BAR1 in the output, but not BAR2. Any PCI resource allocation errors in dmesg during the boot process? Could be the kernel wasn't able to find a place to map all of the BARs. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PCI resource unavailable on mips
Jon Dufresne wrote: Hi, I've done a bit of linux driver development on x86 in the past. Currently I am working on my first ever linux driver for a mips box. I started by testing the device in an x86 box and got it reasonable stable and am now testing it in the mips box. There appears to be a major problem, one unlike I have ever seen before. My PCI device has three BARS. This can be confirmed by the Technical documentation and the x86 code. When the pci device is first probed, I run a loop to printk out the bar information, this is just as a sanity check. Here is the output on the x86: Bar0:PHYS=e000 LEN=0400 Bar1:PHYS=efa0 LEN=0020 Bar2:PHYS=e800 LEN=0400 So, two 64MB BARs and a 2MB one? but here is the output on the mips: Bar0:PHYS=2000 LEN=0400 Bar1:PHYS=2400 LEN=0020 Bar2:PHYS= LEN= notice, BAR2 has no valid information on the mips. I tried to run pci_enable_device before printing this information, as suggested by LDD but it did not help. Has anyone seen a problem like this before and any idea how I can get BAR2 a proper address? If I examine the config space directly there is an address in BAR2's register, however it isn't in the 0x2000 range like the other two, instead it is 0x1c00. Also if I do a ``cat /proc/iomem'' I correctly see BAR0 and BAR1 in the output, but not BAR2. Any PCI resource allocation errors in dmesg during the boot process? Could be the kernel wasn't able to find a place to map all of the BARs. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ATA ACPI needs Mr interpreter, would you please shut up? flag
Tejun Heo wrote: Hello, all. During 2.6.24-rc1, libata enabled ATA-ACPI support by default and there have been a lot of regression reports stemming from it. I have patchset ready to fix most of the problems. With these patches applied, libata should be able to cope with most failures pretty well. There is one remaining issue tho. libata caches the result of _GTM during controller for later use. The primary use is to peek at how BIOS configured the controller. Some controllers (pata_via and pata_amd) lack proper cable detection and BIOS configured values are used as reference. This caching is done before any other operation is performed on the port to avoid caching corrupted data. Problem is that _GTM implementation on certain BIOSen crap themselves if invoked on empty channels. However, as written above, because initial _GTM caching is done before any actual operation is performed on the port, libata can't determine whether the port is occupied or not when trying to cache _GTM result. Unfortunately, VIA PATA is on both categories - it needs _GTM caching but can't cope with _GTM invocation on empty ports. Yay! I seem to have lost the thread/bug report where we decided that one board always choked on an empty channel. Maybe it's not that and it's just another case of the same issue where our resetting default timing values on the controller before calling _GTM would choke the _GTM method? -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible issue with dangling PCI BARs
Benjamin Herrenschmidt wrote: On Thu, 2007-12-13 at 14:05 +1100, Benjamin Herrenschmidt wrote: On Thu, 2007-12-13 at 14:00 +1100, Benjamin Herrenschmidt wrote: .../... (oops, sent too fast) So not only we can have a dangling BAR, but nothing prevent us to actually go turn IO or MEM decoding on in case it wasn't already the case on that device. And I was about to say before I clicked "send".. can't we do something like writing all ff's into the BAR at the same time as we clear res->start ? Isn't that supposed to pretty much disable decoding on that BAR ? Or not... Probably still better than leaving it to whatever dangling value it had no ? Ok, reading some other threads, it seems that writing all ff's will not be a very good alternative on x86 machines where MMCONFIG sits up there... I suppose there is nothing totally safe that can be done, thanks to Intel not thinking about making BARs individually enable/disable'able (or size-able without interrupting access, among other numerous fuckups in the PCI spec). So if a BAR is left dangling, I think we -must- disable MEM and IO decoding on the whole device. In fact, the whole trick of passing a bitmask of required BARs to pci_enable_device_bars() in the first place doesn't fly. Yuck. We could do a bit better than that - a common use case with pci_enable_device_bars would be where the device has some IO space that we don't care about because we only want to use MMIO space. If we only want to enable MMIO BARs then we don't need to enable IO decoding, and in that case it doesn't matter if we failed to find space for the IO space and it overlaps something else. It looks like we already handle the "not enabling IO decoding" part in this case, except that it doesn't look like we ever would disable the decoding if it was already enabled. For the case where you say "I want to enable decoding for this MMIO BAR, but not that one", though, I don't see an obvious way to provide that guarantee with certainty. Normally, one would expect that if a BAR is mapped safely outside the decode window of a PCI bridge it's behind, that it won't ever see the requests and can't respond to them. However, the Intel chipset MMCONFIG overlap fiasco appears to show that this is not always the case and in some cases the device can see and respond to requests outside of the bridge's decode window (with higher decode priority than the MMCONFIG aperture, even).. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: mmaping an IO port device
Aras Vaichas wrote: Hi, Can I implement mmap with an io port connected device on an x86 based CPU? Background: I've got a device driver which can be compiled for either x86 or ARM. The driver provides an interface to an FPGA via either an IO port (0x180) on the x86 or as a memory mapped SRAM-like device (0x3000) on the ARM. To get myself an "address" for ioread calls I use: FPGA_base = (u32) ioremap_nocache(FPGA_REG_IO_BASE, SZ_4K) for both CPU types. FPGA_REG_IO_BASE is set to either 0x180 or 0x3000 for x86 and ARM respectively. I then call ioread16(FPGA_base + FPGA_register) for both x86 and ARM and it all works perfectly. No problems there. My problem is that I am now moving from ioctl calls to a mmap interface. This isn't a problem with ARM as I can pass (0x3000 >> PAGE_SHIFT) to remap_pfn_range() in the .mmap fops function but I can't pass 0x180 because ... well, it's obvious. Is there a trick? Aras It's impossible to mmap an IO port area on x86 since IO ports are not accessible as part of the normal memory space. The only way to get access to IO ports in userspace is to use iopl (which requires root privileges) and then executing inl/outl, etc. instructions directly. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: mmaping an IO port device
Aras Vaichas wrote: Hi, Can I implement mmap with an io port connected device on an x86 based CPU? Background: I've got a device driver which can be compiled for either x86 or ARM. The driver provides an interface to an FPGA via either an IO port (0x180) on the x86 or as a memory mapped SRAM-like device (0x3000) on the ARM. To get myself an address for ioread calls I use: FPGA_base = (u32) ioremap_nocache(FPGA_REG_IO_BASE, SZ_4K) for both CPU types. FPGA_REG_IO_BASE is set to either 0x180 or 0x3000 for x86 and ARM respectively. I then call ioread16(FPGA_base + FPGA_register) for both x86 and ARM and it all works perfectly. No problems there. My problem is that I am now moving from ioctl calls to a mmap interface. This isn't a problem with ARM as I can pass (0x3000 PAGE_SHIFT) to remap_pfn_range() in the .mmap fops function but I can't pass 0x180 because ... well, it's obvious. Is there a trick? Aras It's impossible to mmap an IO port area on x86 since IO ports are not accessible as part of the normal memory space. The only way to get access to IO ports in userspace is to use iopl (which requires root privileges) and then executing inl/outl, etc. instructions directly. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible issue with dangling PCI BARs
Benjamin Herrenschmidt wrote: On Thu, 2007-12-13 at 14:05 +1100, Benjamin Herrenschmidt wrote: On Thu, 2007-12-13 at 14:00 +1100, Benjamin Herrenschmidt wrote: .../... (oops, sent too fast) So not only we can have a dangling BAR, but nothing prevent us to actually go turn IO or MEM decoding on in case it wasn't already the case on that device. And I was about to say before I clicked send.. can't we do something like writing all ff's into the BAR at the same time as we clear res-start ? Isn't that supposed to pretty much disable decoding on that BAR ? Or not... Probably still better than leaving it to whatever dangling value it had no ? Ok, reading some other threads, it seems that writing all ff's will not be a very good alternative on x86 machines where MMCONFIG sits up there... I suppose there is nothing totally safe that can be done, thanks to Intel not thinking about making BARs individually enable/disable'able (or size-able without interrupting access, among other numerous fuckups in the PCI spec). So if a BAR is left dangling, I think we -must- disable MEM and IO decoding on the whole device. In fact, the whole trick of passing a bitmask of required BARs to pci_enable_device_bars() in the first place doesn't fly. Yuck. We could do a bit better than that - a common use case with pci_enable_device_bars would be where the device has some IO space that we don't care about because we only want to use MMIO space. If we only want to enable MMIO BARs then we don't need to enable IO decoding, and in that case it doesn't matter if we failed to find space for the IO space and it overlaps something else. It looks like we already handle the not enabling IO decoding part in this case, except that it doesn't look like we ever would disable the decoding if it was already enabled. For the case where you say I want to enable decoding for this MMIO BAR, but not that one, though, I don't see an obvious way to provide that guarantee with certainty. Normally, one would expect that if a BAR is mapped safely outside the decode window of a PCI bridge it's behind, that it won't ever see the requests and can't respond to them. However, the Intel chipset MMCONFIG overlap fiasco appears to show that this is not always the case and in some cases the device can see and respond to requests outside of the bridge's decode window (with higher decode priority than the MMCONFIG aperture, even).. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Iomega ZIP-100 drive unsupported with jmicron JMB361 chip?
(linux-ide cc'ed) trash can wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I have tolerated this problem for a year and do not post to this list in haste. I have posted on forums and searched the community over the past year. I have looked at the list archive on gossamer-threads.com for solutions. With Fedora Core 6 unsupported (the last kernel for which my zip drive worked), it is time for my last attempt at a solution. Please CC: any response as I have not joined the list. I have compiled a kernel-debug RPM and can run this if its output would help. Thank you for any time you might devote to this problem. motherboard: MSI P965 Platinum/Intel P965 Express Chipset Based (MS-7238 series) Fedora 8 : kernel 2.6.23.1-42.fc8 Iomega Zip drive internal Model Z100ATAPI lspci 03:00.0 SATA controller: JMicron Technologies, Inc. JMB361 AHCI/IDE (rev 02) 03:00.1 IDE interface: JMicron Technologies, Inc. JMB361 AHCI/IDE (rev 02) # lsmod | grep ata pata_jmicron8257 0 ata_generic 8901 0 ata_piix 16709 0 libata 99633 4 ahci,pata_jmicron,ata_generic,ata_piix scsi_mod 119757 4 sr_mod,sg,libata,sd_mod I have recently changed the BIOS setting for the SATA#1 Controller from [IDE] to [AHCI] with no effect. I assume AHCI is correct? AHCI is better, yes. It shouldn't be relevant this this problem though. Text below attached as text.txt for readability. from dmesg: libata version 2.21 loaded. device-mapper: ioctl: 4.11.0-ioctl (2006-10-12) initialised: [EMAIL PROTECTED] PCI: Enabling device :03:00.1 ( -> 0001) ACPI: PCI Interrupt :03:00.1[B] -> GSI 17 (level, low) -> IRQ 17 PCI: Setting latency timer of device :03:00.1 to 64 scsi0 : pata_jmicron scsi1 : pata_jmicron ata1: PATA max UDMA/100 cmd 0x0001cc00 ctl 0x0001c882 bmdma 0x0001c400 irq 17 ata2: PATA max UDMA/100 cmd 0x0001c800 ctl 0x0001c482 bmdma 0x0001c408 irq 17 ata1.00: ATAPI: LITE-ON DVDRW SOHW-1693S, KS0B, max UDMA/66 ata1.01: ATAPI: IOMEGA ZIP 100 ATAPI, 05.H, max MWDMA1, CDB intr ata1.00: configured for UDMA/66 ata1.01: configured for MWDMA1 scsi 0:0:0:0: CD-ROMLITE-ON DVDRW SOHW-1693S KS0B PQ: 0 ANSI: 5 scsi 0:0:1:0: Direct-Access IOMEGA ZIP 100 05.H PQ: 0 ANSI: 5 sd 0:0:1:0: [sda] 196608 512-byte hardware sectors (101 MB) sd 0:0:1:0: [sda] Write Protect is off sd 0:0:1:0: [sda] Mode Sense: 00 40 00 00 sd 0:0:1:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 0:0:1:0: [sda] 196608 512-byte hardware sectors (101 MB) sd 0:0:1:0: [sda] Write Protect is off sd 0:0:1:0: [sda] Mode Sense: 00 40 00 00 sd 0:0:1:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sda:<6>sd 0:0:1:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK sd 0:0:1:0: [sda] Sense Key : Hardware Error [current] sd 0:0:1:0: [sda] Add. Sense: Scsi parity error end_request: I/O error, dev sda, sector 0 Buffer I/O error on device sda, logical block 0 If a disk is inserted into the drive (/var/log/messages) Dec 10 14:22:53 localhost kernel: sd 0:0:1:0: [sda] Spinning up disk.<5>sd 0:0:1:0: [sda] Spinning up diskready Dec 10 14:22:53 localhost kernel: sd 0:0:1:0: [sda] 196608 512-byte hardware sectors (101 MB) Dec 10 14:22:53 localhost kernel: sd 0:0:1:0: [sda] Write Protect is off Dec 10 14:22:53 localhost kernel: sd 0:0:1:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Dec 10 14:22:53 localhost kernel: sd 0:0:1:0: [sda] 196608 512-byte hardware sectors (101 MB) Dec 10 14:22:53 localhost kernel: sd 0:0:1:0: [sda] Write Protect is off Dec 10 14:22:53 localhost kernel: sd 0:0:1:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Dec 10 14:22:53 localhost kernel: sda:<6>sd 0:0:1:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK Dec 10 14:22:53 localhost kernel: sd 0:0:1:0: [sda] Sense Key : Hardware Error [current] Dec 10 14:22:53 localhost kernel: sd 0:0:1:0: [sda] Add. Sense: Scsi parity error Dec 10 14:22:53 localhost kernel: end_request: I/O error, dev sda, sector 0 Dec 10 14:22:53 localhost kernel: printk: 42 messages suppressed. Dec 10 14:22:53 localhost kernel: Buffer I/O error on device sda, logical block 0 That is rather curious. There's no sign of any libata error handling going on.. Maybe the drive is actually returning that error code in the ATAPI CDB, or at least we think it is? You are sure that this drive still works with older kernels using drivers/ide, and that the hardware didn't break at some point, I assume? -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Iomega ZIP-100 drive unsupported with jmicron JMB361 chip?
(linux-ide cc'ed) trash can wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I have tolerated this problem for a year and do not post to this list in haste. I have posted on forums and searched the community over the past year. I have looked at the list archive on gossamer-threads.com for solutions. With Fedora Core 6 unsupported (the last kernel for which my zip drive worked), it is time for my last attempt at a solution. Please CC: any response as I have not joined the list. I have compiled a kernel-debug RPM and can run this if its output would help. Thank you for any time you might devote to this problem. motherboard: MSI P965 Platinum/Intel P965 Express Chipset Based (MS-7238 series) Fedora 8 : kernel 2.6.23.1-42.fc8 Iomega Zip drive internal Model Z100ATAPI lspci 03:00.0 SATA controller: JMicron Technologies, Inc. JMB361 AHCI/IDE (rev 02) 03:00.1 IDE interface: JMicron Technologies, Inc. JMB361 AHCI/IDE (rev 02) # lsmod | grep ata pata_jmicron8257 0 ata_generic 8901 0 ata_piix 16709 0 libata 99633 4 ahci,pata_jmicron,ata_generic,ata_piix scsi_mod 119757 4 sr_mod,sg,libata,sd_mod I have recently changed the BIOS setting for the SATA#1 Controller from [IDE] to [AHCI] with no effect. I assume AHCI is correct? AHCI is better, yes. It shouldn't be relevant this this problem though. Text below attached as text.txt for readability. from dmesg: libata version 2.21 loaded. device-mapper: ioctl: 4.11.0-ioctl (2006-10-12) initialised: [EMAIL PROTECTED] PCI: Enabling device :03:00.1 ( - 0001) ACPI: PCI Interrupt :03:00.1[B] - GSI 17 (level, low) - IRQ 17 PCI: Setting latency timer of device :03:00.1 to 64 scsi0 : pata_jmicron scsi1 : pata_jmicron ata1: PATA max UDMA/100 cmd 0x0001cc00 ctl 0x0001c882 bmdma 0x0001c400 irq 17 ata2: PATA max UDMA/100 cmd 0x0001c800 ctl 0x0001c482 bmdma 0x0001c408 irq 17 ata1.00: ATAPI: LITE-ON DVDRW SOHW-1693S, KS0B, max UDMA/66 ata1.01: ATAPI: IOMEGA ZIP 100 ATAPI, 05.H, max MWDMA1, CDB intr ata1.00: configured for UDMA/66 ata1.01: configured for MWDMA1 scsi 0:0:0:0: CD-ROMLITE-ON DVDRW SOHW-1693S KS0B PQ: 0 ANSI: 5 scsi 0:0:1:0: Direct-Access IOMEGA ZIP 100 05.H PQ: 0 ANSI: 5 sd 0:0:1:0: [sda] 196608 512-byte hardware sectors (101 MB) sd 0:0:1:0: [sda] Write Protect is off sd 0:0:1:0: [sda] Mode Sense: 00 40 00 00 sd 0:0:1:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 0:0:1:0: [sda] 196608 512-byte hardware sectors (101 MB) sd 0:0:1:0: [sda] Write Protect is off sd 0:0:1:0: [sda] Mode Sense: 00 40 00 00 sd 0:0:1:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sda:6sd 0:0:1:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK sd 0:0:1:0: [sda] Sense Key : Hardware Error [current] sd 0:0:1:0: [sda] Add. Sense: Scsi parity error end_request: I/O error, dev sda, sector 0 Buffer I/O error on device sda, logical block 0 If a disk is inserted into the drive (/var/log/messages) Dec 10 14:22:53 localhost kernel: sd 0:0:1:0: [sda] Spinning up disk.5sd 0:0:1:0: [sda] Spinning up diskready Dec 10 14:22:53 localhost kernel: sd 0:0:1:0: [sda] 196608 512-byte hardware sectors (101 MB) Dec 10 14:22:53 localhost kernel: sd 0:0:1:0: [sda] Write Protect is off Dec 10 14:22:53 localhost kernel: sd 0:0:1:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Dec 10 14:22:53 localhost kernel: sd 0:0:1:0: [sda] 196608 512-byte hardware sectors (101 MB) Dec 10 14:22:53 localhost kernel: sd 0:0:1:0: [sda] Write Protect is off Dec 10 14:22:53 localhost kernel: sd 0:0:1:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Dec 10 14:22:53 localhost kernel: sda:6sd 0:0:1:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK Dec 10 14:22:53 localhost kernel: sd 0:0:1:0: [sda] Sense Key : Hardware Error [current] Dec 10 14:22:53 localhost kernel: sd 0:0:1:0: [sda] Add. Sense: Scsi parity error Dec 10 14:22:53 localhost kernel: end_request: I/O error, dev sda, sector 0 Dec 10 14:22:53 localhost kernel: printk: 42 messages suppressed. Dec 10 14:22:53 localhost kernel: Buffer I/O error on device sda, logical block 0 That is rather curious. There's no sign of any libata error handling going on.. Maybe the drive is actually returning that error code in the ATAPI CDB, or at least we think it is? You are sure that this drive still works with older kernels using drivers/ide, and that the hardware didn't break at some point, I assume? -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc4-git5: Reported regressions from 2.6.23
Tejun Heo wrote: Robert Hancock wrote: And you're quite right in your comment that we are often too quick to blacklist hardware instead of looking into why it really is failing. ACPI is one of those areas where we often just need to figure out how to be bug-to-bug compatibile with what Windows is doing.. In the spirit of not blacklisting without looking deep into ACPI code, can somebody familiar with ASL take a look at comment 11 of bug 9320? http://bugzilla.kernel.org/show_bug.cgi?id=9320#c11 This is libata calling _GTM to find out how the BIOS configured the device to determine cable type. Thanks. I suspect it's somewhat similar (though perhaps a different cause), the code is trying to lookup a value (presumably register contents) in a table using Match, gets a value that's not in the table (which makes Match return the ONES value meaning not found) and so the lookup of the corresponding output value with that index fails. We'd need the full ASL dump to know exactly what's going on there. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc4-git5: Reported regressions from 2.6.23
Andreas Mohr wrote: On Mon, Dec 10, 2007 at 01:04:31AM +0100, Andreas Mohr wrote: IOW, it seems very likely that _GTM on these BIOSes (VIA chipsets) isn't actually wrongly implemented but simply expects IDE controller values to have been set up ""differently"". Or... one could possibly even infer from this that - maybe - the _GTM invocation spot is wrong, it should be done somewhere different during bootup. Or whatever. "Whatever" indeed: There's an ASL Match() for a "PMPT" (Primary Master PorT) PCI register, and the possible register values are: Package (0x04) { 0x20, 0x31, 0x65, 0xA8 }, and from OperationRegion (CFG2, PCI_Config, 0x40, 0x20) Field (CFG2, DWordAcc, NoLock, Preserve) { Offset (0x08),· SSPT, 8,· SMPT, 8,· PSPT, 8,· PMPT, 8,· Offset (0x10),· ... we can infer that at PCI_Config offset 0x48 those values should be located. However after bootup or resume there are: # lspci -s 00:11.1 -xxx 00:11.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06) 00: 06 11 71 05 07 00 90 02 06 8a 01 01 00 20 00 00 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20: 01 e4 00 00 00 00 00 00 00 00 00 00 06 11 71 05 30: 00 00 00 00 c0 00 00 00 00 00 00 00 ff 01 00 00 40: 0b 32 09 0a 18 1c c0 00 99 99 20 20 ff 00 a8 20 50: 07 07 f6 f1 14 03 00 00 a8 a8 a8 a8 00 00 00 00 60: 00 02 00 00 00 00 00 00 00 02 00 00 00 00 00 00 70: 02 01 00 00 00 00 00 00 82 01 00 00 00 00 00 00 80: 00 e0 a1 1f 00 00 00 00 00 00 00 00 00 00 00 00 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0: 01 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 06 00 71 05 06 11 71 05 00 00 00 00 00 00 00 00 e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0: 00 00 00 00 00 00 07 00 00 00 00 00 00 00 00 00 As one can see, the relevant values for SSPT, SMPT, PSPT and PMPT are 99 99 20 20, which are not quite entirely valid judging from the array above, and this is because the secondary port is unused, as can also be seen from my bootup log: scsi0 : pata_via scsi1 : pata_via ata1: PATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0xe400 irq 14 ata2: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma 0xe408 irq 15 ata1.00: ATA-5: WDC WD1200JB-00CRA1, 17.07W17, max UDMA/100 ata1.00: 234441648 sectors, multi 16: LBA ata1.01: ATAPI: TOSHIBA DVD-ROM SD-M1612, 1004, max UDMA/33 Switched to high resolution mode on CPU 0 ata1.00: configured for UDMA/100 ata1.01: configured for UDMA/33 ACPI Exception (exoparg2-0442): AE_AML_PACKAGE_LIMIT, Index (0) is beyond end of object [20070126] ACPI Error (psparse-0537): Method parse/execution failed [\_SB_.PCI0.IDE0.GTM_] (Node df80b9a8), AE_AML_PACKAGE_LIM IT ACPI Error (psparse-0537): Method parse/execution failed [\_SB_.PCI0.IDE0.CHN1._GTM] (Node df80b8d0), AE_AML_PACKAG E_LIMIT ata2: ACPI get timing mode failed (AE 0x300d) Manually tweaking the values to 20 20 20 20 truly does skip the _GTM failure message on suspend - only to reappear right on resume due to 99 99 20 20 combo happening again. If I don't tweak, I get _GTM failure at both suspend and resume. As such one can conclude that this BIOS is rather very confused when being called for _GTM on an entirely unused controller port. And this is either because the BIOS is dumb or because ACPI doesn't really expect anyone to call _GTM on an unused physical port. I'd bet on the latter... (however I haven't found ACPI 3.0b explicitly mentioning this somewhere yet) Andreas Mohr Probably Windows doesn't call _GTM on a port with no devices connected, and so the BIOS people never tested that case. Likely we can just avoid doing this - if no devices are connected the timing settings for that channel are irrelevant.. And you're quite right in your comment that we are often too quick to blacklist hardware instead of looking into why it really is failing. ACPI is one of those areas where we often just need to figure out how to be bug-to-bug compatibile with what Windows is doing.. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Bug: get EXT3-fs error Allocating block in system zone
Marco Gatti wrote: Linus Torvalds schrieb: Was there a dmesg out there somewhere? With 4G of RAM, you probably have some of it above the 4GB mark (because of RAM remapping etc, and the PCI decode hole in the low 4GB). It does sound like this is a DMA problem, and your controller cannot correctly DMA to the upper 4GB. So what controller/driver, what's the dmesg, and let's see if we can fix it by adding a DMA mask to it to limit it to the low 32 bits. Controller / drivers: it's a board with intel Q35 chipset. The southbridge has an ICH9 Intel Gigabit 82566DM-2 => e1000 Intel matrix storage SATA => ahci.c Intel graphics media accelerator => not added to kernel Intel Audio => Intel HD Audio AC97 I just got "EXT3-fs error Allocating block in system zone" in dmesg with 4 or more GBs of RAM. I listed boot up dmesg to get an idea of dma config with different amount of RAM. Thanks for your help. The obvious suspect with a filesystem problem would be the disk controller driver, AHCI here. However, the controller appears to set the flag to indicate that it supports 64-bit DMA, so it should be fine, unless it lies of course (which we know that ATI SB600 chipset does, but I don't believe Intel is known to). Could still be a DMA mapping bug that only shows up when IOMMU is used. However, AHCI is a pretty well tested driver.. dmesg with 2GB: .. ahci :00:1f.2: version 2.3 ACPI: PCI Interrupt :00:1f.2[B] -> GSI 19 (level, low) -> IRQ 19 ahci :00:1f.2: nr_ports (6) and implemented port map (0xf) don't match ahci :00:1f.2: AHCI 0001.0200 32 slots 6 ports 3 Gbps 0xf impl SATA mode ahci :00:1f.2: flags: 64bit ncq sntf led clo pmp pio slum part PCI: Setting latency timer of device :00:1f.2 to 64 scsi0 : ahci scsi1 : ahci scsi2 : ahci scsi3 : ahci ata1: SATA max UDMA/133 cmd 0xc2334100 ctl 0x bmdma 0x irq 316 ata2: SATA max UDMA/133 cmd 0xc2334180 ctl 0x bmdma 0x irq 316 ata3: SATA max UDMA/133 cmd 0xc2334200 ctl 0x bmdma 0x irq 316 ata4: SATA max UDMA/133 cmd 0xc2334280 ctl 0x bmdma 0x00000000 irq 316 -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Bug: get EXT3-fs error Allocating block in system zone
Marco Gatti wrote: Linus Torvalds schrieb: Was there a dmesg out there somewhere? With 4G of RAM, you probably have some of it above the 4GB mark (because of RAM remapping etc, and the PCI decode hole in the low 4GB). It does sound like this is a DMA problem, and your controller cannot correctly DMA to the upper 4GB. So what controller/driver, what's the dmesg, and let's see if we can fix it by adding a DMA mask to it to limit it to the low 32 bits. Controller / drivers: it's a board with intel Q35 chipset. The southbridge has an ICH9 Intel Gigabit 82566DM-2 = e1000 Intel matrix storage SATA = ahci.c Intel graphics media accelerator = not added to kernel Intel Audio = Intel HD Audio AC97 I just got EXT3-fs error Allocating block in system zone in dmesg with 4 or more GBs of RAM. I listed boot up dmesg to get an idea of dma config with different amount of RAM. Thanks for your help. The obvious suspect with a filesystem problem would be the disk controller driver, AHCI here. However, the controller appears to set the flag to indicate that it supports 64-bit DMA, so it should be fine, unless it lies of course (which we know that ATI SB600 chipset does, but I don't believe Intel is known to). Could still be a DMA mapping bug that only shows up when IOMMU is used. However, AHCI is a pretty well tested driver.. dmesg with 2GB: .. ahci :00:1f.2: version 2.3 ACPI: PCI Interrupt :00:1f.2[B] - GSI 19 (level, low) - IRQ 19 ahci :00:1f.2: nr_ports (6) and implemented port map (0xf) don't match ahci :00:1f.2: AHCI 0001.0200 32 slots 6 ports 3 Gbps 0xf impl SATA mode ahci :00:1f.2: flags: 64bit ncq sntf led clo pmp pio slum part PCI: Setting latency timer of device :00:1f.2 to 64 scsi0 : ahci scsi1 : ahci scsi2 : ahci scsi3 : ahci ata1: SATA max UDMA/133 cmd 0xc2334100 ctl 0x bmdma 0x irq 316 ata2: SATA max UDMA/133 cmd 0xc2334180 ctl 0x bmdma 0x irq 316 ata3: SATA max UDMA/133 cmd 0xc2334200 ctl 0x bmdma 0x irq 316 ata4: SATA max UDMA/133 cmd 0xc2334280 ctl 0x bmdma 0x irq 316 -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc4-git5: Reported regressions from 2.6.23
Andreas Mohr wrote: On Mon, Dec 10, 2007 at 01:04:31AM +0100, Andreas Mohr wrote: IOW, it seems very likely that _GTM on these BIOSes (VIA chipsets) isn't actually wrongly implemented but simply expects IDE controller values to have been set up differently. Or... one could possibly even infer from this that - maybe - the _GTM invocation spot is wrong, it should be done somewhere different during bootup. Or whatever. Whatever indeed: There's an ASL Match() for a PMPT (Primary Master PorT) PCI register, and the possible register values are: Package (0x04) { 0x20, 0x31, 0x65, 0xA8 }, and from OperationRegion (CFG2, PCI_Config, 0x40, 0x20) Field (CFG2, DWordAcc, NoLock, Preserve) { Offset (0x08),· SSPT, 8,· SMPT, 8,· PSPT, 8,· PMPT, 8,· Offset (0x10),· ... we can infer that at PCI_Config offset 0x48 those values should be located. However after bootup or resume there are: # lspci -s 00:11.1 -xxx 00:11.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06) 00: 06 11 71 05 07 00 90 02 06 8a 01 01 00 20 00 00 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20: 01 e4 00 00 00 00 00 00 00 00 00 00 06 11 71 05 30: 00 00 00 00 c0 00 00 00 00 00 00 00 ff 01 00 00 40: 0b 32 09 0a 18 1c c0 00 99 99 20 20 ff 00 a8 20 50: 07 07 f6 f1 14 03 00 00 a8 a8 a8 a8 00 00 00 00 60: 00 02 00 00 00 00 00 00 00 02 00 00 00 00 00 00 70: 02 01 00 00 00 00 00 00 82 01 00 00 00 00 00 00 80: 00 e0 a1 1f 00 00 00 00 00 00 00 00 00 00 00 00 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0: 01 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 06 00 71 05 06 11 71 05 00 00 00 00 00 00 00 00 e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0: 00 00 00 00 00 00 07 00 00 00 00 00 00 00 00 00 As one can see, the relevant values for SSPT, SMPT, PSPT and PMPT are 99 99 20 20, which are not quite entirely valid judging from the array above, and this is because the secondary port is unused, as can also be seen from my bootup log: scsi0 : pata_via scsi1 : pata_via ata1: PATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0xe400 irq 14 ata2: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma 0xe408 irq 15 ata1.00: ATA-5: WDC WD1200JB-00CRA1, 17.07W17, max UDMA/100 ata1.00: 234441648 sectors, multi 16: LBA ata1.01: ATAPI: TOSHIBA DVD-ROM SD-M1612, 1004, max UDMA/33 Switched to high resolution mode on CPU 0 ata1.00: configured for UDMA/100 ata1.01: configured for UDMA/33 ACPI Exception (exoparg2-0442): AE_AML_PACKAGE_LIMIT, Index (0) is beyond end of object [20070126] ACPI Error (psparse-0537): Method parse/execution failed [\_SB_.PCI0.IDE0.GTM_] (Node df80b9a8), AE_AML_PACKAGE_LIM IT ACPI Error (psparse-0537): Method parse/execution failed [\_SB_.PCI0.IDE0.CHN1._GTM] (Node df80b8d0), AE_AML_PACKAG E_LIMIT ata2: ACPI get timing mode failed (AE 0x300d) Manually tweaking the values to 20 20 20 20 truly does skip the _GTM failure message on suspend - only to reappear right on resume due to 99 99 20 20 combo happening again. If I don't tweak, I get _GTM failure at both suspend and resume. As such one can conclude that this BIOS is rather very confused when being called for _GTM on an entirely unused controller port. And this is either because the BIOS is dumb or because ACPI doesn't really expect anyone to call _GTM on an unused physical port. I'd bet on the latter... (however I haven't found ACPI 3.0b explicitly mentioning this somewhere yet) Andreas Mohr Probably Windows doesn't call _GTM on a port with no devices connected, and so the BIOS people never tested that case. Likely we can just avoid doing this - if no devices are connected the timing settings for that channel are irrelevant.. And you're quite right in your comment that we are often too quick to blacklist hardware instead of looking into why it really is failing. ACPI is one of those areas where we often just need to figure out how to be bug-to-bug compatibile with what Windows is doing.. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc4-git5: Reported regressions from 2.6.23
Tejun Heo wrote: Robert Hancock wrote: And you're quite right in your comment that we are often too quick to blacklist hardware instead of looking into why it really is failing. ACPI is one of those areas where we often just need to figure out how to be bug-to-bug compatibile with what Windows is doing.. In the spirit of not blacklisting without looking deep into ACPI code, can somebody familiar with ASL take a look at comment 11 of bug 9320? http://bugzilla.kernel.org/show_bug.cgi?id=9320#c11 This is libata calling _GTM to find out how the BIOS configured the device to determine cable type. Thanks. I suspect it's somewhat similar (though perhaps a different cause), the code is trying to lookup a value (presumably register contents) in a table using Match, gets a value that's not in the table (which makes Match return the ONES value meaning not found) and so the lookup of the corresponding output value with that index fails. We'd need the full ASL dump to know exactly what's going on there. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sata_nv: fix ADMA ATAPI issues with memory over 4GB (v3)
Jeff Garzik wrote: Robert Hancock wrote: This fixes some problems with ATAPI devices on nForce4 controllers in ADMA mode on systems with memory located above 4GB. We need to delay setting the 64-bit DMA mask until the PRD table and padding buffer are allocated so that they don't get allocated above 4GB and break legacy mode (which is needed for ATAPI devices). Signed-off-by: Robert Hancock <[EMAIL PROTECTED]> This is a bit nasty :/ I would consider setting the consistent DMA mask to 32-bit, and setting the overall mask to 64-bit. Seems like that would solve the problem? Also, does this need to be rebased on top of what I just pushed upstream? Jeff Jeff, ping on this one? This (or, one like it) really should make it into 2.6.24.. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc4-git5: Reported regressions from 2.6.23
Matthew Garrett wrote: On Sat, Dec 08, 2007 at 02:20:01AM -0800, Andrew Morton wrote: On Sat, 8 Dec 2007 11:12:57 +0100 Andreas Mohr <[EMAIL PROTECTED]> wrote: ACPI Exception (exoparg2-0442): AE_AML_PACKAGE_LIMIT, Index (0) is beyond end of object [20070126] ACPI Error (psparse-0537): Method parse/execution failed [\_SB_.PCI0.IDE0.GTF_] (Node c180b990), AE_AML_PACKAGE_LIMIT ACPI Error (psparse-0537): Method parse/execution failed [\_SB_.PCI0.IDE0.CHN0.DRV1._GTF] (Node c180b888), AE_AML_PACKAGE_LIMIT ata1.01: _GTF evaluation failed (AE 0x300d) 037f6bb79f753c014bc84bca0de9bf98bb5ab169 ought to have fixed this? I should think it should have. I think we're too aggressive about disabling the libata ACPI support, even. One of my laptop's _GTF commands on resume is a DEVICE CONFIGURATION FREEZE LOCK command, which gets rejected by the drive (maybe it worked on the original Hitachi disk, but I've upgraded it to a newer Samsung). I'd say if the drive returns command aborted on one of these, we should just ignore that command and continue to the next one without trying to retry or disabling the ACPI support entirely. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc4-git5: Reported regressions from 2.6.23
Matthew Garrett wrote: On Sat, Dec 08, 2007 at 02:20:01AM -0800, Andrew Morton wrote: On Sat, 8 Dec 2007 11:12:57 +0100 Andreas Mohr [EMAIL PROTECTED] wrote: ACPI Exception (exoparg2-0442): AE_AML_PACKAGE_LIMIT, Index (0) is beyond end of object [20070126] ACPI Error (psparse-0537): Method parse/execution failed [\_SB_.PCI0.IDE0.GTF_] (Node c180b990), AE_AML_PACKAGE_LIMIT ACPI Error (psparse-0537): Method parse/execution failed [\_SB_.PCI0.IDE0.CHN0.DRV1._GTF] (Node c180b888), AE_AML_PACKAGE_LIMIT ata1.01: _GTF evaluation failed (AE 0x300d) 037f6bb79f753c014bc84bca0de9bf98bb5ab169 ought to have fixed this? I should think it should have. I think we're too aggressive about disabling the libata ACPI support, even. One of my laptop's _GTF commands on resume is a DEVICE CONFIGURATION FREEZE LOCK command, which gets rejected by the drive (maybe it worked on the original Hitachi disk, but I've upgraded it to a newer Samsung). I'd say if the drive returns command aborted on one of these, we should just ignore that command and continue to the next one without trying to retry or disabling the ACPI support entirely. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sata_nv: fix ADMA ATAPI issues with memory over 4GB (v3)
Jeff Garzik wrote: Robert Hancock wrote: This fixes some problems with ATAPI devices on nForce4 controllers in ADMA mode on systems with memory located above 4GB. We need to delay setting the 64-bit DMA mask until the PRD table and padding buffer are allocated so that they don't get allocated above 4GB and break legacy mode (which is needed for ATAPI devices). Signed-off-by: Robert Hancock [EMAIL PROTECTED] This is a bit nasty :/ I would consider setting the consistent DMA mask to 32-bit, and setting the overall mask to 64-bit. Seems like that would solve the problem? Also, does this need to be rebased on top of what I just pushed upstream? Jeff Jeff, ping on this one? This (or, one like it) really should make it into 2.6.24.. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc4-mm1 and Very Slow PCMCIA Compact Flash
Zan Lynx wrote: On Fri, 2007-12-07 at 15:22 -0800, Andrew Morton wrote: On Fri, 07 Dec 2007 23:09:43 + Zan Lynx <[EMAIL PROTECTED]> wrote: On Fri, 2007-12-07 at 15:02 -0800, Andrew Morton wrote: On Fri, 07 Dec 2007 20:38:24 + Zan Lynx <[EMAIL PROTECTED]> wrote: While I'm reporting problems I'll get this one out there. I normally use a USB-2 memory card reader but I also have a PCMCIA CompactFlash adapter that I use occasionally. During the MM series kernels 2.6.22 and 23 (I am pretty sure) this didn't work at all. I don't know about vanilla since I don't run that. Now with MM kernels 2.6.24 rc1-4 the PCMCIA adapter works again, but I only get read rates of 1.6 MB/s. When it used to work in 2.6.20 I got at least 16 MB/s. The card itself is capable of 30+ in the USB-2 reader. [cut] Oh, OK. Hopefully the ata guys can help out with this. I don't know if it actually strictly a regression? Did libata ever support that device in any earlier kernels? That could be why it didn't work for a few kernel versions. I reconfigured for a libata-only system a while back. And, since I usually use the USB-2 flash reader I didn't care much about the PCMCIA. I will try reverting that patch later tonight, in a few hours. It looks like pata_pcmcia is always PIO mode 0: /** * pcmcia_init_one - attach a PCMCIA interface * @pdev: pcmcia device * * Register a PCMCIA IDE interface. Such interfaces are PIO 0 and * shared IRQ. */ I assume that with old IDE this would use ide_cs.c, but I'm drawing a blank on what modes that supports.. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: x86_64 dynticks not working prev: cpuidle, dynticks compatible or no?
Ed Sweetman wrote: System is idle now, previously it was doing something i couldn't halt at the time. I'm looking at "Local timer interrupts" in the "Loc:" section of /proc/interrupts. Across 1 second while the system is pretty much idle, i still get 300 interrupts. My HZ variable is set to 300 in the kernel config, so this is expected but I was under the assumption that dynticks/tickless being compiled in would cause that to be much lower. Am I reading the wrong section of /proc/interrupts to verify if dynticks is working or not? Again, i see no difference in cpu temp at all. Try running powertop ( http://www.lesswatts.org/projects/powertop/ ) and see what it reports. I don't think dynticks will generally save huge amounts of power on a typical desktop machine. The big gains come from being able to stay in deep sleep C-states (C2/C3) for longer periods of time, but most desktop machines only enable sleep states down to C1. In case it helps, this is an athlon64 x2 with apic functioning and both cores active in 64bit mode. dmesg is below. not related : Some additional notes: it87 is my lm_sensor, it doesn't work in this kernel, yet it did in 2.6.22. Perhaps enabling high precision timers changed something in acpi land. I enabled tcp dma offloading in this kernel, i get debugging output related to it, error is at the last line. No corruption or otherwise bad behavior. Transferring via cifs at 9.7MB/sec "incoming" took about 15% of one cpu... I never bothered to check if that is the norm but i suspect i'll be removing that feature as it seems to not play nice with the kernel. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible EXT2 race
linux-os (Dick Johnson) wrote: On Fri, 7 Dec 2007, Dave Jones wrote: On Fri, Dec 07, 2007 at 08:15:42AM -0500, linux-os (Dick Johnson) wrote: Dec 7 04:05:55 chaos kernel: sd 0:0:1:0: [sdb] Add. Sense: Peripheral device write fault This sounds more like a hardware problem. Dave There was an attempt to write beyond the end of the device because everything in the file-system was getting trashed. I can read/write the 5 year-old SCSI physical drive with no errors from both within linux and through the Adaptec BIOS. This problem only occurs when I attempt to truncate a file that is being written by another task. That SCSI error code doesn't sound like a reasonable one for the drive getting a bad block address. The more typical one in that case would be "Logical block address out of range", or maybe the catch-all "Invalid field in CDB". "Peripheral device write fault", especially as a deferred error (i.e. after the drive already returned a normal completion for the data, and then is reporting the failure to actually write to the media on the next command), really sounds like a drive problem. And the kernel is supposed to trap those at the disk layer, like these are saying it is, _after_ that error occurs: Dec 7 04:08:13 chaos kernel: attempt to access beyond end of device Dec 7 04:08:13 chaos kernel: sdb1: rw=0, want=29687515944, limit=33736437 -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc4-mm1 and Very Slow PCMCIA Compact Flash
Zan Lynx wrote: On Fri, 2007-12-07 at 15:22 -0800, Andrew Morton wrote: On Fri, 07 Dec 2007 23:09:43 + Zan Lynx [EMAIL PROTECTED] wrote: On Fri, 2007-12-07 at 15:02 -0800, Andrew Morton wrote: On Fri, 07 Dec 2007 20:38:24 + Zan Lynx [EMAIL PROTECTED] wrote: While I'm reporting problems I'll get this one out there. I normally use a USB-2 memory card reader but I also have a PCMCIA CompactFlash adapter that I use occasionally. During the MM series kernels 2.6.22 and 23 (I am pretty sure) this didn't work at all. I don't know about vanilla since I don't run that. Now with MM kernels 2.6.24 rc1-4 the PCMCIA adapter works again, but I only get read rates of 1.6 MB/s. When it used to work in 2.6.20 I got at least 16 MB/s. The card itself is capable of 30+ in the USB-2 reader. [cut] Oh, OK. Hopefully the ata guys can help out with this. I don't know if it actually strictly a regression? Did libata ever support that device in any earlier kernels? That could be why it didn't work for a few kernel versions. I reconfigured for a libata-only system a while back. And, since I usually use the USB-2 flash reader I didn't care much about the PCMCIA. I will try reverting that patch later tonight, in a few hours. It looks like pata_pcmcia is always PIO mode 0: /** * pcmcia_init_one - attach a PCMCIA interface * @pdev: pcmcia device * * Register a PCMCIA IDE interface. Such interfaces are PIO 0 and * shared IRQ. */ I assume that with old IDE this would use ide_cs.c, but I'm drawing a blank on what modes that supports.. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: x86_64 dynticks not working prev: cpuidle, dynticks compatible or no?
Ed Sweetman wrote: System is idle now, previously it was doing something i couldn't halt at the time. I'm looking at Local timer interrupts in the Loc: section of /proc/interrupts. Across 1 second while the system is pretty much idle, i still get 300 interrupts. My HZ variable is set to 300 in the kernel config, so this is expected but I was under the assumption that dynticks/tickless being compiled in would cause that to be much lower. Am I reading the wrong section of /proc/interrupts to verify if dynticks is working or not? Again, i see no difference in cpu temp at all. Try running powertop ( http://www.lesswatts.org/projects/powertop/ ) and see what it reports. I don't think dynticks will generally save huge amounts of power on a typical desktop machine. The big gains come from being able to stay in deep sleep C-states (C2/C3) for longer periods of time, but most desktop machines only enable sleep states down to C1. In case it helps, this is an athlon64 x2 with apic functioning and both cores active in 64bit mode. dmesg is below. not related : Some additional notes: it87 is my lm_sensor, it doesn't work in this kernel, yet it did in 2.6.22. Perhaps enabling high precision timers changed something in acpi land. I enabled tcp dma offloading in this kernel, i get debugging output related to it, error is at the last line. No corruption or otherwise bad behavior. Transferring via cifs at 9.7MB/sec incoming took about 15% of one cpu... I never bothered to check if that is the norm but i suspect i'll be removing that feature as it seems to not play nice with the kernel. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possible EXT2 race
linux-os (Dick Johnson) wrote: On Fri, 7 Dec 2007, Dave Jones wrote: On Fri, Dec 07, 2007 at 08:15:42AM -0500, linux-os (Dick Johnson) wrote: Dec 7 04:05:55 chaos kernel: sd 0:0:1:0: [sdb] Add. Sense: Peripheral device write fault This sounds more like a hardware problem. Dave There was an attempt to write beyond the end of the device because everything in the file-system was getting trashed. I can read/write the 5 year-old SCSI physical drive with no errors from both within linux and through the Adaptec BIOS. This problem only occurs when I attempt to truncate a file that is being written by another task. That SCSI error code doesn't sound like a reasonable one for the drive getting a bad block address. The more typical one in that case would be Logical block address out of range, or maybe the catch-all Invalid field in CDB. Peripheral device write fault, especially as a deferred error (i.e. after the drive already returned a normal completion for the data, and then is reporting the failure to actually write to the media on the next command), really sounds like a drive problem. And the kernel is supposed to trap those at the disk layer, like these are saying it is, _after_ that error occurs: Dec 7 04:08:13 chaos kernel: attempt to access beyond end of device Dec 7 04:08:13 chaos kernel: sdb1: rw=0, want=29687515944, limit=33736437 -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: outb 0x80 in inb_p, outb_p harmful on some modern AMD64 with MCP51 laptops
David P. Reed wrote: After much, much testing (months, off and on, pursuing hypotheses), I've discovered that the use of "outb al,0x80" instructions to "delay" after inb and outb instructions causes solid freezes on my HP dv9000z laptop, when ACPI is enabled. It takes a fair number of out's to 0x80, but the hard freeze is reliably reproducible by writing a driver that solely does a loop of 50 outb's to 0x80 and calling it in a loop 1000 times from user space. !!! The serious impact is that the /dev/rtc and /dev/nvram devices are very unreliable - thus "hwclock" freezes very reliably while looping waiting for a new second value and calling "cat /dev/nvram" in a loop freezes the machine if done a few times in a row. This is reproducible, but requires a fair number of outb's to the 0x80 diagnostic port, and seems to require ACPI to be on. io_64.h is the source of these particular instructions, via the CMOS_READ and CMOS_WRITE macros, which are defined in mc146818_64.h. (I wonder if the same problem occurs in 32-bit mode). I'm happy to complete and test a patch, but I'm curious what the right approach ought to be. I have to say I have no clue as to what ACPI is doing on this chipset (nvidia MCP51) that would make port 80 do this. A raw random guess is that something is logging POST codes, but if so, not clear what is problematic in ACPI mode. ANy help/suggestions? Changing the delay instruction sequence from the outb to short jumps might be the safe thing. But Linus, et al. may have experience with that on other architectures like older Pentiums etc. The fact that these "pausing" calls are needed in the first place seems rather cheesy. If there's hardware that's unable to respond to IO port writes as fast as possible, then surely there's a better solution than trying to stall the IOs by an arbitrary and hardware-dependent amount of time, like udelay calls, etc. Does any remotely recent hardware even need this? -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: outb 0x80 in inb_p, outb_p harmful on some modern AMD64 with MCP51 laptops
David P. Reed wrote: After much, much testing (months, off and on, pursuing hypotheses), I've discovered that the use of outb al,0x80 instructions to delay after inb and outb instructions causes solid freezes on my HP dv9000z laptop, when ACPI is enabled. It takes a fair number of out's to 0x80, but the hard freeze is reliably reproducible by writing a driver that solely does a loop of 50 outb's to 0x80 and calling it in a loop 1000 times from user space. !!! The serious impact is that the /dev/rtc and /dev/nvram devices are very unreliable - thus hwclock freezes very reliably while looping waiting for a new second value and calling cat /dev/nvram in a loop freezes the machine if done a few times in a row. This is reproducible, but requires a fair number of outb's to the 0x80 diagnostic port, and seems to require ACPI to be on. io_64.h is the source of these particular instructions, via the CMOS_READ and CMOS_WRITE macros, which are defined in mc146818_64.h. (I wonder if the same problem occurs in 32-bit mode). I'm happy to complete and test a patch, but I'm curious what the right approach ought to be. I have to say I have no clue as to what ACPI is doing on this chipset (nvidia MCP51) that would make port 80 do this. A raw random guess is that something is logging POST codes, but if so, not clear what is problematic in ACPI mode. ANy help/suggestions? Changing the delay instruction sequence from the outb to short jumps might be the safe thing. But Linus, et al. may have experience with that on other architectures like older Pentiums etc. The fact that these pausing calls are needed in the first place seems rather cheesy. If there's hardware that's unable to respond to IO port writes as fast as possible, then surely there's a better solution than trying to stall the IOs by an arbitrary and hardware-dependent amount of time, like udelay calls, etc. Does any remotely recent hardware even need this? -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Timers SMP] can this machine be helped?
Guennadi Liakhovetski wrote: On Tue, 4 Dec 2007, Robert Hancock wrote: Guennadi Liakhovetski wrote: I've got an old 2xP-II @ 400MHz Compaq AP400 system, which I'm still using. It has many peculiarities, so, I wouldn't be surprised if the answer to my questions would be "sorry, the patient is rather dead than alive". How about disabling ACPI entirely, acpi=off on kernel command line? I wouldn't be surprised to see a lot of ACPI stuff broken on an older machine like that.. See above - it's an SMP. Thanks Guennadi --- Guennadi Liakhovetski On a machine that old, you shouldn't need ACPI to detect both CPUs, it should be able to use MPS.. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Timers SMP] can this machine be helped?
Guennadi Liakhovetski wrote: On Tue, 4 Dec 2007, Robert Hancock wrote: Guennadi Liakhovetski wrote: I've got an old 2xP-II @ 400MHz Compaq AP400 system, which I'm still using. It has many peculiarities, so, I wouldn't be surprised if the answer to my questions would be sorry, the patient is rather dead than alive. How about disabling ACPI entirely, acpi=off on kernel command line? I wouldn't be surprised to see a lot of ACPI stuff broken on an older machine like that.. See above - it's an SMP. Thanks Guennadi --- Guennadi Liakhovetski On a machine that old, you shouldn't need ACPI to detect both CPUs, it should be able to use MPS.. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Timers SMP] can this machine be helped?
Guennadi Liakhovetski wrote: Hi, I've got an old 2xP-II @ 400MHz Compaq AP400 system, which I'm still using. It has many peculiarities, so, I wouldn't be surprised if the answer to my questions would be "sorry, the patient is rather dead than alive". Some of the problems lie in ACPI area, I tried some time ago to fix the ACPI tables for these machine, but never got enough time for that. So I'm still booting with acpi=noirq Another problem is its battery is dead and it's hard soldered to the mainboard (Compaq)... It might also have some problems with one of its 3 SCSI busses. I compiled a .24-ish kernel for it with CONFIG_NO_HZ and CONFIG_HIGH_RES_TIMERS. To get the system boot at least sometimes I have to specify nohz=off. Then I get * Found PM-Timer Bug on the chipset. Due to workarounds for a bug, * this clock source is slow. Consider trying other clock sources Without this parameter it hangs usually between Time: acpi_pm clocksource has been installed. and Switched to high resolution mode on CPU 1 Switched to high resolution mode on CPU 0 Tried booting with clocksource=tsc then I've got Marking TSC unstable due to: possible TSC halt in C2. And then a few of these: BUG: soft lockup - CPU#0 stuck for 13s! [swapper:0] Pid: 0, comm: swapper Not tainted (2.6.24-rc2-g8c086340 #3) EIP: 0060:[] EFLAGS: 0283 CPU: 0 EIP is at acpi_processor_idle+0x2ae/0x477 EAX: EBX: feab ECX: 0001 EDX: 0001 ESI: c7c5f2d0 EDI: 00122d9f EBP: c03ddfa8 ESP: c03ddf90 DS: 007b ES: 007b FS: 00d8 GS: SS: 0068 CR0: 8005003b CR2: 081dcf88 CR3: 07e46000 CR4: 02d0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 [] show_trace_log_lvl+0x1a/0x30 [] show_trace+0x12/0x20 [] show_regs+0x1c/0x20 [] softlockup_tick+0x11b/0x150 [] run_local_timers+0x12/0x20 [] update_process_times+0x2f/0x60 [] tick_sched_timer+0x6a/0xe0 [] hrtimer_interrupt+0x120/0x1a0 [] smp_apic_timer_interrupt+0x55/0x90 [] apic_timer_interrupt+0x28/0x30 [] cpu_idle+0x84/0xf0 [] rest_init+0x5d/0x60 [] start_kernel+0x2af/0x2f0 [<>] run_init_process+0x3feff000/0x20 === so, is there any way I can still reasonably use this system? Which configuration / command-line parameters should I try? If needed can provide complete dmesg (with nohz=off or with clocksource=tsc) and .config. How about disabling ACPI entirely, acpi=off on kernel command line? I wouldn't be surprised to see a lot of ACPI stuff broken on an older machine like that.. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel 2.6.23.9 / P35 Chipset + WD 750GB Drives (reset port)
Justin Piszcz wrote: The badblocks did not do anything; however, when I built a software raid 5 and the performed a dd: /usr/bin/time dd if=/dev/zero of=fill_disk bs=1M I saw this somewhere along the way: [30189.967531] RAID5 conf printout: [30189.967576] --- rd:3 wd:3 [30189.967617] disk 0, o:1, dev:sdc1 [30189.967660] disk 1, o:1, dev:sdd1 [30189.967716] disk 2, o:1, dev:sde1 [42332.936615] ata5.00: exception Emask 0x2 SAct 0x7000 SErr 0x0 action 0x2 frozen [42332.936706] ata5.00: spurious completions during NCQ issue=0x0 SAct=0x7000 FIS=004040a1:0800 [42332.936804] ata5.00: cmd 61/08:60:6f:4d:2a/00:00:27:00:00/40 tag 12 cdb 0x0 data 4096 out [42332.936805] res 40/00:74:0f:49:2a/00:00:27:00:00/40 Emask 0x2 (HSM violation) [42332.936977] ata5.00: cmd 61/08:68:77:4d:2a/00:00:27:00:00/40 tag 13 cdb 0x0 data 4096 out [42332.936981] res 40/00:74:0f:49:2a/00:00:27:00:00/40 Emask 0x2 (HSM violation) [42332.937162] ata5.00: cmd 61/00:70:0f:49:2a/04:00:27:00:00/40 tag 14 cdb 0x0 data 524288 out [42332.937163] res 40/00:74:0f:49:2a/00:00:27:00:00/40 Emask 0x2 (HSM violation) [42333.240054] ata5: soft resetting port [42333.494462] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [42333.506592] ata5.00: configured for UDMA/133 [42333.506652] ata5: EH complete [42333.506741] sd 4:0:0:0: [sde] 1465149168 512-byte hardware sectors (750156 MB) [42333.506834] sd 4:0:0:0: [sde] Write Protect is off [42333.506887] sd 4:0:0:0: [sde] Mode Sense: 00 3a 00 00 [42333.506905] sd 4:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Next test, I will turn off NCQ and try to make the problem re-occur. If anyone else has any thoughts here..? I ran long smart tests on all 3 disks, they all ran successfully. Perhaps these drives need to be NCQ BLACKLISTED with the P35 chipset? The problem won't recur with NCQ off, because spurious completions are impossible in that case. It was originally thought that these AHCI spurious NCQ completions were busted NCQ implementations on the drives, but I think there theory is that it's some other timing problem or some such, given the number of drives across all makers which are reported to do this. I believe Tejun is investigating? -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: solid state drive access and context switching
Chris Friesen wrote: Over on comp.os.linux.development.system someone asked an interesting question, and I thought I'd mention it here. Given a fast low-latency solid state drive, would it ever be beneficial to simply wait in the kernel for synchronous read/write calls to complete? The idea is that you could avoid at least two task context switches, and if the data access can be completed at less cost than those context switches it could be an overall win. Has anyone played with this concept? I don't think most SSDs are fast enough that it would really be worth avoiding the context switch for.. I could be wrong though. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sata_nv: fix ADMA ATAPI issues with memory over 4GB (v3)
Jeff Garzik wrote: Robert Hancock wrote: This fixes some problems with ATAPI devices on nForce4 controllers in ADMA mode on systems with memory located above 4GB. We need to delay setting the 64-bit DMA mask until the PRD table and padding buffer are allocated so that they don't get allocated above 4GB and break legacy mode (which is needed for ATAPI devices). Signed-off-by: Robert Hancock <[EMAIL PROTECTED]> This is a bit nasty :/ I would consider setting the consistent DMA mask to 32-bit, and setting the overall mask to 64-bit. Seems like that would solve the problem? The issue with that is that it would also constrain the ADMA CPB/PRD table allocation to 32-bit, which I'd rather avoid having to do. There are dual-socket Opteron boxes like HP xw9300 that use this controller, and limiting the allocation to 32-bit could force a non-optimal node allocation for the table memory. These type of devices really want a version of dma_alloc_coherent that allows overriding the DMA mask for specific allocations to make this cleaner. I'm sure this isn't the only device that has different DMA mask requirements for different consistent memory allocations.. This patch does has the advantage of being confirmed to fix the reporter's problem (https://bugzilla.redhat.com/show_bug.cgi?id=351451) which there's something to be said for this late in the .24-rc series.. Also, does this need to be rebased on top of what I just pushed upstream? It don't think so.. this change is independent from the "sata_nv: don't use legacy DMA in ADMA mode (v3)" patch you just merged. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sata_nv: fix ADMA ATAPI issues with memory over 4GB (v3)
Jeff Garzik wrote: Robert Hancock wrote: This fixes some problems with ATAPI devices on nForce4 controllers in ADMA mode on systems with memory located above 4GB. We need to delay setting the 64-bit DMA mask until the PRD table and padding buffer are allocated so that they don't get allocated above 4GB and break legacy mode (which is needed for ATAPI devices). Signed-off-by: Robert Hancock [EMAIL PROTECTED] This is a bit nasty :/ I would consider setting the consistent DMA mask to 32-bit, and setting the overall mask to 64-bit. Seems like that would solve the problem? The issue with that is that it would also constrain the ADMA CPB/PRD table allocation to 32-bit, which I'd rather avoid having to do. There are dual-socket Opteron boxes like HP xw9300 that use this controller, and limiting the allocation to 32-bit could force a non-optimal node allocation for the table memory. These type of devices really want a version of dma_alloc_coherent that allows overriding the DMA mask for specific allocations to make this cleaner. I'm sure this isn't the only device that has different DMA mask requirements for different consistent memory allocations.. This patch does has the advantage of being confirmed to fix the reporter's problem (https://bugzilla.redhat.com/show_bug.cgi?id=351451) which there's something to be said for this late in the .24-rc series.. Also, does this need to be rebased on top of what I just pushed upstream? It don't think so.. this change is independent from the sata_nv: don't use legacy DMA in ADMA mode (v3) patch you just merged. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: solid state drive access and context switching
Chris Friesen wrote: Over on comp.os.linux.development.system someone asked an interesting question, and I thought I'd mention it here. Given a fast low-latency solid state drive, would it ever be beneficial to simply wait in the kernel for synchronous read/write calls to complete? The idea is that you could avoid at least two task context switches, and if the data access can be completed at less cost than those context switches it could be an overall win. Has anyone played with this concept? I don't think most SSDs are fast enough that it would really be worth avoiding the context switch for.. I could be wrong though. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel 2.6.23.9 / P35 Chipset + WD 750GB Drives (reset port)
Justin Piszcz wrote: The badblocks did not do anything; however, when I built a software raid 5 and the performed a dd: /usr/bin/time dd if=/dev/zero of=fill_disk bs=1M I saw this somewhere along the way: [30189.967531] RAID5 conf printout: [30189.967576] --- rd:3 wd:3 [30189.967617] disk 0, o:1, dev:sdc1 [30189.967660] disk 1, o:1, dev:sdd1 [30189.967716] disk 2, o:1, dev:sde1 [42332.936615] ata5.00: exception Emask 0x2 SAct 0x7000 SErr 0x0 action 0x2 frozen [42332.936706] ata5.00: spurious completions during NCQ issue=0x0 SAct=0x7000 FIS=004040a1:0800 [42332.936804] ata5.00: cmd 61/08:60:6f:4d:2a/00:00:27:00:00/40 tag 12 cdb 0x0 data 4096 out [42332.936805] res 40/00:74:0f:49:2a/00:00:27:00:00/40 Emask 0x2 (HSM violation) [42332.936977] ata5.00: cmd 61/08:68:77:4d:2a/00:00:27:00:00/40 tag 13 cdb 0x0 data 4096 out [42332.936981] res 40/00:74:0f:49:2a/00:00:27:00:00/40 Emask 0x2 (HSM violation) [42332.937162] ata5.00: cmd 61/00:70:0f:49:2a/04:00:27:00:00/40 tag 14 cdb 0x0 data 524288 out [42332.937163] res 40/00:74:0f:49:2a/00:00:27:00:00/40 Emask 0x2 (HSM violation) [42333.240054] ata5: soft resetting port [42333.494462] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [42333.506592] ata5.00: configured for UDMA/133 [42333.506652] ata5: EH complete [42333.506741] sd 4:0:0:0: [sde] 1465149168 512-byte hardware sectors (750156 MB) [42333.506834] sd 4:0:0:0: [sde] Write Protect is off [42333.506887] sd 4:0:0:0: [sde] Mode Sense: 00 3a 00 00 [42333.506905] sd 4:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Next test, I will turn off NCQ and try to make the problem re-occur. If anyone else has any thoughts here..? I ran long smart tests on all 3 disks, they all ran successfully. Perhaps these drives need to be NCQ BLACKLISTED with the P35 chipset? The problem won't recur with NCQ off, because spurious completions are impossible in that case. It was originally thought that these AHCI spurious NCQ completions were busted NCQ implementations on the drives, but I think there theory is that it's some other timing problem or some such, given the number of drives across all makers which are reported to do this. I believe Tejun is investigating? -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Timers SMP] can this machine be helped?
Guennadi Liakhovetski wrote: Hi, I've got an old 2xP-II @ 400MHz Compaq AP400 system, which I'm still using. It has many peculiarities, so, I wouldn't be surprised if the answer to my questions would be sorry, the patient is rather dead than alive. Some of the problems lie in ACPI area, I tried some time ago to fix the ACPI tables for these machine, but never got enough time for that. So I'm still booting with acpi=noirq Another problem is its battery is dead and it's hard soldered to the mainboard (Compaq)... It might also have some problems with one of its 3 SCSI busses. I compiled a .24-ish kernel for it with CONFIG_NO_HZ and CONFIG_HIGH_RES_TIMERS. To get the system boot at least sometimes I have to specify nohz=off. Then I get * Found PM-Timer Bug on the chipset. Due to workarounds for a bug, * this clock source is slow. Consider trying other clock sources Without this parameter it hangs usually between Time: acpi_pm clocksource has been installed. and Switched to high resolution mode on CPU 1 Switched to high resolution mode on CPU 0 Tried booting with clocksource=tsc then I've got Marking TSC unstable due to: possible TSC halt in C2. And then a few of these: BUG: soft lockup - CPU#0 stuck for 13s! [swapper:0] Pid: 0, comm: swapper Not tainted (2.6.24-rc2-g8c086340 #3) EIP: 0060:[c0233d33] EFLAGS: 0283 CPU: 0 EIP is at acpi_processor_idle+0x2ae/0x477 EAX: EBX: feab ECX: 0001 EDX: 0001 ESI: c7c5f2d0 EDI: 00122d9f EBP: c03ddfa8 ESP: c03ddf90 DS: 007b ES: 007b FS: 00d8 GS: SS: 0068 CR0: 8005003b CR2: 081dcf88 CR3: 07e46000 CR4: 02d0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 [c01053fa] show_trace_log_lvl+0x1a/0x30 [c0105f42] show_trace+0x12/0x20 [c01024fc] show_regs+0x1c/0x20 [c014fabb] softlockup_tick+0x11b/0x150 [c01311f2] run_local_timers+0x12/0x20 [c013168f] update_process_times+0x2f/0x60 [c014597a] tick_sched_timer+0x6a/0xe0 [c013fba0] hrtimer_interrupt+0x120/0x1a0 [c0119ff5] smp_apic_timer_interrupt+0x55/0x90 [c0104e70] apic_timer_interrupt+0x28/0x30 [c0102624] cpu_idle+0x84/0xf0 [c0316a7d] rest_init+0x5d/0x60 [c03e1a7f] start_kernel+0x2af/0x2f0 [] run_init_process+0x3feff000/0x20 === so, is there any way I can still reasonably use this system? Which configuration / command-line parameters should I try? If needed can provide complete dmesg (with nohz=off or with clocksource=tsc) and .config. How about disabling ACPI entirely, acpi=off on kernel command line? I wouldn't be surprised to see a lot of ACPI stuff broken on an older machine like that.. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel 2.6.23.9 / P35 Chipset + WD 750GB Drives (reset port)
Justin Piszcz wrote: I am putting a new machine together and I have dual raptor raid 1 for the root, which works just fine under all stress tests. Then I have the WD 750 GiB drive (not RE2, desktop ones for ~150-160 on sale now adays): I ran the following: dd if=/dev/zero of=/dev/sdc dd if=/dev/zero of=/dev/sdd dd if=/dev/zero of=/dev/sde (as it is always a very good idea to do this with any new disk) And sometime along the way(?) (i had gone to sleep and let it run), this occurred: [42880.680144] ata3.00: exception Emask 0x10 SAct 0x0 SErr 0x401 action 0x2 frozen [42880.680231] ata3.00: irq_stat 0x00400040, connection status changed [42880.680290] ata3.00: cmd ec/00:00:00:00:00/00:00:00:00:00/00 tag 0 cdb 0x0 data 512 in [42880.680292] res 40/00:ac:d8:64:54/00:00:57:00:00/40 Emask 0x10 (ATA bus error) [42881.841899] ata3: soft resetting port [42885.966320] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [42915.919042] ata3.00: qc timeout (cmd 0xec) [42915.919094] ata3.00: failed to IDENTIFY (I/O error, err_mask=0x5) [42915.919149] ata3.00: revalidation failed (errno=-5) [42915.919206] ata3: failed to recover some devices, retrying in 5 secs [42920.912458] ata3: hard resetting port [42926.411363] ata3: port is slow to respond, please be patient (Status 0x80) [42930.943080] ata3: COMRESET failed (errno=-16) [42930.943130] ata3: hard resetting port [42931.399628] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [42931.413523] ata3.00: configured for UDMA/133 [42931.413586] ata3: EH pending after completion, repeating EH (cnt=4) [42931.413655] ata3: EH complete [42931.413719] sd 2:0:0:0: [sdc] 1465149168 512-byte hardware sectors (750156 MB) [42931.413809] sd 2:0:0:0: [sdc] Write Protect is off [42931.413856] sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00 [42931.413867] sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Usually when I see this sort of thing with another box I have full of raptors, it was due to a bad raptor and I never saw it again after I replaced the disk that it happened on, but that was using the Intel P965 chipset. For this board, it is a Gigabyte GSP-P35-DS4 (Rev 2.0) and I have all of the drives (2 raptors, 3 750s connected to the Intel ICH9 Southbridge). I am going to do some further testing but does this indicate a bad drive? Bad cable? Bad connector? Could be any of the above. As you can see above, /dev/sdc stopped responding for a little bit and then the kernel reset the port. It looks like the first thing that happened is that the controller reported it lost the SATA link, and then the drive didn't respond until it was bashed with a few hard resets.. Why is this though? What is the likely root cause? Should I replace the drive? Obviously this is not normal and cannot be good at all, the idea is to put these drives in a RAID5 and if one is going to timeout that is going to cause the array to go degraded and thus be worthless in a raid5 configuration. Can anyone offer any insight here? Thank you, Justin. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel 2.6.23.9 / P35 Chipset + WD 750GB Drives (reset port)
Justin Piszcz wrote: I am putting a new machine together and I have dual raptor raid 1 for the root, which works just fine under all stress tests. Then I have the WD 750 GiB drive (not RE2, desktop ones for ~150-160 on sale now adays): I ran the following: dd if=/dev/zero of=/dev/sdc dd if=/dev/zero of=/dev/sdd dd if=/dev/zero of=/dev/sde (as it is always a very good idea to do this with any new disk) And sometime along the way(?) (i had gone to sleep and let it run), this occurred: [42880.680144] ata3.00: exception Emask 0x10 SAct 0x0 SErr 0x401 action 0x2 frozen [42880.680231] ata3.00: irq_stat 0x00400040, connection status changed [42880.680290] ata3.00: cmd ec/00:00:00:00:00/00:00:00:00:00/00 tag 0 cdb 0x0 data 512 in [42880.680292] res 40/00:ac:d8:64:54/00:00:57:00:00/40 Emask 0x10 (ATA bus error) [42881.841899] ata3: soft resetting port [42885.966320] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [42915.919042] ata3.00: qc timeout (cmd 0xec) [42915.919094] ata3.00: failed to IDENTIFY (I/O error, err_mask=0x5) [42915.919149] ata3.00: revalidation failed (errno=-5) [42915.919206] ata3: failed to recover some devices, retrying in 5 secs [42920.912458] ata3: hard resetting port [42926.411363] ata3: port is slow to respond, please be patient (Status 0x80) [42930.943080] ata3: COMRESET failed (errno=-16) [42930.943130] ata3: hard resetting port [42931.399628] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [42931.413523] ata3.00: configured for UDMA/133 [42931.413586] ata3: EH pending after completion, repeating EH (cnt=4) [42931.413655] ata3: EH complete [42931.413719] sd 2:0:0:0: [sdc] 1465149168 512-byte hardware sectors (750156 MB) [42931.413809] sd 2:0:0:0: [sdc] Write Protect is off [42931.413856] sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00 [42931.413867] sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Usually when I see this sort of thing with another box I have full of raptors, it was due to a bad raptor and I never saw it again after I replaced the disk that it happened on, but that was using the Intel P965 chipset. For this board, it is a Gigabyte GSP-P35-DS4 (Rev 2.0) and I have all of the drives (2 raptors, 3 750s connected to the Intel ICH9 Southbridge). I am going to do some further testing but does this indicate a bad drive? Bad cable? Bad connector? Could be any of the above. As you can see above, /dev/sdc stopped responding for a little bit and then the kernel reset the port. It looks like the first thing that happened is that the controller reported it lost the SATA link, and then the drive didn't respond until it was bashed with a few hard resets.. Why is this though? What is the likely root cause? Should I replace the drive? Obviously this is not normal and cannot be good at all, the idea is to put these drives in a RAID5 and if one is going to timeout that is going to cause the array to go degraded and thus be worthless in a raid5 configuration. Can anyone offer any insight here? Thank you, Justin. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possibly SATA related freeze killed networking and RAID
Phillip Susi wrote: Tejun Heo wrote: Agreed. Nobody cared on ATA controllers is usually very effective at taking the whole machine down. Is there any reason why we don't turn on irqpoll on turned off IRQs automatically? Why does a single spurious interrupt cause it to be shut down? I can see if the interrupt is stuck on and keeps interrupting constantly, but if it's just the occasional spurious interrupt, why not just ignore it and move on? I'm not certain offhand, but I think there may be such a threshold. However, an occasional spurious interrupt isn't likely. For a level-triggered interrupt, an unhandled interrupt will keep interrupting forever since nobody knows how to clear it (until we decide to disable the IRQ entirely). -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Possibly SATA related freeze killed networking and RAID
Phillip Susi wrote: Tejun Heo wrote: Agreed. Nobody cared on ATA controllers is usually very effective at taking the whole machine down. Is there any reason why we don't turn on irqpoll on turned off IRQs automatically? Why does a single spurious interrupt cause it to be shut down? I can see if the interrupt is stuck on and keeps interrupting constantly, but if it's just the occasional spurious interrupt, why not just ignore it and move on? I'm not certain offhand, but I think there may be such a threshold. However, an occasional spurious interrupt isn't likely. For a level-triggered interrupt, an unhandled interrupt will keep interrupting forever since nobody knows how to clear it (until we decide to disable the IRQ entirely). -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: How to map user space's virtual memory into kernel logical address space
Maitre Bart wrote: A given app is allocating a large amount of memory (~10M) with malloc(). It passes this pointer to the kernel (device driver) via an custom ioctl. I would like the driver to work on that memory with a pointer (as if it was allocated with vmalloc) as well as the user space too (upon return of the syscall). Is there a way to map a user space's virtual memory range into the kernel logical address space? As far as I learned from my readings, using the user-space pointer directly in kernel space will not work. Of course, copy_from_user() is out of question for efficiency purposes. ioremap() is pretty close to what I wish to do except that it accepts a physical address and I don't how to get it from a user space pointer. And since a physical address is required, I assume the range is considered contiguous, which is not really the case for malloc(). mmap()/remap_pfn_range() are interesting but I don't know how to get a kernel pointer out of them. kmap() does the job for a single page (and anyway, I wouldn't know how to feed it with a struct page from the userland pointer). get_user_pages() looks promising but it seems I have to call kmap() on each page, so it looks like I cannot operate on the buffer with a single pointer. Does any one know if it is possible? And if so, how can I do it? 10MB is an awfully big mapping to put into kernel virtual memory space. I suspect it might be easier to allocate the memory in the kernel and map it in from userspace, but then you have the same problem (and 10MB is awfully big for vmalloc). Is there a good reason why you have to be able to do this? There's likely a better way. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: How to map user space's virtual memory into kernel logical address space
Maitre Bart wrote: A given app is allocating a large amount of memory (~10M) with malloc(). It passes this pointer to the kernel (device driver) via an custom ioctl. I would like the driver to work on that memory with a pointer (as if it was allocated with vmalloc) as well as the user space too (upon return of the syscall). Is there a way to map a user space's virtual memory range into the kernel logical address space? As far as I learned from my readings, using the user-space pointer directly in kernel space will not work. Of course, copy_from_user() is out of question for efficiency purposes. ioremap() is pretty close to what I wish to do except that it accepts a physical address and I don't how to get it from a user space pointer. And since a physical address is required, I assume the range is considered contiguous, which is not really the case for malloc(). mmap()/remap_pfn_range() are interesting but I don't know how to get a kernel pointer out of them. kmap() does the job for a single page (and anyway, I wouldn't know how to feed it with a struct page from the userland pointer). get_user_pages() looks promising but it seems I have to call kmap() on each page, so it looks like I cannot operate on the buffer with a single pointer. Does any one know if it is possible? And if so, how can I do it? 10MB is an awfully big mapping to put into kernel virtual memory space. I suspect it might be easier to allocate the memory in the kernel and map it in from userspace, but then you have the same problem (and 10MB is awfully big for vmalloc). Is there a good reason why you have to be able to do this? There's likely a better way. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/