Re: Problem with ata layer in 2.6.24
Kasper Sandberg wrote: > to put some timeline perspective into this. > i believe it was in 2005 i assembled the system, and when i realized it > was faulty, on old ide driver, i stopped using it - that miht have been > in beginning of 2006. then for almost a year i werent using it, hoping > to somehow fix it, but in january 2007 i think it was, atleast in the > very beginning of 2007, i hit upon the idea of trying libata, and ever > since the system has been running 24/7 - doing these errors around 2 > times a day. > > i have multiple times reported my problems to lkml, but nothing has > happened, i also tried to aproeach jgarzik direcly, but he was not > interested. > > i really hope this can be solved now, its a huge problem > > my fileserver has an asus k8v motherboard, with via chipset (k8t880 i > think it is, or something like it). currently using the promise > controller again(strangely enough all the timeouts seems to happen here, > and when the ITE was on, there, not the onboard one), in conjunction > with the onboard via. Timeouts are nasty to debug. It can be caused by whole range of different problems including transmission errors, bad power, faulty drive, mishandled media error, IRQ misrouting, dumb hardware bug. It's basically 'uh... I told the controller to do something but it never called me back'. If you see timeouts on multiple devices connected to different controllers, the chance is that you have problem somewhere else. The most likely culprit is bad power. Please... * Post the result of 'lspci -nn' and kernel log including full boot log and error messages. * Try to isolate the problem. ie. Does removing several number of drives fix the problem? If the problem is localized to certain device, what happens if you move it? Does the problem follow the drive or stay with the port? If the failing drives are SATA, it's a good idea to power some of the failing drives with a separate PSU and see whether anything is different. By trying to isolate the hardware problem, more can be learned about the error condition and even when the problem actually isn't hardware problem, it gives us much deeper insight of the problem and clues regarding where to look. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
Kasper Sandberg wrote: to put some timeline perspective into this. i believe it was in 2005 i assembled the system, and when i realized it was faulty, on old ide driver, i stopped using it - that miht have been in beginning of 2006. then for almost a year i werent using it, hoping to somehow fix it, but in january 2007 i think it was, atleast in the very beginning of 2007, i hit upon the idea of trying libata, and ever since the system has been running 24/7 - doing these errors around 2 times a day. i have multiple times reported my problems to lkml, but nothing has happened, i also tried to aproeach jgarzik direcly, but he was not interested. i really hope this can be solved now, its a huge problem my fileserver has an asus k8v motherboard, with via chipset (k8t880 i think it is, or something like it). currently using the promise controller again(strangely enough all the timeouts seems to happen here, and when the ITE was on, there, not the onboard one), in conjunction with the onboard via. Timeouts are nasty to debug. It can be caused by whole range of different problems including transmission errors, bad power, faulty drive, mishandled media error, IRQ misrouting, dumb hardware bug. It's basically 'uh... I told the controller to do something but it never called me back'. If you see timeouts on multiple devices connected to different controllers, the chance is that you have problem somewhere else. The most likely culprit is bad power. Please... * Post the result of 'lspci -nn' and kernel log including full boot log and error messages. * Try to isolate the problem. ie. Does removing several number of drives fix the problem? If the problem is localized to certain device, what happens if you move it? Does the problem follow the drive or stay with the port? If the failing drives are SATA, it's a good idea to power some of the failing drives with a separate PSU and see whether anything is different. By trying to isolate the hardware problem, more can be learned about the error condition and even when the problem actually isn't hardware problem, it gives us much deeper insight of the problem and clues regarding where to look. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
Gene Heskett wrote: I doubt libata has that capability now, or ever will, cuz these ide/atapi devices are generally dumber than rocks about that. But any device claiming to be scsi-II is supposed to be able to do those sorts of things while the cpu is off crunching numbers for BOINC or whatever. .. The CD/DVD drives all all "MMC" devices internally, which means they speak a SCSI command protocol. Regardless of the electrical or optical interface. Linux is software, and the software protocol is exactly the same for them, no matter what the cable/bus type happens to be. Cheers -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Problem with ata layer in 2.6.24
> From: [EMAIL PROTECTED] >>if this works then it really needs to move and be renamed. I am compiling >> with DEV_SR set. >> > That fixed me right up, Adam, & k3b is once again as happy as a clam. Fixed it for me too. I just realized the default config in 2.6.24 is way different than the default config in 2.6.23. If I remember correctly there was talk of separating the libata and scsi code. This was awhile ago. I am not a kernel programmer, only a user, but either the scsi and libata kconfig menus should be joined and made generic, or options like cdrom support should be in both kconfig menus. Alan says libata is scsi with an accent so maybe merging the two isn't as bad as it sounds. Just my $0.02 cents, probably worth less in this case. Adam _ Connect and share in new ways with Windows Live. http://www.windowslive.com/share.html?ocid=TXT_TAGHM_Wave2_sharelife_012008-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
> By the linux software definition maybe. But I've defined scsi as that which > uses a 50 wire cable using 50 contact centronics connectors since the > mid '70's, and which often needs a ready supply of nubile virgins t 25, 50 or 68, with multiple voltage levels, plus of course it might be over fibre or copper FC loop and .. SCSI is a protocol. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Tuesday 29 January 2008, Alan Cox wrote: >> That could stand to be moved or renamed, it is well buried in the menu for >> the REAL scsi stuffs, which I don't have any of. > >Yes you do - USB storage and ATAPI are SCSI By the linux software definition maybe. But I've defined scsi as that which uses a 50 wire cable using 50 contact centronics connectors since the mid '70's, and which often needs a ready supply of nubile virgins to sacrifice to make it work, particularly with the old resistor pack terminations & psu's whose 5 volt line is only 4.85 volts due to old age. That's what I call REAL scsi. Its also a REAL PITA if the terms aren't active. You can call what you are doing 'scsi' because you are using much the same command structure, and that is good, but its not the real thing with all its hardware warts and/or capabilities. For one thing, this version usually works. :) Furinstance, you can tell 2 scsi devices on the same controller to talk to each other, moving files from one to the other, and the host controller can then goto sleep & the cpu isn't involved until the devices send it a wakeup to advise the controller that the transfer has been done, and the controller may or may not then interrupt and advise the cpu. You can do that with separate controllers too as long as they have a compatible DMA channel available to both. I doubt libata has that capability now, or ever will, cuz these ide/atapi devices are generally dumber than rocks about that. But any device claiming to be scsi-II is supposed to be able to do those sorts of things while the cpu is off crunching numbers for BOINC or whatever. But that puts my mild objections to classifying this as 'scsi' in a more understandable context. :-) -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) When some people decide it's time for everyone to make big changes, it means that they want you to change first. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
> I've seen a lot of verbosity out of SCSI messages, but I haven't seen a > straightforward interpretation of the problem in there. It's all > information useful for debugging, not information useful for system > administration. It tells you what is going on. Unfortunately that frequently requires some basic knowledge of how to interpret the error report. Drive interface behaviour simply doesn't boil down to a fault light on the dashboard or a "tighten the cable". For most common fault types you'll get errors most administrators should find meaningful - like "Media error" > On the other hand, bringing the system down because a device is > misbehaving is a poor idea. I've personally recovered most of the data off Hence we have RAID and SATA hotplug. Alan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Tue, 29 Jan 2008, Alan Cox wrote: > > The SCSI error reporting really ought to include a simple interpretation > > of the error for end users ("The drive doesn't support this command" "A > > sector's data got lost" "The drive timed out" "The drive failed" "The > > drive is entirely gone"). There's too much similarity between the message > > you get when you try a SMART test that doesn't apply to the drive and what > > you get when the drive is broken. > > That would be the SCSI verbose messages option. I think the Eric > Youngdale consortium added it about Linux 1.2. Nowdays its always built > that way. I've seen a lot of verbosity out of SCSI messages, but I haven't seen a straightforward interpretation of the problem in there. It's all information useful for debugging, not information useful for system administration. > > And it's possible that the error recovery is suboptimal in some cases. It > > seems to like resetting drives too much; perhaps if it keeps seeing the > > same problem and resetting the drive, it should decide that the drive's > > error reporting is just bad and just ignore that error like the old IDE > > did (but, in this case, after saying what it's doing). > > Nothing like casually praying the users data hasn't gone for a walk is > there. If we don't act on them the users don't report them until > something really bad occurs so that isn't an option. On the other hand, bringing the system down because a device is misbehaving is a poor idea. I've personally recovered most of the data off of a dying drive because the system was willing to let me keep using the drive anyway; IIRC, the drive didn't work at all after a reboot, so I would have lost all the data instead of only a little had the system insisted on a perfectly functioning drive in order to use it at all. There ought to be some middle ground between doing nothing until the computer really breaks and breaking the computer before then, but that's an issue not specific to libata. -Daniel *This .sig left intentionally blank* -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
> That could stand to be moved or renamed, it is well buried in the menu for > the > REAL scsi stuffs, which I don't have any of. Yes you do - USB storage and ATAPI are SCSI -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
> The SCSI error reporting really ought to include a simple interpretation > of the error for end users ("The drive doesn't support this command" "A > sector's data got lost" "The drive timed out" "The drive failed" "The > drive is entirely gone"). There's too much similarity between the message > you get when you try a SMART test that doesn't apply to the drive and what > you get when the drive is broken. That would be the SCSI verbose messages option. I think the Eric Youngdale consortium added it about Linux 1.2. Nowdays its always built that way. > And it's possible that the error recovery is suboptimal in some cases. It > seems to like resetting drives too much; perhaps if it keeps seeing the > same problem and resetting the drive, it should decide that the drive's > error reporting is just bad and just ignore that error like the old IDE > did (but, in this case, after saying what it's doing). Nothing like casually praying the users data hasn't gone for a walk is there. If we don't act on them the users don't report them until something really bad occurs so that isn't an option. Alan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Tuesday 29 January 2008, Jeff Garzik wrote: >Gene Heskett wrote: >> On Tuesday 29 January 2008, Jeff Garzik wrote: >>> Gene Heskett wrote: Does anyone know why my dvdwriter isn't being assigned a '/dev/sdx' number when dmesg says its found ok at ata2.00? I've turned on an option that says something about using the bios for device access this build, but I'll be surprised if that's it. :) >>> >>> I think you mean /dev/scdx not /dev/sdx. Make sure you have the 'sr' >>> driver compiled and load (CONFIG_BLK_DEV_SR). >> >> That menu item COULD be moved, I don't have any REAL scsi stuff, so I >> didn't look there. My bad, with help from hiding it like that. :-) >> >>> The bios-for-dev-access thing definitely won't help, and may hurt (by >>> taking over the device you wanted to test). >> >> Ok, if BLK_DEV_SR fails, I'll take that back out. I'm heating the room >> making kernels here. :) > >I can say with 100% certainty that 'sr' is required in order to use your >dvd writer with libata. :) > > Jeff And as usual, you are 100% correct, thanks. And now back to our regularly scheduled testing for 'exception Emask' errors. :) -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) Main's Law: For every action there is an equal and opposite government program. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
Mark Lord wrote: rgheck wrote: Alan Cox wrote: not one problem but lots---is sufficiently widespread that a Mini HOWTO, say, would be really welcome and, I'm guessing, widely used. We don't see very many libata problems at the distro level and they for the most part boil down to - sata_nv with >4GB of RAM, knowing being worked on, no old IDE driver anyway Is this >4GB or >=4GB? I've seen contradictory reports, and I've got 4GB. .. For all practical purposes, most memory over 3GB (or sometimes even 2GB) on a 32-bit x86 system is treated as >4GB by the motherboard. Because it's not the amount of *memory* that matters so much, but rather the amount of *used address space*. Video cards, PCI devices, other motherboard resources etc.. can all subtract from the available address space, leaving much less than 4GB for your RAM. Right. So it looks like I do have this issue, though I haven't seen any actual problems on 24. Is there a known workaround? rh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
rgheck wrote: Mark Lord wrote: rgheck wrote: Alan Cox wrote: not one problem but lots---is sufficiently widespread that a Mini HOWTO, say, would be really welcome and, I'm guessing, widely used. We don't see very many libata problems at the distro level and they for the most part boil down to - sata_nv with >4GB of RAM, knowing being worked on, no old IDE driver anyway Is this >4GB or >=4GB? I've seen contradictory reports, and I've got 4GB. .. For all practical purposes, most memory over 3GB (or sometimes even 2GB) on a 32-bit x86 system is treated as >4GB by the motherboard. Because it's not the amount of *memory* that matters so much, but rather the amount of *used address space*. Video cards, PCI devices, other motherboard resources etc.. can all subtract from the available address space, leaving much less than 4GB for your RAM. Right. So it looks like I do have this issue, though I haven't seen any actual problems on 24. Is there a known workaround? .. For now, the workaround is to not enable the RAM above 4GB. Your kernel .config file should therefore have these two lines: CONFIG_HIGHMEM4G=y # CONFIG_HIGHMEM64G is not set Later, once the issue is fixed at the driver level (soon), you can get your high memory back again by enabling CONFIG_HIGHMEM64G, though this will cost a few percent of performance in the extra page table overhead it creates. Cheers -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Tue, 29 Jan 2008, Alan Cox wrote: > > not one problem but lots---is sufficiently widespread that a Mini HOWTO, > > say, would be really welcome and, I'm guessing, widely used. > > We don't see very many libata problems at the distro level and they for > the most part boil down to > > - error messages looking different - Most bugs I get are things like > media errors (timeout looks different, UNC report looks different) The SCSI error reporting really ought to include a simple interpretation of the error for end users ("The drive doesn't support this command" "A sector's data got lost" "The drive timed out" "The drive failed" "The drive is entirely gone"). There's too much similarity between the message you get when you try a SMART test that doesn't apply to the drive and what you get when the drive is broken. > - faulty hardware being picked up because we actually do real error > checking now. We now check for and give some devices more slack while > still doing error checking. Both IDE layers also added blacklists for > stuff like the TSScorp DVD drives. Qemu has now had its bugs patched. I think this is the big source of unhappy users (and, of course, they all look the same and the reports stay findable by Google, so it looks a lot worse than it is). People getting this problem in distro kernels probably really do want to have a way to report it with enough detail from logs to get it dealt with and then switch back to old IDE until the fix propagates through. And it's possible that the error recovery is suboptimal in some cases. It seems to like resetting drives too much; perhaps if it keeps seeing the same problem and resetting the drive, it should decide that the drive's error reporting is just bad and just ignore that error like the old IDE did (but, in this case, after saying what it's doing). -Daniel *This .sig left intentionally blank* -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
rgheck wrote: Alan Cox wrote: not one problem but lots---is sufficiently widespread that a Mini HOWTO, say, would be really welcome and, I'm guessing, widely used. We don't see very many libata problems at the distro level and they for the most part boil down to - sata_nv with >4GB of RAM, knowing being worked on, no old IDE driver anyway Is this >4GB or >=4GB? I've seen contradictory reports, and I've got 4GB. .. For all practical purposes, most memory over 3GB (or sometimes even 2GB) on a 32-bit x86 system is treated as >4GB by the motherboard. Because it's not the amount of *memory* that matters so much, but rather the amount of *used address space*. Video cards, PCI devices, other motherboard resources etc.. can all subtract from the available address space, leaving much less than 4GB for your RAM. -ml -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Tue, 29 Jan 2008, Alan Cox wrote: > > things in the kernel that refer to SCSI probably should say "storage" (or > > "ATA", really, but that would make the acronyms confusing). > > SCSI is a command protocol. It is what your CD-ROM drive and USB storage > devices talk (albeit with a bit of an accent). Among other things, yes. But SCSI standards also specify electrical interfaces that aren't at all related to the electrical interfaces used by a lot of devices, and a lot of the places the kernel uses the term suggest that it's also talking about the electrical interface (or, at least, connector shape). For example, it's misleading to talk about "SCSI CDROM support" meaning the command protocol when hardly anybody has ever seen a CDROM drive that doesn't use the SCSI command protocol, but most people know about both SCSI-connector and PATA-connector CDROM drives. -Daniel *This .sig left intentionally blank* -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
Gene Heskett wrote: On Tuesday 29 January 2008, Mark Lord wrote: Gene Heskett wrote: .. Does anyone know why my dvdwriter isn't being assigned a '/dev/sdx' number when dmesg says its found ok at ata2.00? I've turned on an option that says something about using the bios for device access this build, but I'll be surprised if that's it. :) .. It should show up as /dev/scd0 or something very similar. Tisn't. Darnit. .. It requires CONFIG_SCSI, CONFIG_BLK_DEV_SD, CONFIG_BLK_DEV_SR, in the kernel .config. The _SR one ("SCSI Reader") is for CD/DVD support. Cheers -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Tuesday 29 January 2008, Daniel Barkalow wrote: >On Tue, 29 Jan 2008, Gene Heskett wrote: >> >For starters, enable CONFIG_BLK_DEV_SR. >> >> That could stand to be moved or renamed, it is well buried in the menu for >> the REAL scsi stuffs, which I don't have any of. Enabled & building now. > >The "SCSI support type (disk, tape, CD-ROM)" section of that menu actually >applies to all ATA-command-set devices that don't use the old IDE code. >For example, usb-storage uses "SCSI disk" out of that section, and >I've only seen "Probe all LUNs on each SCSI device" be needed for a >particular USB card reader with two slots. At this point, most of the >things in the kernel that refer to SCSI probably should say "storage" (or >"ATA", really, but that would make the acronyms confusing). > >Incidentally, you should be able to save debugging time for problems like >missing "sr" by building it as a module, which will build really quickly >and not require a reboot to test. > > -Daniel >*This .sig left intentionally blank* I did, Daniel, but while that has worked, its not been 100% foolproof in the past, so I just waste the 9 minutes building a new kernel as cheap insurance. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) Mal: "If it's Alliance trouble you got, you might want to consider another ship. Some onboard here fought for the Independents." --Episode #8, "Out of Gas" -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Tuesday 29 January 2008, Adam Turk wrote: >I just found this thread and it looks like it will fix my problem too. I > have an IDE cd-rw drive and 2 SCSI hard drives. My ide cd-rw drive hasn't > been showing up. I looked at setting scsi cdrom support > (CONFIG_BLOCK_DEV_SR) but it doesn't mention anything about ide drives > using libata. I know the drive is being detecting by looking at dmesg: >ata_piix :00:07.1: version 2.12 >scsi1 : ata_piix >scsi2 : ata_piix >ata1: PATA max UDMA/33 cmd 0x1f0 ctl 0x3f6 bmdma 0xffa0 irq 14 >ata2: PATA max UDMA/33 cmd 0x170 ctl 0x376 bmdma 0xffa8 irq 15 >ata1.00: ATAPI: Memorex 52MAXX 3252AJ1, 4WS2, max UDMA/33 >ata1.00: configured for UDMA/33 >ata2: port disabled. ignoring. >scsi 1:0:0:0: CD-ROMMemorex 52MAXX 3252AJ1 4WS2 PQ: 0 ANSI: 5 >if this works then it really needs to move and be renamed. I am compiling > with DEV_SR set. > >Just my $0.02 but may be worth more or less, >Adam > That fixed me right up, Adam, & k3b is once again as happy as a clam. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) Main's Law: For every action there is an equal and opposite government program. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
> things in the kernel that refer to SCSI probably should say "storage" (or > "ATA", really, but that would make the acronyms confusing). SCSI is a command protocol. It is what your CD-ROM drive and USB storage devices talk (albeit with a bit of an accent). Alan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
I just found this thread and it looks like it will fix my problem too. I have an IDE cd-rw drive and 2 SCSI hard drives. My ide cd-rw drive hasn't been showing up. I looked at setting scsi cdrom support (CONFIG_BLOCK_DEV_SR) but it doesn't mention anything about ide drives using libata. I know the drive is being detecting by looking at dmesg: ata_piix :00:07.1: version 2.12 scsi1 : ata_piix scsi2 : ata_piix ata1: PATA max UDMA/33 cmd 0x1f0 ctl 0x3f6 bmdma 0xffa0 irq 14 ata2: PATA max UDMA/33 cmd 0x170 ctl 0x376 bmdma 0xffa8 irq 15 ata1.00: ATAPI: Memorex 52MAXX 3252AJ1, 4WS2, max UDMA/33 ata1.00: configured for UDMA/33 ata2: port disabled. ignoring. scsi 1:0:0:0: CD-ROMMemorex 52MAXX 3252AJ1 4WS2 PQ: 0 ANSI: 5 if this works then it really needs to move and be renamed. I am compiling with DEV_SR set. Just my $0.02 but may be worth more or less, Adam _ Need to know the score, the latest news, or you need your HotmailĀ®-get your "fix". http://www.msnmobilefix.com/Default.aspx-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Tue, 29 Jan 2008, Gene Heskett wrote: > >For starters, enable CONFIG_BLK_DEV_SR. > > That could stand to be moved or renamed, it is well buried in the menu for > the > REAL scsi stuffs, which I don't have any of. Enabled & building now. The "SCSI support type (disk, tape, CD-ROM)" section of that menu actually applies to all ATA-command-set devices that don't use the old IDE code. For example, usb-storage uses "SCSI disk" out of that section, and I've only seen "Probe all LUNs on each SCSI device" be needed for a particular USB card reader with two slots. At this point, most of the things in the kernel that refer to SCSI probably should say "storage" (or "ATA", really, but that would make the acronyms confusing). Incidentally, you should be able to save debugging time for problems like missing "sr" by building it as a module, which will build really quickly and not require a reboot to test. -Daniel *This .sig left intentionally blank* -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
Gene Heskett wrote: On Tuesday 29 January 2008, Jeff Garzik wrote: Gene Heskett wrote: Does anyone know why my dvdwriter isn't being assigned a '/dev/sdx' number when dmesg says its found ok at ata2.00? I've turned on an option that says something about using the bios for device access this build, but I'll be surprised if that's it. :) I think you mean /dev/scdx not /dev/sdx. Make sure you have the 'sr' driver compiled and load (CONFIG_BLK_DEV_SR). That menu item COULD be moved, I don't have any REAL scsi stuff, so I didn't look there. My bad, with help from hiding it like that. :-) The bios-for-dev-access thing definitely won't help, and may hurt (by taking over the device you wanted to test). Ok, if BLK_DEV_SR fails, I'll take that back out. I'm heating the room making kernels here. :) I can say with 100% certainty that 'sr' is required in order to use your dvd writer with libata. :) Jeff -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
> Is this >4GB or >=4GB? I've seen contradictory reports, and I've got 4GB. Depends how the memory is mapped. Any memory physically above the 4GB boundary Alan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Tuesday 29 January 2008, Jeff Garzik wrote: >Gene Heskett wrote: >> Does anyone know why my dvdwriter isn't being assigned a '/dev/sdx' number >> when dmesg says its found ok at ata2.00? I've turned on an option that >> says something about using the bios for device access this build, but I'll >> be surprised if that's it. :) > >I think you mean /dev/scdx not /dev/sdx. Make sure you have the 'sr' >driver compiled and load (CONFIG_BLK_DEV_SR). > That menu item COULD be moved, I don't have any REAL scsi stuff, so I didn't look there. My bad, with help from hiding it like that. :-) >The bios-for-dev-access thing definitely won't help, and may hurt (by >taking over the device you wanted to test). > Ok, if BLK_DEV_SR fails, I'll take that back out. I'm heating the room making kernels here. :) Thanks Jeff. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) Life sucks, but death doesn't put out at all. -- Thomas J. Kopp -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
Alan Cox wrote: not one problem but lots---is sufficiently widespread that a Mini HOWTO, say, would be really welcome and, I'm guessing, widely used. We don't see very many libata problems at the distro level and they for the most part boil down to - sata_nv with >4GB of RAM, knowing being worked on, no old IDE driver anyway Is this >4GB or >=4GB? I've seen contradictory reports, and I've got 4GB. Richard -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Tuesday 29 January 2008, Mikael Pettersson wrote: >Gene Heskett writes: > > On Tuesday 29 January 2008, Alan Cox wrote: > > >> As slight change here, I was going to use the same .config as > > >> 2.6.24-rc8, but just discovered that neither rc8 nor final is finding > > >> the drivers for my > > > > > >If it is not finding a driver that is nothing to do with libata. It > > > means it's not being loaded by the distribution, or the distribution > > > kernel is too old (2.6.22) for the hardware - in which case see the > > > Fedora respins which are on 2.6.23.something right now. > > > > > >Alan > > > > Home built kernel Alan. But you are as good as anyone to tell me what I > > need to turn on in order for this dvdwriter to be enabled: > > [ 28.862478] ata2.00: ATAPI: LITE-ON DVDRW SHM-165H6S, HS06, max > > UDMA/66 > > [ 28.908647] ata2.00: limited to UDMA/33 due to 40-wire cable > > [ 29.081253] ata2.00: configured for UDMA/33 > > > > it has had several 80 wire cables tried, hasn't fixed this, and does not > > seem to effect its operation when it does work. > > > > [ 29.132405] scsi 1:0:0:0: CD-ROMLITE-ON DVDRW SHM-165H6S > > HS06 PQ: 0 ANSI: 5 > > [ 43.450795] scsi 1:0:0:0: Attached scsi generic sg1 type 5 > > --- > > No further mention of it in dmesg, and k3b cannot find the drive at any > > /dev/sgX address. > > > > .config attached, what else do I need to turn on? > >... > > > # CONFIG_BLK_DEV_SR is not set > >For starters, enable CONFIG_BLK_DEV_SR. That could stand to be moved or renamed, it is well buried in the menu for the REAL scsi stuffs, which I don't have any of. Enabled & building now. Thanks. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) An air of FRENCH FRIES permeates my nostrils!! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
Gene Heskett wrote: Does anyone know why my dvdwriter isn't being assigned a '/dev/sdx' number when dmesg says its found ok at ata2.00? I've turned on an option that says something about using the bios for device access this build, but I'll be surprised if that's it. :) I think you mean /dev/scdx not /dev/sdx. Make sure you have the 'sr' driver compiled and load (CONFIG_BLK_DEV_SR). The bios-for-dev-access thing definitely won't help, and may hurt (by taking over the device you wanted to test). Jeff -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
Mark Lord wrote: Gene Heskett wrote: .. Does anyone know why my dvdwriter isn't being assigned a '/dev/sdx' number when dmesg says its found ok at ata2.00? I've turned on an option that says something about using the bios for device access this build, but I'll be surprised if that's it. :) .. It should show up as /dev/scd0 or something very similar. Does it appear as /dev/sr0? Try ll /dev/s* and see what you get. Anyway, these /dev/ entries are produced by udev, not by libata. rh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
Gene Heskett writes: > On Tuesday 29 January 2008, Alan Cox wrote: > >> As slight change here, I was going to use the same .config as 2.6.24-rc8, > >> but just discovered that neither rc8 nor final is finding the drivers for > >> my > > > >If it is not finding a driver that is nothing to do with libata. It means > >it's not being loaded by the distribution, or the distribution kernel is > >too old (2.6.22) for the hardware - in which case see the Fedora respins > >which are on 2.6.23.something right now. > > > >Alan > > Home built kernel Alan. But you are as good as anyone to tell me what I > need to turn on in order for this dvdwriter to be enabled: > [ 28.862478] ata2.00: ATAPI: LITE-ON DVDRW SHM-165H6S, HS06, max UDMA/66 > > [ 28.908647] ata2.00: limited to UDMA/33 due to 40-wire cable > [ 29.081253] ata2.00: configured for UDMA/33 > > it has had several 80 wire cables tried, hasn't fixed this, and does not > seem to effect its operation when it does work. > > [ 29.132405] scsi 1:0:0:0: CD-ROMLITE-ON DVDRW SHM-165H6S > HS06 PQ: 0 ANSI: 5 > > [ 43.450795] scsi 1:0:0:0: Attached scsi generic sg1 type 5 > --- > No further mention of it in dmesg, and k3b cannot find the drive at any > /dev/sgX address. > > .config attached, what else do I need to turn on? ... > # CONFIG_BLK_DEV_SR is not set For starters, enable CONFIG_BLK_DEV_SR. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Tuesday 29 January 2008, Mark Lord wrote: >Gene Heskett wrote: >>.. >> Does anyone know why my dvdwriter isn't being assigned a '/dev/sdx' number >> when dmesg says its found ok at ata2.00? I've turned on an option that >> says something about using the bios for device access this build, but I'll >> be surprised if that's it. :) > >.. > >It should show up as /dev/scd0 or something very similar. Tisn't. Darnit. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) clock speed -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
Gene Heskett wrote: .. Does anyone know why my dvdwriter isn't being assigned a '/dev/sdx' number when dmesg says its found ok at ata2.00? I've turned on an option that says something about using the bios for device access this build, but I'll be surprised if that's it. :) .. It should show up as /dev/scd0 or something very similar. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
> As slight change here, I was going to use the same .config as 2.6.24-rc8, but > just discovered that neither rc8 nor final is finding the drivers for my If it is not finding a driver that is nothing to do with libata. It means it's not being loaded by the distribution, or the distribution kernel is too old (2.6.22) for the hardware - in which case see the Fedora respins which are on 2.6.23.something right now. Alan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Tuesday 29 January 2008, Florian Attenberger wrote: >On Mon, 28 Jan 2008 14:13:21 -0500 > >Gene Heskett <[EMAIL PROTECTED]> wrote: >> >> I had to reboot early this morning due to a freezeup, and I had a >> >> bunch of these in the messages log: >> >> == >> >> Jan 27 19:42:11 coyote kernel: [42461.915961] ata1.00: exception Emask >> >> 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 27 19:42:11 coyote kernel: >> >> [42461.915973] ata1.00: cmd ca/00:08:b1:66:46/00:00:00:00:00/e8 tag 0 >> >> dma 4096 out Jan 27 19:42:11 coyote kernel: [42461.915974] res >> >> 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 27 19:42:11 >> >> coyote kernel: [42461.915978] ata1.00: status: { DRDY } Jan 27 19:42:11 >> >> coyote kernel: [42461.916005] ata1: soft resetting link Jan 27 19:42:12 >> >> coyote kernel: [42462.078216] ata1.00: configured for UDMA/100 Jan 27 >> >> 19:42:12 coyote kernel: [42462.078232] ata1: EH complete >> >> Jan 27 19:42:12 coyote kernel: [42462.090700] sd 0:0:0:0: [sda] >> >> 390721968 512-byte hardware sectors (200050 MB) Jan 27 19:42:12 coyote >> >> kernel: [42462.114230] sd 0:0:0:0: [sda] Write Protect is off Jan 27 >> >> 19:42:12 coyote kernel: [42462.115079] sd 0:0:0:0: [sda] Write cache: >> >> enabled, read cache: enabled, doesn't support DPO or FUA >> >> === > >I had this error too, or maybe only a similar one, and another, neither >of which of i still have the error output laying around, so I'm posting both >fixes, that i found here on lkml: >1) disabling ncq like that: >"echo 1 > /sys/block/sda/device/queue_depth" Interesting.. >2) this patch: libata_drain_fifo_on_stuck_drq_hsm.patch >( applies to 2.6.24 too ) > >Signed-off-by: Mark Lord <[EMAIL PROTECTED]> >--- > >--- old/drivers/ata/libata-sff.c 2007-09-28 09:29:22.0 -0400 >+++ linux/drivers/ata/libata-sff.c 2007-09-28 09:39:44.0 -0400 >@@ -420,6 +420,28 @@ > ap->ops->irq_on(ap); > } > >+static void ata_drain_fifo(struct ata_port *ap, struct ata_queued_cmd *qc) >+{ >+ u8 stat = ata_chk_status(ap); >+ /* >+ * Try to clear stuck DRQ if necessary, >+ * by reading/discarding up to two sectors worth of data. >+ */ >+ if ((stat & ATA_DRQ) && (!qc || qc->dma_dir != DMA_TO_DEVICE)) { >+ unsigned int i; >+ unsigned int limit = qc ? qc->sect_size : ATA_SECT_SIZE; >+ >+ printk(KERN_WARNING "Draining up to %u words from data FIFO.\n", >+ limit); >+ for (i = 0; i < limit ; ++i) { >+ ioread16(ap->ioaddr.data_addr); >+ if (!(ata_chk_status(ap) & ATA_DRQ)) >+ break; >+ } >+ printk(KERN_WARNING "Drained %u/%u words.\n", i, limit); >+ } >+} >+ > /** > *ata_bmdma_drive_eh - Perform EH with given methods for BMDMA controller > *@ap: port to handle error for >@@ -476,7 +498,7 @@ > } > > ata_altstatus(ap); >- ata_chk_status(ap); >+ ata_drain_fifo(ap, qc); > ap->ops->irq_clear(ap); > > spin_unlock_irqrestore(ap->lock, flags); >- This too. Thanks Florian. I'll keep these in mind as there may be more than one cat in need of skinning here. See a couple of posts I made to lkml this morning for the investigation I'm doing re the kernel argument 'acpi_use_timer_override', experimental builds under way right now. Does anyone know why my dvdwriter isn't being assigned a '/dev/sdx' number when dmesg says its found ok at ata2.00? I've turned on an option that says something about using the bios for device access this build, but I'll be surprised if that's it. :) -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) Ah, sweet Springtime, when a young man lightly turns his fancy over! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Tuesday 29 January 2008, Alan Cox wrote: >> not one problem but lots---is sufficiently widespread that a Mini HOWTO, >> say, would be really welcome and, I'm guessing, widely used. > >We don't see very many libata problems at the distro level and they for >the most part boil down to > >- error messages looking different - Most bugs I get are things like >media errors (timeout looks different, UNC report looks different) > >- broken hardware - I've closed a whole raft of bugs that turn out to be >new PC systems where even the BIOS doesn't see the drives > >- faulty hardware being picked up because we actually do real error >checking now. We now check for and give some devices more slack while >still doing error checking. Both IDE layers also added blacklists for >stuff like the TSScorp DVD drives. Qemu has now had its bugs patched. > >- sata_nv with >4GB of RAM, knowing being worked on, no old IDE driver >anyway > >- pata_ali MWDMA with ATAPI, PIO works fine, all a bit of a mystery and >as it affects only a few chip variants hard to figure out. Workaround >libata.dma=1 > >- CS handling. On a few boxes using cable select (particularly on one >drive and not the other) shows up a problem, normally a failed SRST. >That's still under investigation. > >- Promise timeouts. The old IDE times out then polls the device and finds >the IRQ was never sent and then recovers so the user sees a short stall >but no errors. The new libata doesn't do this and pdc202xx_old thus >produces some error messages on some boxes. Backup polling is on my todo >list. As slight change here, I was going to use the same .config as 2.6.24-rc8, but just discovered that neither rc8 nor final is finding the drivers for my dvd writer while using libata, so its not useable. So I've enable a couple of things in the 2.6.24 build that aren't in the 2.6.24-rc8. When I find the magic twanger, I'll rebuild -rc8 with it too. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) River: "He didn't lie down. They never lie down." --"Serenity" -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Tuesday 29 January 2008, Alan Cox wrote: >> not one problem but lots---is sufficiently widespread that a Mini HOWTO, >> say, would be really welcome and, I'm guessing, widely used. > >We don't see very many libata problems at the distro level and they for >the most part boil down to > >- error messages looking different - Most bugs I get are things like >media errors (timeout looks different, UNC report looks different) > >- broken hardware - I've closed a whole raft of bugs that turn out to be >new PC systems where even the BIOS doesn't see the drives > >- faulty hardware being picked up because we actually do real error >checking now. We now check for and give some devices more slack while >still doing error checking. Both IDE layers also added blacklists for >stuff like the TSScorp DVD drives. Qemu has now had its bugs patched. > >- sata_nv with >4GB of RAM, knowing being worked on, no old IDE driver >anyway > >- pata_ali MWDMA with ATAPI, PIO works fine, all a bit of a mystery and >as it affects only a few chip variants hard to figure out. Workaround >libata.dma=1 > >- CS handling. On a few boxes using cable select (particularly on one >drive and not the other) shows up a problem, normally a failed SRST. >That's still under investigation. > >- Promise timeouts. The old IDE times out then polls the device and finds >the IRQ was never sent and then recovers so the user sees a short stall >but no errors. The new libata doesn't do this and pdc202xx_old thus >produces some error messages on some boxes. Backup polling is on my todo >list. I have not had a problem, no errors at all, since I rebooted to 2.6.24-rc8 with the added argument in the kernel line in grub (from dmesg): [0.00] Kernel command line: ro root=/dev/VolGroup00/LogVol00 acpi_use_timer_override rhgb quiet which causes dmesg to log, some time later: [ 27.581823] ENABLING IO-APIC IRQs [ 27.582014] ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1 [ 27.592017] ..MP-BIOS bug: 8254 timer not connected to IO-APIC [ 27.592068] ...trying to set up timer (IRQ0) through the 8259A ... failed. [ 27.592071] ...trying to set up timer as Virtual Wire IRQ... works. [ 27.703623] Brought up 1 CPUs This was about noonish yesterday, and the logs have been silent regarding this 'exception Emask' error since then. The drive itself has also passed a smartctl -t long test with no errors since then. Now, the last boot that had the problem was to 2.6.24, which did NOT have that 'acpi_use_timer_override' argument, and its dmesg logged: [ 24.934176] ENABLING IO-APIC IRQs [ 24.934367] ..TIMER: vector=0x31 apic1=0 pin1=0 apic2=-1 pin2=-1 [ 25.045973] Brought up 1 CPUs Now, my question is, did the use of that argument, while it looked like it failed, cause the setup code to do something correct that the default path didn't do? Is this the clue we're all looking for? Since libata is apparently the path taken by TPTB, I'm going to build and boot to a 2.6.24 using libata, but add that argument to grubs kernel line in only one of 2 copies of that stanza. Wish me luck. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) The intelligence of any discussion diminishes with the square of the number of participants. -- Adam Walinsky -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
> not one problem but lots---is sufficiently widespread that a Mini HOWTO, > say, would be really welcome and, I'm guessing, widely used. We don't see very many libata problems at the distro level and they for the most part boil down to - error messages looking different - Most bugs I get are things like media errors (timeout looks different, UNC report looks different) - broken hardware - I've closed a whole raft of bugs that turn out to be new PC systems where even the BIOS doesn't see the drives - faulty hardware being picked up because we actually do real error checking now. We now check for and give some devices more slack while still doing error checking. Both IDE layers also added blacklists for stuff like the TSScorp DVD drives. Qemu has now had its bugs patched. - sata_nv with >4GB of RAM, knowing being worked on, no old IDE driver anyway - pata_ali MWDMA with ATAPI, PIO works fine, all a bit of a mystery and as it affects only a few chip variants hard to figure out. Workaround libata.dma=1 - CS handling. On a few boxes using cable select (particularly on one drive and not the other) shows up a problem, normally a failed SRST. That's still under investigation. - Promise timeouts. The old IDE times out then polls the device and finds the IRQ was never sent and then recovers so the user sees a short stall but no errors. The new libata doesn't do this and pdc202xx_old thus produces some error messages on some boxes. Backup polling is on my todo list. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
As slight change here, I was going to use the same .config as 2.6.24-rc8, but just discovered that neither rc8 nor final is finding the drivers for my If it is not finding a driver that is nothing to do with libata. It means it's not being loaded by the distribution, or the distribution kernel is too old (2.6.22) for the hardware - in which case see the Fedora respins which are on 2.6.23.something right now. Alan -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
Gene Heskett writes: On Tuesday 29 January 2008, Alan Cox wrote: As slight change here, I was going to use the same .config as 2.6.24-rc8, but just discovered that neither rc8 nor final is finding the drivers for my If it is not finding a driver that is nothing to do with libata. It means it's not being loaded by the distribution, or the distribution kernel is too old (2.6.22) for the hardware - in which case see the Fedora respins which are on 2.6.23.something right now. Alan Home built kernel Alan. But you are as good as anyone to tell me what I need to turn on in order for this dvdwriter to be enabled: [ 28.862478] ata2.00: ATAPI: LITE-ON DVDRW SHM-165H6S, HS06, max UDMA/66 [ 28.908647] ata2.00: limited to UDMA/33 due to 40-wire cable [ 29.081253] ata2.00: configured for UDMA/33 it has had several 80 wire cables tried, hasn't fixed this, and does not seem to effect its operation when it does work. [ 29.132405] scsi 1:0:0:0: CD-ROMLITE-ON DVDRW SHM-165H6S HS06 PQ: 0 ANSI: 5 [ 43.450795] scsi 1:0:0:0: Attached scsi generic sg1 type 5 --- No further mention of it in dmesg, and k3b cannot find the drive at any /dev/sgX address. .config attached, what else do I need to turn on? ... # CONFIG_BLK_DEV_SR is not set For starters, enable CONFIG_BLK_DEV_SR. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Tuesday 29 January 2008, Mikael Pettersson wrote: Gene Heskett writes: On Tuesday 29 January 2008, Alan Cox wrote: As slight change here, I was going to use the same .config as 2.6.24-rc8, but just discovered that neither rc8 nor final is finding the drivers for my If it is not finding a driver that is nothing to do with libata. It means it's not being loaded by the distribution, or the distribution kernel is too old (2.6.22) for the hardware - in which case see the Fedora respins which are on 2.6.23.something right now. Alan Home built kernel Alan. But you are as good as anyone to tell me what I need to turn on in order for this dvdwriter to be enabled: [ 28.862478] ata2.00: ATAPI: LITE-ON DVDRW SHM-165H6S, HS06, max UDMA/66 [ 28.908647] ata2.00: limited to UDMA/33 due to 40-wire cable [ 29.081253] ata2.00: configured for UDMA/33 it has had several 80 wire cables tried, hasn't fixed this, and does not seem to effect its operation when it does work. [ 29.132405] scsi 1:0:0:0: CD-ROMLITE-ON DVDRW SHM-165H6S HS06 PQ: 0 ANSI: 5 [ 43.450795] scsi 1:0:0:0: Attached scsi generic sg1 type 5 --- No further mention of it in dmesg, and k3b cannot find the drive at any /dev/sgX address. .config attached, what else do I need to turn on? ... # CONFIG_BLK_DEV_SR is not set For starters, enable CONFIG_BLK_DEV_SR. That could stand to be moved or renamed, it is well buried in the menu for the REAL scsi stuffs, which I don't have any of. Enabled building now. Thanks. -- Cheers, Gene There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) An air of FRENCH FRIES permeates my nostrils!! -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
Gene Heskett wrote: Does anyone know why my dvdwriter isn't being assigned a '/dev/sdx' number when dmesg says its found ok at ata2.00? I've turned on an option that says something about using the bios for device access this build, but I'll be surprised if that's it. :) I think you mean /dev/scdx not /dev/sdx. Make sure you have the 'sr' driver compiled and load (CONFIG_BLK_DEV_SR). The bios-for-dev-access thing definitely won't help, and may hurt (by taking over the device you wanted to test). Jeff -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
Mark Lord wrote: Gene Heskett wrote: .. Does anyone know why my dvdwriter isn't being assigned a '/dev/sdx' number when dmesg says its found ok at ata2.00? I've turned on an option that says something about using the bios for device access this build, but I'll be surprised if that's it. :) .. It should show up as /dev/scd0 or something very similar. Does it appear as /dev/sr0? Try ll /dev/s* and see what you get. Anyway, these /dev/ entries are produced by udev, not by libata. rh -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Tuesday 29 January 2008, Mark Lord wrote: Gene Heskett wrote: .. Does anyone know why my dvdwriter isn't being assigned a '/dev/sdx' number when dmesg says its found ok at ata2.00? I've turned on an option that says something about using the bios for device access this build, but I'll be surprised if that's it. :) .. It should show up as /dev/scd0 or something very similar. Tisn't. Darnit. -- Cheers, Gene There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) clock speed -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
Gene Heskett wrote: .. Does anyone know why my dvdwriter isn't being assigned a '/dev/sdx' number when dmesg says its found ok at ata2.00? I've turned on an option that says something about using the bios for device access this build, but I'll be surprised if that's it. :) .. It should show up as /dev/scd0 or something very similar. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Tuesday 29 January 2008, Alan Cox wrote: not one problem but lots---is sufficiently widespread that a Mini HOWTO, say, would be really welcome and, I'm guessing, widely used. We don't see very many libata problems at the distro level and they for the most part boil down to - error messages looking different - Most bugs I get are things like media errors (timeout looks different, UNC report looks different) - broken hardware - I've closed a whole raft of bugs that turn out to be new PC systems where even the BIOS doesn't see the drives - faulty hardware being picked up because we actually do real error checking now. We now check for and give some devices more slack while still doing error checking. Both IDE layers also added blacklists for stuff like the TSScorp DVD drives. Qemu has now had its bugs patched. - sata_nv with 4GB of RAM, knowing being worked on, no old IDE driver anyway - pata_ali MWDMA with ATAPI, PIO works fine, all a bit of a mystery and as it affects only a few chip variants hard to figure out. Workaround libata.dma=1 - CS handling. On a few boxes using cable select (particularly on one drive and not the other) shows up a problem, normally a failed SRST. That's still under investigation. - Promise timeouts. The old IDE times out then polls the device and finds the IRQ was never sent and then recovers so the user sees a short stall but no errors. The new libata doesn't do this and pdc202xx_old thus produces some error messages on some boxes. Backup polling is on my todo list. As slight change here, I was going to use the same .config as 2.6.24-rc8, but just discovered that neither rc8 nor final is finding the drivers for my dvd writer while using libata, so its not useable. So I've enable a couple of things in the 2.6.24 build that aren't in the 2.6.24-rc8. When I find the magic twanger, I'll rebuild -rc8 with it too. -- Cheers, Gene There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) River: He didn't lie down. They never lie down. --Serenity -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Tuesday 29 January 2008, Florian Attenberger wrote: On Mon, 28 Jan 2008 14:13:21 -0500 Gene Heskett [EMAIL PROTECTED] wrote: I had to reboot early this morning due to a freezeup, and I had a bunch of these in the messages log: == Jan 27 19:42:11 coyote kernel: [42461.915961] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 27 19:42:11 coyote kernel: [42461.915973] ata1.00: cmd ca/00:08:b1:66:46/00:00:00:00:00/e8 tag 0 dma 4096 out Jan 27 19:42:11 coyote kernel: [42461.915974] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 27 19:42:11 coyote kernel: [42461.915978] ata1.00: status: { DRDY } Jan 27 19:42:11 coyote kernel: [42461.916005] ata1: soft resetting link Jan 27 19:42:12 coyote kernel: [42462.078216] ata1.00: configured for UDMA/100 Jan 27 19:42:12 coyote kernel: [42462.078232] ata1: EH complete Jan 27 19:42:12 coyote kernel: [42462.090700] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB) Jan 27 19:42:12 coyote kernel: [42462.114230] sd 0:0:0:0: [sda] Write Protect is off Jan 27 19:42:12 coyote kernel: [42462.115079] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA === I had this error too, or maybe only a similar one, and another, neither of which of i still have the error output laying around, so I'm posting both fixes, that i found here on lkml: 1) disabling ncq like that: echo 1 /sys/block/sda/device/queue_depth Interesting.. 2) this patch: libata_drain_fifo_on_stuck_drq_hsm.patch ( applies to 2.6.24 too ) Signed-off-by: Mark Lord [EMAIL PROTECTED] --- --- old/drivers/ata/libata-sff.c 2007-09-28 09:29:22.0 -0400 +++ linux/drivers/ata/libata-sff.c 2007-09-28 09:39:44.0 -0400 @@ -420,6 +420,28 @@ ap-ops-irq_on(ap); } +static void ata_drain_fifo(struct ata_port *ap, struct ata_queued_cmd *qc) +{ + u8 stat = ata_chk_status(ap); + /* + * Try to clear stuck DRQ if necessary, + * by reading/discarding up to two sectors worth of data. + */ + if ((stat ATA_DRQ) (!qc || qc-dma_dir != DMA_TO_DEVICE)) { + unsigned int i; + unsigned int limit = qc ? qc-sect_size : ATA_SECT_SIZE; + + printk(KERN_WARNING Draining up to %u words from data FIFO.\n, + limit); + for (i = 0; i limit ; ++i) { + ioread16(ap-ioaddr.data_addr); + if (!(ata_chk_status(ap) ATA_DRQ)) + break; + } + printk(KERN_WARNING Drained %u/%u words.\n, i, limit); + } +} + /** *ata_bmdma_drive_eh - Perform EH with given methods for BMDMA controller *@ap: port to handle error for @@ -476,7 +498,7 @@ } ata_altstatus(ap); - ata_chk_status(ap); + ata_drain_fifo(ap, qc); ap-ops-irq_clear(ap); spin_unlock_irqrestore(ap-lock, flags); - This too. Thanks Florian. I'll keep these in mind as there may be more than one cat in need of skinning here. See a couple of posts I made to lkml this morning for the investigation I'm doing re the kernel argument 'acpi_use_timer_override', experimental builds under way right now. Does anyone know why my dvdwriter isn't being assigned a '/dev/sdx' number when dmesg says its found ok at ata2.00? I've turned on an option that says something about using the bios for device access this build, but I'll be surprised if that's it. :) -- Cheers, Gene There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) Ah, sweet Springtime, when a young man lightly turns his fancy over! -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Tuesday 29 January 2008, Alan Cox wrote: not one problem but lots---is sufficiently widespread that a Mini HOWTO, say, would be really welcome and, I'm guessing, widely used. We don't see very many libata problems at the distro level and they for the most part boil down to - error messages looking different - Most bugs I get are things like media errors (timeout looks different, UNC report looks different) - broken hardware - I've closed a whole raft of bugs that turn out to be new PC systems where even the BIOS doesn't see the drives - faulty hardware being picked up because we actually do real error checking now. We now check for and give some devices more slack while still doing error checking. Both IDE layers also added blacklists for stuff like the TSScorp DVD drives. Qemu has now had its bugs patched. - sata_nv with 4GB of RAM, knowing being worked on, no old IDE driver anyway - pata_ali MWDMA with ATAPI, PIO works fine, all a bit of a mystery and as it affects only a few chip variants hard to figure out. Workaround libata.dma=1 - CS handling. On a few boxes using cable select (particularly on one drive and not the other) shows up a problem, normally a failed SRST. That's still under investigation. - Promise timeouts. The old IDE times out then polls the device and finds the IRQ was never sent and then recovers so the user sees a short stall but no errors. The new libata doesn't do this and pdc202xx_old thus produces some error messages on some boxes. Backup polling is on my todo list. I have not had a problem, no errors at all, since I rebooted to 2.6.24-rc8 with the added argument in the kernel line in grub (from dmesg): [0.00] Kernel command line: ro root=/dev/VolGroup00/LogVol00 acpi_use_timer_override rhgb quiet which causes dmesg to log, some time later: [ 27.581823] ENABLING IO-APIC IRQs [ 27.582014] ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1 [ 27.592017] ..MP-BIOS bug: 8254 timer not connected to IO-APIC [ 27.592068] ...trying to set up timer (IRQ0) through the 8259A ... failed. [ 27.592071] ...trying to set up timer as Virtual Wire IRQ... works. [ 27.703623] Brought up 1 CPUs This was about noonish yesterday, and the logs have been silent regarding this 'exception Emask' error since then. The drive itself has also passed a smartctl -t long test with no errors since then. Now, the last boot that had the problem was to 2.6.24, which did NOT have that 'acpi_use_timer_override' argument, and its dmesg logged: [ 24.934176] ENABLING IO-APIC IRQs [ 24.934367] ..TIMER: vector=0x31 apic1=0 pin1=0 apic2=-1 pin2=-1 [ 25.045973] Brought up 1 CPUs Now, my question is, did the use of that argument, while it looked like it failed, cause the setup code to do something correct that the default path didn't do? Is this the clue we're all looking for? Since libata is apparently the path taken by TPTB, I'm going to build and boot to a 2.6.24 using libata, but add that argument to grubs kernel line in only one of 2 copies of that stanza. Wish me luck. -- Cheers, Gene There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) The intelligence of any discussion diminishes with the square of the number of participants. -- Adam Walinsky -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
not one problem but lots---is sufficiently widespread that a Mini HOWTO, say, would be really welcome and, I'm guessing, widely used. We don't see very many libata problems at the distro level and they for the most part boil down to - error messages looking different - Most bugs I get are things like media errors (timeout looks different, UNC report looks different) - broken hardware - I've closed a whole raft of bugs that turn out to be new PC systems where even the BIOS doesn't see the drives - faulty hardware being picked up because we actually do real error checking now. We now check for and give some devices more slack while still doing error checking. Both IDE layers also added blacklists for stuff like the TSScorp DVD drives. Qemu has now had its bugs patched. - sata_nv with 4GB of RAM, knowing being worked on, no old IDE driver anyway - pata_ali MWDMA with ATAPI, PIO works fine, all a bit of a mystery and as it affects only a few chip variants hard to figure out. Workaround libata.dma=1 - CS handling. On a few boxes using cable select (particularly on one drive and not the other) shows up a problem, normally a failed SRST. That's still under investigation. - Promise timeouts. The old IDE times out then polls the device and finds the IRQ was never sent and then recovers so the user sees a short stall but no errors. The new libata doesn't do this and pdc202xx_old thus produces some error messages on some boxes. Backup polling is on my todo list. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
Alan Cox wrote: not one problem but lots---is sufficiently widespread that a Mini HOWTO, say, would be really welcome and, I'm guessing, widely used. We don't see very many libata problems at the distro level and they for the most part boil down to - sata_nv with 4GB of RAM, knowing being worked on, no old IDE driver anyway Is this 4GB or =4GB? I've seen contradictory reports, and I've got 4GB. Richard -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
things in the kernel that refer to SCSI probably should say storage (or ATA, really, but that would make the acronyms confusing). SCSI is a command protocol. It is what your CD-ROM drive and USB storage devices talk (albeit with a bit of an accent). Alan -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Tuesday 29 January 2008, Adam Turk wrote: I just found this thread and it looks like it will fix my problem too. I have an IDE cd-rw drive and 2 SCSI hard drives. My ide cd-rw drive hasn't been showing up. I looked at setting scsi cdrom support (CONFIG_BLOCK_DEV_SR) but it doesn't mention anything about ide drives using libata. I know the drive is being detecting by looking at dmesg: ata_piix :00:07.1: version 2.12 scsi1 : ata_piix scsi2 : ata_piix ata1: PATA max UDMA/33 cmd 0x1f0 ctl 0x3f6 bmdma 0xffa0 irq 14 ata2: PATA max UDMA/33 cmd 0x170 ctl 0x376 bmdma 0xffa8 irq 15 ata1.00: ATAPI: Memorex 52MAXX 3252AJ1, 4WS2, max UDMA/33 ata1.00: configured for UDMA/33 ata2: port disabled. ignoring. scsi 1:0:0:0: CD-ROMMemorex 52MAXX 3252AJ1 4WS2 PQ: 0 ANSI: 5 if this works then it really needs to move and be renamed. I am compiling with DEV_SR set. Just my $0.02 but may be worth more or less, Adam That fixed me right up, Adam, k3b is once again as happy as a clam. -- Cheers, Gene There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) Main's Law: For every action there is an equal and opposite government program. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Tuesday 29 January 2008, Daniel Barkalow wrote: On Tue, 29 Jan 2008, Gene Heskett wrote: For starters, enable CONFIG_BLK_DEV_SR. That could stand to be moved or renamed, it is well buried in the menu for the REAL scsi stuffs, which I don't have any of. Enabled building now. The SCSI support type (disk, tape, CD-ROM) section of that menu actually applies to all ATA-command-set devices that don't use the old IDE code. For example, usb-storage uses SCSI disk out of that section, and I've only seen Probe all LUNs on each SCSI device be needed for a particular USB card reader with two slots. At this point, most of the things in the kernel that refer to SCSI probably should say storage (or ATA, really, but that would make the acronyms confusing). Incidentally, you should be able to save debugging time for problems like missing sr by building it as a module, which will build really quickly and not require a reboot to test. -Daniel *This .sig left intentionally blank* I did, Daniel, but while that has worked, its not been 100% foolproof in the past, so I just waste the 9 minutes building a new kernel as cheap insurance. -- Cheers, Gene There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) Mal: If it's Alliance trouble you got, you might want to consider another ship. Some onboard here fought for the Independents. --Episode #8, Out of Gas -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Tue, 29 Jan 2008, Gene Heskett wrote: For starters, enable CONFIG_BLK_DEV_SR. That could stand to be moved or renamed, it is well buried in the menu for the REAL scsi stuffs, which I don't have any of. Enabled building now. The SCSI support type (disk, tape, CD-ROM) section of that menu actually applies to all ATA-command-set devices that don't use the old IDE code. For example, usb-storage uses SCSI disk out of that section, and I've only seen Probe all LUNs on each SCSI device be needed for a particular USB card reader with two slots. At this point, most of the things in the kernel that refer to SCSI probably should say storage (or ATA, really, but that would make the acronyms confusing). Incidentally, you should be able to save debugging time for problems like missing sr by building it as a module, which will build really quickly and not require a reboot to test. -Daniel *This .sig left intentionally blank* -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
I just found this thread and it looks like it will fix my problem too. I have an IDE cd-rw drive and 2 SCSI hard drives. My ide cd-rw drive hasn't been showing up. I looked at setting scsi cdrom support (CONFIG_BLOCK_DEV_SR) but it doesn't mention anything about ide drives using libata. I know the drive is being detecting by looking at dmesg: ata_piix :00:07.1: version 2.12 scsi1 : ata_piix scsi2 : ata_piix ata1: PATA max UDMA/33 cmd 0x1f0 ctl 0x3f6 bmdma 0xffa0 irq 14 ata2: PATA max UDMA/33 cmd 0x170 ctl 0x376 bmdma 0xffa8 irq 15 ata1.00: ATAPI: Memorex 52MAXX 3252AJ1, 4WS2, max UDMA/33 ata1.00: configured for UDMA/33 ata2: port disabled. ignoring. scsi 1:0:0:0: CD-ROMMemorex 52MAXX 3252AJ1 4WS2 PQ: 0 ANSI: 5 if this works then it really needs to move and be renamed. I am compiling with DEV_SR set. Just my $0.02 but may be worth more or less, Adam _ Need to know the score, the latest news, or you need your HotmailĀ®-get your fix. http://www.msnmobilefix.com/Default.aspx-- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
Gene Heskett wrote: On Tuesday 29 January 2008, Jeff Garzik wrote: Gene Heskett wrote: Does anyone know why my dvdwriter isn't being assigned a '/dev/sdx' number when dmesg says its found ok at ata2.00? I've turned on an option that says something about using the bios for device access this build, but I'll be surprised if that's it. :) I think you mean /dev/scdx not /dev/sdx. Make sure you have the 'sr' driver compiled and load (CONFIG_BLK_DEV_SR). That menu item COULD be moved, I don't have any REAL scsi stuff, so I didn't look there. My bad, with help from hiding it like that. :-) The bios-for-dev-access thing definitely won't help, and may hurt (by taking over the device you wanted to test). Ok, if BLK_DEV_SR fails, I'll take that back out. I'm heating the room making kernels here. :) I can say with 100% certainty that 'sr' is required in order to use your dvd writer with libata. :) Jeff -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
Is this 4GB or =4GB? I've seen contradictory reports, and I've got 4GB. Depends how the memory is mapped. Any memory physically above the 4GB boundary Alan -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
Gene Heskett wrote: On Tuesday 29 January 2008, Mark Lord wrote: Gene Heskett wrote: .. Does anyone know why my dvdwriter isn't being assigned a '/dev/sdx' number when dmesg says its found ok at ata2.00? I've turned on an option that says something about using the bios for device access this build, but I'll be surprised if that's it. :) .. It should show up as /dev/scd0 or something very similar. Tisn't. Darnit. .. It requires CONFIG_SCSI, CONFIG_BLK_DEV_SD, CONFIG_BLK_DEV_SR, in the kernel .config. The _SR one (SCSI Reader) is for CD/DVD support. Cheers -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
rgheck wrote: Alan Cox wrote: not one problem but lots---is sufficiently widespread that a Mini HOWTO, say, would be really welcome and, I'm guessing, widely used. We don't see very many libata problems at the distro level and they for the most part boil down to - sata_nv with 4GB of RAM, knowing being worked on, no old IDE driver anyway Is this 4GB or =4GB? I've seen contradictory reports, and I've got 4GB. .. For all practical purposes, most memory over 3GB (or sometimes even 2GB) on a 32-bit x86 system is treated as 4GB by the motherboard. Because it's not the amount of *memory* that matters so much, but rather the amount of *used address space*. Video cards, PCI devices, other motherboard resources etc.. can all subtract from the available address space, leaving much less than 4GB for your RAM. -ml -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Tue, 29 Jan 2008, Alan Cox wrote: things in the kernel that refer to SCSI probably should say storage (or ATA, really, but that would make the acronyms confusing). SCSI is a command protocol. It is what your CD-ROM drive and USB storage devices talk (albeit with a bit of an accent). Among other things, yes. But SCSI standards also specify electrical interfaces that aren't at all related to the electrical interfaces used by a lot of devices, and a lot of the places the kernel uses the term suggest that it's also talking about the electrical interface (or, at least, connector shape). For example, it's misleading to talk about SCSI CDROM support meaning the command protocol when hardly anybody has ever seen a CDROM drive that doesn't use the SCSI command protocol, but most people know about both SCSI-connector and PATA-connector CDROM drives. -Daniel *This .sig left intentionally blank* -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Tue, 29 Jan 2008, Alan Cox wrote: not one problem but lots---is sufficiently widespread that a Mini HOWTO, say, would be really welcome and, I'm guessing, widely used. We don't see very many libata problems at the distro level and they for the most part boil down to - error messages looking different - Most bugs I get are things like media errors (timeout looks different, UNC report looks different) The SCSI error reporting really ought to include a simple interpretation of the error for end users (The drive doesn't support this command A sector's data got lost The drive timed out The drive failed The drive is entirely gone). There's too much similarity between the message you get when you try a SMART test that doesn't apply to the drive and what you get when the drive is broken. - faulty hardware being picked up because we actually do real error checking now. We now check for and give some devices more slack while still doing error checking. Both IDE layers also added blacklists for stuff like the TSScorp DVD drives. Qemu has now had its bugs patched. I think this is the big source of unhappy users (and, of course, they all look the same and the reports stay findable by Google, so it looks a lot worse than it is). People getting this problem in distro kernels probably really do want to have a way to report it with enough detail from logs to get it dealt with and then switch back to old IDE until the fix propagates through. And it's possible that the error recovery is suboptimal in some cases. It seems to like resetting drives too much; perhaps if it keeps seeing the same problem and resetting the drive, it should decide that the drive's error reporting is just bad and just ignore that error like the old IDE did (but, in this case, after saying what it's doing). -Daniel *This .sig left intentionally blank* -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Tue, 29 Jan 2008, Alan Cox wrote: The SCSI error reporting really ought to include a simple interpretation of the error for end users (The drive doesn't support this command A sector's data got lost The drive timed out The drive failed The drive is entirely gone). There's too much similarity between the message you get when you try a SMART test that doesn't apply to the drive and what you get when the drive is broken. That would be the SCSI verbose messages option. I think the Eric Youngdale consortium added it about Linux 1.2. Nowdays its always built that way. I've seen a lot of verbosity out of SCSI messages, but I haven't seen a straightforward interpretation of the problem in there. It's all information useful for debugging, not information useful for system administration. And it's possible that the error recovery is suboptimal in some cases. It seems to like resetting drives too much; perhaps if it keeps seeing the same problem and resetting the drive, it should decide that the drive's error reporting is just bad and just ignore that error like the old IDE did (but, in this case, after saying what it's doing). Nothing like casually praying the users data hasn't gone for a walk is there. If we don't act on them the users don't report them until something really bad occurs so that isn't an option. On the other hand, bringing the system down because a device is misbehaving is a poor idea. I've personally recovered most of the data off of a dying drive because the system was willing to let me keep using the drive anyway; IIRC, the drive didn't work at all after a reboot, so I would have lost all the data instead of only a little had the system insisted on a perfectly functioning drive in order to use it at all. There ought to be some middle ground between doing nothing until the computer really breaks and breaking the computer before then, but that's an issue not specific to libata. -Daniel *This .sig left intentionally blank* -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
I've seen a lot of verbosity out of SCSI messages, but I haven't seen a straightforward interpretation of the problem in there. It's all information useful for debugging, not information useful for system administration. It tells you what is going on. Unfortunately that frequently requires some basic knowledge of how to interpret the error report. Drive interface behaviour simply doesn't boil down to a fault light on the dashboard or a tighten the cable. For most common fault types you'll get errors most administrators should find meaningful - like Media error On the other hand, bringing the system down because a device is misbehaving is a poor idea. I've personally recovered most of the data off Hence we have RAID and SATA hotplug. Alan -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
That could stand to be moved or renamed, it is well buried in the menu for the REAL scsi stuffs, which I don't have any of. Yes you do - USB storage and ATAPI are SCSI -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Tuesday 29 January 2008, Jeff Garzik wrote: Gene Heskett wrote: On Tuesday 29 January 2008, Jeff Garzik wrote: Gene Heskett wrote: Does anyone know why my dvdwriter isn't being assigned a '/dev/sdx' number when dmesg says its found ok at ata2.00? I've turned on an option that says something about using the bios for device access this build, but I'll be surprised if that's it. :) I think you mean /dev/scdx not /dev/sdx. Make sure you have the 'sr' driver compiled and load (CONFIG_BLK_DEV_SR). That menu item COULD be moved, I don't have any REAL scsi stuff, so I didn't look there. My bad, with help from hiding it like that. :-) The bios-for-dev-access thing definitely won't help, and may hurt (by taking over the device you wanted to test). Ok, if BLK_DEV_SR fails, I'll take that back out. I'm heating the room making kernels here. :) I can say with 100% certainty that 'sr' is required in order to use your dvd writer with libata. :) Jeff And as usual, you are 100% correct, thanks. And now back to our regularly scheduled testing for 'exception Emask' errors. :) -- Cheers, Gene There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) Main's Law: For every action there is an equal and opposite government program. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
Mark Lord wrote: rgheck wrote: Alan Cox wrote: not one problem but lots---is sufficiently widespread that a Mini HOWTO, say, would be really welcome and, I'm guessing, widely used. We don't see very many libata problems at the distro level and they for the most part boil down to - sata_nv with 4GB of RAM, knowing being worked on, no old IDE driver anyway Is this 4GB or =4GB? I've seen contradictory reports, and I've got 4GB. .. For all practical purposes, most memory over 3GB (or sometimes even 2GB) on a 32-bit x86 system is treated as 4GB by the motherboard. Because it's not the amount of *memory* that matters so much, but rather the amount of *used address space*. Video cards, PCI devices, other motherboard resources etc.. can all subtract from the available address space, leaving much less than 4GB for your RAM. Right. So it looks like I do have this issue, though I haven't seen any actual problems on 24. Is there a known workaround? rh -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
rgheck wrote: Mark Lord wrote: rgheck wrote: Alan Cox wrote: not one problem but lots---is sufficiently widespread that a Mini HOWTO, say, would be really welcome and, I'm guessing, widely used. We don't see very many libata problems at the distro level and they for the most part boil down to - sata_nv with 4GB of RAM, knowing being worked on, no old IDE driver anyway Is this 4GB or =4GB? I've seen contradictory reports, and I've got 4GB. .. For all practical purposes, most memory over 3GB (or sometimes even 2GB) on a 32-bit x86 system is treated as 4GB by the motherboard. Because it's not the amount of *memory* that matters so much, but rather the amount of *used address space*. Video cards, PCI devices, other motherboard resources etc.. can all subtract from the available address space, leaving much less than 4GB for your RAM. Right. So it looks like I do have this issue, though I haven't seen any actual problems on 24. Is there a known workaround? .. For now, the workaround is to not enable the RAM above 4GB. Your kernel .config file should therefore have these two lines: CONFIG_HIGHMEM4G=y # CONFIG_HIGHMEM64G is not set Later, once the issue is fixed at the driver level (soon), you can get your high memory back again by enabling CONFIG_HIGHMEM64G, though this will cost a few percent of performance in the extra page table overhead it creates. Cheers -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
The SCSI error reporting really ought to include a simple interpretation of the error for end users (The drive doesn't support this command A sector's data got lost The drive timed out The drive failed The drive is entirely gone). There's too much similarity between the message you get when you try a SMART test that doesn't apply to the drive and what you get when the drive is broken. That would be the SCSI verbose messages option. I think the Eric Youngdale consortium added it about Linux 1.2. Nowdays its always built that way. And it's possible that the error recovery is suboptimal in some cases. It seems to like resetting drives too much; perhaps if it keeps seeing the same problem and resetting the drive, it should decide that the drive's error reporting is just bad and just ignore that error like the old IDE did (but, in this case, after saying what it's doing). Nothing like casually praying the users data hasn't gone for a walk is there. If we don't act on them the users don't report them until something really bad occurs so that isn't an option. Alan -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Tuesday 29 January 2008, Alan Cox wrote: That could stand to be moved or renamed, it is well buried in the menu for the REAL scsi stuffs, which I don't have any of. Yes you do - USB storage and ATAPI are SCSI By the linux software definition maybe. But I've defined scsi as that which uses a 50 wire cable using 50 contact centronics connectors since the mid '70's, and which often needs a ready supply of nubile virgins to sacrifice to make it work, particularly with the old resistor pack terminations psu's whose 5 volt line is only 4.85 volts due to old age. That's what I call REAL scsi. Its also a REAL PITA if the terms aren't active. You can call what you are doing 'scsi' because you are using much the same command structure, and that is good, but its not the real thing with all its hardware warts and/or capabilities. For one thing, this version usually works. :) Furinstance, you can tell 2 scsi devices on the same controller to talk to each other, moving files from one to the other, and the host controller can then goto sleep the cpu isn't involved until the devices send it a wakeup to advise the controller that the transfer has been done, and the controller may or may not then interrupt and advise the cpu. You can do that with separate controllers too as long as they have a compatible DMA channel available to both. I doubt libata has that capability now, or ever will, cuz these ide/atapi devices are generally dumber than rocks about that. But any device claiming to be scsi-II is supposed to be able to do those sorts of things while the cpu is off crunching numbers for BOINC or whatever. But that puts my mild objections to classifying this as 'scsi' in a more understandable context. :-) -- Cheers, Gene There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) When some people decide it's time for everyone to make big changes, it means that they want you to change first. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
By the linux software definition maybe. But I've defined scsi as that which uses a 50 wire cable using 50 contact centronics connectors since the mid '70's, and which often needs a ready supply of nubile virgins t 25, 50 or 68, with multiple voltage levels, plus of course it might be over fibre or copper FC loop and .. SCSI is a protocol. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Problem with ata layer in 2.6.24
From: [EMAIL PROTECTED] if this works then it really needs to move and be renamed. I am compiling with DEV_SR set. That fixed me right up, Adam, k3b is once again as happy as a clam. Fixed it for me too. I just realized the default config in 2.6.24 is way different than the default config in 2.6.23. If I remember correctly there was talk of separating the libata and scsi code. This was awhile ago. I am not a kernel programmer, only a user, but either the scsi and libata kconfig menus should be joined and made generic, or options like cdrom support should be in both kconfig menus. Alan says libata is scsi with an accent so maybe merging the two isn't as bad as it sounds. Just my $0.02 cents, probably worth less in this case. Adam _ Connect and share in new ways with Windows Live. http://www.windowslive.com/share.html?ocid=TXT_TAGHM_Wave2_sharelife_012008-- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
Gene Heskett wrote: I doubt libata has that capability now, or ever will, cuz these ide/atapi devices are generally dumber than rocks about that. But any device claiming to be scsi-II is supposed to be able to do those sorts of things while the cpu is off crunching numbers for BOINC or whatever. .. The CD/DVD drives all all MMC devices internally, which means they speak a SCSI command protocol. Regardless of the electrical or optical interface. Linux is software, and the software protocol is exactly the same for them, no matter what the cable/bus type happens to be. Cheers -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Mon, 28 Jan 2008 14:13:21 -0500 Gene Heskett <[EMAIL PROTECTED]> wrote: > >> I had to reboot early this morning due to a freezeup, and I had a > >> bunch of these in the messages log: > >> == > >> Jan 27 19:42:11 coyote kernel: [42461.915961] ata1.00: exception Emask 0x0 > >> SAct 0x0 SErr 0x0 action 0x2 frozen Jan 27 19:42:11 coyote kernel: > >> [42461.915973] ata1.00: cmd ca/00:08:b1:66:46/00:00:00:00:00/e8 tag 0 dma > >> 4096 out Jan 27 19:42:11 coyote kernel: [42461.915974] res > >> 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 27 19:42:11 > >> coyote kernel: [42461.915978] ata1.00: status: { DRDY } Jan 27 19:42:11 > >> coyote kernel: [42461.916005] ata1: soft resetting link Jan 27 19:42:12 > >> coyote kernel: [42462.078216] ata1.00: configured for UDMA/100 Jan 27 > >> 19:42:12 coyote kernel: [42462.078232] ata1: EH complete > >> Jan 27 19:42:12 coyote kernel: [42462.090700] sd 0:0:0:0: [sda] 390721968 > >> 512-byte hardware sectors (200050 MB) Jan 27 19:42:12 coyote kernel: > >> [42462.114230] sd 0:0:0:0: [sda] Write Protect is off Jan 27 19:42:12 > >> coyote kernel: [42462.115079] sd 0:0:0:0: [sda] Write cache: enabled, read > >> cache: enabled, doesn't support DPO or FUA > >> === I had this error too, or maybe only a similar one, and another, neither of which of i still have the error output laying around, so I'm posting both fixes, that i found here on lkml: 1) disabling ncq like that: "echo 1 > /sys/block/sda/device/queue_depth" 2) this patch: libata_drain_fifo_on_stuck_drq_hsm.patch ( applies to 2.6.24 too ) Signed-off-by: Mark Lord <[EMAIL PROTECTED]> --- --- old/drivers/ata/libata-sff.c2007-09-28 09:29:22.0 -0400 +++ linux/drivers/ata/libata-sff.c 2007-09-28 09:39:44.0 -0400 @@ -420,6 +420,28 @@ ap->ops->irq_on(ap); } +static void ata_drain_fifo(struct ata_port *ap, struct ata_queued_cmd *qc) +{ + u8 stat = ata_chk_status(ap); + /* +* Try to clear stuck DRQ if necessary, +* by reading/discarding up to two sectors worth of data. +*/ + if ((stat & ATA_DRQ) && (!qc || qc->dma_dir != DMA_TO_DEVICE)) { + unsigned int i; + unsigned int limit = qc ? qc->sect_size : ATA_SECT_SIZE; + + printk(KERN_WARNING "Draining up to %u words from data FIFO.\n", + limit); + for (i = 0; i < limit ; ++i) { + ioread16(ap->ioaddr.data_addr); + if (!(ata_chk_status(ap) & ATA_DRQ)) + break; + } + printk(KERN_WARNING "Drained %u/%u words.\n", i, limit); + } +} + /** * ata_bmdma_drive_eh - Perform EH with given methods for BMDMA controller * @ap: port to handle error for @@ -476,7 +498,7 @@ } ata_altstatus(ap); - ata_chk_status(ap); + ata_drain_fifo(ap, qc); ap->ops->irq_clear(ap); spin_unlock_irqrestore(ap->lock, flags); - -- Florian Attenberger <[EMAIL PROTECTED]> pgpqZfRawkKTf.pgp Description: PGP signature
Re: Problem with ata layer in 2.6.24
On Mon, Jan 28, 2008 at 08:31:57PM -0500, Gene Heskett wrote: > > In my script, its one line: > mkinitrd -f initrd-$VER.img $VER && \ > > where $VER is the shell variable I edit to = the version number, located at > the top of the script. > > Unforch, its failing: > No module pata_amd found for kernel 2.6.24, aborting. mkinitrd is just a shell script. Even if its options, and there is a quite a number of these, do not allow to influence a choice of modules in a desired manner, it is pretty trivial to make yourself a custom version of it and just hardwire there a fixed list of modules to use instead of relying on general mechanisms which are trying hard to guess what you may need. That way your regular 'mkinitrd' will build something to boot with libata and 'mkinird.ide' will use IDE modules for that purpose using the same "core" kernel. If you are using distribution kernels, as opposed to your own configuration, it is quite likely that you will need to install 'kernel-devel' package and recompile and add required IDE modules yourself as those may be not provided. This is done the same way like for any other "external" module. Michal -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Mon, 2008-01-28 at 23:49 -0500, Gene Heskett wrote: > On Monday 28 January 2008, Kasper Sandberg wrote: > [...] > > > >I can invalidate this theory... > >i helped a guy on irc debug this problem, and he had ati. I tried having > >him stop using fglrx, and go to r300.. same problem, and same problem > >even with vesa.. :) > > > No Kasper, you are validating it, that it is not nvidia related, which is > what > I was also saying. yeah thats what i mean - i can invalidate the theory that all the affected boxes run nvidia. > > >also, i have this on my fileserver with .20, which doesent even run X, > >or module support in kernel :) > > That far back? Although ISTR I saw it happen once only when I was running > 2.6.18-somethingorother. Yes im afraid so.. i will now provide some complete details, as i feel they are relevant. the thing is, i run 6x300gb disks, IDE, in raid5. i have both an onboard via ide controller, and then i bought a promise pdc 202 new thingie. i had problem however.. after a bit of time, i would get DMA reset error thing, and it all kindof went NUTS. it was as if all data access were skewed, and as you might imagine, this made everything fail badly. i purchased an ITE based controller for the drives on the promise, but exactly the same thing happened. the errors i got was: hdf: dma_intr: bad DMA status (dma_stat=75) hdf: dma_intr: status=0x50 { DriveReady SeekComplete } ide: failed opcode was: unknown --- i then found new hope, when i heard that libata provided much better error handling, so i upgraded to .20. this made my box usable. the error happens once or twice a day, the disk led will turn on constantly, and all IO freezes for about half a minute, where it returns PROPERLY(thank you libata!). as far as i can tell, the only side effect is that i get those messages like described here, and flooded with on google. to put some timeline perspective into this. i believe it was in 2005 i assembled the system, and when i realized it was faulty, on old ide driver, i stopped using it - that miht have been in beginning of 2006. then for almost a year i werent using it, hoping to somehow fix it, but in january 2007 i think it was, atleast in the very beginning of 2007, i hit upon the idea of trying libata, and ever since the system has been running 24/7 - doing these errors around 2 times a day. i have multiple times reported my problems to lkml, but nothing has happened, i also tried to aproeach jgarzik direcly, but he was not interested. i really hope this can be solved now, its a huge problem my fileserver has an asus k8v motherboard, with via chipset (k8t880 i think it is, or something like it). currently using the promise controller again(strangely enough all the timeouts seems to happen here, and when the ITE was on, there, not the onboard one), in conjunction with the onboard via. > >> complaint. Again, fix the nv driver so it will run my screen & I'll be -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Monday 28 January 2008, Kasper Sandberg wrote: [...] >> >We have no way of debugging that module, so please try 2.6.24 without it. >> >> Sorry, I can't do this and have a working machine. The nv driver has >> suffered bit rot or something since the FC2 days when it COULD run a 19" >> crt at 1600x1200, and will not drive this 20" wide screen lcd 1680x1050 >> monitor at more than 800x600, which is absolutely butt ugly fuzzy, looking >> like a jpg compressed to 10%. The system is not usable on a day to basis >> without the nvidia driver. >> >> Fix the nv driver so it will run this screen at its native resolution and >> I'll be glad to run it even if it won't run google earth, which I do use >> from time to time. Now, if in all the hits you can get from google on >> this, currently 14,800 just for 'exception Emask', apparently caused by a >> timeout, if 100% of the complainers are running nvidia drivers also, then >> I see a legit > >I can invalidate this theory... >i helped a guy on irc debug this problem, and he had ati. I tried having >him stop using fglrx, and go to r300.. same problem, and same problem >even with vesa.. :) > No Kasper, you are validating it, that it is not nvidia related, which is what I was also saying. >also, i have this on my fileserver with .20, which doesent even run X, >or module support in kernel :) That far back? Although ISTR I saw it happen once only when I was running 2.6.18-somethingorother. >> complaint. Again, fix the nv driver so it will run my screen & I'll be >> glad to switch. I can see the reason, sure, but the machine must be >> capable of doing its common day to day stuff, while using that driver, >> like running kde for kmail, and browsers that work. >> >> >If the problems persist, please try to capture a complete log from the >> >failing kernel -- the interesting bits are everything from initial boot >> >up to and including the first few errors. You may need to increase the >> >kernel's log buffer size if the log gets truncated >> > (CONFIG_LOG_BUF_SHIFT). >> >> If by log you mean /var/log/messages, I have several megabytes of those. >> If you mean a live dmesg capture taken right now, its attached. It >> contains several of these at the bottom. I long ago made the kernel log >> buffer bigger, cuz it couldn't even show the start immediately after the >> boot, and even the dump to syslog was truncated. >> >> >There are no pata_amd changes from 2.6.24-rc7 to 2.6.24 final. >> >> That is what I was afraid of. I've done some limited grepping in that >> branch of the kernel tree, and cannot seem to locate where this EH handler >> is being invoked from. >> >> There is 2 lines of interest in the dmesg: >> >> [0.00] Nvidia board detected. Ignoring ACPI timer override. >> [0.00] If you got timer trouble try acpi_use_timer_override >> >> But I have NDI what it means, kernel argument/xconfig option? >> >> I've also done some googling, and it appears this problem is fairly >> widespread since the switchover to libata was encouraged. A stock fedora >> F8 kernel suffers the same freezes and eventually locks up, but does it >> without the error messages being logged, it just freezes, feeling >> identical to this in the minutes before the total freeze. I've tried 2 of >> those too, but the newest one won't even run X. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) bureaucrat, n: A politician who has tenure. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Mon, 2008-01-28 at 11:35 -0500, Gene Heskett wrote: > On Monday 28 January 2008, Mikael Pettersson wrote: > >Gene Heskett writes: > > > On Monday 28 January 2008, Peter Zijlstra wrote: > > > >On Mon, 2008-01-28 at 09:17 +0100, Mikael Pettersson wrote: > > > >> 1. Wrong mailing list; use linux-ide (@vger) instead. > > > > > > > >What, and keep all us other interested people in the dark? > > > > > > As a test, I tried rebooting to the latest fedora kernel and found it > > > kills X, so I'm back to the second to last fedora version ATM, and the > > > third 'smartctl -t lng /dev/sda' in 24 hours is running now. The first > > > two completed with no errors. > > > > > > I've added the linux-ide list to refresh those people of the problem, > > > the logs are being spammed by this message stanza: > > > > > > Jan 28 04:46:25 coyote kernel: [26550.290016] ata1.00: exception Emask > > > 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 28 04:46:25 coyote kernel: > > > [26550.290028] ata1.00: cmd 35/00:58:c9:9c:0a/00:01:00:00:00/e0 tag 0 dma > > > 176128 out Jan 28 04:46:25 coyote kernel: [26550.290029] res > > > 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 28 04:46:25 > > > coyote kernel: [26550.290032] ata1.00: status: { DRDY } Jan 28 04:46:25 > > > coyote kernel: [26550.290060] ata1: soft resetting link Jan 28 04:46:25 > > > coyote kernel: [26550.452301] ata1.00: configured for UDMA/100 Jan 28 > > > 04:46:25 coyote kernel: [26550.452318] ata1: EH complete > > > Jan 28 04:46:25 coyote kernel: [26550.455898] sd 0:0:0:0: [sda] 390721968 > > > 512-byte hardware sectors (200050 MB) Jan 28 04:46:25 coyote kernel: > > > [26550.456151] sd 0:0:0:0: [sda] Write Protect is off Jan 28 04:46:25 > > > coyote kernel: [26550.456403] sd 0:0:0:0: [sda] Write cache: enabled, > > > read cache: enabled, doesn't support DPO or FUA > > > >It's not obvious from this incomplete dmesg log what HW or driver > >is behind ata1, but if the 2.6.24-rc7 kernel matches the 2.6.24 one, > > > >it should be pata_amd driving a WDC disk: > > > [ 30.702887] pata_amd :00:09.0: version 0.3.10 > > > [ 30.703052] PCI: Setting latency timer of device :00:09.0 to 64 > > > [ 30.703188] scsi0 : pata_amd > > > [ 30.709313] scsi1 : pata_amd > > > [ 30.710076] ata1: PATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0xf000 > > > irq 14 [ 30.710079] ata2: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma > > > 0xf008 irq 15 [ 30.864753] ata1.00: ATA-6: WDC WD2000JB-00EVA0, > > > 15.05R15, max UDMA/100 [ 30.864756] ata1.00: 390721968 sectors, multi > > > 16: LBA48 > > > [ 30.871629] ata1.00: configured for UDMA/100 > > > >Unfortunately we also see: > > > [ 48.285456] nvidia: module license 'NVIDIA' taints kernel. > > > [ 48.549725] ACPI: PCI Interrupt :02:00.0[A] -> Link [APC4] -> GSI > > > 19 (level, high) -> IRQ 20 [ 48.550149] NVRM: loading NVIDIA UNIX x86 > > > Kernel Module 169.07 Thu Dec 13 18:42:56 PST 2007 > > > >We have no way of debugging that module, so please try 2.6.24 without it. > > Sorry, I can't do this and have a working machine. The nv driver has > suffered > bit rot or something since the FC2 days when it COULD run a 19" crt at > 1600x1200, and will not drive this 20" wide screen lcd 1680x1050 monitor at > more than 800x600, which is absolutely butt ugly fuzzy, looking like a jpg > compressed to 10%. The system is not usable on a day to basis without the > nvidia driver. > > Fix the nv driver so it will run this screen at its native resolution and > I'll > be glad to run it even if it won't run google earth, which I do use from time > to time. Now, if in all the hits you can get from google on this, currently > 14,800 just for 'exception Emask', apparently caused by a timeout, if 100% of > the complainers are running nvidia drivers also, then I see a legit I can invalidate this theory... i helped a guy on irc debug this problem, and he had ati. I tried having him stop using fglrx, and go to r300.. same problem, and same problem even with vesa.. :) also, i have this on my fileserver with .20, which doesent even run X, or module support in kernel :) > complaint. Again, fix the nv driver so it will run my screen & I'll be glad > to switch. I can see the reason, sure, but the machine must be capable of > doing its common day to day stuff, while using that driver, like running kde > for kmail, and browsers that work. > > >If the problems persist, please try to capture a complete log from the > >failing kernel -- the interesting bits are everything from initial boot > >up to and including the first few errors. You may need to increase the > >kernel's log buffer size if the log gets truncated (CONFIG_LOG_BUF_SHIFT). > > If by log you mean /var/log/messages, I have several megabytes of those. > If you mean a live dmesg capture taken right now, its attached. It contains > several of these at the bottom. I long ago made the kernel log buffer > bigger, cuz it couldn't even show the
Re: Problem with ata layer in 2.6.24
On Monday 28 January 2008, Mark Lord wrote: >Gene Heskett wrote: >.. > >> That's ok, dd seemed to do the job also. > >.. > >The two programs operate entirely differently from each other, >so it may still be worth trying the make_bad_sector utility there. > >dd goes through the regular kernel I/O calls, >whereas make_bad_sector sends raw ATA commands >directly (more or less) to the drive. > Humm, if it (the sector error) continues. I'm rather convinced that was a one time transient item caused by doing so many hardware resets. It has not repeated in subsequent stanzas of this error. Several times it went away while the drives long self test was in progress, and the resets that go with the reboot, or one of these errors seems to stop the long test, which from my reading, should resume with no delay, but maybe that only applies to a powerdown restart, which I haven't been doing. The last such error was about 11 hours ago now. I just started another long test, which if ok, should clear the stuff its showing now because the test was interrupted. It has passed that test twice before in the last 36 hours. Thanks Mark. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) You are a fluke of the universe; you have no right to be here. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
Gene Heskett wrote: On Monday 28 January 2008, Gene Heskett wrote: On Monday 28 January 2008, Robert Hancock wrote: [...] Check the /etc/modprobe.conf file, a lot of distributions use this to generate the initrd. If there's references to pata_amd it'll try and include it. Bingo! Thanks Robert, I'll try it again with that line commented. I wasn't aware of that connection at all. Yup, it worked, I feel a reboot coming on. :) But it didn't work, apparently commenting that line out needs to be balanced by adding another line telling it amd74xx is the 'hostadapter', not necessarily scsi. Can this be made more universal so I don't have to edit /etc/modprobe.conf? .. You could really do it like Linus (and me), and not bother with modules for critical services like hard disks. Just build them *into* the core kernel (select "y" or "checkmark" rather than "m" or "dot" for modules). This eliminates a ton of crap that can fail, and may also make your kernel a micro-MIP faster (core memory is often mapped without page table entries, whereas loaded modules use page tables.. slower, slightly). Linus just edits the /boot/grub/menu.lst, and clones an existing boot entry for the new kernel, editing the "kernel" line to match the name of the file that got installed in /boot by "make install" (from the kernel directory). He just leaves the ramdisk/initrd line as-was --> wrong version, but that's okay. I totally get rid of them here, but that requires hardcoding the root=/dev/ part on the "kernel" line. No big deal, it works just fine that way. Cheers -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
Gene Heskett wrote: .. That's ok, dd seemed to do the job also. .. The two programs operate entirely differently from each other, so it may still be worth trying the make_bad_sector utility there. dd goes through the regular kernel I/O calls, whereas make_bad_sector sends raw ATA commands directly (more or less) to the drive. -ml -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Monday 28 January 2008, Gene Heskett wrote: >On Monday 28 January 2008, Robert Hancock wrote: >[...] > >>Check the /etc/modprobe.conf file, a lot of distributions use this to >>generate the initrd. If there's references to pata_amd it'll try and >>include it. > >Bingo! Thanks Robert, I'll try it again with that line commented. I wasn't >aware of that connection at all. Yup, it worked, I feel a reboot coming >on. :) But it didn't work, apparently commenting that line out needs to be balanced by adding another line telling it amd74xx is the 'hostadapter', not necessarily scsi. Can this be made more universal so I don't have to edit /etc/modprobe.conf? Thanks. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) Because we don't think about future generations, they will never forget us. -- Henrik Tikkanen -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Mon, 28 Jan 2008, Gene Heskett wrote: > On Monday 28 January 2008, Daniel Barkalow wrote: > >On Mon, 28 Jan 2008, Gene Heskett wrote: > >> On Monday 28 January 2008, Daniel Barkalow wrote: > >> >Building this and installing it along with the appropriate initrd (which > >> >might be handled by Fedora's install scripts) > >> > >> Or mine, which I've been using for years. > > > >You're ahead of a surprising number of people, including me, if you > >understand making initrds. > > In my script, its one line: > mkinitrd -f initrd-$VER.img $VER && \ > > where $VER is the shell variable I edit to = the version number, located at > the top of the script. > > Unforch, its failing: > No module pata_amd found for kernel 2.6.24, aborting. > > This is with pata_amd turned off and its counterpart under ATA/RLL/etc turned > on. So something is still dependent on it. That looks like something in the guts of the initrd; it probably thinks you need pata_amd and it's unhappy that you don't have it. Actually, another thing to try is making the ATA/etc one be "y" and pata_amd be "m". Most likely, this should lead to the ATA one claiming the drive before the module is loaded (but the module would be loaded later, to avoid upsetting the initrd); you should be able to tell from dmesg (or /dev, for that matter) which one got it, and I think built-in drivers will claim everything they can before an initrd gets loaded. > I do have one sata drive, on an accessory card in the box, so I need the > rest of the sata_sil and friends stuff. Assuming it isn't picking up your hard drive, which it isn't, that shouldn't matter. -Daniel *This .sig left intentionally blank* -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Monday 28 January 2008, Robert Hancock wrote: [...] >Check the /etc/modprobe.conf file, a lot of distributions use this to >generate the initrd. If there's references to pata_amd it'll try and >include it. Bingo! Thanks Robert, I'll try it again with that line commented. I wasn't aware of that connection at all. Yup, it worked, I feel a reboot coming on. :) -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) If everything seems to be going well, you have obviously overlooked something. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Monday 28 January 2008, Daniel Barkalow wrote: >On Mon, 28 Jan 2008, Gene Heskett wrote: >> On Monday 28 January 2008, Daniel Barkalow wrote: >> >Building this and installing it along with the appropriate initrd (which >> >might be handled by Fedora's install scripts) >> >> Or mine, which I've been using for years. > >You're ahead of a surprising number of people, including me, if you >understand making initrds. In my script, its one line: mkinitrd -f initrd-$VER.img $VER && \ where $VER is the shell variable I edit to = the version number, located at the top of the script. Unforch, its failing: No module pata_amd found for kernel 2.6.24, aborting. This is with pata_amd turned off and its counterpart under ATA/RLL/etc turned on. So something is still dependent on it. I do have one sata drive, on an accessory card in the box, so I need the rest of the sata_sil and friends stuff. Its my virtual tapes for amanda. Also home built, the amanda security model cannot be successfully bent into the shape of an rpm. They BTW are #2 on coverity's list of most secure software. So I've rebuilt 2.6.24 as it originally was, and added the acpi timer line to the 2.6.24-rc8 stanza's kernel argument list. It will boot one or the other when I next reboot. Its been about 8 hours since the last error was logged, which is totally weirdsville to this old fart. Phase of the moon maybe? The visit to the sawbones to see about my heart? They are going to fit me with a 30 day recorder tomorrow, my skip a beat problem is getting worse. The sort of stuff that goes with the 7nth decade I guess. Officially, I'm wearing out me, too much sugar, too many times nearly electrocuted=shingles yadda yadda. :-) Oh, and don't forget Arther, he moved in uninvited about 25 years ago too. Those people that talk about the golden years? They're full of excrement... >> >will either get you back to >> >old IDE or will make your kernel panic on boot, depending on whether you >> >got it right (so make sure you can still boot the kernel you're sure of >> > or something from a boot disk). This will also cause your hard drives to >> > show up as different device nodes, so if your boot process doesn't mount >> > by disk uuid but by some other feature (and I don't know what Fedora >> > does), you'll also need to change it to something either stable across >> > access methods or which works for the one you're now using. >> >> It mounts by LABEL=. All of it. > >That'll save a huge amount of hassle. So long as you manage to get the >right drivers included and the wrong drivers not included, you should be >pretty much set. > >> Fedora is not the only people having trouble, name a distro, its probably >> someplace in that 14,800 hit google returns. > >Yeah, but they each may need different instructions, particularly if >they're not mounting by label in general, or not mounting the root >partition by label. That was the big hassle going the opposite direction. >And the procedure is 4 lines to describe to somebody who knows how to >build and install a new kernel for the distro, which is much shorter than >the explanation of how you generally build and install a kernel. A real >howto would have to explain where to get the distro's kernel sources and >default configuration, for example. > > -Daniel >*This .sig left intentionally blank* -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) Never drink from your finger bowl -- it contains only water. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
Gene Heskett wrote: On Monday 28 January 2008, Robert Hancock wrote: Gene Heskett wrote: And so far no one has tried to comment on those 2 dmesg lines I've quoted a couple of times now, here's another: [0.00] Nvidia board detected. Ignoring ACPI timer override. [0.00] If you got timer trouble try acpi_use_timer_override what the heck is that trying to tell me to do, in some sort of broken english? A lot of NVIDIA-chipset motherboards have BIOS problems where they include an incorrect ACPI interrupt override for the timer interrupt, which tends to cause the system to fail to boot due to the timer interrupt not working. The kernel normally ignores ACPI interrupt overrides on the timer interrupt for NVIDIA chipsets for this reason. Unfortunately on some such boards the override is actually correct and needed, and so this actually causes problems. Hence the acpi_use_timer_override option. In any case this is unlikely to have anything to do with your problem, since if that was messed up you likely would never have even booted. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ In this case, there seems to be a buglet. I turned on the nvidia/amd drives under the ATA section of the menu, and turned off the pata_amd under the sata menu in xconfig. But I've tried twice now and it fails to build the initrd because the pata_amd module is on the missing list. Of course its missing, I didn't have it built... Next? Check the /etc/modprobe.conf file, a lot of distributions use this to generate the initrd. If there's references to pata_amd it'll try and include it. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Monday 28 January 2008, Robert Hancock wrote: >Gene Heskett wrote: >> And so far no one has tried to comment on those 2 dmesg lines I've quoted >> a couple of times now, here's another: >> [0.00] Nvidia board detected. Ignoring ACPI timer override. >> [0.00] If you got timer trouble try acpi_use_timer_override >> what the heck is that trying to tell me to do, in some sort of broken >> english? > >A lot of NVIDIA-chipset motherboards have BIOS problems where they >include an incorrect ACPI interrupt override for the timer interrupt, >which tends to cause the system to fail to boot due to the timer >interrupt not working. The kernel normally ignores ACPI interrupt >overrides on the timer interrupt for NVIDIA chipsets for this reason. >Unfortunately on some such boards the override is actually correct and >needed, and so this actually causes problems. Hence the >acpi_use_timer_override option. > >In any case this is unlikely to have anything to do with your problem, >since if that was messed up you likely would never have even booted. >-- >To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >the body of a message to [EMAIL PROTECTED] >More majordomo info at http://vger.kernel.org/majordomo-info.html >Please read the FAQ at http://www.tux.org/lkml/ In this case, there seems to be a buglet. I turned on the nvidia/amd drives under the ATA section of the menu, and turned off the pata_amd under the sata menu in xconfig. But I've tried twice now and it fails to build the initrd because the pata_amd module is on the missing list. Of course its missing, I didn't have it built... Next? -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) Of course it's possible to love a human being if you don't know them too well. -- Charles Bukowski -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Mon, 28 Jan 2008, Gene Heskett wrote: > On Monday 28 January 2008, Daniel Barkalow wrote: > >Building this and installing it along with the appropriate initrd (which > >might be handled by Fedora's install scripts) > > Or mine, which I've been using for years. You're ahead of a surprising number of people, including me, if you understand making initrds. > >will either get you back to > >old IDE or will make your kernel panic on boot, depending on whether you > >got it right (so make sure you can still boot the kernel you're sure of or > >something from a boot disk). This will also cause your hard drives to show > >up as different device nodes, so if your boot process doesn't mount by > >disk uuid but by some other feature (and I don't know what Fedora does), > >you'll also need to change it to something either stable across access > >methods or which works for the one you're now using. > > It mounts by LABEL=. All of it. That'll save a huge amount of hassle. So long as you manage to get the right drivers included and the wrong drivers not included, you should be pretty much set. > Fedora is not the only people having trouble, name a distro, its probably > someplace in that 14,800 hit google returns. Yeah, but they each may need different instructions, particularly if they're not mounting by label in general, or not mounting the root partition by label. That was the big hassle going the opposite direction. And the procedure is 4 lines to describe to somebody who knows how to build and install a new kernel for the distro, which is much shorter than the explanation of how you generally build and install a kernel. A real howto would have to explain where to get the distro's kernel sources and default configuration, for example. -Daniel *This .sig left intentionally blank* -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
Gene Heskett wrote: And so far no one has tried to comment on those 2 dmesg lines I've quoted a couple of times now, here's another: [0.00] Nvidia board detected. Ignoring ACPI timer override. [0.00] If you got timer trouble try acpi_use_timer_override what the heck is that trying to tell me to do, in some sort of broken english? A lot of NVIDIA-chipset motherboards have BIOS problems where they include an incorrect ACPI interrupt override for the timer interrupt, which tends to cause the system to fail to boot due to the timer interrupt not working. The kernel normally ignores ACPI interrupt overrides on the timer interrupt for NVIDIA chipsets for this reason. Unfortunately on some such boards the override is actually correct and needed, and so this actually causes problems. Hence the acpi_use_timer_override option. In any case this is unlikely to have anything to do with your problem, since if that was messed up you likely would never have even booted. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Monday 28 January 2008, Mark Lord wrote: >Gene Heskett wrote: >>.. >> And so far no one has tried to comment on those 2 dmesg lines I've quoted >> a couple of times now, here's another: >> [0.00] Nvidia board detected. Ignoring ACPI timer override. >> [0.00] If you got timer trouble try acpi_use_timer_override >> what the heck is that trying to tell me to do, in some sort of broken >> english? > >.. > >I think it says this: > > "If your system is misbehaving, then try adding the > acpi_use_timer_override keyword to your kernel command line > (/boot/grub/menu.lst) and see if it helps." > >So, you can either hardcode it in /boot/grub/menu.lst (just add it to the > end of the first line you see there that begins with the word "kernel". > >Or you can just try it temporarily at boot time (safer, but tricker), >by catching GRUB (the bootloader) before it actually loads Linux. > >Usually there's some key or something it says you have 3 seconds to hit for > a "menu", so do that, and then use the cursor keys to find the first > "kernel" line in that menu and hit "e" (edit) to go and add the > acpi_use_timer_override keyword to the end of that line (same as above). > >Hit enter when done, and then the letter b (boot) to load Linux with that > option. > >Clear as mud, right? :) Precisely Mark. Thanks, I'm building an ide-ata kernel 2.6.24 now, and I've added that to the argument line for 2.6.24-rc8. Thanks mark. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) Of all men's miseries, the bitterest is this: to know so much and have control over nothing. -- Herodotus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Monday 28 January 2008, Mark Lord wrote: >Gene Heskett wrote: >> On Monday 28 January 2008, Mark Lord wrote: >>.. >> >>> Another way is to use the "make_bad_sector" utility that >>> is included in the source tarball for hdparm-7.7, as follows: >>> >>> make_bad_sector --readback /dev/sda 474507 >> >> Apparently not in the rpm, darnit. > >.. > >That's okay. It should still be in the SRPM source file. >And it's a tiny download from sourceforge.net: > >http://sourceforge.net/search/?type_of_search=soft_of_search=soft >=hdparm > >Cheers That's ok, dd seemed to do the job also. Thanks Mark. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) Those who do not understand Unix are condemned to reinvent it, poorly. -- Henry Spencer -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Monday 28 January 2008, Daniel Barkalow wrote: >On Mon, 28 Jan 2008, Richard Heck wrote: >> Daniel Barkalow wrote: >> > Can you switch back to old IDE to get your work done (and to make sure >> > it's not a hardware issue that's developed recently)? >> >> I think it'd be really, REALLY helpful to a lot of people if you, or >> someone, could explain in moderate detail how this might be done. I tried >> doing it myself, but I'm not sufficiently expert at configuring kernels >> that I was ever able to figure out how to do it. > >As far as configuring the kernel, I can help: > >Go to Device Drivers, ATA/ATAPI/MFM/RLL support, and turn on anything that >looks relevant; go to Device Drivers, Serial ATA and Parallel ATA drivers, >and turn off anything that's PATA and looks relevant. > Done. >(Whether a device uses IDE or PATA depends on which driver that supports >the device is present and find it first, not on any sort of global >configuration, which is probably what tripped you up) > >Building this and installing it along with the appropriate initrd (which >might be handled by Fedora's install scripts) Or mine, which I've been using for years. >will either get you back to >old IDE or will make your kernel panic on boot, depending on whether you >got it right (so make sure you can still boot the kernel you're sure of or >something from a boot disk). This will also cause your hard drives to show >up as different device nodes, so if your boot process doesn't mount by >disk uuid but by some other feature (and I don't know what Fedora does), >you'll also need to change it to something either stable across access >methods or which works for the one you're now using. It mounts by LABEL=. All of it. >> Obviously, the short version is: switch back to Fedora 6. But this kind of >> problem with libata---and yes, you're almost surely right that it's not >> one problem but lots---is sufficiently widespread that a Mini HOWTO, say, >> would be really welcome and, I'm guessing, widely used. > >Fedora really ought to provide documentation, because there's some >distro-specific stuff (like how you deal with the kernel's device node for >the root partition changing), and they're using code by default that's at >least somewhat documented as experimental (although it doesn't seem to be >actually marked as experimental in all cases). Fedora is not the only people having trouble, name a distro, its probably someplace in that 14,800 hit google returns. > -Daniel >*This .sig left intentionally blank* Thanks Daniel, try #1 is building now. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) Those who do not understand Unix are condemned to reinvent it, poorly. -- Henry Spencer -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
Gene Heskett wrote: On Monday 28 January 2008, Mark Lord wrote: .. Another way is to use the "make_bad_sector" utility that is included in the source tarball for hdparm-7.7, as follows: make_bad_sector --readback /dev/sda 474507 Apparently not in the rpm, darnit. .. That's okay. It should still be in the SRPM source file. And it's a tiny download from sourceforge.net: http://sourceforge.net/search/?type_of_search=soft_of_search=soft=hdparm Cheers -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
Mark Lord wrote: Gene Heskett wrote: .. And so far no one has tried to comment on those 2 dmesg lines I've quoted a couple of times now, here's another: [0.00] Nvidia board detected. Ignoring ACPI timer override. [0.00] If you got timer trouble try acpi_use_timer_override what the heck is that trying to tell me to do, in some sort of broken english? .. I think it says this: "If your system is misbehaving, then try adding the acpi_use_timer_override keyword to your kernel command line (/boot/grub/menu.lst) and see if it helps." So, you can either hardcode it in /boot/grub/menu.lst (just add it to the end of the first line you see there that begins with the word "kernel". Or you can just try it temporarily at boot time (safer, but tricker), by catching GRUB (the bootloader) before it actually loads Linux. Usually there's some key or something it says you have 3 seconds to hit for a "menu", so do that, and then use the cursor keys to find the first "kernel" line in that menu and hit "e" (edit) to go and add the acpi_use_timer_override keyword to the end of that line (same as above). .. Minor correction (having just tried it here): once you see the GRUB (boot) menu, hit the letter e to edit the first entry, then scroll to the "kernel" line, and hit the letter e again to edit that line. It should put you at the end of the line, where you can just type a space and then acpi_use_timer_override and then hit enter to finish the (temporary) edit. Then hit b for boot. -ml -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
Gene Heskett wrote: .. And so far no one has tried to comment on those 2 dmesg lines I've quoted a couple of times now, here's another: [0.00] Nvidia board detected. Ignoring ACPI timer override. [0.00] If you got timer trouble try acpi_use_timer_override what the heck is that trying to tell me to do, in some sort of broken english? .. I think it says this: "If your system is misbehaving, then try adding the acpi_use_timer_override keyword to your kernel command line (/boot/grub/menu.lst) and see if it helps." So, you can either hardcode it in /boot/grub/menu.lst (just add it to the end of the first line you see there that begins with the word "kernel". Or you can just try it temporarily at boot time (safer, but tricker), by catching GRUB (the bootloader) before it actually loads Linux. Usually there's some key or something it says you have 3 seconds to hit for a "menu", so do that, and then use the cursor keys to find the first "kernel" line in that menu and hit "e" (edit) to go and add the acpi_use_timer_override keyword to the end of that line (same as above). Hit enter when done, and then the letter b (boot) to load Linux with that option. Clear as mud, right? :) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Mon, 28 Jan 2008, Richard Heck wrote: > Daniel Barkalow wrote: > > Can you switch back to old IDE to get your work done (and to make sure it's > > not a hardware issue that's developed recently)? > I think it'd be really, REALLY helpful to a lot of people if you, or someone, > could explain in moderate detail how this might be done. I tried doing it > myself, but I'm not sufficiently expert at configuring kernels that I was ever > able to figure out how to do it. As far as configuring the kernel, I can help: Go to Device Drivers, ATA/ATAPI/MFM/RLL support, and turn on anything that looks relevant; go to Device Drivers, Serial ATA and Parallel ATA drivers, and turn off anything that's PATA and looks relevant. (Whether a device uses IDE or PATA depends on which driver that supports the device is present and find it first, not on any sort of global configuration, which is probably what tripped you up) Building this and installing it along with the appropriate initrd (which might be handled by Fedora's install scripts) will either get you back to old IDE or will make your kernel panic on boot, depending on whether you got it right (so make sure you can still boot the kernel you're sure of or something from a boot disk). This will also cause your hard drives to show up as different device nodes, so if your boot process doesn't mount by disk uuid but by some other feature (and I don't know what Fedora does), you'll also need to change it to something either stable across access methods or which works for the one you're now using. > Obviously, the short version is: switch back to Fedora 6. But this kind of > problem with libata---and yes, you're almost surely right that it's not one > problem but lots---is sufficiently widespread that a Mini HOWTO, say, would be > really welcome and, I'm guessing, widely used. Fedora really ought to provide documentation, because there's some distro-specific stuff (like how you deal with the kernel's device node for the root partition changing), and they're using code by default that's at least somewhat documented as experimental (although it doesn't seem to be actually marked as experimental in all cases). -Daniel *This .sig left intentionally blank* -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
Andrey Borzenkov wrote: Richard Heck wrote: Daniel Barkalow wrote: Can you switch back to old IDE to get your work done (and to make sure it's not a hardware issue that's developed recently)? I think it'd be really, REALLY helpful to a lot of people if you, or someone, could explain in moderate detail how this might be done. I tried doing it myself, but I'm not sufficiently expert at configuring kernels that I was ever able to figure out how to do it. well, here on Mandriva I 1) compile both IDE and libata as modules 2) create initrd that contains either IDE or libata modules 3) use labels for file system mounts, swaps and resume device. Now 1) should be pretty straightforward (I could send you config if you like, it is stripped down to bare minimum on my system, you will have to check drivers for your hardware). 2 and 3 are obviously distribution dependent. I can explain how to do it on Mandriva that ATM has near to perfect support for addressing devices via label/UUID; also ide/scsi/ata switch is trivial using Mandriva mkinitrd. Thanks for this. Compiling the IDE stuff as a module is indeed the easy part, though I suppose I need to make sure I get the right drivers for my chipset, too. Loading e.g. the Fedora 6 LiveCD and then lsmod'ing should do it, though. Labels are used by default in Fedora now, so that's fine, too. Getting mkinitrd to work right shouldn't be too bad, either. So I'll have a go at this when I get some time and report on it. What might be REALLY helpful to people would be if we Fedora types could produce a modified kernel rpm that would handle thisthough, I should say, I've also seen a lot of complaints along these same lines on Ubuntu. Richard -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Monday 28 January 2008, Andrey Borzenkov wrote: >On Monday 28 January 2008, Gene Heskett wrote: >> On Monday 28 January 2008, Andrey Borzenkov wrote: >> >Richard Heck wrote: >> >> Daniel Barkalow wrote: >> >>> Can you switch back to old IDE to get your work done (and to make sure >> >>> it's not a hardware issue that's developed recently)? >> >> >> >> I think it'd be really, REALLY helpful to a lot of people if you, or >> >> someone, could explain in moderate detail how this might be done. I >> >> tried doing it myself, but I'm not sufficiently expert at configuring >> >> kernels that I was ever able to figure out how to do it. >> > >> >well, here on Mandriva I >> > >> >1) compile both IDE and libata as modules >> >2) create initrd that contains either IDE or libata modules >> >3) use labels for file system mounts, swaps and resume device. >> > >> > >> >Now 1) should be pretty straightforward (I could send you config if you >> >like, it is stripped down to bare minimum on my system, you will have to >> >check drivers for your hardware). 2 and 3 are obviously distribution >> >dependent. I can explain how to do it on Mandriva that ATM has near to >> >perfect support for addressing devices via label/UUID; also ide/scsi/ata >> >switch is trivial using Mandriva mkinitrd. >> >> I already build as modules, and it would be relatively easy to make 2 boot >> stanza's that used the different initrd's if there were examples that >> could be used as 'excludes' when building the initrd's. Is such a >> creature breedable? > >I am not sure I understand a question (it is not my native language) but > here I simply do > >mkinitrd --omit-ide-modules --preload pata_ali --preload sd_mod ... > >or > >mkinitrd --omit-scsi-modules --preload alim15x3 --preload ide-disk ... > This looks doable, thanks. I was trying to be cute above when I'm rather frustrated by all this. I might have to fiddle a bit but I got the idea. OTOH, I and about 15,000 others according to google, would be everlastingly gratefull if it was just fixed. :) Thanks >If you ask how --omit part is implemented I happily send you mkinitrd > script. -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) If everybody minded their own business, the world would go around a deal faster. -- The Duchess, "Through the Looking Glass" -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
On Monday 28 January 2008, Gene Heskett wrote: > On Monday 28 January 2008, Andrey Borzenkov wrote: > >Richard Heck wrote: > >> Daniel Barkalow wrote: > >>> Can you switch back to old IDE to get your work done (and to make sure > >>> it's not a hardware issue that's developed recently)? > >> > >> I think it'd be really, REALLY helpful to a lot of people if you, or > >> someone, could explain in moderate detail how this might be done. I > >> tried doing it myself, but I'm not sufficiently expert at configuring > >> kernels that I was ever able to figure out how to do it. > > > >well, here on Mandriva I > > > >1) compile both IDE and libata as modules > >2) create initrd that contains either IDE or libata modules > >3) use labels for file system mounts, swaps and resume device. > > > > > >Now 1) should be pretty straightforward (I could send you config if you > >like, it is stripped down to bare minimum on my system, you will have to > >check drivers for your hardware). 2 and 3 are obviously distribution > >dependent. I can explain how to do it on Mandriva that ATM has near to > >perfect support for addressing devices via label/UUID; also ide/scsi/ata > >switch is trivial using Mandriva mkinitrd. > > > > I already build as modules, and it would be relatively easy to make 2 boot > stanza's that used the different initrd's if there were examples that could > be used as 'excludes' when building the initrd's. Is such a creature > breedable? > I am not sure I understand a question (it is not my native language) but here I simply do mkinitrd --omit-ide-modules --preload pata_ali --preload sd_mod ... or mkinitrd --omit-scsi-modules --preload alim15x3 --preload ide-disk ... If you ask how --omit part is implemented I happily send you mkinitrd script. signature.asc Description: This is a digitally signed message part.
Re: Problem with ata layer in 2.6.24
On Monday 28 January 2008, Jeff Garzik wrote: >Gene Heskett wrote: >> Greeting; >> >> I had to reboot early this morning due to a freezeup, and I had a >> bunch of these in the messages log: >> == >> Jan 27 19:42:11 coyote kernel: [42461.915961] ata1.00: exception Emask 0x0 >> SAct 0x0 SErr 0x0 action 0x2 frozen Jan 27 19:42:11 coyote kernel: >> [42461.915973] ata1.00: cmd ca/00:08:b1:66:46/00:00:00:00:00/e8 tag 0 dma >> 4096 out Jan 27 19:42:11 coyote kernel: [42461.915974] res >> 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 27 19:42:11 >> coyote kernel: [42461.915978] ata1.00: status: { DRDY } Jan 27 19:42:11 >> coyote kernel: [42461.916005] ata1: soft resetting link Jan 27 19:42:12 >> coyote kernel: [42462.078216] ata1.00: configured for UDMA/100 Jan 27 >> 19:42:12 coyote kernel: [42462.078232] ata1: EH complete >> Jan 27 19:42:12 coyote kernel: [42462.090700] sd 0:0:0:0: [sda] 390721968 >> 512-byte hardware sectors (200050 MB) Jan 27 19:42:12 coyote kernel: >> [42462.114230] sd 0:0:0:0: [sda] Write Protect is off Jan 27 19:42:12 >> coyote kernel: [42462.115079] sd 0:0:0:0: [sda] Write cache: enabled, read >> cache: enabled, doesn't support DPO or FUA >> === >> That one showed up about 2 hours ago, so I expect I'll be locked >> up again before I've managed a 24 hour uptime. This drive passed >> a 'smartctl -t long /dev/sda' with flying colors after the reboot >> this morning. >> >> Two instances were logged after I had rebooted to 2.6.24 from 2.6.24-rc8: >> >> Jan 24 20:46:33 coyote kernel: [0.00] Linux version 2.6.24 >> ([EMAIL PROTECTED]) (gcc version 4.1.2 20070925 (Red Hat 4.1.2-33)) >> #1 SMP Thu Jan 24 20:17:55 EST 2008 >> >> Jan 27 02:28:29 coyote kernel: [193207.445158] ata1.00: exception Emask >> 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 27 02:28:29 coyote kernel: >> [193207.445170] ata1.00: cmd 35/00:08:f9:24:0a/00:00:17:00:00/e0 tag 0 dma >> 4096 out Jan 27 02:28:29 coyote kernel: [193207.445172] res >> 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 27 02:28:29 >> coyote kernel: [193207.445175] ata1.00: status: { DRDY } Jan 27 02:28:29 >> coyote kernel: [193207.445202] ata1: soft resetting link Jan 27 02:28:29 >> coyote kernel: [193207.607384] ata1.00: configured for UDMA/100 Jan 27 >> 02:28:29 coyote kernel: [193207.607399] ata1: EH complete >> Jan 27 02:28:29 coyote kernel: [193207.609681] sd 0:0:0:0: [sda] 390721968 >> 512-byte hardware sectors (200050 MB) Jan 27 02:28:29 coyote kernel: >> [193207.619277] sd 0:0:0:0: [sda] Write Protect is off Jan 27 02:28:29 >> coyote kernel: [193207.649041] sd 0:0:0:0: [sda] Write cache: enabled, >> read cache: enabled, doesn't support DPO or FUA >> Jan 27 02:30:06 coyote kernel: [193304.336929] ata1.00: exception Emask >> 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 27 02:30:06 coyote kernel: >> [193304.336940] ata1.00: cmd ca/00:20:69:22:a6/00:00:00:00:00/e7 tag 0 dma >> 16384 out Jan 27 02:30:06 coyote kernel: [193304.336942] res >> 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 27 02:30:06 >> coyote kernel: [193304.336945] ata1.00: status: { DRDY } Jan 27 02:30:06 >> coyote kernel: [193304.336972] ata1: soft resetting link Jan 27 02:30:06 >> coyote kernel: [193304.499210] ata1.00: configured for UDMA/100 Jan 27 >> 02:30:06 coyote kernel: [193304.499226] ata1: EH complete >> Jan 27 02:30:06 coyote kernel: [193304.499714] sd 0:0:0:0: [sda] 390721968 >> 512-byte hardware sectors (200050 MB) Jan 27 02:30:06 coyote kernel: >> [193304.499857] sd 0:0:0:0: [sda] Write Protect is off Jan 27 02:30:06 >> coyote kernel: [193304.502315] sd 0:0:0:0: [sda] Write cache: enabled, >> read cache: enabled, doesn't support DPO or FUA >> >> None were logged during the time I was running an -rc7 or -rc8. >> >> The previous hits on this resulted in the udma speed being downgraded >> till it was actually running in pio just before the freeze that >> required the hardware reset button. > >Unfortunately there are 1001 different causes for timeouts, so we need >to drill down into the hardware, libata version, and ACPI version (most >notably). > >> I'll reboot to -rc8 right now and resume. If its the drive, I should see >> it. If not, then 2.6.24 is where I'll point the finger. Both rc8 and rc7 do it. The fedora kernels do too, but without the error messages being logged, I assume they are an attempt to trace this? >There was also an ACPI update, which always affects interrupt handling >(whose symptom can sometimes be a timeout). I'm thinking Bingo!, please pay the man. See my posts asking about a couple of lines very early in the dmesg, asking for an english explanation no one has proffered as yet. >Definitely interesting in test results from what you describe. > > Jeff -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) It's no wonder they call it
Re: Problem with ata layer in 2.6.24
On Monday 28 January 2008, Andrey Borzenkov wrote: >Richard Heck wrote: >> Daniel Barkalow wrote: >>> Can you switch back to old IDE to get your work done (and to make sure >>> it's not a hardware issue that's developed recently)? >> >> I think it'd be really, REALLY helpful to a lot of people if you, or >> someone, could explain in moderate detail how this might be done. I >> tried doing it myself, but I'm not sufficiently expert at configuring >> kernels that I was ever able to figure out how to do it. > >well, here on Mandriva I > >1) compile both IDE and libata as modules >2) create initrd that contains either IDE or libata modules >3) use labels for file system mounts, swaps and resume device. > > >Now 1) should be pretty straightforward (I could send you config if you >like, it is stripped down to bare minimum on my system, you will have to >check drivers for your hardware). 2 and 3 are obviously distribution >dependent. I can explain how to do it on Mandriva that ATM has near to >perfect support for addressing devices via label/UUID; also ide/scsi/ata >switch is trivial using Mandriva mkinitrd. > >-andrey > >> Obviously, the short version is: switch back to Fedora 6. But this kind >> of problem with libata---and yes, you're almost surely right that it's >> not one problem but lots---is sufficiently widespread that a Mini HOWTO, >> say, would be really welcome and, I'm guessing, widely used. >> >> Richard I already build as modules, and it would be relatively easy to make 2 boot stanza's that used the different initrd's if there were examples that could be used as 'excludes' when building the initrd's. Is such a creature breedable? -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) It's no wonder they call it WinNT; WNT = VMS++; -- Chris Abbey % Peace, Love and Compile the kernel... -- Justin L. Herreman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with ata layer in 2.6.24
Gene Heskett wrote: Greeting; I had to reboot early this morning due to a freezeup, and I had a bunch of these in the messages log: == Jan 27 19:42:11 coyote kernel: [42461.915961] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 27 19:42:11 coyote kernel: [42461.915973] ata1.00: cmd ca/00:08:b1:66:46/00:00:00:00:00/e8 tag 0 dma 4096 out Jan 27 19:42:11 coyote kernel: [42461.915974] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 27 19:42:11 coyote kernel: [42461.915978] ata1.00: status: { DRDY } Jan 27 19:42:11 coyote kernel: [42461.916005] ata1: soft resetting link Jan 27 19:42:12 coyote kernel: [42462.078216] ata1.00: configured for UDMA/100 Jan 27 19:42:12 coyote kernel: [42462.078232] ata1: EH complete Jan 27 19:42:12 coyote kernel: [42462.090700] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB) Jan 27 19:42:12 coyote kernel: [42462.114230] sd 0:0:0:0: [sda] Write Protect is off Jan 27 19:42:12 coyote kernel: [42462.115079] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA === That one showed up about 2 hours ago, so I expect I'll be locked up again before I've managed a 24 hour uptime. This drive passed a 'smartctl -t long /dev/sda' with flying colors after the reboot this morning. Two instances were logged after I had rebooted to 2.6.24 from 2.6.24-rc8: Jan 24 20:46:33 coyote kernel: [0.00] Linux version 2.6.24 ([EMAIL PROTECTED]) (gcc version 4.1.2 20070925 (Red Hat 4.1.2-33)) #1 SMP Thu Jan 24 20:17:55 EST 2008 Jan 27 02:28:29 coyote kernel: [193207.445158] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 27 02:28:29 coyote kernel: [193207.445170] ata1.00: cmd 35/00:08:f9:24:0a/00:00:17:00:00/e0 tag 0 dma 4096 out Jan 27 02:28:29 coyote kernel: [193207.445172] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 27 02:28:29 coyote kernel: [193207.445175] ata1.00: status: { DRDY } Jan 27 02:28:29 coyote kernel: [193207.445202] ata1: soft resetting link Jan 27 02:28:29 coyote kernel: [193207.607384] ata1.00: configured for UDMA/100 Jan 27 02:28:29 coyote kernel: [193207.607399] ata1: EH complete Jan 27 02:28:29 coyote kernel: [193207.609681] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB) Jan 27 02:28:29 coyote kernel: [193207.619277] sd 0:0:0:0: [sda] Write Protect is off Jan 27 02:28:29 coyote kernel: [193207.649041] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Jan 27 02:30:06 coyote kernel: [193304.336929] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 27 02:30:06 coyote kernel: [193304.336940] ata1.00: cmd ca/00:20:69:22:a6/00:00:00:00:00/e7 tag 0 dma 16384 out Jan 27 02:30:06 coyote kernel: [193304.336942] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 27 02:30:06 coyote kernel: [193304.336945] ata1.00: status: { DRDY } Jan 27 02:30:06 coyote kernel: [193304.336972] ata1: soft resetting link Jan 27 02:30:06 coyote kernel: [193304.499210] ata1.00: configured for UDMA/100 Jan 27 02:30:06 coyote kernel: [193304.499226] ata1: EH complete Jan 27 02:30:06 coyote kernel: [193304.499714] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB) Jan 27 02:30:06 coyote kernel: [193304.499857] sd 0:0:0:0: [sda] Write Protect is off Jan 27 02:30:06 coyote kernel: [193304.502315] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA None were logged during the time I was running an -rc7 or -rc8. The previous hits on this resulted in the udma speed being downgraded till it was actually running in pio just before the freeze that required the hardware reset button. Unfortunately there are 1001 different causes for timeouts, so we need to drill down into the hardware, libata version, and ACPI version (most notably). I'll reboot to -rc8 right now and resume. If its the drive, I should see it. If not, then 2.6.24 is where I'll point the finger. There was also an ACPI update, which always affects interrupt handling (whose symptom can sometimes be a timeout). Definitely interesting in test results from what you describe. Jeff -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/