Re: libata pm
Hello, Am Mo, 28.01.2008, 01:17, schrieb Tejun Heo: Hello, [EMAIL PROTECTED] wrote: With one, two and three drives on the pm I got no errors, so I tried to change the power supply. I got two power supplies for the 16 harddrives and the second one (with all Maxtor drives and the first pm) was failing. After changing the power supply all problems where gone. Sorry for the confusion. It's amazing that, in SATA, PSU-problem-verified / PSU-problem-suggested ratio is significant. I don't think this ever was the case with PATA. SATA link seems much more susceptible to power quality and allows hooking up whole lot of drives. The only remaining issue is that some sata links are only working with 1.5 gbps and I can't figure out why. (ata3,4,5,6.x are the Maxtor drives and ata7,8,9,10.x are the Samsung ones) Hmm... That's how the hardware negotiated transfer rate w/ each other. SControl is telling the hardware that there's no speed limit and to go as high as it can but somehow 3.0Gbps negotiation failed and the link speed is limited to 1.5Gbps for some of the ports. Is the result always the same? What happens if you swap drives between ports? Also, if you have an extra PSU lying around, can you hook it up with some of the drives such that the load is more distributed. Oh.. Another note on PSU. I don't know whether this still holds for more recent ones but mid-to-high range multi-lane PSUs sometimes have lesser juice on 12v rail available for disks than low cost single lane PSUs. This is because high power 12v lanes are allocated to power video cards and disks can only pull power from (usually) single weak 12v lane. Thanks. -- tejun Shuffling the drives did not change anything to the linkspeed of the 3 ports running with 1.5 Gbps. Looks like the problem is port-related. The first PSU (Multilane) offers (measured) 12,20 V on both lanes. This falls during read/write access on all drives attached down to 12,10 V. The second PSU (Singlelane) offers (measured) 11,75 V. This doesn't change during read/write access on all drives attached. I tested the second PSU without anything attached and it still offered 11,75 V. I will try to add another PSU this evening and remeasure. Currently the whole machine needs about 350W. The PSUs are Targan 500W (Multilane - 2x12V 10A) and Noname 350W (Singlelane - 1x12V 10A) so this 'should' be enough... This night i copied the backup back to the first raid (the Maxtor drives) without any error but during the transfer i got this on the second (mounted but unused) raid: ata10.00: failed to read SCR 1 (Emask=0x40) ata10.01: failed to read SCR 1 (Emask=0x40) ata10.02: failed to read SCR 1 (Emask=0x40) ata10.03: failed to read SCR 1 (Emask=0x40) ata10.04: failed to read SCR 1 (Emask=0x40) ata10.05: failed to read SCR 1 (Emask=0x40) ata10.00: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen ata10.01: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen ata10.02: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen ata10.02: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 1 cdb 0x0 data 0 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) ata10.02: status: { DRDY } ata10.03: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen ata10.04: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen ata10.05: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen ata10.15: hard resetting link ata10.15: SATA link up 3.0 Gbps (SStatus 123 SControl 0) ata10.00: hard resetting link ata10.00: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata10.01: hard resetting link ata10.01: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata10.02: hard resetting link ata10.02: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata10.03: hard resetting link ata10.03: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata10.04: hard resetting link ata10.04: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata10.05: hard resetting link ata10.05: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata10.00: configured for UDMA/100 ata10.01: configured for UDMA/100 ata10.02: configured for UDMA/100 ata10.03: configured for UDMA/100 ata10.04: configured for UDMA/100 ata10: EH complete sd 9:0:0:0: [sdp] 976773168 512-byte hardware sectors (500108 MB) sd 9:0:0:0: [sdp] Write Protect is off sd 9:0:0:0: [sdp] Mode Sense: 00 3a 00 00 sd 9:0:0:0: [sdp] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 9:1:0:0: [sdq] 976773168 512-byte hardware sectors (500108 MB) sd 9:1:0:0: [sdq] Write Protect is off sd 9:1:0:0: [sdq] Mode Sense: 00 3a 00 00 sd 9:1:0:0: [sdq] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 9:2:0:0: [sdr] 976773168 512-byte hardware sectors (500108 MB) sd 9:2:0:0: [sdr] Write Protect is off sd 9:2:0:0: [sdr] Mode Sense: 00 3a 00 00 sd 9:2:0:0: [sdr] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 9:3:0:0: [sds] 976773168 512-byte hardware sectors
Re: libata pm
Hello, Dusty. [EMAIL PROTECTED] wrote: Shuffling the drives did not change anything to the linkspeed of the 3 ports running with 1.5 Gbps. Looks like the problem is port-related. Hmm... Okay. Maybe the signal traces or connectors have some problems. But 3.0Gbps on downstream port doesn't make any difference anyway so unless it leads to errors, it should be fine. The first PSU (Multilane) offers (measured) 12,20 V on both lanes. This falls during read/write access on all drives attached down to 12,10 V. That explains the PHY dropouts. I've even seen drives to do hard reset accompanying emergency head unload due to voltage dropping when high IO load hits. Of course, data in cache is lost. The second PSU (Singlelane) offers (measured) 11,75 V. This doesn't change during read/write access on all drives attached. I tested the second PSU without anything attached and it still offered 11,75 V. 11.75 is still in spec and it doesn't fluctuate. I like this power much better. I will try to add another PSU this evening and remeasure. Currently the whole machine needs about 350W. The PSUs are Targan 500W (Multilane - 2x12V 10A) Maybe one of the lane is only connected to video power connectors? IIRC, that was the idea of multilane power anyway. and Noname 350W (Singlelane - 1x12V 10A) so this 'should' be enough... This night i copied the backup back to the first raid (the Maxtor drives) without any error but during the transfer i got this on the second (mounted but unused) raid: ata10.00: failed to read SCR 1 (Emask=0x40) ata10.01: failed to read SCR 1 (Emask=0x40) ata10.02: failed to read SCR 1 (Emask=0x40) ata10.03: failed to read SCR 1 (Emask=0x40) ata10.04: failed to read SCR 1 (Emask=0x40) ata10.05: failed to read SCR 1 (Emask=0x40) ata10.00: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen ata10.01: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen ata10.02: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen ata10.02: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 1 cdb 0x0 data 0 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Timeout during heavy IO can be caused from a number of things and bad power is one of them. Please lemme know your test result. -- tejun - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with ata layer in 2.6.24
On Monday 28 January 2008, Gene Heskett wrote: [ 0.00] If you got timer trouble try acpi_use_timer_override This is from the dmesg of my previous post. Can anyone tell me what it actually means? -- Cheers, Gene There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) I have a simple rule in life: If I don't understand something, it must be bad. - Linus Torvalds - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with ata layer in 2.6.24
On Monday 28 January 2008, Peter Zijlstra wrote: On Mon, 2008-01-28 at 09:17 +0100, Mikael Pettersson wrote: 1. Wrong mailing list; use linux-ide (@vger) instead. What, and keep all us other interested people in the dark? As a test, I tried rebooting to the latest fedora kernel and found it kills X, so I'm back to the second to last fedora version ATM, and the third 'smartctl -t lng /dev/sda' in 24 hours is running now. The first two completed with no errors. I've added the linux-ide list to refresh those people of the problem, the logs are being spammed by this message stanza: Jan 28 04:46:25 coyote kernel: [26550.290016] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 28 04:46:25 coyote kernel: [26550.290028] ata1.00: cmd 35/00:58:c9:9c:0a/00:01:00:00:00/e0 tag 0 dma 176128 out Jan 28 04:46:25 coyote kernel: [26550.290029] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 28 04:46:25 coyote kernel: [26550.290032] ata1.00: status: { DRDY } Jan 28 04:46:25 coyote kernel: [26550.290060] ata1: soft resetting link Jan 28 04:46:25 coyote kernel: [26550.452301] ata1.00: configured for UDMA/100 Jan 28 04:46:25 coyote kernel: [26550.452318] ata1: EH complete Jan 28 04:46:25 coyote kernel: [26550.455898] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB) Jan 28 04:46:25 coyote kernel: [26550.456151] sd 0:0:0:0: [sda] Write Protect is off Jan 28 04:46:25 coyote kernel: [26550.456403] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA And it just did it again, using the fedora kernel but without logging anything at all when it froze. In other words I had to reboot between the word list and the word to above. So now I'm booted to 2.6.24-rc7. Before it crashes again, here is the dmesg: [0.00] Linux version 2.6.24-rc7 ([EMAIL PROTECTED]) (gcc version 4.1.2 20070925 (Red Hat 4.1.2-33)) #1 SMP Mon Jan 14 10:00:40 EST 2008 [0.00] BIOS-provided physical RAM map: [0.00] BIOS-e820: - 0009f800 (usable) [0.00] BIOS-e820: 0009f800 - 000a (reserved) [0.00] BIOS-e820: 000f - 0010 (reserved) [0.00] BIOS-e820: 0010 - 3fff (usable) [0.00] BIOS-e820: 3fff - 3fff3000 (ACPI NVS) [0.00] BIOS-e820: 3fff3000 - 4000 (ACPI data) [0.00] BIOS-e820: fec0 - fec01000 (reserved) [0.00] BIOS-e820: fee0 - fee01000 (reserved) [0.00] BIOS-e820: - 0001 (reserved) [0.00] 127MB HIGHMEM available. [0.00] 896MB LOWMEM available. [0.00] Entering add_active_range(0, 0, 262128) 0 entries of 256 used [0.00] Zone PFN ranges: [0.00] DMA 0 - 4096 [0.00] Normal 4096 - 229376 [0.00] HighMem229376 - 262128 [0.00] Movable zone start PFN for each node [0.00] early_node_map[1] active PFN ranges [0.00] 0:0 - 262128 [0.00] On node 0 totalpages: 262128 [0.00] DMA zone: 32 pages used for memmap [0.00] DMA zone: 0 pages reserved [0.00] DMA zone: 4064 pages, LIFO batch:0 [0.00] Normal zone: 1760 pages used for memmap [0.00] Normal zone: 223520 pages, LIFO batch:31 [0.00] HighMem zone: 255 pages used for memmap [0.00] HighMem zone: 32497 pages, LIFO batch:7 [0.00] Movable zone: 0 pages used for memmap [0.00] DMI 2.2 present. [0.00] ACPI: RSDP 000F7220, 0014 (r0 Nvidia) [0.00] ACPI: RSDT 3FFF3000, 002C (r1 Nvidia AWRDACPI 42302E31 AWRD 0) [0.00] ACPI: FACP 3FFF3040, 0074 (r1 Nvidia AWRDACPI 42302E31 AWRD 0) [0.00] ACPI: DSDT 3FFF30C0, 4CC4 (r1 NVIDIA AWRDACPI 1000 MSFT 10E) [0.00] ACPI: FACS 3FFF, 0040 [0.00] ACPI: APIC 3FFF7DC0, 006E (r1 Nvidia AWRDACPI 42302E31 AWRD 0) [0.00] Nvidia board detected. Ignoring ACPI timer override. [0.00] If you got timer trouble try acpi_use_timer_override [0.00] ACPI: PM-Timer IO Port: 0x4008 [0.00] ACPI: Local APIC address 0xfee0 [0.00] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) [0.00] Processor #0 6:10 APIC version 16 [0.00] ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1]) [0.00] ACPI: IOAPIC (id[0x02] address[0xfec0] gsi_base[0]) [0.00] IOAPIC[0]: apic_id 2, version 17, address 0xfec0, GSI 0-23 [0.00] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) [0.00] ACPI: BIOS IRQ0 pin2 override ignored. [0.00] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) [0.00] ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 14 high edge) [0.00] ACPI: INT_SRC_OVR (bus 0 bus_irq 15
[PATCH] pata_legacy: multiple updates (rediff as asked)
- Fix probe logic to support multiple devices better - Fold in qdi and winbond support - Fix promise 202C30 probe - Restructure diff -u --new-file --recursive --exclude-from /usr/src/exclude linux.vanilla-2.6.24/drivers/ata/pata_legacy.c linux-2.6.24/drivers/ata/pata_legacy.c --- linux.vanilla-2.6.24/drivers/ata/pata_legacy.c 2008-01-24 22:58:37.0 + +++ linux-2.6.24/drivers/ata/pata_legacy.c 2008-01-28 15:47:45.0 + @@ -28,7 +28,6 @@ * * Unsupported but docs exist: * Appian/Adaptec AIC25VL01/Cirrus Logic PD7220 - * Winbond W83759A * * This driver handles legacy (that is ISA/VLB side) IDE ports found * on PC class systems. There are three hybrid devices that are exceptions @@ -36,7 +35,7 @@ * the MPIIX where the tuning is PCI side but the IDE is ISA side. * * Specific support is included for the ht6560a/ht6560b/opti82c611a/ - * opti82c465mv/promise 20230c/20630 + * opti82c465mv/promise 20230c/20630/winbond83759A * * Use the autospeed and pio_mask options with: * Appian ADI/2 aka CLPD7220 or AIC25VL01. @@ -47,9 +46,6 @@ * For now use autospeed and pio_mask as above with the W83759A. This may * change. * - * TODO - * Merge existing pata_qdi driver - * */ #include linux/kernel.h @@ -64,12 +60,13 @@ #include linux/platform_device.h #define DRV_NAME pata_legacy -#define DRV_VERSION 0.5.5 +#define DRV_VERSION 0.6.5 #define NR_HOST 6 -static int legacy_port[NR_HOST] = { 0x1f0, 0x170, 0x1e8, 0x168, 0x1e0, 0x160 }; -static int legacy_irq[NR_HOST] = { 14, 15, 11, 10, 8, 12 }; +static int all; +module_param(all, int, 0444); +MODULE_PARM_DESC(all, Grab all legacy port devices, even if PCI(0=off, 1=on)); struct legacy_data { unsigned long timing; @@ -80,21 +77,107 @@ }; +enum controller { + BIOS = 0, + SNOOP = 1, + PDC20230 = 2, + HT6560A = 3, + HT6560B = 4, + OPTI611A = 5, + OPTI46X = 6, + QDI6500 = 7, + QDI6580 = 8, + QDI6580DP = 9, /* Dual channel mode is different */ + W83759A = 10, + + UNKNOWN = -1 +}; + + +struct legacy_probe { + unsigned char *name; + unsigned long port; + unsigned int irq; + unsigned int slot; + enum controller type; + unsigned long private; +}; + +struct legacy_controller { + const char *name; + struct ata_port_operations *ops; + unsigned int pio_mask; + unsigned int flags; + int (*setup)(struct platform_device *, struct legacy_probe *probe, + struct legacy_data *data); +}; + +static int legacy_port[NR_HOST] = { 0x1f0, 0x170, 0x1e8, 0x168, 0x1e0, 0x160 }; + +static struct legacy_probe probe_list[NR_HOST]; static struct legacy_data legacy_data[NR_HOST]; static struct ata_host *legacy_host[NR_HOST]; static int nr_legacy_host; -static int probe_all; /* Set to check all ISA port ranges */ -static int ht6560a;/* HT 6560A on primary 1, secondary 2, both 3 */ -static int ht6560b;/* HT 6560A on primary 1, secondary 2, both 3 */ -static int opti82c611a;/* Opti82c611A on primary 1, secondary 2, both 3 */ -static int opti82c46x; /* Opti 82c465MV present (pri/sec autodetect) */ -static int autospeed; /* Chip present which snoops speed changes */ -static int pio_mask = 0x1F;/* PIO range for autospeed devices */ +static int probe_all; /* Set to check all ISA port ranges */ +static int ht6560a;/* HT 6560A on primary 1, second 2, both 3 */ +static int ht6560b;/* HT 6560A on primary 1, second 2, both 3 */ +static int opti82c611a;/* Opti82c611A on primary 1, sec 2, both 3 */ +static int opti82c46x; /* Opti 82c465MV present(pri/sec autodetect) */ +static int qdi;/* Set to probe QDI controllers */ +static int winbond;/* Set to probe Winbond controllers, + give I/O port if non stdanard */ +static int autospeed; /* Chip present which snoops speed changes */ +static int pio_mask = 0x1F;/* PIO range for autospeed devices */ static int iordy_mask = 0x;/* Use iordy if available */ /** + * legacy_probe_add- Add interface to probe list + * @port: Controller port + * @irq: IRQ number + * @type: Controller type + * @private: Controller specific info + * + * Add an entry into the probe list for ATA controllers. This is used + * to add the default ISA slots and then to build up the table + * further according to other ISA/VLB/Weird device scans + * + * An I/O port list is used to keep ordering stable and sane, as we + * don't have any good way to talk about ordering otherwise + */ + +static int legacy_probe_add(unsigned long port, unsigned int irq, + enum
Re: [PATCH 0/32] ide-tape redux v1
On Sunday 27 January 2008, Bartlomiej Zolnierkiewicz wrote: BTW what happend to patch #23? my bad, the patch got eaten by gmail's spam filter... - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] pata_sl82c105: dual channel support
Use qc_defer to serialize the two channels Signed-off-by: Alan Cox [EMAIL PROTECTED] diff -u --new-file --recursive --exclude-from /usr/src/exclude linux.vanilla-2.6.24/drivers/ata/pata_sl82c105.c linux-2.6.24/drivers/ata/pata_sl82c105.c --- linux.vanilla-2.6.24/drivers/ata/pata_sl82c105.c2008-01-24 22:58:37.0 + +++ linux-2.6.24/drivers/ata/pata_sl82c105.c2008-01-28 15:44:11.0 + @@ -26,7 +26,7 @@ #include linux/libata.h #define DRV_NAME pata_sl82c105 -#define DRV_VERSION 0.3.2 +#define DRV_VERSION 0.3.3 enum { /* @@ -206,6 +206,34 @@ sl82c105_set_piomode(ap, qc-dev); } +/** + * sl82c105_qc_defer - implement serialization + * @qc: command + * + * We must issue one command per host not per channel because + * of the reset bug. + * + * Q: is the scsi host lock sufficient ? + */ + +static int sl82c105_qc_defer(struct ata_queued_cmd *qc) +{ + struct ata_host *host = qc-ap-host; + struct ata_port *alt = host-ports[1 ^ qc-ap-port_no]; + int rc; + + /* First apply the usual rules */ + rc = ata_std_qc_defer(qc); + if (rc != 0) + return rc; + + /* Now apply serialization rules. Only allow a command if the + other channel state machine is idle */ + if (alt alt-qc_active) + return ATA_DEFER_PORT; + return 0; +} + static struct scsi_host_template sl82c105_sht = { .module = THIS_MODULE, .name = DRV_NAME, @@ -245,6 +273,7 @@ .bmdma_stop = sl82c105_bmdma_stop, .bmdma_status = ata_bmdma_status, + .qc_defer = sl82c105_qc_defer, .qc_prep= ata_qc_prep, .qc_issue = ata_qc_issue_prot, @@ -312,7 +341,7 @@ }; /* for now use only the first port */ const struct ata_port_info *ppi[] = { info_early, - ata_dummy_port_info }; + NULL }; u32 val; int rev; - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with ata layer in 2.6.24
On Monday 28 January 2008, Gene Heskett wrote: On Monday 28 January 2008, Zan Lynx wrote: On Mon, 2008-01-28 at 11:50 -0500, Calvin Walton wrote: On Mon, 2008-01-28 at 11:35 -0500, Gene Heskett wrote: On Monday 28 January 2008, Mikael Pettersson wrote: Unfortunately we also see: [ 48.285456] nvidia: module license 'NVIDIA' taints kernel. [ 48.549725] ACPI: PCI Interrupt :02:00.0[A] - Link [APC4] - GSI 19 (level, high) - IRQ 20 [ 48.550149] NVRM: loading NVIDIA UNIX x86 Kernel Module 169.07 Thu Dec 13 18:42:56 PST 2007 We have no way of debugging that module, so please try 2.6.24 without it. Sorry, I can't do this and have a working machine. The nv driver has suffered bit rot or something since the FC2 days when it COULD run a 19 crt at 1600x1200, and will not drive this 20 wide screen lcd 1680x1050 monitor at more than 800x600, which is absolutely butt ugly fuzzy, looking like a jpg compressed to 10%. The system is not usable on a day to basis without the nvidia driver. You should probably give the nouveau[1] driver a try, if only for testing purposes; if you are running an NV4x (G6x or G7x) card in particular, it works a lot better than the nv driver for 2d support. 1. http://nouveau.freedesktop.org/wiki/InstallNouveau But nouveau is much less stable than nv. For testing purposes, go with stable. I believe at this point, its moot. I captured quite a few instances of that error message while rebooting the last time, all of which occurred long before I logged in and did a startx (I boot to runlevel 3 here), so the kernel was NOT tainted at that point. That dmesg has been posted and some questions asked. As this has gone on for a while, it seems to me that with 14,800 google hits on this problem, Linus should call a halt until this is found and fixed. But I'm not Linus. I'm also locking up for 30 at a time, probably ready for reboot #7 today. I'm not sure why it won't run his screen though. I can use nv to run a 1920x1200 laptop LCD. It *is* dog slow (although nouveau was not any better with a NV17 / 440-Go -- render support for AA fonts seems to be missing), but it does work. I've been trying to run a long selftest on that drive, but the constant reboots are fscking that up. I have attached the last smartctl -a output, indicating that the test was aborted probably from all the resets that are being issued, the last one froze me for around 5 minutes but I haven't rebooted yet. Its attached. Can anyone see if there is actually anything wrong with the drive? If a boot will last long enough for the -t long to complete, then it passes with no errors, but this was interrupted now for the 3rd time. -- Cheers, Gene There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) Well begun is half done. -- Aristotle smartctl version 5.37 [i386-redhat-linux-gnu] Copyright (C) 2002-6 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF INFORMATION SECTION === Model Family: Western Digital Caviar SE family Device Model: WDC WD2000JB-00EVA0 Serial Number:WD-WMAEH2782398 Firmware Version: 15.05R15 User Capacity:200,049,647,616 bytes Device is:In smartctl database [for details use: -P show] ATA Version is: 6 ATA Standard is: Exact ATA specification draft version not indicated Local Time is:Mon Jan 28 12:39:08 2008 EST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x85) Offline data collection activity was aborted by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 249) Self-test routine in progress... 90% of test remaining. Total time to complete Offline data collection: (6942) seconds. Offline data collection capabilities: (0x79) SMART execute Offline immediate. No Auto Offline data collection support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities:(0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability:(0x01) Error logging supported. No General Purpose Logging support. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 88) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED
Re: Problem with ata layer in 2.6.24
Added Alan to CC: list. [ 30.703188] scsi0 : pata_amd [ 30.709313] scsi1 : pata_amd [ 30.710076] ata1: PATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0xf000 irq 14 [ 30.710079] ata2: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma 0xf008 irq 15 [ 30.864753] ata1.00: ATA-6: WDC WD2000JB-00EVA0, 15.05R15, max UDMA/100 [ 30.864756] ata1.00: 390721968 sectors, multi 16: LBA48 [ 30.871629] ata1.00: configured for UDMA/100 .. Gene, please confirm with us that your primary/master hard drive (above) is connected with an 80-wire UDMA cable, as opposed to the older 40-wire cables. [ 31.195305] ata2.00: ATAPI: LITE-ON DVDRW SHM-165H6S, HS06, max UDMA/66 [ 31.243813] ata2.01: ATA-7: MAXTOR STM3320620A, 3.AAE, max UDMA/100 [ 31.243816] ata2.01: 625142448 sectors, multi 16: LBA48 [ 31.243825] ata2.00: limited to UDMA/33 due to 40-wire cable [ 31.417074] ata2.00: configured for UDMA/33 [ 31.451769] ata2.01: configured for UDMA/100 .. That looks like an unrelated bug to me: the driver says 40-wire cable but then goes and chooses UDMA/100 on one of the drives. Alan? - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with ata layer in 2.6.24
Gene Heskett wrote: Greeting; I had to reboot early this morning due to a freezeup, and I had a bunch of these in the messages log: == Jan 27 19:42:11 coyote kernel: [42461.915961] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 27 19:42:11 coyote kernel: [42461.915973] ata1.00: cmd ca/00:08:b1:66:46/00:00:00:00:00/e8 tag 0 dma 4096 out Jan 27 19:42:11 coyote kernel: [42461.915974] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 27 19:42:11 coyote kernel: [42461.915978] ata1.00: status: { DRDY } Jan 27 19:42:11 coyote kernel: [42461.916005] ata1: soft resetting link Jan 27 19:42:12 coyote kernel: [42462.078216] ata1.00: configured for UDMA/100 Jan 27 19:42:12 coyote kernel: [42462.078232] ata1: EH complete Jan 27 19:42:12 coyote kernel: [42462.090700] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB) Jan 27 19:42:12 coyote kernel: [42462.114230] sd 0:0:0:0: [sda] Write Protect is off Jan 27 19:42:12 coyote kernel: [42462.115079] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA === That one showed up about 2 hours ago, so I expect I'll be locked up again before I've managed a 24 hour uptime. This drive passed a 'smartctl -t long /dev/sda' with flying colors after the reboot this morning. Two instances were logged after I had rebooted to 2.6.24 from 2.6.24-rc8: Jan 24 20:46:33 coyote kernel: [0.00] Linux version 2.6.24 ([EMAIL PROTECTED]) (gcc version 4.1.2 20070925 (Red Hat 4.1.2-33)) #1 SMP Thu Jan 24 20:17:55 EST 2008 Jan 27 02:28:29 coyote kernel: [193207.445158] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 27 02:28:29 coyote kernel: [193207.445170] ata1.00: cmd 35/00:08:f9:24:0a/00:00:17:00:00/e0 tag 0 dma 4096 out Jan 27 02:28:29 coyote kernel: [193207.445172] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 27 02:28:29 coyote kernel: [193207.445175] ata1.00: status: { DRDY } Jan 27 02:28:29 coyote kernel: [193207.445202] ata1: soft resetting link Jan 27 02:28:29 coyote kernel: [193207.607384] ata1.00: configured for UDMA/100 Jan 27 02:28:29 coyote kernel: [193207.607399] ata1: EH complete Jan 27 02:28:29 coyote kernel: [193207.609681] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB) Jan 27 02:28:29 coyote kernel: [193207.619277] sd 0:0:0:0: [sda] Write Protect is off Jan 27 02:28:29 coyote kernel: [193207.649041] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Jan 27 02:30:06 coyote kernel: [193304.336929] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 27 02:30:06 coyote kernel: [193304.336940] ata1.00: cmd ca/00:20:69:22:a6/00:00:00:00:00/e7 tag 0 dma 16384 out Jan 27 02:30:06 coyote kernel: [193304.336942] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 27 02:30:06 coyote kernel: [193304.336945] ata1.00: status: { DRDY } Jan 27 02:30:06 coyote kernel: [193304.336972] ata1: soft resetting link Jan 27 02:30:06 coyote kernel: [193304.499210] ata1.00: configured for UDMA/100 Jan 27 02:30:06 coyote kernel: [193304.499226] ata1: EH complete Jan 27 02:30:06 coyote kernel: [193304.499714] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB) Jan 27 02:30:06 coyote kernel: [193304.499857] sd 0:0:0:0: [sda] Write Protect is off Jan 27 02:30:06 coyote kernel: [193304.502315] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA None were logged during the time I was running an -rc7 or -rc8. The previous hits on this resulted in the udma speed being downgraded till it was actually running in pio just before the freeze that required the hardware reset button. I'll reboot to -rc8 right now and resume. If its the drive, I should see it. If not, then 2.6.24 is where I'll point the finger. .. The only libata change I can see that could possibly affect your setup, is this one here, which went in sometime between -rc7 and -final: --- linux-2.6.24-rc7/drivers/ata/libata-eh.c2008-01-06 16:45:38.0 -0500 +++ linux-2.6.24/drivers/ata/libata-eh.c2008-01-24 17:58:37.0 -0500 @@ -1733,11 +1733,15 @@ ehc-i.action = ~ATA_EH_PERDEV_MASK; } - /* consider speeding down */ + /* propagate timeout to host link */ + if ((all_err_mask AC_ERR_TIMEOUT) !ata_is_host_link(link)) + ap-link.eh_context.i.err_mask |= AC_ERR_TIMEOUT; + It looks pretty innocent to me, though. If you want to try reverting just that change (comment out the two lines and rebuild), then that might provide useful information here. If -final is still b0rked even with those two lines changed back, then I suspect you're just getting lucky when switching between the -rc7/-rc8 kernel and the -final kernel. Lucky in a bad way, that is. The real test would be to rebuild the kernel without libata, and *with* the old IDE
Re: Problem with ata layer in 2.6.24
On Monday 28 January 2008, Richard Heck wrote: I've recently seen this kind of error myself, under Fedora 8, using the Fedora 2.6.23 kernels: I'd see a train of the same sort of error: Jan 28 04:46:25 coyote kernel: [26550.290016] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 28 04:46:25 coyote kernel: [26550.290028] ata1.00: cmd 35/00:58:c9:9c:0a/00:01:00:00:00/e0 tag 0 dma 176128 out Jan 28 04:46:25 coyote kernel: [26550.290029] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) usually associated with the optical drive, and then it seems as if the whole SATA subsystem would lock up, and the machine then becomes useless: I get journal commit errors if I'm lucky; if I'm not, it just locks up. My system is also using the pata_amd driver. I have not seen these sorts of errors with the 2.6.24 kernels. Richard Heck Unforch, this is my only bootable drive, and its raising hell with things, about 6 hardware reset initiated reboots so far today since 6:15 am. If it persists I'll go see if Circuit City still has any pata drives left as this mobo won't boot from a sata card. Gene Heskett wrote: On Monday 28 January 2008, Peter Zijlstra wrote: On Mon, 2008-01-28 at 09:17 +0100, Mikael Pettersson wrote: 1. Wrong mailing list; use linux-ide (@vger) instead. What, and keep all us other interested people in the dark? As a test, I tried rebooting to the latest fedora kernel and found it kills X, so I'm back to the second to last fedora version ATM, and the third 'smartctl -t lng /dev/sda' in 24 hours is running now. The first two completed with no errors. I've added the linux-ide list to refresh those people of the problem, the logs are being spammed by this message stanza: Jan 28 04:46:25 coyote kernel: [26550.290016] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 28 04:46:25 coyote kernel: [26550.290028] ata1.00: cmd 35/00:58:c9:9c:0a/00:01:00:00:00/e0 tag 0 dma 176128 out Jan 28 04:46:25 coyote kernel: [26550.290029] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 28 04:46:25 coyote kernel: [26550.290032] ata1.00: status: { DRDY } Jan 28 04:46:25 coyote kernel: [26550.290060] ata1: soft resetting link Jan 28 04:46:25 coyote kernel: [26550.452301] ata1.00: configured for UDMA/100 Jan 28 04:46:25 coyote kernel: [26550.452318] ata1: EH complete Jan 28 04:46:25 coyote kernel: [26550.455898] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB) Jan 28 04:46:25 coyote kernel: [26550.456151] sd 0:0:0:0: [sda] Write Protect is off Jan 28 04:46:25 coyote kernel: [26550.456403] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA And it just did it again, using the fedora kernel but without logging anything at all when it froze. In other words I had to reboot between the word list and the word to above. So now I'm booted to 2.6.24-rc7. Before it crashes again, here is the dmesg: [0.00] Linux version 2.6.24-rc7 ([EMAIL PROTECTED]) (gcc version 4.1.2 20070925 (Red Hat 4.1.2-33)) #1 SMP Mon Jan 14 10:00:40 EST 2008 [0.00] BIOS-provided physical RAM map: [0.00] BIOS-e820: - 0009f800 (usable) [0.00] BIOS-e820: 0009f800 - 000a (reserved) [0.00] BIOS-e820: 000f - 0010 (reserved) [0.00] BIOS-e820: 0010 - 3fff (usable) [0.00] BIOS-e820: 3fff - 3fff3000 (ACPI NVS) [0.00] BIOS-e820: 3fff3000 - 4000 (ACPI data) [0.00] BIOS-e820: fec0 - fec01000 (reserved) [0.00] BIOS-e820: fee0 - fee01000 (reserved) [0.00] BIOS-e820: - 0001 (reserved) [0.00] 127MB HIGHMEM available. [0.00] 896MB LOWMEM available. [0.00] Entering add_active_range(0, 0, 262128) 0 entries of 256 used [0.00] Zone PFN ranges: [0.00] DMA 0 - 4096 [0.00] Normal 4096 - 229376 [0.00] HighMem229376 - 262128 [0.00] Movable zone start PFN for each node [0.00] early_node_map[1] active PFN ranges [0.00] 0:0 - 262128 [0.00] On node 0 totalpages: 262128 [0.00] DMA zone: 32 pages used for memmap [0.00] DMA zone: 0 pages reserved [0.00] DMA zone: 4064 pages, LIFO batch:0 [0.00] Normal zone: 1760 pages used for memmap [0.00] Normal zone: 223520 pages, LIFO batch:31 [0.00] HighMem zone: 255 pages used for memmap [0.00] HighMem zone: 32497 pages, LIFO batch:7 [0.00] Movable zone: 0 pages used for memmap [0.00] DMI 2.2 present. [0.00] ACPI: RSDP 000F7220, 0014 (r0 Nvidia) [0.00] ACPI: RSDT 3FFF3000, 002C (r1 Nvidia AWRDACPI 42302E31 AWRD
Re: Problem with ata layer in 2.6.24
On Monday 28 January 2008, Mikael Pettersson wrote: Gene Heskett writes: On Monday 28 January 2008, Peter Zijlstra wrote: On Mon, 2008-01-28 at 09:17 +0100, Mikael Pettersson wrote: 1. Wrong mailing list; use linux-ide (@vger) instead. What, and keep all us other interested people in the dark? As a test, I tried rebooting to the latest fedora kernel and found it kills X, so I'm back to the second to last fedora version ATM, and the third 'smartctl -t lng /dev/sda' in 24 hours is running now. The first two completed with no errors. I've added the linux-ide list to refresh those people of the problem, the logs are being spammed by this message stanza: Jan 28 04:46:25 coyote kernel: [26550.290016] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 28 04:46:25 coyote kernel: [26550.290028] ata1.00: cmd 35/00:58:c9:9c:0a/00:01:00:00:00/e0 tag 0 dma 176128 out Jan 28 04:46:25 coyote kernel: [26550.290029] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 28 04:46:25 coyote kernel: [26550.290032] ata1.00: status: { DRDY } Jan 28 04:46:25 coyote kernel: [26550.290060] ata1: soft resetting link Jan 28 04:46:25 coyote kernel: [26550.452301] ata1.00: configured for UDMA/100 Jan 28 04:46:25 coyote kernel: [26550.452318] ata1: EH complete Jan 28 04:46:25 coyote kernel: [26550.455898] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB) Jan 28 04:46:25 coyote kernel: [26550.456151] sd 0:0:0:0: [sda] Write Protect is off Jan 28 04:46:25 coyote kernel: [26550.456403] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA It's not obvious from this incomplete dmesg log what HW or driver is behind ata1, but if the 2.6.24-rc7 kernel matches the 2.6.24 one, it should be pata_amd driving a WDC disk: [ 30.702887] pata_amd :00:09.0: version 0.3.10 [ 30.703052] PCI: Setting latency timer of device :00:09.0 to 64 [ 30.703188] scsi0 : pata_amd [ 30.709313] scsi1 : pata_amd [ 30.710076] ata1: PATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0xf000 irq 14 [ 30.710079] ata2: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma 0xf008 irq 15 [ 30.864753] ata1.00: ATA-6: WDC WD2000JB-00EVA0, 15.05R15, max UDMA/100 [ 30.864756] ata1.00: 390721968 sectors, multi 16: LBA48 [ 30.871629] ata1.00: configured for UDMA/100 Unfortunately we also see: [ 48.285456] nvidia: module license 'NVIDIA' taints kernel. [ 48.549725] ACPI: PCI Interrupt :02:00.0[A] - Link [APC4] - GSI 19 (level, high) - IRQ 20 [ 48.550149] NVRM: loading NVIDIA UNIX x86 Kernel Module 169.07 Thu Dec 13 18:42:56 PST 2007 We have no way of debugging that module, so please try 2.6.24 without it. Sorry, I can't do this and have a working machine. The nv driver has suffered bit rot or something since the FC2 days when it COULD run a 19 crt at 1600x1200, and will not drive this 20 wide screen lcd 1680x1050 monitor at more than 800x600, which is absolutely butt ugly fuzzy, looking like a jpg compressed to 10%. The system is not usable on a day to basis without the nvidia driver. Fix the nv driver so it will run this screen at its native resolution and I'll be glad to run it even if it won't run google earth, which I do use from time to time. Now, if in all the hits you can get from google on this, currently 14,800 just for 'exception Emask', apparently caused by a timeout, if 100% of the complainers are running nvidia drivers also, then I see a legit complaint. Again, fix the nv driver so it will run my screen I'll be glad to switch. I can see the reason, sure, but the machine must be capable of doing its common day to day stuff, while using that driver, like running kde for kmail, and browsers that work. If the problems persist, please try to capture a complete log from the failing kernel -- the interesting bits are everything from initial boot up to and including the first few errors. You may need to increase the kernel's log buffer size if the log gets truncated (CONFIG_LOG_BUF_SHIFT). If by log you mean /var/log/messages, I have several megabytes of those. If you mean a live dmesg capture taken right now, its attached. It contains several of these at the bottom. I long ago made the kernel log buffer bigger, cuz it couldn't even show the start immediately after the boot, and even the dump to syslog was truncated. There are no pata_amd changes from 2.6.24-rc7 to 2.6.24 final. That is what I was afraid of. I've done some limited grepping in that branch of the kernel tree, and cannot seem to locate where this EH handler is being invoked from. There is 2 lines of interest in the dmesg: [0.00] Nvidia board detected. Ignoring ACPI timer override. [0.00] If you got timer trouble try acpi_use_timer_override But I have NDI what it means, kernel argument/xconfig option? I've also done some googling, and it appears this problem is fairly
Re: Problem with ata layer in 2.6.24
On Mon, 2008-01-28 at 11:50 -0500, Calvin Walton wrote: On Mon, 2008-01-28 at 11:35 -0500, Gene Heskett wrote: On Monday 28 January 2008, Mikael Pettersson wrote: Unfortunately we also see: [ 48.285456] nvidia: module license 'NVIDIA' taints kernel. [ 48.549725] ACPI: PCI Interrupt :02:00.0[A] - Link [APC4] - GSI 19 (level, high) - IRQ 20 [ 48.550149] NVRM: loading NVIDIA UNIX x86 Kernel Module 169.07 Thu Dec 13 18:42:56 PST 2007 We have no way of debugging that module, so please try 2.6.24 without it. Sorry, I can't do this and have a working machine. The nv driver has suffered bit rot or something since the FC2 days when it COULD run a 19 crt at 1600x1200, and will not drive this 20 wide screen lcd 1680x1050 monitor at more than 800x600, which is absolutely butt ugly fuzzy, looking like a jpg compressed to 10%. The system is not usable on a day to basis without the nvidia driver. You should probably give the nouveau[1] driver a try, if only for testing purposes; if you are running an NV4x (G6x or G7x) card in particular, it works a lot better than the nv driver for 2d support. 1. http://nouveau.freedesktop.org/wiki/InstallNouveau But nouveau is much less stable than nv. For testing purposes, go with stable. I'm not sure why it won't run his screen though. I can use nv to run a 1920x1200 laptop LCD. It *is* dog slow (although nouveau was not any better with a NV17 / 440-Go -- render support for AA fonts seems to be missing), but it does work. -- Zan Lynx [EMAIL PROTECTED] signature.asc Description: This is a digitally signed message part
Re: Problem with ata layer in 2.6.24
Gene Heskett wrote: On Monday 28 January 2008, Mark Lord wrote: .. Another way is to use the make_bad_sector utility that is included in the source tarball for hdparm-7.7, as follows: make_bad_sector --readback /dev/sda 474507 Apparently not in the rpm, darnit. .. That's okay. It should still be in the SRPM source file. And it's a tiny download from sourceforge.net: http://sourceforge.net/search/?type_of_search=softtype_of_search=softwords=hdparm Cheers - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with ata layer in 2.6.24
Daniel Barkalow wrote: Can you switch back to old IDE to get your work done (and to make sure it's not a hardware issue that's developed recently)? I think it'd be really, REALLY helpful to a lot of people if you, or someone, could explain in moderate detail how this might be done. I tried doing it myself, but I'm not sufficiently expert at configuring kernels that I was ever able to figure out how to do it. Obviously, the short version is: switch back to Fedora 6. But this kind of problem with libata---and yes, you're almost surely right that it's not one problem but lots---is sufficiently widespread that a Mini HOWTO, say, would be really welcome and, I'm guessing, widely used. Richard - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with ata layer in 2.6.24
Mark Lord wrote: Gene Heskett wrote: .. And so far no one has tried to comment on those 2 dmesg lines I've quoted a couple of times now, here's another: [0.00] Nvidia board detected. Ignoring ACPI timer override. [0.00] If you got timer trouble try acpi_use_timer_override what the heck is that trying to tell me to do, in some sort of broken english? .. I think it says this: If your system is misbehaving, then try adding the acpi_use_timer_override keyword to your kernel command line (/boot/grub/menu.lst) and see if it helps. So, you can either hardcode it in /boot/grub/menu.lst (just add it to the end of the first line you see there that begins with the word kernel. Or you can just try it temporarily at boot time (safer, but tricker), by catching GRUB (the bootloader) before it actually loads Linux. Usually there's some key or something it says you have 3 seconds to hit for a menu, so do that, and then use the cursor keys to find the first kernel line in that menu and hit e (edit) to go and add the acpi_use_timer_override keyword to the end of that line (same as above). .. Minor correction (having just tried it here): once you see the GRUB (boot) menu, hit the letter e to edit the first entry, then scroll to the kernel line, and hit the letter e again to edit that line. It should put you at the end of the line, where you can just type a space and then acpi_use_timer_override and then hit enter to finish the (temporary) edit. Then hit b for boot. -ml - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with ata layer in 2.6.24
Alan Cox wrote: On Mon, Jan 28, 2008 at 01:38:40PM -0500, Mark Lord wrote: [ 31.195305] ata2.00: ATAPI: LITE-ON DVDRW SHM-165H6S, HS06, max UDMA/66 [ 31.243813] ata2.01: ATA-7: MAXTOR STM3320620A, 3.AAE, max UDMA/100 [ 31.243816] ata2.01: 625142448 sectors, multi 16: LBA48 [ 31.243825] ata2.00: limited to UDMA/33 due to 40-wire cable [ 31.417074] ata2.00: configured for UDMA/33 [ 31.451769] ata2.01: configured for UDMA/100 .. That looks like an unrelated bug to me: the driver says 40-wire cable but then goes and chooses UDMA/100 on one of the drives. We currently assume that - If we have host side detecting 40 that we use 40 - If we have drive side detecting 40 use 40 - If we have drive side detecting 80 and host thinks 80 use 80 The case where the drives disagree isn't currently considered. .. Ahh. Tricky mess, that stuff. I believe that if we have a drive that only sees 40W, then it is probably best to restrict the other drive as well. Just in case the drive that reports 40W cannot actually keep up with the 80W timings, even when they're for the other drive. That's my 2p. Cheers - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/8] ide: add struct ide_port_info instances to legacy host drivers
Hello. Bartlomiej Zolnierkiewicz wrote: * Remove 'struct pci_dev *dev' argument from ide_hwif_setup_dma(). * Un-static ide_hwif_setup_dma() and add CONFIG_BLK_DEV_IDEDMA_PCI=n version. * Add 'const struct ide_port_info *d' argument to ide_device_add[_all](). * Factor out generic ports init from ide_pci_setup_ports() to ide_init_port(), move it to ide-probe.c and call it in in ide_device_add_all() instead of ide_pci_setup_ports(). * Move -mate setup to ide_device_add_all() from ide_port_init(). * Add IDE_HFLAG_NO_AUTOTUNE host flag for host drivers that don't enable -autotune currently. * Setup hwif-chipset in ide_init_port() but iff pi-chipset is set (to not override setup done by ide_hwif_configure()). * Add ETRAX host handling to ide_device_add_all(). * cmd640.c: set IDE_HFLAG_ABUSE_* also for CONFIG_BLK_DEV_CMD640_ENHANCED=n. * pmac.c: make pmac_ide_setup_dma() return an error value and move DMA masks setup to pmac_ide_setup_device(). * Add 'struct ide_port_info' instances to legacy host drivers, pass them to ide_device_add() calls and then remove open-coded ports initialization. Signed-off-by: Bartlomiej Zolnierkiewicz [EMAIL PROTECTED] Index: b/drivers/ide/arm/icside.c === --- a/drivers/ide/arm/icside.c +++ b/drivers/ide/arm/icside.c @@ -459,11 +456,19 @@ icside_register_v5(struct icside_state * idx[0] = hwif-index; - ide_device_add(idx); + ide_device_add(idx, NULL); return 0; } +static const struct ide_port_info icside_v6_port_info __initdata = { + .host_flags = IDE_HFLAG_SERIALIZE | + IDE_HFLAG_NO_DMA | /* no SFF-style DMA */ + IDE_HFLAG_NO_AUTOTUNE, + .mwdma_mask = ATA_MWDMA2, + .swdma_mask = ATA_SWDMA2, +}; + Interesting... this driver's support for SWDMA0 is broken since the cycle should be 960 ns long, not 480, and SWDMA2 is underclocked using the same cycle as SWDMA1, 480 ns... Index: b/drivers/ide/cris/ide-cris.c === --- a/drivers/ide/cris/ide-cris.c +++ b/drivers/ide/cris/ide-cris.c @@ -753,6 +753,15 @@ static void cris_set_dma_mode(ide_drive_ cris_ide_set_speed(TYPE_DMA, 0, strobe, hold); } +static const struct ide_port_info cris_port_info __initdata = { + .chipset= ide_etrax100, + .host_flags = IDE_HFLAG_NO_ATAPI_DMA | + IDE_HFLAG_NO_DMA, /* no SFF-style DMA */ + .pio_mask = ATA_PIO4, + .udma_mask = cris_ultra_mask, Hm, I wonder which value it will assume, 0x07 or 0? Not sure even after looking at the source... :-) + .mwdma_mask = ATA_MWDMA2, +}; + static int __init init_e100_ide(void) { hw_regs_t hw; Index: b/drivers/ide/ide-probe.c === --- a/drivers/ide/ide-probe.c +++ b/drivers/ide/ide-probe.c @@ -1289,12 +1289,86 @@ static void hwif_register_devices(ide_hw } } -int ide_device_add_all(u8 *idx) +static void ide_init_port(ide_hwif_t *hwif, unsigned int port, + const struct ide_port_info *d) { - ide_hwif_t *hwif; + if (d-chipset != ide_etrax100) + hwif-channel = port; Hm, what's so special about ide_etrax100? +int ide_device_add_all(u8 idx[MAX_HWIFS], const struct ide_port_info *d) Function prototype doesn't match with one from linux/ide.h which has the first argument as a pointer... +{ + ide_hwif_t *hwif, *mate = NULL; int i, rc = 0; for (i = 0; i MAX_HWIFS; i++) { + if (d == NULL || idx[i] == 0xff) { Why check for (d == NULL) every time and not do it once and break from the loop or even do it before the loop? + mate = NULL; + continue; + } + + hwif = ide_hwifs[idx[i]]; + + if (d-chipset != ide_etrax100 (i 1) mate) { + hwif-mate = mate; + mate-mate = hwif; + } + + mate = (i 1) ? NULL : hwif; + + ide_init_port(hwif, i 1, d); + } + + for (i = 0; i MAX_HWIFS; i++) { if (idx[i] == 0xff) continue; MBR, Sergei - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with ata layer in 2.6.24
On Mon, 28 Jan 2008, Richard Heck wrote: Daniel Barkalow wrote: Can you switch back to old IDE to get your work done (and to make sure it's not a hardware issue that's developed recently)? I think it'd be really, REALLY helpful to a lot of people if you, or someone, could explain in moderate detail how this might be done. I tried doing it myself, but I'm not sufficiently expert at configuring kernels that I was ever able to figure out how to do it. As far as configuring the kernel, I can help: Go to Device Drivers, ATA/ATAPI/MFM/RLL support, and turn on anything that looks relevant; go to Device Drivers, Serial ATA and Parallel ATA drivers, and turn off anything that's PATA and looks relevant. (Whether a device uses IDE or PATA depends on which driver that supports the device is present and find it first, not on any sort of global configuration, which is probably what tripped you up) Building this and installing it along with the appropriate initrd (which might be handled by Fedora's install scripts) will either get you back to old IDE or will make your kernel panic on boot, depending on whether you got it right (so make sure you can still boot the kernel you're sure of or something from a boot disk). This will also cause your hard drives to show up as different device nodes, so if your boot process doesn't mount by disk uuid but by some other feature (and I don't know what Fedora does), you'll also need to change it to something either stable across access methods or which works for the one you're now using. Obviously, the short version is: switch back to Fedora 6. But this kind of problem with libata---and yes, you're almost surely right that it's not one problem but lots---is sufficiently widespread that a Mini HOWTO, say, would be really welcome and, I'm guessing, widely used. Fedora really ought to provide documentation, because there's some distro-specific stuff (like how you deal with the kernel's device node for the root partition changing), and they're using code by default that's at least somewhat documented as experimental (although it doesn't seem to be actually marked as experimental in all cases). -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with ata layer in 2.6.24
On Monday 28 January 2008, Gene Heskett wrote: While reading this msg as it came back, I locked up again and rebooted to 2.6.24, and got lucky (maybe) as the attached dmesg will show quite a few instances of this LNNNGG before the nvidia driver is loaded to taint the kernel. Have fun guys! On Monday 28 January 2008, Mikael Pettersson wrote: Gene Heskett writes: On Monday 28 January 2008, Peter Zijlstra wrote: On Mon, 2008-01-28 at 09:17 +0100, Mikael Pettersson wrote: 1. Wrong mailing list; use linux-ide (@vger) instead. What, and keep all us other interested people in the dark? As a test, I tried rebooting to the latest fedora kernel and found it kills X, so I'm back to the second to last fedora version ATM, and the third 'smartctl -t lng /dev/sda' in 24 hours is running now. The first two completed with no errors. I've added the linux-ide list to refresh those people of the problem, the logs are being spammed by this message stanza: Jan 28 04:46:25 coyote kernel: [26550.290016] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 28 04:46:25 coyote kernel: [26550.290028] ata1.00: cmd 35/00:58:c9:9c:0a/00:01:00:00:00/e0 tag 0 dma 176128 out Jan 28 04:46:25 coyote kernel: [26550.290029] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 28 04:46:25 coyote kernel: [26550.290032] ata1.00: status: { DRDY } Jan 28 04:46:25 coyote kernel: [26550.290060] ata1: soft resetting link Jan 28 04:46:25 coyote kernel: [26550.452301] ata1.00: configured for UDMA/100 Jan 28 04:46:25 coyote kernel: [26550.452318] ata1: EH complete Jan 28 04:46:25 coyote kernel: [26550.455898] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB) Jan 28 04:46:25 coyote kernel: [26550.456151] sd 0:0:0:0: [sda] Write Protect is off Jan 28 04:46:25 coyote kernel: [26550.456403] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA It's not obvious from this incomplete dmesg log what HW or driver is behind ata1, but if the 2.6.24-rc7 kernel matches the 2.6.24 one, it should be pata_amd driving a WDC disk: [ 30.702887] pata_amd :00:09.0: version 0.3.10 [ 30.703052] PCI: Setting latency timer of device :00:09.0 to 64 [ 30.703188] scsi0 : pata_amd [ 30.709313] scsi1 : pata_amd [ 30.710076] ata1: PATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0xf000 irq 14 [ 30.710079] ata2: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma 0xf008 irq 15 [ 30.864753] ata1.00: ATA-6: WDC WD2000JB-00EVA0, 15.05R15, max UDMA/100 [ 30.864756] ata1.00: 390721968 sectors, multi 16: LBA48 [ 30.871629] ata1.00: configured for UDMA/100 Unfortunately we also see: [ 48.285456] nvidia: module license 'NVIDIA' taints kernel. [ 48.549725] ACPI: PCI Interrupt :02:00.0[A] - Link [APC4] - GSI 19 (level, high) - IRQ 20 [ 48.550149] NVRM: loading NVIDIA UNIX x86 Kernel Module 169.07 Thu Dec 13 18:42:56 PST 2007 We have no way of debugging that module, so please try 2.6.24 without it. Sorry, I can't do this and have a working machine. The nv driver has suffered bit rot or something since the FC2 days when it COULD run a 19 crt at 1600x1200, and will not drive this 20 wide screen lcd 1680x1050 monitor at more than 800x600, which is absolutely butt ugly fuzzy, looking like a jpg compressed to 10%. The system is not usable on a day to basis without the nvidia driver. Fix the nv driver so it will run this screen at its native resolution and I'll be glad to run it even if it won't run google earth, which I do use from time to time. Now, if in all the hits you can get from google on this, currently 14,800 just for 'exception Emask', apparently caused by a timeout, if 100% of the complainers are running nvidia drivers also, then I see a legit complaint. Again, fix the nv driver so it will run my screen I'll be glad to switch. I can see the reason, sure, but the machine must be capable of doing its common day to day stuff, while using that driver, like running kde for kmail, and browsers that work. If the problems persist, please try to capture a complete log from the failing kernel -- the interesting bits are everything from initial boot up to and including the first few errors. You may need to increase the kernel's log buffer size if the log gets truncated (CONFIG_LOG_BUF_SHIFT). If by log you mean /var/log/messages, I have several megabytes of those. If you mean a live dmesg capture taken right now, its attached. It contains several of these at the bottom. I long ago made the kernel log buffer bigger, cuz it couldn't even show the start immediately after the boot, and even the dump to syslog was truncated. There are no pata_amd changes from 2.6.24-rc7 to 2.6.24 final. That is what I was afraid of. I've done some limited grepping in that branch of the kernel tree, and cannot seem to locate where this EH handler is being invoked from. There is 2
Re: Marvel 88SE6121 driver
Thanks Sergei. I tested with 0x0407. There is still no interrupt generated from the controller. The following are the register values after soft reset of the SATA port. Any idea for no interrupt from SATA controller? Thanks for your help, Successfully done SATA_DoSoftReset loop = 1388 ===SATA Phy Regs on Port-Mmio_Base=0xe100dd00 S-Status(0x0): 0x113 S-Error(0x1): 0x419 S-Control(0x2): 0x0 PhyMode3(0x4): 0xaaa400ac PhyMode4(0x5): 0x48101cd PhyMode1(0xB): 0x402aa290 PhyMode2(0xC): 0x392af TestControl(0xD): 0x80c000 Loopback (0xF): 0x0 ===SATA Port Regs on Port-Mmio_Base=0xe100dd00 CMD_BASE_L(0xe100dd00): 0x1ffe CMD_BASE_H(0xe100dd04): 0x0 FIS_BASE_L(0xe100dd08): 0x1ffe0800 FIS_BASE_H(0xe100dd0c): 0x0 IRQ_STATUS(0xe100dd10): 0x0 IRQ_Enable(0xe100dd14): 0xc0700039 Command(0xe100dd18): 0x60001 TF data(0xe100dd20): 0x110180 Signature(0xe100dd24): 0x101 SATA_STATUS(0xe100dd28): 0x113 SATA_CNTL(0xe100dd2c): 0x0 SATA_Err(0xe100dd30): 0x0 SATA_Act(0xe100dd34): 0x0 CMD_ISSUE(0xe100dd38): 0x0 On 1/28/08, Sergei Shtylyov [EMAIL PROTECTED] wrote: Hello. Mike Zheng wrote: From PCI config space, the VENDOR_ID and DEVICE_ID are correct for my driver. However the PCI_COMMAND register of 88SE6121 I got is 0x7, which is the same as default value. Based on the document, it has to be 0x0207 to enable interrupt. However I always failed to do so, the Bit 10 (mask 0x0200) is fast back-to-back enable, what does it have to do with iterrupts. value remains 0x7 after it is re-assigned to the new value 0x0207. So This means that fast back-to-back is not supported that I have no interrupt from PCIe as INTA. Do you have any idea on this issue? There's also bit 11 (mask 0x0400) meaning INTx emulation *disable*. Thanks for your help, Mike Zheng MBR, Sergei - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with ata layer in 2.6.24
Gene Heskett writes: On Monday 28 January 2008, Peter Zijlstra wrote: On Mon, 2008-01-28 at 09:17 +0100, Mikael Pettersson wrote: 1. Wrong mailing list; use linux-ide (@vger) instead. What, and keep all us other interested people in the dark? As a test, I tried rebooting to the latest fedora kernel and found it kills X, so I'm back to the second to last fedora version ATM, and the third 'smartctl -t lng /dev/sda' in 24 hours is running now. The first two completed with no errors. I've added the linux-ide list to refresh those people of the problem, the logs are being spammed by this message stanza: Jan 28 04:46:25 coyote kernel: [26550.290016] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 28 04:46:25 coyote kernel: [26550.290028] ata1.00: cmd 35/00:58:c9:9c:0a/00:01:00:00:00/e0 tag 0 dma 176128 out Jan 28 04:46:25 coyote kernel: [26550.290029] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 28 04:46:25 coyote kernel: [26550.290032] ata1.00: status: { DRDY } Jan 28 04:46:25 coyote kernel: [26550.290060] ata1: soft resetting link Jan 28 04:46:25 coyote kernel: [26550.452301] ata1.00: configured for UDMA/100 Jan 28 04:46:25 coyote kernel: [26550.452318] ata1: EH complete Jan 28 04:46:25 coyote kernel: [26550.455898] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB) Jan 28 04:46:25 coyote kernel: [26550.456151] sd 0:0:0:0: [sda] Write Protect is off Jan 28 04:46:25 coyote kernel: [26550.456403] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA It's not obvious from this incomplete dmesg log what HW or driver is behind ata1, but if the 2.6.24-rc7 kernel matches the 2.6.24 one, it should be pata_amd driving a WDC disk: [ 30.702887] pata_amd :00:09.0: version 0.3.10 [ 30.703052] PCI: Setting latency timer of device :00:09.0 to 64 [ 30.703188] scsi0 : pata_amd [ 30.709313] scsi1 : pata_amd [ 30.710076] ata1: PATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0xf000 irq 14 [ 30.710079] ata2: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma 0xf008 irq 15 [ 30.864753] ata1.00: ATA-6: WDC WD2000JB-00EVA0, 15.05R15, max UDMA/100 [ 30.864756] ata1.00: 390721968 sectors, multi 16: LBA48 [ 30.871629] ata1.00: configured for UDMA/100 Unfortunately we also see: [ 48.285456] nvidia: module license 'NVIDIA' taints kernel. [ 48.549725] ACPI: PCI Interrupt :02:00.0[A] - Link [APC4] - GSI 19 (level, high) - IRQ 20 [ 48.550149] NVRM: loading NVIDIA UNIX x86 Kernel Module 169.07 Thu Dec 13 18:42:56 PST 2007 We have no way of debugging that module, so please try 2.6.24 without it. If the problems persist, please try to capture a complete log from the failing kernel -- the interesting bits are everything from initial boot up to and including the first few errors. You may need to increase the kernel's log buffer size if the log gets truncated (CONFIG_LOG_BUF_SHIFT). There are no pata_amd changes from 2.6.24-rc7 to 2.6.24 final. - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Marvel 88SE6121 driver
Hello. Mike Zheng wrote: From PCI config space, the VENDOR_ID and DEVICE_ID are correct for my driver. However the PCI_COMMAND register of 88SE6121 I got is 0x7, which is the same as default value. Based on the document, it has to be 0x0207 to enable interrupt. However I always failed to do so, the Bit 10 (mask 0x0200) is fast back-to-back enable, what does it have to do with iterrupts. value remains 0x7 after it is re-assigned to the new value 0x0207. So This means that fast back-to-back is not supported that I have no interrupt from PCIe as INTA. Do you have any idea on this issue? There's also bit 11 (mask 0x0400) meaning INTx emulation *disable*. Thanks for your help, Mike Zheng MBR, Sergei - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with ata layer in 2.6.24
Gene Heskett wrote: Greeting; I had to reboot early this morning due to a freezeup, and I had a bunch of these in the messages log: == Jan 27 19:42:11 coyote kernel: [42461.915961] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 27 19:42:11 coyote kernel: [42461.915973] ata1.00: cmd ca/00:08:b1:66:46/00:00:00:00:00/e8 tag 0 dma 4096 out Jan 27 19:42:11 coyote kernel: [42461.915974] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 27 19:42:11 coyote kernel: [42461.915978] ata1.00: status: { DRDY } Jan 27 19:42:11 coyote kernel: [42461.916005] ata1: soft resetting link Jan 27 19:42:12 coyote kernel: [42462.078216] ata1.00: configured for UDMA/100 Jan 27 19:42:12 coyote kernel: [42462.078232] ata1: EH complete Jan 27 19:42:12 coyote kernel: [42462.090700] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB) Jan 27 19:42:12 coyote kernel: [42462.114230] sd 0:0:0:0: [sda] Write Protect is off Jan 27 19:42:12 coyote kernel: [42462.115079] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA === That one showed up about 2 hours ago, so I expect I'll be locked up again before I've managed a 24 hour uptime. This drive passed a 'smartctl -t long /dev/sda' with flying colors after the reboot this morning. Two instances were logged after I had rebooted to 2.6.24 from 2.6.24-rc8: Jan 24 20:46:33 coyote kernel: [0.00] Linux version 2.6.24 ([EMAIL PROTECTED]) (gcc version 4.1.2 20070925 (Red Hat 4.1.2-33)) #1 SMP Thu Jan 24 20:17:55 EST 2008 Jan 27 02:28:29 coyote kernel: [193207.445158] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 27 02:28:29 coyote kernel: [193207.445170] ata1.00: cmd 35/00:08:f9:24:0a/00:00:17:00:00/e0 tag 0 dma 4096 out Jan 27 02:28:29 coyote kernel: [193207.445172] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 27 02:28:29 coyote kernel: [193207.445175] ata1.00: status: { DRDY } Jan 27 02:28:29 coyote kernel: [193207.445202] ata1: soft resetting link Jan 27 02:28:29 coyote kernel: [193207.607384] ata1.00: configured for UDMA/100 Jan 27 02:28:29 coyote kernel: [193207.607399] ata1: EH complete Jan 27 02:28:29 coyote kernel: [193207.609681] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB) Jan 27 02:28:29 coyote kernel: [193207.619277] sd 0:0:0:0: [sda] Write Protect is off Jan 27 02:28:29 coyote kernel: [193207.649041] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Jan 27 02:30:06 coyote kernel: [193304.336929] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 27 02:30:06 coyote kernel: [193304.336940] ata1.00: cmd ca/00:20:69:22:a6/00:00:00:00:00/e7 tag 0 dma 16384 out Jan 27 02:30:06 coyote kernel: [193304.336942] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 27 02:30:06 coyote kernel: [193304.336945] ata1.00: status: { DRDY } Jan 27 02:30:06 coyote kernel: [193304.336972] ata1: soft resetting link Jan 27 02:30:06 coyote kernel: [193304.499210] ata1.00: configured for UDMA/100 Jan 27 02:30:06 coyote kernel: [193304.499226] ata1: EH complete Jan 27 02:30:06 coyote kernel: [193304.499714] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB) Jan 27 02:30:06 coyote kernel: [193304.499857] sd 0:0:0:0: [sda] Write Protect is off Jan 27 02:30:06 coyote kernel: [193304.502315] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA None were logged during the time I was running an -rc7 or -rc8. The previous hits on this resulted in the udma speed being downgraded till it was actually running in pio just before the freeze that required the hardware reset button. Unfortunately there are 1001 different causes for timeouts, so we need to drill down into the hardware, libata version, and ACPI version (most notably). I'll reboot to -rc8 right now and resume. If its the drive, I should see it. If not, then 2.6.24 is where I'll point the finger. There was also an ACPI update, which always affects interrupt handling (whose symptom can sometimes be a timeout). Definitely interesting in test results from what you describe. Jeff - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[2.6 patch] ide: make wait_drive_not_busy() static again
After commit 7267c3377443322588cddaf457cf106839a60463 wait_drive_not_busy() can become static again. Signed-off-by: Adrian Bunk [EMAIL PROTECTED] --- drivers/ide/ide-taskfile.c |2 +- include/linux/ide.h|2 -- 2 files changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/ide/ide-taskfile.c b/drivers/ide/ide-taskfile.c index 5eb6fa1..d320001 100644 --- a/drivers/ide/ide-taskfile.c +++ b/drivers/ide/ide-taskfile.c @@ -260,7 +260,7 @@ static ide_startstop_t task_no_data_intr(ide_drive_t *drive) return ide_stopped; } -u8 wait_drive_not_busy(ide_drive_t *drive) +static u8 wait_drive_not_busy(ide_drive_t *drive) { ide_hwif_t *hwif = HWIF(drive); int retries; diff --git a/include/linux/ide.h b/include/linux/ide.h index 27cb39d..5d3d88e 100644 --- a/include/linux/ide.h +++ b/include/linux/ide.h @@ -986,8 +986,6 @@ ide_startstop_t do_rw_taskfile(ide_drive_t *, ide_task_t *); void task_end_request(ide_drive_t *, struct request *, u8); -u8 wait_drive_not_busy(ide_drive_t *); - int ide_raw_taskfile(ide_drive_t *, ide_task_t *, u8 *, u16); int ide_no_data_taskfile(ide_drive_t *, ide_task_t *); - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with ata layer in 2.6.24
On Monday 28 January 2008, Mark Lord wrote: [ 64.037975] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [ 64.038102] ata1.00: BMDMA stat 0x65 [ 64.038227] ata1.00: cmd c8/00:58:89:3d:07/00:00:00:00:00/e0 tag 0 dma 45056 in [ 64.038229] res 51/40:58:8b:3d:07/00:00:00:00:00/e0 Emask 0x9 (media error) [ 64.038432] ata1.00: status: { DRDY ERR } [ 64.038555] ata1.00: error: { UNC } [ 64.050125] ata1.00: configured for UDMA/100 [ 64.050134] sd 0:0:0:0: [sda] Result: hostbyte=0x00 driverbyte=0x08 [ 64.050138] sd 0:0:0:0: [sda] Sense Key : 0x3 [current] [descriptor] [ 64.050142] Descriptor sense data with sense descriptors (in hex): [ 64.050143] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 [ 64.050149] 00 07 3d 8b [ 64.050152] sd 0:0:0:0: [sda] ASC=0x11 ASCQ=0x4 [ 64.050155] end_request: I/O error, dev sda, sector 474507 .. This error looks somewhat different from the samples posted earlier. This one is quite definitively a bad sector. It should also show up in smartctl -a -data /dev/sda (near the bottom) if SMART was enabled on this drive at boot. It does not unforch. You could try reading that specific sector again just to make sure. One way is to figure out how to use dd for this. [EMAIL PROTECTED] ~]# dd if=/dev/sda bs=512 skip=474506 count=3 ��▒6 {�G���G���libkdecorations.so.1.0.0��c�®���J{�G���G���libkfontinst.so.0.0.0��c�®ʂ�GP�~GJ3G 6�7�8�#��z;{�G���G���libkhotkeys_shared.so.1.0.0��c�®���N{�G���G���libkickermain.so.1.0.0��c�®���Y{�G���G���libkonq.so.4.2.0��c�®���Z{�G���G���libkonqsidebarplugin.so.1.2.0��c�®���d{�G���G���libksgrd.so.1.2.0��c�®▒��G7 G▒�=G▒]��^���▒?e{�G���G���libksplashthemes.so.0.0.0��c�®{�G���G���libtaskbar.so.1.2.0��c�®{�G���G���libtaskmanager.so.1.0.0��c�®�3+0 records in 3+0 records out 1536 bytes (1.5 kB) copied, 6.1403e-05 s, 25.0 MB/s Another way is to use the make_bad_sector utility that is included in the source tarball for hdparm-7.7, as follows: make_bad_sector --readback /dev/sda 474507 Apparently not in the rpm, darnit. (when invoked as above, it does *not* make a bad sector; no worries). If it reports an I/O error consistently on that, then the sector is indeed faulty, and it's contents have long been lost. You can repair the bad sector (but not the original contents) like this: make_bad_sector --rewrite /dev/sda 474507 Cheers I'm going up to Clarksburg this afternoon to see if I can find a couple of drives, one a 2.5 bigger than 40Gb for my 2.5 maxtor usb housing, and another pata drive big enough to run this thing just re-install the December respin after I save as much of this as I can, there's nearly 50GB here now. Maybe it won't be so fscking picky about the next drive. I was hoping someone could look at that last dmseg I attached, but apparently everybody is blinded by unrelated details as that bad sector may have been transient, caused by the multiple hardware reset type reboots so far today :( The last 3 reboots have interrupted a 'smartctl -t long /dev/sda' in progress. :( If I reconvert to non libata, can I do that only for the pata drives of which there are 3 here including the dvd writer, and still use libata for the lone sata drive left? And can I do that without mucking with the device map, which will make amanda/tar attempt to do a level 0 on the whole system if its changed. I see the drives are at 254 again, when are they going to be given a stable device address out of the LANANA experimental group so we can reboot without mucking with that and driving tar crazy? Thanks everybody. -- Cheers, Gene There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) I just had my entire INTESTINAL TRACT coated with TEFLON! - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with ata layer in 2.6.24
[ 64.037975] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [ 64.038102] ata1.00: BMDMA stat 0x65 [ 64.038227] ata1.00: cmd c8/00:58:89:3d:07/00:00:00:00:00/e0 tag 0 dma 45056 in [ 64.038229] res 51/40:58:8b:3d:07/00:00:00:00:00/e0 Emask 0x9 (media error) [ 64.038432] ata1.00: status: { DRDY ERR } [ 64.038555] ata1.00: error: { UNC } [ 64.050125] ata1.00: configured for UDMA/100 [ 64.050134] sd 0:0:0:0: [sda] Result: hostbyte=0x00 driverbyte=0x08 [ 64.050138] sd 0:0:0:0: [sda] Sense Key : 0x3 [current] [descriptor] [ 64.050142] Descriptor sense data with sense descriptors (in hex): [ 64.050143] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 [ 64.050149] 00 07 3d 8b [ 64.050152] sd 0:0:0:0: [sda] ASC=0x11 ASCQ=0x4 [ 64.050155] end_request: I/O error, dev sda, sector 474507 .. This error looks somewhat different from the samples posted earlier. This one is quite definitively a bad sector. It should also show up in smartctl -a -data /dev/sda (near the bottom) if SMART was enabled on this drive at boot. You could try reading that specific sector again just to make sure. One way is to figure out how to use dd for this. Another way is to use the make_bad_sector utility that is included in the source tarball for hdparm-7.7, as follows: make_bad_sector --readback /dev/sda 474507 (when invoked as above, it does *not* make a bad sector; no worries). If it reports an I/O error consistently on that, then the sector is indeed faulty, and it's contents have long been lost. You can repair the bad sector (but not the original contents) like this: make_bad_sector --rewrite /dev/sda 474507 Cheers - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with ata layer in 2.6.24
On Monday 28 January 2008, Zan Lynx wrote: On Mon, 2008-01-28 at 11:50 -0500, Calvin Walton wrote: On Mon, 2008-01-28 at 11:35 -0500, Gene Heskett wrote: On Monday 28 January 2008, Mikael Pettersson wrote: Unfortunately we also see: [ 48.285456] nvidia: module license 'NVIDIA' taints kernel. [ 48.549725] ACPI: PCI Interrupt :02:00.0[A] - Link [APC4] - GSI 19 (level, high) - IRQ 20 [ 48.550149] NVRM: loading NVIDIA UNIX x86 Kernel Module 169.07 Thu Dec 13 18:42:56 PST 2007 We have no way of debugging that module, so please try 2.6.24 without it. Sorry, I can't do this and have a working machine. The nv driver has suffered bit rot or something since the FC2 days when it COULD run a 19 crt at 1600x1200, and will not drive this 20 wide screen lcd 1680x1050 monitor at more than 800x600, which is absolutely butt ugly fuzzy, looking like a jpg compressed to 10%. The system is not usable on a day to basis without the nvidia driver. You should probably give the nouveau[1] driver a try, if only for testing purposes; if you are running an NV4x (G6x or G7x) card in particular, it works a lot better than the nv driver for 2d support. 1. http://nouveau.freedesktop.org/wiki/InstallNouveau But nouveau is much less stable than nv. For testing purposes, go with stable. I believe at this point, its moot. I captured quite a few instances of that error message while rebooting the last time, all of which occurred long before I logged in and did a startx (I boot to runlevel 3 here), so the kernel was NOT tainted at that point. That dmesg has been posted and some questions asked. As this has gone on for a while, it seems to me that with 14,800 google hits on this problem, Linus should call a halt until this is found and fixed. But I'm not Linus. I'm also locking up for 30 at a time, probably ready for reboot #7 today. I'm not sure why it won't run his screen though. I can use nv to run a 1920x1200 laptop LCD. It *is* dog slow (although nouveau was not any better with a NV17 / 440-Go -- render support for AA fonts seems to be missing), but it does work. -- Cheers, Gene There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) There cannot be a crisis next week. My schedule is already full. -- Henry Kissinger - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[2.6 patch] small ide-scan-pci.c cleanup
- ide_scan_pcibus() can become static - instead of ide_scan_pci() we can use ide_scan_pcibus() directly in module_init() Signed-off-by: Adrian Bunk [EMAIL PROTECTED] --- drivers/ide/ide-scan-pci.c |9 ++--- include/linux/ide.h|1 - 2 files changed, 2 insertions(+), 8 deletions(-) 03f221931265eda3ae7dfad9eaa8191e171d2c0c diff --git a/drivers/ide/ide-scan-pci.c b/drivers/ide/ide-scan-pci.c index 7ffa332..93d2e41 100644 --- a/drivers/ide/ide-scan-pci.c +++ b/drivers/ide/ide-scan-pci.c @@ -81,7 +81,7 @@ static int __init ide_scan_pcidev(struct pci_dev *dev) * module ordering not traditionally ordered. */ -int __init ide_scan_pcibus(void) +static int __init ide_scan_pcibus(void) { struct pci_dev *dev = NULL; struct pci_driver *d; @@ -113,9 +113,4 @@ int __init ide_scan_pcibus(void) return 0; } -static int __init ide_scan_pci(void) -{ - return ide_scan_pcibus(); -} - -module_init(ide_scan_pci); +module_init(ide_scan_pcibus); diff --git a/include/linux/ide.h b/include/linux/ide.h index 5d3d88e..7072c53 100644 --- a/include/linux/ide.h +++ b/include/linux/ide.h @@ -1015,7 +1015,6 @@ void ide_init_disk(struct gendisk *, ide_drive_t *); #ifdef CONFIG_IDEPCI_PCIBUS_ORDER extern int ide_scan_direction; -int __init ide_scan_pcibus(void); extern int __ide_pci_register_driver(struct pci_driver *driver, struct module *owner, const char *mod_name); #define ide_pci_register_driver(d) __ide_pci_register_driver(d, THIS_MODULE, KBUILD_MODNAME) #else - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[2.6 patch] unexport ide_dma_on
ide_dma_on can be unexported. Signed-off-by: Adrian Bunk [EMAIL PROTECTED] --- 38b0717b827649511b15fbef6f98c891eda835ff diff --git a/drivers/ide/ide-dma.c b/drivers/ide/ide-dma.c index 5bf3203..15f8c6a 100644 --- a/drivers/ide/ide-dma.c +++ b/drivers/ide/ide-dma.c @@ -474,8 +474,6 @@ void ide_dma_on(ide_drive_t *drive) drive-hwif-dma_host_set(drive, 1); } -EXPORT_SYMBOL(ide_dma_on); - #ifdef CONFIG_BLK_DEV_IDEDMA_PCI /** * ide_dma_setup - begin a DMA phase - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Hot (un)plugging of a SATA drive with sata_nv (CK8S)?
(linux-ide cc'ed) Ignacy Gawedzki wrote: On Fri, Jan 25, 2008 at 09:03:02PM -0600, thus spake Robert Hancock: Ignacy Gawedzki wrote: Hi everyone, I'm having trouble to determine the cause of the following behavior. I'm not even sure that I'm supposed to hot plug and unplug a SATA drive from a nForce3 Ultra (apparently CK8S, on a Gigabyte K8NS Ultra 939 mobo) SATA interface, to begin with. The information is hard to find given that the sata_nv driver supports a range of different hardware. I've recently acquired an external drive with (among others) an eSATA interface, so I also bought a eSATA-SATA bracket and intend to use that drive (Lacie d2 quadra 500G) through eSATA. BTW, eSATA cannot technically be converted properly to SATA with a simple connector adapter. eSATA is supposed to use higher signalling voltages and so using such an adapter is not guaranteed to work. Yeah, apparently this shortens the max cable length to 1 meter. In this case I've got a 1 meter external cable and approx. 30 cm internal (heavily shielded though) cable from the bracket to the SATA port. Anyway, the drive works perfectly if plugged at boot time. The thing is that if I boot the machine with the drive plugged and turned on, it is properly detected and usable. If, at some point, I want to remove the drive, I unmount any partitions on it and issue the proper scsiadd -r command (usually scsiadd -r 1 0 0 0, since this is the second SATA drive) and everything is fine (I turn the drive off and unplug it), so far. Next, when I want to use the drive again, it's still detected alright (although appears as sdc and not sdb anymore), but the SCSI layer issues scsi 1:0:0:0: rejecting I/O to dead device from time to time. Then any scsiadd -r 1 0 0 0 command fails with No such device or address, although it appears in the output of scsiadd -p or even scsiadd -s (always as 1 0 0 0). If I ignore that detail and switch the drive off, then the kernel eventually notices that the drive is gone and the SCSI layer attempts to stop the device and fails ([sdc] START_STOP FAILED). From that moment on, any attempt to plug the drive again fails. The kernel issues ata2: hard resetting port and ata2: port is slow to respond, please be patient (Status 0x80) periodically, until I switch the drive off. If the drive is not present at boot, then hot plugging it fails. The kernel first soft resets the port, then issues the please be patient (Status 0x80) message, complains that SRST failed (errno=-16) and goes on hard resetting the port, issuing please be patient (Status 0x80) and complaining that COMRESET failed (errno=-16), periodically, until the drive is switched off. Full dmesg output would be useful.. I repeated the experiments and dumped as much dmesg as I could. The dmesg outputs of both experiments are attached and commented. It seems that in the case the drive is pluggin at boot time, it remains hot pluggable later (be it with some strange error messages) after all (or is there another factor that I did not reproduce?). Thank you for any help. =) So it seems that unplug/plug works fine if the drive was plugged in at boot, but if it wasn't plugged in on boot it doesn't work when plugged in afterwards. That suggests maybe there is some initialization that the BIOS is doing when the drive is plugged in on boot which we're not doing when one's plugged in afterwards. Unfortunately the lack of public documentation on the NVIDIA SATA hardware makes it difficult to tell whether this is the case.. Any ideas guys? When the drive is plugged in, a stream of this shows up. It would seem like the controller is throwing hotplug interrupts but we never seem to get a SATA link up. This is on nForce3, btw. ata2: exception Emask 0x10 SAct 0x0 SErr 0x5 action 0x2 frozen ata2: hard resetting port ata2: SATA link down (SStatus 0 SControl 300) ata2: EH complete ata2: exception Emask 0x10 SAct 0x0 SErr 0x1d action 0x2 frozen ata2: hard resetting port ata2: port is slow to respond, please be patient (Status 0x80) ata2: SRST failed (errno=-16) ata2: hard resetting port ata2: port is slow to respond, please be patient (Status 0x80) ata2: COMRESET failed (errno=-16) ata2: hard resetting port ata2: port is slow to respond, please be patient (Status 0x80) ata2: COMRESET failed (errno=-16) ata2: limiting SATA link speed to 1.5 Gbps ata2: hard resetting port ata2: COMRESET failed (errno=-16) ata2: reset failed, giving up ata2: EH pending after completion, repeating EH (cnt=4) ata2: exception Emask 0x10 SAct 0x0 SErr 0x19d action 0x2 frozen ata2: hard resetting port ata2: port is slow to respond, please be patient (Status 0x80) ata2: COMRESET failed (errno=-16) ata2: hard resetting port ata2: port is slow to respond, please be patient (Status 0x80) ata2: COMRESET failed (errno=-16) ata2: hard resetting port ata2: port is slow to respond, please be patient (Status 0x80) ata2: COMRESET failed (errno=-16)
Re: [2.6 patch] ata_piix.c: make piix_merge_scr() static
Adrian Bunk wrote: piix_merge_scr() can become static. Signed-off-by: Adrian Bunk [EMAIL PROTECTED] --- f272ad2ac4274a59f0b43cfd65488c51855132d4 diff --git a/drivers/ata/ata_piix.c b/drivers/ata/ata_piix.c index a65c8ae..1a5c3bf 100644 --- a/drivers/ata/ata_piix.c +++ b/drivers/ata/ata_piix.c @@ -1068,7 +1068,7 @@ static void piix_sidpr_write(struct ata_device *dev, unsigned int reg, u32 val) iowrite32(val, hpriv-sidpr + PIIX_SIDPR_DATA); } -u32 piix_merge_scr(u32 val0, u32 val1, const int * const *merge_tbl) +static u32 piix_merge_scr(u32 val0, u32 val1, const int * const *merge_tbl) { u32 val = 0; int i, mi; Ah.. right. Thanks. I somehow forgot static in front of it. Acked-by: Tejun Heo [EMAIL PROTECTED] -- tejun - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[2.6 patch] ata_piix.c: make piix_merge_scr() static
piix_merge_scr() can become static. Signed-off-by: Adrian Bunk [EMAIL PROTECTED] --- f272ad2ac4274a59f0b43cfd65488c51855132d4 diff --git a/drivers/ata/ata_piix.c b/drivers/ata/ata_piix.c index a65c8ae..1a5c3bf 100644 --- a/drivers/ata/ata_piix.c +++ b/drivers/ata/ata_piix.c @@ -1068,7 +1068,7 @@ static void piix_sidpr_write(struct ata_device *dev, unsigned int reg, u32 val) iowrite32(val, hpriv-sidpr + PIIX_SIDPR_DATA); } -u32 piix_merge_scr(u32 val0, u32 val1, const int * const *merge_tbl) +static u32 piix_merge_scr(u32 val0, u32 val1, const int * const *merge_tbl) { u32 val = 0; int i, mi; - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Marvel 88SE6121 driver
Hi All, I am doing development of device driver of 88SE6121 on Kernel 2.4 for PowerPC 8568 board, which is a BIG endian CPU. From PCI config space, the VENDOR_ID and DEVICE_ID are correct for my driver. However the PCI_COMMAND register of 88SE6121 I got is 0x7, which is the same as default value. Based on the document, it has to be 0x0207 to enable interrupt. However I always failed to do so, the value remains 0x7 after it is re-assigned to the new value 0x0207. So that I have no interrupt from PCIe as INTA. Do you have any idea on this issue? Thanks for your help, Mike Zheng - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with ata layer in 2.6.24
On Mon, 28 Jan 2008, Gene Heskett wrote: I believe at this point, its moot. I captured quite a few instances of that error message while rebooting the last time, all of which occurred long before I logged in and did a startx (I boot to runlevel 3 here), so the kernel was NOT tainted at that point. That dmesg has been posted and some questions asked. As this has gone on for a while, it seems to me that with 14,800 google hits on this problem, Linus should call a halt until this is found and fixed. But I'm not Linus. I'm also locking up for 30 at a time, probably ready for reboot #7 today. Can you switch back to old IDE to get your work done (and to make sure it's not a hardware issue that's developed recently)? I believe libata is just a whole lot pickier about behavior than the IDE subsystem was, so it's more likely to complain about stuff, both for good reasons and when it shouldn't, and there are a slew of potential we have to accept that old PATA hardware does this bugs that all have the same symptom of we go into error handling when nothing is actually wrong, hence the vast quantity of hits. I think it's not exactly that it's a common problem as that it's a lot of problems that aren't very distinguishable. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: fixed a bug of adma in rhel4u5 with HDS7250SASUN500G.
Kuan Luo wrote: Robert worte. Kuan, does this patch (using the notifiers to see if the command is really done) still work if one port on the controller has ADMA disabled because it's in ATAPI mode? I seem to recall Allen Martin mentioning that notifiers wouldn't work in this case. I just tried the 2.6.24-rc7 sata_nv driver with one hd and one cdrom in the same controller. I mkfs hd and mounted the cdrom and no error happened. Allen, is there anything about notifier that we should pay attention to? Assuming not, then this patch should be applied.. - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with ata layer in 2.6.24
On Monday 28 January 2008, Daniel Barkalow wrote: On Mon, 28 Jan 2008, Richard Heck wrote: Daniel Barkalow wrote: Can you switch back to old IDE to get your work done (and to make sure it's not a hardware issue that's developed recently)? I think it'd be really, REALLY helpful to a lot of people if you, or someone, could explain in moderate detail how this might be done. I tried doing it myself, but I'm not sufficiently expert at configuring kernels that I was ever able to figure out how to do it. As far as configuring the kernel, I can help: Go to Device Drivers, ATA/ATAPI/MFM/RLL support, and turn on anything that looks relevant; go to Device Drivers, Serial ATA and Parallel ATA drivers, and turn off anything that's PATA and looks relevant. Done. (Whether a device uses IDE or PATA depends on which driver that supports the device is present and find it first, not on any sort of global configuration, which is probably what tripped you up) Building this and installing it along with the appropriate initrd (which might be handled by Fedora's install scripts) Or mine, which I've been using for years. will either get you back to old IDE or will make your kernel panic on boot, depending on whether you got it right (so make sure you can still boot the kernel you're sure of or something from a boot disk). This will also cause your hard drives to show up as different device nodes, so if your boot process doesn't mount by disk uuid but by some other feature (and I don't know what Fedora does), you'll also need to change it to something either stable across access methods or which works for the one you're now using. It mounts by LABEL=. All of it. Obviously, the short version is: switch back to Fedora 6. But this kind of problem with libata---and yes, you're almost surely right that it's not one problem but lots---is sufficiently widespread that a Mini HOWTO, say, would be really welcome and, I'm guessing, widely used. Fedora really ought to provide documentation, because there's some distro-specific stuff (like how you deal with the kernel's device node for the root partition changing), and they're using code by default that's at least somewhat documented as experimental (although it doesn't seem to be actually marked as experimental in all cases). Fedora is not the only people having trouble, name a distro, its probably someplace in that 14,800 hit google returns. -Daniel *This .sig left intentionally blank* Thanks Daniel, try #1 is building now. -- Cheers, Gene There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) Those who do not understand Unix are condemned to reinvent it, poorly. -- Henry Spencer - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
DMA mapping on SCSI device?
We've got a bit of a problem with the sata_nv driver that I'm trying to figure out a decent solution to (hence all the lists CCed). This is the situation: The nForce4 ADMA hardware has 2 modes: legacy mode, where it acts like a normal ATA controller with 32-bit DMA limits, and ADMA mode where it can access all of 64-bit memory. Each PCI device has 2 SATA ports, and the legacy/ADMA mode can be controlled independently on both of them. The trick is that if an ATAPI device is connected, we (as far as I'm aware) can't use ADMA mode, so we have to switch that port into legacy mode. This means it's only capable of 32-bit DMA. However the other port on the controller may be connected to a hard drive and therefore still capable of 64-bit DMA. (To make things more complicated, devices can be hotplugged and so this can change dynamically.) Since the device that libata is doing DMA mapping against is attached to the PCI device and not the port, it creates a problem here. If we change the mask on one it affects the other one as well. The original solution used by the driver was to leave the DMA mask at 64-bit and use blk_queue_bounce_limit to try to force the block layer not to send any requests with DMA addresses over 4GB into the driver. However it seems on x86_64 this doesn't work, since it pushes high addresses through anyway and expects the IOMMU to take care of it (which it doesn't because of the 64-bit mask). The last solution I tried was to set the DMA mask on both ports to 32-bit on slave_configure when an ATAPI device is connected. However, this runs into complications as well. This is run on initialization and when trying to set the other port into 32-bit DMA, it may not be initialized yet. Plus, it forces the port with a hard drive on it into 32-bit DMA needlessly. The ideal solution would be to do mapping against a different struct device for each port, so that we could maintain the proper DMA mask for each of them at all times. However I'm not sure if that's possible. The thought of using the SCSI struct device for DMA mapping was brought up at one point.. any thoughts on that? - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2.6.24] ata_piix: IDE mode SATA patch for Intel ICH10 DeviceID's
This patch adds the Intel ICH10 IDE mode SATA Controller DeviceID's. Signed-off-by: Jason Gaston [EMAIL PROTECTED] --- linux-2.6.24/drivers/ata/ata_piix.c.orig2008-01-24 14:58:37.0 -0800 +++ linux-2.6.24/drivers/ata/ata_piix.c 2008-01-28 14:58:22.0 -0800 @@ -263,6 +263,14 @@ { 0x8086, 0x292e, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_sata_ahci }, /* SATA Controller IDE (Tolapai) */ { 0x8086, 0x5028, PCI_ANY_ID, PCI_ANY_ID, 0, 0, tolapai_sata_ahci }, + /* SATA Controller IDE (ICH10) */ + { 0x8086, 0x3a00, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_sata_ahci }, + /* SATA Controller IDE (ICH10) */ + { 0x8086, 0x3a06, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_2port_sata }, + /* SATA Controller IDE (ICH10) */ + { 0x8086, 0x3a20, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_sata_ahci }, + /* SATA Controller IDE (ICH10) */ + { 0x8086, 0x3a26, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_2port_sata }, { } /* terminate list */ }; - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with ata layer in 2.6.24
On Mon, 28 Jan 2008, Gene Heskett wrote: On Monday 28 January 2008, Daniel Barkalow wrote: Building this and installing it along with the appropriate initrd (which might be handled by Fedora's install scripts) Or mine, which I've been using for years. You're ahead of a surprising number of people, including me, if you understand making initrds. will either get you back to old IDE or will make your kernel panic on boot, depending on whether you got it right (so make sure you can still boot the kernel you're sure of or something from a boot disk). This will also cause your hard drives to show up as different device nodes, so if your boot process doesn't mount by disk uuid but by some other feature (and I don't know what Fedora does), you'll also need to change it to something either stable across access methods or which works for the one you're now using. It mounts by LABEL=. All of it. That'll save a huge amount of hassle. So long as you manage to get the right drivers included and the wrong drivers not included, you should be pretty much set. Fedora is not the only people having trouble, name a distro, its probably someplace in that 14,800 hit google returns. Yeah, but they each may need different instructions, particularly if they're not mounting by label in general, or not mounting the root partition by label. That was the big hassle going the opposite direction. And the procedure is 4 lines to describe to somebody who knows how to build and install a new kernel for the distro, which is much shorter than the explanation of how you generally build and install a kernel. A real howto would have to explain where to get the distro's kernel sources and default configuration, for example. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with ata layer in 2.6.24
On Monday 28 January 2008, Daniel Barkalow wrote: On Mon, 28 Jan 2008, Gene Heskett wrote: On Monday 28 January 2008, Daniel Barkalow wrote: Building this and installing it along with the appropriate initrd (which might be handled by Fedora's install scripts) Or mine, which I've been using for years. You're ahead of a surprising number of people, including me, if you understand making initrds. In my script, its one line: mkinitrd -f initrd-$VER.img $VER \ where $VER is the shell variable I edit to = the version number, located at the top of the script. Unforch, its failing: No module pata_amd found for kernel 2.6.24, aborting. This is with pata_amd turned off and its counterpart under ATA/RLL/etc turned on. So something is still dependent on it. I do have one sata drive, on an accessory card in the box, so I need the rest of the sata_sil and friends stuff. Its my virtual tapes for amanda. Also home built, the amanda security model cannot be successfully bent into the shape of an rpm. They BTW are #2 on coverity's list of most secure software. So I've rebuilt 2.6.24 as it originally was, and added the acpi timer line to the 2.6.24-rc8 stanza's kernel argument list. It will boot one or the other when I next reboot. Its been about 8 hours since the last error was logged, which is totally weirdsville to this old fart. Phase of the moon maybe? The visit to the sawbones to see about my heart? They are going to fit me with a 30 day recorder tomorrow, my skip a beat problem is getting worse. The sort of stuff that goes with the 7nth decade I guess. Officially, I'm wearing out me, too much sugar, too many times nearly electrocuted=shingles yadda yadda. :-) Oh, and don't forget Arther, he moved in uninvited about 25 years ago too. Those people that talk about the golden years? They're full of excrement... will either get you back to old IDE or will make your kernel panic on boot, depending on whether you got it right (so make sure you can still boot the kernel you're sure of or something from a boot disk). This will also cause your hard drives to show up as different device nodes, so if your boot process doesn't mount by disk uuid but by some other feature (and I don't know what Fedora does), you'll also need to change it to something either stable across access methods or which works for the one you're now using. It mounts by LABEL=. All of it. That'll save a huge amount of hassle. So long as you manage to get the right drivers included and the wrong drivers not included, you should be pretty much set. Fedora is not the only people having trouble, name a distro, its probably someplace in that 14,800 hit google returns. Yeah, but they each may need different instructions, particularly if they're not mounting by label in general, or not mounting the root partition by label. That was the big hassle going the opposite direction. And the procedure is 4 lines to describe to somebody who knows how to build and install a new kernel for the distro, which is much shorter than the explanation of how you generally build and install a kernel. A real howto would have to explain where to get the distro's kernel sources and default configuration, for example. -Daniel *This .sig left intentionally blank* -- Cheers, Gene There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) Never drink from your finger bowl -- it contains only water. - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with ata layer in 2.6.24
On Mon, 28 Jan 2008, Gene Heskett wrote: On Monday 28 January 2008, Daniel Barkalow wrote: On Mon, 28 Jan 2008, Gene Heskett wrote: On Monday 28 January 2008, Daniel Barkalow wrote: Building this and installing it along with the appropriate initrd (which might be handled by Fedora's install scripts) Or mine, which I've been using for years. You're ahead of a surprising number of people, including me, if you understand making initrds. In my script, its one line: mkinitrd -f initrd-$VER.img $VER \ where $VER is the shell variable I edit to = the version number, located at the top of the script. Unforch, its failing: No module pata_amd found for kernel 2.6.24, aborting. This is with pata_amd turned off and its counterpart under ATA/RLL/etc turned on. So something is still dependent on it. That looks like something in the guts of the initrd; it probably thinks you need pata_amd and it's unhappy that you don't have it. Actually, another thing to try is making the ATA/etc one be y and pata_amd be m. Most likely, this should lead to the ATA one claiming the drive before the module is loaded (but the module would be loaded later, to avoid upsetting the initrd); you should be able to tell from dmesg (or /dev, for that matter) which one got it, and I think built-in drivers will claim everything they can before an initrd gets loaded. I do have one sata drive, on an accessory card in the box, so I need the rest of the sata_sil and friends stuff. Assuming it isn't picking up your hard drive, which it isn't, that shouldn't matter. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: fixed a bug of adma in rhel4u5 with HDS7250SASUN500G.
robert wrote: Kuan Luo wrote: Robert worte. Kuan, does this patch (using the notifiers to see if the command is really done) still work if one port on the controller has ADMA disabled because it's in ATAPI mode? I seem to recall Allen Martin mentioning that notifiers wouldn't work in this case. I just tried the 2.6.24-rc7 sata_nv driver with one hd and one cdrom in the same controller. I mkfs hd and mounted the cdrom and no error happened. Allen, is there anything about notifier that we should pay attention to? Assuming not, then this patch should be applied.. I am asking someone about the issue. Soon i will be getting a concrete response. --- This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. --- - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with ata layer in 2.6.24
Gene Heskett wrote: .. That's ok, dd seemed to do the job also. .. The two programs operate entirely differently from each other, so it may still be worth trying the make_bad_sector utility there. dd goes through the regular kernel I/O calls, whereas make_bad_sector sends raw ATA commands directly (more or less) to the drive. -ml - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DMA mapping on SCSI device?
On Jan 29, 2008 11:08 AM, Robert Hancock [EMAIL PROTECTED] wrote: ... The last solution I tried was to set the DMA mask on both ports to 32-bit on slave_configure when an ATAPI device is connected. However, this runs into complications as well. This is run on initialization and when trying to set the other port into 32-bit DMA, it may not be initialized yet. Plus, it forces the port with a hard drive on it into 32-bit DMA needlessly. Have you measured the impact of setting the PCI dma mask to 32-bit? Last time Alex Williamson (HP) measured this on IA64, we deliberately forced pci_map_sg() to use the IOMMU even for devices that were 64-bit capable. We got 3-5% better throughput since the device had fewer entries to retrieve and the devices (at the time) weren't that good at processing SG lists. The ideal solution would be to do mapping against a different struct device for each port, so that we could maintain the proper DMA mask for each of them at all times. However I'm not sure if that's possible. The thought of using the SCSI struct device for DMA mapping was brought up at one point.. any thoughts on that? I'm pretty sure that's not possible (using two PCI dev structs). I'm skeptical it's worth converting DMA services to use SCSI devs since that's an extremely invasive change for a marginal benefit. hth, grant - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: sata_sil24 / Alpha / 4726 PMP issue ..
No like with only a drive attached and no PMP. The driver is still unable to IDENTIFY the connected disk: failed to IDENTIFY (I/O error, err_mask=0x20) On Jan 28, 2008 1:51 AM, Thomas Evans [EMAIL PROTECTED] wrote: On 1/27/08, Tejun Heo [EMAIL PROTECTED] wrote: Hmmm... That's strange. If PERR makes a difference, it means PCI bus side is contributing to the problem but only when PMP is attached while directly attached drive works just fine? I need to get a esata to sata cable - i returned all my duplicate equipment, so i haven't a 3124 with internal sata ports. I will try soon. - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DMA mapping on SCSI device?
On Mon, Jan 28, 2008 at 06:08:44PM -0600, Robert Hancock wrote: The thought of using the SCSI struct device for DMA mapping was brought up at one point.. any thoughts on that? I believe this will work on some architectures and not others. Anything that uses include/asm-generic/dma-mapping.h will break, for example. It would be nice for those architectures to get fixed ... -- Intel are signing my paycheques ... these opinions are still mine Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step. - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: sata_sil24 / Alpha / 4726 PMP issue ..
Thomas Evans wrote: No like with only a drive attached and no PMP. The driver is still unable to IDENTIFY the connected disk: failed to IDENTIFY (I/O error, err_mask=0x20) So, the problem is not specific to PMP support. That makes more sense. Does moving the controller to different slot make any difference? -- tejun - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with ata layer in 2.6.24
On Mon, 2008-01-28 at 11:35 -0500, Gene Heskett wrote: On Monday 28 January 2008, Mikael Pettersson wrote: Gene Heskett writes: On Monday 28 January 2008, Peter Zijlstra wrote: On Mon, 2008-01-28 at 09:17 +0100, Mikael Pettersson wrote: 1. Wrong mailing list; use linux-ide (@vger) instead. What, and keep all us other interested people in the dark? As a test, I tried rebooting to the latest fedora kernel and found it kills X, so I'm back to the second to last fedora version ATM, and the third 'smartctl -t lng /dev/sda' in 24 hours is running now. The first two completed with no errors. I've added the linux-ide list to refresh those people of the problem, the logs are being spammed by this message stanza: Jan 28 04:46:25 coyote kernel: [26550.290016] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 28 04:46:25 coyote kernel: [26550.290028] ata1.00: cmd 35/00:58:c9:9c:0a/00:01:00:00:00/e0 tag 0 dma 176128 out Jan 28 04:46:25 coyote kernel: [26550.290029] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 28 04:46:25 coyote kernel: [26550.290032] ata1.00: status: { DRDY } Jan 28 04:46:25 coyote kernel: [26550.290060] ata1: soft resetting link Jan 28 04:46:25 coyote kernel: [26550.452301] ata1.00: configured for UDMA/100 Jan 28 04:46:25 coyote kernel: [26550.452318] ata1: EH complete Jan 28 04:46:25 coyote kernel: [26550.455898] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB) Jan 28 04:46:25 coyote kernel: [26550.456151] sd 0:0:0:0: [sda] Write Protect is off Jan 28 04:46:25 coyote kernel: [26550.456403] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA It's not obvious from this incomplete dmesg log what HW or driver is behind ata1, but if the 2.6.24-rc7 kernel matches the 2.6.24 one, it should be pata_amd driving a WDC disk: [ 30.702887] pata_amd :00:09.0: version 0.3.10 [ 30.703052] PCI: Setting latency timer of device :00:09.0 to 64 [ 30.703188] scsi0 : pata_amd [ 30.709313] scsi1 : pata_amd [ 30.710076] ata1: PATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0xf000 irq 14 [ 30.710079] ata2: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma 0xf008 irq 15 [ 30.864753] ata1.00: ATA-6: WDC WD2000JB-00EVA0, 15.05R15, max UDMA/100 [ 30.864756] ata1.00: 390721968 sectors, multi 16: LBA48 [ 30.871629] ata1.00: configured for UDMA/100 Unfortunately we also see: [ 48.285456] nvidia: module license 'NVIDIA' taints kernel. [ 48.549725] ACPI: PCI Interrupt :02:00.0[A] - Link [APC4] - GSI 19 (level, high) - IRQ 20 [ 48.550149] NVRM: loading NVIDIA UNIX x86 Kernel Module 169.07 Thu Dec 13 18:42:56 PST 2007 We have no way of debugging that module, so please try 2.6.24 without it. Sorry, I can't do this and have a working machine. The nv driver has suffered bit rot or something since the FC2 days when it COULD run a 19 crt at 1600x1200, and will not drive this 20 wide screen lcd 1680x1050 monitor at more than 800x600, which is absolutely butt ugly fuzzy, looking like a jpg compressed to 10%. The system is not usable on a day to basis without the nvidia driver. Fix the nv driver so it will run this screen at its native resolution and I'll be glad to run it even if it won't run google earth, which I do use from time to time. Now, if in all the hits you can get from google on this, currently 14,800 just for 'exception Emask', apparently caused by a timeout, if 100% of the complainers are running nvidia drivers also, then I see a legit I can invalidate this theory... i helped a guy on irc debug this problem, and he had ati. I tried having him stop using fglrx, and go to r300.. same problem, and same problem even with vesa.. :) also, i have this on my fileserver with .20, which doesent even run X, or module support in kernel :) complaint. Again, fix the nv driver so it will run my screen I'll be glad to switch. I can see the reason, sure, but the machine must be capable of doing its common day to day stuff, while using that driver, like running kde for kmail, and browsers that work. If the problems persist, please try to capture a complete log from the failing kernel -- the interesting bits are everything from initial boot up to and including the first few errors. You may need to increase the kernel's log buffer size if the log gets truncated (CONFIG_LOG_BUF_SHIFT). If by log you mean /var/log/messages, I have several megabytes of those. If you mean a live dmesg capture taken right now, its attached. It contains several of these at the bottom. I long ago made the kernel log buffer bigger, cuz it couldn't even show the start immediately after the boot, and even the dump to syslog was truncated. There are no pata_amd changes from 2.6.24-rc7 to 2.6.24 final. That is what I was afraid of. I've done
Re: sata_sil24 / Alpha / 4726 PMP issue ..
I've tried it in various slots on both PCI hoses - no difference. ...tom Tejun Heo wrote: Thomas Evans wrote: No like with only a drive attached and no PMP. The driver is still unable to IDENTIFY the connected disk: failed to IDENTIFY (I/O error, err_mask=0x20) So, the problem is not specific to PMP support. That makes more sense. Does moving the controller to different slot make any difference? - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with ata layer in 2.6.24
On Monday 28 January 2008, Kasper Sandberg wrote: [...] We have no way of debugging that module, so please try 2.6.24 without it. Sorry, I can't do this and have a working machine. The nv driver has suffered bit rot or something since the FC2 days when it COULD run a 19 crt at 1600x1200, and will not drive this 20 wide screen lcd 1680x1050 monitor at more than 800x600, which is absolutely butt ugly fuzzy, looking like a jpg compressed to 10%. The system is not usable on a day to basis without the nvidia driver. Fix the nv driver so it will run this screen at its native resolution and I'll be glad to run it even if it won't run google earth, which I do use from time to time. Now, if in all the hits you can get from google on this, currently 14,800 just for 'exception Emask', apparently caused by a timeout, if 100% of the complainers are running nvidia drivers also, then I see a legit I can invalidate this theory... i helped a guy on irc debug this problem, and he had ati. I tried having him stop using fglrx, and go to r300.. same problem, and same problem even with vesa.. :) No Kasper, you are validating it, that it is not nvidia related, which is what I was also saying. also, i have this on my fileserver with .20, which doesent even run X, or module support in kernel :) That far back? Although ISTR I saw it happen once only when I was running 2.6.18-somethingorother. complaint. Again, fix the nv driver so it will run my screen I'll be glad to switch. I can see the reason, sure, but the machine must be capable of doing its common day to day stuff, while using that driver, like running kde for kmail, and browsers that work. If the problems persist, please try to capture a complete log from the failing kernel -- the interesting bits are everything from initial boot up to and including the first few errors. You may need to increase the kernel's log buffer size if the log gets truncated (CONFIG_LOG_BUF_SHIFT). If by log you mean /var/log/messages, I have several megabytes of those. If you mean a live dmesg capture taken right now, its attached. It contains several of these at the bottom. I long ago made the kernel log buffer bigger, cuz it couldn't even show the start immediately after the boot, and even the dump to syslog was truncated. There are no pata_amd changes from 2.6.24-rc7 to 2.6.24 final. That is what I was afraid of. I've done some limited grepping in that branch of the kernel tree, and cannot seem to locate where this EH handler is being invoked from. There is 2 lines of interest in the dmesg: [0.00] Nvidia board detected. Ignoring ACPI timer override. [0.00] If you got timer trouble try acpi_use_timer_override But I have NDI what it means, kernel argument/xconfig option? I've also done some googling, and it appears this problem is fairly widespread since the switchover to libata was encouraged. A stock fedora F8 kernel suffers the same freezes and eventually locks up, but does it without the error messages being logged, it just freezes, feeling identical to this in the minutes before the total freeze. I've tried 2 of those too, but the newest one won't even run X. -- Cheers, Gene There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. -Ed Howdershelt (Author) bureaucrat, n: A politician who has tenure. - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/32] ide-tape redux v1
Hi Bart, [...] the BKL in idetape_write_release() with finer-grained locking etc, probably also some pipeline improvements, removal of OnStream support, etc. but that'll come later. On-Stream support has been long gone but it seems that deprecation warning etc. managed to survive. w.r.t. to the pipeline-mode: it should be pipelined into /dev/null rationale: - it is _very_ complex - causes errors to be deferred till the next user-space access - direct I/O using blk_rq_map_user() will offer superior performance the only question is whether to remove it... Well, on the one hand, since the driver is only being maintained we should not remove code that works. Also, i don't know how many users ide-tape really has but, would it be worth the trouble at all? Because if nobody's using it, we could just as well pipe the whole thing into /dev/null.. On the other hand, the pipelining part _is_ kinda big and, right, it is not that straightfoward to look at it and know what it actually does - it truly is a student project :) Documentation/ide/ChangeLog.ide-tape.1995-2002 | 405 +++ drivers/ide/Kconfig|3 +- drivers/ide/ide-tape.c | 4146 +--- 3 files changed, 1991 insertions(+), 2563 deletions(-) [...] BTW what happend to patch #23? Well, it appeared in my lkml mailbox having gone over vger which means at least somebody got it :). But, yeah, that was a real nightmare yesterday sending all those patches in one go. See, i got a stupid umts modem behind a not so transparent proxy :) whose subnet is listed in almost every spam database on the planet and whenever i try to send more than one mail i hit all sorts of mail server restrictions like yahoo's maximum messages per day crap.. Gmail seems a bit smarter ?! and scans the mail message and then says all kinds of funny stuff :): 27 10:48:31 gollum postfix/smtp[4011]: F1710123BFD: to=linux-ide@vger.kernel.org, relay=vger.kernel.org[209.132.176.167]:25, delay=10, delays=0.19/0.29/2.7/7.2, dsn=2.7.1, status=sent (250 2.7.1 Looks like Linux source DIFF email.. BF:H 1.55041e-06; S1753942AbYA0Js4) what's next, probably something like: ...(250 3.x.x uh, ok, i'm gonna relay your mail but please have another coffee, please) hash; Anyway, resending #23 to you in a private mail. -- Regards/Gruß, Boris. - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DMA mapping on SCSI device?
The ideal solution would be to do mapping against a different struct device for each port, so that we could maintain the proper DMA mask for each of them at all times. However I'm not sure if that's possible. I cannot imagine why it should be that difficult. The PCI subsystem could over a pci_clone_device() or similar function. For all complicated purposes (sysfs etc) the original device could be used, so it would be hopefully not that difficult. The alternative would be to add a new family of PCI mapping functions that take an explicit mask. Disadvantage would be changing all architectures, but on the other hand the interface could be phase in one by one (and nF4 primarily only works on x86 anyways) I suspect the later would be a little cleaner, although they don't make much difference. -Andi - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: fixed a bug of adma in rhel4u5 with HDS7250SASUN500G.
Robert wrote: Kuan Luo wrote: Robert worte. Kuan, does this patch (using the notifiers to see if the command is really done) still work if one port on the controller has ADMA disabled because it's in ATAPI mode? I seem to recall Allen Martin mentioning that notifiers wouldn't work in this case. I just tried the 2.6.24-rc7 sata_nv driver with one hd and one cdrom in the same controller. I mkfs hd and mounted the cdrom and no error happened. Allen, is there anything about notifier that we should pay attention to? Assuming not, then this patch should be applied.. The patch should be applied. We use the notifier register and there is nothing to do with our notifier register in atapi mode. Allen wrote: I think that's one of the cases where memory notifiers don't work (one of the drives is not in ADMA mode either because it's ATAPI or it's in legacy mode). There's no issue with the notifier registers though. --- This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. --- - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with ata layer in 2.6.24
On Mon, 2008-01-28 at 23:49 -0500, Gene Heskett wrote: On Monday 28 January 2008, Kasper Sandberg wrote: [...] snip I can invalidate this theory... i helped a guy on irc debug this problem, and he had ati. I tried having him stop using fglrx, and go to r300.. same problem, and same problem even with vesa.. :) No Kasper, you are validating it, that it is not nvidia related, which is what I was also saying. yeah thats what i mean - i can invalidate the theory that all the affected boxes run nvidia. also, i have this on my fileserver with .20, which doesent even run X, or module support in kernel :) That far back? Although ISTR I saw it happen once only when I was running 2.6.18-somethingorother. Yes im afraid so.. i will now provide some complete details, as i feel they are relevant. the thing is, i run 6x300gb disks, IDE, in raid5. i have both an onboard via ide controller, and then i bought a promise pdc 202 new thingie. i had problem however.. after a bit of time, i would get DMA reset error thing, and it all kindof went NUTS. it was as if all data access were skewed, and as you might imagine, this made everything fail badly. i purchased an ITE based controller for the drives on the promise, but exactly the same thing happened. the errors i got was: hdf: dma_intr: bad DMA status (dma_stat=75) hdf: dma_intr: status=0x50 { DriveReady SeekComplete } ide: failed opcode was: unknown --- i then found new hope, when i heard that libata provided much better error handling, so i upgraded to .20. this made my box usable. the error happens once or twice a day, the disk led will turn on constantly, and all IO freezes for about half a minute, where it returns PROPERLY(thank you libata!). as far as i can tell, the only side effect is that i get those messages like described here, and flooded with on google. to put some timeline perspective into this. i believe it was in 2005 i assembled the system, and when i realized it was faulty, on old ide driver, i stopped using it - that miht have been in beginning of 2006. then for almost a year i werent using it, hoping to somehow fix it, but in january 2007 i think it was, atleast in the very beginning of 2007, i hit upon the idea of trying libata, and ever since the system has been running 24/7 - doing these errors around 2 times a day. i have multiple times reported my problems to lkml, but nothing has happened, i also tried to aproeach jgarzik direcly, but he was not interested. i really hope this can be solved now, its a huge problem my fileserver has an asus k8v motherboard, with via chipset (k8t880 i think it is, or something like it). currently using the promise controller again(strangely enough all the timeouts seems to happen here, and when the ITE was on, there, not the onboard one), in conjunction with the onboard via. complaint. Again, fix the nv driver so it will run my screen I'll be snip - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with ata layer in 2.6.24
On Mon, Jan 28, 2008 at 08:31:57PM -0500, Gene Heskett wrote: In my script, its one line: mkinitrd -f initrd-$VER.img $VER \ where $VER is the shell variable I edit to = the version number, located at the top of the script. Unforch, its failing: No module pata_amd found for kernel 2.6.24, aborting. mkinitrd is just a shell script. Even if its options, and there is a quite a number of these, do not allow to influence a choice of modules in a desired manner, it is pretty trivial to make yourself a custom version of it and just hardwire there a fixed list of modules to use instead of relying on general mechanisms which are trying hard to guess what you may need. That way your regular 'mkinitrd' will build something to boot with libata and 'mkinird.ide' will use IDE modules for that purpose using the same core kernel. If you are using distribution kernels, as opposed to your own configuration, it is quite likely that you will need to install 'kernel-devel' package and recompile and add required IDE modules yourself as those may be not provided. This is done the same way like for any other external module. Michal - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with ata layer in 2.6.24
On Mon, 28 Jan 2008 14:13:21 -0500 Gene Heskett [EMAIL PROTECTED] wrote: I had to reboot early this morning due to a freezeup, and I had a bunch of these in the messages log: == Jan 27 19:42:11 coyote kernel: [42461.915961] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen Jan 27 19:42:11 coyote kernel: [42461.915973] ata1.00: cmd ca/00:08:b1:66:46/00:00:00:00:00/e8 tag 0 dma 4096 out Jan 27 19:42:11 coyote kernel: [42461.915974] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jan 27 19:42:11 coyote kernel: [42461.915978] ata1.00: status: { DRDY } Jan 27 19:42:11 coyote kernel: [42461.916005] ata1: soft resetting link Jan 27 19:42:12 coyote kernel: [42462.078216] ata1.00: configured for UDMA/100 Jan 27 19:42:12 coyote kernel: [42462.078232] ata1: EH complete Jan 27 19:42:12 coyote kernel: [42462.090700] sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB) Jan 27 19:42:12 coyote kernel: [42462.114230] sd 0:0:0:0: [sda] Write Protect is off Jan 27 19:42:12 coyote kernel: [42462.115079] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA === I had this error too, or maybe only a similar one, and another, neither of which of i still have the error output laying around, so I'm posting both fixes, that i found here on lkml: 1) disabling ncq like that: echo 1 /sys/block/sda/device/queue_depth 2) this patch: libata_drain_fifo_on_stuck_drq_hsm.patch ( applies to 2.6.24 too ) Signed-off-by: Mark Lord [EMAIL PROTECTED] --- --- old/drivers/ata/libata-sff.c2007-09-28 09:29:22.0 -0400 +++ linux/drivers/ata/libata-sff.c 2007-09-28 09:39:44.0 -0400 @@ -420,6 +420,28 @@ ap-ops-irq_on(ap); } +static void ata_drain_fifo(struct ata_port *ap, struct ata_queued_cmd *qc) +{ + u8 stat = ata_chk_status(ap); + /* +* Try to clear stuck DRQ if necessary, +* by reading/discarding up to two sectors worth of data. +*/ + if ((stat ATA_DRQ) (!qc || qc-dma_dir != DMA_TO_DEVICE)) { + unsigned int i; + unsigned int limit = qc ? qc-sect_size : ATA_SECT_SIZE; + + printk(KERN_WARNING Draining up to %u words from data FIFO.\n, + limit); + for (i = 0; i limit ; ++i) { + ioread16(ap-ioaddr.data_addr); + if (!(ata_chk_status(ap) ATA_DRQ)) + break; + } + printk(KERN_WARNING Drained %u/%u words.\n, i, limit); + } +} + /** * ata_bmdma_drive_eh - Perform EH with given methods for BMDMA controller * @ap: port to handle error for @@ -476,7 +498,7 @@ } ata_altstatus(ap); - ata_chk_status(ap); + ata_drain_fifo(ap, qc); ap-ops-irq_clear(ap); spin_unlock_irqrestore(ap-lock, flags); - -- Florian Attenberger [EMAIL PROTECTED] pgpaqRPEbjtUv.pgp Description: PGP signature