Re: aic7xxx DMA overflow error
> it seems like we do for some reason never actually enable swiotlb > for 32-bit x86. Before my commit the block bounce buffering papered > over that for networking, Please try this patch: > > diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c > index 661583662430..71c0b01d93b1 100644 > --- a/arch/x86/kernel/pci-swiotlb.c > +++ b/arch/x86/kernel/pci-swiotlb.c > @@ -42,10 +42,8 @@ IOMMU_INIT_FINISH(pci_swiotlb_detect_override, > int __init pci_swiotlb_detect_4gb(void) > { > /* don't initialize swiotlb if iommu=off (no_iommu=1) */ > -#ifdef CONFIG_X86_64 > if (!no_iommu && max_possible_pfn > MAX_DMA32_PFN) > swiotlb = 1; > -#endif > > /* > * If SME is active then swiotlb will be set to 1 so that bounce Christoph, your patch fixed it nicely. No more error messages when I boot with 16GiB enabled on a 32-bit PAE-enabled system. - Matthew Whitehead
Re: aic7xxx DMA overflow error
Christoph, here is all of the newly patched dmesg output. I also added 'aic7xxx.a9c7xxx=verbose' for extra information. Matthew [0.00] Linux version 4.18.12.pentium4-xeon-christoph+ (root@pentium4) (gcc version 5.4.0 (Gentoo 5.4.0-r4 p1.8, pie-0.6.5)) #525 SMP PREEMPT Sat Oct 13 09:49:31 EDT 2018 [0.00] KERNEL supported cpus: [0.00] Intel GenuineIntel [0.00] x86/fpu: x87 FPU will use FXSAVE [0.00] BIOS-provided physical RAM map: [0.00] BIOS-e820: [mem 0x-0x0009] usable [0.00] BIOS-e820: [mem 0x0010-0xeffd] usable [0.00] BIOS-e820: [mem 0xeffe-0xeffefbff] ACPI data [0.00] BIOS-e820: [mem 0xeffefc00-0xefffefff] reserved [0.00] BIOS-e820: [mem 0xfec0-0xfec0] reserved [0.00] BIOS-e820: [mem 0xfee0-0xfee0] reserved [0.00] BIOS-e820: [mem 0xfff8-0x] reserved [0.00] BIOS-e820: [mem 0x0001-0x0003] usable [0.00] Notice: NX (Execute Disable) protection missing in CPU! [0.00] SMBIOS 2.3 present. [0.00] DMI: Dell Computer Corporation PowerEdge 6650 /0J3082, BIOS A17 01/21/2005 [0.00] last_pfn = 0x40 max_arch_pfn = 0x100 [0.00] x86/PAT: Configuration [0-7]: WB WT UC- UC WB WT UC- UC [0.00] found SMP MP-table at [mem 0x000fe710-0x000fe71f] mapped at [(ptrval)] [0.00] ACPI: Early table checksum verification disabled [0.00] ACPI: RSDP 0x000FDC00 14 (v00 DELL ) [0.00] ACPI: RSDT 0x000FDC14 30 (v01 DELL PE6650 0001 MSFT 010A) [0.00] ACPI: FACP 0x000FDC44 74 (v01 DELL PE6650 0001 MSFT 010A) [0.00] ACPI: DSDT 0xFFFE 005C91 (v01 DELL PE6650 0001 MSFT 010A) [0.00] ACPI: FACS 0xEFFE 40 [0.00] ACPI: APIC 0x000FDCB8 C0 (v01 DELL PE6650 0001 MSFT 010A) [0.00] ACPI: SPCR 0x000FDD78 50 (v01 DELL PE6650 0001 MSFT 010A) [0.00] 15496MB HIGHMEM available. [0.00] 887MB LOWMEM available. [0.00] mapped low ram: 0 - 377fe000 [0.00] low ram: 0 - 377fe000 [0.00] tsc: Fast TSC calibration using PIT [0.00] Zone ranges: [0.00] DMA [mem 0x1000-0x00ff] [0.00] Normal [mem 0x0100-0x377fdfff] [0.00] HighMem [mem 0x377fe000-0x0003] [0.00] Movable zone start for each node [0.00] Early memory node ranges [0.00] node 0: [mem 0x1000-0x0009] [0.00] node 0: [mem 0x0010-0xeffd] [0.00] node 0: [mem 0x0001-0x0003] [0.00] Initmem setup node 0 [mem 0x1000-0x0003] [0.00] Using APIC driver default [0.00] ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1]) [0.00] ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1]) [0.00] ACPI: LAPIC_NMI (acpi_id[0x03] high edge lint[0x1]) [0.00] ACPI: LAPIC_NMI (acpi_id[0x04] high edge lint[0x1]) [0.00] ACPI: LAPIC_NMI (acpi_id[0x05] high edge lint[0x1]) [0.00] ACPI: LAPIC_NMI (acpi_id[0x06] high edge lint[0x1]) [0.00] ACPI: LAPIC_NMI (acpi_id[0x07] high edge lint[0x1]) [0.00] ACPI: LAPIC_NMI (acpi_id[0x08] high edge lint[0x1]) [0.00] IOAPIC[0]: apic_id 8, version 17, address 0xfec0, GSI 0-15 [0.00] IOAPIC[1]: apic_id 9, version 17, address 0xfec01000, GSI 16-31 [0.00] IOAPIC[2]: apic_id 10, version 17, address 0xfec02000, GSI 32-47 [0.00] Using ACPI (MADT) for SMP configuration information [0.00] smpboot: Allowing 8 CPUs, 0 hotplug CPUs [0.00] [mem 0xe000-0xfebf] available for PCI devices [0.00] clocksource: refined-jiffies: mask: 0x max_cycles: 0x, max_idle_ns: 7645519600211568 ns [0.00] setup_percpu: NR_CPUS:8 nr_cpumask_bits:8 nr_cpu_ids:8 nr_node_ids:1 [0.00] percpu: Embedded 29 pages/cpu @(ptrval) s87308 r0 d31476 u118784 [0.00] Built 1 zonelists, mobility grouping on. Total pages: 4126863 [0.00] Kernel command line: BOOT_IMAGE=/vmlinuz-4.18.12.pentium4-xeon-christoph+ root=/dev/sda2 ro init=/usr/lib/systemd/systemd sysrq_always_active console=tty1 console=ttyS0,115200n8 aic7xxx.aic7xxx=verbose [0.00] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) [0.00] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) [0.00] Initializing CPU#0 [0.00] Initializing HighMem for node 0 (000377fe:0040) [0.00] Initializing Movable for node 0 (:) [0.00] Memory: 16374712K/16514556K available (3538K
Re: aic7xxx DMA overflow error
> That isn't a limit, just a reporting clause - the real check is this > line a little above: > > if (unlikely(dev && !dma_capable(dev, dma_addr, size))) { > > which is purely based on the dma mask. So for some reason we must > be in 32-bit only mode for the dma-mask, and not actually enabling > swiotlb. And thinking more about it the latter is what is really odd, > we should always enable swiotlb for systems with 4GB memory. I'll > defintivetly wait for your dmesg! Christoph, here is the dmesg output. It does not successfully boot. [ 16.527021] aic7xxx: large SCBS not supported [ 21.984104] scsi host0: Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 7.0 [ 21.984104] [ 21.984104] aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs [ 21.984104] [ 22.012511] scsi 0:0:0:0: Direct-Access SEAGATE ST373454LC D403 PQ: 0 ANSI: 3 [ 22.020995] scsi0:A:0:0: [ 22.020997] Tagged Queuing enabled. Depth 32 [ 22.028503] scsi target0:0:0: Beginning Domain Validation [ 22.037780] scsi target0:0:0: wide asynchronous [ 22.045276] scsi target0:0:0: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 63) [ 22.056417] scsi 0:0:0:0: Power-on or device reset occurred [ 22.065532] scsi target0:0:0: Ending Domain Validation [ 22.104034] scsi 0:0:1:0: Direct-Access FUJITSU MAW3147NC 0104 PQ: 0 ANSI: 3 [ 22.112535] scsi0:A:1:0: [ 22.112538] Tagged Queuing enabled. Depth 32 [ 22.120056] scsi target0:0:1: Beginning Domain Validation [ 22.129058] scsi target0:0:1: wide asynchronous [ 22.136422] scsi target0:0:1: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 127) [ 22.152111] scsi 0:0:1:0: Power-on or device reset occurred [ 22.161065] scsi target0:0:1: Ending Domain Validation [ 22.188449] scsi 0:0:2:0: Direct-Access IBM IC35L073UCDY10-0 S27N PQ: 0 ANSI: 3 [ 22.196952] scsi0:A:2:0: [ 22.196955] Tagged Queuing enabled. Depth 32 [ 22.204460] scsi target0:0:2: Beginning Domain Validation [ 22.222059] random: fast init done [ 22.225725] scsi target0:0:2: wide asynchronous [ 22.231731] scsi target0:0:2: FAST-20 WIDE SCSI 40.0 MB/s ST (50 ns, offset 8) [ 22.240024] scsi target0:0:2: Domain Validation skipping write tests [ 22.246600] scsi target0:0:2: Ending Domain Validation [ 22.284567] scsi 0:0:3:0: Direct-Access IBM IC35L073UCDY10-0 S27N PQ: 0 ANSI: 3 [ 22.293069] scsi0:A:3:0: [ 22.293072] Tagged Queuing enabled. Depth 32 [ 22.300583] scsi target0:0:3: Beginning Domain Validation [ 22.328612] scsi target0:0:3: wide asynchronous [ 22.334718] scsi target0:0:3: FAST-20 WIDE SCSI 40.0 MB/s ST (50 ns, offset 8) [ 22.343006] scsi target0:0:3: Domain Validation skipping write tests [ 22.349589] scsi target0:0:3: Ending Domain Validation [ 22.901684] scsi 0:0:6:0: Processor PE/PV1x5 SCSI BP 1.1 PQ: 0 ANSI: 2 [ 22.910273] scsi target0:0:6: Beginning Domain Validation [ 22.939101] scsi target0:0:6: Ending Domain Validation [ 25.096168] sd 0:0:0:0: [sda] 143374650 512-byte logical blocks: (73.4 GB/68.4 GiB) [ 25.096225] sd 0:0:1:0: [sdb] 287277984 512-byte logical blocks: (147 GB/137 GiB) [ 25.107747] sd 0:0:2:0: Power-on or device reset occurred [ 25.139537] sd 0:0:3:0: Power-on or device reset occurred [ 25.145255] sd 0:0:0:0: [sda] Write Protect is off [ 25.145278] sd 0:0:1:0: [sdb] Write Protect is off [ 25.145643] scsi host1: pata_serverworks [ 25.177180] sd 0:0:2:0: [sdc] 143374805 512-byte logical blocks: (73.4 GB/68.4 GiB) [ 25.177240] sd 0:0:3:0: [sdd] 143374805 512-byte logical blocks: (73.4 GB/68.4 GiB) [ 25.178429] sd 0:0:1:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [ 25.189353] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, supports DPO and FUA [ 25.200252] scsi host2: pata_serverworks [ 25.202942] sd 0:0:3:0: [sdd] Write Protect is off [ 25.202953] sd 0:0:2:0: [sdc] Write Protect is off [ 25.213855] ata1: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma 0x8b0 irq 14 [ 25.247030] sd 0:0:2:0: [sdc] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [ 25.247037] ata2: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0x8b8 irq 15 [ 25.278937] sd 0:0:3:0: [sdd] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA [ 25.300661] sdb: sdb1 [ 25.406317] sda: sda1 sda2 [ 25.409465] sdc: sdc1 [ 25.474427] sd 0:0:1:0: [sdb] Attached SCSI disk [ 25.475141] sdd: sdd1 [ 25.486165] sd 0:0:2:0: [sdc] Attached SCSI disk [ 25.529044] sd 0:0:0:0: [sda] Attached SCSI disk [ 25.530061] sd 0:0:3:0: [sdd] Attached SCSI disk [ 25.537753] scsi 1:0:0:0: CD-ROMHL-DT-ST RW/DVD GCC-4243N A102 PQ: 0 ANSI: 5 [ 25.963649] EXT4-fs (sda2): mounting ext3 file system using the ext4 subsystem [ 26.206768] EXT4-fs (sda2): mounted filesystem with ordered data mode. Opts: (null) [ 26.214856] VFS: Mounted root (ext3 filesystem) readonly on device 8:2. [ 26.274864]
Re: aic7xxx DMA overflow error
Christoph, I was able to bisect this to your patch "scsi: reduce use of block bounce buffers". I am getting the error on a 32-bit Dell PowerEdge 6650. It has the aic7xxx integrated onto the motherboard. Again, here is the error: aic7xxx :00:03.0: dma_direct_map_sg: overflow 0x0003ff80+65536 of device mask I wonder if the odd 39-bit mask used in aic7xxx is part of the problem? - Matthew
aic7xxx DMA overflow error
Hannes, I'm getting the following error in a custom configured 4.18 32-bit x86 kernel supporting PAE, with 16GiB physical memory. It loops infinitely on the error. aic7xxx :00:03.0: dma_direct_map_sg: overflow 0x0003ff80+65536 of device mask I have tried enabling the following to fix this in the kernel config: CONFIG_X86_PAE=y CONFIG_VMSPLIT_3G=y CONFIG_HIGHMEM=y CONFIG_HIGHMEM64G=y CONFIG_ZONE_DMA=y CONFIG_BOUNCE=y What should I look at next? - Matthew