Re: aic7xxx DMA overflow error

2018-10-13 Thread tedheadster
> it seems like we do for some reason never actually enable swiotlb
> for 32-bit x86.  Before my commit the block bounce buffering papered
> over that for networking,  Please try this patch:
>
> diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
> index 661583662430..71c0b01d93b1 100644
> --- a/arch/x86/kernel/pci-swiotlb.c
> +++ b/arch/x86/kernel/pci-swiotlb.c
> @@ -42,10 +42,8 @@ IOMMU_INIT_FINISH(pci_swiotlb_detect_override,
>  int __init pci_swiotlb_detect_4gb(void)
>  {
> /* don't initialize swiotlb if iommu=off (no_iommu=1) */
> -#ifdef CONFIG_X86_64
> if (!no_iommu && max_possible_pfn > MAX_DMA32_PFN)
> swiotlb = 1;
> -#endif
>
> /*
>  * If SME is active then swiotlb will be set to 1 so that bounce

Christoph,
  your patch fixed it nicely. No more error messages when I boot with
16GiB enabled on a 32-bit PAE-enabled system.

- Matthew Whitehead


Re: aic7xxx DMA overflow error

2018-10-13 Thread tedheadster
Christoph,
  here is all of the newly patched dmesg output. I also added
'aic7xxx.a9c7xxx=verbose' for extra information.

Matthew

[0.00] Linux version 4.18.12.pentium4-xeon-christoph+
(root@pentium4) (gcc version 5.4.0 (Gentoo 5.4.0-r4 p1.8, pie-0.6.5))
#525 SMP PREEMPT Sat Oct 13 09:49:31 EDT 2018
[0.00] KERNEL supported cpus:
[0.00]   Intel GenuineIntel
[0.00] x86/fpu: x87 FPU will use FXSAVE
[0.00] BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0009] usable
[0.00] BIOS-e820: [mem 0x0010-0xeffd] usable
[0.00] BIOS-e820: [mem 0xeffe-0xeffefbff] ACPI data
[0.00] BIOS-e820: [mem 0xeffefc00-0xefffefff] reserved
[0.00] BIOS-e820: [mem 0xfec0-0xfec0] reserved
[0.00] BIOS-e820: [mem 0xfee0-0xfee0] reserved
[0.00] BIOS-e820: [mem 0xfff8-0x] reserved
[0.00] BIOS-e820: [mem 0x0001-0x0003] usable
[0.00] Notice: NX (Execute Disable) protection missing in CPU!
[0.00] SMBIOS 2.3 present.
[0.00] DMI: Dell Computer Corporation PowerEdge 6650
  /0J3082, BIOS A17 01/21/2005
[0.00] last_pfn = 0x40 max_arch_pfn = 0x100
[0.00] x86/PAT: Configuration [0-7]: WB  WT  UC- UC  WB  WT  UC- UC
[0.00] found SMP MP-table at [mem 0x000fe710-0x000fe71f]
mapped at [(ptrval)]
[0.00] ACPI: Early table checksum verification disabled
[0.00] ACPI: RSDP 0x000FDC00 14 (v00 DELL  )
[0.00] ACPI: RSDT 0x000FDC14 30 (v01 DELL   PE6650
  0001 MSFT 010A)
[0.00] ACPI: FACP 0x000FDC44 74 (v01 DELL   PE6650
  0001 MSFT 010A)
[0.00] ACPI: DSDT 0xFFFE 005C91 (v01 DELL   PE6650
  0001 MSFT 010A)
[0.00] ACPI: FACS 0xEFFE 40
[0.00] ACPI: APIC 0x000FDCB8 C0 (v01 DELL   PE6650
  0001 MSFT 010A)
[0.00] ACPI: SPCR 0x000FDD78 50 (v01 DELL   PE6650
  0001 MSFT 010A)
[0.00] 15496MB HIGHMEM available.
[0.00] 887MB LOWMEM available.
[0.00]   mapped low ram: 0 - 377fe000
[0.00]   low ram: 0 - 377fe000
[0.00] tsc: Fast TSC calibration using PIT
[0.00] Zone ranges:
[0.00]   DMA  [mem 0x1000-0x00ff]
[0.00]   Normal   [mem 0x0100-0x377fdfff]
[0.00]   HighMem  [mem 0x377fe000-0x0003]
[0.00] Movable zone start for each node
[0.00] Early memory node ranges
[0.00]   node   0: [mem 0x1000-0x0009]
[0.00]   node   0: [mem 0x0010-0xeffd]
[0.00]   node   0: [mem 0x0001-0x0003]
[0.00] Initmem setup node 0 [mem 0x1000-0x0003]
[0.00] Using APIC driver default
[0.00] ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
[0.00] ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1])
[0.00] ACPI: LAPIC_NMI (acpi_id[0x03] high edge lint[0x1])
[0.00] ACPI: LAPIC_NMI (acpi_id[0x04] high edge lint[0x1])
[0.00] ACPI: LAPIC_NMI (acpi_id[0x05] high edge lint[0x1])
[0.00] ACPI: LAPIC_NMI (acpi_id[0x06] high edge lint[0x1])
[0.00] ACPI: LAPIC_NMI (acpi_id[0x07] high edge lint[0x1])
[0.00] ACPI: LAPIC_NMI (acpi_id[0x08] high edge lint[0x1])
[0.00] IOAPIC[0]: apic_id 8, version 17, address 0xfec0, GSI 0-15
[0.00] IOAPIC[1]: apic_id 9, version 17, address 0xfec01000, GSI 16-31
[0.00] IOAPIC[2]: apic_id 10, version 17, address 0xfec02000, GSI 32-47
[0.00] Using ACPI (MADT) for SMP configuration information
[0.00] smpboot: Allowing 8 CPUs, 0 hotplug CPUs
[0.00] [mem 0xe000-0xfebf] available for PCI devices
[0.00] clocksource: refined-jiffies: mask: 0x
max_cycles: 0x, max_idle_ns: 7645519600211568 ns
[0.00] setup_percpu: NR_CPUS:8 nr_cpumask_bits:8 nr_cpu_ids:8
nr_node_ids:1
[0.00] percpu: Embedded 29 pages/cpu @(ptrval) s87308 r0 d31476 u118784
[0.00] Built 1 zonelists, mobility grouping on.  Total pages: 4126863
[0.00] Kernel command line:
BOOT_IMAGE=/vmlinuz-4.18.12.pentium4-xeon-christoph+ root=/dev/sda2 ro
init=/usr/lib/systemd/systemd sysrq_always_active console=tty1
console=ttyS0,115200n8 aic7xxx.aic7xxx=verbose
[0.00] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
[0.00] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
[0.00] Initializing CPU#0
[0.00] Initializing HighMem for node 0 (000377fe:0040)
[0.00] Initializing Movable for node 0 (:)
[0.00] Memory: 16374712K/16514556K available (3538K 

Re: aic7xxx DMA overflow error

2018-10-12 Thread tedheadster
> That isn't a limit, just a reporting clause - the real check is this
> line a little above:
>
> if (unlikely(dev && !dma_capable(dev, dma_addr, size))) {
>
> which is purely based on the dma mask.  So for some reason we must
> be in 32-bit only mode for the dma-mask, and not actually enabling
> swiotlb.  And thinking more about it the latter is what is really odd,
> we should always enable swiotlb for systems with 4GB memory.  I'll
> defintivetly wait for your dmesg!

Christoph,
  here is the dmesg output. It does not successfully boot.

[   16.527021] aic7xxx: large SCBS not supported
[   21.984104] scsi host0: Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 7.0
[   21.984104] 
[   21.984104] aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
[   21.984104]
[   22.012511] scsi 0:0:0:0: Direct-Access SEAGATE  ST373454LC
  D403 PQ: 0 ANSI: 3
[   22.020995] scsi0:A:0:0:
[   22.020997] Tagged Queuing enabled.  Depth 32
[   22.028503] scsi target0:0:0: Beginning Domain Validation
[   22.037780] scsi target0:0:0: wide asynchronous
[   22.045276] scsi target0:0:0: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5
ns, offset 63)
[   22.056417] scsi 0:0:0:0: Power-on or device reset occurred
[   22.065532] scsi target0:0:0: Ending Domain Validation
[   22.104034] scsi 0:0:1:0: Direct-Access FUJITSU  MAW3147NC
  0104 PQ: 0 ANSI: 3
[   22.112535] scsi0:A:1:0:
[   22.112538] Tagged Queuing enabled.  Depth 32
[   22.120056] scsi target0:0:1: Beginning Domain Validation
[   22.129058] scsi target0:0:1: wide asynchronous
[   22.136422] scsi target0:0:1: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5
ns, offset 127)
[   22.152111] scsi 0:0:1:0: Power-on or device reset occurred
[   22.161065] scsi target0:0:1: Ending Domain Validation
[   22.188449] scsi 0:0:2:0: Direct-Access IBM
IC35L073UCDY10-0 S27N PQ: 0 ANSI: 3
[   22.196952] scsi0:A:2:0:
[   22.196955] Tagged Queuing enabled.  Depth 32
[   22.204460] scsi target0:0:2: Beginning Domain Validation
[   22.222059] random: fast init done
[   22.225725] scsi target0:0:2: wide asynchronous
[   22.231731] scsi target0:0:2: FAST-20 WIDE SCSI 40.0 MB/s ST (50
ns, offset 8)
[   22.240024] scsi target0:0:2: Domain Validation skipping write tests
[   22.246600] scsi target0:0:2: Ending Domain Validation
[   22.284567] scsi 0:0:3:0: Direct-Access IBM
IC35L073UCDY10-0 S27N PQ: 0 ANSI: 3
[   22.293069] scsi0:A:3:0:
[   22.293072] Tagged Queuing enabled.  Depth 32
[   22.300583] scsi target0:0:3: Beginning Domain Validation
[   22.328612] scsi target0:0:3: wide asynchronous
[   22.334718] scsi target0:0:3: FAST-20 WIDE SCSI 40.0 MB/s ST (50
ns, offset 8)
[   22.343006] scsi target0:0:3: Domain Validation skipping write tests
[   22.349589] scsi target0:0:3: Ending Domain Validation
[   22.901684] scsi 0:0:6:0: Processor PE/PV1x5 SCSI BP
  1.1  PQ: 0 ANSI: 2
[   22.910273] scsi target0:0:6: Beginning Domain Validation
[   22.939101] scsi target0:0:6: Ending Domain Validation
[   25.096168] sd 0:0:0:0: [sda] 143374650 512-byte logical blocks:
(73.4 GB/68.4 GiB)
[   25.096225] sd 0:0:1:0: [sdb] 287277984 512-byte logical blocks:
(147 GB/137 GiB)
[   25.107747] sd 0:0:2:0: Power-on or device reset occurred
[   25.139537] sd 0:0:3:0: Power-on or device reset occurred
[   25.145255] sd 0:0:0:0: [sda] Write Protect is off
[   25.145278] sd 0:0:1:0: [sdb] Write Protect is off
[   25.145643] scsi host1: pata_serverworks
[   25.177180] sd 0:0:2:0: [sdc] 143374805 512-byte logical blocks:
(73.4 GB/68.4 GiB)
[   25.177240] sd 0:0:3:0: [sdd] 143374805 512-byte logical blocks:
(73.4 GB/68.4 GiB)
[   25.178429] sd 0:0:1:0: [sdb] Write cache: disabled, read cache:
enabled, doesn't support DPO or FUA
[   25.189353] sd 0:0:0:0: [sda] Write cache: disabled, read cache:
enabled, supports DPO and FUA
[   25.200252] scsi host2: pata_serverworks
[   25.202942] sd 0:0:3:0: [sdd] Write Protect is off
[   25.202953] sd 0:0:2:0: [sdc] Write Protect is off
[   25.213855] ata1: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma 0x8b0 irq 14
[   25.247030] sd 0:0:2:0: [sdc] Write cache: disabled, read cache:
enabled, doesn't support DPO or FUA
[   25.247037] ata2: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0x8b8 irq 15
[   25.278937] sd 0:0:3:0: [sdd] Write cache: disabled, read cache:
enabled, doesn't support DPO or FUA
[   25.300661]  sdb: sdb1
[   25.406317]  sda: sda1 sda2
[   25.409465]  sdc: sdc1
[   25.474427] sd 0:0:1:0: [sdb] Attached SCSI disk
[   25.475141]  sdd: sdd1
[   25.486165] sd 0:0:2:0: [sdc] Attached SCSI disk
[   25.529044] sd 0:0:0:0: [sda] Attached SCSI disk
[   25.530061] sd 0:0:3:0: [sdd] Attached SCSI disk
[   25.537753] scsi 1:0:0:0: CD-ROMHL-DT-ST RW/DVD
GCC-4243N A102 PQ: 0 ANSI: 5
[   25.963649] EXT4-fs (sda2): mounting ext3 file system using the
ext4 subsystem
[   26.206768] EXT4-fs (sda2): mounted filesystem with ordered data
mode. Opts: (null)
[   26.214856] VFS: Mounted root (ext3 filesystem) readonly on device 8:2.
[   26.274864] 

Re: aic7xxx DMA overflow error

2018-10-01 Thread tedheadster
Christoph,
  I was able to bisect this to your patch "scsi: reduce use of block
bounce buffers". I am getting the error on a 32-bit Dell PowerEdge
6650. It has the aic7xxx integrated onto the motherboard.

Again, here is the error:

aic7xxx :00:03.0: dma_direct_map_sg: overflow
0x0003ff80+65536 of device mask 

I wonder if the odd 39-bit mask used in aic7xxx is part of the problem?

- Matthew


aic7xxx DMA overflow error

2018-09-30 Thread tedheadster
Hannes,
  I'm getting the following error in a custom configured 4.18 32-bit
x86 kernel supporting PAE, with 16GiB physical memory. It loops
infinitely on the error.

aic7xxx :00:03.0: dma_direct_map_sg: overflow
0x0003ff80+65536 of device mask 

I have tried enabling the following to fix this in the kernel config:

CONFIG_X86_PAE=y
CONFIG_VMSPLIT_3G=y
CONFIG_HIGHMEM=y
CONFIG_HIGHMEM64G=y
CONFIG_ZONE_DMA=y
CONFIG_BOUNCE=y

What should I look at next?

- Matthew