Re: armv7 cache flushing: don't take shortcuts

2016-08-16 Thread Markus Hennecke

Am 16.08.2016 um 03:03 schrieb Daniel Bolgheroni:

On Mon, Aug 15, 2016 at 09:56:09PM +0200, Mark Kettenis wrote:

The functions that clean/invalidate the caches by virtual address,
bail out after cleaning 32k worth of data.  The 32k matches the L1
cache of most of the CPUs we current run on.  But the Cortex-A7 has an
integrated L2 cache that is larger.  And if you only flush it
partially you may get into trouble.  And now that we actually use the
cache that matters.  Many of the more recent ARMv7 CPUs include such a
L2 cache.  And some of them even have L1 caches that are larger than
32k.  So drop the shortcut and simply clean/invalidate what we were
asked to clean/invalidate.  Most of the calls should be covering a
single page or less anyway.

This fixes the core dumps and illegal instructions that I see when
booting from a SATA disk.

Just saw this commited. It makes Cubieboard2 fully useable so far.
Kernel rebuild with fs on ahci:


Just did a complete system build on ahci on a cubieboard2 without any 
issues:


  335m47.17s real   266m37.66s user54m34.81s system


Thank you very much!




Re: armv7 cache flushing: don't take shortcuts

2016-08-15 Thread Daniel Bolgheroni
On Mon, Aug 15, 2016 at 09:56:09PM +0200, Mark Kettenis wrote:
> The functions that clean/invalidate the caches by virtual address,
> bail out after cleaning 32k worth of data.  The 32k matches the L1
> cache of most of the CPUs we current run on.  But the Cortex-A7 has an
> integrated L2 cache that is larger.  And if you only flush it
> partially you may get into trouble.  And now that we actually use the
> cache that matters.  Many of the more recent ARMv7 CPUs include such a
> L2 cache.  And some of them even have L1 caches that are larger than
> 32k.  So drop the shortcut and simply clean/invalidate what we were
> asked to clean/invalidate.  Most of the calls should be covering a
> single page or less anyway.
> 
> This fixes the core dumps and illegal instructions that I see when
> booting from a SATA disk.

Just saw this commited. It makes Cubieboard2 fully useable so far.
Kernel rebuild with fs on ahci:

(...)
ld -T ldscript --warn-common -nopie -S -o bsd ${SYSTEM_HEAD} vers.o ${OBJS}
textdatabss dec hex
3744040 139412  479308  4362760 429208
   25m50.10s real17m18.26s user 1m28.06s system

Just as a comparison, it takes around 20 min on Wandboard with fs on nfs and
around 23 min on BeagleBone Black with fs also on nfs.

Thank you.

--

U-Boot SPL 2016.07 (Aug 05 2016 - 23:44:57)
DRAM: 1024 MiB
CPU: 91200Hz, AXI/AHB/APB: 3/2/2
Trying to boot from MMC1


U-Boot 2016.07 (Aug 05 2016 - 23:44:57 -0600) Allwinner Technology

CPU:   Allwinner A20 (SUN7I)
Model: Cubietech Cubieboard2
I2C:   ready
DRAM:  1 GiB
MMC:   SUNXI SD/MMC: 0
*** Warning - bad CRC, using default environment

In:serial
Out:   serial
Err:   serial
SCSI:  Target spinup took 0 ms.
AHCI 0001.0100 32 slots 1 ports 3 Gbps 0x1 impl SATA mode
flags: ncq stag pm led clo only pmp pio slum part ccc apst
Net:   eth0: ethernet@01c5
starting USB...
USB0:   USB EHCI 1.00
USB1:   USB OHCI 1.0
USB2:   USB EHCI 1.00
USB3:   USB OHCI 1.0
scanning bus 0 for devices... 1 USB Device(s) found
scanning bus 2 for devices... 1 USB Device(s) found
Hit any key to stop autoboot:  0
=>
=> setenv devnum 0
=> run scsi_boot
scanning bus for devices...
  Device 0: (0:0) Vendor: ATA Prod.: TOSHIBA MK1235GS Rev: PV01
Type: Hard Disk
Capacity: 114473.4 MB = 111.7 GB (234441648 x 512)
Found 1 device(s).

Device 0: (0:0) Vendor: ATA Prod.: TOSHIBA MK1235GS Rev: PV01
Type: Hard Disk
Capacity: 114473.4 MB = 111.7 GB (234441648 x 512)
... is now current device
Scanning scsi 0:1...
Found EFI removable media binary efi/boot/bootarm.efi
reading efi/boot/bootarm.efi
65276 bytes read in 23 ms (2.7 MiB/s)
libfdt fdt_check_header(): FDT_ERR_BADMAGIC
## Starting EFI application at 0x4200 ...
Scanning disks on scsi...
Scanning disks on usb...
Scanning disks on mmc...
MMC Device 1 not found
MMC Device 2 not found
MMC Device 3 not found
Found 6 disks
>> OpenBSD/armv7 BOOTARM 0.1
boot>
booting sd0a:/bsd: 3743840+139408+479308 [64+501824+238352]=0x4e3de0

OpenBSD/armv7 booting ...
arg0 0x4000 arg1 0x10bb arg2 0x4800
Allocating page tables
freestart = 0x407e4000, free_pages = 260124 (0x0003f81c)
IRQ stack: p0x40812000 v0xc0812000
ABT stack: p0x40813000 v0xc0813000
UND stack: p0x40814000 v0xc0814000
SVC stack: p0x40815000 v0xc0815000
Creating L1 page table at 0x407e4000
Mapping kernel
Constructing L2 page tables
undefined page pmap [ using 740612 bytes of bsd ELF symbol table ]
board type: 4283
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California.  All rights reserved.
Copyright (c) 1995-2016 OpenBSD. All rights reserved.  http://www.OpenBSD.org

OpenBSD 6.0-current (GENERIC) #1: Mon Aug 15 19:34:05 BRT 2016
dbolgher...@wbs.my.domain:/usr/src/sys/arch/armv7/compile/GENERIC
real mem  = 1073741824 (1024MB)
avail mem = 104448 (996MB)
mainbus0 at root: Cubietech Cubieboard2
cpu0 at mainbus0: ARM Cortex A7 rev 4 (ARMv7 core)
cpu0: DC enabled IC enabled WB disabled EABT branch prediction enabled
cpu0: 32KB(32b/l,2way) I-cache, 32KB(64b/l,4way) wr-back D-cache
cortex0 at mainbus0
sunxi0 at mainbus0
sxipio0 at sunxi0: 175 pins
sxiccmu0 at sunxi0
gpio0 at sxipio0: 18 pins
gpio1 at sxipio0: 24 pins
gpio2 at sxipio0: 25 pins
gpio3 at sxipio0: 28 pins
gpio4 at sxipio0: 12 pins
gpio5 at sxipio0: 6 pins
gpio6 at sxipio0: 12 pins
gpio7 at sxipio0: 28 pins
gpio8 at sxipio0: 22 pins
agtimer0 at mainbus0: tick rate 24000 KHz
simplebus0 at mainbus0: "soc"
ehci0 at simplebus0
usb0 at ehci0: USB revision 2.0
uhub0 at usb0 "Allwinner EHCI root hub" rev 2.00/1.00 addr 1
sxiahci0 at simplebus0: AHCI 1.1
sxiahci0: port 0: 3.0Gb/s
scsibus0 at sxiahci0: 32 targets
sd0 at scsibus0 targ 0 lun 0:  SCSI3 0/direct 
fixed naa.5391d4f841be
sd0: 114473MB, 512 bytes/sector, 234441648 sectors
ehci1 at simplebus0
usb1 at ehci1: USB revision 2.0
uhub1 at usb1 "Allwinner EHCI root hub" rev 2.00/1.00 addr 1
sxidog0 at simplebus0
sxirtc0 at simplebus0
sxiuart0 at simplebus0: console
dw

armv7 cache flushing: don't take shortcuts

2016-08-15 Thread Mark Kettenis
The functions that clean/invalidate the caches by virtual address,
bail out after cleaning 32k worth of data.  The 32k matches the L1
cache of most of the CPUs we current run on.  But the Cortex-A7 has an
integrated L2 cache that is larger.  And if you only flush it
partially you may get into trouble.  And now that we actually use the
cache that matters.  Many of the more recent ARMv7 CPUs include such a
L2 cache.  And some of them even have L1 caches that are larger than
32k.  So drop the shortcut and simply clean/invalidate what we were
asked to clean/invalidate.  Most of the calls should be covering a
single page or less anyway.

This fixes the core dumps and illegal instructions that I see when
booting from a SATA disk.

ok?


Index: arch/arm/arm/cpufunc_asm_armv7.S
===
RCS file: /cvs/src/sys/arch/arm/arm/cpufunc_asm_armv7.S,v
retrieving revision 1.13
diff -u -p -r1.13 cpufunc_asm_armv7.S
--- arch/arm/arm/cpufunc_asm_armv7.S6 Aug 2016 16:46:25 -   1.13
+++ arch/arm/arm/cpufunc_asm_armv7.S15 Aug 2016 19:45:53 -
@@ -103,8 +103,6 @@ ENTRY(armv7_tlb_flushD)
i_inc   .req r3
 ENTRY(armv7_icache_sync_range)
ldr ip, .Larmv7_icache_line_size
-   cmp r1, #0x8000
-   movcs   r1, #0x8000 /* XXX needs to match cache size... */
ldr ip, [ip]
sub r1, r1, #1  /* Don't overrun */
sub r3, ip, #1
@@ -136,8 +134,6 @@ ENTRY(armv7_icache_sync_all)
 
 ENTRY(armv7_dcache_wb_range)
ldr ip, .Larmv7_dcache_line_size
-   cmp r1, #0x8000
-   movcs   r1, #0x8000 /* XXX needs to match cache size... */
ldr ip, [ip]
sub r1, r1, #1  /* Don't overrun */
sub r3, ip, #1
@@ -155,8 +151,6 @@ ENTRY(armv7_dcache_wb_range)
 
 ENTRY(armv7_idcache_wbinv_range)
ldr ip, .Larmv7_idcache_line_size
-   cmp r1, #0x8000
-   movcs   r1, #0x8000 /* XXX needs to match cache size... */
ldr ip, [ip]
sub r1, r1, #1  /* Don't overrun */
sub r3, ip, #1
@@ -177,8 +171,6 @@ ENTRY(armv7_idcache_wbinv_range)
 
 ENTRY(armv7_dcache_wbinv_range)
ldr ip, .Larmv7_dcache_line_size
-   cmp r1, #0x8000
-   movcs   r1, #0x8000 /* XXX needs to match cache size... */
ldr ip, [ip]
sub r1, r1, #1  /* Don't overrun */
sub r3, ip, #1
@@ -198,8 +190,6 @@ ENTRY(armv7_dcache_wbinv_range)
 
 ENTRY(armv7_dcache_inv_range)
ldr ip, .Larmv7_dcache_line_size
-   cmp r1, #0x8000
-   movcs   r1, #0x8000 /* XXX needs to match cache size... */
ldr ip, [ip]
sub r1, r1, #1  /* Don't overrun */
sub r3, ip, #1