Re: Bogus struct page layout on 32-bit

2021-04-10 Thread Russell King - ARM Linux admin
On Sat, Apr 10, 2021 at 03:06:52PM +0100, Matthew Wilcox wrote:
> How about moving the flags into the union?  A bit messy, but we don't
> have to play games with __packed__.

Yes, that is probably the better solution, avoiding the games to try
and get the union appropriately placed on 32-bit systems.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


Re: [PATCH v2] mm: Move mem_init_print_info() into mm_init()

2021-03-31 Thread Russell King - ARM Linux admin
On Wed, Mar 17, 2021 at 09:52:10AM +0800, Kefeng Wang wrote:
> mem_init_print_info() is called in mem_init() on each architecture,
> and pass NULL argument, so using void argument and move it into mm_init().
> 
> Acked-by: Dave Hansen 
> Signed-off-by: Kefeng Wang 

Acked-by: Russell King  # for arm bits
-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


Re: [PATCH 02/10] ARM: disable CONFIG_IDE in footbridge_defconfig

2021-03-23 Thread Russell King - ARM Linux admin
On Mon, Mar 22, 2021 at 06:10:01PM +0100, Cye Borg wrote:
> PWS 500au:
> 
> snow / # lspci -vvx -s 7.1
> 00:07.1 IDE interface: Contaq Microsystems 82c693 (prog-if 80 [ISA
> Compatibility mode-only controller, supports bus mastering])
> Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr+ Stepping- SERR- FastB2B- DisINTx-
> Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium
> >TAbort- SERR-  Latency: 0
> Interrupt: pin A routed to IRQ 0
> Region 0: I/O ports at 01f0 [size=8]
> Region 1: I/O ports at 03f4
> Region 4: I/O ports at 9080 [size=16]
> Kernel driver in use: pata_cypress
> Kernel modules: pata_cypress
> 00: 80 10 93 c6 45 00 80 02 00 80 01 01 00 00 80 00
> 10: f1 01 00 00 f5 03 00 00 00 00 00 00 00 00 00 00
> 20: 81 90 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00
> 
> snow / # lspci -vvx -s 7.2
> 00:07.2 IDE interface: Contaq Microsystems 82c693 (prog-if 00 [ISA
> Compatibility mode-only controller])
> Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr+ Stepping- SERR- FastB2B- DisINTx-
> Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium
> >TAbort- SERR-  Latency: 0
> Interrupt: pin B routed to IRQ 0
> Region 0: I/O ports at 0170 [size=8]
> Region 1: I/O ports at 0374
> Region 4: Memory at 0c24 (32-bit, non-prefetchable)
> [disabled] [size=64K]
> Kernel modules: pata_cypress
> 00: 80 10 93 c6 45 00 80 02 00 00 01 01 00 00 80 00
> 10: 71 01 00 00 75 03 00 00 00 00 00 00 00 00 00 00
> 20: 00 00 24 0c 00 00 00 00 00 00 00 00 00 00 00 00
> 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00

Thanks very much.

Could I also ask for the output of:

# lspci -vxxx -s 7.0

as well please - this will dump all 256 bytes for the ISA bridge, which
contains a bunch of configuration registers. Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


Re: [PATCH 02/10] ARM: disable CONFIG_IDE in footbridge_defconfig

2021-03-23 Thread Russell King - ARM Linux admin
On Mon, Mar 22, 2021 at 04:33:14PM +0100, Christoph Hellwig wrote:
> On Mon, Mar 22, 2021 at 04:18:23PM +0100, Christoph Hellwig wrote:
> > On Mon, Mar 22, 2021 at 03:15:03PM +, Russell King - ARM Linux admin 
> > wrote:
> > > It gets worse than that though - due to a change to remove
> > > pcibios_min_io from the generic code, moving it into the ARM
> > > architecture code, this has caused a regression that prevents the
> > > legacy resources being registered against the bus resource. So even
> > > if they are there, they cause probe failures. I haven't found a
> > > reasonable way to solve this yet, but until there is, there is no
> > > way that the PATA driver can be used as the "legacy mode" support
> > > is effectively done via the PCI code assigning virtual IO port
> > > resources.
> > > 
> > > I'm quite surprised that the CY82C693 even works on Alpha - I've
> > > asked for a lspci for that last week but nothing has yet been
> > > forthcoming from whoever responded to your patch for Alpha - so I
> > > can't compare what I'm seeing with what's happening with Alpha.
> > 
> > That sounds like something we could fix with a quirk for function 2
> > in the PCI resource assignment code.  Can you show what vendor and
> > device ID function 2 has so that I could try to come up with one?
> 
> Something like this:

That solves the problem for the IDE driver, which knows how to deal
with legacy mode, but not the PATA driver, which doesn't. The PATA
driver needs these resources.

As I say, having these resources presents a problem on ARM. A previous
commit (3c5d1699887b) changed the way the bus resources are setup which
results in /proc/ioports containing:

-000f : dma1
0020-003f : pic1
0060-006f : i8042
0070-0073 : rtc_cmos
  0070-0073 : rtc0
0080-008f : dma low page
00a0-00bf : pic2
00c0-00df : dma2
0213-0213 : ISAPnP
02f8-02ff : serial8250.0
  02f8-02ff : serial
03c0-03df : vga+
03f8-03ff : serial8250.0
  03f8-03ff : serial
0480-048f : dma high page
0a79-0a79 : isapnp write
1000- : PCI0 I/O
  1000-107f : :00:08.0
1000-107f : 3c59x
  1080-108f : :00:06.1
  1090-109f : :00:07.0
1090-109f : pata_it821x
  10a0-10a7 : :00:07.0
10a0-10a7 : pata_it821x
  10a8-10af : :00:07.0
10a8-10af : pata_it821x
  10b0-10b3 : :00:07.0
10b0-10b3 : pata_it821x
  10b4-10b7 : :00:07.0
10b4-10b7 : pata_it821x

The "PCI0 I/O" resource is the bus level resource, and the legacy
resources can not be claimed against that.

Without these resources, the PATA cypress driver doesn't work.

As I said previously, the reason this regression was not picked up
earlier is because I don't upgrade the kernel on this machine very
often; the machine has had uptimes into thousands of days.

I need to try reverting Rob's commit to find out if anything breaks
on this platform - it's completely wrong from a technical point of
view for any case where we have a PCI southbridge, since the
southbridge provides ISA based resources. I'm not entirely sure
what the point of it was, since we still have the PCIBIOS_MIN_IO
macro which still uses pcibios_min_io.

I'm looking at some of the other changes Rob made back at that time
which also look wrong, such as 8ef6e6201b26 which has the effect of
locating the 21285 IO resources to PCI address 0, over the top of
the ISA southbridge resources. I've no idea what Rob was thinking
when he removed the csrio allocation code in that commit, but
looking at it to day, it's soo obviously wrong even to a casual
glance.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


Re: [PATCH 02/10] ARM: disable CONFIG_IDE in footbridge_defconfig

2021-03-22 Thread Russell King - ARM Linux admin
On Mon, Mar 22, 2021 at 05:09:13PM +0100, John Paul Adrian Glaubitz wrote:
> On 3/22/21 4:15 PM, Russell King - ARM Linux admin wrote:
> > I'm quite surprised that the CY82C693 even works on Alpha - I've
> > asked for a lspci for that last week but nothing has yet been
> > forthcoming from whoever responded to your patch for Alpha - so I
> > can't compare what I'm seeing with what's happening with Alpha.
> 
> Here is lspci on my DEC Alpha XP-1000:
> 
> root@tsunami:~> lspci
> :00:07.0 ISA bridge: Contaq Microsystems 82c693
> :00:07.1 IDE interface: Contaq Microsystems 82c693
> :00:07.2 IDE interface: Contaq Microsystems 82c693
> :00:07.3 USB controller: Contaq Microsystems 82c693
> :00:0d.0 VGA compatible controller: Texas Instruments TVP4020 [Permedia 
> 2] (rev 01)
> 0001:01:03.0 Ethernet controller: Digital Equipment Corporation DECchip 
> 21142/43 (rev 41)
> 0001:01:06.0 SCSI storage controller: QLogic Corp. ISP1020 Fast-wide SCSI 
> (rev 06)
> 0001:01:08.0 PCI bridge: Digital Equipment Corporation DECchip 21152 (rev 03)
> 0001:02:09.0 Ethernet controller: Intel Corporation 82541PI Gigabit Ethernet 
> Controller (rev 05)
> root@tsunami:~>

This is no good. What I asked last Thursday was:

"Could you send me the output of lspci -vvx -s 7.1 and lspci -vvx -s 7.2
please?"

so I can see the resources the kernel is using and a dump of the PCI
config space to see what the hardware is using.

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


Re: [PATCH 02/10] ARM: disable CONFIG_IDE in footbridge_defconfig

2021-03-22 Thread Russell King - ARM Linux admin
On Mon, Mar 22, 2021 at 04:18:23PM +0100, Christoph Hellwig wrote:
> On Mon, Mar 22, 2021 at 03:15:03PM +0000, Russell King - ARM Linux admin 
> wrote:
> > It gets worse than that though - due to a change to remove
> > pcibios_min_io from the generic code, moving it into the ARM
> > architecture code, this has caused a regression that prevents the
> > legacy resources being registered against the bus resource. So even
> > if they are there, they cause probe failures. I haven't found a
> > reasonable way to solve this yet, but until there is, there is no
> > way that the PATA driver can be used as the "legacy mode" support
> > is effectively done via the PCI code assigning virtual IO port
> > resources.
> > 
> > I'm quite surprised that the CY82C693 even works on Alpha - I've
> > asked for a lspci for that last week but nothing has yet been
> > forthcoming from whoever responded to your patch for Alpha - so I
> > can't compare what I'm seeing with what's happening with Alpha.
> 
> That sounds like something we could fix with a quirk for function 2
> in the PCI resource assignment code.  Can you show what vendor and
> device ID function 2 has so that I could try to come up with one?

I already have a quirk in arch/arm/kernel/bios32.c for this - but it
is no longer sufficient due to changes in the PCI layer, where much
of this is documented.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


Re: [PATCH 02/10] ARM: disable CONFIG_IDE in footbridge_defconfig

2021-03-22 Thread Russell King - ARM Linux admin
On Mon, Mar 22, 2021 at 03:54:03PM +0100, Christoph Hellwig wrote:
> On Fri, Mar 19, 2021 at 05:53:12PM +0000, Russell King - ARM Linux admin 
> wrote:
> > If I extend the arch/arm/kernel/bios32.c code to kill BARs 2/3 (which
> > actually are not present on the CY82C693) then the IDE driver works
> > for me, but the PATA driver does not:
> > 
> > cy82c693 :00:06.1: IDE controller (0x1080:0xc693 rev 0x00)
> > cy82c693 :00:06.1: not 100% native mode: will probe irqs later
> > legacy IDE will be removed in 2021, please switch to libata
> > Report any missing HW support to linux-...@vger.kernel.org
> > ide0: BM-DMA at 0x1080-0x1087
> > ide1: BM-DMA at 0x1088-0x108f
> > Probing IDE interface ide0...
> > hda: PIONEER DVD-RW DVR-105, ATAPI CD/DVD-ROM drive
> > hda: host max PIO4 wanted PIO255(auto-tune) selected PIO4
> > ...
> > 
> > (unbind Cypress_IDE and try binding pata_cypress)
> > 
> > pata_cypress :00:06.1: no available native port
> 
> This comes from ata_pci_sff_init_host when it tails to initialize
> a port.  There are three cases why it can't initialize the port:
> 
>  1) because it is marked as dummy, which is the case for the second
> port of the cypress controller, but you're not using that even
> with the old ide driver, and we'd still not get that message just
> because of that second port.
>  2) when ata_resources_present returns false because the BAR has
> a zero start or length
>  3) because pcim_iomap_regions() fails.  This prints a warning to the
> log ("failed to request/iomap BARs for port %d (errno=%d)") that you
> should have seen
> 
> So the problem here has to be number two.  The legacy ide driver OTOH
> seems to lack a lot of these checks, although I'm not sure how it
> manages to actually work without those.
> 
> Can you show how the BAR assignment for the device looks using lscpi
> or a tool of your choice?

There's a big problem here. I have to explicitly zero the resources
(getting rid of the legacy ones assigned by the PCI probe code)
because they are in fact _wrong_ for the CY82C693. The PCI code
assumes that PCI function 1 (primary port) and PCI function 2
(secondary port) are two independent dual-channel IDE ports, and
as the PROG-IF of the class code indicates that all ports are in
legacy mode, the PCI code assigns the legacy ioport resources to
_both_ PCI functions. Essentially, the CY82C693 is a bit of an odd-ball
because it splits the two IDE ports across two functions rather than a
single function.

It gets worse than that though - due to a change to remove
pcibios_min_io from the generic code, moving it into the ARM
architecture code, this has caused a regression that prevents the
legacy resources being registered against the bus resource. So even
if they are there, they cause probe failures. I haven't found a
reasonable way to solve this yet, but until there is, there is no
way that the PATA driver can be used as the "legacy mode" support
is effectively done via the PCI code assigning virtual IO port
resources.

I'm quite surprised that the CY82C693 even works on Alpha - I've
asked for a lspci for that last week but nothing has yet been
forthcoming from whoever responded to your patch for Alpha - so I
can't compare what I'm seeing with what's happening with Alpha.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


Re: [PATCH 02/10] ARM: disable CONFIG_IDE in footbridge_defconfig

2021-03-19 Thread Russell King - ARM Linux admin
On Fri, Mar 19, 2021 at 05:07:53PM +, Russell King - ARM Linux admin wrote:
> On Thu, Mar 18, 2021 at 05:56:58AM +0100, Christoph Hellwig wrote:
> > footbridge_defconfig enables CONFIG_IDE but no actual host controller
> > driver, so just drop it.
> 
> I have been using the Cypress 82C693 IDE driver on Footbridge for a
> CD ROM drive, and I know it doesn't work with the PATA driver - as
> I need to disable BM DMA, otherwise the 82C693/DC21285 combination
> deadlocks the PCI bus. The PATA driver doesn't support disabling
> BM DMA without disabling it for all PATA ports, which is really
> annoying for my IT821x card in the same machine.
> 
> So, I'm rather stuck using the PATA driver for the HDDs and the
> IDE driver for the CD ROM.
> 
> That said, a commit a while back "cleaning up" the PCI layer appears
> to have totally shafted the 82C693, as the kernel tries to request
> IO resources at the legacy IDE addresses against the PCI bus resource
> which only covers 0x1000-0x. Hence, the 82C693 IDE ports are non-
> functional at the moment.
> 
> I'm debating about trying to find a fix to the PCI breakage that was
> introduced by "ARM: move PCI i/o resource setup into common code".
> 
> I hadn't noticed it because I don't use the CD ROM drive very often,
> and I don't upgrade the kernel that often either on the machine -
> but it has been running 24x7 for almost two decades.

Okay, a bit more on this...

If I extend the arch/arm/kernel/bios32.c code to kill BARs 2/3 (which
actually are not present on the CY82C693) then the IDE driver works
for me, but the PATA driver does not:

cy82c693 :00:06.1: IDE controller (0x1080:0xc693 rev 0x00)
cy82c693 :00:06.1: not 100% native mode: will probe irqs later
legacy IDE will be removed in 2021, please switch to libata
Report any missing HW support to linux-...@vger.kernel.org
ide0: BM-DMA at 0x1080-0x1087
ide1: BM-DMA at 0x1088-0x108f
Probing IDE interface ide0...
hda: PIONEER DVD-RW DVR-105, ATAPI CD/DVD-ROM drive
hda: host max PIO4 wanted PIO255(auto-tune) selected PIO4
...

(unbind Cypress_IDE and try binding pata_cypress)

pata_cypress :00:06.1: no available native port


-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


Re: [PATCH 02/10] ARM: disable CONFIG_IDE in footbridge_defconfig

2021-03-19 Thread Russell King - ARM Linux admin
On Thu, Mar 18, 2021 at 05:56:58AM +0100, Christoph Hellwig wrote:
> footbridge_defconfig enables CONFIG_IDE but no actual host controller
> driver, so just drop it.

I have been using the Cypress 82C693 IDE driver on Footbridge for a
CD ROM drive, and I know it doesn't work with the PATA driver - as
I need to disable BM DMA, otherwise the 82C693/DC21285 combination
deadlocks the PCI bus. The PATA driver doesn't support disabling
BM DMA without disabling it for all PATA ports, which is really
annoying for my IT821x card in the same machine.

So, I'm rather stuck using the PATA driver for the HDDs and the
IDE driver for the CD ROM.

That said, a commit a while back "cleaning up" the PCI layer appears
to have totally shafted the 82C693, as the kernel tries to request
IO resources at the legacy IDE addresses against the PCI bus resource
which only covers 0x1000-0x. Hence, the 82C693 IDE ports are non-
functional at the moment.

I'm debating about trying to find a fix to the PCI breakage that was
introduced by "ARM: move PCI i/o resource setup into common code".

I hadn't noticed it because I don't use the CD ROM drive very often,
and I don't upgrade the kernel that often either on the machine -
but it has been running 24x7 for almost two decades.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


Re: [PATCH v12 01/14] ARM: mm: add missing pud_page define to 2-level page tables

2021-02-02 Thread Russell King - ARM Linux admin
On Tue, Feb 02, 2021 at 07:47:04PM +0800, Ding Tianhong wrote:
> On 2021/2/2 19:13, Russell King - ARM Linux admin wrote:
> > On Tue, Feb 02, 2021 at 09:05:02PM +1000, Nicholas Piggin wrote:
> >> diff --git a/arch/arm/include/asm/pgtable.h 
> >> b/arch/arm/include/asm/pgtable.h
> >> index c02f24400369..d63a5bb6bd0c 100644
> >> --- a/arch/arm/include/asm/pgtable.h
> >> +++ b/arch/arm/include/asm/pgtable.h
> >> @@ -166,6 +166,9 @@ extern struct page *empty_zero_page;
> >>  
> >>  extern pgd_t swapper_pg_dir[PTRS_PER_PGD];
> >>  
> >> +#define pud_page(pud) pmd_page(__pmd(pud_val(pud)))
> >> +#define pud_write(pud)pmd_write(__pmd(pud_val(pud)))
> > 
> > As there is no PUD, does it really make sense to return a valid
> > struct page (which will be the PTE page) for pud_page(), which is
> > several tables above?
> > 
> --- a/arch/arm/include/asm/pgtable-2level.h
> +++ b/arch/arm/include/asm/pgtable-2level.h
> 
> +static inline int pud_none(pud_t pud)
> +{
> +  return 0;
> +}
> 
> I think it could be fix like this.

We already have that.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


Re: [PATCH v12 01/14] ARM: mm: add missing pud_page define to 2-level page tables

2021-02-02 Thread Russell King - ARM Linux admin
On Tue, Feb 02, 2021 at 09:05:02PM +1000, Nicholas Piggin wrote:
> diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
> index c02f24400369..d63a5bb6bd0c 100644
> --- a/arch/arm/include/asm/pgtable.h
> +++ b/arch/arm/include/asm/pgtable.h
> @@ -166,6 +166,9 @@ extern struct page *empty_zero_page;
>  
>  extern pgd_t swapper_pg_dir[PTRS_PER_PGD];
>  
> +#define pud_page(pud)pmd_page(__pmd(pud_val(pud)))
> +#define pud_write(pud)   pmd_write(__pmd(pud_val(pud)))

As there is no PUD, does it really make sense to return a valid
struct page (which will be the PTE page) for pud_page(), which is
several tables above?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


Re: [RFC please help] membarrier: Rewrite sync_core_before_usermode()

2020-12-30 Thread Russell King - ARM Linux admin
On Wed, Dec 30, 2020 at 10:00:28AM +, Russell King - ARM Linux admin wrote:
> On Wed, Dec 30, 2020 at 12:33:02PM +1000, Nicholas Piggin wrote:
> > Excerpts from Russell King - ARM Linux admin's message of December 29, 2020 
> > 8:44 pm:
> > > On Tue, Dec 29, 2020 at 01:09:12PM +1000, Nicholas Piggin wrote:
> > >> I think it should certainly be documented in terms of what guarantees
> > >> it provides to application, _not_ the kinds of instructions it may or
> > >> may not induce the core to execute. And if existing API can't be
> > >> re-documented sanely, then deprecatd and new ones added that DTRT.
> > >> Possibly under a new system call, if arch's like ARM want a range
> > >> flush and we don't want to expand the multiplexing behaviour of
> > >> membarrier even more (sigh).
> > > 
> > > The 32-bit ARM sys_cacheflush() is there only to support self-modifying
> > > code, and takes whatever actions are necessary to support that.
> > > Exactly what actions it takes are cache implementation specific, and
> > > should be of no concern to the caller, but the underlying thing is...
> > > it's to support self-modifying code.
> > 
> >Caveat
> >cacheflush()  should  not  be used in programs intended to be 
> > portable.
> >On Linux, this call first appeared on the MIPS architecture, but  
> > nowa‐
> >days, Linux provides a cacheflush() system call on some other 
> > architec‐
> >tures, but with different arguments.
> > 
> > What a disaster. Another badly designed interface, although it didn't 
> > originate in Linux it sounds like we weren't to be outdone so
> > we messed it up even worse.
> > 
> > flushing caches is neither necessary nor sufficient for code modification
> > on many processors. Maybe some old MIPS specific private thing was fine,
> > but certainly before it grew to other architectures, somebody should 
> > have thought for more than 2 minutes about it. Sigh.
> 
> WARNING: You are bordering on being objectionable and offensive with
> that comment.
> 
> The ARM interface was designed by me back in the very early days of
> Linux, probably while you were still in dypers, based on what was
> known at the time.  Back in the early 2000s, ideas such as relaxed
> memory ordering were not known.  All there was was one level of
> harvard cache.

Sorry, I got that slightly wrong. Its origins on ARM were from 12 Dec
1998:

http://www.armlinux.org.uk/developer/patches/viewpatch.php?id=88/1

by Philip Blundell, and first appeared in the ARM
pre-patch-2.1.131-19981214-1.gz. It was subsequently documented in the
kernel sources by me in July 2001 in ARM patch-2.4.6-rmk2.gz. It has
a slightly different signature: the third argument on ARM is a flags
argument, whereas the MIPS code, it is some undocumented "cache"
parameter.

Whether the ARM version pre or post dates the MIPS code, I couldn't say.
Whether it was ultimately taken from the MIPS implementation, again, I
couldn't say.

However, please stop insulting your fellow developers ability to think.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


Re: [RFC please help] membarrier: Rewrite sync_core_before_usermode()

2020-12-30 Thread Russell King - ARM Linux admin
On Wed, Dec 30, 2020 at 12:33:02PM +1000, Nicholas Piggin wrote:
> Excerpts from Russell King - ARM Linux admin's message of December 29, 2020 
> 8:44 pm:
> > On Tue, Dec 29, 2020 at 01:09:12PM +1000, Nicholas Piggin wrote:
> >> I think it should certainly be documented in terms of what guarantees
> >> it provides to application, _not_ the kinds of instructions it may or
> >> may not induce the core to execute. And if existing API can't be
> >> re-documented sanely, then deprecatd and new ones added that DTRT.
> >> Possibly under a new system call, if arch's like ARM want a range
> >> flush and we don't want to expand the multiplexing behaviour of
> >> membarrier even more (sigh).
> > 
> > The 32-bit ARM sys_cacheflush() is there only to support self-modifying
> > code, and takes whatever actions are necessary to support that.
> > Exactly what actions it takes are cache implementation specific, and
> > should be of no concern to the caller, but the underlying thing is...
> > it's to support self-modifying code.
> 
>Caveat
>cacheflush()  should  not  be used in programs intended to be portable.
>On Linux, this call first appeared on the MIPS architecture, but  nowa‐
>days, Linux provides a cacheflush() system call on some other architec‐
>tures, but with different arguments.
> 
> What a disaster. Another badly designed interface, although it didn't 
> originate in Linux it sounds like we weren't to be outdone so
> we messed it up even worse.
> 
> flushing caches is neither necessary nor sufficient for code modification
> on many processors. Maybe some old MIPS specific private thing was fine,
> but certainly before it grew to other architectures, somebody should 
> have thought for more than 2 minutes about it. Sigh.

WARNING: You are bordering on being objectionable and offensive with
that comment.

The ARM interface was designed by me back in the very early days of
Linux, probably while you were still in dypers, based on what was
known at the time.  Back in the early 2000s, ideas such as relaxed
memory ordering were not known.  All there was was one level of
harvard cache.

So, juts shut up with your insults.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


Re: [RFC please help] membarrier: Rewrite sync_core_before_usermode()

2020-12-29 Thread Russell King - ARM Linux admin
On Tue, Dec 29, 2020 at 01:09:12PM +1000, Nicholas Piggin wrote:
> I think it should certainly be documented in terms of what guarantees
> it provides to application, _not_ the kinds of instructions it may or
> may not induce the core to execute. And if existing API can't be
> re-documented sanely, then deprecatd and new ones added that DTRT.
> Possibly under a new system call, if arch's like ARM want a range
> flush and we don't want to expand the multiplexing behaviour of
> membarrier even more (sigh).

The 32-bit ARM sys_cacheflush() is there only to support self-modifying
code, and takes whatever actions are necessary to support that.
Exactly what actions it takes are cache implementation specific, and
should be of no concern to the caller, but the underlying thing is...
it's to support self-modifying code.

Sadly, because it's existed for 20+ years, and it has historically been
sufficient for other purposes too, it has seen quite a bit of abuse
despite its design purpose not changing - it's been used by graphics
drivers for example. They quickly learnt the error of their ways with
ARMv6+, since it does not do sufficient for their purposes given the
cache architectures found there.

Let's not go around redesigning this after twenty odd years, requiring
a hell of a lot of pain to users. This interface is called by code
generated by GCC, so to change it you're looking at patching GCC as
well as the kernel, and you basically will make new programs
incompatible with older kernels - very bad news for users.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


Re: [RFC please help] membarrier: Rewrite sync_core_before_usermode()

2020-12-28 Thread Russell King - ARM Linux admin
On Mon, Dec 28, 2020 at 11:44:33AM -0800, Andy Lutomirski wrote:
> On Mon, Dec 28, 2020 at 11:09 AM Russell King - ARM Linux admin
>  wrote:
> >
> > On Mon, Dec 28, 2020 at 07:29:34PM +0100, Jann Horn wrote:
> > > After chatting with rmk about this (but without claiming that any of
> > > this is his opinion), based on the manpage, I think membarrier()
> > > currently doesn't really claim to be synchronizing caches? It just
> > > serializes cores. So arguably if userspace wants to use membarrier()
> > > to synchronize code changes, userspace should first do the code
> > > change, then flush icache as appropriate for the architecture, and
> > > then do the membarrier() to ensure that the old code is unused?
> > >
> > > For 32-bit arm, rmk pointed out that that would be the cacheflush()
> > > syscall. That might cause you to end up with two IPIs instead of one
> > > in total, but we probably don't care _that_ much about extra IPIs on
> > > 32-bit arm?
> > >
> > > For arm64, I believe userspace can flush icache across the entire
> > > system with some instructions from userspace - "DC CVAU" followed by
> > > "DSB ISH", or something like that, I think? (See e.g.
> > > compat_arm_syscall(), the arm64 compat code that implements the 32-bit
> > > arm cacheflush() syscall.)
> >
> > Note that the ARM cacheflush syscall calls flush_icache_user_range()
> > over the range of addresses that userspace has passed - it's intention
> > since day one is to support cases where userspace wants to change
> > executable code.
> >
> > It will issue the appropriate write-backs to the data cache (DCCMVAU),
> > the invalidates to the instruction cache (ICIMVAU), invalidate the
> > branch target buffer (BPIALLIS or BPIALL as appropriate), and issue
> > the appropriate barriers (DSB ISHST, ISB).
> >
> > Note that neither flush_icache_user_range() nor flush_icache_range()
> > result in IPIs; cache operations are broadcast across all CPUs (which
> > is one of the minimums we require for SMP systems.)
> >
> > Now, that all said, I think the question that has to be asked is...
> >
> > What is the basic purpose of membarrier?
> >
> > Is the purpose of it to provide memory barriers, or is it to provide
> > memory coherence?
> >
> > If it's the former and not the latter, then cache flushes are out of
> > scope, and expecting memory written to be visible to the instruction
> > stream is totally out of scope of the membarrier interface, whether
> > or not the writes happen on the same or a different CPU to the one
> > executing the rewritten code.
> >
> > The documentation in the kernel does not seem to describe what it's
> > supposed to be doing - the only thing I could find is this:
> > Documentation/features/sched/membarrier-sync-core/arch-support.txt
> > which describes it as "arch supports core serializing membarrier"
> > whatever that means.
> >
> > Seems to be the standard and usual case of utterly poor to non-existent
> > documentation within the kernel tree, or even a pointer to where any
> > useful documentation can be found.
> >
> > Reading the membarrier(2) man page, I find nothing in there that talks
> > about any kind of cache coherency for self-modifying code - it only
> > seems to be about _barriers_ and nothing more, and barriers alone do
> > precisely nothing to save you from non-coherent Harvard caches.
> >
> > So, either Andy has a misunderstanding, or the man page is wrong, or
> > my rudimentary understanding of what membarrier is supposed to be
> > doing is wrong...
> 
> Look at the latest man page:
> 
> https://man7.org/linux/man-pages/man2/membarrier.2.html
> 
> for MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE.  The result may not be
> all that enlightening.

   MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE (since Linux 4.16)
  In  addition  to  providing  the  memory ordering guarantees de■
  scribed in MEMBARRIER_CMD_PRIVATE_EXPEDITED,  upon  return  from
  system call the calling thread has a guarantee that all its run■
  ning thread siblings have executed a core  serializing  instruc■
  tion.   This  guarantee is provided only for threads in the same
  process as the calling thread.

  The "expedited" commands complete faster than the  non-expedited
  ones,  they  never block, but have the downside of causing extra
  overhead.

  A process must register its intent to use the priv

Re: [RFC please help] membarrier: Rewrite sync_core_before_usermode()

2020-12-28 Thread Russell King - ARM Linux admin
On Mon, Dec 28, 2020 at 07:29:34PM +0100, Jann Horn wrote:
> After chatting with rmk about this (but without claiming that any of
> this is his opinion), based on the manpage, I think membarrier()
> currently doesn't really claim to be synchronizing caches? It just
> serializes cores. So arguably if userspace wants to use membarrier()
> to synchronize code changes, userspace should first do the code
> change, then flush icache as appropriate for the architecture, and
> then do the membarrier() to ensure that the old code is unused?
> 
> For 32-bit arm, rmk pointed out that that would be the cacheflush()
> syscall. That might cause you to end up with two IPIs instead of one
> in total, but we probably don't care _that_ much about extra IPIs on
> 32-bit arm?
> 
> For arm64, I believe userspace can flush icache across the entire
> system with some instructions from userspace - "DC CVAU" followed by
> "DSB ISH", or something like that, I think? (See e.g.
> compat_arm_syscall(), the arm64 compat code that implements the 32-bit
> arm cacheflush() syscall.)

Note that the ARM cacheflush syscall calls flush_icache_user_range()
over the range of addresses that userspace has passed - it's intention
since day one is to support cases where userspace wants to change
executable code.

It will issue the appropriate write-backs to the data cache (DCCMVAU),
the invalidates to the instruction cache (ICIMVAU), invalidate the
branch target buffer (BPIALLIS or BPIALL as appropriate), and issue
the appropriate barriers (DSB ISHST, ISB).

Note that neither flush_icache_user_range() nor flush_icache_range()
result in IPIs; cache operations are broadcast across all CPUs (which
is one of the minimums we require for SMP systems.)

Now, that all said, I think the question that has to be asked is...

What is the basic purpose of membarrier?

Is the purpose of it to provide memory barriers, or is it to provide
memory coherence?

If it's the former and not the latter, then cache flushes are out of
scope, and expecting memory written to be visible to the instruction
stream is totally out of scope of the membarrier interface, whether
or not the writes happen on the same or a different CPU to the one
executing the rewritten code.

The documentation in the kernel does not seem to describe what it's
supposed to be doing - the only thing I could find is this:
Documentation/features/sched/membarrier-sync-core/arch-support.txt
which describes it as "arch supports core serializing membarrier"
whatever that means.

Seems to be the standard and usual case of utterly poor to non-existent
documentation within the kernel tree, or even a pointer to where any
useful documentation can be found.

Reading the membarrier(2) man page, I find nothing in there that talks
about any kind of cache coherency for self-modifying code - it only
seems to be about _barriers_ and nothing more, and barriers alone do
precisely nothing to save you from non-coherent Harvard caches.

So, either Andy has a misunderstanding, or the man page is wrong, or
my rudimentary understanding of what membarrier is supposed to be
doing is wrong...

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


Re: [RFC please help] membarrier: Rewrite sync_core_before_usermode()

2020-12-28 Thread Russell King - ARM Linux admin
On Mon, Dec 28, 2020 at 09:14:23AM -0800, Andy Lutomirski wrote:
> On Mon, Dec 28, 2020 at 2:25 AM Russell King - ARM Linux admin
>  wrote:
> >
> > On Sun, Dec 27, 2020 at 01:36:13PM -0800, Andy Lutomirski wrote:
> > > On Sun, Dec 27, 2020 at 12:18 PM Mathieu Desnoyers
> > >  wrote:
> > > >
> > > > - On Dec 27, 2020, at 1:28 PM, Andy Lutomirski l...@kernel.org 
> > > > wrote:
> > > >
> > >
> > > > >
> > > > > I admit that I'm rather surprised that the code worked at all on 
> > > > > arm64,
> > > > > and I'm suspicious that it has never been very well tested.  My 
> > > > > apologies
> > > > > for not reviewing this more carefully in the first place.
> > > >
> > > > Please refer to 
> > > > Documentation/features/sched/membarrier-sync-core/arch-support.txt
> > > >
> > > > It clearly states that only arm, arm64, powerpc and x86 support the 
> > > > membarrier
> > > > sync core feature as of now:
> > >
> > > Sigh, I missed arm (32).  Russell or ARM folks, what's the right
> > > incantation to make the CPU notice instruction changes initiated by
> > > other cores on 32-bit ARM?
> >
> > You need to call flush_icache_range(), since the changes need to be
> > flushed from the data cache to the point of unification (of the Harvard
> > I and D), and the instruction cache needs to be invalidated so it can
> > then see those updated instructions. This will also take care of the
> > necessary barriers that the CPU requires for you.
> 
> With what parameters?   From looking at the header, this is for the
> case in which the kernel writes some memory and then intends to
> execute it.  That's not what membarrier() does at all.  membarrier()
> works like this:

You didn't specify that you weren't looking at kernel memory.

If you're talking about userspace, then the interface you require
is flush_icache_user_range(), which does the same as
flush_icache_range() but takes userspace addresses. Note that this
requires that the memory is currently mapped at those userspace
addresses.

If that doesn't fit your needs, there isn't an interface to do what
you require, and it basically means creating something brand new
on every architecture.

What you are asking for is not "just a matter of a few instructions".
I have stated the required steps to achieve what you require above;
that is the minimum when you have non-snooping harvard caches, which
the majority of 32-bit ARMs have.

> User thread 1:
> 
> write to RWX memory *or* write to an RW alias of an X region.
> membarrier(...);
> somehow tell thread 2 that we're ready (with a store release, perhaps,
> or even just a relaxed store.)
> 
> User thread 2:
> 
> wait for the indication from thread 1.
> barrier();
> jump to the code.
> 
> membarrier() is, for better or for worse, not given a range of addresses.

Then, I'm sorry, it can't work on 32-bit ARM.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


Re: [RFC please help] membarrier: Rewrite sync_core_before_usermode()

2020-12-28 Thread Russell King - ARM Linux admin
On Sun, Dec 27, 2020 at 01:36:13PM -0800, Andy Lutomirski wrote:
> On Sun, Dec 27, 2020 at 12:18 PM Mathieu Desnoyers
>  wrote:
> >
> > - On Dec 27, 2020, at 1:28 PM, Andy Lutomirski l...@kernel.org wrote:
> >
> 
> > >
> > > I admit that I'm rather surprised that the code worked at all on arm64,
> > > and I'm suspicious that it has never been very well tested.  My apologies
> > > for not reviewing this more carefully in the first place.
> >
> > Please refer to 
> > Documentation/features/sched/membarrier-sync-core/arch-support.txt
> >
> > It clearly states that only arm, arm64, powerpc and x86 support the 
> > membarrier
> > sync core feature as of now:
> 
> Sigh, I missed arm (32).  Russell or ARM folks, what's the right
> incantation to make the CPU notice instruction changes initiated by
> other cores on 32-bit ARM?

You need to call flush_icache_range(), since the changes need to be
flushed from the data cache to the point of unification (of the Harvard
I and D), and the instruction cache needs to be invalidated so it can
then see those updated instructions. This will also take care of the
necessary barriers that the CPU requires for you.

... as documented in Documentation/core-api/cachetlb.rst and so
should be available on every kernel supported CPU.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


Re: [PATCH v2 1/3] module: Rename module_alloc() to text_alloc() and move to kernel proper

2020-07-14 Thread Russell King - ARM Linux admin
On Tue, Jul 14, 2020 at 03:01:09PM +0200, Peter Zijlstra wrote:
> On Tue, Jul 14, 2020 at 03:19:24PM +0300, Ard Biesheuvel wrote:
> > So perhaps the answer is to have text_alloc() not with a 'where'
> > argument but with a 'why' argument. Or more simply, just have separate
> > alloc/free APIs for each case, with generic versions that can be
> > overridden by the architecture.
> 
> Well, there only seem to be 2 cases here, either the pointer needs to
> fit in some immediate displacement, or not.
> 
> On x86 we seem have the advantage of a fairly large immediate
> displacement as compared to many other architectures (due to our
> variable sized instructions). And thus have been fairly liberal with our
> usage of it (also our indirect jmps/calls suck, double so with
> RETCH-POLINE).
> 
> Still, the indirect jump, as mentioned by Russel should work for
> arbitrarily placed code for us too.
> 
> 
> So I'm thinking that something like:
> 
> enum ptr_type {
>   immediate_displacement,
>   absolute,
> };
> 
> void *text_alloc(unsigned long size, enum ptr_type type)
> {
>   unsigned long vstart = VMALLOC_START;
>   unsigned long vend   = VMALLOC_END;
> 
>   if (type == immediate_displacement) {
>   vstart = MODULES_VADDR;
>   vend   = MODULES_END;
>   }
> 
>   return __vmalloc_node_range(size, TEXT_ALIGN, vstart, vend,
>   GFP_KERNEL, PAGE_KERNEL_EXEC, 0,
>   NUMA_NO_NODE, _RET_IP_);
> }
> 
> void text_free(void *ptr)
> {
>   vfree(ptr);
> }

Beware, however, that on 32-bit ARM, if module PLTs are enabled,
we will try to place the module in the module region (which gives
best performance) but if that allocation fails, we will fall back
to placing it in the vmalloc region and using PLTs.

So, for a module allocation, we would need to make up to two calls
to text_alloc(), once with "immediate_displacement", and if that
fails and we have PLT support enabled, again with "absolute".

Hence, as other people have said, why module_alloc() would need to
stay separate.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


Re: [PATCH 1/3] module: Rename module_alloc() to text_alloc() and move to kernel proper

2020-07-14 Thread Russell King - ARM Linux admin
On Tue, Jul 14, 2020 at 01:17:22PM +0300, Ard Biesheuvel wrote:
> On Tue, 14 Jul 2020 at 12:53, Jarkko Sakkinen
>  wrote:
> >
> > On Mon, Jul 13, 2020 at 10:49:48PM +0300, Ard Biesheuvel wrote:
> > > This patch suggests that there are other reasons why conflating
> > > allocation of module space and allocating  text pages for other uses
> > > is a bad idea, but switching all users to text_alloc() is a step in
> > > the wrong direction. It would be better to stop using module_alloc()
> > > in core code except in the module loader, and have a generic
> > > text_alloc() that can be overridden by the arch if necessary. Note
> > > that x86  and s390 are the only architectures that use module_alloc()
> > > in ftrace code.
> >
> > This series essentially does this: introduces text_alloc() and
> > text_memfree(), which have generic implementations in kernel/text.c.
> > Those can be overriddent by arch specific implementations.
> >
> > What you think should be done differently than in my patch set?
> >
> 
> On arm64, module_alloc is only used by the module loader, and so
> pulling it out and renaming it will cause unused code to be
> incorporated into the kernel when building without module support,
> which is the use case you claim to be addressing.
> 
> Module_alloc has semantics that are intimately tied to the module
> loader, but over the years, it ended up being (ab)used by other
> subsystems, which don't require those semantics but just need n pages
> of vmalloc space with executable permissions.
> 
> So the correct approach is to make text_alloc() implement just that,
> generically, and switch bpf etc to use it. Then, only on architectures
> that need it, override it with an implementation that has the required
> additional semantics.
> 
> Refactoring 10+ architectures like this without any regard for how
> text_alloc() deviates from module_alloc() just creates a lot of churn
> that others will have to clean up after you.

For 32-bit ARM, our bpf code uses "blx/bx" (or equivalent code
sequences) rather than encoding a "bl" or "b", so BPF there doesn't
care where the executable memory is mapped, and doesn't need any
PLTs.  Given that, should bpf always allocate from the vmalloc()
region to preserve the module space for modules?

I'm more concerned about ftrace though, but only because I don't
have the understanding of that subsystem to really say whether there
are any side effects from having the allocations potentially be out
of range of a "bl" or "b" instruction.

If ftrace jumps both to and from the allocated page using a "load
address to register, branch to register" approach like BPF does, then
ftrace should be safe - and again, raises the issue that maybe it
should always come from vmalloc() space.

So, I think we need to keep module_alloc() for allocating memory for
modules, and provide a new text_alloc() for these other cases.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


Re: [PATCH 1/3] module: Rename module_alloc() to text_alloc() and move to kernel proper

2020-07-14 Thread Russell King - ARM Linux admin
On Tue, Jul 14, 2020 at 12:49:28PM +0300, Jarkko Sakkinen wrote:
> On Mon, Jul 13, 2020 at 07:34:10PM +0100, Russell King - ARM Linux admin 
> wrote:
> > On Mon, Jul 13, 2020 at 09:19:37PM +0300, Jarkko Sakkinen wrote:
> > > Rename module_alloc() to text_alloc() and module_memfree() to
> > > text_memfree(), and move them to kernel/text.c, which is unconditionally
> > > compiled to the kernel proper. This allows kprobes, ftrace and bpf to
> > > allocate space for executable code without requiring to compile the 
> > > modules
> > > support (CONFIG_MODULES=y) in.
> > 
> > I'm not sure this is a good idea for 32-bit ARM.  The code you are
> > moving for 32-bit ARM is quite specific to module use in that it also
> > supports allocating from the vmalloc area, where the module code
> > knows to add PLT entries.
> > 
> > If the other proposed users of this text_alloc() do not have the logic
> > to add PLT entries when branches between kernel code and this
> > allocation are not reachable by, e.g. a 26-bit signed offset for 32-bit
> > ARM code, then this code is not suitable for that use.
> 
> My intention is to use this in kprobes code in the place of
> module_alloc().  I'm not sure why moving this code out of the module
> subsystem could possibly break anything.  Unfortunately I forgot to add
> covere letter to my series. Sending v2 with that to explain my use case
> for this.

Ah, so you're merely renaming module_alloc() to text_alloc() everywhere?
It sounded from the initial patch like you were also converting other
users to use module_alloc().

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


Re: [PATCH v2 1/3] module: Rename module_alloc() to text_alloc() and move to kernel proper

2020-07-14 Thread Russell King - ARM Linux admin
On Tue, Jul 14, 2020 at 12:45:36PM +0300, Jarkko Sakkinen wrote:
> Rename module_alloc() to text_alloc() and module_memfree() to
> text_memfree(), and move them to kernel/text.c, which is unconditionally
> compiled to the kernel proper. This allows kprobes, ftrace and bpf to
> allocate space for executable code without requiring to compile the modules
> support (CONFIG_MODULES=y) in.

As I said in response to your first post of this, which seems to have
gone unacknowledged, I don't think this is a good idea.

So, NAK on the whole concept this without some discussion on the 32-bit
ARM issues.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


Re: [PATCH 1/3] module: Rename module_alloc() to text_alloc() and move to kernel proper

2020-07-13 Thread Russell King - ARM Linux admin
On Mon, Jul 13, 2020 at 09:19:37PM +0300, Jarkko Sakkinen wrote:
> Rename module_alloc() to text_alloc() and module_memfree() to
> text_memfree(), and move them to kernel/text.c, which is unconditionally
> compiled to the kernel proper. This allows kprobes, ftrace and bpf to
> allocate space for executable code without requiring to compile the modules
> support (CONFIG_MODULES=y) in.

I'm not sure this is a good idea for 32-bit ARM.  The code you are
moving for 32-bit ARM is quite specific to module use in that it also
supports allocating from the vmalloc area, where the module code
knows to add PLT entries.

If the other proposed users of this text_alloc() do not have the logic
to add PLT entries when branches between kernel code and this
allocation are not reachable by, e.g. a 26-bit signed offset for 32-bit
ARM code, then this code is not suitable for that use.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


Re: [PATCH v2 5/5] uaccess: Rename user_access_begin/end() to user_full_access_begin/end()

2020-04-21 Thread Russell King - ARM Linux admin
On Tue, Apr 21, 2020 at 03:49:19AM +0100, Al Viro wrote:
>   The only source I'd been able to find speeks of >= 60 cycles
> (and possibly much more) for non-pipelined coprocessor instructions;
> the list of such does contain loads and stores to a bunch of registers.
> However, the register in question (p15/c3) has only store mentioned there,
> so loads might be cheap; no obvious reasons for those to be slow.
> That's a question to arm folks, I'm afraid...  rmk?

I have no information on that; instruction timings are not defined
at architecture level (architecture reference manual), nor do I find
information in the CPU technical reference manual (which would be
specific to the CPU). Instruction timings tend to be implementation
dependent.

I've always consulted Will Deacon when I've needed to know whether
an instruction is expensive or not.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [PATCH] soc: fsl: dpio: avoid stack usage warning

2020-04-08 Thread Russell King - ARM Linux admin
On Wed, Apr 08, 2020 at 08:58:16PM +0200, Arnd Bergmann wrote:
> A 1024 byte variable on the stack will warn on any 32-bit architecture
> during compile-testing, and is generally a bad idea anyway:
> 
> fsl/dpio/dpio-service.c: In function 
> 'dpaa2_io_service_enqueue_multiple_desc_fq':
> fsl/dpio/dpio-service.c:495:1: error: the frame size of 1032 bytes is larger 
> than 1024 bytes [-Werror=frame-larger-than=]
> 
> There are currently no callers of this function, so I cannot tell whether
> dynamic memory allocation is allowed once callers are added. Change
> it to kcalloc for now, if anyone gets a warning about calling this in
> atomic context after they start using it, they can fix it later.
> 
> Fixes: 9d98809711ae ("soc: fsl: dpio: Adding QMAN multiple enqueue interface")
> Signed-off-by: Arnd Bergmann 
> ---
>  drivers/soc/fsl/dpio/dpio-service.c | 18 +-
>  1 file changed, 13 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/soc/fsl/dpio/dpio-service.c 
> b/drivers/soc/fsl/dpio/dpio-service.c
> index cd4f6410e8c2..ff0ef8cbdbff 100644
> --- a/drivers/soc/fsl/dpio/dpio-service.c
> +++ b/drivers/soc/fsl/dpio/dpio-service.c
> @@ -478,12 +478,17 @@ int dpaa2_io_service_enqueue_multiple_desc_fq(struct 
> dpaa2_io *d,
>   const struct dpaa2_fd *fd,
>   int nb)
>  {
> - int i;
> - struct qbman_eq_desc ed[32];
> + struct qbman_eq_desc *ed = kcalloc(sizeof(struct qbman_eq_desc), 32, 
> GFP_KERNEL);

I think you need to rearrange this to be more compliant with the coding
style.

> + int i, ret;
> +
> + if (!ed)
> + return -ENOMEM;
>  
>   d = service_select(d);
> - if (!d)
> - return -ENODEV;
> + if (!d) {
> + ret = -ENODEV;
> + goto out;
> + }
>  
>   for (i = 0; i < nb; i++) {
>   qbman_eq_desc_clear([i]);
> @@ -491,7 +496,10 @@ int dpaa2_io_service_enqueue_multiple_desc_fq(struct 
> dpaa2_io *d,
>   qbman_eq_desc_set_fq([i], fqid[i]);
>   }
>  
> - return qbman_swp_enqueue_multiple_desc(d->swp, [0], fd, nb);
> + ret = qbman_swp_enqueue_multiple_desc(d->swp, [0], fd, nb);
> +out:
> + kfree(ed);
> + return ret;
>  }
>  EXPORT_SYMBOL(dpaa2_io_service_enqueue_multiple_desc_fq);
>  
> -- 
> 2.26.0
> 
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: decruft the vmalloc API

2020-04-08 Thread Russell King - ARM Linux admin
On Wed, Apr 08, 2020 at 01:58:58PM +0200, Christoph Hellwig wrote:
> Hi all,
> 
> Peter noticed that with some dumb luck you can toast the kernel address
> space with exported vmalloc symbols.
> 
> I used this as an opportunity to decruft the vmalloc.c API and make it
> much more systematic.  This also removes any chance to create vmalloc
> mappings outside the designated areas or using executable permissions
> from modules.  Besides that it removes more than 300 lines of code.

I haven't read all your patches yet.

Have you tested it on 32-bit ARM, where the module area is located
_below_ PAGE_OFFSET and outside of the vmalloc area?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [PATCH RESEND 1/4] uaccess: Add user_read_access_begin/end and user_write_access_begin/end

2020-04-03 Thread Russell King - ARM Linux admin
On Fri, Apr 03, 2020 at 12:26:10PM +0100, Catalin Marinas wrote:
> On Fri, Apr 03, 2020 at 01:58:31AM +0100, Al Viro wrote:
> > On Thu, Apr 02, 2020 at 11:35:57AM -0700, Kees Cook wrote:
> > > Yup, I think it's a weakness of the ARM implementation and I'd like to
> > > not extend it further. AFAIK we should never nest, but I would not be
> > > surprised at all if we did.
> > > 
> > > If we were looking at a design goal for all architectures, I'd like
> > > to be doing what the public PaX patchset did for their memory access
> > > switching, which is to alarm if calling into "enable" found the access
> > > already enabled, etc. Such a condition would show an unexpected nesting
> > > (like we've seen with similar constructs with set_fs() not getting reset
> > > during an exception handler, etc etc).
> > 
> > FWIW, maybe I'm misreading the ARM uaccess logics, but... it smells like
> > KERNEL_DS is somewhat more dangerous there than on e.g. x86.
> > 
> > Look: with CONFIG_CPU_DOMAINS, set_fs(KERNEL_DS) tells MMU to ignore
> > per-page permission bits in DOMAIN_KERNEL (i.e. for kernel address
> > ranges), allowing them even if they would normally be denied.  We need
> > that for actual uaccess loads/stores, since those use insns that pretend
> > to be done in user mode and we want them to access the kernel pages.
> > But that affects the normal loads/stores as well; unless I'm misreading
> > that code, it will ignore (supervisor) r/o on a page.  And that's not
> > just for the code inside the uaccess blocks; *everything* done under
> > KERNEL_DS is subject to that.
> 
> That's correct. Luckily this only affects ARMv5 and earlier. From ARMv6
> onwards, CONFIG_CPU_USE_DOMAINS is no longer selected and the uaccess
> instructions are just plain ldr/str.
> 
> Russell should know the details on whether there was much choice. Since
> the kernel was living in the linear map with full rwx permissions, the
> KERNEL_DS overriding was probably not a concern and the ldrt/strt for
> uaccess deemed more secure. We also have weird permission setting
> pre-ARMv6 (or rather v6k) where RO user pages are writable from the
> kernel with standard str instructions (breaking CoW). I don't recall
> whether it was a choice made by the kernel or something the architecture
> enforced. The vectors page has to be kernel writable (and user RO) to
> store the TLS value in the absence of a TLS register but maybe we could
> do this via the linear alias together with the appropriate cache
> maintenance.
> 
> From ARMv6, the domain overriding had the side-effect of ignoring the XN
> bit and causing random instruction fetches from ioremap() areas. So we
> had to remove the domain switching. We also gained a dedicated TLS
> register.

Indeed.  On pre-ARMv6, we have the following choices for protection
attributes:

Page tables Control Reg Privileged  User
AP  S,R permission  permission
00  0,0 No access   No access
00  1,0 Read-only   No access
00  0,1 Read-only   Read-only
00  1,1 Unpredictable   Unpredictable
01  X,X Read/Write  No access
10  X,X Read/Write  Read-only
11  X,X Read/Write  Read/Write

We use S,R=1,0 under Linux because this allows us to read-protect
kernel pages without making them visible to userspace.  If we
changed to S,R=0,1, then we could have our read-only permissions
for both kernel and userspace, drop domain switching, and use the
plain LDR/STR instructions, but we then lose the ability to
write-protect module executable code and other parts of kernel
space without making them visible to userspace.

So, it essentially boils down to making a choice - which set of
security features we think are the most important.

> I think uaccess_enable() could indeed switch the kernel domain if
> KERNEL_DS is set and move this out of set_fs(). It would reduce the
> window the kernel domain permissions are overridden. Anyway,
> uaccess_enable() appeared much later on arm when Russell introduced PAN
> (SMAP) like support by switching the user domain.

Yes, that would be a possibility.  Another possibility would be to
eliminate as much usage of KERNEL_DS as possible - I've just found
one instance in sys_oabi-compat.c that can be eliminated (epoll_ctl)
but there's several there that can't with the current code structure,
and re-coding the contents of some fs/* functions to work around that
is a very bad idea.  If there's some scope for rejigging some of the
fs/* code, it may be possible to elimate some other cases in there.

I notice that the fs/* code seems like some of the last remaining
users of KERNEL_DS, although I suspect that some aren't possible to
eliminate. :(

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps 

Re: [PATCH RESEND 1/4] uaccess: Add user_read_access_begin/end and user_write_access_begin/end

2020-04-03 Thread Russell King - ARM Linux admin
On Thu, Apr 02, 2020 at 11:35:57AM -0700, Kees Cook wrote:
> On Thu, Apr 02, 2020 at 06:50:32PM +0100, Al Viro wrote:
> > On Thu, Apr 02, 2020 at 07:03:28PM +0200, Christophe Leroy wrote:
> > 
> > > user_access_begin() grants both read and write.
> > > 
> > > This patch adds user_read_access_begin() and user_write_access_begin() but
> > > it doesn't remove user_access_begin()
> > 
> > Ouch...  So the most generic name is for the rarest case?
> >  
> > > > What should we do about that?  Do we prohibit such blocks outside
> > > > of arch?
> > > > 
> > > > What should we do about arm and s390?  There we want a cookie passed
> > > > from beginning of block to its end; should that be a return value?
> > > 
> > > That was the way I implemented it in January, see
> > > https://patchwork.ozlabs.org/patch/1227926/
> > > 
> > > There was some discussion around that and most noticeable was:
> > > 
> > > H. Peter (hpa) said about it: "I have *deep* concern with carrying state 
> > > in
> > > a "key" variable: it's a direct attack vector for a crowbar attack,
> > > especially since it is by definition live inside a user access region."
> > 
> > > This patch minimises the change by just adding user_read_access_begin() 
> > > and
> > > user_write_access_begin() keeping the same parameters as the existing
> > > user_access_begin().
> > 
> > Umm...  What about the arm situation?  The same concerns would apply there,
> > wouldn't they?  Currently we have
> > static __always_inline unsigned int uaccess_save_and_enable(void)
> > {
> > #ifdef CONFIG_CPU_SW_DOMAIN_PAN
> > unsigned int old_domain = get_domain();
> > 
> > /* Set the current domain access to permit user accesses */
> > set_domain((old_domain & ~domain_mask(DOMAIN_USER)) |
> >domain_val(DOMAIN_USER, DOMAIN_CLIENT));
> > 
> > return old_domain;
> > #else
> > return 0;
> > #endif
> > }
> > and
> > static __always_inline void uaccess_restore(unsigned int flags)
> > {
> > #ifdef CONFIG_CPU_SW_DOMAIN_PAN
> > /* Restore the user access mask */
> > set_domain(flags);
> > #endif
> > }
> > 
> > How much do we need nesting on those, anyway?  rmk?

It's that way because it's easy, logical, and actually *more* efficient
to do it that way, rather than read-modify-write the domain register
each time we want to change it.

> Yup, I think it's a weakness of the ARM implementation and I'd like to
> not extend it further. AFAIK we should never nest, but I would not be
> surprised at all if we did.

There is one known nesting, which is __clear_user() when used with
the (IMHO horrid and I don't care about) UACCESS_WITH_MEMCPY feature.
That's not intentional however.

When I introduced this on ARM, the placement I adopted was to locate
it _as close as sanely possible_ to the userspace access so we
minimised the kernel accesses, so we minimise the number of accesses
that could go stray because of the domain issue - we ideally only
want the access done by the accessor itself to be affected, which
we achieve for most accesses.

Thinking laterally, maybe we should get rid of the whole KERNEL_DS
stuff entirely, and come up with an alternative way of handling the
kernel-wants-to-access-kernelspace-via-user-accessors problem.
Such as, copying some data back to userspace memory?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [PATCH RESEND 1/4] uaccess: Add user_read_access_begin/end and user_write_access_begin/end

2020-04-03 Thread Russell King - ARM Linux admin
On Fri, Apr 03, 2020 at 01:58:31AM +0100, Al Viro wrote:
> On Thu, Apr 02, 2020 at 11:35:57AM -0700, Kees Cook wrote:
> 
> > Yup, I think it's a weakness of the ARM implementation and I'd like to
> > not extend it further. AFAIK we should never nest, but I would not be
> > surprised at all if we did.
> > 
> > If we were looking at a design goal for all architectures, I'd like
> > to be doing what the public PaX patchset did for their memory access
> > switching, which is to alarm if calling into "enable" found the access
> > already enabled, etc. Such a condition would show an unexpected nesting
> > (like we've seen with similar constructs with set_fs() not getting reset
> > during an exception handler, etc etc).
> 
> FWIW, maybe I'm misreading the ARM uaccess logics, but... it smells like
> KERNEL_DS is somewhat more dangerous there than on e.g. x86.
> 
> Look: with CONFIG_CPU_DOMAINS, set_fs(KERNEL_DS) tells MMU to ignore
> per-page permission bits in DOMAIN_KERNEL (i.e. for kernel address
> ranges), allowing them even if they would normally be denied.  We need
> that for actual uaccess loads/stores, since those use insns that pretend
> to be done in user mode and we want them to access the kernel pages.
> But that affects the normal loads/stores as well; unless I'm misreading
> that code, it will ignore (supervisor) r/o on a page.  And that's not
> just for the code inside the uaccess blocks; *everything* done under
> KERNEL_DS is subject to that.
> 
> Why do we do that (modify_domain(), that is) inside set_fs() and not
> in uaccess_enable() et.al.?

First, CONFIG_CPU_DOMAINS is used on older ARMs, not ARMv7. Second,
the kernel image itself is not RO-protected on any ARM32 platform.

If we get rid of CONFIG_CPU_DOMAINS, we will use the ARMv7 method of
user access, which is to use normal load/stores for the user accessors
and every access must check against the address limit, even the
__-accessors.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [PATCH v2 00/13] mm: remove __ARCH_HAS_5LEVEL_HACK

2020-02-16 Thread Russell King - ARM Linux admin
On Sun, Feb 16, 2020 at 10:18:30AM +0200, Mike Rapoport wrote:
> From: Mike Rapoport 
> 
> Hi,
> 
> These patches convert several architectures to use page table folding and
> remove __ARCH_HAS_5LEVEL_HACK along with include/asm-generic/5level-fixup.h.
> 
> The changes are mostly about mechanical replacement of pgd accessors with p4d
> ones and the addition of higher levels to page table traversals.
> 
> All the patches were sent separately to the respective arch lists and
> maintainers hence the "v2" prefix.

You fail to explain why this change which adds 488 additional lines of
code is desirable.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH V13] mm/debug: Add tests validating architecture page table helpers

2020-02-11 Thread Russell King - ARM Linux admin
On Tue, Feb 11, 2020 at 06:33:47AM +0100, Christophe Leroy wrote:
> 
> 
> Le 11/02/2020 à 03:25, Anshuman Khandual a écrit :
> > 
> > 
> > On 02/10/2020 04:36 PM, Russell King - ARM Linux admin wrote:
> > > There are good reasons for the way ARM does stuff.  The generic crap was
> > > written without regard for the circumstances that ARM has, and thus is
> > > entirely unsuitable for 32-bit ARM.
> > 
> > Since we dont have an agreement here, lets just settle with disabling the
> > test for now on platforms where the build fails. CONFIG_EXPERT is enabling
> > this test for better adaptability and coverage, hence how about re framing
> > the config like this ? This at the least conveys the fact that EXPERT only
> > works when platform is neither IA64 or ARM.
> 
> Agreed
> 
> > 
> > config DEBUG_VM_PGTABLE
> > bool "Debug arch page table for semantics compliance"
> > depends on MMU
> > depends on ARCH_HAS_DEBUG_VM_PGTABLE || (EXPERT &&  !(IA64 || ARM))
> 
> I think it's maybe better to have a dedicated depends line:
> 
> depends on !IA64 && !ARM
> depends on ARCH_HAS_DEBUG_VM_PGTABLE || EXPERT
> 
> The day arm and/or ia64 is ready for building the test, we can remove that
> depends.

Never going to happen as its technically infeasible, sorry.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH V13] mm/debug: Add tests validating architecture page table helpers

2020-02-10 Thread Russell King - ARM Linux admin
On Mon, Feb 10, 2020 at 11:46:23AM +0100, Christophe Leroy wrote:
> 
> 
> Le 10/02/2020 à 11:02, Russell King - ARM Linux admin a écrit :
> > On Mon, Feb 10, 2020 at 07:38:38AM +0100, Christophe Leroy wrote:
> > > 
> > > 
> > > Le 10/02/2020 à 06:35, Anshuman Khandual a écrit :
> > > > 
> > > > 
> > > > On 02/10/2020 10:22 AM, Andrew Morton wrote:
> > > > > On Thu, 6 Feb 2020 13:49:35 +0530 Anshuman Khandual 
> > > > >  wrote:
> > > > > 
> > > > > > 
> > > > > > On 02/06/2020 04:40 AM, kbuild test robot wrote:
> > > > > > > Hi Anshuman,
> > > > > > > 
> > > > > > > Thank you for the patch! Yet something to improve:
> > > > > > > 
> > > > > > > [auto build test ERROR on powerpc/next]
> > > > > > > [also build test ERROR on s390/features linus/master arc/for-next 
> > > > > > > v5.5]
> > > > > > > [cannot apply to mmotm/master tip/x86/core arm64/for-next/core 
> > > > > > > next-20200205]
> > > > > > > [if your patch is applied to the wrong git tree, please drop us a 
> > > > > > > note to help
> > > > > > > improve the system. BTW, we also suggest to use '--base' option 
> > > > > > > to specify the
> > > > > > > base tree in git format-patch, please see 
> > > > > > > https://stackoverflow.com/a/37406982]
> > > > > > > 
> > > > > > > url:
> > > > > > > https://github.com/0day-ci/linux/commits/Anshuman-Khandual/mm-debug-Add-tests-validating-architecture-page-table-helpers/20200205-215507
> > > > > > > base:   
> > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
> > > > > > > next
> > > > > > > config: ia64-allmodconfig (attached as .config)
> > > > > > > compiler: ia64-linux-gcc (GCC) 7.5.0
> > > > > > > reproduce:
> > > > > > >   wget 
> > > > > > > https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross
> > > > > > >  -O ~/bin/make.cross
> > > > > > >   chmod +x ~/bin/make.cross
> > > > > > >   # save the attached .config to linux build tree
> > > > > > >   GCC_VERSION=7.5.0 make.cross ARCH=ia64
> > > > > > > 
> > > > > > > If you fix the issue, kindly add following tag
> > > > > > > Reported-by: kbuild test robot 
> > > > > > > 
> > > > > > > All error/warnings (new ones prefixed by >>):
> > > > > > > 
> > > > > > >  In file included from 
> > > > > > > include/asm-generic/pgtable-nopud.h:8:0,
> > > > > > >   from arch/ia64/include/asm/pgtable.h:586,
> > > > > > >   from include/linux/mm.h:99,
> > > > > > >   from include/linux/highmem.h:8,
> > > > > > >   from mm/debug_vm_pgtable.c:14:
> > > > > > >  mm/debug_vm_pgtable.c: In function 'pud_clear_tests':
> > > > > > > > > include/asm-generic/pgtable-nop4d-hack.h:47:32: error: 
> > > > > > > > > implicit declaration of function '__pgd'; did you mean 
> > > > > > > > > '__p4d'? [-Werror=implicit-function-declaration]
> > > > > > >   #define __pud(x)((pud_t) { __pgd(x) })
> > > > > > >  ^
> > > > > > > > > mm/debug_vm_pgtable.c:141:8: note: in expansion of macro 
> > > > > > > > > '__pud'
> > > > > > >pud = __pud(pud_val(pud) | RANDOM_ORVALUE);
> > > > > > >  ^
> > > > > > > > > include/asm-generic/pgtable-nop4d-hack.h:47:22: warning: 
> > > > > > > > > missing braces around initializer [-Wmissing-braces]
> > > > > > >   #define __pud(x)((pud_t) { __pgd(x) })
> > > > > > >^
> > > > > > > > > mm/debug_vm_pgtable.c:141:8: note: in expansion of macro 
> > > > > > > > > '__pud'
>

Re: [PATCH V13] mm/debug: Add tests validating architecture page table helpers

2020-02-10 Thread Russell King - ARM Linux admin
On Mon, Feb 10, 2020 at 07:38:38AM +0100, Christophe Leroy wrote:
> 
> 
> Le 10/02/2020 à 06:35, Anshuman Khandual a écrit :
> > 
> > 
> > On 02/10/2020 10:22 AM, Andrew Morton wrote:
> > > On Thu, 6 Feb 2020 13:49:35 +0530 Anshuman Khandual 
> > >  wrote:
> > > 
> > > > 
> > > > On 02/06/2020 04:40 AM, kbuild test robot wrote:
> > > > > Hi Anshuman,
> > > > > 
> > > > > Thank you for the patch! Yet something to improve:
> > > > > 
> > > > > [auto build test ERROR on powerpc/next]
> > > > > [also build test ERROR on s390/features linus/master arc/for-next 
> > > > > v5.5]
> > > > > [cannot apply to mmotm/master tip/x86/core arm64/for-next/core 
> > > > > next-20200205]
> > > > > [if your patch is applied to the wrong git tree, please drop us a 
> > > > > note to help
> > > > > improve the system. BTW, we also suggest to use '--base' option to 
> > > > > specify the
> > > > > base tree in git format-patch, please see 
> > > > > https://stackoverflow.com/a/37406982]
> > > > > 
> > > > > url:
> > > > > https://github.com/0day-ci/linux/commits/Anshuman-Khandual/mm-debug-Add-tests-validating-architecture-page-table-helpers/20200205-215507
> > > > > base:   
> > > > > https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
> > > > > config: ia64-allmodconfig (attached as .config)
> > > > > compiler: ia64-linux-gcc (GCC) 7.5.0
> > > > > reproduce:
> > > > >  wget 
> > > > > https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross
> > > > >  -O ~/bin/make.cross
> > > > >  chmod +x ~/bin/make.cross
> > > > >  # save the attached .config to linux build tree
> > > > >  GCC_VERSION=7.5.0 make.cross ARCH=ia64
> > > > > 
> > > > > If you fix the issue, kindly add following tag
> > > > > Reported-by: kbuild test robot 
> > > > > 
> > > > > All error/warnings (new ones prefixed by >>):
> > > > > 
> > > > > In file included from include/asm-generic/pgtable-nopud.h:8:0,
> > > > >  from arch/ia64/include/asm/pgtable.h:586,
> > > > >  from include/linux/mm.h:99,
> > > > >  from include/linux/highmem.h:8,
> > > > >  from mm/debug_vm_pgtable.c:14:
> > > > > mm/debug_vm_pgtable.c: In function 'pud_clear_tests':
> > > > > > > include/asm-generic/pgtable-nop4d-hack.h:47:32: error: implicit 
> > > > > > > declaration of function '__pgd'; did you mean '__p4d'? 
> > > > > > > [-Werror=implicit-function-declaration]
> > > > >  #define __pud(x)((pud_t) { __pgd(x) })
> > > > > ^
> > > > > > > mm/debug_vm_pgtable.c:141:8: note: in expansion of macro '__pud'
> > > > >   pud = __pud(pud_val(pud) | RANDOM_ORVALUE);
> > > > > ^
> > > > > > > include/asm-generic/pgtable-nop4d-hack.h:47:22: warning: missing 
> > > > > > > braces around initializer [-Wmissing-braces]
> > > > >  #define __pud(x)((pud_t) { __pgd(x) })
> > > > >   ^
> > > > > > > mm/debug_vm_pgtable.c:141:8: note: in expansion of macro '__pud'
> > > > >   pud = __pud(pud_val(pud) | RANDOM_ORVALUE);
> > > > > ^
> > > > > cc1: some warnings being treated as errors
> > > > 
> > > > This build failure is expected now given that we have allowed 
> > > > DEBUG_VM_PGTABLE
> > > > with EXPERT without platform requiring ARCH_HAS_DEBUG_VM_PGTABLE. This 
> > > > problem
> > > > i.e build failure caused without a platform __pgd(), is known to exist 
> > > > both on
> > > > ia64 and arm (32bit) platforms. Please refer 
> > > > https://lkml.org/lkml/2019/9/24/314
> > > > for details where this was discussed earlier.
> > > > 
> > > 
> > > I'd prefer not to merge a patch which is known to cause build
> > > regressions.  Is there some temporary thing we can do to prevent these
> > > errors until arch maintainers(?) get around to implementing the
> > > long-term fixes?
> > 
> > We could explicitly disable CONFIG_DEBUG_VM_PGTABLE on ia64 and arm 
> > platforms
> > which will ensure that others can still use the EXPERT path.
> > 
> > config DEBUG_VM_PGTABLE
> > bool "Debug arch page table for semantics compliance"
> > depends on MMU
> > depends on !(IA64 || ARM)
> > depends on ARCH_HAS_DEBUG_VM_PGTABLE || EXPERT
> > default n if !ARCH_HAS_DEBUG_VM_PGTABLE
> > default y if DEBUG_VM
> > 
> 
> On both ia32 and arm, the fix is trivial.
> 
> Can we include the fix within this patch, just the same way as the
> mm_p4d_folded() fix for x86 ?

Why should arm include a macro for something that nothing (apart from
this checker) requires?  If the checker requires it but the rest of
the kernel does not, it suggests that the checker isn't actually
correct, and the results can't be relied upon.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH 00/50] Add log level to show_stack()

2019-11-08 Thread Russell King - ARM Linux admin
On Fri, Nov 08, 2019 at 04:28:30PM +, Dmitry Safonov wrote:
> On 11/6/19 8:34 PM, Peter Zijlstra wrote:
> > On Wed, Nov 06, 2019 at 04:27:33PM +, Dmitry Safonov wrote:
> [..]
> >> Sorry, I should have tried to describe better.
> >>
> >> I'm trying to remove external users of console_loglevel by following
> >> reasons:
> >
> > I suppose since all my machines have 'debug ignore_loglevel
> > earlyprintk=serial,ttyS0,115200 console=ttyS0,115200' I don't have this
> > experience.
> 
> Yeah, I remember you avoid all those functionalities of printk(), fair
> enough. On the other side, regular users and I'm betting most of
> the non-tuned distributions use /proc/sys/kernel/printk by default.
> (Checking on my Arch & Fedora - loglevel 4 from the box)
> 
> >> - changing console_loglevel on SMP means that unwanted messages from
> >> other CPUs will appear (that have lower log level)
> >> - on UMP unwanted messages may appear if the code is preempted while it
> >> hasn't set the console_loglevel back to old
> >> - rising console_loglevel to print wanted message(s) may not work at all
> >> if printk() has being delayed and the console_loglevel is already set
> >> back to old value
> >
> > Sure, frobbing the global console_loglevel is bad.
> >
> >> I also have patches in wip those needs to print backtrace with specific
> >> loglevel (higher when it's critical, lower when it's notice and
> >> shouldn't go to serial console).
> >
> > (everything always should go to serial, serial is awesome :-)
> 
> Personally I agree. Unfortunately, here @Arista there are switches (I'm
> speaking about the order of thousands at least) those have baud-rate 9600.
> It's a bit expensive being elaborate with such setup.
> 
> >> Besides on local tests I see hits those have headers (messages like
> >> "Backtrace: ") without an actual backtrace and the reverse - a backtrace
> >> without a reason for it. It's quite annoying and worth addressing by
> >> syncing headers log levels to backtraces.
> >
> > I suppose I'm surprised there are backtraces that are not important.
> > Either badness happened and it needs printing, or the user asked for it
> > and it needs printing.
> >
> > Perhaps we should be removing backtraces if they're not important
> > instead of allowing to print them as lower loglevels?
> 
> Well, the use-case for lower log-level is that everything goes into logs
> (/var/log/dmesg or /var/log/messages whatever rsyslog has settting).
> 
> That has it's value:
> - after a failure (i.e. panic) messages, those were only signs that
> something goes wrong can be seen in logs which can give ideas what has
> happened.

No they don't.  When the kernel panics, userspace generally stops
running, so rsyslog won't be able to write them to /var/log/messages.

How, by "kernel panics" I mean a real kernel panic, which probably
isn't what you're talking about there.  You are probably talking
about the whole shebang of non-fatal kernel oops, kernel warnings
and the like.  If so, I'd ask you to stop confuzzilating terminology.

If you really want to capture such events, then you need to have the
kernel write the panic to (e.g.) flash - see the mtdoops driver.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH 00/50] Add log level to show_stack()

2019-11-06 Thread Russell King - ARM Linux admin
On Wed, Nov 06, 2019 at 09:34:40PM +0100, Peter Zijlstra wrote:
> I suppose I'm surprised there are backtraces that are not important.
> Either badness happened and it needs printing, or the user asked for it
> and it needs printing.

Or utterly meaningless.

> Perhaps we should be removing backtraces if they're not important
> instead of allowing to print them as lower loglevels?

Definitely!  WARN_ON() is well overused - and as is typical, used
without much thought.  Bound to happen after Linus got shirty about
BUG_ON() being over used.  Everyone just grabbed the next nearest thing
to assert().

As a kind of example, I've recently come across one WARN_ON() in a
driver subsystem (that shall remain nameless at the moment) which very
likely has multiple different devices on a platform.  The WARN_ON()
triggers as a result of a problem with the hardware, but because it's a
WARN_ON(), you've no idea which device has a problem.  The backtrace is
mostly meaningless.  So you know that a problem has occurred, but the
kernel prints *useless* backtrace to let you know, and totally omits
the *useful* information.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: Onboard SD card doesn't work anymore after the 'mmc-v5.4-2' updates

2019-10-23 Thread Russell King - ARM Linux admin
On Wed, Oct 23, 2019 at 08:52:33AM -0500, Rob Herring wrote:
> > I think this should have been done the other way around and default to
> > coherent since most traditional OF platforms are coherent, and you
> > can't just require those DTs to change.
> 
> You can blame me. This was really only intended for cases where
> coherency is configurable on a per peripheral basis and can't be
> determined in other ways.
> 
> The simple solution here is simply to use the compatible string for
> the device to determine coherent or not.

It really isn't that simple.

There are two aspects to coherency, both of which must match:

1) The configuration of the device
2) The configuration of the kernel's DMA API

(1) is controlled by the driver, which can make the decision any way
it pleases.

(2) on ARM64 is controlled depending on whether or not "dma-coherent"
is specified in the device tree, since ARM64 can have a mixture of
DMA coherent and non-coherent devices.

A mismatch between (1) and (2) results in data corruption, potentially
eating your filesystem.  So, it's very important that the two match.

These didn't match for the LX2160A, but, due to the way CMA was working,
we sort of got away with it, but it was very dangerous as far as data
safety went.

Then, a change to CMA happened which moved where it was located, which
caused a regression.  Reverting the CMA changes didn't seem to be an
option, so another solution had to be found.

I started a discussion on how best to solve this:

https://archive.armlinux.org.uk/lurker/thread/20190919.041320.1e53541f.en.html

and the solution that the discussion came out with was the one that has
been merged - which we now know caused a regression on PPC.

Using compatible strings doesn't solve the issue: there is no way to
tell the DMA API from the driver that the device is coherent.  The
only way to do that is via the "dma-coherent" property in DT on ARM64.

To say that this is a mess is an under-statement, but we seem to have
ended up here because of a series of piece-meal changes that don't seem
to have been thought through enough.

So, what's the right way to solve this, and ensure that the DMA API and
device match as far as their coherency expectations go?  Revert all the
changes for sdhci-of-esdhc and CMA back to 5.0 or 5.1 state?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: Onboard SD card doesn't work anymore after the 'mmc-v5.4-2' updates

2019-10-23 Thread Russell King - ARM Linux admin
On Wed, Oct 23, 2019 at 08:52:33AM -0500, Rob Herring wrote:
> On Wed, Oct 23, 2019 at 1:41 AM Benjamin Herrenschmidt
>  wrote:
> >
> > On Wed, 2019-10-23 at 16:42 +1100, Michael Ellerman wrote:
> > >
> > > Right, it seems of_dma_is_coherent() has baked in the assumption that
> > > devices are non-coherent unless explicitly marked as coherent.
> > >
> > > Which is wrong on all or at least most existing powerpc systems
> > > according to Ben.
> >
> > This is probably broken on sparc(64) as well and whatever else uses
> > DT and is an intrinsicly coherent architecture (did we ever have
> > DT enabled x86s ? Wasn't OLPC such a beast ?).
> 
> Only if those platforms use one of the 5 drivers that call this function:
> 
> drivers/ata/ahci_qoriq.c:   qoriq_priv->is_dmacoherent =
> of_dma_is_coherent(np);
> drivers/crypto/ccree/cc_driver.c:   new_drvdata->coherent =
> of_dma_is_coherent(np);
> drivers/iommu/arm-smmu-v3.c:if (of_dma_is_coherent(dev->of_node))
> drivers/iommu/arm-smmu.c:   if (of_dma_is_coherent(dev->of_node))
> drivers/mmc/host/sdhci-of-esdhc.c:  if (of_dma_is_coherent(dev->of_node))
> 
> Curious that a PPC specific driver (ahci_qoriq) calls it...

Maybe because it is not PPC specific:

arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi:
compatible = "fsl,lx2160a-ahci";
arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi:
compatible = "fsl,lx2160a-ahci";
arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi:
compatible = "fsl,lx2160a-ahci";
arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi:
compatible = "fsl,lx2160a-ahci";

drivers/ata/ahci_qoriq.c:   { .compatible = "fsl,lx2160a-ahci", .data = 
(void *)AHCI_LX2160A},

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH v2 3/3] arm: Add support for function error injection

2019-08-29 Thread Russell King - ARM Linux admin
I'm sorry, I can't apply this, it produces loads of:

include/linux/error-injection.h:7:10: fatal error: asm/error-injection.h: No 
such file or directory

Since your patch 1 has been merged by the ARM64 people, I can't take
it until next cycle.

On Mon, Aug 19, 2019 at 05:18:08PM +0800, Leo Yan wrote:
> Hi Russell,
> 
> On Tue, Aug 06, 2019 at 06:00:15PM +0800, Leo Yan wrote:
> > This patch implements arm specific functions regs_set_return_value() and
> > override_function_with_return() to support function error injection.
> > 
> > In the exception flow, it updates pt_regs::ARM_pc with pt_regs::ARM_lr
> > so can override the probed function return.
> 
> Gentle ping ...  Could you review this patch?
> 
> Thanks,
> Leo.
> 
> > Signed-off-by: Leo Yan 
> > ---
> >  arch/arm/Kconfig  |  1 +
> >  arch/arm/include/asm/ptrace.h |  5 +
> >  arch/arm/lib/Makefile |  2 ++
> >  arch/arm/lib/error-inject.c   | 19 +++
> >  4 files changed, 27 insertions(+)
> >  create mode 100644 arch/arm/lib/error-inject.c
> > 
> > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> > index 33b00579beff..2d3d44a037f6 100644
> > --- a/arch/arm/Kconfig
> > +++ b/arch/arm/Kconfig
> > @@ -77,6 +77,7 @@ config ARM
> > select HAVE_EXIT_THREAD
> > select HAVE_FAST_GUP if ARM_LPAE
> > select HAVE_FTRACE_MCOUNT_RECORD if !XIP_KERNEL
> > +   select HAVE_FUNCTION_ERROR_INJECTION if !THUMB2_KERNEL
> > select HAVE_FUNCTION_GRAPH_TRACER if !THUMB2_KERNEL && !CC_IS_CLANG
> > select HAVE_FUNCTION_TRACER if !XIP_KERNEL
> > select HAVE_GCC_PLUGINS
> > diff --git a/arch/arm/include/asm/ptrace.h b/arch/arm/include/asm/ptrace.h
> > index 91d6b7856be4..3b41f37b361a 100644
> > --- a/arch/arm/include/asm/ptrace.h
> > +++ b/arch/arm/include/asm/ptrace.h
> > @@ -89,6 +89,11 @@ static inline long regs_return_value(struct pt_regs 
> > *regs)
> > return regs->ARM_r0;
> >  }
> >  
> > +static inline void regs_set_return_value(struct pt_regs *regs, unsigned 
> > long rc)
> > +{
> > +   regs->ARM_r0 = rc;
> > +}
> > +
> >  #define instruction_pointer(regs)  (regs)->ARM_pc
> >  
> >  #ifdef CONFIG_THUMB2_KERNEL
> > diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
> > index b25c54585048..8f56484a7156 100644
> > --- a/arch/arm/lib/Makefile
> > +++ b/arch/arm/lib/Makefile
> > @@ -42,3 +42,5 @@ ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
> >CFLAGS_xor-neon.o+= $(NEON_FLAGS)
> >obj-$(CONFIG_XOR_BLOCKS) += xor-neon.o
> >  endif
> > +
> > +obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o
> > diff --git a/arch/arm/lib/error-inject.c b/arch/arm/lib/error-inject.c
> > new file mode 100644
> > index ..2d696dc94893
> > --- /dev/null
> > +++ b/arch/arm/lib/error-inject.c
> > @@ -0,0 +1,19 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +#include 
> > +#include 
> > +
> > +void override_function_with_return(struct pt_regs *regs)
> > +{
> > +   /*
> > +* 'regs' represents the state on entry of a predefined function in
> > +* the kernel/module and which is captured on a kprobe.
> > +*
> > +* 'regs->ARM_lr' contains the the link register for the probed
> > +* function, when kprobe returns back from exception it will override
> > +* the end of probed function and directly return to the predefined
> > +* function's caller.
> > +*/
> > +   instruction_pointer_set(regs, regs->ARM_lr);
> > +}
> > +NOKPROBE_SYMBOL(override_function_with_return);
> > -- 
> > 2.17.1
> > 
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH] dma-mapping: fix page attributes for dma_mmap_*

2019-08-06 Thread Russell King - ARM Linux admin
On Tue, Aug 06, 2019 at 05:08:54PM +0100, Will Deacon wrote:
> On Sat, Aug 03, 2019 at 08:48:12AM +0200, Christoph Hellwig wrote:
> > On Fri, Aug 02, 2019 at 11:38:03AM +0100, Will Deacon wrote:
> > > 
> > > So this boils down to a terminology mismatch. The Arm architecture 
> > > doesn't have
> > > anything called "write combine", so in Linux we instead provide what the 
> > > Arm
> > > architecture calls "Normal non-cacheable" memory for 
> > > pgprot_writecombine().
> > > Amongst other things, this memory type permits speculation, unaligned 
> > > accesses
> > > and merging of writes. I found something in the architecture spec about
> > > non-cachable memory, but it's written in Armglish[1].
> > > 
> > > pgprot_noncached(), on the other hand, provides what the architecture 
> > > calls
> > > Strongly Ordered or Device-nGnRnE memory. This is intended for mapping 
> > > MMIO
> > > (i.e. PCI config space) and therefore forbids speculation, preserves 
> > > access
> > > size, requires strict alignment and also forces write responses to come 
> > > from
> > > the endpoint.
> > > 
> > > I think the naming mismatch is historical, but on arm64 we wanted to use 
> > > the
> > > same names as arm32 so that any drivers using these things directly would 
> > > get
> > > the same behaviour.
> > 
> > That all makes sense, but it totally needs a comment.  I'll try to draft
> > one based on this.  I've also looked at the arm32 code a bit more, and
> > it seems arm always (?) supported Normal non-cacheable attribute, but
> > Linux only optionally uses it for arm v6+ because of fears of drivers
> > missing barriers.
> 
> I think it was also to do with aliasing, but I don't recall all of the
> details.

ARMv6+ is where the architecture significantly changed to introduce
the idea of [Normal, Device, Strongly Ordered] where Normal has the
cache attributes.

Before that, we had just "uncached/unbuffered, uncached/buffered,
cached/unbuffered, cached/buffered" modes.

The write buffer (enabled by buffered modes) has no architected
guarantees about how long writes will sit in it, and there is only
the "drain write buffer" instruction to push writes out.

Up to and including ARMv5, we took the easy approach of just using
the "uncached/unbuffered" mode since that is (a) the safest, and (b)
avoids write buffers that alias when there are multiple different
mappings.

We could have used a different approach, making all IO writes contain
a "drain write buffer" instruction, and map DMA memory as "buffered",
but as there were no Linux barriers defined to order memory accesses
to DMA memory (so, for example, ring buffers can be updated in the
correct order) back in those days, using the uncached/unbuffered mode
was the sanest and most reliable solution.

> 
> > The other really weird things is that in arm32
> > pgprot_dmacoherent incudes the L_PTE_XN bit, which from my understanding
> > is the no-execture bit, but pgprot_writecombine does not.  This seems to
> > not very unintentional.  So minus that the whole DMA_ATTR_WRITE_COMBІNE
> > seems to be about flagging old arm specific drivers as having the proper
> > barriers in places and otherwise is a no-op.
> 
> I think it only matters for Armv7 CPUs, but yes, we should probably be
> setting L_PTE_XN for both of these memory types.

Conventionally, pgprot_writecombine() has only been used to change
the memory type and not the permissions.  Since writecombine memory
is still capable of being executed, I don't see any reason to set XN
for it.

If the user wishes to mmap() using PROT_READ|PROT_EXEC, then is there
really a reason for writecombine to set XN overriding the user?

That said, pgprot_writecombine() is mostly used for framebuffers, which
arguably shouldn't be executable anyway - but who'd want to mmap() the
framebuffer with PROT_EXEC?

> 
> > Here is my tentative plan:
> > 
> >  - respin this patch with a small fix to handle the
> >DMA_ATTR_NON_CONSISTENT (as in ignore it unless actually supported),
> >but keep the name as-is to avoid churn.  This should allow 5.3
> >inclusion and backports
> >  - remove DMA_ATTR_WRITE_COMBINE support from mips, probably also 5.3
> >material.
> >  - move all architectures but arm over to just define
> >pgprot_dmacoherent, including a comment with the above explanation
> >for arm64.
> 
> That would be great, thanks.
> 
> >  - make DMA_ATTR_WRITE_COMBINE a no-op and schedule it for removal,
> >thus removing the last instances of arch_dma_mmap_pgprot
> 
> All sounds good to me, although I suppose 32-bit Arm platforms without
> CONFIG_ARM_DMA_MEM_BUFFERABLE may run into issues if DMA_ATTR_WRITE_COMBINE
> disappears. Only one way to find out...

Looking at the results of grep, I think only OMAP2+ and Exynos may be
affected.

However, removing writecombine support from the DMA API is going to
have a huge impact for framebuffers on earlier ARMs - that's where we
do expect framebuffers to be mapped "uncached/buffered" for 

Re: [PATCH] dma-mapping: fix page attributes for dma_mmap_*

2019-08-06 Thread Russell King - ARM Linux admin
On Tue, Aug 06, 2019 at 05:45:03PM +0100, Russell King - ARM Linux admin wrote:
> On Tue, Aug 06, 2019 at 05:08:54PM +0100, Will Deacon wrote:
> > On Sat, Aug 03, 2019 at 08:48:12AM +0200, Christoph Hellwig wrote:
> > > On Fri, Aug 02, 2019 at 11:38:03AM +0100, Will Deacon wrote:
> > > > 
> > > > So this boils down to a terminology mismatch. The Arm architecture 
> > > > doesn't have
> > > > anything called "write combine", so in Linux we instead provide what 
> > > > the Arm
> > > > architecture calls "Normal non-cacheable" memory for 
> > > > pgprot_writecombine().
> > > > Amongst other things, this memory type permits speculation, unaligned 
> > > > accesses
> > > > and merging of writes. I found something in the architecture spec about
> > > > non-cachable memory, but it's written in Armglish[1].
> > > > 
> > > > pgprot_noncached(), on the other hand, provides what the architecture 
> > > > calls
> > > > Strongly Ordered or Device-nGnRnE memory. This is intended for mapping 
> > > > MMIO
> > > > (i.e. PCI config space) and therefore forbids speculation, preserves 
> > > > access
> > > > size, requires strict alignment and also forces write responses to come 
> > > > from
> > > > the endpoint.
> > > > 
> > > > I think the naming mismatch is historical, but on arm64 we wanted to 
> > > > use the
> > > > same names as arm32 so that any drivers using these things directly 
> > > > would get
> > > > the same behaviour.
> > > 
> > > That all makes sense, but it totally needs a comment.  I'll try to draft
> > > one based on this.  I've also looked at the arm32 code a bit more, and
> > > it seems arm always (?) supported Normal non-cacheable attribute, but
> > > Linux only optionally uses it for arm v6+ because of fears of drivers
> > > missing barriers.
> > 
> > I think it was also to do with aliasing, but I don't recall all of the
> > details.
> 
> ARMv6+ is where the architecture significantly changed to introduce
> the idea of [Normal, Device, Strongly Ordered] where Normal has the
> cache attributes.
> 
> Before that, we had just "uncached/unbuffered, uncached/buffered,
> cached/unbuffered, cached/buffered" modes.
> 
> The write buffer (enabled by buffered modes) has no architected
> guarantees about how long writes will sit in it, and there is only
> the "drain write buffer" instruction to push writes out.
> 
> Up to and including ARMv5, we took the easy approach of just using
> the "uncached/unbuffered" mode since that is (a) the safest, and (b)
> avoids write buffers that alias when there are multiple different
> mappings.
> 
> We could have used a different approach, making all IO writes contain
> a "drain write buffer" instruction, and map DMA memory as "buffered",
> but as there were no Linux barriers defined to order memory accesses
> to DMA memory (so, for example, ring buffers can be updated in the
> correct order) back in those days, using the uncached/unbuffered mode
> was the sanest and most reliable solution.
> 
> > 
> > > The other really weird things is that in arm32
> > > pgprot_dmacoherent incudes the L_PTE_XN bit, which from my understanding
> > > is the no-execture bit, but pgprot_writecombine does not.  This seems to
> > > not very unintentional.  So minus that the whole DMA_ATTR_WRITE_COMBІNE
> > > seems to be about flagging old arm specific drivers as having the proper
> > > barriers in places and otherwise is a no-op.
> > 
> > I think it only matters for Armv7 CPUs, but yes, we should probably be
> > setting L_PTE_XN for both of these memory types.
> 
> Conventionally, pgprot_writecombine() has only been used to change
> the memory type and not the permissions.  Since writecombine memory
> is still capable of being executed, I don't see any reason to set XN
> for it.
> 
> If the user wishes to mmap() using PROT_READ|PROT_EXEC, then is there
> really a reason for writecombine to set XN overriding the user?
> 
> That said, pgprot_writecombine() is mostly used for framebuffers, which
> arguably shouldn't be executable anyway - but who'd want to mmap() the
> framebuffer with PROT_EXEC?
> 
> > 
> > > Here is my tentative plan:
> > > 
> > >  - respin this patch with a small fix to handle the
> > >DMA_ATTR_NON_CONSISTENT (as in ignore it unless actually supported),
> > &g

Re: [PATCH v5 3/3] locking/rwsem: Optimize down_read_trylock()

2019-03-22 Thread Russell King - ARM Linux admin
On Fri, Mar 22, 2019 at 10:30:08AM -0400, Waiman Long wrote:
> Modify __down_read_trylock() to optimize for an unlocked rwsem and make
> it generate slightly better code.
> 
> Before this patch, down_read_trylock:
> 
>0x <+0>: callq  0x5 
>0x0005 <+5>: jmp0x18 
>0x0007 <+7>: lea0x1(%rdx),%rcx
>0x000b <+11>:mov%rdx,%rax
>0x000e <+14>:lock cmpxchg %rcx,(%rdi)
>0x0013 <+19>:cmp%rax,%rdx
>0x0016 <+22>:je 0x23 
>0x0018 <+24>:mov(%rdi),%rdx
>0x001b <+27>:test   %rdx,%rdx
>0x001e <+30>:jns0x7 
>0x0020 <+32>:xor%eax,%eax
>0x0022 <+34>:retq
>0x0023 <+35>:mov%gs:0x0,%rax
>0x002c <+44>:or $0x3,%rax
>0x0030 <+48>:mov%rax,0x20(%rdi)
>0x0034 <+52>:mov$0x1,%eax
>0x0039 <+57>:retq
> 
> After patch, down_read_trylock:
> 
>0x <+0>:   callq  0x5 
>0x0005 <+5>:   xor%eax,%eax
>0x0007 <+7>:   lea0x1(%rax),%rdx
>0x000b <+11>:  lock cmpxchg %rdx,(%rdi)
>0x0010 <+16>:  jne0x29 
>0x0012 <+18>:  mov%gs:0x0,%rax
>0x001b <+27>:  or $0x3,%rax
>0x001f <+31>:  mov%rax,0x20(%rdi)
>0x0023 <+35>:  mov$0x1,%eax
>0x0028 <+40>:  retq
>0x0029 <+41>:  test   %rax,%rax
>0x002c <+44>:  jns0x7 
>0x002e <+46>:  xor%eax,%eax
>0x0030 <+48>:  retq
> 
> By using a rwsem microbenchmark, the down_read_trylock() rate (with a
> load of 10 to lengthen the lock critical section) on a x86-64 system
> before and after the patch were:
> 
>  Before PatchAfter Patch
># of Threads rlock   rlock
> -   -
> 1   14,496  14,716
> 28,644   8,453
>   46,799   6,983
>   85,664   7,190
> 
> On a ARM64 system, the performance results were:
> 
>  Before PatchAfter Patch
># of Threads rlock   rlock
> -   -
> 1   23,676  24,488
> 27,697   9,502
> 44,945   3,440
> 82,641   1,603
> 
> For the uncontended case (1 thread), the new down_read_trylock() is a
> little bit faster. For the contended cases, the new down_read_trylock()
> perform pretty well in x86-64, but performance degrades at high
> contention level on ARM64.

So, 70% for 4 threads, 61% for 4 threads - does this trend
continue tailing off as the number of threads (and cores)
increase?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH v2 07/29] ARM: add kexec_file_load system call number

2019-01-25 Thread Russell King - ARM Linux admin
On Fri, Jan 25, 2019 at 03:43:59PM +, Catalin Marinas wrote:
> On Fri, Jan 18, 2019 at 05:18:13PM +0100, Arnd Bergmann wrote:
> > diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
> > index 86de9eb34296..20ed7e026723 100644
> > --- a/arch/arm/tools/syscall.tbl
> > +++ b/arch/arm/tools/syscall.tbl
> > @@ -415,3 +415,4 @@
> >  398common  rseqsys_rseq
> >  399common  io_pgetevents   sys_io_pgetevents
> >  400common  migrate_pages   sys_migrate_pages
> > +401common  kexec_file_load sys_kexec_file_load
> 
> I presume on arm32 this would still return -ENOSYS.

Yes, I already checked.  If CONFIG_KEXEC_FILE is not set, then
kernel/kexec_file.c is not built, and we fall back to the stub
in kernel/sys_ni.c.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH v2 29/29] y2038: add 64-bit time_t syscalls to all 32-bit architectures

2019-01-19 Thread Russell King - ARM Linux admin
On Fri, Jan 18, 2019 at 11:53:25AM -0800, Andy Lutomirski wrote:
> On Fri, Jan 18, 2019 at 11:33 AM Arnd Bergmann  wrote:
> >
> > On Fri, Jan 18, 2019 at 7:50 PM Andy Lutomirski  wrote:
> > > On Fri, Jan 18, 2019 at 8:25 AM Arnd Bergmann  wrote:
> > > > - Once we get to 512, we clash with the x32 numbers (unless
> > > >   we remove x32 support first), and probably have to skip
> > > >   a few more. I also considered using the 512..547 space
> > > >   for 32-bit-only calls (which never clash with x32), but
> > > >   that also seems to add a bit of complexity.
> > >
> > > I have a patch that I'll send soon to make x32 use its own table.  As
> > > far as I'm concerned, 547 is *it*.  548 is just a normal number and is
> > > not special.  But let's please not reuse 512..547 for other purposes
> > > on x86 variants -- that way lies even more confusion, IMO.
> >
> > Fair enough, the space for those numbers is cheap enough here.
> > I take it you mean we also should not reuse that number space if
> > we were to decide to remove x32 soon, but you are not worried
> > about clashing with arch/alpha when everything else uses consistent
> > numbers?
> >
> 
> I think we have two issues if we reuse those numbers for new syscalls.
> First, I'd really like to see new syscalls be numbered consistently
> everywhere, or at least on all x86 variants, and we can't on x32
> because they mean something else.  Perhaps more importantly, due to
> what is arguably a rather severe bug, issuing a native x86_64 syscall
> (x32 bit clear) with nr in the range 512..547 does *not* return
> -ENOSYS on a kernel with x32 enabled.  Instead it does something that
> is somewhat arbitrary.  With my patch applied, it will return -ENOSYS,
> but old kernels will still exist, and this will break syscall probing.
> 
> Can we perhaps just start the consistent numbers above 547 or maybe
> block out 512..547 in the new regime?

I don't think you gain much with that kind of scheme - it won't take
very long before an architecture misses having a syscall added, and
then someone else adds their own.  Been there with ARM - I was keeping
the syscall table in the same order as x86 for new syscalls, but now
that others have been adding syscalls to the table since I converted
ARM to the tabular form, that's now gone out the window.

So, I think it's completely pointless to do what you're suggesting.
We'll just end up with a big hole in the middle of the syscall table
and then revert back to random numbering of syscalls thereafter again.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH 15/15] arch: add pkey and rseq syscall numbers everywhere

2019-01-15 Thread Russell King - ARM Linux admin
On Thu, Jan 10, 2019 at 05:24:35PM +0100, Arnd Bergmann wrote:
> Most architectures define system call numbers for the rseq and pkey system
> calls, even when they don't support the features, and perhaps never will.
> 
> Only a few architectures are missing these, so just define them anyway
> for consistency. If we decide to add them later to one of these, the
> system call numbers won't get out of sync then.

I was lambasted for adding the pkey syscalls for 32-bit ARM in 2016,
which will probably never support it.  Why has the attitude towards
this kind of thing now apparently become acceptable?

> Signed-off-by: Arnd Bergmann 
> ---
>  arch/alpha/include/asm/unistd.h | 4 
>  arch/alpha/kernel/syscalls/syscall.tbl  | 4 
>  arch/ia64/kernel/syscalls/syscall.tbl   | 4 
>  arch/m68k/kernel/syscalls/syscall.tbl   | 4 
>  arch/parisc/include/asm/unistd.h| 3 ---
>  arch/parisc/kernel/syscalls/syscall.tbl | 4 
>  arch/s390/include/asm/unistd.h  | 3 ---
>  arch/s390/kernel/syscalls/syscall.tbl   | 3 +++
>  arch/sh/kernel/syscalls/syscall.tbl | 4 
>  arch/sparc/include/asm/unistd.h | 5 -
>  arch/sparc/kernel/syscalls/syscall.tbl  | 4 
>  arch/xtensa/kernel/syscalls/syscall.tbl | 1 +
>  12 files changed, 28 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/alpha/include/asm/unistd.h b/arch/alpha/include/asm/unistd.h
> index 564ba87bdc38..31ad350b58a0 100644
> --- a/arch/alpha/include/asm/unistd.h
> +++ b/arch/alpha/include/asm/unistd.h
> @@ -29,9 +29,5 @@
>  #define __IGNORE_getppid
>  #define __IGNORE_getuid
>  
> -/* Alpha doesn't have protection keys. */
> -#define __IGNORE_pkey_mprotect
> -#define __IGNORE_pkey_alloc
> -#define __IGNORE_pkey_free
>  
>  #endif /* _ALPHA_UNISTD_H */
> diff --git a/arch/alpha/kernel/syscalls/syscall.tbl 
> b/arch/alpha/kernel/syscalls/syscall.tbl
> index b0e247287908..25b4a7e76943 100644
> --- a/arch/alpha/kernel/syscalls/syscall.tbl
> +++ b/arch/alpha/kernel/syscalls/syscall.tbl
> @@ -452,3 +452,7 @@
>  521  common  pwritev2sys_pwritev2
>  522  common  statx   sys_statx
>  523  common  io_pgetevents   sys_io_pgetevents
> +524  common  pkey_alloc  sys_pkey_alloc
> +525  common  pkey_free   sys_pkey_free
> +526  common  pkey_mprotect   sys_pkey_mprotect
> +527  common  rseqsys_rseq
> diff --git a/arch/ia64/kernel/syscalls/syscall.tbl 
> b/arch/ia64/kernel/syscalls/syscall.tbl
> index 2e93dbdcdb80..84e03de00177 100644
> --- a/arch/ia64/kernel/syscalls/syscall.tbl
> +++ b/arch/ia64/kernel/syscalls/syscall.tbl
> @@ -339,3 +339,7 @@
>  327  common  io_pgetevents   sys_io_pgetevents
>  328  common  perf_event_open sys_perf_event_open
>  329  common  seccomp sys_seccomp
> +330  common  pkey_alloc  sys_pkey_alloc
> +331  common  pkey_free   sys_pkey_free
> +332  common  pkey_mprotect   sys_pkey_mprotect
> +333  common  rseqsys_rseq
> diff --git a/arch/m68k/kernel/syscalls/syscall.tbl 
> b/arch/m68k/kernel/syscalls/syscall.tbl
> index 5354ba02eed2..ae88b85d068e 100644
> --- a/arch/m68k/kernel/syscalls/syscall.tbl
> +++ b/arch/m68k/kernel/syscalls/syscall.tbl
> @@ -388,6 +388,10 @@
>  378  common  pwritev2sys_pwritev2
>  379  common  statx   sys_statx
>  380  common  seccomp sys_seccomp
> +381  common  pkey_alloc  sys_pkey_alloc
> +382  common  pkey_free   sys_pkey_free
> +383  common  pkey_mprotect   sys_pkey_mprotect
> +384  common  rseqsys_rseq
>  # room for arch specific calls
>  393  common  semget  sys_semget
>  394  common  semctl  sys_semctl
> diff --git a/arch/parisc/include/asm/unistd.h 
> b/arch/parisc/include/asm/unistd.h
> index c2c2afb28941..9ec1026af877 100644
> --- a/arch/parisc/include/asm/unistd.h
> +++ b/arch/parisc/include/asm/unistd.h
> @@ -12,9 +12,6 @@
>  
>  #define __IGNORE_select  /* newselect */
>  #define __IGNORE_fadvise64   /* fadvise64_64 */
> -#define __IGNORE_pkey_mprotect
> -#define __IGNORE_pkey_alloc
> -#define __IGNORE_pkey_free
>  
>  #ifndef ASM_LINE_SEP
>  # define ASM_LINE_SEP ;
> diff --git a/arch/parisc/kernel/syscalls/syscall.tbl 
> b/arch/parisc/kernel/syscalls/syscall.tbl
> index 9bbd2f9f56c8..e07231de3597 100644
> --- a/arch/parisc/kernel/syscalls/syscall.tbl
> +++ b/arch/parisc/kernel/syscalls/syscall.tbl
> @@ -367,3 +367,7 @@
>  348  common  pwritev2sys_pwritev2
> compat_sys_pwritev2
>  349  common  statx   sys_statx
>  350  common  io_pgetevents   sys_io_pgetevents   
> 

Re: [PATCH] mm/zsmalloc.c: Fix zsmalloc 32-bit PAE support

2018-12-10 Thread Russell King - ARM Linux
On Mon, Dec 10, 2018 at 02:35:55PM +, Robin Murphy wrote:
> On 10/12/2018 14:21, Rafael David Tinoco wrote:
> >On 32-bit systems, zsmalloc uses HIGHMEM and, when PAE is enabled, the
> >physical frame number might be so big that zsmalloc obj encoding (to
> >location) will break, causing:
> >
> >BUG: KASAN: null-ptr-deref in zs_map_object+0xa4/0x2bc
> >Read of size 4 at addr  by task mkfs.ext4/623
> >CPU: 2 PID: 623 Comm: mkfs.ext4 Not tainted 
> >4.19.0-rc8-00017-g8239bc6d3307-dirty #15
> >Hardware name: Generic DT based system
> >[] (unwind_backtrace) from [] (show_stack+0x20/0x24)
> >[] (show_stack) from [] (dump_stack+0xbc/0xe8)
> >[] (dump_stack) from [] (kasan_report+0x248/0x390)
> >[] (kasan_report) from [] (__asan_load4+0x78/0xb4)
> >[] (__asan_load4) from [] (zs_map_object+0xa4/0x2bc)
> >[] (zs_map_object) from [] 
> >(zram_bvec_rw.constprop.2+0x324/0x8e4 [zram])
> >[] (zram_bvec_rw.constprop.2 [zram]) from [] 
> >(zram_make_request+0x234/0x46c [zram])
> >[] (zram_make_request [zram]) from [] 
> >(generic_make_request+0x304/0x63c)
> >[] (generic_make_request) from [] (submit_bio+0x4c/0x1c8)
> >[] (submit_bio) from [] 
> >(submit_bh_wbc.constprop.15+0x238/0x26c)
> >[] (submit_bh_wbc.constprop.15) from [] 
> >(__block_write_full_page+0x524/0x76c)
> >[] (__block_write_full_page) from [] 
> >(block_write_full_page+0x1bc/0x1d4)
> >[] (block_write_full_page) from [] 
> >(blkdev_writepage+0x24/0x28)
> >[] (blkdev_writepage) from [] (__writepage+0x44/0x78)
> >[] (__writepage) from [] (write_cache_pages+0x3b8/0x800)
> >[] (write_cache_pages) from [] 
> >(generic_writepages+0x74/0xa0)
> >[] (generic_writepages) from [] 
> >(blkdev_writepages+0x18/0x1c)
> >[] (blkdev_writepages) from [] (do_writepages+0x68/0x134)
> >[] (do_writepages) from [] 
> >(__filemap_fdatawrite_range+0xb0/0xf4)
> >[] (__filemap_fdatawrite_range) from [] 
> >(file_write_and_wait_range+0x64/0xd0)
> >[] (file_write_and_wait_range) from [] 
> >(blkdev_fsync+0x54/0x84)
> >[] (blkdev_fsync) from [] (vfs_fsync_range+0x70/0xd4)
> >[] (vfs_fsync_range) from [] (do_fsync+0x4c/0x80)
> >[] (do_fsync) from [] (sys_fsync+0x1c/0x20)
> >[] (sys_fsync) from [] (ret_fast_syscall+0x0/0x2c)
> >
> >when trying to decode (the pfn) and map the object.
> >
> >That happens because one architecture might not re-define
> >MAX_PHYSMEM_BITS, like in this ARM 32-bit w/ LPAE enabled example. For
> >32-bit systems, if not re-defined, MAX_POSSIBLE_PHYSMEM_BITS will
> >default to BITS_PER_LONG (32) in most cases, and, with PAE enabled,
> >_PFN_BITS might be wrong: which may cause obj variable to overflow if
> >frame number is in HIGHMEM and referencing a page above the 4GB
> >watermark.
> >
> >commit 6e00ec00b1a7 ("staging: zsmalloc: calculate MAX_PHYSMEM_BITS if
> >not defined") realized MAX_PHYSMEM_BITS depended on SPARSEMEM headers
> >and "fixed" it by calculating it using BITS_PER_LONG if SPARSEMEM wasn't
> >used, like in the example given above.
> >
> >Systems with potential for PAE exist for a long time and assuming
> >BITS_PER_LONG seems inadequate. Defining MAX_PHYSMEM_BITS looks better,
> >however it is NOT a constant anymore for x86.
> >
> >SO, instead, MAX_POSSIBLE_PHYSMEM_BITS should be defined by every
> >architecture using zsmalloc, together with a sanity check for
> >MAX_POSSIBLE_PHYSMEM_BITS being too big on 32-bit systems.
> >
> >Link: https://bugs.linaro.org/show_bug.cgi?id=3765#c17
> >Signed-off-by: Rafael David Tinoco 
> >---
> >  arch/arm/include/asm/pgtable-2level-types.h |  2 ++
> >  arch/arm/include/asm/pgtable-3level-types.h |  2 ++
> >  arch/arm64/include/asm/pgtable-types.h  |  2 ++
> >  arch/ia64/include/asm/page.h|  2 ++
> >  arch/mips/include/asm/page.h|  2 ++
> >  arch/powerpc/include/asm/mmu.h  |  2 ++
> >  arch/s390/include/asm/page.h|  2 ++
> >  arch/sh/include/asm/page.h  |  2 ++
> >  arch/sparc/include/asm/page_32.h|  2 ++
> >  arch/sparc/include/asm/page_64.h|  2 ++
> >  arch/x86/include/asm/pgtable-2level_types.h |  2 ++
> >  arch/x86/include/asm/pgtable-3level_types.h |  3 +-
> >  arch/x86/include/asm/pgtable_64_types.h |  4 +--
> >  mm/zsmalloc.c   | 35 +++--
> >  14 files changed, 45 insertions(+), 19 deletions(-)
> >
> >diff --git a/arch/arm/include/asm/pgtable-2level-types.h 
> >b/arch/arm/include/asm/pgtable-2level-types.h
> >index 66cb5b0e89c5..552dba411324 100644
> >--- a/arch/arm/include/asm/pgtable-2level-types.h
> >+++ b/arch/arm/include/asm/pgtable-2level-types.h
> >@@ -64,4 +64,6 @@ typedef pteval_t pgprot_t;
> >  #endif /* STRICT_MM_TYPECHECKS */
> >+#define MAX_POSSIBLE_PHYSMEM_BITS 32
> >+
> >  #endif /* _ASM_PGTABLE_2LEVEL_TYPES_H */
> >diff --git a/arch/arm/include/asm/pgtable-3level-types.h 
> >b/arch/arm/include/asm/pgtable-3level-types.h
> >index 921aa30259c4..664c39e6517c 100644
> >--- a/arch/arm/include/asm/pgtable-3level-types.h
> >+++ 

Re: [PATCH v3 0/7] include: add setbits32/clrbits32/clrsetbits32/setbits64/clrbits64/clrsetbits64

2018-11-15 Thread Russell King - ARM Linux
On Thu, Nov 15, 2018 at 10:30:34AM +0100, LABBE Corentin wrote:
> On Wed, Oct 24, 2018 at 09:57:00AM +0100, Russell King - ARM Linux wrote:
> > On Wed, Oct 24, 2018 at 07:35:46AM +, Corentin Labbe wrote:
> > > This patchset adds a new set of functions which are open-coded in lot of
> > > place.
> > > Basicly the pattern is always the same, "read, modify a bit, write"
> > > some driver and the powerpc arch already have thoses pattern them as 
> > > functions. (like ahci_sunxi.c or dwmac-meson8b)
> > 
> > The advantage of them being open-coded is that it's _obvious_ to the
> > reviewer that there is a read-modify-write going on which, in a multi-
> > threaded environment, may need some locking (so it should trigger a
> > review of the locking around that code.)
> > 
> > With it hidden inside a helper which has no locking itself, it becomes
> > much easier to pass over in review, which means that races are much
> > more likely to go unspotted - and that is bad news.
> > 
> 
> Hello
> 
> I understand your fear, but I think the benefit overhaul thoses.
> Furthermore, drivers which I have converted does not need such locking.
> 
> If you want I can rename the header to linux/setbits-non-atomic.h for making 
> obvious the lack of locking.

It'd probably be better in the function name - it then doesn't get
"lost" that it's non-atomic when it's included via other headers.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH 6/9] PCI: consolidate PCI config entry in drivers/pci

2018-11-01 Thread Russell King - ARM Linux
On Fri, Oct 19, 2018 at 09:58:46PM +0900, Masahiro Yamada wrote:
> On Fri, Oct 19, 2018 at 9:23 PM Russell King - ARM Linux
>  wrote:
> 
> > > index a68b34183107..b185794549be 100644
> > > --- a/arch/arm/mach-pxa/Kconfig
> > > +++ b/arch/arm/mach-pxa/Kconfig
> > > @@ -125,7 +125,7 @@ config MACH_ARMCORE
> > >   bool "CompuLab CM-X255/CM-X270 modules"
> > >   select ARCH_HAS_DMA_SET_COHERENT_MASK if PCI
> > >   select IWMMXT
> > > - select MIGHT_HAVE_PCI
> > > + select HAVE_PCI
> > >   select NEED_MACH_IO_H if PCI
> > >   select PXA25x
> > >   select PXA27x
> >
> > This is wrong.  "MIGHT_HAVE_PCI" is _not_ the same as "HAVE_PCI" - we
> > have a bunch of platforms that mandatorily have PCI and these select
> > PCI directly.  "MIGHT_HAVE_PCI" controls the _visibility_ of the PCI
> > menu option, but does not prevent it being selected.  Your patch will
> > cause Kconfig to complain for those which mandatorily have PCI but
> > do not set HAVE_PCI.
> 
> 
> Good catch!
> But, adding a bunch of 'select HAVE_PCI' along with 'select PCI' is ugly.
> 
> Do you have any suggestion?
> 
> How about letting CONFIG_ARM to select HAVE_PCI ?

Well, the situation we have on ARM is rather optimal - when the
rest of the configuration supports PCI, PCI is made visible.  When
there's no hardware support for PCI, PCI is hidden.  When PCI is
mandatory, PCI is selected and mostly always hidden.

It seems that with this consolidation, we lose that - we end up
with PCI being visible for every ARM config, not only those where
PCI is "impossible" but also for those where PCI is forcefully
selected.

That said, offering PCI for platforms that do not have any possibility
of PCI hardware shouldn't cause any compile issues, and would increase
build coverage - but I'd say it wouldn't _usefully_ increase build
coverage.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH 0/7] serial: Finish kgdb on qcom_geni; fix many lockdep splats w/ kgdb

2018-10-30 Thread Russell King - ARM Linux
On Tue, Oct 30, 2018 at 11:56:28AM +, Daniel Thompson wrote:
> On Mon, Oct 29, 2018 at 11:07:00AM -0700, Douglas Anderson wrote:
> > Looking back, this is pretty much two series squashed that could be
> > treated indepdently.  The first is a serial series and the second is a
> > kgdb series.
> 
> Indeed.
> 
> I couldn't work out the link between the first 5 patches and the last 2
> until I read this...
> 
> Is anything in the 01->05 patch set even related to kgdb? From the stack
> traces it looks to me like the lock dep warning would trigger for any
> sysrq. I think separating into two threads for v2 would be sensible.

I'm concerned about calling smp_call_function() from IRQ context with
IRQs disabled - that will block the ability of the _calling_ CPU to
process IPIs from other CPUs in the system.  If we have other CPUs
waiting on their IPIs to complete on _this_ CPU, we could end up
deadlocking while trying to grab the CSD lock.

This is the intention of warnings in smp_call_function*() - to catch
cases where deadlocks _can_ occur, but do not reliably show up.

The exceptions to the warning (disregarding oops_in_progress) are
chosen to allow IRQs-disabled calls when we're sure that the rest of
the system isn't going to be sending the calling CPU an IPI (eg,
because the CPU isn't marked online, and we only send IPIs to online
CPUs, or if we're still early in the kernel boot and hence have no
other CPUs running.)  The exception is oops_in_progress, which can
occur at any time - even with the current CPU already holding some
CSD locks (eg, oops while trying to send an IPI.)

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH 0/8] add generic builtin command line

2018-10-29 Thread Russell King - ARM Linux
On Mon, Oct 29, 2018 at 10:29:15AM +, Will Deacon wrote:
> On Wed, Oct 24, 2018 at 10:07:32AM +0100, Russell King - ARM Linux wrote:
> > On Wed, Oct 24, 2018 at 11:57:44AM +0300, Maksym Kokhan wrote:
> > > Do you mean, that you haven't seen patch for ARM, which I sent on
> > > September 27 along with cover and patch 1? It is strange, because
> > > you was the one from recipients. If so, you can see this patch here:
> > > https://lore.kernel.org/patchwork/patch/992779/
> > 
> > It seems that I have received patch 5, _but_ it's not threaded with
> > the cover message and patch 1.  With 50k messages in my inbox, and 3k
> > messages since you sent the series, it's virtually impossible to find
> > it (I only found it by looking at my mail server logs from September
> > to find the subject, and then searching my mailbox for that subject.)
> > 
> > This is unnecessarily difficult.
> 
> This comes up surprisingly often, and I think part of the issue is that
> different maintainers have different preferences. I also prefer to receive
> the entire series and cover-letter, but I've seen people object to being
> CC'd on the whole series as well (how they manage to review things in
> isolation is another question...!)

This series has the odd situation where patch 1 is threaded to the
cover letter, but nothing else is - that makes it inconsistent.

Where I've seen people disagree with threading is when sending
follow-up series - whether that should be threaded to the previous
series or not - some people want it others hate it.

However, I haven't seen any disagreement is about having the patches
threaded to the cover.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH v2 0/2] arm64: Cut rebuild time when changing CONFIG_BLK_DEV_INITRD

2018-10-25 Thread Russell King - ARM Linux
On Thu, Oct 25, 2018 at 10:38:34AM +0100, Mike Rapoport wrote:
> On Wed, Oct 24, 2018 at 02:55:17PM -0500, Rob Herring wrote:
> > On Wed, Oct 24, 2018 at 2:33 PM Florian Fainelli  
> > wrote:
> > >
> > > Hi all,
> > >
> > > While investigating why ARM64 required a ton of objects to be rebuilt
> > > when toggling CONFIG_DEV_BLK_INITRD, it became clear that this was
> > > because we define __early_init_dt_declare_initrd() differently and we do
> > > that in arch/arm64/include/asm/memory.h which gets included by a fair
> > > amount of other header files, and translation units as well.
> > 
> > I scratch my head sometimes as to why some config options rebuild so
> > much stuff. One down, ? to go. :)
> > 
> > > Changing the value of CONFIG_DEV_BLK_INITRD is a common thing with build
> > > systems that generate two kernels: one with the initramfs and one
> > > without. buildroot is one of these build systems, OpenWrt is also
> > > another one that does this.
> > >
> > > This patch series proposes adding an empty initrd.h to satisfy the need
> > > for drivers/of/fdt.c to unconditionally include that file, and moves the
> > > custom __early_init_dt_declare_initrd() definition away from
> > > asm/memory.h
> > >
> > > This cuts the number of objects rebuilds from 1920 down to 26, so a
> > > factor 73 approximately.
> > >
> > > Apologies for the long CC list, please let me know how you would go
> > > about merging that and if another approach would be preferable, e.g:
> > > introducing a CONFIG_ARCH_INITRD_BELOW_START_OK Kconfig option or
> > > something like that.
> > 
> > There may be a better way as of 4.20 because bootmem is now gone and
> > only memblock is used. This should unify what each arch needs to do
> > with initrd early. We need the physical address early for memblock
> > reserving. Then later on we need the virtual address to access the
> > initrd. Perhaps we should just change initrd_start and initrd_end to
> > physical addresses (or add 2 new variables would be less invasive and
> > allow for different translation than __va()). The sanity checks and
> > memblock reserve could also perhaps be moved to a common location.
> >
> > Alternatively, given arm64 is the only oddball, I'd be fine with an
> > "if (IS_ENABLED(CONFIG_ARM64))" condition in the default
> > __early_init_dt_declare_initrd as long as we have a path to removing
> > it like the above option.
> 
> I think arm64 does not have to redefine __early_init_dt_declare_initrd().
> Something like this might be just all we need (completely untested,
> probably it won't even compile):

The alternative solution would be to replace initrd_start/initrd_end
with physical address versions of these everywhere - that's what
we're passed from DT, it's what 32-bit ARM would prefer, and seemingly
what 64-bit ARM would also like as well.

Grepping for initrd_start in arch/*/mm shows that there's lots of
architectures that have virtual/physical conversions on these, and
a number that have obviously been derived from 32-bit ARM's approach
(with maintaining a phys_initrd_start variable to simplify things).

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH 0/8] add generic builtin command line

2018-10-24 Thread Russell King - ARM Linux
On Wed, Oct 24, 2018 at 11:57:44AM +0300, Maksym Kokhan wrote:
> Do you mean, that you haven't seen patch for ARM, which I sent on
> September 27 along with cover and patch 1? It is strange, because
> you was the one from recipients. If so, you can see this patch here:
> https://lore.kernel.org/patchwork/patch/992779/

It seems that I have received patch 5, _but_ it's not threaded with
the cover message and patch 1.  With 50k messages in my inbox, and 3k
messages since you sent the series, it's virtually impossible to find
it (I only found it by looking at my mail server logs from September
to find the subject, and then searching my mailbox for that subject.)

This is unnecessarily difficult.

> On Tue, Oct 23, 2018 at 5:48 PM Russell King - ARM Linux
>  wrote:
> >
> > On Tue, Oct 23, 2018 at 05:43:18PM +0300, Maksym Kokhan wrote:
> > > We still have no response to patches for x86, arm, arm64 and powerpc.
> > > Is current generic command line implementation appropriate for these
> > > architectures?
> > > Is it possible to merge these patches in the current form (for x86,
> > > arm, arm64 and powerpc)?
> >
> > You may wish to consider your recipients - I seem to only have received
> > the cover and patch 1 (which doesn't include any ARM specific bits).
> > It may be that you're not getting responses because people haven't seen
> > your patches.
> >
> > Thanks.
> >
> > --
> > RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
> > FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps 
> > up
> > According to speedtest.net: 11.9Mbps down 500kbps up

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH v3 0/7] include: add setbits32/clrbits32/clrsetbits32/setbits64/clrbits64/clrsetbits64

2018-10-24 Thread Russell King - ARM Linux
On Wed, Oct 24, 2018 at 07:35:46AM +, Corentin Labbe wrote:
> This patchset adds a new set of functions which are open-coded in lot of
> place.
> Basicly the pattern is always the same, "read, modify a bit, write"
> some driver and the powerpc arch already have thoses pattern them as 
> functions. (like ahci_sunxi.c or dwmac-meson8b)

The advantage of them being open-coded is that it's _obvious_ to the
reviewer that there is a read-modify-write going on which, in a multi-
threaded environment, may need some locking (so it should trigger a
review of the locking around that code.)

With it hidden inside a helper which has no locking itself, it becomes
much easier to pass over in review, which means that races are much
more likely to go unspotted - and that is bad news.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH 0/8] add generic builtin command line

2018-10-23 Thread Russell King - ARM Linux
On Tue, Oct 23, 2018 at 05:43:18PM +0300, Maksym Kokhan wrote:
> We still have no response to patches for x86, arm, arm64 and powerpc.
> Is current generic command line implementation appropriate for these
> architectures?
> Is it possible to merge these patches in the current form (for x86,
> arm, arm64 and powerpc)?

You may wish to consider your recipients - I seem to only have received
the cover and patch 1 (which doesn't include any ARM specific bits).
It may be that you're not getting responses because people haven't seen
your patches.

Thanks.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH 6/9] PCI: consolidate PCI config entry in drivers/pci

2018-10-19 Thread Russell King - ARM Linux
On Fri, Oct 19, 2018 at 02:09:49PM +0200, Christoph Hellwig wrote:
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index e33735ce1c14..7495d0a0aa31 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -149,9 +149,6 @@ config ARM_DMA_IOMMU_ALIGNMENT
>  
>  endif
>  
> -config MIGHT_HAVE_PCI
> - bool
> -
>  config SYS_SUPPORTS_APM_EMULATION
>   bool
>  
> @@ -320,7 +317,7 @@ config ARCH_MULTIPLATFORM
>   select COMMON_CLK
>   select GENERIC_CLOCKEVENTS
>   select GENERIC_IRQ_MULTI_HANDLER
> - select MIGHT_HAVE_PCI
> + select HAVE_PCI
>   select PCI_DOMAINS if PCI
>   select SPARSE_IRQ
>   select USE_OF
> @@ -436,7 +433,7 @@ config ARCH_IXP4XX
>   select DMABOUNCE if PCI
>   select GENERIC_CLOCKEVENTS
>   select GPIOLIB
> - select MIGHT_HAVE_PCI
> + select HAVE_PCI
>   select NEED_MACH_IO_H
>   select USB_EHCI_BIG_ENDIAN_DESC
>   select USB_EHCI_BIG_ENDIAN_MMIO
> @@ -449,7 +446,7 @@ config ARCH_DOVE
>   select GENERIC_CLOCKEVENTS
>   select GENERIC_IRQ_MULTI_HANDLER
>   select GPIOLIB
> - select MIGHT_HAVE_PCI
> + select HAVE_PCI
>   select MVEBU_MBUS
>   select PINCTRL
>   select PINCTRL_DOVE
> @@ -1216,14 +1213,6 @@ config ISA_DMA
>  config ISA_DMA_API
>   bool
>  
> -config PCI
> - bool "PCI support" if MIGHT_HAVE_PCI
> - help
> -   Find out whether you have a PCI motherboard. PCI is the name of a
> -   bus system, i.e. the way the CPU talks to the other stuff inside
> -   your box. Other bus systems are ISA, EISA, MicroChannel (MCA) or
> -   VESA. If you have PCI, say Y, otherwise N.
> -
>  config PCI_DOMAINS
>   bool "Support for multiple PCI domains"
>   depends on PCI
> @@ -1252,8 +1241,6 @@ config PCI_HOST_ITE8152
>   default y
>   select DMABOUNCE
>  
> -source "drivers/pci/Kconfig"
> -
>  source "drivers/pcmcia/Kconfig"
>  
>  endmenu
> diff --git a/arch/arm/mach-ks8695/Kconfig b/arch/arm/mach-ks8695/Kconfig
> index a545976bdbd6..b3185c05fffa 100644
> --- a/arch/arm/mach-ks8695/Kconfig
> +++ b/arch/arm/mach-ks8695/Kconfig
> @@ -4,7 +4,7 @@ menu "Kendin/Micrel KS8695 Implementations"
>  
>  config MACH_KS8695
>   bool "KS8695 development board"
> - select MIGHT_HAVE_PCI
> + select HAVE_PCI
>   help
> Say 'Y' here if you want your kernel to run on the original
> Kendin-Micrel KS8695 development board.
> @@ -52,7 +52,7 @@ config MACH_CM4002
>  
>  config MACH_CM4008
>   bool "OpenGear CM4008"
> - select MIGHT_HAVE_PCI
> + select HAVE_PCI
>   help
> Say 'Y' here if you want your kernel to support the OpenGear
> CM4008 Console Server. See http://www.opengear.com for more
> @@ -60,7 +60,7 @@ config MACH_CM4008
>  
>  config MACH_CM41xx
>   bool "OpenGear CM41xx"
> - select MIGHT_HAVE_PCI
> + select HAVE_PCI
>   help
> Say 'Y' here if you want your kernel to support the OpenGear
> CM4016 or CM4048 Console Servers. See http://www.opengear.com for
> @@ -68,7 +68,7 @@ config MACH_CM41xx
>  
>  config MACH_IM4004
>   bool "OpenGear IM4004"
> - select MIGHT_HAVE_PCI
> + select HAVE_PCI
>   help
> Say 'Y' here if you want your kernel to support the OpenGear
> IM4004 Secure Access Server. See http://www.opengear.com for
> @@ -76,7 +76,7 @@ config MACH_IM4004
>  
>  config MACH_IM42xx
>   bool "OpenGear IM42xx"
> - select MIGHT_HAVE_PCI
> + select HAVE_PCI
>   help
> Say 'Y' here if you want your kernel to support the OpenGear
> IM4216 or IM4248 Console Servers. See http://www.opengear.com for
> diff --git a/arch/arm/mach-pxa/Kconfig b/arch/arm/mach-pxa/Kconfig
> index a68b34183107..b185794549be 100644
> --- a/arch/arm/mach-pxa/Kconfig
> +++ b/arch/arm/mach-pxa/Kconfig
> @@ -125,7 +125,7 @@ config MACH_ARMCORE
>   bool "CompuLab CM-X255/CM-X270 modules"
>   select ARCH_HAS_DMA_SET_COHERENT_MASK if PCI
>   select IWMMXT
> - select MIGHT_HAVE_PCI
> + select HAVE_PCI
>   select NEED_MACH_IO_H if PCI
>   select PXA25x
>   select PXA27x

This is wrong.  "MIGHT_HAVE_PCI" is _not_ the same as "HAVE_PCI" - we
have a bunch of platforms that mandatorily have PCI and these select
PCI directly.  "MIGHT_HAVE_PCI" controls the _visibility_ of the PCI
menu option, but does not prevent it being selected.  Your patch will
cause Kconfig to complain for those which mandatorily have PCI but
do not set HAVE_PCI.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH V4 2/15] KVM/MMU: Add tlb flush with range helper function

2018-10-14 Thread Russell King - ARM Linux
On Sun, Oct 14, 2018 at 09:21:23PM +0800, Tianyu Lan wrote:
> Sorry to confuse your. I get from CCers from get_maintainer.pl script.

Unfortunately you seem to have made a mistake.  My email address is
'li...@armlinux.org.uk' not 'li...@armlinux.org'.  There is no
'li...@armlinux.org' in MAINTAINERS, yet your emails appear to be
copied to this address.

Consequently, if it was your intention to copy me with the entire
series, that didn't happen.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH V4 2/15] KVM/MMU: Add tlb flush with range helper function

2018-10-14 Thread Russell King - ARM Linux
On Sun, Oct 14, 2018 at 10:16:56AM +0200, Thomas Gleixner wrote:
> On Sun, 14 Oct 2018, Liran Alon wrote:
> > > On 13 Oct 2018, at 17:53, lantianyu1...@gmail.com wrote:
> > > 
> > > From: Lan Tianyu 
> > > 
> > > This patch is to add wrapper functions for tlb_remote_flush_with_range
> > > callback.
> > > 
> > > Signed-off-by: Lan Tianyu 
> > > ---
> > > Change sicne V3:
> > >   Remove code of updating "tlbs_dirty"
> > > Change since V2:
> > >   Fix comment in the kvm_flush_remote_tlbs_with_range()
> > > ---
> > > arch/x86/kvm/mmu.c | 40 
> > > 1 file changed, 40 insertions(+)
> > > 
> > > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> > > index c73d9f650de7..ff656d85903a 100644
> > > --- a/arch/x86/kvm/mmu.c
> > > +++ b/arch/x86/kvm/mmu.c
> > > @@ -264,6 +264,46 @@ static void mmu_spte_set(u64 *sptep, u64 spte);
> > > static union kvm_mmu_page_role
> > > kvm_mmu_calc_root_page_role(struct kvm_vcpu *vcpu);
> > > 
> > > +
> > > +static inline bool kvm_available_flush_tlb_with_range(void)
> > > +{
> > > + return kvm_x86_ops->tlb_remote_flush_with_range;
> > > +}
> > 
> > Seems that kvm_available_flush_tlb_with_range() is not used in this patch…
> 
> What's wrong with that? 
> 
> It provides the implementation and later patches make use of it. It's a
> sensible way to split patches into small, self contained entities.

>From what I can see of the patches that follow _this_ patch in the
series, this function remains unused.  So, not only is it not used
in this patch, it's not used in this series.

I think the real question that needs asking is - what is the plan
for this function, and when will it be used?

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH V4 2/15] KVM/MMU: Add tlb flush with range helper function

2018-10-14 Thread Russell King - ARM Linux
On Sun, Oct 14, 2018 at 10:27:34AM +0100, Russell King - ARM Linux wrote:
> On Sun, Oct 14, 2018 at 10:16:56AM +0200, Thomas Gleixner wrote:
> > On Sun, 14 Oct 2018, Liran Alon wrote:
> > > > On 13 Oct 2018, at 17:53, lantianyu1...@gmail.com wrote:
> > > > 
> > > > From: Lan Tianyu 
> > > > 
> > > > This patch is to add wrapper functions for tlb_remote_flush_with_range
> > > > callback.
> > > > 
> > > > Signed-off-by: Lan Tianyu 
> > > > ---
> > > > Change sicne V3:
> > > >   Remove code of updating "tlbs_dirty"
> > > > Change since V2:
> > > >   Fix comment in the kvm_flush_remote_tlbs_with_range()
> > > > ---
> > > > arch/x86/kvm/mmu.c | 40 
> > > > 1 file changed, 40 insertions(+)
> > > > 
> > > > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> > > > index c73d9f650de7..ff656d85903a 100644
> > > > --- a/arch/x86/kvm/mmu.c
> > > > +++ b/arch/x86/kvm/mmu.c
> > > > @@ -264,6 +264,46 @@ static void mmu_spte_set(u64 *sptep, u64 spte);
> > > > static union kvm_mmu_page_role
> > > > kvm_mmu_calc_root_page_role(struct kvm_vcpu *vcpu);
> > > > 
> > > > +
> > > > +static inline bool kvm_available_flush_tlb_with_range(void)
> > > > +{
> > > > +   return kvm_x86_ops->tlb_remote_flush_with_range;
> > > > +}
> > > 
> > > Seems that kvm_available_flush_tlb_with_range() is not used in this patch…
> > 
> > What's wrong with that? 
> > 
> > It provides the implementation and later patches make use of it. It's a
> > sensible way to split patches into small, self contained entities.
> 
> From what I can see of the patches that follow _this_ patch in the
> series, this function remains unused.  So, not only is it not used
> in this patch, it's not used in this series.

Note - I seem to have only received patches 1 through 4, so this is
based on the patches I've received.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [RESEND PATCH 0/6] rapidio: move Kconfig menu definition to subsystem

2018-07-31 Thread Russell King - ARM Linux
For the thread associated with this patch set, a review of a previous
patch for ARM posted last Tuesday on this subject asked a series of
questions about the PCI-nature of this.  The review has not been
responded to.

If it is inappropriate to offer RapidIO for any architecture that
happens to has PCI, then it is inappropriate to offer it for any
ARM machine that happens to have PCI.

In light of the lack of explanation on this point so far, I'm naking
the ARM part of this series for now.

I also think that the HAS_RAPIDIO thing is misleading and needs
sorting out (as I've mentioned in other emails, including the one
I refer to above) before rapidio becomes available more widely.

On Tue, Jul 31, 2018 at 10:29:48AM -0400, Alexei Colin wrote:
> Resending the patchset from prior submission:
> https://lkml.org/lkml/2018/7/30/911
> 
> The only change are the Cc tags in all patches now include the mailing
> lists for all affected architectures, and patch 1/6 (which adds the menu
> item to RapdidIO subsystem Kconfig) is CCed to all maintainers who are
> getting this cover letter. The cover letter has been updated with
> explanations to points raised in the feedback.
> 
> 
> 
> The top-level Kconfig entry for RapidIO subsystem is currently
> duplicated in several architecture-specific Kconfig files. This set of
> patches does two things:
> 
> 1. Move the Kconfig menu definition into the RapidIO subsystem and
> remove the duplicate definitions from arch Kconfig files.
> 
> 2. Enable RapidIO Kconfig menu entry for arm and arm64 architectures,
> where it was not enabled before. I tested that subsystem and drivers
> build successfully for both architectures, and tested that the modules
> load on a custom arm64 Qemu model.
> 
> For all architectures, RapidIO menu should be offered when either:
> (1) The platform has a PCI bus (which host a RapidIO module on the bus).
> (2) The platform has a RapidIO IP block (connected to a system bus, e.g.
> AXI on ARM). In this case, 'select HAS_RAPIDIO' should be added to the
> 'config ARCH_*' menu entry for the SoCs that offer the IP block.
> 
> Prior to this patchset, different architectures used different criteria:
> * powerpc: (1) and (2)
> * mips: (1) and (2) after recent commit into next that added (2):
>   https://www.linux-mips.org/archives/linux-mips/2018-07/msg00596.html
>   fc5d988878942e9b42a4de5204bdd452f3f1ce47
>   491ec1553e0075f345fbe476a93775eabcbc40b6
> * x86: (1)
> * arm,arm64: none (RapidIO menus never offered)
> 
> This set of architectures are the ones that implement support for
> RapidIO as system bus. On some platforms RapidIO can be the only system
> bus available replacing PCI/PCIe.  As it is done now, RapidIO is
> configured in "Bus Options" (x86/PPC) or "Bus Support" (ARMs) sub-menu
> and from system configuration option it should be kept this way.
> Current location of RAPIDIO configuration option is familiar to users of
> PowerPC and x86 platforms, and is similarly available in some ARM
> manufacturers kernel code trees. (Alex Bounine)
> 
> HAS_RAPIDIO is not enabled unconditionally, because HAS_RAPIDIO option
> is intended for SOCs that have built in SRIO controllers, like TI
> KeyStoneII or FPGAs. Because RapidIO subsystem core is required during
> RapidIO port driver initialization, having separate option allows us to
> control available build options for RapidIO core and port driver (bool
> vs.  tristate) and disable module option if port driver is configured as
> built-in. (Alex Bounine)
> 
> Responses to feedback from prior submission (thanks for the reviews!):
> http://lists.infradead.org/pipermail/linux-arm-kernel/2018-July/593347.html
> http://lists.infradead.org/pipermail/linux-arm-kernel/2018-July/593349.html
> 
> Changelog:
>   * Moved Kconfig entry into RapidIO subsystem instead of duplicating
> 
> In the current patchset, I took the approach of adding '|| PCI' to the
> depends in the subsystem. I did try the alterantive approach mentioned
> in the reviews for v1 of this patch, where the subsystem Kconfig does
> not add a '|| PCI' and each per-architecture Kconfig has to add a
> 'select HAS_RAPIDIO if PCI' and SoCs with IP blocks have to also add
> 'select HAS_RAPIDIO'. This works too but requires each architecture's
> Kconfig to add the line for RapidIO (whereas current approach does not
> require that involvement) and also may create a false impression that
> the dependency on PCI is strict.
> 
> We appreciate the suggestion for also selecting the RapdiIO subsystem for
> compilation with COMPILE_TEST, but hope to address it in a separate
> patchset, localized to the subsystem, since it will need to change
> depends on all drivers, not just on the top level, and since this
> patch now spans multiple architectures.
> 
> Alexei Colin (6):
>   rapidio: define top Kconfig menu in driver subtree
>   x86: factor out RapidIO Kconfig menu
>   powerpc: factor out RapidIO Kconfig menu entry
>   mips: factor out RapidIO Kconfig entry
>   arm: 

Re: [RESEND PATCH 5/6] arm: enable RapidIO menu in Kconfig

2018-07-31 Thread Russell King - ARM Linux
On Tue, Jul 31, 2018 at 10:29:53AM -0400, Alexei Colin wrote:
> Platforms with a PCI bus will be offered the RapidIO menu since they may
> be want support for a RapidIO PCI device. Platforms without a PCI bus
> that might include a RapidIO IP block will need to "select HAS_RAPIDIO"
> in the platform-/machine-specific "config ARCH_*" Kconfig entry.
> 
> Tested that kernel builds for ARM with the RapidIO subsystem and switch
> drivers enabled.
> 
> Cc: Andrew Morton 
> Cc: Russell King 
> Cc: John Paul Walters 
> Cc: linux-ker...@vger.kernel.org
> Cc: x...@kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux-m...@linux-mips.org
> Cc: linux-arm-ker...@lists.infradead.org
> Signed-off-by: Alexei Colin 
> ---
>  arch/arm/Kconfig | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index afe350e5e3d9..602a61324890 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -1278,6 +1278,8 @@ config PCI_HOST_ITE8152
>  
>  source "drivers/pci/Kconfig"
>  
> +source "drivers/rapidio/Kconfig"
> +
>  source "drivers/pcmcia/Kconfig"

Why not place the new Kconfig file after pcmcia?  That way, it is in
a consistent position wrt architectures such as powerpc, and it is
also in alphabetical order.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 13.8Mbps down 630kbps up
According to speedtest.net: 13Mbps down 490kbps up


Re: [RESEND PATCH 1/6] rapidio: define top Kconfig menu in driver subtree

2018-07-31 Thread Russell King - ARM Linux
On Tue, Jul 31, 2018 at 10:29:49AM -0400, Alexei Colin wrote:
> The top-level Kconfig entry for RapidIO subsystem is currently
> duplicated in several architecture-specific Kconfigs. This
> commit re-defines it in the driver subtree, and subsequent
> commits, one per architecture, will remove the duplicated
> definitions from respective per-architecture Kconfigs.
> 
> Cc: Andrew Morton 
> Cc: John Paul Walters 
> Cc: Catalin Marinas 
> Cc: Russell King 
> Cc: Arnd Bergmann 
> Cc: Will Deacon 
> Cc: Ralf Baechle 
> Cc: Paul Burton 
> Cc: Alexander Sverdlin 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: Thomas Gleixner 
> Cc: Peter Anvin 
> Cc: x...@kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux-m...@linux-mips.org
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-ker...@vger.kernel.org
> Signed-off-by: Alexei Colin 
> ---
>  drivers/rapidio/Kconfig | 15 +++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/drivers/rapidio/Kconfig b/drivers/rapidio/Kconfig
> index d6d2f20c4597..98e301847584 100644
> --- a/drivers/rapidio/Kconfig
> +++ b/drivers/rapidio/Kconfig
> @@ -1,6 +1,21 @@
>  #
>  # RapidIO configuration
>  #
> +
> +config HAS_RAPIDIO
> + bool
> + default n

There's no need to specify this default - the default default defaults to
'n' anyway, so "default n" just respecifies what's already the default.
(next time I'll try to add more "default"s into that! ;) )

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 13.8Mbps down 630kbps up
According to speedtest.net: 13Mbps down 490kbps up


Re: [PATCH 0/6] rapidio: move Kconfig menu definition to subsystem

2018-07-30 Thread Russell King - ARM Linux
I only have this message, and patches 5 and 6.  This is meaningless
for me to review - I can't tell what you've done as far as my comments
on your previous iteration.

Please arrange to _at least_ copy all patches the appropriate mailing
lists for the set with your complete patch set if you aren't going to
copy everyone on all patches in the set.

On Mon, Jul 30, 2018 at 06:50:28PM -0400, Alexei Colin wrote:
> In the current patchset, I took the approach of adding '|| PCI' to the
> depends in the subsystem. I did try the alterantive approach mentioned
> in the reviews for v1 of this patch, where the subsystem Kconfig does
> not add a '|| PCI' and each per-architecture Kconfig has to add a
> 'select HAS_RAPIDIO if PCI' and SoCs with IP blocks have to also add
> 'select HAS_RAPIDIO'. This works too but requires each architecture's
> Kconfig to add the line for RapidIO (whereas current approach does not
> require that involvement) and also may create a false impression that
> the dependency on PCI is strict.

... which means, as it stands, I've no idea what you actually came up
with for this.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 13.8Mbps down 630kbps up
According to speedtest.net: 13Mbps down 490kbps up


Re: sparc/ppc/arm compat siginfo ABI regressions: sending SIGFPE via kill() returns wrong values in si_pid and si_uid

2018-04-15 Thread Russell King - ARM Linux
On Fri, Apr 13, 2018 at 12:53:49PM -0700, Linus Torvalds wrote:
> On Fri, Apr 13, 2018 at 11:45 AM, Dave Martin  wrote:
> >
> > Most uses I've seen do nothing more than use the FPE_xyz value to
> > format diagnostic messages while dying.  I struggled to find code that
> > made a meaningful functional decision based on the value, though that's
> > not proof...
> 
> Yeah. I've seen code that cares about SIGFPE deeply, but it's almost
> invariably about some emulated environment (eg Java VM, or CPU
> emulation).
> 
> And the siginfo data is basically never good enough for those
> environments anyway on its own, so they will go and look at the actual
> instruction that caused the fault and the register state instead,
> because they need *all* the information.
> 
> The cases that use si_code are the ones that just trapped signals in
> order to give a more helpful abort message.
> 
> So I could certainly imagine that si_code is actually used by somebody
> who then decides to actuall act differently on it, but aside from
> perhaps printing out a different message, it sounds far-fetched.

Okay, in that case let's just use FPE_FLTINV.  That makes the patch
easily back-portable for stable kernels.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up


Re: sparc/ppc/arm compat siginfo ABI regressions: sending SIGFPE via kill() returns wrong values in si_pid and si_uid

2018-04-13 Thread Russell King - ARM Linux
On Fri, Apr 13, 2018 at 07:35:38PM +0100, Dave Martin wrote:
> If that's the case though, I don't see how a userspace testsuite is
> hitting this code path.  Maybe I've misunderstood the context of this
> thread.

It isn't hitting this exact case.

The userspace testsuite is hitting an entirely different case:

kill(getpid(), SIGFPE);

As one expects, this generates a SIGFPE to the current process, which
then inspects the siginfo structure.  Being a userspace generated
signal, si_code is set to SI_USER, which has the value 0.

With FPE_FIXME defined to zero, as Eric has done:

enum siginfo_layout siginfo_layout(int sig, int si_code)
{
enum siginfo_layout layout = SIL_KILL;
if ((si_code > SI_USER) && (si_code < SI_KERNEL)) {
...
} else {
...
#ifdef FPE_FIXME
if ((sig == SIGFPE) && (si_code == FPE_FIXME))
layout = SIL_FAULT;
#endif
}
return layout;
}

This causes siginfo_layout() to return SIL_FAULT for this userspace
generated signal, rather than the correct SIL_KILL.

This affects which fields we copy to userspace.

SI_USER is defined to pass si_pid and si_uid to the userspace process,
which on ARM are the first two consecutive 32-bit quantities in the
union, which is done when siginfo_layout() returns SIL_KILL.  However,
when SIL_FAULT is returned, we only copy si_addr in the union, which
on ARM is just one 32-bit quantity.

Consequently, userspace sees a correct value for si_pid, and si_uid
remains set to whatever was there in userspace.  In the case of the
strace program, that's zero.  This means if you run the strace
testsuite as root, the problem doesn't appear, but if you run it as
a non-root user, it will.

So, the testsuite case has little to do with the behaviour of the VFP
handling - it's to do with the behaviour of the kernel's signal handling.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up


Re: sparc/ppc/arm compat siginfo ABI regressions: sending SIGFPE via kill() returns wrong values in si_pid and si_uid

2018-04-13 Thread Russell King - ARM Linux
On Fri, Apr 13, 2018 at 06:08:28PM +0100, Dave Martin wrote:
> On Fri, Apr 13, 2018 at 09:33:17AM -0700, Linus Torvalds wrote:
> > On Fri, Apr 13, 2018 at 2:42 AM, Russell King - ARM Linux
> > <li...@armlinux.org.uk> wrote:
> > >
> > > Yes, it does solve the problem at hand with strace - the exact patch I
> > > tested against 4.16 is below.
> > 
> > Ok, good.
> > 
> > > However, FPE_FLTUNK is not defined in older kernels, so while we can
> > > fix it this way for the current merge window, that doesn't help 4.16.
> > 
> > I wonder if we should even bother with FPE_FLTUNK.
> > 
> > I suspect we might as well use FPE_FLTINV, I suspect, and not have
> > this complexity at all. That case is not worth worrying about, since
> > it's a "this shouldn't happen anyway" and the *real* reason will be in
> > the kernel logs due to vfs_panic().
> > 
> > So it's not like this is something that the user should ever care
> > about the si_code about.
> 
> Ack, my intended meaning for FPE_FLTUNK is that the fp exception is
> either spurious or we can't tell easily (or possibly at all) which
> FPE_XXX should be returned.  It's up to userspace to figure it out
> if it really cares.  Previously we were accidentally returning SI_USER
> in si_code for arm64.
> 
> This case on arm looks like a more serious error for which FPE_FLTINV
> may be more appropriate anyway.

No.  The cases where we get to this point are:

1. A trap concerning a coprocessor register transfer instruction (iow, move
   between a VFP register and ARM register.)
2. A trap concerning a coprocessor register load or save instruction.

(In both of these, "concerning" means that the VFP hardware provides
such an instruction as the reason for the fault, *not* that it is the
faulting instruction.)

3. A combination of the exception bits (EX and DEX) on certain VFP
   implementations.

All of these can be summarised as "the hardware went wrong in some way"
rather than "the user program did something wrong."

FPE_FLTINV means "floating point invalid operation".  Does it really
cover the case where hardware has failed, or is it intended to cover
the case where userspace did something wrong and asked for an invalid
operation from the FP hardware?

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up


Re: sparc/ppc/arm compat siginfo ABI regressions: sending SIGFPE via kill() returns wrong values in si_pid and si_uid

2018-04-13 Thread Russell King - ARM Linux
On Thu, Apr 12, 2018 at 10:22:15AM -0700, Linus Torvalds wrote:
> On Thu, Apr 12, 2018 at 10:20 AM, Russell King - ARM Linux
> <li...@armlinux.org.uk> wrote:
> >
> > This file was created to contain FPE_FIXME, by the "signal/arm: Document
> > conflicts with SI_USER and SIGFPE" commit so if we're removing it, it
> > would be better to remove the whole file.
> 
> Fair enough. I'm not going to commit that anyway since I can't test
> it, but yes, if it tests ok that sounds like the right thing to do.

Yes, it does solve the problem at hand with strace - the exact patch I
tested against 4.16 is below.

Testing this exact code path (exceptions set to VFP_EXCEPTION_ERROR)
is something that can only happen if the hardware does something stupid,
and I don't have a way of making it do that, so the code path can't be
tested.

However, FPE_FLTUNK is not defined in older kernels, so while we can
fix it this way for the current merge window, that doesn't help 4.16.
How we solve that depends what happens with Eric's patch (266da65e9156
("signal: Add FPE_FLTUNK si_code for undiagnosable fp exceptions"))
that introduces FPE_FLTUNK - and there's also the problem of NSIGFPE,
which kernel/signal.c uses in the path that selects the siginfo layout.

Given that the path we're talking about is unlikely to happen (as
mentioned in my second paragraph) I still think reverting Eric's patch
is the right way forward for older kernels.

(Note, my previous comment about the si_code initialiser was incorrect.)

diff --git a/arch/arm/include/uapi/asm/siginfo.h 
b/arch/arm/include/uapi/asm/siginfo.h
deleted file mode 100644
index d0513880be21..
--- a/arch/arm/include/uapi/asm/siginfo.h
+++ /dev/null
@@ -1,13 +0,0 @@
-#ifndef __ASM_SIGINFO_H
-#define __ASM_SIGINFO_H
-
-#include 
-
-/*
- * SIGFPE si_codes
- */
-#ifdef __KERNEL__
-#define FPE_FIXME  0   /* Broken dup of SI_USER */
-#endif /* __KERNEL__ */
-
-#endif
diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
index 4c375e11ae95..8a1a5e6048d2 100644
--- a/arch/arm/vfp/vfpmodule.c
+++ b/arch/arm/vfp/vfpmodule.c
@@ -28,6 +28,10 @@
 #include 
 #include 
 
+#ifndef FPE_FLTUNK
+#define FPE_FLTUNK 14
+#endif
+
 #include "vfpinstr.h"
 #include "vfp.h"
 
@@ -257,7 +261,7 @@ static void vfp_raise_exceptions(u32 exceptions, u32 inst, 
u32 fpscr, struct pt_
 
if (exceptions == VFP_EXCEPTION_ERROR) {
vfp_panic("unhandled bounce", inst);
-   vfp_raise_sigfpe(FPE_FIXME, regs);
+   vfp_raise_sigfpe(FPE_FLTUNK, regs);
return;
}
 


-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up


Re: sparc/ppc/arm compat siginfo ABI regressions: sending SIGFPE via kill() returns wrong values in si_pid and si_uid

2018-04-12 Thread Russell King - ARM Linux
On Thu, Apr 12, 2018 at 09:50:26AM -0700, Linus Torvalds wrote:
> Does this attached patch perhaps fix the ARM case?
> 
> It just uses FPE_FLTUNK as the default si_code for SIGFPE, which seems
> sane enough. And then gets rid of FPE_FIXME, which should resolve the
> nasty case.
> 
> Hmm? Entirely untested, and I didn't really look at the test-case in
> question since I can't really run it anyway.

I'll test tomorrow and let you know.

>  arch/arm/include/uapi/asm/siginfo.h | 7 ---
>  arch/arm/vfp/vfpmodule.c| 4 ++--
>  2 files changed, 2 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/arm/include/uapi/asm/siginfo.h 
> b/arch/arm/include/uapi/asm/siginfo.h
> index d0513880be21..d87beeedb4c4 100644
> --- a/arch/arm/include/uapi/asm/siginfo.h
> +++ b/arch/arm/include/uapi/asm/siginfo.h
> @@ -3,11 +3,4 @@
>  
>  #include 
>  
> -/*
> - * SIGFPE si_codes
> - */
> -#ifdef __KERNEL__
> -#define FPE_FIXME0   /* Broken dup of SI_USER */
> -#endif /* __KERNEL__ */
> -
>  #endif

This file was created to contain FPE_FIXME, by the "signal/arm: Document
conflicts with SI_USER and SIGFPE" commit so if we're removing it, it
would be better to remove the whole file.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up


Re: sparc/ppc/arm compat siginfo ABI regressions: sending SIGFPE via kill() returns wrong values in si_pid and si_uid

2018-04-12 Thread Russell King - ARM Linux
On Thu, Apr 12, 2018 at 03:49:28PM +0300, Dmitry V. Levin wrote:
> The "KERNEL BUG" diagnostics I was talking about was added to strace yesterday
> as a part of workaround commit, see
> https://github.com/strace/strace/commit/34c7794cc16e2511eda7b1d5767c655a83b17309
> Before that change the test just failed.

Ah, seeing the test case really helps to see exactly what and why it's
broken.  Yes, Eric's commit was definitely wrong and needs to be
reverted, because it incorrectly changes what happens when kill(1) is
used to deliver a SIGFPE signal to a process.

Eric, please sort this out - you have a much better handle on whether
there are any dependencies here that would need to be resolved from
a simple revert of the offending commits, but that revert must happen
because you've caused a user visible regression.

The original code _was_ safe even if it wasn't correct to the specs,
as we'd end up copying the si_addr field (as the si_pid copy) and a
zeroed field as the si_uid copy.  It was just that si_code was
technically wrong, and that's something that would be even more
dangerous to change now.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up


Re: sparc/ppc/arm compat siginfo ABI regressions: sending SIGFPE via kill() returns wrong values in si_pid and si_uid

2018-04-12 Thread Russell King - ARM Linux
On Thu, Apr 12, 2018 at 02:03:14PM +0300, Dmitry V. Levin wrote:
> On Thu, Apr 12, 2018 at 10:58:11AM +0100, Russell King - ARM Linux wrote:
> > On Thu, Apr 12, 2018 at 04:34:35AM +0300, Dmitry V. Levin wrote:
> > > A similar commit v4.16-rc1~159^2~37
> > > ("signal/arm: Document conflicts with SI_USER and SIGFPE") must have
> > > introduced a similar ABI regression to compat arm.
> > 
> > So, could you explain how can this change cause a regression?
> > 
> > +#define FPE_FIXME  0
> > -   vfp_raise_sigfpe(0, regs);
> > +   vfp_raise_sigfpe(FPE_FIXME, regs);
> 
> No, this hunk hasn't caused the regression, but another one did:
> 
> diff --git a/arch/arm/include/uapi/asm/siginfo.h 
> b/arch/arm/include/uapi/asm/siginfo.h
> new file mode 100644
> index 000..d051388
> --- /dev/null
> +++ b/arch/arm/include/uapi/asm/siginfo.h
> @@ -0,0 +1,13 @@
> +#ifndef __ASM_SIGINFO_H
> +#define __ASM_SIGINFO_H
> +
> +#include 
> +
> +/*
> + * SIGFPE si_codes
> + */
> +#ifdef __KERNEL__
> +#define FPE_FIXME  0   /* Broken dup of SI_USER */
> +#endif /* __KERNEL__ */
> +
> +#endif
> 
> This is due to FPE_FIXME handling in kernel/signal.c

Building strace 4.22 on ARM and running the test suite reveals no
problems with the signal_receive test, tested on both 4.14 and 4.16
kernels - there's no "KERNEL BUG" reports in any of the test results.
However, stock strace 4.22 source doesn't appear to contain the
"KERNEL BUG" string anywhere, so this may be a Suse specific addition
to the test:

~/src/strace-4.22$ grep -ri 'KERNEL BUG' .
./strace.1:Arguably, every instance of such behavior is a kernel bug.)
./strace.1.in:Arguably, every instance of such behavior is a kernel bug.)
./NEWS:  * Worked around a kernel bug in tracing privileged executables.
./ChangeLog:aarch64: workaround gcc+kernel bug.
./ChangeLog:tests: workaround kernel bugs in seccomp-strict.test and 
prctl-seccomp-strict.test
./ChangeLog:instead.  We don't want the testsuite failing due to kernel 
bugs.
./ChangeLog:First guess is that it's a workaround for old kernel bugs:
./ChangeLog:This kernel bug is fixed long ago. This change removes the 
workaround.

Any ideas where the "KERNEL BUG" in Suse builds is coming from?  Any
ideas how to test it on other architectures (iow, where can we get
source that contains this test?)

Based on previous experience, unfortunately folk don't tend to report
user ABI regressions to kernel developers, so we'd probably never know
that there's a problem - I do think the safer thing would've been to
leave it well alone, and just accept that we'll end up copying more
words to userspace than is actually intended.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up


Re: sparc/ppc/arm compat siginfo ABI regressions: sending SIGFPE via kill() returns wrong values in si_pid and si_uid

2018-04-12 Thread Russell King - ARM Linux
On Thu, Apr 12, 2018 at 04:34:35AM +0300, Dmitry V. Levin wrote:
> A similar commit v4.16-rc1~159^2~37
> ("signal/arm: Document conflicts with SI_USER and SIGFPE") must have
> introduced a similar ABI regression to compat arm.

So, could you explain how can this change cause a regression?

+#define FPE_FIXME  0
-   vfp_raise_sigfpe(0, regs);
+   vfp_raise_sigfpe(FPE_FIXME, regs);

I think you're talking garbage here - look at the damned change.
It subsitutes a definition for a constant, and vfp_raise_sigfpe()
ends up receiving exactly the same value bother before and after
the change.

The change is rather incomplete though because it should have
also changed:

int si_code = 0;

as well.

So, the commit log is accurate in this case: it _is_ about
documenting the conflicting cases between SI_USER and SIGFPE and
that bit of the change has no ABI effect.

What does slightly annoy me is the creation of uapi/asm/siginfo.h
to contain a definition that _isn't_ to be exposed as part of the
UAPI.  If it's not part of the UAPI, it doesn't belong in a UAPI
header, period.  In any case, I don't think that is exposed to
userspace.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up


Re: [PATCH] Extract initrd free logic from arch-specific code.

2018-03-29 Thread Russell King - ARM Linux
On Thu, Mar 29, 2018 at 05:53:14PM +0100, Marc Zyngier wrote:
> On Thu, 29 Mar 2018 16:58:27 +0100,
> Russell King - ARM Linux wrote:
> > 
> > On Thu, Mar 29, 2018 at 05:43:47PM +0200, Geert Uytterhoeven wrote:
> > > On Thu, Mar 29, 2018 at 5:27 PM, Russell King - ARM Linux
> > > <li...@armlinux.org.uk> wrote:
> > > > On Thu, Mar 29, 2018 at 09:37:52AM +1100, Oliver wrote:
> > > >> On Thu, Mar 29, 2018 at 9:14 AM, Russell King - ARM Linux
> > > >> <li...@armlinux.org.uk> wrote:
> > > >> > On Wed, Mar 28, 2018 at 02:04:22PM -0500, Rob Landley wrote:
> > > >> >> On 03/28/2018 11:48 AM, Russell King - ARM Linux wrote:
> > > >> >> > On Wed, Mar 28, 2018 at 10:58:51AM -0500, Rob Landley wrote:
> > > >> >> >> On 03/28/2018 10:26 AM, Shea Levy wrote:
> > > >> >> >>> Now only those architectures that have custom initrd free 
> > > >> >> >>> requirements
> > > >> >> >>> need to define free_initrd_mem.
> > > >> >> >> ...
> > > >> >> >>> --- a/arch/arc/mm/init.c
> > > >> >> >>> +++ b/arch/arc/mm/init.c
> > > >> >> >>> @@ -229,10 +229,3 @@ void __ref free_initmem(void)
> > > >> >> >>>  {
> > > >> >> >>>   free_initmem_default(-1);
> > > >> >> >>>  }
> > > >> >> >>> -
> > > >> >> >>> -#ifdef CONFIG_BLK_DEV_INITRD
> > > >> >> >>> -void __init free_initrd_mem(unsigned long start, unsigned long 
> > > >> >> >>> end)
> > > >> >> >>> -{
> > > >> >> >>> - free_reserved_area((void *)start, (void *)end, -1, "initrd");
> > > >> >> >>> -}
> > > >> >> >>> -#endif
> > > >> >> >>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> > > >> >> >>> index 3f972e83909b..19d1c5594e2d 100644
> > > >> >> >>> --- a/arch/arm/Kconfig
> > > >> >> >>> +++ b/arch/arm/Kconfig
> > > >> >> >>> @@ -47,6 +47,7 @@ config ARM
> > > >> >> >>>   select HARDIRQS_SW_RESEND
> > > >> >> >>>   select HAVE_ARCH_AUDITSYSCALL if (AEABI && !OABI_COMPAT)
> > > >> >> >>>   select HAVE_ARCH_BITREVERSE if (CPU_32v7M || CPU_32v7) && 
> > > >> >> >>> !CPU_32v6
> > > >> >> >>> + select HAVE_ARCH_FREE_INITRD_MEM
> > > >> >> >>>   select HAVE_ARCH_JUMP_LABEL if !XIP_KERNEL && 
> > > >> >> >>> !CPU_ENDIAN_BE32 && MMU
> > > >> >> >>>   select HAVE_ARCH_KGDB if !CPU_ENDIAN_BE32 && MMU
> > > >> >> >>>   select HAVE_ARCH_MMAP_RND_BITS if MMU
> > > >> >> >>
> > > >> >> >> Isn't this why weak symbols were invented?
> > > >> >> >
> > > >> >> > Weak symbols means that we end up with both the weakly-referenced 
> > > >> >> > code
> > > >> >> > and the arch code in the kernel image.  That's fine if the weak 
> > > >> >> > code
> > > >> >> > is small.
> > > >> >>
> > > >> >> The kernel's been able to build with link time garbage collection 
> > > >> >> since 2016:
> > > >> >>
> > > >> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b67067f1176d
> > > >> >>
> > > >> >> Wouldn't that remove the unused one?
> > > >> >
> > > >> > Probably, if anyone bothered to use that, which they don't.
> > > >> >
> > > >> > LD_DEAD_CODE_DATA_ELIMINATION is a symbol without a prompt, and from
> > > >> > what I can see, nothing selects it.  Therefore, the symbol is always
> > > >> > disabled, and so the feature never gets used in mainline kernels.
> > > >> >
> > > >> > Brings up the obvious question - why is it there if it's completely
> > > >> > unused?  (Maybe to cause confusion, and allowing a justificat

Re: [PATCH] Extract initrd free logic from arch-specific code.

2018-03-29 Thread Russell King - ARM Linux
On Thu, Mar 29, 2018 at 11:39:24AM -0500, Rob Landley wrote:
> 
> 
> On 03/28/2018 05:14 PM, Russell King - ARM Linux wrote:
> > On Wed, Mar 28, 2018 at 02:04:22PM -0500, Rob Landley wrote:
> >>
> >>
> >> On 03/28/2018 11:48 AM, Russell King - ARM Linux wrote:
> >>> On Wed, Mar 28, 2018 at 10:58:51AM -0500, Rob Landley wrote:
> >>>> On 03/28/2018 10:26 AM, Shea Levy wrote:
> >>>>> Now only those architectures that have custom initrd free requirements
> >>>>> need to define free_initrd_mem.
> >>>> ...
> >>>>> --- a/arch/arc/mm/init.c
> >>>>> +++ b/arch/arc/mm/init.c
> >>>>> @@ -229,10 +229,3 @@ void __ref free_initmem(void)
> >>>>>  {
> >>>>> free_initmem_default(-1);
> >>>>>  }
> >>>>> -
> >>>>> -#ifdef CONFIG_BLK_DEV_INITRD
> >>>>> -void __init free_initrd_mem(unsigned long start, unsigned long end)
> >>>>> -{
> >>>>> -   free_reserved_area((void *)start, (void *)end, -1, "initrd");
> >>>>> -}
> >>>>> -#endif
> >>>>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> >>>>> index 3f972e83909b..19d1c5594e2d 100644
> >>>>> --- a/arch/arm/Kconfig
> >>>>> +++ b/arch/arm/Kconfig
> >>>>> @@ -47,6 +47,7 @@ config ARM
> >>>>> select HARDIRQS_SW_RESEND
> >>>>> select HAVE_ARCH_AUDITSYSCALL if (AEABI && !OABI_COMPAT)
> >>>>> select HAVE_ARCH_BITREVERSE if (CPU_32v7M || CPU_32v7) && 
> >>>>> !CPU_32v6
> >>>>> +   select HAVE_ARCH_FREE_INITRD_MEM
> >>>>> select HAVE_ARCH_JUMP_LABEL if !XIP_KERNEL && !CPU_ENDIAN_BE32 
> >>>>> && MMU
> >>>>> select HAVE_ARCH_KGDB if !CPU_ENDIAN_BE32 && MMU
> >>>>> select HAVE_ARCH_MMAP_RND_BITS if MMU
> >>>>
> >>>> Isn't this why weak symbols were invented?
> >>>
> >>> Weak symbols means that we end up with both the weakly-referenced code
> >>> and the arch code in the kernel image.  That's fine if the weak code
> >>> is small.
> >>
> >> The kernel's been able to build with link time garbage collection since 
> >> 2016:
> >>
> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b67067f1176d
> >>
> >> Wouldn't that remove the unused one?
> > 
> > Probably, if anyone bothered to use that, which they don't.
> > 
> > LD_DEAD_CODE_DATA_ELIMINATION is a symbol without a prompt, and from
> > what I can see, nothing selects it.  Therefore, the symbol is always
> > disabled, and so the feature never gets used in mainline kernels.
> 
> It looks like there are per-architecture linker scripts that need to be 
> updated?
> So if an architecture supports it, it's always done (well, it probes for the
> toolchain supporting the flag). And if the architecture doesn't support it, 
> the
> linker script needs to be updated to mark sections with "I know nothing seems 
> to
> reference this at the ELF level but keep it anyway, we're pulling an assembly
> trick".

It looks like it needs much more than just architecture changes as the
reason it fails on ARM is because the init thread structure is missing
due to missing KEEP()s in INIT_TASK_DATA().  Probably means it doesn't
work anywhere.

8<===
From: Russell King <rmk+ker...@armlinux.org.uk>
Subject: [PATCH] Fix LD_DEAD_CODE_DATA_ELIMINATION

LD_DEAD_CODE_DATA_ELIMINATION fails to boot on ARM because the linker
eliminates the init thread data from the bottom of the init threads
stack.  This causes recursive faults that end up overwriting parts
of the kernel before they can print any message.

Signed-off-by: Russell King <rmk+ker...@armlinux.org.uk>
---
 include/asm-generic/vmlinux.lds.h |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/asm-generic/vmlinux.lds.h 
b/include/asm-generic/vmlinux.lds.h
index 1ab0e520d6fc..41af8a74aae4 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -279,8 +279,8 @@
VMLINUX_SYMBOL(__start_init_task) = .;  \
VMLINUX_SYMBOL(init_thread_union) = .;  \
VMLINUX_SYMBOL(init_stack) = .; \
-   *(.data..init_task) \
-   *(.data..init_thread_info)  \
+   KEEP(*(.data..init_task))   \
+   KEEP(*(.data..init_thread_info))\
. = VMLINUX_SYMBOL(__start_init_task) + THREAD_SIZE;\
VMLINUX_SYMBOL(__end_init_task) = .;
 

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up


Re: [PATCH] Extract initrd free logic from arch-specific code.

2018-03-29 Thread Russell King - ARM Linux
On Thu, Mar 29, 2018 at 05:43:47PM +0200, Geert Uytterhoeven wrote:
> On Thu, Mar 29, 2018 at 5:27 PM, Russell King - ARM Linux
> <li...@armlinux.org.uk> wrote:
> > On Thu, Mar 29, 2018 at 09:37:52AM +1100, Oliver wrote:
> >> On Thu, Mar 29, 2018 at 9:14 AM, Russell King - ARM Linux
> >> <li...@armlinux.org.uk> wrote:
> >> > On Wed, Mar 28, 2018 at 02:04:22PM -0500, Rob Landley wrote:
> >> >> On 03/28/2018 11:48 AM, Russell King - ARM Linux wrote:
> >> >> > On Wed, Mar 28, 2018 at 10:58:51AM -0500, Rob Landley wrote:
> >> >> >> On 03/28/2018 10:26 AM, Shea Levy wrote:
> >> >> >>> Now only those architectures that have custom initrd free 
> >> >> >>> requirements
> >> >> >>> need to define free_initrd_mem.
> >> >> >> ...
> >> >> >>> --- a/arch/arc/mm/init.c
> >> >> >>> +++ b/arch/arc/mm/init.c
> >> >> >>> @@ -229,10 +229,3 @@ void __ref free_initmem(void)
> >> >> >>>  {
> >> >> >>>   free_initmem_default(-1);
> >> >> >>>  }
> >> >> >>> -
> >> >> >>> -#ifdef CONFIG_BLK_DEV_INITRD
> >> >> >>> -void __init free_initrd_mem(unsigned long start, unsigned long end)
> >> >> >>> -{
> >> >> >>> - free_reserved_area((void *)start, (void *)end, -1, "initrd");
> >> >> >>> -}
> >> >> >>> -#endif
> >> >> >>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> >> >> >>> index 3f972e83909b..19d1c5594e2d 100644
> >> >> >>> --- a/arch/arm/Kconfig
> >> >> >>> +++ b/arch/arm/Kconfig
> >> >> >>> @@ -47,6 +47,7 @@ config ARM
> >> >> >>>   select HARDIRQS_SW_RESEND
> >> >> >>>   select HAVE_ARCH_AUDITSYSCALL if (AEABI && !OABI_COMPAT)
> >> >> >>>   select HAVE_ARCH_BITREVERSE if (CPU_32v7M || CPU_32v7) && 
> >> >> >>> !CPU_32v6
> >> >> >>> + select HAVE_ARCH_FREE_INITRD_MEM
> >> >> >>>   select HAVE_ARCH_JUMP_LABEL if !XIP_KERNEL && !CPU_ENDIAN_BE32 && 
> >> >> >>> MMU
> >> >> >>>   select HAVE_ARCH_KGDB if !CPU_ENDIAN_BE32 && MMU
> >> >> >>>   select HAVE_ARCH_MMAP_RND_BITS if MMU
> >> >> >>
> >> >> >> Isn't this why weak symbols were invented?
> >> >> >
> >> >> > Weak symbols means that we end up with both the weakly-referenced code
> >> >> > and the arch code in the kernel image.  That's fine if the weak code
> >> >> > is small.
> >> >>
> >> >> The kernel's been able to build with link time garbage collection since 
> >> >> 2016:
> >> >>
> >> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b67067f1176d
> >> >>
> >> >> Wouldn't that remove the unused one?
> >> >
> >> > Probably, if anyone bothered to use that, which they don't.
> >> >
> >> > LD_DEAD_CODE_DATA_ELIMINATION is a symbol without a prompt, and from
> >> > what I can see, nothing selects it.  Therefore, the symbol is always
> >> > disabled, and so the feature never gets used in mainline kernels.
> >> >
> >> > Brings up the obvious question - why is it there if it's completely
> >> > unused?  (Maybe to cause confusion, and allowing a justification
> >> > for __weak ?)
> >>
> >> IIRC Nick had some patches to do the arch enablement for powerpc, but
> >> I'm not sure what happened to them though. I suspect it just fell down
> >> Nick's ever growing TODO list.
> >
> > I've given it a go on ARM, marking every linker-built table with KEEP()
> > and comparing the System.map files.  The resulting kernel is around
> > 150k smaller, which seems good.
> >
> > However, it doesn't boot - and I don't know why.  Booting the kernel
> > under kvmtool in a VM using virtio-console, I can find no way to get
> > any kernel messages out of it.  Using lkvm debug, I can see that the
> > PC is stuck inside die(), and that's the only information I have.
> > It dies before bringing up the other CPUs, so it's a very early death.
> >
> > I don't think other console types are available under ARM64.
> 
> earlycon?

Through what - as I say above, I think the only thing that's present is
virtio-console, and the virtio stack only get initialised much later in
boot.

Eg, there's the memory-based virtio driver which interfaces any virtio
driver to a memory-based ring structures for communication with the host
(drivers/virtio/virtio_mmio.c) which is initialised at module_init()
time, and so isn't available for earlycon.

I don't think merely changing the module_init() calls in the appropriate
virtio bits will suffice - it's why I pointed out that it dies before
SMP initialisation, which also means that it dies before we start
running the initcalls for subsystems and drivers.

I'm not aware of there being an emulated UART in the guest's address
space, so serial based stuff doesn't work.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up


Re: [PATCH] Extract initrd free logic from arch-specific code.

2018-03-29 Thread Russell King - ARM Linux
On Thu, Mar 29, 2018 at 09:37:52AM +1100, Oliver wrote:
> On Thu, Mar 29, 2018 at 9:14 AM, Russell King - ARM Linux
> <li...@armlinux.org.uk> wrote:
> > On Wed, Mar 28, 2018 at 02:04:22PM -0500, Rob Landley wrote:
> >>
> >>
> >> On 03/28/2018 11:48 AM, Russell King - ARM Linux wrote:
> >> > On Wed, Mar 28, 2018 at 10:58:51AM -0500, Rob Landley wrote:
> >> >> On 03/28/2018 10:26 AM, Shea Levy wrote:
> >> >>> Now only those architectures that have custom initrd free requirements
> >> >>> need to define free_initrd_mem.
> >> >> ...
> >> >>> --- a/arch/arc/mm/init.c
> >> >>> +++ b/arch/arc/mm/init.c
> >> >>> @@ -229,10 +229,3 @@ void __ref free_initmem(void)
> >> >>>  {
> >> >>>   free_initmem_default(-1);
> >> >>>  }
> >> >>> -
> >> >>> -#ifdef CONFIG_BLK_DEV_INITRD
> >> >>> -void __init free_initrd_mem(unsigned long start, unsigned long end)
> >> >>> -{
> >> >>> - free_reserved_area((void *)start, (void *)end, -1, "initrd");
> >> >>> -}
> >> >>> -#endif
> >> >>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> >> >>> index 3f972e83909b..19d1c5594e2d 100644
> >> >>> --- a/arch/arm/Kconfig
> >> >>> +++ b/arch/arm/Kconfig
> >> >>> @@ -47,6 +47,7 @@ config ARM
> >> >>>   select HARDIRQS_SW_RESEND
> >> >>>   select HAVE_ARCH_AUDITSYSCALL if (AEABI && !OABI_COMPAT)
> >> >>>   select HAVE_ARCH_BITREVERSE if (CPU_32v7M || CPU_32v7) && !CPU_32v6
> >> >>> + select HAVE_ARCH_FREE_INITRD_MEM
> >> >>>   select HAVE_ARCH_JUMP_LABEL if !XIP_KERNEL && !CPU_ENDIAN_BE32 && MMU
> >> >>>   select HAVE_ARCH_KGDB if !CPU_ENDIAN_BE32 && MMU
> >> >>>   select HAVE_ARCH_MMAP_RND_BITS if MMU
> >> >>
> >> >> Isn't this why weak symbols were invented?
> >> >
> >> > Weak symbols means that we end up with both the weakly-referenced code
> >> > and the arch code in the kernel image.  That's fine if the weak code
> >> > is small.
> >>
> >> The kernel's been able to build with link time garbage collection since 
> >> 2016:
> >>
> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b67067f1176d
> >>
> >> Wouldn't that remove the unused one?
> >
> > Probably, if anyone bothered to use that, which they don't.
> >
> > LD_DEAD_CODE_DATA_ELIMINATION is a symbol without a prompt, and from
> > what I can see, nothing selects it.  Therefore, the symbol is always
> > disabled, and so the feature never gets used in mainline kernels.
> >
> > Brings up the obvious question - why is it there if it's completely
> > unused?  (Maybe to cause confusion, and allowing a justification
> > for __weak ?)
> 
> IIRC Nick had some patches to do the arch enablement for powerpc, but
> I'm not sure what happened to them though. I suspect it just fell down
> Nick's ever growing TODO list.

I've given it a go on ARM, marking every linker-built table with KEEP()
and comparing the System.map files.  The resulting kernel is around
150k smaller, which seems good.

However, it doesn't boot - and I don't know why.  Booting the kernel
under kvmtool in a VM using virtio-console, I can find no way to get
any kernel messages out of it.  Using lkvm debug, I can see that the
PC is stuck inside die(), and that's the only information I have.
It dies before bringing up the other CPUs, so it's a very early death.

I don't think other console types are available under ARM64.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up


Re: [PATCH] Extract initrd free logic from arch-specific code.

2018-03-28 Thread Russell King - ARM Linux
On Wed, Mar 28, 2018 at 02:04:22PM -0500, Rob Landley wrote:
> 
> 
> On 03/28/2018 11:48 AM, Russell King - ARM Linux wrote:
> > On Wed, Mar 28, 2018 at 10:58:51AM -0500, Rob Landley wrote:
> >> On 03/28/2018 10:26 AM, Shea Levy wrote:
> >>> Now only those architectures that have custom initrd free requirements
> >>> need to define free_initrd_mem.
> >> ...
> >>> --- a/arch/arc/mm/init.c
> >>> +++ b/arch/arc/mm/init.c
> >>> @@ -229,10 +229,3 @@ void __ref free_initmem(void)
> >>>  {
> >>>   free_initmem_default(-1);
> >>>  }
> >>> -
> >>> -#ifdef CONFIG_BLK_DEV_INITRD
> >>> -void __init free_initrd_mem(unsigned long start, unsigned long end)
> >>> -{
> >>> - free_reserved_area((void *)start, (void *)end, -1, "initrd");
> >>> -}
> >>> -#endif
> >>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> >>> index 3f972e83909b..19d1c5594e2d 100644
> >>> --- a/arch/arm/Kconfig
> >>> +++ b/arch/arm/Kconfig
> >>> @@ -47,6 +47,7 @@ config ARM
> >>>   select HARDIRQS_SW_RESEND
> >>>   select HAVE_ARCH_AUDITSYSCALL if (AEABI && !OABI_COMPAT)
> >>>   select HAVE_ARCH_BITREVERSE if (CPU_32v7M || CPU_32v7) && !CPU_32v6
> >>> + select HAVE_ARCH_FREE_INITRD_MEM
> >>>   select HAVE_ARCH_JUMP_LABEL if !XIP_KERNEL && !CPU_ENDIAN_BE32 && MMU
> >>>   select HAVE_ARCH_KGDB if !CPU_ENDIAN_BE32 && MMU
> >>>   select HAVE_ARCH_MMAP_RND_BITS if MMU
> >>
> >> Isn't this why weak symbols were invented?
> > 
> > Weak symbols means that we end up with both the weakly-referenced code
> > and the arch code in the kernel image.  That's fine if the weak code
> > is small.
> 
> The kernel's been able to build with link time garbage collection since 2016:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b67067f1176d
> 
> Wouldn't that remove the unused one?

Probably, if anyone bothered to use that, which they don't.

LD_DEAD_CODE_DATA_ELIMINATION is a symbol without a prompt, and from
what I can see, nothing selects it.  Therefore, the symbol is always
disabled, and so the feature never gets used in mainline kernels.

Brings up the obvious question - why is it there if it's completely
unused?  (Maybe to cause confusion, and allowing a justification
for __weak ?)

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up


Re: [PATCH] Extract initrd free logic from arch-specific code.

2018-03-28 Thread Russell King - ARM Linux
On Wed, Mar 28, 2018 at 10:58:51AM -0500, Rob Landley wrote:
> On 03/28/2018 10:26 AM, Shea Levy wrote:
> > Now only those architectures that have custom initrd free requirements
> > need to define free_initrd_mem.
> ...
> > --- a/arch/arc/mm/init.c
> > +++ b/arch/arc/mm/init.c
> > @@ -229,10 +229,3 @@ void __ref free_initmem(void)
> >  {
> > free_initmem_default(-1);
> >  }
> > -
> > -#ifdef CONFIG_BLK_DEV_INITRD
> > -void __init free_initrd_mem(unsigned long start, unsigned long end)
> > -{
> > -   free_reserved_area((void *)start, (void *)end, -1, "initrd");
> > -}
> > -#endif
> > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> > index 3f972e83909b..19d1c5594e2d 100644
> > --- a/arch/arm/Kconfig
> > +++ b/arch/arm/Kconfig
> > @@ -47,6 +47,7 @@ config ARM
> > select HARDIRQS_SW_RESEND
> > select HAVE_ARCH_AUDITSYSCALL if (AEABI && !OABI_COMPAT)
> > select HAVE_ARCH_BITREVERSE if (CPU_32v7M || CPU_32v7) && !CPU_32v6
> > +   select HAVE_ARCH_FREE_INITRD_MEM
> > select HAVE_ARCH_JUMP_LABEL if !XIP_KERNEL && !CPU_ENDIAN_BE32 && MMU
> > select HAVE_ARCH_KGDB if !CPU_ENDIAN_BE32 && MMU
> > select HAVE_ARCH_MMAP_RND_BITS if MMU
> 
> Isn't this why weak symbols were invented?

Weak symbols means that we end up with both the weakly-referenced code
and the arch code in the kernel image.  That's fine if the weak code
is small.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up


Re: linux-next: Tree for Nov 7

2017-11-13 Thread Russell King - ARM Linux
On Mon, Nov 13, 2017 at 04:16:41PM +0100, Michal Hocko wrote:
> On Mon 13-11-17 13:00:57, Michal Hocko wrote:
> [...]
> > Yes, I have mentioned that in the previous email but the amount of code
> > would be even larger. Basically every arch which reimplements
> > arch_get_unmapped_area would have to special case new MAP_FIXED flag to
> > do vma lookup.
> 
> It turned out that this might be much more easier than I thought after
> all. It seems we can really handle that in the common code. This would
> mean that we are exposing a new functionality to the userspace though.
> Myabe this would be useful on its own though. Just a quick draft (not
> even compile tested) whether this makes sense in general. I would be
> worried about unexpected behavior when somebody set other bit without a
> good reason and we might fail with ENOMEM for such a call now.
> 
> Elf loader would then use MAP_FIXED_SAFE rather than MAP_FIXED.
> ---
> diff --git a/arch/alpha/include/uapi/asm/mman.h 
> b/arch/alpha/include/uapi/asm/mman.h
> index 3b26cc62dadb..d021c21f9b01 100644
> --- a/arch/alpha/include/uapi/asm/mman.h
> +++ b/arch/alpha/include/uapi/asm/mman.h
> @@ -31,6 +31,9 @@
>  #define MAP_STACK0x8 /* give out an address that is best 
> suited for process/thread stacks */
>  #define MAP_HUGETLB  0x10/* create a huge page mapping */
>  
> +#define MAP_KEEP_MAPPING 0x200
> +#define MAP_FIXED_SAFE   MAP_FIXED|MAP_KEEP_MAPPING /* enforce MAP_FIXED 
> without clobbering an existing mapping */

A few things...

1. Does this need to be exposed to userland?
2. Can it end up in include/uapi/asm-generic/mman*.h ?
3. The definition of MAP_FIXED_SAFE should really have parens around it.

> @@ -1365,6 +1365,13 @@ unsigned long do_mmap(struct file *file, unsigned long 
> addr,
>   if (offset_in_page(addr))
>   return addr;
>  
> + if ((flags & MAP_FIXED_SAFE) == MAP_FIXED_SAFE) {

I'm surprised this doesn't warn - since this effectively expands to:

flags & MAP_FIXED | MAP_KEEP_MAPPING

hence why MAP_FIXED_SAFE needs parens.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up


Re: linux-next: Tree for Nov 7

2017-11-13 Thread Russell King - ARM Linux
On Mon, Nov 13, 2017 at 03:11:40PM +0100, Michal Hocko wrote:
> On Mon 13-11-17 10:20:06, Michal Hocko wrote:
> > [Cc arm and ppc maintainers]
> > 
> > Thanks a lot for testing!
> > 
> > On Sun 12-11-17 11:38:02, Joel Stanley wrote:
> > > On Fri, Nov 10, 2017 at 11:00 PM, Michal Hocko  wrote:
> > > > Hi Joel,
> > > >
> > > > On Wed 08-11-17 15:20:50, Michal Hocko wrote:
> > > > [...]
> > > >> > There are a lot of messages on the way up that look like this:
> > > >> >
> > > >> > [2.527460] Uhuuh, elf segement at 000d9000 requested but the
> > > >> > memory is mapped already
> > > >> > [2.540160] Uhuuh, elf segement at 000d9000 requested but the
> > > >> > memory is mapped already
> > > >> > [2.546153] Uhuuh, elf segement at 000d9000 requested but the
> > > >> > memory is mapped already
> > > >> >
> > > >> > And then trying to run userspace looks like this:
> > > >>
> > > >> Could you please run with debugging patch posted
> > > >> http://lkml.kernel.org/r/20171107102854.vylrtaodla63k...@dhcp22.suse.cz
> > > >
> > > > Did you have chance to test with this debugging patch, please?
> > > 
> > > Lots of this:
> > > 
> > > [1.177266] Uhuuh, elf segement at 000d9000 requested but the  memory 
> > > is mapped already, got 000dd000
> > > [1.177555] Clashing vma [dd000, de000] flags:100873 name:(null)
> > 
> > This smells like the problem I've expected that mmap with hint doesn't
> > respect the hint even though there is no clashing mapping. The above
> > basically says that we didn't map at 0xd9000 but it has placed it at
> > 0xdd000. The nearest (clashing) vma is at 0xdd000 so this is our new
> > mapping. find_vma returns the closest vma (with addr < vm_end) for the
> > given address 0xd9000 so this address cannot be mapped by any other vma.
> > 
> > Now that I am looking at arm's arch_get_unmapped_area it does perform
> > aligning for shared vmas.
> 
> Sorry for confusion here. These are not shared mappings as pointed out
> by Russell in a private email. I got confused by the above flags which I
> have misinterpreted as bit 0 set => MAP_SHARED. These are vm_flags
> obviously so the bit 0 is VM_READ. Sorry about the confusion. The real
> reason we are doing the alignment is that we do a file mapping
>   /*
>* We only need to do colour alignment if either the I or D
>* caches alias.
>*/
>   if (aliasing)
>   do_align = filp || (flags & MAP_SHARED);
> 
> I am not really familiar with this architecture to understand why do we
> need aliasing for file mappings, though.

I think it's there so that flush_dcache_page() works - possibly
get_user_pages() being used on a private mapping of page cache pages,
but that's guessing.

I'm afraid I don't remember all the details, this is code from around
15 years ago, and I'd be very nervous about changing it now without
fully understanding the issues.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up


Re: linux-next: Tree for Nov 7

2017-11-13 Thread Russell King - ARM Linux
On Mon, Nov 13, 2017 at 10:20:06AM +0100, Michal Hocko wrote:
> [Cc arm and ppc maintainers]
> 
> Thanks a lot for testing!
> 
> On Sun 12-11-17 11:38:02, Joel Stanley wrote:
> > On Fri, Nov 10, 2017 at 11:00 PM, Michal Hocko  wrote:
> > > Hi Joel,
> > >
> > > On Wed 08-11-17 15:20:50, Michal Hocko wrote:
> > > [...]
> > >> > There are a lot of messages on the way up that look like this:
> > >> >
> > >> > [2.527460] Uhuuh, elf segement at 000d9000 requested but the
> > >> > memory is mapped already
> > >> > [2.540160] Uhuuh, elf segement at 000d9000 requested but the
> > >> > memory is mapped already
> > >> > [2.546153] Uhuuh, elf segement at 000d9000 requested but the
> > >> > memory is mapped already
> > >> >
> > >> > And then trying to run userspace looks like this:
> > >>
> > >> Could you please run with debugging patch posted
> > >> http://lkml.kernel.org/r/20171107102854.vylrtaodla63k...@dhcp22.suse.cz
> > >
> > > Did you have chance to test with this debugging patch, please?
> > 
> > Lots of this:
> > 
> > [1.177266] Uhuuh, elf segement at 000d9000 requested but the  memory is 
> > mapped already, got 000dd000
> > [1.177555] Clashing vma [dd000, de000] flags:100873 name:(null)
> 
> This smells like the problem I've expected that mmap with hint doesn't
> respect the hint even though there is no clashing mapping. The above
> basically says that we didn't map at 0xd9000 but it has placed it at
> 0xdd000. The nearest (clashing) vma is at 0xdd000 so this is our new
> mapping. find_vma returns the closest vma (with addr < vm_end) for the
> given address 0xd9000 so this address cannot be mapped by any other vma.
> 
> Now that I am looking at arm's arch_get_unmapped_area it does perform
> aligning for shared vmas. We do not do that for MAP_FIXED.  Powepc,
> reported earlier [1] seems to suffer from the similar problem.
> slice_get_unmapped_area alignes to slices, whatever that means.
> 
> I can see two possible ways around that. Either we explicitly request
> non-aligned mappings via a special MAP_$FOO (e.g. MAP_FIXED_SAFE) or
> simply opt out from the MAP_FIXED protection via ifdefs. The first
> option sounds more generic to me but also more tricky to not introduce
> other user visible effects. The later is quite straightforward. What do
> you think about the following on top of the previous patch?
> 
> It is rather terse and disables the MAP_FIXED protection for arm
> comletely because I couldn't find a way to make it conditional on
> CACHEID_VIPT_ALIASING. But this can be always handled later. I find the
> protection for other archtectures useful enough to have this working for
> most architectures now and handle others specially.

Can someone provide the background information for this please?

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up


Re: [PATCH v2 0/5] Use ELF_ET_DYN_BASE only for PIE

2017-06-24 Thread Russell King - ARM Linux
On Fri, Jun 23, 2017 at 01:59:55PM -0700, Kees Cook wrote:
> This is v2 (to refresh the 5 patches in -mm) for moving ELF_ET_DYN_BASE
> safely lower. Changes are clarifications in the commit logs (suggested
> by mpe), a compat think-o fix for arm64 (thanks to Ard), and to add
> Rik and mpe's Acks.
> 
> Quoting patch 1/5:
> 
> The ELF_ET_DYN_BASE position was originally intended to keep loaders
> away from ET_EXEC binaries. (For example, running "/lib/ld-linux.so.2
> /bin/cat" might cause the subsequent load of /bin/cat into where the
> loader had been loaded.) With the advent of PIE (ET_DYN binaries with
> an INTERP Program Header), ELF_ET_DYN_BASE continued to be used since
> the kernel was only looking at ET_DYN. However, since ELF_ET_DYN_BASE
> is traditionally set at the top 1/3rd of the TASK_SIZE, a substantial
> portion of the address space is unused.

With existing kernels on ARM:

0001-00017000 r-xp  08:01 270810 /bin/cat
00026000-00027000 r--p 6000 08:01 270810 /bin/cat
00027000-00028000 rw-p 7000 08:01 270810 /bin/cat
7f661000-7f679000 r-xp  08:01 281659 
/lib/arm-linux-gnueabihf/ld-2.23.so
7f688000-7f689000 r--p 00017000 08:01 281659 
/lib/arm-linux-gnueabihf/ld-2.23.so
7f689000-7f68a000 rw-p 00018000 08:01 281659 
/lib/arm-linux-gnueabihf/ld-2.23.so

If the loader is loaded at 4MB, this means the size of an ET_EXEC
program is limited to less than 4MB - and distros aren't yet
building everything as PIE on ARM.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.


Re: [PATCH 25/44] arm: implement ->mapping_error

2017-06-08 Thread Russell King - ARM Linux
BOn Thu, Jun 08, 2017 at 03:25:50PM +0200, Christoph Hellwig wrote:
> +static int dmabounce_mapping_error(struct device *dev, dma_addr_t dma_addr)
> +{
> + if (dev->archdata.dmabounce)
> + return 0;

I'm not convinced that we need this check here:

dev->archdata.dmabounce = device_info;
set_dma_ops(dev, _ops);

There shouldn't be any chance of dev->archdata.dmabounce being NULL if
the dmabounce_ops has been set as the current device DMA ops.  So I
think that test can be killed.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.


Re: [PATCH 2/3] arm: kprobes: remove kprobe_exceptions_notify

2017-03-06 Thread Russell King - ARM Linux
On Mon, Mar 06, 2017 at 11:37:20PM +0530, Naveen N. Rao wrote:
> On 2017/02/08 01:24AM, Naveen N Rao wrote:
> > ... as the weak variant will do.
> > 
> > Signed-off-by: Naveen N. Rao 
> > ---
> >  arch/arm/probes/kprobes/core.c | 10 --
> >  arch/arm64/kernel/probes/kprobes.c |  6 --
> >  2 files changed, 16 deletions(-)
> 
> With the generic changes in this series now in -rc1, can you please pick 
> this up?

It would've been nice to have been in the To: or Cc: on this patch,
I suspect everyone on the ARM side ignored this series (I certainly
didn't notice it, and I suspect the ARM64 folk didn't notice it for
exactly the same reason.)

In any case, this patch needs to be split - ARM and ARM64 are
maintained separately (as stated in MAINTAINERS), and patches go via
different trees.  Please resubmit with the patch split between the
architectures and proper recipients in the headers.

Thanks.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.


Re: [PATCH 1/3] futex: remove duplicated code

2017-03-04 Thread Russell King - ARM Linux
On Fri, Mar 03, 2017 at 01:27:10PM +0100, Jiri Slaby wrote:
> diff --git a/arch/arm/include/asm/futex.h b/arch/arm/include/asm/futex.h
> index 6795368ad023..cc414382dab4 100644
> --- a/arch/arm/include/asm/futex.h
> +++ b/arch/arm/include/asm/futex.h
> @@ -128,20 +128,10 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user 
> *uaddr,
>  #endif /* !SMP */
>  
>  static inline int
> -futex_atomic_op_inuser (int encoded_op, u32 __user *uaddr)
> +arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr)
>  {
> - int op = (encoded_op >> 28) & 7;
> - int cmp = (encoded_op >> 24) & 15;
> - int oparg = (encoded_op << 8) >> 20;
> - int cmparg = (encoded_op << 20) >> 20;
>   int oldval = 0, ret, tmp;
>  
> - if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28))
> - oparg = 1 << oparg;
> -
> - if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
> - return -EFAULT;
> -
>  #ifndef CONFIG_SMP
>   preempt_disable();
>  #endif
> @@ -172,17 +162,9 @@ futex_atomic_op_inuser (int encoded_op, u32 __user 
> *uaddr)
>   preempt_enable();
>  #endif
>  
> - if (!ret) {
> - switch (cmp) {
> - case FUTEX_OP_CMP_EQ: ret = (oldval == cmparg); break;
> - case FUTEX_OP_CMP_NE: ret = (oldval != cmparg); break;
> - case FUTEX_OP_CMP_LT: ret = (oldval < cmparg); break;
> - case FUTEX_OP_CMP_GE: ret = (oldval >= cmparg); break;
> - case FUTEX_OP_CMP_LE: ret = (oldval <= cmparg); break;
> - case FUTEX_OP_CMP_GT: ret = (oldval > cmparg); break;
> - default: ret = -ENOSYS;
> - }
> - }
> + if (!ret)
> + *oval = oldval;
> +
>   return ret;
>  }
>  
> diff --git a/kernel/futex.c b/kernel/futex.c
> index b687cb22301c..c5ff9850952f 100644
> --- a/kernel/futex.c
> +++ b/kernel/futex.c
> @@ -1457,6 +1457,42 @@ futex_wake(u32 __user *uaddr, unsigned int flags, int 
> nr_wake, u32 bitset)
>   return ret;
>  }
>  
> +static int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
> +{
> + int op = (encoded_op >> 28) & 7;
> + int cmp = (encoded_op >> 24) & 15;
> + int oparg = (encoded_op << 8) >> 20;
> + int cmparg = (encoded_op << 20) >> 20;

Hmm.  oparg and cmparg look like they're doing these shifts to get sign
extension of the 12-bit values by assuming that "int" is 32-bit -
probably worth a comment, or for safety, they should be "s32" so it's
not dependent on the bit-width of "int".

> + int oldval, ret;
> +
> + if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28))
> + oparg = 1 << oparg;

I guess it doesn't matter that oparg can be >= the bit size of oparg
(so large values produce an undefined result) as it's no different
from userspace trying to do the same with large shifts.

> +
> + if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
> + return -EFAULT;
> +
> + ret = arch_futex_atomic_op_inuser(op, oparg, , uaddr);
> + if (ret)
> + return ret;
> +
> + switch (cmp) {
> + case FUTEX_OP_CMP_EQ:
> + return oldval == cmparg;
> + case FUTEX_OP_CMP_NE:
> + return oldval != cmparg;
> + case FUTEX_OP_CMP_LT:
> + return oldval < cmparg;
> + case FUTEX_OP_CMP_GE:
> + return oldval >= cmparg;
> + case FUTEX_OP_CMP_LE:
> + return oldval <= cmparg;
> + case FUTEX_OP_CMP_GT:
> + return oldval > cmparg;
> + default:
> + return -ENOSYS;
> + }
> +}
> +
>  /*
>   * Wake up all waiters hashed on the physical page that is mapped
>   * to this virtual address:

As it's no worse than our existing code, for the above,

Acked-by: Russell King 

Thanks.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.


Re: [PATCH v3 03/37] treewide: Consolidate set_dma_ops() implementations

2017-01-30 Thread Russell King - ARM Linux
On Fri, Jan 20, 2017 at 01:04:03PM -0800, Bart Van Assche wrote:
> Now that all set_dma_ops() implementations are identical (ignoring
> BUG_ON() statements), remove the architecture specific definitions
> and add a definition in .
> 
> Signed-off-by: Bart Van Assche 
> Cc: Benjamin Herrenschmidt 
> Cc: Chris Metcalf 
> Cc: David Woodhouse 
> Cc: linux-a...@vger.kernel.org
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-ker...@vger.kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: Paul Mackerras 
> Cc: Russell King 

Acked-by: Russell King 

Thanks.

> ---
>  arch/arm/include/asm/dma-mapping.h | 6 --
>  arch/powerpc/include/asm/dma-mapping.h | 5 -
>  arch/tile/include/asm/dma-mapping.h| 5 -
>  include/linux/dma-mapping.h| 5 +
>  4 files changed, 5 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/arm/include/asm/dma-mapping.h 
> b/arch/arm/include/asm/dma-mapping.h
> index 312f4d0564d6..c7432d647e53 100644
> --- a/arch/arm/include/asm/dma-mapping.h
> +++ b/arch/arm/include/asm/dma-mapping.h
> @@ -31,12 +31,6 @@ static inline const struct dma_map_ops *get_dma_ops(struct 
> device *dev)
>   return __generic_dma_ops(dev);
>  }
>  
> -static inline void set_dma_ops(struct device *dev, const struct dma_map_ops 
> *ops)
> -{
> - BUG_ON(!dev);
> - dev->dma_ops = ops;
> -}
> -
>  #define HAVE_ARCH_DMA_SUPPORTED 1
>  extern int dma_supported(struct device *dev, u64 mask);
>  
> diff --git a/arch/powerpc/include/asm/dma-mapping.h 
> b/arch/powerpc/include/asm/dma-mapping.h
> index 59fbd4abcbf8..8275603ba4d5 100644
> --- a/arch/powerpc/include/asm/dma-mapping.h
> +++ b/arch/powerpc/include/asm/dma-mapping.h
> @@ -91,11 +91,6 @@ static inline const struct dma_map_ops *get_dma_ops(struct 
> device *dev)
>   return dev->dma_ops;
>  }
>  
> -static inline void set_dma_ops(struct device *dev, const struct dma_map_ops 
> *ops)
> -{
> - dev->dma_ops = ops;
> -}
> -
>  /*
>   * get_dma_offset()
>   *
> diff --git a/arch/tile/include/asm/dma-mapping.h 
> b/arch/tile/include/asm/dma-mapping.h
> index c0620697eaad..2562995a6ac9 100644
> --- a/arch/tile/include/asm/dma-mapping.h
> +++ b/arch/tile/include/asm/dma-mapping.h
> @@ -59,11 +59,6 @@ static inline phys_addr_t dma_to_phys(struct device *dev, 
> dma_addr_t daddr)
>  
>  static inline void dma_mark_clean(void *addr, size_t size) {}
>  
> -static inline void set_dma_ops(struct device *dev, const struct dma_map_ops 
> *ops)
> -{
> - dev->dma_ops = ops;
> -}
> -
>  static inline bool dma_capable(struct device *dev, dma_addr_t addr, size_t 
> size)
>  {
>   if (!dev->dma_mask)
> diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
> index f1da68b82c63..e97f23e8b2d9 100644
> --- a/include/linux/dma-mapping.h
> +++ b/include/linux/dma-mapping.h
> @@ -164,6 +164,11 @@ int dma_mmap_from_coherent(struct device *dev, struct 
> vm_area_struct *vma,
>  
>  #ifdef CONFIG_HAS_DMA
>  #include 
> +static inline void set_dma_ops(struct device *dev,
> +const struct dma_map_ops *dma_ops)
> +{
> + dev->dma_ops = dma_ops;
> +}
>  #else
>  /*
>   * Define the dma api to allow compilation but not linking of
> -- 
> 2.11.0
> 

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.


Re: [PATCH v3 1/8] arm: put types.h in uapi

2017-01-13 Thread Russell King - ARM Linux
On Fri, Jan 13, 2017 at 11:46:39AM +0100, Nicolas Dichtel wrote:
> This header file is exported, thus move it to uapi.

I'm taking this patch, but with the following commit log:

  Due to the way kbuild works, this header was unintentionally exported
  back in 2013 when it was created, despite it not being in a uapi/
  directory.  This is very non-intuitive behaviour by Kbuild.

  However, we've had this include exported to userland for almost four
  years, and searching google for "ARM types.h __UINTPTR_TYPE__" gives
  no hint that anyone has complained about it.  So, let's make it
  officially exported in this state.

If anyone has any objections, they better shout sooner rather than
later.

> 
> Signed-off-by: Nicolas Dichtel 
> ---
>  arch/arm/include/asm/types.h  | 40 
> ---
>  arch/arm/include/uapi/asm/types.h | 40 
> +++
>  2 files changed, 40 insertions(+), 40 deletions(-)
>  delete mode 100644 arch/arm/include/asm/types.h
>  create mode 100644 arch/arm/include/uapi/asm/types.h
> 
> diff --git a/arch/arm/include/asm/types.h b/arch/arm/include/asm/types.h
> deleted file mode 100644
> index a53cdb8f068c..
> --- a/arch/arm/include/asm/types.h
> +++ /dev/null
> @@ -1,40 +0,0 @@
> -#ifndef _ASM_TYPES_H
> -#define _ASM_TYPES_H
> -
> -#include 
> -
> -/*
> - * The C99 types uintXX_t that are usually defined in 'stdint.h' are not as
> - * unambiguous on ARM as you would expect. For the types below, there is a
> - * difference on ARM between GCC built for bare metal ARM, GCC built for 
> glibc
> - * and the kernel itself, which results in build errors if you try to build 
> with
> - * -ffreestanding and include 'stdint.h' (such as when you include 
> 'arm_neon.h'
> - * in order to use NEON intrinsics)
> - *
> - * As the typedefs for these types in 'stdint.h' are based on builtin defines
> - * supplied by GCC, we can tweak these to align with the kernel's idea of 
> those
> - * types, so 'linux/types.h' and 'stdint.h' can be safely included from the 
> same
> - * source file (provided that -ffreestanding is used).
> - *
> - *int32_t uint32_t   uintptr_t
> - * bare metal GCC longunsigned long  unsigned int
> - * glibc GCC  int unsigned int   unsigned int
> - * kernel int unsigned int   unsigned long
> - */
> -
> -#ifdef __INT32_TYPE__
> -#undef __INT32_TYPE__
> -#define __INT32_TYPE__   int
> -#endif
> -
> -#ifdef __UINT32_TYPE__
> -#undef __UINT32_TYPE__
> -#define __UINT32_TYPE__  unsigned int
> -#endif
> -
> -#ifdef __UINTPTR_TYPE__
> -#undef __UINTPTR_TYPE__
> -#define __UINTPTR_TYPE__ unsigned long
> -#endif
> -
> -#endif /* _ASM_TYPES_H */
> diff --git a/arch/arm/include/uapi/asm/types.h 
> b/arch/arm/include/uapi/asm/types.h
> new file mode 100644
> index ..9435a42f575e
> --- /dev/null
> +++ b/arch/arm/include/uapi/asm/types.h
> @@ -0,0 +1,40 @@
> +#ifndef _UAPI_ASM_TYPES_H
> +#define _UAPI_ASM_TYPES_H
> +
> +#include 
> +
> +/*
> + * The C99 types uintXX_t that are usually defined in 'stdint.h' are not as
> + * unambiguous on ARM as you would expect. For the types below, there is a
> + * difference on ARM between GCC built for bare metal ARM, GCC built for 
> glibc
> + * and the kernel itself, which results in build errors if you try to build 
> with
> + * -ffreestanding and include 'stdint.h' (such as when you include 
> 'arm_neon.h'
> + * in order to use NEON intrinsics)
> + *
> + * As the typedefs for these types in 'stdint.h' are based on builtin defines
> + * supplied by GCC, we can tweak these to align with the kernel's idea of 
> those
> + * types, so 'linux/types.h' and 'stdint.h' can be safely included from the 
> same
> + * source file (provided that -ffreestanding is used).
> + *
> + *int32_t uint32_t   uintptr_t
> + * bare metal GCC longunsigned long  unsigned int
> + * glibc GCC  int unsigned int   unsigned int
> + * kernel int unsigned int   unsigned long
> + */
> +
> +#ifdef __INT32_TYPE__
> +#undef __INT32_TYPE__
> +#define __INT32_TYPE__   int
> +#endif
> +
> +#ifdef __UINT32_TYPE__
> +#undef __UINT32_TYPE__
> +#define __UINT32_TYPE__  unsigned int
> +#endif
> +
> +#ifdef __UINTPTR_TYPE__
> +#undef __UINTPTR_TYPE__
> +#define __UINTPTR_TYPE__ unsigned long
> +#endif
> +
> +#endif /* _UAPI_ASM_TYPES_H */
> -- 
> 2.8.1
> 

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.


Re: [PATCH v3 1/8] arm: put types.h in uapi

2017-01-13 Thread Russell King - ARM Linux
On Fri, Jan 13, 2017 at 05:01:01PM +0100, Nicolas Dichtel wrote:
> Please, do not remove the email subject when you reply. I restore it to
> ease the thread follow-up.

I mentioned it to David, and he says it's because the long list of
recipients is breaking his mailer.  I've already posed the question
about whether that's exploitable!

> Le 13/01/2017 à 16:36, David Howells a écrit :
> > Nicolas Dichtel  wrote:
> > 
> >> This header file is exported, thus move it to uapi.
> > 
> > Exported how?
> 
> It is listed in include/uapi/asm-generic/Kbuild.asm, which is included by
> arch/arm/include/uapi/asm/Kbuild.

We really should not be installing non-uapi header files to userland
under _any_ circumstance - this to me sounds like a bug in kbuild.

The assumption is that headers outside of uapi directories are not
part of the user visible API, and so can be freely modified - which
in the presence of this bug is untrue.

However, as it's happening, and this header has been there since 2013
(commit 09096f6a0ee2 - "ARM: 7822/1: add workaround for ambiguous C99
stdint.h types") it's now well and truely part of the user API whether
we intended it to be or not, so your patch looks to me like the correct
thing to do.

I think it needs further evaluation to make sure kbuild isn't going to
do something else silly, like subsitute include/asm-generic/types.h for
the now missing arch/arm/include/asm/types.h

I wonder how many more headers are unintentionally exported.

... what a mess. :(

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.


Re: [PATCH v2 7/7] uapi: export all headers under uapi directories

2017-01-09 Thread Russell King - ARM Linux
On Fri, Jan 06, 2017 at 10:43:59AM +0100, Nicolas Dichtel wrote:
> diff --git a/arch/arm/include/uapi/asm/Kbuild 
> b/arch/arm/include/uapi/asm/Kbuild
> index 46a76cd6acb6..607f702c2d62 100644
> --- a/arch/arm/include/uapi/asm/Kbuild
> +++ b/arch/arm/include/uapi/asm/Kbuild
> @@ -1,23 +1,6 @@
>  # UAPI Header export list
>  include include/uapi/asm-generic/Kbuild.asm
>  
> -header-y += auxvec.h
> -header-y += byteorder.h
> -header-y += fcntl.h
> -header-y += hwcap.h
> -header-y += ioctls.h
> -header-y += kvm_para.h
> -header-y += mman.h
> -header-y += perf_regs.h
> -header-y += posix_types.h
> -header-y += ptrace.h
> -header-y += setup.h
> -header-y += sigcontext.h
> -header-y += signal.h
> -header-y += stat.h
> -header-y += statfs.h
> -header-y += swab.h
> -header-y += unistd.h
>  genhdr-y += unistd-common.h
>  genhdr-y += unistd-oabi.h
>  genhdr-y += unistd-eabi.h

Acked-by: Russell King 

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.


Re: [PATCH v2 1/7] arm: put types.h in uapi

2017-01-09 Thread Russell King - ARM Linux
On Mon, Jan 09, 2017 at 12:33:02PM +0100, Arnd Bergmann wrote:
> On Friday, January 6, 2017 10:43:53 AM CET Nicolas Dichtel wrote:
> > 
> > diff --git a/arch/arm/include/asm/types.h b/arch/arm/include/asm/types.h
> > index a53cdb8f068c..c48fee3d7b3b 100644
> > --- a/arch/arm/include/asm/types.h
> > +++ b/arch/arm/include/asm/types.h
> > @@ -1,40 +1,6 @@
> >  #ifndef _ASM_TYPES_H
> >  #define _ASM_TYPES_H
> >  
> > -#include 
> ...
> > -#define __UINTPTR_TYPE__   unsigned long
> > -#endif
> > +#include 
> >  
> >  #endif /* _ASM_TYPES_H */
> > 
> 
> Moving the file is correct as far as I can tell, but the extra
> #include is not necessary here, as the kernel will automatically
> search both arch/arm/include/ and arch/arm/include/uapi/.

Indeed, I'd like to see the include/asm file gone.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.


Re: [PATCH 1/1] sched: provide common cpu_relax_yield definition

2016-11-16 Thread Russell King - ARM Linux
On Wed, Nov 16, 2016 at 01:23:05PM +0100, Christian Borntraeger wrote:
> No need to duplicate the same define everywhere. Since
> the only user is stop-machine and the only provider is
> s390, we can use a default implementation of cpu_relax_yield
> in sched.h.
> 
> Suggested-by: Russell King 
> Signed-off-by: Christian Borntraeger 

Thanks for that.  (Please change my address above when adding this ack...)

Acked-by: Russell King 

> ---
>  arch/alpha/include/asm/processor.h  | 1 -
>  arch/arc/include/asm/processor.h| 3 ---
>  arch/arm/include/asm/processor.h| 2 --
>  arch/arm64/include/asm/processor.h  | 2 --
>  arch/avr32/include/asm/processor.h  | 1 -
>  arch/blackfin/include/asm/processor.h   | 1 -
>  arch/c6x/include/asm/processor.h| 1 -
>  arch/cris/include/asm/processor.h   | 1 -
>  arch/frv/include/asm/processor.h| 1 -
>  arch/h8300/include/asm/processor.h  | 1 -
>  arch/hexagon/include/asm/processor.h| 1 -
>  arch/ia64/include/asm/processor.h   | 1 -
>  arch/m32r/include/asm/processor.h   | 1 -
>  arch/m68k/include/asm/processor.h   | 1 -
>  arch/metag/include/asm/processor.h  | 1 -
>  arch/microblaze/include/asm/processor.h | 1 -
>  arch/mips/include/asm/processor.h   | 1 -
>  arch/mn10300/include/asm/processor.h| 1 -
>  arch/nios2/include/asm/processor.h  | 1 -
>  arch/openrisc/include/asm/processor.h   | 1 -
>  arch/parisc/include/asm/processor.h | 1 -
>  arch/powerpc/include/asm/processor.h| 2 --
>  arch/s390/include/asm/processor.h   | 1 +
>  arch/score/include/asm/processor.h  | 1 -
>  arch/sh/include/asm/processor.h | 1 -
>  arch/sparc/include/asm/processor_32.h   | 1 -
>  arch/sparc/include/asm/processor_64.h   | 1 -
>  arch/tile/include/asm/processor.h   | 2 --
>  arch/unicore32/include/asm/processor.h  | 1 -
>  arch/x86/include/asm/processor.h| 2 --
>  arch/x86/um/asm/processor.h | 1 -
>  arch/xtensa/include/asm/processor.h | 1 -
>  include/linux/sched.h   | 4 
>  33 files changed, 5 insertions(+), 38 deletions(-)
> 
> diff --git a/arch/alpha/include/asm/processor.h 
> b/arch/alpha/include/asm/processor.h
> index 31e8dbe..2fec2de 100644
> --- a/arch/alpha/include/asm/processor.h
> +++ b/arch/alpha/include/asm/processor.h
> @@ -58,7 +58,6 @@ unsigned long get_wchan(struct task_struct *p);
>((tsk) == current ? rdusp() : task_thread_info(tsk)->pcb.usp)
>  
>  #define cpu_relax()  barrier()
> -#define cpu_relax_yield() cpu_relax()
>  
>  #define ARCH_HAS_PREFETCH
>  #define ARCH_HAS_PREFETCHW
> diff --git a/arch/arc/include/asm/processor.h 
> b/arch/arc/include/asm/processor.h
> index d102a49..6e1242d 100644
> --- a/arch/arc/include/asm/processor.h
> +++ b/arch/arc/include/asm/processor.h
> @@ -60,15 +60,12 @@ struct task_struct;
>  #ifndef CONFIG_EZNPS_MTM_EXT
>  
>  #define cpu_relax()  barrier()
> -#define cpu_relax_yield()cpu_relax()
>  
>  #else
>  
>  #define cpu_relax() \
>   __asm__ __volatile__ (".word %0" : : "i"(CTOP_INST_SCHD_RW) : "memory")
>  
> -#define cpu_relax_yield()cpu_relax()
> -
>  #endif
>  
>  #define copy_segments(tsk, mm)  do { } while (0)
> diff --git a/arch/arm/include/asm/processor.h 
> b/arch/arm/include/asm/processor.h
> index 9e71c58b..c3d5fc1 100644
> --- a/arch/arm/include/asm/processor.h
> +++ b/arch/arm/include/asm/processor.h
> @@ -82,8 +82,6 @@ unsigned long get_wchan(struct task_struct *p);
>  #define cpu_relax()  barrier()
>  #endif
>  
> -#define cpu_relax_yield()  cpu_relax()
> -
>  #define task_pt_regs(p) \
>   ((struct pt_regs *)(THREAD_START_SP + task_stack_page(p)) - 1)
>  
> diff --git a/arch/arm64/include/asm/processor.h 
> b/arch/arm64/include/asm/processor.h
> index 6132f64..747c65a 100644
> --- a/arch/arm64/include/asm/processor.h
> +++ b/arch/arm64/include/asm/processor.h
> @@ -149,8 +149,6 @@ static inline void cpu_relax(void)
>   asm volatile("yield" ::: "memory");
>  }
>  
> -#define cpu_relax_yield() cpu_relax()
> -
>  /* Thread switching */
>  extern struct task_struct *cpu_switch_to(struct task_struct *prev,
>struct task_struct *next);
> diff --git a/arch/avr32/include/asm/processor.h 
> b/arch/avr32/include/asm/processor.h
> index ee62365..972adcc 100644
> --- a/arch/avr32/include/asm/processor.h
> +++ b/arch/avr32/include/asm/processor.h
> @@ -92,7 +92,6 @@ extern struct avr32_cpuinfo boot_cpu_data;
>  #define TASK_UNMAPPED_BASE   (PAGE_ALIGN(TASK_SIZE / 3))
>  
>  #define cpu_relax()  barrier()
> -#define cpu_relax_yield()cpu_relax()
>  #define cpu_sync_pipeline()  asm volatile("sub pc, -2" : : : "memory")
>  
>  struct cpu_context {
> diff --git a/arch/blackfin/include/asm/processor.h 
> b/arch/blackfin/include/asm/processor.h
> index 57acfb1..85d4af9 100644
> 

Re: [GIT PULL v2 1/5] processor.h: introduce cpu_relax_yield

2016-11-15 Thread Russell King - ARM Linux
On Tue, Nov 15, 2016 at 02:19:53PM +0100, Christian Borntraeger wrote:
> On 11/15/2016 01:30 PM, Russell King - ARM Linux wrote:
> > On Tue, Oct 25, 2016 at 11:03:11AM +0200, Christian Borntraeger wrote:
> >> For spinning loops people do often use barrier() or cpu_relax().
> >> For most architectures cpu_relax and barrier are the same, but on
> >> some architectures cpu_relax can add some latency.
> >> For example on power,sparc64 and arc, cpu_relax can shift the CPU
> >> towards other hardware threads in an SMT environment.
> >> On s390 cpu_relax does even more, it uses an hypercall to the
> >> hypervisor to give up the timeslice.
> >> In contrast to the SMT yielding this can result in larger latencies.
> >> In some places this latency is unwanted, so another variant
> >> "cpu_relax_lowlatency" was introduced. Before this is used in more
> >> and more places, lets revert the logic and provide a cpu_relax_yield
> >> that can be called in places where yielding is more important than
> >> latency. By default this is the same as cpu_relax on all architectures.
> > 
> > Rather than having to update all these architectures in this way, can't
> > we put in some linux/*.h header something like:
> > 
> > #ifndef cpu_relax_yield
> > #define cpu_relax_yield() cpu_relax()
> > #endif
> > 
> > so only those architectures that need to do something need to be
> > modified?
> 
> These patches are part of linux-next since a month or so, changing that 
> would invalidate all the next testing. If people want that, I can certainly
> do that, though.

It's three weeks since you posted them.  For one of those weeks (the
week you posted them) I was away, and missed them while catching up.
Sorry, but it sometimes takes a while to spot things amongst the
backlog, and normally takes some subsequent activity on the thread to
bring it back into view.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.


Re: [GIT PULL v2 1/5] processor.h: introduce cpu_relax_yield

2016-11-15 Thread Russell King - ARM Linux
On Tue, Oct 25, 2016 at 11:03:11AM +0200, Christian Borntraeger wrote:
> For spinning loops people do often use barrier() or cpu_relax().
> For most architectures cpu_relax and barrier are the same, but on
> some architectures cpu_relax can add some latency.
> For example on power,sparc64 and arc, cpu_relax can shift the CPU
> towards other hardware threads in an SMT environment.
> On s390 cpu_relax does even more, it uses an hypercall to the
> hypervisor to give up the timeslice.
> In contrast to the SMT yielding this can result in larger latencies.
> In some places this latency is unwanted, so another variant
> "cpu_relax_lowlatency" was introduced. Before this is used in more
> and more places, lets revert the logic and provide a cpu_relax_yield
> that can be called in places where yielding is more important than
> latency. By default this is the same as cpu_relax on all architectures.

Rather than having to update all these architectures in this way, can't
we put in some linux/*.h header something like:

#ifndef cpu_relax_yield
#define cpu_relax_yield() cpu_relax()
#endif

so only those architectures that need to do something need to be
modified?

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.


Re: Kexec regression in next-20160906

2016-09-07 Thread Russell King - ARM Linux
On Tue, Sep 06, 2016 at 08:33:20PM -0300, Thiago Jung Bauermann wrote:
> Thanks for reporting the problem and finding the commit that caused it.
> The only thing in commit 5c01cdd2d4bc which can affect kexec_load is the 
> fact that struct kexec_segment has a new member.
> 
> This is probably breaking the ABI on ARM, then. I verified that kexec_load 
> kept working on ppc64le with a kexec binary compiled with the original 
> struct kexec_segment definition, but apparently I got lucky.

That _will_ definitely break the ABI on ARM, and pretty much all
32-bit architectures.  It's a silly thing to do, and I'm really
surprised that it passed through review without being spotted.

The reason you "got lucky" with ppc64le is that there was probably
padding in the structure, and you happened to place your new member
in that padding, so the structure size didn't change.

For 32-bit architectures, there will be no padding - both the pointers
and size_t members will be 32-bit, and will be naturally aligned, and
hence there will be no padding.  Any addition to the structure will
change the size of the structure.

Any change to a UAPI header needs to be carefully considered and
questioned as it is always a potential userspace breakage - and in
the kernel, we're supposed to be doing our up-most to avoid
breaking userspace.

It's not like it was in the old days when we didn't have the UAPI
seperate - today, we can find these things by looking at the patch
diffstat and seeing whether any file in "uapi" is touched.  That
should be the trigger for a really in-depth review of the change.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.


Re: [PATCH v5 2/7] perf annotate: Add cross arch annotate support

2016-08-19 Thread Russell King - ARM Linux
On Fri, Aug 19, 2016 at 04:09:51PM +0530, Ravi Bangoria wrote:
> Thanks Russell for reviewing.
> 
> On Friday 19 August 2016 01:20 PM, Russell King - ARM Linux wrote:
> > On Fri, Aug 19, 2016 at 10:59:01AM +0530, Ravi Bangoria wrote:
> >> -static struct ins instructions[] = {
> >> +static struct ins instructions_x86[] = {
> >>{ .name = "add",   .ops  = _ops, },
> >>{ .name = "addl",  .ops  = _ops, },
> >>{ .name = "addq",  .ops  = _ops, },
> >>{ .name = "addw",  .ops  = _ops, },
> >>{ .name = "and",   .ops  = _ops, },
> >> -#ifdef __arm__
> >> -  { .name = "b", .ops  = _ops, }, // might also be a call
> >> -  { .name = "bcc",   .ops  = _ops, },
> >> -  { .name = "bcs",   .ops  = _ops, },
> >> -  { .name = "beq",   .ops  = _ops, },
> >> -  { .name = "bge",   .ops  = _ops, },
> >> -  { .name = "bgt",   .ops  = _ops, },
> >> -  { .name = "bhi",   .ops  = _ops, },
> >> -  { .name = "bl",.ops  = _ops, },
> >> -  { .name = "bls",   .ops  = _ops, },
> >> -  { .name = "blt",   .ops  = _ops, },
> >> -  { .name = "blx",   .ops  = _ops, },
> >> -  { .name = "bne",   .ops  = _ops, },
> >> -#endif
> > Notice that ARM includes a lot of other instructions from this table,
> > not just those above.
> >
> >>{ .name = "bts",   .ops  = _ops, },
> >>{ .name = "call",  .ops  = _ops, },
> >>{ .name = "callq", .ops  = _ops, },
> >> @@ -456,6 +444,21 @@ static struct ins instructions[] = {
> >>{ .name = "retq",  .ops  = _ops, },
> >>  };
> >>  
> >> +static struct ins instructions_arm[] = {
> >> +  { .name = "b", .ops  = _ops, }, /* might also be a call */
> >> +  { .name = "bcc",   .ops  = _ops, },
> >> +  { .name = "bcs",   .ops  = _ops, },
> >> +  { .name = "beq",   .ops  = _ops, },
> >> +  { .name = "bge",   .ops  = _ops, },
> >> +  { .name = "bgt",   .ops  = _ops, },
> >> +  { .name = "bhi",   .ops  = _ops, },
> >> +  { .name = "bl",.ops  = _ops, },
> >> +  { .name = "bls",   .ops  = _ops, },
> >> +  { .name = "blt",   .ops  = _ops, },
> >> +  { .name = "blx",   .ops  = _ops, },
> >> +  { .name = "bne",   .ops  = _ops, },
> >> +};
> >> +
> > ...
> >> +  if (!strcmp(norm_arch, NORM_X86)) {
> >> +  instructions = instructions_x86;
> >> +  nmemb = ARRAY_SIZE(instructions_x86);
> >> +  } else if (!strcmp(norm_arch, NORM_ARM)) {
> >> +  instructions = instructions_arm;
> >> +  nmemb = ARRAY_SIZE(instructions_arm);
> > But these changes result in _only_ the ones that were in the #if __arm__
> > being matched.  This is wrong.
> >
> > If we want to go that way, we need to add _all_ arm instructions to
> > instructions_arm, not just those within the #if.
> 
> Yes, I've mentioned same in cover letter as well.
> 
> Can I add all x86 instructions for arm as well? If not, can you please
> provide a list of arm instructions that needs to be added here.

If it were me doing a change like this, I'd be trying to preserve the
current behaviour to avoid causing regressions, which would mean
ensuring that all the instructions that were visible before the change
remain visible after the change, even those which are obviously x86
specific but were still in the table anyway.  It then becomes a cleanup
matter later to remove those which aren't relevent, rather than having
to chase around wondering why the tool broke.

I'm afraid I don't have time to look at this (I'm chasing regressions
and bugs in the kernel) so I'd suggest you try to avoid causing
regressions in this tool...

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.


Re: [PATCH v5 2/7] perf annotate: Add cross arch annotate support

2016-08-19 Thread Russell King - ARM Linux
On Fri, Aug 19, 2016 at 10:59:01AM +0530, Ravi Bangoria wrote:
> -static struct ins instructions[] = {
> +static struct ins instructions_x86[] = {
>   { .name = "add",   .ops  = _ops, },
>   { .name = "addl",  .ops  = _ops, },
>   { .name = "addq",  .ops  = _ops, },
>   { .name = "addw",  .ops  = _ops, },
>   { .name = "and",   .ops  = _ops, },
> -#ifdef __arm__
> - { .name = "b", .ops  = _ops, }, // might also be a call
> - { .name = "bcc",   .ops  = _ops, },
> - { .name = "bcs",   .ops  = _ops, },
> - { .name = "beq",   .ops  = _ops, },
> - { .name = "bge",   .ops  = _ops, },
> - { .name = "bgt",   .ops  = _ops, },
> - { .name = "bhi",   .ops  = _ops, },
> - { .name = "bl",.ops  = _ops, },
> - { .name = "bls",   .ops  = _ops, },
> - { .name = "blt",   .ops  = _ops, },
> - { .name = "blx",   .ops  = _ops, },
> - { .name = "bne",   .ops  = _ops, },
> -#endif

Notice that ARM includes a lot of other instructions from this table,
not just those above.

>   { .name = "bts",   .ops  = _ops, },
>   { .name = "call",  .ops  = _ops, },
>   { .name = "callq", .ops  = _ops, },
> @@ -456,6 +444,21 @@ static struct ins instructions[] = {
>   { .name = "retq",  .ops  = _ops, },
>  };
>  
> +static struct ins instructions_arm[] = {
> + { .name = "b", .ops  = _ops, }, /* might also be a call */
> + { .name = "bcc",   .ops  = _ops, },
> + { .name = "bcs",   .ops  = _ops, },
> + { .name = "beq",   .ops  = _ops, },
> + { .name = "bge",   .ops  = _ops, },
> + { .name = "bgt",   .ops  = _ops, },
> + { .name = "bhi",   .ops  = _ops, },
> + { .name = "bl",.ops  = _ops, },
> + { .name = "bls",   .ops  = _ops, },
> + { .name = "blt",   .ops  = _ops, },
> + { .name = "blx",   .ops  = _ops, },
> + { .name = "bne",   .ops  = _ops, },
> +};
> +
...
> + if (!strcmp(norm_arch, NORM_X86)) {
> + instructions = instructions_x86;
> + nmemb = ARRAY_SIZE(instructions_x86);
> + } else if (!strcmp(norm_arch, NORM_ARM)) {
> + instructions = instructions_arm;
> + nmemb = ARRAY_SIZE(instructions_arm);

But these changes result in _only_ the ones that were in the #if __arm__
being matched.  This is wrong.

If we want to go that way, we need to add _all_ arm instructions to
instructions_arm, not just those within the #if.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.


Re: [RFC 3/3] kexec: extend kexec_file_load system call

2016-07-21 Thread Russell King - ARM Linux
On Wed, Jul 20, 2016 at 11:41:35AM +, David Laight wrote:
> From: Dave Young
> > I do not think it is worth to add another syscall for extra fds.
> > We have open(2) as an example for different numbers of arguments
> > already.
> 
> Probably works 'by luck' and no one has actually thought about why.
> That ioctl() works is (probably) even more lucky.
> 
> There are ABI that use different calling conventions for varags functions
> (eg always stack all the arguments). I guess linux doesn't run on any of them.
> 
> ioctl() is a particular problem because the 'arg' might be an integer or a 
> pointer.
> Fortunately all the 64bit ABI linux uses pass the arg parameter in a register
> (and don't use different registers for pointer and data arguments).
> 
> You could have two 'libc' functions that refer to the same system call entry.
> Certainly safer than a varargs function.

Don't forget that the syscall API is not a normal C function API - it's
special, because there's little point stacking arguments on the userspace
stack and then having the kernel function try and read them off the
kernelspace stack.

If an architecture does such a thing, then it needs special veneers to
handle that (reading off the userspace stack and placing them onto the
kernelspace stack, or the arch needs to define some other method of
handling the situation.)

So, really, the actual C APIs don't matter that much - what matters more
is the definition of a sane way to pass such arguments.  Given the
extensive historical nature of open() and ioctl(), it would be completely
silly not to create something which allows these calls to work.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC 0/3] extend kexec_file_load system call

2016-07-20 Thread Russell King - ARM Linux
On Wed, Jul 20, 2016 at 01:45:42PM +1000, Balbir Singh wrote:
> > IOW, if your kernel forced signature verification, you should not be
> > able to do sig_enforce=0. If you kernel did not have
> > CONFIG_MODULE_SIG_FORCE=y, then sig_enforce should be 0 by default anyway
> > and you are not making it worse using command line.
> 
> OK.. I checked and you are right, but that is an example and there are
> other things like security=, thermal.*, nosmep, nosmap that need auditing
> for safety and might hurt the system security if used. I still think
> think that assuming you can pass any command line without breaking security
> is a broken argument.

Quite, and you don't need to run code in a privileged environment to do
any of that.

It's also not trivial to protect against: new kernels gain new arguments
which older kernels may not know about.  No matter how much protection
is built into older kernels, newer kernels can become vulnerable through
the addition of further arguments.

Also, how sure are we that there are no stack overflow issues with kernel
command line parsing?  Can we be sure that there's none?  This is
something which happens early in the kernel boot, before the full memory
protections have been set up.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC 0/3] extend kexec_file_load system call

2016-07-15 Thread Russell King - ARM Linux
On Fri, Jul 15, 2016 at 09:26:10AM -0400, Vivek Goyal wrote:
> On Fri, Jul 15, 2016 at 09:31:02AM +0200, Arnd Bergmann wrote:
> > I think that helps, as it makes the problem space correspond to that
> > of modifying the command line, but I can still come up with countless
> > attacks based on modifications of the /chosen node and/or the command
> > line, in fact it's probably easier than any other node.
> 
> I don't know anything about DTB. So here comes a very basic question. Does
> DTB allow passing an executable blob to kernel or pass the location of
> some unsigned executable code at kernel level.

DT on ARM is a description of the hardware - it can be thought of as a
set of nodes with properties attached.  The properties can describe
anything (we have documentation in Documentation/devicetree/bindings
which describes what we expect the properties to contain.)

On other architectures, DT can also contain open-firmware "functions"
but I don't think there's much support in the kernel for that - maybe
the PPC folk can reply on that point.

It is possible that someone may, at some point, decide to create a
property that points to some executable blob, but I can't think of a
reason why we should ever allow such a monstrosity in mainline kernels.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC 0/3] extend kexec_file_load system call

2016-07-15 Thread Russell King - ARM Linux
On Wed, Jul 13, 2016 at 03:13:42PM +0200, Arnd Bergmann wrote:
> On Wednesday, July 13, 2016 10:41:28 AM CEST Mark Rutland wrote:
> > The big question is whether this is a realistic case on a secure boot
> > system.
> 
> What does x86 do here? I assume changes to the command line are also
> limited.

They aren't.  You can specify /anything/ even with a fully-signed kernel
and initrd, which was one of the things I pointed out in my previous
set of responses.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC 0/3] extend kexec_file_load system call

2016-07-13 Thread Russell King - ARM Linux
On Wed, Jul 13, 2016 at 09:03:38AM -0400, Vivek Goyal wrote:
> On Wed, Jul 13, 2016 at 09:26:39AM +0100, Russell King - ARM Linux wrote:
> > Indeed - maybe Eric knows better, but I can't see any situation where
> > the dtb we load via kexec should ever affect "the bootloader", unless
> > the "kernel" that's being loaded into kexec is "the bootloader".
> > 
> > Now, going back to the more fundamental issue raised in my first reply,
> > about the kernel command line.
> > 
> > On x86, I can see that it _is_ possible for userspace to specify a
> > command line, and the kernel loading the image provides the command
> > line to the to-be-kexeced kernel with very little checking.  So, if
> > your kernel is signed, what stops the "insecure userspace" loading
> > a signed kernel but giving it an insecure rootfs and/or console?
> 
> It is not kexec specific. I could do this for regular boot too, right?
> 
> Command line options are not signed. I thought idea behind secureboot
> was to execute only trusted code and command line options don't enforce
> you to execute unsigned code.
> 
> So it sounds like different class of security problems which you are
> referring to and not necessarily covered by secureboot or signed
> kernel.

Let me give you an example.

You have a secure boot setup, where the firmware/ROM validates the boot
loader.  Good, the boot loader hasn't been tampered with.

You interrupt the boot loader and are able to modify the command line
for the booted kernel.

The boot loader loads the kernel and verifies the kernel's signature.
Good, the kernel hasn't been tampered with.  The kernel starts running.

You've plugged in a USB drive to the device, and specified a partition
containing a root filesystem that you control to the kernel.  The
validated kernel finds the USB drive, and mounts it, and executes
your own binaries on the USB drive.

You run a shell on the console.  You now have control of the system,
and can mount the real rootfs, inspect it, and work out what it does,
etc.

At this point, what use was all the validation that the secure boot
has done?  Absolutely useless.

If you can change the command line arguments given to the kernel, you
have no security, no matter how much you verify signatures.  It's
the illusion of security, nothing more, nothing less.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC 0/3] extend kexec_file_load system call

2016-07-13 Thread Russell King - ARM Linux
On Wed, Jul 13, 2016 at 05:55:33PM +1000, Stewart Smith wrote:
> Russell King - ARM Linux <li...@armlinux.org.uk> writes:
> > On Wed, Jul 13, 2016 at 02:59:51PM +1000, Stewart Smith wrote:
> >> Russell King - ARM Linux <li...@armlinux.org.uk> writes:
> >> > On Tue, Jul 12, 2016 at 10:58:05PM +0200, Petr Tesarik wrote:
> >> >> I'm not an expert on DTB, so I can't provide an example of code
> >> >> execution, but you have already mentioned the /chosen/linux,stdout-path
> >> >> property. If an attacker redirects the bootloader to an insecure
> >> >> console, they may get access to the system that would otherwise be
> >> >> impossible.
> >> >
> >> > I fail to see how kexec connects with the boot loader - the DTB image
> >> > that's being talked about is one which is passed from the currently
> >> > running kernel to the to-be-kexec'd kernel.  For ARM (and I suspect
> >> > also ARM64) that's a direct call chain which doesn't involve any
> >> > boot loader or firmware, and certainly none that would involve the
> >> > passed DTB image.
> >> 
> >> For OpenPOWER machines, kexec is the bootloader. Our bootloader is a
> >> linux kernel and initramfs with a UI (petitboot) - this means we never
> >> have to write a device driver twice: write a kernel one and you're done
> >> (for booting from the device and using it in your OS).
> >
> > I think you misunderstood my point.
> >
> > On ARM, we do not go:
> >
> > kernel (kexec'd from) -> boot loader -> kernel (kexec'd to)
> >
> > but we go:
> >
> > kernel (kexec'd from) -> kernel (kexec'd to)
> >
> > There's no intermediate step involving any bootloader.
> >
> > Hence, my point is that the dtb loaded by kexec is _only_ used by the
> > kernel which is being kexec'd to, not by the bootloader, nor indeed
> > the kernel which it is loaded into.
> >
> > Moreover, if you read the bit that I quoted (which is what I was
> > replying to), you'll notice that it is talking about the DTB loaded
> > by kexec somehow causing the _bootloader_ to be redirected to an
> > alternative console.  This point is wholely false on ARM.
> 
> Ahh.. I missed the bootloader bit there.
> 
> In which case, we're the same on OpenPOWER, there is no intermediate
> bootloader - in our case we have linux (with kexec) taking on what uboot
> or grub is typically used for on other platforms.

Indeed - maybe Eric knows better, but I can't see any situation where
the dtb we load via kexec should ever affect "the bootloader", unless
the "kernel" that's being loaded into kexec is "the bootloader".

Now, going back to the more fundamental issue raised in my first reply,
about the kernel command line.

On x86, I can see that it _is_ possible for userspace to specify a
command line, and the kernel loading the image provides the command
line to the to-be-kexeced kernel with very little checking.  So, if
your kernel is signed, what stops the "insecure userspace" loading
a signed kernel but giving it an insecure rootfs and/or console?

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC 0/3] extend kexec_file_load system call

2016-07-13 Thread Russell King - ARM Linux
On Wed, Jul 13, 2016 at 09:47:56AM +0200, Ard Biesheuvel wrote:
> On 13 July 2016 at 09:36, Russell King - ARM Linux
> <li...@armlinux.org.uk> wrote:
> > On Wed, Jul 13, 2016 at 02:59:51PM +1000, Stewart Smith wrote:
> >> Russell King - ARM Linux <li...@armlinux.org.uk> writes:
> >> > On Tue, Jul 12, 2016 at 10:58:05PM +0200, Petr Tesarik wrote:
> >> >> I'm not an expert on DTB, so I can't provide an example of code
> >> >> execution, but you have already mentioned the /chosen/linux,stdout-path
> >> >> property. If an attacker redirects the bootloader to an insecure
> >> >> console, they may get access to the system that would otherwise be
> >> >> impossible.
> >> >
> >> > I fail to see how kexec connects with the boot loader - the DTB image
> >> > that's being talked about is one which is passed from the currently
> >> > running kernel to the to-be-kexec'd kernel.  For ARM (and I suspect
> >> > also ARM64) that's a direct call chain which doesn't involve any
> >> > boot loader or firmware, and certainly none that would involve the
> >> > passed DTB image.
> >>
> >> For OpenPOWER machines, kexec is the bootloader. Our bootloader is a
> >> linux kernel and initramfs with a UI (petitboot) - this means we never
> >> have to write a device driver twice: write a kernel one and you're done
> >> (for booting from the device and using it in your OS).
> >
> > I think you misunderstood my point.
> >
> > On ARM, we do not go:
> >
> > kernel (kexec'd from) -> boot loader -> kernel (kexec'd to)
> >
> > but we go:
> >
> > kernel (kexec'd from) -> kernel (kexec'd to)
> >
> > There's no intermediate step involving any bootloader.
> >
> > Hence, my point is that the dtb loaded by kexec is _only_ used by the
> > kernel which is being kexec'd to, not by the bootloader, nor indeed
> > the kernel which it is loaded into.
> >
> > Moreover, if you read the bit that I quoted (which is what I was
> > replying to), you'll notice that it is talking about the DTB loaded
> > by kexec somehow causing the _bootloader_ to be redirected to an
> > alternative console.  This point is wholely false on ARM.
> >
> 
> The particular example may not apply, but the argument that the DTB
> -as a description of the hardware topology- needs to be signed if the
> kernel is also signed is valid. We do the same in the UEFI stub, i.e.,
> it normally takes a dtb= argument to allow the DTB to be overridden,
> but this feature is disabled when Secure Boot is in effect. By the
> same reasoning, if any kind of kexec kernel image validation is in
> effect, we should either validate the DTB image as well, or disallow
> external DTBs and only perform kexec with the kernel's current DTB
> (the blob it was booted with, not the unflattened data structure)

*Sigh*  yes, I know full well, which is why I said what I said in my
_first_ reply:

"However, your point is valid as an attacker can redirect the console
 and/or mounted root on the to-be-kexec'd kernel if they can modify
 the DTB - and there's a whole host of subtle ways to do that, not
 necessarily just modification of the kernel command line."

and I went on to raise a valid point about the necessity to do that
for crashdump, which has been _completely_ ignored.

So, I just stopped reading your reply after the first three lines,
because we are in fact in agreement... but thanks for trying to waste
my time.

Please, keep with the overall discussion, and stop replying to a single
email as a whole point in isolation to every other email in the thread.
And stop bikeshedding, by picking up on the easy stuff but ignoring the
more fundamental points, like the crashdump issue I mentioned in my
first reply and now this reply.

Thanks.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

  1   2   3   >