[PATCH 2/2] powerpc/configs: Add PPC4xx_OCM to ppc40x_defconfig

2018-12-31 Thread Michael Ellerman
There was recently a compilation break to this driver, but we didn't
notice because none of our defconfigs have it enabled. Fix that.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/configs/ppc40x_defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/configs/ppc40x_defconfig 
b/arch/powerpc/configs/ppc40x_defconfig
index 10fb1df63b46..689d7e276769 100644
--- a/arch/powerpc/configs/ppc40x_defconfig
+++ b/arch/powerpc/configs/ppc40x_defconfig
@@ -85,3 +85,4 @@ CONFIG_CRYPTO_ECB=y
 CONFIG_CRYPTO_PCBC=y
 CONFIG_CRYPTO_MD5=y
 CONFIG_CRYPTO_DES=y
+CONFIG_PPC4xx_OCM=y
-- 
2.17.2



[PATCH 1/2] powerpc/4xx/ocm: Fix phys_addr_t printf warnings

2018-12-31 Thread Michael Ellerman
Currently the code produces several warnings, eg:

  arch/powerpc/platforms/4xx/ocm.c:240:38: error: format '%llx'
  expects argument of type 'long long unsigned int', but argument 3
  has type 'phys_addr_t {aka unsigned int}'
 seq_printf(m, "PhysAddr : 0x%llx\n", ocm->phys);
   ~~~^ ~

Fix it by using the special %pa[p] format for printing phys_addr_t.
Note we need to pass the value by reference for the special specifier
to work.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/platforms/4xx/ocm.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/platforms/4xx/ocm.c b/arch/powerpc/platforms/4xx/ocm.c
index c22b099c42f1..a1aaa1569d7c 100644
--- a/arch/powerpc/platforms/4xx/ocm.c
+++ b/arch/powerpc/platforms/4xx/ocm.c
@@ -237,12 +237,12 @@ static int ocm_debugfs_show(struct seq_file *m, void *v)
continue;
 
seq_printf(m, "PPC4XX OCM   : %d\n", ocm->index);
-   seq_printf(m, "PhysAddr : 0x%llx\n", ocm->phys);
+   seq_printf(m, "PhysAddr : %pa[p]\n", &(ocm->phys));
seq_printf(m, "MemTotal : %d Bytes\n", ocm->memtotal);
seq_printf(m, "MemTotal(NC) : %d Bytes\n", ocm->nc.memtotal);
seq_printf(m, "MemTotal(C)  : %d Bytes\n\n", ocm->c.memtotal);
 
-   seq_printf(m, "NC.PhysAddr  : 0x%llx\n", ocm->nc.phys);
+   seq_printf(m, "NC.PhysAddr  : %pa[p]\n", &(ocm->nc.phys));
seq_printf(m, "NC.VirtAddr  : 0x%p\n", ocm->nc.virt);
seq_printf(m, "NC.MemTotal  : %d Bytes\n", ocm->nc.memtotal);
seq_printf(m, "NC.MemFree   : %d Bytes\n", ocm->nc.memfree);
@@ -252,7 +252,7 @@ static int ocm_debugfs_show(struct seq_file *m, void *v)
blk->size, blk->owner);
}
 
-   seq_printf(m, "\nC.PhysAddr   : 0x%llx\n", ocm->c.phys);
+   seq_printf(m, "\nC.PhysAddr   : %pa[p]\n", &(ocm->c.phys));
seq_printf(m, "C.VirtAddr   : 0x%p\n", ocm->c.virt);
seq_printf(m, "C.MemTotal   : %d Bytes\n", ocm->c.memtotal);
seq_printf(m, "C.MemFree: %d Bytes\n", ocm->c.memfree);
-- 
2.17.2



[PATCH] KVM: PPC: Book3S HV: radix: Fix uninitialized var build error

2018-12-31 Thread Michael Ellerman
Old GCCs (4.6.3 at least), aren't able to follow the logic in
__kvmhv_copy_tofrom_guest_radix() and warn that old_pid is used
uninitialized:

  arch/powerpc/kvm/book3s_64_mmu_radix.c:75:3: error: 'old_pid' may be
  used uninitialized in this function

The logic is OK, we only use old_pid if quadrant == 1, and in that
case it has definitely be initialised, eg:

if (quadrant == 1) {
old_pid = mfspr(SPRN_PID);
...
if (quadrant == 1 && pid != old_pid)
mtspr(SPRN_PID, old_pid);

Annotate it to fix the error.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/kvm/book3s_64_mmu_radix.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c 
b/arch/powerpc/kvm/book3s_64_mmu_radix.c
index fb88167a402a..1b821c6efdef 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -33,8 +33,8 @@ unsigned long __kvmhv_copy_tofrom_guest_radix(int lpid, int 
pid,
  gva_t eaddr, void *to, void *from,
  unsigned long n)
 {
+   int uninitialized_var(old_pid), old_lpid;
unsigned long quadrant, ret = n;
-   int old_pid, old_lpid;
bool is_load = !!to;
 
/* Can't access quadrants 1 or 2 in non-HV mode, call the HV to do it */
-- 
2.17.2



Re: [PATCH v8 24/25] powerpc: Adopt nvram module for PPC64

2018-12-31 Thread Finn Thain
On Mon, 31 Dec 2018, Arnd Bergmann wrote:

> On Sun, Dec 30, 2018 at 4:29 AM Finn Thain  wrote:
> >
> > On Sat, 29 Dec 2018, Arnd Bergmann wrote:
> >
> > > On Wed, Dec 26, 2018 at 1:43 AM Finn Thain  
> > > wrote:
> > >
> > > > +static ssize_t ppc_nvram_get_size(void)
> > > > +{
> > > > +   if (ppc_md.nvram_size)
> > > > +   return ppc_md.nvram_size();
> > > > +   return -ENODEV;
> > > > +}
> > >
> > > > +const struct nvram_ops arch_nvram_ops = {
> > > > +   .read   = ppc_nvram_read,
> > > > +   .write  = ppc_nvram_write,
> > > > +   .get_size   = ppc_nvram_get_size,
> > > > +   .sync   = ppc_nvram_sync,
> > > > +};
> > >
> > > Coming back to this after my comment on the m68k side, I notice that 
> > > there is now a double indirection through function pointers. Have 
> > > you considered completely removing the operations from ppc_md 
> > > instead by having multiple copies of nvram_ops?
> > >
> >
> > I considered a few alternatives. I figured that it was refactoring 
> > that could be deferred, as it would be confined to arch/powerpc. I was 
> > more interested in the cross-platform API.
> 
> Fair enough.
> 
> > > With the current method, it does seem odd to have a single 
> > > per-architecture instance of the exported structure containing 
> > > function pointers. This doesn't give us the flexibility of having 
> > > multiple copies in the kernel the way that ppc_md does, but it adds 
> > > overhead compared to simply exporting the functions directly.
> > >
> >
> > You're right, there is overhead here.
> >
> > With a bit of auditing, wrappers like the one you quoted (which merely 
> > checks whether or not a ppc_md method is implemented) could surely be 
> > avoided.
> >
> > The arch_nvram_ops methods are supposed to optional (that is, they are 
> > allowed to be NULL).
> >
> > We could call exactly the same function pointers though either ppc_md 
> > or arch_nvram_ops. That would avoid the double indirection.
> 
> I think you can have a 'const' structure in the __ro_after_init section, 
> so without changing anything else, powerpc could just copy the function 
> pointers from ppc_md into the arch_nvram_ops at early init time, which 
> should ideally simplify your implementation as well.
> 

This "early init time" could be hard to pin down... It has to be after 
ppc_md methods are initialized but before the nvram_ops methods get used 
(e.g. by the framebuffer console). Seems a bit fragile (?)

Your suggestion to completely remove the ppc_md.nvram* methods might be a 
better way. It just means functions get assigned to nvram_ops pointers 
instead of ppc_md pointers.

The patch is simple enough, but it assumes that arch_nvram_ops is not 
const. The struct machdep_calls ppc_md is not const, so should we worry 
about dropping the const for the struct nvram_ops arch_nvram_ops?

-- 

> Arnd
> 


Re: [PATCH v8 20/25] powerpc, fbdev: Use arch_nvram_ops methods instead of nvram_read_byte() and nvram_write_byte()

2018-12-31 Thread Finn Thain
On Mon, 31 Dec 2018, Arnd Bergmann wrote:

> On Sun, Dec 30, 2018 at 12:43 AM Finn Thain  
> wrote:
> 
> >
> > Is there some benefit, or is that just personal taste?
> >
> > Avoiding changes to call sites avoids code review, but I think 1) the
> > thinkpad_acpi changes have already been reviewed and 2) the fbdev changes
> > need review anyway.
> >
> > Your suggesion would add several new entities and one extra layer of
> > indirection.
> >
> > I think indirection harms readability because now the reader now has to go
> > and look up the meaning of the new entities.
> >
> > It's not the case that we need to choose between definitions of
> > nvram_read_byte() at compile time, or stub them out:
> >
> > #ifdef CONFIG_FOO
> > static inline unsigned char nvram_read_byte(int addr)
> > {
> > return arch_nvram_ops.read_byte(addr);
> > }
> > #else
> > static inline unsigned char nvram_read_byte(int addr) { }
> > #endif
> >
> > And I don't anticipate a need for a macro here either:
> >
> > #define nvram_read_byte(a) random_nvram_read_byte_impl(a)
> >
> > I think I've used the simplest solution.
> 
> Having the indirection would help if the inline function can
> encapsulate the NULL pointer check, like
> 
> static inline unsigned char nvram_read_byte(loff_t addr)
> {
>char data;
> 
>if (!IS_ENABLED(CONFIG_NVRAM))
>  return 0xff;
> 
>if (arch_nvram_ops.read_byte)
>  return arch_nvram_ops.read_byte(addr);
> 
>if (arch_nvram_ops.read)
>  return arch_nvram_ops.read(char, 1, &addr);
> 
>   return 0xff;
> }
> 

The semantics of .read_byte and .read are subtly different. For CONFIG_X86 
and CONFIG_ATARI, .read implies checksum validation and .read_byte does 
not.

In particular, in the thinkpad_acpi case, checksum validation isn't used, 
but in the atari_scsi case, it is.

So I like to see drivers explicitly call the method they want. I didn't 
want to obscure this distinction in a helper.

-- 


Re: [PATCH v8 18/25] powerpc: Implement nvram sync ioctl

2018-12-31 Thread Finn Thain
On Mon, 31 Dec 2018, Arnd Bergmann wrote:

> On Sun, Dec 30, 2018 at 8:25 AM Finn Thain  wrote:
> >
> > On Sat, 29 Dec 2018, Arnd Bergmann wrote:
> >
> > > > --- a/drivers/char/nvram.c
> > > > +++ b/drivers/char/nvram.c
> > > > @@ -48,6 +48,10 @@
> > > >  #include 
> > > >  #include 
> > > >
> > > > +#ifdef CONFIG_PPC
> > > > +#include 
> > > > +#include 
> > > > +#endif
> > > >
> > > >  static DEFINE_MUTEX(nvram_mutex);
> > > >  static DEFINE_SPINLOCK(nvram_state_lock);
> > > > @@ -331,6 +335,37 @@ static long nvram_misc_ioctl(struct file *file, 
> > > > unsigned int cmd,
> > > > long ret = -ENOTTY;
> > > >
> > > > switch (cmd) {
> > > > +#ifdef CONFIG_PPC
> > > > +   case OBSOLETE_PMAC_NVRAM_GET_OFFSET:
> > > > +   pr_warn("nvram: Using obsolete PMAC_NVRAM_GET_OFFSET 
> > > > ioctl\n");
> > > > +   /* fall through */
> > > > +   case IOC_NVRAM_GET_OFFSET:
> > > > +   ret = -EINVAL;
> > > > +#ifdef CONFIG_PPC_PMAC
> > >
> > > I think it would make be nicer here to keep the ppc bits in arch/ppc,
> > > and instead add a .ioctl() callback to nvram_ops.
> > >
> >
> > The problem with having an nvram_ops.ioctl() method is the code in the 
> > !PPC branch. That code would get duplicated because it's needed by 
> > both X86 and M68K, to implement the checksum ioctls.
> 
> I was thinking you'd just have a common ioctl function that falls back 
> to the .ioctl callback for any unhandled commands like
> 
> switch (cmd) {
> case NVRAM_INIT:
>  ...
>  break;
> case ...:
>  break;
> default:
>  if (ops->ioctl)
>  return ops->ioctl(...);
>  return -EINVAL;
> }
> 
> Would that work?
> 

There are no ioctls common to all architectures. So your example becomes,

static long nvram_misc_ioctl(struct file *file, unsigned int cmd,
 unsigned long arg)
{
 if (ops->ioctl)
  return ops->ioctl(file, cmd, arg);
 return -ENOTTY;
}

And then my objection is the same: m68k and x86 now have to duplicate 
their common ops->ioctl() implementation.

Here's a compromise that avoids some code duplication.

 switch (cmd) {
#if defined(CONFIG_X86) || defined(CONFIG_M68K)
 case NVRAM_INIT:
  ...
  break;
 case NVRAM_SETCKS:
  ...
  break;
#endif
 default:
  if (ops->ioctl)
  return ops->ioctl(...);
  return -EINVAL;
 }

But PPC64 and PPC32 also need to share their ops->ioctl() implementation. 
It's not clear to me where that code would go.

Personally, I prefer the present patch series, or something similar, with 
it's symmetry between nvram.c and nvram.h:

static long nvram_misc_ioctl(struct file *file, unsigned int cmd,
 unsigned long arg)
{
long ret = -ENOTTY;

switch (cmd) {
#if defined(CONFIG_PPC)
case OBSOLETE_PMAC_NVRAM_GET_OFFSET:
...
case IOC_NVRAM_GET_OFFSET:
...
break;
case IOC_NVRAM_SYNC:
...
break;
#elif defined(CONFIG_X86) || defined(CONFIG_M68K)
case NVRAM_INIT:
...
break;
case NVRAM_SETCKS:
...
break;
#endif
}
return ret;
}

... versus the struct definition in nvram.h,

struct nvram_ops {
ssize_t (*read)(char *, size_t, loff_t *);
ssize_t (*write)(char *, size_t, loff_t *);
unsigned char   (*read_byte)(int);
void(*write_byte)(unsigned char, int);
ssize_t (*get_size)(void);
#if defined(CONFIG_PPC)
long(*sync)(void);
int (*get_partition)(int);
#elif defined(CONFIG_X86) || defined(CONFIG_M68K)
long(*set_checksum)(void);
long(*initialize)(void);
#endif
};

Which of these alternatives do you prefer? Is there a better way?

-- 


>Arnd
> 


Re: Runtime warnings in powerpc code

2018-12-31 Thread Benjamin Herrenschmidt
On Thu, 2018-12-27 at 11:05 -0800, Guenter Roeck wrote:
> Hi,
> 
> I am getting the attached runtime warnings when enabling certain debug
> options in powerpc code. The warnings are seen with pretty much all
> platforms, and all active kernel releases.
> 
> Question: Is it worthwhile to keep building / testing powerpc builds
> with the respective debug options enabled, and report it once in a while,
> or should I just disable the options ?

I've been fixing some issues with ppc32 and some of that stuff, I sent
some experimental patches updating 4xx and Christoph sent 8xx variants,
I still need to go through the other ones.

Cheers,
Ben.


> Thanks,
> Guenter
> 
> ---
> CONFIG_DEBUG_ATOMIC_SLEEP
> 
> [ cut here ]
> do not call blocking ops when !TASK_RUNNING; state=2 set at [<(ptrval)>]
> prepare_to_wait+0x54/0xe4
> WARNING: CPU: 0 PID: 1 at kernel/sched/core.c:6099 __might_sleep+0x94/0x9c
> Modules linked in:
> CPU: 0 PID: 1 Comm: init Not tainted 4.20.0-yocto-standard+ #1
> NIP:  c00667a0 LR: c00667a0 CTR: 
> REGS: cf8df8c0 TRAP: 0700   Not tainted  (4.20.0-yocto-standard+)
> MSR:  00029032   CR: 2822  XER: 2000
> 
> GPR00: c00667a0 cf8df970 cf8e 0062 c0af15c8 0007 fa1ae97e 
> 757148e2 
> GPR08: cf8de000    1f386000   
> cfd83c8c 
> GPR16: 0004 0004 0004  060c 000a cf8dfdb8 
> cf267804 
> GPR24: cf8dfd78 cf8dfd68 cfa88a20 cec70830 0001  01d3 
> c0b444cc 
> NIP [c00667a0] __might_sleep+0x94/0x9c
> LR [c00667a0] __might_sleep+0x94/0x9c
> Call Trace:
> [cf8df970] [c00667a0] __might_sleep+0x94/0x9c (unreliable)
> [cf8df990] [c05beddc] do_ide_request+0x48/0x6bc
> [cf8dfa10] [c0492bcc] __blk_run_queue+0x80/0x10c
> [cf8dfa20] [c049a938] blk_flush_plug_list+0x23c/0x258
> [cf8dfa60] [c006b888] io_schedule_prepare+0x44/0x5c
> [cf8dfa70] [c006b8c0] io_schedule+0x20/0x48
> [cf8dfa80] [c095e1ac] bit_wait_io+0x24/0x74
> [cf8dfa90] [c095dd94] __wait_on_bit+0xac/0x104
> [cf8dfab0] [c095de74] out_of_line_wait_on_bit+0x88/0x98
> [cf8dfae0] [c0229094] bh_submit_read+0xf8/0x104
> [cf8dfaf0] [c028b9a8] ext4_get_branch+0xdc/0x168
> [cf8dfb20] [c028c7fc] ext4_ind_map_blocks+0x2b0/0xc08
> [cf8dfc30] [c029551c] ext4_map_blocks+0x2e0/0x65c
> [cf8dfc80] [c02b8c84] ext4_mpage_readpages+0x5e8/0x97c
> [cf8dfd60] [c016c3cc] read_pages+0x60/0x1a0
> [cf8dfdb0] [c016c6e8] __do_page_cache_readahead+0x1dc/0x208
> [cf8dfe10] [c0159768] filemap_fault+0x418/0x834
> [cf8dfe50] [c02a00fc] ext4_filemap_fault+0x40/0x64
> [cf8dfe60] [c0198d0c] __do_fault+0x34/0xb8
> [cf8dfe70] [c019e264] handle_mm_fault+0xc44/0xf88
> [cf8dfef0] [c001a218] __do_page_fault+0x158/0x7b4
> [cf8dff40] [c00143b4] handle_page_fault+0x14/0x40
> --- interrupt: 301 at 0xb7904a70
> LR = 0xb78ef0c8
> Instruction dump:
> 7fe3fb78 bba10014 7c0803a6 38210020 4bfffd20 808a 3c60c0b0 3941 
> 7cc53378 3863a558 99490001 4bfd03bd <0fe0> 4bc0 7c0802a6 90010004 
> irq event stamp: 126702
> hardirqs last  enabled at (126701): [] console_unlock+0x434/0x5d0
> hardirqs last disabled at (126702): [] reenable_mmu+0x30/0x88
> softirqs last  enabled at (126552): [] __do_softirq+0x42c/0x4a0
> softirqs last disabled at (126529): [] irq_exit+0x104/0x108
> ---[ end trace 4f6c84b7815474d9 ]---
> 
> ---
> #if defined(CONFIG_PROVE_LOCKING) && defined(CONFIG_DEBUG_LOCKDEP) && \
> defined(CONFIG_TRACE_IRQFLAGS)
> 
> [ cut here ]
> DEBUG_LOCKS_WARN_ON(!current->hardirqs_enabled)
> WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:3762
> check_flags.part.25+0x1a0/0x1c4
> Modules linked in:
> CPU: 0 PID: 1 Comm: init Tainted: GW 4.20.0-yocto-standard+ #1
> NIP:  c00839f0 LR: c00839f0 CTR: 
> REGS: cf8dfe00 TRAP: 0700   Tainted: GW
> (4.20.0-yocto-standard+)
> MSR:  00021032   CR: 2828  XER: 2000
> 
> GPR00: c00839f0 cf8dfeb0 cf8e 002f 0001 c00938f4 c1425b76 
> 002f 
> GPR08: 1032  0001 0004 28282828   
> b7927688 
> GPR16:  bfe20a5c bfe20a58 0fe5fff8 1b38 10002178  
> b7929c20 
> GPR24: c095d81c 7c9319ee   b7929ae0 cf8e 9032 
> c0d2 
> NIP [c00839f0] check_flags.part.25+0x1a0/0x1c4
> LR [c00839f0] check_flags.part.25+0x1a0/0x1c4
> Call Trace:
> [cf8dfeb0] [c00839f0] check_flags.part.25+0x1a0/0x1c4 (unreliable)
> [cf8dfec0] [c0085f6c] lock_is_held_type+0x78/0xb4
> [cf8dfee0] [c095d35c] __schedule+0x6cc/0xb44
> [cf8dff30] [c095d81c] schedule+0x48/0xb8
> [cf8dff40] [c0014694] recheck+0x0/0x20
> --- interrupt: 501 at 0xb78f2850
> LR = 0xb78f2a24
> Instruction dump:
> 3c80c0b0 3c60c0af 3884d684 38635f94 4bfb3189 0fe0 4bfffec8 3c80c0b0 
> 3c60c0af 3884d668 38635f94 4bfb316d <0fe0> 4bfffefc 3c80c0b0 3c60c0af 
> irq event stamp: 127630
> hardirqs last  enabled at (127629): []
> _raw_spin_unlock_irq+0x3c/0x94
> hardirqs last disabled at (127630): [] reenable_mmu+0x30/0x88
> softirqs las

Re: [PATCH v8 00/25] Re-use nvram module

2018-12-31 Thread Finn Thain
On Sun, 30 Dec 2018, I wrote:

> 
> The rationale for the ops struct was that it offers introspection.
> 
> [...] those platforms which need checksum validation always set 
> byte-at-a-time methods to NULL.
> 
> [...] The NULL methods in the ops struct allow the nvram.c misc device 
> to avoid inefficient byte-at-a-time accessors where possible, just as 
> arch/powerpc/kernel/nvram_64.c presently does.
> 

Hopefully my message makes more sense with the tangential irrelevancies 
removed. I will document these considerations in nvram.h for the next 
revision.

-- 


[PATCH] ibmveth: fix DMA unmap error in ibmveth_xmit_start error path

2018-12-31 Thread Tyrel Datwyler
Commit 33a48ab105a7 ("ibmveth: Fix DMA unmap error") fixed an issue in the
normal code path of ibmveth_xmit_start() that was originally introduced by
Commit 6e8ab30ec677 ("ibmveth: Add scatter-gather support"). This original
fix missed the error path where dma_unmap_page is wrongly called on the
header portion in descs[0] which was mapped with dma_map_single. As a
result a failure to DMA map any of the frags results in a dmesg warning
when CONFIG_DMA_API_DEBUG is enabled.

[ cut here ]
DMA-API: ibmveth 3002: device driver frees DMA memory with wrong function
  [device address=0x0a43] [size=172 bytes] [mapped as page] 
[unmapped as single]
WARNING: CPU: 1 PID: 8426 at kernel/dma/debug.c:1085 check_unmap+0x4fc/0xe10
...

...
DMA-API: Mapped at:
ibmveth_start_xmit+0x30c/0xb60
dev_hard_start_xmit+0x100/0x450
sch_direct_xmit+0x224/0x490
__qdisc_run+0x20c/0x980
__dev_queue_xmit+0x1bc/0xf20

This fixes the API misuse by unampping descs[0] with dma_unmap_single.

Fixes: 6e8ab30ec677 ("ibmveth: Add scatter-gather support")
Cc: sta...@vger.kernel.org
Signed-off-by: Tyrel Datwyler 
---
 drivers/net/ethernet/ibm/ibmveth.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/ibm/ibmveth.c 
b/drivers/net/ethernet/ibm/ibmveth.c
index a468178..098d876 100644
--- a/drivers/net/ethernet/ibm/ibmveth.c
+++ b/drivers/net/ethernet/ibm/ibmveth.c
@@ -1171,11 +1171,15 @@ static netdev_tx_t ibmveth_start_xmit(struct sk_buff 
*skb,
 
 map_failed_frags:
last = i+1;
-   for (i = 0; i < last; i++)
+   for (i = 1; i < last; i++)
dma_unmap_page(&adapter->vdev->dev, descs[i].fields.address,
   descs[i].fields.flags_len & IBMVETH_BUF_LEN_MASK,
   DMA_TO_DEVICE);
 
+   dma_unmap_single(&adapter->vdev->dev,
+descs[0].fields.address,
+descs[0].fields.flags_len & IBMVETH_BUF_LEN_MASK,
+DMA_TO_DEVICE);
 map_failed:
if (!firmware_has_feature(FW_FEATURE_CMO))
netdev_err(netdev, "tx: unable to map xmit buffer\n");
-- 
2.7.4



Re: [PATCH] block/swim3: Fix regression on PowerBook G3

2018-12-31 Thread Jens Axboe
On 12/30/18 10:44 PM, Finn Thain wrote:
> As of v4.20, the swim3 driver crashes when loaded on a PowerBook G3
> (Wallstreet).
> 
> MacIO PCI driver attached to Gatwick chipset
> MacIO PCI driver attached to Heathrow chipset
> swim3 0.00015000:floppy: [fd0] SWIM3 floppy controller in media bay
> 0.00013020:ch-a: ttyS0 at MMIO 0xf3013020 (irq = 16, base_baud = 230400) is a 
> Z85c30 ESCC - Serial port
> 0.00013000:ch-b: ttyS1 at MMIO 0xf3013000 (irq = 17, base_baud = 230400) is a 
> Z85c30 ESCC - Infrared port
> macio: fixed media-bay irq on gatwick
> macio: fixed left floppy irqs
> swim3 1.00015000:floppy: [fd1] Couldn't request interrupt
> Unable to handle kernel paging request for data at address 0x0024
> Faulting instruction address: 0xc02652f8
> Oops: Kernel access of bad area, sig: 11 [#1]
> BE SMP NR_CPUS=2 PowerMac
> Modules linked in:
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.20.0 #2
> NIP:  c02652f8 LR: c026915c CTR: c0276d1c
> REGS: df43ba10 TRAP: 0300   Not tainted  (4.20.0)
> MSR:  9032   CR: 28228288  XER: 0100
> DAR: 0024 DSISR: 4000
> GPR00: c026915c df43bac0 df439060 c0731524 df494700  c06e1c08 0001
> GPR08: 0001  df5ff220 1032 28228282  c0004ca4 
> GPR16:    c073144c dfffe064 c0731524 0120 c0586108
> GPR24: c073132c c073143c c073143c  c0731524 df67cd70 df494700 0001
> NIP [c02652f8] blk_mq_free_rqs+0x28/0xf8
> LR [c026915c] blk_mq_sched_tags_teardown+0x58/0x84
> Call Trace:
> [df43bac0] [c0045f50] flush_workqueue_prep_pwqs+0x178/0x1c4 (unreliable)
> [df43bae0] [c026915c] blk_mq_sched_tags_teardown+0x58/0x84
> [df43bb00] [c02697f0] blk_mq_exit_sched+0x9c/0xb8
> [df43bb20] [c0252794] elevator_exit+0x84/0xa4
> [df43bb40] [c0256538] blk_exit_queue+0x30/0x50
> [df43bb50] [c0256640] blk_cleanup_queue+0xe8/0x184
> [df43bb70] [c034732c] swim3_attach+0x330/0x5f0
> [df43bbb0] [c034fb24] macio_device_probe+0x58/0xec
> [df43bbd0] [c032ba88] really_probe+0x1e4/0x2f4
> [df43bc00] [c032bd28] driver_probe_device+0x64/0x204
> [df43bc20] [c0329ac4] bus_for_each_drv+0x60/0xac
> [df43bc50] [c032b824] __device_attach+0xe8/0x160
> [df43bc80] [c032ab38] bus_probe_device+0xa0/0xbc
> [df43bca0] [c0327338] device_add+0x3d8/0x630
> [df43bcf0] [c0350848] macio_add_one_device+0x444/0x48c
> [df43bd50] [c03509f8] macio_pci_add_devices+0x168/0x1bc
> [df43bd90] [c03500ec] macio_pci_probe+0xc0/0x10c
> [df43bda0] [c02ad884] pci_device_probe+0xd4/0x184
> [df43bdd0] [c032ba88] really_probe+0x1e4/0x2f4
> [df43be00] [c032bd28] driver_probe_device+0x64/0x204
> [df43be20] [c032bfcc] __driver_attach+0x104/0x108
> [df43be40] [c0329a00] bus_for_each_dev+0x64/0xb4
> [df43be70] [c032add8] bus_add_driver+0x154/0x238
> [df43be90] [c032ca24] driver_register+0x84/0x148
> [df43bea0] [c0004aa0] do_one_initcall+0x40/0x188
> [df43bf00] [c0690100] kernel_init_freeable+0x138/0x1d4
> [df43bf30] [c0004cbc] kernel_init+0x18/0x10c
> [df43bf40] [c00121e4] ret_from_kernel_thread+0x14/0x1c
> Instruction dump:
> 5484d97e 4bfff4f4 9421ffe0 7c0802a6 bf410008 7c9e2378 90010024 8124005c
> 2f89 419e0078 81230004 7c7c1b78 <81290024> 2f89 419e0064 8144
> ---[ end trace 12025ab921a9784c ]---
> 
> Reverting commit 8ccb8cb1892b ("swim3: convert to blk-mq") resolves the
> problem.
> 
> That commit added a struct blk_mq_tag_set to struct floppy_state and
> initialized it with a blk_mq_init_sq_queue() call. Unfortunately, there
> is a memset() in swim3_add_device() that subsequently clears the
> floppy_state struct. That means fs->tag_set->ops is a NULL pointer, and
> it gets dereferenced by blk_mq_free_rqs() which gets called in the
> request_irq() error path. Move the memset() to fix this bug.
> 
> BTW, the request_irq() failure for the left mediabay floppy (fd1) is not
> a regression. I don't know why it happens. The right media bay floppy
> (fd0) works fine however.

Applied, thanks.

-- 
Jens Axboe



Re: [PATCH] block/swim3: Fix -EBUSY error when re-opening device after unmount

2018-12-31 Thread Jens Axboe
On 12/30/18 10:44 PM, Finn Thain wrote:
> When the block device is opened with FMODE_EXCL, ref_count is set to -1.
> This value doesn't get reset when the device is closed which means the
> device cannot be opened again. Fix this by checking for refcount <= 0
> in the release method.

Applied, thanks.


-- 
Jens Axboe



Re: [PATCH] block/swim3: Remove dead return statement

2018-12-31 Thread Jens Axboe
On 12/30/18 10:44 PM, Finn Thain wrote:
> Cc: linuxppc-dev@lists.ozlabs.org
> Signed-off-by: Finn Thain 

Applied, thanks.

-- 
Jens Axboe



Re: [PATCH v8 24/25] powerpc: Adopt nvram module for PPC64

2018-12-31 Thread Arnd Bergmann
On Sun, Dec 30, 2018 at 4:29 AM Finn Thain  wrote:
>
> On Sat, 29 Dec 2018, Arnd Bergmann wrote:
>
> > On Wed, Dec 26, 2018 at 1:43 AM Finn Thain  
> > wrote:
> >
> > > +static ssize_t ppc_nvram_get_size(void)
> > > +{
> > > +   if (ppc_md.nvram_size)
> > > +   return ppc_md.nvram_size();
> > > +   return -ENODEV;
> > > +}
> >
> > > +const struct nvram_ops arch_nvram_ops = {
> > > +   .read   = ppc_nvram_read,
> > > +   .write  = ppc_nvram_write,
> > > +   .get_size   = ppc_nvram_get_size,
> > > +   .sync   = ppc_nvram_sync,
> > > +};
> >
> > Coming back to this after my comment on the m68k side, I notice that
> > there is now a double indirection through function pointers. Have you
> > considered completely removing the operations from ppc_md instead by
> > having multiple copies of nvram_ops?
> >
>
> I considered a few alternatives. I figured that it was refactoring that
> could be deferred, as it would be confined to arch/powerpc. I was more
> interested in the cross-platform API.

Fair enough.

> > With the current method, it does seem odd to have a single
> > per-architecture instance of the exported structure containing function
> > pointers. This doesn't give us the flexibility of having multiple copies
> > in the kernel the way that ppc_md does, but it adds overhead compared to
> > simply exporting the functions directly.
> >
>
> You're right, there is overhead here.
>
> With a bit of auditing, wrappers like the one you quoted (which merely
> checks whether or not a ppc_md method is implemented) could surely be
> avoided.
>
> The arch_nvram_ops methods are supposed to optional (that is, they are
> allowed to be NULL).
>
> We could call exactly the same function pointers though either ppc_md or
> arch_nvram_ops. That would avoid the double indirection.

I think you can have a 'const' structure in the __ro_after_init section,
so without changing anything else, powerpc could just copy the
function pointers from ppc_md into the arch_nvram_ops at early
init time, which should ideally simplify your implementation as well.

Arnd


Re: [PATCH v8 20/25] powerpc, fbdev: Use arch_nvram_ops methods instead of nvram_read_byte() and nvram_write_byte()

2018-12-31 Thread Arnd Bergmann
On Sun, Dec 30, 2018 at 12:43 AM Finn Thain  wrote:

>
> Is there some benefit, or is that just personal taste?
>
> Avoiding changes to call sites avoids code review, but I think 1) the
> thinkpad_acpi changes have already been reviewed and 2) the fbdev changes
> need review anyway.
>
> Your suggesion would add several new entities and one extra layer of
> indirection.
>
> I think indirection harms readability because now the reader now has to go
> and look up the meaning of the new entities.
>
> It's not the case that we need to choose between definitions of
> nvram_read_byte() at compile time, or stub them out:
>
> #ifdef CONFIG_FOO
> static inline unsigned char nvram_read_byte(int addr)
> {
> return arch_nvram_ops.read_byte(addr);
> }
> #else
> static inline unsigned char nvram_read_byte(int addr) { }
> #endif
>
> And I don't anticipate a need for a macro here either:
>
> #define nvram_read_byte(a) random_nvram_read_byte_impl(a)
>
> I think I've used the simplest solution.

Having the indirection would help if the inline function can
encapsulate the NULL pointer check, like

static inline unsigned char nvram_read_byte(loff_t addr)
{
   char data;

   if (!IS_ENABLED(CONFIG_NVRAM))
 return 0xff;

   if (arch_nvram_ops.read_byte)
 return arch_nvram_ops.read_byte(addr);

   if (arch_nvram_ops.read)
 return arch_nvram_ops.read(char, 1, &addr);

  return 0xff;
}

(the above assumes no #ifdef in the structure definition, if you
keep the #ifdef there they have to be added here as well).

  Arnd


Re: [PATCH v8 00/25] Re-use nvram module

2018-12-31 Thread Arnd Bergmann
On Sun, Dec 30, 2018 at 5:05 AM Finn Thain  wrote:
>
> On Sun, 30 Dec 2018, I wrote:
>
> >
> > I'm not opposed to exported functions in place of a singleton ops
> > struct. Other things being equal I'm inclined toward the ops struct,
> > perhaps because I like encapsulation or perhaps because I don't like
> > excess generality. (That design decision was made years ago and I don't
> > remember the reasoning.)
>
> The rationale for the ops struct was that it offers introspection.
>
> It turns out that PPC64 device drivers don't care about byte-at-a-time
> accessors and X86 device drivers don't care about checksum validation.
> But that only gets us so far.
>
> We still needed a way to find out whether the arch has provided
> byte-at-a-time accessors (i.e. PPC32 and M68K Mac) or byte range accessors
> (i.e. PPC64 and those platforms with checksummed NVRAM like X86 and M68K
> Atari).
>
> You can't resolve this question at build time for a multi-platform kernel
> binary, so pre-processor tricks don't help.
>
> Device drivers tend to want to access NVRAM one byte at a time. With this
> patch series, those platforms which need checksum validation always set
> byte-at-a-time methods to NULL. (Hence the atari_scsi changes in patch 3.)
>
> The char misc driver is quite different to the usual device drivers,
> because the struct file_operations methods always access a byte range.
>
> The NULL methods in the ops struct allow the nvram.c misc device to avoid
> inefficient byte-at-a-time accessors where possible, just as
> arch/powerpc/kernel/nvram_64.c presently does.


Ok, I see. That sounds absolutely reasonable, so let's stay with
the structure as you proposed.

   Arnd


Re: [PATCH v8 13/25] m68k: Dispatch nvram_ops calls to Atari or Mac functions

2018-12-31 Thread Arnd Bergmann
On Sun, Dec 30, 2018 at 11:12 PM Finn Thain  wrote:
> On Sun, 30 Dec 2018, LEROY Christophe wrote:

> But I'm over-simplifying. Arnd's alternative actually goes like this,
>
> #if defined(CONFIG_MAC) && !defined(CONFIG_ATARI)
> const struct nvram_ops arch_nvram_ops = {
> /* ... */
> }
> #elif !defined(CONFIG_MAC) && defined(CONFIG_ATARI)
> const struct nvram_ops arch_nvram_ops = {
> /* ... */
> }
> #elif defined(CONFIG_MAC) && defined(CONFIG_ATARI)
> const struct nvram_ops arch_nvram_ops = {
> /* ... */
> }
> #endif
>
> So, you're right, this isn't "duplication", it's "triplication".

Ok, I failed to realized that MAC and ATARI are not mutually exclusive.
I agree that your original version is best then.

   Arnd


Re: [PATCH v8 18/25] powerpc: Implement nvram sync ioctl

2018-12-31 Thread Arnd Bergmann
On Mon, Dec 31, 2018 at 12:13 AM Finn Thain  wrote:
> On Sun, 30 Dec 2018, Finn Thain wrote:
> > > > diff --git a/include/linux/nvram.h b/include/linux/nvram.h
> > > > index b7bfaec60a43..24a57675dba1 100644
> > > > --- a/include/linux/nvram.h
> > > > +++ b/include/linux/nvram.h
> > > > @@ -18,8 +18,12 @@ struct nvram_ops {
> > > > unsigned char   (*read_byte)(int);
> > > > void(*write_byte)(unsigned char, int);
> > > > ssize_t (*get_size)(void);
> > > > +#ifdef CONFIG_PPC
> > > > +   long(*sync)(void);
> > > > +#else
> > > > long(*set_checksum)(void);
> > > > long(*initialize)(void);
> > > > +#endif
> > > >  };
> > >
> > > Maybe just leave all entries visible here, and avoid the above #ifdef
> > > checks.
> > >
> >
> > The #ifdef isn't there just to save a few bytes, though it does do that.
> > It's really meant to cause a build failure when I mess up somewhere. But
> > I'm happy to change it if you can see a reason to do so (?)
> >
>
> I think the problem with these #ifdef conditionals is that they don't
> express the correct constraints. So, at the end of this series I'd prefer
> to see,
>
> struct nvram_ops {
>ssize_t (*read)(char *, size_t, loff_t *);
>ssize_t (*write)(char *, size_t, loff_t *);
>unsigned char   (*read_byte)(int);
>void(*write_byte)(unsigned char, int);
>ssize_t (*get_size)(void);
> #if defined(CONFIG_PPC)
>long(*sync)(void);
>int (*get_partition)(int);
> #elif defined(CONFIG_X86) || defined(CONFIG_M68K)
>long(*set_checksum)(void);
>long(*initialize)(void);
> #endif
> };
>
> Is that okay with you?

My preference would be no #ifdef here, but the compile time
error you mention is a good enough reason, so I'm fine with
either version you pick.

   Arnd


Re: [PATCH v8 18/25] powerpc: Implement nvram sync ioctl

2018-12-31 Thread Arnd Bergmann
On Sun, Dec 30, 2018 at 8:25 AM Finn Thain  wrote:
>
> On Sat, 29 Dec 2018, Arnd Bergmann wrote:
>
> > > --- a/drivers/char/nvram.c
> > > +++ b/drivers/char/nvram.c
> > > @@ -48,6 +48,10 @@
> > >  #include 
> > >  #include 
> > >
> > > +#ifdef CONFIG_PPC
> > > +#include 
> > > +#include 
> > > +#endif
> > >
> > >  static DEFINE_MUTEX(nvram_mutex);
> > >  static DEFINE_SPINLOCK(nvram_state_lock);
> > > @@ -331,6 +335,37 @@ static long nvram_misc_ioctl(struct file *file, 
> > > unsigned int cmd,
> > > long ret = -ENOTTY;
> > >
> > > switch (cmd) {
> > > +#ifdef CONFIG_PPC
> > > +   case OBSOLETE_PMAC_NVRAM_GET_OFFSET:
> > > +   pr_warn("nvram: Using obsolete PMAC_NVRAM_GET_OFFSET 
> > > ioctl\n");
> > > +   /* fall through */
> > > +   case IOC_NVRAM_GET_OFFSET:
> > > +   ret = -EINVAL;
> > > +#ifdef CONFIG_PPC_PMAC
> >
> > I think it would make be nicer here to keep the ppc bits in arch/ppc,
> > and instead add a .ioctl() callback to nvram_ops.
> >
>
> The problem with having an nvram_ops.ioctl() method is the code in the
> !PPC branch. That code would get duplicated because it's needed by both
> X86 and M68K, to implement the checksum ioctls.

I was thinking you'd just have a common ioctl function that falls
back to the .ioctl callback for any unhandled commands like

switch (cmd) {
case NVRAM_INIT:
 ...
 break;
case ...:
 break;
default:
 if (ops->ioctl)
 return ops->ioctl(...);
 return -EINVAL;
}

Would that work?

   Arnd


[PATCH v4 6/6] arm, s390, unicore32: remove oneliner wrappers for memblock_alloc()

2018-12-31 Thread Mike Rapoport
arm, s390 and unicore32 use oneliner wrappers for memblock_alloc().
Replace their usage with direct call to memblock_alloc().

Suggested-by: Christoph Hellwig 
Signed-off-by: Mike Rapoport 
---
 arch/arm/mm/mmu.c   | 11 +++
 arch/s390/numa/numa.c   | 10 +-
 arch/unicore32/mm/mmu.c | 12 
 3 files changed, 8 insertions(+), 25 deletions(-)

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 0a04c9a5..57de0dd 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -719,14 +719,9 @@ EXPORT_SYMBOL(phys_mem_access_prot);
 
 #define vectors_base() (vectors_high() ? 0x : 0)
 
-static void __init *early_alloc_aligned(unsigned long sz, unsigned long align)
-{
-   return memblock_alloc(sz, align);
-}
-
 static void __init *early_alloc(unsigned long sz)
 {
-   return early_alloc_aligned(sz, sz);
+   return memblock_alloc(sz, sz);
 }
 
 static void *__init late_alloc(unsigned long sz)
@@ -998,7 +993,7 @@ void __init iotable_init(struct map_desc *io_desc, int nr)
if (!nr)
return;
 
-   svm = early_alloc_aligned(sizeof(*svm) * nr, __alignof__(*svm));
+   svm = memblock_alloc(sizeof(*svm) * nr, __alignof__(*svm));
 
for (md = io_desc; nr; md++, nr--) {
create_mapping(md);
@@ -1020,7 +1015,7 @@ void __init vm_reserve_area_early(unsigned long addr, 
unsigned long size,
struct vm_struct *vm;
struct static_vm *svm;
 
-   svm = early_alloc_aligned(sizeof(*svm), __alignof__(*svm));
+   svm = memblock_alloc(sizeof(*svm), __alignof__(*svm));
 
vm = &svm->vm;
vm->addr = (void *)addr;
diff --git a/arch/s390/numa/numa.c b/arch/s390/numa/numa.c
index 2281a88..2d1271e 100644
--- a/arch/s390/numa/numa.c
+++ b/arch/s390/numa/numa.c
@@ -58,14 +58,6 @@ EXPORT_SYMBOL(__node_distance);
 int numa_debug_enabled;
 
 /*
- * alloc_node_data() - Allocate node data
- */
-static __init pg_data_t *alloc_node_data(void)
-{
-   return memblock_alloc(sizeof(pg_data_t), 8);
-}
-
-/*
  * numa_setup_memory() - Assign bootmem to nodes
  *
  * The memory is first added to memblock without any respect to nodes.
@@ -101,7 +93,7 @@ static void __init numa_setup_memory(void)
 
/* Allocate and fill out node_data */
for (nid = 0; nid < MAX_NUMNODES; nid++)
-   NODE_DATA(nid) = alloc_node_data();
+   NODE_DATA(nid) = memblock_alloc(sizeof(pg_data_t), 8);
 
for_each_online_node(nid) {
unsigned long start_pfn, end_pfn;
diff --git a/arch/unicore32/mm/mmu.c b/arch/unicore32/mm/mmu.c
index 50d8c1a..a402192 100644
--- a/arch/unicore32/mm/mmu.c
+++ b/arch/unicore32/mm/mmu.c
@@ -141,16 +141,12 @@ static void __init build_mem_type_table(void)
 
 #define vectors_base() (vectors_high() ? 0x : 0)
 
-static void __init *early_alloc(unsigned long sz)
-{
-   return memblock_alloc(sz, sz);
-}
-
 static pte_t * __init early_pte_alloc(pmd_t *pmd, unsigned long addr,
unsigned long prot)
 {
if (pmd_none(*pmd)) {
-   pte_t *pte = early_alloc(PTRS_PER_PTE * sizeof(pte_t));
+   pte_t *pte = memblock_alloc(PTRS_PER_PTE * sizeof(pte_t),
+   PTRS_PER_PTE * sizeof(pte_t));
__pmd_populate(pmd, __pa(pte) | prot);
}
BUG_ON(pmd_bad(*pmd));
@@ -352,7 +348,7 @@ static void __init devicemaps_init(void)
/*
 * Allocate the vector page early.
 */
-   vectors = early_alloc(PAGE_SIZE);
+   vectors = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
 
for (addr = VMALLOC_END; addr; addr += PGDIR_SIZE)
pmd_clear(pmd_off_k(addr));
@@ -429,7 +425,7 @@ void __init paging_init(void)
top_pmd = pmd_off_k(0x);
 
/* allocate the zero page. */
-   zero_page = early_alloc(PAGE_SIZE);
+   zero_page = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
 
bootmem_init();
 
-- 
2.7.4



[PATCH v4 5/6] arch: simplify several early memory allocations

2018-12-31 Thread Mike Rapoport
There are several early memory allocations in arch/ code that use
memblock_phys_alloc() to allocate memory, convert the returned physical
address to the virtual address and then set the allocated memory to zero.

Exactly the same behaviour can be achieved simply by calling
memblock_alloc(): it allocates the memory in the same way as
memblock_phys_alloc(), then it performs the phys_to_virt() conversion and
clears the allocated memory.

Replace the longer sequence with a simpler call to memblock_alloc().

Signed-off-by: Mike Rapoport 
---
 arch/arm/mm/mmu.c |  4 +---
 arch/c6x/mm/dma-coherent.c|  9 ++---
 arch/nds32/mm/init.c  | 12 
 arch/powerpc/kernel/setup-common.c|  4 ++--
 arch/powerpc/mm/ppc_mmu_32.c  |  3 +--
 arch/powerpc/platforms/powernv/opal.c |  3 +--
 arch/s390/numa/numa.c |  6 +-
 arch/sparc/kernel/prom_64.c   |  7 ++-
 arch/sparc/mm/init_64.c   |  9 +++--
 arch/unicore32/mm/mmu.c   |  4 +---
 10 files changed, 18 insertions(+), 43 deletions(-)

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index f5cc1cc..0a04c9a5 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -721,9 +721,7 @@ EXPORT_SYMBOL(phys_mem_access_prot);
 
 static void __init *early_alloc_aligned(unsigned long sz, unsigned long align)
 {
-   void *ptr = __va(memblock_phys_alloc(sz, align));
-   memset(ptr, 0, sz);
-   return ptr;
+   return memblock_alloc(sz, align);
 }
 
 static void __init *early_alloc(unsigned long sz)
diff --git a/arch/c6x/mm/dma-coherent.c b/arch/c6x/mm/dma-coherent.c
index 75b7957..0be2898 100644
--- a/arch/c6x/mm/dma-coherent.c
+++ b/arch/c6x/mm/dma-coherent.c
@@ -121,8 +121,6 @@ void arch_dma_free(struct device *dev, size_t size, void 
*vaddr,
  */
 void __init coherent_mem_init(phys_addr_t start, u32 size)
 {
-   phys_addr_t bitmap_phys;
-
if (!size)
return;
 
@@ -138,11 +136,8 @@ void __init coherent_mem_init(phys_addr_t start, u32 size)
if (dma_size & (PAGE_SIZE - 1))
++dma_pages;
 
-   bitmap_phys = memblock_phys_alloc(BITS_TO_LONGS(dma_pages) * 
sizeof(long),
- sizeof(long));
-
-   dma_bitmap = phys_to_virt(bitmap_phys);
-   memset(dma_bitmap, 0, dma_pages * PAGE_SIZE);
+   dma_bitmap = memblock_alloc(BITS_TO_LONGS(dma_pages) * sizeof(long),
+   sizeof(long));
 }
 
 static void c6x_dma_sync(struct device *dev, phys_addr_t paddr, size_t size,
diff --git a/arch/nds32/mm/init.c b/arch/nds32/mm/init.c
index 253f79f..d1e521c 100644
--- a/arch/nds32/mm/init.c
+++ b/arch/nds32/mm/init.c
@@ -78,8 +78,7 @@ static void __init map_ram(void)
}
 
/* Alloc one page for holding PTE's... */
-   pte = (pte_t *) __va(memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE));
-   memset(pte, 0, PAGE_SIZE);
+   pte = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
set_pmd(pme, __pmd(__pa(pte) + _PAGE_KERNEL_TABLE));
 
/* Fill the newly allocated page with PTE'S */
@@ -111,8 +110,7 @@ static void __init fixedrange_init(void)
pgd = swapper_pg_dir + pgd_index(vaddr);
pud = pud_offset(pgd, vaddr);
pmd = pmd_offset(pud, vaddr);
-   fixmap_pmd_p = (pmd_t *) __va(memblock_phys_alloc(PAGE_SIZE, 
PAGE_SIZE));
-   memset(fixmap_pmd_p, 0, PAGE_SIZE);
+   fixmap_pmd_p = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
set_pmd(pmd, __pmd(__pa(fixmap_pmd_p) + _PAGE_KERNEL_TABLE));
 
 #ifdef CONFIG_HIGHMEM
@@ -124,8 +122,7 @@ static void __init fixedrange_init(void)
pgd = swapper_pg_dir + pgd_index(vaddr);
pud = pud_offset(pgd, vaddr);
pmd = pmd_offset(pud, vaddr);
-   pte = (pte_t *) __va(memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE));
-   memset(pte, 0, PAGE_SIZE);
+   pte = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
set_pmd(pmd, __pmd(__pa(pte) + _PAGE_KERNEL_TABLE));
pkmap_page_table = pte;
 #endif /* CONFIG_HIGHMEM */
@@ -150,8 +147,7 @@ void __init paging_init(void)
fixedrange_init();
 
/* allocate space for empty_zero_page */
-   zero_page = __va(memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE));
-   memset(zero_page, 0, PAGE_SIZE);
+   zero_page = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
zone_sizes_init();
 
empty_zero_page = virt_to_page(zero_page);
diff --git a/arch/powerpc/kernel/setup-common.c 
b/arch/powerpc/kernel/setup-common.c
index ca00fbb..82be48c 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -459,8 +459,8 @@ void __init smp_setup_cpu_maps(void)
 
DBG("smp_setup_cpu_maps()\n");
 
-   cpu_to_phys_id = __va(memblock_phys_alloc(nr_cpu_ids * sizeof(u32), 
__alignof__(u32)));
-   memset(cpu_to_phys_id, 0, nr_cpu_ids * sizeof(u32));
+   cpu_to_phys_id = memblock_alloc(nr_cpu_ids *

[PATCH v4 4/6] openrisc: simplify pte_alloc_one_kernel()

2018-12-31 Thread Mike Rapoport
The pte_alloc_one_kernel() function allocates a page using
__get_free_page(GFP_KERNEL) when mm initialization is complete and
memblock_phys_alloc() on the earlier stages. The physical address of the
page allocated with memblock_phys_alloc() is converted to the virtual
address and in the both cases the allocated page is cleared using
clear_page().

The code is simplified by replacing __get_free_page() with
get_zeroed_page() and by replacing memblock_phys_alloc() with
memblock_alloc().

Signed-off-by: Mike Rapoport 
Acked-by: Stafford Horne 
---
 arch/openrisc/mm/ioremap.c | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/arch/openrisc/mm/ioremap.c b/arch/openrisc/mm/ioremap.c
index c969752..cfef989 100644
--- a/arch/openrisc/mm/ioremap.c
+++ b/arch/openrisc/mm/ioremap.c
@@ -123,13 +123,10 @@ pte_t __ref *pte_alloc_one_kernel(struct mm_struct *mm,
 {
pte_t *pte;
 
-   if (likely(mem_init_done)) {
-   pte = (pte_t *) __get_free_page(GFP_KERNEL);
-   } else {
-   pte = (pte_t *) __va(memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE));
-   }
+   if (likely(mem_init_done))
+   pte = (pte_t *)get_zeroed_page(GFP_KERNEL);
+   else
+   pte = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
 
-   if (pte)
-   clear_page(pte);
return pte;
 }
-- 
2.7.4



[PATCH v4 3/6] sh: prefer memblock APIs returning virtual address

2018-12-31 Thread Mike Rapoport
Rather than use the memblock_alloc_base that returns a physical address and
then convert this address to the virtual one, use appropriate memblock
function that returns a virtual address.

There is a small functional change in the allocation of then NODE_DATA().
Instead of panicing if the local allocation failed, the non-local
allocation attempt will be made.

Signed-off-by: Mike Rapoport 
---
 arch/sh/mm/init.c | 18 +-
 arch/sh/mm/numa.c |  5 ++---
 2 files changed, 7 insertions(+), 16 deletions(-)

diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c
index a8e5c0e..a0fa4de 100644
--- a/arch/sh/mm/init.c
+++ b/arch/sh/mm/init.c
@@ -192,24 +192,16 @@ void __init page_table_range_init(unsigned long start, 
unsigned long end,
 void __init allocate_pgdat(unsigned int nid)
 {
unsigned long start_pfn, end_pfn;
-#ifdef CONFIG_NEED_MULTIPLE_NODES
-   unsigned long phys;
-#endif
 
get_pfn_range_for_nid(nid, &start_pfn, &end_pfn);
 
 #ifdef CONFIG_NEED_MULTIPLE_NODES
-   phys = __memblock_alloc_base(sizeof(struct pglist_data),
-   SMP_CACHE_BYTES, end_pfn << PAGE_SHIFT);
-   /* Retry with all of system memory */
-   if (!phys)
-   phys = __memblock_alloc_base(sizeof(struct pglist_data),
-   SMP_CACHE_BYTES, 
memblock_end_of_DRAM());
-   if (!phys)
+   NODE_DATA(nid) = memblock_alloc_try_nid_nopanic(
+   sizeof(struct pglist_data),
+   SMP_CACHE_BYTES, MEMBLOCK_LOW_LIMIT,
+   MEMBLOCK_ALLOC_ACCESSIBLE, nid);
+   if (!NODE_DATA(nid))
panic("Can't allocate pgdat for node %d\n", nid);
-
-   NODE_DATA(nid) = __va(phys);
-   memset(NODE_DATA(nid), 0, sizeof(struct pglist_data));
 #endif
 
NODE_DATA(nid)->node_start_pfn = start_pfn;
diff --git a/arch/sh/mm/numa.c b/arch/sh/mm/numa.c
index 830e8b3..c4bde61 100644
--- a/arch/sh/mm/numa.c
+++ b/arch/sh/mm/numa.c
@@ -41,9 +41,8 @@ void __init setup_bootmem_node(int nid, unsigned long start, 
unsigned long end)
__add_active_range(nid, start_pfn, end_pfn);
 
/* Node-local pgdat */
-   NODE_DATA(nid) = __va(memblock_alloc_base(sizeof(struct pglist_data),
-SMP_CACHE_BYTES, end));
-   memset(NODE_DATA(nid), 0, sizeof(struct pglist_data));
+   NODE_DATA(nid) = memblock_alloc_node(sizeof(struct pglist_data),
+SMP_CACHE_BYTES, nid);
 
NODE_DATA(nid)->node_start_pfn = start_pfn;
NODE_DATA(nid)->node_spanned_pages = end_pfn - start_pfn;
-- 
2.7.4



[PATCH v4 2/6] microblaze: prefer memblock API returning virtual address

2018-12-31 Thread Mike Rapoport
Rather than use the memblock_alloc_base that returns a physical address and
then convert this address to the virtual one, use appropriate memblock
function that returns a virtual address.

Signed-off-by: Mike Rapoport 
Tested-by: Michal Simek 
---
 arch/microblaze/mm/init.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/microblaze/mm/init.c b/arch/microblaze/mm/init.c
index b17fd8a..44f4b89 100644
--- a/arch/microblaze/mm/init.c
+++ b/arch/microblaze/mm/init.c
@@ -363,8 +363,9 @@ void __init *early_get_page(void)
 * Mem start + kernel_tlb -> here is limit
 * because of mem mapping from head.S
 */
-   return __va(memblock_alloc_base(PAGE_SIZE, PAGE_SIZE,
-   memory_start + kernel_tlb));
+   return memblock_alloc_try_nid_raw(PAGE_SIZE, PAGE_SIZE,
+   MEMBLOCK_LOW_LIMIT, memory_start + kernel_tlb,
+   NUMA_NO_NODE);
 }
 
 #endif /* CONFIG_MMU */
-- 
2.7.4



[PATCH v4 1/6] powerpc: prefer memblock APIs returning virtual address

2018-12-31 Thread Mike Rapoport
There are a several places that allocate memory using memblock APIs that
return a physical address, convert the returned address to the virtual
address and frequently also memset(0) the allocated range.

Update these places to use memblock allocators already returning a virtual
address. Use memblock functions that clear the allocated memory instead of
calling memset(0) where appropriate.

The calls to memblock_alloc_base() that were not followed by memset(0) are
replaced with memblock_alloc_try_nid_raw(). Since the latter does not
panic() when the allocation fails, the appropriate panic() calls are added
to the call sites.

Signed-off-by: Mike Rapoport 
---
 arch/powerpc/kernel/paca.c | 16 ++--
 arch/powerpc/kernel/setup_64.c | 24 ++--
 arch/powerpc/mm/hash_utils_64.c|  6 +++---
 arch/powerpc/mm/pgtable-book3e.c   |  8 ++--
 arch/powerpc/mm/pgtable-book3s64.c |  5 +
 arch/powerpc/mm/pgtable-radix.c| 25 +++--
 arch/powerpc/platforms/pasemi/iommu.c  |  5 +++--
 arch/powerpc/platforms/pseries/setup.c | 18 ++
 arch/powerpc/sysdev/dart_iommu.c   |  7 +--
 9 files changed, 51 insertions(+), 63 deletions(-)

diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c
index 913bfca..276d36d4 100644
--- a/arch/powerpc/kernel/paca.c
+++ b/arch/powerpc/kernel/paca.c
@@ -27,7 +27,7 @@
 static void *__init alloc_paca_data(unsigned long size, unsigned long align,
unsigned long limit, int cpu)
 {
-   unsigned long pa;
+   void *ptr;
int nid;
 
/*
@@ -42,17 +42,15 @@ static void *__init alloc_paca_data(unsigned long size, 
unsigned long align,
nid = early_cpu_to_node(cpu);
}
 
-   pa = memblock_alloc_base_nid(size, align, limit, nid, MEMBLOCK_NONE);
-   if (!pa) {
-   pa = memblock_alloc_base(size, align, limit);
-   if (!pa)
-   panic("cannot allocate paca data");
-   }
+   ptr = memblock_alloc_try_nid(size, align, MEMBLOCK_LOW_LIMIT,
+limit, nid);
+   if (!ptr)
+   panic("cannot allocate paca data");
 
if (cpu == boot_cpuid)
memblock_set_bottom_up(false);
 
-   return __va(pa);
+   return ptr;
 }
 
 #ifdef CONFIG_PPC_PSERIES
@@ -118,7 +116,6 @@ static struct slb_shadow * __init new_slb_shadow(int cpu, 
unsigned long limit)
}
 
s = alloc_paca_data(sizeof(*s), L1_CACHE_BYTES, limit, cpu);
-   memset(s, 0, sizeof(*s));
 
s->persistent = cpu_to_be32(SLB_NUM_BOLTED);
s->buffer_length = cpu_to_be32(sizeof(*s));
@@ -222,7 +219,6 @@ void __init allocate_paca(int cpu)
paca = alloc_paca_data(sizeof(struct paca_struct), L1_CACHE_BYTES,
limit, cpu);
paca_ptrs[cpu] = paca;
-   memset(paca, 0, sizeof(struct paca_struct));
 
initialise_paca(paca, cpu);
 #ifdef CONFIG_PPC_PSERIES
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index 236c115..3dcd779 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -634,19 +634,17 @@ __init u64 ppc64_bolted_size(void)
 
 static void *__init alloc_stack(unsigned long limit, int cpu)
 {
-   unsigned long pa;
+   void *ptr;
 
BUILD_BUG_ON(STACK_INT_FRAME_SIZE % 16);
 
-   pa = memblock_alloc_base_nid(THREAD_SIZE, THREAD_SIZE, limit,
-   early_cpu_to_node(cpu), MEMBLOCK_NONE);
-   if (!pa) {
-   pa = memblock_alloc_base(THREAD_SIZE, THREAD_SIZE, limit);
-   if (!pa)
-   panic("cannot allocate stacks");
-   }
+   ptr = memblock_alloc_try_nid(THREAD_SIZE, THREAD_SIZE,
+MEMBLOCK_LOW_LIMIT, limit,
+early_cpu_to_node(cpu));
+   if (!ptr)
+   panic("cannot allocate stacks");
 
-   return __va(pa);
+   return ptr;
 }
 
 void __init irqstack_early_init(void)
@@ -739,20 +737,17 @@ void __init emergency_stack_init(void)
struct thread_info *ti;
 
ti = alloc_stack(limit, i);
-   memset(ti, 0, THREAD_SIZE);
emerg_stack_init_thread_info(ti, i);
paca_ptrs[i]->emergency_sp = (void *)ti + THREAD_SIZE;
 
 #ifdef CONFIG_PPC_BOOK3S_64
/* emergency stack for NMI exception handling. */
ti = alloc_stack(limit, i);
-   memset(ti, 0, THREAD_SIZE);
emerg_stack_init_thread_info(ti, i);
paca_ptrs[i]->nmi_emergency_sp = (void *)ti + THREAD_SIZE;
 
/* emergency stack for machine check exception handling. */
ti = alloc_stack(limit, i);
-   memset(ti, 0, THREAD_SIZE);
emerg_stack_init_thread_info(ti, i);
   

[PATCH v4 0/6] memblock: simplify several early memory allocation

2018-12-31 Thread Mike Rapoport
Hi,

These patches simplify some of the early memory allocations by replacing
usage of older memblock APIs with newer and shinier ones.

Quite a few places in the arch/ code allocated memory using a memblock API
that returns a physical address of the allocated area, then converted this
physical address to a virtual one and then used memset(0) to clear the
allocated range.

More recent memblock APIs do all the three steps in one call and their
usage simplifies the code.

It's important to note that regardless of API used, the core allocation is
nearly identical for any set of memblock allocators: first it tries to find
a free memory with all the constraints specified by the caller and then
falls back to the allocation with some or all constraints disabled.

The first three patches perform the conversion of call sites that have
exact requirements for the node and the possible memory range.

The fourth patch is a bit one-off as it simplifies openrisc's
implementation of pte_alloc_one_kernel(), and not only the memblock usage.

The fifth patch takes care of simpler cases when the allocation can be
satisfied with a simple call to memblock_alloc().

The sixth patch removes one-liner wrappers for memblock_alloc on arm and
unicore32, as suggested by Christoph.

v4:
* rebased on the current upstream
* added conversion of s390 node data allocation

v3:
* added Tested-by from Michal Simek for microblaze changes
* updated powerpc changes as per Michael Ellerman comments:
  - use allocations that clear memory in alloc_paca_data() and alloc_stack()
  - ensure the replacement is equivalent to old API

v2:
* added Ack from Stafford Horne for openrisc changes
* entirely drop early_alloc wrappers on arm and unicore32, as per Christoph
Hellwig

Mike Rapoport (6):
  powerpc: prefer memblock APIs returning virtual address
  microblaze: prefer memblock API returning virtual address
  sh: prefer memblock APIs returning virtual address
  openrisc: simplify pte_alloc_one_kernel()
  arch: simplify several early memory allocations
  arm, s390, unicore32: remove oneliner wrappers for memblock_alloc()

 arch/arm/mm/mmu.c  | 13 +++--
 arch/c6x/mm/dma-coherent.c |  9 ++---
 arch/microblaze/mm/init.c  |  5 +++--
 arch/nds32/mm/init.c   | 12 
 arch/openrisc/mm/ioremap.c | 11 ---
 arch/powerpc/kernel/paca.c | 16 ++--
 arch/powerpc/kernel/setup-common.c |  4 ++--
 arch/powerpc/kernel/setup_64.c | 24 ++--
 arch/powerpc/mm/hash_utils_64.c|  6 +++---
 arch/powerpc/mm/pgtable-book3e.c   |  8 ++--
 arch/powerpc/mm/pgtable-book3s64.c |  5 +
 arch/powerpc/mm/pgtable-radix.c| 25 +++--
 arch/powerpc/mm/ppc_mmu_32.c   |  3 +--
 arch/powerpc/platforms/pasemi/iommu.c  |  5 +++--
 arch/powerpc/platforms/powernv/opal.c  |  3 +--
 arch/powerpc/platforms/pseries/setup.c | 18 ++
 arch/powerpc/sysdev/dart_iommu.c   |  7 +--
 arch/s390/numa/numa.c  | 14 +-
 arch/sh/mm/init.c  | 18 +-
 arch/sh/mm/numa.c  |  5 ++---
 arch/sparc/kernel/prom_64.c|  7 ++-
 arch/sparc/mm/init_64.c|  9 +++--
 arch/unicore32/mm/mmu.c| 14 --
 23 files changed, 88 insertions(+), 153 deletions(-)

-- 
2.7.4