[PATCH] KVM: PPC: Replace zero-length array with flexible array member

2021-09-18 Thread Len Baker
There is a regular need in the kernel to provide a way to declare having
a dynamically sized set of trailing elements in a structure. Kernel code
should always use "flexible array members" [1] for these cases. The
older style of one-element or zero-length arrays should no longer be
used[2].

Also, make use of the struct_size() helper in kzalloc().

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] 
https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays

Signed-off-by: Len Baker 
---
 arch/powerpc/include/asm/kvm_host.h | 2 +-
 arch/powerpc/kvm/book3s_64_vio.c| 3 +--
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 080a7feb7731..3aed653373a5 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -190,7 +190,7 @@ struct kvmppc_spapr_tce_table {
u64 size;   /* window size in pages */
struct list_head iommu_tables;
struct mutex alloc_lock;
-   struct page *pages[0];
+   struct page *pages[];
 };

 /* XICS components, defined in book3s_xics.c */
diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
index 6365087f3160..d42b4b6d4a79 100644
--- a/arch/powerpc/kvm/book3s_64_vio.c
+++ b/arch/powerpc/kvm/book3s_64_vio.c
@@ -295,8 +295,7 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
return ret;

ret = -ENOMEM;
-   stt = kzalloc(sizeof(*stt) + npages * sizeof(struct page *),
- GFP_KERNEL);
+   stt = kzalloc(struct_size(stt, pages, npages), GFP_KERNEL);
if (!stt)
goto fail_acct;

--
2.25.1



Re: [PATCH v8 2/3] tty: hvc: pass DMA capable memory to put_chars()

2021-09-18 Thread Xianting Tian

thanks Greg, I will submit v9 patch for reviewing.

Before, I was waiting for a new reply:(

在 2021/9/18 下午8:40, Greg KH 写道:

On Sat, Sep 18, 2021 at 08:32:01PM +0800, Xianting Tian wrote:

hi

Will you consider to continue the disscussion of this patch? thanks

I do not see a newer version of this series.

thanks,

greg k-h


Re: [PATCH v8 2/3] tty: hvc: pass DMA capable memory to put_chars()

2021-09-18 Thread Greg KH
On Sat, Sep 18, 2021 at 08:32:01PM +0800, Xianting Tian wrote:
> hi
> 
> Will you consider to continue the disscussion of this patch? thanks

I do not see a newer version of this series.

thanks,

greg k-h


Re: [PATCH v8 2/3] tty: hvc: pass DMA capable memory to put_chars()

2021-09-18 Thread Xianting Tian

hi

Will you consider to continue the disscussion of this patch? thanks

在 2021/8/20 下午4:43, Xianting TIan 写道:


在 2021/8/20 下午2:49, Daniel Axtens 写道:

Xianting Tian  writes:


As well known, hvc backend driver(eg, virtio-console) can register its
operations to hvc framework. The operations can contain put_chars(),
get_chars() and so on.

Some hvc backend may do dma in its operations. eg, put_chars() of
virtio-console. But in the code of hvc framework, it may pass DMA
incapable memory to put_chars() under a specific configuration, which
is explained in commit c4baad5029(virtio-console: avoid DMA from 
stack):

We could also run into issues on powerpc where Andrew is working on
adding vmap-stack but the opal hvc driver assumes that it is passed a
buffer which is not in vmalloc space but in the linear mapping. So it
would be good to fix this (or more clearly document what drivers can
expect).


1, c[] is on stack,
    hvc_console_print():
char c[N_OUTBUF] __ALIGNED__;
cons_ops[index]->put_chars(vtermnos[index], c, i);
2, ch is on stack,
    static void hvc_poll_put_char(,,char ch)
    {
struct tty_struct *tty = driver->ttys[0];
struct hvc_struct *hp = tty->driver_data;
int n;

do {
    n = hp->ops->put_chars(hp->vtermno, , 1);
} while (n <= 0);
    }

Commit c4baad5029 is just the fix to avoid DMA from stack memory, which
is passed to virtio-console by hvc framework in above code. But I think
the fix is aggressive, it directly uses kmemdup() to alloc new buffer
from kmalloc area and do memcpy no matter the memory is in kmalloc area
or not. But most importantly, it should better be fixed in the hvc
framework, by changing it to never pass stack memory to the put_chars()
function in the first place. Otherwise, we still face the same issue if
a new hvc backend using dma added in the future.

In this patch, we make 'char out_buf[N_OUTBUF]' and 'chat out_ch' part
of 'struct hvc_struct', so both two buf are no longer the stack memory.
we can use it in above two cases separately.

Introduce another array(cons_outbufs[]) for buffer pointers next to
the cons_ops[] and vtermnos[] arrays. With the array, we can easily 
find

the buffer, instead of traversing hp list.

With the patch, we can remove the fix c4baad5029.

Signed-off-by: Xianting Tian 
Reviewed-by: Shile Zhang 
  struct hvc_struct {
  struct tty_port port;
  spinlock_t lock;
  int index;
  int do_wakeup;
-    char *outbuf;
-    int outbuf_size;
  int n_outbuf;
  uint32_t vtermno;
  const struct hv_ops *ops;
@@ -48,6 +56,10 @@ struct hvc_struct {
  struct work_struct tty_resize;
  struct list_head next;
  unsigned long flags;
+    char out_ch;
+    char out_buf[N_OUTBUF] __ALIGNED__;
+    int outbuf_size;
+    char outbuf[0] __ALIGNED__;

I'm trying to understand this patch but I am finding it very difficult
to understand what the difference between `out_buf` and `outbuf`
(without the underscore) is supposed to be. `out_buf` is statically
sized and the size of `outbuf` is supposed to depend on the arguments to
hvc_alloc(), but I can't quite figure out what the roles of each one are
and their names are confusingly similiar!

I looked briefly at the older revisions of the series but it didn't make
things much clearer.

Could you give them clearer names?


thanks for the comments,

It is indeed not easy to understand by the name. I will change it to a 
proper name if we have next version patch.


Jiri Slaby is worring about the performance, because we need add two 
locks to protect 'out_ch' and 'out_buf' separately, the origin 
on-stack buffer is lockless.


I don't know whether this solution can be accepted, just waiting for 
Jiri's further commtents.




Also, looking at Documentation/process/deprecated.rst, it looks like
maybe we want to use a 'flexible array member' instead:

.. note:: If you are using struct_size() on a structure containing a 
zero-length
 or a one-element array as a trailing array member, please 
refactor such

 array usage and switch to a `flexible array member
 <#zero-length-and-one-element-arrays>`_ instead.

I think we want:

thanks, we should use [], not [0].



+    char outbuf[] __ALIGNED__;

Kind regards,
Daniel


Re: [PATCH v5 6/6] sched/fair: Consider SMT in ASYM_PACKING load balance

2021-09-18 Thread Vincent Guittot
On Fri, 17 Sept 2021 at 20:47, Peter Zijlstra  wrote:
>
> On Fri, Sep 17, 2021 at 05:25:02PM +0200, Vincent Guittot wrote:
>
> > With the removal of the condition !sds->local_stat.sum_nr_running
> > which seems useless because dst_cpu is idle and not SMT, this patch
> > looks good to me
>
> I've made it look like this. Thanks!

Reviewed-by: Vincent Guittot 

Thanks

>
> ---
> Subject: sched/fair: Consider SMT in ASYM_PACKING load balance
> From: Ricardo Neri 
> Date: Fri, 10 Sep 2021 18:18:19 -0700
>
> From: Ricardo Neri 
>
> When deciding to pull tasks in ASYM_PACKING, it is necessary not only to
> check for the idle state of the destination CPU, dst_cpu, but also of
> its SMT siblings.
>
> If dst_cpu is idle but its SMT siblings are busy, performance suffers
> if it pulls tasks from a medium priority CPU that does not have SMT
> siblings.
>
> Implement asym_smt_can_pull_tasks() to inspect the state of the SMT
> siblings of both dst_cpu and the CPUs in the candidate busiest group.
>
> Signed-off-by: Ricardo Neri 
> Signed-off-by: Peter Zijlstra (Intel) 
> Reviewed-by: Joel Fernandes (Google) 
> Reviewed-by: Len Brown 
> Link: 
> https://lkml.kernel.org/r/20210911011819.12184-7-ricardo.neri-calde...@linux.intel.com
> ---
>  kernel/sched/fair.c |   92 
> 
>  1 file changed, 92 insertions(+)
>
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -8538,10 +8538,96 @@ group_type group_classify(unsigned int i
> return group_has_spare;
>  }
>
> +/**
> + * asym_smt_can_pull_tasks - Check whether the load balancing CPU can pull 
> tasks
> + * @dst_cpu:   Destination CPU of the load balancing
> + * @sds:   Load-balancing data with statistics of the local group
> + * @sgs:   Load-balancing statistics of the candidate busiest group
> + * @sg:The candidate busiest group
> + *
> + * Check the state of the SMT siblings of both @sds::local and @sg and decide
> + * if @dst_cpu can pull tasks.
> + *
> + * If @dst_cpu does not have SMT siblings, it can pull tasks if two or more 
> of
> + * the SMT siblings of @sg are busy. If only one CPU in @sg is busy, pull 
> tasks
> + * only if @dst_cpu has higher priority.
> + *
> + * If both @dst_cpu and @sg have SMT siblings, and @sg has exactly one more
> + * busy CPU than @sds::local, let @dst_cpu pull tasks if it has higher 
> priority.
> + * Bigger imbalances in the number of busy CPUs will be dealt with in
> + * update_sd_pick_busiest().
> + *
> + * If @sg does not have SMT siblings, only pull tasks if all of the SMT 
> siblings
> + * of @dst_cpu are idle and @sg has lower priority.
> + */
> +static bool asym_smt_can_pull_tasks(int dst_cpu, struct sd_lb_stats *sds,
> +   struct sg_lb_stats *sgs,
> +   struct sched_group *sg)
> +{
> +#ifdef CONFIG_SCHED_SMT
> +   bool local_is_smt, sg_is_smt;
> +   int sg_busy_cpus;
> +
> +   local_is_smt = sds->local->flags & SD_SHARE_CPUCAPACITY;
> +   sg_is_smt = sg->flags & SD_SHARE_CPUCAPACITY;
> +
> +   sg_busy_cpus = sgs->group_weight - sgs->idle_cpus;
> +
> +   if (!local_is_smt) {
> +   /*
> +* If we are here, @dst_cpu is idle and does not have SMT
> +* siblings. Pull tasks if candidate group has two or more
> +* busy CPUs.
> +*/
> +   if (sg_busy_cpus >= 2) /* implies sg_is_smt */
> +   return true;
> +
> +   /*
> +* @dst_cpu does not have SMT siblings. @sg may have SMT
> +* siblings and only one is busy. In such case, @dst_cpu
> +* can help if it has higher priority and is idle (i.e.,
> +* it has no running tasks).
> +*/
> +   return sched_asym_prefer(dst_cpu, sg->asym_prefer_cpu);
> +   }
> +
> +   /* @dst_cpu has SMT siblings. */
> +
> +   if (sg_is_smt) {
> +   int local_busy_cpus = sds->local->group_weight -
> + sds->local_stat.idle_cpus;
> +   int busy_cpus_delta = sg_busy_cpus - local_busy_cpus;
> +
> +   if (busy_cpus_delta == 1)
> +   return sched_asym_prefer(dst_cpu, 
> sg->asym_prefer_cpu);
> +
> +   return false;
> +   }
> +
> +   /*
> +* @sg does not have SMT siblings. Ensure that @sds::local does not 
> end
> +* up with more than one busy SMT sibling and only pull tasks if there
> +* are not busy CPUs (i.e., no CPU has running tasks).
> +*/
> +   if (!sds->local_stat.sum_nr_running)
> +   return sched_asym_prefer(dst_cpu, sg->asym_prefer_cpu);
> +
> +   return false;
> +#else
> +   /* Always return false so that callers deal with non-SMT cases. */
> +   return false;
> +#endif
> +}
> +
>  static inline bool
>  sched_asym(struct lb_env *env, struct 

[PATCH] powerpc/476: Fix sparse report

2021-09-18 Thread Christophe Leroy
arch/powerpc/platforms/44x/ppc476.c:236:17: warning: cast removes 
address space '__iomem' of expression
arch/powerpc/platforms/44x/ppc476.c:241:34: warning: incorrect type in 
argument 1 (different address spaces)
arch/powerpc/platforms/44x/ppc476.c:241:34:expected void const 
volatile [noderef] __iomem *addr
arch/powerpc/platforms/44x/ppc476.c:241:34:got unsigned char 
[usertype] *
arch/powerpc/platforms/44x/ppc476.c:243:17: warning: incorrect type in 
argument 1 (different address spaces)
arch/powerpc/platforms/44x/ppc476.c:243:17:expected void volatile 
[noderef] __iomem *addr
arch/powerpc/platforms/44x/ppc476.c:243:17:got unsigned char 
[usertype] *[assigned] fpga

Mark 'fpga' pointer as __iomem.

Reported-by: kernel test robot 
Fixes: ab9a4183fddf ("powerpc: Update currituck pci/usb fixup for new board 
revision")
Cc: Alistair Popple 
Signed-off-by: Christophe Leroy 
---
 arch/powerpc/platforms/44x/ppc476.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/44x/ppc476.c 
b/arch/powerpc/platforms/44x/ppc476.c
index 07f7e3ce67b5..fb7db5cedd4e 100644
--- a/arch/powerpc/platforms/44x/ppc476.c
+++ b/arch/powerpc/platforms/44x/ppc476.c
@@ -219,7 +219,7 @@ static int board_rev = -1;
 static int __init ppc47x_get_board_rev(void)
 {
int reg;
-   u8 *fpga;
+   u8 __iomem *fpga;
struct device_node *np = NULL;
 
if (of_machine_is_compatible("ibm,currituck")) {
@@ -233,7 +233,7 @@ static int __init ppc47x_get_board_rev(void)
if (!np)
goto fail;
 
-   fpga = (u8 *) of_iomap(np, 0);
+   fpga = of_iomap(np, 0);
of_node_put(np);
if (!fpga)
goto fail;
-- 
2.31.1



Re: [PATCH v2] powerpc/32: Don't use a struct based type for pte_t

2021-09-18 Thread Christophe Leroy




Le 17/09/2021 à 16:32, David Laight a écrit :

From: Christophe Leroy

Sent: 17 September 2021 14:58

Long time ago we had a config item called STRICT_MM_TYPECHECKS
to build the kernel with pte_t defined as a structure in order
to perform additional build checks or build it with pte_t
defined as a simple type in order to get simpler generated code.


...

diff --git a/arch/powerpc/include/asm/pgtable-types.h 
b/arch/powerpc/include/asm/pgtable-types.h
index d11b4c61d686..c60199fc6fa6 100644
--- a/arch/powerpc/include/asm/pgtable-types.h
+++ b/arch/powerpc/include/asm/pgtable-types.h
@@ -5,14 +5,26 @@
  /* PTE level */
  #if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES)
  typedef struct { pte_basic_t pte, pte1, pte2, pte3; } pte_t;
-#else
+#elif defined(__CHECKER__) || !defined(CONFIG_PPC32)
  typedef struct { pte_basic_t pte; } pte_t;
+#else
+typedef pte_basic_t pte_t;
  #endif
+
+#if defined(__CHECKER__) || !defined(CONFIG_PPC32) || \
+(defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES))
  #define __pte(x)  ((pte_t) { (x) })
  static inline pte_basic_t pte_val(pte_t x)
  {
return x.pte;
  }
+#else
+#define __pte(x)   ((pte_t)(x))
+static inline pte_basic_t pte_val(pte_t x)
+{
+   return x;
+}
+#endif


Would it be better to define:
static inline pte_basic_*pte_basic(pte_t *x)
{
#if xxx
return x;
#else
return >pte;
#endif
}

Then pte_val(x) is always *pt_basic(x)
and the casts like:


-   pte_basic_t *entry = >pte;
+   pte_basic_t *entry = (pte_basic_t *)ptep;


can go away.




I don't like that idea too much, because it means going through a 
pointer of something which is not in memory at the begining. Doing that 
forces GCC to put the pte_t object on stack. And that's exactly the 
purpose of this patch: avoid having to go via memory. Allthough recent 
versions of GCC optimise it away at the end, I don't like it in principle.


And the only two places (pte_update() and set_huge_pte_at() where this 
cast is required are really two places very special which deal with real 
page tables. So IMHO it makes sense to explicitely show that what we use 
is the address of the entry in the page table.


Christophe


Re: [PATCH v2] powerpc/32: Don't use a struct based type for pte_t

2021-09-18 Thread Christophe Leroy




Le 18/09/2021 à 05:26, Michael Ellerman a écrit :

Christophe Leroy  writes:

Long time ago we had a config item called STRICT_MM_TYPECHECKS
to build the kernel with pte_t defined as a structure in order
to perform additional build checks or build it with pte_t
defined as a simple type in order to get simpler generated code.

Commit 670eea924198 ("powerpc/mm: Always use STRICT_MM_TYPECHECKS")
made the struct based definition the only one, considering that the
generated code was similar in both cases.

That's right on ppc64 because the ABI is such that the content of a
struct having a single simple type element is passed as register,
but on ppc32 such a structure is passed via the stack like any
structure.

Simple test function:

pte_t test(pte_t pte)
{
return pte;
}

Before this patch we get

c00108ec :
c00108ec:   81 24 00 00 lwz r9,0(r4)
c00108f0:   91 23 00 00 stw r9,0(r3)
c00108f4:   4e 80 00 20 blr

So, for PPC32, restore the simple type behaviour we got before
commit 670eea924198, but instead of adding a config option to
activate type check, do it when __CHECKER__ is set so that type
checking is performed by 'sparse' and provides feedback like:

arch/powerpc/mm/pgtable.c:466:16: warning: incorrect type in return 
expression (different base types)
arch/powerpc/mm/pgtable.c:466:16:expected unsigned long
arch/powerpc/mm/pgtable.c:466:16:got struct pte_t [usertype] x


OK that's a good trade off.

One question below ...


diff --git a/arch/powerpc/include/asm/pgtable-types.h 
b/arch/powerpc/include/asm/pgtable-types.h
index d11b4c61d686..c60199fc6fa6 100644
--- a/arch/powerpc/include/asm/pgtable-types.h
+++ b/arch/powerpc/include/asm/pgtable-types.h
@@ -5,14 +5,26 @@
  /* PTE level */
  #if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES)
  typedef struct { pte_basic_t pte, pte1, pte2, pte3; } pte_t;
-#else
+#elif defined(__CHECKER__) || !defined(CONFIG_PPC32)


It would be nicer if this logic was in Kconfig.

eg. restore config STRICT_MM_TYPECHECKS but make it always enabled for
64-bit, and depend on CHECKER for 32-bit.

The only thing is I'm not sure if we can test __CHECKER__ in Kconfig?



I think Kconfig doesn't see __CHECKER__, otherwise it would mean that a 
build may get different whether you build with C=1/2 or not.


__CHECKER__ is a define added by sparse when doing the CHECK on a per 
object basis.


What I can do is to add:

#if defined(__CHECKER__) || !defined(CONFIG_PPC32)
#define STRICT_MM_TYPECHECKS
#endif

Christophe