Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-03-02 Thread Jakob Koschel



> On 3. Mar 2022, at 05:58, David Laight  wrote:
> 
> From: Xiaomeng Tong
>> Sent: 03 March 2022 02:27
>> 
>> On Wed, 2 Mar 2022 14:04:06 +, David Laight
>>  wrote:
>>> I think that it would be better to make any alternate loop macro
>>> just set the variable to NULL on the loop exit.
>>> That is easier to code for and the compiler might be persuaded to
>>> not redo the test.
>> 
>> No, that would lead to a NULL dereference.
> 
> Why, it would make it b ethe same as the 'easy to use':
>   for (item = head; item; item = item->next) {
>   ...
>   if (...)
>   break;
>   ...
>   }
>   if (!item)
>   return;
> 
>> The problem is the mis-use of iterator outside the loop on exit, and
>> the iterator will be the HEAD's container_of pointer which pointers
>> to a type-confused struct. Sidenote: The *mis-use* here refers to
>> mistakely access to other members of the struct, instead of the
>> list_head member which acutally is the valid HEAD.
> 
> The problem is that the HEAD's container_of pointer should never
> be calculated at all.
> This is what is fundamentally broken about the current definition.
> 
>> IOW, you would dereference a (NULL + offset_of_member) address here.
> 
> Where?
> 
>> Please remind me if i missed something, thanks.
>> 
>> Can you share your "alternative definitions" details? thanks!
> 
> The loop should probably use as extra variable that points
> to the 'list node' in the next structure.
> Something like:
>   for (xxx *iter = head->next;
>   iter ==  ? ((item = NULL),0) : ((item = 
> list_item(iter),1));
>   iter = item->member->next) {
>  ...
> With a bit of casting you can use 'item' to hold 'iter'.

I think this would make sense, it would mean you only assign the containing
element on valid elements.

I was thinking something along the lines of:

#define list_for_each_entry(pos, head, member)  
\
for (struct list_head *list = head->next, typeof(pos) pos;  \
 list == head ? 0 : (( pos = list_entry(pos, list, member), 1));
\
 list = list->next)

Although the initialization block of the for loop is not valid C, I'm
not sure there is any way to declare two variables of a different type
in the initialization part of the loop.

I believe all this does is get rid of the >member == (head) check
to terminate the list.
It alone will not fix any of the other issues that using the iterator
variable after the loop currently has.


AFAIK Adrián Moreno is working on doing something along those lines
for the list iterator in openvswitch (that was similar to the kernel
one before) [1].

I *think* they don't declare 'pos' within the loop which we *do want*
to avoid any uses of it after the loop.
(If pos is not declared in the initialization block, shadowing the
*outer* pos, it would just point to the last element of the list or stay
uninitialized if the list is empty).


[1] https://www.mail-archive.com/ovs-dev@openvswitch.org/msg63497.html


> 
>> 
>>> OTOH there may be alternative definitions that can be used to get
>>> the compiler (or other compiler-like tools) to detect broken code.
>>> Even if the definition can't possibly generate a working kerrnel.
>> 
>> The "list_for_each_entry_inside(pos, type, head, member)" way makes
>> the iterator invisiable outside the loop, and would be catched by
>> compiler if use-after-loop things happened.
> 
> It is also a compete PITA for anything doing a search.
> 
>   David
> 
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 
> 1PT, UK
> Registration No: 1397386 (Wales)
> 

- Jakob

RE: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-03-02 Thread Xiaomeng Tong
On Thu, 3 Mar 2022 04:58:23 +, David Laight wrote:
> on 3 Mar 2022 10:27:29 +0800, Xiaomeng Tong wrote:
> > The problem is the mis-use of iterator outside the loop on exit, and
> > the iterator will be the HEAD's container_of pointer which pointers
> > to a type-confused struct. Sidenote: The *mis-use* here refers to
> > mistakely access to other members of the struct, instead of the
> > list_head member which acutally is the valid HEAD.
>
> The problem is that the HEAD's container_of pointer should never
> be calculated at all.
> This is what is fundamentally broken about the current definition.

Yes, the rule is "the HEAD's container_of pointer should never be
calculated at all outside the loop", but how do you make sure everyone
follows this rule?
Everyone makes mistakes, but we can eliminate them all from the beginning
with the help of compiler which can catch such use-after-loop things.

> > IOW, you would dereference a (NULL + offset_of_member) address here.
>
>Where?

In the case where a developer do not follows the above rule, and mistakely
access a non-list-head member of the HEAD's container_of pointer outside
the loop. For example:
struct req{
  int a;
  struct list_head h;
}
struct req *r;
list_for_each_entry(r, HEAD, h) {
  if (r->a == 0x10)
break;
}
// the developer made a mistake: he didn't take this situation into
// account where all entries in the list are *r->a != 0x10*, and now
// the r is the HEAD's container_of pointer. 
r->a = 0x20;
Thus the "r->a = 0x20" would dereference a (NULL + offset_of_member)
address here.

> > Please remind me if i missed something, thanks.
> >
> > Can you share your "alternative definitions" details? thanks!
>
> The loop should probably use as extra variable that points
> to the 'list node' in the next structure.
> Something like:
>   for (xxx *iter = head->next;
>   iter ==  ? ((item = NULL),0) : ((item = 
> list_item(iter),1));
>   iter = item->member->next) {
>  ...
> With a bit of casting you can use 'item' to hold 'iter'.

you still can not make sure everyone follows this rule: 
"do not use iterator outside the loop" without the help of compiler,
because item is declared outside the loop.

BTW, to avoid ambiguity,the "alternative definitions" here i asked is
something from you in this context:
"OTOH there may be alternative definitions that can be used to get
the compiler (or other compiler-like tools) to detect broken code.
Even if the definition can't possibly generate a working kerrnel."

> > 
> > > OTOH there may be alternative definitions that can be used to get
> > > the compiler (or other compiler-like tools) to detect broken code.
> > > Even if the definition can't possibly generate a working kerrnel.
> > 
> > The "list_for_each_entry_inside(pos, type, head, member)" way makes
> > the iterator invisiable outside the loop, and would be catched by
> > compiler if use-after-loop things happened.

> It is also a compete PITA for anything doing a search.

You mean it would be a burden on search? can you show me some examples?

Or you mean it is too long the list_for_each_entry_inside* string to live
in one single line, and should spilt into two line? If it is the case, there
are 2 way to mitigate it.
1. pass a shorter t as struct type to the macro
2. after all list_for_each_entry macros be replaced with
list_for_each_entry_inside, remove the list_for_each_entry implementations
and rename all list_for_each_entry_inside use back to the origin
list_for_each_entry in a single patch.

For me, it is acceptable and not a such big problem.

--
Xiaomeng Tong


[PATCH 3/6] KVM: PPC: Book3S HV P9: Move cede logic out of XIVE escalation rearming

2022-03-02 Thread Nicholas Piggin
Move the cede abort logic out of xive escalation rearming and into
the caller to prepare for handling a similar case with nested guest
entry.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/kvm_ppc.h |  4 ++--
 arch/powerpc/kvm/book3s_hv.c   | 10 --
 arch/powerpc/kvm/book3s_xive.c |  9 ++---
 3 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index a14dbcd1b8ce..94fa5f246657 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -671,7 +671,7 @@ extern int kvmppc_xive_set_irq(struct kvm *kvm, int 
irq_source_id, u32 irq,
   int level, bool line_status);
 extern void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu);
 extern void kvmppc_xive_pull_vcpu(struct kvm_vcpu *vcpu);
-extern void kvmppc_xive_rearm_escalation(struct kvm_vcpu *vcpu);
+extern bool kvmppc_xive_rearm_escalation(struct kvm_vcpu *vcpu);
 
 static inline int kvmppc_xive_enabled(struct kvm_vcpu *vcpu)
 {
@@ -709,7 +709,7 @@ static inline int kvmppc_xive_set_irq(struct kvm *kvm, int 
irq_source_id, u32 ir
  int level, bool line_status) { return 
-ENODEV; }
 static inline void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu) { }
 static inline void kvmppc_xive_pull_vcpu(struct kvm_vcpu *vcpu) { }
-static inline void kvmppc_xive_rearm_escalation(struct kvm_vcpu *vcpu) { }
+static inline bool kvmppc_xive_rearm_escalation(struct kvm_vcpu *vcpu) { 
return true; }
 
 static inline int kvmppc_xive_enabled(struct kvm_vcpu *vcpu)
{ return 0; }
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 5df359053147..a0b674d3a2da 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -4073,10 +4073,16 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, 
u64 time_limit,
!(vcpu->arch.shregs.msr & MSR_PR)) {
unsigned long req = kvmppc_get_gpr(vcpu, 3);
 
-   /* H_CEDE has to be handled now, not later */
+   /* H_CEDE has to be handled now */
if (req == H_CEDE) {
kvmppc_cede(vcpu);
-   kvmppc_xive_rearm_escalation(vcpu); /* may 
un-cede */
+   if (!kvmppc_xive_rearm_escalation(vcpu)) {
+   /*
+* Pending escalation so abort
+* the cede.
+*/
+   vcpu->arch.ceded = 0;
+   }
kvmppc_set_gpr(vcpu, 3, 0);
trap = 0;
 
diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c
index e216c068075d..7b513e14cada 100644
--- a/arch/powerpc/kvm/book3s_xive.c
+++ b/arch/powerpc/kvm/book3s_xive.c
@@ -179,12 +179,13 @@ void kvmppc_xive_pull_vcpu(struct kvm_vcpu *vcpu)
 }
 EXPORT_SYMBOL_GPL(kvmppc_xive_pull_vcpu);
 
-void kvmppc_xive_rearm_escalation(struct kvm_vcpu *vcpu)
+bool kvmppc_xive_rearm_escalation(struct kvm_vcpu *vcpu)
 {
void __iomem *esc_vaddr = (void __iomem *)vcpu->arch.xive_esc_vaddr;
+   bool ret = true;
 
if (!esc_vaddr)
-   return;
+   return ret;
 
/* we are using XIVE with single escalation */
 
@@ -197,7 +198,7 @@ void kvmppc_xive_rearm_escalation(struct kvm_vcpu *vcpu)
 * we also don't want to set xive_esc_on to 1 here in
 * case we race with xive_esc_irq().
 */
-   vcpu->arch.ceded = 0;
+   ret = false;
/*
 * The escalation interrupts are special as we don't EOI them.
 * There is no need to use the load-after-store ordering offset
@@ -210,6 +211,8 @@ void kvmppc_xive_rearm_escalation(struct kvm_vcpu *vcpu)
__raw_readq(esc_vaddr + XIVE_ESB_SET_PQ_00);
}
mb();
+
+   return ret;
 }
 EXPORT_SYMBOL_GPL(kvmppc_xive_rearm_escalation);
 
-- 
2.23.0



[PATCH 6/6] KVM: PPC: Book3S HV Nested: L2 LPCR should inherit L1 LPES setting

2022-03-02 Thread Nicholas Piggin
The L1 should not be able to adjust LPES mode for the L2. Setting LPES
if the L0 needs it clear would cause external interrupts to be sent to
L2 and missed by the L0.

Clearing LPES when it may be set, as typically happens with XIVE enabled
could cause a performance issue despite having no native XIVE support in
the guest, because it will cause mediated interrupts for the L2 to be
taken in HV mode, which then have to be injected.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/kvm/book3s_hv.c| 4 
 arch/powerpc/kvm/book3s_hv_nested.c | 3 +--
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 77315c2c3f43..acba9cb241c9 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -5323,6 +5323,10 @@ static int kvmppc_core_init_vm_hv(struct kvm *kvm)
kvm->arch.host_lpcr = lpcr = mfspr(SPRN_LPCR);
lpcr &= LPCR_PECE | LPCR_LPES;
} else {
+   /*
+* The L2 LPES mode will be set by the L0 according to whether
+* or not it needs to take external interrupts in HV mode.
+*/
lpcr = 0;
}
lpcr |= (4UL << LPCR_DPFD_SH) | LPCR_HDICE |
diff --git a/arch/powerpc/kvm/book3s_hv_nested.c 
b/arch/powerpc/kvm/book3s_hv_nested.c
index 9d373f8963ee..58e05a9122ac 100644
--- a/arch/powerpc/kvm/book3s_hv_nested.c
+++ b/arch/powerpc/kvm/book3s_hv_nested.c
@@ -261,8 +261,7 @@ static void load_l2_hv_regs(struct kvm_vcpu *vcpu,
/*
 * Don't let L1 change LPCR bits for the L2 except these:
 */
-   mask = LPCR_DPFD | LPCR_ILE | LPCR_TC | LPCR_AIL | LPCR_LD |
-   LPCR_LPES | LPCR_MER;
+   mask = LPCR_DPFD | LPCR_ILE | LPCR_TC | LPCR_AIL | LPCR_LD | LPCR_MER;
 
/*
 * Additional filtering is required depending on hardware
-- 
2.23.0



[PATCH 5/6] KVM: PPC: Book3S HV Nested: L2 must not run with L1 xive context

2022-03-02 Thread Nicholas Piggin
The PowerNV L0 currently pushes the OS xive context when running a vCPU,
regardless of whether it is running a nested guest. The problem is that
xive OS ring interrupts will be delivered while the L2 is running.

At the moment, by default, the L2 guest runs with LPCR[LPES]=0, which
actually makes external interrupts go to the L0. That causes the L2 to
exit and the interrupt taken or injected into the L1, so in some
respects this behaves like an escalation. It's not clear if this was
deliberate or not, there's no comment about it and the L1 is actually
allowed to clear LPES in the L2, so it's confusing at best.

When the L2 is running, the L1 is essentially in a ceded state with
respect to external interrupts (it can't respond to them directly and
won't get scheduled again absent some additional event). So the natural
way to solve this is when the L0 handles a H_ENTER_NESTED hypercall to
run the L2, have it arm the escalation interrupt and don't push the L1
context while running the L2.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/kvm/book3s_hv.c   | 26 --
 arch/powerpc/kvm/book3s_xive.c |  2 +-
 2 files changed, 21 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 0289d076c0a8..77315c2c3f43 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -4063,14 +4063,10 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, 
u64 time_limit,
}
 
} else if (nested) {
-   kvmppc_xive_push_vcpu(vcpu);
-
__this_cpu_write(cpu_in_guest, kvm);
trap = kvmhv_vcpu_entry_p9(vcpu, time_limit, lpcr, tb);
__this_cpu_write(cpu_in_guest, NULL);
 
-   kvmppc_xive_pull_vcpu(vcpu);
-
} else {
kvmppc_xive_push_vcpu(vcpu);
 
@@ -4082,8 +4078,13 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, 
u64 time_limit,
!(vcpu->arch.shregs.msr & MSR_PR)) {
unsigned long req = kvmppc_get_gpr(vcpu, 3);
 
-   /* H_CEDE has to be handled now */
+   /*
+* XIVE rearm and XICS hcalls must be handled
+* before xive context is pulled (is this
+* true?)
+*/
if (req == H_CEDE) {
+   /* H_CEDE has to be handled now */
kvmppc_cede(vcpu);
if (!kvmppc_xive_rearm_escalation(vcpu)) {
/*
@@ -4095,7 +4096,20 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, 
u64 time_limit,
kvmppc_set_gpr(vcpu, 3, 0);
trap = 0;
 
-   /* XICS hcalls must be handled before xive is pulled */
+   } else if (req == H_ENTER_NESTED) {
+   /*
+* L2 should not run with the L1
+* context so rearm and pull it.
+*/
+   if (!kvmppc_xive_rearm_escalation(vcpu)) {
+   /*
+* Pending escalation so abort
+* H_ENTER_NESTED.
+*/
+   kvmppc_set_gpr(vcpu, 3, 0);
+   trap = 0;
+   }
+
} else if (hcall_is_xics(req)) {
int ret;
 
diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c
index 7b513e14cada..e44e251509fe 100644
--- a/arch/powerpc/kvm/book3s_xive.c
+++ b/arch/powerpc/kvm/book3s_xive.c
@@ -241,7 +241,7 @@ static irqreturn_t xive_esc_irq(int irq, void *data)
 
vcpu->arch.irq_pending = 1;
smp_mb();
-   if (vcpu->arch.ceded)
+   if (vcpu->arch.ceded || vcpu->arch.nested)
kvmppc_fast_vcpu_kick(vcpu);
 
/* Since we have the no-EOI flag, the interrupt is effectively
-- 
2.23.0



[PATCH 4/6] KVM: PPC: Book3S HV P9: Split !nested case out from guest entry

2022-03-02 Thread Nicholas Piggin
The differences between nested and !nested will become larger in
later changes so split them out for readability.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/kvm/book3s_hv.c | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index a0b674d3a2da..0289d076c0a8 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -4034,6 +4034,8 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu 
*vcpu, u64 time_limit, uns
 static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit,
 unsigned long lpcr, u64 *tb)
 {
+   struct kvm *kvm = vcpu->kvm;
+   struct kvm_nested_guest *nested = vcpu->arch.nested;
u64 next_timer;
int trap;
 
@@ -4053,23 +4055,30 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, 
u64 time_limit,
trap = kvmhv_vcpu_entry_p9_nested(vcpu, time_limit, lpcr, tb);
 
/* H_CEDE has to be handled now, not later */
-   if (trap == BOOK3S_INTERRUPT_SYSCALL && !vcpu->arch.nested &&
+   if (trap == BOOK3S_INTERRUPT_SYSCALL && !nested &&
kvmppc_get_gpr(vcpu, 3) == H_CEDE) {
kvmppc_cede(vcpu);
kvmppc_set_gpr(vcpu, 3, 0);
trap = 0;
}
 
-   } else {
-   struct kvm *kvm = vcpu->kvm;
+   } else if (nested) {
+   kvmppc_xive_push_vcpu(vcpu);
 
+   __this_cpu_write(cpu_in_guest, kvm);
+   trap = kvmhv_vcpu_entry_p9(vcpu, time_limit, lpcr, tb);
+   __this_cpu_write(cpu_in_guest, NULL);
+
+   kvmppc_xive_pull_vcpu(vcpu);
+
+   } else {
kvmppc_xive_push_vcpu(vcpu);
 
__this_cpu_write(cpu_in_guest, kvm);
trap = kvmhv_vcpu_entry_p9(vcpu, time_limit, lpcr, tb);
__this_cpu_write(cpu_in_guest, NULL);
 
-   if (trap == BOOK3S_INTERRUPT_SYSCALL && !vcpu->arch.nested &&
+   if (trap == BOOK3S_INTERRUPT_SYSCALL &&
!(vcpu->arch.shregs.msr & MSR_PR)) {
unsigned long req = kvmppc_get_gpr(vcpu, 3);
 
-- 
2.23.0



[PATCH 2/6] KVM: PPC: Book3S HV P9: Inject pending xive interrupts at guest entry

2022-03-02 Thread Nicholas Piggin
If there is a pending xive interrupt, inject it at guest entry (if
MSR[EE] is enabled) rather than take another interrupt when the guest
is entered. If xive is enabled then LPCR[LPES] is set so this behaviour
should be expected.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/kvm/book3s_hv.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index f8c0f1f52a1e..5df359053147 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -4524,9 +4524,14 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 
time_limit,
 
if (!nested) {
kvmppc_core_prepare_to_enter(vcpu);
-   if (test_bit(BOOK3S_IRQPRIO_EXTERNAL,
->arch.pending_exceptions))
+   if (vcpu->arch.shregs.msr & MSR_EE) {
+   if (xive_interrupt_pending(vcpu))
+   kvmppc_inject_interrupt_hv(vcpu,
+   BOOK3S_INTERRUPT_EXTERNAL, 0);
+   } else if (test_bit(BOOK3S_IRQPRIO_EXTERNAL,
+>arch.pending_exceptions)) {
lpcr |= LPCR_MER;
+   }
} else if (vcpu->arch.pending_exceptions ||
   vcpu->arch.doorbell_request ||
   xive_interrupt_pending(vcpu)) {
-- 
2.23.0



[PATCH 0/6] KVM: PPC: Book3S HV interrupt fixes

2022-03-02 Thread Nicholas Piggin
This series fixes up a bunch of little interrupt issues which were found
by inspection haven't seem to have caused big problems but possibly
could or could cause the occasional latency spike from a temporarily lost
interrupt.

The big thing is the xive context change. Currently we run an L2 with
its L1's xive OS context pushed. I'm proposing that we instead treat
that as an escalation similar to cede.

Thanks,
Nick

Nicholas Piggin (6):
  KVM: PPC: Book3S HV P9: Fix "lost kick" race
  KVM: PPC: Book3S HV P9: Inject pending xive interrupts at guest entry
  KVM: PPC: Book3S HV P9: Move cede logic out of XIVE escalation
rearming
  KVM: PPC: Book3S HV P9: Split !nested case out from guest entry
  KVM: PPC: Book3S HV Nested: L2 must not run with L1 xive context
  KVM: PPC: Book3S HV Nested: L2 LPCR should inherit L1 LPES setting

 arch/powerpc/include/asm/kvm_ppc.h  |  4 +-
 arch/powerpc/kvm/book3s_hv.c| 97 -
 arch/powerpc/kvm/book3s_hv_nested.c |  3 +-
 arch/powerpc/kvm/book3s_xive.c  | 11 ++--
 4 files changed, 90 insertions(+), 25 deletions(-)

-- 
2.23.0



[PATCH 1/6] KVM: PPC: Book3S HV P9: Fix "lost kick" race

2022-03-02 Thread Nicholas Piggin
When new work is created that requires attention from the hypervisor
(e.g., to inject an interrupt into the guest), fast_vcpu_kick is used to
pull the target vcpu out of the guest if it may have been running.

Therefore the work creation side looks like this:

  vcpu->arch.doorbell_request = 1;
  kvmppc_fast_vcpu_kick_hv(vcpu) {
smp_mb();
cpu = vcpu->cpu;
if (cpu != -1)
send_ipi(cpu);
  }

And the guest entry side *should* look like this:

  vcpu->cpu = smp_processor_id();
  smp_mb();
  if (vcpu->arch.doorbell_request) {
// do something (abort entry or inject doorbell etc)
  }

But currently the store and load are flipped, so it is possible for the
entry to see no doorbell pending, and the doorbell creation misses the
store to set cpu, resulting lost work (or at least delayed until the
next guest exit).

Fix this by reordering the entry operations and adding a smp_mb
between them. The P8 path appears to have a similar race which is
commented but not addressed yet.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/kvm/book3s_hv.c | 41 +---
 1 file changed, 33 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 84c89f08ae9a..f8c0f1f52a1e 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -225,6 +225,13 @@ static void kvmppc_fast_vcpu_kick_hv(struct kvm_vcpu *vcpu)
int cpu;
struct rcuwait *waitp;
 
+   /*
+* rcuwait_wake_up contains smp_mb() which orders prior stores that
+* create pending work vs below loads of cpu fields. The other side
+* is the barrier in vcpu run that orders setting the cpu fields vs
+* testing for pending work.
+*/
+
waitp = kvm_arch_vcpu_get_wait(vcpu);
if (rcuwait_wake_up(waitp))
++vcpu->stat.generic.halt_wakeup;
@@ -1089,7 +1096,7 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu)
break;
}
tvcpu->arch.prodded = 1;
-   smp_mb();
+   smp_mb(); /* This orders prodded store vs ceded load */
if (tvcpu->arch.ceded)
kvmppc_fast_vcpu_kick_hv(tvcpu);
break;
@@ -3771,6 +3778,14 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore 
*vc)
pvc = core_info.vc[sub];
pvc->pcpu = pcpu + thr;
for_each_runnable_thread(i, vcpu, pvc) {
+   /*
+* XXX: is kvmppc_start_thread called too late here?
+* It updates vcpu->cpu and vcpu->arch.thread_cpu
+* which are used by kvmppc_fast_vcpu_kick_hv(), but
+* kick is called after new exceptions become available
+* and exceptions are checked earlier than here, by
+* kvmppc_core_prepare_to_enter.
+*/
kvmppc_start_thread(vcpu, pvc);
kvmppc_create_dtl_entry(vcpu, pvc);
trace_kvm_guest_enter(vcpu);
@@ -4492,6 +4507,21 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 
time_limit,
if (need_resched() || !kvm->arch.mmu_ready)
goto out;
 
+   vcpu->cpu = pcpu;
+   vcpu->arch.thread_cpu = pcpu;
+   vc->pcpu = pcpu;
+   local_paca->kvm_hstate.kvm_vcpu = vcpu;
+   local_paca->kvm_hstate.ptid = 0;
+   local_paca->kvm_hstate.fake_suspend = 0;
+
+   /*
+* Orders set cpu/thread_cpu vs testing for pending interrupts and
+* doorbells below. The other side is when these fields are set vs
+* kvmppc_fast_vcpu_kick_hv reading the cpu/thread_cpu fields to
+* kick a vCPU to notice the pending interrupt.
+*/
+   smp_mb();
+
if (!nested) {
kvmppc_core_prepare_to_enter(vcpu);
if (test_bit(BOOK3S_IRQPRIO_EXTERNAL,
@@ -4511,13 +4541,6 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 
time_limit,
 
tb = mftb();
 
-   vcpu->cpu = pcpu;
-   vcpu->arch.thread_cpu = pcpu;
-   vc->pcpu = pcpu;
-   local_paca->kvm_hstate.kvm_vcpu = vcpu;
-   local_paca->kvm_hstate.ptid = 0;
-   local_paca->kvm_hstate.fake_suspend = 0;
-
__kvmppc_create_dtl_entry(vcpu, pcpu, tb + vc->tb_offset, 0);
 
trace_kvm_guest_enter(vcpu);
@@ -4619,6 +4642,8 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 
time_limit,
run->exit_reason = KVM_EXIT_INTR;
vcpu->arch.ret = -EINTR;
  out:
+   vcpu->cpu = -1;
+   vcpu->arch.thread_cpu = -1;
powerpc_local_irq_pmu_restore(flags);
preempt_enable();
goto done;
-- 
2.23.0



RE: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-03-02 Thread David Laight
From: Xiaomeng Tong
> Sent: 03 March 2022 02:27
> 
> On Wed, 2 Mar 2022 14:04:06 +, David Laight
>  wrote:
> > I think that it would be better to make any alternate loop macro
> > just set the variable to NULL on the loop exit.
> > That is easier to code for and the compiler might be persuaded to
> > not redo the test.
> 
> No, that would lead to a NULL dereference.

Why, it would make it b ethe same as the 'easy to use':
for (item = head; item; item = item->next) {
...
if (...)
break;
...
}
if (!item)
return;
 
> The problem is the mis-use of iterator outside the loop on exit, and
> the iterator will be the HEAD's container_of pointer which pointers
> to a type-confused struct. Sidenote: The *mis-use* here refers to
> mistakely access to other members of the struct, instead of the
> list_head member which acutally is the valid HEAD.

The problem is that the HEAD's container_of pointer should never
be calculated at all.
This is what is fundamentally broken about the current definition.

> IOW, you would dereference a (NULL + offset_of_member) address here.

Where?

> Please remind me if i missed something, thanks.
>
> Can you share your "alternative definitions" details? thanks!

The loop should probably use as extra variable that points
to the 'list node' in the next structure.
Something like:
for (xxx *iter = head->next;
iter ==  ? ((item = NULL),0) : ((item = 
list_item(iter),1));
iter = item->member->next) {
   ...
With a bit of casting you can use 'item' to hold 'iter'.

> 
> > OTOH there may be alternative definitions that can be used to get
> > the compiler (or other compiler-like tools) to detect broken code.
> > Even if the definition can't possibly generate a working kerrnel.
> 
> The "list_for_each_entry_inside(pos, type, head, member)" way makes
> the iterator invisiable outside the loop, and would be catched by
> compiler if use-after-loop things happened.

It is also a compete PITA for anything doing a search.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



Re: [PATCH 02/12] powerpc: wiiu: device tree

2022-03-02 Thread Ash Logan

Hi Rob,
Thanks for the review.

On 3/3/22 00:36, Rob Herring wrote:

On Tue, Mar 1, 2022 at 10:44 PM Ash Logan  wrote:


Add a device tree source file for the Nintendo Wii U video game console.


Test this with 'make W=1 dtbs_checks'.


Does make W=1 ARCH=powerpc wiiu_defconfig dtbs_check seem reasonable? I 
ran it, and saw LINT/CHKDT/UPD/SCHEMA/COPY steps, but if I put garbage 
in the .dts it gives no warnings.




Signed-off-by: Ash Logan 
Co-developed-by: Roberto Van Eeden 
Signed-off-by: Roberto Van Eeden 
Co-developed-by: Emmanuel Gil Peyrot 
Signed-off-by: Emmanuel Gil Peyrot 
---
  arch/powerpc/boot/dts/wiiu.dts | 327 +
  1 file changed, 327 insertions(+)
  create mode 100644 arch/powerpc/boot/dts/wiiu.dts

diff --git a/arch/powerpc/boot/dts/wiiu.dts b/arch/powerpc/boot/dts/wiiu.dts
new file mode 100644
index ..aaf264963f61
--- /dev/null
+++ b/arch/powerpc/boot/dts/wiiu.dts
@@ -0,0 +1,327 @@
+// SPDX-License-Identifier: GPL-2.0


What about non-GPL environments?


The other powerpc dts files are all GPL-2.0(-or-later), is there a 
preferred license for devicetrees?



+/*
+ * Nintendo Wii U Device Tree Source
+ *
+ * Copyright (C) 2022 The linux-wiiu Team
+ */
+
+/dts-v1/;
+#include 
+#include 
+
+/ {
+   model = "nintendo,wiiu";
+   compatible = "nintendo,wiiu";
+
+   #address-cells = <1>;
+   #size-cells = <1>;
+
+   chosen {
+   bootargs = "root=/dev/sda1 rootwait";
+   };
+
+   memory {
+   device_type = "memory";
+   reg = <0x 0x0200/* MEM1 - 32MiB */
+  0x0800 0x0030/* MEM0 - 3MiB  */
+  0x1000 0x8000>;  /* MEM2 - 2GiB  */
+   };
+
+   cpus {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   /* TODO: Add SMP */
+   PowerPC,espresso@0 {
+   device_type = "cpu";
+   reg = <0>;
+   clock-frequency = <1243125000>; /* 1.243125GHz 
*/
+   bus-frequency = <248625000>;/* 248.625MHz 
core-to-bus 5x */
+   timebase-frequency = <62156250>;/* 1/4 of the 
bus clock */
+   i-cache-size = <32768>; /* 32K icache */
+   i-cache-line-size = <32>;
+   i-cache-block-size = <32>;
+   i-cache-sets = <128>;
+   d-cache-size = <32768>; /* 32K dcache */
+   d-cache-line-size = <32>;
+   d-cache-block-size = <32>;
+   d-cache-sets = <128>;
+   next-level-cache = <_0>;
+   L2_0:l2-cache {
+   compatible = "cache";
+   cache-level = <2>;
+   cache-unified;
+   cache-size = <0x8>; /* 512KB L2 */
+   cache-line-size = <64>;
+   cache-block-size = <32>;
+   cache-sets = <2048>;
+   };
+   };
+   };
+
+   latte {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   compatible = "nintendo,latte";
+   ranges = <0x0c00 0x0c00 0x0040  /* 
Espresso-only registers */
+ 0x0d00 0x0d00 0x0020  /* Latte AHB 
deivces */
+ 0x0d80 0x0d80 0x0080>;/* Latte SoC 
registers */
+
+   gpu7@c20 {


gpu@...


+   compatible = "nintendo,latte-gpu7";
+   reg = <0x0c20 0x8>;
+   interrupts = <2>;
+   interrupt-parent = <_pic>;
+   };
+
+   espresso_pic: pic@c78 {
+   #interrupt-cells = <1>;
+   interrupt-controller;
+
+   compatible = "nintendo,espresso-pic";
+   reg = <0x0c78 0x18>;
+   };
+
+   latte_dsp: dsp@c005000 {
+   compatible = "nintendo,latte-dsp";
+   reg = <0x0c005000 0x200>;
+   };
+
+   ehci_0: usb@d04 {
+   compatible = "nintendo,latte-usb-ehci", "usb-ehci";
+   reg = <0x0d04 0x100>;
+   interrupts = <4>;
+   interrupt-parent = <_pic>;
+   big-endian-regs;
+   };
+
+   ohci_0_0: usb@d05 {
+   compatible = "nintendo,latte-usb-ohci";
+   reg = <0x0d05 0x100>;
+   interrupts = <5>;
+   interrupt-parent = <_pic>;
+
+   big-endian-regs;
+   };
+
+   

[Bug 215652] kernel 5.17-rc fail to load radeon DRM "modprobe: ERROR: could not insert 'radeon': Unknown symbol in module, or unknown parameter (see dmesg)"

2022-03-02 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=215652

Alex Deucher (alexdeuc...@gmail.com) changed:

   What|Removed |Added

 CC||alexdeuc...@gmail.com

--- Comment #3 from Alex Deucher (alexdeuc...@gmail.com) ---
If you are using an initrd, the firmware must be present on the initrd for the
driver to find it when it loads.  Please make sure the firmware is available in
your initrd.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching someone on the CC list of the bug.

RE: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-03-02 Thread Xiaomeng Tong
On Wed, 2 Mar 2022 14:04:06 +, David Laight
 wrote:
> I think that it would be better to make any alternate loop macro
> just set the variable to NULL on the loop exit.
> That is easier to code for and the compiler might be persuaded to
> not redo the test.

No, that would lead to a NULL dereference.

The problem is the mis-use of iterator outside the loop on exit, and
the iterator will be the HEAD's container_of pointer which pointers
to a type-confused struct. Sidenote: The *mis-use* here refers to
mistakely access to other members of the struct, instead of the
list_head member which acutally is the valid HEAD.

IOW, you would dereference a (NULL + offset_of_member) address here.

Please remind me if i missed something, thanks.

> OTOH there may be alternative definitions that can be used to get
> the compiler (or other compiler-like tools) to detect broken code.
> Even if the definition can't possibly generate a working kerrnel.

The "list_for_each_entry_inside(pos, type, head, member)" way makes
the iterator invisiable outside the loop, and would be catched by
compiler if use-after-loop things happened.

Can you share your "alternative definitions" details? thanks!

--
Xiaomeng Tong


Re: [PATCH 11/12] swiotlb: merge swiotlb-xen initialization into swiotlb

2022-03-02 Thread Stefano Stabellini
On Wed, 2 Mar 2022, Christoph Hellwig wrote:
> On Tue, Mar 01, 2022 at 06:55:47PM -0800, Stefano Stabellini wrote:
> > Unrelated to this specific patch series: now that I think about it, if
> > io_tlb_default_mem.nslabs is already allocated by the time xen_mm_init
> > is called, wouldn't we potentially have an issue with the GFP flags used
> > for the earlier allocation (e.g. GFP_DMA32 not used)? Maybe something
> > for another day.
> 
> swiotlb_init allocates low memory from meblock, which is roughly
> equivalent to GFP_DMA allocations, so we'll be fine.
> 
> > > @@ -143,10 +141,15 @@ static int __init xen_mm_init(void)
> > >   if (!xen_swiotlb_detect())
> > >   return 0;
> > >  
> > > - rc = xen_swiotlb_init();
> > >   /* we can work with the default swiotlb */
> > > - if (rc < 0 && rc != -EEXIST)
> > > - return rc;
> > > + if (!io_tlb_default_mem.nslabs) {
> > > + if (!xen_initial_domain())
> > > + return -EINVAL;
> > 
> > I don't think we need this xen_initial_domain() check. It is all
> > already sorted out by the xen_swiotlb_detect() check above.
> 
> Is it?
> 
> static inline int xen_swiotlb_detect(void)
> {
>   if (!xen_domain())
>   return 0;
>   if (xen_feature(XENFEAT_direct_mapped))
>   return 1;
>   /* legacy case */
>   if (!xen_feature(XENFEAT_not_direct_mapped) && xen_initial_domain())
>   return 1;
>   return 0;
> }

It used to be that we had a

  if (!xen_initial_domain())
  return -EINVAL;

check in the initialization of swiotlb-xen on ARM. Then we replaced it
with the more sophisticated xen_swiotlb_detect().

The reason is that swiotlb-xen on ARM relies on Dom0 being 1:1 mapped
(guest physical addresses == physical addresses). Recent changes in Xen
allowed also DomUs to be 1:1 mapped. Changes still under discussion will
allow Dom0 not to be 1:1 mapped.

So, before all the Xen-side changes, knowing what was going to happen, I
introduced a clearer interface: XENFEAT_direct_mapped and
XENFEAT_not_direct_mapped tell us whether the guest (Linux) is 1:1
mapped or not. If it is 1:1 mapped then Linux can take advantage of
swiotlb-xen. Now xen_swiotlb_detect() returns true if Linux is 1:1
mapped.

Then of course there is the legacy case. That's taken care of by:

if (!xen_feature(XENFEAT_not_direct_mapped) && xen_initial_domain())
return 1;

The intention is to say that if
XENFEAT_direct_mapped/XENFEAT_not_direct_mapped are not present, then
use xen_initial_domain() like we did before.

So if xen_swiotlb_detect() returns true we know that Linux is either
dom0 (xen_initial_domain() == true) or we have very good reasons to
think we should initialize swiotlb-xen anyway
(xen_feature(XENFEAT_direct_mapped) == true).


> I think I'd keep it as-is for now, as my planned next step would be to
> fold xen-swiotlb into swiotlb entirely.

Thinking more about it we actually need to drop the xen_initial_domain()
check otherwise some cases won't be functional (Dom0 not 1:1 mapped, or
DomU 1:1 mapped).


[Bug 215652] kernel 5.17-rc fail to load radeon DRM "modprobe: ERROR: could not insert 'radeon': Unknown symbol in module, or unknown parameter (see dmesg)"

2022-03-02 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=215652

--- Comment #2 from Erhard F. (erhar...@mailbox.org) ---
Created attachment 300522
  --> https://bugzilla.kernel.org/attachment.cgi?id=300522=edit
kernel .config (kernel 5.17-rc5, CONFIG_DRM_RADEON=m, Talos II)

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching someone on the CC list of the bug.

[Bug 215652] kernel 5.17-rc fail to load radeon DRM "modprobe: ERROR: could not insert 'radeon': Unknown symbol in module, or unknown parameter (see dmesg)"

2022-03-02 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=215652

--- Comment #1 from Erhard F. (erhar...@mailbox.org) ---
Created attachment 300521
  --> https://bugzilla.kernel.org/attachment.cgi?id=300521=edit
kernel dmesg (kernel 5.17-rc5, CONFIG_DRM_RADEON=y, Talos II)

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching someone on the CC list of the bug.

[Bug 215652] New: kernel 5.17-rc fail to load radeon DRM "modprobe: ERROR: could not insert 'radeon': Unknown symbol in module, or unknown parameter (see dmesg)"

2022-03-02 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=215652

Bug ID: 215652
   Summary: kernel 5.17-rc fail to load radeon DRM "modprobe:
ERROR: could not insert 'radeon': Unknown symbol in
module, or unknown parameter (see dmesg)"
   Product: Drivers
   Version: 2.5
Kernel Version: 5.17-rc5
  Hardware: PPC-64
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: Video(DRI - non Intel)
  Assignee: drivers_video-...@kernel-bugs.osdl.org
  Reporter: erhar...@mailbox.org
CC: platform_ppc...@kernel-bugs.osdl.org
Regression: No

Created attachment 300520
  --> https://bugzilla.kernel.org/attachment.cgi?id=300520=edit
kernel dmesg (kernel 5.17-rc5, CONFIG_DRM_RADEON=m, Talos II)

Kernel 5.17-rc5 has problems loading the radeon KMS module on my Talos II:

 # modprobe -v radeon
insmod /lib/modules/5.17.0-rc5-P9+/kernel/drivers/gpu/drm/drm_kms_helper.ko 
modprobe: ERROR: could not insert 'radeon': Unknown symbol in module, or
unknown parameter (see dmesg)

dmesg show no further output then.

When I build a kernel with CONFIG_DRM_RADEON=y I get this in dmesg:
[...]
ATOM BIOS: X1550
[drm] Generation 2 PCI interface, using max accessible memory
radeon :01:00.0: VRAM: 512M 0x - 0x1FFF (512M
used)
radeon :01:00.0: GTT: 512M 0x2000 - 0x3FFF
[drm] Detected VRAM RAM=512M, BAR=256M
[drm] RAM width 128bits DDR
radeon :01:00.0: dma_iommu_get_required_mask: returning bypass mask
0xfff
[drm] radeon: 512M of VRAM memory ready
[drm] radeon: 512M of GTT memory ready.
[drm] GART: num cpu pages 131072, num gpu pages 131072
[drm] radeon: 1 quad pipes, 1 z pipes initialized.
[drm] PCIE GART of 512M enabled (table at 0x0004).
radeon :01:00.0: WB enabled
radeon :01:00.0: fence driver on ring 0 use gpu addr 0x2000
radeon :01:00.0: radeon: MSI limited to 32-bit
[drm] radeon: irq initialized.
[drm] Loading R500 Microcode
radeon :01:00.0: Direct firmware load for radeon/R520_cp.bin failed with
error -2
radeon_cp: Failed to load firmware "radeon/R520_cp.bin"
[drm:.r100_cp_init] *ERROR* Failed to load firmware!
radeon :01:00.0: failed initializing CP (-2).
radeon :01:00.0: Disabling GPU acceleration

So it complains about not finding the relevant firmware. But the firmware being
located in everything should be ok. This used to work on kernels 5.16.x and
before.

 # ls -al /lib/firmware/radeon/R520_cp.bin
-rw-r--r-- 1 root root 2048  2. Mär 20:43 /lib/firmware/radeon/R520_cp.bin


Some data about the machine:
 # lspci 
:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
:01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
RV516 [Radeon X1300/X1550 Series]
:01:00.1 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] RV516
[Radeon X1300/X1550 Series] (Secondary)
0001:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0001:01:00.0 Non-Volatile memory controller: Phison Electronics Corporation
Device 5008 (rev 01)
0002:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0003:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0003:01:00.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI
Host Controller (rev 02)
0004:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0004:01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme
BCM5719 Gigabit Ethernet PCIe (rev 01)
0004:01:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme
BCM5719 Gigabit Ethernet PCIe (rev 01)
0005:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0005:01:00.0 PCI bridge: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge (rev
04)
0005:02:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics
Family (rev 41)
0030:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0031:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0032:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)
0033:00:00.0 PCI bridge: IBM POWER9 Host Bridge (PHB4)

 # lspci -s :01:00.0 -vv
:01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
RV516 [Radeon X1300/X1550 Series] (prog-if 00 [VGA controller])
Subsystem: PC Partner Limited / Sapphire Technology RV516 [Radeon
X1300/X1550 Series]
Device tree node:
/sys/firmware/devicetree/base/pciex@600c3c000/pci@0/vga@0
Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr+
Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR-  [disabled]
Expansion ROM at 600c0 [disabled] [size=128K]
Capabilities: [50] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-

Re: [PATCH v4 1/1] crypto: vmx - add missing dependencies

2022-03-02 Thread Herbert Xu
On Wed, Feb 23, 2022 at 04:11:15PM +0100, Petr Vorel wrote:
> vmx-crypto module depends on CRYPTO_AES, CRYPTO_CBC, CRYPTO_CTR or
> CRYPTO_XTS, thus add them.
> 
> These dependencies are likely to be enabled, but if
> CRYPTO_DEV_VMX=y && !CRYPTO_MANAGER_DISABLE_TESTS
> and either of CRYPTO_AES, CRYPTO_CBC, CRYPTO_CTR or CRYPTO_XTS is built
> as module or disabled, alg_test() from crypto/testmgr.c complains during
> boot about failing to allocate the generic fallback implementations
> (2 == ENOENT):
> 
> [0.540953] Failed to allocate xts(aes) fallback: -2
> [0.541014] alg: skcipher: failed to allocate transform for p8_aes_xts: -2
> [0.541120] alg: self-tests for p8_aes_xts (xts(aes)) failed (rc=-2)
> [0.50] Failed to allocate ctr(aes) fallback: -2
> [0.544497] alg: skcipher: failed to allocate transform for p8_aes_ctr: -2
> [0.544603] alg: self-tests for p8_aes_ctr (ctr(aes)) failed (rc=-2)
> [0.547992] Failed to allocate cbc(aes) fallback: -2
> [0.548052] alg: skcipher: failed to allocate transform for p8_aes_cbc: -2
> [0.548156] alg: self-tests for p8_aes_cbc (cbc(aes)) failed (rc=-2)
> [0.550745] Failed to allocate transformation for 'aes': -2
> [0.550801] alg: cipher: Failed to load transform for p8_aes: -2
> [0.550892] alg: self-tests for p8_aes (aes) failed (rc=-2)
> 
> Fixes: c07f5d3da643 ("crypto: vmx - Adding support for XTS")
> Fixes: d2e3ae6f3aba ("crypto: vmx - Enabling VMX module for PPC64")
> 
> Suggested-by: Nicolai Stange 
> Signed-off-by: Petr Vorel 
> ---
> Changes v3->v4:
> * Drop commit which merged CRYPTO_DEV_VMX and CRYPTO_DEV_VMX_ENCRYPT
>   (Herbert)
> 
>  drivers/crypto/vmx/Kconfig | 4 
>  1 file changed, 4 insertions(+)

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [Intel-gfx] [PATCH 6/6] treewide: remove check of list iterator against head past the loop body

2022-03-02 Thread Tvrtko Ursulin



On 28/02/2022 11:08, Jakob Koschel wrote:

When list_for_each_entry() completes the iteration over the whole list
without breaking the loop, the iterator value will be a bogus pointer
computed based on the head element.

While it is safe to use the pointer to determine if it was computed
based on the head element, either with list_entry_is_head() or
>member == head, using the iterator variable after the loop should
be avoided.

In preparation to limiting the scope of a list iterator to the list
traversal loop, use a dedicated pointer to point to the found element.

Signed-off-by: Jakob Koschel 


[snip until i915 parts]


  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 14 +++---
  .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 15 ---
  drivers/gpu/drm/i915/gt/intel_ring.c  | 15 ---


[snip]


diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 00327b750fbb..80c79028901a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -107,25 +107,27 @@ static void lut_close(struct i915_gem_context *ctx)
radix_tree_for_each_slot(slot, >handles_vma, , 0) {
struct i915_vma *vma = rcu_dereference_raw(*slot);
struct drm_i915_gem_object *obj = vma->obj;
-   struct i915_lut_handle *lut;
+   struct i915_lut_handle *lut = NULL;
+   struct i915_lut_handle *tmp;

if (!kref_get_unless_zero(>base.refcount))
continue;

spin_lock(>lut_lock);
-   list_for_each_entry(lut, >lut_list, obj_link) {
-   if (lut->ctx != ctx)
+   list_for_each_entry(tmp, >lut_list, obj_link) {
+   if (tmp->ctx != ctx)
continue;

-   if (lut->handle != iter.index)
+   if (tmp->handle != iter.index)
continue;

-   list_del(>obj_link);
+   list_del(>obj_link);
+   lut = tmp;
break;
}
spin_unlock(>lut_lock);

-   if (>obj_link != >lut_list) {
+   if (lut) {
i915_lut_handle_free(lut);
radix_tree_iter_delete(>handles_vma, , slot);


Looks okay although personally I would have left lut as is for a smaller 
diff and introduced a new local like 'found' or 'unlinked'.



i915_vma_close(vma);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 1736efa43339..fda9e3685ad2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2444,7 +2444,8 @@ static struct i915_request *eb_throttle(struct 
i915_execbuffer *eb, struct intel
  {
struct intel_ring *ring = ce->ring;
struct intel_timeline *tl = ce->timeline;
-   struct i915_request *rq;
+   struct i915_request *rq = NULL;
+   struct i915_request *tmp;

/*
 * Completely unscientific finger-in-the-air estimates for suitable
@@ -2460,15 +2461,17 @@ static struct i915_request *eb_throttle(struct 
i915_execbuffer *eb, struct intel
 * claiming our resources, but not so long that the ring completely
 * drains before we can submit our next request.
 */
-   list_for_each_entry(rq, >requests, link) {
-   if (rq->ring != ring)
+   list_for_each_entry(tmp, >requests, link) {
+   if (tmp->ring != ring)
continue;

-   if (__intel_ring_space(rq->postfix,
-  ring->emit, ring->size) > ring->size / 2)
+   if (__intel_ring_space(tmp->postfix,
+  ring->emit, ring->size) > ring->size / 
2) {
+   rq = tmp;
break;
+   }
}
-   if (>link == >requests)
+   if (!rq)
return NULL; /* weird, we will check again later for real */


Alternatively, instead of break could simply do "return 
i915_request_get(rq);" and replace the end of the function after the 
loop with "return NULL;". A bit smaller diff, or at least less "spread 
out" over the function, so might be easier to backport stuff touching 
this area in the future. But looks correct as is.




return i915_request_get(rq);
diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c 
b/drivers/gpu/drm/i915/gt/intel_ring.c
index 2fdd52b62092..4881c4e0c407 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring.c
@@ -191,24 +191,27 @@ wait_for_space(struct intel_ring *ring,
   struct intel_timeline *tl,
   unsigned int bytes)
  {
-   struct i915_request *target;
+   struct 

Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-03-02 Thread Kees Cook
On Wed, Mar 02, 2022 at 12:18:45PM -0800, Linus Torvalds wrote:
> On Wed, Mar 2, 2022 at 12:07 PM Kees Cook  wrote:
> >
> > I've long wanted to change kfree() to explicitly set pointers to NULL on
> > free. https://github.com/KSPP/linux/issues/87
> 
> We've had this discussion with the gcc people in the past, and gcc
> actually has some support for it, but it's sadly tied to the actual
> function name (ie gcc has some special-casing for "free()")
> 
> See
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94527
> 
> for some of that discussion.
> 
> Oh, and I see some patch actually got merged since I looked there last
> so that you can mark "deallocator" functions, but I think it's only
> for the context matching, not for actually killing accesses to the
> pointer afterwards.

Ah! I missed that getting added in GCC 11. But yes, there it is:

https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-malloc-function-attribute

Hah, now we may need to split __malloc from __alloc_size. ;)

I'd still like the NULL assignment behavior, though, since some things
can easily avoid static analysis.

-- 
Kees Cook


Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-03-02 Thread Linus Torvalds
On Wed, Mar 2, 2022 at 12:07 PM Kees Cook  wrote:
>
> I've long wanted to change kfree() to explicitly set pointers to NULL on
> free. https://github.com/KSPP/linux/issues/87

We've had this discussion with the gcc people in the past, and gcc
actually has some support for it, but it's sadly tied to the actual
function name (ie gcc has some special-casing for "free()")

See

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94527

for some of that discussion.

Oh, and I see some patch actually got merged since I looked there last
so that you can mark "deallocator" functions, but I think it's only
for the context matching, not for actually killing accesses to the
pointer afterwards.

   Linus


Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-03-02 Thread Kees Cook
On Wed, Mar 02, 2022 at 10:29:31AM +0100, Rasmus Villemoes wrote:
> This won't help the current issue (because it doesn't exist and might
> never), but just in case some compiler people are listening, I'd like to
> have some sort of way to tell the compiler "treat this variable as
> uninitialized from here on". So one could do
> 
> #define kfree(p) do { __kfree(p); __magic_uninit(p); } while (0)
> 
> with __magic_uninit being a magic no-op that doesn't affect the
> semantics of the code, but could be used by the compiler's "[is/may be]
> used uninitialized" machinery to flag e.g. double frees on some odd
> error path etc. It would probably only work for local automatic
> variables, but it should be possible to just ignore the hint if p is
> some expression like foo->bar or has side effects. If we had that, the
> end-of-loop test could include that to "uninitialize" the iterator.

I've long wanted to change kfree() to explicitly set pointers to NULL on
free. https://github.com/KSPP/linux/issues/87

The thing stopping a trivial transformation of kfree() is:

kfree(get_some_pointer());

I would argue, though, that the above is poor form: the thing holding
the pointer should be the thing freeing it, so these cases should be
refactored and kfree() could do the NULLing by default.

Quoting myself in the above issue:


Without doing massive tree-wide changes, I think we need compiler
support. If we had something like __builtin_is_lvalue(), we could
distinguish function returns from lvalues. For example, right now a
common case are things like:

kfree(get_some_ptr());

But if we could at least gain coverage of the lvalue cases, and detect
them statically at compile-time, we could do:

#define __kfree_and_null(x) do { __kfree(*x); *x = NULL; } while (0)
#define kfree(x) __builtin_choose_expr(__builtin_is_lvalue(x),
__kfree_and_null(&(x)), __kfree(x))

Alternatively, we could do a tree-wide change of the former case (findable
with Coccinelle) and change them into something like kfree_no_null()
and redefine kfree() itself:

#define kfree_no_null(x) do { void *__ptr = (x); __kfree(__ptr); } while (0)
#define kfree(x) do { __kfree(x); x = NULL; } while (0)

-- 
Kees Cook


Re: [PATCH] powerpc/security: Provide stubs for when PPC_BARRIER_NOSPEC isn't enabled

2022-03-02 Thread Naveen N. Rao

Christophe Leroy wrote:



Le 10/01/2022 à 11:07, Naveen N. Rao a écrit :

kernel test robot reported the below build error with a randconfig:
   powerpc64-linux-ld: arch/powerpc/net/bpf_jit_comp64.o:(.toc+0x0):
   undefined reference to `powerpc_security_features'

This can happen if CONFIG_PPC_BARRIER_NOSPEC is not enabled. Address
this by providing stub functions for security_ftr_enabled() and related
helpers when the config option is not enabled.


Looks like this can happen only when E500 is not selected.

But what kind of CPU do we have if it's not a E500 ?

AFAICS in cputable.c, if not a PPC32 and not a BOOK3S_64 is must be a 
E500 otherwise there's just no CPU.


This was triggered for a 64-bit build and the bug report is:
http://lkml.kernel.org/r/202201082018.actzm4jh-...@intel.com

The randconfig used is:
https://download.01.org/0day-ci/archive/20220108/202201082018.actzm4jh-...@intel.com/config

It just selects the generic cpu and BOOK3E_64:

#
# Processor support
#
# CONFIG_PPC_BOOK3S_64 is not set
CONFIG_PPC_BOOK3E_64=y
CONFIG_GENERIC_CPU=y
CONFIG_PPC_BOOK3E=y
CONFIG_PPC_FPU_REGS=y
CONFIG_PPC_FPU=y
CONFIG_BOOKE=y
CONFIG_PPC_MMU_NOHASH=y
CONFIG_PPC_BOOK3E_MMU=y
CONFIG_PMU_SYSFS=y
# CONFIG_SMP is not set
CONFIG_NR_CPUS=1
CONFIG_PPC_DOORBELL=y
# end of Processor support



Should we make Kconfig stricter instead to avoid the Robot selecting a 
crazy config ?


If that config is indeed not possible, it sure will be nice to prevent 
that.



- Naveen



Re: [PATCH v2 1/7] ftrace: Expose flags used for ftrace_replace_code()

2022-03-02 Thread Naveen N. Rao

Christophe Leroy wrote:



Le 27/06/2019 à 13:23, Naveen N. Rao a écrit :

Since ftrace_replace_code() is a __weak function and can be overridden,
we need to expose the flags that can be set. So, move the flags enum to
the header file.

Reviewed-by: Steven Rostedt (VMware) 
Signed-off-by: Naveen N. Rao 


This series does apply anymore.

We have a link to it in https://github.com/linuxppc/issues/issues/386

I'll flag it "change requested"


There are a couple of changes being worked on as a prerequisite for this 
series. See:

http://lkml.kernel.org/r/cover.1645096227.git.naveen.n@linux.vnet.ibm.com


- Naveen



Re: [PATCH 1/2] powerpc/vdso64: link vdso64 with linker

2022-03-02 Thread Christophe Leroy




Le 02/09/2020 à 00:25, Nick Desaulniers a écrit :

Rather than invoke the compiler as the driver, use the linker. That way
we can check --orphan-handling=warn support correctly, as cc-ldoption
was removed in
commit 055efab3120b ("kbuild: drop support for cc-ldoption").

Painstakingly compared the output between `objdump -a` before and after
this change. Now function symbols have the correct type of FUNC rather
than NONE, and the entry is slightly different (which doesn't matter for
the vdso). Binary size is the same.

Fixes: commit f2af201002a8 ("powerpc/build: vdso linker warning for orphan 
sections")
Link: 
https://lore.kernel.org/lkml/cakwvodnn3wxydjomvnveyd_njwrku3fabwt_bs92duihhyw...@mail.gmail.com/
Signed-off-by: Nick Desaulniers 


Is this change still necessary ? If so please rebase as we have changed 
the structure of VDSO source files (Only one directory common to 32 and 64).


Christophe


---
  arch/powerpc/include/asm/vdso.h | 17 ++---
  arch/powerpc/kernel/vdso64/Makefile |  8 ++--
  arch/powerpc/kernel/vdso64/vdso64.lds.S |  1 -
  3 files changed, 8 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/include/asm/vdso.h b/arch/powerpc/include/asm/vdso.h
index 2ff884853f97..11b2ecf49f79 100644
--- a/arch/powerpc/include/asm/vdso.h
+++ b/arch/powerpc/include/asm/vdso.h
@@ -24,19 +24,7 @@ int vdso_getcpu_init(void);
  
  #else /* __ASSEMBLY__ */
  
-#ifdef __VDSO64__

-#define V_FUNCTION_BEGIN(name) \
-   .globl name;\
-   name:   \
-
-#define V_FUNCTION_END(name)   \
-   .size name,.-name;
-
-#define V_LOCAL_FUNC(name) (name)
-#endif /* __VDSO64__ */
-
-#ifdef __VDSO32__
-
+#if defined(__VDSO32__) || defined (__VDSO64__)
  #define V_FUNCTION_BEGIN(name)\
.globl name;\
.type name,@function;   \
@@ -46,8 +34,7 @@ int vdso_getcpu_init(void);
.size name,.-name;
  
  #define V_LOCAL_FUNC(name) (name)

-
-#endif /* __VDSO32__ */
+#endif /* __VDSO{32|64}__ */
  
  #endif /* __ASSEMBLY__ */
  
diff --git a/arch/powerpc/kernel/vdso64/Makefile b/arch/powerpc/kernel/vdso64/Makefile

index 38c317f25141..7ea3ce537d0a 100644
--- a/arch/powerpc/kernel/vdso64/Makefile
+++ b/arch/powerpc/kernel/vdso64/Makefile
@@ -32,9 +32,13 @@ $(obj)/%.so: OBJCOPYFLAGS := -S
  $(obj)/%.so: $(obj)/%.so.dbg FORCE
$(call if_changed,objcopy)
  
+ldflags-y := -shared -soname linux-vdso64.so.1 \

+   $(call ld-option, --eh-frame-hdr) \
+   $(call ld-option, --orphan-handling=warn) -T
+
  # actual build commands
-quiet_cmd_vdso64ld = VDSO64L $@
-  cmd_vdso64ld = $(CC) $(c_flags) -o $@ -Wl,-T$(filter %.lds,$^) $(filter 
%.o,$^) $(call cc-ldoption, -Wl$(comma)--orphan-handling=warn)
+quiet_cmd_vdso64ld = LD  $@
+  cmd_vdso64ld = $(cmd_ld)
  
  # install commands for the unstripped file

  quiet_cmd_vdso_install = INSTALL $@
diff --git a/arch/powerpc/kernel/vdso64/vdso64.lds.S 
b/arch/powerpc/kernel/vdso64/vdso64.lds.S
index 4e3a8d4ee614..58c33b704b6a 100644
--- a/arch/powerpc/kernel/vdso64/vdso64.lds.S
+++ b/arch/powerpc/kernel/vdso64/vdso64.lds.S
@@ -11,7 +11,6 @@ OUTPUT_FORMAT("elf64-powerpcle", "elf64-powerpcle", 
"elf64-powerpcle")
  OUTPUT_FORMAT("elf64-powerpc", "elf64-powerpc", "elf64-powerpc")
  #endif
  OUTPUT_ARCH(powerpc:common64)
-ENTRY(_start)
  
  SECTIONS

  {


Re: [PATCH 3/3] powerpc/bpf: Reallocate BPF registers to volatile registers when possible on PPC64

2022-03-02 Thread Naveen N. Rao

Christophe Leroy wrote:



Le 27/07/2021 à 08:55, Jordan Niethe a écrit :

Implement commit 40272035e1d0 ("powerpc/bpf: Reallocate BPF registers to
volatile registers when possible on PPC32") for PPC64.

When the BPF routine doesn't call any function, the non volatile
registers can be reallocated to volatile registers in order to avoid
having to save them/restore on the stack. To keep track of which
registers can be reallocated to make sure registers are set seen when
used.

Before this patch, the test #359 ADD default X is:
0:   nop
4:   nop
8:   std r27,-40(r1)
c:   std r28,-32(r1)
   10:   xor r8,r8,r8
   14:   rotlwi  r8,r8,0
   18:   xor r28,r28,r28
   1c:   rotlwi  r28,r28,0
   20:   mr  r27,r3
   24:   li  r8,66
   28:   add r8,r8,r28
   2c:   rotlwi  r8,r8,0
   30:   ld  r27,-40(r1)
   34:   ld  r28,-32(r1)
   38:   mr  r3,r8
   3c:   blr

After this patch, the same test has become:
0:   nop
4:   nop
8:   xor r8,r8,r8
c:   rotlwi  r8,r8,0
   10:   xor r5,r5,r5
   14:   rotlwi  r5,r5,0
   18:   mr  r4,r3
   1c:   li  r8,66
   20:   add r8,r8,r5
   24:   rotlwi  r8,r8,0
   28:   mr  r3,r8
   2c:   blr

Signed-off-by: Jordan Niethe 


If this series is still applicable, it needs to be rebased of Naveen's 
series https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=286000


Thanks for bringing this up. My apologies - I missed copying you and 
Jordan on the new series.


I have included the first patch and a variant of the second patch in 
this series, in the new series I posted. For patch 3/3, it might be 
simpler to not track temp register usage on ppc64.



Thanks,
Naveen



Re: [Reoprt] Some compile warning on ppc dts

2022-03-02 Thread Christophe Leroy




Le 01/03/2021 à 03:16, chenjun (AM) a écrit :

Hi

After run the following commands
make distclean
make allmodconfig ARCH=powerpc CROSS_COMPILE=powerpc64-linux-gnu-
make oldconfig ARCH=powerpc CROSS_COMPILE=powerpc64-linux-gnu-
make -j64 ARCH=powerpc CROSS_COMPILE=powerpc64-linux-gnu-

I get some warning:
arch/powerpc/boot/dts/mpc5200b.dtsi:267.20-280.4: Warning (pci_bridge):
/pci@fd00: missing ranges for PCI bridg
e (or not a bridge)
arch/powerpc/boot/dts/o2dnt2.dtb: Warning (pci_device_bus_num): Failed
prerequisite 'pci_bridge'
arch/powerpc/boot/dts/mpc5200b.dtsi:182.18-186.5: Warning
(spi_bus_bridge): /soc5200@f000/psc@2000: node name f
or SPI buses should be 'spi'
also defined at arch/powerpc/boot/dts/o2d.dtsi:32.12-43.5
arch/powerpc/boot/dts/o2dnt2.dtb: Warning (spi_bus_reg): Failed
prerequisite 'spi_bus_bridge'
...

For the problem about "node name for SPI buses should be 'spi'":
Rename the psc@2000 to spi@2000 in arch/powerpc/boot/dts/o2d.dtsi can
fix it.
diff --git a/arch/powerpc/boot/dts/o2d.dtsi b/arch/powerpc/boot/dts/o2d.dtsi
index 6661955a2be4..cd3dc70cd72e 100644
--- a/arch/powerpc/boot/dts/o2d.dtsi
+++ b/arch/powerpc/boot/dts/o2d.dtsi
@@ -29,7 +29,7 @@ rtc@800 {
   >-->--->---status = "disabled";
   >-->---};
-
->-->---psc@2000 {>->---// PSC1
+>-->---spi@2000 {>->---// PSC1
   >-->--->---compatible =
"fsl,mpc5200b-psc-spi","fsl,mpc5200-psc-spi";
   >-->--->---#address-cells = <1>;
   >-->--->---#size-cells = <0>;
---

For the problem about "missing ranges for PCI bridge (or not a bridge)":
Ranges should be add in arch/powerpc/boot/dts/mpc5200b.dtsi.
  >---pci: pci@fd00 {
  >--->---#interrupt-cells = <1>;
  >--->---#size-cells = <2>;
  >--->---#address-cells = <3>;
  >--->---device_type = "pci";
  >--->---compatible = "fsl,mpc5200b-pci","fsl,mpc5200-pci";
  >--->---reg = <0xfd00 0x100>;
  >--->---// interrupt-map-mask = need to add
  >--->---// interrupt-map = need to add
  >--->---clock-frequency = <0>; // From boot loader
  >--->---interrupts = <2 8 0 2 9 0 2 10 0>;
  >--->---bus-range = <0 0>;
  >--->---// ranges = need to add
  >---};
I think the ranges should be add by someone who knows the mpc5200 better.




This patch has garbage instead of tabs, it doesn't apply


Re: powerpc{32,64} randconfigs

2022-03-02 Thread Christophe Leroy




Le 28/04/2021 à 01:45, Randy Dunlap a écrit :

On 4/21/21 12:15 AM, Michael Ellerman wrote:

Randy Dunlap  writes:


Reviewed-by: Randy Dunlap 
Tested-by: Randy Dunlap 


Please merge this. :)



Merged as f259fb893c69 ("powerpc/Makefile: Add ppc32/ppc64_randconfig 
targets")




Re: [PATCH] arch: powerpc: kvm: remove unnecessary casting

2022-03-02 Thread Christophe Leroy




Le 09/05/2021 à 14:00, Nour-eddine Taleb a écrit :

remove unnecessary castings, from "void *" to "struct kvmppc_xics *"

Signed-off-by: Nour-eddine Taleb <1337.nouredd...@gmail.com>


This patch doesn't apply. Tabs are broken, they've been replaced by 4 
space chars.



---
  arch/powerpc/kvm/book3s_xics.c| 2 +-
  arch/powerpc/kvm/book3s_xive.c| 2 +-
  arch/powerpc/kvm/book3s_xive_native.c | 2 +-
  3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_xics.c b/arch/powerpc/kvm/book3s_xics.c
index 303e3cb096db..9ae74fa551a6 100644
--- a/arch/powerpc/kvm/book3s_xics.c
+++ b/arch/powerpc/kvm/book3s_xics.c
@@ -1440,7 +1440,7 @@ static int kvmppc_xics_create(struct kvm_device
*dev, u32 type)

  static void kvmppc_xics_init(struct kvm_device *dev)
  {
-struct kvmppc_xics *xics = (struct kvmppc_xics *)dev->private;
+struct kvmppc_xics *xics = dev->private;

  xics_debugfs_init(xics);
  }
diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c
index e7219b6f5f9a..05bcaf81a90a 100644
--- a/arch/powerpc/kvm/book3s_xive.c
+++ b/arch/powerpc/kvm/book3s_xive.c
@@ -2242,7 +2242,7 @@ static void xive_debugfs_init(struct kvmppc_xive *xive)

  static void kvmppc_xive_init(struct kvm_device *dev)
  {
-struct kvmppc_xive *xive = (struct kvmppc_xive *)dev->private;
+struct kvmppc_xive *xive = dev->private;

  /* Register some debug interfaces */
  xive_debugfs_init(xive);
diff --git a/arch/powerpc/kvm/book3s_xive_native.c
b/arch/powerpc/kvm/book3s_xive_native.c
index 76800c84f2a3..2703432cea78 100644
--- a/arch/powerpc/kvm/book3s_xive_native.c
+++ b/arch/powerpc/kvm/book3s_xive_native.c
@@ -1265,7 +1265,7 @@ static void xive_native_debugfs_init(struct
kvmppc_xive *xive)

  static void kvmppc_xive_native_init(struct kvm_device *dev)
  {
-struct kvmppc_xive *xive = (struct kvmppc_xive *)dev->private;
+struct kvmppc_xive *xive = dev->private;

  /* Register some debug interfaces */
  xive_native_debugfs_init(xive);


Re: [RFC 0/2] Add generic FPU api similar to x86

2022-03-02 Thread Christophe Leroy




Le 19/07/2021 à 21:52, Anson Jacob a écrit :

This is an attempt to have generic FPU enable/disable
calls similar to x86.
So that we can simplify gpu/drm/amd/display/dc/os_types.h


Seems like gpu/drm/amd/display/dc/os_types.h has been simplified through 
another way via commit 96ee63730fa3 ("drm/amd/display: Add control 
mechanism for FPU")


Are powerpc changes in patch 1 still relevant ? In that case please rebase.



Also adds FPU correctness logic seen in x86.

Anson Jacob (2):
   ppc/fpu: Add generic FPU api similar to x86
   drm/amd/display: Use PPC FPU functions

  arch/powerpc/include/asm/switch_to.h  |  29 ++---
  arch/powerpc/kernel/process.c | 130 ++
  drivers/gpu/drm/amd/display/dc/os_types.h |  28 +
  3 files changed, 139 insertions(+), 48 deletions(-)



Re: [RFC PATCH] powerpc: Implement hotplug smt control

2022-03-02 Thread Christophe Leroy


Le 17/02/2022 à 08:04, Joel Stanley a écrit :
> x86 added a control for turning SMT on and off in commit 05736e4ac13c
> ("cpu/hotplug: Provide knobs to control SMT").
> 
> Implement this for powerpc as an alternative to the currently method of
> iterating through /sys/devices/system/cpu/cpuN/online for every CPU.
> 
># ppc64_cpu --info
>Core   0:0*1*2*3*4*5*6*7*
>Core   1:8*9*   10*   11*   12*   13*   14*   15*
># grep . /sys/devices/system/cpu/smt/*
>/sys/devices/system/cpu/smt/active:1
>/sys/devices/system/cpu/smt/control:on
># echo off > /sys/devices/system/cpu/smt/control
># ppc64_cpu --info
>Core   0:0*1 2 3 4 5 6 7
>Core   1:8*9101112131415
># grep . /sys/devices/system/cpu/smt/*
>/sys/devices/system/cpu/smt/active:0
>/sys/devices/system/cpu/smt/control:off
> 
> Signed-off-by: Joel Stanley 

Build fails with corenet64_smp_defconfig:

   CC  kernel/cpu.o
kernel/cpu.c: In function 'cpuhp_smt_disable':
kernel/cpu.c:2220:23: error: implicit declaration of function 
'cpu_down_maps_locked' [-Werror=implicit-function-declaration]
  2220 | ret = cpu_down_maps_locked(cpu, CPUHP_OFFLINE);
   |   ^~~~
cc1: some warnings being treated as errors
make[1]: *** [scripts/Makefile.build:288: kernel/cpu.o] Error 1

Christophe

Re: [PATCH -next] macintosh/therm_adt746x: Replaced simple_strtol() with kstrtoint()

2022-03-02 Thread Christophe Leroy




Le 24/05/2021 à 14:08, Liu Shixin a écrit :

The simple_strtol() function is deprecated in some situation since
it does not check for the range overflow. Use kstrtoint() instead.

Signed-off-by: Liu Shixin 
---
  drivers/macintosh/therm_adt746x.c | 6 --
  1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/macintosh/therm_adt746x.c 
b/drivers/macintosh/therm_adt746x.c
index 7e218437730c..0d7ef55126ce 100644
--- a/drivers/macintosh/therm_adt746x.c
+++ b/drivers/macintosh/therm_adt746x.c
@@ -352,7 +352,8 @@ static ssize_t store_##name(struct device *dev, struct 
device_attribute *attr, c
struct thermostat *th = dev_get_drvdata(dev);   \
int val;\
int i;  \
-   val = simple_strtol(buf, NULL, 10); \
+   if (unlikely(kstrtoint(buf, 10, ))  \
+   return -EINVAL; \
printk(KERN_INFO "Adjusting limits by %d degrees\n", val);\
limit_adjust = val; \
for (i=0; i < 3; i++)\
@@ -364,7 +365,8 @@ static ssize_t store_##name(struct device *dev, struct 
device_attribute *attr, c
  static ssize_t store_##name(struct device *dev, struct device_attribute 
*attr, const char *buf, size_t n) \
  { \
int val;\
-   val = simple_strtol(buf, NULL, 10); \
+   if (unlikely(kstrtoint(buf, 10, ))  \
+   return -EINVAL; \
if (val < 0 || val > 255) \
return -EINVAL; \
printk(KERN_INFO "Setting specified fan speed to %d\n", val); \


Obviously no build test has been performed:

  CC [M]  drivers/macintosh/therm_adt746x.o
drivers/macintosh/therm_adt746x.c: In function 'store_specified_fan_speed':
drivers/macintosh/therm_adt746x.c:369:17: error: expected ')' before 
'return'

  369 | return -EINVAL; \
  | ^~
drivers/macintosh/therm_adt746x.c:385:1: note: in expansion of macro 
'BUILD_STORE_FUNC_INT'

  385 | BUILD_STORE_FUNC_INT(specified_fan_speed,fan_speed)
  | ^~~~
drivers/macintosh/therm_adt746x.c:368:12: note: to match this '('
  368 | if (unlikely(kstrtoint(buf, 10, ))  \
  |^
drivers/macintosh/therm_adt746x.c:385:1: note: in expansion of macro 
'BUILD_STORE_FUNC_INT'

  385 | BUILD_STORE_FUNC_INT(specified_fan_speed,fan_speed)
  | ^~~~
drivers/macintosh/therm_adt746x.c:375:1: error: expected expression 
before '}' token

  375 | }
  | ^
drivers/macintosh/therm_adt746x.c:385:1: note: in expansion of macro 
'BUILD_STORE_FUNC_INT'

  385 | BUILD_STORE_FUNC_INT(specified_fan_speed,fan_speed)
  | ^~~~
drivers/macintosh/therm_adt746x.c:375:1: error: no return statement in 
function returning non-void [-Werror=return-type]

  375 | }
  | ^
drivers/macintosh/therm_adt746x.c:385:1: note: in expansion of macro 
'BUILD_STORE_FUNC_INT'

  385 | BUILD_STORE_FUNC_INT(specified_fan_speed,fan_speed)
  | ^~~~
drivers/macintosh/therm_adt746x.c: In function 'store_limit_adjust':
drivers/macintosh/therm_adt746x.c:356:17: error: expected ')' before 
'return'

  356 | return -EINVAL; \
  | ^~
drivers/macintosh/therm_adt746x.c:391:1: note: in expansion of macro 
'BUILD_STORE_FUNC_DEG'

  391 | BUILD_STORE_FUNC_DEG(limit_adjust,   th)
  | ^~~~
drivers/macintosh/therm_adt746x.c:355:12: note: to match this '('
  355 | if (unlikely(kstrtoint(buf, 10, ))  \
  |^
drivers/macintosh/therm_adt746x.c:391:1: note: in expansion of macro 
'BUILD_STORE_FUNC_DEG'

  391 | BUILD_STORE_FUNC_DEG(limit_adjust,   th)
  | ^~~~
drivers/macintosh/therm_adt746x.c:362:1: error: expected expression 
before '}' token

  362 | }
  | ^
drivers/macintosh/therm_adt746x.c:391:1: note: in expansion of macro 
'BUILD_STORE_FUNC_DEG'

  391 | BUILD_STORE_FUNC_DEG(limit_adjust,   th)
  | ^~~~
drivers/macintosh/therm_adt746x.c:354:13: warning: unused variable 'i' 
[-Wunused-variable]

  354 | int i;  \
  | ^
drivers/macintosh/therm_adt746x.c:391:1: note: in expansion of macro 
'BUILD_STORE_FUNC_DEG'

  391 | BUILD_STORE_FUNC_DEG(limit_adjust,   th)
  | ^~~~
drivers/macintosh/therm_adt746x.c:352:28: warning: unused variable 'th' 
[-Wunused-variable]

  352 | struct thermostat *th 

Re: [PATCH 11/12] swiotlb: merge swiotlb-xen initialization into swiotlb

2022-03-02 Thread Boris Ostrovsky




On 3/1/22 9:55 PM, Stefano Stabellini wrote:

On Tue, 1 Mar 2022, Christoph Hellwig wrote:

Allow to pass a remap argument to the swiotlb initialization functions
to handle the Xen/x86 remap case.  ARM/ARM64 never did any remapping
from xen_swiotlb_fixup, so we don't even need that quirk.




Aside from that the rest looks OK. Also, you can add my:

Tested-by: Stefano Stabellini 



Not for me, I fail to boot with

[   52.202000] bnxt_en :31:00.0: swiotlb buffer is full (sz: 256 bytes), 
total 0 (slots), used 0 (slots)

(this is iscsi root so I need the NIC).


I bisected it to "x86: remove the IOMMU table infrastructure" but haven't 
actually looked at the code yet.


-boris


RE: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-03-02 Thread David Laight
From: Xiaomeng Tong
> Sent: 02 March 2022 09:31
> 
> On Mon, 28 Feb 2022 16:41:04 -0800, Linus Torvalds
>  wrote:
> >
> > But basically to _me_, the important part is that the end result is
> > maintainable longer-term.
> 
> I couldn't agree more. And because of that, I stick with the following
> approach because it's maintainable longer-term than "type(pos) pos" one:
>  Implements a new macro for each list_for_each_entry* with _inside suffix.
>   #define list_for_each_entry_inside(pos, type, head, member)

I think that it would be better to make any alternate loop macro
just set the variable to NULL on the loop exit.
That is easier to code for and the compiler might be persuaded to
not redo the test.

It also doesn't need an extra variable defined in the for() statement
so can be back-ported to older kernels without required declaration
in the middle of blocks.

OTOH there may be alternative definitions that can be used to get
the compiler (or other compiler-like tools) to detect broken code.
Even if the definition can't possibly generate a working kerrnel.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



Re: [PATCH 02/12] powerpc: wiiu: device tree

2022-03-02 Thread Rob Herring
On Tue, Mar 1, 2022 at 10:44 PM Ash Logan  wrote:
>
> Add a device tree source file for the Nintendo Wii U video game console.

Test this with 'make W=1 dtbs_checks'.

>
> Signed-off-by: Ash Logan 
> Co-developed-by: Roberto Van Eeden 
> Signed-off-by: Roberto Van Eeden 
> Co-developed-by: Emmanuel Gil Peyrot 
> Signed-off-by: Emmanuel Gil Peyrot 
> ---
>  arch/powerpc/boot/dts/wiiu.dts | 327 +
>  1 file changed, 327 insertions(+)
>  create mode 100644 arch/powerpc/boot/dts/wiiu.dts
>
> diff --git a/arch/powerpc/boot/dts/wiiu.dts b/arch/powerpc/boot/dts/wiiu.dts
> new file mode 100644
> index ..aaf264963f61
> --- /dev/null
> +++ b/arch/powerpc/boot/dts/wiiu.dts
> @@ -0,0 +1,327 @@
> +// SPDX-License-Identifier: GPL-2.0

What about non-GPL environments?

> +/*
> + * Nintendo Wii U Device Tree Source
> + *
> + * Copyright (C) 2022 The linux-wiiu Team
> + */
> +
> +/dts-v1/;
> +#include 
> +#include 
> +
> +/ {
> +   model = "nintendo,wiiu";
> +   compatible = "nintendo,wiiu";
> +
> +   #address-cells = <1>;
> +   #size-cells = <1>;
> +
> +   chosen {
> +   bootargs = "root=/dev/sda1 rootwait";
> +   };
> +
> +   memory {
> +   device_type = "memory";
> +   reg = <0x 0x0200/* MEM1 - 32MiB */
> +  0x0800 0x0030/* MEM0 - 3MiB  */
> +  0x1000 0x8000>;  /* MEM2 - 2GiB  */
> +   };
> +
> +   cpus {
> +   #address-cells = <1>;
> +   #size-cells = <0>;
> +
> +   /* TODO: Add SMP */
> +   PowerPC,espresso@0 {
> +   device_type = "cpu";
> +   reg = <0>;
> +   clock-frequency = <1243125000>; /* 
> 1.243125GHz */
> +   bus-frequency = <248625000>;/* 248.625MHz 
> core-to-bus 5x */
> +   timebase-frequency = <62156250>;/* 1/4 of the 
> bus clock */
> +   i-cache-size = <32768>; /* 32K icache */
> +   i-cache-line-size = <32>;
> +   i-cache-block-size = <32>;
> +   i-cache-sets = <128>;
> +   d-cache-size = <32768>; /* 32K dcache */
> +   d-cache-line-size = <32>;
> +   d-cache-block-size = <32>;
> +   d-cache-sets = <128>;
> +   next-level-cache = <_0>;
> +   L2_0:l2-cache {
> +   compatible = "cache";
> +   cache-level = <2>;
> +   cache-unified;
> +   cache-size = <0x8>; /* 512KB L2 */
> +   cache-line-size = <64>;
> +   cache-block-size = <32>;
> +   cache-sets = <2048>;
> +   };
> +   };
> +   };
> +
> +   latte {
> +   #address-cells = <1>;
> +   #size-cells = <1>;
> +   compatible = "nintendo,latte";
> +   ranges = <0x0c00 0x0c00 0x0040  /* 
> Espresso-only registers */
> + 0x0d00 0x0d00 0x0020  /* Latte AHB 
> deivces */
> + 0x0d80 0x0d80 0x0080>;/* Latte SoC 
> registers */
> +
> +   gpu7@c20 {

gpu@...

> +   compatible = "nintendo,latte-gpu7";
> +   reg = <0x0c20 0x8>;
> +   interrupts = <2>;
> +   interrupt-parent = <_pic>;
> +   };
> +
> +   espresso_pic: pic@c78 {
> +   #interrupt-cells = <1>;
> +   interrupt-controller;
> +
> +   compatible = "nintendo,espresso-pic";
> +   reg = <0x0c78 0x18>;
> +   };
> +
> +   latte_dsp: dsp@c005000 {
> +   compatible = "nintendo,latte-dsp";
> +   reg = <0x0c005000 0x200>;
> +   };
> +
> +   ehci_0: usb@d04 {
> +   compatible = "nintendo,latte-usb-ehci", "usb-ehci";
> +   reg = <0x0d04 0x100>;
> +   interrupts = <4>;
> +   interrupt-parent = <_pic>;
> +   big-endian-regs;
> +   };
> +
> +   ohci_0_0: usb@d05 {
> +   compatible = "nintendo,latte-usb-ohci";
> +   reg = <0x0d05 0x100>;
> +   interrupts = <5>;
> +   interrupt-parent = <_pic>;
> +
> +   big-endian-regs;
> +   };
> +
> +   ohci_0_1: usb@d06 {
> +   compatible = "nintendo,latte-usb-ohci";
> +  

Re: [PATCH 01/12] dt-bindings: wiiu: Document the Nintendo Wii U devicetree

2022-03-02 Thread Rob Herring
On Tue, Mar 1, 2022 at 10:44 PM Ash Logan  wrote:
>
> Adds schema for the various Wii U devicetree nodes used.

Please resend to the DT list if you want this reviewed and so that checks run.

>
> Signed-off-by: Ash Logan 
> ---
>  .../bindings/powerpc/nintendo/wiiu.yaml   | 28 +++
>  .../powerpc/nintendo/wiiu/espresso-pic.yaml   | 42 +
>  .../bindings/powerpc/nintendo/wiiu/gpu7.yaml  | 41 +
>  .../powerpc/nintendo/wiiu/latte-ahci.yaml | 43 +
>  .../powerpc/nintendo/wiiu/latte-dsp.yaml  | 35 ++
>  .../powerpc/nintendo/wiiu/latte-pic.yaml  | 46 +++
>  .../powerpc/nintendo/wiiu/latte-sdhci.yaml| 40 
>  .../bindings/powerpc/nintendo/wiiu/latte.yaml | 25 ++
>  8 files changed, 300 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/powerpc/nintendo/wiiu.yaml
>  create mode 100644 
> Documentation/devicetree/bindings/powerpc/nintendo/wiiu/espresso-pic.yaml
>  create mode 100644 
> Documentation/devicetree/bindings/powerpc/nintendo/wiiu/gpu7.yaml
>  create mode 100644 
> Documentation/devicetree/bindings/powerpc/nintendo/wiiu/latte-ahci.yaml
>  create mode 100644 
> Documentation/devicetree/bindings/powerpc/nintendo/wiiu/latte-dsp.yaml
>  create mode 100644 
> Documentation/devicetree/bindings/powerpc/nintendo/wiiu/latte-pic.yaml
>  create mode 100644 
> Documentation/devicetree/bindings/powerpc/nintendo/wiiu/latte-sdhci.yaml
>  create mode 100644 
> Documentation/devicetree/bindings/powerpc/nintendo/wiiu/latte.yaml
>
> diff --git a/Documentation/devicetree/bindings/powerpc/nintendo/wiiu.yaml 
> b/Documentation/devicetree/bindings/powerpc/nintendo/wiiu.yaml
> new file mode 100644
> index ..5824b07928f5
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/powerpc/nintendo/wiiu.yaml
> @@ -0,0 +1,28 @@
> +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
> +%YAML 1.2
> +---
> +
> +$id: http://devicetree.org/schemas/powerpc/nintendo/wiiu.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Nintendo Wii U bindings
> +
> +maintainers:
> +  - Ash Logan 
> +  - Emmanuel Gil Peyrot 
> +
> +description: |
> +  Nintendo Wii U video game console binding.
> +
> +properties:
> +  $nodename:
> +const: "/"
> +  compatible:
> +oneOf:
> +  - description: Nintendo Wii U video game console
> +items:
> +  - const: nintendo,wiiu
> +
> +additionalProperties: true
> +
> +...
> diff --git 
> a/Documentation/devicetree/bindings/powerpc/nintendo/wiiu/espresso-pic.yaml 
> b/Documentation/devicetree/bindings/powerpc/nintendo/wiiu/espresso-pic.yaml
> new file mode 100644
> index ..878a81595f5f
> --- /dev/null
> +++ 
> b/Documentation/devicetree/bindings/powerpc/nintendo/wiiu/espresso-pic.yaml
> @@ -0,0 +1,42 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/powerpc/nintendo/wiiu/espresso-pic.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Nintendo Wii U "Espresso" interrupt controller
> +
> +maintainers:
> +  - Ash Logan 
> +  - Emmanuel Gil Peyrot 
> +
> +description: |
> +  Interrupt controller found on the Nintendo Wii U for the "Espresso" 
> processor.
> +
> +properties:
> +  compatible:
> +oneOf:
> +  - description: Nintendo Wii U "Espresso" interrupt controller
> +items:
> +  - const: nintendo,espresso-pic
> +  '#interrupt-cells':
> +# Interrupt numbers 0-32 in one cell
> +const: 1
> +  interrupt-controller: true
> +  reg:
> +items:
> +  - description: Core registers
> +
> +additionalProperties: false
> +
> +examples:
> +  - |
> +espresso_pic: pic@c78 {
> +#interrupt-cells = <1>;
> +interrupt-controller;
> +
> +compatible = "nintendo,espresso-pic";
> +reg = <0x0c78 0x18>;
> +};
> +
> +...
> diff --git 
> a/Documentation/devicetree/bindings/powerpc/nintendo/wiiu/gpu7.yaml 
> b/Documentation/devicetree/bindings/powerpc/nintendo/wiiu/gpu7.yaml
> new file mode 100644
> index ..e54d49015f36
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/powerpc/nintendo/wiiu/gpu7.yaml
> @@ -0,0 +1,41 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/powerpc/nintendo/wiiu/gpu7.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Nintendo Wii U Latte "GPU7" graphics processor
> +
> +maintainers:
> +  - Ash Logan 
> +  - Emmanuel Gil Peyrot 
> +
> +description: |
> +  GPU7 graphics processor, also known as "GX2", found in the Latte 
> multifunction chip of the
> +  Nintendo Wii U.
> +
> +properties:
> +  compatible:
> +oneOf:
> +  - description: Nintendo Wii U Latte "GPU7" graphics processor
> +items:
> +  - const: nintendo,latte-gpu7
> +  reg:
> +items:
> +  - description: 

Re: [PATCH -next] powerpc/pmac: remove not use symbol

2022-03-02 Thread Christophe Leroy




Le 09/04/2021 à 11:35, Zucheng Zheng a écrit :

sccdbg symbol is not used, so remove it


You could mention that it hasn't been used since commit 51d3082fe6e5 
("[PATCH] powerpc: Unify udbg (#2)")





Reported-by: Hulk Robot 
Signed-off-by: Zucheng Zheng 
---
  arch/powerpc/platforms/powermac/setup.c | 4 
  1 file changed, 4 deletions(-)

diff --git a/arch/powerpc/platforms/powermac/setup.c 
b/arch/powerpc/platforms/powermac/setup.c
index db5107c80485..df9ea37d5708 100644
--- a/arch/powerpc/platforms/powermac/setup.c
+++ b/arch/powerpc/platforms/powermac/setup.c
@@ -83,10 +83,6 @@ extern struct machdep_calls pmac_md;
  
  #define DEFAULT_ROOT_DEVICE Root_SDA1	/* sda1 - slightly silly choice */
  
-#ifdef CONFIG_PPC64

-int sccdbg;
-#endif
-
  sys_ctrler_t sys_ctrler = SYS_CTRLER_UNKNOWN;
  EXPORT_SYMBOL(sys_ctrler);
  


Re: [PATCH -next] powerpc/pmac: Make some symbols static

2022-03-02 Thread Christophe Leroy


Le 09/04/2021 à 11:38, Zucheng Zheng a écrit :
> ppc_override_l2cr/ppc_override_l2cr_value/has_l2cache symbol is not used
> outside of setup.c, so commit marks it static.
> 
> Reported-by: Hulk Robot 
> Signed-off-by: Zucheng Zheng 
> ---
>   arch/powerpc/platforms/powermac/setup.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powermac/setup.c 
> b/arch/powerpc/platforms/powermac/setup.c
> index 86aee3f2483f..db5107c80485 100644
> --- a/arch/powerpc/platforms/powermac/setup.c
> +++ b/arch/powerpc/platforms/powermac/setup.c
> @@ -71,9 +71,9 @@
>   
>   #undef SHOW_GATWICK_IRQS
>   
> -int ppc_override_l2cr = 0;
> -int ppc_override_l2cr_value;
> -int has_l2cache = 0;
> +static int ppc_override_l2cr;
> +static int ppc_override_l2cr_value;
> +static int has_l2cache;
>   
>   int pmac_newworld;
>   

With ppc64_defconfig,

   CC  arch/powerpc/platforms/powermac/setup.o
arch/powerpc/platforms/powermac/setup.c:75:12: error: 
'ppc_override_l2cr_value' defined but not used [-Werror=unused-variable]
75 | static int ppc_override_l2cr_value;
   |^~~
arch/powerpc/platforms/powermac/setup.c:74:12: error: 
'ppc_override_l2cr' defined but not used [-Werror=unused-variable]
74 | static int ppc_override_l2cr;
   |^
cc1: all warnings being treated as errors
make[3]: *** [scripts/Makefile.build:288: 
arch/powerpc/platforms/powermac/setup.o] Error 1


You have to move it inside the #ifdef CONFIG_PPC32 block that uses it.

Christophe

Re: [PATCH 11/12] swiotlb: merge swiotlb-xen initialization into swiotlb

2022-03-02 Thread Boris Ostrovsky



On 3/2/22 8:15 AM, Boris Ostrovsky wrote:



On 3/1/22 9:55 PM, Stefano Stabellini wrote:

On Tue, 1 Mar 2022, Christoph Hellwig wrote:

Allow to pass a remap argument to the swiotlb initialization functions
to handle the Xen/x86 remap case.  ARM/ARM64 never did any remapping
from xen_swiotlb_fixup, so we don't even need that quirk.




Aside from that the rest looks OK. Also, you can add my:

Tested-by: Stefano Stabellini 



Not for me, I fail to boot with

[   52.202000] bnxt_en :31:00.0: swiotlb buffer is full (sz: 256 bytes), 
total 0 (slots), used 0 (slots)

(this is iscsi root so I need the NIC).


I bisected it to "x86: remove the IOMMU table infrastructure" but haven't 
actually looked at the code yet.



Again, this is as dom0. Baremetal is fine.


-boris



Re: [PATCH v4] powerpc/papr_scm: Implement initial support for injecting smart errors

2022-03-02 Thread Michael Ellerman
On Tue, 25 Jan 2022 01:52:04 +0530, Vaibhav Jain wrote:
> Presently PAPR doesn't support injecting smart errors on an
> NVDIMM. This makes testing the NVDIMM health reporting functionality
> difficult as simulating NVDIMM health related events need a hacked up
> qemu version.
> 
> To solve this problem this patch proposes simulating certain set of
> NVDIMM health related events in papr_scm. Specifically 'fatal' health
> state and 'dirty' shutdown state. These error can be injected via the
> user-space 'ndctl-inject-smart(1)' command. With the proposed patch and
> corresponding ndctl patches following command flow is expected:
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/papr_scm: Implement initial support for injecting smart errors
  https://git.kernel.org/powerpc/c/bbbca72352bb9484bc057c91a408332b35ee8f4c

cheers


Re: [PATCH] powerpc/boot: Add `otheros-too-big.bld` to .gitignore

2022-03-02 Thread Michael Ellerman
On Mon, 14 Feb 2022 07:55:43 +0100, Paul Menzel wrote:
> Currently, `git status` lists the file as untracked by git, so tell git
> to ignore it.
> 
> 

Applied to powerpc/next.

[1/1] powerpc/boot: Add `otheros-too-big.bld` to .gitignore
  https://git.kernel.org/powerpc/c/cb7356986db020c96f37532042fdae6706e81df7

cheers


Re: [PATCH] powerpc/64s/hash: Make hash faults work in NMI context

2022-03-02 Thread Michael Ellerman
On Fri, 4 Feb 2022 13:53:48 +1000, Nicholas Piggin wrote:
> Hash faults are not resoved in NMI context, instead causing the access
> to fail. This is done because perf interrupts can get backtraces
> including walking the user stack, and taking a hash fault on those could
> deadlock on the HPTE lock if the perf interrupt hits while the same HPTE
> lock is being held by the hash fault code. The user-access for the stack
> walking will notice the access failed and deal with that in the perf
> code.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/64s/hash: Make hash faults work in NMI context
  https://git.kernel.org/powerpc/c/8b91cee5eadd2021f55e6775f2d50bd56d00c217

cheers


Re: [PATCH] powerpc/Makefile: Don't pass -mcpu=powerpc64 when building 32-bit

2022-03-02 Thread Michael Ellerman
On Tue, 15 Feb 2022 22:28:58 +1100, Michael Ellerman wrote:
> When CONFIG_GENERIC_CPU=y (true for all our defconfigs) we pass
> -mcpu=powerpc64 to the compiler, even when we're building a 32-bit
> kernel.
> 
> This happens because we have an ifdef CONFIG_PPC_BOOK3S_64/else block in
> the Makefile that was written before 32-bit supported GENERIC_CPU. Prior
> to that the else block only applied to 64-bit Book3E.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/Makefile: Don't pass -mcpu=powerpc64 when building 32-bit
  https://git.kernel.org/powerpc/c/2863dd2db23e0407f6c50b8ba5c0e55abef894f1

cheers


Re: [PATCH 00/20] Add perf sampling tests as part of selftest

2022-03-02 Thread Michael Ellerman
On Thu, 27 Jan 2022 12:49:52 +0530, Kajol Jain wrote:
> Patch series adds support for perf sampling tests that
> enables capturing sampling data in perf mmap buffer and
> further support for reading and processing the samples.
> It also addds basic utility functions to process the
> mmap buffer inorder to read total count of samples as
> well as the contents of sample.
> 
> [...]

Patches 1-15 and 17-20 applied to powerpc/next.

[01/20] selftest/powerpc/pmu: Include mmap_buffer field as part of struct event

https://git.kernel.org/powerpc/c/f961e20f15ed35e9ca154a099897d600b78b0311
[02/20] selftest/powerpc/pmu: Add support for perf sampling tests

https://git.kernel.org/powerpc/c/c315669e2fbd71bb9387066f60f0d91b0ceb28f3
[03/20] selftest/powerpc/pmu: Add macros to parse event codes

https://git.kernel.org/powerpc/c/6523dce86222451e5ca2df8a46597a217084bfdf
[04/20] selftest/powerpc/pmu: Add utility functions to post process the mmap 
buffer

https://git.kernel.org/powerpc/c/5f6c3061af7ca3b0f9f8a20ec7a445671f7e6b5a
[05/20] selftest/powerpc/pmu: Add event_init_sampling function

https://git.kernel.org/powerpc/c/54d4ba7f22d1ed5bfbc915771cf2e3e147cf03b2
[06/20] selftest/powerpc/pmu: Add macros to extract mmcr fields

https://git.kernel.org/powerpc/c/79c4e6aba8dfc9206acc68884498080f451121f7
[07/20] selftest/powerpc/pmu: Add macro to extract mmcr0/mmcr1 fields

https://git.kernel.org/powerpc/c/2b49e641063e7569493371ef433b9c4ce8c8dd8b
[08/20] selftest/powerpc/pmu: Add macro to extract mmcr3 and mmcra fields

https://git.kernel.org/powerpc/c/13307f9584ea9408d9959302074dc4e8308b6cab
[09/20] selftest/powerpc/pmu/: Add interface test for mmcr0 exception bits

https://git.kernel.org/powerpc/c/eb7aa044df18c6f7a88bc17fc4c9f4524652a290
[10/20] selftest/powerpc/pmu/: Add interface test for mmcr0_cc56run field

https://git.kernel.org/powerpc/c/a7c0ab2e61484c0844eae2f208a06cc940338d83
[11/20] selftest/powerpc/pmu/: Add interface test for mmcr0_pmccext bit

https://git.kernel.org/powerpc/c/b24142b9d2401468bcd8df157013306d5b4f6626
[12/20] selftest/powerpc/pmu/: Add interface test for mmcr0_pmcjce field

https://git.kernel.org/powerpc/c/9ac7c6d5e4b570f416d849b263a6f67a617b4fa5
[13/20] selftest/powerpc/pmu/: Add interface test for mmcr0_fc56 field using 
pmc1

https://git.kernel.org/powerpc/c/d5172f2585cd0fc9788aa9b25d3dce6483321792
[14/20] selftest/powerpc/pmu/: Add interface test for mmcr0_pmc56 using pmc5

https://git.kernel.org/powerpc/c/6e11374b08723b2c43ae83bd5d48000d695f13a0
[15/20] selftest/powerpc/pmu/: Add interface test for mmcr1_comb field

https://git.kernel.org/powerpc/c/2becea3b6acf308642d6c0e9bd41caf7820753f5
[17/20] selftest/powerpc/pmu/: Add interface test for mmcr2_l2l3 field

https://git.kernel.org/powerpc/c/ac575b2606bf99a0d01a054196e24e1ad82c194d
[18/20] selftest/powerpc/pmu/: Add interface test for mmcr2_fcs_fch fields

https://git.kernel.org/powerpc/c/9ee241f1b1447c7e8ca90902ab1888aa9e7a3c00
[19/20] selftest/powerpc/pmu/: Add interface test for mmcr3_src fields

https://git.kernel.org/powerpc/c/02f02feb6b50c67171fd56bc3fd0fd96118c5c12
[20/20] selftest/powerpc/pmu: Add interface test for mmcra register fields

https://git.kernel.org/powerpc/c/29cf373c5766e6bd1b97056d2d678a41777669aa

cheers


Re: [RESEND PATCH v2] powerpc/fadump: register for fadump as early as possible

2022-03-02 Thread Michael Ellerman
On Tue, 1 Feb 2022 16:23:05 +0530, Hari Bathini wrote:
> Crash recovery (fadump) is setup in the userspace by some service.
> This service rebuilds initrd with dump capture capability, if it is
> not already dump capture capable before proceeding to register for
> firmware assisted dump (echo 1 > /sys/kernel/fadump/registered). But
> arming the kernel with crash recovery support does not have to wait
> for userspace configuration. So, register for fadump while setting
> it up itself. This can at worst lead to a scenario, where /proc/vmcore
> is ready afer crash but the initrd does not know how/where to offload
> it, which is always better than not having a /proc/vmcore at all due
> to incomplete configuration in the userspace at the time of crash.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/fadump: register for fadump as early as possible
  https://git.kernel.org/powerpc/c/607451ce0aa9bdff590db4d087173edba6d7a29d

cheers


Re: [PATCH] powerpc/module_64: fix array_size.cocci warning

2022-03-02 Thread Michael Ellerman
On Wed, 23 Feb 2022 15:54:23 +0800, Guo Zhengkui wrote:
> Fix following coccicheck warning:
> ./arch/powerpc/kernel/module_64.c:432:40-41: WARNING: Use ARRAY_SIZE.
> 
> ARRAY_SIZE(arr) is a macro provided by the kernel. It makes sure that arr
> is an array, so it's safer than sizeof(arr) / sizeof(arr[0]) and more
> standard.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/module_64: fix array_size.cocci warning
  https://git.kernel.org/powerpc/c/8a0edc72bec25fa62450bfef1a150483558e1289

cheers


Re: [PATCHv2 1/3] powerpc: lib: sstep: fix 'sthcx' instruction

2022-03-02 Thread Michael Ellerman
On Thu, 24 Feb 2022 17:22:13 +0100, Anders Roxell wrote:
> Looks like there been a copy paste mistake when added the instruction
> 'stbcx' twice and one was probably meant to be 'sthcx'.
> Changing to 'sthcx' from 'stbcx'.
> 
> 

Applied to powerpc/next.

[1/3] powerpc: lib: sstep: fix 'sthcx' instruction
  https://git.kernel.org/powerpc/c/a633cb1edddaa643fadc70abc88f89a408fa834a
[2/3] powerpc: fix build errors
  https://git.kernel.org/powerpc/c/8667d0d64dd1f84fd41b5897fd87fa9113ae05e3
[3/3] powerpc: lib: sstep: fix build errors
  https://git.kernel.org/powerpc/c/8219d31effa7be5dbc7ff915d7970672e028c701

cheers


Re: [PATCH] powerpc: Don't allow the use of EMIT_BUG_ENTRY with BUGFLAG_WARNING

2022-03-02 Thread Michael Ellerman
On Sun, 13 Feb 2022 10:02:41 +0100, Christophe Leroy wrote:
> Warnings in assembly must use EMIT_WARN_ENTRY in order to generate
> the necessary entry in exception table.
> 
> Check in EMIT_BUG_ENTRY that flags don't include BUGFLAG_WARNING.
> 
> This change avoids problems like the one fixed by
> commit fd1ea686 ("powerpc/64s: Use EMIT_WARN_ENTRY for SRR debug
> warnings").
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc: Don't allow the use of EMIT_BUG_ENTRY with BUGFLAG_WARNING
  https://git.kernel.org/powerpc/c/38a1756861b8fc2ea9afb93e231194c642a4e261

cheers


Re: [PATCH v1] powerpc/interrupt: Remove struct interrupt_state

2022-03-02 Thread Michael Ellerman
On Fri, 25 Feb 2022 17:36:22 +0100, Christophe Leroy wrote:
> Since commit ceff77efa4f8 ("powerpc/64e/interrupt: Use new interrupt
> context tracking scheme") struct interrupt_state has been empty and
> unused.
> 
> Remove it.
> 
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/interrupt: Remove struct interrupt_state
  https://git.kernel.org/powerpc/c/973e2e6462405d85d3e8bb02d516d5fe6d1193ed

cheers


Re: [PATCH] powerpc: Remove remaining stab codes

2022-03-02 Thread Michael Ellerman
On Tue, 22 Feb 2022 15:05:30 +0100, Christophe Leroy wrote:
> Following commit 12318163737c ("powerpc/32: Remove remaining .stabs
> annotations"), stabs code are not used anymore.
> 
> Remove them.
> 
> 

Applied to powerpc/next.

[1/1] powerpc: Remove remaining stab codes
  https://git.kernel.org/powerpc/c/406a8c1d8fa59ae6a6462a6fb6ff892f6a4f7499

cheers


Re: [PATCH] powerpc/mm/numa: skip NUMA_NO_NODE onlining in parse_numa_properties()

2022-03-02 Thread Michael Ellerman
On Thu, 24 Feb 2022 15:23:12 -0300, Daniel Henrique Barboza wrote:
> Executing node_set_online() when nid = NUMA_NO_NODE results in an
> undefined behavior. node_set_online() will call node_set_state(), into
> __node_set(), into set_bit(), and since NUMA_NO_NODE is -1 we'll end up
> doing a negative shift operation inside
> arch/powerpc/include/asm/bitops.h. This potential UB was detected
> running a kernel with CONFIG_UBSAN.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/mm/numa: skip NUMA_NO_NODE onlining in parse_numa_properties()
  https://git.kernel.org/powerpc/c/749ed4a20657bcea66a6e082ca3dc0d228cbec80

cheers


Re: [PATCH v1 1/4] powerpc/ftrace: Also save r1 in ftrace_caller()

2022-03-02 Thread Michael Ellerman
On Tue, 15 Feb 2022 19:31:22 +0100, Christophe Leroy wrote:
> Also save r1 in ftrace_caller()
> 
> r1 is needed during unwinding when the function_graph tracer
> is active.
> 
> 

Applied to powerpc/next.

[1/4] powerpc/ftrace: Also save r1 in ftrace_caller()
  https://git.kernel.org/powerpc/c/34d8dac807f0ee3dc42ab45bdb284a3caf2b5ed1
[2/4] powerpc/ftrace: Add recursion protection in prepare_ftrace_return()
  https://git.kernel.org/powerpc/c/df45a55788286c541449d82ee09fef3ac5ff77a1
[3/4] powerpc/ftrace: Have arch_ftrace_get_regs() return NULL unless 
FL_SAVE_REGS is set
  https://git.kernel.org/powerpc/c/fc75f87337983229b7355d6b77f30fb6e7f359ee
[4/4] powerpc/ftrace: Style cleanup in ftrace_mprofile.S
  https://git.kernel.org/powerpc/c/76b372814b088aeb76f0f753d968c8aa6d297f2a

cheers


Re: [PATCH V3 09/30] arm/mm: Enable ARCH_HAS_VM_GET_PAGE_PROT

2022-03-02 Thread Russell King (Oracle)
On Wed, Mar 02, 2022 at 04:36:52PM +0530, Anshuman Khandual wrote:
> On 3/2/22 3:35 PM, Geert Uytterhoeven wrote:
> > I doubt the switch() variant would give better code on any platform.
> > 
> > What about using tables everywhere, using designated initializers
> > to improve readability?
> 
> Designated initializers ? Could you please be more specific. A table look
> up on arm platform would be something like this and arm_protection_map[]
> needs to be updated with user_pgprot like before.

There is *absolutely* nothing wrong with that. Updating it once during
boot is way more efficient than having to compute the value each time
vm_get_page_prot() gets called.

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


Re: [PATCH V3 09/30] arm/mm: Enable ARCH_HAS_VM_GET_PAGE_PROT

2022-03-02 Thread Geert Uytterhoeven
Hi Anshuman,

On Wed, Mar 2, 2022 at 12:07 PM Anshuman Khandual
 wrote:
> On 3/2/22 3:35 PM, Geert Uytterhoeven wrote:
> > On Wed, Mar 2, 2022 at 10:51 AM Anshuman Khandual
> >  wrote:
> >> On 3/2/22 12:35 PM, Christophe Leroy wrote:
> >>> Le 02/03/2022 à 04:22, Anshuman Khandual a écrit :
>  On 3/1/22 1:46 PM, Christophe Leroy wrote:
> > Le 01/03/2022 à 01:31, Russell King (Oracle) a écrit :
> >> On Tue, Mar 01, 2022 at 05:30:41AM +0530, Anshuman Khandual wrote:
> >>> On 2/28/22 4:27 PM, Russell King (Oracle) wrote:
>  On Mon, Feb 28, 2022 at 04:17:32PM +0530, Anshuman Khandual wrote:
> > This defines and exports a platform specific custom 
> > vm_get_page_prot() via
> > subscribing ARCH_HAS_VM_GET_PAGE_PROT. Subsequently all __SXXX and 
> > __PXXX
> > macros can be dropped which are no longer needed.
> 
>  What I would really like to know is why having to run _code_ to work 
>  out
>  what the page protections need to be is better than looking it up in 
>  a
>  table.
> 
>  Not only is this more expensive in terms of CPU cycles, it also 
>  brings
>  additional code size with it.
> 
>  I'm struggling to see what the benefit is.
> >>>
> >>> Currently vm_get_page_prot() is also being _run_ to fetch required 
> >>> page
> >>> protection values. Although that is being run in the core MM and from 
> >>> a
> >>> platform perspective __SXXX, __PXXX are just being exported for a 
> >>> table.
> >>> Looking it up in a table (and applying more constructs there after) is
> >>> not much different than a clean switch case statement in terms of CPU
> >>> usage. So this is not more expensive in terms of CPU cycles.
> >>
> >> I disagree.
> >
> > So do I.
> >
> >>
> >> However, let's base this disagreement on some evidence. Here is the
> >> present 32-bit ARM implementation:
> >>
> >> 0048 :
> >> 48:   e20fand r0, r0, #15
> >> 4c:   e3003000movwr3, #0
> >>   4c: R_ARM_MOVW_ABS_NC   .LANCHOR1
> >> 50:   e3403000movtr3, #0
> >>   50: R_ARM_MOVT_ABS  .LANCHOR1
> >> 54:   e7930100ldr r0, [r3, r0, lsl #2]
> >> 58:   e12fff1ebx  lr
> >>
> >> That is five instructions long.
> >
> > On ppc32 I get:
> >
> > 0094 :
> > 94: 3d 20 00 00 lis r9,0
> > 96: R_PPC_ADDR16_HA .data..ro_after_init
> > 98: 54 84 16 ba rlwinm  r4,r4,2,26,29
> > 9c: 39 29 00 00 addir9,r9,0
> > 9e: R_PPC_ADDR16_LO .data..ro_after_init
> > a0: 7d 29 20 2e lwzxr9,r9,r4
> > a4: 91 23 00 00 stw r9,0(r3)
> > a8: 4e 80 00 20 blr
> >
> >
> >>
> >> Please show that your new implementation is not more expensive on
> >> 32-bit ARM. Please do so by building a 32-bit kernel, and providing
> >> the disassembly.
> >
> > With your series I get:
> >
> >  :
> >  0: 3d 20 00 00 lis r9,0
> > 2: R_PPC_ADDR16_HA  .rodata
> >  4: 39 29 00 00 addir9,r9,0
> > 6: R_PPC_ADDR16_LO  .rodata
> >  8: 54 84 16 ba rlwinm  r4,r4,2,26,29
> >  c: 7d 49 20 2e lwzxr10,r9,r4
> > 10: 7d 4a 4a 14 add r10,r10,r9
> > 14: 7d 49 03 a6 mtctr   r10
> > 18: 4e 80 04 20 bctr
> > 1c: 39 20 03 15 li  r9,789
> > 20: 91 23 00 00 stw r9,0(r3)
> > 24: 4e 80 00 20 blr
> > 28: 39 20 01 15 li  r9,277
> > 2c: 91 23 00 00 stw r9,0(r3)
> > 30: 4e 80 00 20 blr
> > 34: 39 20 07 15 li  r9,1813
> > 38: 91 23 00 00 stw r9,0(r3)
> > 3c: 4e 80 00 20 blr
> > 40: 39 20 05 15 li  r9,1301
> > 44: 91 23 00 00 stw r9,0(r3)
> > 48: 4e 80 00 20 blr
> > 4c: 39 20 01 11 li  r9,273
> > 50: 4b ff ff d0 b   20 
> >
> >
> > That is definitely more expensive, it implements a table of branches.
> 
>  Okay, will split out the PPC32 implementation that retains existing
>  table look up method. Also planning to keep that inside same file
>  (arch/powerpc/mm/mmap.c), unless you have a difference preference.
> >>>
> >>> My point was not to get something specific for PPC32, but to amplify on
> >>> Russell's objection.
> >>>
> >>> As this is bad for ARM and bad for PPC32, do we have any evidence that
> 

Re: [PATCH V3 09/30] arm/mm: Enable ARCH_HAS_VM_GET_PAGE_PROT

2022-03-02 Thread Anshuman Khandual



On 3/2/22 3:35 PM, Geert Uytterhoeven wrote:
> Hi Anshuman,
> 
> On Wed, Mar 2, 2022 at 10:51 AM Anshuman Khandual
>  wrote:
>> On 3/2/22 12:35 PM, Christophe Leroy wrote:
>>> Le 02/03/2022 à 04:22, Anshuman Khandual a écrit :
 On 3/1/22 1:46 PM, Christophe Leroy wrote:
> Le 01/03/2022 à 01:31, Russell King (Oracle) a écrit :
>> On Tue, Mar 01, 2022 at 05:30:41AM +0530, Anshuman Khandual wrote:
>>> On 2/28/22 4:27 PM, Russell King (Oracle) wrote:
 On Mon, Feb 28, 2022 at 04:17:32PM +0530, Anshuman Khandual wrote:
> This defines and exports a platform specific custom 
> vm_get_page_prot() via
> subscribing ARCH_HAS_VM_GET_PAGE_PROT. Subsequently all __SXXX and 
> __PXXX
> macros can be dropped which are no longer needed.

 What I would really like to know is why having to run _code_ to work 
 out
 what the page protections need to be is better than looking it up in a
 table.

 Not only is this more expensive in terms of CPU cycles, it also brings
 additional code size with it.

 I'm struggling to see what the benefit is.
>>>
>>> Currently vm_get_page_prot() is also being _run_ to fetch required page
>>> protection values. Although that is being run in the core MM and from a
>>> platform perspective __SXXX, __PXXX are just being exported for a table.
>>> Looking it up in a table (and applying more constructs there after) is
>>> not much different than a clean switch case statement in terms of CPU
>>> usage. So this is not more expensive in terms of CPU cycles.
>>
>> I disagree.
>
> So do I.
>
>>
>> However, let's base this disagreement on some evidence. Here is the
>> present 32-bit ARM implementation:
>>
>> 0048 :
>> 48:   e20fand r0, r0, #15
>> 4c:   e3003000movwr3, #0
>>   4c: R_ARM_MOVW_ABS_NC   .LANCHOR1
>> 50:   e3403000movtr3, #0
>>   50: R_ARM_MOVT_ABS  .LANCHOR1
>> 54:   e7930100ldr r0, [r3, r0, lsl #2]
>> 58:   e12fff1ebx  lr
>>
>> That is five instructions long.
>
> On ppc32 I get:
>
> 0094 :
> 94: 3d 20 00 00 lis r9,0
> 96: R_PPC_ADDR16_HA .data..ro_after_init
> 98: 54 84 16 ba rlwinm  r4,r4,2,26,29
> 9c: 39 29 00 00 addir9,r9,0
> 9e: R_PPC_ADDR16_LO .data..ro_after_init
> a0: 7d 29 20 2e lwzxr9,r9,r4
> a4: 91 23 00 00 stw r9,0(r3)
> a8: 4e 80 00 20 blr
>
>
>>
>> Please show that your new implementation is not more expensive on
>> 32-bit ARM. Please do so by building a 32-bit kernel, and providing
>> the disassembly.
>
> With your series I get:
>
>  :
>  0: 3d 20 00 00 lis r9,0
> 2: R_PPC_ADDR16_HA  .rodata
>  4: 39 29 00 00 addir9,r9,0
> 6: R_PPC_ADDR16_LO  .rodata
>  8: 54 84 16 ba rlwinm  r4,r4,2,26,29
>  c: 7d 49 20 2e lwzxr10,r9,r4
> 10: 7d 4a 4a 14 add r10,r10,r9
> 14: 7d 49 03 a6 mtctr   r10
> 18: 4e 80 04 20 bctr
> 1c: 39 20 03 15 li  r9,789
> 20: 91 23 00 00 stw r9,0(r3)
> 24: 4e 80 00 20 blr
> 28: 39 20 01 15 li  r9,277
> 2c: 91 23 00 00 stw r9,0(r3)
> 30: 4e 80 00 20 blr
> 34: 39 20 07 15 li  r9,1813
> 38: 91 23 00 00 stw r9,0(r3)
> 3c: 4e 80 00 20 blr
> 40: 39 20 05 15 li  r9,1301
> 44: 91 23 00 00 stw r9,0(r3)
> 48: 4e 80 00 20 blr
> 4c: 39 20 01 11 li  r9,273
> 50: 4b ff ff d0 b   20 
>
>
> That is definitely more expensive, it implements a table of branches.

 Okay, will split out the PPC32 implementation that retains existing
 table look up method. Also planning to keep that inside same file
 (arch/powerpc/mm/mmap.c), unless you have a difference preference.
>>>
>>> My point was not to get something specific for PPC32, but to amplify on
>>> Russell's objection.
>>>
>>> As this is bad for ARM and bad for PPC32, do we have any evidence that
>>> your change is good for any other architecture ?
>>>
>>> I checked PPC64 and there is exactly the same drawback. With the current
>>> implementation it is a small function performing table read then a few
>>> adjustment. After your change it is a bigger function implementing a
>>> table of branches.
>>
>> I am wondering 

Re: [PATCH V3 09/30] arm/mm: Enable ARCH_HAS_VM_GET_PAGE_PROT

2022-03-02 Thread Geert Uytterhoeven
Hi Anshuman,

On Wed, Mar 2, 2022 at 10:51 AM Anshuman Khandual
 wrote:
> On 3/2/22 12:35 PM, Christophe Leroy wrote:
> > Le 02/03/2022 à 04:22, Anshuman Khandual a écrit :
> >> On 3/1/22 1:46 PM, Christophe Leroy wrote:
> >>> Le 01/03/2022 à 01:31, Russell King (Oracle) a écrit :
>  On Tue, Mar 01, 2022 at 05:30:41AM +0530, Anshuman Khandual wrote:
> > On 2/28/22 4:27 PM, Russell King (Oracle) wrote:
> >> On Mon, Feb 28, 2022 at 04:17:32PM +0530, Anshuman Khandual wrote:
> >>> This defines and exports a platform specific custom 
> >>> vm_get_page_prot() via
> >>> subscribing ARCH_HAS_VM_GET_PAGE_PROT. Subsequently all __SXXX and 
> >>> __PXXX
> >>> macros can be dropped which are no longer needed.
> >>
> >> What I would really like to know is why having to run _code_ to work 
> >> out
> >> what the page protections need to be is better than looking it up in a
> >> table.
> >>
> >> Not only is this more expensive in terms of CPU cycles, it also brings
> >> additional code size with it.
> >>
> >> I'm struggling to see what the benefit is.
> >
> > Currently vm_get_page_prot() is also being _run_ to fetch required page
> > protection values. Although that is being run in the core MM and from a
> > platform perspective __SXXX, __PXXX are just being exported for a table.
> > Looking it up in a table (and applying more constructs there after) is
> > not much different than a clean switch case statement in terms of CPU
> > usage. So this is not more expensive in terms of CPU cycles.
> 
>  I disagree.
> >>>
> >>> So do I.
> >>>
> 
>  However, let's base this disagreement on some evidence. Here is the
>  present 32-bit ARM implementation:
> 
>  0048 :
>  48:   e20fand r0, r0, #15
>  4c:   e3003000movwr3, #0
>    4c: R_ARM_MOVW_ABS_NC   .LANCHOR1
>  50:   e3403000movtr3, #0
>    50: R_ARM_MOVT_ABS  .LANCHOR1
>  54:   e7930100ldr r0, [r3, r0, lsl #2]
>  58:   e12fff1ebx  lr
> 
>  That is five instructions long.
> >>>
> >>> On ppc32 I get:
> >>>
> >>> 0094 :
> >>> 94: 3d 20 00 00 lis r9,0
> >>> 96: R_PPC_ADDR16_HA .data..ro_after_init
> >>> 98: 54 84 16 ba rlwinm  r4,r4,2,26,29
> >>> 9c: 39 29 00 00 addir9,r9,0
> >>> 9e: R_PPC_ADDR16_LO .data..ro_after_init
> >>> a0: 7d 29 20 2e lwzxr9,r9,r4
> >>> a4: 91 23 00 00 stw r9,0(r3)
> >>> a8: 4e 80 00 20 blr
> >>>
> >>>
> 
>  Please show that your new implementation is not more expensive on
>  32-bit ARM. Please do so by building a 32-bit kernel, and providing
>  the disassembly.
> >>>
> >>> With your series I get:
> >>>
> >>>  :
> >>>  0: 3d 20 00 00 lis r9,0
> >>> 2: R_PPC_ADDR16_HA  .rodata
> >>>  4: 39 29 00 00 addir9,r9,0
> >>> 6: R_PPC_ADDR16_LO  .rodata
> >>>  8: 54 84 16 ba rlwinm  r4,r4,2,26,29
> >>>  c: 7d 49 20 2e lwzxr10,r9,r4
> >>> 10: 7d 4a 4a 14 add r10,r10,r9
> >>> 14: 7d 49 03 a6 mtctr   r10
> >>> 18: 4e 80 04 20 bctr
> >>> 1c: 39 20 03 15 li  r9,789
> >>> 20: 91 23 00 00 stw r9,0(r3)
> >>> 24: 4e 80 00 20 blr
> >>> 28: 39 20 01 15 li  r9,277
> >>> 2c: 91 23 00 00 stw r9,0(r3)
> >>> 30: 4e 80 00 20 blr
> >>> 34: 39 20 07 15 li  r9,1813
> >>> 38: 91 23 00 00 stw r9,0(r3)
> >>> 3c: 4e 80 00 20 blr
> >>> 40: 39 20 05 15 li  r9,1301
> >>> 44: 91 23 00 00 stw r9,0(r3)
> >>> 48: 4e 80 00 20 blr
> >>> 4c: 39 20 01 11 li  r9,273
> >>> 50: 4b ff ff d0 b   20 
> >>>
> >>>
> >>> That is definitely more expensive, it implements a table of branches.
> >>
> >> Okay, will split out the PPC32 implementation that retains existing
> >> table look up method. Also planning to keep that inside same file
> >> (arch/powerpc/mm/mmap.c), unless you have a difference preference.
> >
> > My point was not to get something specific for PPC32, but to amplify on
> > Russell's objection.
> >
> > As this is bad for ARM and bad for PPC32, do we have any evidence that
> > your change is good for any other architecture ?
> >
> > I checked PPC64 and there is exactly the same drawback. With the current
> > implementation it is a small function performing table read then a few
> > adjustment. After your change it is a bigger function implementing a
> > table of branches.
>
> I am wondering if this would not be the case for any other switch case
> 

[RFC PATCH 6/7] serial: General support for multipoint addresses

2022-03-02 Thread Ilpo Järvinen
This patch adds generic support for serial multipoint
addressing. Two new ioctls are added. TIOCSADDR is used to
indicate the destination/receive address. TIOCGADDR returns
the current address in use. The driver should implement
set_addr and get_addr to support addressing mode.

Adjust ADDRB clearing to happen only if driver does not provide
set_addr (=the driver doesn't support address mode).

This change is necessary for supporting devices with RS485
multipoint addressing [*]. A following patch in the patch series
adds support for Synopsys Designware UART capable for 9th bit
addressing mode. In this mode, 9th bit is used to indicate an
address (byte) within the communication line. The 9th bit
addressing mode is selected using ADDRB introduced by the
previous patch.

Transmit addresses / receiver filter are specified by setting
the flags SER_ADDR_DEST and/or SER_ADDR_RECV. When the user
supplies the transmit address, in the 9bit addressing mode it is
sent out immediately with the 9th bit set to 1. After that, the
subsequent normal data bytes are sent with 9th bit as 0 and they
are intended to the device with the given address. It is up to
receiver to enforce the filter using SER_ADDR_RECV. When userspace
has supplied the receive address, the driver is expected to handle
the matching of the address and only data with that address is
forwarded to the user. Both SER_ADDR_DEST and SER_ADDR_RECV can
be given at the same time in a single call if the addresses are
the same.

The user can clear the receive filter with SER_ADDR_RECV_CLEAR.

[*] Technically, RS485 is just an electronic spec and does not
itself specify the 9th bit addressing mode but 9th bit seems
at least "semi-standard" way to do addressing with RS485.

Cc: linux-...@vger.kernel.org
Cc: Richard Henderson 
Cc: Ivan Kokshaysky 
Cc: Matt Turner 
Cc: linux-al...@vger.kernel.org
Cc: Thomas Bogendoerfer 
Cc: linux-m...@vger.kernel.org
Cc: "James E.J. Bottomley" 
Cc: Helge Deller 
Cc: linux-par...@vger.kernel.org
Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Yoshinori Sato 
Cc: Rich Felker 
Cc: linux...@vger.kernel.org
Cc: "David S. Miller" 
Cc: sparcli...@vger.kernel.org
Cc: Chris Zankel 
Cc: Max Filippov 
Cc: linux-xte...@linux-xtensa.org
Cc: Arnd Bergmann 
Cc: linux-a...@vger.kernel.org
Cc: linux-...@vger.kernel.org
Signed-off-by: Ilpo Järvinen 
---
 .../driver-api/serial/serial-rs485.rst| 23 ++-
 arch/alpha/include/uapi/asm/ioctls.h  |  3 +
 arch/mips/include/uapi/asm/ioctls.h   |  3 +
 arch/parisc/include/uapi/asm/ioctls.h |  3 +
 arch/powerpc/include/uapi/asm/ioctls.h|  3 +
 arch/sh/include/uapi/asm/ioctls.h |  3 +
 arch/sparc/include/uapi/asm/ioctls.h  |  3 +
 arch/xtensa/include/uapi/asm/ioctls.h |  3 +
 drivers/tty/serial/8250/8250_core.c   |  2 +
 drivers/tty/serial/serial_core.c  | 62 ++-
 include/linux/serial_core.h   |  6 ++
 include/uapi/asm-generic/ioctls.h |  3 +
 include/uapi/linux/serial.h   |  8 +++
 13 files changed, 123 insertions(+), 2 deletions(-)

diff --git a/Documentation/driver-api/serial/serial-rs485.rst 
b/Documentation/driver-api/serial/serial-rs485.rst
index 6bc824f948f9..2f45f007fa5b 100644
--- a/Documentation/driver-api/serial/serial-rs485.rst
+++ b/Documentation/driver-api/serial/serial-rs485.rst
@@ -95,7 +95,28 @@ RS485 Serial Communications
/* Error handling. See errno. */
}
 
-5. References
+5. Multipoint Addressing
+
+
+   The Linux kernel provides serial_addr structure to handle addressing within
+   multipoint serial communications line such as RS485. 9th bit addressiong 
mode
+   is enabled by adding ADDRB flag in termios c_cflag.
+
+   Serial core calls device specific set/get_addr in response to TIOCSADDR and
+   TIOCGADDR ioctls with a pointer to serial_addr. Destination and receive
+   address can be specified using serial_addr flags field. Receive address may
+   also be cleared using flags. Once an address is set, the communication
+   can occur only with the particular device and other peers are filtered out.
+   It is left up to the receiver side to enforce the filtering.
+
+   Address flags:
+   - SER_ADDR_RECV: Receive (filter) address.
+   - SER_ADDR_RECV_CLEAR: Clear receive filter (only for TIOCSADDR).
+   - SER_ADDR_DEST: Destination address.
+
+   Note: not all devices supporting RS485 support multipoint addressing.
+
+6. References
 =
 
  [1]   include/uapi/linux/serial.h
diff --git a/arch/alpha/include/uapi/asm/ioctls.h 
b/arch/alpha/include/uapi/asm/ioctls.h
index 971311605288..500cab3e1d6b 100644
--- a/arch/alpha/include/uapi/asm/ioctls.h
+++ b/arch/alpha/include/uapi/asm/ioctls.h
@@ -125,4 +125,7 @@
 #define TIOCMIWAIT 0x545C  /* wait for a change on serial input line(s) */
 #define TIOCGICOUNT0x545D  /* 

[RFC PATCH 5/7] serial: termbits: ADDRB to indicate 9th bit addressing mode

2022-03-02 Thread Ilpo Järvinen
Add ADDRB to termbits to indicate 9th bit addressing mode.
This change is necessary for supporting devices with RS485
multipoint addressing [*]. A later patch in the patch series
adds support for Synopsys Designware UART capable for 9th bit
addressing mode. In this mode, 9th bit is used to indicate an
address (byte) within the communication line. The 9th bit
addressing mode is selected using ADDRB introduced by an earlier
patch.

[*] Technically, RS485 is just an electronic spec and does not
itself specify the 9th bit addressing mode but 9th bit seems
at least "semi-standard" way to do addressing with RS485.

Cc: linux-...@vger.kernel.org
Cc: Richard Henderson 
Cc: Ivan Kokshaysky 
Cc: Matt Turner 
Cc: linux-al...@vger.kernel.org
Cc: Thomas Bogendoerfer 
Cc: linux-m...@vger.kernel.org
Cc: "James E.J. Bottomley" 
Cc: Helge Deller 
Cc: linux-par...@vger.kernel.org
Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: linuxppc-dev@lists.ozlabs.org
Cc: "David S. Miller" 
Cc: sparcli...@vger.kernel.org
Cc: Arnd Bergmann 
Cc: linux-a...@vger.kernel.org
Cc: linux-...@vger.kernel.org
Signed-off-by: Ilpo Järvinen 
---
 arch/alpha/include/uapi/asm/termbits.h   | 1 +
 arch/mips/include/uapi/asm/termbits.h| 1 +
 arch/parisc/include/uapi/asm/termbits.h  | 1 +
 arch/powerpc/include/uapi/asm/termbits.h | 1 +
 arch/sparc/include/uapi/asm/termbits.h   | 1 +
 drivers/tty/amiserial.c  | 6 +-
 drivers/tty/moxa.c   | 1 +
 drivers/tty/mxser.c  | 1 +
 drivers/tty/serial/serial_core.c | 2 ++
 drivers/tty/tty_ioctl.c  | 2 ++
 drivers/usb/serial/usb-serial.c  | 5 +++--
 include/uapi/asm-generic/termbits.h  | 1 +
 12 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/arch/alpha/include/uapi/asm/termbits.h 
b/arch/alpha/include/uapi/asm/termbits.h
index 4575ba34a0ea..285169c794ec 100644
--- a/arch/alpha/include/uapi/asm/termbits.h
+++ b/arch/alpha/include/uapi/asm/termbits.h
@@ -180,6 +180,7 @@ struct ktermios {
 #define HUPCL  0004
 
 #define CLOCAL 0010
+#define ADDRB  01000   /* address bit */
 #define CMSPAR   0100  /* mark or space (stick) parity */
 #define CRTSCTS  0200  /* flow control */
 
diff --git a/arch/mips/include/uapi/asm/termbits.h 
b/arch/mips/include/uapi/asm/termbits.h
index dfeffba729b7..e7ea31cfec78 100644
--- a/arch/mips/include/uapi/asm/termbits.h
+++ b/arch/mips/include/uapi/asm/termbits.h
@@ -181,6 +181,7 @@ struct ktermios {
 #define B300 0010015
 #define B350 0010016
 #define B400 0010017
+#define ADDRB002   /* address bit */
 #define CIBAUD   00200360  /* input baud rate */
 #define CMSPAR   0100  /* mark or space (stick) parity */
 #define CRTSCTS  0200  /* flow control */
diff --git a/arch/parisc/include/uapi/asm/termbits.h 
b/arch/parisc/include/uapi/asm/termbits.h
index 40e920f8d683..629be061f5d5 100644
--- a/arch/parisc/include/uapi/asm/termbits.h
+++ b/arch/parisc/include/uapi/asm/termbits.h
@@ -158,6 +158,7 @@ struct ktermios {
 #define  B300 0010015
 #define  B350 0010016
 #define  B400 0010017
+#define ADDRB002   /* address bit */
 #define CIBAUD00200360 /* input baud rate */
 #define CMSPAR0100  /* mark or space (stick) parity */
 #define CRTSCTS   0200  /* flow control */
diff --git a/arch/powerpc/include/uapi/asm/termbits.h 
b/arch/powerpc/include/uapi/asm/termbits.h
index ed18bc61f63d..1b778ac562a4 100644
--- a/arch/powerpc/include/uapi/asm/termbits.h
+++ b/arch/powerpc/include/uapi/asm/termbits.h
@@ -171,6 +171,7 @@ struct ktermios {
 #define HUPCL  0004
 
 #define CLOCAL 0010
+#define ADDRB  0020/* address bit */
 #define CMSPAR   0100  /* mark or space (stick) parity */
 #define CRTSCTS  0200  /* flow control */
 
diff --git a/arch/sparc/include/uapi/asm/termbits.h 
b/arch/sparc/include/uapi/asm/termbits.h
index ce5ad5d0f105..4ad60c4acf65 100644
--- a/arch/sparc/include/uapi/asm/termbits.h
+++ b/arch/sparc/include/uapi/asm/termbits.h
@@ -200,6 +200,7 @@ struct ktermios {
 #define B300  0x1011
 #define B350  0x1012
 #define B400  0x1013  */
+#define ADDRB0x2000  /* address bit */
 #define CIBAUD   0x100f  /* input baud rate (not used) */
 #define CMSPAR   0x4000  /* mark or space (stick) parity */
 #define CRTSCTS  0x8000  /* flow control */
diff --git a/drivers/tty/amiserial.c b/drivers/tty/amiserial.c
index 533d02b38e02..3ca97007bd6e 100644
--- a/drivers/tty/amiserial.c
+++ b/drivers/tty/amiserial.c
@@ -1175,7 +1175,11 @@ static void rs_set_termios(struct tty_struct *tty, 
struct ktermios *old_termios)
 {
struct serial_state *info = tty->driver_data;
unsigned long flags;
-   unsigned int cflag = 

Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-03-02 Thread Xiaomeng Tong
On Mon, 28 Feb 2022 16:41:04 -0800, Linus Torvalds
 wrote:
>
> But basically to _me_, the important part is that the end result is
> maintainable longer-term.

I couldn't agree more. And because of that, I stick with the following
approach because it's maintainable longer-term than "type(pos) pos" one:
 Implements a new macro for each list_for_each_entry* with _inside suffix.
  #define list_for_each_entry_inside(pos, type, head, member)

I have posted a patch series here to demonstrate this approach:
https://lore.kernel.org/lkml/20220301075839.4156-3-xiam0nd.t...@gmail.com/

Although we need replace all the use of list_for_each_entry* (15000+)
with list_for_each_entry*_inside, the work can be done gradually rather
than all at once. We can incrementally replace these callers until
all these in the kernel are completely updated with *_inside* one. At
that time, we can just remove the implements of origin macros and rename
the *_inside* macro back to the origin name just in one single patch.

And the "type(pos) pos" approach need teach developers to "not initialize
the iterator variable, otherwise the use-after-loop will not be reported by
compiler", which is unreasonable and impossible for all developers. 

And it will mess up the following code logic and no warnning reported by
compiler, even without initializing "ext" at the beginning:
void foo(struct mem_extent *arg) {
  struct mem_extent *ext;  // used both for iterator and normal ptr
  ...
  ext = arg;  // this assignment can alse be done in another bar() func
  ...
  list_for_each_entry(ext, head, member) {
if (found(ext))
   break;
  }
  ...
  // use ext after the loop
  ret = ext;
}
If the loop hit the break, the last "ret" will be the found ext iterator.
However, if the "type(pos) pos" approach applied, the last "ret" will be
"arg" which is not the intention of the developers, because the "ext" is
two different variables inside and outside the loop.

Thus, my idea is *better a finger off than always aching*, let's choose
the "list_for_each_entry_inside(pos, type, head, member)" approach.

> It turns out that just syntactically, it's really nice to give the
> type of the iterator from outside the way we do now. Yeah, it may be a
> bit odd, and maybe it's partly because I'm so used to the
> "list_for_each_list_entry()" syntax, but moving the type into the loop
> construct really made it nasty - either one very complex line, or
> having to split it over two lines which was even worse.
>
> Maybe the place I looked at just happened to have a long typename, but
> it's basically always going to be a struct, so it's never a _simple_
> type. And it just looked very odd adn unnatural to have the type as
> one of the "arguments" to that list_for_each_entry() macro.

we can pass a shorter type name to list_for_each_entry_inside, thus no
need to split it over two lines. Actually it is not a big problem.
+ #define t struct sram_bank_info
- list_for_each_entry(pos, head, member) {
+ list_for_each_entry_inside(pos, t, head, member) {

I put the type at the second argument not the first to avoid messing up
the pattern match in some coccinelle scripts.

>  (b) gives us a nice warning for any normal use-after-loop case
> (unless you explicitly initialized it like that
> sgx_mmu_notifier_release() function did for no good reason

sometimes developers can be confused by the reported warnning:
"used without having been initialized", and can not figure out immediately
that "oh, now i am using another different variable but with the same name
of the loop iterator variable", which has changed the programming habits
of developers.

>  (c) also guarantees that even if you don't get a warning,
> non-converted (or newly written) bad code won't actually _work_
>
> so you end up getting the new rules without any ambiguity or mistaken

It will lead to a wrong/NULL pointer dereference if the pointer is used
anywhere else, depend on which value is used to initialized with.

Best regard,
--
Xiaomeng Tong


Re: [PATCH V3 09/30] arm/mm: Enable ARCH_HAS_VM_GET_PAGE_PROT

2022-03-02 Thread Anshuman Khandual



On 3/2/22 12:35 PM, Christophe Leroy wrote:
> 
> 
> Le 02/03/2022 à 04:22, Anshuman Khandual a écrit :
>>
>>
>> On 3/1/22 1:46 PM, Christophe Leroy wrote:
>>>
>>>
>>> Le 01/03/2022 à 01:31, Russell King (Oracle) a écrit :
 On Tue, Mar 01, 2022 at 05:30:41AM +0530, Anshuman Khandual wrote:
> On 2/28/22 4:27 PM, Russell King (Oracle) wrote:
>> On Mon, Feb 28, 2022 at 04:17:32PM +0530, Anshuman Khandual wrote:
>>> This defines and exports a platform specific custom vm_get_page_prot() 
>>> via
>>> subscribing ARCH_HAS_VM_GET_PAGE_PROT. Subsequently all __SXXX and 
>>> __PXXX
>>> macros can be dropped which are no longer needed.
>>
>> What I would really like to know is why having to run _code_ to work out
>> what the page protections need to be is better than looking it up in a
>> table.
>>
>> Not only is this more expensive in terms of CPU cycles, it also brings
>> additional code size with it.
>>
>> I'm struggling to see what the benefit is.
>
> Currently vm_get_page_prot() is also being _run_ to fetch required page
> protection values. Although that is being run in the core MM and from a
> platform perspective __SXXX, __PXXX are just being exported for a table.
> Looking it up in a table (and applying more constructs there after) is
> not much different than a clean switch case statement in terms of CPU
> usage. So this is not more expensive in terms of CPU cycles.

 I disagree.
>>>
>>> So do I.
>>>

 However, let's base this disagreement on some evidence. Here is the
 present 32-bit ARM implementation:

 0048 :
 48:   e20fand r0, r0, #15
 4c:   e3003000movwr3, #0
   4c: R_ARM_MOVW_ABS_NC   .LANCHOR1
 50:   e3403000movtr3, #0
   50: R_ARM_MOVT_ABS  .LANCHOR1
 54:   e7930100ldr r0, [r3, r0, lsl #2]
 58:   e12fff1ebx  lr

 That is five instructions long.
>>>
>>> On ppc32 I get:
>>>
>>> 0094 :
>>> 94: 3d 20 00 00 lis r9,0
>>> 96: R_PPC_ADDR16_HA .data..ro_after_init
>>> 98: 54 84 16 ba rlwinm  r4,r4,2,26,29
>>> 9c: 39 29 00 00 addir9,r9,0
>>> 9e: R_PPC_ADDR16_LO .data..ro_after_init
>>> a0: 7d 29 20 2e lwzxr9,r9,r4
>>> a4: 91 23 00 00 stw r9,0(r3)
>>> a8: 4e 80 00 20 blr
>>>
>>>

 Please show that your new implementation is not more expensive on
 32-bit ARM. Please do so by building a 32-bit kernel, and providing
 the disassembly.
>>>
>>> With your series I get:
>>>
>>>  :
>>>  0: 3d 20 00 00 lis r9,0
>>> 2: R_PPC_ADDR16_HA  .rodata
>>>  4: 39 29 00 00 addir9,r9,0
>>> 6: R_PPC_ADDR16_LO  .rodata
>>>  8: 54 84 16 ba rlwinm  r4,r4,2,26,29
>>>  c: 7d 49 20 2e lwzxr10,r9,r4
>>> 10: 7d 4a 4a 14 add r10,r10,r9
>>> 14: 7d 49 03 a6 mtctr   r10
>>> 18: 4e 80 04 20 bctr
>>> 1c: 39 20 03 15 li  r9,789
>>> 20: 91 23 00 00 stw r9,0(r3)
>>> 24: 4e 80 00 20 blr
>>> 28: 39 20 01 15 li  r9,277
>>> 2c: 91 23 00 00 stw r9,0(r3)
>>> 30: 4e 80 00 20 blr
>>> 34: 39 20 07 15 li  r9,1813
>>> 38: 91 23 00 00 stw r9,0(r3)
>>> 3c: 4e 80 00 20 blr
>>> 40: 39 20 05 15 li  r9,1301
>>> 44: 91 23 00 00 stw r9,0(r3)
>>> 48: 4e 80 00 20 blr
>>> 4c: 39 20 01 11 li  r9,273
>>> 50: 4b ff ff d0 b   20 
>>>
>>>
>>> That is definitely more expensive, it implements a table of branches.
>>
>> Okay, will split out the PPC32 implementation that retains existing
>> table look up method. Also planning to keep that inside same file
>> (arch/powerpc/mm/mmap.c), unless you have a difference preference.
> 
> My point was not to get something specific for PPC32, but to amplify on 
> Russell's objection.
> 
> As this is bad for ARM and bad for PPC32, do we have any evidence that 
> your change is good for any other architecture ?
> 
> I checked PPC64 and there is exactly the same drawback. With the current 
> implementation it is a small function performing table read then a few 
> adjustment. After your change it is a bigger function implementing a 
> table of branches.

I am wondering if this would not be the case for any other switch case
statement on the platform ? Is there something specific/different just
on vm_get_page_prot() implementation ? Are you suggesting that switch
case statements should just be avoided instead ?

> 
> So, as requested by Russell, could you look at the disassembly for other 
> 

Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-03-02 Thread Rasmus Villemoes
On 02/03/2022 00.55, Linus Torvalds wrote:
> On Tue, Mar 1, 2022 at 3:19 PM David Laight  wrote:
>>

> With the "don't use iterator outside the loop" approach, the exact
> same code works in both the old world order and the new world order,
> and you don't have the semantic confusion. And *if* you try to use the
> iterator outside the loop, you'll _mostly_ (*) get a compiler warning
> about it not being initialized.
> 
>  Linus
> 
> (*) Unless somebody initializes the iterator pointer pointlessly.
> Which clearly does happen. Thus the "mostly". It's not perfect, and
> that's most definitely not nice - but it should at least hopefully
> make it that much harder to mess up.

This won't help the current issue (because it doesn't exist and might
never), but just in case some compiler people are listening, I'd like to
have some sort of way to tell the compiler "treat this variable as
uninitialized from here on". So one could do

#define kfree(p) do { __kfree(p); __magic_uninit(p); } while (0)

with __magic_uninit being a magic no-op that doesn't affect the
semantics of the code, but could be used by the compiler's "[is/may be]
used uninitialized" machinery to flag e.g. double frees on some odd
error path etc. It would probably only work for local automatic
variables, but it should be possible to just ignore the hint if p is
some expression like foo->bar or has side effects. If we had that, the
end-of-loop test could include that to "uninitialize" the iterator.

Maybe sparse/smatch or some other static analyzer could implement such a
magic thing? Maybe it's better as a function attribute
[__attribute__((uninitializes(1)))] to avoid having to macrofy all
functions that release resources.

Rasmus


Re: [PATCH 07/11] x86: remove the IOMMU table infrastructure

2022-03-02 Thread Christoph Hellwig
On Wed, Mar 02, 2022 at 12:18:26PM +0300, Anatoly Pugachev wrote:
> Is it possible to keep documentation comments in source files? Or are
> they completely irrelevant now?

That ones you quoted are very much irrelevant now.  And the behaviour
of the swiotlb disabling will have to change (this patchset is a bit
of a preparation for now) as we now use per-device dma_ops and the
dma-iommu can dip into the swiotlb pool for untrusted devices.  In
practive we'll basically have to always initialize the swiotlb buffer
now.


Re: [PATCH 07/11] x86: remove the IOMMU table infrastructure

2022-03-02 Thread Anatoly Pugachev
On Sun, Feb 27, 2022 at 7:31 PM Christoph Hellwig  wrote:
>
> The IOMMU table tries to separate the different IOMMUs into different
> backends, but actually requires various cross calls.
>
> Rewrite the code to do the generic swiotlb/swiotlb-xen setup directly
> in pci-dma.c and then just call into the IOMMU drivers.
...
> --- a/arch/x86/include/asm/iommu_table.h
> +++ /dev/null
> @@ -1,102 +0,0 @@
> -/* SPDX-License-Identifier: GPL-2.0 */
> -#ifndef _ASM_X86_IOMMU_TABLE_H
> -#define _ASM_X86_IOMMU_TABLE_H
> -
> -#include 
> -
> -/*
> - * History lesson:
> - * The execution chain of IOMMUs in 2.6.36 looks as so:
> - *
> - *[xen-swiotlb]
> - * |
> - * +[swiotlb *]--+
> - */ | \
> - *   /  |  \
> - *[GART] [Calgary]  [Intel VT-d]
> - * /
> - */
> - * [AMD-Vi]
> - *
> - * *: if SWIOTLB detected 'iommu=soft'/'swiotlb=force' it would skip
> - * over the rest of IOMMUs and unconditionally initialize the SWIOTLB.
> - * Also it would surreptitiously initialize set the swiotlb=1 if there were
> - * more than 4GB and if the user did not pass in 'iommu=off'. The swiotlb
> - * flag would be turned off by all IOMMUs except the Calgary one.
> - *
> - * The IOMMU_INIT* macros allow a similar tree (or more complex if desired)
> - * to be built by defining who we depend on.
> - *
> - * And all that needs to be done is to use one of the macros in the IOMMU
> - * and the pci-dma.c will take care of the rest.
> - */

Christoph,

Is it possible to keep documentation comments in source files? Or are
they completely irrelevant now?

Thanks.


[PATCH v4 3/3] powerpc/pseries/vas: Add VAS migration handler

2022-03-02 Thread Haren Myneni
[Update: Included the build fix reported by kernel test robot ]

Since the VAS windows belong to the VAS hardware resource, the
hypervisor expects the partition to close them on source partition
and reopen them after the partition migrated on the destination
machine.

This handler is called before pseries_suspend() to close these
windows and again invoked after migration. All active windows
for both default and QoS types will be closed and mark them
inactive and reopened after migration with this handler.
During the migration, the user space receives paste instruction
failure if it issues copy/paste on these inactive windows.

The current migration implementation does not freeze the user
space and applications can continue to open VAS windows while
migration is in progress. So when the migration_in_progress flag
is set, VAS open window API returns -EBUSY.

Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/pseries/mobility.c |  5 ++
 arch/powerpc/platforms/pseries/vas.c  | 98 ++-
 arch/powerpc/platforms/pseries/vas.h  | 14 
 3 files changed, 116 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/mobility.c 
b/arch/powerpc/platforms/pseries/mobility.c
index 85033f392c78..70004243e25e 100644
--- a/arch/powerpc/platforms/pseries/mobility.c
+++ b/arch/powerpc/platforms/pseries/mobility.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include "pseries.h"
+#include "vas.h"   /* vas_migration_handler() */
 #include "../../kernel/cacheinfo.h"
 
 static struct kobject *mobility_kobj;
@@ -669,12 +670,16 @@ static int pseries_migrate_partition(u64 handle)
if (ret)
return ret;
 
+   vas_migration_handler(VAS_SUSPEND);
+
ret = pseries_suspend(handle);
if (ret == 0)
post_mobility_fixup();
else
pseries_cancel_migration(handle, ret);
 
+   vas_migration_handler(VAS_RESUME);
+
return ret;
 }
 
diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index fbcf311da0ec..1f59d78c77a1 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -30,6 +30,7 @@ static struct hv_vas_cop_feat_caps hv_cop_caps;
 
 static struct vas_caps vascaps[VAS_MAX_FEAT_TYPE];
 static DEFINE_MUTEX(vas_pseries_mutex);
+static bool migration_in_progress;
 
 static long hcall_return_busy_check(long rc)
 {
@@ -356,7 +357,10 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
 * same fault IRQ is not freed by the OS before.
 */
mutex_lock(_pseries_mutex);
-   rc = allocate_setup_window(txwin, (u64 *)[0],
+   if (migration_in_progress)
+   rc = -EBUSY;
+   else
+   rc = allocate_setup_window(txwin, (u64 *)[0],
   cop_feat_caps->win_type);
mutex_unlock(_pseries_mutex);
if (rc)
@@ -869,6 +873,98 @@ static struct notifier_block pseries_vas_nb = {
.notifier_call = pseries_vas_notifier,
 };
 
+/*
+ * For LPM, all windows have to be closed on the source partition
+ * before migration and reopen them on the destination partition
+ * after migration. So closing windows during suspend and
+ * reopen them during resume.
+ */
+int vas_migration_handler(int action)
+{
+   struct vas_cop_feat_caps *caps;
+   int old_nr_creds, new_nr_creds = 0;
+   struct vas_caps *vcaps;
+   int i, rc = 0;
+
+   /*
+* NX-GZIP is not enabled. Nothing to do for migration.
+*/
+   if (!copypaste_feat)
+   return rc;
+
+   mutex_lock(_pseries_mutex);
+
+   if (action == VAS_SUSPEND)
+   migration_in_progress = true;
+   else
+   migration_in_progress = false;
+
+   for (i = 0; i < VAS_MAX_FEAT_TYPE; i++) {
+   vcaps = [i];
+   caps = >caps;
+   old_nr_creds = atomic_read(>nr_total_credits);
+
+   rc = h_query_vas_capabilities(H_QUERY_VAS_CAPABILITIES,
+ vcaps->feat,
+ (u64)virt_to_phys(_cop_caps));
+   if (!rc) {
+   new_nr_creds = 
be16_to_cpu(hv_cop_caps.target_lpar_creds);
+   /*
+* Should not happen. But incase print messages, close
+* all windows in the list during suspend and reopen
+* windows based on new lpar_creds on the destination
+* system.
+*/
+   if (old_nr_creds != new_nr_creds) {
+   pr_err("Target credits mismatch with the 
hypervisor\n");
+   pr_err("state(%d): lpar creds: %d HV lpar 
creds: %d\n",
+   action, old_nr_creds, new_nr_creds);
+   pr_err("Used creds: %d, Active creds: 

Re: [next-20220225][Oops][ppc] lvm snapshot merge results kernel panics (throtl_pending_timer_fn)

2022-03-02 Thread Ming Lei
On Wed, Mar 02, 2022 at 01:31:39PM +0530, Abdul Haleem wrote:
> Greeting's
> 
> Linux next kernel 5.17.0-rc5-next-20220225 crashed on my power 10 LPAR when
> merge lvm snapshot on nvme disk

Please try next-20220301, in which the "bad" patch of 'block: cancel all
throttled bios in del_gendisk()' is dropped.


Thanks,
Ming



Re: [PATCH 11/12] swiotlb: merge swiotlb-xen initialization into swiotlb

2022-03-02 Thread Christoph Hellwig
On Tue, Mar 01, 2022 at 06:55:47PM -0800, Stefano Stabellini wrote:
> Unrelated to this specific patch series: now that I think about it, if
> io_tlb_default_mem.nslabs is already allocated by the time xen_mm_init
> is called, wouldn't we potentially have an issue with the GFP flags used
> for the earlier allocation (e.g. GFP_DMA32 not used)? Maybe something
> for another day.

swiotlb_init allocates low memory from meblock, which is roughly
equivalent to GFP_DMA allocations, so we'll be fine.

> > @@ -143,10 +141,15 @@ static int __init xen_mm_init(void)
> > if (!xen_swiotlb_detect())
> > return 0;
> >  
> > -   rc = xen_swiotlb_init();
> > /* we can work with the default swiotlb */
> > -   if (rc < 0 && rc != -EEXIST)
> > -   return rc;
> > +   if (!io_tlb_default_mem.nslabs) {
> > +   if (!xen_initial_domain())
> > +   return -EINVAL;
> 
> I don't think we need this xen_initial_domain() check. It is all
> already sorted out by the xen_swiotlb_detect() check above.

Is it?

static inline int xen_swiotlb_detect(void)
{
if (!xen_domain())
return 0;
if (xen_feature(XENFEAT_direct_mapped))
return 1;
/* legacy case */
if (!xen_feature(XENFEAT_not_direct_mapped) && xen_initial_domain())
return 1;
return 0;
}

I think I'd keep it as-is for now, as my planned next step would be to
fold xen-swiotlb into swiotlb entirely.


Re: [PATCH 07/11] x86: remove the IOMMU table infrastructure

2022-03-02 Thread Christoph Hellwig
On Tue, Mar 01, 2022 at 01:20:22PM -0500, Konrad Rzeszutek Wilk wrote:
> I think you also need to check for IBM Calgary?

The IBM Calgary IOMMU support is long gone.


[next-20220225][Oops][ppc] lvm snapshot merge results kernel panics (throtl_pending_timer_fn)

2022-03-02 Thread Abdul Haleem

Greeting's

Linux next kernel 5.17.0-rc5-next-20220225 crashed on my power 10 LPAR 
when merge lvm snapshot on nvme disk


console logs


fdisk -l /dev/nvme1n1

Disk /dev/nvme1n1: 372.6 GiB, 400088457216 bytes, 97677846 sectors
Units: sectors of 1 * 4096 = 4096 bytes
Sector size (logical/physical): 4096 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
vgcreate avocado_vg /dev/nvme1n1
  Volume group "avocado_vg" successfully created

lvcreate --name avocado_lv --size 18432.0 avocado_vg -y
  Wiping ext2 signature on /dev/avocado_vg/avocado_lv.
  Logical volume "avocado_lv" created.

lvdisplay avocado_vg
  --- Logical volume ---
  LV Path    /dev/avocado_vg/avocado_lv
  LV Name    avocado_lv
  VG Name    avocado_vg
  LV UUID nkhkFh-Oofl-GKH1-1055-3B47-0gup-yQtI1s
  LV Write Access    read/write
  LV Creation host, time ltc-zz1b-lp4.aus.stglabs.ibm.com, 2022-03-02 
01:32:19 -0600

  LV Status  available
  # open 0
  LV Size    18.00 GiB
  Current LE 4608
  Segments   1
  Allocation inherit
  Read ahead sectors auto
  - currently set to 8192
  Block device   253:0

mkfs.ext2 /dev/avocado_vg/avocado_lv

mke2fs 1.44.6 (5-Mar-2019)
Creating filesystem with 4718592 4k blocks and 1179648 inodes
Filesystem UUID: 5ed3c335-bac2-4b64-a827-222d287af0b2
Superblock backups stored on blocks:
    32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 
2654208,

    4096000

Allocating group tables: done
Writing inode tables: done
Writing superblocks and filesystem accounting information: done

mount /dev/avocado_vg/avocado_lv /mnt

[ 1053.517019] EXT4-fs (dm-0): mounting ext2 file system using the ext4 
subsystem
[ 1053.518270] EXT4-fs (dm-0): mounted filesystem without journal. Quota 
mode: none.


umount /dev/avocado_vg/avocado_lv

lvdisplay avocado_vg
  --- Logical volume ---
  LV Path    /dev/avocado_vg/avocado_lv
  LV Name    avocado_lv
  VG Name    avocado_vg
  LV UUID nkhkFh-Oofl-GKH1-1055-3B47-0gup-yQtI1s
  LV Write Access    read/write
  LV Creation host, time ltc-zz1b-lp4.aus.stglabs.ibm.com, 2022-03-02 
01:32:19 -0600

  LV Status  available
  # open 0
  LV Size    18.00 GiB
  Current LE 4608
  Segments   1
  Allocation inherit
  Read ahead sectors auto
  - currently set to 8192
  Block device   253:0

lvcreate --snapshot --name avocado_sn /dev/avocado_vg/avocado_lv 
--ignoreactivationskip --size 18432.0


  Logical volume "avocado_sn" created.

vgdisplay avocado_vg

  --- Volume group ---
  VG Name   avocado_vg
  System ID
  Format    lvm2
  Metadata Areas    1
  Metadata Sequence No  4
  VG Access read/write
  VG Status resizable
  MAX LV    0
  Cur LV    2
  Open LV   0
  Max PV    0
  Cur PV    1
  Act PV    1
  VG Size   <372.61 GiB
  PE Size   4.00 MiB
  Total PE  95388
  Alloc PE / Size   9216 / 36.00 GiB
  Free  PE / Size   86172 / <336.61 GiB
  VG UUID   jobi1f-kHw4-2ovw-nR3E-Eml5-tFYR-mJc3WT

lvdisplay avocado_vg
  --- Logical volume ---
  LV Path    /dev/avocado_vg/avocado_lv
  LV Name    avocado_lv
  VG Name    avocado_vg
  LV UUID nkhkFh-Oofl-GKH1-1055-3B47-0gup-yQtI1s
  LV Write Access    read/write
  LV Creation host, time ltc-zz1b-lp4.aus.stglabs.ibm.com, 2022-03-02 
01:32:19 -0600

  LV snapshot status source of
 avocado_sn [active]
  LV Status  available
  # open 0
  LV Size    18.00 GiB
  Current LE 4608
  Segments   1
  Allocation inherit
  Read ahead sectors auto
  - currently set to 256
  Block device   253:0

  --- Logical volume ---
  LV Path    /dev/avocado_vg/avocado_sn
  LV Name    avocado_sn
  VG Name    avocado_vg
  LV UUID QL14R9-aVa9-nD8z-DTyg-87qL-b2yY-O30G5P
  LV Write Access    read/write
  LV Creation host, time ltc-zz1b-lp4.aus.stglabs.ibm.com, 2022-03-02 
01:32:20 -0600

  LV snapshot status active destination for avocado_lv
  LV Status  available
  # open 0
  LV Size    18.00 GiB
  Current LE 4608
  COW-table size 18.00 GiB
  COW-table LE   4608
  Allocated to snapshot  0.00%
  Snapshot chunk size    4.00 KiB
  Segments   1
  Allocation inherit
  Read ahead sectors auto
  - currently set to 8192
  Block device   253:3

lvconvert --merge --interval 1 /dev/avocado_vg/avocado_sn
  Merging of volume avocado_vg/avocado_sn started.
  avocado_vg/avocado_lv: Merged: 100.00%
BUG: Unable to handle