Re: [PATCH v4 0/3] Optimize tlbflush path

2019-09-27 Thread Atish Patra
On Fri, 2019-08-30 at 19:50 -0700, Paul Walmsley wrote:
> Hi Atish,
> 
> On Thu, 22 Aug 2019, Atish Patra wrote:
> 
> > This series adds few optimizations to reduce the trap cost in the
> > tlb
> > flush path. We should only make SBI calls to remote tlb flush only
> > if
> > absolutely required. 
> 
> The patches look great.  My understanding is that these optimization 
> patches may actually be a partial workaround for the TLB flushing bug
> that 
> we've been looking at for the last month or so, which can corrupt
> memory 
> or crash the system.
> 
> If that's the case, let's first root-cause the underlying
> bug.  Otherwise 
> we'll just be papering over the actual issue, which probably could
> still 
> occur even with this series, correct?  Since it contains no explicit 
> fixes?
> 
> 
I have verified the glibc locale install issue both in Qemu and
Unleashed. I don't see any issue with OpenSBI master + Linux v5.3
kernel.

As per our investigation, it looks like a hardware errata with
Unleashed board as the memory corruption issue only happens in case of
tlb range flush. In RISC-V, sfence.vma can only be issued at page
boundary. If the range is larger than that, OpenSBI has to issue
multiple sfence.vma calls back to back leading to possible memory
corruption.

Currently, OpenSBI has a platform feature i.e. "tlb_range_flush_limit"
that allows to configure tlb flush threshold per platform. Any tlb
flush range request greater than this threshold, is converted to a full
flush. Currently, it is set to the default value 4K for every
platform[1]. Glibc locale install memory corruption only happens if
this threshold is changed to a higher value i.e. 1G. This doesn't
change anything in OpenSBI code path except the fact that it will issue
many sfence.vma instructions back to back instead of one. 

If the hardware team at SiFive can look into this as well, it would be
great.

To conclude, we think this issue need to be investigated by hardware
team and the kernel patch can be merged to get the performance benefit.

[1] 
https://github.com/riscv/opensbi/blob/master/include/sbi/sbi_platform.h#L40



> - Paul
> 
> ___
> linux-riscv mailing list
> linux-ri...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

-- 
Regards,
Atish


Re: [PATCH v4 0/3] Optimize tlbflush path

2019-08-30 Thread Paul Walmsley
Hi Atish,

On Thu, 22 Aug 2019, Atish Patra wrote:

> This series adds few optimizations to reduce the trap cost in the tlb
> flush path. We should only make SBI calls to remote tlb flush only if
> absolutely required. 

The patches look great.  My understanding is that these optimization 
patches may actually be a partial workaround for the TLB flushing bug that 
we've been looking at for the last month or so, which can corrupt memory 
or crash the system.

If that's the case, let's first root-cause the underlying bug.  Otherwise 
we'll just be papering over the actual issue, which probably could still 
occur even with this series, correct?  Since it contains no explicit 
fixes?


- Paul


[PATCH v4 0/3] Optimize tlbflush path

2019-08-22 Thread Atish Patra
This series adds few optimizations to reduce the trap cost in the tlb
flush path. We should only make SBI calls to remote tlb flush only if
absolutely required. 

This series is based on Christoph's series:

http://lists.infradead.org/pipermail/linux-riscv/2019-August/006148.html

Changes from v3->v4
1. Simplify the local vs remote usecase.
2. Reorder the patches in the series.

Changes from v2->v3:
1. Split the patches into smaller one per optimization.

Atish Patra (3):
RISC-V: Do not invoke SBI call if cpumask is empty
RISC-V: Issue a local tlbflush if possible.
RISC-V: Issue a tlb page flush if possible

arch/riscv/mm/tlbflush.c | 25 +++--
1 file changed, 23 insertions(+), 2 deletions(-)

--
2.21.0