Re: [PATCH V3 06/26] csky: Cache and TLB routines
On Fri, Sep 07, 2018 at 04:13:35PM +0200, Arnd Bergmann wrote: > On Fri, Sep 7, 2018 at 2:55 PM Guo Ren wrote: > > > > On Fri, Sep 07, 2018 at 10:14:38AM +0200, Arnd Bergmann wrote: > > > On Fri, Sep 7, 2018 at 5:04 AM Guo Ren wrote: > > > > On Thu, Sep 06, 2018 at 04:31:16PM +0200, Arnd Bergmann wrote: > > > Similarly, an MMIO read may be used to see if a DMA has completed > > > and the device register tells you that the DMA has left the device, > > > but without a barrier, the CPU may have prefetched the DMA > > > data while waiting for the MMIO-read to complete. The __io_ar() > > > barrier() in asm-generic/io.h prevents the compiler from reordering > > > the two reads, but if an weakly ordered read (in coherent DMA buffer) > > > can bypass a strongly ordered read (MMIO), then it's still still > > > broken. > > __io_ar() barrier()? not rmb() ?! I've defined the rmb in asm/barrier, So > > I got rmb() here not barrier(). > > > > Only __io_br() is barrier(). > > Ah right, I misremembered the defaults. It's probably ok then. Thx for the review and comments. These let me re-consider the mmio issues and help to improve the csky asm/io.h in future. > > > > > > - How does endianess work? Are there any buses that flip bytes around > > > > > when running big-endian, or do you always do that in software? > > > > Currently we only support little-endian and soc will follow it. > > > > > > Ok, that makes it easier. If you think that you won't even need big-endian > > > support in the long run, you could also remove your asm/byteorder.h > > > header. If you're not sure, it doesn't hurt to keep it of course. > > Em... I'm not sure, so let me keep it for a while. > > Ok. I think overall the trend is to be little-endian only for most > architectures: powerpc64 moved from big-endian only to little-endian > by default, ARM rarely uses big-endian (basically only for legacy > applications ported from BE MIPS or ppc), and all new architectures > we added in the last years are little-endian (OpenRISC being the > main exception). Good news, I really don't want to support big-endian and it makes CI double. Best Regards Guo Ren
Re: [PATCH V3 06/26] csky: Cache and TLB routines
On Fri, Sep 07, 2018 at 04:13:35PM +0200, Arnd Bergmann wrote: > On Fri, Sep 7, 2018 at 2:55 PM Guo Ren wrote: > > > > On Fri, Sep 07, 2018 at 10:14:38AM +0200, Arnd Bergmann wrote: > > > On Fri, Sep 7, 2018 at 5:04 AM Guo Ren wrote: > > > > On Thu, Sep 06, 2018 at 04:31:16PM +0200, Arnd Bergmann wrote: > > > Similarly, an MMIO read may be used to see if a DMA has completed > > > and the device register tells you that the DMA has left the device, > > > but without a barrier, the CPU may have prefetched the DMA > > > data while waiting for the MMIO-read to complete. The __io_ar() > > > barrier() in asm-generic/io.h prevents the compiler from reordering > > > the two reads, but if an weakly ordered read (in coherent DMA buffer) > > > can bypass a strongly ordered read (MMIO), then it's still still > > > broken. > > __io_ar() barrier()? not rmb() ?! I've defined the rmb in asm/barrier, So > > I got rmb() here not barrier(). > > > > Only __io_br() is barrier(). > > Ah right, I misremembered the defaults. It's probably ok then. Thx for the review and comments. These let me re-consider the mmio issues and help to improve the csky asm/io.h in future. > > > > > > - How does endianess work? Are there any buses that flip bytes around > > > > > when running big-endian, or do you always do that in software? > > > > Currently we only support little-endian and soc will follow it. > > > > > > Ok, that makes it easier. If you think that you won't even need big-endian > > > support in the long run, you could also remove your asm/byteorder.h > > > header. If you're not sure, it doesn't hurt to keep it of course. > > Em... I'm not sure, so let me keep it for a while. > > Ok. I think overall the trend is to be little-endian only for most > architectures: powerpc64 moved from big-endian only to little-endian > by default, ARM rarely uses big-endian (basically only for legacy > applications ported from BE MIPS or ppc), and all new architectures > we added in the last years are little-endian (OpenRISC being the > main exception). Good news, I really don't want to support big-endian and it makes CI double. Best Regards Guo Ren
Re: [PATCH V3 06/26] csky: Cache and TLB routines
On Fri, Sep 7, 2018 at 2:55 PM Guo Ren wrote: > > On Fri, Sep 07, 2018 at 10:14:38AM +0200, Arnd Bergmann wrote: > > On Fri, Sep 7, 2018 at 5:04 AM Guo Ren wrote: > > > On Thu, Sep 06, 2018 at 04:31:16PM +0200, Arnd Bergmann wrote: > > Similarly, an MMIO read may be used to see if a DMA has completed > > and the device register tells you that the DMA has left the device, > > but without a barrier, the CPU may have prefetched the DMA > > data while waiting for the MMIO-read to complete. The __io_ar() > > barrier() in asm-generic/io.h prevents the compiler from reordering > > the two reads, but if an weakly ordered read (in coherent DMA buffer) > > can bypass a strongly ordered read (MMIO), then it's still still > > broken. > __io_ar() barrier()? not rmb() ?! I've defined the rmb in asm/barrier, So > I got rmb() here not barrier(). > > Only __io_br() is barrier(). Ah right, I misremembered the defaults. It's probably ok then. > > > > - How does endianess work? Are there any buses that flip bytes around > > > > when running big-endian, or do you always do that in software? > > > Currently we only support little-endian and soc will follow it. > > > > Ok, that makes it easier. If you think that you won't even need big-endian > > support in the long run, you could also remove your asm/byteorder.h > > header. If you're not sure, it doesn't hurt to keep it of course. > Em... I'm not sure, so let me keep it for a while. Ok. I think overall the trend is to be little-endian only for most architectures: powerpc64 moved from big-endian only to little-endian by default, ARM rarely uses big-endian (basically only for legacy applications ported from BE MIPS or ppc), and all new architectures we added in the last years are little-endian (OpenRISC being the main exception). Arnd
Re: [PATCH V3 06/26] csky: Cache and TLB routines
On Fri, Sep 7, 2018 at 2:55 PM Guo Ren wrote: > > On Fri, Sep 07, 2018 at 10:14:38AM +0200, Arnd Bergmann wrote: > > On Fri, Sep 7, 2018 at 5:04 AM Guo Ren wrote: > > > On Thu, Sep 06, 2018 at 04:31:16PM +0200, Arnd Bergmann wrote: > > Similarly, an MMIO read may be used to see if a DMA has completed > > and the device register tells you that the DMA has left the device, > > but without a barrier, the CPU may have prefetched the DMA > > data while waiting for the MMIO-read to complete. The __io_ar() > > barrier() in asm-generic/io.h prevents the compiler from reordering > > the two reads, but if an weakly ordered read (in coherent DMA buffer) > > can bypass a strongly ordered read (MMIO), then it's still still > > broken. > __io_ar() barrier()? not rmb() ?! I've defined the rmb in asm/barrier, So > I got rmb() here not barrier(). > > Only __io_br() is barrier(). Ah right, I misremembered the defaults. It's probably ok then. > > > > - How does endianess work? Are there any buses that flip bytes around > > > > when running big-endian, or do you always do that in software? > > > Currently we only support little-endian and soc will follow it. > > > > Ok, that makes it easier. If you think that you won't even need big-endian > > support in the long run, you could also remove your asm/byteorder.h > > header. If you're not sure, it doesn't hurt to keep it of course. > Em... I'm not sure, so let me keep it for a while. Ok. I think overall the trend is to be little-endian only for most architectures: powerpc64 moved from big-endian only to little-endian by default, ARM rarely uses big-endian (basically only for legacy applications ported from BE MIPS or ppc), and all new architectures we added in the last years are little-endian (OpenRISC being the main exception). Arnd
Re: [PATCH V3 06/26] csky: Cache and TLB routines
On Fri, Sep 07, 2018 at 10:14:38AM +0200, Arnd Bergmann wrote: > On Fri, Sep 7, 2018 at 5:04 AM Guo Ren wrote: > > > > On Thu, Sep 06, 2018 at 04:31:16PM +0200, Arnd Bergmann wrote: > > > On Wed, Sep 5, 2018 at 2:08 PM Guo Ren wrote: > > > > > > Can you describe how C-Sky hardware implements MMIO? > > Our mmio is uncachable and strong-order address, so there is no need > > barriers for access these io addr. > > > > #define ioremap_wc ioremap_nocache > > #define ioremap_wt ioremap_nocache > > > > Current ioremap_wc and ioremap_wt implementation are too simple and > > we'll improve it in future. > > > > > In particular: > > > > > > - Is a read from uncached memory always serialized with DMA, and with > > > other CPUs doing MMIO access to a different address? > > CPU use ld.w to get data from uncached strong order memory. > > Other CPUs use the same mmio vaddr to access the uncachable strong order > > memory paddr. > > Ok, but what about the DMA? The most common requirement for > serialization here is with a DMA transfer, where you first write > into a buffer in memory, then write to an MMIO register to trigger > a DMA-load, and then the device reads the data from memory. > Without a barrier before the MMIO, the data may still be in a > store queue of the CPU, and the DMA gets stale data. > > Similarly, an MMIO read may be used to see if a DMA has completed > and the device register tells you that the DMA has left the device, > but without a barrier, the CPU may have prefetched the DMA > data while waiting for the MMIO-read to complete. The __io_ar() > barrier() in asm-generic/io.h prevents the compiler from reordering > the two reads, but if an weakly ordered read (in coherent DMA buffer) > can bypass a strongly ordered read (MMIO), then it's still still > broken. __io_ar() barrier()? not rmb() ?! I've defined the rmb in asm/barrier, So I got rmb() here not barrier(). Only __io_br() is barrier(). > > > - How does endianess work? Are there any buses that flip bytes around > > > when running big-endian, or do you always do that in software? > > Currently we only support little-endian and soc will follow it. > > Ok, that makes it easier. If you think that you won't even need big-endian > support in the long run, you could also remove your asm/byteorder.h > header. If you're not sure, it doesn't hurt to keep it of course. Em... I'm not sure, so let me keep it for a while. Best Regards Guo Ren
Re: [PATCH V3 06/26] csky: Cache and TLB routines
On Fri, Sep 07, 2018 at 10:14:38AM +0200, Arnd Bergmann wrote: > On Fri, Sep 7, 2018 at 5:04 AM Guo Ren wrote: > > > > On Thu, Sep 06, 2018 at 04:31:16PM +0200, Arnd Bergmann wrote: > > > On Wed, Sep 5, 2018 at 2:08 PM Guo Ren wrote: > > > > > > Can you describe how C-Sky hardware implements MMIO? > > Our mmio is uncachable and strong-order address, so there is no need > > barriers for access these io addr. > > > > #define ioremap_wc ioremap_nocache > > #define ioremap_wt ioremap_nocache > > > > Current ioremap_wc and ioremap_wt implementation are too simple and > > we'll improve it in future. > > > > > In particular: > > > > > > - Is a read from uncached memory always serialized with DMA, and with > > > other CPUs doing MMIO access to a different address? > > CPU use ld.w to get data from uncached strong order memory. > > Other CPUs use the same mmio vaddr to access the uncachable strong order > > memory paddr. > > Ok, but what about the DMA? The most common requirement for > serialization here is with a DMA transfer, where you first write > into a buffer in memory, then write to an MMIO register to trigger > a DMA-load, and then the device reads the data from memory. > Without a barrier before the MMIO, the data may still be in a > store queue of the CPU, and the DMA gets stale data. > > Similarly, an MMIO read may be used to see if a DMA has completed > and the device register tells you that the DMA has left the device, > but without a barrier, the CPU may have prefetched the DMA > data while waiting for the MMIO-read to complete. The __io_ar() > barrier() in asm-generic/io.h prevents the compiler from reordering > the two reads, but if an weakly ordered read (in coherent DMA buffer) > can bypass a strongly ordered read (MMIO), then it's still still > broken. __io_ar() barrier()? not rmb() ?! I've defined the rmb in asm/barrier, So I got rmb() here not barrier(). Only __io_br() is barrier(). > > > - How does endianess work? Are there any buses that flip bytes around > > > when running big-endian, or do you always do that in software? > > Currently we only support little-endian and soc will follow it. > > Ok, that makes it easier. If you think that you won't even need big-endian > support in the long run, you could also remove your asm/byteorder.h > header. If you're not sure, it doesn't hurt to keep it of course. Em... I'm not sure, so let me keep it for a while. Best Regards Guo Ren
Re: [PATCH V3 06/26] csky: Cache and TLB routines
On Fri, Sep 7, 2018 at 5:04 AM Guo Ren wrote: > > On Thu, Sep 06, 2018 at 04:31:16PM +0200, Arnd Bergmann wrote: > > On Wed, Sep 5, 2018 at 2:08 PM Guo Ren wrote: > > > > Can you describe how C-Sky hardware implements MMIO? > Our mmio is uncachable and strong-order address, so there is no need > barriers for access these io addr. > > #define ioremap_wc ioremap_nocache > #define ioremap_wt ioremap_nocache > > Current ioremap_wc and ioremap_wt implementation are too simple and > we'll improve it in future. > > > In particular: > > > > - Is a read from uncached memory always serialized with DMA, and with > > other CPUs doing MMIO access to a different address? > CPU use ld.w to get data from uncached strong order memory. > Other CPUs use the same mmio vaddr to access the uncachable strong order > memory paddr. Ok, but what about the DMA? The most common requirement for serialization here is with a DMA transfer, where you first write into a buffer in memory, then write to an MMIO register to trigger a DMA-load, and then the device reads the data from memory. Without a barrier before the MMIO, the data may still be in a store queue of the CPU, and the DMA gets stale data. Similarly, an MMIO read may be used to see if a DMA has completed and the device register tells you that the DMA has left the device, but without a barrier, the CPU may have prefetched the DMA data while waiting for the MMIO-read to complete. The __io_ar() barrier() in asm-generic/io.h prevents the compiler from reordering the two reads, but if an weakly ordered read (in coherent DMA buffer) can bypass a strongly ordered read (MMIO), then it's still still broken. > > - How does endianess work? Are there any buses that flip bytes around > > when running big-endian, or do you always do that in software? > Currently we only support little-endian and soc will follow it. Ok, that makes it easier. If you think that you won't even need big-endian support in the long run, you could also remove your asm/byteorder.h header. If you're not sure, it doesn't hurt to keep it of course. Arnd
Re: [PATCH V3 06/26] csky: Cache and TLB routines
On Fri, Sep 7, 2018 at 5:04 AM Guo Ren wrote: > > On Thu, Sep 06, 2018 at 04:31:16PM +0200, Arnd Bergmann wrote: > > On Wed, Sep 5, 2018 at 2:08 PM Guo Ren wrote: > > > > Can you describe how C-Sky hardware implements MMIO? > Our mmio is uncachable and strong-order address, so there is no need > barriers for access these io addr. > > #define ioremap_wc ioremap_nocache > #define ioremap_wt ioremap_nocache > > Current ioremap_wc and ioremap_wt implementation are too simple and > we'll improve it in future. > > > In particular: > > > > - Is a read from uncached memory always serialized with DMA, and with > > other CPUs doing MMIO access to a different address? > CPU use ld.w to get data from uncached strong order memory. > Other CPUs use the same mmio vaddr to access the uncachable strong order > memory paddr. Ok, but what about the DMA? The most common requirement for serialization here is with a DMA transfer, where you first write into a buffer in memory, then write to an MMIO register to trigger a DMA-load, and then the device reads the data from memory. Without a barrier before the MMIO, the data may still be in a store queue of the CPU, and the DMA gets stale data. Similarly, an MMIO read may be used to see if a DMA has completed and the device register tells you that the DMA has left the device, but without a barrier, the CPU may have prefetched the DMA data while waiting for the MMIO-read to complete. The __io_ar() barrier() in asm-generic/io.h prevents the compiler from reordering the two reads, but if an weakly ordered read (in coherent DMA buffer) can bypass a strongly ordered read (MMIO), then it's still still broken. > > - How does endianess work? Are there any buses that flip bytes around > > when running big-endian, or do you always do that in software? > Currently we only support little-endian and soc will follow it. Ok, that makes it easier. If you think that you won't even need big-endian support in the long run, you could also remove your asm/byteorder.h header. If you're not sure, it doesn't hurt to keep it of course. Arnd
Re: [PATCH V3 06/26] csky: Cache and TLB routines
On Thu, Sep 06, 2018 at 04:31:16PM +0200, Arnd Bergmann wrote: > On Wed, Sep 5, 2018 at 2:08 PM Guo Ren wrote: > > > diff --git a/arch/csky/include/asm/io.h b/arch/csky/include/asm/io.h > > new file mode 100644 > > index 000..fcb2142 > > --- /dev/null > > +++ b/arch/csky/include/asm/io.h > > @@ -0,0 +1,23 @@ > > +// SPDX-License-Identifier: GPL-2.0 > > +// Copyright (C) 2018 Hangzhou C-SKY Microsystems co.,ltd. > > +#ifndef __ASM_CSKY_IO_H > > +#define __ASM_CSKY_IO_H > > + > > +#include > > +#include > > +#include > > + > > +extern void __iomem *ioremap(phys_addr_t offset, size_t size); > > + > > +extern void iounmap(void *addr); > > + > > +extern int remap_area_pages(unsigned long address, phys_addr_t phys_addr, > > + size_t size, unsigned long flags); > > + > > +#define ioremap_nocache(phy, sz) ioremap(phy, sz) > > +#define ioremap_wc ioremap_nocache > > +#define ioremap_wt ioremap_nocache > > + > > +#include > > It is very unusual for an architecture to not need special handling in > asm/io.h, > to do the proper barriers etc. > > Can you describe how C-Sky hardware implements MMIO? Our mmio is uncachable and strong-order address, so there is no need barriers for access these io addr. #define ioremap_wc ioremap_nocache #define ioremap_wt ioremap_nocache Current ioremap_wc and ioremap_wt implementation are too simple and we'll improve it in future. > In particular: > > - Is a read from uncached memory always serialized with DMA, and with > other CPUs doing MMIO access to a different address? CPU use ld.w to get data from uncached strong order memory. Other CPUs use the same mmio vaddr to access the uncachable strong order memory paddr. > - How does endianess work? Are there any buses that flip bytes around > when running big-endian, or do you always do that in software? Currently we only support little-endian and soc will follow it. Guo Ren
Re: [PATCH V3 06/26] csky: Cache and TLB routines
On Thu, Sep 06, 2018 at 04:31:16PM +0200, Arnd Bergmann wrote: > On Wed, Sep 5, 2018 at 2:08 PM Guo Ren wrote: > > > diff --git a/arch/csky/include/asm/io.h b/arch/csky/include/asm/io.h > > new file mode 100644 > > index 000..fcb2142 > > --- /dev/null > > +++ b/arch/csky/include/asm/io.h > > @@ -0,0 +1,23 @@ > > +// SPDX-License-Identifier: GPL-2.0 > > +// Copyright (C) 2018 Hangzhou C-SKY Microsystems co.,ltd. > > +#ifndef __ASM_CSKY_IO_H > > +#define __ASM_CSKY_IO_H > > + > > +#include > > +#include > > +#include > > + > > +extern void __iomem *ioremap(phys_addr_t offset, size_t size); > > + > > +extern void iounmap(void *addr); > > + > > +extern int remap_area_pages(unsigned long address, phys_addr_t phys_addr, > > + size_t size, unsigned long flags); > > + > > +#define ioremap_nocache(phy, sz) ioremap(phy, sz) > > +#define ioremap_wc ioremap_nocache > > +#define ioremap_wt ioremap_nocache > > + > > +#include > > It is very unusual for an architecture to not need special handling in > asm/io.h, > to do the proper barriers etc. > > Can you describe how C-Sky hardware implements MMIO? Our mmio is uncachable and strong-order address, so there is no need barriers for access these io addr. #define ioremap_wc ioremap_nocache #define ioremap_wt ioremap_nocache Current ioremap_wc and ioremap_wt implementation are too simple and we'll improve it in future. > In particular: > > - Is a read from uncached memory always serialized with DMA, and with > other CPUs doing MMIO access to a different address? CPU use ld.w to get data from uncached strong order memory. Other CPUs use the same mmio vaddr to access the uncachable strong order memory paddr. > - How does endianess work? Are there any buses that flip bytes around > when running big-endian, or do you always do that in software? Currently we only support little-endian and soc will follow it. Guo Ren
Re: [PATCH V3 06/26] csky: Cache and TLB routines
On Wed, Sep 5, 2018 at 2:08 PM Guo Ren wrote: > diff --git a/arch/csky/include/asm/io.h b/arch/csky/include/asm/io.h > new file mode 100644 > index 000..fcb2142 > --- /dev/null > +++ b/arch/csky/include/asm/io.h > @@ -0,0 +1,23 @@ > +// SPDX-License-Identifier: GPL-2.0 > +// Copyright (C) 2018 Hangzhou C-SKY Microsystems co.,ltd. > +#ifndef __ASM_CSKY_IO_H > +#define __ASM_CSKY_IO_H > + > +#include > +#include > +#include > + > +extern void __iomem *ioremap(phys_addr_t offset, size_t size); > + > +extern void iounmap(void *addr); > + > +extern int remap_area_pages(unsigned long address, phys_addr_t phys_addr, > + size_t size, unsigned long flags); > + > +#define ioremap_nocache(phy, sz) ioremap(phy, sz) > +#define ioremap_wc ioremap_nocache > +#define ioremap_wt ioremap_nocache > + > +#include It is very unusual for an architecture to not need special handling in asm/io.h, to do the proper barriers etc. Can you describe how C-Sky hardware implements MMIO? In particular: - Is a read from uncached memory always serialized with DMA, and with other CPUs doing MMIO access to a different address? - How does endianess work? Are there any buses that flip bytes around when running big-endian, or do you always do that in software? Arnd
Re: [PATCH V3 06/26] csky: Cache and TLB routines
On Wed, Sep 5, 2018 at 2:08 PM Guo Ren wrote: > diff --git a/arch/csky/include/asm/io.h b/arch/csky/include/asm/io.h > new file mode 100644 > index 000..fcb2142 > --- /dev/null > +++ b/arch/csky/include/asm/io.h > @@ -0,0 +1,23 @@ > +// SPDX-License-Identifier: GPL-2.0 > +// Copyright (C) 2018 Hangzhou C-SKY Microsystems co.,ltd. > +#ifndef __ASM_CSKY_IO_H > +#define __ASM_CSKY_IO_H > + > +#include > +#include > +#include > + > +extern void __iomem *ioremap(phys_addr_t offset, size_t size); > + > +extern void iounmap(void *addr); > + > +extern int remap_area_pages(unsigned long address, phys_addr_t phys_addr, > + size_t size, unsigned long flags); > + > +#define ioremap_nocache(phy, sz) ioremap(phy, sz) > +#define ioremap_wc ioremap_nocache > +#define ioremap_wt ioremap_nocache > + > +#include It is very unusual for an architecture to not need special handling in asm/io.h, to do the proper barriers etc. Can you describe how C-Sky hardware implements MMIO? In particular: - Is a read from uncached memory always serialized with DMA, and with other CPUs doing MMIO access to a different address? - How does endianess work? Are there any buses that flip bytes around when running big-endian, or do you always do that in software? Arnd
[PATCH V3 06/26] csky: Cache and TLB routines
Signed-off-by: Guo Ren --- arch/csky/abiv1/cacheflush.c | 50 arch/csky/abiv1/inc/abi/cacheflush.h | 41 +++ arch/csky/abiv1/inc/abi/tlb.h | 11 ++ arch/csky/abiv2/cacheflush.c | 54 + arch/csky/abiv2/inc/abi/cacheflush.h | 38 ++ arch/csky/abiv2/inc/abi/tlb.h | 12 ++ arch/csky/include/asm/barrier.h | 45 +++ arch/csky/include/asm/cache.h | 28 + arch/csky/include/asm/cacheflush.h| 8 ++ arch/csky/include/asm/io.h| 23 arch/csky/include/asm/tlb.h | 19 +++ arch/csky/include/asm/tlbflush.h | 22 arch/csky/include/uapi/asm/cachectl.h | 13 +++ arch/csky/mm/cachev1.c| 126 arch/csky/mm/cachev2.c| 79 + arch/csky/mm/syscache.c | 28 + arch/csky/mm/tlb.c| 214 ++ 17 files changed, 811 insertions(+) create mode 100644 arch/csky/abiv1/cacheflush.c create mode 100644 arch/csky/abiv1/inc/abi/cacheflush.h create mode 100644 arch/csky/abiv1/inc/abi/tlb.h create mode 100644 arch/csky/abiv2/cacheflush.c create mode 100644 arch/csky/abiv2/inc/abi/cacheflush.h create mode 100644 arch/csky/abiv2/inc/abi/tlb.h create mode 100644 arch/csky/include/asm/barrier.h create mode 100644 arch/csky/include/asm/cache.h create mode 100644 arch/csky/include/asm/cacheflush.h create mode 100644 arch/csky/include/asm/io.h create mode 100644 arch/csky/include/asm/tlb.h create mode 100644 arch/csky/include/asm/tlbflush.h create mode 100644 arch/csky/include/uapi/asm/cachectl.h create mode 100644 arch/csky/mm/cachev1.c create mode 100644 arch/csky/mm/cachev2.c create mode 100644 arch/csky/mm/syscache.c create mode 100644 arch/csky/mm/tlb.c diff --git a/arch/csky/abiv1/cacheflush.c b/arch/csky/abiv1/cacheflush.c new file mode 100644 index 000..4c6fede --- /dev/null +++ b/arch/csky/abiv1/cacheflush.c @@ -0,0 +1,50 @@ +// SPDX-License-Identifier: GPL-2.0 +// Copyright (C) 2018 Hangzhou C-SKY Microsystems co.,ltd. +#include +#include +#include +#include +#include +#include +#include +#include +#include + +void flush_dcache_page(struct page *page) +{ + struct address_space *mapping = page_mapping(page); + unsigned long addr; + + if (mapping && !mapping_mapped(mapping)) { + set_bit(PG_arch_1, &(page)->flags); + return; + } + + /* +* We could delay the flush for the !page_mapping case too. But that +* case is for exec env/arg pages and those are %99 certainly going to +* get faulted into the tlb (and thus flushed) anyways. +*/ + addr = (unsigned long) page_address(page); + dcache_wb_range(addr, addr + PAGE_SIZE); +} + +void update_mmu_cache(struct vm_area_struct *vma, unsigned long address, pte_t *pte) +{ + unsigned long addr; + struct page *page; + unsigned long pfn; + + pfn = pte_pfn(*pte); + if (unlikely(!pfn_valid(pfn))) + return; + + page = pfn_to_page(pfn); + addr = (unsigned long) page_address(page); + + if (vma->vm_flags & VM_EXEC || + pages_do_alias(addr, address & PAGE_MASK)) + cache_wbinv_all(); + + clear_bit(PG_arch_1, &(page)->flags); +} diff --git a/arch/csky/abiv1/inc/abi/cacheflush.h b/arch/csky/abiv1/inc/abi/cacheflush.h new file mode 100644 index 000..ba5071e --- /dev/null +++ b/arch/csky/abiv1/inc/abi/cacheflush.h @@ -0,0 +1,41 @@ +// SPDX-License-Identifier: GPL-2.0 +// Copyright (C) 2018 Hangzhou C-SKY Microsystems co.,ltd. +#ifndef __ABI_CSKY_CACHEFLUSH_H +#define __ABI_CSKY_CACHEFLUSH_H + +#include +#include +#include + +#define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1 +extern void flush_dcache_page(struct page *); + +#define flush_cache_mm(mm) cache_wbinv_all() +#define flush_cache_page(vma,page,pfn) cache_wbinv_all() +#define flush_cache_dup_mm(mm) cache_wbinv_all() + +#define flush_cache_range(mm,start,end)cache_wbinv_range(start, end) +#define flush_cache_vmap(start, end) cache_wbinv_range(start, end) +#define flush_cache_vunmap(start, end) cache_wbinv_range(start, end) + +#define flush_icache_page(vma, page) cache_wbinv_all() +#define flush_icache_range(start, end) cache_wbinv_range(start, end) +#define flush_icache_user_range(vma,pg,adr,len)cache_wbinv_range(adr, adr + len) + +#define copy_from_user_page(vma, page, vaddr, dst, src, len) \ +do{ \ + cache_wbinv_all(); \ + memcpy(dst, src, len); \ + icache_inv_all(); \ +}while(0) + +#define copy_to_user_page(vma, page, vaddr, dst, src, len) \ +do{ \ + cache_wbinv_all(); \ + memcpy(dst, src, len); \ +}while(0) + +#define flush_dcache_mmap_lock(mapping)do{}while(0) +#define flush_dcache_mmap_unlock(mapping) do{}while(0) + +#endif /* __ABI_CSKY_CACHEFLUSH_H */ diff --git
[PATCH V3 06/26] csky: Cache and TLB routines
Signed-off-by: Guo Ren --- arch/csky/abiv1/cacheflush.c | 50 arch/csky/abiv1/inc/abi/cacheflush.h | 41 +++ arch/csky/abiv1/inc/abi/tlb.h | 11 ++ arch/csky/abiv2/cacheflush.c | 54 + arch/csky/abiv2/inc/abi/cacheflush.h | 38 ++ arch/csky/abiv2/inc/abi/tlb.h | 12 ++ arch/csky/include/asm/barrier.h | 45 +++ arch/csky/include/asm/cache.h | 28 + arch/csky/include/asm/cacheflush.h| 8 ++ arch/csky/include/asm/io.h| 23 arch/csky/include/asm/tlb.h | 19 +++ arch/csky/include/asm/tlbflush.h | 22 arch/csky/include/uapi/asm/cachectl.h | 13 +++ arch/csky/mm/cachev1.c| 126 arch/csky/mm/cachev2.c| 79 + arch/csky/mm/syscache.c | 28 + arch/csky/mm/tlb.c| 214 ++ 17 files changed, 811 insertions(+) create mode 100644 arch/csky/abiv1/cacheflush.c create mode 100644 arch/csky/abiv1/inc/abi/cacheflush.h create mode 100644 arch/csky/abiv1/inc/abi/tlb.h create mode 100644 arch/csky/abiv2/cacheflush.c create mode 100644 arch/csky/abiv2/inc/abi/cacheflush.h create mode 100644 arch/csky/abiv2/inc/abi/tlb.h create mode 100644 arch/csky/include/asm/barrier.h create mode 100644 arch/csky/include/asm/cache.h create mode 100644 arch/csky/include/asm/cacheflush.h create mode 100644 arch/csky/include/asm/io.h create mode 100644 arch/csky/include/asm/tlb.h create mode 100644 arch/csky/include/asm/tlbflush.h create mode 100644 arch/csky/include/uapi/asm/cachectl.h create mode 100644 arch/csky/mm/cachev1.c create mode 100644 arch/csky/mm/cachev2.c create mode 100644 arch/csky/mm/syscache.c create mode 100644 arch/csky/mm/tlb.c diff --git a/arch/csky/abiv1/cacheflush.c b/arch/csky/abiv1/cacheflush.c new file mode 100644 index 000..4c6fede --- /dev/null +++ b/arch/csky/abiv1/cacheflush.c @@ -0,0 +1,50 @@ +// SPDX-License-Identifier: GPL-2.0 +// Copyright (C) 2018 Hangzhou C-SKY Microsystems co.,ltd. +#include +#include +#include +#include +#include +#include +#include +#include +#include + +void flush_dcache_page(struct page *page) +{ + struct address_space *mapping = page_mapping(page); + unsigned long addr; + + if (mapping && !mapping_mapped(mapping)) { + set_bit(PG_arch_1, &(page)->flags); + return; + } + + /* +* We could delay the flush for the !page_mapping case too. But that +* case is for exec env/arg pages and those are %99 certainly going to +* get faulted into the tlb (and thus flushed) anyways. +*/ + addr = (unsigned long) page_address(page); + dcache_wb_range(addr, addr + PAGE_SIZE); +} + +void update_mmu_cache(struct vm_area_struct *vma, unsigned long address, pte_t *pte) +{ + unsigned long addr; + struct page *page; + unsigned long pfn; + + pfn = pte_pfn(*pte); + if (unlikely(!pfn_valid(pfn))) + return; + + page = pfn_to_page(pfn); + addr = (unsigned long) page_address(page); + + if (vma->vm_flags & VM_EXEC || + pages_do_alias(addr, address & PAGE_MASK)) + cache_wbinv_all(); + + clear_bit(PG_arch_1, &(page)->flags); +} diff --git a/arch/csky/abiv1/inc/abi/cacheflush.h b/arch/csky/abiv1/inc/abi/cacheflush.h new file mode 100644 index 000..ba5071e --- /dev/null +++ b/arch/csky/abiv1/inc/abi/cacheflush.h @@ -0,0 +1,41 @@ +// SPDX-License-Identifier: GPL-2.0 +// Copyright (C) 2018 Hangzhou C-SKY Microsystems co.,ltd. +#ifndef __ABI_CSKY_CACHEFLUSH_H +#define __ABI_CSKY_CACHEFLUSH_H + +#include +#include +#include + +#define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1 +extern void flush_dcache_page(struct page *); + +#define flush_cache_mm(mm) cache_wbinv_all() +#define flush_cache_page(vma,page,pfn) cache_wbinv_all() +#define flush_cache_dup_mm(mm) cache_wbinv_all() + +#define flush_cache_range(mm,start,end)cache_wbinv_range(start, end) +#define flush_cache_vmap(start, end) cache_wbinv_range(start, end) +#define flush_cache_vunmap(start, end) cache_wbinv_range(start, end) + +#define flush_icache_page(vma, page) cache_wbinv_all() +#define flush_icache_range(start, end) cache_wbinv_range(start, end) +#define flush_icache_user_range(vma,pg,adr,len)cache_wbinv_range(adr, adr + len) + +#define copy_from_user_page(vma, page, vaddr, dst, src, len) \ +do{ \ + cache_wbinv_all(); \ + memcpy(dst, src, len); \ + icache_inv_all(); \ +}while(0) + +#define copy_to_user_page(vma, page, vaddr, dst, src, len) \ +do{ \ + cache_wbinv_all(); \ + memcpy(dst, src, len); \ +}while(0) + +#define flush_dcache_mmap_lock(mapping)do{}while(0) +#define flush_dcache_mmap_unlock(mapping) do{}while(0) + +#endif /* __ABI_CSKY_CACHEFLUSH_H */ diff --git