Re: [RFC] powerpc/mm: honor O_SYNC flag for memory map
On Fri, Nov 27, 2009 at 10:30 AM, Paul Mackerras wrote: > Li Yang writes: > >> That's my concern too. But after all mmap without O_SYNC on I/O >> devices should be deprecated. > > It should? Why? > > Shouldn't the onus rather be on those proposing an incompatible change > to the kernel ABI, such as this is, to show why the change is > absolutely essential? In addition to the performance issue I stated earlier in this thread. There was also cache paradoxes problem too. If a memory region is shared between two cores/OS'es and Linux can't map the region with the same cacheable property as the other OS had mapped, there will be a problem. The best solution for this problem is to make the cacheablity controllable. And it seems to be generic issue which does not only exist on powerpc platforms. So it might be better dealt with the already existed O_SYNC flag which was meant to deal with this kind of thing. I agree with your comment in another email that device tree could be used, but that solution is not generic enough though. - Leo ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC] powerpc/mm: honor O_SYNC flag for memory map
Li Yang writes: > That's my concern too. But after all mmap without O_SYNC on I/O > devices should be deprecated. It should? Why? Shouldn't the onus rather be on those proposing an incompatible change to the kernel ABI, such as this is, to show why the change is absolutely essential? > A warning message in deprecation period > could reduce potential problem caused by this change. Not making the change at all would reduce the potential problems even further. :) Paul. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC] powerpc/mm: honor O_SYNC flag for memory map
So what you are saying is that if the kernel has mapped a physical page as cacheable while user application is trying to map it as non-cacheable, there will be machine checks and checkstops rather than just performance drop? This is new to me. Could you elaborate a bit? If some data is in cache at a certain physical address, and you then do an uncached read from that address, on at least some CPUs both the bus interface and the cache will reply. Bang, machine check. Writes are problematic as well. Segher ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC] powerpc/mm: honor O_SYNC flag for memory map
On Wed, Nov 25, 2009 at 7:30 PM, Gabriel Paubert wrote: > On Wed, Nov 25, 2009 at 04:07:46PM +0800, Li Yang wrote: >> On Sun, Nov 22, 2009 at 4:01 AM, Segher Boessenkool >> wrote: >> >>> You need to be a bit more careful tho. You must not allow RAM managed by >> >>> the kernel to be mapped non-cachable. >> >> >> >> Even if the user explicitly sets the O_SYNC flag? IMHO, it's a bug of >> >> the application if it uses O_SYNC on main memory to be mmap'ed later. >> >> And we don't need to cover up the bug. >> > >> > Is that "embedded thinking"? Conflicts like this cause machine checks or >> > checkstops on many PowerPC implementations, we do not normally allow such >> > to be caused by userland. >> >> So what you are saying is that if the kernel has mapped a physical >> page as cacheable while user application is trying to map it as >> non-cacheable, there will be machine checks and checkstops rather than >> just performance drop? This is new to me. Could you elaborate a bit? > > That's called cache paradoxes. And yes they may be a problem. Thanks. I will prevent this from happening. > > Besides that, existing application may have used mmap without O_SYNC on > I/O devices, knowing that the kernel would map them uncached. Your > patch would break them by using cached accesses (and it can cause > really hard to debug lockups, I've seen this, probably caused by > infinite retries on the PCI bus). That's my concern too. But after all mmap without O_SYNC on I/O devices should be deprecated. A warning message in deprecation period could reduce potential problem caused by this change. - Leo ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC] powerpc/mm: honor O_SYNC flag for memory map
On Wed, Nov 25, 2009 at 04:07:46PM +0800, Li Yang wrote: > On Sun, Nov 22, 2009 at 4:01 AM, Segher Boessenkool > wrote: > >>> You need to be a bit more careful tho. You must not allow RAM managed by > >>> the kernel to be mapped non-cachable. > >> > >> Even if the user explicitly sets the O_SYNC flag? IMHO, it's a bug of > >> the application if it uses O_SYNC on main memory to be mmap'ed later. > >> And we don't need to cover up the bug. > > > > Is that "embedded thinking"? Conflicts like this cause machine checks or > > checkstops on many PowerPC implementations, we do not normally allow such > > to be caused by userland. > > So what you are saying is that if the kernel has mapped a physical > page as cacheable while user application is trying to map it as > non-cacheable, there will be machine checks and checkstops rather than > just performance drop? This is new to me. Could you elaborate a bit? That's called cache paradoxes. And yes they may be a problem. Besides that, existing application may have used mmap without O_SYNC on I/O devices, knowing that the kernel would map them uncached. Your patch would break them by using cached accesses (and it can cause really hard to debug lockups, I've seen this, probably caused by infinite retries on the PCI bus). Gabriel ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC] powerpc/mm: honor O_SYNC flag for memory map
On Sun, Nov 22, 2009 at 4:01 AM, Segher Boessenkool wrote: >>> You need to be a bit more careful tho. You must not allow RAM managed by >>> the kernel to be mapped non-cachable. >> >> Even if the user explicitly sets the O_SYNC flag? IMHO, it's a bug of >> the application if it uses O_SYNC on main memory to be mmap'ed later. >> And we don't need to cover up the bug. > > Is that "embedded thinking"? Conflicts like this cause machine checks or > checkstops on many PowerPC implementations, we do not normally allow such > to be caused by userland. So what you are saying is that if the kernel has mapped a physical page as cacheable while user application is trying to map it as non-cacheable, there will be machine checks and checkstops rather than just performance drop? This is new to me. Could you elaborate a bit? Thanks. - Leo ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC] powerpc/mm: honor O_SYNC flag for memory map
You need to be a bit more careful tho. You must not allow RAM managed by the kernel to be mapped non-cachable. Even if the user explicitly sets the O_SYNC flag? IMHO, it's a bug of the application if it uses O_SYNC on main memory to be mmap'ed later. And we don't need to cover up the bug. Is that "embedded thinking"? Conflicts like this cause machine checks or checkstops on many PowerPC implementations, we do not normally allow such to be caused by userland. Segher ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC] powerpc/mm: honor O_SYNC flag for memory map
On Fri, Nov 20, 2009 at 5:03 PM, Benjamin Herrenschmidt wrote: > On Fri, 2009-11-20 at 11:00 +0800, Li Yang-R58472 wrote: >> Because there is no way to set mapped memory as cacheable if the >> memory >> is not managed by Linux kernel. While, it's not rare in real system >> to >> allocate some dedicated memory to a certain application which is not >> managed by kernel and then mmap'ed the memory to the application. The >> memory should be cacheable but we can't map it to be cacheable due to >> this intelligent setting. And it is a big hit to the performance. >> Moreover, the standard O_SYNC flag suggest that user has the control >> over cacheablity, but actually we had not. > > You need to be a bit more careful tho. You must not allow RAM managed by > the kernel to be mapped non-cachable. Even if the user explicitly sets the O_SYNC flag? IMHO, it's a bug of the application if it uses O_SYNC on main memory to be mmap'ed later. And we don't need to cover up the bug. - Leo ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [RFC] powerpc/mm: honor O_SYNC flag for memory map
On Fri, 2009-11-20 at 11:00 +0800, Li Yang-R58472 wrote: > Because there is no way to set mapped memory as cacheable if the > memory > is not managed by Linux kernel. While, it's not rare in real system > to > allocate some dedicated memory to a certain application which is not > managed by kernel and then mmap'ed the memory to the application. The > memory should be cacheable but we can't map it to be cacheable due to > this intelligent setting. And it is a big hit to the performance. > Moreover, the standard O_SYNC flag suggest that user has the control > over cacheablity, but actually we had not. You need to be a bit more careful tho. You must not allow RAM managed by the kernel to be mapped non-cachable. Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [RFC] powerpc/mm: honor O_SYNC flag for memory map
>-Original Message- >From: Kumar Gala [mailto:ga...@kernel.crashing.org] > >On Nov 17, 2009, at 1:10 AM, Li Yang wrote: > >> Rather than the original intelligent way, we grant user more freedom. >> This enables user to map cacheable memory not managed by Linux. >> >> Signed-off-by: Li Yang >> --- >> The only direct users of this function is fb_mmap() and >/dev/mem mmap. >> Although I'm not sure if anything is depending on the intelligent >> setting of cacheability. > >is there some reason to change this? Because there is no way to set mapped memory as cacheable if the memory is not managed by Linux kernel. While, it's not rare in real system to allocate some dedicated memory to a certain application which is not managed by kernel and then mmap'ed the memory to the application. The memory should be cacheable but we can't map it to be cacheable due to this intelligent setting. And it is a big hit to the performance. Moreover, the standard O_SYNC flag suggest that user has the control over cacheablity, but actually we had not. - Leo > >- k > >> >> arch/powerpc/mm/mem.c |2 +- >> 1 files changed, 1 insertions(+), 1 deletions(-) >> >> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c index >> 579382c..0fd267e 100644 >> --- a/arch/powerpc/mm/mem.c >> +++ b/arch/powerpc/mm/mem.c >> @@ -101,7 +101,7 @@ pgprot_t phys_mem_access_prot(struct file *file, >> unsigned long pfn, >> if (ppc_md.phys_mem_access_prot) >> return ppc_md.phys_mem_access_prot(file, pfn, >size, vma_prot); >> >> -if (!page_is_ram(pfn)) >> +if (file->f_flags & O_SYNC) >> vma_prot = pgprot_noncached(vma_prot); >> >> return vma_prot; >> -- >> 1.6.4 >> >> ___ >> Linuxppc-dev mailing list >> Linuxppc-dev@lists.ozlabs.org >> https://lists.ozlabs.org/listinfo/linuxppc-dev > > > ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC] powerpc/mm: honor O_SYNC flag for memory map
On Nov 17, 2009, at 1:10 AM, Li Yang wrote: Rather than the original intelligent way, we grant user more freedom. This enables user to map cacheable memory not managed by Linux. Signed-off-by: Li Yang --- The only direct users of this function is fb_mmap() and /dev/mem mmap. Although I'm not sure if anything is depending on the intelligent setting of cacheability. is there some reason to change this? - k arch/powerpc/mm/mem.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c index 579382c..0fd267e 100644 --- a/arch/powerpc/mm/mem.c +++ b/arch/powerpc/mm/mem.c @@ -101,7 +101,7 @@ pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn, if (ppc_md.phys_mem_access_prot) return ppc_md.phys_mem_access_prot(file, pfn, size, vma_prot); - if (!page_is_ram(pfn)) + if (file->f_flags & O_SYNC) vma_prot = pgprot_noncached(vma_prot); return vma_prot; -- 1.6.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev