Re: [PATCH v2 4/6] x86: Add clear_page_nocache

2012-08-13 Thread Kirill A. Shutemov
On Mon, Aug 13, 2012 at 07:04:02PM +0200, Borislav Petkov wrote: > On Mon, Aug 13, 2012 at 02:43:34PM +0300, Kirill A. Shutemov wrote: > > $ cat test.c > > #include > > #include > > > > #define SIZE 1024*1024*1024 > > > > void clear_page_nocache_sse2(void *page) __attribute__((regparm(1))); > >

Re: [PATCH v2 4/6] x86: Add clear_page_nocache

2012-08-13 Thread Borislav Petkov
On Mon, Aug 13, 2012 at 02:43:34PM +0300, Kirill A. Shutemov wrote: > $ cat test.c > #include > #include > > #define SIZE 1024*1024*1024 > > void clear_page_nocache_sse2(void *page) __attribute__((regparm(1))); > > int main(int argc, char** argv) > { > char *p; > unsigned long

Re: [PATCH v2 4/6] x86: Add clear_page_nocache

2012-08-13 Thread Andi Kleen
> Moving 64 bytes per cycle is faster on Sandy Bridge, but slower on > Westmere. Any preference? ;) You have to be careful with these benchmarks. - You need to make sure the data is cache cold, cache hot is misleading. - The numbers can change if you have multiple CPUs doing this in parallel. -A

Re: [PATCH v2 4/6] x86: Add clear_page_nocache

2012-08-13 Thread Jan Beulich
>>> On 13.08.12 at 13:43, "Kirill A. Shutemov" >>> wrote: > On Thu, Aug 09, 2012 at 04:22:04PM +0100, Jan Beulich wrote: >> >>> On 09.08.12 at 17:03, "Kirill A. Shutemov" >> >>> wrote: > > ... > >> > --- >> > arch/x86/include/asm/page.h |2 ++ >> > arch/x86/include/asm/string_3

Re: [PATCH v2 4/6] x86: Add clear_page_nocache

2012-08-13 Thread Kirill A. Shutemov
On Thu, Aug 09, 2012 at 04:22:04PM +0100, Jan Beulich wrote: > >>> On 09.08.12 at 17:03, "Kirill A. Shutemov" > >>> wrote: ... > > --- > > arch/x86/include/asm/page.h |2 ++ > > arch/x86/include/asm/string_32.h |5 + > > arch/x86/include/asm/string_64.h |5

Re: [PATCH v2 4/6] x86: Add clear_page_nocache

2012-08-09 Thread Jan Beulich
>>> On 09.08.12 at 17:03, "Kirill A. Shutemov" >>> wrote: > From: Andi Kleen > > Add a cache avoiding version of clear_page. Straight forward integer variant > of the existing 64bit clear_page, for both 32bit and 64bit. While on 64-bit this is fine, I fail to see how you avoid using the SSE2 i

Re: [PATCH v2 4/6] x86: Add clear_page_nocache

2012-08-09 Thread H. Peter Anvin
On 08/09/2012 08:03 AM, Kirill A. Shutemov wrote: From: Andi Kleen Add a cache avoiding version of clear_page. Straight forward integer variant of the existing 64bit clear_page, for both 32bit and 64bit. Also add the necessary glue for highmem including a layer that non cache coherent architec

[PATCH v2 4/6] x86: Add clear_page_nocache

2012-08-09 Thread Kirill A. Shutemov
From: Andi Kleen Add a cache avoiding version of clear_page. Straight forward integer variant of the existing 64bit clear_page, for both 32bit and 64bit. Also add the necessary glue for highmem including a layer that non cache coherent architectures that use the virtual address for flushing can