Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-09-03 Thread Hiro Yoshioka
From: Hiro Yoshioka <[EMAIL PROTECTED]> Subject: Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll() Date: Fri, 02 Sep 2005 13:37:16 +0900 (JST) Message-ID: <[EMAIL PROTECTED]> > From: Andrew Morton <[EMAIL PROTECTED]> > > Hiro Yoshioka <[EMAIL PROTECTED]

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-09-03 Thread Hiro Yoshioka
From: Hiro Yoshioka [EMAIL PROTECTED] Subject: Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll() Date: Fri, 02 Sep 2005 13:37:16 +0900 (JST) Message-ID: [EMAIL PROTECTED] From: Andrew Morton [EMAIL PROTECTED] Hiro Yoshioka [EMAIL PROTECTED] wrote: --- linux-2.6.12.4.orig/arch

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-09-01 Thread Hiro Yoshioka
From: Andrew Morton <[EMAIL PROTECTED]> > Hiro Yoshioka <[EMAIL PROTECTED]> wrote: > > > > --- linux-2.6.12.4.orig/arch/i386/lib/usercopy.c2005-08-05 > > 16:04:37.0 +0900 > > +++ linux-2.6.12.4.nt/arch/i386/lib/usercopy.c 2005-09-01 > > 17:09:41.0 +0900 > > Really.

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-09-01 Thread Andrew Morton
Hiro Yoshioka <[EMAIL PROTECTED]> wrote: > > --- linux-2.6.12.4.orig/arch/i386/lib/usercopy.c 2005-08-05 > 16:04:37.0 +0900 > +++ linux-2.6.12.4.nt/arch/i386/lib/usercopy.c 2005-09-01 > 17:09:41.0 +0900 Really. Please redo and retest the patch against a current

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-09-01 Thread Hiro Yoshioka
Andrew, From: Andrew Morton <[EMAIL PROTECTED]> > Andi Kleen <[EMAIL PROTECTED]> wrote: > > > > On Friday 02 September 2005 04:08, Andrew Morton wrote: > > > > > I suppose I'll queue it up in -mm for a while, although I'm a bit dubious > > > about the whole idea... We'll gain some and we'll

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-09-01 Thread Andrew Morton
Andi Kleen <[EMAIL PROTECTED]> wrote: > > On Friday 02 September 2005 04:08, Andrew Morton wrote: > > > I suppose I'll queue it up in -mm for a while, although I'm a bit dubious > > about the whole idea... We'll gain some and we'll lose some - how do we > > know it's a net gain? > > I suspect

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-09-01 Thread Andi Kleen
On Friday 02 September 2005 04:08, Andrew Morton wrote: > I suppose I'll queue it up in -mm for a while, although I'm a bit dubious > about the whole idea... We'll gain some and we'll lose some - how do we > know it's a net gain? I suspect it'll gain more than it loses. The only case where it

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-09-01 Thread Andrew Morton
Hiro Yoshioka <[EMAIL PROTECTED]> wrote: > > From: Andi Kleen <[EMAIL PROTECTED]> > > On Thursday 01 September 2005 11:07, Hiro Yoshioka wrote: > > > > > The following is the almost final version of the > > > cache pollution aware __copy_from_user_ll() patch. > > > > Looks good to me. > > > >

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-09-01 Thread Andi Kleen
On Friday 02 September 2005 03:43, Hiro Yoshioka wrote: > From: Andi Kleen <[EMAIL PROTECTED]> > > > On Thursday 01 September 2005 11:07, Hiro Yoshioka wrote: > > > The following is the almost final version of the > > > cache pollution aware __copy_from_user_ll() patch. > > > > Looks good to me. >

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-09-01 Thread Hiro Yoshioka
From: Andi Kleen <[EMAIL PROTECTED]> > On Thursday 01 September 2005 11:07, Hiro Yoshioka wrote: > > > The following is the almost final version of the > > cache pollution aware __copy_from_user_ll() patch. > > Looks good to me. > > Once the filemap.c hunk is in I'll probably do something >

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-09-01 Thread Andi Kleen
On Thursday 01 September 2005 11:07, Hiro Yoshioka wrote: > The following is the almost final version of the > cache pollution aware __copy_from_user_ll() patch. Looks good to me. Once the filemap.c hunk is in I'll probably do something similar for x86-64. -Andi - To unsubscribe from this

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-09-01 Thread Hiro Yoshioka
Hi, > From: Andi Kleen <[EMAIL PROTECTED]> > > > Hi, > > > > > > The following patch does not use MMX regsiters so that we don't have > > > to worry about save/restore the FPU/MMX states. > > > > > > What do you think? > > > > Performance will probably be bad on K7 Athlons - those have a

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-09-01 Thread Hiro Yoshioka
Hi, From: Andi Kleen [EMAIL PROTECTED] Hi, The following patch does not use MMX regsiters so that we don't have to worry about save/restore the FPU/MMX states. What do you think? Performance will probably be bad on K7 Athlons - those have a microcoded movnti which is

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-09-01 Thread Andi Kleen
On Thursday 01 September 2005 11:07, Hiro Yoshioka wrote: The following is the almost final version of the cache pollution aware __copy_from_user_ll() patch. Looks good to me. Once the filemap.c hunk is in I'll probably do something similar for x86-64. -Andi - To unsubscribe from this list:

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-09-01 Thread Hiro Yoshioka
From: Andi Kleen [EMAIL PROTECTED] On Thursday 01 September 2005 11:07, Hiro Yoshioka wrote: The following is the almost final version of the cache pollution aware __copy_from_user_ll() patch. Looks good to me. Once the filemap.c hunk is in I'll probably do something similar for

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-09-01 Thread Andi Kleen
On Friday 02 September 2005 03:43, Hiro Yoshioka wrote: From: Andi Kleen [EMAIL PROTECTED] On Thursday 01 September 2005 11:07, Hiro Yoshioka wrote: The following is the almost final version of the cache pollution aware __copy_from_user_ll() patch. Looks good to me. Once the

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-09-01 Thread Andrew Morton
Hiro Yoshioka [EMAIL PROTECTED] wrote: From: Andi Kleen [EMAIL PROTECTED] On Thursday 01 September 2005 11:07, Hiro Yoshioka wrote: The following is the almost final version of the cache pollution aware __copy_from_user_ll() patch. Looks good to me. Once the filemap.c hunk

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-09-01 Thread Andi Kleen
On Friday 02 September 2005 04:08, Andrew Morton wrote: I suppose I'll queue it up in -mm for a while, although I'm a bit dubious about the whole idea... We'll gain some and we'll lose some - how do we know it's a net gain? I suspect it'll gain more than it loses. The only case where it

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-09-01 Thread Andrew Morton
Andi Kleen [EMAIL PROTECTED] wrote: On Friday 02 September 2005 04:08, Andrew Morton wrote: I suppose I'll queue it up in -mm for a while, although I'm a bit dubious about the whole idea... We'll gain some and we'll lose some - how do we know it's a net gain? I suspect it'll gain

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-09-01 Thread Hiro Yoshioka
Andrew, From: Andrew Morton [EMAIL PROTECTED] Andi Kleen [EMAIL PROTECTED] wrote: On Friday 02 September 2005 04:08, Andrew Morton wrote: I suppose I'll queue it up in -mm for a while, although I'm a bit dubious about the whole idea... We'll gain some and we'll lose some - how do

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-09-01 Thread Andrew Morton
Hiro Yoshioka [EMAIL PROTECTED] wrote: --- linux-2.6.12.4.orig/arch/i386/lib/usercopy.c 2005-08-05 16:04:37.0 +0900 +++ linux-2.6.12.4.nt/arch/i386/lib/usercopy.c 2005-09-01 17:09:41.0 +0900 Really. Please redo and retest the patch against a current kernel. -

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-09-01 Thread Hiro Yoshioka
From: Andrew Morton [EMAIL PROTECTED] Hiro Yoshioka [EMAIL PROTECTED] wrote: --- linux-2.6.12.4.orig/arch/i386/lib/usercopy.c2005-08-05 16:04:37.0 +0900 +++ linux-2.6.12.4.nt/arch/i386/lib/usercopy.c 2005-09-01 17:09:41.0 +0900 Really. Please redo and

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-24 Thread Hiro Yoshioka
From: Andi Kleen <[EMAIL PROTECTED]> > > Hi, > > > > The following patch does not use MMX regsiters so that we don't have > > to worry about save/restore the FPU/MMX states. > > > > What do you think? > > Performance will probably be bad on K7 Athlons - those have a microcoded > movnti which is

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-24 Thread Hiro Yoshioka
From: Hirokazu Takahashi <[EMAIL PROTECTED]> > > The following patch does not use MMX regsiters so that we don't have > > to worry about save/restore the FPU/MMX states. > > > > What do you think? > > I think __copy_user_zeroing_intel_nocache() should be followed by sfence > or mfence

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-24 Thread Hirokazu Takahashi
Hi, > The following patch does not use MMX regsiters so that we don't have > to worry about save/restore the FPU/MMX states. > > What do you think? I think __copy_user_zeroing_intel_nocache() should be followed by sfence or mfence instruction to flush the data. - To unsubscribe from this

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-24 Thread Andi Kleen
Hiro Yoshioka <[EMAIL PROTECTED]> writes: > Hi, > > The following patch does not use MMX regsiters so that we don't have > to worry about save/restore the FPU/MMX states. > > What do you think? Performance will probably be bad on K7 Athlons - those have a microcoded movnti which is quite slow.

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-24 Thread Arjan van de Ven
On Wed, 2005-08-24 at 23:11 +0900, Hiro Yoshioka wrote: > Hi, > > The following patch does not use MMX regsiters so that we don't have > to worry about save/restore the FPU/MMX states. > > What do you think? excellent! - To unsubscribe from this list: send the line "unsubscribe linux-kernel"

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-24 Thread Hiro Yoshioka
Hi, The following patch does not use MMX regsiters so that we don't have to worry about save/restore the FPU/MMX states. What do you think? Some performance data are Total of GLOBAL_POWER_EVENTS (CPU cycle samples) 2.6.12.4.orig1921587 2.6.12.4.nt 1688900 1688900/1921587=87.89%

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-24 Thread Hiro Yoshioka
Hi, The following patch does not use MMX regsiters so that we don't have to worry about save/restore the FPU/MMX states. What do you think? Some performance data are Total of GLOBAL_POWER_EVENTS (CPU cycle samples) 2.6.12.4.orig1921587 2.6.12.4.nt 1688900 1688900/1921587=87.89%

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-24 Thread Arjan van de Ven
On Wed, 2005-08-24 at 23:11 +0900, Hiro Yoshioka wrote: Hi, The following patch does not use MMX regsiters so that we don't have to worry about save/restore the FPU/MMX states. What do you think? excellent! - To unsubscribe from this list: send the line unsubscribe linux-kernel in the

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-24 Thread Andi Kleen
Hiro Yoshioka [EMAIL PROTECTED] writes: Hi, The following patch does not use MMX regsiters so that we don't have to worry about save/restore the FPU/MMX states. What do you think? Performance will probably be bad on K7 Athlons - those have a microcoded movnti which is quite slow. Also

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-24 Thread Hirokazu Takahashi
Hi, The following patch does not use MMX regsiters so that we don't have to worry about save/restore the FPU/MMX states. What do you think? I think __copy_user_zeroing_intel_nocache() should be followed by sfence or mfence instruction to flush the data. - To unsubscribe from this list:

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-24 Thread Hiro Yoshioka
From: Hirokazu Takahashi [EMAIL PROTECTED] The following patch does not use MMX regsiters so that we don't have to worry about save/restore the FPU/MMX states. What do you think? I think __copy_user_zeroing_intel_nocache() should be followed by sfence or mfence instruction to flush

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-24 Thread Hiro Yoshioka
From: Andi Kleen [EMAIL PROTECTED] Hi, The following patch does not use MMX regsiters so that we don't have to worry about save/restore the FPU/MMX states. What do you think? Performance will probably be bad on K7 Athlons - those have a microcoded movnti which is quite slow.

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-22 Thread Hiro Yoshioka
Hi, It seems to me this mail does not go out. So resending it. > On 8/18/05, Hiro Yoshioka <[EMAIL PROTECTED]> wrote: > > 1) using stack to save/restore MMX registers > > It seems to me that it has some regression. > I'd like to rollback it and use kernel_fpu_begin() and kernel_fpu_end(). The

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-22 Thread Hiro Yoshioka
Hi, It seems to me this mail does not go out. So resending it. > On 8/18/05, Hiro Yoshioka <[EMAIL PROTECTED]> wrote: > > 1) using stack to save/restore MMX registers > > It seems to me that it has some regression. > I'd like to rollback it and use kernel_fpu_begin() and kernel_fpu_end(). The

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-22 Thread Hiro Yoshioka
> On 8/18/05, Hiro Yoshioka <[EMAIL PROTECTED]> wrote: > > 1) using stack to save/restore MMX registers > > It seems to me that it has some regression. > I'd like to rollback it and use kernel_fpu_begin() and kernel_fpu_end(). The following is a current version of cache aware copy_from_user_ll.

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-22 Thread Andi Kleen
> 2) low latency version of cache aware copy Having a low latency version that is only active with CONFIG_PREEMPT is bad - non preempt kernels need good latency too. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED]

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-22 Thread Andi Kleen
2) low latency version of cache aware copy Having a low latency version that is only active with CONFIG_PREEMPT is bad - non preempt kernels need good latency too. -Andi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-22 Thread Hiro Yoshioka
On 8/18/05, Hiro Yoshioka [EMAIL PROTECTED] wrote: 1) using stack to save/restore MMX registers It seems to me that it has some regression. I'd like to rollback it and use kernel_fpu_begin() and kernel_fpu_end(). The following is a current version of cache aware copy_from_user_ll. 1)

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-22 Thread Hiro Yoshioka
Hi, It seems to me this mail does not go out. So resending it. On 8/18/05, Hiro Yoshioka [EMAIL PROTECTED] wrote: 1) using stack to save/restore MMX registers It seems to me that it has some regression. I'd like to rollback it and use kernel_fpu_begin() and kernel_fpu_end(). The

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-22 Thread Hiro Yoshioka
Hi, It seems to me this mail does not go out. So resending it. On 8/18/05, Hiro Yoshioka [EMAIL PROTECTED] wrote: 1) using stack to save/restore MMX registers It seems to me that it has some regression. I'd like to rollback it and use kernel_fpu_begin() and kernel_fpu_end(). The

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-18 Thread Hiro Yoshioka
Hi, On 8/18/05, Hiro Yoshioka <[EMAIL PROTECTED]> wrote: > 1) using stack to save/restore MMX registers It seems to me that it has some regression. I'd like to rollback it and use kernel_fpu_begin() and kernel_fpu_end(). Regards, Hiro -- Hiro Yoshioka mailto:hyoshiok at miraclelinux.com - To

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-18 Thread Lee Revell
On Thu, 2005-08-18 at 00:27 +0900, Akira Tsukamoto wrote: > My computer with Athlon K7 was faster with manually prefetching, > but I did not know it is already becoming obsolete. > Don't listen to people who tell you $FOO hardware is obsolete, they have a very narrow view. "Obsolete" is

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-18 Thread Hiro Yoshioka
> So I make two APIs. > __copy_user_zeroing_nocache() > __copy_user_zeroing_inatomic_nocache() > > The former is a low latency version and the other is a throughput version. 1) using stack to save/restore MMX registers 2) low latency version of cache aware copy 3) __copy_user*_nocache APIs so

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-18 Thread Hiro Yoshioka
On 16 Aug 2005 15:15:35 +0200, Andi Kleen <[EMAIL PROTECTED]> wrote: > However it disables preemption, which especially for bigger > copies will probably make the low latency people unhappy. In the copy loop, +#ifdef CONFIG_PREEMPT + if ( (i%64)==0 ) { +

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-18 Thread Hiro Yoshioka
Chuck, On 8/18/05, Chuck Ebbert <[EMAIL PROTECTED]> wrote: > On Wed, 17 Aug 2005 at 13:50:22 +0900 (JST), Hiro Yoshioka wrote: > > > 3) page faults/exceptions/... > > 3-1 TS flag is set by the CPU (Am I right?) > > TS will _not_ be set if a trap/fault or interrupt occurs. The only > way

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-18 Thread Hiro Yoshioka
Chuck, On 8/18/05, Chuck Ebbert [EMAIL PROTECTED] wrote: On Wed, 17 Aug 2005 at 13:50:22 +0900 (JST), Hiro Yoshioka wrote: 3) page faults/exceptions/... 3-1 TS flag is set by the CPU (Am I right?) TS will _not_ be set if a trap/fault or interrupt occurs. The only way that could

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-18 Thread Hiro Yoshioka
On 16 Aug 2005 15:15:35 +0200, Andi Kleen [EMAIL PROTECTED] wrote: However it disables preemption, which especially for bigger copies will probably make the low latency people unhappy. In the copy loop, +#ifdef CONFIG_PREEMPT + if ( (i%64)==0 ) { + MMX_RESTORE; +

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-18 Thread Hiro Yoshioka
So I make two APIs. __copy_user_zeroing_nocache() __copy_user_zeroing_inatomic_nocache() The former is a low latency version and the other is a throughput version. 1) using stack to save/restore MMX registers 2) low latency version of cache aware copy 3) __copy_user*_nocache APIs so if you

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-18 Thread Lee Revell
On Thu, 2005-08-18 at 00:27 +0900, Akira Tsukamoto wrote: My computer with Athlon K7 was faster with manually prefetching, but I did not know it is already becoming obsolete. Don't listen to people who tell you $FOO hardware is obsolete, they have a very narrow view. Obsolete is meaningless

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-18 Thread Hiro Yoshioka
Hi, On 8/18/05, Hiro Yoshioka [EMAIL PROTECTED] wrote: 1) using stack to save/restore MMX registers It seems to me that it has some regression. I'd like to rollback it and use kernel_fpu_begin() and kernel_fpu_end(). Regards, Hiro -- Hiro Yoshioka mailto:hyoshiok at miraclelinux.com - To

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-17 Thread Akira Tsukamoto
On Wed, 17 Aug 2005 23:30:13 +0900 Akira Tsukamoto <[EMAIL PROTECTED]> mentioned: > > I'm trying to understand this mechanism but I don't > > understand very well. > > My explanation was a bit ambiguous, see the code below. > Where the fp register saved? It saves fp register *inside*

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-17 Thread Chuck Ebbert
On Wed, 17 Aug 2005 at 13:50:22 +0900 (JST), Hiro Yoshioka wrote: > 3) page faults/exceptions/... > 3-1 TS flag is set by the CPU (Am I right?) TS will _not_ be set if a trap/fault or interrupt occurs. The only way that could happen automatically would be to use a separate hardware task with

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-17 Thread Akira Tsukamoto
I am resubmitting this because it seems to be lost when I posted the before yesterday. Arjan van de Ven mentioned: > The only comment/question I have is about the use of prefetchnta; that > might have cache-evicting properties as well (eg evict the cache of

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-17 Thread Akira Tsukamoto
On Wed, 17 Aug 2005 14:10:34 +0900 Hiro Yoshioka <[EMAIL PROTECTED]> mentioned: > On 8/17/05, Akira Tsukamoto <[EMAIL PROTECTED]> wrote: > > Anyway, going back to copy_user topic, > > big remaining issues are > > 1)store/restore floating point register (80/64bytes) twice every time by > >

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-17 Thread Akira Tsukamoto
On Wed, 17 Aug 2005 14:10:34 +0900 Hiro Yoshioka [EMAIL PROTECTED] mentioned: On 8/17/05, Akira Tsukamoto [EMAIL PROTECTED] wrote: Anyway, going back to copy_user topic, big remaining issues are 1)store/restore floating point register (80/64bytes) twice every time by surrounding

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-17 Thread Akira Tsukamoto
I am resubmitting this because it seems to be lost when I posted the before yesterday. Arjan van de Ven mentioned: The only comment/question I have is about the use of prefetchnta; that might have cache-evicting properties as well (eg evict the cache of

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-17 Thread Chuck Ebbert
On Wed, 17 Aug 2005 at 13:50:22 +0900 (JST), Hiro Yoshioka wrote: 3) page faults/exceptions/... 3-1 TS flag is set by the CPU (Am I right?) TS will _not_ be set if a trap/fault or interrupt occurs. The only way that could happen automatically would be to use a separate hardware task with

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-17 Thread Akira Tsukamoto
On Wed, 17 Aug 2005 23:30:13 +0900 Akira Tsukamoto [EMAIL PROTECTED] mentioned: I'm trying to understand this mechanism but I don't understand very well. My explanation was a bit ambiguous, see the code below. Where the fp register saved? It saves fp register *inside* task_struct, More

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-16 Thread Hiro Yoshioka
Akira, Thanks for your suggestions. On 8/17/05, Akira Tsukamoto <[EMAIL PROTECTED]> wrote: > Anyway, going back to copy_user topic, > big remaining issues are > 1)store/restore floating point register (80/64bytes) twice every time by > surrounding with kernel_fpu_begin()/kernel_fpu_end()

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-16 Thread Hiro Yoshioka
From: Hiro Yoshioka <[EMAIL PROTECTED]> Subject: Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll() Date: Wed, 17 Aug 2005 08:21:53 +0900 (JST) Message-ID: <[EMAIL PROTECTED]> > Chuck, > > From: Chuck Ebbert <[EMAIL PROTECTED]> > > On Tue, 16 Aug 2

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-16 Thread Hiro Yoshioka
Chuck, From: Chuck Ebbert <[EMAIL PROTECTED]> > On Tue, 16 Aug 2005 at 19:16:17 +0900 (JST), Hiro Yoshioka wrote: > > oh, really? Does the linux kernel take care of > > SSE save/restore on a task switch? > > Check out XMMS_SAVE and XMMS_RESTORE in include/asm-i386/xor.h Thanks for your

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-16 Thread Chuck Ebbert
On Tue, 16 Aug 2005 at 19:16:17 +0900 (JST), Hiro Yoshioka wrote: > oh, really? Does the linux kernel take care of > SSE save/restore on a task switch? Check out XMMS_SAVE and XMMS_RESTORE in include/asm-i386/xor.h __ Chuck - To unsubscribe from this list: send the line "unsubscribe

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-16 Thread Andi Kleen
Arjan van de Ven <[EMAIL PROTECTED]> writes: > > not on kernel entry afaik. > However just save the register on the stack and put it back at the > end... You need to do more than that, like disabling lazy FPU mode. That is what kernel_fpu_begin/end takes care of. However it disables

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-16 Thread Hirokazu Takahashi
Hi, > > > > My code does nothing do it. > > > > > > > > I need a volunteer to implement it. > > > > > > it's actually not too hard; all you need is to use SSE and not MMX; and > > > then just store sse register you're overwriting on the stack or so... > > > > oh, really? Does the linux kernel

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-16 Thread Arjan van de Ven
On Tue, 2005-08-16 at 19:16 +0900, Hiro Yoshioka wrote: > From: Arjan van de Ven <[EMAIL PROTECTED]> > > > My code does nothing do it. > > > > > > I need a volunteer to implement it. > > > > it's actually not too hard; all you need is to use SSE and not MMX; and > > then just store sse register

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-16 Thread Hirokazu Takahashi
Hi > > > My code does nothing do it. > > > > > > I need a volunteer to implement it. > > > > it's actually not too hard; all you need is to use SSE and not MMX; and > > then just store sse register you're overwriting on the stack or so... > > oh, really? Does the linux kernel take care of >

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-16 Thread Hiro Yoshioka
From: Arjan van de Ven <[EMAIL PROTECTED]> > > My code does nothing do it. > > > > I need a volunteer to implement it. > > it's actually not too hard; all you need is to use SSE and not MMX; and > then just store sse register you're overwriting on the stack or so... oh, really? Does the linux

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-16 Thread Hiro Yoshioka
From: Arjan van de Ven [EMAIL PROTECTED] My code does nothing do it. I need a volunteer to implement it. it's actually not too hard; all you need is to use SSE and not MMX; and then just store sse register you're overwriting on the stack or so... oh, really? Does the linux kernel take

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-16 Thread Hirokazu Takahashi
Hi My code does nothing do it. I need a volunteer to implement it. it's actually not too hard; all you need is to use SSE and not MMX; and then just store sse register you're overwriting on the stack or so... oh, really? Does the linux kernel take care of SSE save/restore on

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-16 Thread Arjan van de Ven
On Tue, 2005-08-16 at 19:16 +0900, Hiro Yoshioka wrote: From: Arjan van de Ven [EMAIL PROTECTED] My code does nothing do it. I need a volunteer to implement it. it's actually not too hard; all you need is to use SSE and not MMX; and then just store sse register you're overwriting

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-16 Thread Hirokazu Takahashi
Hi, My code does nothing do it. I need a volunteer to implement it. it's actually not too hard; all you need is to use SSE and not MMX; and then just store sse register you're overwriting on the stack or so... oh, really? Does the linux kernel take care of SSE

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-16 Thread Andi Kleen
Arjan van de Ven [EMAIL PROTECTED] writes: not on kernel entry afaik. However just save the register on the stack and put it back at the end... You need to do more than that, like disabling lazy FPU mode. That is what kernel_fpu_begin/end takes care of. However it disables preemption,

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-16 Thread Chuck Ebbert
On Tue, 16 Aug 2005 at 19:16:17 +0900 (JST), Hiro Yoshioka wrote: oh, really? Does the linux kernel take care of SSE save/restore on a task switch? Check out XMMS_SAVE and XMMS_RESTORE in include/asm-i386/xor.h __ Chuck - To unsubscribe from this list: send the line unsubscribe linux-kernel

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-16 Thread Hiro Yoshioka
Chuck, From: Chuck Ebbert [EMAIL PROTECTED] On Tue, 16 Aug 2005 at 19:16:17 +0900 (JST), Hiro Yoshioka wrote: oh, really? Does the linux kernel take care of SSE save/restore on a task switch? Check out XMMS_SAVE and XMMS_RESTORE in include/asm-i386/xor.h Thanks for your suggestion. But

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-16 Thread Hiro Yoshioka
From: Hiro Yoshioka [EMAIL PROTECTED] Subject: Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll() Date: Wed, 17 Aug 2005 08:21:53 +0900 (JST) Message-ID: [EMAIL PROTECTED] Chuck, From: Chuck Ebbert [EMAIL PROTECTED] On Tue, 16 Aug 2005 at 19:16:17 +0900 (JST), Hiro Yoshioka wrote

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-16 Thread Hiro Yoshioka
Akira, Thanks for your suggestions. On 8/17/05, Akira Tsukamoto [EMAIL PROTECTED] wrote: Anyway, going back to copy_user topic, big remaining issues are 1)store/restore floating point register (80/64bytes) twice every time by surrounding with kernel_fpu_begin()/kernel_fpu_end() is big

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-15 Thread Arjan van de Ven
On Tue, 2005-08-16 at 12:30 +0900, Hiro Yoshioka wrote: > The following example shows the L3 cache miss is reduced from 37410 to 107. most impressive; it seems the approach to do this selectively is paying off very well! The only comment/question I have is about the use of prefetchnta; that

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-15 Thread Arjan van de Ven
On Tue, 2005-08-16 at 13:17 +0900, Hirokazu Takahashi wrote: > Hi, > > BTW, what are you going to do with the page-faults which may happen > during __copy_user_zeroing_nocache()? The current process may be blocked > in the handler for a while and get FPU registers polluted. > kernel_fpu_begin()

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-15 Thread Arjan van de Ven
On Tue, 2005-08-16 at 13:54 +0900, Hiro Yoshioka wrote: > Takahashi san, > > I appreciate your comments. > > > Hi, > > > > BTW, what are you going to do with the page-faults which may happen > > during __copy_user_zeroing_nocache()? The current process may be blocked > > in the handler for a

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-15 Thread Hiro Yoshioka
Takahashi san, I appreciate your comments. > Hi, > > BTW, what are you going to do with the page-faults which may happen > during __copy_user_zeroing_nocache()? The current process may be blocked > in the handler for a while and get FPU registers polluted. > kernel_fpu_begin() won't help the

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-15 Thread Hirokazu Takahashi
Hi, BTW, what are you going to do with the page-faults which may happen during __copy_user_zeroing_nocache()? The current process may be blocked in the handler for a while and get FPU registers polluted. kernel_fpu_begin() won't help the case. This is another issue, though. > > Thanks. > > > >

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-15 Thread Hiro Yoshioka
From: Hiro Yoshioka <[EMAIL PROTECTED]> Date: Tue, 16 Aug 2005 08:33:59 +0900 > Thanks. > > filemap_copy_from_user() calls __copy_from_user_inatomic() calls > __copy_from_user_ll(). > > I'll look at the code. The following is a quick hack of cache aware implementation of __copy_from_user_ll()

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-15 Thread Hiro Yoshioka
On 8/15/05, Arjan van de Ven <[EMAIL PROTECTED]> wrote: > > copy_from_user_nocache() is fine. > > > > But I don't know where I can use it. (I'm not so > > familiar with the linux kernel file system yet.) > > I suspect the few cases where it will make the most difference will be > in the VFS for

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-15 Thread Andi Kleen
On Mon, Aug 15, 2005 at 05:09:12PM +0200, Arjan van de Ven wrote: > On Mon, 2005-08-15 at 17:02 +0200, Andi Kleen wrote: > > Arjan van de Ven <[EMAIL PROTECTED]> writes: > > > > > On Mon, 2005-08-15 at 08:15 -0400, [EMAIL PROTECTED] wrote: > > > > Actually, is there any place *other* than write()

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-15 Thread Arjan van de Ven
On Mon, 2005-08-15 at 17:02 +0200, Andi Kleen wrote: > Arjan van de Ven <[EMAIL PROTECTED]> writes: > > > On Mon, 2005-08-15 at 08:15 -0400, [EMAIL PROTECTED] wrote: > > > Actually, is there any place *other* than write() to the page cache that > > > warrants a non-temporal store? Network

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-15 Thread Andi Kleen
Arjan van de Ven <[EMAIL PROTECTED]> writes: > On Mon, 2005-08-15 at 08:15 -0400, [EMAIL PROTECTED] wrote: > > Actually, is there any place *other* than write() to the page cache that > > warrants a non-temporal store? Network sockets with scatter/gather and > > hardware checksum, maybe? > >

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-15 Thread Ian Kumlien
On Mon, 2005-08-15 at 09:21 +0200, Arjan van de Ven wrote: > On Sun, 2005-08-14 at 23:24 +0200, Ian Kumlien wrote: > > Hi, all > > > > I might be missunderstanding things but... > > > > First of all, machines with long pipelines will suffer from cache misses > > (p4 in this case). > > > >

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-15 Thread Arjan van de Ven
On Mon, 2005-08-15 at 08:15 -0400, [EMAIL PROTECTED] wrote: > Actually, is there any place *other* than write() to the page cache that > warrants a non-temporal store? Network sockets with scatter/gather and > hardware checksum, maybe? afaik those use zero copy already, eg straight pagecache

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-15 Thread linux
Actually, is there any place *other* than write() to the page cache that warrants a non-temporal store? Network sockets with scatter/gather and hardware checksum, maybe? This is pretty much synonomous with what is allowed to go into high memory, no? While we're on the subject, for the

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-15 Thread Arjan van de Ven
On Mon, 2005-08-15 at 17:44 +0900, Hiro Yoshioka wrote: > Hi, > > I appreciate your suggestion. > > On 8/15/05, Arjan van de Ven <[EMAIL PROTECTED]> wrote: > > > > > Anyway we could not find the cache aware version of __copy_from_user_ll > > > has a big regression yet. > > > > > > that is

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-15 Thread Hiro Yoshioka
Hi, I appreciate your suggestion. On 8/15/05, Arjan van de Ven <[EMAIL PROTECTED]> wrote: > > > Anyway we could not find the cache aware version of __copy_from_user_ll > > has a big regression yet. > > > that is because you spread the cache misses out from one place to all > over the place,

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-15 Thread Arjan van de Ven
On Sun, 2005-08-14 at 23:24 +0200, Ian Kumlien wrote: > Hi, all > > I might be missunderstanding things but... > > First of all, machines with long pipelines will suffer from cache misses > (p4 in this case). > > Depending on the size copied, (i don't know how large they are so..) > can't one

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-15 Thread Arjan van de Ven
> Anyway we could not find the cache aware version of __copy_from_user_ll > has a big regression yet. that is because you spread the cache misses out from one place to all over the place, so that no one single point sticks out anymore. Do you agree that your copy is less optimal for the case

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-15 Thread Hiro Yoshioka
Hi, From: Arjan van de Ven <[EMAIL PROTECTED]> Subject: Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll() Date: Sun, 14 Aug 2005 12:35:43 +0200 Message-ID: <[EMAIL PROTECTED]> > On Sun, 2005-08-14 at 19:22 +0900, Hiro Yoshioka wrote: > > Thanks for your comments

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-15 Thread Hiro Yoshioka
Hi, From: Arjan van de Ven [EMAIL PROTECTED] Subject: Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll() Date: Sun, 14 Aug 2005 12:35:43 +0200 Message-ID: [EMAIL PROTECTED] On Sun, 2005-08-14 at 19:22 +0900, Hiro Yoshioka wrote: Thanks for your comments. On 8/14/05, Arjan van

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-15 Thread Arjan van de Ven
Anyway we could not find the cache aware version of __copy_from_user_ll has a big regression yet. that is because you spread the cache misses out from one place to all over the place, so that no one single point sticks out anymore. Do you agree that your copy is less optimal for the case

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-15 Thread Arjan van de Ven
On Sun, 2005-08-14 at 23:24 +0200, Ian Kumlien wrote: Hi, all I might be missunderstanding things but... First of all, machines with long pipelines will suffer from cache misses (p4 in this case). Depending on the size copied, (i don't know how large they are so..) can't one run out of

Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()

2005-08-15 Thread Hiro Yoshioka
Hi, I appreciate your suggestion. On 8/15/05, Arjan van de Ven [EMAIL PROTECTED] wrote: Anyway we could not find the cache aware version of __copy_from_user_ll has a big regression yet. that is because you spread the cache misses out from one place to all over the place, so that no

  1   2   >