From: Hiro Yoshioka <[EMAIL PROTECTED]>
Subject: Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()
Date: Fri, 02 Sep 2005 13:37:16 +0900 (JST)
Message-ID: <[EMAIL PROTECTED]>
> From: Andrew Morton <[EMAIL PROTECTED]>
> > Hiro Yoshioka <[EMAIL PROTECTED]
From: Hiro Yoshioka [EMAIL PROTECTED]
Subject: Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()
Date: Fri, 02 Sep 2005 13:37:16 +0900 (JST)
Message-ID: [EMAIL PROTECTED]
From: Andrew Morton [EMAIL PROTECTED]
Hiro Yoshioka [EMAIL PROTECTED] wrote:
--- linux-2.6.12.4.orig/arch
From: Andrew Morton <[EMAIL PROTECTED]>
> Hiro Yoshioka <[EMAIL PROTECTED]> wrote:
> >
> > --- linux-2.6.12.4.orig/arch/i386/lib/usercopy.c2005-08-05
> > 16:04:37.0 +0900
> > +++ linux-2.6.12.4.nt/arch/i386/lib/usercopy.c 2005-09-01
> > 17:09:41.0 +0900
>
> Really.
Hiro Yoshioka <[EMAIL PROTECTED]> wrote:
>
> --- linux-2.6.12.4.orig/arch/i386/lib/usercopy.c 2005-08-05
> 16:04:37.0 +0900
> +++ linux-2.6.12.4.nt/arch/i386/lib/usercopy.c 2005-09-01
> 17:09:41.0 +0900
Really. Please redo and retest the patch against a current
Andrew,
From: Andrew Morton <[EMAIL PROTECTED]>
> Andi Kleen <[EMAIL PROTECTED]> wrote:
> >
> > On Friday 02 September 2005 04:08, Andrew Morton wrote:
> >
> > > I suppose I'll queue it up in -mm for a while, although I'm a bit dubious
> > > about the whole idea... We'll gain some and we'll
Andi Kleen <[EMAIL PROTECTED]> wrote:
>
> On Friday 02 September 2005 04:08, Andrew Morton wrote:
>
> > I suppose I'll queue it up in -mm for a while, although I'm a bit dubious
> > about the whole idea... We'll gain some and we'll lose some - how do we
> > know it's a net gain?
>
> I suspect
On Friday 02 September 2005 04:08, Andrew Morton wrote:
> I suppose I'll queue it up in -mm for a while, although I'm a bit dubious
> about the whole idea... We'll gain some and we'll lose some - how do we
> know it's a net gain?
I suspect it'll gain more than it loses. The only case where it
Hiro Yoshioka <[EMAIL PROTECTED]> wrote:
>
> From: Andi Kleen <[EMAIL PROTECTED]>
> > On Thursday 01 September 2005 11:07, Hiro Yoshioka wrote:
> >
> > > The following is the almost final version of the
> > > cache pollution aware __copy_from_user_ll() patch.
> >
> > Looks good to me.
> >
> >
On Friday 02 September 2005 03:43, Hiro Yoshioka wrote:
> From: Andi Kleen <[EMAIL PROTECTED]>
>
> > On Thursday 01 September 2005 11:07, Hiro Yoshioka wrote:
> > > The following is the almost final version of the
> > > cache pollution aware __copy_from_user_ll() patch.
> >
> > Looks good to me.
>
From: Andi Kleen <[EMAIL PROTECTED]>
> On Thursday 01 September 2005 11:07, Hiro Yoshioka wrote:
>
> > The following is the almost final version of the
> > cache pollution aware __copy_from_user_ll() patch.
>
> Looks good to me.
>
> Once the filemap.c hunk is in I'll probably do something
>
On Thursday 01 September 2005 11:07, Hiro Yoshioka wrote:
> The following is the almost final version of the
> cache pollution aware __copy_from_user_ll() patch.
Looks good to me.
Once the filemap.c hunk is in I'll probably do something
similar for x86-64.
-Andi
-
To unsubscribe from this
Hi,
> From: Andi Kleen <[EMAIL PROTECTED]>
> > > Hi,
> > >
> > > The following patch does not use MMX regsiters so that we don't have
> > > to worry about save/restore the FPU/MMX states.
> > >
> > > What do you think?
> >
> > Performance will probably be bad on K7 Athlons - those have a
Hi,
From: Andi Kleen [EMAIL PROTECTED]
Hi,
The following patch does not use MMX regsiters so that we don't have
to worry about save/restore the FPU/MMX states.
What do you think?
Performance will probably be bad on K7 Athlons - those have a microcoded
movnti which is
On Thursday 01 September 2005 11:07, Hiro Yoshioka wrote:
The following is the almost final version of the
cache pollution aware __copy_from_user_ll() patch.
Looks good to me.
Once the filemap.c hunk is in I'll probably do something
similar for x86-64.
-Andi
-
To unsubscribe from this list:
From: Andi Kleen [EMAIL PROTECTED]
On Thursday 01 September 2005 11:07, Hiro Yoshioka wrote:
The following is the almost final version of the
cache pollution aware __copy_from_user_ll() patch.
Looks good to me.
Once the filemap.c hunk is in I'll probably do something
similar for
On Friday 02 September 2005 03:43, Hiro Yoshioka wrote:
From: Andi Kleen [EMAIL PROTECTED]
On Thursday 01 September 2005 11:07, Hiro Yoshioka wrote:
The following is the almost final version of the
cache pollution aware __copy_from_user_ll() patch.
Looks good to me.
Once the
Hiro Yoshioka [EMAIL PROTECTED] wrote:
From: Andi Kleen [EMAIL PROTECTED]
On Thursday 01 September 2005 11:07, Hiro Yoshioka wrote:
The following is the almost final version of the
cache pollution aware __copy_from_user_ll() patch.
Looks good to me.
Once the filemap.c hunk
On Friday 02 September 2005 04:08, Andrew Morton wrote:
I suppose I'll queue it up in -mm for a while, although I'm a bit dubious
about the whole idea... We'll gain some and we'll lose some - how do we
know it's a net gain?
I suspect it'll gain more than it loses. The only case where it
Andi Kleen [EMAIL PROTECTED] wrote:
On Friday 02 September 2005 04:08, Andrew Morton wrote:
I suppose I'll queue it up in -mm for a while, although I'm a bit dubious
about the whole idea... We'll gain some and we'll lose some - how do we
know it's a net gain?
I suspect it'll gain
Andrew,
From: Andrew Morton [EMAIL PROTECTED]
Andi Kleen [EMAIL PROTECTED] wrote:
On Friday 02 September 2005 04:08, Andrew Morton wrote:
I suppose I'll queue it up in -mm for a while, although I'm a bit dubious
about the whole idea... We'll gain some and we'll lose some - how do
Hiro Yoshioka [EMAIL PROTECTED] wrote:
--- linux-2.6.12.4.orig/arch/i386/lib/usercopy.c 2005-08-05
16:04:37.0 +0900
+++ linux-2.6.12.4.nt/arch/i386/lib/usercopy.c 2005-09-01
17:09:41.0 +0900
Really. Please redo and retest the patch against a current kernel.
-
From: Andrew Morton [EMAIL PROTECTED]
Hiro Yoshioka [EMAIL PROTECTED] wrote:
--- linux-2.6.12.4.orig/arch/i386/lib/usercopy.c2005-08-05
16:04:37.0 +0900
+++ linux-2.6.12.4.nt/arch/i386/lib/usercopy.c 2005-09-01
17:09:41.0 +0900
Really. Please redo and
From: Andi Kleen <[EMAIL PROTECTED]>
> > Hi,
> >
> > The following patch does not use MMX regsiters so that we don't have
> > to worry about save/restore the FPU/MMX states.
> >
> > What do you think?
>
> Performance will probably be bad on K7 Athlons - those have a microcoded
> movnti which is
From: Hirokazu Takahashi <[EMAIL PROTECTED]>
> > The following patch does not use MMX regsiters so that we don't have
> > to worry about save/restore the FPU/MMX states.
> >
> > What do you think?
>
> I think __copy_user_zeroing_intel_nocache() should be followed by sfence
> or mfence
Hi,
> The following patch does not use MMX regsiters so that we don't have
> to worry about save/restore the FPU/MMX states.
>
> What do you think?
I think __copy_user_zeroing_intel_nocache() should be followed by sfence
or mfence instruction to flush the data.
-
To unsubscribe from this
Hiro Yoshioka <[EMAIL PROTECTED]> writes:
> Hi,
>
> The following patch does not use MMX regsiters so that we don't have
> to worry about save/restore the FPU/MMX states.
>
> What do you think?
Performance will probably be bad on K7 Athlons - those have a microcoded
movnti which is quite slow.
On Wed, 2005-08-24 at 23:11 +0900, Hiro Yoshioka wrote:
> Hi,
>
> The following patch does not use MMX regsiters so that we don't have
> to worry about save/restore the FPU/MMX states.
>
> What do you think?
excellent!
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel"
Hi,
The following patch does not use MMX regsiters so that we don't have
to worry about save/restore the FPU/MMX states.
What do you think?
Some performance data are
Total of GLOBAL_POWER_EVENTS (CPU cycle samples)
2.6.12.4.orig1921587
2.6.12.4.nt 1688900
1688900/1921587=87.89%
Hi,
The following patch does not use MMX regsiters so that we don't have
to worry about save/restore the FPU/MMX states.
What do you think?
Some performance data are
Total of GLOBAL_POWER_EVENTS (CPU cycle samples)
2.6.12.4.orig1921587
2.6.12.4.nt 1688900
1688900/1921587=87.89%
On Wed, 2005-08-24 at 23:11 +0900, Hiro Yoshioka wrote:
Hi,
The following patch does not use MMX regsiters so that we don't have
to worry about save/restore the FPU/MMX states.
What do you think?
excellent!
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the
Hiro Yoshioka [EMAIL PROTECTED] writes:
Hi,
The following patch does not use MMX regsiters so that we don't have
to worry about save/restore the FPU/MMX states.
What do you think?
Performance will probably be bad on K7 Athlons - those have a microcoded
movnti which is quite slow.
Also
Hi,
The following patch does not use MMX regsiters so that we don't have
to worry about save/restore the FPU/MMX states.
What do you think?
I think __copy_user_zeroing_intel_nocache() should be followed by sfence
or mfence instruction to flush the data.
-
To unsubscribe from this list:
From: Hirokazu Takahashi [EMAIL PROTECTED]
The following patch does not use MMX regsiters so that we don't have
to worry about save/restore the FPU/MMX states.
What do you think?
I think __copy_user_zeroing_intel_nocache() should be followed by sfence
or mfence instruction to flush
From: Andi Kleen [EMAIL PROTECTED]
Hi,
The following patch does not use MMX regsiters so that we don't have
to worry about save/restore the FPU/MMX states.
What do you think?
Performance will probably be bad on K7 Athlons - those have a microcoded
movnti which is quite slow.
Hi,
It seems to me this mail does not go out.
So resending it.
> On 8/18/05, Hiro Yoshioka <[EMAIL PROTECTED]> wrote:
> > 1) using stack to save/restore MMX registers
>
> It seems to me that it has some regression.
> I'd like to rollback it and use kernel_fpu_begin() and kernel_fpu_end().
The
Hi,
It seems to me this mail does not go out.
So resending it.
> On 8/18/05, Hiro Yoshioka <[EMAIL PROTECTED]> wrote:
> > 1) using stack to save/restore MMX registers
>
> It seems to me that it has some regression.
> I'd like to rollback it and use kernel_fpu_begin() and kernel_fpu_end().
The
> On 8/18/05, Hiro Yoshioka <[EMAIL PROTECTED]> wrote:
> > 1) using stack to save/restore MMX registers
>
> It seems to me that it has some regression.
> I'd like to rollback it and use kernel_fpu_begin() and kernel_fpu_end().
The following is a current version of cache aware copy_from_user_ll.
> 2) low latency version of cache aware copy
Having a low latency version that is only active with CONFIG_PREEMPT
is bad - non preempt kernels need good latency too.
-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
2) low latency version of cache aware copy
Having a low latency version that is only active with CONFIG_PREEMPT
is bad - non preempt kernels need good latency too.
-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More
On 8/18/05, Hiro Yoshioka [EMAIL PROTECTED] wrote:
1) using stack to save/restore MMX registers
It seems to me that it has some regression.
I'd like to rollback it and use kernel_fpu_begin() and kernel_fpu_end().
The following is a current version of cache aware copy_from_user_ll.
1)
Hi,
It seems to me this mail does not go out.
So resending it.
On 8/18/05, Hiro Yoshioka [EMAIL PROTECTED] wrote:
1) using stack to save/restore MMX registers
It seems to me that it has some regression.
I'd like to rollback it and use kernel_fpu_begin() and kernel_fpu_end().
The
Hi,
It seems to me this mail does not go out.
So resending it.
On 8/18/05, Hiro Yoshioka [EMAIL PROTECTED] wrote:
1) using stack to save/restore MMX registers
It seems to me that it has some regression.
I'd like to rollback it and use kernel_fpu_begin() and kernel_fpu_end().
The
Hi,
On 8/18/05, Hiro Yoshioka <[EMAIL PROTECTED]> wrote:
> 1) using stack to save/restore MMX registers
It seems to me that it has some regression.
I'd like to rollback it and use kernel_fpu_begin() and kernel_fpu_end().
Regards,
Hiro
--
Hiro Yoshioka
mailto:hyoshiok at miraclelinux.com
-
To
On Thu, 2005-08-18 at 00:27 +0900, Akira Tsukamoto wrote:
> My computer with Athlon K7 was faster with manually prefetching,
> but I did not know it is already becoming obsolete.
>
Don't listen to people who tell you $FOO hardware is obsolete, they have
a very narrow view. "Obsolete" is
> So I make two APIs.
> __copy_user_zeroing_nocache()
> __copy_user_zeroing_inatomic_nocache()
>
> The former is a low latency version and the other is a throughput version.
1) using stack to save/restore MMX registers
2) low latency version of cache aware copy
3) __copy_user*_nocache APIs so
On 16 Aug 2005 15:15:35 +0200, Andi Kleen <[EMAIL PROTECTED]> wrote:
> However it disables preemption, which especially for bigger
> copies will probably make the low latency people unhappy.
In the copy loop,
+#ifdef CONFIG_PREEMPT
+ if ( (i%64)==0 ) {
+
Chuck,
On 8/18/05, Chuck Ebbert <[EMAIL PROTECTED]> wrote:
> On Wed, 17 Aug 2005 at 13:50:22 +0900 (JST), Hiro Yoshioka wrote:
>
> > 3) page faults/exceptions/...
> > 3-1 TS flag is set by the CPU (Am I right?)
>
> TS will _not_ be set if a trap/fault or interrupt occurs. The only
> way
Chuck,
On 8/18/05, Chuck Ebbert [EMAIL PROTECTED] wrote:
On Wed, 17 Aug 2005 at 13:50:22 +0900 (JST), Hiro Yoshioka wrote:
3) page faults/exceptions/...
3-1 TS flag is set by the CPU (Am I right?)
TS will _not_ be set if a trap/fault or interrupt occurs. The only
way that could
On 16 Aug 2005 15:15:35 +0200, Andi Kleen [EMAIL PROTECTED] wrote:
However it disables preemption, which especially for bigger
copies will probably make the low latency people unhappy.
In the copy loop,
+#ifdef CONFIG_PREEMPT
+ if ( (i%64)==0 ) {
+ MMX_RESTORE;
+
So I make two APIs.
__copy_user_zeroing_nocache()
__copy_user_zeroing_inatomic_nocache()
The former is a low latency version and the other is a throughput version.
1) using stack to save/restore MMX registers
2) low latency version of cache aware copy
3) __copy_user*_nocache APIs so if you
On Thu, 2005-08-18 at 00:27 +0900, Akira Tsukamoto wrote:
My computer with Athlon K7 was faster with manually prefetching,
but I did not know it is already becoming obsolete.
Don't listen to people who tell you $FOO hardware is obsolete, they have
a very narrow view. Obsolete is meaningless
Hi,
On 8/18/05, Hiro Yoshioka [EMAIL PROTECTED] wrote:
1) using stack to save/restore MMX registers
It seems to me that it has some regression.
I'd like to rollback it and use kernel_fpu_begin() and kernel_fpu_end().
Regards,
Hiro
--
Hiro Yoshioka
mailto:hyoshiok at miraclelinux.com
-
To
On Wed, 17 Aug 2005 23:30:13 +0900
Akira Tsukamoto <[EMAIL PROTECTED]> mentioned:
> > I'm trying to understand this mechanism but I don't
> > understand very well.
>
> My explanation was a bit ambiguous, see the code below.
> Where the fp register saved? It saves fp register *inside*
On Wed, 17 Aug 2005 at 13:50:22 +0900 (JST), Hiro Yoshioka wrote:
> 3) page faults/exceptions/...
> 3-1 TS flag is set by the CPU (Am I right?)
TS will _not_ be set if a trap/fault or interrupt occurs. The only
way that could happen automatically would be to use a separate hardware
task with
I am resubmitting this because it seems to be lost when I posted
the before yesterday.
Arjan van de Ven mentioned:
> The only comment/question I have is about the use of prefetchnta; that
> might have cache-evicting properties as well (eg evict the cache of
On Wed, 17 Aug 2005 14:10:34 +0900
Hiro Yoshioka <[EMAIL PROTECTED]> mentioned:
> On 8/17/05, Akira Tsukamoto <[EMAIL PROTECTED]> wrote:
> > Anyway, going back to copy_user topic,
> > big remaining issues are
> > 1)store/restore floating point register (80/64bytes) twice every time by
> >
On Wed, 17 Aug 2005 14:10:34 +0900
Hiro Yoshioka [EMAIL PROTECTED] mentioned:
On 8/17/05, Akira Tsukamoto [EMAIL PROTECTED] wrote:
Anyway, going back to copy_user topic,
big remaining issues are
1)store/restore floating point register (80/64bytes) twice every time by
surrounding
I am resubmitting this because it seems to be lost when I posted
the before yesterday.
Arjan van de Ven mentioned:
The only comment/question I have is about the use of prefetchnta; that
might have cache-evicting properties as well (eg evict the cache of
On Wed, 17 Aug 2005 at 13:50:22 +0900 (JST), Hiro Yoshioka wrote:
3) page faults/exceptions/...
3-1 TS flag is set by the CPU (Am I right?)
TS will _not_ be set if a trap/fault or interrupt occurs. The only
way that could happen automatically would be to use a separate hardware
task with
On Wed, 17 Aug 2005 23:30:13 +0900
Akira Tsukamoto [EMAIL PROTECTED] mentioned:
I'm trying to understand this mechanism but I don't
understand very well.
My explanation was a bit ambiguous, see the code below.
Where the fp register saved? It saves fp register *inside* task_struct,
More
Akira,
Thanks for your suggestions.
On 8/17/05, Akira Tsukamoto <[EMAIL PROTECTED]> wrote:
> Anyway, going back to copy_user topic,
> big remaining issues are
> 1)store/restore floating point register (80/64bytes) twice every time by
> surrounding with kernel_fpu_begin()/kernel_fpu_end()
From: Hiro Yoshioka <[EMAIL PROTECTED]>
Subject: Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()
Date: Wed, 17 Aug 2005 08:21:53 +0900 (JST)
Message-ID: <[EMAIL PROTECTED]>
> Chuck,
>
> From: Chuck Ebbert <[EMAIL PROTECTED]>
> > On Tue, 16 Aug 2
Chuck,
From: Chuck Ebbert <[EMAIL PROTECTED]>
> On Tue, 16 Aug 2005 at 19:16:17 +0900 (JST), Hiro Yoshioka wrote:
> > oh, really? Does the linux kernel take care of
> > SSE save/restore on a task switch?
>
> Check out XMMS_SAVE and XMMS_RESTORE in include/asm-i386/xor.h
Thanks for your
On Tue, 16 Aug 2005 at 19:16:17 +0900 (JST), Hiro Yoshioka wrote:
> oh, really? Does the linux kernel take care of
> SSE save/restore on a task switch?
Check out XMMS_SAVE and XMMS_RESTORE in include/asm-i386/xor.h
__
Chuck
-
To unsubscribe from this list: send the line "unsubscribe
Arjan van de Ven <[EMAIL PROTECTED]> writes:
>
> not on kernel entry afaik.
> However just save the register on the stack and put it back at the
> end...
You need to do more than that, like disabling lazy FPU mode.
That is what kernel_fpu_begin/end takes care of.
However it disables
Hi,
> > > > My code does nothing do it.
> > > >
> > > > I need a volunteer to implement it.
> > >
> > > it's actually not too hard; all you need is to use SSE and not MMX; and
> > > then just store sse register you're overwriting on the stack or so...
> >
> > oh, really? Does the linux kernel
On Tue, 2005-08-16 at 19:16 +0900, Hiro Yoshioka wrote:
> From: Arjan van de Ven <[EMAIL PROTECTED]>
> > > My code does nothing do it.
> > >
> > > I need a volunteer to implement it.
> >
> > it's actually not too hard; all you need is to use SSE and not MMX; and
> > then just store sse register
Hi
> > > My code does nothing do it.
> > >
> > > I need a volunteer to implement it.
> >
> > it's actually not too hard; all you need is to use SSE and not MMX; and
> > then just store sse register you're overwriting on the stack or so...
>
> oh, really? Does the linux kernel take care of
>
From: Arjan van de Ven <[EMAIL PROTECTED]>
> > My code does nothing do it.
> >
> > I need a volunteer to implement it.
>
> it's actually not too hard; all you need is to use SSE and not MMX; and
> then just store sse register you're overwriting on the stack or so...
oh, really? Does the linux
From: Arjan van de Ven [EMAIL PROTECTED]
My code does nothing do it.
I need a volunteer to implement it.
it's actually not too hard; all you need is to use SSE and not MMX; and
then just store sse register you're overwriting on the stack or so...
oh, really? Does the linux kernel take
Hi
My code does nothing do it.
I need a volunteer to implement it.
it's actually not too hard; all you need is to use SSE and not MMX; and
then just store sse register you're overwriting on the stack or so...
oh, really? Does the linux kernel take care of
SSE save/restore on
On Tue, 2005-08-16 at 19:16 +0900, Hiro Yoshioka wrote:
From: Arjan van de Ven [EMAIL PROTECTED]
My code does nothing do it.
I need a volunteer to implement it.
it's actually not too hard; all you need is to use SSE and not MMX; and
then just store sse register you're overwriting
Hi,
My code does nothing do it.
I need a volunteer to implement it.
it's actually not too hard; all you need is to use SSE and not MMX; and
then just store sse register you're overwriting on the stack or so...
oh, really? Does the linux kernel take care of
SSE
Arjan van de Ven [EMAIL PROTECTED] writes:
not on kernel entry afaik.
However just save the register on the stack and put it back at the
end...
You need to do more than that, like disabling lazy FPU mode.
That is what kernel_fpu_begin/end takes care of.
However it disables preemption,
On Tue, 16 Aug 2005 at 19:16:17 +0900 (JST), Hiro Yoshioka wrote:
oh, really? Does the linux kernel take care of
SSE save/restore on a task switch?
Check out XMMS_SAVE and XMMS_RESTORE in include/asm-i386/xor.h
__
Chuck
-
To unsubscribe from this list: send the line unsubscribe linux-kernel
Chuck,
From: Chuck Ebbert [EMAIL PROTECTED]
On Tue, 16 Aug 2005 at 19:16:17 +0900 (JST), Hiro Yoshioka wrote:
oh, really? Does the linux kernel take care of
SSE save/restore on a task switch?
Check out XMMS_SAVE and XMMS_RESTORE in include/asm-i386/xor.h
Thanks for your suggestion. But
From: Hiro Yoshioka [EMAIL PROTECTED]
Subject: Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()
Date: Wed, 17 Aug 2005 08:21:53 +0900 (JST)
Message-ID: [EMAIL PROTECTED]
Chuck,
From: Chuck Ebbert [EMAIL PROTECTED]
On Tue, 16 Aug 2005 at 19:16:17 +0900 (JST), Hiro Yoshioka wrote
Akira,
Thanks for your suggestions.
On 8/17/05, Akira Tsukamoto [EMAIL PROTECTED] wrote:
Anyway, going back to copy_user topic,
big remaining issues are
1)store/restore floating point register (80/64bytes) twice every time by
surrounding with kernel_fpu_begin()/kernel_fpu_end() is big
On Tue, 2005-08-16 at 12:30 +0900, Hiro Yoshioka wrote:
> The following example shows the L3 cache miss is reduced from 37410 to 107.
most impressive; it seems the approach to do this selectively is paying
off very well!
The only comment/question I have is about the use of prefetchnta; that
On Tue, 2005-08-16 at 13:17 +0900, Hirokazu Takahashi wrote:
> Hi,
>
> BTW, what are you going to do with the page-faults which may happen
> during __copy_user_zeroing_nocache()? The current process may be blocked
> in the handler for a while and get FPU registers polluted.
> kernel_fpu_begin()
On Tue, 2005-08-16 at 13:54 +0900, Hiro Yoshioka wrote:
> Takahashi san,
>
> I appreciate your comments.
>
> > Hi,
> >
> > BTW, what are you going to do with the page-faults which may happen
> > during __copy_user_zeroing_nocache()? The current process may be blocked
> > in the handler for a
Takahashi san,
I appreciate your comments.
> Hi,
>
> BTW, what are you going to do with the page-faults which may happen
> during __copy_user_zeroing_nocache()? The current process may be blocked
> in the handler for a while and get FPU registers polluted.
> kernel_fpu_begin() won't help the
Hi,
BTW, what are you going to do with the page-faults which may happen
during __copy_user_zeroing_nocache()? The current process may be blocked
in the handler for a while and get FPU registers polluted.
kernel_fpu_begin() won't help the case. This is another issue, though.
> > Thanks.
> >
> >
From: Hiro Yoshioka <[EMAIL PROTECTED]>
Date: Tue, 16 Aug 2005 08:33:59 +0900
> Thanks.
>
> filemap_copy_from_user() calls __copy_from_user_inatomic() calls
> __copy_from_user_ll().
>
> I'll look at the code.
The following is a quick hack of cache aware implementation
of __copy_from_user_ll()
On 8/15/05, Arjan van de Ven <[EMAIL PROTECTED]> wrote:
> > copy_from_user_nocache() is fine.
> >
> > But I don't know where I can use it. (I'm not so
> > familiar with the linux kernel file system yet.)
>
> I suspect the few cases where it will make the most difference will be
> in the VFS for
On Mon, Aug 15, 2005 at 05:09:12PM +0200, Arjan van de Ven wrote:
> On Mon, 2005-08-15 at 17:02 +0200, Andi Kleen wrote:
> > Arjan van de Ven <[EMAIL PROTECTED]> writes:
> >
> > > On Mon, 2005-08-15 at 08:15 -0400, [EMAIL PROTECTED] wrote:
> > > > Actually, is there any place *other* than write()
On Mon, 2005-08-15 at 17:02 +0200, Andi Kleen wrote:
> Arjan van de Ven <[EMAIL PROTECTED]> writes:
>
> > On Mon, 2005-08-15 at 08:15 -0400, [EMAIL PROTECTED] wrote:
> > > Actually, is there any place *other* than write() to the page cache that
> > > warrants a non-temporal store? Network
Arjan van de Ven <[EMAIL PROTECTED]> writes:
> On Mon, 2005-08-15 at 08:15 -0400, [EMAIL PROTECTED] wrote:
> > Actually, is there any place *other* than write() to the page cache that
> > warrants a non-temporal store? Network sockets with scatter/gather and
> > hardware checksum, maybe?
>
>
On Mon, 2005-08-15 at 09:21 +0200, Arjan van de Ven wrote:
> On Sun, 2005-08-14 at 23:24 +0200, Ian Kumlien wrote:
> > Hi, all
> >
> > I might be missunderstanding things but...
> >
> > First of all, machines with long pipelines will suffer from cache misses
> > (p4 in this case).
> >
> >
On Mon, 2005-08-15 at 08:15 -0400, [EMAIL PROTECTED] wrote:
> Actually, is there any place *other* than write() to the page cache that
> warrants a non-temporal store? Network sockets with scatter/gather and
> hardware checksum, maybe?
afaik those use zero copy already, eg straight pagecache
Actually, is there any place *other* than write() to the page cache that
warrants a non-temporal store? Network sockets with scatter/gather and
hardware checksum, maybe?
This is pretty much synonomous with what is allowed to go into high
memory, no?
While we're on the subject, for the
On Mon, 2005-08-15 at 17:44 +0900, Hiro Yoshioka wrote:
> Hi,
>
> I appreciate your suggestion.
>
> On 8/15/05, Arjan van de Ven <[EMAIL PROTECTED]> wrote:
> >
> > > Anyway we could not find the cache aware version of __copy_from_user_ll
> > > has a big regression yet.
> >
> >
> > that is
Hi,
I appreciate your suggestion.
On 8/15/05, Arjan van de Ven <[EMAIL PROTECTED]> wrote:
>
> > Anyway we could not find the cache aware version of __copy_from_user_ll
> > has a big regression yet.
>
>
> that is because you spread the cache misses out from one place to all
> over the place,
On Sun, 2005-08-14 at 23:24 +0200, Ian Kumlien wrote:
> Hi, all
>
> I might be missunderstanding things but...
>
> First of all, machines with long pipelines will suffer from cache misses
> (p4 in this case).
>
> Depending on the size copied, (i don't know how large they are so..)
> can't one
> Anyway we could not find the cache aware version of __copy_from_user_ll
> has a big regression yet.
that is because you spread the cache misses out from one place to all
over the place, so that no one single point sticks out anymore.
Do you agree that your copy is less optimal for the case
Hi,
From: Arjan van de Ven <[EMAIL PROTECTED]>
Subject: Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()
Date: Sun, 14 Aug 2005 12:35:43 +0200
Message-ID: <[EMAIL PROTECTED]>
> On Sun, 2005-08-14 at 19:22 +0900, Hiro Yoshioka wrote:
> > Thanks for your comments
Hi,
From: Arjan van de Ven [EMAIL PROTECTED]
Subject: Re: [RFC] [PATCH] cache pollution aware __copy_from_user_ll()
Date: Sun, 14 Aug 2005 12:35:43 +0200
Message-ID: [EMAIL PROTECTED]
On Sun, 2005-08-14 at 19:22 +0900, Hiro Yoshioka wrote:
Thanks for your comments.
On 8/14/05, Arjan van
Anyway we could not find the cache aware version of __copy_from_user_ll
has a big regression yet.
that is because you spread the cache misses out from one place to all
over the place, so that no one single point sticks out anymore.
Do you agree that your copy is less optimal for the case
On Sun, 2005-08-14 at 23:24 +0200, Ian Kumlien wrote:
Hi, all
I might be missunderstanding things but...
First of all, machines with long pipelines will suffer from cache misses
(p4 in this case).
Depending on the size copied, (i don't know how large they are so..)
can't one run out of
Hi,
I appreciate your suggestion.
On 8/15/05, Arjan van de Ven [EMAIL PROTECTED] wrote:
Anyway we could not find the cache aware version of __copy_from_user_ll
has a big regression yet.
that is because you spread the cache misses out from one place to all
over the place, so that no
1 - 100 of 126 matches
Mail list logo