On Mon, 2014-03-24 at 06:17 -0700, Eric Dumazet wrote:
> On Mon, 2014-03-24 at 10:30 +, David Laight wrote:
> > ip_fast_csum() either needs an explicit "m" constraint for the actual
> > buffer (and target) bytes, or the stronger "memory" constraint.
> > The 'volatile' is then not needed.
I
On Mon, 2014-03-24 at 10:30 +, David Laight wrote:
> From: Eric Dumazet
> > On Fri, 2014-03-21 at 14:52 -0400, David Miller wrote:
> > > From: Eric Dumazet
> > > Date: Fri, 21 Mar 2014 05:50:50 -0700
> > >
> > > > It looks like a barrier() would be more appropriate.
> > >
> > > barrier() ==
From: Eric Dumazet
> On Fri, 2014-03-21 at 14:52 -0400, David Miller wrote:
> > From: Eric Dumazet
> > Date: Fri, 21 Mar 2014 05:50:50 -0700
> >
> > > It looks like a barrier() would be more appropriate.
> >
> > barrier() == __asm__ __volatile__(:::"memory")
>
> Indeed, but now you mention it,
From: Eric Dumazet
On Fri, 2014-03-21 at 14:52 -0400, David Miller wrote:
From: Eric Dumazet eric.duma...@gmail.com
Date: Fri, 21 Mar 2014 05:50:50 -0700
It looks like a barrier() would be more appropriate.
barrier() == __asm__ __volatile__(:::memory)
Indeed, but now you mention
On Mon, 2014-03-24 at 10:30 +, David Laight wrote:
From: Eric Dumazet
On Fri, 2014-03-21 at 14:52 -0400, David Miller wrote:
From: Eric Dumazet eric.duma...@gmail.com
Date: Fri, 21 Mar 2014 05:50:50 -0700
It looks like a barrier() would be more appropriate.
barrier() ==
On Mon, 2014-03-24 at 06:17 -0700, Eric Dumazet wrote:
On Mon, 2014-03-24 at 10:30 +, David Laight wrote:
ip_fast_csum() either needs an explicit m constraint for the actual
buffer (and target) bytes, or the stronger memory constraint.
The 'volatile' is then not needed.
I am testing
On Fri, 2014-03-21 at 14:52 -0400, David Miller wrote:
> From: Eric Dumazet
> Date: Fri, 21 Mar 2014 05:50:50 -0700
>
> > It looks like a barrier() would be more appropriate.
>
> barrier() == __asm__ __volatile__(:::"memory")
Indeed, but now you mention it, ip_fast_csum() do not uses volatile
On Fri, 2014-03-21 at 14:52 -0400, David Miller wrote:
From: Eric Dumazet eric.duma...@gmail.com
Date: Fri, 21 Mar 2014 05:50:50 -0700
It looks like a barrier() would be more appropriate.
barrier() == __asm__ __volatile__(:::memory)
Indeed, but now you mention it, ip_fast_csum() do not
From: Eric Dumazet
Date: Fri, 21 Mar 2014 05:50:50 -0700
> It looks like a barrier() would be more appropriate.
barrier() == __asm__ __volatile__(:::"memory")
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More
From: Eric Dumazet
> On Thu, 2014-03-20 at 18:56 -0700, Andi Kleen wrote:
> > Eric Dumazet writes:
> > >
> > > I saw csum_partial() consuming 1% of cpu cycles in a GRO workload, that
> > > is insane...
> >
> >
> > Couldn't it just be the cache miss?
>
> Or the fact that we mix 16 bit stores and
On Fri, 2014-03-21 at 06:47 -0700, Eric Dumazet wrote:
> Another idea would be to move the ip_fast_csum() call at the end of
> inet_gro_complete()
>
> I'll try this :
>
> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> index 8c54870db792..0ca8f350a532 100644
> --- a/net/ipv4/af_inet.c
>
On Fri, 2014-03-21 at 06:32 -0700, Eric Dumazet wrote:
> On Fri, 2014-03-21 at 05:50 -0700, Eric Dumazet wrote:
>
> > Or the fact that we mix 16 bit stores and 32bit loads ?
> >
> > iph->tot_len = newlen;
> > iph->check = 0;
> > iph->check = ip_fast_csum(iph, 5);
>
> Yep definitely. Using 16
On Fri, 2014-03-21 at 05:50 -0700, Eric Dumazet wrote:
> Or the fact that we mix 16 bit stores and 32bit loads ?
>
> iph->tot_len = newlen;
> iph->check = 0;
> iph->check = ip_fast_csum(iph, 5);
Yep definitely. Using 16 bit loads in ip_fast_csum() totally removes the
stall. I no longer see
On Fri, Mar 21, 2014 at 05:50:50AM -0700, Eric Dumazet wrote:
> On Thu, 2014-03-20 at 18:56 -0700, Andi Kleen wrote:
> > Eric Dumazet writes:
> > >
> > > I saw csum_partial() consuming 1% of cpu cycles in a GRO workload, that
> > > is insane...
> >
> >
> > Couldn't it just be the cache miss?
>
On Thu, 2014-03-20 at 18:56 -0700, Andi Kleen wrote:
> Eric Dumazet writes:
> >
> > I saw csum_partial() consuming 1% of cpu cycles in a GRO workload, that
> > is insane...
>
>
> Couldn't it just be the cache miss?
Or the fact that we mix 16 bit stores and 32bit loads ?
iph->tot_len = newlen;
On Thu, 2014-03-20 at 18:56 -0700, Andi Kleen wrote:
Eric Dumazet eric.duma...@gmail.com writes:
I saw csum_partial() consuming 1% of cpu cycles in a GRO workload, that
is insane...
Couldn't it just be the cache miss?
Or the fact that we mix 16 bit stores and 32bit loads ?
On Fri, Mar 21, 2014 at 05:50:50AM -0700, Eric Dumazet wrote:
On Thu, 2014-03-20 at 18:56 -0700, Andi Kleen wrote:
Eric Dumazet eric.duma...@gmail.com writes:
I saw csum_partial() consuming 1% of cpu cycles in a GRO workload, that
is insane...
Couldn't it just be the cache
On Fri, 2014-03-21 at 05:50 -0700, Eric Dumazet wrote:
Or the fact that we mix 16 bit stores and 32bit loads ?
iph-tot_len = newlen;
iph-check = 0;
iph-check = ip_fast_csum(iph, 5);
Yep definitely. Using 16 bit loads in ip_fast_csum() totally removes the
stall. I no longer see
On Fri, 2014-03-21 at 06:32 -0700, Eric Dumazet wrote:
On Fri, 2014-03-21 at 05:50 -0700, Eric Dumazet wrote:
Or the fact that we mix 16 bit stores and 32bit loads ?
iph-tot_len = newlen;
iph-check = 0;
iph-check = ip_fast_csum(iph, 5);
Yep definitely. Using 16 bit loads in
On Fri, 2014-03-21 at 06:47 -0700, Eric Dumazet wrote:
Another idea would be to move the ip_fast_csum() call at the end of
inet_gro_complete()
I'll try this :
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 8c54870db792..0ca8f350a532 100644
--- a/net/ipv4/af_inet.c
+++
From: Eric Dumazet
On Thu, 2014-03-20 at 18:56 -0700, Andi Kleen wrote:
Eric Dumazet eric.duma...@gmail.com writes:
I saw csum_partial() consuming 1% of cpu cycles in a GRO workload, that
is insane...
Couldn't it just be the cache miss?
Or the fact that we mix 16 bit stores
From: Eric Dumazet eric.duma...@gmail.com
Date: Fri, 21 Mar 2014 05:50:50 -0700
It looks like a barrier() would be more appropriate.
barrier() == __asm__ __volatile__(:::memory)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to
22 matches
Mail list logo