Re: [PATCH 0/10] [IOAT] I/OAT patches repost

2006-04-27 Thread Chris Leech
Netperf2 TOT now accesses the buffer that was just recv()'d rather than the one that is about to be recv()'d. We've posted netperf2 results with I/OAT enabled/disabled and the data access option on/off at http://kernel.org/pub/linux/kernel/people/grover/ioat/netperf-icb-1.5-postscaling-both.pdf

Re: [PATCH 0/10] [IOAT] I/OAT patches repost

2006-04-27 Thread Rick Jones
Chris Leech wrote: Netperf2 TOT now accesses the buffer that was just recv()'d rather than the one that is about to be recv()'d. We've posted netperf2 results with I/OAT enabled/disabled and the data access option on/off at

Re: [PATCH 0/10] [IOAT] I/OAT patches repost

2006-04-21 Thread Ingo Oeser
David S. Miller wrote: The first thing an application is going to do is touch that data. So I think it's very important to prewarm the caches and the only straightforward way I know of to always warm up the correct cpu's caches is copy_to_user(). Hmm, what if the application is sth. like a

[PATCH 0/10] [IOAT] I/OAT patches repost

2006-04-20 Thread Andrew Grover
Hi I'm reposting these, originally posted by Chris Leech a few weeks ago. However, there is an extra part since I broke up one patch that was too big for netdev last time into two (patches 2 and 3). Of course we're always looking for more style improvement comments, but more importantly we're

Re: [PATCH 0/10] [IOAT] I/OAT patches repost

2006-04-20 Thread Olof Johansson
On Thu, Apr 20, 2006 at 01:49:16PM -0700, Andrew Grover wrote: Hi I'm reposting these, originally posted by Chris Leech a few weeks ago. However, there is an extra part since I broke up one patch that was too big for netdev last time into two (patches 2 and 3). Of course we're always

Re: [PATCH 0/10] [IOAT] I/OAT patches repost

2006-04-20 Thread Olof Johansson
On Thu, Apr 20, 2006 at 03:14:15PM -0700, Andrew Grover wrote: Hah, I was just writing an email covering those. I'll incorporate that into this reponse. On 4/20/06, Olof Johansson [EMAIL PROTECTED] wrote: I guess the overall question is, how much of this needs to be addressed in the

Re: [PATCH 0/10] [IOAT] I/OAT patches repost

2006-04-20 Thread David S. Miller
From: Andrew Grover [EMAIL PROTECTED] Date: Thu, 20 Apr 2006 15:14:15 -0700 First obviously it's a technology for RX CPU improvement so there's no benefit on TX workloads. Second it depends on there being buffers to copy the data into *before* the data arrives. This happens to be the case for

Re: [PATCH 0/10] [IOAT] I/OAT patches repost

2006-04-20 Thread David S. Miller
From: Olof Johansson [EMAIL PROTECTED] Date: Thu, 20 Apr 2006 18:33:43 -0500 On Thu, Apr 20, 2006 at 03:14:15PM -0700, Andrew Grover wrote: In addition, there may be workloads (file serving? backup?) where we could do a skb-page-in-page-cache copy and avoid cache pollution? Yes, NFS is

Re: [PATCH 0/10] [IOAT] I/OAT patches repost

2006-04-20 Thread Rick Jones
Unfortunately, many benchmarks just do raw bandwidth tests sending to a receiver that just doesn't even look at the data. They just return from recvmsg() and loop back into it. This is not what applications using networking actually do, so it's important to make sure we look intelligently at

Re: [PATCH 0/10] [IOAT] I/OAT patches repost

2006-04-20 Thread Rick Jones
David S. Miller wrote: From: Andrew Grover [EMAIL PROTECTED] Date: Thu, 20 Apr 2006 15:14:15 -0700 First obviously it's a technology for RX CPU improvement so there's no benefit on TX workloads. Second it depends on there being buffers to copy the data into *before* the data arrives. This

Re: [PATCH 0/10] [IOAT] I/OAT patches repost

2006-04-20 Thread David S. Miller
From: Rick Jones [EMAIL PROTECTED] Date: Thu, 20 Apr 2006 18:00:37 -0700 Actually, that brings-up a question - presently, and for reasons that are lost to me in the mists of time - netperf will access the buffer before it calls recv(). I'm wondering if that should be changed to an access

Re: [PATCH 0/10] [IOAT] I/OAT patches repost

2006-04-20 Thread Herbert Xu
David S. Miller [EMAIL PROTECTED] wrote: For I/O AT you'd really want to get the DMA engine going as soon as you had those packets, but I do not see a clean and reliable way to determine the target pages before the app gets back to recvmsg(). The vmsplice() system call proposed by Linus

Re: [PATCH 0/10] [IOAT] I/OAT patches repost

2006-04-20 Thread Olof Johansson
On Thu, Apr 20, 2006 at 05:27:42PM -0700, David S. Miller wrote: From: Olof Johansson [EMAIL PROTECTED] Date: Thu, 20 Apr 2006 16:33:05 -0500 From the wiki: 3. Data copied by I/OAT is not cached This is a I/OAT device limitation and not a global statement of the DMA

Re: [PATCH 0/10] [IOAT] I/OAT patches repost

2006-04-20 Thread Olof Johansson
On Thu, Apr 20, 2006 at 05:44:38PM -0700, David S. Miller wrote: From: Olof Johansson [EMAIL PROTECTED] Date: Thu, 20 Apr 2006 18:33:43 -0500 On Thu, Apr 20, 2006 at 03:14:15PM -0700, Andrew Grover wrote: In addition, there may be workloads (file serving? backup?) where we could do

Re: [PATCH 0/10] [IOAT] I/OAT patches repost

2006-04-20 Thread David S. Miller
From: Olof Johansson [EMAIL PROTECTED] Date: Thu, 20 Apr 2006 22:04:26 -0500 On Thu, Apr 20, 2006 at 05:27:42PM -0700, David S. Miller wrote: Besides the control overhead of the DMA engines, the biggest thing lost in my opinion is the perfect cache warming that a cpu based copy does from

Re: [PATCH 0/10] [IOAT] I/OAT patches repost

2006-04-20 Thread Olof Johansson
On Thu, Apr 20, 2006 at 08:42:00PM -0700, David S. Miller wrote: This is basically why none of the performance gains add up to me. I am thus very concerned that the current non-cache-warming implmentation may fall flat performance wise. Ok, I buy your arguments. It does seems unlikely that a