Re: [RFC PATCH v2 0/6] Reduce cache miss for snmp_fold_field

2016-09-06 Thread hejianet

Hi Marcelo

Thanks for the suggestion

Will consider that

B.R.

Jia


On 9/6/16 8:44 PM, Marcelo Ricardo Leitner wrote:

On Tue, Sep 06, 2016 at 10:30:03AM +0800, Jia He wrote:
...

v2:
- 1/6 fix bug in udplite statistics.
- 1/6 snmp_seq_show is split into 2 parts

Jia He (6):
   proc: Reduce cache miss in {snmp,netstat}_seq_show
   proc: Reduce cache miss in snmp6_seq_show
   proc: Reduce cache miss in sctp_snmp_seq_show
   proc: Reduce cache miss in xfrm_statistics_seq_show
   ipv6: Remove useless parameter in __snmp6_fill_statsdev
   net: Suppress the "Comparison to NULL could be written" warning

Hi Jia,

Did you try to come up with a generic interface for this, like
snmp_fold_fields64() (note the fieldS) or snmp_fold_field64_batch() ?

Sounds like we have the same code in several places and seems they all
operate very similarly. They have a percpu table, an identified max, a
destination buffer..

If this is possible, this would reduce the possibility of hiccups in a
particular code.

   Marcelo






Re: [RFC PATCH v2 0/6] Reduce cache miss for snmp_fold_field

2016-09-06 Thread Marcelo Ricardo Leitner
On Tue, Sep 06, 2016 at 10:30:03AM +0800, Jia He wrote:
...
> v2:
> - 1/6 fix bug in udplite statistics. 
> - 1/6 snmp_seq_show is split into 2 parts
> 
> Jia He (6):
>   proc: Reduce cache miss in {snmp,netstat}_seq_show
>   proc: Reduce cache miss in snmp6_seq_show
>   proc: Reduce cache miss in sctp_snmp_seq_show
>   proc: Reduce cache miss in xfrm_statistics_seq_show
>   ipv6: Remove useless parameter in __snmp6_fill_statsdev
>   net: Suppress the "Comparison to NULL could be written" warning

Hi Jia,

Did you try to come up with a generic interface for this, like
snmp_fold_fields64() (note the fieldS) or snmp_fold_field64_batch() ?

Sounds like we have the same code in several places and seems they all
operate very similarly. They have a percpu table, an identified max, a
destination buffer.. 

If this is possible, this would reduce the possibility of hiccups in a
particular code.

  Marcelo



[RFC PATCH v2 0/6] Reduce cache miss for snmp_fold_field

2016-09-05 Thread Jia He
In a PowerPc server with large cpu number(160), besides commit
a3a773726c9f ("net: Optimize snmp stat aggregation by walking all
the percpu data at once"), I watched several other snmp_fold_field
callsites which will cause high cache miss rate.

My simple test case, which read from the procfs items endlessly:
/***/
#include 
#include 
#include 
#include 
#include 
#define LINELEN  2560
int main(int argc, char **argv)
{
int i;
int fd = -1 ;
int rdsize = 0;
char buf[LINELEN+1];

buf[LINELEN] = 0;
memset(buf,0,LINELEN);

if(1 >= argc) {
printf("file name empty\n");
return -1;
}

fd = open(argv[1], O_RDWR, 0644);
if(0 > fd){
printf("open error\n");
return -2;
}

for(i=0;i<0x;i++) {
while(0 < (rdsize = read(fd,buf,LINELEN))){
//nothing here
}

lseek(fd, 0, SEEK_SET);
}

close(fd);
return 0;
}
/**/

compile and run:
gcc test.c -o test

perf stat -d -e cache-misses ./test /proc/net/snmp
perf stat -d -e cache-misses ./test /proc/net/snmp6
perf stat -d -e cache-misses ./test /proc/net/netstat
perf stat -d -e cache-misses ./test /proc/net/sctp/snmp
perf stat -d -e cache-misses ./test /proc/net/xfrm_stat

before the patch set:

 Performance counter stats for 'system wide':

 355911097  cache-misses
 [40.08%]
2356829300  L1-dcache-loads 
 [60.04%]
 355642645  L1-dcache-load-misses #   15.09% of all L1-dcache 
hits   [60.02%]
 346544541  LLC-loads   
 [59.97%]
389763  LLC-load-misses   #0.11% of all LL-cache 
hits[40.02%]

   6.245162638 seconds time elapsed

After the patch set:
===
 Performance counter stats for 'system wide':

 194992476  cache-misses
 [40.03%]
6718051877  L1-dcache-loads 
 [60.07%]
 194871921  L1-dcache-load-misses #2.90% of all L1-dcache 
hits   [60.11%]
 187632232  LLC-loads   
 [60.04%]
464466  LLC-load-misses   #0.25% of all LL-cache 
hits[39.89%]

   6.868422769 seconds time elapsed
The cache-miss rate can be reduced from 15% to 2.9%

v2:
- 1/6 fix bug in udplite statistics. 
- 1/6 snmp_seq_show is split into 2 parts

Jia He (6):
  proc: Reduce cache miss in {snmp,netstat}_seq_show
  proc: Reduce cache miss in snmp6_seq_show
  proc: Reduce cache miss in sctp_snmp_seq_show
  proc: Reduce cache miss in xfrm_statistics_seq_show
  ipv6: Remove useless parameter in __snmp6_fill_statsdev
  net: Suppress the "Comparison to NULL could be written" warning

 net/ipv4/proc.c  | 144 ++-
 net/ipv6/addrconf.c  |  12 ++---
 net/ipv6/proc.c  |  47 +
 net/sctp/proc.c  |  15 --
 net/xfrm/xfrm_proc.c |  15 --
 5 files changed, 162 insertions(+), 71 deletions(-)

-- 
1.8.3.1