Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-10-09 Thread Aaron Lu
On Mon, Oct 03, 2016 at 10:32:04AM +0800, Xin Long wrote:
> On Fri, Sep 30, 2016 at 3:05 PM, Aaron Lu  wrote:
> > On 08/23/2016 05:44 AM, Marcelo Ricardo Leitner wrote:
> >> Em 19-08-2016 04:24, Aaron Lu escreveu:
> >>> On Fri, Aug 19, 2016 at 04:19:39AM -0300, Marcelo Ricardo Leitner wrote:
>  Hi,
> 
>  Em 19-08-2016 02:29, Aaron Lu escreveu:
>  ...
> > It doesn't look insane and sctp_wait_for_sndbuf may actually have
> > something to do with a larger sctp_chunk I suppose?
> >
> > The same perf record doesn't capture any sample for the good commit,
> > which suggests the nerperf process doesn't sleep in 
> > sctp_wait_for_sndbuf.
> 
>  Ahhh yes! It does, and then it would mean your txbuf is too small for the
>  chunk sizes you're using (sctp tests option -m).
> 
>  What's your netperf cmdline again please?
> >>>
> >>> netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1
> >>>
> >>> Is the 10K used here a problem? If so, can you suggest a proper value
> >>> for our netperf performance test? Thanks.
> >>
> >> We're still working on this. Xin could reproduce it on an i3 too, but
> >> I'm afraid this commit just unmasked an issue in there. You're
> >> overloading the CPU by too much when spawning 8 parallel netperf's on a
> >> 4-core system, seems that commit a6c2f79287 was that last rock that made
> >> it slip into a precipice. sctp's cwnd and rwnd management are not as
> >> good as tcp's and now it seems you're triggering a corner case.
> >>
> >> I hope to have more soon.
> >
> > I wonder if there is any update on this issue?
> >
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> 
> be4947b sctp: change to check peer prsctp_capable when using prsctp polices
> 0605483 sctp: remove prsctp_param from sctp_chunk
> 73dca12 sctp: move sent_count to the memory hole in sctp_chunk
> 
> These three commit can avoid this issue by recovering sctp_chunk size.

Thanks for the update, I just confirmed the throughput is back on my
desktop.



Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-10-02 Thread Xin Long
On Fri, Sep 30, 2016 at 3:05 PM, Aaron Lu  wrote:
> On 08/23/2016 05:44 AM, Marcelo Ricardo Leitner wrote:
>> Em 19-08-2016 04:24, Aaron Lu escreveu:
>>> On Fri, Aug 19, 2016 at 04:19:39AM -0300, Marcelo Ricardo Leitner wrote:
 Hi,

 Em 19-08-2016 02:29, Aaron Lu escreveu:
 ...
> It doesn't look insane and sctp_wait_for_sndbuf may actually have
> something to do with a larger sctp_chunk I suppose?
>
> The same perf record doesn't capture any sample for the good commit,
> which suggests the nerperf process doesn't sleep in sctp_wait_for_sndbuf.

 Ahhh yes! It does, and then it would mean your txbuf is too small for the
 chunk sizes you're using (sctp tests option -m).

 What's your netperf cmdline again please?
>>>
>>> netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1
>>>
>>> Is the 10K used here a problem? If so, can you suggest a proper value
>>> for our netperf performance test? Thanks.
>>
>> We're still working on this. Xin could reproduce it on an i3 too, but
>> I'm afraid this commit just unmasked an issue in there. You're
>> overloading the CPU by too much when spawning 8 parallel netperf's on a
>> 4-core system, seems that commit a6c2f79287 was that last rock that made
>> it slip into a precipice. sctp's cwnd and rwnd management are not as
>> good as tcp's and now it seems you're triggering a corner case.
>>
>> I hope to have more soon.
>
> I wonder if there is any update on this issue?
>
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

be4947b sctp: change to check peer prsctp_capable when using prsctp polices
0605483 sctp: remove prsctp_param from sctp_chunk
73dca12 sctp: move sent_count to the memory hole in sctp_chunk

These three commit can avoid this issue by recovering sctp_chunk size.


Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-09-30 Thread Aaron Lu
On 08/23/2016 05:44 AM, Marcelo Ricardo Leitner wrote:
> Em 19-08-2016 04:24, Aaron Lu escreveu:
>> On Fri, Aug 19, 2016 at 04:19:39AM -0300, Marcelo Ricardo Leitner wrote:
>>> Hi,
>>>
>>> Em 19-08-2016 02:29, Aaron Lu escreveu:
>>> ...
 It doesn't look insane and sctp_wait_for_sndbuf may actually have
 something to do with a larger sctp_chunk I suppose?

 The same perf record doesn't capture any sample for the good commit,
 which suggests the nerperf process doesn't sleep in sctp_wait_for_sndbuf.
>>>
>>> Ahhh yes! It does, and then it would mean your txbuf is too small for the
>>> chunk sizes you're using (sctp tests option -m).
>>>
>>> What's your netperf cmdline again please?
>>
>> netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1
>>
>> Is the 10K used here a problem? If so, can you suggest a proper value
>> for our netperf performance test? Thanks.
> 
> We're still working on this. Xin could reproduce it on an i3 too, but 
> I'm afraid this commit just unmasked an issue in there. You're 
> overloading the CPU by too much when spawning 8 parallel netperf's on a 
> 4-core system, seems that commit a6c2f79287 was that last rock that made 
> it slip into a precipice. sctp's cwnd and rwnd management are not as 
> good as tcp's and now it seems you're triggering a corner case.
> 
> I hope to have more soon.

I wonder if there is any update on this issue?

Thanks,
Aaron


Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-23 Thread Aaron Lu
On 08/23/2016 05:44 AM, Marcelo Ricardo Leitner wrote:
> Em 19-08-2016 04:24, Aaron Lu escreveu:
>> On Fri, Aug 19, 2016 at 04:19:39AM -0300, Marcelo Ricardo Leitner wrote:
>>> Hi,
>>>
>>> Em 19-08-2016 02:29, Aaron Lu escreveu:
>>> ...
 It doesn't look insane and sctp_wait_for_sndbuf may actually have
 something to do with a larger sctp_chunk I suppose?

 The same perf record doesn't capture any sample for the good commit,
 which suggests the nerperf process doesn't sleep in sctp_wait_for_sndbuf.
>>>
>>> Ahhh yes! It does, and then it would mean your txbuf is too small for the
>>> chunk sizes you're using (sctp tests option -m).
>>>
>>> What's your netperf cmdline again please?
>>
>> netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1
>>
>> Is the 10K used here a problem? If so, can you suggest a proper value
>> for our netperf performance test? Thanks.
> 
> We're still working on this. Xin could reproduce it on an i3 too, but 
> I'm afraid this commit just unmasked an issue in there. You're 
> overloading the CPU by too much when spawning 8 parallel netperf's on a 
> 4-core system, seems that commit a6c2f79287 was that last rock that made 
> it slip into a precipice. sctp's cwnd and rwnd management are not as 
> good as tcp's and now it seems you're triggering a corner case.

OK, I see.

> 
> I hope to have more soon.

Looking forward to test your patches.
Thanks for the update.

Regards,
Aaron


Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-22 Thread Marcelo Ricardo Leitner

Em 19-08-2016 04:24, Aaron Lu escreveu:

On Fri, Aug 19, 2016 at 04:19:39AM -0300, Marcelo Ricardo Leitner wrote:

Hi,

Em 19-08-2016 02:29, Aaron Lu escreveu:
...

It doesn't look insane and sctp_wait_for_sndbuf may actually have
something to do with a larger sctp_chunk I suppose?

The same perf record doesn't capture any sample for the good commit,
which suggests the nerperf process doesn't sleep in sctp_wait_for_sndbuf.


Ahhh yes! It does, and then it would mean your txbuf is too small for the
chunk sizes you're using (sctp tests option -m).

What's your netperf cmdline again please?


netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1

Is the 10K used here a problem? If so, can you suggest a proper value
for our netperf performance test? Thanks.


We're still working on this. Xin could reproduce it on an i3 too, but 
I'm afraid this commit just unmasked an issue in there. You're 
overloading the CPU by too much when spawning 8 parallel netperf's on a 
4-core system, seems that commit a6c2f79287 was that last rock that made 
it slip into a precipice. sctp's cwnd and rwnd management are not as 
good as tcp's and now it seems you're triggering a corner case.


I hope to have more soon.

Regards,
Marcelo


Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-19 Thread Aaron Lu
On Fri, Aug 19, 2016 at 04:19:39AM -0300, Marcelo Ricardo Leitner wrote:
> Hi,
> 
> Em 19-08-2016 02:29, Aaron Lu escreveu:
> ...
> > It doesn't look insane and sctp_wait_for_sndbuf may actually have
> > something to do with a larger sctp_chunk I suppose?
> > 
> > The same perf record doesn't capture any sample for the good commit,
> > which suggests the nerperf process doesn't sleep in sctp_wait_for_sndbuf.
> 
> Ahhh yes! It does, and then it would mean your txbuf is too small for the
> chunk sizes you're using (sctp tests option -m).
> 
> What's your netperf cmdline again please?

netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1

Is the 10K used here a problem? If so, can you suggest a proper value
for our netperf performance test? Thanks.

Regards,
Aaron


Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-19 Thread Marcelo Ricardo Leitner

Hi,

Em 19-08-2016 02:29, Aaron Lu escreveu:
...

It doesn't look insane and sctp_wait_for_sndbuf may actually have
something to do with a larger sctp_chunk I suppose?

The same perf record doesn't capture any sample for the good commit,
which suggests the nerperf process doesn't sleep in sctp_wait_for_sndbuf.


Ahhh yes! It does, and then it would mean your txbuf is too small for 
the chunk sizes you're using (sctp tests option -m).


What's your netperf cmdline again please?

Regards,
Marcelo


Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-18 Thread Aaron Lu
On Thu, Aug 18, 2016 at 08:45:42PM +0800, Xin Long wrote:
> >> Hi, Aaron
> >>
> >> 1)
> >> I talked with Marcelo about this one.
> >> He said it might be related with cacheline.  the  new field distroyed
> >> the prior cacheline. So on top of commit 826d253d57b1, pls only add
> >> +   unsigned long prsctp_param;
> >>
> >> to the end of struct sctp_chunk, then try.
> >
> > This doesn't work.
> >
> 
> If it's because of cache lines changed, I'm not sure this, either.
> Maybe 2) is a good way to fix it.

A comparison of the good commit 826d253d57b1 and the bad a6c2f792873a:

tests: 8
testcase/path_params/tbox_group/run: 
netperf/ipv4-300s-200%-cs-localhost-10K-SCTP_STREAM_MANY-performance/lkp-ivb-d02

826d253d57b11f69 a6c2f792873aff332a4689717c  
 --  
 %stddev  change %stddev
 \  |\  
  3923 -37%   2461netperf.Throughput_Mbps
 9 -78%  2vmstat.procs.r
112616  19% 133981vmstat.system.cs
  4053   7%   4350vmstat.system.in
  8598 ±  4%   957%  90912softirqs.SCHED
  16466114 -37%   10305467softirqs.NET_RX
605899 -46% 329262softirqs.TIMER
 72067 ± 10%   -63%  26356 ±  3%  softirqs.RCU
  4785 ±  7%-9%   4352slabinfo.anon_vma_chain.num_objs
   642 ±  7%14%731 ±  6%  slabinfo.kmalloc-512.active_objs
  4993  15%   5735slabinfo.kmalloc-64.active_objs
  4993  15%   5735slabinfo.kmalloc-64.num_objs
  2529 ±  4%   -15%   2150proc-vmstat.nr_alloc_batch
 4.733e+08 -37%  2.999e+08proc-vmstat.pgalloc_normal
 8.476e+08 -37%   5.36e+08proc-vmstat.pgfree
 3.742e+08 -37%  2.361e+08proc-vmstat.pgalloc_dma32
  1.48e+08 -37%   93033641proc-vmstat.numa_hit
  1.48e+08 -37%   93033640proc-vmstat.numa_local
  0.05 ± 17% 52102%  24.80turbostat.CPU%c1
  0.643065%  20.10 ±  3%  turbostat.CPU%c6
  0.12 ± 39%  1900%   2.35 ±  3%  turbostat.Pkg%pc2
  0.46 ± 10%  1686%   8.22 ±  6%  turbostat.Pkg%pc6
 37.54 -14%  32.11turbostat.PkgWatt
 20.20 -25%  15.22turbostat.CorWatt
 99.31 -45%  54.97turbostat.%Busy
  3269 -45%   1803turbostat.Avg_MHz
 76510 ± 46% 3e+05%  1.954e+08cpuidle.C1-IVB.time
 19769 ± 17%  5534%1113742 ±  5%  cpuidle.C1E-IVB.time
   151 ± 11%  4175%   6454 ±  7%  cpuidle.C1E-IVB.usage
   114 ± 14%  6216%   7232 ±  5%  cpuidle.C3-IVB.usage
 33074 ± 14%  5159%1739419 ±  3%  cpuidle.C3-IVB.time
  88744203% 381901cpuidle.C6-IVB.usage
   80061844072%   3.34e+08cpuidle.C6-IVB.time
 12019 ± 35%   303%  48398perf-stat.cpu-migrations
  34232822  19%   40780053perf-stat.context-switches
339045   5% 354573perf-stat.minor-faults
339041   5% 354568perf-stat.page-faults
 2.776e+11 -28%  2.003e+11perf-stat.branch-instructions
 1.505e+12 -29%  1.065e+12perf-stat.instructions
 6.421e+11 -30%  4.473e+11perf-stat.dTLB-loads
  5.32e+11 -34%  3.536e+11perf-stat.dTLB-stores
 1.173e+11 -38%  7.271e+10perf-stat.cache-references
 3.735e+08 ±  5%   -48%  1.959e+08 ±  4%  perf-stat.iTLB-load-misses
 3.864e+09 -51%1.9e+09perf-stat.branch-misses
 4.069e+09 ± 20%   -56%  1.798e+09 ± 40%  perf-stat.dTLB-load-misses
 5.285e+08 ± 22%   -70%  1.585e+08 ± 16%  perf-stat.dTLB-store-misses
 7.126e+09 ± 16%   -97%   2.27e+08 ±  4%  perf-stat.cache-misses

The obvious change is:
1 the bad commit has a much fewer runnable process - vmstat.procs.r
2 the context switches are much higher in the bad commit - vmstat.system.cs

It all suggests the netperf processes go to sleep for some reason in the bad
commit.

I used "perf record -p one_netperf_pid -e probe:pick_next_task_idle" as
suggested by Tim to see where it went to sleep:

Samples: 78  of event 'probe:pick_next_task_idle', Event count(approx.): 78
  Children  Self  Trace output
  ■-  100.00%   100.00%  (810fc750)
  ▒ __sendmsg_nocancel
  ▒ entry_SYSCALL_64_fastpath
  ▒ sys_sendmsg
  ▒ __sys_sendmsg
  ▒ ___sys_sendmsg
  ▒ inet_sendmsg
  ▒ sctp_sendmsg
  ▒ sctp_wait_for_sndbuf
  ▒ schedule_timeout
  ▒ schedule
  ▒ pick_next_task_idle

It doesn't look insane and sctp_wait_for_sndbuf may actually have
something to do wi

Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-18 Thread Xin Long
>> Hi, Aaron
>>
>> 1)
>> I talked with Marcelo about this one.
>> He said it might be related with cacheline.  the  new field distroyed
>> the prior cacheline. So on top of commit 826d253d57b1, pls only add
>> +   unsigned long prsctp_param;
>>
>> to the end of struct sctp_chunk, then try.
>
> This doesn't work.
>

If it's because of cache lines changed, I'm not sure this, either.
Maybe 2) is a good way to fix it.

Thanks Aaron.

>> 2)
>> if 1) still doesn't work, I may think about to drop prsctp_param in
>> sctp_chunk, and reuse msg->expire_at. as for sent_count, I will
>> put it to the mem hole of sctp_chunk.
>>
>> So pls also try the attachment patch,  on top of commit a6c2f792873a
>
> Good news, this brings the performance back on my Sandybridge desktop :)
> I have queued jobs to the Ivybridge test box but I guess the result is
> the same, but will let you know if it isn't.
>
> It looks like the size of the structure plays a role here, but not clear
> to me what happened underneath. Do you know why?
>
> Thanks,
> Aaron
>
>>
>> Thanks.
>
>> diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
>> index 6bcda71..008cb76 100644
>> --- a/include/net/sctp/structs.h
>> +++ b/include/net/sctp/structs.h
>> @@ -524,7 +524,11 @@ struct sctp_datamsg {
>>   struct list_head chunks;
>>   /* Reference counting. */
>>   atomic_t refcnt;
>> - /* When is this message no longer interesting to the peer? */
>> + /* Re-use this field to record param for prsctp policies,
>> +  * for TTL policy, it is the time_to_drop of this chunk,
>> +  * for RTX policy, it is the max_sent_count of this chunk,
>> +  * for PRIO policy, it is the priority of this chunk.
>> +  */
>>   unsigned long expires_at;
>>   /* Did the messenge fail to send? */
>>   int send_error;
>> @@ -553,6 +557,9 @@ struct sctp_chunk {
>>
>>   atomic_t refcnt;
>>
>> + /* How many times this chunk have been sent, for prsctp RTX policy */
>> + int sent_count;
>> +
>>   /* This is our link to the per-transport transmitted list.  */
>>   struct list_head transmitted_list;
>>
>> @@ -602,16 +609,6 @@ struct sctp_chunk {
>>   /* This needs to be recoverable for SCTP_SEND_FAILED events. */
>>   struct sctp_sndrcvinfo sinfo;
>>
>> - /* We use this field to record param for prsctp policies,
>> -  * for TTL policy, it is the time_to_drop of this chunk,
>> -  * for RTX policy, it is the max_sent_count of this chunk,
>> -  * for PRIO policy, it is the priority of this chunk.
>> -  */
>> - unsigned long prsctp_param;
>> -
>> - /* How many times this chunk have been sent, for prsctp RTX policy */
>> - int sent_count;
>> -
>>   /* Which association does this belong to?  */
>>   struct sctp_association *asoc;
>>
>> diff --git a/net/sctp/chunk.c b/net/sctp/chunk.c
>> index 2698d12..0c53d64 100644
>> --- a/net/sctp/chunk.c
>> +++ b/net/sctp/chunk.c
>> @@ -349,7 +349,7 @@ int sctp_chunk_abandoned(struct sctp_chunk *chunk)
>>   }
>>
>>   if (SCTP_PR_TTL_ENABLED(chunk->sinfo.sinfo_flags) &&
>> - time_after(jiffies, chunk->prsctp_param)) {
>> + time_after(jiffies, chunk->msg->expires_at)) {
>>   if (chunk->sent_count)
>>   chunk->asoc->abandoned_sent[SCTP_PR_INDEX(TTL)]++;
>>   else
>> diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
>> index 2c431ee..c7110a9 100644
>> --- a/net/sctp/sm_make_chunk.c
>> +++ b/net/sctp/sm_make_chunk.c
>> @@ -718,7 +718,7 @@ static void sctp_set_prsctp_policy(struct sctp_chunk 
>> *chunk,
>>   return;
>>
>>   if (SCTP_PR_TTL_ENABLED(sinfo->sinfo_flags))
>> - chunk->prsctp_param =
>> + chunk->msg->expires_at =
>>   jiffies + msecs_to_jiffies(sinfo->sinfo_timetolive);
>>  }
>>
>


Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-17 Thread Aaron Lu
On Thu, Aug 18, 2016 at 02:06:50AM +0800, Xin Long wrote:
> >
> > It doesn't seem memory is an issue.
> >
> > The whole dump is about the same.
> > The MemFree and MemAvailable doesn't change much.
> >
> Hi, Aaron
> 
> 1)
> I talked with Marcelo about this one.
> He said it might be related with cacheline.  the  new field distroyed
> the prior cacheline. So on top of commit 826d253d57b1, pls only add
> +   unsigned long prsctp_param;
> 
> to the end of struct sctp_chunk, then try.

This doesn't work.
 
> 2)
> if 1) still doesn't work, I may think about to drop prsctp_param in
> sctp_chunk, and reuse msg->expire_at. as for sent_count, I will
> put it to the mem hole of sctp_chunk.
> 
> So pls also try the attachment patch,  on top of commit a6c2f792873a

Good news, this brings the performance back on my Sandybridge desktop :)
I have queued jobs to the Ivybridge test box but I guess the result is
the same, but will let you know if it isn't.

It looks like the size of the structure plays a role here, but not clear
to me what happened underneath. Do you know why?

Thanks,
Aaron

> 
> Thanks.

> diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
> index 6bcda71..008cb76 100644
> --- a/include/net/sctp/structs.h
> +++ b/include/net/sctp/structs.h
> @@ -524,7 +524,11 @@ struct sctp_datamsg {
>   struct list_head chunks;
>   /* Reference counting. */
>   atomic_t refcnt;
> - /* When is this message no longer interesting to the peer? */
> + /* Re-use this field to record param for prsctp policies,
> +  * for TTL policy, it is the time_to_drop of this chunk,
> +  * for RTX policy, it is the max_sent_count of this chunk,
> +  * for PRIO policy, it is the priority of this chunk.
> +  */
>   unsigned long expires_at;
>   /* Did the messenge fail to send? */
>   int send_error;
> @@ -553,6 +557,9 @@ struct sctp_chunk {
>  
>   atomic_t refcnt;
>  
> + /* How many times this chunk have been sent, for prsctp RTX policy */
> + int sent_count;
> +
>   /* This is our link to the per-transport transmitted list.  */
>   struct list_head transmitted_list;
>  
> @@ -602,16 +609,6 @@ struct sctp_chunk {
>   /* This needs to be recoverable for SCTP_SEND_FAILED events. */
>   struct sctp_sndrcvinfo sinfo;
>  
> - /* We use this field to record param for prsctp policies,
> -  * for TTL policy, it is the time_to_drop of this chunk,
> -  * for RTX policy, it is the max_sent_count of this chunk,
> -  * for PRIO policy, it is the priority of this chunk.
> -  */
> - unsigned long prsctp_param;
> -
> - /* How many times this chunk have been sent, for prsctp RTX policy */
> - int sent_count;
> -
>   /* Which association does this belong to?  */
>   struct sctp_association *asoc;
>  
> diff --git a/net/sctp/chunk.c b/net/sctp/chunk.c
> index 2698d12..0c53d64 100644
> --- a/net/sctp/chunk.c
> +++ b/net/sctp/chunk.c
> @@ -349,7 +349,7 @@ int sctp_chunk_abandoned(struct sctp_chunk *chunk)
>   }
>  
>   if (SCTP_PR_TTL_ENABLED(chunk->sinfo.sinfo_flags) &&
> - time_after(jiffies, chunk->prsctp_param)) {
> + time_after(jiffies, chunk->msg->expires_at)) {
>   if (chunk->sent_count)
>   chunk->asoc->abandoned_sent[SCTP_PR_INDEX(TTL)]++;
>   else
> diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
> index 2c431ee..c7110a9 100644
> --- a/net/sctp/sm_make_chunk.c
> +++ b/net/sctp/sm_make_chunk.c
> @@ -718,7 +718,7 @@ static void sctp_set_prsctp_policy(struct sctp_chunk 
> *chunk,
>   return;
>  
>   if (SCTP_PR_TTL_ENABLED(sinfo->sinfo_flags))
> - chunk->prsctp_param =
> + chunk->msg->expires_at =
>   jiffies + msecs_to_jiffies(sinfo->sinfo_timetolive);
>  }
>  



Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-17 Thread Xin Long
>
> It doesn't seem memory is an issue.
>
> The whole dump is about the same.
> The MemFree and MemAvailable doesn't change much.
>
Hi, Aaron

1)
I talked with Marcelo about this one.
He said it might be related with cacheline.  the  new field distroyed
the prior cacheline. So on top of commit 826d253d57b1, pls only add
+   unsigned long prsctp_param;

to the end of struct sctp_chunk, then try.


2)
if 1) still doesn't work, I may think about to drop prsctp_param in
sctp_chunk, and reuse msg->expire_at. as for sent_count, I will
put it to the mem hole of sctp_chunk.

So pls also try the attachment patch,  on top of commit a6c2f792873a

Thanks.
diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 6bcda71..008cb76 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -524,7 +524,11 @@ struct sctp_datamsg {
 	struct list_head chunks;
 	/* Reference counting. */
 	atomic_t refcnt;
-	/* When is this message no longer interesting to the peer? */
+	/* Re-use this field to record param for prsctp policies,
+	 * for TTL policy, it is the time_to_drop of this chunk,
+	 * for RTX policy, it is the max_sent_count of this chunk,
+	 * for PRIO policy, it is the priority of this chunk.
+	 */
 	unsigned long expires_at;
 	/* Did the messenge fail to send? */
 	int send_error;
@@ -553,6 +557,9 @@ struct sctp_chunk {
 
 	atomic_t refcnt;
 
+	/* How many times this chunk have been sent, for prsctp RTX policy */
+	int sent_count;
+
 	/* This is our link to the per-transport transmitted list.  */
 	struct list_head transmitted_list;
 
@@ -602,16 +609,6 @@ struct sctp_chunk {
 	/* This needs to be recoverable for SCTP_SEND_FAILED events. */
 	struct sctp_sndrcvinfo sinfo;
 
-	/* We use this field to record param for prsctp policies,
-	 * for TTL policy, it is the time_to_drop of this chunk,
-	 * for RTX policy, it is the max_sent_count of this chunk,
-	 * for PRIO policy, it is the priority of this chunk.
-	 */
-	unsigned long prsctp_param;
-
-	/* How many times this chunk have been sent, for prsctp RTX policy */
-	int sent_count;
-
 	/* Which association does this belong to?  */
 	struct sctp_association *asoc;
 
diff --git a/net/sctp/chunk.c b/net/sctp/chunk.c
index 2698d12..0c53d64 100644
--- a/net/sctp/chunk.c
+++ b/net/sctp/chunk.c
@@ -349,7 +349,7 @@ int sctp_chunk_abandoned(struct sctp_chunk *chunk)
 	}
 
 	if (SCTP_PR_TTL_ENABLED(chunk->sinfo.sinfo_flags) &&
-	time_after(jiffies, chunk->prsctp_param)) {
+	time_after(jiffies, chunk->msg->expires_at)) {
 		if (chunk->sent_count)
 			chunk->asoc->abandoned_sent[SCTP_PR_INDEX(TTL)]++;
 		else
diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
index 2c431ee..c7110a9 100644
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -718,7 +718,7 @@ static void sctp_set_prsctp_policy(struct sctp_chunk *chunk,
 		return;
 
 	if (SCTP_PR_TTL_ENABLED(sinfo->sinfo_flags))
-		chunk->prsctp_param =
+		chunk->msg->expires_at =
 			jiffies + msecs_to_jiffies(sinfo->sinfo_timetolive);
 }
 


Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-17 Thread Aaron Lu
On 08/17/2016 04:58 PM, Xin Long wrote:
>>
>> It doesn't change on my desktop Sandybridge.
>>
>> $ cat 4.7.0-rc6-01199-g116558d316e8/0/netperf.json
>> {
>>   "netperf.Throughput_Mbps": [
>>748.205624998
>>   ]
>> }
>>
>> Where commit 116558d316e8 is based on top of the last test commit
>> 98dd2532b14e with the sent_count removed.
> Nice job
> I guess it may be because of your system memory limitation
> 
> sctp_chunk size is bigger than before, netpref produced a lot
> of sctp_chunk in send queue.
> 
> can you check the memory of your systems when the test is
> running,  to see if memory is the bottle neck of this test ?

We have a monitor to dump /proc/meminfo every second during the run.

On my desktop, the result is -

At start:
time: 1471413103.386122645
MemTotal:   14193468 kB
MemFree:13849136 kB
MemAvailable:   13789204 kB

In the middle of the run:
time: 1471413254.363430637
MemTotal:   14193468 kB
MemFree:13811732 kB
MemAvailable:   13756376 kB

When the test is about to finish:
time: 1471413391.294215121
MemTotal:   14193468 kB
MemFree:13286080 kB
MemAvailable:   13749416 kB

It doesn't seem memory is an issue.

The whole dump is about the same.
The MemFree and MemAvailable doesn't change much.

Thanks,
Aaron


Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-17 Thread Xin Long
>
> It doesn't change on my desktop Sandybridge.
>
> $ cat 4.7.0-rc6-01199-g116558d316e8/0/netperf.json
> {
>   "netperf.Throughput_Mbps": [
>748.205624998
>   ]
> }
>
> Where commit 116558d316e8 is based on top of the last test commit
> 98dd2532b14e with the sent_count removed.
Nice job
I guess it may be because of your system memory limitation

sctp_chunk size is bigger than before, netpref produced a lot
of sctp_chunk in send queue.

can you check the memory of your systems when the test is
running,  to see if memory is the bottle neck of this test ?

>
> Thanks,
> Aaron


Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-17 Thread Aaron Lu
On Wed, Aug 17, 2016 at 04:02:45PM +0800, Xin Long wrote:
> >> you mean only this two line:
> >>> +   unsigned long prsctp_param;
> >>> +   int sent_count;ca;
> >>
> >> caused the performance issue ?
> >
> > Right.
> OK, can you remove this line from your patch
> +   int sent_count;
> 
> then test again, thanks.

It doesn't change on my desktop Sandybridge.

$ cat 4.7.0-rc6-01199-g116558d316e8/0/netperf.json
{
  "netperf.Throughput_Mbps": [
   748.205624998
  ]
}

Where commit 116558d316e8 is based on top of the last test commit
98dd2532b14e with the sent_count removed.

Thanks,
Aaron


Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-17 Thread Xin Long
>> you mean only this two line:
>>> +   unsigned long prsctp_param;
>>> +   int sent_count;ca;
>>
>> caused the performance issue ?
>
> Right.
OK, can you remove this line from your patch
+   int sent_count;

then test again, thanks.


Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-17 Thread Aaron Lu
On Wed, Aug 17, 2016 at 03:42:34PM +0800, Aaron Lu wrote:
> On 08/17/2016 03:35 PM, Xin Long wrote:
> >>  include/net/sctp/structs.h | 3 +++
> >>  1 file changed, 3 insertions(+)
> >>
> >> diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
> >> index d8e464aacb20..932f2780d3a4 100644
> >> --- a/include/net/sctp/structs.h
> >> +++ b/include/net/sctp/structs.h
> >> @@ -602,6 +602,9 @@ struct sctp_chunk {
> >> /* This needs to be recoverable for SCTP_SEND_FAILED events. */
> >> struct sctp_sndrcvinfo sinfo;
> >>
> >> +   unsigned long prsctp_param;
> >> +   int sent_count;
> >> +
> >> /* Which association does this belong to?  */
> >> struct sctp_association *asoc;
> >>
> >> --
> >> 2.5.5
> >>
> >> Then the performance dropped to the same as the bisected commit
> >> a6c2f792873a:
> >> $ cat 4.7.0-rc6-01198-g98dd2532b14e/0/netperf.json
> >> {
> >>   "netperf.Throughput_Mbps": [
> >>754.494375
> >>   ]
> >> }
> >>
> >> I think this agrees with the perf data in that the newly added function
> >> doesn't show up in the perf-profile but still, the performance drops.
> >> So the only possible reason is the newly added fields to the sctp_chunk
> >> structure.
> >>
> >> Is this expected?
> > interesting , you didn't include the modification of the functions
> > parts, right ?
> 
> Yes.
> 
> > you mean only this two line:
> >> +   unsigned long prsctp_param;
> >> +   int sent_count;ca;
> > 
> > caused the performance issue ?
>  
> Right.

Note the test is done on my own Sandybridge desktop, I'll queue a job to
run on the Ivybridge test box now.


Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-17 Thread Aaron Lu
On 08/17/2016 03:35 PM, Xin Long wrote:
>>  include/net/sctp/structs.h | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
>> index d8e464aacb20..932f2780d3a4 100644
>> --- a/include/net/sctp/structs.h
>> +++ b/include/net/sctp/structs.h
>> @@ -602,6 +602,9 @@ struct sctp_chunk {
>> /* This needs to be recoverable for SCTP_SEND_FAILED events. */
>> struct sctp_sndrcvinfo sinfo;
>>
>> +   unsigned long prsctp_param;
>> +   int sent_count;
>> +
>> /* Which association does this belong to?  */
>> struct sctp_association *asoc;
>>
>> --
>> 2.5.5
>>
>> Then the performance dropped to the same as the bisected commit
>> a6c2f792873a:
>> $ cat 4.7.0-rc6-01198-g98dd2532b14e/0/netperf.json
>> {
>>   "netperf.Throughput_Mbps": [
>>754.494375
>>   ]
>> }
>>
>> I think this agrees with the perf data in that the newly added function
>> doesn't show up in the perf-profile but still, the performance drops.
>> So the only possible reason is the newly added fields to the sctp_chunk
>> structure.
>>
>> Is this expected?
> interesting , you didn't include the modification of the functions
> parts, right ?

Yes.

> you mean only this two line:
>> +   unsigned long prsctp_param;
>> +   int sent_count;ca;
> 
> caused the performance issue ?
 
Right.


Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-17 Thread Xin Long
>  include/net/sctp/structs.h | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
> index d8e464aacb20..932f2780d3a4 100644
> --- a/include/net/sctp/structs.h
> +++ b/include/net/sctp/structs.h
> @@ -602,6 +602,9 @@ struct sctp_chunk {
> /* This needs to be recoverable for SCTP_SEND_FAILED events. */
> struct sctp_sndrcvinfo sinfo;
>
> +   unsigned long prsctp_param;
> +   int sent_count;
> +
> /* Which association does this belong to?  */
> struct sctp_association *asoc;
>
> --
> 2.5.5
>
> Then the performance dropped to the same as the bisected commit
> a6c2f792873a:
> $ cat 4.7.0-rc6-01198-g98dd2532b14e/0/netperf.json
> {
>   "netperf.Throughput_Mbps": [
>754.494375
>   ]
> }
>
> I think this agrees with the perf data in that the newly added function
> doesn't show up in the perf-profile but still, the performance drops.
> So the only possible reason is the newly added fields to the sctp_chunk
> structure.
>
> Is this expected?
interesting , you didn't include the modification of the functions
parts, right ?
you mean only this two line:
> +   unsigned long prsctp_param;
> +   int sent_count;ca;

caused the performance issue ?


Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-16 Thread Aaron Lu
On Wed, Aug 17, 2016 at 02:37:19PM +0800, Aaron Lu wrote:
> On Wed, Aug 17, 2016 at 02:14:05PM +0800, Aaron Lu wrote:
> > On Wed, Aug 17, 2016 at 01:41:04PM +0800, Xin Long wrote:
> > > > The perf-profile data for the two commits are attached(for the case of
> > > > prsctp_enable=1, the perf-profile data doesn't get collected for the 0
> > > > case for some reason, I'm checking the problem now).
> > > >
> > > > The CPU gets much more idle time in the bisected commit a6c2f79287:
> > > >
> > > > 68.89% 0.70%  [kernel.kallsyms]   [k] entry_SYSCALL_64_fastpath
> > > > 49.32% 0.12%  [kernel.kallsyms]   [k] sys_sendmsg
> > > > 49.17% 0.12%  [kernel.kallsyms]   [k] __sys_sendmsg
> > > > 48.58% 0.22%  [kernel.kallsyms]   [k] ___sys_sendmsg
> > > > 46.69% 0.06%  [kernel.kallsyms]   [k] sock_sendmsg
> > > > 46.31% 0.16%  [kernel.kallsyms]   [k] inet_sendmsg
> > > > 45.90% 0.98%  [kernel.kallsyms]   [k] sctp_sendmsg
> > > > 29.66% 0.45%  [kernel.kallsyms]   [k] sctp_do_sm
> > > > 29.54% 0.23%  [kernel.kallsyms]   [k] cpu_startup_entry
> > > > 28.81% 0.68%  [kernel.kallsyms]   [k] 
> > > > sctp_cmd_interpreter.isra.24
> > > > 26.20% 0.00%  [kernel.kallsyms]   [k] start_secondary
> > > > 23.04% 0.09%  [kernel.kallsyms]   [k] sctp_inq_push
> > > > 23.03% 0.08%  [kernel.kallsyms]   [k] call_cpuidle
> > > > 22.94% 0.00%  [kernel.kallsyms]   [k] cpuidle_enter
> > > > 22.60% 0.18%  [kernel.kallsyms]   [k] cpuidle_enter_state
> > > > 21.99%21.99%  [kernel.kallsyms]   [k] intel_idle
> > > > ... ...
> > > >
> > > > While its immediate parent commit 826d253d57 is mostly busy working:
> > > >
> > > > 98.53% 0.83%  [kernel.kallsyms]   [k] entry_SYSCALL_64_fastpath
> > > > 78.13% 0.12%  [kernel.kallsyms]   [k] sys_sendmsg
> > > > 78.03% 0.16%  [kernel.kallsyms]   [k] __sys_sendmsg
> > > > 77.08% 0.28%  [kernel.kallsyms]   [k] ___sys_sendmsg
> > > > 74.44% 0.08%  [kernel.kallsyms]   [k] sock_sendmsg
> > > > 73.82% 0.13%  [kernel.kallsyms]   [k] inet_sendmsg
> > > > 73.34% 1.44%  [kernel.kallsyms]   [k] sctp_sendmsg
> > > > 47.52% 0.75%  [kernel.kallsyms]   [k] sctp_do_sm
> > > > 46.19% 0.90%  [kernel.kallsyms]   [k] 
> > > > sctp_cmd_interpreter.isra.24
> > > > 37.17% 1.43%  [kernel.kallsyms]   [k] sctp_outq_flush
> > > > 36.93% 0.08%  [kernel.kallsyms]   [k] sctp_outq_uncork
> > > > 34.24% 0.15%  [kernel.kallsyms]   [k] sctp_inq_push
> > > > ... ...
> > > > No idle related function above 1%.
> > > >
> > > > Will the bisected commit make the idle possible?
> > > No, not at all. :)
> > > 
> > > pls help to debug as I said in the last reply.
> > 
> > OK, will see how to do that.
> > 
> > In the meantime, I just tried to reproduce on my own desktop:
> > Sandybridge i7-2600 CPU @ 3.40GHz and it reproduced:
> > $ cat 4.7.0-rc6-01198-ga6c2f792873a/0/netperf.json
> > {
> >   "netperf.Throughput_Mbps": [
> >752.94502
> >   ]
> > }
> > $ cat 4.7.0-rc6-01197-g826d253d57b1/0/netperf.json
> > {
> >   "netperf.Throughput_Mbps": [
> >1068.555624997
> >   ]
> > }
> 
> On top of
> commit 826d253d57b1 ("sctp: add SCTP_PR_ASSOC_STATUS on sctp sockopt")
> I applied the below commit:
> 
> From 98dd2532b14e29dcc2ab40a7348755531afa79e4 Mon Sep 17 00:00:00 2001
> From: Aaron Lu 
> Date: Wed, 17 Aug 2016 14:20:00 +0800
> Subject: [PATCH] sctp: test
> 
> ---
>  include/net/sctp/structs.h | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
> index d8e464aacb20..932f2780d3a4 100644
> --- a/include/net/sctp/structs.h
> +++ b/include/net/sctp/structs.h
> @@ -602,6 +602,9 @@ struct sctp_chunk {
>   /* This needs to be recoverable for SCTP_SEND_FAILED events. */
>   struct sctp_sndrcvinfo sinfo;
>  
> + unsigned long prsctp_param;
> + int sent_count;
> +
>   /* Which association does this belong to?  */
>   struct sctp_association *asoc;
>  
> -- 
> 2.5.5
> 
> Then the performance dropped to the same as the bisected commit
> a6c2f792873a:
> $ cat 4.7.0-rc6-01198-g98dd2532b14e/0/netperf.json
> {
>   "netperf.Throughput_Mbps": [
>754.494375
>   ]
> }
> 
> I think this agrees with the perf data in that the newly added function

Actually, I mean the modified functions like sctp_chunk_abandoned and
__sctp_packet_append_chunk, etc.

> doesn't show up in the perf-profile but still, the performance drops.
> So the only possible reason is the newly added fields to the sctp_chunk
> structure.
> 
> Is this expected?
> 
> Thanks,
> Aaron


Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-16 Thread Aaron Lu
On Wed, Aug 17, 2016 at 02:14:05PM +0800, Aaron Lu wrote:
> On Wed, Aug 17, 2016 at 01:41:04PM +0800, Xin Long wrote:
> > > The perf-profile data for the two commits are attached(for the case of
> > > prsctp_enable=1, the perf-profile data doesn't get collected for the 0
> > > case for some reason, I'm checking the problem now).
> > >
> > > The CPU gets much more idle time in the bisected commit a6c2f79287:
> > >
> > > 68.89% 0.70%  [kernel.kallsyms]   [k] entry_SYSCALL_64_fastpath
> > > 49.32% 0.12%  [kernel.kallsyms]   [k] sys_sendmsg
> > > 49.17% 0.12%  [kernel.kallsyms]   [k] __sys_sendmsg
> > > 48.58% 0.22%  [kernel.kallsyms]   [k] ___sys_sendmsg
> > > 46.69% 0.06%  [kernel.kallsyms]   [k] sock_sendmsg
> > > 46.31% 0.16%  [kernel.kallsyms]   [k] inet_sendmsg
> > > 45.90% 0.98%  [kernel.kallsyms]   [k] sctp_sendmsg
> > > 29.66% 0.45%  [kernel.kallsyms]   [k] sctp_do_sm
> > > 29.54% 0.23%  [kernel.kallsyms]   [k] cpu_startup_entry
> > > 28.81% 0.68%  [kernel.kallsyms]   [k] sctp_cmd_interpreter.isra.24
> > > 26.20% 0.00%  [kernel.kallsyms]   [k] start_secondary
> > > 23.04% 0.09%  [kernel.kallsyms]   [k] sctp_inq_push
> > > 23.03% 0.08%  [kernel.kallsyms]   [k] call_cpuidle
> > > 22.94% 0.00%  [kernel.kallsyms]   [k] cpuidle_enter
> > > 22.60% 0.18%  [kernel.kallsyms]   [k] cpuidle_enter_state
> > > 21.99%21.99%  [kernel.kallsyms]   [k] intel_idle
> > > ... ...
> > >
> > > While its immediate parent commit 826d253d57 is mostly busy working:
> > >
> > > 98.53% 0.83%  [kernel.kallsyms]   [k] entry_SYSCALL_64_fastpath
> > > 78.13% 0.12%  [kernel.kallsyms]   [k] sys_sendmsg
> > > 78.03% 0.16%  [kernel.kallsyms]   [k] __sys_sendmsg
> > > 77.08% 0.28%  [kernel.kallsyms]   [k] ___sys_sendmsg
> > > 74.44% 0.08%  [kernel.kallsyms]   [k] sock_sendmsg
> > > 73.82% 0.13%  [kernel.kallsyms]   [k] inet_sendmsg
> > > 73.34% 1.44%  [kernel.kallsyms]   [k] sctp_sendmsg
> > > 47.52% 0.75%  [kernel.kallsyms]   [k] sctp_do_sm
> > > 46.19% 0.90%  [kernel.kallsyms]   [k] sctp_cmd_interpreter.isra.24
> > > 37.17% 1.43%  [kernel.kallsyms]   [k] sctp_outq_flush
> > > 36.93% 0.08%  [kernel.kallsyms]   [k] sctp_outq_uncork
> > > 34.24% 0.15%  [kernel.kallsyms]   [k] sctp_inq_push
> > > ... ...
> > > No idle related function above 1%.
> > >
> > > Will the bisected commit make the idle possible?
> > No, not at all. :)
> > 
> > pls help to debug as I said in the last reply.
> 
> OK, will see how to do that.
> 
> In the meantime, I just tried to reproduce on my own desktop:
> Sandybridge i7-2600 CPU @ 3.40GHz and it reproduced:
> $ cat 4.7.0-rc6-01198-ga6c2f792873a/0/netperf.json
> {
>   "netperf.Throughput_Mbps": [
>752.94502
>   ]
> }
> $ cat 4.7.0-rc6-01197-g826d253d57b1/0/netperf.json
> {
>   "netperf.Throughput_Mbps": [
>1068.555624997
>   ]
> }

On top of
commit 826d253d57b1 ("sctp: add SCTP_PR_ASSOC_STATUS on sctp sockopt")
I applied the below commit:

>From 98dd2532b14e29dcc2ab40a7348755531afa79e4 Mon Sep 17 00:00:00 2001
From: Aaron Lu 
Date: Wed, 17 Aug 2016 14:20:00 +0800
Subject: [PATCH] sctp: test

---
 include/net/sctp/structs.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index d8e464aacb20..932f2780d3a4 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -602,6 +602,9 @@ struct sctp_chunk {
/* This needs to be recoverable for SCTP_SEND_FAILED events. */
struct sctp_sndrcvinfo sinfo;
 
+   unsigned long prsctp_param;
+   int sent_count;
+
/* Which association does this belong to?  */
struct sctp_association *asoc;
 
-- 
2.5.5

Then the performance dropped to the same as the bisected commit
a6c2f792873a:
$ cat 4.7.0-rc6-01198-g98dd2532b14e/0/netperf.json
{
  "netperf.Throughput_Mbps": [
   754.494375
  ]
}

I think this agrees with the perf data in that the newly added function
doesn't show up in the perf-profile but still, the performance drops.
So the only possible reason is the newly added fields to the sctp_chunk
structure.

Is this expected?

Thanks,
Aaron


Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-16 Thread Aaron Lu
On Wed, Aug 17, 2016 at 01:41:04PM +0800, Xin Long wrote:
> > The perf-profile data for the two commits are attached(for the case of
> > prsctp_enable=1, the perf-profile data doesn't get collected for the 0
> > case for some reason, I'm checking the problem now).
> >
> > The CPU gets much more idle time in the bisected commit a6c2f79287:
> >
> > 68.89% 0.70%  [kernel.kallsyms]   [k] entry_SYSCALL_64_fastpath
> > 49.32% 0.12%  [kernel.kallsyms]   [k] sys_sendmsg
> > 49.17% 0.12%  [kernel.kallsyms]   [k] __sys_sendmsg
> > 48.58% 0.22%  [kernel.kallsyms]   [k] ___sys_sendmsg
> > 46.69% 0.06%  [kernel.kallsyms]   [k] sock_sendmsg
> > 46.31% 0.16%  [kernel.kallsyms]   [k] inet_sendmsg
> > 45.90% 0.98%  [kernel.kallsyms]   [k] sctp_sendmsg
> > 29.66% 0.45%  [kernel.kallsyms]   [k] sctp_do_sm
> > 29.54% 0.23%  [kernel.kallsyms]   [k] cpu_startup_entry
> > 28.81% 0.68%  [kernel.kallsyms]   [k] sctp_cmd_interpreter.isra.24
> > 26.20% 0.00%  [kernel.kallsyms]   [k] start_secondary
> > 23.04% 0.09%  [kernel.kallsyms]   [k] sctp_inq_push
> > 23.03% 0.08%  [kernel.kallsyms]   [k] call_cpuidle
> > 22.94% 0.00%  [kernel.kallsyms]   [k] cpuidle_enter
> > 22.60% 0.18%  [kernel.kallsyms]   [k] cpuidle_enter_state
> > 21.99%21.99%  [kernel.kallsyms]   [k] intel_idle
> > ... ...
> >
> > While its immediate parent commit 826d253d57 is mostly busy working:
> >
> > 98.53% 0.83%  [kernel.kallsyms]   [k] entry_SYSCALL_64_fastpath
> > 78.13% 0.12%  [kernel.kallsyms]   [k] sys_sendmsg
> > 78.03% 0.16%  [kernel.kallsyms]   [k] __sys_sendmsg
> > 77.08% 0.28%  [kernel.kallsyms]   [k] ___sys_sendmsg
> > 74.44% 0.08%  [kernel.kallsyms]   [k] sock_sendmsg
> > 73.82% 0.13%  [kernel.kallsyms]   [k] inet_sendmsg
> > 73.34% 1.44%  [kernel.kallsyms]   [k] sctp_sendmsg
> > 47.52% 0.75%  [kernel.kallsyms]   [k] sctp_do_sm
> > 46.19% 0.90%  [kernel.kallsyms]   [k] sctp_cmd_interpreter.isra.24
> > 37.17% 1.43%  [kernel.kallsyms]   [k] sctp_outq_flush
> > 36.93% 0.08%  [kernel.kallsyms]   [k] sctp_outq_uncork
> > 34.24% 0.15%  [kernel.kallsyms]   [k] sctp_inq_push
> > ... ...
> > No idle related function above 1%.
> >
> > Will the bisected commit make the idle possible?
> No, not at all. :)
> 
> pls help to debug as I said in the last reply.

OK, will see how to do that.

In the meantime, I just tried to reproduce on my own desktop:
Sandybridge i7-2600 CPU @ 3.40GHz and it reproduced:
$ cat 4.7.0-rc6-01198-ga6c2f792873a/0/netperf.json
{
  "netperf.Throughput_Mbps": [
   752.94502
  ]
}
$ cat 4.7.0-rc6-01197-g826d253d57b1/0/netperf.json
{
  "netperf.Throughput_Mbps": [
   1068.555624997
  ]
}


Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-16 Thread Xin Long
> The perf-profile data for the two commits are attached(for the case of
> prsctp_enable=1, the perf-profile data doesn't get collected for the 0
> case for some reason, I'm checking the problem now).
>
> The CPU gets much more idle time in the bisected commit a6c2f79287:
>
> 68.89% 0.70%  [kernel.kallsyms]   [k] entry_SYSCALL_64_fastpath
> 49.32% 0.12%  [kernel.kallsyms]   [k] sys_sendmsg
> 49.17% 0.12%  [kernel.kallsyms]   [k] __sys_sendmsg
> 48.58% 0.22%  [kernel.kallsyms]   [k] ___sys_sendmsg
> 46.69% 0.06%  [kernel.kallsyms]   [k] sock_sendmsg
> 46.31% 0.16%  [kernel.kallsyms]   [k] inet_sendmsg
> 45.90% 0.98%  [kernel.kallsyms]   [k] sctp_sendmsg
> 29.66% 0.45%  [kernel.kallsyms]   [k] sctp_do_sm
> 29.54% 0.23%  [kernel.kallsyms]   [k] cpu_startup_entry
> 28.81% 0.68%  [kernel.kallsyms]   [k] sctp_cmd_interpreter.isra.24
> 26.20% 0.00%  [kernel.kallsyms]   [k] start_secondary
> 23.04% 0.09%  [kernel.kallsyms]   [k] sctp_inq_push
> 23.03% 0.08%  [kernel.kallsyms]   [k] call_cpuidle
> 22.94% 0.00%  [kernel.kallsyms]   [k] cpuidle_enter
> 22.60% 0.18%  [kernel.kallsyms]   [k] cpuidle_enter_state
> 21.99%21.99%  [kernel.kallsyms]   [k] intel_idle
> ... ...
>
> While its immediate parent commit 826d253d57 is mostly busy working:
>
> 98.53% 0.83%  [kernel.kallsyms]   [k] entry_SYSCALL_64_fastpath
> 78.13% 0.12%  [kernel.kallsyms]   [k] sys_sendmsg
> 78.03% 0.16%  [kernel.kallsyms]   [k] __sys_sendmsg
> 77.08% 0.28%  [kernel.kallsyms]   [k] ___sys_sendmsg
> 74.44% 0.08%  [kernel.kallsyms]   [k] sock_sendmsg
> 73.82% 0.13%  [kernel.kallsyms]   [k] inet_sendmsg
> 73.34% 1.44%  [kernel.kallsyms]   [k] sctp_sendmsg
> 47.52% 0.75%  [kernel.kallsyms]   [k] sctp_do_sm
> 46.19% 0.90%  [kernel.kallsyms]   [k] sctp_cmd_interpreter.isra.24
> 37.17% 1.43%  [kernel.kallsyms]   [k] sctp_outq_flush
> 36.93% 0.08%  [kernel.kallsyms]   [k] sctp_outq_uncork
> 34.24% 0.15%  [kernel.kallsyms]   [k] sctp_inq_push
> ... ...
> No idle related function above 1%.
>
> Will the bisected commit make the idle possible?
No, not at all. :)

pls help to debug as I said in the last reply.


Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-16 Thread Aaron Lu
On 08/17/2016 01:04 PM, Aaron Lu wrote:
> On 08/16/2016 05:56 PM, Xin Long wrote:
>
> I'm testing on Linus' master, can we all use that please?
>

 [git] git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

 [mechine]
 Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
 mem 62G (66000220K)

 [system]
 # cat /etc/redhat-release
 Red Hat Enterprise Linux Server release 7.3 Beta (Maipo)

 [commit 3684b03]
 [root@hp-dl380pg8-11 lxin]# uname -r
 4.8.0-rc2.3684b03
 [root@hp-dl380pg8-11 lxin]# cat test.sh
 killall -0 netserver || netserver -4 &
 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1
>>>
>>> I just realized the test we are doing is not exactly the same.
>>> As the original report says:
>>> ip: ipv4
>>> runtime: 300s
>>> nr_threads: 200%
>>> cluster: cs-localhost
>>> send_size: 10K
>>> test: SCTP_STREAM_MANY
>>> cpufreq_governor: performance
>>>
>>> Note the nr_threads: 200%, which means to start 2 times of CPU number
>>> processes of netperf.
>>>
>>> In our IVB i3(2 cores, 2 threads per core) case, 8 netperf processes
>>> are started concurrently:
>> OK, understand.
>>
>>>
>>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K 
>>> -H 127.0.0.1 &
>>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K 
>>> -H 127.0.0.1 &
>>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K 
>>> -H 127.0.0.1 &
>>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K 
>>> -H 127.0.0.1 &
>>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K 
>>> -H 127.0.0.1 &
>>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K 
>>> -H 127.0.0.1 &
>>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K 
>>> -H 127.0.0.1 &
>>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K 
>>> -H 127.0.0.1 &
>>>
>>> The throughput is the average of those runs.
>>>
>>> And I think we should be doing test on:
>>> commit a6c2f79287 ("sctp: implement prsctp TTL policy") (the bisected one)
>>> and
>>> commit 826d253d57 ("sctp: add SCTP_PR_ASSOC_STATUS on sctp sockopt") (its 
>>> immediate parent)
>>> instead of Linus' master HEAD to avoid other factors.
>>>
>> OK, I will do tests as your suggestion now,  but need to rebuild again :D
>>
>> can you disable pr_enable with "sysctl -w net.sctp.prsctp_enable=0",
>> then try again?
> 
> For commit a6c2f79287 ("sctp: implement prsctp TTL policy"), no matter
> the value of net.sctp.prsctp_enable, the throughput is almost the same:

The perf-profile data for the two commits are attached(for the case of
prsctp_enable=1, the perf-profile data doesn't get collected for the 0
case for some reason, I'm checking the problem now).

The CPU gets much more idle time in the bisected commit a6c2f79287:

68.89% 0.70%  [kernel.kallsyms]   [k] entry_SYSCALL_64_fastpath
49.32% 0.12%  [kernel.kallsyms]   [k] sys_sendmsg
49.17% 0.12%  [kernel.kallsyms]   [k] __sys_sendmsg
48.58% 0.22%  [kernel.kallsyms]   [k] ___sys_sendmsg
46.69% 0.06%  [kernel.kallsyms]   [k] sock_sendmsg
46.31% 0.16%  [kernel.kallsyms]   [k] inet_sendmsg
45.90% 0.98%  [kernel.kallsyms]   [k] sctp_sendmsg
29.66% 0.45%  [kernel.kallsyms]   [k] sctp_do_sm
29.54% 0.23%  [kernel.kallsyms]   [k] cpu_startup_entry
28.81% 0.68%  [kernel.kallsyms]   [k] sctp_cmd_interpreter.isra.24
26.20% 0.00%  [kernel.kallsyms]   [k] start_secondary
23.04% 0.09%  [kernel.kallsyms]   [k] sctp_inq_push
23.03% 0.08%  [kernel.kallsyms]   [k] call_cpuidle
22.94% 0.00%  [kernel.kallsyms]   [k] cpuidle_enter
22.60% 0.18%  [kernel.kallsyms]   [k] cpuidle_enter_state
21.99%21.99%  [kernel.kallsyms]   [k] intel_idle
... ...

While its immediate parent commit 826d253d57 is mostly busy working:

98.53% 0.83%  [kernel.kallsyms]   [k] entry_SYSCALL_64_fastpath
78.13% 0.12%  [kernel.kallsyms]   [k] sys_sendmsg
78.03% 0.16%  [kernel.kallsyms]   [k] __sys_sendmsg
77.08% 0.28%  [kernel.kallsyms]   [k] ___sys_sendmsg
74.44% 0.08%  [kernel.kallsyms]   [k] sock_sendmsg
73.82% 0.13%  [kernel.kallsyms]   [k] inet_sendmsg
73.34% 1.44%  [kernel.kallsyms]   [k] sctp_sendmsg
47.52% 0.75%  [kernel.kallsyms]   [k] sctp_do_sm
46.19% 0.90%  [kernel.kallsyms]   [k] sctp_cmd_interpreter.isra.24
37.17% 1.43%  [kernel.kallsyms]   [k] sctp_outq_flush
36.93% 0.08%  [kernel.kallsyms]   [k] sctp_outq_uncork
34.24% 0.15%  [kernel.kallsyms]   [k] sctp_inq_push
... ...
No idle related function above 1%.

Will the bisected commit make the idle possible?

Thanks,
Aaron


perf-profile-a6c2f79287.gz
Description: application/gzip


perf-profile-826d253d57.gz
Description: appli

Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-16 Thread Xin Long
>
> For commit a6c2f79287 ("sctp: implement prsctp TTL policy"), no matter
> the value of net.sctp.prsctp_enable, the throughput is almost the same:
>
> net.sctp.prsctp_enable = 0
> {
>   "netperf.Throughput_Mbps": [
> 2353.311249997
>   ]
> }
>
> net.sctp.prsctp_enable = 1
> {
>   "netperf.Throughput_Mbps": [
> 2371.586250003
>   ]
> }
>
> For its immediate parent:
> commit 826d253d57 ("sctp: add SCTP_PR_ASSOC_STATUS on sctp sockopt")
> No matter the value of net.sctp.prsctp_enable, the throughput is again
> almost the same:
>
> net.sctp.prsctp_enable = 0
> {
>   "netperf.Throughput_Mbps": [
> 3838.83004
>   ]
> }
>
> net.sctp.prsctp_enable = 1
> {
>   "netperf.Throughput_Mbps": [
> 3751.46005
>   ]
> }
>
> Does this result give any hint?
OK, if you disable prsctp_enable, commit a6c2f79287 really only adds
two if (), which definitely can't affect performance.

if it's really an issue, pls help to reverse the codes from commit a6c2f79287
little by little, rebuild kernel and try.  you will find which line
exactly caused
the performance issue.  it seems the only way to locate the issue, yet it's
only reproducable in your env.

Thanks.


Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-16 Thread Aaron Lu
On 08/16/2016 05:56 PM, Xin Long wrote:

 I'm testing on Linus' master, can we all use that please?

>>>
>>> [git] git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
>>>
>>> [mechine]
>>> Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
>>> mem 62G (66000220K)
>>>
>>> [system]
>>> # cat /etc/redhat-release
>>> Red Hat Enterprise Linux Server release 7.3 Beta (Maipo)
>>>
>>> [commit 3684b03]
>>> [root@hp-dl380pg8-11 lxin]# uname -r
>>> 4.8.0-rc2.3684b03
>>> [root@hp-dl380pg8-11 lxin]# cat test.sh
>>> killall -0 netserver || netserver -4 &
>>> netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1
>>
>> I just realized the test we are doing is not exactly the same.
>> As the original report says:
>> ip: ipv4
>> runtime: 300s
>> nr_threads: 200%
>> cluster: cs-localhost
>> send_size: 10K
>> test: SCTP_STREAM_MANY
>> cpufreq_governor: performance
>>
>> Note the nr_threads: 200%, which means to start 2 times of CPU number
>> processes of netperf.
>>
>> In our IVB i3(2 cores, 2 threads per core) case, 8 netperf processes
>> are started concurrently:
> OK, understand.
> 
>>
>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
>> 127.0.0.1 &
>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
>> 127.0.0.1 &
>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
>> 127.0.0.1 &
>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
>> 127.0.0.1 &
>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
>> 127.0.0.1 &
>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
>> 127.0.0.1 &
>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
>> 127.0.0.1 &
>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
>> 127.0.0.1 &
>>
>> The throughput is the average of those runs.
>>
>> And I think we should be doing test on:
>> commit a6c2f79287 ("sctp: implement prsctp TTL policy") (the bisected one)
>> and
>> commit 826d253d57 ("sctp: add SCTP_PR_ASSOC_STATUS on sctp sockopt") (its 
>> immediate parent)
>> instead of Linus' master HEAD to avoid other factors.
>>
> OK, I will do tests as your suggestion now,  but need to rebuild again :D
> 
> can you disable pr_enable with "sysctl -w net.sctp.prsctp_enable=0",
> then try again?

For commit a6c2f79287 ("sctp: implement prsctp TTL policy"), no matter
the value of net.sctp.prsctp_enable, the throughput is almost the same:

net.sctp.prsctp_enable = 0
{
  "netperf.Throughput_Mbps": [
2353.311249997
  ]
}

net.sctp.prsctp_enable = 1
{
  "netperf.Throughput_Mbps": [
2371.586250003
  ]
}

For its immediate parent:
commit 826d253d57 ("sctp: add SCTP_PR_ASSOC_STATUS on sctp sockopt")
No matter the value of net.sctp.prsctp_enable, the throughput is again
almost the same:

net.sctp.prsctp_enable = 0
{
  "netperf.Throughput_Mbps": [
3838.83004
  ]
}

net.sctp.prsctp_enable = 1
{
  "netperf.Throughput_Mbps": [
3751.46005
  ]
}

Does this result give any hint?

Thanks,
Aaron


Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-16 Thread Xin Long
>
> And I think we should be doing test on:
> commit a6c2f79287 ("sctp: implement prsctp TTL policy") (the bisected one)
> and
> commit 826d253d57 ("sctp: add SCTP_PR_ASSOC_STATUS on sctp sockopt") (its 
> immediate parent)
> instead of Linus' master HEAD to avoid other factors.
>
The test result shows they are almost same:

826d253d57
=
[root@localhost lxin]# sysctl -w net.sctp.prsctp_enable=1
net.sctp.prsctp_enable = 1
15484.93
15557.69
15395.61

[root@localhost lxin]# sysctl -w net.sctp.prsctp_enable=0
net.sctp.prsctp_enable = 0
15369.83
14419.81
15202.59


a6c2f79287
===
[root@localhost lxin]# sysctl -w net.sctp.prsctp_enable=1
net.sctp.prsctp_enable = 1
15198.00
15567.87
16092.55

[root@localhost lxin]# sysctl -w net.sctp.prsctp_enable=0
net.sctp.prsctp_enable = 0
15624.70
15021.85
15390.62

You can also review the commit a6c2f79287 if you have time:

It just added some 'if()' in the sending path if we don't use
any policy . In our test, no policy was used, I even added
log in kernel to check if some unexpected policy is enabled.

But still no.

If you can reproduce this issue stably, I suggest you can reverse
some code of that patch (it's a really a small patch)  and re-build
the kernel, then try.
With that, you can locate which line exactly triggered this issue.

Thanks.


Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-16 Thread Xin Long
>>>
>>> I'm testing on Linus' master, can we all use that please?
>>>
>>
>> [git] git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
>>
>> [mechine]
>> Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
>> mem 62G (66000220K)
>>
>> [system]
>> # cat /etc/redhat-release
>> Red Hat Enterprise Linux Server release 7.3 Beta (Maipo)
>>
>> [commit 3684b03]
>> [root@hp-dl380pg8-11 lxin]# uname -r
>> 4.8.0-rc2.3684b03
>> [root@hp-dl380pg8-11 lxin]# cat test.sh
>> killall -0 netserver || netserver -4 &
>> netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1
>
> I just realized the test we are doing is not exactly the same.
> As the original report says:
> ip: ipv4
> runtime: 300s
> nr_threads: 200%
> cluster: cs-localhost
> send_size: 10K
> test: SCTP_STREAM_MANY
> cpufreq_governor: performance
>
> Note the nr_threads: 200%, which means to start 2 times of CPU number
> processes of netperf.
>
> In our IVB i3(2 cores, 2 threads per core) case, 8 netperf processes
> are started concurrently:
OK, understand.

>
> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
> 127.0.0.1 &
> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
> 127.0.0.1 &
> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
> 127.0.0.1 &
> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
> 127.0.0.1 &
> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
> 127.0.0.1 &
> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
> 127.0.0.1 &
> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
> 127.0.0.1 &
> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
> 127.0.0.1 &
>
> The throughput is the average of those runs.
>
> And I think we should be doing test on:
> commit a6c2f79287 ("sctp: implement prsctp TTL policy") (the bisected one)
> and
> commit 826d253d57 ("sctp: add SCTP_PR_ASSOC_STATUS on sctp sockopt") (its 
> immediate parent)
> instead of Linus' master HEAD to avoid other factors.
>
OK, I will do tests as your suggestion now,  but need to rebuild again :D

can you disable pr_enable with "sysctl -w net.sctp.prsctp_enable=0",
then try again?


Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-16 Thread Aaron Lu
On 08/16/2016 04:02 PM, Xin Long wrote:
>>
>> I'm testing on Linus' master, can we all use that please?
>>
> 
> [git] git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> 
> [mechine]
> Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
> mem 62G (66000220K)
> 
> [system]
> # cat /etc/redhat-release
> Red Hat Enterprise Linux Server release 7.3 Beta (Maipo)
> 
> [commit 3684b03]
> [root@hp-dl380pg8-11 lxin]# uname -r
> 4.8.0-rc2.3684b03
> [root@hp-dl380pg8-11 lxin]# cat test.sh
> killall -0 netserver || netserver -4 &
> netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1

I just realized the test we are doing is not exactly the same.
As the original report says:
ip: ipv4
runtime: 300s
nr_threads: 200%
cluster: cs-localhost
send_size: 10K
test: SCTP_STREAM_MANY
cpufreq_governor: performance

Note the nr_threads: 200%, which means to start 2 times of CPU number
processes of netperf.

In our IVB i3(2 cores, 2 threads per core) case, 8 netperf processes
are started concurrently:

2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
127.0.0.1 &
2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
127.0.0.1 &
2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
127.0.0.1 &
2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
127.0.0.1 &
2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
127.0.0.1 &
2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
127.0.0.1 &
2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
127.0.0.1 &
2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
127.0.0.1 &

The throughput is the average of those runs.

And I think we should be doing test on:
commit a6c2f79287 ("sctp: implement prsctp TTL policy") (the bisected one)
and
commit 826d253d57 ("sctp: add SCTP_PR_ASSOC_STATUS on sctp sockopt") (its 
immediate parent)
instead of Linus' master HEAD to avoid other factors.

Thanks,
Aaron


Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-16 Thread Aaron Lu
On 08/16/2016 04:02 PM, Xin Long wrote:
>>
>> I'm testing on Linus' master, can we all use that please?
>>
> 
> [git] git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> 
> [mechine]
> Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
> mem 62G (66000220K)
> 
> [system]
> # cat /etc/redhat-release
> Red Hat Enterprise Linux Server release 7.3 Beta (Maipo)
> 
> [commit 3684b03]
> [root@hp-dl380pg8-11 lxin]# uname -r
> 4.8.0-rc2.3684b03
> [root@hp-dl380pg8-11 lxin]# cat test.sh
> killall -0 netserver || netserver -4 &
> netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1
> [root@hp-dl380pg8-11 lxin]# sh test.sh
> SCTP 1-TO-MANY STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
> 127.0.0.1 () port 0 AF_INET
> Recv   SendSend  Utilization   Service Demand
> Socket Socket  Message  Elapsed  Send Recv SendRecv
> Size   SizeSize Time Throughput  localremote   local   remote
> bytes  bytes   bytessecs.10^6bits/s  % S  % S  us/KB   us/KB
> 
> 212992 212992  10240300.00 16914.99   3.28 3.28 0.636   0.636
> 
> [commit f959fb4]
> [root@localhost lxin]# uname -r
> 4.7.0-rc6.f959fb4
> [root@localhost lxin]# cat test.sh
> killall -0 netserver || netserver -4 &
> netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1
> [root@localhost lxin]# sh test.sh
> SCTP 1-TO-MANY STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
> 127.0.0.1 () port 0 AF_INET
> Recv   SendSend  Utilization   Service Demand
> Socket Socket  Message  Elapsed  Send Recv SendRecv
> Size   SizeSize Time Throughput  localremote   local   remote
> bytes  bytes   bytessecs.10^6bits/s  % S  % S  us/KB   us/KB
> 
> 212992 212992  10240300.00 12975.32   3.35 3.35 0.847   0.846
> 
> 
> Still, in my env, the latest kernel is better than old one.
> Sorry, I'm not sure why it's so different in your env.

Could the test have anything to do with the hardware? i.e. yours is Xeon
E5-2690 while mine is IVB i3?

> 
> Could you do 'netperf' test manually, instead of lkp-tests, then check again.

Manually test under LKP is not easy as those test machines are all doing
things automatically. But if you think that is necessary, I can do that.

> Pls show you system's distros as well, like rhel, ubuntu or arch ?

We do not use any of these distros.
The rootfs is derived from debian:
https://github.com/fengguang/reproduce-kernel-bug/blob/master/debian/debian-x86_64-2015-02-07.cgz

Thanks,
Aaron


Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-16 Thread Xin Long
>
> I'm testing on Linus' master, can we all use that please?
>

[git] git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

[mechine]
Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
mem 62G (66000220K)

[system]
# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.3 Beta (Maipo)

[commit 3684b03]
[root@hp-dl380pg8-11 lxin]# uname -r
4.8.0-rc2.3684b03
[root@hp-dl380pg8-11 lxin]# cat test.sh
killall -0 netserver || netserver -4 &
netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1
[root@hp-dl380pg8-11 lxin]# sh test.sh
SCTP 1-TO-MANY STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
127.0.0.1 () port 0 AF_INET
Recv   SendSend  Utilization   Service Demand
Socket Socket  Message  Elapsed  Send Recv SendRecv
Size   SizeSize Time Throughput  localremote   local   remote
bytes  bytes   bytessecs.10^6bits/s  % S  % S  us/KB   us/KB

212992 212992  10240300.00 16914.99   3.28 3.28 0.636   0.636

[commit f959fb4]
[root@localhost lxin]# uname -r
4.7.0-rc6.f959fb4
[root@localhost lxin]# cat test.sh
killall -0 netserver || netserver -4 &
netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1
[root@localhost lxin]# sh test.sh
SCTP 1-TO-MANY STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
127.0.0.1 () port 0 AF_INET
Recv   SendSend  Utilization   Service Demand
Socket Socket  Message  Elapsed  Send Recv SendRecv
Size   SizeSize Time Throughput  localremote   local   remote
bytes  bytes   bytessecs.10^6bits/s  % S  % S  us/KB   us/KB

212992 212992  10240300.00 12975.32   3.35 3.35 0.847   0.846


Still, in my env, the latest kernel is better than old one.
Sorry, I'm not sure why it's so different in your env.

Could you do 'netperf' test manually, instead of lkp-tests, then check again.
Pls show you system's distros as well, like rhel, ubuntu or arch ?


Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-15 Thread Aaron Lu

Any update on this, Long?

Regards,
Aaron

On 08/08/2016 10:10 AM, Aaron Lu wrote:
> On Fri, Aug 05, 2016 at 07:53:38PM +0800, Xin Long wrote:
 It doesn't make much sense to me. the codes I added cannot be
 triggered without enable any pr policies. and I also did the tests in
>>>
>>> It seems these pr policies has to be turned on by user space, i.e.
>>> netperf in this case?
>>>
>>> I checked netperf's source code, it doesn't seem set any option
>>> related to SCTP PR POLICY but I'm new to network code so I could be
>>> wrong or missing something.
>>>
 my local environment,  the result looks normal to me compare to
 prior version.
>>>
>>> Can you share your number?
>>> We run netperf like this:
>>> netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1
>>> The full log of the run is attached for your reference.
>>
>> Now I also changed to linux-net.git
>>
>> commit 96b585267f552d4b6a28ea8bd75e5ed03deb6e71
>> [root@hp-dl388g8-08 ~]# uname -r
>> 4.7.0.new
>> [root@hp-dl388g8-08 ~]# netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 --
>> -m 10K -H 127.0.0.1
>> SCTP 1-TO-MANY STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
>> 127.0.0.1 () port 0 AF_INET
>> Recv   SendSend  Utilization   Service Demand
>> Socket Socket  Message  Elapsed  Send Recv SendRecv
>> Size   SizeSize Time Throughput  localremote   local   remote
>> bytes  bytes   bytessecs.10^6bits/s  % S  % S  us/KB   us/KB
>>
>> 212992 212992  10240300.00 11814.56   4.65 4.65 0.775   0.774
>>
>>
>> commit f959fb442c35f4b61fea341401b8463dd0a1b959 (just before the buggie 
>> patch)
> 
> I'm testing on Linus' master, can we all use that please?
> 
>> [root@localhost ~]# netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m
>> 10K -H 127.0.0.1
>> SCTP 1-TO-MANY STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
>> 127.0.0.1 () port 0 AF_INET
>> Recv   SendSend  Utilization   Service Demand
>> Socket Socket  Message  Elapsed  Send Recv SendRecv
>> Size   SizeSize Time Throughput  localremote   local   remote
>> bytes  bytes   bytessecs.10^6bits/s  % S  % S  us/KB   us/KB
>>
>> 212992 212992  10240300.00 9454.90   5.22 5.22 1.086   1.085
>>
>>
>> I did tests on physical machine.
>> did you do it on guest ?
> 
> The test is done on a ivy-bridge desktop with 8G memory:
> # cpudesc : Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz
> # total memory : 8058152 kB
> 
>>
>>>

 Recently the sctp performance is not stable,  as during these patches,
 netperf cannot get the result, but return ENOTCONN. which may
 also affect the testing. anyway we've fixed the -ENOTCONN issue
 already in the latest version.
>>>
>>> I tested commit 96b585267f55, which is Linus' git tree HEAD on 08/03, I
>>> guess the fix you mentioned should already be in there? But
>>> unfortunately, the throughput of netperf is still at low number(we did
>>> the test 5 times):
>>> $ cat */netperf.json
>>> {
>>>   "netperf.Throughput_Mbps": [
>>> 2470.69748
>>>   ]
>>> }{
>>>   "netperf.Throughput_Mbps": [
>>> 2486.7675
>>>   ]
>>> }{
>>>   "netperf.Throughput_Mbps": [
>>> 2478.945
>>>   ]
>>> }{
>>>   "netperf.Throughput_Mbps": [
>>> 2429.465
>>>   ]
>>> }{
>>>   "netperf.Throughput_Mbps": [
>>> 2476.91504
>>>   ]
>>>
>>> Considering what you have said that the patch shouldn't make a
>>> difference, the performance drop is really confusing. Any idea what
>>> could be the cause? Thanks.
>> Now I saw your tests result against the new kernel
>>
>> Could you do the same test on the kernel before the problematic commit ?
> 
> Yes, the throughput of its parent commit is higer enough to trigger the
> automatic bisect and then we send out the report.
> 
> Throughput of its parent commit 826d253d57b1("sctp: add SCTP_PR_ASSOC_STATUS
> on sctp sockopt"):
> Average:
> "netperf.Throughput_Mbps": 3923.84375,
> 
> $ cat */netperf.json
> {
>   "netperf.Throughput_Mbps": [
> 3869.25375
>   ]
> }{
>   "netperf.Throughput_Mbps": [
> 3952.58875
>   ]
> }{
>   "netperf.Throughput_Mbps": [
> 3936.89625
>   ]
> }{
>   "netperf.Throughput_Mbps": [
> 3936.63625
>   ]
> }
> 
> Feel free to let me know if you need any more information or you want me
> to do more tests on other commits/machines, thanks.
> 
> Regards,
> Aaron
> 



Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-07 Thread Aaron Lu
On Fri, Aug 05, 2016 at 07:53:38PM +0800, Xin Long wrote:
> >> It doesn't make much sense to me. the codes I added cannot be
> >> triggered without enable any pr policies. and I also did the tests in
> >
> > It seems these pr policies has to be turned on by user space, i.e.
> > netperf in this case?
> >
> > I checked netperf's source code, it doesn't seem set any option
> > related to SCTP PR POLICY but I'm new to network code so I could be
> > wrong or missing something.
> >
> >> my local environment,  the result looks normal to me compare to
> >> prior version.
> >
> > Can you share your number?
> > We run netperf like this:
> > netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1
> > The full log of the run is attached for your reference.
> 
> Now I also changed to linux-net.git
> 
> commit 96b585267f552d4b6a28ea8bd75e5ed03deb6e71
> [root@hp-dl388g8-08 ~]# uname -r
> 4.7.0.new
> [root@hp-dl388g8-08 ~]# netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 --
> -m 10K -H 127.0.0.1
> SCTP 1-TO-MANY STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
> 127.0.0.1 () port 0 AF_INET
> Recv   SendSend  Utilization   Service Demand
> Socket Socket  Message  Elapsed  Send Recv SendRecv
> Size   SizeSize Time Throughput  localremote   local   remote
> bytes  bytes   bytessecs.10^6bits/s  % S  % S  us/KB   us/KB
> 
> 212992 212992  10240300.00 11814.56   4.65 4.65 0.775   0.774
> 
> 
> commit f959fb442c35f4b61fea341401b8463dd0a1b959 (just before the buggie patch)

I'm testing on Linus' master, can we all use that please?

> [root@localhost ~]# netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m
> 10K -H 127.0.0.1
> SCTP 1-TO-MANY STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
> 127.0.0.1 () port 0 AF_INET
> Recv   SendSend  Utilization   Service Demand
> Socket Socket  Message  Elapsed  Send Recv SendRecv
> Size   SizeSize Time Throughput  localremote   local   remote
> bytes  bytes   bytessecs.10^6bits/s  % S  % S  us/KB   us/KB
> 
> 212992 212992  10240300.00 9454.90   5.22 5.22 1.086   1.085
> 
> 
> I did tests on physical machine.
> did you do it on guest ?

The test is done on a ivy-bridge desktop with 8G memory:
# cpudesc : Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz
# total memory : 8058152 kB

> 
> >
> >>
> >> Recently the sctp performance is not stable,  as during these patches,
> >> netperf cannot get the result, but return ENOTCONN. which may
> >> also affect the testing. anyway we've fixed the -ENOTCONN issue
> >> already in the latest version.
> >
> > I tested commit 96b585267f55, which is Linus' git tree HEAD on 08/03, I
> > guess the fix you mentioned should already be in there? But
> > unfortunately, the throughput of netperf is still at low number(we did
> > the test 5 times):
> > $ cat */netperf.json
> > {
> >   "netperf.Throughput_Mbps": [
> > 2470.69748
> >   ]
> > }{
> >   "netperf.Throughput_Mbps": [
> > 2486.7675
> >   ]
> > }{
> >   "netperf.Throughput_Mbps": [
> > 2478.945
> >   ]
> > }{
> >   "netperf.Throughput_Mbps": [
> > 2429.465
> >   ]
> > }{
> >   "netperf.Throughput_Mbps": [
> > 2476.91504
> >   ]
> >
> > Considering what you have said that the patch shouldn't make a
> > difference, the performance drop is really confusing. Any idea what
> > could be the cause? Thanks.
> Now I saw your tests result against the new kernel
> 
> Could you do the same test on the kernel before the problematic commit ?

Yes, the throughput of its parent commit is higer enough to trigger the
automatic bisect and then we send out the report.

Throughput of its parent commit 826d253d57b1("sctp: add SCTP_PR_ASSOC_STATUS
on sctp sockopt"):
Average:
"netperf.Throughput_Mbps": 3923.84375,

$ cat */netperf.json
{
  "netperf.Throughput_Mbps": [
3869.25375
  ]
}{
  "netperf.Throughput_Mbps": [
3952.58875
  ]
}{
  "netperf.Throughput_Mbps": [
3936.89625
  ]
}{
  "netperf.Throughput_Mbps": [
3936.63625
  ]
}

Feel free to let me know if you need any more information or you want me
to do more tests on other commits/machines, thanks.

Regards,
Aaron


Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-05 Thread Xin Long
>> It doesn't make much sense to me. the codes I added cannot be
>> triggered without enable any pr policies. and I also did the tests in
>
> It seems these pr policies has to be turned on by user space, i.e.
> netperf in this case?
>
> I checked netperf's source code, it doesn't seem set any option
> related to SCTP PR POLICY but I'm new to network code so I could be
> wrong or missing something.
>
>> my local environment,  the result looks normal to me compare to
>> prior version.
>
> Can you share your number?
> We run netperf like this:
> netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1
> The full log of the run is attached for your reference.

Now I also changed to linux-net.git

commit 96b585267f552d4b6a28ea8bd75e5ed03deb6e71
[root@hp-dl388g8-08 ~]# uname -r
4.7.0.new
[root@hp-dl388g8-08 ~]# netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 --
-m 10K -H 127.0.0.1
SCTP 1-TO-MANY STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
127.0.0.1 () port 0 AF_INET
Recv   SendSend  Utilization   Service Demand
Socket Socket  Message  Elapsed  Send Recv SendRecv
Size   SizeSize Time Throughput  localremote   local   remote
bytes  bytes   bytessecs.10^6bits/s  % S  % S  us/KB   us/KB

212992 212992  10240300.00 11814.56   4.65 4.65 0.775   0.774


commit f959fb442c35f4b61fea341401b8463dd0a1b959 (just before the buggie patch)
[root@localhost ~]# netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m
10K -H 127.0.0.1
SCTP 1-TO-MANY STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
127.0.0.1 () port 0 AF_INET
Recv   SendSend  Utilization   Service Demand
Socket Socket  Message  Elapsed  Send Recv SendRecv
Size   SizeSize Time Throughput  localremote   local   remote
bytes  bytes   bytessecs.10^6bits/s  % S  % S  us/KB   us/KB

212992 212992  10240300.00 9454.90   5.22 5.22 1.086   1.085


I did tests on physical machine.
did you do it on guest ?

>
>>
>> Recently the sctp performance is not stable,  as during these patches,
>> netperf cannot get the result, but return ENOTCONN. which may
>> also affect the testing. anyway we've fixed the -ENOTCONN issue
>> already in the latest version.
>
> I tested commit 96b585267f55, which is Linus' git tree HEAD on 08/03, I
> guess the fix you mentioned should already be in there? But
> unfortunately, the throughput of netperf is still at low number(we did
> the test 5 times):
> $ cat */netperf.json
> {
>   "netperf.Throughput_Mbps": [
> 2470.69748
>   ]
> }{
>   "netperf.Throughput_Mbps": [
> 2486.7675
>   ]
> }{
>   "netperf.Throughput_Mbps": [
> 2478.945
>   ]
> }{
>   "netperf.Throughput_Mbps": [
> 2429.465
>   ]
> }{
>   "netperf.Throughput_Mbps": [
> 2476.91504
>   ]
>
> Considering what you have said that the patch shouldn't make a
> difference, the performance drop is really confusing. Any idea what
> could be the cause? Thanks.
Now I saw your tests result against the new kernel

Could you do the same test on the kernel before the problematic commit ?


Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

2016-08-04 Thread Aaron Lu
On Thu, Jul 28, 2016 at 03:01:36PM +0800, Xin Long wrote:
> On Wed, Jul 27, 2016 at 9:54 AM, kernel test robot
>  wrote:
> >
> > FYI, we noticed a -37.2% regression of netperf.Throughput_Mbps due to 
> > commit:
> >
> > commit a6c2f792873aff332a4689717c3cd6104f46684c ("sctp: implement prsctp 
> > TTL policy")
> > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> >
> > in testcase: netperf
> > on test machine: 4 threads Ivy Bridge with 8G memory
> > with following parameters:
> >
> > ip: ipv4
> > runtime: 300s
> > nr_threads: 200%
> > cluster: cs-localhost
> > send_size: 10K
> > test: SCTP_STREAM_MANY
> > cpufreq_governor: performance
> >
> >
> >
> > Disclaimer:
> > Results have been estimated based on internal Intel analysis and are 
> > provided
> > for informational purposes only. Any difference in system hardware or 
> > software
> > design or configuration may affect actual performance.
> >
> It doesn't make much sense to me. the codes I added cannot be
> triggered without enable any pr policies. and I also did the tests in

It seems these pr policies has to be turned on by user space, i.e.
netperf in this case?

I checked netperf's source code, it doesn't seem set any option
related to SCTP PR POLICY but I'm new to network code so I could be
wrong or missing something.

> my local environment,  the result looks normal to me compare to
> prior version.

Can you share your number?
We run netperf like this:
netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1
The full log of the run is attached for your reference.

> 
> Recently the sctp performance is not stable,  as during these patches,
> netperf cannot get the result, but return ENOTCONN. which may
> also affect the testing. anyway we've fixed the -ENOTCONN issue
> already in the latest version.

I tested commit 96b585267f55, which is Linus' git tree HEAD on 08/03, I
guess the fix you mentioned should already be in there? But
unfortunately, the throughput of netperf is still at low number(we did
the test 5 times):
$ cat */netperf.json
{
  "netperf.Throughput_Mbps": [
2470.69748
  ]
}{
  "netperf.Throughput_Mbps": [
2486.7675
  ]
}{
  "netperf.Throughput_Mbps": [
2478.945
  ]
}{
  "netperf.Throughput_Mbps": [
2429.465
  ]
}{
  "netperf.Throughput_Mbps": [
2476.91504
  ]

Considering what you have said that the patch shouldn't make a
difference, the performance drop is really confusing. Any idea what
could be the cause? Thanks.

Regards,
Aaron
2016-08-04 16:12:43 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
127.0.0.1 &
2016-08-04 16:12:43 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
127.0.0.1 &
2016-08-04 16:12:43 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
127.0.0.1 &
2016-08-04 16:12:43 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
127.0.0.1 &
2016-08-04 16:12:43 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
127.0.0.1 &
2016-08-04 16:12:43 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
127.0.0.1 &
2016-08-04 16:12:43 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
127.0.0.1 &
2016-08-04 16:12:43 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 
127.0.0.1 &
SCTP 1-TO-MANY STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 127.0.0.1 
() port 0 AF_INET : demo
Recv   SendSend  Utilization   Service Demand
Socket Socket  Message  Elapsed  Send Recv SendRecv
Size   SizeSize Time Throughput  localremote   local   remote
bytes  bytes   bytessecs.10^6bits/s  % S  % S  us/KB   us/KB

212992 212992  10240300.00 2373.19   51.1451.147.061   7.061 
SCTP 1-TO-MANY STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 127.0.0.1 
() port 0 AF_INET : demo
Recv   SendSend  Utilization   Service Demand
Socket Socket  Message  Elapsed  Send Recv SendRecv
Size   SizeSize Time Throughput  localremote   local   remote
bytes  bytes   bytessecs.10^6bits/s  % S  % S  us/KB   us/KB

212992 212992  10240300.00 2374.59   51.1451.117.057   7.053 
SCTP 1-TO-MANY STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 127.0.0.1 
() port 0 AF_INET : demo
Recv   SendSend  Utilization   Service Demand
Socket Socket  Message  Elapsed  Send Recv SendRecv
Size   SizeSize Time Throughput  localremote   local   remote
bytes  bytes   bytessecs.10^6bits/s  % S  % S  us/KB   us/KB

212992 212992  10240300.00 2633.32   51.1451.116.364   6.360 
SCTP 1-TO-MANY STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 127.0.0.1 
() port 0 AF_INET : demo
Recv   SendSend  Utilization   Service Demand
Socket Socket  Message  Elapsed  Se