RE: [patch iproute2 v6 0/3] tc: Add -bs option to batch mode

2018-01-08 Thread Chris Mi
> -Original Message-
> From: n...@orbyte.nwl.cc [mailto:n...@orbyte.nwl.cc] On Behalf Of Phil
> Sutter
> Sent: Monday, January 8, 2018 9:32 PM
> To: Chris Mi <chr...@mellanox.com>
> Cc: dsah...@gmail.com; marcelo.leit...@gmail.com;
> netdev@vger.kernel.org; gerlitz...@gmail.com;
> step...@networkplumber.org
> Subject: Re: [patch iproute2 v6 0/3] tc: Add -bs option to batch mode
> 
> Hi Chris,
> 
> On Mon, Jan 08, 2018 at 02:03:53AM +, Chris Mi wrote:
> > > On Thu, Jan 04, 2018 at 04:34:51PM +0900, Chris Mi wrote:
> > > > The insertion rate is improved more than 10%.
> > >
> > > Did you measure the effect of increasing batch sizes?
> > Yes. Even if we enlarge the batch size bigger than 10, there is no big
> improvement.
> > I think that's because current kernel doesn't process the requests in
> parallel.
> > If kernel processes the requests in parallel, I believe specifying a
> > bigger batch size will get a better result.
> 
> But throughput doesn't regress at some point, right? I think that's the 
> critical
> aspect when considering an "unlimited" batch size.
> 
> On Mon, Jan 08, 2018 at 08:00:00AM +, Chris Mi wrote:
> > After testing, I find that the message passed to kernel should not be too
> big.
> > If it is bigger than about 64K, sendmsg returns -1, errno is 90 (EMSGSIZE).
> > That is about 400 commands.  So how about set batch size to 128 which is
> big enough?
> 
> If that's the easiest way, why not. At first, I thought one could maybe send
> the collected messages in chunks of suitable size, but that's probably not
> worth the effort.
I did a testing. If we read a million commands in memory and send them in 
chunks of 128,
we'll have a big regression. It takes about 21 seconds.




RE: [patch iproute2 v6 0/3] tc: Add -bs option to batch mode

2018-01-08 Thread Chris Mi
> -Original Message-
> From: Stephen Hemminger [mailto:step...@networkplumber.org]
> Sent: Monday, January 8, 2018 11:40 PM
> To: Chris Mi <chr...@mellanox.com>
> Cc: David Ahern <dsah...@gmail.com>; Phil Sutter <p...@nwl.cc>;
> marcelo.leit...@gmail.com; netdev@vger.kernel.org; gerlitz...@gmail.com
> Subject: Re: [patch iproute2 v6 0/3] tc: Add -bs option to batch mode
> 
> On Mon, 8 Jan 2018 08:00:00 +
> Chris Mi <chr...@mellanox.com> wrote:
> 
> > > >> I wonder whether specifying the batch size is necessary at all.
> > > >> Couldn't batch mode just collect messages until either EOF or an
> > > >> incompatible command is encountered which then triggers a commit
> > > >> to kernel? This might simplify code quite a bit.
> > > > That's a good suggestion.
> > >
> > > Thanks for your time on this, Chris.
> > After testing, I find that the message passed to kernel should not be too
> big.
> > If it is bigger than about 64K, sendmsg returns -1, errno is 90 (EMSGSIZE).
> > That is about 400 commands.  So how about set batch size to 128 which is
> big enough?
> 
> 
> Use sendmmsg?
Maybe we can try that, but there is also a limit on it.


RE: [patch iproute2 v6 0/3] tc: Add -bs option to batch mode

2018-01-08 Thread Chris Mi
> -Original Message-
> From: n...@orbyte.nwl.cc [mailto:n...@orbyte.nwl.cc] On Behalf Of Phil
> Sutter
> Sent: Monday, January 8, 2018 9:32 PM
> To: Chris Mi <chr...@mellanox.com>
> Cc: dsah...@gmail.com; marcelo.leit...@gmail.com;
> netdev@vger.kernel.org; gerlitz...@gmail.com;
> step...@networkplumber.org
> Subject: Re: [patch iproute2 v6 0/3] tc: Add -bs option to batch mode
> 
> Hi Chris,
> 
> On Mon, Jan 08, 2018 at 02:03:53AM +, Chris Mi wrote:
> > > On Thu, Jan 04, 2018 at 04:34:51PM +0900, Chris Mi wrote:
> > > > The insertion rate is improved more than 10%.
> > >
> > > Did you measure the effect of increasing batch sizes?
> > Yes. Even if we enlarge the batch size bigger than 10, there is no big
> improvement.
> > I think that's because current kernel doesn't process the requests in
> parallel.
> > If kernel processes the requests in parallel, I believe specifying a
> > bigger batch size will get a better result.
> 
> But throughput doesn't regress at some point, right? I think that's the 
> critical
> aspect when considering an "unlimited" batch size.
Yes.
> 
> On Mon, Jan 08, 2018 at 08:00:00AM +, Chris Mi wrote:
> > After testing, I find that the message passed to kernel should not be too
> big.
> > If it is bigger than about 64K, sendmsg returns -1, errno is 90 (EMSGSIZE).
> > That is about 400 commands.  So how about set batch size to 128 which is
> big enough?
> 
> If that's the easiest way, why not. At first, I thought one could maybe send
> the collected messages in chunks of suitable size, but that's probably not
> worth the effort.
OK.

-Chris


Re: [patch iproute2 v6 0/3] tc: Add -bs option to batch mode

2018-01-08 Thread Stephen Hemminger
On Mon, 8 Jan 2018 08:00:00 +
Chris Mi  wrote:

> > >> I wonder whether specifying the batch size is necessary at all.
> > >> Couldn't batch mode just collect messages until either EOF or an
> > >> incompatible command is encountered which then triggers a commit to
> > >> kernel? This might simplify code quite a bit.  
> > > That's a good suggestion.  
> > 
> > Thanks for your time on this, Chris.  
> After testing, I find that the message passed to kernel should not be too big.
> If it is bigger than about 64K, sendmsg returns -1, errno is 90 (EMSGSIZE).
> That is about 400 commands.  So how about set batch size to 128 which is big 
> enough?


Use sendmmsg?


Re: [patch iproute2 v6 0/3] tc: Add -bs option to batch mode

2018-01-08 Thread Phil Sutter
Hi Chris,

On Mon, Jan 08, 2018 at 02:03:53AM +, Chris Mi wrote:
> > On Thu, Jan 04, 2018 at 04:34:51PM +0900, Chris Mi wrote:
> > > The insertion rate is improved more than 10%.
> > 
> > Did you measure the effect of increasing batch sizes?
> Yes. Even if we enlarge the batch size bigger than 10, there is no big 
> improvement.
> I think that's because current kernel doesn't process the requests in 
> parallel.
> If kernel processes the requests in parallel, I believe specifying a bigger 
> batch size
> will get a better result.

But throughput doesn't regress at some point, right? I think that's the
critical aspect when considering an "unlimited" batch size.

On Mon, Jan 08, 2018 at 08:00:00AM +, Chris Mi wrote:
> After testing, I find that the message passed to kernel should not be too big.
> If it is bigger than about 64K, sendmsg returns -1, errno is 90 (EMSGSIZE).
> That is about 400 commands.  So how about set batch size to 128 which is big 
> enough?

If that's the easiest way, why not. At first, I thought one could maybe
send the collected messages in chunks of suitable size, but that's
probably not worth the effort.

Cheers, Phil


RE: [patch iproute2 v6 0/3] tc: Add -bs option to batch mode

2018-01-08 Thread Chris Mi
> >> I wonder whether specifying the batch size is necessary at all.
> >> Couldn't batch mode just collect messages until either EOF or an
> >> incompatible command is encountered which then triggers a commit to
> >> kernel? This might simplify code quite a bit.
> > That's a good suggestion.
> 
> Thanks for your time on this, Chris.
After testing, I find that the message passed to kernel should not be too big.
If it is bigger than about 64K, sendmsg returns -1, errno is 90 (EMSGSIZE).
That is about 400 commands.  So how about set batch size to 128 which is big 
enough?


Re: [patch iproute2 v6 0/3] tc: Add -bs option to batch mode

2018-01-07 Thread David Ahern
On 1/7/18 7:03 PM, Chris Mi wrote:

>> Did you measure the effect of increasing batch sizes?
> Yes. Even if we enlarge the batch size bigger than 10, there is no big 
> improvement.

That will change over time so the tc command should allow auto-batching
to work up to the message size limit.


> I think that's because current kernel doesn't process the requests in 
> parallel.
> If kernel processes the requests in parallel, I believe specifying a bigger 
> batch size
> will get a better result.
>>
>> I wonder whether specifying the batch size is necessary at all. Couldn't 
>> batch
>> mode just collect messages until either EOF or an incompatible command is
>> encountered which then triggers a commit to kernel? This might simplify
>> code quite a bit.
> That's a good suggestion.

Thanks for your time on this, Chris.


RE: [patch iproute2 v6 0/3] tc: Add -bs option to batch mode

2018-01-07 Thread Chris Mi
> -Original Message-
> From: n...@orbyte.nwl.cc [mailto:n...@orbyte.nwl.cc] On Behalf Of Phil
> Sutter
> Sent: Saturday, January 6, 2018 1:25 AM
> To: Chris Mi <chr...@mellanox.com>
> Cc: netdev@vger.kernel.org; gerlitz...@gmail.com;
> step...@networkplumber.org; dsah...@gmail.com;
> marcelo.leit...@gmail.com
> Subject: Re: [patch iproute2 v6 0/3] tc: Add -bs option to batch mode
> 
> Hi Chris,
> 
> On Thu, Jan 04, 2018 at 04:34:51PM +0900, Chris Mi wrote:
> > Currently in tc batch mode, only one command is read from the batch
> > file and sent to kernel to process. With this patchset, we can
> > accumulate several commands before sending to kernel. The batch size
> > is specified using option -bs or -batchsize.
> >
> > To accumulate the commands in tc, client should allocate an array of
> > struct iovec. If batchsize is bigger than 1, only after the client has
> > accumulated enough commands, can the client call rtnl_talk_msg to send
> > the message that includes the iov array. One exception is that there
> > is no more command in the batch file.
> >
> > But please note that kernel still processes the requests one by one.
> > To process the requests in parallel in kernel is another effort.
> > The time we're saving in this patchset is the user mode and kernel
> > mode context switch. So this patchset works on top of the current kernel.
> >
> > Using the following script in kernel, we can generate 1,000,000 rules.
> > tools/testing/selftests/tc-testing/tdc_batch.py
> >
> > Without this patchset, 'tc -b $file' exection time is:
> >
> > real0m15.555s
> > user0m7.211s
> > sys 0m8.284s
> >
> > With this patchset, 'tc -b $file -bs 10' exection time is:
> >
> > real0m13.043s
> > user0m6.479s
> > sys 0m6.504s
> >
> > The insertion rate is improved more than 10%.
> 
> Did you measure the effect of increasing batch sizes?
Yes. Even if we enlarge the batch size bigger than 10, there is no big 
improvement.
I think that's because current kernel doesn't process the requests in parallel.
If kernel processes the requests in parallel, I believe specifying a bigger 
batch size
will get a better result.
> 
> I wonder whether specifying the batch size is necessary at all. Couldn't batch
> mode just collect messages until either EOF or an incompatible command is
> encountered which then triggers a commit to kernel? This might simplify
> code quite a bit.
That's a good suggestion.

-Chris
> 
> Cheers, Phil


Re: [patch iproute2 v6 0/3] tc: Add -bs option to batch mode

2018-01-05 Thread Marcelo Ricardo Leitner
On Fri, Jan 05, 2018 at 10:27:52AM -0700, David Ahern wrote:
> On 1/5/18 10:25 AM, Phil Sutter wrote:
> > I wonder whether specifying the batch size is necessary at all. Couldn't
> > batch mode just collect messages until either EOF or an incompatible
> > command is encountered which then triggers a commit to kernel? This
> > might simplify code quite a bit.
> > 
> 
> agreed.

Agreed too, especially now that it supports flushing because of other
rules.

  Marcelo


Re: [patch iproute2 v6 0/3] tc: Add -bs option to batch mode

2018-01-05 Thread David Ahern
On 1/5/18 10:25 AM, Phil Sutter wrote:
> I wonder whether specifying the batch size is necessary at all. Couldn't
> batch mode just collect messages until either EOF or an incompatible
> command is encountered which then triggers a commit to kernel? This
> might simplify code quite a bit.
> 

agreed.


Re: [patch iproute2 v6 0/3] tc: Add -bs option to batch mode

2018-01-05 Thread Phil Sutter
Hi Chris,

On Thu, Jan 04, 2018 at 04:34:51PM +0900, Chris Mi wrote:
> Currently in tc batch mode, only one command is read from the batch
> file and sent to kernel to process. With this patchset, we can accumulate
> several commands before sending to kernel. The batch size is specified
> using option -bs or -batchsize.
> 
> To accumulate the commands in tc, client should allocate an array of
> struct iovec. If batchsize is bigger than 1, only after the client
> has accumulated enough commands, can the client call rtnl_talk_msg
> to send the message that includes the iov array. One exception is
> that there is no more command in the batch file.
> 
> But please note that kernel still processes the requests one by one.
> To process the requests in parallel in kernel is another effort.
> The time we're saving in this patchset is the user mode and kernel mode
> context switch. So this patchset works on top of the current kernel.
> 
> Using the following script in kernel, we can generate 1,000,000 rules.
>   tools/testing/selftests/tc-testing/tdc_batch.py
> 
> Without this patchset, 'tc -b $file' exection time is:
> 
> real0m15.555s
> user0m7.211s
> sys 0m8.284s
> 
> With this patchset, 'tc -b $file -bs 10' exection time is:
> 
> real0m13.043s
> user0m6.479s
> sys 0m6.504s
> 
> The insertion rate is improved more than 10%.

Did you measure the effect of increasing batch sizes?

I wonder whether specifying the batch size is necessary at all. Couldn't
batch mode just collect messages until either EOF or an incompatible
command is encountered which then triggers a commit to kernel? This
might simplify code quite a bit.

Cheers, Phil