Re: Very slow Virtual Router provisioning with 4.9.2.0

2017-05-08 Thread Wido den Hollander
Hi,

To conclude, after doing a lot of work in debugging we were able to reduce the 
deployment of our VRs from ~2 hours to ~5 MINUTES.

Two PRs are open for this against the 4.9 branch:

- https://github.com/apache/cloudstack/pull/2077
- https://github.com/apache/cloudstack/pull/2089

The problem was that in Basic Networking each VR would get ALL DHCP information 
instead of just the information for that POD (PR 2077).

The other issue is that for each entry dnsmasq and Apache would be restarted 
combined with some other things. By delaying this to the end of the provision 
of the router we save a lot of time.

Both PRs are running in production on our cloud in Basic Networking with a few 
thousands Instances behind it. No problems found so far.

Wido

> Op 2 mei 2017 om 19:57 schreef Wido den Hollander :
> 
> 
> Hi,
> 
> Last night I upgraded a CloudStack 4.5.2 setup to 4.9.2.0. All went well, but 
> the VR provisioning is terribly slow which causes all kinds of problems.
> 
> The vr_cfg.sh and update_config.py scripts start to run. Restart dnsmasq, add 
> metadata, etc.
> 
> But for just 1800 hosts this can take up to 2 hours and that causes timeouts 
> in the management server and other problems.
> 
> 2 hours is just very, very slow. So I am starting to wonder if something is 
> wrong here.
> 
> Did anybody else see this?
> 
> Running Basic Networking with CloudStack 4.9.2.0
> 
> Wido


Re: Very slow Virtual Router provisioning with 4.9.2.0

2017-05-07 Thread Daan Hoogland
As far as I can tell there was a missing license header and no license issue at 
all. On the other hand, both didn’t pass Jenkins or travis and I did not 
analyse the code fully enough so I could do with intensive review. Both having 
failed and their warranty at merge.

https://github.com/apache/cloudstack/pull/2083
https://github.com/apache/cloudstack/pull/2084

On 04/05/17 13:33, "Wido den Hollander" <w...@widodh.nl> wrote:

Here we go: https://github.com/apache/cloudstack/pull/2077

That works really well for us. 70% less DHCP entries in our VR in Basic 
Networking since only the entries for that POD are send to the VR.

Wido

> Op 4 mei 2017 om 12:12 schreef Wido den Hollander <w...@widodh.nl>:
> > Op 4 mei 2017 om 11:11 schreef Wei ZHOU <ustcweiz...@gmail.com>:
> > > 2017-05-04 10:48 GMT+02:00 Wido den Hollander <w...@widodh.nl>:
> > > > Op 3 mei 2017 om 14:49 schreef Daan Hoogland 
<daan.hoogl...@gmail.com>:
 
> > > > Happy to pick this up, Remi. I'm travelling now but will look at 
both on
> > > > Friday.
> > > > On 3 May 2017 2:25 pm, "Remi Bergsma" <rberg...@schubergphilis.com>
> > > > > On 03/05/2017, 13:44, "Rohit Yadav" <rohit.ya...@shapeblue.com> wrote:
> > > > > From: Remi Bergsma <rberg...@schubergphilis.com>
    > > > > > Sent: 03 May 2017 16:58:18
> > > > > To: dev@cloudstack.apache.org
> > > > > Subject: Re: Very slow Virtual Router provisioning with 
4.9.2.0
> > > > >
> > > > > Hi,
> > > > >
> > > > > The patches I talked about:
> > > > >
> > > > > 1) Iptables speed improvement
> > > > > https://github.com/apache/cloudstack/pull/1482
> > > > > Was reverted due to a licensing issue.
> > > > >
> > > > > 2) Passwd speed improvement
> > > > > https://github.com/MissionCriticalCloudOldRepos/
> > > cosmic-core/pull/138
> > > > >
> > > > > By now, these are rather old patches so they need some work 
before
> > > > > they apply to CloudStack again.
> > > > >
 


daan.hoogl...@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 



Re: Very slow Virtual Router provisioning with 4.9.2.0

2017-05-04 Thread Wido den Hollander
Here we go: https://github.com/apache/cloudstack/pull/2077

That works really well for us. 70% less DHCP entries in our VR in Basic 
Networking since only the entries for that POD are send to the VR.

Wido

> Op 4 mei 2017 om 12:12 schreef Wido den Hollander <w...@widodh.nl>:
> 
> 
> Hi,
> 
> Yes, we are working on a few low hanging fruit fixes. Like checking if the 
> last restart of dnsmasq was < 10 sec ago. If so, skip the restart.
> 
> Will report back once we have anything.
> 
> Wido
> 
> > Op 4 mei 2017 om 11:11 schreef Wei ZHOU <ustcweiz...@gmail.com>:
> > 
> > 
> > Hi Wido,
> > 
> > A simple improvement is, donot wait while restarting dnsmasq service in VR.
> > 
> > 
> > '''
> > diff --git a/systemvm/patches/debian/config/opt/cloud/bin/cs/CsDhcp.py
> > b/systemvm/patches/debian/config/opt/cloud/bin/cs/CsDhcp.py
> > index 95d2eff..999be8f 100755
> > --- a/systemvm/patches/debian/config/opt/cloud/bin/cs/CsDhcp.py
> > +++ b/systemvm/patches/debian/config/opt/cloud/bin/cs/CsDhcp.py
> > @@ -59,7 +59,7 @@ class CsDhcp(CsDataBag):
> > 
> >  # We restart DNSMASQ every time the configure.py is called in
> > order to avoid lease problems.
> >  if not self.cl.is_redundant() or self.cl.is_master():
> > -CsHelper.service("dnsmasq", "restart")
> > +CsHelper.execute3("service dnsmasq restart")
> > 
> >  def configure_server(self):
> >  # self.conf.addeq("dhcp-hostsfile=%s" % DHCP_HOSTS)
> > diff --git a/systemvm/patches/debian/config/opt/cloud/bin/cs/CsHelper.py
> > b/systemvm/patches/debian/config/opt/cloud/bin/cs/CsHelper.py
> > index a8ccea2..b06bde3 100755
> > --- a/systemvm/patches/debian/config/opt/cloud/bin/cs/CsHelper.py
> > +++ b/systemvm/patches/debian/config/opt/cloud/bin/cs/CsHelper.py
> > @@ -191,6 +191,11 @@ def execute2(command):
> >  p.wait()
> >  return p
> > 
> > +def execute3(command):
> > +""" Execute command """
> > +logging.debug("Executing: %s" % command)
> > +p = subprocess.Popen(command, stdout=subprocess.PIPE,
> > stderr=subprocess.PIPE, shell=True)
> > +return p
> > 
> >  def service(name, op):
> >  execute("service %s %s" % (name, op))
> > '''
> > 
> > -Wei
> > 
> > 
> > 2017-05-04 10:48 GMT+02:00 Wido den Hollander <w...@widodh.nl>:
> > 
> > > Thanks Daan, Remi.
> > >
> > > I found a additional bug where it seems that 
> > > 'network.dns.basiczone.updates'
> > > isn't read when sending DHCP settings in Basic Networking.
> > >
> > > This means that the VR gets all DHCP setting for the whole zone instead of
> > > just for that POD.
> > >
> > > In this case some VRs we have get ~2k of DHCP offerings send to them which
> > > causes a large slowdown.
> > >
> > > Wido
> > >
> > > > Op 3 mei 2017 om 14:49 schreef Daan Hoogland <daan.hoogl...@gmail.com>:
> > > >
> > > >
> > > > Happy to pick this up, Remi. I'm travelling now but will look at both on
> > > > Friday.
> > > >
> > > > Biligual auto correct use.  Read at your own risico
> > > >
> > > > On 3 May 2017 2:25 pm, "Remi Bergsma" <rberg...@schubergphilis.com>
> > > wrote:
> > > >
> > > > > Always happy to share, but I won’t have time to work on porting this 
> > > > > to
> > > > > CloudStack any time soon.
> > > > >
> > > > > Regards, Remi
> > > > >
> > > > >
> > > > > On 03/05/2017, 13:44, "Rohit Yadav" <rohit.ya...@shapeblue.com> wrote:
> > > > >
> > > > > Hi Remi, thanks for sharing. We would love to have those changes
> > > (for
> > > > > 4.9+), looking forward to your pull requests.
> > > > >
> > > > >
> > > > > Regards.
> > > > >
> > > > > 
> > > > > From: Remi Bergsma <rberg...@schubergphilis.com>
> > > > > Sent: 03 May 2017 16:58:18
> > > > > To: dev@cloudstack.apache.org
> > > > > Subject: Re: Very slow Virtual Router provisioning with 4.9.2.0
> > > > >
> > > > > Hi,
> > > > >
> > > > > The patc

Re: Very slow Virtual Router provisioning with 4.9.2.0

2017-05-04 Thread Wido den Hollander
Hi,

Yes, we are working on a few low hanging fruit fixes. Like checking if the last 
restart of dnsmasq was < 10 sec ago. If so, skip the restart.

Will report back once we have anything.

Wido

> Op 4 mei 2017 om 11:11 schreef Wei ZHOU <ustcweiz...@gmail.com>:
> 
> 
> Hi Wido,
> 
> A simple improvement is, donot wait while restarting dnsmasq service in VR.
> 
> 
> '''
> diff --git a/systemvm/patches/debian/config/opt/cloud/bin/cs/CsDhcp.py
> b/systemvm/patches/debian/config/opt/cloud/bin/cs/CsDhcp.py
> index 95d2eff..999be8f 100755
> --- a/systemvm/patches/debian/config/opt/cloud/bin/cs/CsDhcp.py
> +++ b/systemvm/patches/debian/config/opt/cloud/bin/cs/CsDhcp.py
> @@ -59,7 +59,7 @@ class CsDhcp(CsDataBag):
> 
>  # We restart DNSMASQ every time the configure.py is called in
> order to avoid lease problems.
>  if not self.cl.is_redundant() or self.cl.is_master():
> -CsHelper.service("dnsmasq", "restart")
> +CsHelper.execute3("service dnsmasq restart")
> 
>  def configure_server(self):
>  # self.conf.addeq("dhcp-hostsfile=%s" % DHCP_HOSTS)
> diff --git a/systemvm/patches/debian/config/opt/cloud/bin/cs/CsHelper.py
> b/systemvm/patches/debian/config/opt/cloud/bin/cs/CsHelper.py
> index a8ccea2..b06bde3 100755
> --- a/systemvm/patches/debian/config/opt/cloud/bin/cs/CsHelper.py
> +++ b/systemvm/patches/debian/config/opt/cloud/bin/cs/CsHelper.py
> @@ -191,6 +191,11 @@ def execute2(command):
>  p.wait()
>  return p
> 
> +def execute3(command):
> +""" Execute command """
> +logging.debug("Executing: %s" % command)
> +p = subprocess.Popen(command, stdout=subprocess.PIPE,
> stderr=subprocess.PIPE, shell=True)
> +return p
> 
>  def service(name, op):
>  execute("service %s %s" % (name, op))
> '''
> 
> -Wei
> 
> 
> 2017-05-04 10:48 GMT+02:00 Wido den Hollander <w...@widodh.nl>:
> 
> > Thanks Daan, Remi.
> >
> > I found a additional bug where it seems that 'network.dns.basiczone.updates'
> > isn't read when sending DHCP settings in Basic Networking.
> >
> > This means that the VR gets all DHCP setting for the whole zone instead of
> > just for that POD.
> >
> > In this case some VRs we have get ~2k of DHCP offerings send to them which
> > causes a large slowdown.
> >
> > Wido
> >
> > > Op 3 mei 2017 om 14:49 schreef Daan Hoogland <daan.hoogl...@gmail.com>:
> > >
> > >
> > > Happy to pick this up, Remi. I'm travelling now but will look at both on
> > > Friday.
> > >
> > > Biligual auto correct use.  Read at your own risico
> > >
> > > On 3 May 2017 2:25 pm, "Remi Bergsma" <rberg...@schubergphilis.com>
> > wrote:
> > >
> > > > Always happy to share, but I won’t have time to work on porting this to
> > > > CloudStack any time soon.
> > > >
> > > > Regards, Remi
> > > >
> > > >
> > > > On 03/05/2017, 13:44, "Rohit Yadav" <rohit.ya...@shapeblue.com> wrote:
> > > >
> > > > Hi Remi, thanks for sharing. We would love to have those changes
> > (for
> > > > 4.9+), looking forward to your pull requests.
> > > >
> > > >
> > > > Regards.
> > > >
> > > > 
> > > > From: Remi Bergsma <rberg...@schubergphilis.com>
> > > > Sent: 03 May 2017 16:58:18
> > > > To: dev@cloudstack.apache.org
> > > > Subject: Re: Very slow Virtual Router provisioning with 4.9.2.0
> > > >
> > > > Hi,
> > > >
> > > > The patches I talked about:
> > > >
> > > > 1) Iptables speed improvement
> > > > https://github.com/apache/cloudstack/pull/1482
> > > > Was reverted due to a licensing issue.
> > > >
> > > > 2) Passwd speed improvement
> > > > https://github.com/MissionCriticalCloudOldRepos/
> > cosmic-core/pull/138
> > > >
> > > > By now, these are rather old patches so they need some work before
> > > > they apply to CloudStack again.
> > > >
> > > > Regards, Remi
> > > >
> > > >
> > > >
> > > > On 03/05/2017, 12:49, "Jeff Hair" <j...@greenqloud.com> wrote:
> > > >
> > > > Hi Remi,
> > > >

Re: Very slow Virtual Router provisioning with 4.9.2.0

2017-05-04 Thread Wei ZHOU
Hi Wido,

A simple improvement is, donot wait while restarting dnsmasq service in VR.


'''
diff --git a/systemvm/patches/debian/config/opt/cloud/bin/cs/CsDhcp.py
b/systemvm/patches/debian/config/opt/cloud/bin/cs/CsDhcp.py
index 95d2eff..999be8f 100755
--- a/systemvm/patches/debian/config/opt/cloud/bin/cs/CsDhcp.py
+++ b/systemvm/patches/debian/config/opt/cloud/bin/cs/CsDhcp.py
@@ -59,7 +59,7 @@ class CsDhcp(CsDataBag):

 # We restart DNSMASQ every time the configure.py is called in
order to avoid lease problems.
 if not self.cl.is_redundant() or self.cl.is_master():
-CsHelper.service("dnsmasq", "restart")
+CsHelper.execute3("service dnsmasq restart")

 def configure_server(self):
 # self.conf.addeq("dhcp-hostsfile=%s" % DHCP_HOSTS)
diff --git a/systemvm/patches/debian/config/opt/cloud/bin/cs/CsHelper.py
b/systemvm/patches/debian/config/opt/cloud/bin/cs/CsHelper.py
index a8ccea2..b06bde3 100755
--- a/systemvm/patches/debian/config/opt/cloud/bin/cs/CsHelper.py
+++ b/systemvm/patches/debian/config/opt/cloud/bin/cs/CsHelper.py
@@ -191,6 +191,11 @@ def execute2(command):
 p.wait()
 return p

+def execute3(command):
+""" Execute command """
+logging.debug("Executing: %s" % command)
+p = subprocess.Popen(command, stdout=subprocess.PIPE,
stderr=subprocess.PIPE, shell=True)
+return p

 def service(name, op):
 execute("service %s %s" % (name, op))
'''

-Wei


2017-05-04 10:48 GMT+02:00 Wido den Hollander <w...@widodh.nl>:

> Thanks Daan, Remi.
>
> I found a additional bug where it seems that 'network.dns.basiczone.updates'
> isn't read when sending DHCP settings in Basic Networking.
>
> This means that the VR gets all DHCP setting for the whole zone instead of
> just for that POD.
>
> In this case some VRs we have get ~2k of DHCP offerings send to them which
> causes a large slowdown.
>
> Wido
>
> > Op 3 mei 2017 om 14:49 schreef Daan Hoogland <daan.hoogl...@gmail.com>:
> >
> >
> > Happy to pick this up, Remi. I'm travelling now but will look at both on
> > Friday.
> >
> > Biligual auto correct use.  Read at your own risico
> >
> > On 3 May 2017 2:25 pm, "Remi Bergsma" <rberg...@schubergphilis.com>
> wrote:
> >
> > > Always happy to share, but I won’t have time to work on porting this to
> > > CloudStack any time soon.
> > >
> > > Regards, Remi
> > >
> > >
> > > On 03/05/2017, 13:44, "Rohit Yadav" <rohit.ya...@shapeblue.com> wrote:
> > >
> > > Hi Remi, thanks for sharing. We would love to have those changes
> (for
> > > 4.9+), looking forward to your pull requests.
> > >
> > >
> > > Regards.
> > >
> > > 
> > > From: Remi Bergsma <rberg...@schubergphilis.com>
> > > Sent: 03 May 2017 16:58:18
> > > To: dev@cloudstack.apache.org
> > > Subject: Re: Very slow Virtual Router provisioning with 4.9.2.0
> > >
> > > Hi,
> > >
> > > The patches I talked about:
> > >
> > > 1) Iptables speed improvement
> > > https://github.com/apache/cloudstack/pull/1482
> > > Was reverted due to a licensing issue.
> > >
> > > 2) Passwd speed improvement
> > > https://github.com/MissionCriticalCloudOldRepos/
> cosmic-core/pull/138
> > >
> > > By now, these are rather old patches so they need some work before
> > > they apply to CloudStack again.
> > >
> > > Regards, Remi
> > >
> > >
> > >
> > > On 03/05/2017, 12:49, "Jeff Hair" <j...@greenqloud.com> wrote:
> > >
> > > Hi Remi,
> > >
> > > Do you have a link to the PR that was reverted? And also
> possibly
> > > the code
> > > that makes the password updating more efficient?
> > >
> > > Jeff
> > >
> > > On Wed, May 3, 2017 at 10:36 AM, Remi Bergsma <
> > > rberg...@schubergphilis.com>
> > > wrote:
> > >
> > > > Hi Wido,
> > > >
> > > > When we had similar issues last year, we found that for
> example
> > > comparing
> > > > the iptables rules one-by-one is 1000x slower than simply
> > > loading them all
> > > > at once. Boris rewrote this part in our Cosmic fork, may be
> > > worth looking
> >

Re: Very slow Virtual Router provisioning with 4.9.2.0

2017-05-04 Thread Wido den Hollander
Thanks Daan, Remi.

I found a additional bug where it seems that 'network.dns.basiczone.updates' 
isn't read when sending DHCP settings in Basic Networking.

This means that the VR gets all DHCP setting for the whole zone instead of just 
for that POD.

In this case some VRs we have get ~2k of DHCP offerings send to them which 
causes a large slowdown.

Wido

> Op 3 mei 2017 om 14:49 schreef Daan Hoogland <daan.hoogl...@gmail.com>:
> 
> 
> Happy to pick this up, Remi. I'm travelling now but will look at both on
> Friday.
> 
> Biligual auto correct use.  Read at your own risico
> 
> On 3 May 2017 2:25 pm, "Remi Bergsma" <rberg...@schubergphilis.com> wrote:
> 
> > Always happy to share, but I won’t have time to work on porting this to
> > CloudStack any time soon.
> >
> > Regards, Remi
> >
> >
> > On 03/05/2017, 13:44, "Rohit Yadav" <rohit.ya...@shapeblue.com> wrote:
> >
> > Hi Remi, thanks for sharing. We would love to have those changes (for
> > 4.9+), looking forward to your pull requests.
> >
> >
> > Regards.
> >
> > ____
> >     From: Remi Bergsma <rberg...@schubergphilis.com>
> > Sent: 03 May 2017 16:58:18
> > To: dev@cloudstack.apache.org
> > Subject: Re: Very slow Virtual Router provisioning with 4.9.2.0
> >
> > Hi,
> >
> > The patches I talked about:
> >
> > 1) Iptables speed improvement
> > https://github.com/apache/cloudstack/pull/1482
> > Was reverted due to a licensing issue.
> >
> > 2) Passwd speed improvement
> > https://github.com/MissionCriticalCloudOldRepos/cosmic-core/pull/138
> >
> > By now, these are rather old patches so they need some work before
> > they apply to CloudStack again.
> >
> > Regards, Remi
> >
> >
> >
> > On 03/05/2017, 12:49, "Jeff Hair" <j...@greenqloud.com> wrote:
> >
> > Hi Remi,
> >
> > Do you have a link to the PR that was reverted? And also possibly
> > the code
> > that makes the password updating more efficient?
> >
> > Jeff
> >
> > On Wed, May 3, 2017 at 10:36 AM, Remi Bergsma <
> > rberg...@schubergphilis.com>
> > wrote:
> >
> > > Hi Wido,
> > >
> > > When we had similar issues last year, we found that for example
> > comparing
> > > the iptables rules one-by-one is 1000x slower than simply
> > loading them all
> > > at once. Boris rewrote this part in our Cosmic fork, may be
> > worth looking
> > > into this again. The PR to CloudStack was merged, but reverted
> > later, can't
> > > remember why. We run it in production ever since. Also feeding
> > passwords to
> > > the passwd server is very inefficient (it operates like a
> > snowball and gets
> > > slower once you have more VMs). That we also fixed in Cosmic,
> > not sure if
> > > that patch made it upstream. Wrote it about a year ago already.
> > >
> > > We tested applying 10K iptables rules in just a couple of
> > seconds. 1000
> > > VMs takes a few minutes to deploy.
> >     >
> > > Generally speaking I'd suggest looking at the logs to find what
> > takes long
> > > or is executed a lot of times. Iptables and passwd are two to
> > look at.
> > >
> > > If you want I can lookup the patches. Not handy on my phone now
> > ;-)
> > >
> > > Regards, Remi
> > > 
> > > From: Wido den Hollander <w...@widodh.nl>
> > > Sent: Tuesday, May 2, 2017 7:57:08 PM
> > > To: dev@cloudstack.apache.org
> > > Subject: Very slow Virtual Router provisioning with 4.9.2.0
> > >
> > > Hi,
> > >
> > > Last night I upgraded a CloudStack 4.5.2 setup to 4.9.2.0. All
> > went well,
> > > but the VR provisioning is terribly slow which causes all kinds
> > of problems.
> > >
> > > The vr_cfg.sh and update_config.py scripts start to run. Restart
> > dnsmasq,
> > > add metadata, etc.
> > >
> > > But for just 1800 hosts this can take up to 2 hours and that
> > causes
> > > timeouts in the management server and other problems.
> > >
> > > 2 hours is just very, very slow. So I am starting to wonder if
> > something
> > > is wrong here.
> > >
> > > Did anybody else see this?
> > >
> > > Running Basic Networking with CloudStack 4.9.2.0
> > >
> > > Wido
> > >
> >
> >
> >
> >
> > rohit.ya...@shapeblue.com
> > www.shapeblue.com
> > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > @shapeblue
> >
> >
> >
> >
> >
> >


Re: Very slow Virtual Router provisioning with 4.9.2.0

2017-05-03 Thread Rene Moser
Thanks Remi for the hint and Daan for pick it up! That is why I like
open source software development and this project ;)

On 05/03/2017 02:49 PM, Daan Hoogland wrote:
> Happy to pick this up, Remi. I'm travelling now but will look at both on
> Friday.




Re: Very slow Virtual Router provisioning with 4.9.2.0

2017-05-03 Thread Daan Hoogland
Happy to pick this up, Remi. I'm travelling now but will look at both on
Friday.

Biligual auto correct use.  Read at your own risico

On 3 May 2017 2:25 pm, "Remi Bergsma" <rberg...@schubergphilis.com> wrote:

> Always happy to share, but I won’t have time to work on porting this to
> CloudStack any time soon.
>
> Regards, Remi
>
>
> On 03/05/2017, 13:44, "Rohit Yadav" <rohit.ya...@shapeblue.com> wrote:
>
> Hi Remi, thanks for sharing. We would love to have those changes (for
> 4.9+), looking forward to your pull requests.
>
>
> Regards.
>
> 
> From: Remi Bergsma <rberg...@schubergphilis.com>
>     Sent: 03 May 2017 16:58:18
> To: dev@cloudstack.apache.org
> Subject: Re: Very slow Virtual Router provisioning with 4.9.2.0
>
> Hi,
>
> The patches I talked about:
>
> 1) Iptables speed improvement
> https://github.com/apache/cloudstack/pull/1482
> Was reverted due to a licensing issue.
>
> 2) Passwd speed improvement
> https://github.com/MissionCriticalCloudOldRepos/cosmic-core/pull/138
>
> By now, these are rather old patches so they need some work before
> they apply to CloudStack again.
>
> Regards, Remi
>
>
>
> On 03/05/2017, 12:49, "Jeff Hair" <j...@greenqloud.com> wrote:
>
> Hi Remi,
>
> Do you have a link to the PR that was reverted? And also possibly
> the code
> that makes the password updating more efficient?
>
> Jeff
>
> On Wed, May 3, 2017 at 10:36 AM, Remi Bergsma <
> rberg...@schubergphilis.com>
> wrote:
>
> > Hi Wido,
> >
> > When we had similar issues last year, we found that for example
> comparing
> > the iptables rules one-by-one is 1000x slower than simply
> loading them all
> > at once. Boris rewrote this part in our Cosmic fork, may be
> worth looking
> > into this again. The PR to CloudStack was merged, but reverted
> later, can't
> > remember why. We run it in production ever since. Also feeding
> passwords to
> > the passwd server is very inefficient (it operates like a
> snowball and gets
> > slower once you have more VMs). That we also fixed in Cosmic,
> not sure if
> > that patch made it upstream. Wrote it about a year ago already.
> >
> > We tested applying 10K iptables rules in just a couple of
> seconds. 1000
> > VMs takes a few minutes to deploy.
> >
> > Generally speaking I'd suggest looking at the logs to find what
> takes long
> > or is executed a lot of times. Iptables and passwd are two to
> look at.
>     >
> > If you want I can lookup the patches. Not handy on my phone now
> ;-)
> >
> > Regards, Remi
> > 
> > From: Wido den Hollander <w...@widodh.nl>
> > Sent: Tuesday, May 2, 2017 7:57:08 PM
> > To: dev@cloudstack.apache.org
> > Subject: Very slow Virtual Router provisioning with 4.9.2.0
> >
> > Hi,
> >
> > Last night I upgraded a CloudStack 4.5.2 setup to 4.9.2.0. All
> went well,
> > but the VR provisioning is terribly slow which causes all kinds
> of problems.
> >
> > The vr_cfg.sh and update_config.py scripts start to run. Restart
> dnsmasq,
> > add metadata, etc.
> >
> > But for just 1800 hosts this can take up to 2 hours and that
> causes
> > timeouts in the management server and other problems.
> >
> > 2 hours is just very, very slow. So I am starting to wonder if
> something
> > is wrong here.
> >
> > Did anybody else see this?
> >
> > Running Basic Networking with CloudStack 4.9.2.0
> >
> > Wido
> >
>
>
>
>
> rohit.ya...@shapeblue.com
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>
>
>
>
>
>


Re: Very slow Virtual Router provisioning with 4.9.2.0

2017-05-03 Thread Remi Bergsma
Always happy to share, but I won’t have time to work on porting this to 
CloudStack any time soon.

Regards, Remi


On 03/05/2017, 13:44, "Rohit Yadav" <rohit.ya...@shapeblue.com> wrote:

Hi Remi, thanks for sharing. We would love to have those changes (for 
4.9+), looking forward to your pull requests.


Regards.


From: Remi Bergsma <rberg...@schubergphilis.com>
Sent: 03 May 2017 16:58:18
To: dev@cloudstack.apache.org
Subject: Re: Very slow Virtual Router provisioning with 4.9.2.0

Hi,

The patches I talked about:

1) Iptables speed improvement
https://github.com/apache/cloudstack/pull/1482
Was reverted due to a licensing issue.

2) Passwd speed improvement
https://github.com/MissionCriticalCloudOldRepos/cosmic-core/pull/138

By now, these are rather old patches so they need some work before they 
apply to CloudStack again.

Regards, Remi



On 03/05/2017, 12:49, "Jeff Hair" <j...@greenqloud.com> wrote:

Hi Remi,

Do you have a link to the PR that was reverted? And also possibly the 
code
that makes the password updating more efficient?

Jeff

On Wed, May 3, 2017 at 10:36 AM, Remi Bergsma 
<rberg...@schubergphilis.com>
wrote:

> Hi Wido,
>
> When we had similar issues last year, we found that for example 
comparing
> the iptables rules one-by-one is 1000x slower than simply loading 
them all
> at once. Boris rewrote this part in our Cosmic fork, may be worth 
looking
> into this again. The PR to CloudStack was merged, but reverted later, 
can't
> remember why. We run it in production ever since. Also feeding 
passwords to
> the passwd server is very inefficient (it operates like a snowball 
and gets
> slower once you have more VMs). That we also fixed in Cosmic, not 
sure if
> that patch made it upstream. Wrote it about a year ago already.
>
> We tested applying 10K iptables rules in just a couple of seconds. 
1000
> VMs takes a few minutes to deploy.
>
> Generally speaking I'd suggest looking at the logs to find what takes 
long
> or is executed a lot of times. Iptables and passwd are two to look at.
>
> If you want I can lookup the patches. Not handy on my phone now ;-)
>
> Regards, Remi
> 
> From: Wido den Hollander <w...@widodh.nl>
    > Sent: Tuesday, May 2, 2017 7:57:08 PM
> To: dev@cloudstack.apache.org
> Subject: Very slow Virtual Router provisioning with 4.9.2.0
>
> Hi,
>
> Last night I upgraded a CloudStack 4.5.2 setup to 4.9.2.0. All went 
well,
> but the VR provisioning is terribly slow which causes all kinds of 
problems.
>
> The vr_cfg.sh and update_config.py scripts start to run. Restart 
dnsmasq,
> add metadata, etc.
>
> But for just 1800 hosts this can take up to 2 hours and that causes
> timeouts in the management server and other problems.
>
> 2 hours is just very, very slow. So I am starting to wonder if 
something
> is wrong here.
>
> Did anybody else see this?
>
> Running Basic Networking with CloudStack 4.9.2.0
>
> Wido
>




rohit.ya...@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 





Re: Very slow Virtual Router provisioning with 4.9.2.0

2017-05-03 Thread Rohit Yadav
Hi Remi, thanks for sharing. We would love to have those changes (for 4.9+), 
looking forward to your pull requests.


Regards.


From: Remi Bergsma <rberg...@schubergphilis.com>
Sent: 03 May 2017 16:58:18
To: dev@cloudstack.apache.org
Subject: Re: Very slow Virtual Router provisioning with 4.9.2.0

Hi,

The patches I talked about:

1) Iptables speed improvement
https://github.com/apache/cloudstack/pull/1482
Was reverted due to a licensing issue.

2) Passwd speed improvement
https://github.com/MissionCriticalCloudOldRepos/cosmic-core/pull/138

By now, these are rather old patches so they need some work before they apply 
to CloudStack again.

Regards, Remi



On 03/05/2017, 12:49, "Jeff Hair" <j...@greenqloud.com> wrote:

Hi Remi,

Do you have a link to the PR that was reverted? And also possibly the code
that makes the password updating more efficient?

Jeff

On Wed, May 3, 2017 at 10:36 AM, Remi Bergsma <rberg...@schubergphilis.com>
wrote:

> Hi Wido,
>
> When we had similar issues last year, we found that for example comparing
> the iptables rules one-by-one is 1000x slower than simply loading them all
> at once. Boris rewrote this part in our Cosmic fork, may be worth looking
> into this again. The PR to CloudStack was merged, but reverted later, 
can't
> remember why. We run it in production ever since. Also feeding passwords 
to
> the passwd server is very inefficient (it operates like a snowball and 
gets
> slower once you have more VMs). That we also fixed in Cosmic, not sure if
> that patch made it upstream. Wrote it about a year ago already.
>
> We tested applying 10K iptables rules in just a couple of seconds. 1000
> VMs takes a few minutes to deploy.
>
> Generally speaking I'd suggest looking at the logs to find what takes long
> or is executed a lot of times. Iptables and passwd are two to look at.
>
> If you want I can lookup the patches. Not handy on my phone now ;-)
>
> Regards, Remi
> 
> From: Wido den Hollander <w...@widodh.nl>
> Sent: Tuesday, May 2, 2017 7:57:08 PM
> To: dev@cloudstack.apache.org
> Subject: Very slow Virtual Router provisioning with 4.9.2.0
>
> Hi,
>
> Last night I upgraded a CloudStack 4.5.2 setup to 4.9.2.0. All went well,
> but the VR provisioning is terribly slow which causes all kinds of 
problems.
>
> The vr_cfg.sh and update_config.py scripts start to run. Restart dnsmasq,
> add metadata, etc.
>
> But for just 1800 hosts this can take up to 2 hours and that causes
> timeouts in the management server and other problems.
>
> 2 hours is just very, very slow. So I am starting to wonder if something
> is wrong here.
>
> Did anybody else see this?
>
> Running Basic Networking with CloudStack 4.9.2.0
>
> Wido
>




rohit.ya...@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 



Re: Very slow Virtual Router provisioning with 4.9.2.0

2017-05-03 Thread Remi Bergsma
Hi,

The patches I talked about:

1) Iptables speed improvement
https://github.com/apache/cloudstack/pull/1482 
Was reverted due to a licensing issue. 

2) Passwd speed improvement
https://github.com/MissionCriticalCloudOldRepos/cosmic-core/pull/138 

By now, these are rather old patches so they need some work before they apply 
to CloudStack again.

Regards, Remi



On 03/05/2017, 12:49, "Jeff Hair" <j...@greenqloud.com> wrote:

Hi Remi,

Do you have a link to the PR that was reverted? And also possibly the code
that makes the password updating more efficient?

Jeff

On Wed, May 3, 2017 at 10:36 AM, Remi Bergsma <rberg...@schubergphilis.com>
wrote:

> Hi Wido,
>
> When we had similar issues last year, we found that for example comparing
> the iptables rules one-by-one is 1000x slower than simply loading them all
> at once. Boris rewrote this part in our Cosmic fork, may be worth looking
> into this again. The PR to CloudStack was merged, but reverted later, 
can't
> remember why. We run it in production ever since. Also feeding passwords 
to
> the passwd server is very inefficient (it operates like a snowball and 
gets
> slower once you have more VMs). That we also fixed in Cosmic, not sure if
> that patch made it upstream. Wrote it about a year ago already.
>
> We tested applying 10K iptables rules in just a couple of seconds. 1000
> VMs takes a few minutes to deploy.
>
> Generally speaking I'd suggest looking at the logs to find what takes long
> or is executed a lot of times. Iptables and passwd are two to look at.
>
> If you want I can lookup the patches. Not handy on my phone now ;-)
>
> Regards, Remi
> 
> From: Wido den Hollander <w...@widodh.nl>
    > Sent: Tuesday, May 2, 2017 7:57:08 PM
> To: dev@cloudstack.apache.org
> Subject: Very slow Virtual Router provisioning with 4.9.2.0
>
> Hi,
>
> Last night I upgraded a CloudStack 4.5.2 setup to 4.9.2.0. All went well,
> but the VR provisioning is terribly slow which causes all kinds of 
problems.
>
> The vr_cfg.sh and update_config.py scripts start to run. Restart dnsmasq,
> add metadata, etc.
>
> But for just 1800 hosts this can take up to 2 hours and that causes
> timeouts in the management server and other problems.
>
> 2 hours is just very, very slow. So I am starting to wonder if something
> is wrong here.
>
> Did anybody else see this?
>
> Running Basic Networking with CloudStack 4.9.2.0
>
> Wido
>





Re: Very slow Virtual Router provisioning with 4.9.2.0

2017-05-03 Thread Remi Bergsma
Hi Wido,

When we had similar issues last year, we found that for example comparing the 
iptables rules one-by-one is 1000x slower than simply loading them all at once. 
Boris rewrote this part in our Cosmic fork, may be worth looking into this 
again. The PR to CloudStack was merged, but reverted later, can't remember why. 
We run it in production ever since. Also feeding passwords to the passwd server 
is very inefficient (it operates like a snowball and gets slower once you have 
more VMs). That we also fixed in Cosmic, not sure if that patch made it 
upstream. Wrote it about a year ago already.

We tested applying 10K iptables rules in just a couple of seconds. 1000 VMs 
takes a few minutes to deploy.

Generally speaking I'd suggest looking at the logs to find what takes long or 
is executed a lot of times. Iptables and passwd are two to look at.

If you want I can lookup the patches. Not handy on my phone now ;-)

Regards, Remi

From: Wido den Hollander <w...@widodh.nl>
Sent: Tuesday, May 2, 2017 7:57:08 PM
To: dev@cloudstack.apache.org
Subject: Very slow Virtual Router provisioning with 4.9.2.0

Hi,

Last night I upgraded a CloudStack 4.5.2 setup to 4.9.2.0. All went well, but 
the VR provisioning is terribly slow which causes all kinds of problems.

The vr_cfg.sh and update_config.py scripts start to run. Restart dnsmasq, add 
metadata, etc.

But for just 1800 hosts this can take up to 2 hours and that causes timeouts in 
the management server and other problems.

2 hours is just very, very slow. So I am starting to wonder if something is 
wrong here.

Did anybody else see this?

Running Basic Networking with CloudStack 4.9.2.0

Wido


Re: Very slow Virtual Router provisioning with 4.9.2.0

2017-05-03 Thread Jayapal Uradi
Another reason of slow can be VR configuration(persistent VR configuration 
design).  When one component config apply, whole VR configuration apply is 
executed. Due to this the VR boot up time will increase.

Thanks,
Jayapal


> On May 3, 2017, at 1:55 PM, Marc-Aurèle Brothier  wrote:
> 
> Hi Wido,
> 
> Well for us, it's not a version problem, it's simply a design problem. This
> VR is very problematic during any upgrade of cloudstack (which I perform
> every week almost on our platform), same goes for the secondary storage VMs
> which scans all templates. We've planned on our roadmap to get rid of the
> system vms. The VR is really a SPoF.
> 
> On Tue, May 2, 2017 at 7:57 PM, Wido den Hollander  wrote:
> 
>> Hi,
>> 
>> Last night I upgraded a CloudStack 4.5.2 setup to 4.9.2.0. All went well,
>> but the VR provisioning is terribly slow which causes all kinds of problems.
>> 
>> The vr_cfg.sh and update_config.py scripts start to run. Restart dnsmasq,
>> add metadata, etc.
>> 
>> But for just 1800 hosts this can take up to 2 hours and that causes
>> timeouts in the management server and other problems.
>> 
>> 2 hours is just very, very slow. So I am starting to wonder if something
>> is wrong here.
>> 
>> Did anybody else see this?
>> 
>> Running Basic Networking with CloudStack 4.9.2.0
>> 
>> Wido
>> 




DISCLAIMER
==
This e-mail may contain privileged and confidential information which is the 
property of Accelerite, a Persistent Systems business. It is intended only for 
the use of the individual or entity to which it is addressed. If you are not 
the intended recipient, you are not authorized to read, retain, copy, print, 
distribute or use this message. If you have received this communication in 
error, please notify the sender and delete all copies of this message. 
Accelerite, a Persistent Systems business does not accept any liability for 
virus infected mails.


Re: Very slow Virtual Router provisioning with 4.9.2.0

2017-05-03 Thread Marc-Aurèle Brothier
Hi Wido,

Well for us, it's not a version problem, it's simply a design problem. This
VR is very problematic during any upgrade of cloudstack (which I perform
every week almost on our platform), same goes for the secondary storage VMs
which scans all templates. We've planned on our roadmap to get rid of the
system vms. The VR is really a SPoF.

On Tue, May 2, 2017 at 7:57 PM, Wido den Hollander  wrote:

> Hi,
>
> Last night I upgraded a CloudStack 4.5.2 setup to 4.9.2.0. All went well,
> but the VR provisioning is terribly slow which causes all kinds of problems.
>
> The vr_cfg.sh and update_config.py scripts start to run. Restart dnsmasq,
> add metadata, etc.
>
> But for just 1800 hosts this can take up to 2 hours and that causes
> timeouts in the management server and other problems.
>
> 2 hours is just very, very slow. So I am starting to wonder if something
> is wrong here.
>
> Did anybody else see this?
>
> Running Basic Networking with CloudStack 4.9.2.0
>
> Wido
>


Very slow Virtual Router provisioning with 4.9.2.0

2017-05-02 Thread Wido den Hollander
Hi,

Last night I upgraded a CloudStack 4.5.2 setup to 4.9.2.0. All went well, but 
the VR provisioning is terribly slow which causes all kinds of problems.

The vr_cfg.sh and update_config.py scripts start to run. Restart dnsmasq, add 
metadata, etc.

But for just 1800 hosts this can take up to 2 hours and that causes timeouts in 
the management server and other problems.

2 hours is just very, very slow. So I am starting to wonder if something is 
wrong here.

Did anybody else see this?

Running Basic Networking with CloudStack 4.9.2.0

Wido