[GitHub] cloudstack issue #872: Strongswan vpn feature

2016-10-14 Thread swill
Github user swill commented on the issue:

https://github.com/apache/cloudstack/pull/872
  
Hey @jayapalu, thanks for the follow up.  Here are a couple things to note.
- In order to get Remote Access VPN to work you need to update the L2TP 
conf file to include `type=transport`.
- In order to get 3des working for S2S VPN, you will need to install the 
`libstrongswan-extra-plugins` package as well.
- Running `ipsec restart` seems to get rid of the discrepancy between the 
running config and the config files, but I think one of the main issues is a 
missing `ipsec rereadsecrets` when the S2S config changes.

Here is some basic stuff you can do to reproduce the problems.  You can use 
two VPCs as a test environment and create a S2S VPN connection between them to 
do tests.
- Remove all the S2S VPN connections and gateways.
- Manually remove the `/etc/ipsec.d/ipsec.vpn-vv.xx.yy.zz.conf` and 
`/etc/ipsec.d/ipsec.vpn-vv.xx.yy.zz.secrets` files from the VRs.
- Create a S2S VPN configuration with `dpd=true` and `pfs=modp1024` and set 
a PSK of something like `1234567890`.  This configuration should work.  If it 
doesn't, do an `ipsec restart` and it will probably start working.  Even if it 
does not work, we can continue the tests sequence.
- Remove the entire configuration through ACS (connections and gateways).
-- Note the files `/etc/ipsec.d/ipsec.vpn-vv.xx.yy.zz.conf` and 
`/etc/ipsec.d/ipsec.vpn-vv.xx.yy.zz.secrets` are not removed from the VRs.
-- Note the `conf` file includes dpd and psk details previously configured.
- Create a new S2S VPN configuration with `dpd=false` and without PFS.  For 
now, don't change the PSK from what it was before.
-- Note that the `conf` file on the VRs still includes the `dpd` 
configuration.  Also note that `pfs=no` now, but the `esp` config still 
includes the `modp1024` to specify `pfs=yes`.
-- If the connection was working before, this configuration, which is very 
much broken, will still work because what ipsec has in memory does not reflect 
what is in the config files.
- Remove the entire configuration through ACS again.
- Recreate it again, but this time change the PSK to something different 
like `0987654321`.
-- Note that now the connection breaks and you get an authentication error.
-- At this point to get it working again you will have to run `ipsec 
restart` because the old PSK is still in the ipsec memory.
-- You may have to manually clean up your config files at this point 
because they may be polluted by bad configuration since they are never deleted 
and configuration options are never deleted in a config, only added or edited.

That should get you going.  If you have questions, let me know.  I will 
isolate the problem more on monday.

I think the majority of these problems will go away if the config files get 
deleted when the configuration is deleted through ACS.  I think the logic will 
then flow the way it is expected.  Right now, things like `ipsec reload` are 
never called because they are showing as not changed, even though the config 
has actually changed.  I think that is the first step and then we go from 
there.  I also think we will need to run `ipsec rereadsecrets` after updating 
the s2s config in order to check if the PSK has changed and load it into the 
running config if it did change.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] cloudstack issue #872: Strongswan vpn feature

2016-10-14 Thread jayapalu
Github user jayapalu commented on the issue:

https://github.com/apache/cloudstack/pull/872
  
@swill Let me also try the issue you have mentioned in my setup on Monday.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] cloudstack issue #872: Strongswan vpn feature

2016-10-14 Thread swill
Github user swill commented on the issue:

https://github.com/apache/cloudstack/pull/872
  
The more I dig into this the deeper the rabbit hole goes.  Here are a few 
things I have found which I need to address.
- When a VPN connection, gateway, etc is deleted, the configuration is not 
actually cleaned up.
- When a new configuration is defined, it only has the ability to add to or 
modify the current configuration, it does not have the ability to remove config 
items.  Combined with the above point, this means that if you ever turn on 
`dpd` for example, it is not possible to ever turn it off.
- The configuration files on the VR do not reflect the running config in 
`ipsec`.  You can have identical configurations and it will work sometimes and 
it wont work other times.  I have been able to reset the config to make the 
running config match the defined config by doing a `ipsec restart`, but I have 
to close the gap as to why it is not consistent and where the divergence 
happens.  I believe it is due to the PSK not actually getting updated with a 
`ipsec rereadsecrets`, but because of other issues, I can't even get code 
blocks to execute when they should be on changes.  
- There appears to be a problem with the `if secret.is_changed() or 
file.is_changed()` logic which is causing logic not to run when it should.  I 
am still working out why this is the case.

All to say, I still have a lot to work through before this is ready for 
primetime.  I think I have the Remote Access VPN functionality working as 
expected and relatively stable now, but I am still working through a lot of 
issues with the S2S VPN feature(s).  I have given a code drop of the Remote 
Access VPN functionality to one of our operations teams to continue testing 
that feature as I work through the S2S issues.  Hopefully I will have better 
news next week.

Have a nice weekend everyone...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: ACS 4.9 + VMware: Unable to remove one of the NICs of a multi-nic VM

2016-10-14 Thread Prashanth Manthena
Hi Paul,

Thank you for trying it out.

I am only hitting this issue for guest VMs (i.e. not with VPC VRs) created
in ACS 4.9 (i.e. not in ACS 4.7) with VMware setups.

Moreover, I get the same error when I am trying to remove the NIC (i.e.
network adapter) directly from VMware's Vcenter.

There is a possible workaround for this issue from VMware on Internet,
which doesn't work in this scenario both from CloudStack and VMware:
https://kb.vmware.com/selfservice/microsites/search.do?language=en_US=displayKC=2081503

Most likely, this issue has something to do with how we deploy (multi-nic)
guest VMs in ACS 4.9 with VMware setups.

On Fri, Oct 14, 2016 at 1:00 PM, Paul Angus 
wrote:

> Hi Prashanth,
>
> I've just tried that. I get the same error -
> The guest operating system did not respond to a hot-remove request for
> device ethernet1 in a timely manner.
>
> Kind regards,
>
> Paul Angus
>
> paul.an...@shapeblue.com
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>
>
>
>
> -Original Message-
> From: Prashanth Manthena [mailto:prashanth.manth...@nuagenetworks.net]
> Sent: 14 October 2016 09:21
> To: dev@cloudstack.apache.org
> Subject: Re: ACS 4.9 + VMware: Unable to remove one of the NICs of a
> multi-nic VM
>
> HI All,
>
> Does this issue ring a bell, and is anyone else hitting this issue ?
>
> Let me know, if it is a known issue.
>
> Thanking you in advance !!
>
> On Thu, Oct 13, 2016 at 6:25 PM, Prashanth Manthena < prashanth.manthena@
> nuagenetworks.net> wrote:
>
> > Hi,
> >
> > I am hitting the following issue on an ACS 4.9 + VMware setup (steps
> > to
> > reproduce):
> >
> > 1) Deploy a multi-nic VM (or) add a nic to a single-nic VM
> >
> > 2) Remove the non-default nic from the multi-nic VM, which fails with
> > the following error/exception in the management server log:
> >
> > 2016-10-05 06:13:28,251 DEBUG [c.c.a.ApiServlet]
> > (catalina-exec-14:ctx-f8dc6bd0 ctx-ee610e01) (logid:58e9cf98)
> > ===END===  10.31.52.95 -- GET
> > command=queryAsyncJobResult=9ad66ce9-6e1b-4c25-bd2e-763f4586dd86
> > =json&_=1475673245452
> > 2016-10-05 06:13:29,787 ERROR [c.c.h.v.r.VmwareResource]
> (DirectAgent-302:ctx-78a58d67 10.31.56.178, job-171/job-172, cmd:
> UnPlugNicCommand) (logid:9ad66ce9) Unexpected exception:
> > java.lang.RuntimeException: The guest operating system did not respond
> to a hot-remove request for device ethernet1 in a timely manner.
> > at com.cloud.hypervisor.vmware.util.VmwareClient.waitForTask(
> VmwareClient.java:354)
> > at com.cloud.hypervisor.vmware.mo.VirtualMachineMO.
> configureVm(VirtualMachineMO.java:949)
> > at com.cloud.hypervisor.vmware.resource.VmwareResource.
> execute(VmwareResource.java:1103)
> > at com.cloud.hypervisor.vmware.resource.VmwareResource.
> executeRequest(VmwareResource.java:469)
> > at com.cloud.agent.manager.DirectAgentAttache$Task.runInContext(
> DirectAgentAttache.java:315)
> > at org.apache.cloudstack.managed.context.
> ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
> > at org.apache.cloudstack.managed.context.impl.
> DefaultManagedContext$1.call(DefaultManagedContext.java:56)
> > at org.apache.cloudstack.managed.context.impl.
> DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
> > at org.apache.cloudstack.managed.context.impl.
> DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
> > at org.apache.cloudstack.managed.context.
> ManagedContextRunnable.run(ManagedContextRunnable.java:46)
> > at java.util.concurrent.Executors$RunnableAdapter.
> call(Executors.java:471)
> > at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> > at java.util.concurrent.ScheduledThreadPoolExecutor$
> ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
> > at java.util.concurrent.ScheduledThreadPoolExecutor$
> ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
> > at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1145)
> > at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:615)
> > at java.lang.Thread.run(Thread.java:745)
> > 2016-10-05 06:13:29,788 DEBUG [c.c.a.m.DirectAgentAttache]
> (DirectAgent-302:ctx-78a58d67) (logid:9ad66ce9) Seq 4-1440588930805137508:
> Response Received:
> > 2016-10-05 06:13:29,788 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
> > (DirectAgent-302:ctx-78a58d67) (logid:9ad66ce9) Seq
> > 4-1440588930805137508: MgmtId 275619427423488: Resp: Routing to peer
> > 2016-10-05 06:13:29,789 DEBUG [c.c.a.m.AgentAttache]
> > (DirectAgent-302:ctx-78a58d67) (logid:9ad66ce9) Seq
> > 4-1440588930805137508: No more commands found
> > 2016-10-05 06:13:31,120 DEBUG [o.s.b.f.s.DefaultListableBeanFactory]
> (API-Job-Executor-8:ctx-a6e36538 job-171 ctx-446c510f) (logid:9ad66ce9)
> Returning cached instance of singleton bean 

[GitHub] cloudstack issue #1692: Fix Smoke Test Failures

2016-10-14 Thread jburwell
Github user jburwell commented on the issue:

https://github.com/apache/cloudstack/pull/1692
  
@karuturi can you provide a test LGTM from CloudMonger?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


RE: ACS 4.9 + VMware: Unable to remove one of the NICs of a multi-nic VM

2016-10-14 Thread Paul Angus
Hi Prashanth,

I've just tried that. I get the same error -
The guest operating system did not respond to a hot-remove request for device 
ethernet1 in a timely manner.

Kind regards,

Paul Angus

paul.an...@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 


-Original Message-
From: Prashanth Manthena [mailto:prashanth.manth...@nuagenetworks.net] 
Sent: 14 October 2016 09:21
To: dev@cloudstack.apache.org
Subject: Re: ACS 4.9 + VMware: Unable to remove one of the NICs of a multi-nic 
VM

HI All,

Does this issue ring a bell, and is anyone else hitting this issue ?

Let me know, if it is a known issue.

Thanking you in advance !!

On Thu, Oct 13, 2016 at 6:25 PM, Prashanth Manthena < 
prashanth.manth...@nuagenetworks.net> wrote:

> Hi,
>
> I am hitting the following issue on an ACS 4.9 + VMware setup (steps 
> to
> reproduce):
>
> 1) Deploy a multi-nic VM (or) add a nic to a single-nic VM
>
> 2) Remove the non-default nic from the multi-nic VM, which fails with 
> the following error/exception in the management server log:
>
> 2016-10-05 06:13:28,251 DEBUG [c.c.a.ApiServlet] 
> (catalina-exec-14:ctx-f8dc6bd0 ctx-ee610e01) (logid:58e9cf98) 
> ===END===  10.31.52.95 -- GET  
> command=queryAsyncJobResult=9ad66ce9-6e1b-4c25-bd2e-763f4586dd86
> =json&_=1475673245452
> 2016-10-05 06:13:29,787 ERROR [c.c.h.v.r.VmwareResource] 
> (DirectAgent-302:ctx-78a58d67 10.31.56.178, job-171/job-172, cmd: 
> UnPlugNicCommand) (logid:9ad66ce9) Unexpected exception:
> java.lang.RuntimeException: The guest operating system did not respond to a 
> hot-remove request for device ethernet1 in a timely manner.
> at 
> com.cloud.hypervisor.vmware.util.VmwareClient.waitForTask(VmwareClient.java:354)
> at 
> com.cloud.hypervisor.vmware.mo.VirtualMachineMO.configureVm(VirtualMachineMO.java:949)
> at 
> com.cloud.hypervisor.vmware.resource.VmwareResource.execute(VmwareResource.java:1103)
> at 
> com.cloud.hypervisor.vmware.resource.VmwareResource.executeRequest(VmwareResource.java:469)
> at 
> com.cloud.agent.manager.DirectAgentAttache$Task.runInContext(DirectAgentAttache.java:315)
> at 
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
> at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
> at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
> at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
> at 
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> 2016-10-05 06:13:29,788 DEBUG [c.c.a.m.DirectAgentAttache] 
> (DirectAgent-302:ctx-78a58d67) (logid:9ad66ce9) Seq 4-1440588930805137508: 
> Response Received:
> 2016-10-05 06:13:29,788 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
> (DirectAgent-302:ctx-78a58d67) (logid:9ad66ce9) Seq 
> 4-1440588930805137508: MgmtId 275619427423488: Resp: Routing to peer
> 2016-10-05 06:13:29,789 DEBUG [c.c.a.m.AgentAttache] 
> (DirectAgent-302:ctx-78a58d67) (logid:9ad66ce9) Seq 
> 4-1440588930805137508: No more commands found
> 2016-10-05 06:13:31,120 DEBUG [o.s.b.f.s.DefaultListableBeanFactory] 
> (API-Job-Executor-8:ctx-a6e36538 job-171 ctx-446c510f) (logid:9ad66ce9) 
> Returning cached instance of singleton bean 'messageBus'
> 2016-10-05 06:13:31,127 ERROR [c.c.a.ApiAsyncJobDispatcher] 
> (API-Job-Executor-8:ctx-a6e36538 job-171) (logid:9ad66ce9) Unexpected 
> exception while executing 
> org.apache.cloudstack.api.command.admin.vm.RemoveNicFromVMCmdByAdmin
> com.cloud.utils.exception.CloudRuntimeException: Unable to remove 
> Ntwk[205|Guest|16] from VM[User|i-2-3-VM]
> at 
> com.cloud.vm.UserVmManagerImpl.removeNicFromVirtualMachine(UserVmManagerImpl.java:1291)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> 

Re: ACS 4.9 + VMware: Unable to remove one of the NICs of a multi-nic VM

2016-10-14 Thread Prashanth Manthena
HI All,

Does this issue ring a bell, and is anyone else hitting this issue ?

Let me know, if it is a known issue.

Thanking you in advance !!

On Thu, Oct 13, 2016 at 6:25 PM, Prashanth Manthena <
prashanth.manth...@nuagenetworks.net> wrote:

> Hi,
>
> I am hitting the following issue on an ACS 4.9 + VMware setup (steps to
> reproduce):
>
> 1) Deploy a multi-nic VM (or) add a nic to a single-nic VM
>
> 2) Remove the non-default nic from the multi-nic VM, which fails with the
> following error/exception in the management server log:
>
> 2016-10-05 06:13:28,251 DEBUG [c.c.a.ApiServlet] 
> (catalina-exec-14:ctx-f8dc6bd0 ctx-ee610e01) (logid:58e9cf98) ===END===  
> 10.31.52.95 -- GET  
> command=queryAsyncJobResult=9ad66ce9-6e1b-4c25-bd2e-763f4586dd86=json&_=1475673245452
> 2016-10-05 06:13:29,787 ERROR [c.c.h.v.r.VmwareResource] 
> (DirectAgent-302:ctx-78a58d67 10.31.56.178, job-171/job-172, cmd: 
> UnPlugNicCommand) (logid:9ad66ce9) Unexpected exception:
> java.lang.RuntimeException: The guest operating system did not respond to a 
> hot-remove request for device ethernet1 in a timely manner.
> at 
> com.cloud.hypervisor.vmware.util.VmwareClient.waitForTask(VmwareClient.java:354)
> at 
> com.cloud.hypervisor.vmware.mo.VirtualMachineMO.configureVm(VirtualMachineMO.java:949)
> at 
> com.cloud.hypervisor.vmware.resource.VmwareResource.execute(VmwareResource.java:1103)
> at 
> com.cloud.hypervisor.vmware.resource.VmwareResource.executeRequest(VmwareResource.java:469)
> at 
> com.cloud.agent.manager.DirectAgentAttache$Task.runInContext(DirectAgentAttache.java:315)
> at 
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
> at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
> at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
> at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
> at 
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> 2016-10-05 06:13:29,788 DEBUG [c.c.a.m.DirectAgentAttache] 
> (DirectAgent-302:ctx-78a58d67) (logid:9ad66ce9) Seq 4-1440588930805137508: 
> Response Received:
> 2016-10-05 06:13:29,788 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] 
> (DirectAgent-302:ctx-78a58d67) (logid:9ad66ce9) Seq 4-1440588930805137508: 
> MgmtId 275619427423488: Resp: Routing to peer
> 2016-10-05 06:13:29,789 DEBUG [c.c.a.m.AgentAttache] 
> (DirectAgent-302:ctx-78a58d67) (logid:9ad66ce9) Seq 4-1440588930805137508: No 
> more commands found
> 2016-10-05 06:13:31,120 DEBUG [o.s.b.f.s.DefaultListableBeanFactory] 
> (API-Job-Executor-8:ctx-a6e36538 job-171 ctx-446c510f) (logid:9ad66ce9) 
> Returning cached instance of singleton bean 'messageBus'
> 2016-10-05 06:13:31,127 ERROR [c.c.a.ApiAsyncJobDispatcher] 
> (API-Job-Executor-8:ctx-a6e36538 job-171) (logid:9ad66ce9) Unexpected 
> exception while executing 
> org.apache.cloudstack.api.command.admin.vm.RemoveNicFromVMCmdByAdmin
> com.cloud.utils.exception.CloudRuntimeException: Unable to remove 
> Ntwk[205|Guest|16] from VM[User|i-2-3-VM]
> at 
> com.cloud.vm.UserVmManagerImpl.removeNicFromVirtualMachine(UserVmManagerImpl.java:1291)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
> at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
> at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
> at 
> org.apache.cloudstack.network.contrail.management.EventUtils$EventInterceptor.invoke(EventUtils.java:106)
> at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
>   

Re: 4.8, 4.9, and master Testing Status

2016-10-14 Thread John Burwell
All,

We have made great strides stabilizing the 4.8 [1] and 4.9 [2] smoke tests.  
While we are not super green, the following remaining failures/issues are 
isolated to the VPC VR and secondary storage.  

* CLOUDSTACK-9541: redundant VPC VR: issues when master and backup 
switch happens on failover [3]
* CLOUDSTACK-9540: createPrivateGateway create private network does not 
create proper VLAN network on XenServer
* CLOUDSTACK-9528: SSVM Downloads (built-in) template multiple times

Therefore, I would like to merge these two PRs so that we can begin the process 
of rebasing and retesting the PRs slotted for 4.8 and 4.9 that are not affected 
by these issues (i.e. PRs unrelated to secondary storage or the VR).  Our hope 
is that we can correct these issues quickly, and by the time we have worked 
through the backlog of pending PRs, these issues will be addressed and we can 
move those impacted forward.

Unfortunately, the master PR [5] has 6 failures and 4 errors on XenServer [6] 
that we are currently analyzing.  We hope to have these resolved shortly in 
order to begin progressing PRs targeting master.

I would like to get 1692 [1] and 1703 [2] merged in the next 24 hours.  We need 
to complete the following actions in order to accomplish this goal:

* Obtain at least one code review LGTM on PR #1692 [1]
* Obtain at least one code review LGTM on PR #1703 [2]
* Obtain at least one test review LGTM on PR #1703 [2]

Once these PRs, I will be updating PRs slotted for 4.8 and 4.9 to ping authors 
for a rebase.  Following each rebase, we will trigger blueorangutan to retest 
each one.

Thank again for your patience and assistance,
-John

[1]: https://github.com/apache/cloudstack/pull/1692
[2]: https://github.com/apache/cloudstack/pull/1703
[3]: https://issues.apache.org/jira/browse/CLOUDSTACK-9541
[4]: https://issues.apache.org/jira/browse/CLOUDSTACK-9540
[5]: https://github.com/apache/cloudstack/pull/1708
[6]: https://github.com/apache/cloudstack/pull/1708#issuecomment-253698099

> On Oct 7, 2016, at 10:12 AM, Will Stevens  wrote:
> 
> Great work everyone.  Don't worry about the sporadic updates, that is just
> the nature of the beast when working through stuff like this.  Well done so
> far...
> 
> *Will STEVENS*
> Lead Developer
> 
> *CloudOps* *| *Cloud Solutions Experts
> 420 rue Guy *|* Montreal *|* Quebec *|* H3J 1S6
> w cloudops.com *|* tw @CloudOps_
> 
> On Fri, Oct 7, 2016 at 9:53 AM, John Burwell 
> wrote:
> 
>> All,
>> 
>> Thank you Ilya and Haijao for your words of encouragement.  In addition to
>> the efforts of Paul, Rohit, Murali, Abhi, and Bobby, Sergey Levitskiy has
>> been providing great help testing VMware.
>> 
>> I apologize for my sporadic status updates.  We have made significant
>> progress in getting smoke tests to pass on KVM, XenServer, and VMware.
>> Currently, we have the following number of failures and errors:
>> 
>>* KVM: 0
>>* VMware: 4
>>* XenServer: 8
>> 
>> The outstanding failures and errors seem to be the caused by the following
>> issues:
>> 
>>1. On VMware and XenServer, guest VMs in VPCs start but don’t
>> acquire IP addresses causing tests relying on SSH connectivity tests to
>> fail.  The issue occurs does not occur on KVM, intermittently on VMware,
>> and consistently on XenServer.  This issue affects the test_vpc_redundant,
>> test_privategw_acl, and test_vpc_vpn test suites.   We believe that this
>> issue may be caused by either the guest VMs startup/DHCP wait period
>> winning the race with the VPC VR configuration or there is a problem on the
>> VPC VR assigning IP addresses.  We are currently investigating and expect
>> to identify the root cause shortly.
>>2. SSVM downloads str being restarted due to ping timeouts on
>> XenServer and VMware.  We are seeing the following messages such as the
>> following in the Management Server logs:
>> 
>>com.cloud.utils.exception.CloudRuntimeException: Failed
>> to send command, due to 
>> Agent:5,com.cloud.exception.OperationTimedoutException:
>> Commands
>>9042102151853113352 to Host 5 timed out after 2400
>> 
>>  Our initial investigation discovered different timezones being
>> used by the system VM templates and Management Server.  This discrepancy We
>> have modified Trillian to ensure consistent configuration of time zones
>> across a cluster, and are preparing another run for XenServer and VMware.
>> KVM is not affected by this time zone issue because KVM hosts use the same
>> CentOS template as CentOS based Management Servers -- creating time zone
>> consistency by side effect.
>> 
>> Reports of each test run are available on PR #1692 [1].  We have kicked a
>> new round of tests on KVM, VMware, and XenServer with the time zone fix and
>> additional instrumentation to run down the VPC VR race condition.
>> 
>> Instead of directly forward 

[GitHub] cloudstack issue #1615: CLOUDSTACK-9438: Fix for CLOUDSTACK-9252 - Make NFS ...

2016-10-14 Thread jburwell
Github user jburwell commented on the issue:

https://github.com/apache/cloudstack/pull/1615
  
@koushik-das please see [this 
thread](http://markmail.org/thread/xp7ckhxhip2rbnr7) regarding the testing 
freeze discussion.  Also, per our [community release 
schedule](https://cwiki.apache.org/confluence/display/CLOUDSTACK/%5BPROPOSAL%5D+2016-2017+Release+Cycle+and+Calendar),
 the 4.8, 4.9, and master branches are frozen for testing pre-RC.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---