Re: Welcoming Mike as the new Apache CloudStack VP

2018-03-26 Thread David Mabry
Congrats Mike!

On 3/26/18, 12:04 PM, "Andrija Panic"  wrote:

Congratulations Mike !




On 26 March 2018 at 18:46, Ron Wheeler 
wrote:

>
> Congratulations Mike!
>
> Ron
>
> On 26/03/2018 10:11 AM, Wido den Hollander wrote:
>
>> Hi all,
>>
>> It's been a great pleasure working with the CloudStack project as the
>> ACS VP over the past year.
>>
>> A big thank you from my side for everybody involved with the project in
>> the last year.
>>
>> Hereby I would like to announce that Mike Tutkowski has been elected to
>> replace me as the Apache Cloudstack VP in our annual VP rotation.
>>
>> Mike has a long history with the project and I am are happy welcome him
>> as the new VP for CloudStack.
>>
>> Welcome Mike!
>>
>> Thanks,
>>
>> Wido
>>
>>
> --
> Ron Wheeler
> President
> Artifact Software Inc
> email: rwhee...@artifact-software.com
> skype: ronaldmwheeler
> phone: 866-970-2435, ext 102
>
>


-- 

Andrija Panić




Re: [VOTE] Move to Github issues

2018-03-26 Thread David Mabry
+1

On 3/26/18, 8:05 AM, "Will Stevens"  wrote:

+1

On Mon, Mar 26, 2018, 5:51 AM Nicolas Vazquez, <
nicolas.vazq...@shapeblue.com> wrote:

> +1
>
> 
> From: Dag Sonstebo 
> Sent: Monday, March 26, 2018 5:06:29 AM
> To: us...@cloudstack.apache.org; dev@cloudstack.apache.org
> Subject: Re: [VOTE] Move to Github issues
>
> +1
>
> Regards,
> Dag Sonstebo
> Cloud Architect
> ShapeBlue
>
> On 26/03/2018, 07:33, "Rohit Yadav"  wrote:
>
> All,
>
> Based on the discussion last week [1], I would like to start a vote to
> put
> the proposal into effect:
>
> - Enable Github issues, wiki features in CloudStack repositories.
> - Both user and developers can use Github issues for tracking issues.
> - Developers can use #id references while fixing an existing/open
> issue in
> a PR [2]. PRs can be sent without requiring to open/create an issue.
> - Use Github milestone to track both issues and pull requests towards 
a
> CloudStack release, and generate release notes.
> - Relax requirement for JIRA IDs, JIRA still to be used for historical
> reference and security issues. Use of JIRA will be discouraged.
> - The current requirement of two(+) non-author LGTMs will continue for
> PR
> acceptance. The two(+) PR non-authors can advise resolution to any
> issue
> that we've not already discussed/agreed upon.
>
> For sanity in tallying the vote, can PMC members please be sure to
> indicate
> "(binding)" with their vote?
>
> [ ] +1  approve
> [ ] +0  no opinion
> [ ] -1  disapprove (and reason why)
>
> Vote will be open for 120 hours. If the vote passes the following
> actions
> will be taken:
> - Get Github features enabled from ASF INFRA
> - Update CONTRIBUTING.md and other relevant cwiki pages.
> - Update project website
>
> [1] https://markmail.org/message/llodbwsmzgx5hod6
> [2]
> https://blog.github.com/2013-05-14-closing-issues-via-pull-requests/
>
> Regards,
> Rohit Yadav
>
>
>
> dag.sonst...@shapeblue.com
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>
>
>
>
> nicolas.vazq...@shapeblue.com
> www.shapeblue.com
> ,
> @shapeblue
>
>
>
>




Re: CS 4.8 KVM VMs will not live migrate

2018-02-01 Thread David Mabry
Andrija,

You were right!  The isolation_uri and the broadcast_uri where both blank for 
the problem VMs.  Once I corrected the issue, I was able to migrate them inside 
of CS without issue.  Thanks for helping me get to the root cause of this 
issue.  

Thanks,
David Mabry

On 2/1/18, 3:27 PM, "David Mabry" <dma...@ena.com.INVALID> wrote:

Andrija,

Thanks for the tip.  I'll check that out and let you know what I find.

Thanks,
David Mabry
On 2/1/18, 2:04 PM, "Andrija Panic" <andrija.pa...@gmail.com> wrote:

The customer with serial number here :)

So, another issue which I noticed, when you have KVM host disconnections
(agent disconnect), then in some cases in the cloud.NICs table, there 
will
be missing broadcast URI, isolatio_URI and state or similar filed that 
is
NULL instead of having correct values for specific NIC of the affected 
VM.

In this case the VM will not live migrate via ACS (but you can of course
manually migrate it)...the fix is to fix the NICs table with proper 
values
(copy values from other NICs in the same network).

Check if this might be the case...

Cheers

On 31 January 2018 at 15:49, Tutkowski, Mike <mike.tutkow...@netapp.com>
wrote:

> Glad to hear you fixed the issue! :)
>
    > > On Jan 31, 2018, at 7:16 AM, David Mabry <dma...@ena.com.INVALID> 
wrote:
> >
> > Mike and Wei,
> >
> > Good news!  I was able to manually live migrate these VMs following 
the
> steps outlined below:
> >
> > 1.) virsh dumpxml 38 --migratable > 38.xml
> > 2.) Change the vnc information in 38.xml to match destination host 
IP
> and available VNC port
> > 3.) virsh migrate --verbose --live 38 --xml 38.xml qemu+tcp://
> destination.host.net/system
> >
> > To my surprise, Cloudstack was able to discover and properly handle 
the
> fact that this VM was live migrated to a new host without issue.  
Very cool.
> >
> > Wei, I suspect you are correct when you said this was an issue with 
the
> cloudstack agent code.  After digging a little deeper, the agent is 
never
> attempting to talk to libvirt at all after prepping the dxml to send 
to the
> destination host.  I'm going to attempt to reproduce this in my lab 
and
> attach a remote debugger and see if I can get to the bottom of it.
> >
> > Thanks again for the help guys!  I really appreciate it.
> >
> > Thanks,
> > David Mabry
> >
> > On 1/30/18, 9:55 AM, "David Mabry" <dma...@ena.com.INVALID> wrote:
> >
> >Ah, understood.  I'll take a closer look at the logs and make 
sure
> that I didn't accidentally miss those lines when I pulled together 
the logs
> for this email chain.
> >
> >Thanks,
> >David Mabry
> >On 1/30/18, 8:34 AM, "Wei ZHOU" <ustcweiz...@gmail.com> wrote:
> >
> >Hi David,
> >
> >I encountered the UnsupportAnswer once before, when I made 
some
> changes in
    > >the kvm plugin.
> >
> >Normally there should be some network configurations in the
> agent.log but I
> >do not see it.
> >
> >-Wei
> >
> >
> >2018-01-30 15:00 GMT+01:00 David Mabry 
<dma...@ena.com.invalid>:
> >
> >> Hi Wei,
> >>
> >> I detached the iso and received the same error.  Just out of 
curiosity,
    > >> what leads you to believe it is something in the vxlan code?  I 
guess at
> >> this point, attaching a remote debugger to the agent in question 
might
> be
> >> the best way to get to the bottom of what is going on.
> >>
> >> Thanks in advance for the help.  I really, really appreciate it.
> >>
> >> Thanks,
> >> David Mabry
> >>
> >> On 1/30/18, 3:30 AM, "Wei ZHOU" <ustcweiz...@gmail.com> wrote:
> >>
> >>The answer should be caused by an exception in the cloudstack 
agent.
> >>    I tried to migrate a vm in our testing env, it is working.
> >&

Re: CS 4.8 KVM VMs will not live migrate

2018-02-01 Thread David Mabry
Andrija,

Thanks for the tip.  I'll check that out and let you know what I find.

Thanks,
David Mabry
On 2/1/18, 2:04 PM, "Andrija Panic" <andrija.pa...@gmail.com> wrote:

The customer with serial number here :)

So, another issue which I noticed, when you have KVM host disconnections
(agent disconnect), then in some cases in the cloud.NICs table, there will
be missing broadcast URI, isolatio_URI and state or similar filed that is
NULL instead of having correct values for specific NIC of the affected VM.

In this case the VM will not live migrate via ACS (but you can of course
manually migrate it)...the fix is to fix the NICs table with proper values
(copy values from other NICs in the same network).

Check if this might be the case...

Cheers

On 31 January 2018 at 15:49, Tutkowski, Mike <mike.tutkow...@netapp.com>
wrote:

> Glad to hear you fixed the issue! :)
>
> > On Jan 31, 2018, at 7:16 AM, David Mabry <dma...@ena.com.INVALID> wrote:
> >
> > Mike and Wei,
> >
> > Good news!  I was able to manually live migrate these VMs following the
> steps outlined below:
> >
> > 1.) virsh dumpxml 38 --migratable > 38.xml
> > 2.) Change the vnc information in 38.xml to match destination host IP
> and available VNC port
> > 3.) virsh migrate --verbose --live 38 --xml 38.xml qemu+tcp://
> destination.host.net/system
> >
> > To my surprise, Cloudstack was able to discover and properly handle the
> fact that this VM was live migrated to a new host without issue.  Very 
cool.
> >
> > Wei, I suspect you are correct when you said this was an issue with the
> cloudstack agent code.  After digging a little deeper, the agent is never
> attempting to talk to libvirt at all after prepping the dxml to send to 
the
> destination host.  I'm going to attempt to reproduce this in my lab and
> attach a remote debugger and see if I can get to the bottom of it.
> >
    > > Thanks again for the help guys!  I really appreciate it.
> >
> > Thanks,
> > David Mabry
> >
> > On 1/30/18, 9:55 AM, "David Mabry" <dma...@ena.com.INVALID> wrote:
> >
> >Ah, understood.  I'll take a closer look at the logs and make sure
> that I didn't accidentally miss those lines when I pulled together the 
logs
> for this email chain.
> >
> >Thanks,
> >David Mabry
> >On 1/30/18, 8:34 AM, "Wei ZHOU" <ustcweiz...@gmail.com> wrote:
> >
> >Hi David,
> >
> >I encountered the UnsupportAnswer once before, when I made some
    > changes in
> >the kvm plugin.
> >
> >Normally there should be some network configurations in the
> agent.log but I
> >do not see it.
> >
> >-Wei
> >
> >
> >2018-01-30 15:00 GMT+01:00 David Mabry <dma...@ena.com.invalid>:
> >
> >> Hi Wei,
> >>
> >> I detached the iso and received the same error.  Just out of curiosity,
> >> what leads you to believe it is something in the vxlan code?  I guess 
at
> >> this point, attaching a remote debugger to the agent in question might
> be
> >> the best way to get to the bottom of what is going on.
> >>
> >> Thanks in advance for the help.  I really, really appreciate it.
> >>
> >> Thanks,
> >> David Mabry
> >>
> >> On 1/30/18, 3:30 AM, "Wei ZHOU" <ustcweiz...@gmail.com> wrote:
> >>
> >>The answer should be caused by an exception in the cloudstack agent.
> >>I tried to migrate a vm in our testing env, it is working.
> >>
> >>there are some different between our env and yours.
> >>(1) vlan VS vxlan
> >>(2) no ISO VS attached ISO
> >>(3) both of us use ceph and centos7.
> >>
> >>I suspect it is caused by codes on vxlan.
> >>However, could you detach the ISO and try again ?
> >>
> >>-Wei
> >>
> >>
> >>
> >>2018-01-29 19:48 GMT+01:00 David Mabry <dma...@ena.com.invalid>:
> >>
> >>> Good day Cloudstack Devs,
> >>>
> >>> I've run across a real head scratcher.  I have two VMs, (initially 3
> >> VMs,
  

Re: CS 4.8 KVM VMs will not live migrate

2018-01-31 Thread David Mabry
Mike and Wei,

Good news!  I was able to manually live migrate these VMs following the steps 
outlined below:

1.) virsh dumpxml 38 --migratable > 38.xml
2.) Change the vnc information in 38.xml to match destination host IP and 
available VNC port
3.) virsh migrate --verbose --live 38 --xml 38.xml 
qemu+tcp://destination.host.net/system

To my surprise, Cloudstack was able to discover and properly handle the fact 
that this VM was live migrated to a new host without issue.  Very cool.

Wei, I suspect you are correct when you said this was an issue with the 
cloudstack agent code.  After digging a little deeper, the agent is never 
attempting to talk to libvirt at all after prepping the dxml to send to the 
destination host.  I'm going to attempt to reproduce this in my lab and attach 
a remote debugger and see if I can get to the bottom of it.

Thanks again for the help guys!  I really appreciate it.

Thanks,
David Mabry

On 1/30/18, 9:55 AM, "David Mabry" <dma...@ena.com.INVALID> wrote:

Ah, understood.  I'll take a closer look at the logs and make sure that I 
didn't accidentally miss those lines when I pulled together the logs for this 
email chain.
    
    Thanks,
David Mabry
On 1/30/18, 8:34 AM, "Wei ZHOU" <ustcweiz...@gmail.com> wrote:

Hi David,

I encountered the UnsupportAnswer once before, when I made some changes 
in
the kvm plugin.

Normally there should be some network configurations in the agent.log 
but I
do not see it.

-Wei


    2018-01-30 15:00 GMT+01:00 David Mabry <dma...@ena.com.invalid>:

> Hi Wei,
>
> I detached the iso and received the same error.  Just out of 
curiosity,
> what leads you to believe it is something in the vxlan code?  I guess 
at
> this point, attaching a remote debugger to the agent in question 
might be
> the best way to get to the bottom of what is going on.
>
> Thanks in advance for the help.  I really, really appreciate it.
>
> Thanks,
> David Mabry
>
> On 1/30/18, 3:30 AM, "Wei ZHOU" <ustcweiz...@gmail.com> wrote:
>
> The answer should be caused by an exception in the cloudstack 
agent.
> I tried to migrate a vm in our testing env, it is working.
>
> there are some different between our env and yours.
> (1) vlan VS vxlan
> (2) no ISO VS attached ISO
> (3) both of us use ceph and centos7.
>
> I suspect it is caused by codes on vxlan.
> However, could you detach the ISO and try again ?
>
> -Wei
>
>
>
> 2018-01-29 19:48 GMT+01:00 David Mabry <dma...@ena.com.invalid>:
>
> > Good day Cloudstack Devs,
> >
> > I've run across a real head scratcher.  I have two VMs, 
(initially 3
> VMs,
> > but more on that later) on a single host, that I cannot live 
migrate
> to any
> > other host in the same cluster.  We discovered this after 
attempting
> to
> > roll out patches going from CentOS 7.2 to CentOS 7.4.  
Initially, we
> > thought it had something to do with the new version of libvirtd 
or
> qemu-kvm
> > on the other hosts in the cluster preventing these VMs from
> migrating, but
> > we are able to live migrate other VMs to and from this host 
without
> issue.
> > We can even create new VMs on this specific host and live 
migrate
> them
> > after creation with no issue.  We've put the migration source 
agent,
> > migration destination agent and the management server in debug 
and
> don't
> > seem to get anything useful other than "Unsupported command".
> Luckily, we
> > did have one VM that was shutdown and restarted, this is the 
3rd VM
> > mentioned above.  Since that VM has been restarted, it has no 
issues
> live
> > migrating to any other host in the cluster.
> >
> > I'm at a loss as to what to try next and I'm hoping that 
someone out
> there
> > might have had a similar issue and could shed some light on 
what to
> do.
> > Obviously, I can contact the customer and have them shutdown 
their
> VMs, but
> > that will potentially just delay this problem to be solved 
another
>

Re: CS 4.8 KVM VMs will not live migrate

2018-01-30 Thread David Mabry
Ah, understood.  I'll take a closer look at the logs and make sure that I 
didn't accidentally miss those lines when I pulled together the logs for this 
email chain.

Thanks,
David Mabry
On 1/30/18, 8:34 AM, "Wei ZHOU" <ustcweiz...@gmail.com> wrote:

Hi David,

I encountered the UnsupportAnswer once before, when I made some changes in
the kvm plugin.

Normally there should be some network configurations in the agent.log but I
do not see it.

-Wei


2018-01-30 15:00 GMT+01:00 David Mabry <dma...@ena.com.invalid>:

> Hi Wei,
>
> I detached the iso and received the same error.  Just out of curiosity,
> what leads you to believe it is something in the vxlan code?  I guess at
> this point, attaching a remote debugger to the agent in question might be
> the best way to get to the bottom of what is going on.
>
> Thanks in advance for the help.  I really, really appreciate it.
>
> Thanks,
> David Mabry
>
> On 1/30/18, 3:30 AM, "Wei ZHOU" <ustcweiz...@gmail.com> wrote:
>
> The answer should be caused by an exception in the cloudstack agent.
> I tried to migrate a vm in our testing env, it is working.
>
> there are some different between our env and yours.
> (1) vlan VS vxlan
> (2) no ISO VS attached ISO
> (3) both of us use ceph and centos7.
>
> I suspect it is caused by codes on vxlan.
> However, could you detach the ISO and try again ?
>
> -Wei
>
>
>
> 2018-01-29 19:48 GMT+01:00 David Mabry <dma...@ena.com.invalid>:
>
> > Good day Cloudstack Devs,
> >
> > I've run across a real head scratcher.  I have two VMs, (initially 3
> VMs,
> > but more on that later) on a single host, that I cannot live migrate
> to any
> > other host in the same cluster.  We discovered this after attempting
> to
> > roll out patches going from CentOS 7.2 to CentOS 7.4.  Initially, we
> > thought it had something to do with the new version of libvirtd or
> qemu-kvm
> > on the other hosts in the cluster preventing these VMs from
> migrating, but
> > we are able to live migrate other VMs to and from this host without
> issue.
> > We can even create new VMs on this specific host and live migrate
> them
> > after creation with no issue.  We've put the migration source agent,
> > migration destination agent and the management server in debug and
> don't
> > seem to get anything useful other than "Unsupported command".
> Luckily, we
> > did have one VM that was shutdown and restarted, this is the 3rd VM
> > mentioned above.  Since that VM has been restarted, it has no issues
> live
> > migrating to any other host in the cluster.
> >
> > I'm at a loss as to what to try next and I'm hoping that someone out
> there
> > might have had a similar issue and could shed some light on what to
> do.
> > Obviously, I can contact the customer and have them shutdown their
> VMs, but
> > that will potentially just delay this problem to be solved another
> day.
> > Even if shutting down the VMs is ultimately the solution, I'd still
> like to
> > understand what happened to cause this issue in the first place with
> the
> > hopes of preventing it in the future.
> >
> > Here's some information about my setup:
> > Cloudstack 4.8 Advanced Networking
> > CentOS 7.2 and 7.4 Hosts
> > Ceph RBD Primary Storage
> > NFS Secondary Storage
> > Instance in Question for Debug: i-532-1392-NSVLTN
> >
> > I have attached relevant debug logs to this email if anyone wishes
> to take
> > a look.  I think the most interesting error message that I have
> received is
> > the following:
> >
> > 468390:2018-01-27 08:59:35,172 DEBUG [c.c.a.t.Request]
> > (Work-Job-Executor-6:ctx-188ea30f job-181792/job-181802
> ctx-8e7f45ad)
> > (logid:f0888362) Seq 22-942378222027276319: Received:  { Ans: ,
> MgmtId:
> > 14038012703634, via: 22(csh02c01z01.nsvltn.ena.net), Ver: v1,
> Flags: 110,
> > { UnsupportedAnswer } }
> > 468391:2018-01-27 08:59:35,172 WARN  [c.c.a.m.AgentManagerImpl]
> > (Work-

Re: CS 4.8 KVM VMs will not live migrate

2018-01-30 Thread David Mabry
Hi Wei,

I detached the iso and received the same error.  Just out of curiosity, what 
leads you to believe it is something in the vxlan code?  I guess at this point, 
attaching a remote debugger to the agent in question might be the best way to 
get to the bottom of what is going on.

Thanks in advance for the help.  I really, really appreciate it.

Thanks,
David Mabry

On 1/30/18, 3:30 AM, "Wei ZHOU" <ustcweiz...@gmail.com> wrote:

The answer should be caused by an exception in the cloudstack agent.
I tried to migrate a vm in our testing env, it is working.

there are some different between our env and yours.
(1) vlan VS vxlan
(2) no ISO VS attached ISO
(3) both of us use ceph and centos7.

I suspect it is caused by codes on vxlan.
However, could you detach the ISO and try again ?

-Wei



2018-01-29 19:48 GMT+01:00 David Mabry <dma...@ena.com.invalid>:

> Good day Cloudstack Devs,
>
> I've run across a real head scratcher.  I have two VMs, (initially 3 VMs,
> but more on that later) on a single host, that I cannot live migrate to 
any
> other host in the same cluster.  We discovered this after attempting to
> roll out patches going from CentOS 7.2 to CentOS 7.4.  Initially, we
> thought it had something to do with the new version of libvirtd or 
qemu-kvm
> on the other hosts in the cluster preventing these VMs from migrating, but
> we are able to live migrate other VMs to and from this host without issue.
> We can even create new VMs on this specific host and live migrate them
> after creation with no issue.  We've put the migration source agent,
> migration destination agent and the management server in debug and don't
> seem to get anything useful other than "Unsupported command".  Luckily, we
> did have one VM that was shutdown and restarted, this is the 3rd VM
> mentioned above.  Since that VM has been restarted, it has no issues live
> migrating to any other host in the cluster.
>
> I'm at a loss as to what to try next and I'm hoping that someone out there
> might have had a similar issue and could shed some light on what to do.
> Obviously, I can contact the customer and have them shutdown their VMs, 
but
> that will potentially just delay this problem to be solved another day.
> Even if shutting down the VMs is ultimately the solution, I'd still like 
to
> understand what happened to cause this issue in the first place with the
> hopes of preventing it in the future.
>
> Here's some information about my setup:
> Cloudstack 4.8 Advanced Networking
> CentOS 7.2 and 7.4 Hosts
> Ceph RBD Primary Storage
> NFS Secondary Storage
> Instance in Question for Debug: i-532-1392-NSVLTN
>
> I have attached relevant debug logs to this email if anyone wishes to take
> a look.  I think the most interesting error message that I have received 
is
> the following:
>
> 468390:2018-01-27 08:59:35,172 DEBUG [c.c.a.t.Request]
> (Work-Job-Executor-6:ctx-188ea30f job-181792/job-181802 ctx-8e7f45ad)
> (logid:f0888362) Seq 22-942378222027276319: Received:  { Ans: , MgmtId:
> 14038012703634, via: 22(csh02c01z01.nsvltn.ena.net), Ver: v1, Flags: 110,
> { UnsupportedAnswer } }
> 468391:2018-01-27 08:59:35,172 WARN  [c.c.a.m.AgentManagerImpl]
> (Work-Job-Executor-6:ctx-188ea30f job-181792/job-181802 ctx-8e7f45ad)
> (logid:f0888362) Unsupported Command: Unsupported command issued:
> com.cloud.agent.api.PrepareForMigrationCommand.  Are you sure you got the
> right type of server?
> 468392:2018-01-27 08:59:35,179 ERROR [c.c.v.VmWorkJobHandlerProxy]
> (Work-Job-Executor-6:ctx-188ea30f job-181792/job-181802 ctx-8e7f45ad)
> (logid:f0888362) Invocation exception, caused by: 
com.cloud.exception.AgentUnavailableException:
> Resource [Host:22] is unreachable: Host 22: Unable to prepare for 
migration
> due to Unsupported command issued: 
com.cloud.agent.api.PrepareForMigrationCommand.
> Are you sure you got the right type of server?
> 468393:2018-01-27 08:59:35,179 INFO  [c.c.v.VmWorkJobHandlerProxy]
> (Work-Job-Executor-6:ctx-188ea30f job-181792/job-181802 ctx-8e7f45ad)
> (logid:f0888362) Rethrow exception 
com.cloud.exception.AgentUnavailableException:
> Resource [Host:22] is unreachable: Host 22: Unable to prepare for 
migration
> due to Unsupported command issued: 
com.cloud.agent.api.PrepareForMigrationCommand.
> Are you sure you got the right type of server?
>
> I've tracked this "Unsupported command" down in the CS 4.8 code to
> cloudstack/api/src/com/cloud/agent/api/Answer.java which is the gen

Re: CS 4.8 KVM VMs will not live migrate

2018-01-29 Thread David Mabry
Mike,

Thanks for the reply.  As requested:

Will not Migrate


  i-532-1392-NSVLTN
  f7dbf00b-2e15-4991-a407-cf27a3d65d1e
  Other PV Virtio-SCSI (64-bit)
  4194304
  4194304
  2
  
2000
  
  
/machine
  
  

  Apache Software Foundation
  CloudStack KVM Hypervisor
  f7dbf00b-2e15-4991-a407-cf27a3d65d1e

  
  
hvm



  
  



  
  
Haswell-noTSX

  
  

  
  destroy
  restart
  destroy
  
/usr/libexec/qemu-kvm

  
  

  
  

  
  
  
  
524288000
524288000
500
500
  
  223e08b0929c4c47833d
  
  


  
  

  
  

  
  
  
  
524288000
524288000
500
500
  
  97e5a2991efd40ed85f4
  
  


  
  
  
  
  
  


  
  


  
  


  


  
  


  
  
  


  
  
  
  
  


  
  
  


  
  
  


  




  


  
  
  


  


Will migrate


  i-532-1298-NSVLTN
  d6ec74b8-4f6a-405c-834e-ece42151b802
  Windows PV
  4194304
  4194304
  1
  
1000
  
  
/machine
  
  

  Apache Software Foundation
  CloudStack KVM Hypervisor
  d6ec74b8-4f6a-405c-834e-ece42151b802

  
  
hvm



  
  



  
  
Haswell





  
  

  
  destroy
  restart
  destroy
  
/usr/libexec/qemu-kvm

  
  

  
  

  
  
  
  
524288000
524288000
500
500
  
  f0b58e22d05a48258a4a
  
  


  
  

  
  

  
  
  
  
524288000
524288000
500
500
  
  cd0c282239124730ac55
  
  


  
  
  
  
  
  


  
  


  
  


  


  
  
  


  
  
  
  
  


  
  
  


  
  
  


  


  


  


  


  
  
  


  
  
  
+107:+107
+107:+107
  



David Mabry
Manager of Systems Engineering
On 1/29/18, 5:30 PM, "Tutkowski, Mike" <mike.tutkow...@netapp.com> wrote:

Hi David,

So, I don’t know if what I am going to say here will at all be of use to 
you, but maybe. :)

I had a customer one time mention to me that he had trouble with live VM 
migration on KVM with a VM that was created on an older version of CloudStack. 
Live VM migration worked fine for these VMs on the older version of CloudStack 
(I think it was version 4.5) and stopped working when he upgraded to 4.8. New 
VMs (VMs created on the newer version of CloudStack) worked fine for this 
feature on 4.8, but old VMs had to be stopped and re-started for live VM 
migration to work. I believe the older version of CloudStack was not placing 
the serial number of the VM in the VM’s XML descriptor file, but newer versions 
of CloudStack were expecting this field.

Can you dump the XML of one or both of your VMs that don’t live migrate and 
see if they have the serial number field in their XML? Then, I’d recommend 
dumping the XML of the VM that works and seeing if it does, in fact, have the 
serial number field in its XML.

I hope this is of some help.

Talk to you later,
Mike

On 1/29/18, 11:48 AM, "David Mabry" <dma...@ena.com.INVALID> wrote:

Good day Cloudstack Devs,

I've run across a real head scratcher.  I have two VMs, (initially 3 
VMs, but more on that later) on a single host, that I cannot live migrate to 
any other host in the same cluster.  We discovered this after attempting to 
roll out patches going from CentOS 7.2 to CentOS 7.4.  Initially, we thought it 
had something to do with the new version of libvirtd or qemu-kvm on the other 
hosts in the cluster preventing these VMs from migrating, but we are able to 
live migrate other VMs to and from this host without issue.  We can even create 
new VMs on this specific host and live migrate them after creation with no 
issue.  We've put the migration source agent, migration destination agent and 
the management server in debug and don't seem to get anything useful other than 
"Unsupported command".  Luckily, we did have one VM that was shutdown and 
restarted, this is the 3rd VM mentioned above.  Since that VM has been 
restarted, it has no issues live migrating to any other host in the cluster.

I'm at a loss as to wh

CS 4.8 KVM VMs will not live migrate

2018-01-29 Thread David Mabry
Good day Cloudstack Devs,

I've run across a real head scratcher.  I have two VMs, (initially 3 VMs, but 
more on that later) on a single host, that I cannot live migrate to any other 
host in the same cluster.  We discovered this after attempting to roll out 
patches going from CentOS 7.2 to CentOS 7.4.  Initially, we thought it had 
something to do with the new version of libvirtd or qemu-kvm on the other hosts 
in the cluster preventing these VMs from migrating, but we are able to live 
migrate other VMs to and from this host without issue.  We can even create new 
VMs on this specific host and live migrate them after creation with no issue.  
We've put the migration source agent, migration destination agent and the 
management server in debug and don't seem to get anything useful other than 
"Unsupported command".  Luckily, we did have one VM that was shutdown and 
restarted, this is the 3rd VM mentioned above.  Since that VM has been 
restarted, it has no issues live migrating to any other host in the cluster.

I'm at a loss as to what to try next and I'm hoping that someone out there 
might have had a similar issue and could shed some light on what to do.  
Obviously, I can contact the customer and have them shutdown their VMs, but 
that will potentially just delay this problem to be solved another day.  Even 
if shutting down the VMs is ultimately the solution, I'd still like to 
understand what happened to cause this issue in the first place with the hopes 
of preventing it in the future.

Here's some information about my setup:
Cloudstack 4.8 Advanced Networking
CentOS 7.2 and 7.4 Hosts
Ceph RBD Primary Storage
NFS Secondary Storage
Instance in Question for Debug: i-532-1392-NSVLTN

I have attached relevant debug logs to this email if anyone wishes to take a 
look.  I think the most interesting error message that I have received is the 
following:

468390:2018-01-27 08:59:35,172 DEBUG [c.c.a.t.Request] 
(Work-Job-Executor-6:ctx-188ea30f job-181792/job-181802 ctx-8e7f45ad) 
(logid:f0888362) Seq 22-942378222027276319: Received:  { Ans: , MgmtId: 
14038012703634, via: 22(csh02c01z01.nsvltn.ena.net), Ver: v1, Flags: 110, { 
UnsupportedAnswer } }
468391:2018-01-27 08:59:35,172 WARN  [c.c.a.m.AgentManagerImpl] 
(Work-Job-Executor-6:ctx-188ea30f job-181792/job-181802 ctx-8e7f45ad) 
(logid:f0888362) Unsupported Command: Unsupported command issued: 
com.cloud.agent.api.PrepareForMigrationCommand.  Are you sure you got the right 
type of server?
468392:2018-01-27 08:59:35,179 ERROR [c.c.v.VmWorkJobHandlerProxy] 
(Work-Job-Executor-6:ctx-188ea30f job-181792/job-181802 ctx-8e7f45ad) 
(logid:f0888362) Invocation exception, caused by: 
com.cloud.exception.AgentUnavailableException: Resource [Host:22] is 
unreachable: Host 22: Unable to prepare for migration due to Unsupported 
command issued: com.cloud.agent.api.PrepareForMigrationCommand.  Are you sure 
you got the right type of server?
468393:2018-01-27 08:59:35,179 INFO  [c.c.v.VmWorkJobHandlerProxy] 
(Work-Job-Executor-6:ctx-188ea30f job-181792/job-181802 ctx-8e7f45ad) 
(logid:f0888362) Rethrow exception 
com.cloud.exception.AgentUnavailableException: Resource [Host:22] is 
unreachable: Host 22: Unable to prepare for migration due to Unsupported 
command issued: com.cloud.agent.api.PrepareForMigrationCommand.  Are you sure 
you got the right type of server?

I've tracked this "Unsupported command" down in the CS 4.8 code to 
cloudstack/api/src/com/cloud/agent/api/Answer.java which is the generic answer 
class.  I believe where the error is really being spawned from is 
cloudstack/engine/orchestration/src/com/cloud/vm/VirtualMachineManagerImpl.java.
  Specifically:
Answer pfma = null;
try {
pfma = _agentMgr.send(dstHostId, pfmc);
if (pfma == null || !pfma.getResult()) {
final String details = pfma != null ? pfma.getDetails() : "null 
answer returned";
final String msg = "Unable to prepare for migration due to " + 
details;
pfma = null;
throw new AgentUnavailableException(msg, dstHostId);
}

The pfma returned must be in error or is never returned and therefore still 
null.  That answer appears that it should be coming from the destination agent, 
but for the life of me I can't figure out what the root cause of this error is 
beyond, "Unsupported command issued".  What command is unsupported?  My guess 
is that it could be something wrong with the dxml that is generated and passed 
to the destination host, but I have as yet been unable to catch that dxml in 
debug.

Any help or guidance is greatly appreciated.

Thanks,
David Mabry

Management Server Debug
=
468377:2018-01-27 08:59:35,101 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(AsyncJobMgr-Heartbeat-1:ctx-ce324459) (logid:943aafdc) Execute sync-queue 
item: SyncQueueItemVO {id:69592, queueId: 65982, contentT

Re: [ANNOUNCE] Syed Mushtaq Ahmed has joined the PMC

2017-10-11 Thread David Mabry
Congrats Syed!

On 10/11/17, 7:20 AM, "Ian Rae"  wrote:

Hear hear!

On Wed, Oct 11, 2017 at 6:55 AM, Nux!  wrote:

> Congrats :)
>
> --
> Sent from the Delta quadrant using Borg technology!
>
> Nux!
> www.nux.ro
>
> - Original Message -
> > From: "Sigert GOEMINNE" 
> > To: "dev" 
> > Cc: "users" 
> > Sent: Wednesday, 11 October, 2017 10:20:50
> > Subject: Re: [ANNOUNCE] Syed Mushtaq Ahmed has joined the PMC
>
> > Congratulations Syed!
> >
> > *Sigert Goeminne*
> > Software Development Engineer
> >
> > *nuage*networks.net 
> > Copernicuslaan 50
> > 2018 Antwerp
> > Belgium
> >
> >
> >
> >
> > On Wed, Oct 11, 2017 at 10:41 AM, Nitin Kumar Maharana <
> > nitinkumar.mahar...@accelerite.com> wrote:
> >
> >> Congratulations Syed!!!
> >> On 09-Oct-2017, at 4:56 PM, Paul Angus  mailto:
> >> paul.an...@shapeblue.com>> wrote:
> >>
> >> Fellow CloudStackers,
> >>
> >> It gives me great pleasure to say that Syed has be invited to join the
> PMC
> >> and has gracefully accepted.
> >> Please joining me in congratulating Syed!
> >>
> >>
> >> Kind regards,
> >>
> >> Paul Angus
> >>
> >>
> >> paul.an...@shapeblue.com
> >> www.shapeblue.com
> >> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> >> @shapeblue
> >>
> >>
> >>
> >>
> >> DISCLAIMER
> >> ==
> >> This e-mail may contain privileged and confidential information which 
is
> >> the property of Accelerite, a Persistent Systems business. It is
> intended
> >> only for the use of the individual or entity to which it is addressed.
> If
> >> you are not the intended recipient, you are not authorized to read,
> retain,
> >> copy, print, distribute or use this message. If you have received this
> >> communication in error, please notify the sender and delete all copies
> of
> >> this message. Accelerite, a Persistent Systems business does not accept
> any
> >> liability for virus infected mails.
>



-- 
Ian Rae
CEO | PDG
c: 514.944.4008

CloudOps | Cloud Infrastructure and Networking Solutions
www.cloudops.com | 420 rue Guy | Montreal | Canada | H3J 1S6




Re: [README][Quarterly Call] - CloudStack Development, Blockers and Community Efforts

2017-09-05 Thread David Mabry
Hi Ilya,

We are also interested in contributing support for the RBD/Ceph Storage backend 
for the new KVM HA feature that was rolled in 4.10.  We haven’t started work on 
that effort at this time, so I don’t have a PR to reference.

Thanks,
David Mabry

On 9/5/17, 2:13 PM, "ilya" <ilya.mailing.li...@gmail.com> wrote:

Hi ENA team

Please send updates today before 5pm PST if you'd like to have more
items discussed.

On 8/18/17 5:12 AM, Simon Weller wrote:
> Ilya,
> 
> 
> I'll be attending with a few other folks from ENA.
> 
> 
> Here's one for the Dev efforts -
> 
> 
> 
>  Ability to Specify Mac Address when plugging a network
>   We're working on cloud migration strategies and part of 
that is making the move as seamless as possible.
> 
>  The ability to specify a mac address when shifting a VM 
workload from another environment makes the transition a lot easier.
>   https://issues.apache.org/jira/browse/CLOUDSTACK-9949
> 
>   https://github.com/apache/cloudstack/pull/2143
>   Nathan Johnson
>   PR has been submitted as of 7/13 is is awaiting review 
from the community (Targetting 4.11)
> 
> 
> We'll discuss our roadmap internally for the next half and get back to 
you with additions before the call.
> 
> 
> - Si
> 
> 
> From: ilya <ilya.mailing.li...@gmail.com>
> Sent: Thursday, August 17, 2017 7:29 PM
> To: dev@cloudstack.apache.org
> Subject: Re: [README][Quarterly Call] - CloudStack Development, Blockers 
and Community Efforts
> 
> Hi All,
> 
> I'd like to pick this thread back up and see if you are joining. As a
> reminder, proposed date is September 6th 2017, time 9AM PST.
> 
> If you are, please kindly respond. If you have things to discuss -
> please use the outline below:
> 
>   1) Development efforts - 60 minutes
> Upcoming Features you are working on developing (to avoid
> collision and maintain the roadmap).
>   Depending on number of topics we need to discuss - time for
> each topic will be set accordingly.
>   If you would like to particiapate - please respond to this
> thread and adhere to sample format below:
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 2) Release Blockers - 20 minutes
>   If you would like to participate - please respond to this
> thread and adhere to sample format below:
> 
> 
> 
> 
> 
> 
> 3) Community Efforts - 10+ minutes
> 
> 
> 
> 
> 
> 
> 
> Thanks
> ilya
> 
> On 8/1/17 10:55 AM, ilya wrote:
>> Hi Team
>>
>> Proposed new date for first quarterly call
>>
>> September 6th 2017, time 9AM PST.
>>
>> This is a month out and hopefully can work with most folks. If it does
>> not work with your timing - please consider finding delegates and/or
>> representatives.
>>
>> Regards
>> ilya
>>
>> On 7/20/17 6:11 AM, Wido den Hollander wrote:
>>>
>>>> Op 20 juli 2017 om 14:58 schreef Giles Sirett 
<giles.sir...@shapeblue.com>:
>>>>
>>>>
>>>> Hi Ilya
>>>> Sorry, I should have highlighted that User Group meeting clash before
>>>>
>>>> Under normal circumstances, I would say: its futile trying to 
coordinate calendars with such a broad audience - there will always be some 
people not available , just set a regular date, keep it rolling (build and they 
will come)
>>>>
>>>> However, for the first call, there will be at least Wido, Mike, Paul, 
Daan, me and probably a lot more PMC members not available because of the user 
group meeting
>>>>
>>>
>>> +1 I will be present!
>>>
>>> Wido
>>>
>>>> To keep it simple, I'd therefore say, go with the fol

Re: Miami CCC '17 Roundtable/Hackathon Summary

2017-05-24 Thread David Mabry
Marty, thanks for keeping usability and adoption at the forefront of this 
conversation.  I believe it is something that can be easily lost as we get deep 
in the technical weeds and it is good to be reminded about what is important 
beyond how much code it will take to create feature X.  I also believe that 
together we can come up with a solution that meets both technical and ease of 
implementation requirements.  I think we can make a significant design change 
to allow for a new VR, hopefully not one maintained by us, and include new 
feature like NFV without forcing undue complexity on users who don’t want or 
need it.  In the end, for me, it really comes down to orchestration.  How much 
can we do for a user “out of the box”?  I think it is important that we have a 
“default” option for the VR/Networking that is easy to implement and fits most 
SMB use cases.  With that said, I don’t think we can risk alienating the larger 
companies that use ACS in a more complex environment.  I think for ACS to stay 
relevant and compete against the likes of OpenStack we will need features like 
NFV and a VR that is consistent and stable.  Oh and we also need IPv6 (That was 
for you Wido ;) ).

I agree with Paul that we might want to create a dedicated channel or have 
weekly meetings to begin really pushing this major feature forward with the 
community, sooner rather than later.  It is easy for us to lose momentum on 
monumental tasks such as this.

In short, one of the great features of ACS is that it provides *choice*.  What 
is important, to Marty’s point, is that we don’t lose sight of usability and 
ease of implementation when providing that choice for the wide variety of users 
that we have.

Thanks,
Dave Mabry
Education Networks of America

On 5/23/17, 10:29 PM, "Marty Godsey"  wrote:

Thank you Simon for the re-cap of the hackathon. I was able to catch the 
last couple of hours of it but saw the notes on the boards..

I am going to give my thoughts on this coming from a slightly different 
angle. As many of you know, I am not a coder. I am an Systems/Network Engineer. 
I know many times design decisions are made based upon the amount of time it 
will require to write a particular piece of code, update, fix bugs, etc. But 
the one thing we can't forget is that many ACS users may not have the ability 
to add their own plugins, write code to interact with a router, etc. I know I 
can't myself, going back to the I'm not a coder, but thankfully I know people 
that can and can get it done if need be but the point is many people cannot. As 
we decide how we are going to re-write the networking portions of ACS we have 
to step back and take a look at what was one of the most talked about topics at 
this year's CCC. I am not talking about the networking, IPV6 support or any 
other cool idea we had. The constant conversation in the hallways and at the 
many "Zest" outings was ADOPTION and MARKET AWARENESS.

Adoption.. How do we get the word out and get it adopted by more people? 
It’s a tough question but something that also has to influence how we build 
ACS. Let take a moment and compare ACS to its closest competitor Openstack. We 
all know that Openstack has the market share, it has the money behind it. But 
what is the constant complaint we hear from people who use? ""Yea, it works but 
man,, it was a bi%#h to get going""  Openstack has gotten its adoption cause it 
had big names and a lot of money behind it. Openstacks complexity has also 
caused it to not be adopted in many cases. Your typical IT shop in a small to 
medium sized business does not have the expertise to implement something like 
this. And when I say SMB I am saying organizations from 10-500 people.

So back to my adoption question. As mentioned before one of the reasons 
many people come to ACS is the fact that it has it all. Networking, hyper-visor 
management, user management, storage management, its multi-tenant. What will 
drive ACS adoption will be improving what ACS already does, not making it more 
like OpenStack. Now do I think that having a module service or plugin service 
to provide a framework to allow for external resources to be used by ACS is a 
good thing? Yes I do. But I also do not want to, and hope we don’t, move away 
from what made ACS what it is today. A software that allows companies to easily 
spin up new public or private clouds. Adoption-Centric Usability. 

If I rambled a little here I apologize, its 11:30pm and sometimes I get 
ahead of myself (especially when I write something like this at this hour) when 
writing about something I am passionate about and I am passionate about getting 
more exposure and adoption of ACS.

Thank you for listening guys.. Sorry for the ramble.

Regards,
Marty Godsey
Principal Engineer
nSource Solutions, LLC

-Original Message-
From: Rafael Weingärtner 

Re: [DISCUSS] Config Drive: Using the OpenStack format?

2017-05-19 Thread David Mabry
+1.  I like it!

On 5/19/17, 11:58 AM, "Wei ZHOU"  wrote:

gd idea

2017-05-19 15:33 GMT+02:00 Marc-Aurèle Brothier :

> Hi Widoo,
>
> That sounds like a pretty good idea in my opinion. +1 for adding it
>
> Marco
>
>
> > On 19 May 2017, at 15:15, Wido den Hollander  wrote:
> >
> > Hi,
> >
> > Yesterday at ApacheCon Kris from Nuage networks gave a great
> presentation about alternatives for userdata from the VR: Config Drive
> >
> > In short, a CD-ROM/ISO attached to the Instance containing the
> meta/userdata instead of having the VR serve it.
> >
> > The outstanding PR [0] uses it's own format on the ISO while cloud-init
> already has support for config drive [1].
> >
> > This format uses 'openstack' in the name, but it seems to be in
> cloud-init natively and well supported.
> >
> > I started the discussion yesterday during the talk and thought to take
> it to the list.
> >
> > My opinion is that we should use the OpenStack format for the config
> drive:
> >
> > - It's already in cloud-init
> > - Easier to templates to be used on CloudStack
> > - Easier adoption
> >
> > We can always write a file like "GENERATED_BY_APACHE_CLOUDSTACK" or
> something on the ISO.
> >
> > We can also symlink the 'openstack' directory to a directory called
> 'cloudstack' on the ISO.
> >
> > Does anybody else have a opinion on this one?
> >
> > Wido
> >
> > [0]: https://github.com/apache/cloudstack/pull/2097
> > [1]: http://cloudinit.readthedocs.io/en/latest/topics/
> datasources/configdrive.html#version-2
>
>




Re: PRs for 4.10

2017-02-22 Thread David Mabry
I would like to see the following PRs merged:

https://github.com/apache/cloudstack/pull/1915
https://github.com/apache/cloudstack/pull/1954 <- This one could be backported 
to 4.8/4.9

Collectively, these solve an issue that causes the VPC router to fail due to a 
full /var/log FS.  We’ve tested these extensively in the lab and confirmed that 
it does solve the problem.

Thanks,
David Mabry

On 2/22/17, 2:00 PM, "williamstev...@gmail.com on behalf of Will Stevens" 
<williamstev...@gmail.com on behalf of wstev...@cloudops.com> wrote:

I would like to get this fix in:
https://github.com/apache/cloudstack/pull/1907

This fix is really important because it causes routing issues when the IP,
which was not cleaned up, is later used on a different VR.

*Will STEVENS*
Lead Developer

<https://goo.gl/NYZ8KK>

On Wed, Feb 22, 2017 at 5:29 AM, Rohit Yadav <rohit.ya...@shapeblue.com>
wrote:

> Rajani and all,
>
>
> Please consider following PRs for review and merge:
>
>
> https://github.com/apache/cloudstack/pull/1829 (XenServer HVM VM attached
> disk limit bugfix, enough LGTMs and test results)
>
> https://github.com/apache/cloudstack/pull/1941 (same as above with
> additional fixes, lacks review lgtms)
> https://github.com/apache/cloudstack/pull/1896 (translation updates)
>
> https://github.com/apache/cloudstack/pull/1770 (template size issue
> bugfix)
>
> https://github.com/apache/cloudstack/pull/1950 (Ubuntu 16.04 packaging
> tomcat6/7 support)
>
> https://github.com/apache/cloudstack/pull/1951 (xenserver7 capability fix
> and upgrade path from 4.9.2->4.9.3)
>
> https://github.com/apache/cloudstack/pull/1944 (metrics view and infra
> tab/UI performance improvement)
> https://github.com/apache/cloudstack/pull/1903 (cannot add users to vpc
> vpn)
>
> https://github.com/apache/cloudstack/pull/1945
>
> <https://github.com/apache/cloudstack/pull/1945>https://
> github.com/apache/cloudstack/pull/1946
>
> <https://github.com/apache/cloudstack/pull/1946>https://
> github.com/apache/cloudstack/pull/1947
> https://github.com/apache/cloudstack/pull/1856
> https://github.com/apache/cloudstack/pull/1212
>
>
>
> Regards.
>
> 
> From: Rajani Karuturi <raj...@apache.org>
> Sent: 22 February 2017 14:41:42
> To: dev@cloudstack.apache.org
> Subject: Re: PRs for 4.10
>
> 6 days to go for the first RC.
> Please get the required lgtms and tests ready for the PRs you like to see
> in 4.10.
> I will try to merge all the PRs that meet the criteria.
>
> Thanks,
> ~Rajani
>
> On 19 Feb 2017 3:11 p.m., "Rajani Karuturi" <raj...@apache.org> wrote:
>
> > noted.
> >
> >
> > ~ Rajani
> >
> > http://cloudplatform.accelerite.com/
> >
> >
> > On February 17, 2017 at 6:03 PM, Frank Maximus (
> > frank.maxi...@nuagenetworks.net) wrote:
> >
> > I have a couple of bugfixes on previous version outstanding,
> > which I would like to have merged to 4.10,
> > both still requiring review:
> > on 4.8: PR#1912 <https://github.com/apache/cloudstack/pull/1912>: which
> > fixes password service running on internal lb vms, making it impossible
> to
> > do loadbalancing on port 8080
> > on 4.9: PR#1925 <https://github.com/apache/cloudstack/pull/1925>: Minor
> > plugin fix
> >
> > Kind Regards,
> > Frank
> >
> > On Tue, Feb 14, 2017 at 10:56 PM Syed Ahmed <sah...@cloudops.com> wrote:
> >
> > I'd like to include https://github.com/apache/cloudstack/pull/1928 to
> 4.10
> > as well. This is a simple fix that adds hypervisor capabilities for
> > XenServer 7
> >
> > Thanks,
> > -Syed
> >
> > On Tue, Feb 14, 2017 at 12:06 AM, Will Stevens <wstev...@cloudops.com>
> > wrote:
> >
> > Not sure, I will see if I can find some time tomorrow to look at this.
> > Thanks...
> >
> > *Will STEVENS*
> > Lead Developer
> >
> > <https://goo.gl/NYZ8KK>
> >
> > On Mon, Feb 13, 2017 at 11:58 PM, Rajani Karuturi <raj...@apache.org>
> > wrote:
> >
> > Thanks Will. I will take a look at this today and merge.
> >
> &g

Re: [DISCUSS][FS] Host HA for CloudStack

2017-02-17 Thread David Mabry

On 2/16/17, 5:18 AM, "Rohit Yadav"  wrote:

All,


I would like to start discussion on a new feature - Host HA for CloudStack.

CloudStack lacks a way to reliably fence a host, the idea of the host-ha 
feature is to provide a general purpose HA framework and HA provider 
implementation specific for hypervisor that can use additional mechanism such 
as OOBM (ipmi based power management) to reliably investigate, recover and 
fence a host. This feature can handle scenarios associated with server crash 
issues and reliable fencing of hosts and HA of VM. The first version will have 
HA provider implementation for KVM (and for simulator to test the framework 
implementation, and write marvin tests that can validate the feature on Travis 
and others).


Please have a look at the FS here:

https://cwiki.apache.org/confluence/display/CLOUDSTACK/Host+HA


Looking forward to your comments and questions.


Regards.

rohit.ya...@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 

 Rohit,

First, thanks for all the work you have put into this.  This is something that 
CS has sorely needed for a long time.

A couple of items:

1.) You state the following:
“Before invoking the HA provider’s fence operation, the HA resource management 
will place the resource in maintenance mode. The intention is to require an 
administrator to manually verify that a resource is ready to return service by 
requiring an administrator to take it out of maintenance mode.”
I agree that putting a host in maintenance mode to require manual intervention 
in order to bring it back online is ideal and honestly how I would probably 
prefer to do it.  However, I also like to give the end user/operator choice.  
Perhaps we could add an option to bring the Host out of Maintenance mode 
automatically if it passes all checks and comes back into an ELIGIBLE state.  
This way, if the operator chooses, the host could come back into full operation 
and start recovering VMs if needed.  This could also be handy if your 
environment isn’t quite n+1 when it comes to host capacity and you need to have 
the host back up and running as soon as possible to minimize the outage 
duration.  Again, I know it isn’t ideal, but I don’t see the harm in giving the 
operator the choice.

2.) You state the following:
“For the initial release, only KVM with NFS storage will be supported. However, 
the storage check component will be implemented in a modular fashion allowing 
for checks using other storage platforms(e.g. Ceph) in the future. HA provider 
plugins can be implemented for other hypervisors.”
We are using KVM with a Ceph backend and would be very interested in helping 
make it a part of the initial push for this feature.  I have a Dev environment 
backed by Ceph that we could use for teseting and would be willing to help with 
the development of the Ceph activity checks.

I’m looking forward to getting this feature added to CS.  Again, great job 
putting this together and starting the conversation.

Thanks,
Mabry



Re: Modify the system vm build scripts

2017-02-17 Thread David Mabry
Awesome.  Thanks for the quick answer.  That totally makes sense.  Package 
changes (installation, etc…) are done in the “appliance” section of code.  Any 
“config” changes required beyond package installation are done in the “patches” 
section of code.

Thanks,
David Mabry

On 2/17/17, 10:52 AM, "williamstev...@gmail.com on behalf of Will Stevens" 
<williamstev...@gmail.com on behalf of wstev...@cloudops.com> wrote:

So the System VM is "built" from two sources.

1)

https://github.com/apache/cloudstack/tree/master/tools/appliance/definitions/systemvmtemplate
This defines what is actually built and is distributed as the SystemVM
Template.  You MUST use it if you change the packages included in the
SystemVM template or change the core components in any way.  Changing
anything here REQUIRES a new SystemVM template to be distributed for that
change to be used.

2) https://github.com/apache/cloudstack/tree/master/systemvm/patches/debian
This defines the systemvm.iso which is loaded into the System VM template
after the system vm is deployed.  This basically defines configuration
which can be changed without requiring a new System VM template.  This
section does not handle installation of packages and such, instead it
handles System VM configuration and functionality.  So if settings files
need to be changed (for say something like VPN) or if the way we handle IP
address changes, etc...  That is all handled from here.  The systemvm.iso
is generated by the management server (i think) and is pushed to the system
vm after the system vm boots and the configuration which cloudstack manages
is handled through this code.

Is that clear?  Let me know if you have more questions.  I have had to do a
bunch of stuff in this recently for some of the networking issues we have
had as well as the StrongSwan VPN implementation (which changed both
places).

Using StrongSwan as an example:
I had to modify (1) in order to remove the OpenSwan package installation
and add the StrongSwan package installation.

I had to modify (2) in order to change the configuration of the VPN in
order to handle things the way that StrongSwan needed things done.  So the
changes to things like the `ipsec` command are handled in (2) because those
are configuration changes and not package changes.

Is that clearer?

*Will STEVENS*
Lead Developer

<https://goo.gl/NYZ8KK>

On Fri, Feb 17, 2017 at 11:18 AM, David Mabry <dma...@ena.com> wrote:

> Hello everyone,
>
> I’m looking at making some changes to the system vm, but I have found that
> there looks like there are 2 different places in the code that “build” the
> systemvm.  There is there is https://github.com/apache/cloudstack/tree/
> 13bfdd71e6f52d2f613a802b3d16c9b40af7/systemvm/patches/debian, which
> looks like it might be the “old” way and there is
> https://github.com/apache/cloudstack/tree/87ef8137534fa798101f65c6691fcf
> 71513ac978/tools/appliance/definitions/systemvmtemplate, which looks like
> it might be the “new” way.  If I wanted to make changes to how the 
systemvm
> is built which place should I modify?  I assume that I should modify the
> build scripts in the “new” location, but I thought I would ask here first
> just to be sure.
>
> --Mabry
>




Modify the system vm build scripts

2017-02-17 Thread David Mabry
Hello everyone,

I’m looking at making some changes to the system vm, but I have found that 
there looks like there are 2 different places in the code that “build” the 
systemvm.  There is there is 
https://github.com/apache/cloudstack/tree/13bfdd71e6f52d2f613a802b3d16c9b40af7/systemvm/patches/debian,
 which looks like it might be the “old” way and there is 
https://github.com/apache/cloudstack/tree/87ef8137534fa798101f65c6691fcf71513ac978/tools/appliance/definitions/systemvmtemplate,
 which looks like it might be the “new” way.  If I wanted to make changes to 
how the systemvm is built which place should I modify?  I assume that I should 
modify the build scripts in the “new” location, but I thought I would ask here 
first just to be sure.

--Mabry


Re: Jenkins broken?

2016-04-25 Thread David Mabry
Sure, I give it a shot and let you know the results.

Thanks,
David Mabry






On 4/25/16, 8:54 AM, "Will Stevens" <williamstev...@gmail.com> wrote:

>Jenkins has been acting up a bit recently. Try doing a force push of your
>PR to kick off the run again to see if it still fails.
>On Apr 25, 2016 9:14 AM, "David Mabry" <dma...@ena.com> wrote:
>
>> Hello everyone,
>>
>> Can someone check on Jenkins?  It looks like it not able to check out 4.7
>> branch and it’s failing on my pull request.  See the logs below:
>>
>>
>> FATAL: Could not checkout 4.7 with start point origin/4.7
>> hudson.plugins.git.GitException<
>> http://stacktrace.jenkins-ci.org/search?query=hudson.plugins.git.GitException>:
>> Could not checkout 4.7 with start point origin/4.7
>> at
>> org.jenkinsci.plugins.gitclient.CliGitAPIImpl$9.execute(CliGitAPIImpl.java:1962)<
>> http://stacktrace.jenkins-ci.org/search/?query=org.jenkinsci.plugins.gitclient.CliGitAPIImpl$9.execute=method
>> >
>> at
>> org.jenkinsci.plugins.gitclient.AbstractGitAPIImpl.checkoutBranch(AbstractGitAPIImpl.java:82)<
>> http://stacktrace.jenkins-ci.org/search/?query=org.jenkinsci.plugins.gitclient.AbstractGitAPIImpl.checkoutBranch=method
>> >
>> at
>> org.jenkinsci.plugins.gitclient.CliGitAPIImpl.checkoutBranch(CliGitAPIImpl.java:62)<
>> http://stacktrace.jenkins-ci.org/search/?query=org.jenkinsci.plugins.gitclient.CliGitAPIImpl.checkoutBranch=method
>> >
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:606)
>> at
>> hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:608)
>> at
>> hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:583)
>> at
>> hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:542)
>> at hudson.remoting.UserRequest.perform(UserRequest.java:120)
>> at hudson.remoting.UserRequest.perform(UserRequest.java:48)
>> at hudson.remoting.Request$2.run(Request.java:326)
>> at
>> hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68)
>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> at java.lang.Thread.run(Thread.java:745)
>> at ..remote call to ubuntu-us1(Native Method)
>> at
>> hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1416)
>> at hudson.remoting.UserResponse.retrieve(UserRequest.java:220)
>> at hudson.remoting.Channel.call(Channel.java:781)
>> at
>> hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:250)
>> at com.sun.proxy.$Proxy115.checkoutBranch(Unknown Source)
>> at
>> org.jenkinsci.plugins.gitclient.RemoteGitImpl.checkoutBranch(RemoteGitImpl.java:327)
>> at
>> com.cloudbees.jenkins.plugins.git.vmerge.BuildChooserImpl.getCandidateRevisions(BuildChooserImpl.java:78)
>> at
>> hudson.plugins.git.GitSCM.determineRevisionToBuild(GitSCM.java:951)
>> at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1054)
>> at hudson.scm.SCM.checkout(SCM.java:485)
>> at hudson.model.AbstractProject.checkout(AbstractProject.java:1276)
>> at
>> hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:607)
>> at
>> jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
>> at
>> hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:529)
>> at hudson.model.Run.execute(Run.java:1738)
>> at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
>> at
>> hudson.model.ResourceController.execute(ResourceController.java:98)
>> at hudson.model.Executor.run(Executor.java:410)
>> Caused by: hudson.plugins.git.GitException: Command "git checkout -b 4.7
>> origin/4.7" returned status code 1:
>> stdout:
>> engine/storage/image/src/org/apache/cloudstack/storage/im

Jenkins broken?

2016-04-25 Thread David Mabry
Hello everyone,

Can someone check on Jenkins?  It looks like it not able to check out 4.7 
branch and it’s failing on my pull request.  See the logs below:


FATAL: Could not checkout 4.7 with start point origin/4.7
hudson.plugins.git.GitException:
 Could not checkout 4.7 with start point origin/4.7
at 
org.jenkinsci.plugins.gitclient.CliGitAPIImpl$9.execute(CliGitAPIImpl.java:1962)
at 
org.jenkinsci.plugins.gitclient.AbstractGitAPIImpl.checkoutBranch(AbstractGitAPIImpl.java:82)
at 
org.jenkinsci.plugins.gitclient.CliGitAPIImpl.checkoutBranch(CliGitAPIImpl.java:62)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:608)
at 
hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:583)
at 
hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:542)
at hudson.remoting.UserRequest.perform(UserRequest.java:120)
at hudson.remoting.UserRequest.perform(UserRequest.java:48)
at hudson.remoting.Request$2.run(Request.java:326)
at 
hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
at ..remote call to ubuntu-us1(Native Method)
at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1416)
at hudson.remoting.UserResponse.retrieve(UserRequest.java:220)
at hudson.remoting.Channel.call(Channel.java:781)
at 
hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:250)
at com.sun.proxy.$Proxy115.checkoutBranch(Unknown Source)
at 
org.jenkinsci.plugins.gitclient.RemoteGitImpl.checkoutBranch(RemoteGitImpl.java:327)
at 
com.cloudbees.jenkins.plugins.git.vmerge.BuildChooserImpl.getCandidateRevisions(BuildChooserImpl.java:78)
at hudson.plugins.git.GitSCM.determineRevisionToBuild(GitSCM.java:951)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1054)
at hudson.scm.SCM.checkout(SCM.java:485)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1276)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:607)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:529)
at hudson.model.Run.execute(Run.java:1738)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:98)
at hudson.model.Executor.run(Executor.java:410)
Caused by: hudson.plugins.git.GitException: Command "git checkout -b 4.7 
origin/4.7" returned status code 1:
stdout: 
engine/storage/image/src/org/apache/cloudstack/storage/image/TemplateServiceImpl.java:
 needs merge
services/secondary-storage/server/src/org/apache/cloudstack/storage/resource/NfsSecondaryStorageResource.java:
 needs merge

stderr: error: you need to resolve your current index first

at 
org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:1693)
at 
org.jenkinsci.plugins.gitclient.CliGitAPIImpl.access$500(CliGitAPIImpl.java:62)
at 
org.jenkinsci.plugins.gitclient.CliGitAPIImpl$9.execute(CliGitAPIImpl.java:1956)
at 
org.jenkinsci.plugins.gitclient.AbstractGitAPIImpl.checkoutBranch(AbstractGitAPIImpl.java:82)
at 
org.jenkinsci.plugins.gitclient.CliGitAPIImpl.checkoutBranch(CliGitAPIImpl.java:62)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:608)
 

CLOUDSTACK-9349: Enable root disk detach for KVM with new Marvin tests PR#1500

2016-04-18 Thread David Mabry
Hello all,

I submitted PR 1500 this morning that addresses JIRA issue CLOUDSTACK-9349 
around KVM Root Volume detach/attach.  This was really a very minor code change 
to java, but I also submitted a marvin integration test that I would love to 
get some feedback on.  This was my first pass at writing a marvin unit test and 
I did my best to follow the same style I saw in test_volumes.py, but again I 
would love to have someone more familiar with marvin take a look and make sure 
everything looks right to them.

Please see the original PR submission below for more details.

Thanks in advance for the feedback.

Thanks,
David Mabry





On 4/18/16, 8:20 AM, "dmabry" <g...@git.apache.org> wrote:

>GitHub user dmabry opened a pull request:
>
>https://github.com/apache/cloudstack/pull/1500
>
>CLOUDSTACK-9349
>
>This PR addresses the KVM detach/attach ROOT disks from VMs 
> (CLOUDSTACK-9349).  In short, this allows the KVM Hypervisor, and I added the 
> Simulator as a valid hypervisor for ease of development and testing of 
> marvin, to detach a root volume and the reattach a root volume using the 
> deviceid=0 flag to the attachVolume API.  I have also written a marvin 
> integration test that verifies this feature works for both KVM and the 
> Simulator.
>
>Below is the marvin results files of the full marvin test_volumes.py.  All 
> tests pass, including the new root detach/attach, on our KVM lab running with 
> the patches in this PR.
>
>
> [test_volumes_KIR4G3.zip](https://github.com/apache/cloudstack/files/223799/test_volumes_KIR4G3.zip)
>
>
>You can merge this pull request into a Git repository by running:
>
>$ git pull https://github.com/myENA/cloudstack KVM_root_detach
>
>Alternatively you can review and apply these changes as the patch at:
>
>https://github.com/apache/cloudstack/pull/1500.patch
>
>To close this pull request, make a commit to your master/trunk branch
>with (at least) the following in the commit message:
>
>This closes #1500
>
>
>commit 48ce76344040de2ab8014f76292abe0421d42f85
>Author: Simon Weller <siwelle...@gmail.com>
>Date:   2016-03-24T19:55:34Z
>
>Merge pull request #4 from apache/4.7
>
>4.7 PR
>
>commit d0a02640dfd4878da81a2e59588c4b5ff2a06401
>Author: Simon Weller <swel...@ena.com>
>Date:   2016-04-14T13:28:37Z
>
>Let hypervisor type KVM detach root volumes
>
>commit 7807955433cea390bb7358e3bb90dbc9cc06bbea
>Author: David Mabry <dma...@ena.com>
>Date:   2016-04-15T12:30:07Z
>
>updated test_volumes.py to include a test for detaching and reattaching a 
> root volume from a vm.  I also had to update base.py to all attach_volume to 
> have the parameter deviceid to be passed as needed.
>
>commit d7d55630daff4a5e17c9a374dc2e9bc478dff808
>Author: David Mabry <dma...@ena.com>
>Date:   2016-04-18T02:41:29Z
>
>Added Simulator as valid hypervisor for root detach
>
>
>
>
>---
>If your project is set up for it, you can reply to this email and have your
>reply appear on GitHub as well. If your project does not have this feature
>enabled and wishes so, or if the feature is enabled but not working, please
>contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
>with INFRA.
>---