from:"Andrei Mikhailovsky"

users mailing list not working?

2023-04-17 Thread Andrei Mikhailovsky

Hello the dev team, 

Just checking if the users mailing list is working? There has been no activity 
since 14th of April and the message that I've sent some hours ago hasn't shown 
up yet. 

Cheers 

Andrei

Re: ACS 4.16 documentation issue/error

2021-11-26 Thread Andrei Mikhailovsky

Hi Rohit,

this is the link where I've found the guides:

http://qa.cloudstack.cloud/docs/WIP-PROOFING/pr/215/adminguide/templates/_cloud_init.html#linux-with-cloud-init

I got the above link from google when I was searching for cloud-init and 
cloudstack templates or something like that.

Sorry, I am not a contributor/developer. Perhaps one of the dev people who are 
doing the templates/documentation could double check the guides and make the 
necessary correction. I've tested it only on Ubuntu Server 20.04 LTS and found 
the problem.

Andrei

- Original Message -
> From: "Rohit Yadav" 
> To: "dev" 
> Sent: Friday, 26 November, 2021 09:33:46
> Subject: Re: ACS 4.16 documentation issue/error

> Hi Andrei,
> 
> Thanks for sharing. Can you share which page/link you found the error, the
> cloud-init docs I found were at:
> https://docs.cloudstack.apache.org/en/latest/adminguide/virtual_machines/user-data.html#using-cloud-init
> 
> You may also raise a pull request and propose changes at
> https://github.com/apache/cloudstack-documentation
> 
> 
> Regards.
> 
> 
> From: Andrei Mikhailovsky 
> Sent: Thursday, November 25, 2021 16:25
> To: dev 
> Subject: ACS 4.16 documentation issue/error
> 
> Hello everyone.
> 
> I've been following the Cloud-init integration guides in the ACS 4.16
> documentation and noticed a problem with Ubuntu Server 20.04 LTS. In
> particular, the section "Specify the managed user" shows:
> 
> system_info : default_user : name : cloud - user lock_passwd : false # disable
> user password login - true/false sudo : [ \ "ALL=(ALL) ALL \" ]   # User
> permissions disable_root : 0 # root remote login is 0 - enabled, 1 - disabled
> ssh_pwauth : 1 # password login is 0 - disabled, 1- enabled
> 
> Adding this produces an error message when trying to use sudo. The error 
> message
> is:
> 
> $ sudo su
>>>> /etc/sudoers.d/90-cloud-init-users: syntax error near line 4 <<<
> sudo: parse error in /etc/sudoers.d/90-cloud-init-users near line 4
> sudo: no valid sudoers sources found, quitting
> sudo: unable to initialize policy plugin
> 
> Removing the "\" in
> sudo : [ \ "ALL=(ALL) ALL \" ]
> seem to fix the problem for me.
> 
> Could you please test this part again to make sure the users are producing
> working templates after following the instructions.
> 
> Thanks
> 
> Andrei

ACS 4.16 documentation issue/error

2021-11-25 Thread Andrei Mikhailovsky

Hello everyone. 

I've been following the Cloud-init integration guides in the ACS 4.16 
documentation and noticed a problem with Ubuntu Server 20.04 LTS. In 
particular, the section "Specify the managed user" shows: 

system_info : default_user : name : cloud - user lock_passwd : false # disable 
user password login - true/false sudo : [ \ "ALL=(ALL) ALL \" ]   # User 
permissions disable_root : 0 # root remote login is 0 - enabled, 1 - disabled 
ssh_pwauth : 1 # password login is 0 - disabled, 1- enabled 

Adding this produces an error message when trying to use sudo. The error 
message is: 

$ sudo su 
>>> /etc/sudoers.d/90-cloud-init-users: syntax error near line 4 <<< 
sudo: parse error in /etc/sudoers.d/90-cloud-init-users near line 4 
sudo: no valid sudoers sources found, quitting 
sudo: unable to initialize policy plugin 

Removing the "\" in 
sudo : [ \ "ALL=(ALL) ALL \" ] 
seem to fix the problem for me. 

Could you please test this part again to make sure the users are producing 
working templates after following the instructions. 

Thanks 

Andrei

Re: Problems after upgrade from 4.13.1 to 4.15.0

2021-03-29 Thread Andrei Mikhailovsky

Hi Wei,

Hello,

All packages I used were stock from the official ubuntu repo or the CloudStack 
one. I have tried restarting the management server, services and mysql.

I have been doing some more testing and troubleshooting over the weekend. I've 
managed to resolve the problem after reinstalling cloudstack-* packages on the 
management server and restarting the server. The sql related error is now gone.

I did have tons of issues with the System VMs and virtual routers. They were 
simply not starting properly. Doing the template update of the router or simply 
delete the old router/systemvm and creating a new one was not working either. I 
could see the vm is created on the host, but it wasn't responsive to the virsh 
console commands nor it was responding to pings on either of the ip addresses. 
the CloudStack was showing the routers / System VMs in Starting state for a 
very long time. There were no errors in the management log. A very strange 
thing indeed. I've tried clearing the entries in sync_queque, async_job and 
vm_work_job tables and restarting the management server, but that didn't help.

After a bunch of experimentation, I've found a work around, which seems to have 
worked. I had to update the cloudstack-agent on all host servers in the cluster 
before the management server started properly creating the systemvm and virtual 
router vms. It took me a bit of time to come to that. I think it would greatly 
help other people if something like this is mentioned in the upgrade guide.

Anyways, I will keep updated of any other issues that I discover after 
upgrading to 4.15, if any.

Thanks for your help

Andrei

- Original Message -
> From: "Wei ZHOU" 
> To: "dev" 
> Sent: Monday, 29 March, 2021 10:16:04
> Subject: Re: Problems after upgrade from 4.13.1 to 4.15.0

> Hi Andrei,
> 
> Have you retried after restarting the management server and/or mysql ?
> What's your mysql server version ?
> 
> -Wei
> 
> 
> On Sun, 28 Mar 2021 at 21:58, Andrei Mikhailovsky 
> wrote:
> 
>> Hello everyone,
>>
>> I've updated my CloudStack management server and an agent from 4.13.1 to
>> 4.15.0. I am running Ubuntu 18.04 server. Following the instructions in the
>> documentation on the upgrade steps, the management server and the agent
>> started ok. I've logged in to the new GUI and at first things seem ok.
>> However, I've noticed that I can't perform any vm / systemvm related
>> operations. Things like start/stop/migrate/shutdown vms produce a 503
>> error. Also, I wasn't able to add a host running 4.15.0 agent. Inspecting
>> the management server logs I get the following exception, which happens
>> with pretty much any vm related action.
>>
>>
>> --
>>
>>
>> 2021-03-28 02:28:46,811 DEBUG [c.c.v.UserVmManagerImpl]
>> (API-Job-Executor-6:ctx-cfe07062 job-81025 ctx-2387e198) (logid:c5396488)
>> Found no ongoing snapshots on volumes associated with th
>> e vm with id 695
>> 2021-03-28 02:28:46,813 DEBUG [c.c.v.UserVmManagerImpl]
>> (API-Job-Executor-6:ctx-cfe07062 job-81025 ctx-2387e198) (logid:c5396488)
>> Collect vm disk statistics from host before stopping VM
>> 2021-03-28 02:28:46,879 DEBUG [c.c.a.t.Request]
>> (AgentManager-Handler-14:null) (logid Seq 121-4330211041716731948:
>> Processing: { Ans: , MgmtId: 115129173025114, via: 121, Ver: v1, Fla
>> gs: 10,
>> [{"com.cloud.agent.api.GetVmDiskStatsAnswer":{"hostName":"ais-cloudhost13","vmDiskStatsMap":{"i-2-695-VM":[]},"result":"true","details":"","wait":"0"}}]
>> }
>> 2021-03-28 02:28:46,879 DEBUG [c.c.a.t.Request]
>> (API-Job-Executor-6:ctx-cfe07062 job-81025 ctx-2387e198) (logid:c5396488)
>> Seq 121-4330211041716731948: Received: { Ans: , MgmtId: 1151291
>> 73025114, via: 121(ais-cloudhost13), Ver: v1, Flags: 10, {
>> GetVmDiskStatsAnswer } }
>> 2021-03-28 02:28:46,879 DEBUG [c.c.a.m.AgentManagerImpl]
>> (API-Job-Executor-6:ctx-cfe07062 job-81025 ctx-2387e198) (logid:c5396488)
>> Details from executing class com.cloud.agent.api.GetVmD
>> iskStatsCommand:
>> 2021-03-28 02:28:46,880 DEBUG [c.c.v.UserVmManagerImpl]
>> (API-Job-Executor-6:ctx-cfe07062 job-81025 ctx-2387e198) (logid:c5396488)
>> Collect vm network statistics from host before stopping
>> Vm
>> 2021-03-28 02:28:46,897 DEBUG [c.c.a.t.Request]
>> (AgentManager-Handler-1:null) (logid Seq 121-4330211041716731949:
>> Processing: { Ans: , MgmtId: 115129173025114, via: 121, Ver: v1, Flag
>> s: 10,
>> [{"com.cloud.agent.api.GetVmNetworkStatsAnswer":{"hostName":"ais-cloudhost13","vmNet

Problems after upgrade from 4.13.1 to 4.15.0

2021-03-28 Thread Andrei Mikhailovsky

Hello everyone, 

I've updated my CloudStack management server and an agent from 4.13.1 to 
4.15.0. I am running Ubuntu 18.04 server. Following the instructions in the 
documentation on the upgrade steps, the management server and the agent started 
ok. I've logged in to the new GUI and at first things seem ok. However, I've 
noticed that I can't perform any vm / systemvm related operations. Things like 
start/stop/migrate/shutdown vms produce a 503 error. Also, I wasn't able to add 
a host running 4.15.0 agent. Inspecting the management server logs I get the 
following exception, which happens with pretty much any vm related action. 


-- 


2021-03-28 02:28:46,811 DEBUG [c.c.v.UserVmManagerImpl] 
(API-Job-Executor-6:ctx-cfe07062 job-81025 ctx-2387e198) (logid:c5396488) Found 
no ongoing snapshots on volumes associated with th 
e vm with id 695 
2021-03-28 02:28:46,813 DEBUG [c.c.v.UserVmManagerImpl] 
(API-Job-Executor-6:ctx-cfe07062 job-81025 ctx-2387e198) (logid:c5396488) 
Collect vm disk statistics from host before stopping VM 
2021-03-28 02:28:46,879 DEBUG [c.c.a.t.Request] (AgentManager-Handler-14:null) 
(logid Seq 121-4330211041716731948: Processing: { Ans: , MgmtId: 
115129173025114, via: 121, Ver: v1, Fla 
gs: 10, 
[{"com.cloud.agent.api.GetVmDiskStatsAnswer":{"hostName":"ais-cloudhost13","vmDiskStatsMap":{"i-2-695-VM":[]},"result":"true","details":"","wait":"0"}}]
 } 
2021-03-28 02:28:46,879 DEBUG [c.c.a.t.Request] 
(API-Job-Executor-6:ctx-cfe07062 job-81025 ctx-2387e198) (logid:c5396488) Seq 
121-4330211041716731948: Received: { Ans: , MgmtId: 1151291 
73025114, via: 121(ais-cloudhost13), Ver: v1, Flags: 10, { GetVmDiskStatsAnswer 
} } 
2021-03-28 02:28:46,879 DEBUG [c.c.a.m.AgentManagerImpl] 
(API-Job-Executor-6:ctx-cfe07062 job-81025 ctx-2387e198) (logid:c5396488) 
Details from executing class com.cloud.agent.api.GetVmD 
iskStatsCommand: 
2021-03-28 02:28:46,880 DEBUG [c.c.v.UserVmManagerImpl] 
(API-Job-Executor-6:ctx-cfe07062 job-81025 ctx-2387e198) (logid:c5396488) 
Collect vm network statistics from host before stopping 
Vm 
2021-03-28 02:28:46,897 DEBUG [c.c.a.t.Request] (AgentManager-Handler-1:null) 
(logid Seq 121-4330211041716731949: Processing: { Ans: , MgmtId: 
115129173025114, via: 121, Ver: v1, Flag 
s: 10, 
[{"com.cloud.agent.api.GetVmNetworkStatsAnswer":{"hostName":"ais-cloudhost13","vmNetworkStatsMap":{"i-2-695-VM":[{"vmName":"i-2-695-VM","macAddress":"02:00:20:a5:00:01","bytesSent
 
":"(335.09 MB) 351364549","bytesReceived":"(294.63 MB) 
308940852"},{"vmName":"i-2-695-VM","macAddress":"06:c7:fe:00:01:0b","bytesSent":"(74.57
 KB) 76358","bytesReceived":"(585.85 MB) 614 
310467"}]},"result":"true","details":"","wait":"0"}}] } 
2021-03-28 02:28:46,897 DEBUG [c.c.a.t.Request] 
(API-Job-Executor-6:ctx-cfe07062 job-81025 ctx-2387e198) (logid:c5396488) Seq 
121-4330211041716731949: Received: { Ans: , MgmtId: 1151291 
73025114, via: 121(ais-cloudhost13), Ver: v1, Flags: 10, { 
GetVmNetworkStatsAnswer } } 
2021-03-28 02:28:46,897 DEBUG [c.c.a.m.AgentManagerImpl] 
(API-Job-Executor-6:ctx-cfe07062 job-81025 ctx-2387e198) (logid:c5396488) 
Details from executing class com.cloud.agent.api.GetVmN 
etworkStatsCommand: 
2021-03-28 02:28:46,909 WARN [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-6:ctx-cfe07062 job-81025 ctx-2387e198) (logid:c5396488) 
Unable to schedule async job for command com.clo 
ud.vm.VmWorkMigrate, unexpected exception. 
com.cloud.utils.exception.CloudRuntimeException: Unable to lock vm_instance695. 
Waited 0 
at com.cloud.utils.db.Merovingian2.doAcquire(Merovingian2.java:197) 
at com.cloud.utils.db.Merovingian2.acquire(Merovingian2.java:137) 
at com.cloud.utils.db.TransactionLegacy.lock(TransactionLegacy.java:384) 
at com.cloud.utils.db.GenericDaoBase.lockInLockTable(GenericDaoBase.java:1075) 
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method) 
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 
at java.base/java.lang.reflect.Method.invoke(Method.java:566) 
at 
org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344)
 
at 
org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198)
 
at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
 
at 
com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34)
 
at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:175)
 
at 
org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:95)
 
at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
 
at

Re: [RESULT][VOTE] Primate as modern UI for CloudStack

2019-10-23 Thread Andrei Mikhailovsky

Rohit, when do you plan to add this interface to the release of ACS?

thanks
Andrei

- Original Message -
> From: "Rohit Yadav" 
> To: "users" 
> Cc: "dev" , priv...@cloudstack.apache.org
> Sent: Tuesday, 22 October, 2019 09:24:07
> Subject: Re: [RESULT][VOTE] Primate as modern UI for CloudStack

> All,
> 
> The repository is live - https://github.com/apache/cloudstack-primate and can
> accept pull requests now.
> 
> Updates and pending items in this regard:
> 
>  *   Get the Github repo's issues, wiki, projects etc. enabled. I've pinged 
> ASF
>  INFRA in that regard - https://issues.apache.org/jira/browse/INFRA-19274
>  *   I've added a contributing document:
>  https://github.com/apache/cloudstack-primate/blob/master/CONTRIBUTING.md
>  (kindly review)
>  *   Basic (work in progress) documentation section added:
>  https://github.com/apache/cloudstack-primate#documentation
>  *
> Once repository functions are fully enabled, I'll share a proper project
> progress/status update and details of the bi-weekly meeting on the Primate SIG
> thread.
> 
> Thanks.
> 
> 
> Regards,
> 
> Rohit Yadav
> 
> Software Architect, ShapeBlue
> 
> https://www.shapeblue.com
> 
> 
> From: Andrija Panic 
> Sent: Monday, October 21, 2019 14:37
> To: users 
> Cc: dev@cloudstack.apache.org ;
> priv...@cloudstack.apache.org 
> Subject: Re: [RESULT][VOTE] Primate as modern UI for CloudStack
> 
> (that seems like more +1s than for last few ACS releases altogether :) )
> 
> Great work Rohit - thx!
> 
> 
> rohit.ya...@shapeblue.com
> www.shapeblue.com
> Amadeus House, Floral Street, London  WC2E 9DPUK
> @shapeblue
>  
> 
> 
> On Mon, 21 Oct 2019 at 08:57, Rohit Yadav  wrote:
> 
>> All,
>>
>> After 2 weeks, the vote for accepting Primate as a CloudStack project [1]
>> *passes* with
>> 10 PMC + 11 non-PMC votes.
>>
>> +1 (PMC / binding)
>> 10 person (Mike, Simon, Andrija, Sven, Wido, Will, Syed, Gabriel, Giles,
>> Bruno)
>>
>> +1 (committer, non-binding/users)
>> 11 person (Nicolas, Nitin, Lucian, Ezequiel, Haijiao, Alex, Alessandro,
>> Marco, Anurag, Leonardo, KB Shiv)
>>
>> 0
>> none
>>
>> -1
>> none
>>
>>
>> I'll now request ASF INFRA [2] to help enable issues, pull requests, wiki,
>> projects for the new repository:
>>
>> https://github.com/apache/cloudstack-primate
>>
>>
>> The code will be donated and pushed in the next 24-48 hours from the old
>> repository [3] to the new repository under the Apache CloudStack project.
>>
>>
>> Thanks to everyone participating.
>>
>> [1] https://markmail.org/message/tblrbrtew6cvrusr
>>
>> [2] https://issues.apache.org/jira/browse/INFRA-19274
>>
>> [3] https://github.com/shapeblue/primate
>>
>> Regards,
>>
>> Rohit Yadav
>>
>> Software Architect, ShapeBlue
>>
>> https://www.shapeblue.com
>>
>> 
>>
>> rohit.ya...@shapeblue.com
>> www.shapeblue.com
>> Amadeus House, Floral Street, London  WC2E 9DPUK
>> @shapeblue
>>
>>
>>
>> From: Rohit Yadav
>> Sent: Monday, October 7, 2019 17:01
>> To: dev@cloudstack.apache.org ;
>> us...@cloudstack.apache.org ;
>> priv...@cloudstack.apache.org 
>> Subject: [VOTE] Primate as modern UI for CloudStack
>>
>> All,
>>
>> The feedback and response has been positive on the proposal to use Primate
>> as the modern UI for CloudStack [1] [2]. Thank you all.
>>
>> I'm starting this vote (to):
>>
>>   *   Accept Primate codebase [3] as a project under Apache CloudStack
>> project
>>   *   Create and host a new repository (cloudstack-primate) and follow
>> Github based development workflow (issues, pull requests etc) as we do with
>> CloudStack
>>   *   Given this is a new project, to encourage cadence until its feature
>> completeness the merge criteria is proposed as:
>>  *   Manual testing against each PR and/or with screenshots from the
>> author or testing contributor, integration with Travis is possible once we
>> get JS/UI tests
>>  *   At least 1 LGTM from any of the active contributors, we'll move
>> this to 2 LGTMs when the codebase reaches feature parity wrt the
>> existing/old CloudStack UI
>>  *   Squash and merge PRs
>>   *   Accept the proposed timeline [1][2] (subject to achievement of goals
>> wrt Primate technical release and GA)
>>  *   the first technical preview targetted with the winter 2019 LTS
>> release (~Q1 2020) and release to serve a deprecation notice wrt the older
>> UI
>>  *   define a release approach before winter LTS
>>  *   stop taking feature FRs for old/existing UI after winter 2019 LTS
>> release, work on upgrade path/documentation from old UI to Primate
>>  *   the first Primate GA targetted wrt summer LTS 2020 (~H2 2019),
>> but still ship old UI with a final deprecation notice
>>  *   old UI codebase removed from codebase in winter 2020 LTS release
>>
>> The vote will be up for the next two weeks to give enough time for PMC and
>> the community to gather consensus and still have room for questions,
>> feedback

Re: [ANNOUNCE] Apache CloudStack 4.13.0.0 GA

2019-09-24 Thread Andrei Mikhailovsky

Great work guys and girls!!!

- Original Message -
> From: "Paul Angus" 
> To: annou...@cloudstack.apache.org, "Apache CloudStack Marketing" 
> , "dev"
> , "users" , 
> users...@cloudstack.apache.org
> Sent: Tuesday, 24 September, 2019 11:06:28
> Subject: [ANNOUNCE] Apache CloudStack 4.13.0.0 GA

> *The Apache Software Foundation Announces Apache**®** CloudStack**®** v4.13*
> 
> 
> Apache CloudStack v4.13 features nearly 200 new features, enhancements and
> fixes since 4.12., such as enhanced hypervisor support, performance
> increases and more user-configurable controls.  Highlights include:
> 
> 
> 
>   - Supporting configuration of virtualised appliances
>   - VMware 6.7 support
>   - Increased granularity & control of instance  deployment
>   - Improvements in system VM performance
>   - Allow live migration of DPDK enabled instances
>   - More flexible UI branding
>   - Allowing users to create layer 2 network offerings
> 
> 
> The full list of new features can be found in the project release notes at
> http://docs.cloudstack.apache.org/en/4.13.0.0/releasenotes/changes.html
> 
> 
> 
> Apache CloudStack powers numerous elastic Cloud computing services,
> including solutions that have ranked as Gartner Magic Quadrant leaders.
> Highlighted in the Forrester Q4 2017 Enterprise Open Source Cloud Adoption
> report, Apache CloudStack "sits beneath hundreds of service provider
> clouds", including Fortune 5 multinational corporations. A list of known
> Apache CloudStack users are available at
> http://cloudstack.apache.org/users.html

Re: 4.13 rbd snapshot delete failed

2019-09-09 Thread Andrei Mikhailovsky

A quick feedback from my side. I've never had a properly working delete 
snapshot with ceph. Every week or so I have to manually delete all ceph 
snapshots. However, the NFS secondary storage snapshots are deleted just fine. 
I've been using CloudStack for 5+ years and it was always the case. I am 
currently running 4.11.2 with ceph 13.2.6-1xenial.

Andrei

- Original Message -
> From: "Andrija Panic" 
> To: "Gabriel Beims Bräscher" 
> Cc: "users" , "dev" 
> Sent: Sunday, 8 September, 2019 19:17:59
> Subject: Re: 4.13 rbd snapshot delete failed

> Thx Gabriel for extensive feedback.
> Actually my ex company added the code to really delete a RBD snap back in
> 2016 or so, was part of 4.9 if not mistaken. So I expect the code is there,
> but probably some exception is happening or regression...
> 
> Cheers
> 
> On Sun, Sep 8, 2019, 09:31 Gabriel Beims Bräscher 
> wrote:
> 
>> Thanks for the feedback, Andrija. It looks like delete was not totally
>> supported then (am I missing something?). I will take a look into this and
>> open a PR adding propper support for rbd snapshot deletion if necessary.
>>
>> Regarding the rollback, I have tested it several times and it worked;
>> however, I see a weak point on the Ceph rollback implementation.
>>
>> It looks like Li Jerry was able to execute the rollback without any
>> problem. Li, could you please post here  the log output: "Attempting to
>> rollback RBD snapshot [name:%s], [pool:%s], [volumeid:%s],
>> [snapshotid:%s]"? Andrija will not be able to see that log as the exception
>> happen prior to it, the only way of you checking those values is via remote
>> debugging. If you be able to post those values it would help as well on
>> sorting out what is wrong.
>>
>> I am checking the code base, running a few tests, and evaluating the log
>> that you (Andrija) sent. What I can say for now is that it looks that the
>> parameter "snapshotRelPath = snapshot.getPath()" [1] is a critical piece of
>> code that can definitely break the rollback execution flow. My tests had
>> pointed for a pattern but now I see other possibilities. I will probably
>> add a few parameters on the rollback/revert command instead of using the
>> path or review the path life-cycle and different execution flows in order
>> to keep it safer to be used.
>> [1]
>> https://github.com/apache/cloudstack/blob/50fc045f366bd9769eba85c4bc3ecdc0b7035c11/plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/wrapper
>>
>> A few details on the test environments and Ceph/RBD version:
>> CloudStack, KVM, and Ceph nodes are running with Ubuntu 18.04
>> Ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic
>> (stable)
>> RADOS Block Devices has snapshot rollback support since Ceph v10.0.2 [
>> https://github.com/ceph/ceph/pull/6878]
>> Rados-java [https://github.com/ceph/rados-java] supports snapshot
>> rollback since 0.5.0; rados-java 0.5.0 is the version used by CloudStack
>> 4.13.0.0
>>
>> I will be updating here soon.
>>
>> Em dom, 8 de set de 2019 às 12:28, Wido den Hollander 
>> escreveu:
>>
>>>
>>>
>>> On 9/8/19 5:26 AM, Andrija Panic wrote:
>>> > Maaany release ago, deleting Ceph volume snap, was also only deleting
>>> it in
>>> > DB, so the RBD performance become terrible with many tens of (i. e.
>>> Hourly)
>>> > snapshots. I'll try to verify this on 4.13 myself, but Wido and the guys
>>> > will know better...
>>>
>>> I pinged Gabriel and he's looking into it. He'll get back to it.
>>>
>>> Wido
>>>
>>> >
>>> > I
>>> >
>>> > On Sat, Sep 7, 2019, 08:34 li jerry  wrote:
>>> >
>>> >> I found it had nothing to do with  storage.cleanup.delay and
>>> >> storage.cleanup.interval.
>>> >>
>>> >>
>>> >>
>>> >> The reason is that when DeleteSnapshot Cmd is executed, because the RBD
>>> >> snapshot does not have Copy to secondary storage, it only changes the
>>> >> database information, and does not enter the main storage to delete the
>>> >> snapshot.
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> Log===
>>> >>
>>> >>
>>> >>
>>> >> 2019-09-07 23:27:00,118 DEBUG [c.c.a.ApiServlet]
>>> >> (qtp504527234-17:ctx-2e407b61) (logid:445cbea8) ===START===
>>> 192.168.254.3
>>> >> -- GET
>>> >>
>>> command=deleteSnapshot=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f=json&_=1567869534480
>>> >>
>>> >> 2019-09-07 23:27:00,139 DEBUG [c.c.a.ApiServer]
>>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) CIDRs from
>>> >> which account 'Acct[2f96c108-9408-11e9-a820-0200582b001a-admin]' is
>>> allowed
>>> >> to perform API calls: 0.0.0.0/0,::/0
>>> >>
>>> >> 2019-09-07 23:27:00,204 DEBUG [c.c.a.ApiServer]
>>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) Retrieved
>>> >> cmdEventType from job info: SNAPSHOT.DELETE
>>> >>
>>> >> 2019-09-07 23:27:00,217 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>>> >> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:c34a368a) Add
>>> job-1378
>>> >> into job monitoring
>>> >>
>>> >> 2019-09-07 23:27:00,219 DEBUG

Re: Concurrent Volume Snapshots

2019-06-14 Thread Andrei Mikhailovsky

Thanks for the update Rohit, Suresh,

Is this feature happening in the 4.13 release?

Andrei

- Original Message -
> From: "Suresh Kumar Anaparti" 
> To: "dev" 
> Sent: Thursday, 13 June, 2019 19:10:11
> Subject: Re: Concurrent Volume Snapshots

> Currently, in CloudStack, only one job per VM can be active (in execution)
> at any given point of time. Here, for concurrent volume snapshots of same
> VM, all these snapshot operations are considered to be the same VM work
> jobs and would be in queue. Once an active volume snapshot job is done, the
> next one is picked up for execution.
> 
> The PR https://github.com/apache/cloudstack/pull/1897 supports multiple
> snapshots of the same VM for XenServer. KVM not tested. I'll rebase the
> code with the latest master.
> 
> Regards,
> Suresh
> 
> 
> On Thu, Jun 13, 2019 at 7:41 PM Rohit Yadav 
> wrote:
> 
>> I checked out outstanding PRs list, looks like this feature is not
>> supported currently:
>>
>> https://github.com/apache/cloudstack/pull/1897
>>
>>
>> Regards,
>>
>> Rohit Yadav
>>
>> Software Architect, ShapeBlue
>>
>> https://www.shapeblue.com
>>
>> 
>> From: Rohit Yadav 
>> Sent: Thursday, June 13, 2019 7:37:09 PM
>> To: dev
>> Subject: Re: Concurrent Volume Snapshots
>>
>> Hi Andrei,
>>
>>
>> Try playing with concurrent.snapshots.threshold.perhost. (empty is treated
>> as 1).
>>
>>
>> Regards,
>>
>> Rohit Yadav
>>
>> Software Architect, ShapeBlue
>>
>> https://www.shapeblue.com
>>
>> 
>> From: Andrei Mikhailovsky 
>> Sent: Thursday, June 13, 2019 6:54:07 PM
>> To: dev
>> Subject: Re: Concurrent Volume Snapshots
>>
>> Hi Rohit,
>>
>> I have updated some of those options to increase the timeout to 2 days
>> rather than a few hours by default.
>>
>> However, these options relate to the timeout of the process.
>>
>> I was wondering if there is an option to allow simultaneous snapshotting
>> of volumes on a single VM? I would like all volumes of the vm to be copied
>> over to the secondary storage at the same time, rather than one after
>> another.
>>
>> Cheers
>>
>>
>> rohit.ya...@shapeblue.com
>> www.shapeblue.com<http://www.shapeblue.com>
>> Amadeus House, Floral Street, London  WC2E 9DPUK
>> @shapeblue
>>
>>
>>
>>
>> rohit.ya...@shapeblue.com
>> www.shapeblue.com
>> Amadeus House, Floral Street, London  WC2E 9DPUK
>> @shapeblue
>>
>>
>>
>> ----- Original Message -
>> > From: "Rohit Yadav" 
>> > To: "dev" 
>> > Sent: Thursday, 13 June, 2019 14:02:21
>> > Subject: Re: Concurrent Volume Snapshots
>>
>> > You can try to experiment with the following global settings:
>> >
>> >
>> > wait
>> >
>> > backup.snapshot.wait
>> > copy.volume.wait
>> > vm.job.lock.timeout
>> >
>> >
>> > Regards,
>> >
>> > Rohit Yadav
>> >
>> > Software Architect, ShapeBlue
>> >
>> > https://www.shapeblue.com
>> >
>> > 
>> > From: Andrei Mikhailovsky 
>> > Sent: Thursday, June 13, 2019 6:27:23 PM
>> > To: dev
>> > Subject: Concurrent Volume Snapshots
>> >
>> > Hello everyone
>> >
>> > I am having running snapshot issues on large volumes. The hypervisor is
>> KVM and
>> > the storage backend is Ceph (rbd). ACS version is 4.11.2. Here is my
>> issue:
>> >
>> > I've got several vms with 3-6 volumes of 2TB each. I have a recurring
>> schedule
>> > setup to take a snapshot of each volume once a month. It takes a long
>> time for
>> > a volume to be snapshotted (in a magnitude of 20 hours). As a result,
>> when the
>> > schedule kicks in, it only manages to snapshot the first volume and the
>> > snapshots of the other volumes fail due to the async job timeout. From
>> what I
>> > have discovered, ACS only does a single volume snapshot at a time. I
>> can't seem
>> > to find the settings to enable concurrent snapshotting. So, it can't
>> snapshot
>> > all of the vm volumes at the same time. This is very much problematic
>> for many
>> > reasons, but the main reason is that upon recovery of multiple volumes,
>> the
>> > data on those will not be consistent.
>> >
>> > Is there a way around it? Perhaps there is an option in the settings
>> that I
>> > can't find that disables this odd behaviour of the volume snapshots?
>> >
>> > Cheers
>> >
>> > Andrei
>> >
>> > rohit.ya...@shapeblue.com
>> > www.shapeblue.com<http://www.shapeblue.com>
>> > Amadeus House, Floral Street, London  WC2E 9DPUK
>> > @shapeblue

Re: Concurrent Volume Snapshots

2019-06-13 Thread Andrei Mikhailovsky

Hi Rohit,

I have updated some of those options to increase the timeout to 2 days rather 
than a few hours by default.

However, these options relate to the timeout of the process.

I was wondering if there is an option to allow simultaneous snapshotting of 
volumes on a single VM? I would like all volumes of the vm to be copied over to 
the secondary storage at the same time, rather than one after another.

Cheers

- Original Message -
> From: "Rohit Yadav" 
> To: "dev" 
> Sent: Thursday, 13 June, 2019 14:02:21
> Subject: Re: Concurrent Volume Snapshots

> You can try to experiment with the following global settings:
> 
> 
> wait
> 
> backup.snapshot.wait
> copy.volume.wait
> vm.job.lock.timeout
> 
> 
> Regards,
> 
> Rohit Yadav
> 
> Software Architect, ShapeBlue
> 
> https://www.shapeblue.com
> 
> 
> From: Andrei Mikhailovsky 
> Sent: Thursday, June 13, 2019 6:27:23 PM
> To: dev
> Subject: Concurrent Volume Snapshots
> 
> Hello everyone
> 
> I am having running snapshot issues on large volumes. The hypervisor is KVM 
> and
> the storage backend is Ceph (rbd). ACS version is 4.11.2. Here is my issue:
> 
> I've got several vms with 3-6 volumes of 2TB each. I have a recurring schedule
> setup to take a snapshot of each volume once a month. It takes a long time for
> a volume to be snapshotted (in a magnitude of 20 hours). As a result, when the
> schedule kicks in, it only manages to snapshot the first volume and the
> snapshots of the other volumes fail due to the async job timeout. From what I
> have discovered, ACS only does a single volume snapshot at a time. I can't 
> seem
> to find the settings to enable concurrent snapshotting. So, it can't snapshot
> all of the vm volumes at the same time. This is very much problematic for many
> reasons, but the main reason is that upon recovery of multiple volumes, the
> data on those will not be consistent.
> 
> Is there a way around it? Perhaps there is an option in the settings that I
> can't find that disables this odd behaviour of the volume snapshots?
> 
> Cheers
> 
> Andrei
> 
> rohit.ya...@shapeblue.com
> www.shapeblue.com
> Amadeus House, Floral Street, London  WC2E 9DPUK
> @shapeblue

Re: Concurrent Volume Snapshots

2019-06-13 Thread Andrei Mikhailovsky

Thanks Rohit, I will try to investigate those options.

Andrei



- Original Message -
> From: "Rohit Yadav" 
> To: "dev" 
> Sent: Thursday, 13 June, 2019 14:02:21
> Subject: Re: Concurrent Volume Snapshots

> You can try to experiment with the following global settings:
> 
> 
> wait
> 
> backup.snapshot.wait
> copy.volume.wait
> vm.job.lock.timeout
> 
> 
> Regards,
> 
> Rohit Yadav
> 
> Software Architect, ShapeBlue
> 
> https://www.shapeblue.com
> 
> 
> From: Andrei Mikhailovsky 
> Sent: Thursday, June 13, 2019 6:27:23 PM
> To: dev
> Subject: Concurrent Volume Snapshots
> 
> Hello everyone
> 
> I am having running snapshot issues on large volumes. The hypervisor is KVM 
> and
> the storage backend is Ceph (rbd). ACS version is 4.11.2. Here is my issue:
> 
> I've got several vms with 3-6 volumes of 2TB each. I have a recurring schedule
> setup to take a snapshot of each volume once a month. It takes a long time for
> a volume to be snapshotted (in a magnitude of 20 hours). As a result, when the
> schedule kicks in, it only manages to snapshot the first volume and the
> snapshots of the other volumes fail due to the async job timeout. From what I
> have discovered, ACS only does a single volume snapshot at a time. I can't 
> seem
> to find the settings to enable concurrent snapshotting. So, it can't snapshot
> all of the vm volumes at the same time. This is very much problematic for many
> reasons, but the main reason is that upon recovery of multiple volumes, the
> data on those will not be consistent.
> 
> Is there a way around it? Perhaps there is an option in the settings that I
> can't find that disables this odd behaviour of the volume snapshots?
> 
> Cheers
> 
> Andrei
> 
> rohit.ya...@shapeblue.com
> www.shapeblue.com
> Amadeus House, Floral Street, London  WC2E 9DPUK
> @shapeblue

Concurrent Volume Snapshots

2019-06-13 Thread Andrei Mikhailovsky

Hello everyone 

I am having running snapshot issues on large volumes. The hypervisor is KVM and 
the storage backend is Ceph (rbd). ACS version is 4.11.2. Here is my issue: 

I've got several vms with 3-6 volumes of 2TB each. I have a recurring schedule 
setup to take a snapshot of each volume once a month. It takes a long time for 
a volume to be snapshotted (in a magnitude of 20 hours). As a result, when the 
schedule kicks in, it only manages to snapshot the first volume and the 
snapshots of the other volumes fail due to the async job timeout. From what I 
have discovered, ACS only does a single volume snapshot at a time. I can't seem 
to find the settings to enable concurrent snapshotting. So, it can't snapshot 
all of the vm volumes at the same time. This is very much problematic for many 
reasons, but the main reason is that upon recovery of multiple volumes, the 
data on those will not be consistent. 

Is there a way around it? Perhaps there is an option in the settings that I 
can't find that disables this odd behaviour of the volume snapshots? 

Cheers 

Andrei

Re: Help! Jobs stuck in pending state

2019-01-23 Thread Andrei Mikhailovsky

Hi

I've had this issue a few times in 2018 and managed to get it fixed pretty 
easily, although had spent a number of hours initially trying to figure out WTF 
is going on. This issue looks like one of those artefacts that creeped up in 
one of the versions released in 2018 and hasn't been addressed by the dev team.

The way I fixed it was similar to what has been recommended earlier. However, 
the difference was that I am sure I've looked at more tables than just the two 
suggested. Basically, I've stopped the management server, created the sql 
backup, connected to the sql db and listed all tables. Grepped for the words 
like job/schedule/queue/sync. After that I've went through all the tables and 
pretty much removed all the past / active / awaiting execution jobs. I have 
started by looking at the vm related jobs (the vm that I've tried to start but 
wasn't able to). This has worked once, but the second time I had to remove a 
lot more jobs which relate to other vms. After that I've started the management 
server and all went well from there.

What I have also noticed is that my snapshot jobs (I use KVM and Ceph) seem to 
be blocking jobs on the hypervisor hosts which are running these snapshots. So, 
if I am trying to perform various vm related jobs on a host server which is 
currently running a snapshot process, that job will not be executed until the 
snapshot process is done. I've tested this countless number of times and it's 
still the case. Again, this issued appeared in one of the 2018 releases as I've 
never seen between 2012 - 2017.

Both issues are annoying as hell!

Cheers

- Original Message -
> From: "Alireza Eskandari" 
> To: "dev" 
> Sent: Wednesday, 23 January, 2019 12:40:48
> Subject: Re: Help! Jobs stuck in pending state

> I'm following this issue in github:
> https://github.com/apache/cloudstack/issues/3104
> Please leave your comments
> Thanks
> 
> On Wed, Jan 23, 2019 at 12:39 PM Wei ZHOU  wrote:
> 
>> Hi Alireza,
>>
>> could you try again after restarting mgt server ?
>>
>> -Wei
>>
>> Alireza Eskandari  于2019年1月23日周三 上午6:22写道：
>>
>> > First I deleted two jobs which was existed in  vm_work_job table and its
>> > related entry in  sync_queue table but it doesn't help.
>> > Then I delete all the entries in sync_queue tables and again no success.
>> > Any idea?
>> >
>> > On Wed, Jan 23, 2019 at 1:50 AM Wei ZHOU  wrote:
>> >
>> > > If you know the instance id and mysql password, it should work after
>> > > removing some records in mysql.
>> > >
>> > > ```
>> > > set @id=X;
>> > >
>> > > delete from vm_work_job where vm_instance_id=@id;
>> > > delete from sync_queue where sync_objid=@id;
>> > > ```
>> > >
>> > > Alireza Eskandari  于2019年1月22日周二 下午10:59写道：
>> > >
>> > > > Hi guys
>> > > > I have opened a bug in jira about my problem in CS:
>> > > > https://issues.apache.org/jira/browse/CLOUDSTACK-10401
>> > > > CloudStack doesn't process jobs! My cloud in totally unusable.
>> > > > Thanks in advance for you help.
>> > > >
>> > >
>> >

broken workflow in autogenerating / modifying the agent.properties file

2018-10-22 Thread Andrei Mikhailovsky

Hello, 

Recently I've had an issue with one of the host servers. I've emailed the user 
list about this with subject " ACS 4.11.1.0 - agent.properties file became 
empty on a KVM host " 

In summary, the logic behind the autogeneration / modification of the 
agent.properties file is somewhat flawed. In my case, the host file had a 0 
byte size agent.properties file after the management server was restarted. This 
was likely to be due to the root partition being full at the time when the 
automatic modification took place. As a result, host server became disconnected 
and unable to connect. All vms were stuck on that host server without the 
ability to migrate them. 

Perhaps there should be more common sanity checks to make sure the destination 
file could actually be automatically created. 

Andrei

Re: Broken volume migration logic?

2018-10-11 Thread Andrei Mikhailovsky

Thanks for your input and the explanations, gents.

This is not really a big issue for me as we have a small scale environment that 
doesn't require volume disk migration. And frankly speaking, the disk migration 
using the manual method works far quicker than using the gui way where the disk 
is probably first exported to the nfs secondary storage and reimported back.

But it is nice to see the work is being done to improve the migration logic in 
the upcoming releases.

Cheers

- Original Message -
> From: "Andrija Panic" 
> To: "dev" 
> Sent: Thursday, 11 October, 2018 13:54:53
> Subject: Re: Broken volume migration logic?

> HI Rafael, Andrei,
> 
> that sounds wonderful !
> 
> @Andrei , we had exactly the same situation, but we have done internal code
> changes in ACS 4.5 /4.8 (never committed back to community unfortunately...
> ), so after migration is done, and we want to change offering, the list of
> Offerings is NOT matching the TAG of the volume only (so no error like you
> still get) - the list of offerings is shown depending on the CURRENT POOL
> of the volume  - we match the tags of any existing offerings vs tags on the
> CURRENT POOL where volume exist - so only matching offerings (targeting new
> pool...) are shown.
> 
> (we had CEPH/NFS as soruce with "deprecated" tag and all ceph/nfs
> offerings deleted/inactive, and destination pool was SoldiFire with new
> storage tag and a set of Compute/Disk offerings with tag "solidfire")
> 
> In our case this means - volume was on CEPH  and had CEPH offering - after
> we migrate offering to solidfire, only offering showing tag that matches
> the tags of the current pool (Solidfire), are shown... hope I was clearn
> with this long explanation :)
> 
> For volumes specifically, storage tags are (to my knowledge) only evaluated
> when you deploy VM (root volume) or create data volume - you can see this
> in logs when ACS search for pool having this and that tag...
> 
> Once resource (volume) is DEPLOYED (exists), it works as it is (as Rafael
> explained), and Offerings are ignored for that matter - BUT interestingly
> enough - some properties (i.e. min/max iops aka storage QoS or KVM io
> throtling aka. hypervisor QoS) are inherited and copied over from offering
> to actual volumes table/row in DB (for that specific volume...) when volume
> is being created, etc - .while some properties like "cache_mode"
> (write-back or not)  still read/applied on the fly from the actual
> Offering... so it's mix and match :)
> 
> I might be able to provide code that did this new way of matching tags, in
> case it would be interesting (but no human power to commit anything/PR, I
> can just share with Rafael or someone who is willing to push it upstream)
> Rafael ?
> 
> 
> Cheers
> 
> 
> 
> 
> 
> On Thu, 11 Oct 2018 at 14:16, Rafael Weingärtner <
> rafaelweingart...@gmail.com> wrote:
> 
>> What you described seems to be the new feature introduced with
>> https://issues.apache.org/jira/browse/CLOUDSTACK-10323 and
>> https://issues.apache.org/jira/browse/CLOUDSTACK-10240. However, this
>> feature should have been introduced only in master (4.12). I was not able
>> to find those commits in 4.11.1.0 though. Maybe ACS was already allowing
>> the movement between shared storages with different tags?. Anyways, the
>> block of code used to do this process has been totally re-written (now
>> everything is unit-tested). It is only in 4.12 though… It will also allow
>> placement overridden (ignoring storage tags and storage types), and also it
>> will allow replacing the disk offering while migrating the disk to a
>> new/different storage system.
>>
>> To answer your questions.
>>
>> > My question is how did the vm start? Did cloudstack ignore the storage
>> > tags or is there another reason?
>> >
>> Once the volume is already placed somewhere, CloudStack any extra checking
>> (if it can use the volume as is). Therefore, it only moves on with the
>> normal VM start.
>>
>>
>> On Thu, Oct 11, 2018 at 8:46 AM Andrei Mikhailovsky
>>  wrote:
>>
>> > Hello,
>> >
>> > I have recently tried to migrate a volume from one rbd storage pool to
>> > another. Have noticed a possible issue with the migration logic, which I
>> > was hoping to discuss with you.
>> >
>> > My setup: ACS 4.11.1.0
>> > Ceph + rbd for two primary storage pools (hdd and ssd pools)
>> > Storage tags are used together with the Disk Offerings (rbd tag is used
>> > for hdd backend volumes and rbd-ssd tag is used for the ssd backend
>> > volumes)
&

Broken volume migration logic?

2018-10-11 Thread Andrei Mikhailovsky

Hello, 

I have recently tried to migrate a volume from one rbd storage pool to another. 
Have noticed a possible issue with the migration logic, which I was hoping to 
discuss with you. 

My setup: ACS 4.11.1.0 
Ceph + rbd for two primary storage pools (hdd and ssd pools) 
Storage tags are used together with the Disk Offerings (rbd tag is used for hdd 
backend volumes and rbd-ssd tag is used for the ssd backend volumes) 

What I tried to do: Move a single volume from hdd pool over to the ssd pool. 
Migration went well according to the cloudstack job result. I ended up with a 
volume on the ssd storage pool. 

After the migration was done, I had a look at the disk service offering of the 
migrated volume and the service offering was still the hdd service offering 
despite the volume now being stored on the ssd pool. I have tried to change the 
disk offering to the ssd pool and had an error saying that the storage tags 
must be the same. Obviously, in my case, the storage tags of the hdd and ssd 
pool offerings are different. I have checked the database and indeed, the db 
still has the hdd disk offering id. 

I have tried to start the vm and to my surprise the vm has started. From my 
previous experience and my understanding how the tags work with storage, the vm 
should not have started. The disk offering tag of the migrated volume points to 
the hdd storage where this volume doesn't exist. So, starting the vm should 
have errors out with an error like Insufficient resources or something like 
that. 

So, I have a bit of an inconsistency going on with that volume. According to 
the cloudstack gui, the volume is stored on the ssd pool but has a disk 
offering from the hdd pool and there is no way to change that from the gui 
itself. 


My question is how did the vm start? Did cloudstack ignore the storage tags or 
is there another reason? 

Thanks

Re: Upgrade from ACS 4.9.X to 4.11.0 broke VPC source NAT

2018-07-11 Thread Andrei Mikhailovsky

Hi Andrija,


>From what I recall this was not an issue for us on 4.9.x. The problem started 
>after we've upgraded. We do have a few networks that does require a static 
>nat, so it is not really an option for us.

Its a shame that such an artefact hasn't been identified during the automated / 
manual testing prior to the release and the fix hasn't been included in the 
latest point release despite having fixes for over 100 issues, some of which 
are far less serious. Not too sure what to think of it to be honest. Seems like 
one step forward, two steps backwards with the new releases (

Andrei


- Original Message -
> From: "Andrija Panic" 
> To: "dev" 
> Sent: Monday, 9 July, 2018 22:39:06
> Subject: Re: Upgrade from ACS 4.9.X to 4.11.0 broke VPC source NAT

> Andrei, if not mistaken I believe I saw same behavior even on 4.8 - in our
> case, what I vaguely remember was, that we configure Port Forwarding
> instead of Static NAT - it did solve our use case (for some customer), but
> maybe it's not acceptable for you...
> 
> Cheers
> 
> On Mon, 9 Jul 2018 at 18:27, Andrei Mikhailovsky 
> wrote:
> 
>> Hi Rohit,
>>
>> I would like to send you a quick update on this issue. I have recently
>> upgraded to 4.11.1.0 with the new system vm templates. The issue that I've
>> described is still present in the latest release. Hasn't it been included
>> in the latest 4.11 maintenance release? I thought that it would be as it
>> breaks the major function of the VPC.
>>
>> Cheers.
>>
>> Andrei
>>
>> - Original Message -
>> > From: "Andrei Mikhailovsky" 
>> > To: "dev" 
>> > Sent: Friday, 20 April, 2018 11:52:30
>> > Subject: Re: Upgrade from ACS 4.9.X to 4.11.0 broke VPC source NAT
>>
>> > Thanks
>> >
>> >
>> >
>> > - Original Message -
>> >> From: "Rohit Yadav" 
>> >> To: "dev" , "dev" > >
>> >> Sent: Friday, 20 April, 2018 10:35:55
>> >> Subject: Re: Upgrade from ACS 4.9.X to 4.11.0 broke VPC source NAT
>> >
>> >> Hi Andrei,
>> >>
>> >> I've fixed this recently, please see
>> >> https://github.com/apache/cloudstack/pull/2579
>> >>
>> >> As a workaround you can add routing rules manually. On the PR, there is
>> a link
>> >> to a comment that explains the issue and suggests manual workaround.
>> Let me
>> >> know if that works for you.
>> >>
>> >> Regards.
>> >>
>> >>
>> >> From: Andrei Mikhailovsky
>> >> Sent: Friday, 20 April, 2:21 PM
>> >> Subject: Upgrade from ACS 4.9.X to 4.11.0 broke VPC source NAT
>> >> To: dev
>> >>
>> >>
>> >> Hello, I have been posting to the users thread about this issue. here
>> is a quick
>> >> summary in case if people contributing to the source nat code on the
>> VPC side
>> >> would like to fix this issue. Problem summary: no connectivity between
>> virtual
>> >> machines behind two Static NAT networks. Problem case: When one virtual
>> machine
>> >> sends a packet to the external address of the another virtual machine
>> that are
>> >> handled by the same router and both are behind the Static NAT the
>> traffic does
>> >> not work. 10.1.10.100 10.1.10.1:eth2 eth3:10.1.20.1 10.1.20.100 virt1
>> router
>> >> virt2 178.248.108.77:eth1:178.248.108.113 a single packet is send from
>> virt1 to
>> >> virt2. stage1: it arrives to the router on eth2 and enters
>> "nat_PREROUTING"
>> >> IN=eth2 OUT= SRC=10.1.10.100 DST=178.248.108.113) goes through the "10
>> 1K DNAT
>> >> all -- * * 0.0.0.0/0 178.248.108.113 to:10.1.20.100 " rule and has the
>> DST
>> >> DNATED to the internal IP of the virt2 stage2: Enters the FORWARDING
>> chain and
>> >> is being DROPPED by the default policy. DROPPED:IN=eth2 OUT=eth1
>> >> SRC=10.1.10.100 DST=10.1.20.100 The reason being is that the OUT
>> interface is
>> >> not correctly changed from eth1 to eth3 during the nat_PREROUTING so
>> the packet
>> >> is not intercepted by the FORWARD rule and thus not accepted. "24 14K
>> >> ACL_INBOUND_eth3 all -- * eth3 0.0.0.0/0 10.1.20.0/24" stage3: manually
>> >> inserted rule to accept this packet for FORWARDING. the packet enters
>> the
>> >> "nat_POSTROUTING" chain IN= OUT=eth1 S

Re: Upgrade from ACS 4.9.X to 4.11.0 broke VPC source NAT

2018-07-09 Thread Andrei Mikhailovsky

Hi Rohit,

I would like to send you a quick update on this issue. I have recently upgraded 
to 4.11.1.0 with the new system vm templates. The issue that I've described is 
still present in the latest release. Hasn't it been included in the latest 4.11 
maintenance release? I thought that it would be as it breaks the major function 
of the VPC.

Cheers.

Andrei

- Original Message -
> From: "Andrei Mikhailovsky" 
> To: "dev" 
> Sent: Friday, 20 April, 2018 11:52:30
> Subject: Re: Upgrade from ACS 4.9.X to 4.11.0 broke VPC source NAT

> Thanks
> 
> 
> 
> - Original Message -
>> From: "Rohit Yadav" 
>> To: "dev" , "dev" 
>> Sent: Friday, 20 April, 2018 10:35:55
>> Subject: Re: Upgrade from ACS 4.9.X to 4.11.0 broke VPC source NAT
> 
>> Hi Andrei,
>> 
>> I've fixed this recently, please see
>> https://github.com/apache/cloudstack/pull/2579
>> 
>> As a workaround you can add routing rules manually. On the PR, there is a 
>> link
>> to a comment that explains the issue and suggests manual workaround. Let me
>> know if that works for you.
>> 
>> Regards.
>> 
>> 
>> From: Andrei Mikhailovsky
>> Sent: Friday, 20 April, 2:21 PM
>> Subject: Upgrade from ACS 4.9.X to 4.11.0 broke VPC source NAT
>> To: dev
>> 
>> 
>> Hello, I have been posting to the users thread about this issue. here is a 
>> quick
>> summary in case if people contributing to the source nat code on the VPC side
>> would like to fix this issue. Problem summary: no connectivity between 
>> virtual
>> machines behind two Static NAT networks. Problem case: When one virtual 
>> machine
>> sends a packet to the external address of the another virtual machine that 
>> are
>> handled by the same router and both are behind the Static NAT the traffic 
>> does
>> not work. 10.1.10.100 10.1.10.1:eth2 eth3:10.1.20.1 10.1.20.100 virt1 router
>> virt2 178.248.108.77:eth1:178.248.108.113 a single packet is send from virt1 
>> to
>> virt2. stage1: it arrives to the router on eth2 and enters "nat_PREROUTING"
>> IN=eth2 OUT= SRC=10.1.10.100 DST=178.248.108.113) goes through the "10 1K 
>> DNAT
>> all -- * * 0.0.0.0/0 178.248.108.113 to:10.1.20.100 " rule and has the DST
>> DNATED to the internal IP of the virt2 stage2: Enters the FORWARDING chain 
>> and
>> is being DROPPED by the default policy. DROPPED:IN=eth2 OUT=eth1
>> SRC=10.1.10.100 DST=10.1.20.100 The reason being is that the OUT interface is
>> not correctly changed from eth1 to eth3 during the nat_PREROUTING so the 
>> packet
>> is not intercepted by the FORWARD rule and thus not accepted. "24 14K
>> ACL_INBOUND_eth3 all -- * eth3 0.0.0.0/0 10.1.20.0/24" stage3: manually
>> inserted rule to accept this packet for FORWARDING. the packet enters the
>> "nat_POSTROUTING" chain IN= OUT=eth1 SRC=10.1.10.100 DST=10.1.20.100 and has
>> the SRC changed to the external IP 16 1320 SNAT all -- * eth1 10.1.10.100
>> 0.0.0.0/0 to:178.248.108.77 and is sent to the external network on eth1.
>> 13:37:44.834341 IP 178.248.108.77 > 10.1.20.100: ICMP echo request, id 2644,
>> seq 2, length 64 For some reason, during the nat_PREROUTING stage the DST_IP 
>> is
>> changed, but the OUT interface still reflects the interface associated with 
>> the
>> old DST_IP. Here is the routing table # ip route list default via 
>> 178.248.108.1
>> dev eth1 10.1.10.0/24 dev eth2 proto kernel scope link src 10.1.10.1
>> 10.1.20.0/24 dev eth3 proto kernel scope link src 10.1.20.1 169.254.0.0/16 
>> dev
>> eth0 proto kernel scope link src 169.254.0.5 178.248.108.0/25 dev eth1 proto
>> kernel scope link src 178.248.108.101 # ip rule list 0: from all lookup local
>> 32761: from all fwmark 0x3 lookup Table_eth3 32762: from all fwmark 0x2 
>> lookup
>> Table_eth2 32763: from all fwmark 0x1 lookup Table_eth1 32764: from 
>> 10.1.0.0/16
>> lookup static_route_back 32765: from 10.1.0.0/16 lookup static_route 32766:
>> from all lookup main 32767: from all lookup default Further into the
>> investigation, the problem was pinned down to those rules. All the traffic 
>> from
>> internal IP on the static NATed connection were forced to go to the outside
>> interface (eth1), by setting the mark 0x1 and then using the matching # ip 
>> rule
>> to direct it. #iptables -t mangle -L PREROUTING -vn Chain PREROUTING (policy
>> ACCEPT 97 packets, 11395 bytes) pkts bytes target prot opt in out source
>> destination 49 3644 CONNMARK all -- * * 10.1.10.100 0.0.

Re: [RESULT][VOTE] Apache CloudStack 4.11.1.0

2018-06-27 Thread Andrei Mikhailovsky

Congratulations everyone on making this happen! Well done guys!

Andrei

- Original Message -
> From: "Paul Angus" 
> To: "dev" , "users" 
> Sent: Tuesday, 26 June, 2018 17:09:52
> Subject: [RESULT][VOTE] Apache CloudStack 4.11.1.0

> Hi All,
> 
> After 72 hours, the vote for CloudStack 4.11.1.0 *passes* with
> 3 PMC + 2 non-PMC votes.
> 
> +1 (PMC / binding)
> 
> Rohit Yadav
> 
> Paul Angus
> 
> Mike Tutkowski
> 
> +1 (non binding)
> 
> Nicolas Vazquez
> 
> Boris Stoyanov
> 
> 0
> Rene Moser
> 
> -1
> none
> 
> Thanks to everyone participating.
> 
> I will now prepare the release announcement to go out after 24 hours to give 
> the
> mirrors time to catch up.
> 
> 
> Kind regards,
> 
> Paul Angus
> 
> 
> 
> paul.an...@shapeblue.com
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue

4.11.0 - can't create guest vms with RBD storage!

2018-04-30 Thread Andrei Mikhailovsky

hello gents, 

I have just realised that after upgrading to 4.11.0 we are no longer able to 
create new VMs. This has just been noticed as we have previously used ready 
made templates, which work just fine. 

Setup: ACS 4.11.0 (upgraded from 4.9.3), KVM + CEPH, Ubuntu 16.04 on all 
servers 

When trying to create a new vm from an ISO image I get the following error: 


com.cloud.exception.StorageUnavailableException: Resource [StoragePool:2] is 
unreachable: Unable to create 
Vol[3937|vm=2217|ROOT]:com.cloud.utils.exception.CloudRuntimeException: 
org.libvirt.LibvirtException: this function is not supported by the connection 
driver: only RAW volumes are supported by this storage pool 

at 
org.apache.cloudstack.engine.orchestration.VolumeOrchestrator.recreateVolume(VolumeOrchestrator.java:1336)
 
at 
org.apache.cloudstack.engine.orchestration.VolumeOrchestrator.prepare(VolumeOrchestrator.java:1413)
 
at 
com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:1110)
 
at 
com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:4927)
 
at sun.reflect.GeneratedMethodAccessor498.invoke(Unknown Source) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 
at java.lang.reflect.Method.invoke(Method.java:498) 
at 
com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
 
at 
com.cloud.vm.VirtualMachineManagerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:5090)
 
at com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.java:102) 
at 
org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:581)
 
at 
org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
 
at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
 
at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
 
at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
 
at 
org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
 
at 
org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:529)
 
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
at java.lang.Thread.run(Thread.java:748) 


My guess is that ACS tried to create a QCOW2 image type whereas it should be 
RAW on ceph/rbd. 

I am really struggling to understand how this bug in a function of MAJOR 
importance could have been missed during the tests ran by developers and 
community before making a final realise. Anyways, I hope the fix will make it 
to 4.11.1 release, otherwise it's really messed up! 

Cheers 

Andrei

Re: Upgrade from ACS 4.9.X to 4.11.0 broke VPC source NAT

2018-04-20 Thread Andrei Mikhailovsky

Thanks



- Original Message -
> From: "Rohit Yadav" <rohit.ya...@shapeblue.com>
> To: "dev" <dev@cloudstack.apache.org>, "dev" <dev@cloudstack.apache.org>
> Sent: Friday, 20 April, 2018 10:35:55
> Subject: Re: Upgrade from ACS 4.9.X to 4.11.0 broke VPC source NAT

> Hi Andrei,
> 
> I've fixed this recently, please see
> https://github.com/apache/cloudstack/pull/2579
> 
> As a workaround you can add routing rules manually. On the PR, there is a link
> to a comment that explains the issue and suggests manual workaround. Let me
> know if that works for you.
> 
> Regards.
> 
> 
> From: Andrei Mikhailovsky
> Sent: Friday, 20 April, 2:21 PM
> Subject: Upgrade from ACS 4.9.X to 4.11.0 broke VPC source NAT
> To: dev
> 
> 
> Hello, I have been posting to the users thread about this issue. here is a 
> quick
> summary in case if people contributing to the source nat code on the VPC side
> would like to fix this issue. Problem summary: no connectivity between virtual
> machines behind two Static NAT networks. Problem case: When one virtual 
> machine
> sends a packet to the external address of the another virtual machine that are
> handled by the same router and both are behind the Static NAT the traffic does
> not work. 10.1.10.100 10.1.10.1:eth2 eth3:10.1.20.1 10.1.20.100 virt1 router
> virt2 178.248.108.77:eth1:178.248.108.113 a single packet is send from virt1 
> to
> virt2. stage1: it arrives to the router on eth2 and enters "nat_PREROUTING"
> IN=eth2 OUT= SRC=10.1.10.100 DST=178.248.108.113) goes through the "10 1K DNAT
> all -- * * 0.0.0.0/0 178.248.108.113 to:10.1.20.100 " rule and has the DST
> DNATED to the internal IP of the virt2 stage2: Enters the FORWARDING chain and
> is being DROPPED by the default policy. DROPPED:IN=eth2 OUT=eth1
> SRC=10.1.10.100 DST=10.1.20.100 The reason being is that the OUT interface is
> not correctly changed from eth1 to eth3 during the nat_PREROUTING so the 
> packet
> is not intercepted by the FORWARD rule and thus not accepted. "24 14K
> ACL_INBOUND_eth3 all -- * eth3 0.0.0.0/0 10.1.20.0/24" stage3: manually
> inserted rule to accept this packet for FORWARDING. the packet enters the
> "nat_POSTROUTING" chain IN= OUT=eth1 SRC=10.1.10.100 DST=10.1.20.100 and has
> the SRC changed to the external IP 16 1320 SNAT all -- * eth1 10.1.10.100
> 0.0.0.0/0 to:178.248.108.77 and is sent to the external network on eth1.
> 13:37:44.834341 IP 178.248.108.77 > 10.1.20.100: ICMP echo request, id 2644,
> seq 2, length 64 For some reason, during the nat_PREROUTING stage the DST_IP 
> is
> changed, but the OUT interface still reflects the interface associated with 
> the
> old DST_IP. Here is the routing table # ip route list default via 
> 178.248.108.1
> dev eth1 10.1.10.0/24 dev eth2 proto kernel scope link src 10.1.10.1
> 10.1.20.0/24 dev eth3 proto kernel scope link src 10.1.20.1 169.254.0.0/16 dev
> eth0 proto kernel scope link src 169.254.0.5 178.248.108.0/25 dev eth1 proto
> kernel scope link src 178.248.108.101 # ip rule list 0: from all lookup local
> 32761: from all fwmark 0x3 lookup Table_eth3 32762: from all fwmark 0x2 lookup
> Table_eth2 32763: from all fwmark 0x1 lookup Table_eth1 32764: from 
> 10.1.0.0/16
> lookup static_route_back 32765: from 10.1.0.0/16 lookup static_route 32766:
> from all lookup main 32767: from all lookup default Further into the
> investigation, the problem was pinned down to those rules. All the traffic 
> from
> internal IP on the static NATed connection were forced to go to the outside
> interface (eth1), by setting the mark 0x1 and then using the matching # ip 
> rule
> to direct it. #iptables -t mangle -L PREROUTING -vn Chain PREROUTING (policy
> ACCEPT 97 packets, 11395 bytes) pkts bytes target prot opt in out source
> destination 49 3644 CONNMARK all -- * * 10.1.10.100 0.0.0.0/0 state NEW
> CONNMARK save 37 2720 MARK all -- * * 10.1.20.100 0.0.0.0/0 state NEW MARK set
> 0x1 37 2720 CONNMARK all -- * * 10.1.20.100 0.0.0.0/0 state NEW CONNMARK save
> 114 8472 MARK all -- * * 10.1.10.100 0.0.0.0/0 state NEW MARK set 0x1 114 8472
> CONNMARK all -- * * 10.1.10.100 0.0.0.0/0 state NEW CONNMARK save # ip rule 0:
> from all lookup local 32761: from all fwmark 0x3 lookup Table_eth3 32762: from
> all fwmark 0x2 lookup Table_eth2 32763: from all fwmark 0x1 lookup Table_eth1
> 32764: from 10.1.0.0/16 lookup static_route_back 32765: from 10.1.0.0/16 
> lookup
> static_route 32766: from all lookup main 32767: from all lookup default The
> acceptable solution is to delete those rules all together.? The problem with
> such approach is that the inter VPC traffic will use the internal IP 
> addresses,
> so the packe

Upgrade from ACS 4.9.X to 4.11.0 broke VPC source NAT

2018-04-20 Thread Andrei Mikhailovsky

Hello, 

I have been posting to the users thread about this issue. here is a quick 
summary in case if people contributing to the source nat code on the VPC side 
would like to fix this issue. 


Problem summary: no connectivity between virtual machines behind two Static NAT 
networks. 

Problem case: When one virtual machine sends a packet to the external address 
of the another virtual machine that are handled by the same router and both are 
behind the Static NAT the traffic does not work. 



10.1.10.100 10.1.10.1:eth2 eth3:10.1.20.1 10.1.20.100 
virt1 <---> router <---> virt2 
178.248.108.77:eth1:178.248.108.113 


a single packet is send from virt1 to virt2. 


stage1: it arrives to the router on eth2 and enters "nat_PREROUTING" 
IN=eth2 OUT= SRC=10.1.10.100 DST=178.248.108.113) 

goes through the "10 1K DNAT all -- * * 0.0.0.0/0 178.248.108.113 
to:10.1.20.100 
" rule and has the DST DNATED to the internal IP of the virt2 


stage2: Enters the FORWARDING chain and is being DROPPED by the default policy. 
DROPPED:IN=eth2 OUT=eth1 SRC=10.1.10.100 DST=10.1.20.100 

The reason being is that the OUT interface is not correctly changed from eth1 
to eth3 during the nat_PREROUTING 
so the packet is not intercepted by the FORWARD rule and thus not accepted. 
"24 14K ACL_INBOUND_eth3 all -- * eth3 0.0.0.0/0 10.1.20.0/24" 


stage3: manually inserted rule to accept this packet for FORWARDING. 
the packet enters the "nat_POSTROUTING" chain 
IN= OUT=eth1 SRC=10.1.10.100 DST=10.1.20.100 

and has the SRC changed to the external IP 
16 1320 SNAT all -- * eth1 10.1.10.100 0.0.0.0/0 to:178.248.108.77 

and is sent to the external network on eth1. 
13:37:44.834341 IP 178.248.108.77 > 10.1.20.100: ICMP echo request, id 2644, 
seq 2, length 64 


For some reason, during the nat_PREROUTING stage the DST_IP is changed, but the 
OUT interface still reflects the interface associated with the old DST_IP. 

Here is the routing table 
# ip route list 
default via 178.248.108.1 dev eth1 
10.1.10.0/24 dev eth2 proto kernel scope link src 10.1.10.1 
10.1.20.0/24 dev eth3 proto kernel scope link src 10.1.20.1 
169.254.0.0/16 dev eth0 proto kernel scope link src 169.254.0.5 
178.248.108.0/25 dev eth1 proto kernel scope link src 178.248.108.101 

# ip rule list 
0: from all lookup local 
32761: from all fwmark 0x3 lookup Table_eth3 
32762: from all fwmark 0x2 lookup Table_eth2 
32763: from all fwmark 0x1 lookup Table_eth1 
32764: from 10.1.0.0/16 lookup static_route_back 
32765: from 10.1.0.0/16 lookup static_route 
32766: from all lookup main 
32767: from all lookup default 


Further into the investigation, the problem was pinned down to those rules. 
All the traffic from internal IP on the static NATed connection were forced to 
go to the outside interface (eth1), by setting the mark 0x1 and then using the 
matching # ip rule to direct it. 

#iptables -t mangle -L PREROUTING -vn 
Chain PREROUTING (policy ACCEPT 97 packets, 11395 bytes) 
pkts bytes target prot opt in out source destination 
49 3644 CONNMARK all -- * * 10.1.10.100 0.0.0.0/0 state NEW CONNMARK save 
37 2720 MARK all -- * * 10.1.20.100 0.0.0.0/0 state NEW MARK set 0x1 
37 2720 CONNMARK all -- * * 10.1.20.100 0.0.0.0/0 state NEW CONNMARK save 
114 8472 MARK all -- * * 10.1.10.100 0.0.0.0/0 state NEW MARK set 0x1 
114 8472 CONNMARK all -- * * 10.1.10.100 0.0.0.0/0 state NEW CONNMARK save 


# ip rule 
0: from all lookup local 
32761: from all fwmark 0x3 lookup Table_eth3 
32762: from all fwmark 0x2 lookup Table_eth2 
32763: from all fwmark 0x1 lookup Table_eth1 
32764: from 10.1.0.0/16 lookup static_route_back 
32765: from 10.1.0.0/16 lookup static_route 
32766: from all lookup main 
32767: from all lookup default 


The acceptable solution is to delete those rules all together.? 

The problem with such approach is that the inter VPC traffic will use the 
internal IP addresses, 
so the packets going from 178.248.108.77 to 178.248.108.113 
would be seen as communication between 10.1.10.100 and 10.1.20.100 

thus we need to apply further two rules 
# iptables -t nat -I POSTROUTING -o eth3 -s 10.1.10.0/24 -d 10.1.20.0/24 -j 
SNAT --to-source 178.248.108.77 
# iptables -t nat -I POSTROUTING -o eth2 -s 10.1.20.0/24 -d 10.1.10.0/24 -j 
SNAT --to-source 178.248.108.113 

in order to make sure that the packets leaving the router would have correct 
source IP. 

This way it is possible to have static NAT on all of the IPS within the VPC and 
ensure a successful communication between them. 


So, for a quick and dirty fix, we ran this command on the VR: 

for i in iptables -t mangle -L PREROUTING -vn | awk '/0x1/ && !/eth1/ {print 
$8}'; do iptables -t mangle -D PREROUTING -s $i -m state —state NEW -j MARK 
—set-mark "0x1" ; done 



The issue has been introduced around early 4.9.x releases I believe. 


Thanks 

Andrei 





- Original Message - 
> From: "Andrei Mikhailovsky" <an

CLOUDSTACK-8663 and CLOUDSTACK-4858

2017-09-19 Thread Andrei Mikhailovsky

Hello guys, 

I have a question on CLOUDSTACK-4858 and CLOUDSTACK-8663 issues that were fixed 
in the recent 4.9.3.0 release. 

First of all, big up for addressing issue 4858 after about 3+ years of it being 
'cooked' in the oven. This issue alone will save so much time and network 
traffic for many of us I am sure. This leads me to the question on pruning old 
snapshots on the ceph storage. 

I am currently running 4.9.2.0 and for ages I've been having a problem with 
cloudstack leaving disk snapshots on the primary storage after they are being 
copied to the secondary storage. When I realised this issue, I had over 4000 
snapshots on ceph. So, now I am running a small script that clears the clutter 
left by cloudstack's snapshotting process. So, if I were to use primary storage 
exclusively for keeping the snapshots, would my old snapshots be removed 
according to the snapshot schedule? Or has this function been missed out? 

Thanks 

Andrei

Re: Changing default NFS mount options

2017-09-04 Thread Andrei Mikhailovsky


Hi Rafael,

I had a chat with our storage guy and it turns out that this was the server 
side configuration change that was needed. It is all working nicely at around 
450MB/s for seq writes.

Cheers

Andrei

- Original Message -
> From: "Rafael Weingärtner" <raf...@autonomiccs.com.br>
> To: "dev" <dev@cloudstack.apache.org>
> Sent: Friday, 1 September, 2017 14:02:20
> Subject: Re: Changing default NFS mount options

> Do you know which parameter you are setting up manually that is causing
> this change?
> 
> 
> On 9/1/2017 7:15 AM, Andrei Mikhailovsky wrote:
>> Hello guys,
>>
>> could you please let me know how could I change the default mount options 
>> with
>> the NFS primary storage? I have noticed that if I am mounting nfs storage
>> manually on the host server, the write speeds are significantly faster 
>> compared
>> with the mount point that is automatically created by ACS. Like the 
>> difference
>> is around 50MB/s for the automatic mount and about 450MB/s for the manual
>> mount.
>>
>> Can I manually adjust the mount options, even if it means changing the 
>> sources
>> and recompiling?
>>
>> Thanks
>>
>> Andrei
>>
> 
> --
> --
> Rafael Weingärtner

Changing default NFS mount options

2017-09-01 Thread Andrei Mikhailovsky

Hello guys, 

could you please let me know how could I change the default mount options with 
the NFS primary storage? I have noticed that if I am mounting nfs storage 
manually on the host server, the write speeds are significantly faster compared 
with the mount point that is automatically created by ACS. Like the difference 
is around 50MB/s for the automatic mount and about 450MB/s for the manual 
mount. 

Can I manually adjust the mount options, even if it means changing the sources 
and recompiling? 

Thanks 

Andrei

Re: [RESULT][VOTE] Apache CloudStack 4.10.0.0

2017-07-06 Thread Andrei Mikhailovsky


Congratulations to everyone! Job well done!

Andrei

- Original Message -
> From: "Haijiao" <18602198...@163.com>
> To: "dev" 
> Sent: Thursday, 6 July, 2017 13:58:48
> Subject: Re:Re: [RESULT][VOTE] Apache CloudStack 4.10.0.0

> Finally 4.10 arrives.  True achievement of whole community !
> 
> 
> Thanks Rajani !
> 
> 
> 
> 
> 
> 
> 在2017年07月06 19时54分, "Wido den Hollander"写道:
> 
> 
>> Op 6 juli 2017 om 12:09 schreef Wei ZHOU :
>>
>>
>> nice!!
> 
> Indeed! Let's go for 4.11 :)
> 
> Wido
> 
>>
>> 2017-07-06 11:56 GMT+02:00 Rajani Karuturi :
>>
>> > Hi all,
>> >
>> > After 72 hours, the vote for CloudStack 4.10.0.0 [1] *passes* with
>> > 4 PMC + 2 non-PMC votes.
>> >
>> > +1 (PMC / binding)
>> > * Mike Tutkowski
>> > * Wido den Hollander
>> > * Daan Hoogland
>> > * Milamber
>> >
>> > +1 (non binding)
>> > * Kris Sterckx
>> > * Boris Stoyanov
>> >
>> > 0
>> > none
>> >
>> > -1
>> > none
>> >
>> > Thanks to everyone participating.
>> >
>> > I will now prepare the release announcement to go out after 24 hours to
>> > give the mirrors time to catch up.
>> >
>> > [1] http://markmail.org/thread/dafndhtflon4pshf
>> >
>> > ~Rajani
>> > http://cloudplatform.accelerite.com/

Re: error adding VPN user in VPC network

2016-11-22 Thread Andrei Mikhailovsky

Dag from the users mailing list has pointed to this: 
https://issues.apache.org/jira/browse/CLOUDSTACK-9356


- Original Message -
> From: "Will Stevens" <williamstev...@gmail.com>
> To: "dev" <dev@cloudstack.apache.org>
> Sent: Tuesday, 22 November, 2016 17:37:56
> Subject: Re: error adding VPN user in VPC network

> Hmm. That is strange. I have not seen that behavior before.
> 
> On Nov 22, 2016 11:45 AM, "Andrei Mikhailovsky" <and...@arhont.com.invalid>
> wrote:
> 
>> Hi Will,
>>
>> forgot to mention that my VPN services are working well for all existing
>> accounts on a none VPC networks. I am running version 4.9.0 and have no
>> issues apart from creating new vpn users to a VPC enabled network.
>>
>> Actually, I've just checked that I can successfully add a new user to a
>> non-VPC network. No issues there as far as I can see.
>>
>> Andrei
>>
>> - Original Message -
>> > From: "Will Stevens" <wstev...@cloudops.com>
>> > To: "dev" <dev@cloudstack.apache.org>
>> > Sent: Tuesday, 22 November, 2016 13:47:36
>> > Subject: Re: error adding VPN user in VPC network
>>
>> > I am not sure how you are able to add the VPN user to a Static NAT IP.
>> You
>> > should be adding it to the Source NAT IP.  Was that just a typo or are
>> you
>> > targeting the wrong IP address using the API or something like that?
>> >
>> > There are known issues with the current VPN implementation (openswan).
>> > Basically, if you try to scp files over it or tail a log, it will drop
>> your
>> > connection.  You may want to try the code from my PR
>> > https://github.com/apache/cloudstack/pull/1741 if you have problems with
>> > the current implementation.  That PR should make it into the next
>> release...
>> >
>> > *Will STEVENS*
>> > Lead Developer
>> >
>> > <https://goo.gl/NYZ8KK>
>> >
>> > On Tue, Nov 22, 2016 at 8:35 AM, Andrei Mikhailovsky <
>> > and...@arhont.com.invalid> wrote:
>> >
>> >> Hello
>> >>
>> >> Duplicating this from the users list.
>> >>
>> >> I am running ACS 4.9.0.
>> >>
>> >> I am having an issue with adding a VPN user to the VPC network. I've
>> >> enabled the VPN service on the static IP. The service was enabled and I
>> >> have the PSK shown to me. However, when I am adding a new user it fails
>> >> with the following error:
>> >>
>> >> 2016-11-22 12:05:26,189 DEBUG [c.c.n.v.RemoteAccessVpnManagerImpl]
>> >> (API-Job-Executor-82:ctx-d62e35c3 job-31537 ctx-8ac8a450)
>> >> (logid:f76b2eae) VPN User VpnUser[40-andrei-45] is set on
>> >> com.cloud.network.dao.RemoteAccessVpnVO$$EnhancerByCGLIB$$cc1dfb8d@
>> >> 4465732c
>> >> 2016-11-22 12:05:26,189 WARN [c.c.n.v.RemoteAccessVpnManagerImpl]
>> >> (API-Job-Executor-82:ctx-d62e35c3 job-31537 ctx-8ac8a450)
>> >> (logid:f76b2eae) Unable to apply vpn users
>> >> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>> >> at java.util.ArrayList.rangeCheck(ArrayList.java:635)
>> >> at java.util.ArrayList.get(ArrayList.java:411)
>> >> at com.cloud.network.vpn.RemoteAccessVpnManagerImpl.applyVpnUsers(
>> >> RemoteAccessVpnManagerImpl.java:532)
>> >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> >> at sun.reflect.NativeMethodAccessorImpl.invoke(
>> >> NativeMethodAccessorImpl.java:57)
>> >> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
>> >> DelegatingMethodAccessorImpl.java:43)
>> >> at java.lang.reflect.Method.invoke(Method.java:606)
>> >> at org.springframework.aop.support.AopUtils.
>> invokeJoinpointUsingReflection
>> >> (AopUtils.java:317)
>> >> at org.springframework.aop.framework.ReflectiveMethodInvocation.
>> >> invokeJoinpoint(ReflectiveMethodInvocation.java:183)
>> >> at org.springframework.aop.framework.ReflectiveMethodInvocation.
>> proceed(
>> >> ReflectiveMethodInvocation.java:150)
>> >> at org.springframework.aop.interceptor.ExposeInvocationInterceptor.
>> invoke(
>> >> ExposeInvocationInterceptor.java:91)
>> >> at org.springframework.aop.framework.ReflectiveMethodInvocation.
>> proceed(
>> >> ReflectiveMethodInvocation.java:172)
>> >> at org.springframework.aop.framework.JdkDynamicAopProxy.

Re: error adding VPN user in VPC network

2016-11-22 Thread Andrei Mikhailovsky

Hi Will,

forgot to mention that my VPN services are working well for all existing 
accounts on a none VPC networks. I am running version 4.9.0 and have no issues 
apart from creating new vpn users to a VPC enabled network.

Actually, I've just checked that I can successfully add a new user to a non-VPC 
network. No issues there as far as I can see.

Andrei

- Original Message -
> From: "Will Stevens" <wstev...@cloudops.com>
> To: "dev" <dev@cloudstack.apache.org>
> Sent: Tuesday, 22 November, 2016 13:47:36
> Subject: Re: error adding VPN user in VPC network

> I am not sure how you are able to add the VPN user to a Static NAT IP.  You
> should be adding it to the Source NAT IP.  Was that just a typo or are you
> targeting the wrong IP address using the API or something like that?
> 
> There are known issues with the current VPN implementation (openswan).
> Basically, if you try to scp files over it or tail a log, it will drop your
> connection.  You may want to try the code from my PR
> https://github.com/apache/cloudstack/pull/1741 if you have problems with
> the current implementation.  That PR should make it into the next release...
> 
> *Will STEVENS*
> Lead Developer
> 
> <https://goo.gl/NYZ8KK>
> 
> On Tue, Nov 22, 2016 at 8:35 AM, Andrei Mikhailovsky <
> and...@arhont.com.invalid> wrote:
> 
>> Hello
>>
>> Duplicating this from the users list.
>>
>> I am running ACS 4.9.0.
>>
>> I am having an issue with adding a VPN user to the VPC network. I've
>> enabled the VPN service on the static IP. The service was enabled and I
>> have the PSK shown to me. However, when I am adding a new user it fails
>> with the following error:
>>
>> 2016-11-22 12:05:26,189 DEBUG [c.c.n.v.RemoteAccessVpnManagerImpl]
>> (API-Job-Executor-82:ctx-d62e35c3 job-31537 ctx-8ac8a450)
>> (logid:f76b2eae) VPN User VpnUser[40-andrei-45] is set on
>> com.cloud.network.dao.RemoteAccessVpnVO$$EnhancerByCGLIB$$cc1dfb8d@
>> 4465732c
>> 2016-11-22 12:05:26,189 WARN [c.c.n.v.RemoteAccessVpnManagerImpl]
>> (API-Job-Executor-82:ctx-d62e35c3 job-31537 ctx-8ac8a450)
>> (logid:f76b2eae) Unable to apply vpn users
>> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>> at java.util.ArrayList.rangeCheck(ArrayList.java:635)
>> at java.util.ArrayList.get(ArrayList.java:411)
>> at com.cloud.network.vpn.RemoteAccessVpnManagerImpl.applyVpnUsers(
>> RemoteAccessVpnManagerImpl.java:532)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at sun.reflect.NativeMethodAccessorImpl.invoke(
>> NativeMethodAccessorImpl.java:57)
>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
>> DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:606)
>> at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection
>> (AopUtils.java:317)
>> at org.springframework.aop.framework.ReflectiveMethodInvocation.
>> invokeJoinpoint(ReflectiveMethodInvocation.java:183)
>> at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(
>> ReflectiveMethodInvocation.java:150)
>> at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(
>> ExposeInvocationInterceptor.java:91)
>> at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(
>> ReflectiveMethodInvocation.java:172)
>> at org.springframework.aop.framework.JdkDynamicAopProxy.
>> invoke(JdkDynamicAopProxy.java:204)
>> at com.sun.proxy.$Proxy237.applyVpnUsers(Unknown Source)
>> at org.apache.cloudstack.api.command.user.vpn.AddVpnUserCmd.execute(
>> AddVpnUserCmd.java:122)
>> at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:150)
>> at com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:
>> 108)
>> at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.
>> runInContext(AsyncJobManagerImpl.java:554)
>> at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(
>> ManagedContextRunnable.java:49)
>> at org.apache.cloudstack.managed.context.impl.
>> DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>> at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.
>> callWithContext(DefaultManagedContext.java:103)
>> at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.
>> runWithContext(DefaultManagedContext.java:53)
>> at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(
>> ManagedContextRunnable.java:46)
>> at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(
>> AsyncJobManagerImpl.java:502)
>> at java.util.concurrent

Re: error adding VPN user in VPC network

2016-11-22 Thread Andrei Mikhailovsky

Hi Wei

Is this something that will be merged in the next release?

Thanks

- Original Message -
> From: "Wei ZHOU" <ustcweiz...@gmail.com>
> To: "dev" <dev@cloudstack.apache.org>
> Sent: Tuesday, 22 November, 2016 15:06:18
> Subject: Re: error adding VPN user in VPC network

> Hi Andrii,
> 
> I fixed by following change
> '''
> diff --git
> a/server/src/com/cloud/network/vpn/RemoteAccessVpnManagerImpl.java
> b/server/src/com/cloud/network/vpn/RemoteAccessVpnManagerImpl.java
> index b473f05..2a84714 100644
> --- a/server/src/com/cloud/network/vpn/RemoteAccessVpnManagerImpl.java
> +++ b/server/src/com/cloud/network/vpn/RemoteAccessVpnManagerImpl.java
> @@ -521,21 +521,26 @@ public class RemoteAccessVpnManagerImpl extends
> ManagerBase implements RemoteAcc
> 
> boolean success = true;
> 
> -boolean[] finals = new boolean[users.size()];
> +Boolean[] finals = new Boolean[users.size()];
> for (RemoteAccessVPNServiceProvider element :
> _vpnServiceProviders) {
> s_logger.debug("Applying vpn access to " + element.getName());
> for (RemoteAccessVpnVO vpn : vpns) {
> try {
> String[] results = element.applyVpnUsers(vpn, users);
> if (results != null) {
> +int indexUser = -1;
> for (int i = 0; i < results.length; i++) {
> -s_logger.debug("VPN User " + users.get(i) +
> (results[i] == null ? " is set on " : (" couldn't be set due to " +
> results[i]) + " on ") + vpn);
> +indexUser ++;
> +if (indexUser == users.size()) {
> +indexUser = 0; // results on multiple VPC
> routers are combined in commit 13eb789, reset user index if one VR is done.
> +}
> +s_logger.debug("VPN User " +
> users.get(indexUser) + (results[i] == null ? " is set on " : (" couldn't be
> set due to " + results[i]) + " on ") + vpn.getUuid());
> if (results[i] == null) {
> -if (!finals[i]) {
> -finals[i] = true;
> +if (finals[indexUser] == null) {
> +finals[indexUser] = true;
> }
> } else {
> -    finals[i] = false;
> +finals[indexUser] = false;
> success = false;
> }
> }
> '''
> 
> 2016-11-22 14:35 GMT+01:00 Andrei Mikhailovsky <and...@arhont.com.invalid>:
> 
>> Hello
>>
>> Duplicating this from the users list.
>>
>> I am running ACS 4.9.0.
>>
>> I am having an issue with adding a VPN user to the VPC network. I've
>> enabled the VPN service on the static IP. The service was enabled and I
>> have the PSK shown to me. However, when I am adding a new user it fails
>> with the following error:
>>
>> 2016-11-22 12:05:26,189 DEBUG [c.c.n.v.RemoteAccessVpnManagerImpl]
>> (API-Job-Executor-82:ctx-d62e35c3 job-31537 ctx-8ac8a450)
>> (logid:f76b2eae) VPN User VpnUser[40-andrei-45] is set on
>> com.cloud.network.dao.RemoteAccessVpnVO$$EnhancerByCGLIB$$cc1dfb8d@
>> 4465732c
>> 2016-11-22 12:05:26,189 WARN [c.c.n.v.RemoteAccessVpnManagerImpl]
>> (API-Job-Executor-82:ctx-d62e35c3 job-31537 ctx-8ac8a450)
>> (logid:f76b2eae) Unable to apply vpn users
>> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>> at java.util.ArrayList.rangeCheck(ArrayList.java:635)
>> at java.util.ArrayList.get(ArrayList.java:411)
>> at com.cloud.network.vpn.RemoteAccessVpnManagerImpl.applyVpnUsers(
>> RemoteAccessVpnManagerImpl.java:532)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at sun.reflect.NativeMethodAccessorImpl.invoke(
>> NativeMethodAccessorImpl.java:57)
>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
>> DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:606)
>> at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection
>> (AopUtils.java:317)
>> at org.springframework.aop.framework.ReflectiveMethodInvocation.
>> invokeJoinpoint(ReflectiveMethodInvocation.java:183)
>> at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(
>> Reflectiv

Re: error adding VPN user in VPC network

2016-11-22 Thread Andrei Mikhailovsky

Hi Will, yeah, it's a typo. I meant to say the SourceNat IP.

Any idea when the next release is out?

Thanks

- Original Message -
> From: "Will Stevens" <wstev...@cloudops.com>
> To: "dev" <dev@cloudstack.apache.org>
> Sent: Tuesday, 22 November, 2016 13:47:36
> Subject: Re: error adding VPN user in VPC network

> I am not sure how you are able to add the VPN user to a Static NAT IP.  You
> should be adding it to the Source NAT IP.  Was that just a typo or are you
> targeting the wrong IP address using the API or something like that?
> 
> There are known issues with the current VPN implementation (openswan).
> Basically, if you try to scp files over it or tail a log, it will drop your
> connection.  You may want to try the code from my PR
> https://github.com/apache/cloudstack/pull/1741 if you have problems with
> the current implementation.  That PR should make it into the next release...
> 
> *Will STEVENS*
> Lead Developer
> 
> <https://goo.gl/NYZ8KK>
> 
> On Tue, Nov 22, 2016 at 8:35 AM, Andrei Mikhailovsky <
> and...@arhont.com.invalid> wrote:
> 
>> Hello
>>
>> Duplicating this from the users list.
>>
>> I am running ACS 4.9.0.
>>
>> I am having an issue with adding a VPN user to the VPC network. I've
>> enabled the VPN service on the static IP. The service was enabled and I
>> have the PSK shown to me. However, when I am adding a new user it fails
>> with the following error:
>>
>> 2016-11-22 12:05:26,189 DEBUG [c.c.n.v.RemoteAccessVpnManagerImpl]
>> (API-Job-Executor-82:ctx-d62e35c3 job-31537 ctx-8ac8a450)
>> (logid:f76b2eae) VPN User VpnUser[40-andrei-45] is set on
>> com.cloud.network.dao.RemoteAccessVpnVO$$EnhancerByCGLIB$$cc1dfb8d@
>> 4465732c
>> 2016-11-22 12:05:26,189 WARN [c.c.n.v.RemoteAccessVpnManagerImpl]
>> (API-Job-Executor-82:ctx-d62e35c3 job-31537 ctx-8ac8a450)
>> (logid:f76b2eae) Unable to apply vpn users
>> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
>> at java.util.ArrayList.rangeCheck(ArrayList.java:635)
>> at java.util.ArrayList.get(ArrayList.java:411)
>> at com.cloud.network.vpn.RemoteAccessVpnManagerImpl.applyVpnUsers(
>> RemoteAccessVpnManagerImpl.java:532)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at sun.reflect.NativeMethodAccessorImpl.invoke(
>> NativeMethodAccessorImpl.java:57)
>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
>> DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:606)
>> at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection
>> (AopUtils.java:317)
>> at org.springframework.aop.framework.ReflectiveMethodInvocation.
>> invokeJoinpoint(ReflectiveMethodInvocation.java:183)
>> at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(
>> ReflectiveMethodInvocation.java:150)
>> at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(
>> ExposeInvocationInterceptor.java:91)
>> at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(
>> ReflectiveMethodInvocation.java:172)
>> at org.springframework.aop.framework.JdkDynamicAopProxy.
>> invoke(JdkDynamicAopProxy.java:204)
>> at com.sun.proxy.$Proxy237.applyVpnUsers(Unknown Source)
>> at org.apache.cloudstack.api.command.user.vpn.AddVpnUserCmd.execute(
>> AddVpnUserCmd.java:122)
>> at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:150)
>> at com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:
>> 108)
>> at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.
>> runInContext(AsyncJobManagerImpl.java:554)
>> at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(
>> ManagedContextRunnable.java:49)
>> at org.apache.cloudstack.managed.context.impl.
>> DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>> at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.
>> callWithContext(DefaultManagedContext.java:103)
>> at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.
>> runWithContext(DefaultManagedContext.java:53)
>> at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(
>> ManagedContextRunnable.java:46)
>> at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(
>> AsyncJobManagerImpl.java:502)
>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>> at java.util.concurrent.ThreadPoolExecutor.runWorker(
>> ThreadPoolExecutor.java:1145)
>> at java.util.concurr

error adding VPN user in VPC network

2016-11-22 Thread Andrei Mikhailovsky

Hello 

Duplicating this from the users list. 

I am running ACS 4.9.0. 

I am having an issue with adding a VPN user to the VPC network. I've enabled 
the VPN service on the static IP. The service was enabled and I have the PSK 
shown to me. However, when I am adding a new user it fails with the following 
error: 

2016-11-22 12:05:26,189 DEBUG [c.c.n.v.RemoteAccessVpnManagerImpl] 
(API-Job-Executor-82:ctx-d62e35c3 job-31537 ctx-8ac8a450) (logid:f76b2eae) VPN 
User VpnUser[40-andrei-45] is set on 
com.cloud.network.dao.RemoteAccessVpnVO$$EnhancerByCGLIB$$cc1dfb8d@4465732c 
2016-11-22 12:05:26,189 WARN [c.c.n.v.RemoteAccessVpnManagerImpl] 
(API-Job-Executor-82:ctx-d62e35c3 job-31537 ctx-8ac8a450) (logid:f76b2eae) 
Unable to apply vpn users 
java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 
at java.util.ArrayList.rangeCheck(ArrayList.java:635) 
at java.util.ArrayList.get(ArrayList.java:411) 
at 
com.cloud.network.vpn.RemoteAccessVpnManagerImpl.applyVpnUsers(RemoteAccessVpnManagerImpl.java:532)
 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 
at java.lang.reflect.Method.invoke(Method.java:606) 
at 
org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
 
at 
org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
 
at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
 
at 
org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
 
at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
 
at 
org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
 
at com.sun.proxy.$Proxy237.applyVpnUsers(Unknown Source) 
at 
org.apache.cloudstack.api.command.user.vpn.AddVpnUserCmd.execute(AddVpnUserCmd.java:122)
 
at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:150) 
at com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:108) 
at 
org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:554)
 
at 
org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
 
at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
 
at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
 
at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
 
at 
org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
 
at 
org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:502)
 
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
at java.lang.Thread.run(Thread.java:745) 
2016-11-22 12:05:26,190 DEBUG [c.c.n.v.RemoteAccessVpnManagerImpl] 
(API-Job-Executor-82:ctx-d62e35c3 job-31537 ctx-8ac8a450) (logid:f76b2eae) 
Applying vpn access to VirtualRouter 
2016-11-22 12:05:26,192 WARN [c.c.n.v.RemoteAccessVpnManagerImpl] 
(API-Job-Executor-82:ctx-d62e35c3 job-31537 ctx-8ac8a450) (logid:f76b2eae) 
Failed to apply vpn for user andrei, accountId=45 
2016-11-22 12:05:26,193 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-82:ctx-d62e35c3 job-31537) (logid:f76b2eae) Complete async 
job-31537, jobStatus: FAILED, resultCode: 530, result: 
org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":530,"errortext":"Failed
 to add vpn user"} 

Please advise how to get this problem fixed and have a working VPN service? 

Thanks 

Andrei

Re: change of fqdn in rbd storage pool

2016-07-22 Thread Andrei Mikhailovsky

Wido, thanks for your help. I will try that.

Andrei

- Original Message -
> From: "Wido den Hollander" <w...@widodh.nl>
> To: "dev" <dev@cloudstack.apache.org>
> Sent: Friday, 22 July, 2016 11:55:40
> Subject: Re: change of fqdn in rbd storage pool

>> Op 22 juli 2016 om 12:37 schreef Andrei Mikhailovsky
>> <and...@arhont.com.INVALID>:
>> 
>> 
>> Hi Wido,
>> 
>> Many thanks, this is how it is currently setup via dns with round robin. that
>> dns name will change and thus I need to tell ACS about that. I wasn't sure 
>> how
>> it works and if it is as simple as you've suggested. I thought that it could
>> break stuff with libvirt.
>> 
> 
> Oh, yes, you will have to restart the agent and libvirt on the nodes so they
> load the new pool definition, otherwise the agents will still use the old
> hostname.
> 
> Wido
> 
>> Cheers
>> 
>> Andrei
>> 
>> - Original Message -
>> > From: "Wido den Hollander" <w...@widodh.nl>
>> > To: "dev" <dev@cloudstack.apache.org>
>> > Sent: Friday, 22 July, 2016 10:33:02
>> > Subject: Re: change of fqdn in rbd storage pool
>> 
>> >> Op 22 juli 2016 om 11:21 schreef Andrei Mikhailovsky
>> >> <and...@arhont.com.INVALID>:
>> >> 
>> >> 
>> >> Hi
>> >> 
>> >> We are making some changes to our infrastructure and as a result, the 
>> >> fqdn name
>> >> of our RBD storage pool is changing. From what I can see, the GUI does not
>> >> allow the change of the IP Address field of the Primary Storage pool.
>> >> 
>> >> What manual steps are required for changing the fqdn of the Primary 
>> >> Storage
>> >> pool?
>> >> 
>> > 
>> > Edit the host in the storage_pool MySQL table, that should be sufficient.
>> > 
>> > With Ceph I would always recommend to have a hostname which is a Round 
>> > Robin
>> > Record pointing to all monitors.
>> > 
>> > Wido
>> > 
>> >> Many thanks
>> >> 
> > > > Andrei

Re: change of fqdn in rbd storage pool

2016-07-22 Thread Andrei Mikhailovsky

Hi Wido,

Many thanks, this is how it is currently setup via dns with round robin. that 
dns name will change and thus I need to tell ACS about that. I wasn't sure how 
it works and if it is as simple as you've suggested. I thought that it could 
break stuff with libvirt.

Cheers

Andrei

- Original Message -
> From: "Wido den Hollander" <w...@widodh.nl>
> To: "dev" <dev@cloudstack.apache.org>
> Sent: Friday, 22 July, 2016 10:33:02
> Subject: Re: change of fqdn in rbd storage pool

>> Op 22 juli 2016 om 11:21 schreef Andrei Mikhailovsky
>> <and...@arhont.com.INVALID>:
>> 
>> 
>> Hi
>> 
>> We are making some changes to our infrastructure and as a result, the fqdn 
>> name
>> of our RBD storage pool is changing. From what I can see, the GUI does not
>> allow the change of the IP Address field of the Primary Storage pool.
>> 
>> What manual steps are required for changing the fqdn of the Primary Storage
>> pool?
>> 
> 
> Edit the host in the storage_pool MySQL table, that should be sufficient.
> 
> With Ceph I would always recommend to have a hostname which is a Round Robin
> Record pointing to all monitors.
> 
> Wido
> 
>> Many thanks
>> 
> > Andrei

change of fqdn in rbd storage pool

2016-07-22 Thread Andrei Mikhailovsky

Hi 

We are making some changes to our infrastructure and as a result, the fqdn name 
of our RBD storage pool is changing. From what I can see, the GUI does not 
allow the change of the IP Address field of the Primary Storage pool. 

What manual steps are required for changing the fqdn of the Primary Storage 
pool? 

Many thanks 

Andrei

Re: 4.9.0 RC2 Status

2016-07-22 Thread Andrei Mikhailovsky

Hi

I've been randomly seeing this issue for over a year now. At least I think it 
might be related.

I am currently on 4.7.1.1, but a few previous releases had this issue too on 
some of the networks. I've got a half a dozen of networks or so, which are 
broken and do not allow outgoing traffic despite having the egress rule that 
allows all traffic out with cidr 0.0.0.0/0. These networks are always broken. 
Restarting the network with and without the Clean Up option doesnt help nor 
does removing and adding the Egress rule.

In order to fix the outgoing traffic I have to login to the VR in question and 
manually run:

iptables -A FW_OUTBOUND -j ACCEPT

Only after this command the egress traffic starts to flow. This procedure has 
to be repeated EVERY time the router is restarted or recreated for EVERY 
network which is broken. The rest of the networks are not affected by this 
issue.

I definitely didn't have this issue on the early 4.X releases and this issue 
probably happened around version 4.4 or 4.5.

Andrei

- Original Message -
> From: "Rohit Yadav" 
> To: "Simon Weller" , "dev" 
> Sent: Thursday, 21 July, 2016 21:13:52
> Subject: Re: 4.9.0 RC2 Status

> Hi Will,
> 
> 
> The issue is that after upgrading the VR from a pre-4.6 environment, the
> outbound traffic for guest VMs stop working (where their egress rule was allow
> all for 0.0.0.0/0). Along with this, I found that removing allow all 0.0.0.0/0
> egress rule does not remove the rule from VR's filter table. This could be
> minor security issue for guest VMs.
> 
> 
> I think it's a blocker, please help review and test it:
> 
> https://github.com/apache/cloudstack/pull/1614
> 
> 
> Regards.
> 
> 
> From: williamstev...@gmail.com  on behalf of Will
> Stevens 
> Sent: 21 July 2016 21:43:42
> To: Simon Weller
> Cc: dev@cloudstack.apache.org
> Subject: Re: 4.9.0 RC2 Status
> 
> I am waiting on pdube's PR to fix some issues with VPCs (not introduced in
> 4.9, but should be fixed in 4.9).
> 
> I am also testing #1613 because I had added #1594 and had to revert it
> because I was running into an error consistently ever since.  Hopefully
> #1613 will run cleanly and I can merge it as well for 4.9.
> 
> Sorry for the delay.  Since this release is so huge, it makes sense to fix
> as many issues as possible before it ships (especially if we will LTS this
> release).
> 
> *Will STEVENS*
> Lead Developer
> 
> *CloudOps* *| *Cloud Solutions Experts
> 420 rue Guy *|* Montreal *|* Quebec *|* H3J 1S6
> w cloudops.com *|* tw @CloudOps_
> 
> 
> rohit.ya...@shapeblue.com
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>  
> 
> 
> On Thu, Jul 21, 2016 at 12:04 PM, Simon Weller  wrote:
> 
>> John,
>>
>>
>> I think we're pending a PR from pdube related to broken VPCs. It sounds
>> very much like what we found in our QA environment a few weeks ago.
>>
>> - Si
>>
>> --
>> *From:* John Burwell 
>> *Sent:* Thursday, July 21, 2016 10:55 AM
>> *To:* dev@cloudstack.apache.org
>> *Cc:* Will Stevens
>> *Subject:* 4.9.0 RC2 Status
>>
>> Will,
>>
>> I am inquiring as to the status of 4.9.0 RC2.  Are there issues we can
>> help resolve in order to get it out?  If not, do you have an ETA on when it
>> will be cut?
>>
>> Thanks,
>> -John
>> john.burw...@shapeblue.com
>> www.shapeblue.com
>> 53 Chandos Place, Covent Garden, London VA WC2N 4HSUK
>> @shapeblue
>>
>>
>>

Re: Pesky volume snapshot schedule

2016-02-05 Thread Andrei Mikhailovsky

Hi Anshul,

Many thanks for the suggestion. I will try that right away.

Cheers

Andrei



- Original Message -
> From: "Anshul Gangwar" <anshul.gang...@citrix.com>
> To: "dev" <dev@cloudstack.apache.org>
> Sent: Friday, 5 February, 2016 09:40:49
> Subject: Re: Pesky volume snapshot schedule

> Try setting "volume.snapshot.job.cancel.threshold" value greater than
> "backup.snapshot.wait" value. This will make sure that backup snapshot is
> cancelled before job cancellation. This in turn will take care of cleanup of
> old snapshots.
> 
> Regards,
> Anshul
> 
> On 05-Feb-2016, at 2:53 PM, Andrei Mikhailovsky
> <and...@arhont.com<mailto:and...@arhont.com>> wrote:
> 
> Hello,
> 
> I was hoping someone could help me investigate and fix the issue that I am
> having with one of the volume snapshots schedules. I am running ACS version
> 4.6.2 on Ubuntu 14.04 with latest updates, etc.
> 
> I've got a 200 gig vm root disk image which I am snapshotting on a daily 
> basis.
> I've created the schedule to take a snapshot at 9:30pm every day and keeping 8
> snapshots. The schedule is executing and the image is being snapshotted and
> placed on the nfs secondary storage. However, ACS does not remove the old
> snapshots at all and I am ending up with dozens and dozens of snapshots of 
> this
> particular volume. I have to manually remove them to keep the secondary 
> storage
> from filling up. The problem is with this particular volume. I've got several
> other daily snapshots schedules which are working perfectly well and their
> clean up works well as well.
> 
> I've compared db values for this particular volume with other volumes and I
> can't find any difference apart from the disk size. The db values for the
> schedule are also correct and look similar to a properly working schedule
> (apart from the start time). I've also tried to remove the schedule and create
> a new one at no avail.
> 
> Could someone help me fix the issue?
> 
> Many thanks
> 
> Andrei

Pesky volume snapshot schedule

2016-02-05 Thread Andrei Mikhailovsky

Hello, 

I was hoping someone could help me investigate and fix the issue that I am 
having with one of the volume snapshots schedules. I am running ACS version 
4.6.2 on Ubuntu 14.04 with latest updates, etc. 

I've got a 200 gig vm root disk image which I am snapshotting on a daily basis. 
I've created the schedule to take a snapshot at 9:30pm every day and keeping 8 
snapshots. The schedule is executing and the image is being snapshotted and 
placed on the nfs secondary storage. However, ACS does not remove the old 
snapshots at all and I am ending up with dozens and dozens of snapshots of this 
particular volume. I have to manually remove them to keep the secondary storage 
from filling up. The problem is with this particular volume. I've got several 
other daily snapshots schedules which are working perfectly well and their 
clean up works well as well. 

I've compared db values for this particular volume with other volumes and I 
can't find any difference apart from the disk size. The db values for the 
schedule are also correct and look similar to a properly working schedule 
(apart from the start time). I've also tried to remove the schedule and create 
a new one at no avail. 

Could someone help me fix the issue? 

Many thanks 

Andrei

Re: Pesky volume snapshot schedule

2016-02-05 Thread Andrei Mikhailovsky

Hi Anshul,

Had a look in ACS GUI and I can't find the 
'volume.snapshot.job.cancel.threshold' option. Has it been introduced in ACS 
4.7 or 4.8 versions? I am running 4.6.2 at the moment.

Thanks

-- 
Andrei Mikhailovsky
Director
Arhont Information Security

Web: http://www.arhont.com
http://www.wi-foo.com
Tel: +44 (0)870 4431337
Fax: +44 (0)208 429 3111
PGP: Key ID - 0x2B3438DE
PGP: Server - keyserver.pgp.com

DISCLAIMER

The information contained in this email is intended only for the use of the 
person(s) to whom it is addressed and may be confidential or contain legally 
privileged information. If you are not the intended recipient you are hereby 
notified that any perusal, use, distribution, copying or disclosure is strictly 
prohibited. If you have received this email in error please immediately advise 
us by return email at and...@arhont.com and delete and purge the email and any 
attachments without making a copy.

- Original Message -
> From: "Anshul Gangwar" <anshul.gang...@citrix.com>
> To: "dev" <dev@cloudstack.apache.org>
> Sent: Friday, 5 February, 2016 09:40:49
> Subject: Re: Pesky volume snapshot schedule

> Try setting "volume.snapshot.job.cancel.threshold" value greater than
> "backup.snapshot.wait" value. This will make sure that backup snapshot is
> cancelled before job cancellation. This in turn will take care of cleanup of
> old snapshots.
> 
> Regards,
> Anshul
> 
> On 05-Feb-2016, at 2:53 PM, Andrei Mikhailovsky
> <and...@arhont.com<mailto:and...@arhont.com>> wrote:
> 
> Hello,
> 
> I was hoping someone could help me investigate and fix the issue that I am
> having with one of the volume snapshots schedules. I am running ACS version
> 4.6.2 on Ubuntu 14.04 with latest updates, etc.
> 
> I've got a 200 gig vm root disk image which I am snapshotting on a daily 
> basis.
> I've created the schedule to take a snapshot at 9:30pm every day and keeping 8
> snapshots. The schedule is executing and the image is being snapshotted and
> placed on the nfs secondary storage. However, ACS does not remove the old
> snapshots at all and I am ending up with dozens and dozens of snapshots of 
> this
> particular volume. I have to manually remove them to keep the secondary 
> storage
> from filling up. The problem is with this particular volume. I've got several
> other daily snapshots schedules which are working perfectly well and their
> clean up works well as well.
> 
> I've compared db values for this particular volume with other volumes and I
> can't find any difference apart from the disk size. The db values for the
> schedule are also correct and look similar to a properly working schedule
> (apart from the start time). I've also tried to remove the schedule and create
> a new one at no avail.
> 
> Could someone help me fix the issue?
> 
> Many thanks
> 
> Andrei

Re: upgrading 4.5.2 -> 4.6.0 virtualrouter upgrade timeout

2016-02-01 Thread Andrei Mikhailovsky

Hi Remi,

Is this patched merged into 4.7.1 or 4.8.0, which was recently released? I am 
planning to do the upgrade and wanted to double check.

Thanks

Andrei
- Original Message -
> From: "Remi Bergsma" <rberg...@schubergphilis.com>
> To: "dev" <dev@cloudstack.apache.org>
> Sent: Tuesday, 5 January, 2016 11:20:31
> Subject: Re: upgrading 4.5.2 -> 4.6.0 virtualrouter upgrade timeout

> Hi Andrei,
> 
> You indeed need to build CloudStack for this to work.
> 
> You can create packages with ./packaging/package.sh script in the source tree.
> The PR is against 4.7 and when you create RPMs those will be 4.7.1-SHAPSHOT. I
> do run this in production and it resolved the issue. Let me know if it works
> for you too.
> 
> Regards,
> Remi
> 
> 
> 
> 
> On 05/01/16 10:07, "Andrei Mikhailovsky" <and...@arhont.com> wrote:
> 
>>Hi Remi,
>>
>>I've not tried the patch. I've missed it. Do I need to rebuild the ACS to 
>>apply
>>the patch or would making changes to the two files suffice?
>>
>>Thanks
>>
>>Andrei
>>- Original Message -
>>> From: "Remi Bergsma" <rberg...@schubergphilis.com>
>>> To: "dev" <dev@cloudstack.apache.org>
>>> Sent: Tuesday, 5 January, 2016 05:49:05
>>> Subject: Re: upgrading 4.5.2 -> 4.6.0 virtualrouter upgrade timeout
>>
>>> Hi Andrei,
>>> 
>>> Did you try it in combination with the patch I created (PR1291)? You need 
>>> both
>>> changes.
>>> 
>>> Regards, Remi
>>> 
>>> Sent from my iPhone
>>> 
>>>> On 04 Jan 2016, at 22:17, Andrei Mikhailovsky <and...@arhont.com> wrote:
>>>> 
>>>> Hi Remi,
>>>> 
>>>> Thanks for your reply. However, your suggestion of increasing the
>>>> router.aggregation.command.each.timeout didn't help. I've tried setting the
>>>> value to 120 at no avail. Still fails with the same error.
>>>> 
>>>> Andrei
>>>> 
>>>> - Original Message -
>>>>> From: "Remi Bergsma" <rberg...@schubergphilis.com>
>>>>> To: "dev" <dev@cloudstack.apache.org>
>>>>> Sent: Monday, 4 January, 2016 10:44:43
>>>>> Subject: Re: upgrading 4.5.2 -> 4.6.0 virtualrouter upgrade timeout
>>>> 
>>>>> Hi Andrei,
>>>>> 
>>>>> Missed that mail, sorry. I created a PR that allows for longer timeouts 
>>>>> [1].
>>>>> 
>>>>> Also, you can bump the router.aggregation.command.each.timeout global 
>>>>> setting to
>>>>> say 15-30 so it will allow to boot.
>>>>> 
>>>>> Next, we need to find why it takes so long in the first place. In our
>>>>> environment it at least starts now.
>>>>> 
>>>>> Regards,
>>>>> Remi
>>>>> 
>>>>> [1] https://github.com/apache/cloudstack/pull/1291
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>>> On 04/01/16 11:41, "Andrei Mikhailovsky" <and...@arhont.com> wrote:
>>>>>> 
>>>>>> Hello guys,
>>>>>> 
>>>>>> Tried the user's mailing list without any luck. Perhaps the dev guys 
>>>>>> know if
>>>>>> this issue is being looked at for the next release?
>>>>>> 
>>>>>> I've just upgraded to 4.6.2 and have similar issues with three virtual 
>>>>>> routers
>>>>>> out of 22 in total. They are all failing exactly the same way as 
>>>>>> described
>>>>>> here.
>>>>>> 
>>>>>> Has anyone found a permanent workaround for this issue?
>>>>>> 
>>>>>> Thanks
>>>>>> 
>>>>>> Andrei
>>>>>> 
>>>>>> - Original Message -
>>>>>>> From: "Stephan Seitz" <s.se...@secretresearchfacility.com>
>>>>>>> To: "users" <us...@cloudstack.apache.org>
>>>>>>> Sent: Monday, 30 November, 2015 19:53:57
>>>>>>> Subject: Re: upgrading 4.5.2 -> 4.6.0 virtualrouter upgrade timeout
>>>>>> 
>>>>>>> Does anybody else experiemce problems due to (very) slow deployment of
>>>>>>&g

Re: [RESULT][VOTE] Apache CloudStack 4.8.0

2016-02-01 Thread Andrei Mikhailovsky

Hi 

Could you please point to the release notes document for both 4.8.0 and 4.7.1? 
I couldn't find anything on the cloudstack.org.

Many thanks

Andrei

- Original Message -
> From: "Remi Bergsma" 
> To: "dev" 
> Sent: Tuesday, 26 January, 2016 07:29:33
> Subject: [RESULT][VOTE] Apache CloudStack 4.8.0

> Hi all,
> 
> After 72+ hours, the vote for CloudStack 4.8.0 [1] *passes* with 5 PMC + 1
> non-PMC votes.
> 
> +1 (PMC / binding)
> * Daan
> * Milamber
> * Remi
> * Boris
> * Nux
> 
> +1 (non binding)
> * Glenn
> 
> 0
> Suresh Sadhu
> 
> -1
> none
> 
> Thanks to everyone participating.
> 
> I will now prepare the release announcement to go out after 24 hours to give 
> the
> mirrors time to catch up.
> 
> [1] http://cloudstack.markmail.org/message/crhpnaz7kjexa3pa

Re: upgrading 4.5.2 -> 4.6.0 virtualrouter upgrade timeout

2016-01-05 Thread Andrei Mikhailovsky

Hi Remi,

I've not tried the patch. I've missed it. Do I need to rebuild the ACS to apply 
the patch or would making changes to the two files suffice?

Thanks

Andrei
- Original Message -
> From: "Remi Bergsma" <rberg...@schubergphilis.com>
> To: "dev" <dev@cloudstack.apache.org>
> Sent: Tuesday, 5 January, 2016 05:49:05
> Subject: Re: upgrading 4.5.2 -> 4.6.0 virtualrouter upgrade timeout

> Hi Andrei,
> 
> Did you try it in combination with the patch I created (PR1291)? You need both
> changes.
> 
> Regards, Remi
> 
> Sent from my iPhone
> 
>> On 04 Jan 2016, at 22:17, Andrei Mikhailovsky <and...@arhont.com> wrote:
>> 
>> Hi Remi,
>> 
>> Thanks for your reply. However, your suggestion of increasing the
>> router.aggregation.command.each.timeout didn't help. I've tried setting the
>> value to 120 at no avail. Still fails with the same error.
>> 
>> Andrei
>> 
>> - Original Message -
>>> From: "Remi Bergsma" <rberg...@schubergphilis.com>
>>> To: "dev" <dev@cloudstack.apache.org>
>>> Sent: Monday, 4 January, 2016 10:44:43
>>> Subject: Re: upgrading 4.5.2 -> 4.6.0 virtualrouter upgrade timeout
>> 
>>> Hi Andrei,
>>> 
>>> Missed that mail, sorry. I created a PR that allows for longer timeouts [1].
>>> 
>>> Also, you can bump the router.aggregation.command.each.timeout global 
>>> setting to
>>> say 15-30 so it will allow to boot.
>>> 
>>> Next, we need to find why it takes so long in the first place. In our
>>> environment it at least starts now.
>>> 
>>> Regards,
>>> Remi
>>> 
>>> [1] https://github.com/apache/cloudstack/pull/1291
>>> 
>>> 
>>> 
>>> 
>>> 
>>>> On 04/01/16 11:41, "Andrei Mikhailovsky" <and...@arhont.com> wrote:
>>>> 
>>>> Hello guys,
>>>> 
>>>> Tried the user's mailing list without any luck. Perhaps the dev guys know 
>>>> if
>>>> this issue is being looked at for the next release?
>>>> 
>>>> I've just upgraded to 4.6.2 and have similar issues with three virtual 
>>>> routers
>>>> out of 22 in total. They are all failing exactly the same way as described
>>>> here.
>>>> 
>>>> Has anyone found a permanent workaround for this issue?
>>>> 
>>>> Thanks
>>>> 
>>>> Andrei
>>>> 
>>>> - Original Message -
>>>>> From: "Stephan Seitz" <s.se...@secretresearchfacility.com>
>>>>> To: "users" <us...@cloudstack.apache.org>
>>>>> Sent: Monday, 30 November, 2015 19:53:57
>>>>> Subject: Re: upgrading 4.5.2 -> 4.6.0 virtualrouter upgrade timeout
>>>> 
>>>>> Does anybody else experiemce problems due to (very) slow deployment of
>>>>> VRs?
>>>>> 
>>>>> 
>>>>> Am Dienstag, den 24.11.2015, 16:31 +0100 schrieb Stephan Seitz:
>>>>>> Update / FYI:
>>>>>> After faking the particular VRu in sql, I tried to restart that
>>>>>> network,
>>>>>> and it always fails. To me it looks like the update_config.py - which
>>>>>> takes almost all cpu ressources - runs way longer any watchdog will
>>>>>> accept.
>>>>>> 
>>>>>> I'm able to mitigate that by very nasty workarounds:
>>>>>> a) start the router
>>>>>> b) wait until its provisioned
>>>>>> c) restart cloudstack-management
>>>>>> d)  update vm_instance
>>>>>> set state='Running',
>>>>>> power_state='PowerOn' where name = 'r-XXX-VM';
>>>>>> e) once: update domain_router
>>>>>> set template_version="Cloudstack Release 4.6.0 Wed Nov 4 08:22:47 UTC
>>>>>> 2015",
>>>>>> scripts_version="546c9e7ac38e0aa16ecc498899dac8e2"
>>>>>> where id=XXX;
>>>>>> f) wait until update_config.py finishes (for me thats about 15
>>>>>> minutes)
>>>>>> 
>>>>>> Since I expect the need for VR restarts in the future, this behaviour
>>>>>> is
>>>>>> somehow unsatisfying. It needs a lot of errorprone intervention.
>>>>>> 
>>>>>> I'm quite unsure if it's intr

Re: upgrading 4.5.2 -> 4.6.0 virtualrouter upgrade timeout

2016-01-04 Thread Andrei Mikhailovsky

Hi Remi,

Thanks for your reply. However, your suggestion of increasing the 
router.aggregation.command.each.timeout didn't help. I've tried setting the 
value to 120 at no avail. Still fails with the same error.

Andrei

- Original Message -
> From: "Remi Bergsma" <rberg...@schubergphilis.com>
> To: "dev" <dev@cloudstack.apache.org>
> Sent: Monday, 4 January, 2016 10:44:43
> Subject: Re: upgrading 4.5.2 -> 4.6.0 virtualrouter upgrade timeout

> Hi Andrei,
> 
> Missed that mail, sorry. I created a PR that allows for longer timeouts [1].
> 
> Also, you can bump the router.aggregation.command.each.timeout global setting 
> to
> say 15-30 so it will allow to boot.
> 
> Next, we need to find why it takes so long in the first place. In our
> environment it at least starts now.
> 
> Regards,
> Remi
> 
> [1] https://github.com/apache/cloudstack/pull/1291
> 
> 
> 
> 
> 
> On 04/01/16 11:41, "Andrei Mikhailovsky" <and...@arhont.com> wrote:
> 
>>Hello guys,
>>
>>Tried the user's mailing list without any luck. Perhaps the dev guys know if
>>this issue is being looked at for the next release?
>>
>>I've just upgraded to 4.6.2 and have similar issues with three virtual routers
>>out of 22 in total. They are all failing exactly the same way as described
>>here.
>>
>>Has anyone found a permanent workaround for this issue?
>>
>>Thanks
>>
>>Andrei
>>
>>- Original Message -
>>> From: "Stephan Seitz" <s.se...@secretresearchfacility.com>
>>> To: "users" <us...@cloudstack.apache.org>
>>> Sent: Monday, 30 November, 2015 19:53:57
>>> Subject: Re: upgrading 4.5.2 -> 4.6.0 virtualrouter upgrade timeout
>>
>>> Does anybody else experiemce problems due to (very) slow deployment of
>>> VRs?
>>> 
>>> 
>>> Am Dienstag, den 24.11.2015, 16:31 +0100 schrieb Stephan Seitz:
>>>> Update / FYI:
>>>> After faking the particular VRu in sql, I tried to restart that
>>>> network,
>>>> and it always fails. To me it looks like the update_config.py - which
>>>> takes almost all cpu ressources - runs way longer any watchdog will
>>>> accept.
>>>> 
>>>> I'm able to mitigate that by very nasty workarounds:
>>>> a) start the router
>>>> b) wait until its provisioned
>>>> c) restart cloudstack-management
>>>> d)  update vm_instance
>>>> set state='Running',
>>>> power_state='PowerOn' where name = 'r-XXX-VM';
>>>> e) once: update domain_router
>>>> set template_version="Cloudstack Release 4.6.0 Wed Nov 4 08:22:47 UTC
>>>> 2015",
>>>> scripts_version="546c9e7ac38e0aa16ecc498899dac8e2"
>>>> where id=XXX;
>>>> f) wait until update_config.py finishes (for me thats about 15
>>>> minutes)
>>>> 
>>>> Since I expect the need for VR restarts in the future, this behaviour
>>>> is
>>>> somehow unsatisfying. It needs a lot of errorprone intervention.
>>>> 
>>>> I'm quite unsure if it's introduced with the update or the particular
>>>> VR
>>>> just has simply not been restarted after getting configured with lots
>>>> of
>>>> ips and rules.
>>>> 
>>>> 
>>>> Am Dienstag, den 24.11.2015, 12:29 +0100 schrieb Stephan Seitz:
>>>> > Hi List!
>>>> > 
>>>> > After upgrading from 4.5.2 to 4.6.0 I faced a problem with one
>>>> > virtualrouter. This particular VR has about 10 IPs w/ LB and FW
>>>> > rules
>>>> > defined. During the upgrade process, and after about 4-5 minutes a
>>>> > watchdog kicks in and kills the respective VR due to no response.
>>>> > 
>>>> > So far I didn't find any timeout value in the global settings.
>>>> > Temporarily setting network.router.EnableServiceMonitoring to false
>>>> > doesn't change the behaviour.
>>>> > 
>>>> > Any help, how to mitigate that nasty timeout would be really
>>>> > appreciated :)
>>>> > 
>>>> > cheers,
>>>> > 
>>>> > Stephan
>>>> > 
>>>> > From within the VR, the logs show
>>>> > 
>>>> > 2015-11-24 11:24:33,807  CsFile.py search:123 Searching for
>>>> > dhcp-range=interface:eth0,set:interface and replacing with
&g

Fwd: upgrading 4.5.2 -> 4.6.0 virtualrouter upgrade timeout

2016-01-04 Thread Andrei Mikhailovsky

Hello guys,

Tried the user's mailing list without any luck. Perhaps the dev guys know if 
this issue is being looked at for the next release?

I've just upgraded to 4.6.2 and have similar issues with three virtual routers 
out of 22 in total. They are all failing exactly the same way as described here.

Has anyone found a permanent workaround for this issue?

Thanks

Andrei

- Original Message -
> From: "Stephan Seitz" 
> To: "users" 
> Sent: Monday, 30 November, 2015 19:53:57
> Subject: Re: upgrading 4.5.2 -> 4.6.0 virtualrouter upgrade timeout

> Does anybody else experiemce problems due to (very) slow deployment of
> VRs?
> 
> 
> Am Dienstag, den 24.11.2015, 16:31 +0100 schrieb Stephan Seitz:
>> Update / FYI:
>> After faking the particular VRu in sql, I tried to restart that
>> network,
>> and it always fails. To me it looks like the update_config.py - which
>> takes almost all cpu ressources - runs way longer any watchdog will
>> accept.
>> 
>> I'm able to mitigate that by very nasty workarounds:
>> a) start the router
>> b) wait until its provisioned
>> c) restart cloudstack-management
>> d)  update vm_instance
>> set state='Running',
>> power_state='PowerOn' where name = 'r-XXX-VM';
>> e) once: update domain_router
>> set template_version="Cloudstack Release 4.6.0 Wed Nov 4 08:22:47 UTC
>> 2015",
>> scripts_version="546c9e7ac38e0aa16ecc498899dac8e2"
>> where id=XXX;
>> f) wait until update_config.py finishes (for me thats about 15
>> minutes)
>> 
>> Since I expect the need for VR restarts in the future, this behaviour
>> is
>> somehow unsatisfying. It needs a lot of errorprone intervention.
>> 
>> I'm quite unsure if it's introduced with the update or the particular
>> VR
>> just has simply not been restarted after getting configured with lots
>> of
>> ips and rules.
>> 
>> 
>> Am Dienstag, den 24.11.2015, 12:29 +0100 schrieb Stephan Seitz:
>> > Hi List!
>> > 
>> > After upgrading from 4.5.2 to 4.6.0 I faced a problem with one
>> > virtualrouter. This particular VR has about 10 IPs w/ LB and FW
>> > rules
>> > defined. During the upgrade process, and after about 4-5 minutes a
>> > watchdog kicks in and kills the respective VR due to no response.
>> > 
>> > So far I didn't find any timeout value in the global settings.
>> > Temporarily setting network.router.EnableServiceMonitoring to false
>> > doesn't change the behaviour.
>> > 
>> > Any help, how to mitigate that nasty timeout would be really
>> > appreciated :)
>> > 
>> > cheers,
>> > 
>> > Stephan
>> > 
>> > From within the VR, the logs show
>> > 
>> > 2015-11-24 11:24:33,807  CsFile.py search:123 Searching for
>> > dhcp-range=interface:eth0,set:interface and replacing with
>> > dhcp-range=interface:eth0,set:interface-eth0,10.10.22.1,static
>> > 2015-11-24 11:24:33,808  merge.py load:56 Creating data bag type
>> > guestnetwork
>> > 2015-11-24 11:24:33,808  CsFile.py search:123 Searching for
>> > dhcp-option=tag:interface-eth0,15 and replacing with
>> > dhcp-option=tag:interface-eth0,15,heinlein.cloudservice
>> > 2015-11-24 11:24:33,808  CsFile.py search:123 Searching for
>> > dhcp-option=tag:interface-eth0,6 and replacing with
>> > dhcp-option=tag:interface
>> > -eth0,6,10.10.22.1,195.10.208.2,91.198.250.2
>> > 2015-11-24 11:24:33,809  CsFile.py search:123 Searching for
>> > dhcp-option=tag:interface-eth0,3, and replacing with
>> > dhcp-option=tag:interface-eth0,3,10.10.22.1
>> > 2015-11-24 11:24:33,809  CsFile.py search:123 Searching for
>> > dhcp-option=tag:interface-eth0,1, and replacing with
>> > dhcp-option=tag:interface-eth0,1,255.255.255.0
>> > 2015-11-24 11:24:33,810  CsHelper.py execute:160 Executing: service
>> > dnsmasq restart
>> > 
>> > ==> /var/log/messages <==
>> > Nov 24 11:24:34 r-504-VM shutdown[6752]: shutting down for system
>> > halt
>> > 
>> > Broadcast message from root@r-504-VM (Tue Nov 24 11:24:34 2015):
>> > 
>> > The system is going down for system halt NOW!
>> > Nov 24 11:24:35 r-504-VM KVP: KVP starting; pid is:6844
>> > 
>> > ==> /var/log/cloud.log <==
>> > /opt/cloud/bin/vr_cfg.sh: line 60:  6603
>> > Killed  /opt/cloud/bin/update_config.py
>> > vm_dhcp_entry.json
>> > 
>> > ==> /var/log/messages <==
>> > Nov 24 11:24:35 r-504-VM cloud: VR config: executing
>> > failed: /opt/cloud/bin/update_config.py vm_dhcp_entry.json
>> > 
>> > ==> /var/log/cloud.log <==
>> > Tue Nov 24 11:24:35 UTC 2015 : VR config: executing
>> > failed: /opt/cloud/bin/update_config.py vm_dhcp_entry.json
>> > Connection to 169.254.2.192 closed by remote host.
>> > Connection to 169.254.2.192 closed.
>> > 
>> > 
>> > the management-server.log shows
>> > 
>> > 2015-11-24 12:24:43,015 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> > (Work-Job-Executor-1:ctx-ad9e4658 job-5163/job-5164) Done executing
>> > com.cloud.vm.VmWorkStart for job-5164
>> > 2015-11-24 12:24:43,017 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>> > (Work-Job-Executor-1:ctx-ad9e4658 job-5163/job-5164)

Re: [4.6] Can't create template or volume from snapshot

2015-10-25 Thread Andrei Mikhailovsky

I am actually having issues with 4.5.2 branch and creating templates from a 
snapshot on Ceph rbd primary and nfs secondary. 

Some ppl reported this working, but it just doesn't work for me. While creating 
a template from a volume or snapshot i hit timeout error after about 5-10 mins, 
which is oddly very short. 

Andrei 
- Original Message -

From: "Mike Tutkowski"  
To: dev@cloudstack.apache.org 
Sent: Friday, 23 October, 2015 9:04:35 PM 
Subject: Re: [4.6] Can't create template or volume from snapshot 

4.5 should be OK. I tested this kind of stuff back then and didn't notice 
anything. 

Let me make sure I'm running with the most recent system VM template for 
4.6. 

Prior to using your PR, I was not able to deploy a VM to local storage on 
VMware. 

With your PR, I was able to perform such a deployment, but I still saw 
exceptions in the console. 

Both with and without your PR, I was not able to deploy a VM using managed 
storage with VMware. The only way I can get this to work right now is if I 
set a breakpoint in VMwareGuru and make sure the CopyCommand is sent to the 
VMware server resource inside of the management server (as opposed to that 
command going to the SSVM). 

On Friday, October 23, 2015, Wei ZHOU  wrote: 

> Hi Mike, 
> 
> Does it work without this commit? I want to know if it is caused by this 
> commit. 
> Moreover, does it work on cloudstack 4.5 ? 
> 
> 
> 2015-10-23 21:31 GMT+02:00 Mike Tutkowski  >: 
> 
> > I just tried it, though, with managed storage and it doesn't work. Same 
> > error of sending the CopyCommand to the wrong server. 
> > 
> > On Fri, Oct 23, 2015 at 1:25 PM, Mike Tutkowski < 
> > mike.tutkow...@solidfire.com > wrote: 
> > 
> > > Hi Wei, 
> > > 
> > > So, I am able to spin up a VM using local storage now on VMware with 
> your 
> > > PR; however, I still see the following exceptions thrown when I look at 
> > the 
> > > CS MS console: 
> > > 
> > > INFO [c.c.v.VirtualMachineManagerImpl] 
> (Work-Job-Executor-6:ctx-6046512a 
> > > job-263/job-264 ctx-d61972a5) Unable to contact resource. 
> > > com.cloud.exception.StorageUnavailableException: Resource 
> > [StoragePool:22] 
> > > is unreachable: Unable to create Vol[43|vm=31|ROOT]:Unsupported command 
> > > issued: org.apache.cloudstack.storage.command.CopyCommand. Are you 
> sure 
> > > you got the right type of server? 
> > > at 
> > > 
> > 
> org.apache.cloudstack.engine.orchestration.VolumeOrchestrator.recreateVolume(VolumeOrchestrator.java:1278)
>  
> > > at 
> > > 
> > 
> org.apache.cloudstack.engine.orchestration.VolumeOrchestrator.prepare(VolumeOrchestrator.java:1336)
>  
> > > at 
> > > 
> > 
> com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:1000)
>  
> > > at 
> > > 
> > 
> com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:4576)
>  
> > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> > > at 
> > > 
> > 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
> > > at 
> > > 
> > 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  
> > > at java.lang.reflect.Method.invoke(Method.java:606) 
> > > at 
> > > 
> > 
> com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107)
>  
> > > at 
> > > 
> > 
> com.cloud.vm.VirtualMachineManagerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4732)
>  
> > > at 
> > > com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.java:102) 
> > > at 
> > > 
> > 
> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:537)
>  
> > > at 
> > > 
> > 
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>  
> > > at 
> > > 
> > 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>  
> > > at 
> > > 
> > 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
>  
> > > at 
> > > 
> > 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
>  
> > > at 
> > > 
> > 
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>  
> > > at 
> > > 
> > 
> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:494)
>  
> > > at 
> > > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> > > at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> > > at 
> > > 
> > 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  
> > > at 
> > > 
> > 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  
> > > at java.lang.Thread.run(Thread.java:744) 
> >

slow nfs = reboot all hosts (((

2015-10-09 Thread Andrei Mikhailovsky

Hello 

My issue is whenever my nfs server becomes slow to respond, ACS just bloody 
reboots ALL hosts servers, not just the once running vms with volumes attached 
to the slow nfs server. Recently, i've decided to remove some of the old 
snapshots to free up some disk space. I've deleted about a dozen snapshots and 
I was monitoring the nfs server for progress. At no point did the nfs server 
lost the connectivity, it just became a bit slow and under load. By slow I mean 
i was still able to list files on the nfs mount point and the ssh session was 
still working okay. It was just taking a few more seconds to respond when it 
comes to nfs file listings, creation, deletion, etc. However, the ACS agent has 
just rebooted every single host server, killing all running guests and system 
vms. In my case, I only have two guests with volumes on the nfs server. The 
rest of the vms are running off rbd storage. Yet, all host servers were 
rebooted, even those which were not running guests with nfs volumes. 

Ever since i've started using ACS, it was always pretty dumb in correctly 
determining if the nfs storage is still alive. I would say it has done the 
maniac reboot everything type of behaviour at least 5 times in the past 3 
years. So, in the previous versions of ACS i've just modified the 
kvmheartbeat.sh and hashed out the line with "reboot" as these reboots were 
just pissing everyone off. 

After upgrading to ACS 4.5.x that script has no reboot command and I was 
wondering if it is still possible to instruct the kvmheartbeat script not to 
reboot the host servers? 

Thanks for your advice. 

Andrei

Re: slow nfs = reboot all hosts (((

2015-10-09 Thread Andrei Mikhailovsky

I think there should be as much REISUB as possible when trying to reboot a 
broken server. Doing only last B bit is a bit dangerous imho. 

Andrei 
- Original Message -

From: . "Nux!" <n...@li.nux.ro> 
To: dev@cloudstack.apache.org 
Sent: Friday, 9 October, 2015 6:53:43 PM 
Subject: Re: slow nfs = reboot all hosts ((( 

Andrei, 

Yes, that command will just reboot without flushing anything to disk, like 
cutting power. 
It is made because many servers are slow to respond to normal reboot commands 
under load, if at all, this could lead to corrupted data and so on. 
The sysrq switch is a much better choice from this pov. 

We really need to look at a proper way of doing HA with KVM. 

-- 
Sent from the Delta quadrant using Borg technology! 

Nux! 
www.nux.ro 

- Original Message - 
> From: "Andrei Mikhailovsky" <and...@arhont.com> 
> To: dev@cloudstack.apache.org 
> Sent: Friday, 9 October, 2015 16:47:46 
> Subject: Re: slow nfs = reboot all hosts ((( 

> Thanks guys, I am not sure how i've missed that. probably the coffee didn't 
> kick 
> in yet ))) 
> 
> Anyway, am I right in saying that now the host server reboot is now forced 
> without stopping the services, unmounting filesystems with potentially open 
> and 
> unsync-ed data, etc? 
> 
> Isn't this rather bad and dangerous to perform simply because of 
> slow/unresponsive one of possibly many nfs servers? Not only that, the 
> heartbeat also reboot the servers that are not running vms with nfs volumes? 
> In 
> my case it just rebooted every single host server. 
> 
> Very worrying indeed. 
> 
> Andrei 
> 
> 
> - Original Message - 
> 
> From: "Nux!" <n...@li.nux.ro> 
> To: dev@cloudstack.apache.org 
> Sent: Friday, 9 October, 2015 12:58:19 PM 
> Subject: Re: slow nfs = reboot all hosts ((( 
> 
> Hello, 
> 
> Instead of commenting 'echo b > /proc/sysrq-trigger' and also disabling your 
> HA 
> at the same time, perhaps there's a way to tweak the timeouts to be more 
> generous with lazy NFS servers. 
> 
> Can you go through the logs and see what is happening before the reboot? I am 
> not sure exactly which timeout the script cares about, worth investigating. 
> 
> Lucian 
> 
> -- 
> Sent from the Delta quadrant using Borg technology! 
> 
> Nux! 
> www.nux.ro 
> 
> - Original Message - 
>> From: "Andrija Panic" <andrija.pa...@gmail.com> 
>> To: dev@cloudstack.apache.org 
>> Sent: Friday, 9 October, 2015 10:25:05 
>> Subject: Re: slow nfs = reboot all hosts ((( 
> 
>> I managed this problem the folowing way: 
>> http://admintweets.com/cloudstack-disable-agent-rebooting-kvm-host/ 
>> 
>> Cheers 
>> On Oct 9, 2015 10:21 AM, "Andrei Mikhailovsky" <and...@arhont.com> wrote: 
>> 
>>> Hello 
>>> 
>>> My issue is whenever my nfs server becomes slow to respond, ACS just 
>>> bloody reboots ALL hosts servers, not just the once running vms with 
>>> volumes attached to the slow nfs server. Recently, i've decided to remove 
>>> some of the old snapshots to free up some disk space. I've deleted about a 
>>> dozen snapshots and I was monitoring the nfs server for progress. At no 
>>> point did the nfs server lost the connectivity, it just became a bit slow 
>>> and under load. By slow I mean i was still able to list files on the nfs 
>>> mount point and the ssh session was still working okay. It was just taking 
>>> a few more seconds to respond when it comes to nfs file listings, creation, 
>>> deletion, etc. However, the ACS agent has just rebooted every single host 
>>> server, killing all running guests and system vms. In my case, I only have 
>>> two guests with volumes on the nfs server. The rest of the vms are running 
>>> off rbd storage. Yet, all host servers were rebooted, even those which were 
>>> not running guests with nfs volumes. 
>>> 
>>> Ever since i've started using ACS, it was always pretty dumb in correctly 
>>> determining if the nfs storage is still alive. I would say it has done the 
>>> maniac reboot everything type of behaviour at least 5 times in the past 3 
>>> years. So, in the previous versions of ACS i've just modified the 
>>> kvmheartbeat.sh and hashed out the line with "reboot" as these reboots were 
>>> just pissing everyone off. 
>>> 
>>> After upgrading to ACS 4.5.x that script has no reboot command and I was 
>>> wondering if it is still possible to instruct the kvmheartbeat script not 
>>> to reboot the host servers? 
>>> 
>>> Thanks for your advice. 
>>> 
> >> Andrei

Re: slow nfs = reboot all hosts (((

2015-10-09 Thread Andrei Mikhailovsky

Thanks guys, I am not sure how i've missed that. probably the coffee didn't 
kick in yet ))) 

Anyway, am I right in saying that now the host server reboot is now forced 
without stopping the services, unmounting filesystems with potentially open and 
unsync-ed data, etc? 

Isn't this rather bad and dangerous to perform simply because of 
slow/unresponsive one of possibly many nfs servers? Not only that, the 
heartbeat also reboot the servers that are not running vms with nfs volumes? In 
my case it just rebooted every single host server. 

Very worrying indeed. 

Andrei 


- Original Message -

From: "Nux!" <n...@li.nux.ro> 
To: dev@cloudstack.apache.org 
Sent: Friday, 9 October, 2015 12:58:19 PM 
Subject: Re: slow nfs = reboot all hosts ((( 

Hello, 

Instead of commenting 'echo b > /proc/sysrq-trigger' and also disabling your HA 
at the same time, perhaps there's a way to tweak the timeouts to be more 
generous with lazy NFS servers. 

Can you go through the logs and see what is happening before the reboot? I am 
not sure exactly which timeout the script cares about, worth investigating. 

Lucian 

-- 
Sent from the Delta quadrant using Borg technology! 

Nux! 
www.nux.ro 

- Original Message - 
> From: "Andrija Panic" <andrija.pa...@gmail.com> 
> To: dev@cloudstack.apache.org 
> Sent: Friday, 9 October, 2015 10:25:05 
> Subject: Re: slow nfs = reboot all hosts ((( 

> I managed this problem the folowing way: 
> http://admintweets.com/cloudstack-disable-agent-rebooting-kvm-host/ 
> 
> Cheers 
> On Oct 9, 2015 10:21 AM, "Andrei Mikhailovsky" <and...@arhont.com> wrote: 
> 
>> Hello 
>> 
>> My issue is whenever my nfs server becomes slow to respond, ACS just 
>> bloody reboots ALL hosts servers, not just the once running vms with 
>> volumes attached to the slow nfs server. Recently, i've decided to remove 
>> some of the old snapshots to free up some disk space. I've deleted about a 
>> dozen snapshots and I was monitoring the nfs server for progress. At no 
>> point did the nfs server lost the connectivity, it just became a bit slow 
>> and under load. By slow I mean i was still able to list files on the nfs 
>> mount point and the ssh session was still working okay. It was just taking 
>> a few more seconds to respond when it comes to nfs file listings, creation, 
>> deletion, etc. However, the ACS agent has just rebooted every single host 
>> server, killing all running guests and system vms. In my case, I only have 
>> two guests with volumes on the nfs server. The rest of the vms are running 
>> off rbd storage. Yet, all host servers were rebooted, even those which were 
>> not running guests with nfs volumes. 
>> 
>> Ever since i've started using ACS, it was always pretty dumb in correctly 
>> determining if the nfs storage is still alive. I would say it has done the 
>> maniac reboot everything type of behaviour at least 5 times in the past 3 
>> years. So, in the previous versions of ACS i've just modified the 
>> kvmheartbeat.sh and hashed out the line with "reboot" as these reboots were 
>> just pissing everyone off. 
>> 
>> After upgrading to ACS 4.5.x that script has no reboot command and I was 
>> wondering if it is still possible to instruct the kvmheartbeat script not 
>> to reboot the host servers? 
>> 
>> Thanks for your advice. 
>> 
>> Andrei

Re: [Feature] Cloudstack KVM with RBD

2015-05-28 Thread Andrei Mikhailovsky

+1 for this 
- Original Message -

From: Logan Barfield lbarfi...@tqhosting.com 
To: dev@cloudstack.apache.org 
Sent: Thursday, 28 May, 2015 3:48:09 PM 
Subject: Re: [Feature] Cloudstack KVM with RBD 

Hi Star, 

I'll +1 this. I would like to see support for RBD snapshots as well, 
and maybe have a method to backup the snapshots to secondary 
storage. Right now for large volumes it can take an hour or more to 
finish the snapshot. 

I have already discussed this with Wido, and was able to determine 
that even without using native RBD snapshots we could improve the copy 
time by saving the snaps as thin volumes instead of full raw files. 
Right now the snapshot code when using RBD specifically converts the 
volumes to a full raw file, when saving as a qcow2 image would use 
less space. When restoring a snapshot the code current specifically 
indicates the source image as being a raw file, but if we change the 
code to not indicate the source image type qemu-img should 
automatically detect it. We just need to see if that's the case with 
all of the supported versions of libvirt/qemu before submitting a pull 
request. 

Thank You, 

Logan Barfield 
Tranquil Hosting 

On Wed, May 27, 2015 at 9:18 PM, Star Guo st...@ceph.me wrote: 
 Hi everyone, 

 Since I have test cloudstack 4.4.2 + kvm + rbd, deploy an instance is so 
 fast apart from the first deployment because copy template from secondary 
 storage (NFS) to primary storage (RBD). That is no problem. 
 However, when I do some volume operation, such as create snapshot, create 
 template, template deploy ect, it also take some time to finish because copy 
 data between primary storage and secondary storage. 
 So I think that if we support the same rbd as secondary storage, and use 
 ceph COW feature, it may reduce the time and just some seconds. (OpenStack 
 can make glance and cinder as the same rbd) 

 Best Regards, 
 Star Guo

Re: ACS 4.5.1 KVM live migration problem

2015-05-19 Thread Andrei Mikhailovsky



Hi guys, 

Coming back to the problem with live migration. I've done some more testing and 
I think there is an issue (probably introduced since 4.4.x). 

I have manually set the vlan://number for the broadcast and isolation _uri 
values in the data base. This has indeed solved the migration problem. I am 
able to migrate vm after making the change. 

However, a bigger problem has surfaced. After stopping the vm, I am no longer 
able to start it, even though I've not had any issues stopping/starting the vm 
prior to making db change. I've also noticed that after the vm is stopped, the 
value of both broadcast and isolation URIs is reset back to NULL. Not sure if 
this is the expected behaviour or not. 

Could someone help me with getting to the bottom of this issue? 

Thanks 

Andrei 

- Original Message -

From: Andrija Panic andrija.pa...@gmail.com 
To: dev@cloudstack.apache.org 
Cc: us...@cloudstack.apache.org 
Sent: Friday, 15 May, 2015 2:01:30 PM 
Subject: Re: ACS 4.5.1 KVM live migration problem 

Ok, but since they are guest, it confuses me - is this advanced zone with 
vlan, right ? Then my understanding all NICs (of user VM) needs to have 
some isolation method... 

Anyway - I'm running advanced zone + vlans, and all VMS (VMs behind VPC 
and VMS on internet/public network - but still that's Guest network) - 
still all of them have some vlan://x value. 

For VR, SSVM, CPVM - there are NICs on ACS public network that doesnt use 
vlan - they have vlan://untagged, and NULL is only used for LinkLocal 
(169.x) NICs, and for mgmt/sec-storage NIC for SSVM/CPVM in my case. 



On 15 May 2015 at 13:47, Andrei Mikhailovsky and...@arhont.com wrote: 

 Andrija, 
 
 I've ran the command and it showed me a bunch of running vms with NULLs. I 
 would roughly say about 20% of my total running vms do have NULL under the 
 isolation and broadcast URIs. 
 
 All of these vms are working perfectly well (in terms of network 
 connectivity) and there is nothing special about them. They all have at 
 least one guest NIC. 
 
 Andrei 
 - Original Message - 
 
 From: Andrija Panic andrija.pa...@gmail.com 
 To: dev@cloudstack.apache.org 
 Cc: us...@cloudstack.apache.org 
 Sent: Friday, 15 May, 2015 12:34:24 PM 
 Subject: Re: ACS 4.5.1 KVM live migration problem 
 
 Andrei, 
 
 select instance_id,isolation_uri,broadcast_uri from nics where instance_id 
 in (select id from vm_instance where state='Running' and name not like 
 'r-%' and name not like 'v-%' and name not like 's-%') order by 
 instance_id; 
 
 This gives me every niC, that does not belong to router or SSVm CPVMI 
 always have vlan values - since this is all Guest NICs - they must have 
 vlan ID... 
 NULL values are only present when VM is deleted/stoped in my case... 
 
 Can you check your VM 664 - what is so specific about it ? 
 all NICs (in my understanding, if this is advacned zone) must have some 
 vlan, can not be NULL or untagged ? 
 
 On 15 May 2015 at 12:58, Andrei Mikhailovsky and...@arhont.com wrote: 
 
  
  
  Hi Andrija, Marcus, 
  
  Thanks for your comments and suggestions. I've checked the cloud.nics 
 table 
  
  mysql select instance_id,isolation_uri,broadcast_uri from nics where 
  instance_id=564 or instance_id=664 or instance_id=; 
  +-+---+---+ 
  | instance_id | isolation_uri | broadcast_uri | 
  +-+---+---+ 
  | 564 | vlan://96 | vlan://96 | 
  | 664 | NULL | NULL | 
  |  | vlan://1127 | vlan://1127 | 
  +-+---+---+ 
  
  
  From my tests, instance_ids 564 and  are migrating correctly, but 
  instance 664 is not ans showing the npe similar to the one i've given. 
  
  
  Is this what is causing the migration issues? If so, should i change all 
  isolation_uri and broadcast_uri to the corresponding network vlan ids? 
  
  Thanks 
  
  Andrei 
  
  - Original Message - 
  
  From: Andrija Panic andrija.pa...@gmail.com 
  To: dev@cloudstack.apache.org 
  Sent: Thursday, 14 May, 2015 4:00:07 PM 
  Subject: Re: Fwd: ACS 4.5.1 KVM live migration problem 
  
  That would probably be a bug that I had...but we updated main VLAN table 
  with change URI or something... Marcus saved me that time :) 
  Andrei, please provide more info and the info Marcus said, I will try to 
  compare my values with yours if of any help. 
  
  On 14 May 2015 at 16:56, Marcus shadow...@gmail.com wrote: 
  
   So, I vaguely remember an issue introduced a little over a year ago 
 where 
   the broadcast domain value of the nic was changed from a URI to just a 
  vlan 
   ID, which worked for vlans but broke vxlan and some other things. If I 
   remember correctly, there would be a small set of installs during this 
   period that wouldn't have created their nics with the correct broadcast 
   domain value. I don't remember which versions were doing this but I do 
  know 
   there's a JIRA ticket and a paper trail on how

Re: ACS 4.5.1 KVM live migration problem

2015-05-19 Thread Andrei Mikhailovsky

Okay, I think i got to the bottom of the problem. 

For some reason, after performing a manual db change of the 
isolation/broadcast_uri values in db, the virtual router, responsible for the 
network that was attached to this particular vm, has stopped responding and 
wasn't giving out dhcp leases. After performing the network restart with the 
clean up option, I am able to start vms one again. 

After starting the vms, their isolation/broadcast_uri values are populated 
properly. 

Andrei 



- Original Message -

From: Andrei Mikhailovsky and...@arhont.com 
To: dev@cloudstack.apache.org 
Cc: us...@cloudstack.apache.org 
Sent: Tuesday, 19 May, 2015 9:36:05 AM 
Subject: Re: ACS 4.5.1 KVM live migration problem 



Hi guys, 

Coming back to the problem with live migration. I've done some more testing and 
I think there is an issue (probably introduced since 4.4.x). 

I have manually set the vlan://number for the broadcast and isolation _uri 
values in the data base. This has indeed solved the migration problem. I am 
able to migrate vm after making the change. 

However, a bigger problem has surfaced. After stopping the vm, I am no longer 
able to start it, even though I've not had any issues stopping/starting the vm 
prior to making db change. I've also noticed that after the vm is stopped, the 
value of both broadcast and isolation URIs is reset back to NULL. Not sure if 
this is the expected behaviour or not. 

Could someone help me with getting to the bottom of this issue? 

Thanks 

Andrei 

- Original Message - 

From: Andrija Panic andrija.pa...@gmail.com 
To: dev@cloudstack.apache.org 
Cc: us...@cloudstack.apache.org 
Sent: Friday, 15 May, 2015 2:01:30 PM 
Subject: Re: ACS 4.5.1 KVM live migration problem 

Ok, but since they are guest, it confuses me - is this advanced zone with 
vlan, right ? Then my understanding all NICs (of user VM) needs to have 
some isolation method... 

Anyway - I'm running advanced zone + vlans, and all VMS (VMs behind VPC 
and VMS on internet/public network - but still that's Guest network) - 
still all of them have some vlan://x value. 

For VR, SSVM, CPVM - there are NICs on ACS public network that doesnt use 
vlan - they have vlan://untagged, and NULL is only used for LinkLocal 
(169.x) NICs, and for mgmt/sec-storage NIC for SSVM/CPVM in my case. 



On 15 May 2015 at 13:47, Andrei Mikhailovsky and...@arhont.com wrote: 

 Andrija, 
 
 I've ran the command and it showed me a bunch of running vms with NULLs. I 
 would roughly say about 20% of my total running vms do have NULL under the 
 isolation and broadcast URIs. 
 
 All of these vms are working perfectly well (in terms of network 
 connectivity) and there is nothing special about them. They all have at 
 least one guest NIC. 
 
 Andrei 
 - Original Message - 
 
 From: Andrija Panic andrija.pa...@gmail.com 
 To: dev@cloudstack.apache.org 
 Cc: us...@cloudstack.apache.org 
 Sent: Friday, 15 May, 2015 12:34:24 PM 
 Subject: Re: ACS 4.5.1 KVM live migration problem 
 
 Andrei, 
 
 select instance_id,isolation_uri,broadcast_uri from nics where instance_id 
 in (select id from vm_instance where state='Running' and name not like 
 'r-%' and name not like 'v-%' and name not like 's-%') order by 
 instance_id; 
 
 This gives me every niC, that does not belong to router or SSVm CPVMI 
 always have vlan values - since this is all Guest NICs - they must have 
 vlan ID... 
 NULL values are only present when VM is deleted/stoped in my case... 
 
 Can you check your VM 664 - what is so specific about it ? 
 all NICs (in my understanding, if this is advacned zone) must have some 
 vlan, can not be NULL or untagged ? 
 
 On 15 May 2015 at 12:58, Andrei Mikhailovsky and...@arhont.com wrote: 
 
  
  
  Hi Andrija, Marcus, 
  
  Thanks for your comments and suggestions. I've checked the cloud.nics 
 table 
  
  mysql select instance_id,isolation_uri,broadcast_uri from nics where 
  instance_id=564 or instance_id=664 or instance_id=; 
  +-+---+---+ 
  | instance_id | isolation_uri | broadcast_uri | 
  +-+---+---+ 
  | 564 | vlan://96 | vlan://96 | 
  | 664 | NULL | NULL | 
  |  | vlan://1127 | vlan://1127 | 
  +-+---+---+ 
  
  
  From my tests, instance_ids 564 and  are migrating correctly, but 
  instance 664 is not ans showing the npe similar to the one i've given. 
  
  
  Is this what is causing the migration issues? If so, should i change all 
  isolation_uri and broadcast_uri to the corresponding network vlan ids? 
  
  Thanks 
  
  Andrei 
  
  - Original Message - 
  
  From: Andrija Panic andrija.pa...@gmail.com 
  To: dev@cloudstack.apache.org 
  Sent: Thursday, 14 May, 2015 4:00:07 PM 
  Subject: Re: Fwd: ACS 4.5.1 KVM live migration problem 
  
  That would probably be a bug that I had...but we updated main VLAN table 
  with change URI or something... Marcus saved

Re: ACS 4.5.1 KVM live migration problem

2015-05-15 Thread Andrei Mikhailovsky



Hi Andrija, Marcus, 

Thanks for your comments and suggestions. I've checked the cloud.nics table 

mysql select instance_id,isolation_uri,broadcast_uri from nics where 
instance_id=564 or instance_id=664 or instance_id=; 
+-+---+---+ 
| instance_id | isolation_uri | broadcast_uri | 
+-+---+---+ 
| 564 | vlan://96 | vlan://96 | 
| 664 | NULL | NULL | 
|  | vlan://1127 | vlan://1127 | 
+-+---+---+ 


From my tests, instance_ids 564 and  are migrating correctly, but instance 
664 is not ans showing the npe similar to the one i've given. 


Is this what is causing the migration issues? If so, should i change all 
isolation_uri and broadcast_uri to the corresponding network vlan ids? 

Thanks 

Andrei 

- Original Message -

From: Andrija Panic andrija.pa...@gmail.com 
To: dev@cloudstack.apache.org 
Sent: Thursday, 14 May, 2015 4:00:07 PM 
Subject: Re: Fwd: ACS 4.5.1 KVM live migration problem 

That would probably be a bug that I had...but we updated main VLAN table 
with change URI or something... Marcus saved me that time :) 
Andrei, please provide more info and the info Marcus said, I will try to 
compare my values with yours if of any help. 

On 14 May 2015 at 16:56, Marcus shadow...@gmail.com wrote: 

 So, I vaguely remember an issue introduced a little over a year ago where 
 the broadcast domain value of the nic was changed from a URI to just a vlan 
 ID, which worked for vlans but broke vxlan and some other things. If I 
 remember correctly, there would be a small set of installs during this 
 period that wouldn't have created their nics with the correct broadcast 
 domain value. I don't remember which versions were doing this but I do know 
 there's a JIRA ticket and a paper trail on how people were fixing it. The 
 code that broke the URI was backed out. VMs created with the bad code would 
 not be compatible with the new or the old versions of code. 
 
 I was under the impression at the time that there was some SQL provided to 
 update the values during an upgrade, perhaps that never made it in, or 
 somehow got skipped during your upgrade process. At any rate, since there 
 is a null pointer on broadcast domain type, you may check your 
 nics/networks the MySQL db and verify that the broadcast/isolation types 
 are URI format and not just a number. Or try to find the bug I'm referring 
 to from around April last year. 
 On May 14, 2015 5:04 AM, Andrei Mikhailovsky and...@arhont.com wrote: 
 
  Hi guys, 
  
  Forwarding the message to the dev list as ive not had much reply in the 
  users list. 
  
  In summary. after upgrading from ASC4.4.2 ro 4.5.1 i started having 
  migration issues with a lot of vms. some vms are successfully migrating 
 and 
  others are not . 
  
  The logs are shown below 
  
  could someone help me to get to the bottom of this problem? 
  
  Thanks 
  
  Andrei 
  
  
  
  - Forwarded Message - 
  From: Andrei Mikhailovsky and...@arhont.com 
  To: us...@cloudstack.apache.org 
  Sent: Wednesday, 13 May, 2015 10:44:29 AM 
  Subject: Re: ACS 4.5.1 KVM live migration problem 
  
  Hi Rohit, 
  
  forgot to answer you on the cloud.vlan table. 
  
  That particular vm has a network with vlan id 1151 as shown when i look 
 at 
  the network details in the acs gui. However, this vlan is not shown in 
 the 
  cloud.vlan table. From what I can see the cloud.vlan table shows only the 
  public and management network vlan interfaces and does not show the guest 
  network vlans. 
  
  In terms of the public network vlan which is used for routing traffic to 
  the internet from this particular vm, it is: 
  
  
  mysql select * from vlan where id=12; 
  
  
 ++--+-+---+-+---++++-+-+--+---+-+-+
  
  | id | uuid | vlan_id | vlan_gateway | vlan_netmask | description | 
  vlan_type | data_center_id | network_id | physical_network_id | 
 ip6_gateway 
  | ip6_cidr | ip6_range | removed | created | 
  
  
 ++--+-+---+-+---++++-+-+--+---+-+-+
  
  | 12 | d13ea4b3-2087-4376-9d0a-f54efe2a55af | vlan://2030 | 178.XXX.XXX.1 
  | 255.255.255.128 | 178.XXX.XXX.2-178.XXX.XXX.119 | VirtualNetwork | 1 | 
  200 | 200 | NULL | NULL | NULL | NULL | NULL | 
  
  
 ++--+-+---+-+---++++-+-+--+---+-+-+
  
  1 row in set (0.00 sec

Re: ACS 4.5.1 KVM live migration problem

2015-05-15 Thread Andrei Mikhailovsky

Andrija, 

I've ran the command and it showed me a bunch of running vms with NULLs. I 
would roughly say about 20% of my total running vms do have NULL under the 
isolation and broadcast URIs. 

All of these vms are working perfectly well (in terms of network connectivity) 
and there is nothing special about them. They all have at least one guest NIC. 

Andrei 
- Original Message -

From: Andrija Panic andrija.pa...@gmail.com 
To: dev@cloudstack.apache.org 
Cc: us...@cloudstack.apache.org 
Sent: Friday, 15 May, 2015 12:34:24 PM 
Subject: Re: ACS 4.5.1 KVM live migration problem 

Andrei, 

select instance_id,isolation_uri,broadcast_uri from nics where instance_id 
in (select id from vm_instance where state='Running' and name not like 
'r-%' and name not like 'v-%' and name not like 's-%') order by instance_id; 

This gives me every niC, that does not belong to router or SSVm CPVMI 
always have vlan values - since this is all Guest NICs - they must have 
vlan ID... 
NULL values are only present when VM is deleted/stoped in my case... 

Can you check your VM 664 - what is so specific about it ? 
all NICs (in my understanding, if this is advacned zone) must have some 
vlan, can not be NULL or untagged ? 

On 15 May 2015 at 12:58, Andrei Mikhailovsky and...@arhont.com wrote: 

 
 
 Hi Andrija, Marcus, 
 
 Thanks for your comments and suggestions. I've checked the cloud.nics table 
 
 mysql select instance_id,isolation_uri,broadcast_uri from nics where 
 instance_id=564 or instance_id=664 or instance_id=; 
 +-+---+---+ 
 | instance_id | isolation_uri | broadcast_uri | 
 +-+---+---+ 
 | 564 | vlan://96 | vlan://96 | 
 | 664 | NULL | NULL | 
 |  | vlan://1127 | vlan://1127 | 
 +-+---+---+ 
 
 
 From my tests, instance_ids 564 and  are migrating correctly, but 
 instance 664 is not ans showing the npe similar to the one i've given. 
 
 
 Is this what is causing the migration issues? If so, should i change all 
 isolation_uri and broadcast_uri to the corresponding network vlan ids? 
 
 Thanks 
 
 Andrei 
 
 - Original Message - 
 
 From: Andrija Panic andrija.pa...@gmail.com 
 To: dev@cloudstack.apache.org 
 Sent: Thursday, 14 May, 2015 4:00:07 PM 
 Subject: Re: Fwd: ACS 4.5.1 KVM live migration problem 
 
 That would probably be a bug that I had...but we updated main VLAN table 
 with change URI or something... Marcus saved me that time :) 
 Andrei, please provide more info and the info Marcus said, I will try to 
 compare my values with yours if of any help. 
 
 On 14 May 2015 at 16:56, Marcus shadow...@gmail.com wrote: 
 
  So, I vaguely remember an issue introduced a little over a year ago where 
  the broadcast domain value of the nic was changed from a URI to just a 
 vlan 
  ID, which worked for vlans but broke vxlan and some other things. If I 
  remember correctly, there would be a small set of installs during this 
  period that wouldn't have created their nics with the correct broadcast 
  domain value. I don't remember which versions were doing this but I do 
 know 
  there's a JIRA ticket and a paper trail on how people were fixing it. The 
  code that broke the URI was backed out. VMs created with the bad code 
 would 
  not be compatible with the new or the old versions of code. 
  
  I was under the impression at the time that there was some SQL provided 
 to 
  update the values during an upgrade, perhaps that never made it in, or 
  somehow got skipped during your upgrade process. At any rate, since there 
  is a null pointer on broadcast domain type, you may check your 
  nics/networks the MySQL db and verify that the broadcast/isolation types 
  are URI format and not just a number. Or try to find the bug I'm 
 referring 
  to from around April last year. 
  On May 14, 2015 5:04 AM, Andrei Mikhailovsky and...@arhont.com 
 wrote: 
  
   Hi guys, 
   
   Forwarding the message to the dev list as ive not had much reply in the 
   users list. 
   
   In summary. after upgrading from ASC4.4.2 ro 4.5.1 i started having 
   migration issues with a lot of vms. some vms are successfully migrating 
  and 
   others are not . 
   
   The logs are shown below 
   
   could someone help me to get to the bottom of this problem? 
   
   Thanks 
   
   Andrei 
   
   
   
   - Forwarded Message - 
   From: Andrei Mikhailovsky and...@arhont.com 
   To: us...@cloudstack.apache.org 
   Sent: Wednesday, 13 May, 2015 10:44:29 AM 
   Subject: Re: ACS 4.5.1 KVM live migration problem 
   
   Hi Rohit, 
   
   forgot to answer you on the cloud.vlan table. 
   
   That particular vm has a network with vlan id 1151 as shown when i look 
  at 
   the network details in the acs gui. However, this vlan is not shown in 
  the 
   cloud.vlan table. From what I can see the cloud.vlan table shows only 
 the 
   public and management network vlan interfaces and does not show

Fwd: ACS 4.5.1 KVM live migration problem

2015-05-14 Thread Andrei Mikhailovsky

Hi guys,

Forwarding the message to the dev list as ive not had much reply in the users 
list.

In summary. after upgrading from ASC4.4.2 ro 4.5.1 i started having migration 
issues with a lot of vms. some vms are successfully migrating and others are 
not .

The logs are shown below  

could someone help me to get to the bottom of this problem?

Thanks

Andrei



- Forwarded Message -
From: Andrei Mikhailovsky and...@arhont.com
To: us...@cloudstack.apache.org
Sent: Wednesday, 13 May, 2015 10:44:29 AM
Subject: Re: ACS 4.5.1 KVM live migration problem

Hi Rohit, 

forgot to answer you on the cloud.vlan table. 

That particular vm has a network with vlan id 1151 as shown when i look at the 
network details in the acs gui. However, this vlan is not shown in the 
cloud.vlan table. From what I can see the cloud.vlan table shows only the 
public and management network vlan interfaces and does not show the guest 
network vlans. 

In terms of the public network vlan which is used for routing traffic to the 
internet from this particular vm, it is: 


mysql select * from vlan where id=12; 
++--+-+---+-+---++++-+-+--+---+-+-+
 
| id | uuid | vlan_id | vlan_gateway | vlan_netmask | description | vlan_type | 
data_center_id | network_id | physical_network_id | ip6_gateway | ip6_cidr | 
ip6_range | removed | created | 
++--+-+---+-+---++++-+-+--+---+-+-+
 
| 12 | d13ea4b3-2087-4376-9d0a-f54efe2a55af | vlan://2030 | 178.XXX.XXX.1 | 
255.255.255.128 | 178.XXX.XXX.2-178.XXX.XXX.119 | VirtualNetwork | 1 | 200 | 
200 | NULL | NULL | NULL | NULL | NULL | 
++--+-+---+-+---++++-+-+--+---+-+-+
 
1 row in set (0.00 sec) 


Hope that helps 

Andrei 
- Original Message -

From: Rohit Yadav rohit.ya...@shapeblue.com 
To: us...@cloudstack.apache.org 
Sent: Wednesday, 13 May, 2015 8:55:55 AM 
Subject: Re: ACS 4.5.1 KVM live migration problem 

Hi Andrei, 

This looks like an issue similar to 
https://issues.apache.org/jira/browse/CLOUDSTACK-6893 
Can share the row from your cloud.vlan table and value of “select cache_mode 
from volume_view where vm_id=put the vm id here\G; for the VM causing the 
NPE? 

 On 12-May-2015, at 10:51 pm, Andrei Mikhailovsky and...@arhont.com wrote: 
 
 
 
 It seems that the problem is worse than i've initially thought. In fact, I 
 can't migrate most of my vms apart from a handful and I can't determine a 
 correlation between the migrateable vms and once that produce exception. 
 
 Thanks for any help. 
 
 Andrei 
 
 - Original Message - 
 
 From: Andrei Mikhailovsky and...@arhont.com 
 To: us...@cloudstack.apache.org 
 Sent: Tuesday, 12 May, 2015 8:53:16 PM 
 Subject: ACS 4.5.1 KVM live migration problem 
 
 Hi, 
 
 I am having an issue migrating some of vms after recently upgrading to ACS 
 4.5.1. I am running Ubuntu 14.04 on both host and management servers. Here is 
 the output from the log file on a client agent : 
 
 
 2015-05-12 20:42:34,154 DEBUG [kvm.resource.LibvirtComputingResource] 
 (agentRequest-Handler-1:null) Preparing host for migrating 
 com.cloud.agent.api.to.VirtualMachineTO@21a038ac 
 2015-05-12 20:42:34,157 DEBUG [kvm.resource.LibvirtConnection] 
 (agentRequest-Handler-1:null) can't find connection: KVM, for vm: 
 i-9-1162-VM, continue 
 2015-05-12 20:42:34,159 DEBUG [kvm.resource.LibvirtConnection] 
 (agentRequest-Handler-1:null) can't find connection: LXC, for vm: 
 i-9-1162-VM, continue 
 2015-05-12 20:42:34,159 DEBUG [kvm.resource.LibvirtConnection] 
 (agentRequest-Handler-1:null) can't find which hypervisor the vm used , then 
 use the default hypervisor 
 2015-05-12 20:42:34,160 DEBUG [kvm.resource.BridgeVifDriver] 
 (agentRequest-Handler-1:null) nic=[Nic:Guest-178.248.108.205-vlan://2014] 
 2015-05-12 20:42:34,160 DEBUG [kvm.resource.BridgeVifDriver] 
 (agentRequest-Handler-1:null) creating a vNet dev and bridge for guest 
 traffic per traffic label cloudstackbr0 
 2015-05-12 20:42:34,160 DEBUG [kvm.resource.BridgeVifDriver] 
 (agentRequest-Handler-1:null) Executing: 
 /usr/share/cloudstack-common/scripts/vm/network/vnet/modifyvlan.sh -v 2014 -p 
 bond0 -b brbond0-2014 -o add 
 2015-05-12 20:42:34,211 DEBUG [kvm.resource.BridgeVifDriver] 
 (agentRequest-Handler-1:null) Execution is successful. 
 2015-05-12 20:42:34,211 DEBUG [kvm.resource.BridgeVifDriver] 
 (agentRequest-Handler-1:null) nic=[Nic:Guest-10.1.1.66-null

Re: Unable to mount Secondary Storage on SSVM

2015-05-08 Thread Andrei Mikhailovsky


Srini,

you need to make sure the nfs server is configured properly to allow access 
from SSVM, as you can see, the access is denied. please check that the nfs 
server supports the same protocol version as ssvm is requesting. Most likely, 
it's nfs v3.

Check logs on the nfs server to verify why it is denying access.

Once this is fixed, you should be good to go.

Andrei

- Original Message -
From: srinivas niddapu sr...@axiomio.com
To: Rohit Yadav rohit.ya...@shapeblue.com, dev@cloudstack.apache.org
Cc: us...@cloudstack.apache.org
Sent: Friday, 8 May, 2015 2:03:10 PM
Subject: RE: Unable to mount Secondary Storage on SSVM

Appreciated info Rohit.

As we verified our NFS storage there is no permission restrictions (* FULL 
ACCESS).
Validated the same NFS share on the Cloud Stack Hypervisors, already mounted 
and data visible.

We try to mount the NFS volume in the SSVM manually but its throwing error. 
Unable to mount.
mount.nfs: access denied by server while mounting

While restoring snapshot in the CloudStack UI getting below error.

Status
Failed to create templatecom.cloud.utils.exception.CloudRuntimeException: 
GetRootDir for nfs://172.30.36.51/vS02304090GCSP_NAS07 failed due to 
com.cloud.utils.exception.CloudRuntimeException: Unable to mount 
172.30.36.51:/vS02304090GCSP_NAS07 at 
/mnt/SecStorage/1c7f122c-e72e-3daa-a54a-3693b89d4015 due to mount.nfs: access 
denied by server while mounting 172.30.36.51:/vS02304090GCSP_NAS07 at 
org.apache.cloudstack.storage.resource.NfsSecondaryStorageResource.getRootDir(NfsSecondaryStorageResource.java:1956)
 at 
org.apache.cloudstack.storage.resource.NfsSecondaryStorageResource.copySnapshotToTemplateFromNfsToNfsXenserver(NfsSecondaryStorageResource.java:377)
 at 
org.apache.cloudstack.storage.resource.NfsSecondaryStorageResource.copySnapshotToTemplateFromNfsToNfs(NfsSecondaryStorageResource.java:444)
 at 
org.apache.cloudstack.storage.resource.NfsSecondaryStorageResource.createTemplateFromSnapshot(NfsSecondaryStorageResource.java:553)
 at 
org.apache.cloudstack.storage.resource.NfsSecondaryStorageResource.execute(NfsSecondaryStorageResource.java:632)
 at 
org.apache.cloudstack.storage.resource.NfsSecondaryStorageResource.executeRequest(NfsSecondaryStorageResource.java:236)
 at 
com.cloud.storage.resource.PremiumSecondaryStorageResource.defaultAction(PremiumSecondaryStorageResource.java:63)
 at 
com.cloud.storage.resource.PremiumSecondaryStorageResource.executeRequest(PremiumSecondaryStorageResource.java:59)
 at com.cloud.agent.Agent.processRequest(Agent.java:498) at 
com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:806) at 
com.cloud.utils.nio.Task.run(Task.java:83) at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
at java.lang.Thread.run(Thread.java:679)

Any suggestions.

Thanks,
Srini.

-Original Message-
From: Rohit Yadav [mailto:rohit.ya...@shapeblue.com] 
Sent: Friday, May 08, 2015 5:57 PM
To: dev@cloudstack.apache.org
Cc: us...@cloudstack.apache.org; srinivas niddapu
Subject: Re: Unable to mount Secondary Storage on SSVM


 On 08-May-2015, at 2:05 pm, anil lakineni anilkumar459.lakin...@gmail.com 
 wrote:

 and this Secondary Storage path is mounting and working with all other 
 servers except with SSVM.. Getting error  mount.nfs: access denied by 
 server while mounting xx.xx.xx.xx:/

Check your nfs exports file and do a chmod 777 on the mount points, such as 
/export/secondary or /export/primary.

Regards,
Rohit Yadav
Software Architect, ShapeBlue
M. +91 88 262 30892 | rohit.ya...@shapeblue.com
Blog: bhaisaab.org | Twitter: @_bhaisaab



Find out more about ShapeBlue and our range of CloudStack related services

IaaS Cloud Design  Buildhttp://shapeblue.com/iaas-cloud-design-and-build//
CSForge – rapid IaaS deployment frameworkhttp://shapeblue.com/csforge/
CloudStack Consultinghttp://shapeblue.com/cloudstack-consultancy/
CloudStack Software 
Engineeringhttp://shapeblue.com/cloudstack-software-engineering/
CloudStack Infrastructure 
Supporthttp://shapeblue.com/cloudstack-infrastructure-support/
CloudStack Bootcamp Training Courseshttp://shapeblue.com/cloudstack-training/

This email and any attachments to it may be confidential and are intended 
solely for the use of the individual to whom it is addressed. Any views or 
opinions expressed are solely those of the author and do not necessarily 
represent those of Shape Blue Ltd or related companies. If you are not the 
intended recipient of this email, you must neither take any action based upon 
its contents, nor copy or show it to anyone. Please contact the sender if you 
believe you have received this email in error. Shape Blue Ltd is a company 
incorporated in England  Wales. ShapeBlue Services India LLP is a company 
incorporated in India and is operated under license from Shape Blue Ltd. Shape 
Blue Brasil Consultoria Ltda is a company incorporated in

Next ACS release?

2015-04-21 Thread Andrei Mikhailovsky

Hello guys, 

Looking at the dev and user lists it is becoming less certain if version 4.5.x 
is ever coming out. It seems like a few months have passed since the not so 
fortunate release of 4.5.0 and I can't find a release schedule for the 4.5.1, 
which seems to have stopped at rc2 stage and haven't progressed further to a 
release stage. 

Are we likely to see any progress with the 4.5.x branch or is the community 
switching towards the 4.6.x branch without releasing the 4.5.x? 

I am a bit unclear as there are no release dates, schedules or dead lines that 
the community should work with. Possibly as a result of this, the ACS releases 
are not being released on time or fast enough. 

Does it make sense to introduce release schedules for ACS that the dev 
community should stick to? Similar to what is being done in many other 
projects, like Ubuntu, etc. Or would this break the ACS project releases even 
more? 

Andrei

Re: Next ACS release?

2015-04-21 Thread Andrei Mikhailovsky

Ilya, Mark, thanks for your feedback, 

I also see the need to restructure the release schedule for ACS as the current 
release cycles are not really working. There is no _reliable_ release cycle of 
the product and as we have recently seen with the 4.5 branch, the release did 
not happen for months and it is still not clear when this will take place. In 
my (I must admit somewhat limited) experience if there are no deadlines, 
developers are not keen on releases and the release are likely to be delayed. 
This is what we've seen with the past ACS releases, they are overdue by many 
months. 

The community might get a much better responce if there is a much shorter 
release cycle even if it means pushing out less features with each release. At 
least some features will get completed, tested and implemented in a set time 
frame. I would rather see a release cycle of every 3-4 months with 5 new 
features than a release with 15 new features which may or may not get released 
every 9 - 12 months. 

By any means, please comment if someone disagrees or thinks there is a better 
alternative. 

Andrei 
- Original Message -

 From: ilya ilya.mailing.li...@gmail.com
 To: dev@cloudstack.apache.org
 Sent: Tuesday, 21 April, 2015 7:30:34 PM
 Subject: Re: Next ACS release?

 Andrei,

 To best of my knowledge, both 4.4.x and 4.5.x are being worked on
 actively. As a community, we need to get better on QA of each release
 -
 this is something we are planning to cover this year with distributed
 QA
 model, this was not widely discussed yet but something we need to
 tackle.

 4.5 rc2 got stalled and we need to restart. We had a 4 month release
 cycle, but we can really stick to it hard - as its community driven.
 May
 will have to revise it down to 6 months or so.

 Regards
 ilya

 On 4/21/15 1:26 AM, Andrei Mikhailovsky wrote:
  Hello guys,
 
  Looking at the dev and user lists it is becoming less certain if
  version 4.5.x is ever coming out. It seems like a few months have
  passed since the not so fortunate release of 4.5.0 and I can't
  find a release schedule for the 4.5.1, which seems to have stopped
  at rc2 stage and haven't progressed further to a release stage.
 
  Are we likely to see any progress with the 4.5.x branch or is the
  community switching towards the 4.6.x branch without releasing the
  4.5.x?
 
  I am a bit unclear as there are no release dates, schedules or dead
  lines that the community should work with. Possibly as a result of
  this, the ACS releases are not being released on time or fast
  enough.
 
  Does it make sense to introduce release schedules for ACS that the
  dev community should stick to? Similar to what is being done in
  many other projects, like Ubuntu, etc. Or would this break the ACS
  project releases even more?
 
  Andrei

Re: Cloudstack and KVM clusters,

2015-04-01 Thread Andrei Mikhailovsky

I would highly recommend looking at the Ceph storage instead of using too many 
tiers of complications. Ceph integrates well with kvm and cloudstack and has 
proven to work very well over the years. 

Andrei 
- Original Message -

 From: chiu ching cheng ccchiou...@gmail.com
 To: us...@cloudstack.apache.org
 Cc: dev@cloudstack.apache.org
 Sent: Wednesday, 1 April, 2015 2:31:42 AM
 Subject: Re: Cloudstack and KVM clusters,

 If I wnat to build a kvm native cluster with gfs2 + DLM , and use
 iscsi +
 DRBD in storage .

 Then add the kvm cluster to cloudstack , and add the SharedMountPoint
 to
 cloudstack as primary storage , Does it work ?

 On Wed, Apr 1, 2015 at 6:12 AM, Marcus shadow...@gmail.com wrote:

  Don't forget SharedMountPoint. This (in theory, haven't tried it
  recently) allows you to use any clustered filesystem that has a
  consistent mountpoint across all KVM hosts in a CS cluster, e.g.
  mount
  an OCFS2 to /vmstore1 then register /vmstore1 as a
  SharedMountPoint.

  The Ceph support is in the form of RBD, by the way. You could use
  CephFS if you wished via SharedMountPoint.

  On Tue, Mar 31, 2015 at 2:09 PM, Simon Weller swel...@ena.com
  wrote:
   The hosts need to be part of the same Cloudstack cluster, and
   depending
  on the underlying storage technology, you may need a clustered file
  system
  as well.

   A Cloudstack cluster is basically a group of physical hosts.

   For example:

   You build a new Zone in Cloudstack. Under the zone you have a
   pod.
  Within the pod, you build a new cluster (just a group of hosts).
  Then you
  assigned 4 servers (hosts) into that cluster. You will be able to
  live
  migrate between the 4 hosts assuming the original mentioned
  criteria are
  met.

   - Si

   From: Rafael Weingartner rafaelweingart...@gmail.com
   Sent: Tuesday, March 31, 2015 4:02 PM
   To: dev@cloudstack.apache.org
   Cc: us...@cloudstack.apache.org
   Subject: Re: Cloudstack and KVM clusters,

   Thanks Simon,

   I think I got it.

   So, the hosts do not need to be in a cluster to perform the live
  migration.

   On Tue, Mar 31, 2015 at 5:59 PM, Simon Weller swel...@ena.com
   wrote:

   Rafael,

   KVM live migration really relies on whether the underlying
   shared
  storage
   (and file system) supports the ability to provide data
   consistency
  during a
   migration. You never ever want a situation where 2 hosts are
   able to
  mount
   and write to the same volume concurrently.

   You can live migrate in KVM today using the following underlying
   file
   systems/methods:

   1. NFS
   2. CEPH
   3. Clustered Logical Volume Management (CLVM) on top of SAN
   exposed
   storage via iSCSI,FC or FCOE.

   It's also possible to build your own storage driver and set a
   LUN to
  read
   only on a particular host using your SANs API.

   Solidfire, Nexenta and Cloudbyte have also added storage drivers
   more
   recently that may provide support for live migration, but as I'm
   not
   personally familiar with these storage platforms, I'll leave it
   up to
   others to comment if they wish.

   - Si

   From: Rafael Weingartner rafaelweingart...@gmail.com
   Sent: Tuesday, March 31, 2015 3:36 PM
   To: us...@cloudstack.apache.org; dev@cloudstack.apache.org
   Subject: Cloudstack and KVM clusters,

   Hi folks,

   I was looking a matrix of Cloudstack compatibility matrix at
   http://pt.slideshare.net/TimMackey/hypervisor-31754727,

   Slide 25 seemed to show that we cannot have clusters of KVM in
   CS? Is
  that
   true? Is it possible to live migrate VMs between KVM hosts that
   are not
   clustered in CS?

   --
   Rafael Weingärtner

   --
   Rafael Weingärtner

Re: [VOTE] Apache CloudStack 4.5.1-rc1

2015-03-24 Thread Andrei Mikhailovsky

Hello, 

Does anyone have an idea when the 4.5.1 is going to be out including the 
packaged versions? It has been a while since the unfortunate, but well spotted, 
release attempt of the 4.5.0 and I was hoping to see the 4.5.1 with fixes 
shortly. 

Cheers 

Andrei 
- Original Message -

 From: Rohit Yadav rohit.ya...@shapeblue.com
 To: dev@cloudstack.apache.org
 Cc: us...@cloudstack.apache.org
 Sent: Friday, 20 March, 2015 7:29:18 AM
 Subject: Re: [VOTE] Apache CloudStack 4.5.1-rc1

 (+ users)

 Hi all,

 I've built signed centos63/centos7/debian repository out of this RC
 for
 your convenience: http://packages.shapeblue.com/cloudstack/testing/

 4.5 systemvm templates:
 http://packages.shapeblue.com/systemvmtemplate/4.5/

 (If you're on 4.5.0. There is no need to upgrade your SystemVM
 templates).

 Happy testing!

 On Friday 20 March 2015 02:06 AM, David Nalley wrote:
  I did indeed - it's 4.5.1 RC - my apologies for not checking the
  template closely enough to eliminate all of the fill in the blank
  spots.
 
  On Thu, Mar 19, 2015 at 11:01 AM, Geoff Higginbottom
  geoff.higginbot...@shapeblue.com wrote:
  Hi David,
 
  You appear to have left x.x.x.x in the e-mail, I think you meant
  to put
  4.5.1 here instead.
 
  Regards
 
  Geoff Higginbottom
  CTO / Cloud Architect
 
  D: +44 20 3603 0542 tel:+442036030542 | S: +44 20 3603 0540
  tel:+442036030540 | M: +447968161581 tel:+447968161581
 
  geoff.higginbot...@shapeblue.com | www.shapeblue.com
  htp://www.shapeblue.com/ | Twitter:@cloudstackguru
  https://twitter.com/#!/cloudstackguru
 
  ShapeBlue Ltd, 53 Chandos Place, Covent Garden, London, WC2N 4HS
  x-apple-data-detectors://5
 
 
 
 
  On 19/03/2015 15:40, David Nalley da...@gnsa.us wrote:
 
  Hi All,
 
  I've created a X.X.X release, with the following artifacts up for
  a vote:
 
  Git Branch and Commit SH:
  https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;a=shortlog;h=refs
  /heads/4.5-RC20150319T1429
  Commit: 3c06466e208769f32c03767abc6bd2680fd8
 
  Source release (checksums and signatures are available at the
  same
  location):
  https://dist.apache.org/repos/dist/dev/cloudstack/4.5.1-rc1/
 
  PGP release keys (signed using 0x6fe50f1c):
  https://dist.apache.org/repos/dist/release/cloudstack/KEYS
 
  Vote will be open for 72 hours.
 
  For sanity in tallying the vote, can PMC members please be sure
  to
  indicate (binding) with their vote?
 
  [ ] +1 approve
  [ ] +0 no opinion
  [ ] -1 disapprove (and reason why)
 
  Find out more about ShapeBlue and our range of CloudStack related
  services
 
  IaaS Cloud Design 
  Buildhttp://shapeblue.com/iaas-cloud-design-and-build//
  CSForge – rapid IaaS deployment
  frameworkhttp://shapeblue.com/csforge/
  CloudStack
  Consultinghttp://shapeblue.com/cloudstack-consultancy/
  CloudStack Software
  Engineeringhttp://shapeblue.com/cloudstack-software-engineering/
  CloudStack Infrastructure
  Supporthttp://shapeblue.com/cloudstack-infrastructure-support/
  CloudStack Bootcamp Training
  Courseshttp://shapeblue.com/cloudstack-training/
 
  This email and any attachments to it may be confidential and are
  intended solely for the use of the individual to whom it is
  addressed. Any views or opinions expressed are solely those of
  the author and do not necessarily represent those of Shape Blue
  Ltd or related companies. If you are not the intended recipient
  of this email, you must neither take any action based upon its
  contents, nor copy or show it to anyone. Please contact the
  sender if you believe you have received this email in error.
  Shape Blue Ltd is a company incorporated in England  Wales.
  ShapeBlue Services India LLP is a company incorporated in India
  and is operated under license from Shape Blue Ltd. Shape Blue
  Brasil Consultoria Ltda is a company incorporated in Brasil and
  is operated under license from Shape Blue Ltd. ShapeBlue SA Pty
  Ltd is a company registered by The Republic of South Africa and
  is traded under license from Shape Blue Ltd. ShapeBlue is a
  registered trademark.

 --
 Regards,
 Rohit Yadav
 Software Architect, ShapeBlue
 M. +91 8826230892 | rohit.ya...@shapeblue.com
 Blog: bhaisaab.org | Twitter: @_bhaisaab
 PS. If you see any footer below, I did not add it :)
 Find out more about ShapeBlue and our range of CloudStack related
 services

 IaaS Cloud Design 
 Buildhttp://shapeblue.com/iaas-cloud-design-and-build//
 CSForge – rapid IaaS deployment
 frameworkhttp://shapeblue.com/csforge/
 CloudStack Consultinghttp://shapeblue.com/cloudstack-consultancy/
 CloudStack Software
 Engineeringhttp://shapeblue.com/cloudstack-software-engineering/
 CloudStack Infrastructure
 Supporthttp://shapeblue.com/cloudstack-infrastructure-support/
 CloudStack Bootcamp Training
 Courseshttp://shapeblue.com/cloudstack-training/

 This email and any attachments to it may be confidential and are
 intended solely for the use of the individual to whom it is
 addressed. Any views or opinions expressed are

Re: BUG: anybody addressing this one ?

2015-03-18 Thread Andrei Mikhailovsky

Happened to me on several occasions as well. However, I am double careful now 
and I double check the volume names before deleting to make sure I do the right 
one. 

This little bug is annoying to say the least ))) 

Andrei 

- Original Message -

 From: Andrija Panic andrija.pa...@gmail.com
 To: dev@cloudstack.apache.org
 Sent: Wednesday, 18 March, 2015 11:58:53 AM
 Subject: BUG: anybody addressing this one ?

 Hi,

 https://issues.apache.org/jira/browse/CLOUDSTACK-7926

 Currently I have hit a bug, when I click on some instance, then on
 View
 Volumes, and then I get listed volumes that belong to some other VM -
 it
 already happened to me that I deleted the volumes - beacuse of ACS
 bug in
 GUI !

 So, I suggest to consider maybe to implement purging volumes the same
 way
 it is implemented with VM-s - so the VM is not really deleted - and
 the
 purge thread in ACS will acually delete it when it runs...

 --

 Andrija Panić

Re: BUG: anybody addressing this one ?

2015-03-18 Thread Andrei Mikhailovsky

Just done. 

As i've commented in the bug report, similar behaviour happens when you go to 
Infrastructure  Hosts  host  View Instances. Sometimes you get a list of all 
instances, even the stopped instances, instead of only those that run on the 
chosen host. 

Andrei 

- Original Message -

 From: Andrija Panic andrija.pa...@gmail.com
 To: dev@cloudstack.apache.org
 Sent: Wednesday, 18 March, 2015 1:58:35 PM
 Subject: Re: BUG: anybody addressing this one ?

 Andrei, please comment the JIRA with same text if you have time :)
 otherwise...no use here I guess

 Thanks

 On 18 March 2015 at 14:26, Andrei Mikhailovsky and...@arhont.com
 wrote:

  Happened to me on several occasions as well. However, I am double
  careful
  now and I double check the volume names before deleting to make
  sure I do
  the right one.
 
  This little bug is annoying to say the least )))
 
  Andrei
 
  - Original Message -
 
   From: Andrija Panic andrija.pa...@gmail.com
   To: dev@cloudstack.apache.org
   Sent: Wednesday, 18 March, 2015 11:58:53 AM
   Subject: BUG: anybody addressing this one ?
 
   Hi,
 
   https://issues.apache.org/jira/browse/CLOUDSTACK-7926
 
   Currently I have hit a bug, when I click on some instance, then
   on
   View
   Volumes, and then I get listed volumes that belong to some other
   VM -
   it
   already happened to me that I deleted the volumes - beacuse of
   ACS
   bug in
   GUI !
 
   So, I suggest to consider maybe to implement purging volumes the
   same
   way
   it is implemented with VM-s - so the VM is not really deleted -
   and
   the
   purge thread in ACS will acually delete it when it runs...
 
   --
 
   Andrija Panić
 

 --

 Andrija Panić

Re: Disk throtling on KVM - shared storage - works ot not ?

2015-03-09 Thread Andrei Mikhailovsky

Hi Guys, 

What i have noticed ( with version 4.2.1 I think) is that when you attach a new 
disk with throttle offering to the running vm, the throttling settings are not 
passed to kvm untill the vm is restarted from the acs side. Simply doing the 
reboot within the OS itself doesn't help. However, once the restart is done, 
the throttle settings are passed on and it works okay. 

Andrei 

- Original Message -

 From: Andrija Panic andrija.pa...@gmail.com
 To: dev@cloudstack.apache.org
 Sent: Monday, 9 March, 2015 3:30:01 PM
 Subject: Re: Disk throtling on KVM - shared storage - works ot not ?

 Thx guys !
 On Mar 9, 2015 4:10 PM, Wei ZHOU ustcweiz...@gmail.com wrote:

  Hi Andria,
 
  I am sure it works with QEMU 1.2.0+ , of course including Ubuntu
  14.04.
 
  Wei
 
 
  2015-03-09 15:53 GMT+01:00 Andrija Panic andrija.pa...@gmail.com:
 
   Hi Wei,
  
   thanks for info - it is CentOS 6.x and libvirt 1.2.3 (manually
   compiled
   with RBD support) and also custom version of QEMU (original stock
   qemu
  from
   CentOS 6.4, patched by Inktank for RBD support)
  
   So this is the reason I guess ?
  
   Do you know if it works with Ubuntu 14.04 - I plan using this for
   new ACS
   installation?
  
  
   Thanks
  
   On 9 March 2015 at 15:48, Wei ZHOU ustcweiz...@gmail.com wrote:
  
What's the QEMU and libvirt version?
QEMU on CentOS 6.X does not support it.
   
2015-03-09 15:01 GMT+01:00 Andrija Panic
andrija.pa...@gmail.com:
   
 Because it doesnt work on my ACS 4.3 installation with CEPH
 :)

 I checked virsh dumpxml so only throtling on network level is
  present -
 unless it wasnt applied because of my qemu version (intank's
 patched
 versions of stocl 0.12.1.2, just RBD support added).
 qemu-kvm version problematic perhaps ?

 Was it supposed to work with 4.3 also, Wido ?

 Thanks

 On 9 March 2015 at 14:11, Wido den Hollander w...@widodh.nl
 wrote:

  -BEGIN PGP SIGNED MESSAGE-
  Hash: SHA1
 
 
 
  On 03/09/2015 01:57 PM, Andrija Panic wrote:
   Hi,
  
   I'm wondering if disk throtling on ACS44 or later works
   on shared
   storage (limiting number of IOPS and bandwtih, on compute
   offering)
  
 
  Why shouldn't it work? The throttling is done inside Qemu
  and works
  just fine with NFS or Ceph.
 
  Wido
 
   I know this does NOT work (shared storage) on ACS 4.3,
   I'm
   wondering if it is there in 4.4/4/5 ?
  
   We are using KVM.
  
   Thanks,
  
  -BEGIN PGP SIGNATURE-
  Version: GnuPG v1
 
  iQIcBAEBAgAGBQJU/ZvpAAoJEAGbWC3bPspCl0cP/0BuRCMcthp7ZHJho0hEFoWl
  tZQ9k3OKtd6Y69wKbL8h6GuXlG/B4Wf1SM0QEM8yFSiO48YnoChmU4EZofSxZ4X5
  3Ah7oJAmvPuI6XoHziMafiAHz6V+91P7Lb8E72b2vz0cMW+uQDDDtaA77QCQWOUG
  HTkZRqYnhAAIKy7y6gvom+e//uRMUMEFAcjxbTBrpRhvRPJN3Wf+nG65tCKwW4GH
  fyaaVyKERW6Y3BHGRCYPPxfjLPBYvZAzfDx7PmtwrHfBrFMv+v1M30GPjN+dBzRs
  3TT+U970ALkcxlLWNLA+mvhA/kG6f9unYU5XyVTxzx4GonJYCRYzxchTV9b63OO9
  0p/UPaSuv8bgCQLlGpAxfEBzSjgvU28yNZg0UCesNmS41V9qdNRt0wzuV0uZzNKg
  MsfRaGVTbR3VVysY8UYZCB/tz/5oFqtQzJMxQVmN/+Rigf8oxaI3aKHlfjlz1s+Y
  1o28b2S6U1QcfYWFZq91FLoWMoetq2SwgzmOy2JSwjGFy6lM55IqgrYvd1QuTEGp
  IxZKf4PKm+pCGbKJIJ+kUJdPfUJCtSCjw8yiT7brsp5mFfjoe9kvKvRsa9Kdm1Ys
  QqNMAjNkb9EWINqpqJYRlIwlGd/TT7QcH0t0kRjQuhuX1DPug2RghgyszHJzMsrW
  UlEXVeoy8RjTnh8+hQXy
  =+PhJ
  -END PGP SIGNATURE-
 



 --

 Andrija Panić

   
  
  
  
   --
  
   Andrija Panić

Re: [RESULT] [VOTE] Apache CloudStack 4.5.0 RC4

2015-03-05 Thread Andrei Mikhailovsky

Congrats!!! 

- Original Message -

 From: David Nalley da...@gnsa.us
 To: dev@cloudstack.apache.org
 Sent: Thursday, 5 March, 2015 9:12:50 PM
 Subject: [RESULT] [VOTE] Apache CloudStack 4.5.0 RC4

 Hi all,

 After more than 72 hours, the vote for CloudStack 4.5.0 *passes* with
 4 PMC + 2 non-PMC votes.

 +1 (PMC / binding)
 Marcus
 Pierre-Luc
 Mike
 Chip

 +1 (non binding)
 Rohit
 Nux

 0
 none

 -1
 none

 Thanks to everyone participating.

 I will now prepare the release announcement to go out, likely for
 Tuesday.

 --David

 On Mon, Mar 2, 2015 at 11:49 AM, David Nalley da...@gnsa.us wrote:
  Hi All,
 
  I've created yet another 4.5.0 release candidate, with the
  following
  artifacts up for a vote:
 
  Git Branch and Commit SH:
  https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;a=shortlog;h=refs/heads/4.5-RC20150302T1625
  Commit: c066f0455fa126a2b41cccefa25b56421a445d99
 
  Source release (checksums and signatures are available at the same
  location):
  https://dist.apache.org/repos/dist/dev/cloudstack/4.5.0-rc4/
 
  PGP release keys (signed using 0x6fe50f1c):
  https://dist.apache.org/repos/dist/release/cloudstack/KEYS
 
  Vote will be open for at least 72 hours.
 
  For sanity in tallying the vote, can PMC members please be sure to
  indicate (binding) with their vote?
 
  [ ] +1 approve
  [ ] +0 no opinion
  [ ] -1 disapprove (and reason why)

Re: Libvirt RBD caching

2015-02-19 Thread Andrei Mikhailovsky

Ilya, i have followed the instructions on ceph website and it worked perfectly 
well. The only addition is to enable rbd caching in the ceph.conf 

Andrei 

- Original Message -

 From: ilya musayev ilya.mailing.li...@gmail.com
 To: dev@cloudstack.apache.org
 Sent: Thursday, 19 February, 2015 10:21:22 AM
 Subject: Re: Libvirt  RBD caching

 Logan

 Side note: it would help great if you can post your notes/guide on
 setting up Ceph as primary with CloudStack.
 There arent any docs out there.

 Thanks
 ilya
 On 2/18/15 12:00 PM, Logan Barfield wrote:
  Our current deployment is KVM with Ceph RBD primary storage. We
  have
  rbd_cache enabled, and use cache=none in Qemu by default.

  I've been running some tests to try to figure out why our write
  speeds
  with FreeBSD are significantly lower than Linux. I was testing both
  RBD and local SSD storage, with various cache configurations. Out
  of
  all of them the only one that performed close to our standard Linux
  images was local SSD, Qemu cache=writeback, FreeBSD gpt journal
  enabled.

  I've been reading on various lists the reasons and risks for
  cache=none vs cache=writeback:
  - cache=none: Safer for live migration
  - cache=writeback: Ceph RBD docs claim that this is required for
  data
  integrity when using rbd_cache

  From what I can tell performance is generally the same with both,
  except in the case of FreeBSD.

  What is the current line of thinking on this? Should be using
  'none'
  or 'writeback' with RBD by default? Is 'writeback' considered safe
  for live migration?

Are we far from ACS 4.5.0?

2015-02-17 Thread Andrei Mikhailovsky

Guys, i've seen a lot of activity a few weeks ago with RC1, 2, 3, etc. However, 
it seems to have slowed down somewhat. 

When is the 4.5.0 being released? Has it been put on hold for some reason? 

Thanks 

Andrei

Re: Cloudmonkey question

2015-02-16 Thread Andrei Mikhailovsky

After digging a bit further and getting hints from Rohit, the correct syntax to 
achieve this without using any additional scripting would be: 

list volumes tags[0].key=remote_backup tags[0].value=yes 

This will only list the volumes with the tag remote_backup=yes 

Thanks for your help and ideas 

Andrei 

- Original Message -

 From: Mike Tutkowski mike.tutkow...@solidfire.com
 To: dev@cloudstack.apache.org
 Sent: Monday, 16 February, 2015 5:34:26 AM
 Subject: Re: Cloudmonkey question

 Yeah, that's true - it does look like that should work.

 On Sunday, February 15, 2015, Ian Duffy i...@ianduffy.ie wrote:

  Assuming I'm reading and understanding the API docs correctly at
 
  http://cloudstack.apache.org/docs/api/apidocs-4.1/root_admin/listVolumes.html
 
  list volumes tags=remote_backup
 
  should list everything with the tag remote_backup
 
  On 15 February 2015 at 23:50, Mike Tutkowski
  mike.tutkow...@solidfire.com javascript:; wrote:
   I believe you'd need to write a script to invoke CloudMonkey to
   execute
  the
   list command and then parse the results of the command (keeping
   only what
   you're interested in).
  
   On Sunday, February 15, 2015, Andrei Mikhailovsky
   and...@arhont.com
  javascript:; wrote:
  
   Hello guys,
  
   I have a silly question; can't really find an answer by
   googling. How
  do I
   use tags when I want to query something. For instance, if I want
   to
  query
   volumes using list volumes command. If i would like to get
   only the
   results containing a certain tag, like a tag with key
   remote_backup and
   value of yes; how would the list volumes command should look
   like?
  
   Thanks
  
   Andrei
  
  
  
   --
   *Mike Tutkowski*
   *Senior CloudStack Developer, SolidFire Inc.*
   e: mike.tutkow...@solidfire.com javascript:;
   o: 303.746.7302
   Advancing the way the world uses the cloud
   http://solidfire.com/solution/overview/?video=play*™*
 

 --
 *Mike Tutkowski*
 *Senior CloudStack Developer, SolidFire Inc.*
 e: mike.tutkow...@solidfire.com
 o: 303.746.7302
 Advancing the way the world uses the cloud
 http://solidfire.com/solution/overview/?video=play*™*

Re: Disable HA temporary ?

2015-02-16 Thread Andrei Mikhailovsky

I had similar issues at least two or thee times. The host agent would 
disconnect from the management server. The agent would not connect back to the 
management server without manual intervention, however, it would happily 
continue running the vms. The management server would initiate the HA and fire 
up vms, which are already running on the disconnected host. I ended up with a 
handful of vms and virtual routers being ran on two hypervisors, thus 
corrupting the disk and having all sorts of issues ((( . 

I think there has to be a better way of dealing with this case. At least on an 
image level. Perhaps a host should keep some sort of lock file or a file for 
every image where it would record a time stamp. Something like: 

f5ffa8b0-d852-41c8-a386-6efb8241e2e7 and 
f5ffa8b0-d852-41c8-a386-6efb8241e2e7-timestamp 

Thus, the f5ffa8b0-d852-41c8-a386-6efb8241e2e7 is the name of the disk image 
and f5ffa8b0-d852-41c8-a386-6efb8241e2e7-timestamp is the image's time stamp. 

The hypervisor should record the time stamp in this file while the vm is 
running. Let's say every 5-10 seconds. If the timestamp is old, we can assume 
that the volume is no longer used by the hypervisor. 

When a vm is started, the timestamp file should be checked and if the timestamp 
is recent, the vm should not start, otherwise, the vm should start and the 
timestamp file should be regularly updated. 

I am sure there are better ways of doing this, but at least this method should 
not allow two vms running on different hosts to use the same volume and corrupt 
the data. 

In ceph, as far as I remember, a new feature is being developed to provide a 
locking mechanism of an rbd image. Not sure if this will do the job? 

Andrei 

- Original Message -

 From: Wido den Hollander w...@widodh.nl
 To: dev@cloudstack.apache.org
 Sent: Monday, 16 February, 2015 11:32:13 AM
 Subject: Re: Disable HA temporary ?

 On 16-02-15 11:00, Andrija Panic wrote:
  Hi team,
 
  I just had funny behaviour few days ago - one of my hosts was under
  heavy
  load (some disk/network load) and it went disconnected from MGMT
  server.
 
  Then MGMT server stared doing HA thing, but without being able to
  make sure
  that the VMs on the disconnected hosts are really shutdown (and
  they were
  NOT).
 
  So MGMT started again some VMs on other hosts, thus resulting in
  having 2
  copies of the same VM, using shared strage - so corruption happened
  on the
  disk.
 
  Is there a way to temporary disable HA feature on global level, or
  anything
  similar ?

 Not that I'm aware of, but this is something I also ran in to a
 couple
 of times.

 It would indeed be nice if there could be a way to stop the HA
 process
 completely as an Admin.

 Wido

  Thanks

Re: Your thoughts on using Primary Storage for keeping snapshots

2015-02-16 Thread Andrei Mikhailovsky

 den Hollander 
   w...@widodh.nl
 wrote:
 
 
  On 16-02-15 15:38, Logan Barfield wrote:
  I like this idea a lot for Ceph RBD. I do think there
  should
still be
  support for copying snapshots to secondary storage as
  needed
   (for
  transfers between zones, etc.). I really think that
  this
  could
   be
  part of a larger move to clarify the naming
  conventions used
  for
disk
  operations. Currently Volume Snapshots should
  probably
   really be
  called Backups. So having snapshot functionality,
  and a
convert
  snapshot to backup/template would be a good move.
 
 
  I fully agree that this would be a very great addition.
 
  I won't be able to work on this any time soon though.
 
  Wido
 
  Thank You,
 
  Logan Barfield
  Tranquil Hosting
 
 
  On Mon, Feb 16, 2015 at 9:16 AM, Andrija Panic 
 andrija.pa...@gmail.com wrote:
  BIG +1
 
  My team should submit some patch to ACS for better
  KVM
   snapshots,
 including
  whole VM snapshot etc...but it's too early to give
  details...
  best
 
  On 16 February 2015 at 13:01, Andrei Mikhailovsky 
and...@arhont.com
 wrote:
 
  Hello guys,
 
  I was hoping to have some feedback from the
  community on the
subject
 of
  having an ability to keep snapshots on the primary
  storage
   where
it
 is
  supported by the storage backend.
 
  The idea behind this functionality is to improve how
  snapshots
are
  currently handled on KVM hypervisors with Ceph
  primary
   storage.
At
 the
  moment, the snapshots are taken on the primary
  storage and
   being
 copied to
  the secondary storage. This method is very slow and
   inefficient
even
 on
  small infrastructure. Even on medium deployments
  using
   snapshots
in
 KVM
  becomes nearly impossible. If you have tens or
  hundreds
concurrent
  snapshots taking place you will have a bunch of
  timeouts and
errors,
 your
  network becomes clogged, etc. In addition, using
  these
   snapshots
for
  creating new volumes or reverting back vms also slow
  and
 inefficient. As
  above, when you have tens or hundreds concurrent
  operations
  it
will
 not
  succeed and you will have a majority of tasks with
  errors or
 timeouts.
 
  At the moment, taking a single snapshot of
  relatively small
volumes
 (200GB
  or 500GB for instance) takes tens if not hundreds of
  minutes.
Taking
 a
  snapshot of the same volume on ceph primary storage
  takes a
   few
 seconds at
  most! Similarly, converting a snapshot to a volume
  takes
  tens
   if
not
  hundreds of minutes when secondary storage is
  involved;
   compared
with
  seconds if done directly on the primary storage.
 
  I suggest that the CloudStack should have the
  ability to
  keep
volume
  snapshots on the primary storage where this is
  supported by
   the
 storage.
  Perhaps having a per primary storage setting that
  enables
  this
  functionality. This will be beneficial for Ceph
  primary
   storage
on
 KVM
  hypervisors and perhaps on XenServer when Ceph will
  be
   supported
in
 a near
  future.
 
  This will greatly speed up the process of using
  snapshots on
   KVM
and
 users
  will actually start using snapshotting rather than
  giving up
   with
  frustration.
 
  I have opened the ticket CLOUDSTACK-8256, so please
  cast
  your
vote
 if you
  are in agreement.
 
  Thanks for your input
 
  Andrei
 
 
 
 
 
 
 
  --
 
  Andrija Panić




 --
 *Mike Tutkowski*
 *Senior CloudStack Developer, SolidFire Inc.*
 e: mike.tutkow...@solidfire.com
 o: 303.746.7302
 Advancing the way the world uses the cloud
 http://solidfire.com/solution/overview/?video=play*™*




 --
 *Mike Tutkowski*
 *Senior CloudStack Developer, SolidFire Inc.*
 e: mike.tutkow...@solidfire.com
 o: 303.746.7302
 Advancing the way the world uses the cloud
 http://solidfire.com/solution/overview/?video=play*™*
   
   
   
   
--
*Mike Tutkowski*
*Senior CloudStack Developer, SolidFire Inc.*
e: mike.tutkow...@solidfire.com
o: 303.746.7302
Advancing the way the world uses the cloud
http://solidfire.com/solution/overview/?video=play*™*
   
   
   
   
--
*Mike Tutkowski*
*Senior CloudStack Developer, SolidFire Inc.*
e: mike.tutkow...@solidfire.com
o: 303.746.7302
Advancing the way the world uses the cloud
http

Re: Your thoughts on using Primary Storage for keeping snapshots

2015-02-16 Thread Andrei Mikhailovsky

+1 for renaming the Snapshot into something more logical. 

However, for many ppl Backups kind of means the functionality on a more 
granular level (like ability to restore files, etc.) Not sure if Backup should 
be the right term for the current volume Snapshots. 

I agree, there should be ability to copy snapshots to the secondary storage. 
perhaps even both if one requires. If someone wants to have a backup copy of 
the snapshot on the secondary storage, they might choose to have this option. 

Andrei 
- Original Message -

 From: Logan Barfield lbarfi...@tqhosting.com
 To: dev@cloudstack.apache.org
 Sent: Monday, 16 February, 2015 2:38:00 PM
 Subject: Re: Your thoughts on using Primary Storage for keeping
 snapshots

 I like this idea a lot for Ceph RBD. I do think there should still be
 support for copying snapshots to secondary storage as needed (for
 transfers between zones, etc.). I really think that this could be
 part of a larger move to clarify the naming conventions used for disk
 operations. Currently Volume Snapshots should probably really be
 called Backups. So having snapshot functionality, and a convert
 snapshot to backup/template would be a good move.

 Thank You,

 Logan Barfield
 Tranquil Hosting

 On Mon, Feb 16, 2015 at 9:16 AM, Andrija Panic
 andrija.pa...@gmail.com wrote:
  BIG +1
 
  My team should submit some patch to ACS for better KVM snapshots,
  including
  whole VM snapshot etc...but it's too early to give details...
  best
 
  On 16 February 2015 at 13:01, Andrei Mikhailovsky
  and...@arhont.com wrote:
 
  Hello guys,
 
  I was hoping to have some feedback from the community on the
  subject of
  having an ability to keep snapshots on the primary storage where
  it is
  supported by the storage backend.
 
  The idea behind this functionality is to improve how snapshots are
  currently handled on KVM hypervisors with Ceph primary storage. At
  the
  moment, the snapshots are taken on the primary storage and being
  copied to
  the secondary storage. This method is very slow and inefficient
  even on
  small infrastructure. Even on medium deployments using snapshots
  in KVM
  becomes nearly impossible. If you have tens or hundreds concurrent
  snapshots taking place you will have a bunch of timeouts and
  errors, your
  network becomes clogged, etc. In addition, using these snapshots
  for
  creating new volumes or reverting back vms also slow and
  inefficient. As
  above, when you have tens or hundreds concurrent operations it
  will not
  succeed and you will have a majority of tasks with errors or
  timeouts.
 
  At the moment, taking a single snapshot of relatively small
  volumes (200GB
  or 500GB for instance) takes tens if not hundreds of minutes.
  Taking a
  snapshot of the same volume on ceph primary storage takes a few
  seconds at
  most! Similarly, converting a snapshot to a volume takes tens if
  not
  hundreds of minutes when secondary storage is involved; compared
  with
  seconds if done directly on the primary storage.
 
  I suggest that the CloudStack should have the ability to keep
  volume
  snapshots on the primary storage where this is supported by the
  storage.
  Perhaps having a per primary storage setting that enables this
  functionality. This will be beneficial for Ceph primary storage on
  KVM
  hypervisors and perhaps on XenServer when Ceph will be supported
  in a near
  future.
 
  This will greatly speed up the process of using snapshots on KVM
  and users
  will actually start using snapshotting rather than giving up with
  frustration.
 
  I have opened the ticket CLOUDSTACK-8256, so please cast your vote
  if you
  are in agreement.
 
  Thanks for your input
 
  Andrei
 
 
 
 
 
 
 
  --
 
  Andrija Panić

Your thoughts on using Primary Storage for keeping snapshots

2015-02-16 Thread Andrei Mikhailovsky

Hello guys, 

I was hoping to have some feedback from the community on the subject of having 
an ability to keep snapshots on the primary storage where it is supported by 
the storage backend. 

The idea behind this functionality is to improve how snapshots are currently 
handled on KVM hypervisors with Ceph primary storage. At the moment, the 
snapshots are taken on the primary storage and being copied to the secondary 
storage. This method is very slow and inefficient even on small infrastructure. 
Even on medium deployments using snapshots in KVM becomes nearly impossible. If 
you have tens or hundreds concurrent snapshots taking place you will have a 
bunch of timeouts and errors, your network becomes clogged, etc. In addition, 
using these snapshots for creating new volumes or reverting back vms also slow 
and inefficient. As above, when you have tens or hundreds concurrent operations 
it will not succeed and you will have a majority of tasks with errors or 
timeouts. 

At the moment, taking a single snapshot of relatively small volumes (200GB or 
500GB for instance) takes tens if not hundreds of minutes. Taking a snapshot of 
the same volume on ceph primary storage takes a few seconds at most! Similarly, 
converting a snapshot to a volume takes tens if not hundreds of minutes when 
secondary storage is involved; compared with seconds if done directly on the 
primary storage. 

I suggest that the CloudStack should have the ability to keep volume snapshots 
on the primary storage where this is supported by the storage. Perhaps having a 
per primary storage setting that enables this functionality. This will be 
beneficial for Ceph primary storage on KVM hypervisors and perhaps on XenServer 
when Ceph will be supported in a near future. 

This will greatly speed up the process of using snapshots on KVM and users will 
actually start using snapshotting rather than giving up with frustration. 

I have opened the ticket CLOUDSTACK-8256, so please cast your vote if you are 
in agreement. 

Thanks for your input 

Andrei

Cloudmonkey question

2015-02-15 Thread Andrei Mikhailovsky

Hello guys, 

I have a silly question; can't really find an answer by googling. How do I use 
tags when I want to query something. For instance, if I want to query volumes 
using list volumes command. If i would like to get only the results 
containing a certain tag, like a tag with key remote_backup and value of yes; 
how would the list volumes command should look like? 

Thanks 

Andrei

CloudStack and virtio-scsi support in KVM

2015-02-10 Thread Andrei Mikhailovsky

Hello guys, 

I was wondering what is the current support for implementing virtio-scsi 
support for KVM hypervisors? I couldn't find much by googling apart from it has 
been built into OpenStack already. 

The reason for using virtio-scsi instead of virtio-blk would be increasing the 
number of devices you can attach to a vm, but also, and that's what I am 
interested in is the ability to use discard and reclaim unused blocks from the 
backend storage like ceph rbd. There are also talks about having a greater 
performance advantage as well. 

Thanks 

Andrei

Re: CloudStack and virtio-scsi support in KVM

2015-02-10 Thread Andrei Mikhailovsky

CLOUDSTACK-8239 Has been created 

Thanks 

Andrei 

- Original Message -

 From: Wido den Hollander w...@widodh.nl
 To: dev@cloudstack.apache.org
 Sent: Tuesday, 10 February, 2015 8:04:16 PM
 Subject: Re: CloudStack and virtio-scsi support in KVM

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On 02/10/2015 08:15 PM, Andrei Mikhailovsky wrote:
  Hello guys,

  I was wondering what is the current support for implementing
  virtio-scsi support for KVM hypervisors? I couldn't find much by
  googling apart from it has been built into OpenStack already.

  The reason for using virtio-scsi instead of virtio-blk would be
  increasing the number of devices you can attach to a vm, but also,
  and that's what I am interested in is the ability to use discard
  and reclaim unused blocks from the backend storage like ceph rbd.
  There are also talks about having a greater performance advantage
  as well.

 Not done yet, but something which crossed my mind. Shouldn't be a
 real
 problem to do however, it's just a small XML change in the Agent.

 Would you mind opening a ticket in Jira?

 Wido

  Thanks

  Andrei

 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1

 iQIcBAEBAgAGBQJU2mQ+AAoJEAGbWC3bPspCIbAQAKmMyO5sOfMwVBvYsAZ0sBKZ
 CvjLfI+D59uH28qj80QD/Ti83JBlq1qmbGbPWlVgZkjSyWQMM0NrJ20R4sP8dXGr
 kbotx25VkfMI0UqwnhplIldMmnI2ELTuLcPRUG8n6zfDDXEjZt9+iyJc94loRg7U
 srHoCMDEnT+In9AtzDbzWOeZwXLYFTaajzwgDAQtJUn0xNL9v73C3cE0m72DkvPe
 tHmgWJJ7aWJPseQ9SjZXj5j7GAHkZMhsLaIajVnrMvg+kXUNUnqcRXqdxdxeZIiw
 J0ak9ztQd3tiLHl4qd4qjRaeROUP2TWzb5rFVATv3Ofd5g3MF0hvR/0+1T7cVD03
 Mg5cA9geHBpQvbe5Wy1c67n2zePpRP6coHb4HNIIb0EhlvaM6MzrFEvXvutnK9QF
 lMAZ5z6apQUnYLUio/G7vp//yd636LUlW9cIVsyqw4TQ41JfiPYwFZvghMAjt2nk
 3SNTBkqnPmbpCT+X5dpoSqLPVhtnkg1SO2rwy1+sggvO/EGapjQ2tWqWeAbVQmPd
 1n8AYijmexFkk+sQQHGiEeff+Al65/WTeh9y9vFZKtoVyIiUOuK141Jr1LeuoDtN
 wXhmuyo7otWWZjt9l5+lPFqZLNoAp59stcTLzS5/DnGl+z5n6pgtp/Q6d7o4HMH0
 Aj4TGPb16qm1nql6SAsZ
 =NzP6
 -END PGP SIGNATURE-

Re: CloudStack and virtio-scsi support in KVM

2015-02-10 Thread Andrei Mikhailovsky

Wido, sure will do! 

Andrei 

- Original Message -

 From: Wido den Hollander w...@widodh.nl
 To: dev@cloudstack.apache.org
 Sent: Tuesday, 10 February, 2015 8:04:16 PM
 Subject: Re: CloudStack and virtio-scsi support in KVM

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On 02/10/2015 08:15 PM, Andrei Mikhailovsky wrote:
  Hello guys,

  I was wondering what is the current support for implementing
  virtio-scsi support for KVM hypervisors? I couldn't find much by
  googling apart from it has been built into OpenStack already.

  The reason for using virtio-scsi instead of virtio-blk would be
  increasing the number of devices you can attach to a vm, but also,
  and that's what I am interested in is the ability to use discard
  and reclaim unused blocks from the backend storage like ceph rbd.
  There are also talks about having a greater performance advantage
  as well.

 Not done yet, but something which crossed my mind. Shouldn't be a
 real
 problem to do however, it's just a small XML change in the Agent.

 Would you mind opening a ticket in Jira?

 Wido

  Thanks

  Andrei

 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1

 iQIcBAEBAgAGBQJU2mQ+AAoJEAGbWC3bPspCIbAQAKmMyO5sOfMwVBvYsAZ0sBKZ
 CvjLfI+D59uH28qj80QD/Ti83JBlq1qmbGbPWlVgZkjSyWQMM0NrJ20R4sP8dXGr
 kbotx25VkfMI0UqwnhplIldMmnI2ELTuLcPRUG8n6zfDDXEjZt9+iyJc94loRg7U
 srHoCMDEnT+In9AtzDbzWOeZwXLYFTaajzwgDAQtJUn0xNL9v73C3cE0m72DkvPe
 tHmgWJJ7aWJPseQ9SjZXj5j7GAHkZMhsLaIajVnrMvg+kXUNUnqcRXqdxdxeZIiw
 J0ak9ztQd3tiLHl4qd4qjRaeROUP2TWzb5rFVATv3Ofd5g3MF0hvR/0+1T7cVD03
 Mg5cA9geHBpQvbe5Wy1c67n2zePpRP6coHb4HNIIb0EhlvaM6MzrFEvXvutnK9QF
 lMAZ5z6apQUnYLUio/G7vp//yd636LUlW9cIVsyqw4TQ41JfiPYwFZvghMAjt2nk
 3SNTBkqnPmbpCT+X5dpoSqLPVhtnkg1SO2rwy1+sggvO/EGapjQ2tWqWeAbVQmPd
 1n8AYijmexFkk+sQQHGiEeff+Al65/WTeh9y9vFZKtoVyIiUOuK141Jr1LeuoDtN
 wXhmuyo7otWWZjt9l5+lPFqZLNoAp59stcTLzS5/DnGl+z5n6pgtp/Q6d7o4HMH0
 Aj4TGPb16qm1nql6SAsZ
 =NzP6
 -END PGP SIGNATURE-

Re: [VOTE] Apache CloudStack 4.5.0 RC1

2015-02-05 Thread Andrei Mikhailovsky

Mike, I think dropping Ubuntu 12.04 will not be taken kindly by the community. 
I am not speaking for myself as I've upgraded to 14.04 already, but there are 
still tons of 12.04 installs. 

Andrei 

- Original Message -

 From: Mike Tutkowski mike.tutkow...@solidfire.com
 To: Rohit Yadav rohit.ya...@shapeblue.com
 Cc: Wilder Rodrigues wrodrig...@schubergphilis.com,
 dev@cloudstack.apache.org, int-toolkit
 int-tool...@schubergphilis.com, int-cloud
 int-cl...@schubergphilis.com
 Sent: Thursday, 5 February, 2015 8:01:38 PM
 Subject: Re: [VOTE] Apache CloudStack 4.5.0 RC1

 Hi everyone,

 So, what do we want to do here? It seems that KVM doesn't work on CS
 4.5
 (due to this kvmclock issue).

 How do we go about collecting enough data points in order to
 determine if
 we can drop Ubuntu 12.04 from the supported list of environments with
 regards to KVM on CS 4.5?

 Thanks,
 Mike

 On Wed, Feb 4, 2015 at 10:58 AM, Mike Tutkowski 
 mike.tutkow...@solidfire.com wrote:

  FYI: From what I observed, I could not get a user VM to start on
  Ubuntu
  12.04.1 using CloudStack 4.6. I think Marcus believes the same
  issue would
  be present using CloudStack 4.5 with Ubuntu 12.04.1.

  On Wed, Feb 4, 2015 at 6:29 AM, Rohit Yadav
  rohit.ya...@shapeblue.com
  wrote:

  Hi Wilder,

  The issue I shared was related to KVM (based on Ubuntu 14.04), I
  think
  as user VMs Ubuntu 12.04 should run fine. Though you can help test
  4.4.2
  to 4.5.0 (use latest 4.5 branch) as well.

  I've spent last two weeks only on testing various components and
  upgrade
  scenarios but mostly with KVM, I think now the upgrade process is
  pretty
  smooth with fewer rough edges.

  On Wednesday 04 February 2015 05:58 PM, Wilder Rodrigues wrote:

  Hi there,

  We are currently using 4.4.2 in our production environment and we
  have
  Ubuntu 12.04-5 VMs. That would be crucial to have support to
  Ubuntu
  12.04.

  Is there any bug already created that we can have a look and help
  to
  fix? Not being able to upgrade from 4.4.2 to 4.5 wouldn’t be
  cool.

  Cheers,
  Wilder

  On 02 Feb 2015, at 08:53, Mike Tutkowski
  mike.tutkow...@solidfire.com
  mailto:mike.tutkow...@solidfire.com wrote:

  Also, just as an FYI, after I upgraded Ubuntu from 12.04 to
  14.04, I
  was able to create a VM using CS 4.6 (as we expected would be
  the case).

  As Marcus mentioned, we should try to determine if Ubuntu 12.04
  should
  be a supported platform for CS 4.5.

  On Sun, Feb 1, 2015 at 7:12 PM, Mike Tutkowski
  mike.tutkow...@solidfire.com
  mailto:mike.tutkow...@solidfire.com
  wrote:

  Thanks, Rohit

  As Marcus later commented, it's a compatibility issue with
  Ubuntu
  12.04 (which is what I'm running).

  On Sun, Feb 1, 2015 at 6:30 PM, Rohit Yadav
  rohit.ya...@shapeblue.com mailto:rohit.ya...@shapeblue.com
  wrote:

  Hi Mike,

  I’ve tested 4.5 branch with KVM/Ubuntu 14.04 and local storage
  seems to work for me. I’ve not tested it thoroughly things
  like migration, attaching/detach localstorage disks etc.

   On 31-Jan-2015, at 1:10 pm, Mike Tutkowski
  mike.tutkow...@solidfire.com
  mailto:mike.tutkow...@solidfire.com wrote:

   Hi everyone,

   Any news on this?

   I am still having trouble creating a VM on local storage on
  KVM (with or without kvmclock.disable=true in agent.properties).

   I'm on Ubuntu 12.04.1.

   Thanks!
   Mike

   On Sat, Jan 24, 2015 at 2:19 AM, Wilder Rodrigues
  wrodrig...@schubergphilis.com
  mailto:wrodrig...@schubergphilis.com wrote:
   Okay, thanks for the clarification.

   I will test it over the weekend.

   Cheers,
   Wilder

   On 22 Jan 2015, at 15:20, Rohit Yadav
  rohit.ya...@shapeblue.com mailto:rohit.ya...@shapeblue.com
  wrote:

Hi Wilder,

If you’re testing please use latest 4.5 branch which
  should become the next RC. At the moment the latest is SHA
  d08369ad06b6d5ef801f79493c2aa4bdaeab1b83. Thanks.

On 22-Jan-2015, at 6:29 pm, Wilder Rodrigues
  wrodrig...@schubergphilis.com
  mailto:wrodrig...@schubergphilis.com wrote:

Hi Rohit,

Tests were based on the commit id gave put in the RC1:
  8db3cbd4ff62b17a8b496026b68cf60ee0c76740

Please let me know if there is a new commit ID to be used
  and I will test it.

Apologies for the misunderstanding, I have been a bit
  away from the list. :)

Cheers,
Wilder

On 22 Jan 2015, at 12:29, Rohit Yadav
  rohit.ya...@shapeblue.com mailto:rohit.ya...@shapeblue.com
  wrote:

Hi Wilder,

Thanks for sharing, but I’m confused if these tests were
  against latest 4.5 branch or the RC1? Looking forward to your
  tests (on latest 4.5). Thanks.

Regards.

On 22-Jan-2015, at 4:24 pm, Wilder Rodrigues
  wrodrig...@schubergphilis.com
  mailto:wrodrig...@schubergphilis.com wrote:

Hi all,

Sorry for the delay on the tests and also coming to it
  only after the RC has been cancelled, but I was way too busy
  with

Re: [VOTE] Apache CloudStack 4.5.0 RC1

2015-02-05 Thread Andrei Mikhailovsky

I personally would compare Ubuntu 12.04 to Windows Server 2008R2 in terms of 
it's popularity compared with the next stable release. You do see Windows 2012 
servers about, but 2008R2 is still very popular. Same as Ubuntu 14.04 and 
12.04. 

Andrei 
- Original Message -

 From: Mike Tutkowski mike.tutkow...@solidfire.com
 To: dev@cloudstack.apache.org
 Sent: Thursday, 5 February, 2015 10:29:19 PM
 Subject: Re: [VOTE] Apache CloudStack 4.5.0 RC1

 Thanks for the feedback, Andrei.

 I personally don't have a good feel for how common or not 12.04 still
 is,
 so it's great to have your comments.

 On Thu, Feb 5, 2015 at 3:07 PM, Andrei Mikhailovsky
 and...@arhont.com
 wrote:

  Mike, I think dropping Ubuntu 12.04 will not be taken kindly by the
  community. I am not speaking for myself as I've upgraded to 14.04
  already,
  but there are still tons of 12.04 installs.

  Andrei

  - Original Message -

   From: Mike Tutkowski mike.tutkow...@solidfire.com
   To: Rohit Yadav rohit.ya...@shapeblue.com
   Cc: Wilder Rodrigues wrodrig...@schubergphilis.com,
   dev@cloudstack.apache.org, int-toolkit
   int-tool...@schubergphilis.com, int-cloud
   int-cl...@schubergphilis.com
   Sent: Thursday, 5 February, 2015 8:01:38 PM
   Subject: Re: [VOTE] Apache CloudStack 4.5.0 RC1

   Hi everyone,

   So, what do we want to do here? It seems that KVM doesn't work on
   CS
   4.5
   (due to this kvmclock issue).

   How do we go about collecting enough data points in order to
   determine if
   we can drop Ubuntu 12.04 from the supported list of environments
   with
   regards to KVM on CS 4.5?

   Thanks,
   Mike

   On Wed, Feb 4, 2015 at 10:58 AM, Mike Tutkowski 
   mike.tutkow...@solidfire.com wrote:

FYI: From what I observed, I could not get a user VM to start
on
Ubuntu
12.04.1 using CloudStack 4.6. I think Marcus believes the same
issue would
be present using CloudStack 4.5 with Ubuntu 12.04.1.

On Wed, Feb 4, 2015 at 6:29 AM, Rohit Yadav
rohit.ya...@shapeblue.com
wrote:

Hi Wilder,

The issue I shared was related to KVM (based on Ubuntu 14.04),
I
think
as user VMs Ubuntu 12.04 should run fine. Though you can help
test
4.4.2
to 4.5.0 (use latest 4.5 branch) as well.

I've spent last two weeks only on testing various components
and
upgrade
scenarios but mostly with KVM, I think now the upgrade process
is
pretty
smooth with fewer rough edges.

On Wednesday 04 February 2015 05:58 PM, Wilder Rodrigues
wrote:

Hi there,

We are currently using 4.4.2 in our production environment
and we
have
Ubuntu 12.04-5 VMs. That would be crucial to have support to
Ubuntu
12.04.

Is there any bug already created that we can have a look and
help
to
fix? Not being able to upgrade from 4.4.2 to 4.5 wouldn’t be
cool.

Cheers,
Wilder

On 02 Feb 2015, at 08:53, Mike Tutkowski
mike.tutkow...@solidfire.com
mailto:mike.tutkow...@solidfire.com wrote:

Also, just as an FYI, after I upgraded Ubuntu from 12.04 to
14.04, I
was able to create a VM using CS 4.6 (as we expected would
be
the case).

As Marcus mentioned, we should try to determine if Ubuntu
12.04
should
be a supported platform for CS 4.5.

On Sun, Feb 1, 2015 at 7:12 PM, Mike Tutkowski
mike.tutkow...@solidfire.com
mailto:mike.tutkow...@solidfire.com
wrote:

Thanks, Rohit

As Marcus later commented, it's a compatibility issue with
Ubuntu
12.04 (which is what I'm running).

On Sun, Feb 1, 2015 at 6:30 PM, Rohit Yadav
rohit.ya...@shapeblue.com
mailto:rohit.ya...@shapeblue.com
wrote:

Hi Mike,

I’ve tested 4.5 branch with KVM/Ubuntu 14.04 and local
storage
seems to work for me. I’ve not tested it thoroughly things
like migration, attaching/detach localstorage disks etc.

 On 31-Jan-2015, at 1:10 pm, Mike Tutkowski
mike.tutkow...@solidfire.com
mailto:mike.tutkow...@solidfire.com wrote:

 Hi everyone,

 Any news on this?

 I am still having trouble creating a VM on local storage
 on
KVM (with or without kvmclock.disable=true in
agent.properties).

 I'm on Ubuntu 12.04.1.

 Thanks!
 Mike

 On Sat, Jan 24, 2015 at 2:19 AM, Wilder Rodrigues
wrodrig...@schubergphilis.com
mailto:wrodrig...@schubergphilis.com wrote:
 Okay, thanks for the clarification.

 I will test it over the weekend.

 Cheers,
 Wilder

 On 22 Jan 2015, at 15:20, Rohit Yadav
rohit.ya...@shapeblue.com
mailto:rohit.ya...@shapeblue.com
wrote:

  Hi Wilder,

  If you’re testing please use latest 4.5 branch which
should become the next RC. At the moment the latest is SHA
d08369ad06b6d5ef801f79493c2aa4bdaeab1b83. Thanks

Re: Downgrading recommendations from 4.4.2 to 4.3.2

2015-02-04 Thread Andrei Mikhailovsky

Daan, do you know if there is a changelog for the current 4.4.3 branch? Perhaps 
you are right and some of the issues were fixed. 

Andrei 

- Original Message -

 From: Daan Hoogland daan.hoogl...@gmail.com
 To: dev dev@cloudstack.apache.org
 Sent: Wednesday, 4 February, 2015 4:01:55 PM
 Subject: Re: Downgrading recommendations from 4.4.2 to 4.3.2

 Andrei,

 The 4.4.2 version sysvms report version 4.4.1. Unless you bake your
 own. I made some custom templates for xen and vmware for internal
 use.
 the 4.4.1 version should be fine though.

 as for fix forward, going to 4.5 is an option but I meant keeping
 your
 own branch and cherry-picking fixes that you need until a new stable
 release is out.

 4.4.3 was never voted in, but you could try that one

 On Wed, Feb 4, 2015 at 4:47 PM, Andrei Mikhailovsky
 and...@arhont.com wrote:
  Actually, all my systemvms and VRs are running 4.4.1 eventhough I
  am sure i've used the latest templates. Could someone running
  version 4.4.2 please verify the version of their systemvms?

  Kind regards

  - Original Message -

  From: Pierre-Luc Dion pd...@cloudops.com
  To: dev@cloudstack.apache.org
  Sent: Wednesday, 4 February, 2015 2:09:30 PM
  Subject: Re: Downgrading recommendations from 4.4.2 to 4.3.2

  Andrei,

  I wouldn't recommend to downgrade, I did that in the past when we
  upgraded
  to 4.2.1 and I end up with even more issues. For the maintenance
  mode
  issue, can you verify that systemvm currently running (including
  VR)
  run as
  4.4.2 ? I've saw behavior like this when upgrading to 4.4.x where
  CloudStack remain in Maintenance mode until all VR and system vm
  run
  to the
  proper ACS version. This is in /etc/cloudstack of VRs, cpvm and
  ssvm.
  if
  not at 4.4.2, try to install a new systemvm template and apply it
  to
  complete the upgrade of system VMs.

  Hope this help and you won't have to deal with a rollback, if so,
  please
  share your experience

  On Wed, Feb 4, 2015 at 8:51 AM, Andrei Mikhailovsky
  and...@arhont.com
  wrote:

   Daan,

   Do you mean that I should wait for the 4.5 branch to be out and
   try
   an
   upgrade hoping that the issues are fixed?

   Why do you think downgrading the database is a tricky business?
   Let's say
   I do the following:

   1. Backup existing data base
   2. Remove all virtual routers
   3. remove the existing db (make a backup just in case).
   4. restore db from 4.3.2
   5. downgrade packages
   6. restart networking (hopefully it will recreate all virtual
   routers)

   Or do you think there might be some issues in this process?

   Thanks

   - Original Message -

From: Daan Hoogland daan.hoogl...@gmail.com
To: dev dev@cloudstack.apache.org
Sent: Wednesday, 4 February, 2015 12:34:08 PM
Subject: Re: Downgrading recommendations from 4.4.2 to 4.3.2

Andrei,

downgrading the database is your big problem. You don't want
to
restore at this point. I would go for fix forward but that's
having
the ability to maintain my own.branch.

On Wed, Feb 4, 2015 at 1:21 PM, Andrei Mikhailovsky
and...@arhont.com wrote:

 Hello guys,

 I was hoping to get some suggestions on how to perform a
 downgrade
 from a live 4.4.2 environment back to 4.3.2. After
 performing
 an
 upgrade 4 days ago, I have discovered a bunch of problems
 with
 4.4.2 release, which makes it unusable for me. Problems like
 CLOUDSTACK-8201, CLOUDSTACK-8210 and inability to create new
 instances (yet to open a support ticket) are kind of big
 issues
 for me!

 Prior to performing an upgrade I've backed up cloud and
 cloud_usage
 databases of version 4.3.2. I have previously downgraded ACS
 several times without any issues. However, i've done the
 downgrade
 a few hours after performing an upgrade. This time, it's
 been
 about 4 days since i've done the upgrade. Since then, I've
 restarted many networks with clean up enabled, which
 resulted
 in
 creation of new virtual routers. I've also created a few
 test
 vms,
 templates and snapshots, but I am not actually too concerned
 about
 those.

 What is the best route for me to do a downgrade to 4.3.2? I
 am
 planning to do it this weekend, however, I can do it sooner.
 Just
 need to make sure my ACS will not be even more broken.

 Many thanks

 Andrei

--
Daan

 --
 Daan

Re: Google Summer of Code 2015 is coming

2015-02-04 Thread Andrei Mikhailovsky

+1 for support of Ceph in Xenserver 

Andrei 
- Original Message -

 From: Erik Weber terbol...@gmail.com
 To: dev dev@cloudstack.apache.org
 Sent: Wednesday, 4 February, 2015 7:35:39 AM
 Subject: Re: Google Summer of Code 2015 is coming

 Pure Xen support would be nice :-)

 (and Ceph in XenServer, but that's not really a CloudStack issue)

 --
 Erik

 On Tue, Feb 3, 2015 at 9:42 AM, Sebastien Goasguen run...@gmail.com
 wrote:

  GSoC 2015 is back.
  Time to enter your project proposals in jira if you want to mentor.

  Begin forwarded message:

   From: Ulrich Stärk u...@apache.org
   Subject: Google Summer of Code 2015 is coming
   Date: February 2, 2015 5:44:52 PM EST
   To: ment...@community.apache.org
   Reply-To: ment...@community.apache.org
   Reply-To: ment...@community.apache.org

   Hello PMCs (incubator Mentors, please forward this email to your
  podlings),

   Google Summer of Code [1] is a program sponsored by Google
   allowing
  students to spend their summer
   working on open source software. Students will receive stipends
   for
  developing open source software
   full-time for three months. Projects will provide mentoring and
   project
  ideas, and in return have
   the chance to get new code developed and - most importantly - to
  identify and bring in new committers.

   The ASF will apply as a participating organization meaning
   individual
  projects don't have to apply
   separately.

   If you want to participate with your project we ask you to do the
  following things by no later than
   2015-02-13 19:00 UTC (applications from organizations close a
   week later)

   1. understand what it means to be a mentor [2].

   2. record your project ideas.

   Just create issues in JIRA, label them with gsoc2015, and they
   will show
  up at [3]. Please be as
   specific as possible when describing your idea. Include the
   programming
  language, the tools and
   skills required, but try not to scare potential students away.
   They are
  supposed to learn what's
   required before the program starts.

   Use labels, e.g. for the programming language (java, c, c++,
   erlang,
  python, brainfuck, ...) or
   technology area (cloud, xml, web, foo, bar, ...) and record them
   at [5].

   Please use the COMDEV JIRA project for recording your ideas if
   your
  project doesn't use JIRA (e.g.
   httpd, ooo). Contact d...@community.apache.org if you need
   assistance.

   [4] contains some additional information (will be updated for
   2015
  shortly).

   3. subscribe to ment...@community.apache.org; restricted to
   potential
  mentors, meant to be used as a
   private list - general discussions on the public
  d...@community.apache.org list as much as possible
   please). Use a recognized address when subscribing (@apache.org
   or one
  of your alias addresses on
   record).

   Note that the ASF isn't accepted as a participating organization
   yet,
  nevertheless you *have to*
   start recording your ideas now or we might not get accepted.

   Over the years we were able to complete hundreds of projects
  successfully. Some of our prior
   students are active contributors now! Let's make this year a
   success
  again!

   Cheers,

   Uli

   P.S.: Except for the private parts (label spreadsheet mostly),
   this
  email is free to be shared
   publicly if you want to.

   [1] http://www.google-melange.com/gsoc/homepage/google/gsoc2015
   [2] http://community.apache.org/guide-to-being-a-mentor.html
   [3] http://s.apache.org/gsoc2015ideas
   [4] http://community.apache.org/gsoc.html
   [5] http://s.apache.org/gsoclabels

Re: Major breakage in GUI after upgrade from 4.3.2 to 4.4.2

2015-02-04 Thread Andrei Mikhailovsky

That would be very useful to find out! 

I am running Ubuntu 14.04 on the management server and hosts and I am running 
Ubuntu 14.10 on the client that accesses the GUI. 

Thanks 

Andrei 

- Original Message -

 From: Mike Tutkowski mike.tutkow...@solidfire.com
 To: dev@cloudstack.apache.org
 Sent: Wednesday, 4 February, 2015 6:00:43 PM
 Subject: Re: Major breakage in GUI after upgrade from 4.3.2 to 4.4.2

 Thanks, Andrei

 I wonder, has anyone else out there had this issue when upgrading
 from
 4.3.2 to 4.4.2? It would be nice to know if this is a shared issue or
 perhaps something in your environment.

 What OS is your management server running on and what OS are you
 running
 Firefox and Chrome from? Thanks!

 On Tue, Feb 3, 2015 at 12:41 PM, Andrei Mikhailovsky
 and...@arhont.com
 wrote:

  Mike, i've tried logging off/on, tried restarting both the
  management
  server and the client and also tried changing the language. Nothing
  works
  regarding minor issue #2. The tabs are still showing as
  label.anything
  and also some of the buttons.
 
  Andrei
  - Original Message -
 
   From: Mike Tutkowski mike.tutkow...@solidfire.com
   To: dev@cloudstack.apache.org
   Sent: Tuesday, 3 February, 2015 6:39:04 PM
   Subject: Re: Major breakage in GUI after upgrade from 4.3.2 to
   4.4.2
 
   With regards to your #2 Minor issue:
 
   I've noticed this before, as well. Usually logging out and
   logging
   back in
   solves the problem. I think one of our GUI people had commented
   at
   some
   point about a fix for this (but that might have been in 4.5).
 
   On Mon, Feb 2, 2015 at 12:23 PM, Andrei Mikhailovsky
   and...@arhont.com
   wrote:
 
Hi guys,
   
Sorry for duplicating the message from the user list. I've not
got
anywhere there.
   
I've recently upgraded my ASC from version 4.3.2 to version
4.4.2.
The
upgrade process went well without any setbacks or issues. I've
not
seen any
errors in the log files. All looks good apart from the GUI
issues.
I've
tried to clear browser caches and pressed force refresh as
well.
This
happens in Firefox as well as Chrome.
   
The following major issue that i've identified so far:
   
1. I can no longer create new instances. Regardless of if I am
doing it
from the ISO or existing Templates. After following the Add
Instance wizard
and clicking on the Launch button nothing happens. The wizard
window
becomes shaded and the spinning circle appears. I've left it
for
hours
without any change. When the Launch button is pressed, the
management
server does not receive an API call to create an instance.
There
are
actually nothing in the logs after the button is pressed.
However,
I can
successfully create new instances by using the CloudMonkey
clie.
2. There is no Delete button for Templates and ISOs. The Edit
and
Download
buttons are there, but not the Delete button.
   
The following minor issues that i've seen so far:
   
1. The elements in the Dashboard screen are not fitting their
corresponding boxes. They stick out and not aligning properly
2. Some Tabs are not labeled properly and instead show
something
like:
label.zones or label.add.isolated.network and a few more that
i've
noticed,
but can't recall exactly what they were. But it seems that
these
labels are
all over the place (probably about 20% of all Tabs and buttons
in
the GUI)
   
   
Has anyone else seen these types of issues with the 4.4.x
branch?
Any
thoughts on what is causing the issues and how to resolve them?
   
Thanks
   
Andrei
   
 
   --
   *Mike Tutkowski*
   *Senior CloudStack Developer, SolidFire Inc.*
   e: mike.tutkow...@solidfire.com
   o: 303.746.7302
   Advancing the way the world uses the cloud
   http://solidfire.com/solution/overview/?video=play*™*
 

 --
 *Mike Tutkowski*
 *Senior CloudStack Developer, SolidFire Inc.*
 e: mike.tutkow...@solidfire.com
 o: 303.746.7302
 Advancing the way the world uses the cloud
 http://solidfire.com/solution/overview/?video=play*™*

Downgrading recommendations from 4.4.2 to 4.3.2

2015-02-04 Thread Andrei Mikhailovsky



Hello guys, 

I was hoping to get some suggestions on how to perform a downgrade from a live 
4.4.2 environment back to 4.3.2. After performing an upgrade 4 days ago, I have 
discovered a bunch of problems with 4.4.2 release, which makes it unusable for 
me. Problems like CLOUDSTACK-8201, CLOUDSTACK-8210 and inability to create new 
instances (yet to open a support ticket) are kind of big issues for me! 

Prior to performing an upgrade I've backed up cloud and cloud_usage databases 
of version 4.3.2. I have previously downgraded ACS several times without any 
issues. However, i've done the downgrade a few hours after performing an 
upgrade. This time, it's been about 4 days since i've done the upgrade. Since 
then, I've restarted many networks with clean up enabled, which resulted in 
creation of new virtual routers. I've also created a few test vms, templates 
and snapshots, but I am not actually too concerned about those. 

What is the best route for me to do a downgrade to 4.3.2? I am planning to do 
it this weekend, however, I can do it sooner. Just need to make sure my ACS 
will not be even more broken. 

Many thanks 

Andrei

Re: Downgrading recommendations from 4.4.2 to 4.3.2

2015-02-04 Thread Andrei Mikhailovsky

Daan, 

Do you mean that I should wait for the 4.5 branch to be out and try an upgrade 
hoping that the issues are fixed? 

Why do you think downgrading the database is a tricky business? Let's say I do 
the following: 

1. Backup existing data base 
2. Remove all virtual routers 
3. remove the existing db (make a backup just in case). 
4. restore db from 4.3.2 
5. downgrade packages 
6. restart networking (hopefully it will recreate all virtual routers) 

Or do you think there might be some issues in this process? 

Thanks 

- Original Message -

 From: Daan Hoogland daan.hoogl...@gmail.com
 To: dev dev@cloudstack.apache.org
 Sent: Wednesday, 4 February, 2015 12:34:08 PM
 Subject: Re: Downgrading recommendations from 4.4.2 to 4.3.2

 Andrei,

 downgrading the database is your big problem. You don't want to
 restore at this point. I would go for fix forward but that's having
 the ability to maintain my own.branch.

 On Wed, Feb 4, 2015 at 1:21 PM, Andrei Mikhailovsky
 and...@arhont.com wrote:
 
 
  Hello guys,
 
  I was hoping to get some suggestions on how to perform a downgrade
  from a live 4.4.2 environment back to 4.3.2. After performing an
  upgrade 4 days ago, I have discovered a bunch of problems with
  4.4.2 release, which makes it unusable for me. Problems like
  CLOUDSTACK-8201, CLOUDSTACK-8210 and inability to create new
  instances (yet to open a support ticket) are kind of big issues
  for me!
 
  Prior to performing an upgrade I've backed up cloud and cloud_usage
  databases of version 4.3.2. I have previously downgraded ACS
  several times without any issues. However, i've done the downgrade
  a few hours after performing an upgrade. This time, it's been
  about 4 days since i've done the upgrade. Since then, I've
  restarted many networks with clean up enabled, which resulted in
  creation of new virtual routers. I've also created a few test vms,
  templates and snapshots, but I am not actually too concerned about
  those.
 
  What is the best route for me to do a downgrade to 4.3.2? I am
  planning to do it this weekend, however, I can do it sooner. Just
  need to make sure my ACS will not be even more broken.
 
  Many thanks
 
  Andrei
 

 --
 Daan

Re: Major breakage in GUI after upgrade from 4.3.2 to 4.4.2

2015-02-03 Thread Andrei Mikhailovsky

Mark, 

I've tried using both, the template and the ISO and get the same error. 

The templates are based on the following (copy paste from GUI): 

The compute offering: 
Name 2vCPU_2GB 
ID 8951c6d0-d18f-40e9-b34d-457a12b3da9c 
Description 2vCPU_2GB NO High Availability 
Storage Type shared 
# of CPU Cores 2 
CPU (in MHz) 2.00 GHz 
Memory (in MB) 2.00 GB 
Network Rate (Mb/s) 
Custom IOPS 
Min IOPS N/A 
Max IOPS N/A 
Hypervisor Snapshot Reserve N/A 
Disk Read Rate (BPS) 
Disk Write Rate (BPS) 
Disk Read Rate (IOPS) 
Disk Write Rate (IOPS) 
Offer HA No 
CPU Cap No 
Volatile No 
Deployment Planner 
Planner Mode 
GPU 
vGPU type 
Storage Tags rbd 
Host Tags 
Domain 
Created 17 Dec 2013 17:28:25 

The disk offering: 
Name 10GB Disk Standard - RBD 
ID cb6719d6-166f-4a9c-8ece-5c1b528c4982 
Description 10GB Disk Standard Tier - RBD 
Custom Disk Size No 
Disk Size (in GB) 10 
Custom IOPS 
Min IOPS N/A 
Max IOPS N/A 
Hypervisor Snapshot Reserve N/A 
Disk Write Rate (BPS) 
Disk Write Rate (BPS) 
Disk Write Rate (IOPS) 
Disk Write Rate (IOPS) 
label.cache.mode none 
Storage Tags rbd 
Domain 
Storage Type shared 

I've tried using the offerings which existed pre 4.4.2 upgrade as well as on a 
newly created compute/disk offerings. Still get the same issue. 

I've tried using Chrome and Firefox installed from Ubuntu 14.10 repos with the 
latest updates. The debugging console was activated in Firefox. If you want, I 
can do the same from Chrome as well, but from what I can see, both browsers 
behave the same way. 

I am unable to try IE as I do not have access to a windows box. 

Please let me know if you need anything else. 

Andrei 

- Original Message -

 From: Mike Tutkowski mike.tutkow...@solidfire.com
 To: dev@cloudstack.apache.org
 Cc: Brian Federle brian.fede...@citrix.com
 Sent: Tuesday, 3 February, 2015 2:48:08 PM
 Subject: Re: Major breakage in GUI after upgrade from 4.3.2 to 4.4.2

 Also, thanks for the info that a new compute offering doesn't help.

 Are you able to run the same tests from different browsers? If so,
 can you
 tell me those results?

 Thanks!

 On Tue, Feb 3, 2015 at 7:45 AM, Mike Tutkowski
 mike.tutkow...@solidfire.com
  wrote:

  Hi Andrei,
 
  A couple more questions for you:
 
  Are you spinning up a VM based on a template or an ISO in this
  case?
 
  If a template, can you specify the characteristics of your compute
  offering?
 
  If an ISO, can you specify the characteristics of your compute and
  disk
  offerings?
 
  Thanks!
  Mike
 
  On Tue, Feb 3, 2015 at 2:25 AM, Andrei Mikhailovsky
  and...@arhont.com
  wrote:
 
  Mike, thanks for looking into this. I've ran a few tests and I can
  confirm that creating a new disk and compute offering does NOT
  solve the
  problem. I still have the same error on the same line. The disk
  and compute
  offering were created by specifying only the required options
  marked with
  the red *.
 
  Andrei
 
  - Original Message -
 
   From: Mike Tutkowski mike.tutkow...@solidfire.com
   To: dev@cloudstack.apache.org, Brian Federle
   brian.fede...@citrix.com
   Sent: Tuesday, 3 February, 2015 4:45:12 AM
   Subject: Re: Major breakage in GUI after upgrade from 4.3.2 to
   4.4.2
 
   So, Andrei, the problem is this if statement should return
   false
   if you
   are not allowing IOPS to be set in your compute offering:
 
   if
  
  (args.$wizard.find('input[name=disk-min-iops]').parent().parent().css('display')
   != 'none') {
 
   But it returns true and then the next if statement doesn't
   find
   the
   disk-min-iops control.
 
   I just ran some tests on this a moment ago and it all worked
   fine, so
   I'll
   be curious to see if this is only a problem for you when you use
   a
   compute
   offering that existed before you completed the upgrade.
 
   Perhaps we'll be able to have a GUI person examine this code
   with the
   upgrade scenario in mind and comment, as well. This pattern is
   the
   same as
   that of the optional CPU, MHz, and memory pattern, so it's a bit
   strange to
   me that the CPU/MHz/memory line doesn't fail first (unless your
   compute
   offering does accept input for CPU/MHz/memory).
 
   Thanks!
 
   On Mon, Feb 2, 2015 at 9:16 PM, Mike Tutkowski
   mike.tutkow...@solidfire.com
wrote:
 
Hey Andrei,
   
Does this only happen when you try to spin up a VM using a
compute
offering that existed BEFORE the upgrade?
   
Looking at it another way, if you create a new compute
offering
once
you're already upgraded, are you able to spin up a VM with
that
compute
offering?
   
Thanks!
Mike
   
On Mon, Feb 2, 2015 at 9:07 PM, Mike Tutkowski 
mike.tutkow...@solidfire.com wrote:
   
These two top-level if statements follow the same pattern:
   
if
   
  (args.$wizard.find('input[name=compute-cpu-cores]').parent().parent().css('display')
!= 'none') {
if
(args.$wizard.find('input[name=compute-cpu-cores]').val().length

0

Re: Major breakage in GUI after upgrade from 4.3.2 to 4.4.2

2015-02-03 Thread Andrei Mikhailovsky

Mike, 

I have a preset values for the number of cpus and Mhz. In this case, it's 2cpus 
of 2ghz each. I've also tried different offerings with different number of 
cpus, but the result is no different. 

Cheers 

- Original Message -

 From: Mike Tutkowski mike.tutkow...@solidfire.com
 To: dev@cloudstack.apache.org
 Cc: Brian Federle brian.fede...@citrix.com
 Sent: Tuesday, 3 February, 2015 3:40:35 PM
 Subject: Re: Major breakage in GUI after upgrade from 4.3.2 to 4.4.2

 Thanks, Andrei

 Another question: Does your compute offering specify the CPU, MHz,
 and
 memory or does it allow the end user to specify those values?

 On Tue, Feb 3, 2015 at 8:33 AM, Andrei Mikhailovsky
 and...@arhont.com
 wrote:

  Mark,
 
  I've tried using both, the template and the ISO and get the same
  error.
 
  The templates are based on the following (copy paste from GUI):
 
  The compute offering:
  Name 2vCPU_2GB
  ID 8951c6d0-d18f-40e9-b34d-457a12b3da9c
  Description 2vCPU_2GB NO High Availability
  Storage Type shared
  # of CPU Cores 2
  CPU (in MHz) 2.00 GHz
  Memory (in MB) 2.00 GB
  Network Rate (Mb/s)
  Custom IOPS
  Min IOPS N/A
  Max IOPS N/A
  Hypervisor Snapshot Reserve N/A
  Disk Read Rate (BPS)
  Disk Write Rate (BPS)
  Disk Read Rate (IOPS)
  Disk Write Rate (IOPS)
  Offer HA No
  CPU Cap No
  Volatile No
  Deployment Planner
  Planner Mode
  GPU
  vGPU type
  Storage Tags rbd
  Host Tags
  Domain
  Created 17 Dec 2013 17:28:25
 
  The disk offering:
  Name 10GB Disk Standard - RBD
  ID cb6719d6-166f-4a9c-8ece-5c1b528c4982
  Description 10GB Disk Standard Tier - RBD
  Custom Disk Size No
  Disk Size (in GB) 10
  Custom IOPS
  Min IOPS N/A
  Max IOPS N/A
  Hypervisor Snapshot Reserve N/A
  Disk Write Rate (BPS)
  Disk Write Rate (BPS)
  Disk Write Rate (IOPS)
  Disk Write Rate (IOPS)
  label.cache.mode none
  Storage Tags rbd
  Domain
  Storage Type shared
 
  I've tried using the offerings which existed pre 4.4.2 upgrade as
  well as
  on a newly created compute/disk offerings. Still get the same
  issue.
 
  I've tried using Chrome and Firefox installed from Ubuntu 14.10
  repos with
  the latest updates. The debugging console was activated in Firefox.
  If you
  want, I can do the same from Chrome as well, but from what I can
  see, both
  browsers behave the same way.
 
  I am unable to try IE as I do not have access to a windows box.
 
  Please let me know if you need anything else.
 
  Andrei
 
  - Original Message -
 
   From: Mike Tutkowski mike.tutkow...@solidfire.com
   To: dev@cloudstack.apache.org
   Cc: Brian Federle brian.fede...@citrix.com
   Sent: Tuesday, 3 February, 2015 2:48:08 PM
   Subject: Re: Major breakage in GUI after upgrade from 4.3.2 to
   4.4.2
 
   Also, thanks for the info that a new compute offering doesn't
   help.
 
   Are you able to run the same tests from different browsers? If
   so,
   can you
   tell me those results?
 
   Thanks!
 
   On Tue, Feb 3, 2015 at 7:45 AM, Mike Tutkowski
   mike.tutkow...@solidfire.com
wrote:
 
Hi Andrei,
   
A couple more questions for you:
   
Are you spinning up a VM based on a template or an ISO in this
case?
   
If a template, can you specify the characteristics of your
compute
offering?
   
If an ISO, can you specify the characteristics of your compute
and
disk
offerings?
   
Thanks!
Mike
   
On Tue, Feb 3, 2015 at 2:25 AM, Andrei Mikhailovsky
and...@arhont.com
wrote:
   
Mike, thanks for looking into this. I've ran a few tests and I
can
confirm that creating a new disk and compute offering does NOT
solve the
problem. I still have the same error on the same line. The
disk
and compute
offering were created by specifying only the required options
marked with
the red *.
   
Andrei
   
- Original Message -
   
 From: Mike Tutkowski mike.tutkow...@solidfire.com
 To: dev@cloudstack.apache.org, Brian Federle
 brian.fede...@citrix.com
 Sent: Tuesday, 3 February, 2015 4:45:12 AM
 Subject: Re: Major breakage in GUI after upgrade from 4.3.2
 to
 4.4.2
   
 So, Andrei, the problem is this if statement should return
 false
 if you
 are not allowing IOPS to be set in your compute offering:
   
 if

   
  (args.$wizard.find('input[name=disk-min-iops]').parent().parent().css('display')
 != 'none') {
   
 But it returns true and then the next if statement
 doesn't
 find
 the
 disk-min-iops control.
   
 I just ran some tests on this a moment ago and it all worked
 fine, so
 I'll
 be curious to see if this is only a problem for you when you
 use
 a
 compute
 offering that existed before you completed the upgrade.
   
 Perhaps we'll be able to have a GUI person examine this code
 with the
 upgrade scenario in mind and comment, as well. This pattern
 is
 the
 same

Re: Major breakage in GUI after upgrade from 4.3.2 to 4.4.2

2015-02-03 Thread Andrei Mikhailovsky

Mike, i've tried logging off/on, tried restarting both the management server 
and the client and also tried changing the language. Nothing works regarding 
minor issue #2. The tabs are still showing as label.anything and also some of 
the buttons. 

Andrei 
- Original Message -

 From: Mike Tutkowski mike.tutkow...@solidfire.com
 To: dev@cloudstack.apache.org
 Sent: Tuesday, 3 February, 2015 6:39:04 PM
 Subject: Re: Major breakage in GUI after upgrade from 4.3.2 to 4.4.2

 With regards to your #2 Minor issue:

 I've noticed this before, as well. Usually logging out and logging
 back in
 solves the problem. I think one of our GUI people had commented at
 some
 point about a fix for this (but that might have been in 4.5).

 On Mon, Feb 2, 2015 at 12:23 PM, Andrei Mikhailovsky
 and...@arhont.com
 wrote:

  Hi guys,

  Sorry for duplicating the message from the user list. I've not got
  anywhere there.

  I've recently upgraded my ASC from version 4.3.2 to version 4.4.2.
  The
  upgrade process went well without any setbacks or issues. I've not
  seen any
  errors in the log files. All looks good apart from the GUI issues.
  I've
  tried to clear browser caches and pressed force refresh as well.
  This
  happens in Firefox as well as Chrome.

  The following major issue that i've identified so far:

  1. I can no longer create new instances. Regardless of if I am
  doing it
  from the ISO or existing Templates. After following the Add
  Instance wizard
  and clicking on the Launch button nothing happens. The wizard
  window
  becomes shaded and the spinning circle appears. I've left it for
  hours
  without any change. When the Launch button is pressed, the
  management
  server does not receive an API call to create an instance. There
  are
  actually nothing in the logs after the button is pressed. However,
  I can
  successfully create new instances by using the CloudMonkey clie.
  2. There is no Delete button for Templates and ISOs. The Edit and
  Download
  buttons are there, but not the Delete button.

  The following minor issues that i've seen so far:

  1. The elements in the Dashboard screen are not fitting their
  corresponding boxes. They stick out and not aligning properly
  2. Some Tabs are not labeled properly and instead show something
  like:
  label.zones or label.add.isolated.network and a few more that i've
  noticed,
  but can't recall exactly what they were. But it seems that these
  labels are
  all over the place (probably about 20% of all Tabs and buttons in
  the GUI)

  Has anyone else seen these types of issues with the 4.4.x branch?
  Any
  thoughts on what is causing the issues and how to resolve them?

  Thanks

  Andrei

 --
 *Mike Tutkowski*
 *Senior CloudStack Developer, SolidFire Inc.*
 e: mike.tutkow...@solidfire.com
 o: 303.746.7302
 Advancing the way the world uses the cloud
 http://solidfire.com/solution/overview/?video=play*™*

Re: Major breakage in GUI after upgrade from 4.3.2 to 4.4.2

2015-02-03 Thread Andrei Mikhailovsky

. It
  works fine in all of the tests I've run where I've set the
  compute offering
  up to both ask and not ask for these IOPS fields.
 
  Perhaps one of our GUI gurus can comment (I've included Brian
  Federle).
 
  On Mon, Feb 2, 2015 at 5:57 PM, Andrei Mikhailovsky
  and...@arhont.com
  wrote:
 
  Mike,
 
  I am not really sure how to do that.
 
  Here is what I've done so far, perhaps you could help me with
  some
  instructions.
 
  I've opened debugging console in Firefox and checked the Console
  tab.
  After i've followed the add instance wizard while watching the
  messages in
  the Console. No errors until I've clicked the launch button.
  After that
  I've got the following message:
 
  TypeError: args.$wizard.find(...).val(...) is undefined
  instanceWizard.js:649
 
  Looking at the line 649 in the instanceWizard.js:
 
  if (args.$wizard.find('input[name=disk-min-iops]').val().length
   0) {
 
  So, it seem to be looking for the disk-min-iops value which is
  not
  defined during the wizard creation. I do not recall ever being
  required to
  specify these values in the past. Thus, not sure why it needs
  these values
  all of a sudden after performing an upgrade from acs 4.3.2?
 
  Any idea anyone?
 
  Cheers
 
  - Original Message -
 
   From: Mike Tutkowski mike.tutkow...@solidfire.com
   To: dev@cloudstack.apache.org
   Sent: Monday, 2 February, 2015 9:25:31 PM
   Subject: Re: Major breakage in GUI after upgrade from 4.3.2 to
   4.4.2
 
   Hey Andrei,
 
   Are you familiar with debugging in your web browser?
 
   One thing you could try is to set a breakpoint in
   instanceWizard.js
   where
   deployVirtualMachine is invoked and see what happens.
 
   Talk to you later,
   Mike
 
   On Mon, Feb 2, 2015 at 2:16 PM, Andrei Mikhailovsky
   and...@arhont.com
   wrote:
 
Mike, you are absolutely right, thanks! The delete function
has
been
hidden under the Zones tab (in my version of GUI it is shown
as
label.zones). So, this one is sorted out.
   
Now, I wonder how to fix the major issue #1 - unable to
create new
vm
instances? Anyone any thoughts?
   
Thanks
   
Andrei
   
- Original Message -
   
 From: Mike Tutkowski mike.tutkow...@solidfire.com
 To: dev@cloudstack.apache.org
 Sent: Monday, 2 February, 2015 7:38:39 PM
 Subject: Re: Major breakage in GUI after upgrade from
 4.3.2 to
 4.4.2
   
 I wonder for your Major issue #2 if you have drilled down
 into
 the
 applicable zone from which you want to delete the
 template?
   
 I had trouble finding this at one point, as well.
   
 I don't have easy access to a 4.4 GUI at the time being,
 but in
 4.6
 you
 need to go to Templates, click on the template in the
 table,
 select
 the
 Zone tab, click on the applicable zone in the table, then
 you see
 a
 delete
 button.
   
 On Mon, Feb 2, 2015 at 12:23 PM, Andrei Mikhailovsky
 and...@arhont.com
 wrote:
   
  Hi guys,
 
  Sorry for duplicating the message from the user list.
  I've not
  got
  anywhere there.
 
  I've recently upgraded my ASC from version 4.3.2 to
  version
  4.4.2.
  The
  upgrade process went well without any setbacks or
  issues. I've
  not
  seen any
  errors in the log files. All looks good apart from the
  GUI
  issues.
  I've
  tried to clear browser caches and pressed force refresh
  as
  well.
  This
  happens in Firefox as well as Chrome.
 
  The following major issue that i've identified so far:
 
  1. I can no longer create new instances. Regardless of
  if I am
  doing it
  from the ISO or existing Templates. After following the
  Add
  Instance wizard
  and clicking on the Launch button nothing happens. The
  wizard
  window
  becomes shaded and the spinning circle appears. I've
  left it
  for
  hours
  without any change. When the Launch button is pressed,
  the
  management
  server does not receive an API call to create an
  instance.
  There
  are
  actually nothing in the logs after the button is
  pressed.
  However,
  I can
  successfully create new instances by using the
  CloudMonkey
  clie.
  2. There is no Delete button for Templates and ISOs. The
  Edit
  and
  Download
  buttons are there, but not the Delete button.
 
  The following minor issues that i've seen so far:
 
  1. The elements in the Dashboard screen are not fitting
  their
  corresponding boxes. They stick out and not aligning
  properly
  2. Some Tabs are not labeled properly and instead show
  something
  like:
  label.zones or label.add.isolated.network and a few more
  that
  i've
  noticed,
  but can't recall exactly what

Re: Major breakage in GUI after upgrade from 4.3.2 to 4.4.2

2015-02-02 Thread Andrei Mikhailovsky

Mike, you are absolutely right, thanks! The delete function has been hidden 
under the Zones tab (in my version of GUI it is shown as label.zones). So, 
this one is sorted out. 

Now, I wonder how to fix the major issue #1 - unable to create new vm 
instances? Anyone any thoughts? 

Thanks 

Andrei 

- Original Message -

 From: Mike Tutkowski mike.tutkow...@solidfire.com
 To: dev@cloudstack.apache.org
 Sent: Monday, 2 February, 2015 7:38:39 PM
 Subject: Re: Major breakage in GUI after upgrade from 4.3.2 to 4.4.2

 I wonder for your Major issue #2 if you have drilled down into the
 applicable zone from which you want to delete the template?

 I had trouble finding this at one point, as well.

 I don't have easy access to a 4.4 GUI at the time being, but in 4.6
 you
 need to go to Templates, click on the template in the table, select
 the
 Zone tab, click on the applicable zone in the table, then you see a
 delete
 button.

 On Mon, Feb 2, 2015 at 12:23 PM, Andrei Mikhailovsky
 and...@arhont.com
 wrote:

  Hi guys,
 
  Sorry for duplicating the message from the user list. I've not got
  anywhere there.
 
  I've recently upgraded my ASC from version 4.3.2 to version 4.4.2.
  The
  upgrade process went well without any setbacks or issues. I've not
  seen any
  errors in the log files. All looks good apart from the GUI issues.
  I've
  tried to clear browser caches and pressed force refresh as well.
  This
  happens in Firefox as well as Chrome.
 
  The following major issue that i've identified so far:
 
  1. I can no longer create new instances. Regardless of if I am
  doing it
  from the ISO or existing Templates. After following the Add
  Instance wizard
  and clicking on the Launch button nothing happens. The wizard
  window
  becomes shaded and the spinning circle appears. I've left it for
  hours
  without any change. When the Launch button is pressed, the
  management
  server does not receive an API call to create an instance. There
  are
  actually nothing in the logs after the button is pressed. However,
  I can
  successfully create new instances by using the CloudMonkey clie.
  2. There is no Delete button for Templates and ISOs. The Edit and
  Download
  buttons are there, but not the Delete button.
 
  The following minor issues that i've seen so far:
 
  1. The elements in the Dashboard screen are not fitting their
  corresponding boxes. They stick out and not aligning properly
  2. Some Tabs are not labeled properly and instead show something
  like:
  label.zones or label.add.isolated.network and a few more that i've
  noticed,
  but can't recall exactly what they were. But it seems that these
  labels are
  all over the place (probably about 20% of all Tabs and buttons in
  the GUI)
 
 
  Has anyone else seen these types of issues with the 4.4.x branch?
  Any
  thoughts on what is causing the issues and how to resolve them?
 
  Thanks
 
  Andrei
 

 --
 *Mike Tutkowski*
 *Senior CloudStack Developer, SolidFire Inc.*
 e: mike.tutkow...@solidfire.com
 o: 303.746.7302
 Advancing the way the world uses the cloud
 http://solidfire.com/solution/overview/?video=play*™*

Re: Major breakage in GUI after upgrade from 4.3.2 to 4.4.2

2015-02-02 Thread Andrei Mikhailovsky

Mike, 

I am not really sure how to do that. 

Here is what I've done so far, perhaps you could help me with some 
instructions. 

I've opened debugging console in Firefox and checked the Console tab. After 
i've followed the add instance wizard while watching the messages in the 
Console. No errors until I've clicked the launch button. After that I've got 
the following message: 

TypeError: args.$wizard.find(...).val(...) is undefined instanceWizard.js:649 

Looking at the line 649 in the instanceWizard.js: 

if (args.$wizard.find('input[name=disk-min-iops]').val().length  0) { 

So, it seem to be looking for the disk-min-iops value which is not defined 
during the wizard creation. I do not recall ever being required to specify 
these values in the past. Thus, not sure why it needs these values all of a 
sudden after performing an upgrade from acs 4.3.2? 

Any idea anyone? 

Cheers 

- Original Message -

 From: Mike Tutkowski mike.tutkow...@solidfire.com
 To: dev@cloudstack.apache.org
 Sent: Monday, 2 February, 2015 9:25:31 PM
 Subject: Re: Major breakage in GUI after upgrade from 4.3.2 to 4.4.2

 Hey Andrei,

 Are you familiar with debugging in your web browser?

 One thing you could try is to set a breakpoint in instanceWizard.js
 where
 deployVirtualMachine is invoked and see what happens.

 Talk to you later,
 Mike

 On Mon, Feb 2, 2015 at 2:16 PM, Andrei Mikhailovsky
 and...@arhont.com
 wrote:

  Mike, you are absolutely right, thanks! The delete function has
  been
  hidden under the Zones tab (in my version of GUI it is shown as
  label.zones). So, this one is sorted out.
 
  Now, I wonder how to fix the major issue #1 - unable to create new
  vm
  instances? Anyone any thoughts?
 
  Thanks
 
  Andrei
 
  - Original Message -
 
   From: Mike Tutkowski mike.tutkow...@solidfire.com
   To: dev@cloudstack.apache.org
   Sent: Monday, 2 February, 2015 7:38:39 PM
   Subject: Re: Major breakage in GUI after upgrade from 4.3.2 to
   4.4.2
 
   I wonder for your Major issue #2 if you have drilled down into
   the
   applicable zone from which you want to delete the template?
 
   I had trouble finding this at one point, as well.
 
   I don't have easy access to a 4.4 GUI at the time being, but in
   4.6
   you
   need to go to Templates, click on the template in the table,
   select
   the
   Zone tab, click on the applicable zone in the table, then you see
   a
   delete
   button.
 
   On Mon, Feb 2, 2015 at 12:23 PM, Andrei Mikhailovsky
   and...@arhont.com
   wrote:
 
Hi guys,
   
Sorry for duplicating the message from the user list. I've not
got
anywhere there.
   
I've recently upgraded my ASC from version 4.3.2 to version
4.4.2.
The
upgrade process went well without any setbacks or issues. I've
not
seen any
errors in the log files. All looks good apart from the GUI
issues.
I've
tried to clear browser caches and pressed force refresh as
well.
This
happens in Firefox as well as Chrome.
   
The following major issue that i've identified so far:
   
1. I can no longer create new instances. Regardless of if I am
doing it
from the ISO or existing Templates. After following the Add
Instance wizard
and clicking on the Launch button nothing happens. The wizard
window
becomes shaded and the spinning circle appears. I've left it
for
hours
without any change. When the Launch button is pressed, the
management
server does not receive an API call to create an instance.
There
are
actually nothing in the logs after the button is pressed.
However,
I can
successfully create new instances by using the CloudMonkey
clie.
2. There is no Delete button for Templates and ISOs. The Edit
and
Download
buttons are there, but not the Delete button.
   
The following minor issues that i've seen so far:
   
1. The elements in the Dashboard screen are not fitting their
corresponding boxes. They stick out and not aligning properly
2. Some Tabs are not labeled properly and instead show
something
like:
label.zones or label.add.isolated.network and a few more that
i've
noticed,
but can't recall exactly what they were. But it seems that
these
labels are
all over the place (probably about 20% of all Tabs and buttons
in
the GUI)
   
   
Has anyone else seen these types of issues with the 4.4.x
branch?
Any
thoughts on what is causing the issues and how to resolve them?
   
Thanks
   
Andrei
   
 
   --
   *Mike Tutkowski*
   *Senior CloudStack Developer, SolidFire Inc.*
   e: mike.tutkow...@solidfire.com
   o: 303.746.7302
   Advancing the way the world uses the cloud
   http://solidfire.com/solution/overview/?video=play*™*
 

 --
 *Mike Tutkowski*
 *Senior CloudStack Developer, SolidFire Inc.*
 e: mike.tutkow...@solidfire.com
 o: 303.746.7302
 Advancing the way the world uses

Major breakage in GUI after upgrade from 4.3.2 to 4.4.2

2015-02-02 Thread Andrei Mikhailovsky

Hi guys, 

Sorry for duplicating the message from the user list. I've not got anywhere 
there.

I've recently upgraded my ASC from version 4.3.2 to version 4.4.2. The upgrade 
process went well without any setbacks or issues. I've not seen any errors in 
the log files. All looks good apart from the GUI issues. I've tried to clear 
browser caches and pressed force refresh as well. This happens in Firefox as 
well as Chrome. 

The following major issue that i've identified so far: 

1. I can no longer create new instances. Regardless of if I am doing it from 
the ISO or existing Templates. After following the Add Instance wizard and 
clicking on the Launch button nothing happens. The wizard window becomes shaded 
and the spinning circle appears. I've left it for hours without any change. 
When the Launch button is pressed, the management server does not receive an 
API call to create an instance. There are actually nothing in the logs after 
the button is pressed. However, I can successfully create new instances by 
using the CloudMonkey clie.
2. There is no Delete button for Templates and ISOs. The Edit and Download 
buttons are there, but not the Delete button.

The following minor issues that i've seen so far: 

1. The elements in the Dashboard screen are not fitting their corresponding 
boxes. They stick out and not aligning properly 
2. Some Tabs are not labeled properly and instead show something like: 
label.zones or label.add.isolated.network and a few more that i've noticed, but 
can't recall exactly what they were. But it seems that these labels are all 
over the place (probably about 20% of all Tabs and buttons in the GUI)


Has anyone else seen these types of issues with the 4.4.x branch? Any thoughts 
on what is causing the issues and how to resolve them? 

Thanks 

Andrei

Re: [DISCUSS] we need a better SSVM solution

2015-01-29 Thread Andrei Mikhailovsky

I am also +1 on this. 

For large deployments it is a must feature to automatically upgrade a zone or 
region level system vms. I think that ACS should not only automatically upgrade 
the templates, but also have the option to automatically upgrade the running 
system vms. 

It would also be awesome if ACS could fire up a temporary/redundant virtual 
router before upgrading the live one. This will minimise the downtime. Similar 
to what the redundant virtual routers do. Once the live router is upgraded and 
switched to master/primary function, the temporary one could be automatically 
deleted. 

Andrei 

- Original Message -

 From: Daan Hoogland daan.hoogl...@gmail.com
 To: dev dev@cloudstack.apache.org
 Sent: Thursday, 29 January, 2015 10:52:53 AM
 Subject: Re: [DISCUSS] we need a better SSVM solution

 I don't like the puppet/chef idea but at Schuberg Philis we use
 ansible which negates most of my opposition :p

 I would rather have a 'upload or sysvmtemplate' the system vm
 template
 has some requirements so I think we would either require it to be
 build (on the ms?) or be checked during upload. At least the MS
 should
 allow for automatic update. Remi and I got some inspiration last
 night
 from our update of about 200 routers and some ssvm's and cpvm's. To
 cut it short; i'm with scenario 1.

 On Wed, Jan 28, 2015 at 10:09 PM, Andrija Panic
 andrija.pa...@gmail.com wrote:
  +1 !
  On Jan 28, 2015 10:01 PM, Erik Weber terbol...@gmail.com wrote:
 
  On Wed, Jan 28, 2015 at 9:44 PM, John Kinsella j...@stratosec.co
  wrote:
 
   Every time there’s an issue (security or otherwise) with the
   system VM
   ISOs, it’s a relative pain to fix. They’re sort of a closed
   system,
  people
   know little (relative to other ACS parts, IMHO) about their
   innards, and
   updating them is more difficult than it should be.
  
   I’d love to see a Better Way. I think these things could be
   dynamically
   built, with the option to have them connect to a configuration
   management
   (CM) system such as Puppet, Chef, Salt-Stack or whatever else
   floats
   people’s boat.
  
  
  Totally agree, but we should consider the fact that users might
  not use our
  builds and make it equally easy to update with a custom one.
 
  One possible use case:
   * User installs new ACS system.
   * User logs into mgmt server, goes to Templates area, clicks
   button to
   fetch default SSVM image. UI allows providing alternative URL,
   other
   options as needed.
   * (time passes)
   * Security issue is announced. User goes back into Templates
   area,
  selects
   SSVM template, clicks “Download updated template” and it does.
   Under
   infrastructure/system VMs and infrastrucutre/virtual routers,
   there’s
   buttons to update one or more running instances to use the new
   template
  
  
  If the user is using one of the published templates, why not just
  download
  the new one and send a notification that a new template is ready
  and that
  systemvms should be scheduled for a restart?
 
 
   Another possible use case:
   * User installs new ACS system
   * User uploads SSVM template that has CM agent configured to
   talk to
  their
   CM server (I’ve been wanting to lab this for a while now)
   * As ACS creates system VMs, they phone home to CM server, it
   provides
   them with instructions to install various packages and config as
   needed
  to
   be domr/console proxy/whatever. We provide basic “recipes” for
   CM systems
   for people to use and grow from.
   * Security issue is announced. User updates recipe in CM system,
   a few
   minutes later the SSVMs are up-to-date.
  
   Modification on that use case: We ship the SSVM with
   puppet/chef/blah
   installed, part of the SSVM “patch” process configures
   appropriate CM
   system.
  
   What might make the second use case easier would be to have some
   hooks in
   ACS that when a system is created/destroyed/modified, it informs
   3rd
  party
   via API.
  
   (Obviously API calls for all of the above to allow process
   without
   touching the UI)
  
   Thoughts?
  
  
  I've wondered for quite some time why we haven't had a simple
  checkbox in
  the template register view that says 'Use as System VM' or
  similar.
 
  Anyway, huge +1
 
  --
  Erik
 

 --
 Daan

XenServer 6.5 RC and ACS

2014-12-29 Thread Andrei Mikhailovsky

Hello guys, 

Is there a way to connect XenServer 6.5 RC to ACS? I am using 4.2.1 at the 
moment and I get an error when I try to add the host. Any unofficial tips on 
how this could be done on 4.2.1? Or do I need to wait for acs 4.5 to be out? 

Cheers 

Andrei

Re: XenServer 6.5 RC and ACS

2014-12-29 Thread Andrei Mikhailovsky

Sorry, I do not have a lab to test. 

Could someone verify if 4.4 supports XenServer 6.5 please? 

Thanks 

Andrei 

- Original Message -

 From: Pierre-Luc Dion pd...@cloudops.com
 To: dev@cloudstack.apache.org
 Sent: Monday, 29 December, 2014 5:44:56 PM
 Subject: Re: XenServer 6.5 RC and ACS

 Hi Andrei,

 I haven't tested it but it might work with ACS 4.4. do you have a lab
 to
 validate this ? their is reference to XS6.5 into the code which is
 why I'm
 expecting to have it working with 4.4.

 Cheers,

 PL

 On Mon, Dec 29, 2014 at 12:40 PM, Andrei Mikhailovsky
 and...@arhont.com
 wrote:

  Hello guys,
 
  Is there a way to connect XenServer 6.5 RC to ACS? I am using 4.2.1
  at the
  moment and I get an error when I try to add the host. Any
  unofficial tips
  on how this could be done on 4.2.1? Or do I need to wait for acs
  4.5 to be
  out?
 
  Cheers
 
  Andrei

Ubuntu 14.04 support on host servers

2014-12-10 Thread Andrei Mikhailovsky

Hello guys, 

Does anyone know if ubuntu server 14.04 on host servers is currently supported 
in acs 4.4 branch? I've seen some issues with getting the usage stats for acs 
4.2.1 ,so wondering if the latest acs is ready for the latest ubuntu server lts 
as a host server? 

Cheers 

Andrei

Is there any progress with CLOUDSTACK-4858 bug?

2014-12-04 Thread Andrei Mikhailovsky


Hello guys, 

I was wondering if there are any plans to fix the CLOUDSTACK-4858, which has 
been reported back in October 2013 and despite having a Major Priority hasn't 
even been assigned to anyone as far as I can see? 

Thanks 

Andrei

Is there any progress with CLOUDSTACK-4858 bug?

2014-12-04 Thread Andrei Mikhailovsky

Hello guys, 

I was wondering if there are any plans to fix the CLOUDSTACK-4858, which has 
been reported back in October 2013 and despite having a Major Priority hasn't 
even been assigned to anyone as far as I can see? 

Thanks 

Andrei

CloudMonkey 5.3 not starting on Ubuntu 14.10

2014-12-01 Thread Andrei Mikhailovsky

Hi guys, 

I am having issues starting cloudmonkey on ubuntu 14.10. I get the following 
error: 

$ cloudmonkey 
Import error in cloudmonkey.cloudmonkey : No module named packages 

I have tried to google for the error, but can't find the solution. 

I've installed it via pip install cloudmonkey. 

Could some one point me in the right direction please? 

Thanks 

Andrei

Re: CloudMonkey 5.3 not starting on Ubuntu 14.10

2014-12-01 Thread Andrei Mikhailovsky

Erik, 

this is a clean install, not an upgrade. I don't remember having issues on 
Ubuntu 14.04. I've also tried the easy_install method shown in the docs and 
that produced the same error. 

not really sure what you mean by the virtualenv. 

Andrei 
- Original Message -

 From: sebgoa run...@gmail.com
 To: dev@cloudstack.apache.org
 Sent: Monday, 1 December, 2014 10:50:05 AM
 Subject: Re: CloudMonkey 5.3 not starting on Ubuntu 14.10

 On Dec 1, 2014, at 11:44 AM, Erik Weber terbol...@gmail.com wrote:

  On Mon, Dec 1, 2014 at 11:32 AM, Andrei Mikhailovsky
  and...@arhont.com
  wrote:
 
  Hi guys,
 
  I am having issues starting cloudmonkey on ubuntu 14.10. I get the
  following error:
 
  $ cloudmonkey
  Import error in cloudmonkey.cloudmonkey : No module named packages
 
 
 
  I have the same on CentOS 6.6, cloudmonkey installed with pip
 

 Have you tried with virtualenv ?
 Are you upgrading ?

 
  --
  Erik

Re: CloudMonkey 5.3 not starting on Ubuntu 14.10

2014-12-01 Thread Andrei Mikhailovsky

Rohit, 

I've tried your suggestion and that didn't work for me. Had the same error 
message. 

However, I can confirm that running Erik's suggestion fixed my problem. 

Andrei 

-- 
Andrei Mikhailovsky 
Director 
Arhont Information Security 

Web: http://www.arhont.com 
http://www.wi-foo.com 
Tel: +44 (0)870 4431337 
Fax: +44 (0)208 429 3111 
PGP: Key ID - 0x2B3438DE 
PGP: Server - keyserver.pgp.com 

DISCLAIMER 

The information contained in this email is intended only for the use of the 
person(s) to whom it is addressed and may be confidential or contain legally 
privileged information. If you are not the intended recipient you are hereby 
notified that any perusal, use, distribution, copying or disclosure is strictly 
prohibited. If you have received this email in error please immediately advise 
us by return email at and...@arhont.com and delete and purge the email and any 
attachments without making a copy. 

- Original Message -

 From: Rohit Yadav rohit.ya...@shapeblue.com
 To: dev@cloudstack.apache.org
 Sent: Monday, 1 December, 2014 10:55:39 AM
 Subject: Re: CloudMonkey 5.3 not starting on Ubuntu 14.10

 Andrei,

 I’ve replied to your query in another thread, that tries to publish
 this issue and its fix for everyone on users@ and dev@. Thanks.

 Quick solution - remove python-pip and install pip using pip project
 page.

  On 01-Dec-2014, at 4:22 pm, Andrei Mikhailovsky and...@arhont.com
  wrote:
 
  Erik,
 
  this is a clean install, not an upgrade. I don't remember having
  issues on Ubuntu 14.04. I've also tried the easy_install method
  shown in the docs and that produced the same error.
 
  not really sure what you mean by the virtualenv.
 
  Andrei
  - Original Message -
 
  From: sebgoa run...@gmail.com
  To: dev@cloudstack.apache.org
  Sent: Monday, 1 December, 2014 10:50:05 AM
  Subject: Re: CloudMonkey 5.3 not starting on Ubuntu 14.10
 
  On Dec 1, 2014, at 11:44 AM, Erik Weber terbol...@gmail.com
  wrote:
 
  On Mon, Dec 1, 2014 at 11:32 AM, Andrei Mikhailovsky
  and...@arhont.com
  wrote:
 
  Hi guys,
 
  I am having issues starting cloudmonkey on ubuntu 14.10. I get
  the
  following error:
 
  $ cloudmonkey
  Import error in cloudmonkey.cloudmonkey : No module named
  packages
 
 
 
  I have the same on CentOS 6.6, cloudmonkey installed with pip
 
 
  Have you tried with virtualenv ?
  Are you upgrading ?
 
 
  --
  Erik

 Regards,
 Rohit Yadav
 Software Architect, ShapeBlue
 M. +91 88 262 30892 | rohit.ya...@shapeblue.com
 Blog: bhaisaab.org | Twitter: @_bhaisaab

 Find out more about ShapeBlue and our range of CloudStack related
 services

 IaaS Cloud Design 
 Buildhttp://shapeblue.com/iaas-cloud-design-and-build//
 CSForge – rapid IaaS deployment
 frameworkhttp://shapeblue.com/csforge/
 CloudStack Consultinghttp://shapeblue.com/cloudstack-consultancy/
 CloudStack Software
 Engineeringhttp://shapeblue.com/cloudstack-software-engineering/
 CloudStack Infrastructure
 Supporthttp://shapeblue.com/cloudstack-infrastructure-support/
 CloudStack Bootcamp Training
 Courseshttp://shapeblue.com/cloudstack-training/

 This email and any attachments to it may be confidential and are
 intended solely for the use of the individual to whom it is
 addressed. Any views or opinions expressed are solely those of the
 author and do not necessarily represent those of Shape Blue Ltd or
 related companies. If you are not the intended recipient of this
 email, you must neither take any action based upon its contents, nor
 copy or show it to anyone. Please contact the sender if you believe
 you have received this email in error. Shape Blue Ltd is a company
 incorporated in England  Wales. ShapeBlue Services India LLP is a
 company incorporated in India and is operated under license from
 Shape Blue Ltd. Shape Blue Brasil Consultoria Ltda is a company
 incorporated in Brasil and is operated under license from Shape Blue
 Ltd. ShapeBlue SA Pty Ltd is a company registered by The Republic of
 South Africa and is traded under license from Shape Blue Ltd.
 ShapeBlue is a registered trademark.

Re: [DISCUSS] LTS Releases

2014-11-27 Thread Andrei Mikhailovsky

ChunFeng, 

I think as long as there is a change to the current efforts it will improve the 
stability of the product. At the moment, it is clearly not working very well 
for the end users, otherwise, we would not be discussing this topic. 

As to answer your previous concerns, I agree, having a 5 year support is not an 
option for CloudStack, especially taking into considering the dynamic 
development and the current maturity of the product. It has to be more 
frequent. Perhaps the LTS equivalent version should be released with every 
two/three releases of the non-LTS release. Two/three release cycles should be 
enough time to community test the new features and correct the bugs for the 
stable release. 

IMHO the naming concept is less important as long as the documentation and 
release notes make a distinction. Having fancy letters at the end of the 
product name is a marketing/PR/sales job ). Some companies use LTS, others GA, 
others simply use odd/even version numbering to distinguish between the 
production and testing releases. 

One issue that I foresee with the LTS / non-LTS release cycles is that the 
non-LTS releases might have a smaller userbase as a lot of users want to have a 
production ready system and they perhaps be less likely to install and test the 
non-LTS release. 

Andrei 
- Original Message -

 From: ChunFeng chunf...@domolo.com
 To: dev dev@cloudstack.apache.org
 Sent: Thursday, 27 November, 2014 10:36:46 AM
 Subject: Re: [DISCUSS] LTS Releases

 hi,

 LTS means Long Term Support , for ubuntu means 5 years support for
 both desktop and server distributions. If we adopt to release
 cloudstack's LTS version , how many years should we support ? 5
 years ? of course can not accept ! even 3 years may be too long to
 old for support as a IAAS management software . 2 or 1 years ? this
 should not call LTS version .

 Second ,the time span for ubuntu release next new LTS version is
 every 2 years . Then , consider the first question , what kind of
 years interval should we take?

 Third, even if the above two issues is false positive , how should we
 name the LTS release version's ? such as: CloudStack-LTS-4.x-201401
 , CloudStack-LTS-4.X-201402 etc , this may cause a more confuse to
 end-users to make a right choice .

 IMO , if a software could automatically upgrade to newer version by
 just one or fews clickes , which kind software is suitable for LTS
 release mechanism , otherwise the cost for end-user to upgrade from
 the older one to newer which will be same as user to choice next
 release version, ie reinstall .

 as Hugo, sebgoa and Andrija said:  we need a WAY to focus here on
 FIXING, POLISHING , then LTS becomes less important and  I’m not
 in favor of supporting LTS releases as a community. 

 --

 Regards,

 ChunFeng

 -- Original --
 From: sebgoarun...@gmail.com;
 Date: Thu, Nov 27, 2014 05:14 PM
 To: devdev@cloudstack.apache.org;

 Subject: Re: [DISCUSS] LTS Releases

 On Nov 27, 2014, at 9:01 AM, Andrija Panic andrija.pa...@gmail.com
 wrote:

  my 2 cents again:
 
  Whether we have this LTS release or not - is not just about having
  release
  - we need a WAY to focus here on FIXING, POLISHING product and more
  important to stimulate/make developers interested in doing so.
  If this was company owned product, it would be very easy to set
  goals, and
  then speak to devs, fix this, fix that.
 
  Since this is ofcourse comunity based product - we need some way of
  focusing on fixing the stuff, and really stop adding features
  (maybe not
  completely quit adding features, but...)
 
  One important note, and possible scenario - just by having LTS
  release, but
  still having majority of developer working on non-LTS release
  (ading new
  features) is a big probability, and then we are back to zero with
  our
  progress, so I guess this is also an option/problem that we need to
  consider.
 
  I have a very nice experience with CloudStack so far (in general,
  except
  being frustrated by some childish problems) - if this was all
  polished, and
  documentation complete - I'm 100% sure there will be no better
  cloud
  project on the market any time soon, and I really mean it !
 
  It is my wish (and I hope of others) to see CloudStack migrate from
  #CloudstackWorks to #CloudStackRocks for the next CCC and I think
  this is
  VERY much possible.
 

 Thanks for this Andrija, it made my morning :)

 I am of the opinion that if/when we improve our committing habits, we
 will have high confidence that every bug fixed in a release branch
 will also be fixed in the next release.

 Little process changing that we are making, like using github PR,
 merging back to master etc, will help us get into somewhat of a
 rolling release.

 If we take great care with our upgrade paths and avoid regressions
 then LTS becomes less important. We have had some challenges with
 4.4 and the fact that 4.3 is solid makes it natural to want

Re: [DISCUSS] LTS Releases

2014-11-25 Thread Andrei Mikhailovsky

- Original Message -

 Hi,

 During CCCEU14 conference and over emails, I spoke with many
 CloudStack users and I think most of us would like to have and use
 LTS releases. I propose that;

 - We encourage a habit to backport a bugfix to all qualifying
 branches whether or not those branches are LTS
 - We contribute (unit, integration) tests on LTS branches as well on
 other qualifying branches
 - We put correct affect version and fix version on JIRA so issues
 that should be backported to a branch are identified
 - We adapt the LTS release model from Fedora/Ubuntu projects. Please
 share ideas, comments?
 - We officially recognize a LTS release branch, say 4.3 now and
 everyone helps to maintain it, backport bugfixes etc.
 - Until a next latest stable release is published that we all
 mutually agree, we keep working on the LTS branch. After say we have
 a stable 4.5.0 or 4.5.1 release, we can agree to recognize 4.5 as
 our next LTS branch and work on it.

 Having a robust product release means we all (developers, users,
 sysadmins, ops etc.) can save time consumed on firefighting a
 CloudStack cloud. Having a LTS branch and releases will get us there
 because on a LTS release/support branch we don’t do feature work at
 all and we only invest time to do bugfixing etc.

+1 with everything. It is essential for the end users to have a bug fix 
releases instead of waiting for the next release to come. I've noticed that 
with CloudStack project majority of latest releases have been delayed from 
their initial estimated dates. This creates a lot of false expectations and 
false hopes for the end users who are waiting for the bug fixes. I guess a lot 
of productions users would rather see a bug being fixed than get a bunch of new 
features, which are likely to be broken or unpolished in the first release. 
Also, new releases are likely to introduce additional issues upon upgrading 
forcing people to downgrade back to their old releases with old unfixed bugs. 
The LTS release would solve a lot of issues and frustrations and should 
actually be beneficial to the project and community. 

In my opinion the Ubuntu team has captured the releases cycles perfectly well. 
Perhaps ACS should have a stable release every 2 years and a testing release 
every 2 or 4 quarters. This way, the users will be happy to have a solid 
backported platform that they can run in production and the developers will be 
happy working on a new feature set. 

 ShapeBlue is already serving their customers with product patching
 service and using our own packages hosting
 (http://shapeblue.com/packages) we publish patches on the “main”
 repository for everyone. We also publish details of the patch we
 publish on our Github wiki, such as this example;
 https://github.com/shapeblue/cloudstack/wiki/Release-Notes:-ACS-4.4.1-ShapeBlue-Patch01
 We’ve recently started putting patches and release notes publicly
 (rather than just using emails) so you’ll see more of these in
 future. When we make patches we push the changes to upstream
 branches as well, in fact we fix on upstream first.

Kudos to ShapeBlue team!!! Many thanks for your contributions and help on 
promoting this project. I love you guys!!! 

 In our experience the 4.3.x releases are most stable and so we’re
 backporting bugfixes from 4.4/4.5/master. I’m personally going
 through a list of JIRA issues which has affect version 4.3.0 and/or
 4.3.1 but the bugfix either does not exist or exists in a non-4.3
 branch.

 Regards,
 Rohit Yadav
 Software Architect, ShapeBlue
 M. +91 88 262 30892 | rohit.ya...@shapeblue.com
 Blog: bhaisaab.org | Twitter: @_bhaisaab

 Find out more about ShapeBlue and our range of CloudStack related
 services

 IaaS Cloud Design 
 Buildhttp://shapeblue.com/iaas-cloud-design-and-build//
 CSForge – rapid IaaS deployment
 frameworkhttp://shapeblue.com/csforge/
 CloudStack Consultinghttp://shapeblue.com/cloudstack-consultancy/
 CloudStack Software
 Engineeringhttp://shapeblue.com/cloudstack-software-engineering/
 CloudStack Infrastructure
 Supporthttp://shapeblue.com/cloudstack-infrastructure-support/
 CloudStack Bootcamp Training
 Courseshttp://shapeblue.com/cloudstack-training/

 This email and any attachments to it may be confidential and are
 intended solely for the use of the individual to whom it is
 addressed. Any views or opinions expressed are solely those of the
 author and do not necessarily represent those of Shape Blue Ltd or
 related companies. If you are not the intended recipient of this
 email, you must neither take any action based upon its contents, nor
 copy or show it to anyone. Please contact the sender if you believe
 you have received this email in error. Shape Blue Ltd is a company
 incorporated in England  Wales. ShapeBlue Services India LLP is a
 company incorporated in India and is operated under license from
 Shape Blue Ltd. Shape Blue Brasil Consultoria Ltda is a company
 incorporated in Brasil and is operated under license from

Re: Template or root Volume VHD not able to run on standalone xenserver

2014-08-11 Thread Andrei Mikhailovsky

Not sure if this is related, but the only time i've had issues with starting 
vms with XenServer + CloudStack was vhd-utils were not copied to 
/opt/xensource/bin folder. Make sure the file is there.

Andrei



- Original Message -
From: Tejas Gadaria refond.g...@gmail.com
To: us...@cloudstack.apache.org, dev@cloudstack.apache.org
Sent: Monday, 11 August, 2014 1:27:43 PM
Subject: Template or root Volume VHD not able to run on standalone xenserver

Hi,

I am using ACS 4.2 with Xenserver 6.2 SP1..
Also I have .vhd of template  root volume  want to run on standalone xen
server but, while trying to import vhd from import wizard in xencenter it
says, 'VHD is not suitable for xenserver' .

Is it related to vhd-util, what we are importing in CS Management server?
How can i resolve this issue?

Regards,
Tejas

Re: [ACS44] release 4.4.1

2014-08-07 Thread Andrei Mikhailovsky

Daan, apart from the template issues, which I believe were already solved, are 
there any other issues with KVM that might prevent one to upgrade?

I was planning to upgrade this this weekend, but are now considering postponing 
the upgrade.

Thanks



-- 
Andrei Mikhailovsky
Director
Arhont Information Security

Web: http://www.arhont.com
http://www.wi-foo.com
Tel: +44 (0)870 4431337
Fax: +44 (0)208 429 3111
PGP: Key ID - 0x2B3438DE
PGP: Server - keyserver.pgp.com

DISCLAIMER

The information contained in this email is intended only for the use of the 
person(s) to whom it is addressed and may be confidential or contain legally 
privileged information. If you are not the intended recipient you are hereby 
notified that any perusal, use, distribution, copying or disclosure is strictly 
prohibited. If you have received this email in error please immediately advise 
us by return email at and...@arhont.com and delete and purge the email and any 
attachments without making a copy.


- Original Message -
From: Daan Hoogland daan.hoogl...@gmail.com
To: dev dev@cloudstack.apache.org
Cc: us...@cloudstack.apache.org
Sent: Thursday, 7 August, 2014 1:58:45 PM
Subject: Re: [ACS44] release 4.4.1

depends on what you are using Xerex.

Not for KVM it seems and there are some problems with the xen sytemvm.

On Thu, Aug 7, 2014 at 2:55 PM, Xerex Bueno xbu...@lpsintegration.com wrote:
 Is 4.4.0 more reliable/stable than 4.3?




 On 8/6/14, 3:54 AM, Daan Hoogland daan.hoogl...@gmail.com wrote:

People,

Since there are some problems in 4.4.0 I am planning a bugfix release.

I created a wiki page for it. This only contains dates so far. I am
offline starting the 16th so I want to have it out by the 14th,
optimist that I am. For this to happen a successful RC must be
available by the 10th. This is maybe a tight schedule but I hope
everybody is putting their full attention to having a viable 4.4 out
there.

Please add any important info to
https://cwiki.apache.org/confluence/display/CLOUDSTACK/CloudStack+4.4.1+Bu
gFix+Release

Thanks
--
Daan


 

 This document is PROPRIETARY and CONFIDENTIAL and may not be duplicated, 
 redistributed, or displayed to any other party without the expressed written 
 permission of LPS Integration, Inc. If you are not the intended recipient and 
 have received this email in error, please destroy the email and contact the 
 LPS Integration Security Officer at 866-577-2902 (Phone), 615-349-9009 (Fax) 
 or 230 Great Circle Rd. Suite 218 Nashville, TN 37228 (US Mail)




-- 
Daan

1 2 >

1 - 100 of 147 matches

Mail list logo