Re: PKIX path building failed

2017-12-18 Thread Makrand
Hi,

Assuming your SSVM is up and running, have you tried to register any
another template? Did it work?

--
Makrand


On Tue, Dec 19, 2017 at 11:38 AM, Jagdish Patil 
wrote:

> Thanks for responding Makrand,
>
> This error is occurring as soon as system VM gets downloaded and started
> and tries to download the CentOS5.5-KVM(No GUI) template. And I have also
> tried giving other HTTP URLs for registering but still, the PKIX path
> building error pops up.
>
> I have attached a part of the log, have a look.
>
>
>
> On Mon, Dec 18, 2017 at 5:29 PM Makrand  wrote:
>
>> Hello Jagdish,
>>
>> What is source URL of template you're trying to register?
>>
>> And when exactly this error is appearing? Immediately after you try to
>> register a template?
>>
>>
>> --
>> Makrand
>>
>>
>> On Fri, Dec 15, 2017 at 1:01 PM, Jagdish Patil > >
>> wrote:
>>
>> > Hey Guys,
>> >
>> > I am facing the following issue with this configuration:
>> >
>> > *Configuration:*
>> > CloudStack Version: 4.9
>> > OS: CentOS 6.8(X86_64)
>> > Hypervisor: KVM
>> > CIDR:24
>> >
>> > *Issue:*
>> >
>> > *Failed to register template: 4fe0b968-e02a-11e7-939c-f8a9632f48e1 with
>> > error: sun.security.validator.ValidatorException: PKIX path building
>> > failed: sun.security.provider.certpath.SunCertPathBuilderException:
>> unable
>> > to find valid certification path to requested target*
>> >
>> > There are solutions given by multiple peoples on the internet but none
>> of
>> > them are helping me. Please help.
>> >
>> > Thank You,
>> > Jagdish Patil,
>> > (B.Tech-Cloud Based Application: IBM)
>> > M:8735828606 <087358%2028606> <087358%2028606>
>> > E:jagdishpatil...@gmail.com
>> >
>>
>


Re: PKIX path building failed

2017-12-18 Thread Jagdish Patil
Thanks for responding Makrand,

This error is occurring as soon as system VM gets downloaded and started
and tries to download the CentOS5.5-KVM(No GUI) template. And I have also
tried giving other HTTP URLs for registering but still, the PKIX path
building error pops up.

I have attached a part of the log, have a look.



On Mon, Dec 18, 2017 at 5:29 PM Makrand  wrote:

> Hello Jagdish,
>
> What is source URL of template you're trying to register?
>
> And when exactly this error is appearing? Immediately after you try to
> register a template?
>
>
> --
> Makrand
>
>
> On Fri, Dec 15, 2017 at 1:01 PM, Jagdish Patil 
> wrote:
>
> > Hey Guys,
> >
> > I am facing the following issue with this configuration:
> >
> > *Configuration:*
> > CloudStack Version: 4.9
> > OS: CentOS 6.8(X86_64)
> > Hypervisor: KVM
> > CIDR:24
> >
> > *Issue:*
> >
> > *Failed to register template: 4fe0b968-e02a-11e7-939c-f8a9632f48e1 with
> > error: sun.security.validator.ValidatorException: PKIX path building
> > failed: sun.security.provider.certpath.SunCertPathBuilderException:
> unable
> > to find valid certification path to requested target*
> >
> > There are solutions given by multiple peoples on the internet but none of
> > them are helping me. Please help.
> >
> > Thank You,
> > Jagdish Patil,
> > (B.Tech-Cloud Based Application: IBM)
> > M:8735828606 <087358%2028606> <087358%2028606>
> > E:jagdishpatil...@gmail.com
> >
>
2017-12-13 23:54:08,357 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(Work-Job-Executor-1:ctx-50b2a87e job-12/job-16 ctx-ca176482) (logid:1af01289) 
Publish async job-16 complete on message bus
2017-12-13 23:54:08,357 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(Work-Job-Executor-1:ctx-50b2a87e job-12/job-16 ctx-ca176482) (logid:1af01289) 
Wake up jobs related to job-16
2017-12-13 23:54:08,357 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(Work-Job-Executor-1:ctx-50b2a87e job-12/job-16 ctx-ca176482) (logid:1af01289) 
Update db status for job-16
2017-12-13 23:54:08,359 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(Work-Job-Executor-1:ctx-50b2a87e job-12/job-16 ctx-ca176482) (logid:1af01289) 
Wake up jobs joined with job-16 and disjoin all subjobs created from job- 16
2017-12-13 23:54:08,579 DEBUG [c.c.v.VmWorkJobDispatcher] 
(Work-Job-Executor-1:ctx-50b2a87e job-12/job-16) (logid:1af01289) Done with run 
of VM work job: com.cloud.vm.VmWorkStart for VM 2, job origin: 12
2017-12-13 23:54:08,579 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(Work-Job-Executor-1:ctx-50b2a87e job-12/job-16) (logid:1af01289) Done 
executing com.cloud.vm.VmWorkStart for job-16
2017-12-13 23:54:08,581 INFO  [o.a.c.f.j.i.AsyncJobMonitor] 
(Work-Job-Executor-1:ctx-50b2a87e job-12/job-16) (logid:1af01289) Remove job-16 
from job monitoring
2017-12-13 23:54:08,754 DEBUG [c.c.a.SecondaryStorageVmAlertAdapter] 
(secstorage-1:ctx-085700c7) (logid:208be3b0) received secondary storage vm alert
2017-12-13 23:54:08,755 DEBUG [c.c.a.SecondaryStorageVmAlertAdapter] 
(secstorage-1:ctx-085700c7) (logid:208be3b0) Secondary Storage Vm is up, zone: 
Zone1, secStorageVm: s-2-VM, public IP: 172.16.0.113, private IP: 172.16.0.110
2017-12-13 23:54:08,756 WARN  [o.a.c.alerts] (secstorage-1:ctx-085700c7) 
(logid:208be3b0)  alertType:: 19 // dataCenterId:: 1 // podId:: 1 // 
clusterId:: null // message:: Secondary Storage Vm up in zone: Zone1, 
secStorageVm: s-2-VM, public IP: 172.16.0.113, private IP: 172.16.0.110
2017-12-13 23:54:08,993 INFO  [o.a.c.s.SecondaryStorageManagerImpl] 
(secstorage-1:ctx-085700c7) (logid:208be3b0) Secondary storage vm s-2-VM is 
started
2017-12-13 23:54:08,993 INFO  [o.a.c.s.PremiumSecondaryStorageManagerImpl] 
(secstorage-1:ctx-085700c7) (logid:208be3b0) Primary secondary storage is not 
even started, wait until next turn
2017-12-13 23:54:09,016 DEBUG [o.a.c.s.SecondaryStorageManagerImpl] 
(secstorage-1:ctx-89e6698a) (logid:ef8a2b1d) Zone 1 is ready to launch 
secondary storage VM
2017-12-13 23:54:09,858 INFO  [o.a.c.f.j.i.AsyncJobManagerImpl] 
(AsyncJobMgr-Heartbeat-1:ctx-754725c3) (logid:17ae7b53) Begin cleanup expired 
async-jobs
2017-12-13 23:54:09,866 INFO  [o.a.c.f.j.i.AsyncJobManagerImpl] 
(AsyncJobMgr-Heartbeat-1:ctx-754725c3) (logid:17ae7b53) End cleanup expired 
async-jobs
2017-12-13 23:54:15,371 DEBUG [c.c.a.t.Request] (AgentManager-Handler-2:null) 
(logid:) Seq -1-0: Scheduling the first command  { Cmd , MgmtId: -1, via: -1, 
Ver: v1, Flags: 101, 

RE: Not able to create the Secondary Storage and Console proxy Vms

2017-12-18 Thread Dickson Lam (dilam)
Hi Glenn:

Do you have chance to look at the log file that I upload on pastebin?

Regards
Dickson

-Original Message-
From: Dickson Lam (dilam) 
Sent: Friday, December 01, 2017 11:37 AM
To: users@cloudstack.apache.org
Subject: RE: Not able to create the Secondary Storage and Console proxy Vms

Hi Glenn:

The log file is too big. I can only paste the first 2300 lines on the log to 
the paste bin. Hopefully, it will have enough information. If not, please let 
me know. The following is the link:
https://pastebin.com/twK6wnjr

The title is Create Secondary Storage Problem.

Thanks
Dickson

-Original Message-
From: Glenn Wagner [mailto:glenn.wag...@shapeblue.com] 
Sent: Friday, December 01, 2017 9:04 AM
To: users@cloudstack.apache.org
Subject: RE: Not able to create the Secondary Storage and Console proxy Vms

H,

You can upload your logs to https://pastebin.com/ and then send us the link 
that gets generated once uploaded.

Regards
Glenn




glenn.wag...@shapeblue.com
www.shapeblue.com
Winter Suite, 1st Floor, The Avenues, Drama Street, Somerset West, Cape Town  
7129South Africa @shapeblue
  
 


-Original Message-
From: Dickson Lam (dilam) [mailto:di...@cisco.com]
Sent: Friday, 01 December 2017 5:00 PM
To: users@cloudstack.apache.org
Subject: RE: Not able to create the Secondary Storage and Console proxy Vms

Hi Glenn:

Thanks for your reply but sorry I am new and can you tell me how to upload the 
logs to pastebin. I look at the 
http://mail-archives.apache.org/mod_mbox/cloudstack-users/ site and did not see 
anything that can allow me to upload a file.
Yes, I have run the following on the Management server to prepare for the 
system vm template:

/usr/share/cloudstack-common/scripts/storage/secondary/cloud-install-sys-tmplt 
\ -m /export/secondary \ -u 
http://cloudstack.apt-get.eu/systemvm/4.6/systemvm64template-4.6.0-vmware.ova \ 
-h vmware \ -F

Yes, the reserved system ip is on 10.89.98.x subnet and Management server is on 
10.89.118.x. These two subnet is routable communicate with each other through 
layer 3.

Regards
Dickson

-Original Message-
From: Glenn Wagner [mailto:glenn.wag...@shapeblue.com]
Sent: Friday, December 01, 2017 6:14 AM
To: users@cloudstack.apache.org
Subject: RE: Not able to create the Secondary Storage and Console proxy Vms

Hi,

Could you upload your management server logs to pastebin so we can have a look 
Did you seed the System template before you started the cloudstack management 
service 

To answer your question the system VM's will use the Reserved system IP address 
and the management server will need to communicate with the systemVM's over ssh 
and the cloudsrtack agent on the System VM will communicate with the management 
server on port 8250.

Regards
Glenn



glenn.wag...@shapeblue.com
www.shapeblue.com
Winter Suite, 1st Floor, The Avenues, Drama Street, Somerset West, Cape Town  
7129South Africa @shapeblue
  
 


-Original Message-
From: Dickson Lam (dilam) [mailto:di...@cisco.com]
Sent: Thursday, 30 November 2017 6:23 PM
To: users@cloudstack.apache.org
Subject: Not able to create the Secondary Storage and Console proxy Vms

Hi all:

I am new here and need some help to setup Cloudstack 4.9 to manage a VMWare 5.5 
Esxi hosts on a VCenter for demo. I got the management server up and running 
but Secondary Storage VM and Console proxy VM get creation failure. The 
following is the errors:

Secondary Storage Vm creation failure. zone: Zone-Orion, error details: null 
Console proxy creation failure. zone: Zone-Orion, error details: null

I have installed Cloudstack 4.9 Management Server on CentOS 7 VM. The setup 
that includes one Esxi host with VM DataStorage. The nfs mount secondary 
storage has 200G disk space  on the Management Server.
The Esxi host is on vlan 98 with ip 10.89.98.144 and it is located at OrionTest 
datacenter. The Management Server VM is on vlan118  with ip address 
10.89.118.109 which is on other Datacenter.

I have run the following on the Management server to prepare for the system vm 
template:

/usr/share/cloudstack-common/scripts/storage/secondary/cloud-install-sys-tmplt 
\ -m /export/secondary \ -u 
http://cloudstack.apt-get.eu/systemvm/4.6/systemvm64template-4.6.0-vmware.ova \ 
-h vmware \ -F

During the configuration, I have Public traffic range from 10.89.98.210 to 
10.89.98.220, Reserved System IP range from 10.89.98.244 to 10.89.98.254 and 
Storage traffic range from 10.89.98.147 to 10.89.98.154.

Please point me a direction what is the problem and how to resolve it. Also, Is 
the Management Server and the System VM need to be on the same subnet?

Regards
Dickson



Re: [UPDATE] Debian 9 "stretch" systemvmtemplate for master

2017-12-18 Thread Rohit Yadav
Hi Wido,

Thanks. I've verified, virtio-scsi seems to work for me. Qemu guest agent also 
works, I was able to write poc code to get rid of patchviasocket.py as well. 
Can you help review and test the PR?

Regards.

From: Wido den Hollander 
Sent: Saturday, December 9, 2017 12:32:24 AM
To: d...@cloudstack.apache.org; Rohit Yadav
Cc: users@cloudstack.apache.org
Subject: Re: [UPDATE] Debian 9 "stretch" systemvmtemplate for master

Awesome work!

More replies below

On 12/08/2017 03:58 PM, Rohit Yadav wrote:
> All,
>
>
> Our effort to move to Debian9 systemvmtemplate seems to be soon coming to 
> conclusion, the following high-level goals have been achieved so far:
>
>
> - Several infra improvements such as faster patching (no reboots on 
> patching), smaller setup/patch scripts and even smaller cloud-early-config, 
> old file cleanups and directory/filesystem refactorings
>
> - Tested and boots/runs on KVM, VMware, XenServer and HyperV (thanks to Paul 
> for hyperv)
>
> - Boots, patches, runs systemvm/VR in about 10s (tested with KVM/XenServer 
> and NFS+SSDs) with faster console-proxy (cloud) service launch
>
> - Disk size reduced to 2GB from the previous 3+GB with still bigger /var/log 
> partition
>
> - Migration to systemd based cloud-early-config, cloud services etc (thanks 
> Wido!)
>
> - Strongswan provided vpn/ipsec improvements (ports based on work from 
> Will/Syed)
>
> - Several fixes to redundant virtual routers and scripts for VPC (ports from 
> Remi json/gzip PR and additional fixes/improvements to execute update_config 
> faster)
>
> - Packages installation improvements (thanks to Rene for review)
>
> - Several integration test fixes -- all smoke tests passing on KVM and most 
> on XenServer, work on fixing VMware test failures is on-going
>
> - Several UI/UX improvements and systemvm python codebase linting/unit tests 
> added to Travis
>
>
> Here's the pull request:
>
> https://github.com/apache/cloudstack/pull/2211
>
>
> I've temporarily hosted the templates here:
>
> http://hydra.yadav.xyz/debian9/
>
>
> Outstanding tasks/issues:
>
> - Should we skip rVR related tests for VMware noting a reference to a jira 
> ticket to renable them once the feature is support for VMware?
>
> - Fix intermittent failures for XenServer and test failures on VMware
>
> - Misc issues and items (full checklist available on the PR)
>
> - Review and additional test effort from community
>
>
> After your due review, if we're able to show that the test results are on par 
> with the previous 4.9.2.0/4.9.3.0 smoke test results (i.e. most are passing) 
> on XenServer, KVM, and VMware I would proceed with merging the PR by end of 
> this month. Thoughts, comments?
>

We might want to look/verify that this works:

- Running with VirtIO-SCSI under KVM (allows disk trimming)
- Make sure the Qemu Guest Agent works

If those two things work we can keep the footprint of the SSVM rather small.

Wido

>
> Regards.
>
> rohit.ya...@shapeblue.com
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>
>
>
>

rohit.ya...@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 



Re: [UPDATE] Debian 9 "stretch" systemvmtemplate for master

2017-12-18 Thread Rohit Yadav
All,

Thanks for your feedback.

We're reaching close to completion now. All smoketests are now passing on KVM, 
XenServer and VMWare now. There are however few intermittent failures on VMware 
being looked into. The rVR smoketests failures on VMware have been fixed as 
well.

The systemvmtemplate build has been now been migrated to packer, making it 
easier for anyone to build systemvm templates. Overall, VRs are now 2x to 3x 
faster, lighter (reduced disk size by 1.2GB), requiring no reboot after 
patching, improved systemvm python code, strongswan provided vpn/ipsec is more 
robust along with rVR functionality on kvm and xenserver, with good support for 
vmware (still needs further improvements). Overall, the PR2211 also aims to 
stabilize master branch. The outstanding task is to improve some tests to avoid 
env introduced failures and update the sql/db upgrade path which is on going.

Given the current state, and smoketests passing, I would like to request for 
your comments and reviews on pull request 2211: 
https://github.com/apache/cloudstack/pull/2211


Regards.

From: Wido den Hollander 
Sent: Saturday, December 9, 2017 12:32:24 AM
To: d...@cloudstack.apache.org; Rohit Yadav
Cc: users@cloudstack.apache.org
Subject: Re: [UPDATE] Debian 9 "stretch" systemvmtemplate for master

Awesome work!

More replies below

On 12/08/2017 03:58 PM, Rohit Yadav wrote:
> All,
>
>
> Our effort to move to Debian9 systemvmtemplate seems to be soon coming to 
> conclusion, the following high-level goals have been achieved so far:
>
>
> - Several infra improvements such as faster patching (no reboots on 
> patching), smaller setup/patch scripts and even smaller cloud-early-config, 
> old file cleanups and directory/filesystem refactorings
>
> - Tested and boots/runs on KVM, VMware, XenServer and HyperV (thanks to Paul 
> for hyperv)
>
> - Boots, patches, runs systemvm/VR in about 10s (tested with KVM/XenServer 
> and NFS+SSDs) with faster console-proxy (cloud) service launch
>
> - Disk size reduced to 2GB from the previous 3+GB with still bigger /var/log 
> partition
>
> - Migration to systemd based cloud-early-config, cloud services etc (thanks 
> Wido!)
>
> - Strongswan provided vpn/ipsec improvements (ports based on work from 
> Will/Syed)
>
> - Several fixes to redundant virtual routers and scripts for VPC (ports from 
> Remi json/gzip PR and additional fixes/improvements to execute update_config 
> faster)
>
> - Packages installation improvements (thanks to Rene for review)
>
> - Several integration test fixes -- all smoke tests passing on KVM and most 
> on XenServer, work on fixing VMware test failures is on-going
>
> - Several UI/UX improvements and systemvm python codebase linting/unit tests 
> added to Travis
>
>
> Here's the pull request:
>
> https://github.com/apache/cloudstack/pull/2211
>
>
> I've temporarily hosted the templates here:
>
> http://hydra.yadav.xyz/debian9/
>
>
> Outstanding tasks/issues:
>
> - Should we skip rVR related tests for VMware noting a reference to a jira 
> ticket to renable them once the feature is support for VMware?
>
> - Fix intermittent failures for XenServer and test failures on VMware
>
> - Misc issues and items (full checklist available on the PR)
>
> - Review and additional test effort from community
>
>
> After your due review, if we're able to show that the test results are on par 
> with the previous 4.9.2.0/4.9.3.0 smoke test results (i.e. most are passing) 
> on XenServer, KVM, and VMware I would proceed with merging the PR by end of 
> this month. Thoughts, comments?
>

We might want to look/verify that this works:

- Running with VirtIO-SCSI under KVM (allows disk trimming)
- Make sure the Qemu Guest Agent works

If those two things work we can keep the footprint of the SSVM rather small.

Wido

>
> Regards.
>
> rohit.ya...@shapeblue.com
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>
>
>
>

rohit.ya...@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 



Re: [Discuss] Management cluster / Zookeeper holding locks

2017-12-18 Thread Rafael Weingärtner
So, we would need to change every piece of code that opens and uses
connections and transactions to change to ZK model? I mean, to direct the
flow to ZK.

On Mon, Dec 18, 2017 at 8:55 AM, Marc-Aurèle Brothier 
wrote:

> I understand your point, but there isn't any "transaction" in ZK. The
> transaction and commit stuff are really for DB and not part of ZK. All
> entries (if you start writing data in some nodes) are versioned. For
> example you could enforce that to overwrite a node value you must submit
> the node data having the same last version id to ensure you were
> overwriting from the latest value/state of that node. Bear in mind that you
> should not put too much data into your ZK, it's not a database replacement,
> neither a nosql db.
>
> The ZK client (CuratorFramework object) is started on the server startup,
> and you only need to pass it along your calls so that the connection is
> reused, or retried, depending on the state. Nothing manual has to be done,
> it's all in this curator library.
>
> On Mon, Dec 18, 2017 at 11:44 AM, Rafael Weingärtner <
> rafaelweingart...@gmail.com> wrote:
>
> > I did not check the link before. Sorry about that.
> >
> > Reading some of the pages there, I see curator more like a client library
> > such as MySQL JDBC client.
> >
> > When I mentioned framework, I was looking for something like Spring-data.
> > So, we could simply rely on the framework to manage connections and
> > transactions. For instance, we could define a pattern that would open
> > connection with a read-only transaction. And then, we could annotate
> > methods that would write in the database something with
> > @Transactional(readonly = false). If we are going to a change like this
> we
> > need to remove manually open connections and transactions. Also, we have
> to
> > remove the transaction management code from our code base.
> >
> > I would like to see something like this [1] in our future. No manually
> > written transaction code, and no transaction management in our code base.
> > Just simple annotation usage or transaction pattern in Spring XML files.
> >
> > [1]
> > https://github.com/rafaelweingartner/daily-tasks/
> > blob/master/src/main/java/br/com/supero/desafio/services/
> TaskService.java
> >
> > On Mon, Dec 18, 2017 at 8:32 AM, Marc-Aurèle Brothier  >
> > wrote:
> >
> > > @rafael, yes there is a framework (curator), it's the link I posted in
> my
> > > first message: https://curator.apache.org/curator-recipes/shared-lock.
> > html
> > > This framework helps handling all the complexity of ZK.
> > >
> > > The ZK client stays connected all the time (as the DB connection pool),
> > and
> > > only one connection (ZKClient) is needed to communicate with the ZK
> > server.
> > > The framework handles reconnection as well.
> > >
> > > Have a look at ehc curator website to understand its goal:
> > > https://curator.apache.org/
> > >
> > > On Mon, Dec 18, 2017 at 11:01 AM, Rafael Weingärtner <
> > > rafaelweingart...@gmail.com> wrote:
> > >
> > > > Do we have framework to do this kind of looking in ZK?
> > > > I mean, you said " create a new InterProcessSemaphoreMutex which
> > handles
> > > > the locking mechanism.". This feels that we would have to continue
> > > opening
> > > > and closing this transaction manually, which is what causes a lot of
> > our
> > > > headaches with transactions (it is not MySQL locks fault entirely,
> but
> > > our
> > > > code structure).
> > > >
> > > > On Mon, Dec 18, 2017 at 7:47 AM, Marc-Aurèle Brothier <
> > ma...@exoscale.ch
> > > >
> > > > wrote:
> > > >
> > > > > We added ZK lock for fix this issue but we will remove all current
> > > locks
> > > > in
> > > > > ZK in favor of ZK one. The ZK lock is already encapsulated in a
> > project
> > > > > with an interface, but more work should be done to have a proper
> > > > interface
> > > > > for locks which could be implemented with the "tool" you want,
> > either a
> > > > DB
> > > > > lock for simplicity, or ZK for more advanced scenarios.
> > > > >
> > > > > @Daan you will need to add the ZK libraries in CS and have a
> running
> > ZK
> > > > > server somewhere. The configuration value is read from the
> > > > > server.properties. If the line is empty, the ZK client is not
> created
> > > and
> > > > > any lock request will immediately return (not holding any lock).
> > > > >
> > > > > @Rafael: ZK is pretty easy to setup and have running, as long as
> you
> > > > don't
> > > > > put too much data in it. Regarding our scenario here, with only
> > locks,
> > > > it's
> > > > > easy. ZK would be only the gatekeeper to locks in the code,
> ensuring
> > > that
> > > > > multi JVM can request a true lock.
> > > > > For the code point of view, you're opening a connection to a ZK
> node
> > > (any
> > > > > of a cluster) and you create a new InterProcessSemaphoreMutex which
> > > > handles
> > > > > the locking mechanism.
> > > > >
> > > > > On Mon, Dec 18, 2017 at 10:24 

Re: Need help for first time user

2017-12-18 Thread soundar rajan
Please find my comments inline

On Mon, Dec 18, 2017 at 2:31 PM, Vivek Kumar 
wrote:

> Hello Sounder,
>
> 1- What exactly is your requirement ? Are you going to deploy public or
> private cloud.
>

   Private cloud

> 2- what hypervisor do want to use, ( as per ur trail mail you are trying
> to configure with Xenserver, you can use either XenServer 6.5 SP1 or
> XenServer 7.0 ).
>
   I would like to manage all my hypervisor in cloudstack (windows
hyper v and xenserver)


> 3- If you are planning to build Private cloud i.e VPC ( Advance Zone in
> Cloud stack ) , you need to take care of few networking things as well,
> CloudStack management VLAN should be able to reach your  POD/XENHOST VLAN
> and secondary storage VLAN, and POD VLAN should be able to reach CM VLAN
> and secondary storage as well.
>
All my Vlan are accessible within each other. The problem i faced is
cloudstack management is not able to access the xenserver the same is
possible using my terminal
 I have not choose advance while configuring.

> 4- What primary storage are you using, if primary storage is on other VLAN
> it should be reachable from your hypervisor,
>
My NFS storage is accessible from all hypervisor

>
>
> Vivek Kumar
> Virtualization and Cloud Consultant
>
>
> > On 18-Dec-2017, at 12:57 PM, soundar rajan 
> wrote:
> >
> > Tried with xenserver 5.6 sp2 no luck.
> >
> > On Sat, Dec 16, 2017 at 5:19 PM, soundar rajan 
> > wrote:
> >
> >> Everything went well. But management server is not able to communicate
> >> with xenserver using xapi link local ip's
> >>
> >> 2017-12-16 06:27:28,432 DEBUG [c.c.h.x.r.CitrixResourceBase]
> >> (DirectAgent-48:ctx-cbc6b9fe) (logid:a3b30d4b) Trying to connect to
> >> 169.254.0.225 attempt 47 of 100
> >> 2017-12-16 06:27:29,020 WARN  [o.a.c.f.j.i.AsyncJobMonitor]
> >> (Timer-1:ctx-316e4e5c) (logid:6d852e5f) Task (job-12) has been pending
> for
> >> 624 seconds
> >> 2017-12-16 06:27:29,020 WARN  [o.a.c.f.j.i.AsyncJobMonitor]
> >> (Timer-1:ctx-316e4e5c) (logid:6d852e5f) Task (job-13) has been pending
> for
> >> 622 seconds
> >> 2017-12-16 06:27:30,074 DEBUG [c.c.h.x.r.CitrixResourceBase]
> >> (DirectAgent-49:ctx-3ee4becb) (logid:39a5433a) Trying to connect to
> >> 169.254.0.211 attempt 47 of 100
> >> 2017-12-16 06:27:31,215 DEBUG [c.c.a.m.DirectAgentAttache]
> >> (DirectAgentCronJob-30:ctx-9034c513) (logid:ea579fd0) Ping from
> >> 1(xenserver-cloudstack)
> >> 2017-12-16 06:27:31,216 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
> >> (DirectAgentCronJob-30:ctx-9034c513) (logid:ea579fd0) Process host VM
> >> state report from ping process. host: 1
> >> 2017-12-16 06:27:31,221 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
> >> (DirectAgentCronJob-30:ctx-9034c513) (logid:ea579fd0) Process VM state
> >> report. host: 1, number of records in report: 2
> >> 2017-12-16 06:27:31,221 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
> >> (DirectAgentCronJob-30:ctx-9034c513) (logid:ea579fd0) VM state report.
> >> host: 1, vm id: 1, power state: PowerOn
> >> 2017-12-16 06:27:31,226 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
> >> (DirectAgentCronJob-30:ctx-9034c513) (logid:ea579fd0) VM state report
> is
> >> updated. host: 1, vm id: 1, power state: PowerOn
> >> 2017-12-16 06:27:31,228 INFO  [c.c.v.VirtualMachineManagerImpl]
> >> (DirectAgentCronJob-30:ctx-9034c513) (logid:ea579fd0) There is pending
> >> job or HA tasks working on the VM. vm id: 1, postpone power-change
> report
> >> by resetting power-change counters
> >> 2017-12-16 06:27:31,233 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
> >> (DirectAgentCronJob-30:ctx-9034c513) (logid:ea579fd0) VM state report.
> >> host: 1, vm id: 2, power state: PowerOn
> >> 2017-12-16 06:27:31,242 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
> >> (DirectAgentCronJob-30:ctx-9034c513) (logid:ea579fd0) VM state report
> is
> >> updated. host: 1, vm id: 2, power state: PowerOn
> >> 2017-12-16 06:27:31,245 INFO  [c.c.v.VirtualMachineManagerImpl]
> >> (DirectAgentCronJob-30:ctx-9034c513) (logid:ea579fd0) There is pending
> >> job or HA tasks working on the VM. vm id: 2, postpone power-change
> report
> >> by resetting power-change counters
> >> 2017-12-16 06:27:31,258 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
> >> (DirectAgentCronJob-30:ctx-9034c513) (logid:ea579fd0) Done with process
> >> of VM state report. host: 1
> >> 2017-12-16 06:27:34,314 INFO  [o.a.c.f.j.i.AsyncJobManagerImpl]
> >> (AsyncJobMgr-Heartbeat-1:ctx-a8717eba) (logid:56de398e) Begin cleanup
> >> expired async-jobs
> >> 2017-12-16 06:27:34,321 INFO  [o.a.c.f.j.i.AsyncJobManagerImpl]
> >> (AsyncJobMgr-Heartbeat-1:ctx-a8717eba) (logid:56de398e) End cleanup
> >> expired async-jobs
> >>
> >>
> >> Please help..
> >>
> >> Cloudstack version 4.9
> >>
> >> Xenserver version 7 (This is first host without any bonding and 1 nic)
> >>
> >>
> >> Thanks
> >> Shyam
> >>
> >> On Sat, Dec 16, 2017 at 2:36 PM, soundar rajan 

Re: PKIX path building failed

2017-12-18 Thread Makrand
Hello Jagdish,

What is source URL of template you're trying to register?

And when exactly this error is appearing? Immediately after you try to
register a template?


--
Makrand


On Fri, Dec 15, 2017 at 1:01 PM, Jagdish Patil 
wrote:

> Hey Guys,
>
> I am facing the following issue with this configuration:
>
> *Configuration:*
> CloudStack Version: 4.9
> OS: CentOS 6.8(X86_64)
> Hypervisor: KVM
> CIDR:24
>
> *Issue:*
>
> *Failed to register template: 4fe0b968-e02a-11e7-939c-f8a9632f48e1 with
> error: sun.security.validator.ValidatorException: PKIX path building
> failed: sun.security.provider.certpath.SunCertPathBuilderException: unable
> to find valid certification path to requested target*
>
> There are solutions given by multiple peoples on the internet but none of
> them are helping me. Please help.
>
> Thank You,
> Jagdish Patil,
> (B.Tech-Cloud Based Application: IBM)
> M:8735828606 <087358%2028606>
> E:jagdishpatil...@gmail.com
>


Re: [Discuss] Management cluster / Zookeeper holding locks

2017-12-18 Thread Marc-Aurèle Brothier
I understand your point, but there isn't any "transaction" in ZK. The
transaction and commit stuff are really for DB and not part of ZK. All
entries (if you start writing data in some nodes) are versioned. For
example you could enforce that to overwrite a node value you must submit
the node data having the same last version id to ensure you were
overwriting from the latest value/state of that node. Bear in mind that you
should not put too much data into your ZK, it's not a database replacement,
neither a nosql db.

The ZK client (CuratorFramework object) is started on the server startup,
and you only need to pass it along your calls so that the connection is
reused, or retried, depending on the state. Nothing manual has to be done,
it's all in this curator library.

On Mon, Dec 18, 2017 at 11:44 AM, Rafael Weingärtner <
rafaelweingart...@gmail.com> wrote:

> I did not check the link before. Sorry about that.
>
> Reading some of the pages there, I see curator more like a client library
> such as MySQL JDBC client.
>
> When I mentioned framework, I was looking for something like Spring-data.
> So, we could simply rely on the framework to manage connections and
> transactions. For instance, we could define a pattern that would open
> connection with a read-only transaction. And then, we could annotate
> methods that would write in the database something with
> @Transactional(readonly = false). If we are going to a change like this we
> need to remove manually open connections and transactions. Also, we have to
> remove the transaction management code from our code base.
>
> I would like to see something like this [1] in our future. No manually
> written transaction code, and no transaction management in our code base.
> Just simple annotation usage or transaction pattern in Spring XML files.
>
> [1]
> https://github.com/rafaelweingartner/daily-tasks/
> blob/master/src/main/java/br/com/supero/desafio/services/TaskService.java
>
> On Mon, Dec 18, 2017 at 8:32 AM, Marc-Aurèle Brothier 
> wrote:
>
> > @rafael, yes there is a framework (curator), it's the link I posted in my
> > first message: https://curator.apache.org/curator-recipes/shared-lock.
> html
> > This framework helps handling all the complexity of ZK.
> >
> > The ZK client stays connected all the time (as the DB connection pool),
> and
> > only one connection (ZKClient) is needed to communicate with the ZK
> server.
> > The framework handles reconnection as well.
> >
> > Have a look at ehc curator website to understand its goal:
> > https://curator.apache.org/
> >
> > On Mon, Dec 18, 2017 at 11:01 AM, Rafael Weingärtner <
> > rafaelweingart...@gmail.com> wrote:
> >
> > > Do we have framework to do this kind of looking in ZK?
> > > I mean, you said " create a new InterProcessSemaphoreMutex which
> handles
> > > the locking mechanism.". This feels that we would have to continue
> > opening
> > > and closing this transaction manually, which is what causes a lot of
> our
> > > headaches with transactions (it is not MySQL locks fault entirely, but
> > our
> > > code structure).
> > >
> > > On Mon, Dec 18, 2017 at 7:47 AM, Marc-Aurèle Brothier <
> ma...@exoscale.ch
> > >
> > > wrote:
> > >
> > > > We added ZK lock for fix this issue but we will remove all current
> > locks
> > > in
> > > > ZK in favor of ZK one. The ZK lock is already encapsulated in a
> project
> > > > with an interface, but more work should be done to have a proper
> > > interface
> > > > for locks which could be implemented with the "tool" you want,
> either a
> > > DB
> > > > lock for simplicity, or ZK for more advanced scenarios.
> > > >
> > > > @Daan you will need to add the ZK libraries in CS and have a running
> ZK
> > > > server somewhere. The configuration value is read from the
> > > > server.properties. If the line is empty, the ZK client is not created
> > and
> > > > any lock request will immediately return (not holding any lock).
> > > >
> > > > @Rafael: ZK is pretty easy to setup and have running, as long as you
> > > don't
> > > > put too much data in it. Regarding our scenario here, with only
> locks,
> > > it's
> > > > easy. ZK would be only the gatekeeper to locks in the code, ensuring
> > that
> > > > multi JVM can request a true lock.
> > > > For the code point of view, you're opening a connection to a ZK node
> > (any
> > > > of a cluster) and you create a new InterProcessSemaphoreMutex which
> > > handles
> > > > the locking mechanism.
> > > >
> > > > On Mon, Dec 18, 2017 at 10:24 AM, Ivan Kudryavtsev <
> > > > kudryavtsev...@bw-sw.com
> > > > > wrote:
> > > >
> > > > > Rafael,
> > > > >
> > > > > - It's easy to configure and run ZK either in single node or
> cluster
> > > > > - zookeeper should replace mysql locking mechanism used inside ACS
> > code
> > > > > (places where ACS locks tables or rows).
> > > > >
> > > > > I don't think from the other size, that moving from MySQL locks to
> ZK
> > > > locks
> > > > > is easy and light and (even 

Re: [Discuss] Management cluster / Zookeeper holding locks

2017-12-18 Thread Rafael Weingärtner
I did not check the link before. Sorry about that.

Reading some of the pages there, I see curator more like a client library
such as MySQL JDBC client.

When I mentioned framework, I was looking for something like Spring-data.
So, we could simply rely on the framework to manage connections and
transactions. For instance, we could define a pattern that would open
connection with a read-only transaction. And then, we could annotate
methods that would write in the database something with
@Transactional(readonly = false). If we are going to a change like this we
need to remove manually open connections and transactions. Also, we have to
remove the transaction management code from our code base.

I would like to see something like this [1] in our future. No manually
written transaction code, and no transaction management in our code base.
Just simple annotation usage or transaction pattern in Spring XML files.

[1]
https://github.com/rafaelweingartner/daily-tasks/blob/master/src/main/java/br/com/supero/desafio/services/TaskService.java

On Mon, Dec 18, 2017 at 8:32 AM, Marc-Aurèle Brothier 
wrote:

> @rafael, yes there is a framework (curator), it's the link I posted in my
> first message: https://curator.apache.org/curator-recipes/shared-lock.html
> This framework helps handling all the complexity of ZK.
>
> The ZK client stays connected all the time (as the DB connection pool), and
> only one connection (ZKClient) is needed to communicate with the ZK server.
> The framework handles reconnection as well.
>
> Have a look at ehc curator website to understand its goal:
> https://curator.apache.org/
>
> On Mon, Dec 18, 2017 at 11:01 AM, Rafael Weingärtner <
> rafaelweingart...@gmail.com> wrote:
>
> > Do we have framework to do this kind of looking in ZK?
> > I mean, you said " create a new InterProcessSemaphoreMutex which handles
> > the locking mechanism.". This feels that we would have to continue
> opening
> > and closing this transaction manually, which is what causes a lot of our
> > headaches with transactions (it is not MySQL locks fault entirely, but
> our
> > code structure).
> >
> > On Mon, Dec 18, 2017 at 7:47 AM, Marc-Aurèle Brothier  >
> > wrote:
> >
> > > We added ZK lock for fix this issue but we will remove all current
> locks
> > in
> > > ZK in favor of ZK one. The ZK lock is already encapsulated in a project
> > > with an interface, but more work should be done to have a proper
> > interface
> > > for locks which could be implemented with the "tool" you want, either a
> > DB
> > > lock for simplicity, or ZK for more advanced scenarios.
> > >
> > > @Daan you will need to add the ZK libraries in CS and have a running ZK
> > > server somewhere. The configuration value is read from the
> > > server.properties. If the line is empty, the ZK client is not created
> and
> > > any lock request will immediately return (not holding any lock).
> > >
> > > @Rafael: ZK is pretty easy to setup and have running, as long as you
> > don't
> > > put too much data in it. Regarding our scenario here, with only locks,
> > it's
> > > easy. ZK would be only the gatekeeper to locks in the code, ensuring
> that
> > > multi JVM can request a true lock.
> > > For the code point of view, you're opening a connection to a ZK node
> (any
> > > of a cluster) and you create a new InterProcessSemaphoreMutex which
> > handles
> > > the locking mechanism.
> > >
> > > On Mon, Dec 18, 2017 at 10:24 AM, Ivan Kudryavtsev <
> > > kudryavtsev...@bw-sw.com
> > > > wrote:
> > >
> > > > Rafael,
> > > >
> > > > - It's easy to configure and run ZK either in single node or cluster
> > > > - zookeeper should replace mysql locking mechanism used inside ACS
> code
> > > > (places where ACS locks tables or rows).
> > > >
> > > > I don't think from the other size, that moving from MySQL locks to ZK
> > > locks
> > > > is easy and light and (even implemetable) way.
> > > >
> > > > 2017-12-18 16:20 GMT+07:00 Rafael Weingärtner <
> > > rafaelweingart...@gmail.com
> > > > >:
> > > >
> > > > > How hard is it to configure Zookeeper and get everything up and
> > > running?
> > > > > BTW: what zookeeper would be managing? CloudStack management
> servers
> > or
> > > > > MySQL nodes?
> > > > >
> > > > > On Mon, Dec 18, 2017 at 7:13 AM, Ivan Kudryavtsev <
> > > > > kudryavtsev...@bw-sw.com>
> > > > > wrote:
> > > > >
> > > > > > Hello, Marc-Aurele, I strongly believe that all mysql locks
> should
> > be
> > > > > > removed in favour of truly DLM solution like Zookeeper. The
> > > performance
> > > > > of
> > > > > > 3node ZK ensemble should be enough to hold up to 1000-2000 locks
> > per
> > > > > second
> > > > > > and it helps to move to truly clustered MySQL like galera without
> > > > single
> > > > > > master server.
> > > > > >
> > > > > > 2017-12-18 15:33 GMT+07:00 Marc-Aurèle Brothier <
> ma...@exoscale.ch
> > >:
> > > > > >
> > > > > > > Hi everyone,
> > > > > > >
> > > > > > > I was wondering how many of you 

Re: [Discuss] Management cluster / Zookeeper holding locks

2017-12-18 Thread Marc-Aurèle Brothier
@rafael, yes there is a framework (curator), it's the link I posted in my
first message: https://curator.apache.org/curator-recipes/shared-lock.html
This framework helps handling all the complexity of ZK.

The ZK client stays connected all the time (as the DB connection pool), and
only one connection (ZKClient) is needed to communicate with the ZK server.
The framework handles reconnection as well.

Have a look at ehc curator website to understand its goal:
https://curator.apache.org/

On Mon, Dec 18, 2017 at 11:01 AM, Rafael Weingärtner <
rafaelweingart...@gmail.com> wrote:

> Do we have framework to do this kind of looking in ZK?
> I mean, you said " create a new InterProcessSemaphoreMutex which handles
> the locking mechanism.". This feels that we would have to continue opening
> and closing this transaction manually, which is what causes a lot of our
> headaches with transactions (it is not MySQL locks fault entirely, but our
> code structure).
>
> On Mon, Dec 18, 2017 at 7:47 AM, Marc-Aurèle Brothier 
> wrote:
>
> > We added ZK lock for fix this issue but we will remove all current locks
> in
> > ZK in favor of ZK one. The ZK lock is already encapsulated in a project
> > with an interface, but more work should be done to have a proper
> interface
> > for locks which could be implemented with the "tool" you want, either a
> DB
> > lock for simplicity, or ZK for more advanced scenarios.
> >
> > @Daan you will need to add the ZK libraries in CS and have a running ZK
> > server somewhere. The configuration value is read from the
> > server.properties. If the line is empty, the ZK client is not created and
> > any lock request will immediately return (not holding any lock).
> >
> > @Rafael: ZK is pretty easy to setup and have running, as long as you
> don't
> > put too much data in it. Regarding our scenario here, with only locks,
> it's
> > easy. ZK would be only the gatekeeper to locks in the code, ensuring that
> > multi JVM can request a true lock.
> > For the code point of view, you're opening a connection to a ZK node (any
> > of a cluster) and you create a new InterProcessSemaphoreMutex which
> handles
> > the locking mechanism.
> >
> > On Mon, Dec 18, 2017 at 10:24 AM, Ivan Kudryavtsev <
> > kudryavtsev...@bw-sw.com
> > > wrote:
> >
> > > Rafael,
> > >
> > > - It's easy to configure and run ZK either in single node or cluster
> > > - zookeeper should replace mysql locking mechanism used inside ACS code
> > > (places where ACS locks tables or rows).
> > >
> > > I don't think from the other size, that moving from MySQL locks to ZK
> > locks
> > > is easy and light and (even implemetable) way.
> > >
> > > 2017-12-18 16:20 GMT+07:00 Rafael Weingärtner <
> > rafaelweingart...@gmail.com
> > > >:
> > >
> > > > How hard is it to configure Zookeeper and get everything up and
> > running?
> > > > BTW: what zookeeper would be managing? CloudStack management servers
> or
> > > > MySQL nodes?
> > > >
> > > > On Mon, Dec 18, 2017 at 7:13 AM, Ivan Kudryavtsev <
> > > > kudryavtsev...@bw-sw.com>
> > > > wrote:
> > > >
> > > > > Hello, Marc-Aurele, I strongly believe that all mysql locks should
> be
> > > > > removed in favour of truly DLM solution like Zookeeper. The
> > performance
> > > > of
> > > > > 3node ZK ensemble should be enough to hold up to 1000-2000 locks
> per
> > > > second
> > > > > and it helps to move to truly clustered MySQL like galera without
> > > single
> > > > > master server.
> > > > >
> > > > > 2017-12-18 15:33 GMT+07:00 Marc-Aurèle Brothier  >:
> > > > >
> > > > > > Hi everyone,
> > > > > >
> > > > > > I was wondering how many of you are running CloudStack with a
> > cluster
> > > > of
> > > > > > management servers. I would think most of you, but it would be
> nice
> > > to
> > > > > hear
> > > > > > everyone voices. And do you get hosts going over their capacity
> > > limits?
> > > > > >
> > > > > > We discovered that during the VM allocation, if you get a lot of
> > > > parallel
> > > > > > requests to create new VMs, most notably with large profiles, the
> > > > > capacity
> > > > > > increase is done too far after the host capacity checks and
> results
> > > in
> > > > > > hosts going over their capacity limits. To detail the steps: the
> > > > > deployment
> > > > > > planner checks for cluster/host capacity and pick up one
> deployment
> > > > plan
> > > > > > (zone, cluster, host). The plan is stored in the database under a
> > > > VMwork
> > > > > > job and another thread picks that entry and starts the
> deployment,
> > > > > > increasing the host capacity and sending the commands. Here
> > there's a
> > > > > time
> > > > > > gap between the host being picked up and the capacity increase
> for
> > > that
> > > > > > host of a couple of seconds, which is well enough to go over the
> > > > capacity
> > > > > > on one or more hosts. A few VMwork job can be added in the DB
> queue
> > > > > > targeting the same host before one gets picked up.

Re: [Discuss] Management cluster / Zookeeper holding locks

2017-12-18 Thread Rafael Weingärtner
Do we have framework to do this kind of looking in ZK?
I mean, you said " create a new InterProcessSemaphoreMutex which handles
the locking mechanism.". This feels that we would have to continue opening
and closing this transaction manually, which is what causes a lot of our
headaches with transactions (it is not MySQL locks fault entirely, but our
code structure).

On Mon, Dec 18, 2017 at 7:47 AM, Marc-Aurèle Brothier 
wrote:

> We added ZK lock for fix this issue but we will remove all current locks in
> ZK in favor of ZK one. The ZK lock is already encapsulated in a project
> with an interface, but more work should be done to have a proper interface
> for locks which could be implemented with the "tool" you want, either a DB
> lock for simplicity, or ZK for more advanced scenarios.
>
> @Daan you will need to add the ZK libraries in CS and have a running ZK
> server somewhere. The configuration value is read from the
> server.properties. If the line is empty, the ZK client is not created and
> any lock request will immediately return (not holding any lock).
>
> @Rafael: ZK is pretty easy to setup and have running, as long as you don't
> put too much data in it. Regarding our scenario here, with only locks, it's
> easy. ZK would be only the gatekeeper to locks in the code, ensuring that
> multi JVM can request a true lock.
> For the code point of view, you're opening a connection to a ZK node (any
> of a cluster) and you create a new InterProcessSemaphoreMutex which handles
> the locking mechanism.
>
> On Mon, Dec 18, 2017 at 10:24 AM, Ivan Kudryavtsev <
> kudryavtsev...@bw-sw.com
> > wrote:
>
> > Rafael,
> >
> > - It's easy to configure and run ZK either in single node or cluster
> > - zookeeper should replace mysql locking mechanism used inside ACS code
> > (places where ACS locks tables or rows).
> >
> > I don't think from the other size, that moving from MySQL locks to ZK
> locks
> > is easy and light and (even implemetable) way.
> >
> > 2017-12-18 16:20 GMT+07:00 Rafael Weingärtner <
> rafaelweingart...@gmail.com
> > >:
> >
> > > How hard is it to configure Zookeeper and get everything up and
> running?
> > > BTW: what zookeeper would be managing? CloudStack management servers or
> > > MySQL nodes?
> > >
> > > On Mon, Dec 18, 2017 at 7:13 AM, Ivan Kudryavtsev <
> > > kudryavtsev...@bw-sw.com>
> > > wrote:
> > >
> > > > Hello, Marc-Aurele, I strongly believe that all mysql locks should be
> > > > removed in favour of truly DLM solution like Zookeeper. The
> performance
> > > of
> > > > 3node ZK ensemble should be enough to hold up to 1000-2000 locks per
> > > second
> > > > and it helps to move to truly clustered MySQL like galera without
> > single
> > > > master server.
> > > >
> > > > 2017-12-18 15:33 GMT+07:00 Marc-Aurèle Brothier :
> > > >
> > > > > Hi everyone,
> > > > >
> > > > > I was wondering how many of you are running CloudStack with a
> cluster
> > > of
> > > > > management servers. I would think most of you, but it would be nice
> > to
> > > > hear
> > > > > everyone voices. And do you get hosts going over their capacity
> > limits?
> > > > >
> > > > > We discovered that during the VM allocation, if you get a lot of
> > > parallel
> > > > > requests to create new VMs, most notably with large profiles, the
> > > > capacity
> > > > > increase is done too far after the host capacity checks and results
> > in
> > > > > hosts going over their capacity limits. To detail the steps: the
> > > > deployment
> > > > > planner checks for cluster/host capacity and pick up one deployment
> > > plan
> > > > > (zone, cluster, host). The plan is stored in the database under a
> > > VMwork
> > > > > job and another thread picks that entry and starts the deployment,
> > > > > increasing the host capacity and sending the commands. Here
> there's a
> > > > time
> > > > > gap between the host being picked up and the capacity increase for
> > that
> > > > > host of a couple of seconds, which is well enough to go over the
> > > capacity
> > > > > on one or more hosts. A few VMwork job can be added in the DB queue
> > > > > targeting the same host before one gets picked up.
> > > > >
> > > > > To fix this issue, we're using Zookeeper to act as the multi JVM
> lock
> > > > > manager thanks to their curator library (
> > > > > https://curator.apache.org/curator-recipes/shared-lock.html). We
> > also
> > > > > changed the time when the capacity is increased, which occurs now
> > > pretty
> > > > > much after the deployment plan is found and inside the zookeeper
> > lock.
> > > > This
> > > > > ensure we don't go over the capacity of any host, and it has been
> > > proven
> > > > > efficient since a month in our management server cluster.
> > > > >
> > > > > This adds another potential requirement which should be discuss
> > before
> > > > > proposing a PR. Today the code works seamlessly without ZK too, to
> > > ensure
> > > > > it's not a hard requirement, for example in a 

Re: [Discuss] Management cluster / Zookeeper holding locks

2017-12-18 Thread Marc-Aurèle Brothier
We added ZK lock for fix this issue but we will remove all current locks in
ZK in favor of ZK one. The ZK lock is already encapsulated in a project
with an interface, but more work should be done to have a proper interface
for locks which could be implemented with the "tool" you want, either a DB
lock for simplicity, or ZK for more advanced scenarios.

@Daan you will need to add the ZK libraries in CS and have a running ZK
server somewhere. The configuration value is read from the
server.properties. If the line is empty, the ZK client is not created and
any lock request will immediately return (not holding any lock).

@Rafael: ZK is pretty easy to setup and have running, as long as you don't
put too much data in it. Regarding our scenario here, with only locks, it's
easy. ZK would be only the gatekeeper to locks in the code, ensuring that
multi JVM can request a true lock.
For the code point of view, you're opening a connection to a ZK node (any
of a cluster) and you create a new InterProcessSemaphoreMutex which handles
the locking mechanism.

On Mon, Dec 18, 2017 at 10:24 AM, Ivan Kudryavtsev  wrote:

> Rafael,
>
> - It's easy to configure and run ZK either in single node or cluster
> - zookeeper should replace mysql locking mechanism used inside ACS code
> (places where ACS locks tables or rows).
>
> I don't think from the other size, that moving from MySQL locks to ZK locks
> is easy and light and (even implemetable) way.
>
> 2017-12-18 16:20 GMT+07:00 Rafael Weingärtner  >:
>
> > How hard is it to configure Zookeeper and get everything up and running?
> > BTW: what zookeeper would be managing? CloudStack management servers or
> > MySQL nodes?
> >
> > On Mon, Dec 18, 2017 at 7:13 AM, Ivan Kudryavtsev <
> > kudryavtsev...@bw-sw.com>
> > wrote:
> >
> > > Hello, Marc-Aurele, I strongly believe that all mysql locks should be
> > > removed in favour of truly DLM solution like Zookeeper. The performance
> > of
> > > 3node ZK ensemble should be enough to hold up to 1000-2000 locks per
> > second
> > > and it helps to move to truly clustered MySQL like galera without
> single
> > > master server.
> > >
> > > 2017-12-18 15:33 GMT+07:00 Marc-Aurèle Brothier :
> > >
> > > > Hi everyone,
> > > >
> > > > I was wondering how many of you are running CloudStack with a cluster
> > of
> > > > management servers. I would think most of you, but it would be nice
> to
> > > hear
> > > > everyone voices. And do you get hosts going over their capacity
> limits?
> > > >
> > > > We discovered that during the VM allocation, if you get a lot of
> > parallel
> > > > requests to create new VMs, most notably with large profiles, the
> > > capacity
> > > > increase is done too far after the host capacity checks and results
> in
> > > > hosts going over their capacity limits. To detail the steps: the
> > > deployment
> > > > planner checks for cluster/host capacity and pick up one deployment
> > plan
> > > > (zone, cluster, host). The plan is stored in the database under a
> > VMwork
> > > > job and another thread picks that entry and starts the deployment,
> > > > increasing the host capacity and sending the commands. Here there's a
> > > time
> > > > gap between the host being picked up and the capacity increase for
> that
> > > > host of a couple of seconds, which is well enough to go over the
> > capacity
> > > > on one or more hosts. A few VMwork job can be added in the DB queue
> > > > targeting the same host before one gets picked up.
> > > >
> > > > To fix this issue, we're using Zookeeper to act as the multi JVM lock
> > > > manager thanks to their curator library (
> > > > https://curator.apache.org/curator-recipes/shared-lock.html). We
> also
> > > > changed the time when the capacity is increased, which occurs now
> > pretty
> > > > much after the deployment plan is found and inside the zookeeper
> lock.
> > > This
> > > > ensure we don't go over the capacity of any host, and it has been
> > proven
> > > > efficient since a month in our management server cluster.
> > > >
> > > > This adds another potential requirement which should be discuss
> before
> > > > proposing a PR. Today the code works seamlessly without ZK too, to
> > ensure
> > > > it's not a hard requirement, for example in a lab.
> > > >
> > > > Comments?
> > > >
> > > > Kind regards,
> > > > Marc-Aurèle
> > > >
> > >
> > >
> > >
> > > --
> > > With best regards, Ivan Kudryavtsev
> > > Bitworks Software, Ltd.
> > > Cell: +7-923-414-1515
> > > WWW: http://bitworks.software/ 
> > >
> >
> >
> >
> > --
> > Rafael Weingärtner
> >
>
>
>
> --
> With best regards, Ivan Kudryavtsev
> Bitworks Software, Ltd.
> Cell: +7-923-414-1515
> WWW: http://bitworks.software/ 
>


Re: [Discuss] Management cluster / Zookeeper holding locks

2017-12-18 Thread Rafael Weingärtner
so, how does that work?
I mean, instead of opening a transaction with the database and executing
locks, what do we need to do in the code?

On Mon, Dec 18, 2017 at 7:24 AM, Ivan Kudryavtsev 
wrote:

> Rafael,
>
> - It's easy to configure and run ZK either in single node or cluster
> - zookeeper should replace mysql locking mechanism used inside ACS code
> (places where ACS locks tables or rows).
>
> I don't think from the other size, that moving from MySQL locks to ZK locks
> is easy and light and (even implemetable) way.
>
> 2017-12-18 16:20 GMT+07:00 Rafael Weingärtner  >:
>
> > How hard is it to configure Zookeeper and get everything up and running?
> > BTW: what zookeeper would be managing? CloudStack management servers or
> > MySQL nodes?
> >
> > On Mon, Dec 18, 2017 at 7:13 AM, Ivan Kudryavtsev <
> > kudryavtsev...@bw-sw.com>
> > wrote:
> >
> > > Hello, Marc-Aurele, I strongly believe that all mysql locks should be
> > > removed in favour of truly DLM solution like Zookeeper. The performance
> > of
> > > 3node ZK ensemble should be enough to hold up to 1000-2000 locks per
> > second
> > > and it helps to move to truly clustered MySQL like galera without
> single
> > > master server.
> > >
> > > 2017-12-18 15:33 GMT+07:00 Marc-Aurèle Brothier :
> > >
> > > > Hi everyone,
> > > >
> > > > I was wondering how many of you are running CloudStack with a cluster
> > of
> > > > management servers. I would think most of you, but it would be nice
> to
> > > hear
> > > > everyone voices. And do you get hosts going over their capacity
> limits?
> > > >
> > > > We discovered that during the VM allocation, if you get a lot of
> > parallel
> > > > requests to create new VMs, most notably with large profiles, the
> > > capacity
> > > > increase is done too far after the host capacity checks and results
> in
> > > > hosts going over their capacity limits. To detail the steps: the
> > > deployment
> > > > planner checks for cluster/host capacity and pick up one deployment
> > plan
> > > > (zone, cluster, host). The plan is stored in the database under a
> > VMwork
> > > > job and another thread picks that entry and starts the deployment,
> > > > increasing the host capacity and sending the commands. Here there's a
> > > time
> > > > gap between the host being picked up and the capacity increase for
> that
> > > > host of a couple of seconds, which is well enough to go over the
> > capacity
> > > > on one or more hosts. A few VMwork job can be added in the DB queue
> > > > targeting the same host before one gets picked up.
> > > >
> > > > To fix this issue, we're using Zookeeper to act as the multi JVM lock
> > > > manager thanks to their curator library (
> > > > https://curator.apache.org/curator-recipes/shared-lock.html). We
> also
> > > > changed the time when the capacity is increased, which occurs now
> > pretty
> > > > much after the deployment plan is found and inside the zookeeper
> lock.
> > > This
> > > > ensure we don't go over the capacity of any host, and it has been
> > proven
> > > > efficient since a month in our management server cluster.
> > > >
> > > > This adds another potential requirement which should be discuss
> before
> > > > proposing a PR. Today the code works seamlessly without ZK too, to
> > ensure
> > > > it's not a hard requirement, for example in a lab.
> > > >
> > > > Comments?
> > > >
> > > > Kind regards,
> > > > Marc-Aurèle
> > > >
> > >
> > >
> > >
> > > --
> > > With best regards, Ivan Kudryavtsev
> > > Bitworks Software, Ltd.
> > > Cell: +7-923-414-1515
> > > WWW: http://bitworks.software/ 
> > >
> >
> >
> >
> > --
> > Rafael Weingärtner
> >
>
>
>
> --
> With best regards, Ivan Kudryavtsev
> Bitworks Software, Ltd.
> Cell: +7-923-414-1515
> WWW: http://bitworks.software/ 
>



-- 
Rafael Weingärtner


Re: [Discuss] Management cluster / Zookeeper holding locks

2017-12-18 Thread Ivan Kudryavtsev
Rafael,

- It's easy to configure and run ZK either in single node or cluster
- zookeeper should replace mysql locking mechanism used inside ACS code
(places where ACS locks tables or rows).

I don't think from the other size, that moving from MySQL locks to ZK locks
is easy and light and (even implemetable) way.

2017-12-18 16:20 GMT+07:00 Rafael Weingärtner :

> How hard is it to configure Zookeeper and get everything up and running?
> BTW: what zookeeper would be managing? CloudStack management servers or
> MySQL nodes?
>
> On Mon, Dec 18, 2017 at 7:13 AM, Ivan Kudryavtsev <
> kudryavtsev...@bw-sw.com>
> wrote:
>
> > Hello, Marc-Aurele, I strongly believe that all mysql locks should be
> > removed in favour of truly DLM solution like Zookeeper. The performance
> of
> > 3node ZK ensemble should be enough to hold up to 1000-2000 locks per
> second
> > and it helps to move to truly clustered MySQL like galera without single
> > master server.
> >
> > 2017-12-18 15:33 GMT+07:00 Marc-Aurèle Brothier :
> >
> > > Hi everyone,
> > >
> > > I was wondering how many of you are running CloudStack with a cluster
> of
> > > management servers. I would think most of you, but it would be nice to
> > hear
> > > everyone voices. And do you get hosts going over their capacity limits?
> > >
> > > We discovered that during the VM allocation, if you get a lot of
> parallel
> > > requests to create new VMs, most notably with large profiles, the
> > capacity
> > > increase is done too far after the host capacity checks and results in
> > > hosts going over their capacity limits. To detail the steps: the
> > deployment
> > > planner checks for cluster/host capacity and pick up one deployment
> plan
> > > (zone, cluster, host). The plan is stored in the database under a
> VMwork
> > > job and another thread picks that entry and starts the deployment,
> > > increasing the host capacity and sending the commands. Here there's a
> > time
> > > gap between the host being picked up and the capacity increase for that
> > > host of a couple of seconds, which is well enough to go over the
> capacity
> > > on one or more hosts. A few VMwork job can be added in the DB queue
> > > targeting the same host before one gets picked up.
> > >
> > > To fix this issue, we're using Zookeeper to act as the multi JVM lock
> > > manager thanks to their curator library (
> > > https://curator.apache.org/curator-recipes/shared-lock.html). We also
> > > changed the time when the capacity is increased, which occurs now
> pretty
> > > much after the deployment plan is found and inside the zookeeper lock.
> > This
> > > ensure we don't go over the capacity of any host, and it has been
> proven
> > > efficient since a month in our management server cluster.
> > >
> > > This adds another potential requirement which should be discuss before
> > > proposing a PR. Today the code works seamlessly without ZK too, to
> ensure
> > > it's not a hard requirement, for example in a lab.
> > >
> > > Comments?
> > >
> > > Kind regards,
> > > Marc-Aurèle
> > >
> >
> >
> >
> > --
> > With best regards, Ivan Kudryavtsev
> > Bitworks Software, Ltd.
> > Cell: +7-923-414-1515
> > WWW: http://bitworks.software/ 
> >
>
>
>
> --
> Rafael Weingärtner
>



-- 
With best regards, Ivan Kudryavtsev
Bitworks Software, Ltd.
Cell: +7-923-414-1515
WWW: http://bitworks.software/ 


Re: Things to consider in replacing secondary storage

2017-12-18 Thread christian.kirmse
Hi Thura,

afaik, yes.
But please keep in mind that I’m not at ACS guru, so there might be a way which 
I’m not aware of.
For us it was a quite painless migration which was the most important thing.

Regards
Christian
> On 6. Dec 2017, at 06:07, Thura, Minn Minn  wrote:
> 
> Hi Christian,
> 
> Appreciate your sharing.
> So the conclusion is there is no way to achieve the below and we have to 
> resort to "one stroke replace".  
> -
> First, to make the system create new template and snapshot only on new 
> secstorage
>   while keeping old secstorage for current snapshot and template.
> -
> 
> Thanks,
> Thura (Fujitsu FIP)
> 
> 
> -Original Message-
> From: christian.kir...@zv.fraunhofer.de 
> [mailto:christian.kir...@zv.fraunhofer.de] 
> Sent: Tuesday, December 05, 2017 7:32 PM
> To: users@cloudstack.apache.org
> Subject: Re: Things to consider in replacing secondary storage
> 
> Hi Thura,
> 
> at the beginning of the year, we also faces this challenge.
> Eventually we used the “one stroke” method which Vivek has described.
> Since we have about 80TB of sec storage it was not option to copy the data 
> while the management servers are down.
> Thankfully our storage system has a built-in feature to synchronise data 
> between volumes, so we synched the data over a couple of days and made a 
> clean cut during our maintenance where we only had to sync the remaining few 
> gigabytes which were new on the old storage.
> If your secondary storage is based on linux you probably can achieve this via 
> rsync or similar tools.
> 
> Regards
> Christian
> 
> On 5. Dec 2017, at 10:40, Thura, Minn Minn 
> > wrote:
> 
> Hi Vivek,
> 
> Appreciate sharing your steps.
> I understand that ur procedure is replacing secstorage at one stroke.
> The copy process will take a very long time(about 1 month) in our environment 
> for tens of TB.
> 
> What we would like to achieve is something like below:
> First, to make the system create new template and snapshot only on new 
> secstorage
>   while keeping old secstorage for current snapshot and template.
> Second, we will gradually copy current snapshot and template to new 
> secstorage and change appropriate db info.
> 
> The problem is I could not find any way to achieve First step.
> (Will filling some dummy data to hog all available space in current 
> secstorage do the trick? Or any other smart way?) Maybe I could use ur 
> procedure to achieve our Second step.
> 
> Thanks,
> Thura (Fujitsu FIP)
> 
> 
> -Original Message-
> From: Vivek Kumar [mailto:vivek.ku...@indiqus.com]
> Sent: Tuesday, December 05, 2017 5:44 PM
> To: users@cloudstack.apache.org
> Subject: Re: Things to consider in replacing secondary storage
> 
> Hello Thura,
> 
> I have done this earlier on as well. Steps which I have followed-
> 
> 1- Create a new secondary storage-
> 2- Mount this secondary storage at any host. ( mount -t nfs 
> x.x.x.x/x:/ / ), make sure that your 
> older secondary storage is also mounted.
> 3- Stop all running Management Servers and  Wait 30 minutes. This allows any 
> writes to secondary storage to complete.
> 4- Copy all data from older secondary storage to newly secondary storage by 
> using cp -Rv ( i.e cp -rp /secondary1/* /secondary2 ).
> 5- Check the integrity of data, make sure that all data has been copied, 
> double check the size , permission, all folders  on the new secondary storage.
> 6- Once you are done with above steps. Now take the DB backup, Please don’t 
> forget to take DB Backup, also new secondary storage should be reachable from 
> host and management server.
> 
> Suppose I have a two NFS server where I have created 2 directories( 
> /secondary1 on first NFS server  and /secondary2 on secondary server)  and 
> shared via NFS.
> 
> First NFS server Path - 192.168.0.100/secondary1 Second NFS Server Path- 
> 192.168.0.200/secondary2
> 
> Please make sure that all data have been copied successfully from  secondary1 
> to  secondary2. ( Double Check the data )
> 
> View of image_store which was looking as below.
> 
> Please make sure that all data have been copied successfully from  secondary1 
> to  secondary2. ( Double Check the data )
> 
> mysql> select * from image_store\G
> *** 1. row ***
>id: 1
>  name: Secondary1
> image_provider_name: NFS
>  protocol: nfs
>   url: nfs://192.168.0.100/secondary1
>data_center_id: 1
> scope: ZONE
>  role: Image
>  uuid: 4a18559e-e6e8-4329-aac6-9ea6c74ec5e6
>parent: 5554e6ec-0dea-3881-9943-d06a6ebe17e8
>   created: 2016-10-04 15:13:40
>   removed: NULL
>total_size: NULL
>used_bytes: NULL
> 1 row in set (0.00 sec)
> 
> Now i want to replace the IP and mount point of the 

Re: [Discuss] Management cluster / Zookeeper holding locks

2017-12-18 Thread Rafael Weingärtner
How hard is it to configure Zookeeper and get everything up and running?
BTW: what zookeeper would be managing? CloudStack management servers or
MySQL nodes?

On Mon, Dec 18, 2017 at 7:13 AM, Ivan Kudryavtsev 
wrote:

> Hello, Marc-Aurele, I strongly believe that all mysql locks should be
> removed in favour of truly DLM solution like Zookeeper. The performance of
> 3node ZK ensemble should be enough to hold up to 1000-2000 locks per second
> and it helps to move to truly clustered MySQL like galera without single
> master server.
>
> 2017-12-18 15:33 GMT+07:00 Marc-Aurèle Brothier :
>
> > Hi everyone,
> >
> > I was wondering how many of you are running CloudStack with a cluster of
> > management servers. I would think most of you, but it would be nice to
> hear
> > everyone voices. And do you get hosts going over their capacity limits?
> >
> > We discovered that during the VM allocation, if you get a lot of parallel
> > requests to create new VMs, most notably with large profiles, the
> capacity
> > increase is done too far after the host capacity checks and results in
> > hosts going over their capacity limits. To detail the steps: the
> deployment
> > planner checks for cluster/host capacity and pick up one deployment plan
> > (zone, cluster, host). The plan is stored in the database under a VMwork
> > job and another thread picks that entry and starts the deployment,
> > increasing the host capacity and sending the commands. Here there's a
> time
> > gap between the host being picked up and the capacity increase for that
> > host of a couple of seconds, which is well enough to go over the capacity
> > on one or more hosts. A few VMwork job can be added in the DB queue
> > targeting the same host before one gets picked up.
> >
> > To fix this issue, we're using Zookeeper to act as the multi JVM lock
> > manager thanks to their curator library (
> > https://curator.apache.org/curator-recipes/shared-lock.html). We also
> > changed the time when the capacity is increased, which occurs now pretty
> > much after the deployment plan is found and inside the zookeeper lock.
> This
> > ensure we don't go over the capacity of any host, and it has been proven
> > efficient since a month in our management server cluster.
> >
> > This adds another potential requirement which should be discuss before
> > proposing a PR. Today the code works seamlessly without ZK too, to ensure
> > it's not a hard requirement, for example in a lab.
> >
> > Comments?
> >
> > Kind regards,
> > Marc-Aurèle
> >
>
>
>
> --
> With best regards, Ivan Kudryavtsev
> Bitworks Software, Ltd.
> Cell: +7-923-414-1515
> WWW: http://bitworks.software/ 
>



-- 
Rafael Weingärtner


Re: [Discuss] Management cluster / Zookeeper holding locks

2017-12-18 Thread Ivan Kudryavtsev
Hello, Marc-Aurele, I strongly believe that all mysql locks should be
removed in favour of truly DLM solution like Zookeeper. The performance of
3node ZK ensemble should be enough to hold up to 1000-2000 locks per second
and it helps to move to truly clustered MySQL like galera without single
master server.

2017-12-18 15:33 GMT+07:00 Marc-Aurèle Brothier :

> Hi everyone,
>
> I was wondering how many of you are running CloudStack with a cluster of
> management servers. I would think most of you, but it would be nice to hear
> everyone voices. And do you get hosts going over their capacity limits?
>
> We discovered that during the VM allocation, if you get a lot of parallel
> requests to create new VMs, most notably with large profiles, the capacity
> increase is done too far after the host capacity checks and results in
> hosts going over their capacity limits. To detail the steps: the deployment
> planner checks for cluster/host capacity and pick up one deployment plan
> (zone, cluster, host). The plan is stored in the database under a VMwork
> job and another thread picks that entry and starts the deployment,
> increasing the host capacity and sending the commands. Here there's a time
> gap between the host being picked up and the capacity increase for that
> host of a couple of seconds, which is well enough to go over the capacity
> on one or more hosts. A few VMwork job can be added in the DB queue
> targeting the same host before one gets picked up.
>
> To fix this issue, we're using Zookeeper to act as the multi JVM lock
> manager thanks to their curator library (
> https://curator.apache.org/curator-recipes/shared-lock.html). We also
> changed the time when the capacity is increased, which occurs now pretty
> much after the deployment plan is found and inside the zookeeper lock. This
> ensure we don't go over the capacity of any host, and it has been proven
> efficient since a month in our management server cluster.
>
> This adds another potential requirement which should be discuss before
> proposing a PR. Today the code works seamlessly without ZK too, to ensure
> it's not a hard requirement, for example in a lab.
>
> Comments?
>
> Kind regards,
> Marc-Aurèle
>



-- 
With best regards, Ivan Kudryavtsev
Bitworks Software, Ltd.
Cell: +7-923-414-1515
WWW: http://bitworks.software/ 


Re: [Discuss] Management cluster / Zookeeper holding locks

2017-12-18 Thread Daan Hoogland
Are you proposing to add zookeeper as an optional requirement, Marc-Aurèle?
or just curator? and what is the decision mech of including it or not?

On Mon, Dec 18, 2017 at 9:33 AM, Marc-Aurèle Brothier 
wrote:

> Hi everyone,
>
> I was wondering how many of you are running CloudStack with a cluster of
> management servers. I would think most of you, but it would be nice to hear
> everyone voices. And do you get hosts going over their capacity limits?
>
> We discovered that during the VM allocation, if you get a lot of parallel
> requests to create new VMs, most notably with large profiles, the capacity
> increase is done too far after the host capacity checks and results in
> hosts going over their capacity limits. To detail the steps: the deployment
> planner checks for cluster/host capacity and pick up one deployment plan
> (zone, cluster, host). The plan is stored in the database under a VMwork
> job and another thread picks that entry and starts the deployment,
> increasing the host capacity and sending the commands. Here there's a time
> gap between the host being picked up and the capacity increase for that
> host of a couple of seconds, which is well enough to go over the capacity
> on one or more hosts. A few VMwork job can be added in the DB queue
> targeting the same host before one gets picked up.
>
> To fix this issue, we're using Zookeeper to act as the multi JVM lock
> manager thanks to their curator library (
> https://curator.apache.org/curator-recipes/shared-lock.html). We also
> changed the time when the capacity is increased, which occurs now pretty
> much after the deployment plan is found and inside the zookeeper lock. This
> ensure we don't go over the capacity of any host, and it has been proven
> efficient since a month in our management server cluster.
>
> This adds another potential requirement which should be discuss before
> proposing a PR. Today the code works seamlessly without ZK too, to ensure
> it's not a hard requirement, for example in a lab.
>
> Comments?
>
> Kind regards,
> Marc-Aurèle
>



-- 
Daan


Re: Need help for first time user

2017-12-18 Thread Vivek Kumar
Hello Sounder,

1- What exactly is your requirement ? Are you going to deploy public or private 
cloud.
2- what hypervisor do want to use, ( as per ur trail mail you are trying to 
configure with Xenserver, you can use either XenServer 6.5 SP1 or XenServer 7.0 
).
3- If you are planning to build Private cloud i.e VPC ( Advance Zone in Cloud 
stack ) , you need to take care of few networking things as well, CloudStack 
management VLAN should be able to reach your  POD/XENHOST VLAN and secondary 
storage VLAN, and POD VLAN should be able to reach CM VLAN and secondary 
storage as well.
4- What primary storage are you using, if primary storage is on other VLAN it 
should be reachable from your hypervisor,


Vivek Kumar
Virtualization and Cloud Consultant


> On 18-Dec-2017, at 12:57 PM, soundar rajan  wrote:
> 
> Tried with xenserver 5.6 sp2 no luck.
> 
> On Sat, Dec 16, 2017 at 5:19 PM, soundar rajan 
> wrote:
> 
>> Everything went well. But management server is not able to communicate
>> with xenserver using xapi link local ip's
>> 
>> 2017-12-16 06:27:28,432 DEBUG [c.c.h.x.r.CitrixResourceBase]
>> (DirectAgent-48:ctx-cbc6b9fe) (logid:a3b30d4b) Trying to connect to
>> 169.254.0.225 attempt 47 of 100
>> 2017-12-16 06:27:29,020 WARN  [o.a.c.f.j.i.AsyncJobMonitor]
>> (Timer-1:ctx-316e4e5c) (logid:6d852e5f) Task (job-12) has been pending for
>> 624 seconds
>> 2017-12-16 06:27:29,020 WARN  [o.a.c.f.j.i.AsyncJobMonitor]
>> (Timer-1:ctx-316e4e5c) (logid:6d852e5f) Task (job-13) has been pending for
>> 622 seconds
>> 2017-12-16 06:27:30,074 DEBUG [c.c.h.x.r.CitrixResourceBase]
>> (DirectAgent-49:ctx-3ee4becb) (logid:39a5433a) Trying to connect to
>> 169.254.0.211 attempt 47 of 100
>> 2017-12-16 06:27:31,215 DEBUG [c.c.a.m.DirectAgentAttache]
>> (DirectAgentCronJob-30:ctx-9034c513) (logid:ea579fd0) Ping from
>> 1(xenserver-cloudstack)
>> 2017-12-16 06:27:31,216 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
>> (DirectAgentCronJob-30:ctx-9034c513) (logid:ea579fd0) Process host VM
>> state report from ping process. host: 1
>> 2017-12-16 06:27:31,221 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
>> (DirectAgentCronJob-30:ctx-9034c513) (logid:ea579fd0) Process VM state
>> report. host: 1, number of records in report: 2
>> 2017-12-16 06:27:31,221 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
>> (DirectAgentCronJob-30:ctx-9034c513) (logid:ea579fd0) VM state report.
>> host: 1, vm id: 1, power state: PowerOn
>> 2017-12-16 06:27:31,226 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
>> (DirectAgentCronJob-30:ctx-9034c513) (logid:ea579fd0) VM state report is
>> updated. host: 1, vm id: 1, power state: PowerOn
>> 2017-12-16 06:27:31,228 INFO  [c.c.v.VirtualMachineManagerImpl]
>> (DirectAgentCronJob-30:ctx-9034c513) (logid:ea579fd0) There is pending
>> job or HA tasks working on the VM. vm id: 1, postpone power-change report
>> by resetting power-change counters
>> 2017-12-16 06:27:31,233 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
>> (DirectAgentCronJob-30:ctx-9034c513) (logid:ea579fd0) VM state report.
>> host: 1, vm id: 2, power state: PowerOn
>> 2017-12-16 06:27:31,242 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
>> (DirectAgentCronJob-30:ctx-9034c513) (logid:ea579fd0) VM state report is
>> updated. host: 1, vm id: 2, power state: PowerOn
>> 2017-12-16 06:27:31,245 INFO  [c.c.v.VirtualMachineManagerImpl]
>> (DirectAgentCronJob-30:ctx-9034c513) (logid:ea579fd0) There is pending
>> job or HA tasks working on the VM. vm id: 2, postpone power-change report
>> by resetting power-change counters
>> 2017-12-16 06:27:31,258 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
>> (DirectAgentCronJob-30:ctx-9034c513) (logid:ea579fd0) Done with process
>> of VM state report. host: 1
>> 2017-12-16 06:27:34,314 INFO  [o.a.c.f.j.i.AsyncJobManagerImpl]
>> (AsyncJobMgr-Heartbeat-1:ctx-a8717eba) (logid:56de398e) Begin cleanup
>> expired async-jobs
>> 2017-12-16 06:27:34,321 INFO  [o.a.c.f.j.i.AsyncJobManagerImpl]
>> (AsyncJobMgr-Heartbeat-1:ctx-a8717eba) (logid:56de398e) End cleanup
>> expired async-jobs
>> 
>> 
>> Please help..
>> 
>> Cloudstack version 4.9
>> 
>> Xenserver version 7 (This is first host without any bonding and 1 nic)
>> 
>> 
>> Thanks
>> Shyam
>> 
>> On Sat, Dec 16, 2017 at 2:36 PM, soundar rajan 
>> wrote:
>> 
>>> error in the cloudstack console
>>> 
>>> Console proxy creation failure. zone: NVINLLPerror details: null
>>> 
>>> 
>>> Secondary Storage Vm creation failure. zone: NVINL...error details: null
>>> 
>>> On Sat, Dec 16, 2017 at 10:14 AM, soundar rajan 
>>> wrote:
>>> 
 Hi Prakash,
 
 Please find the error message
 
 2017-12-15 23:27:14,616 WARN  [o.a.c.f.j.i.AsyncJobMonitor]
 (Timer-1:ctx-88e59de6) (logid:825f4d93) Task (job-12) has been pending for
 298 seconds
 2017-12-15 23:27:14,616 WARN  [o.a.c.f.j.i.AsyncJobMonitor]
 (Timer-1:ctx-88e59de6) (logid:825f4d93) Task (job-13) has 

[Discuss] Management cluster / Zookeeper holding locks

2017-12-18 Thread Marc-Aurèle Brothier
Hi everyone,

I was wondering how many of you are running CloudStack with a cluster of
management servers. I would think most of you, but it would be nice to hear
everyone voices. And do you get hosts going over their capacity limits?

We discovered that during the VM allocation, if you get a lot of parallel
requests to create new VMs, most notably with large profiles, the capacity
increase is done too far after the host capacity checks and results in
hosts going over their capacity limits. To detail the steps: the deployment
planner checks for cluster/host capacity and pick up one deployment plan
(zone, cluster, host). The plan is stored in the database under a VMwork
job and another thread picks that entry and starts the deployment,
increasing the host capacity and sending the commands. Here there's a time
gap between the host being picked up and the capacity increase for that
host of a couple of seconds, which is well enough to go over the capacity
on one or more hosts. A few VMwork job can be added in the DB queue
targeting the same host before one gets picked up.

To fix this issue, we're using Zookeeper to act as the multi JVM lock
manager thanks to their curator library (
https://curator.apache.org/curator-recipes/shared-lock.html). We also
changed the time when the capacity is increased, which occurs now pretty
much after the deployment plan is found and inside the zookeeper lock. This
ensure we don't go over the capacity of any host, and it has been proven
efficient since a month in our management server cluster.

This adds another potential requirement which should be discuss before
proposing a PR. Today the code works seamlessly without ZK too, to ensure
it's not a hard requirement, for example in a lab.

Comments?

Kind regards,
Marc-Aurèle