Re: Experience with clustered/shared filesystems based on SAN storage on KVM?

2021-10-27 Thread Pratik Chandrakar
Since NFS alone doesn't offer HA. What do you recommend for HA NFS?

On Thu, Oct 28, 2021 at 7:37 AM Hean Seng  wrote:

> I have similar consideration when start exploring  Cloudstack , but in
> reality  Clustered Filesystem is not easy to maintain.  You seems have
> choice of OCFS or GFS2 ,  gfs2 is hard to maintain and in redhat ,  ocfs
> recently only maintained in oracle linux.  I believe you do not want to
> choose solution that is very propriety .   Thus just SAN or ISCSI o is not
> really a direct solution here , except you want to encapsulate it in NFS
> and facing Cloudstack Storage.
>
> It work good on CEPH and NFS , but performance wise,  NFS is better . And
> all documentation and features you saw  in Cloudstack , it work perfectly
> on NFS.
>
> If you choose CEPH,  may be you have to compensate with some performance
> degradation,
>
>
>
> On Thu, Oct 28, 2021 at 12:44 AM Leandro Mendes 
> wrote:
>
> > I've been using Ceph in prod for volumes for some time. Note that
> although
> > I had several cloudstack installations,  this one runs on top of Cinder,
> > but it basic translates as libvirt and rados.
> >
> > It is totally stable and performance IMHO is enough for virtualized
> > services.
> >
> > IO might suffer some penalization due the data replication inside Ceph.
> > Elasticsearch for instance, the degradation would be a bit worse as there
> > is replication also in the application size, but IMHO, unless you need
> > extreme low latency it would be ok.
> >
> >
> > Best,
> >
> > Leandro.
> >
> > On Thu, Oct 21, 2021, 11:20 AM Brussk, Michael <
> michael.bru...@nttdata.com
> > >
> > wrote:
> >
> > > Hello community,
> > >
> > > today I need your experience and knowhow about clustered/shared
> > > filesystems based on SAN storage to be used with KVM.
> > > We need to consider about a clustered/shared filesystem based on SAN
> > > storage (no NFS or iSCSI), but do not have any knowhow or experience
> with
> > > this.
> > > Those I would like to ask if there any productive used environments out
> > > there based on SAN storage on KVM?
> > > If so, which clustered/shared filesystem you are using and how is your
> > > experience with that (stability, reliability, maintainability,
> > performance,
> > > useability,...)?
> > > Furthermore, if you had already to consider in the past between SAN
> > > storage or CEPH, I would also like to participate on your
> considerations
> > > and results :)
> > >
> > > Regards,
> > > Michael
> > >
> >
>
>
> --
> Regards,
> Hean Seng
>


-- 
*Regards,*
*Pratik Chandrakar*


Re: Experience with clustered/shared filesystems based on SAN storage on KVM?

2021-10-27 Thread Hean Seng
I have similar consideration when start exploring  Cloudstack , but in
reality  Clustered Filesystem is not easy to maintain.  You seems have
choice of OCFS or GFS2 ,  gfs2 is hard to maintain and in redhat ,  ocfs
recently only maintained in oracle linux.  I believe you do not want to
choose solution that is very propriety .   Thus just SAN or ISCSI o is not
really a direct solution here , except you want to encapsulate it in NFS
and facing Cloudstack Storage.

It work good on CEPH and NFS , but performance wise,  NFS is better . And
all documentation and features you saw  in Cloudstack , it work perfectly
on NFS.

If you choose CEPH,  may be you have to compensate with some performance
degradation,



On Thu, Oct 28, 2021 at 12:44 AM Leandro Mendes 
wrote:

> I've been using Ceph in prod for volumes for some time. Note that although
> I had several cloudstack installations,  this one runs on top of Cinder,
> but it basic translates as libvirt and rados.
>
> It is totally stable and performance IMHO is enough for virtualized
> services.
>
> IO might suffer some penalization due the data replication inside Ceph.
> Elasticsearch for instance, the degradation would be a bit worse as there
> is replication also in the application size, but IMHO, unless you need
> extreme low latency it would be ok.
>
>
> Best,
>
> Leandro.
>
> On Thu, Oct 21, 2021, 11:20 AM Brussk, Michael  >
> wrote:
>
> > Hello community,
> >
> > today I need your experience and knowhow about clustered/shared
> > filesystems based on SAN storage to be used with KVM.
> > We need to consider about a clustered/shared filesystem based on SAN
> > storage (no NFS or iSCSI), but do not have any knowhow or experience with
> > this.
> > Those I would like to ask if there any productive used environments out
> > there based on SAN storage on KVM?
> > If so, which clustered/shared filesystem you are using and how is your
> > experience with that (stability, reliability, maintainability,
> performance,
> > useability,...)?
> > Furthermore, if you had already to consider in the past between SAN
> > storage or CEPH, I would also like to participate on your considerations
> > and results :)
> >
> > Regards,
> > Michael
> >
>


-- 
Regards,
Hean Seng


Re: ACS 4.15 - Disaster recovery after secondary storage issue

2021-10-27 Thread benoit lair
I tried to free my SR of tags
I restarted ACS

Here is the log generated about systems vms after the reboot :

https://pastebin.com/xJNfA23u

The parts of the log which are curious for me :

2021-10-28 00:31:04,462 DEBUG [c.c.h.x.r.XenServerStorageProcessor]
(DirectAgent-14:ctx-3eaf758f) (logid:cc3c4e1e) Catch Exception
com.xensource.xenapi.Types$UuidInvalid :VDI getByUuid for uuid:
159e620a-575d-43a8-9a57-f3c7f57a1c8a failed due to The uuid you supplied
was invalid.
2021-10-28 00:31:04,462 WARN  [c.c.h.x.r.XenServerStorageProcessor]
(DirectAgent-14:ctx-3eaf758f) (logid:cc3c4e1e) Unable to create volume;
Pool=volumeTO[uuid=e4347562-9454-453d-be04-29dc746aaf33|path=null|datastore=PrimaryDataStoreTO[uuid=fbbf2bf0-ccc8-4df3-9794-c914f418a9d9|name=null|id=2|pooltype=PreSetup]];
Disk:
com.cloud.utils.exception.CloudRuntimeException: Catch Exception
com.xensource.xenapi.Types$UuidInvalid :VDI getByUuid for uuid:
159e620a-575d-43a8-9a57-f3c7f57a1c8a failed due to The uuid you supplied
was invalid.
at
com.cloud.hypervisor.xenserver.resource.XenServerStorageProcessor.getVDIbyUuid(XenServerStorageProcessor.java:655)
at
com.cloud.hypervisor.xenserver.resource.XenServerStorageProcessor.cloneVolumeFromBaseTemplate(XenServerStorageProcessor.java:843)
at
com.cloud.storage.resource.StorageSubsystemCommandHandlerBase.execute(StorageSubsystemCommandHandlerBase.java:99)
at
com.cloud.storage.resource.StorageSubsystemCommandHandlerBase.handleStorageCommands(StorageSubsystemCommandHandlerBase.java:59)
at
com.cloud.hypervisor.xenserver.resource.wrapper.xenbase.CitrixStorageSubSystemCommandWrapper.execute(CitrixStorageSubSystemCommandWrapper.java:36)
at
com.cloud.hypervisor.xenserver.resource.wrapper.xenbase.CitrixStorageSubSystemCommandWrapper.execute(CitrixStorageSubSystemCommandWrapper.java:30)
at
com.cloud.hypervisor.xenserver.resource.wrapper.xenbase.CitrixRequestWrapper.execute(CitrixRequestWrapper.java:122)
at
com.cloud.hypervisor.xenserver.resource.CitrixResourceBase.executeRequest(CitrixResourceBase.java:1763)
at
com.cloud.agent.manager.DirectAgentAttache$Task.runInContext(DirectAgentAttache.java:315)
at
org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:48)
at
org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:55)
at
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:102)
at
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:52)
at
org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:45)
at
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: The uuid you supplied was invalid.
at com.xensource.xenapi.Types.checkResponse(Types.java:1491)
at com.xensource.xenapi.Connection.dispatch(Connection.java:395)
at
com.cloud.hypervisor.xenserver.resource.XenServerConnectionPool$XenServerConnection.dispatch(XenServerConnectionPool.java:457)
... 21 more

Is this normal to have this : Unable to create volume;
Pool=volumeTO[uuid=e4347562-9454-453d-be04-29dc746aaf33|path=null|datastore=PrimaryDataStoreTO[uuid=fbbf2bf0-ccc8-4df3-9794-c914f418a9d9|name=null|id=2|pooltype=PreSetup]]
with values null ?

Best, Benoit

Le jeu. 28 oct. 2021 à 00:46, benoit lair  a écrit :

> Hello Andrija,
>
> Well seen :)
>
> 2021-10-27 17:59:22,100 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
> (Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8
> FirstFitRoutingAllocator) (logid:ce3ac740) Host name: xcp-cluster1-01,
> hostId: 1 is in avoid set, skipping this and trying other available hosts
> 2021-10-27 17:59:22,109 DEBUG [c.c.c.CapacityManagerImpl]
> (Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8
> FirstFitRoutingAllocator) (logid:ce3ac740) Host: 3 has cpu capability
> (cpu:48, speed:2593) to support requested CPU: 1 and requested speed: 500
> 2021-10-27 17:59:22,109 DEBUG [c.c.c.CapacityManagerImpl]
> (Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8
> FirstFitRoutingAllocator) (logid:ce3ac740) Checking if host: 3 has enough
> capacity for requested CPU: 500 and requested RAM: (512.00 MB) 536870912 ,
> cpuOverprovisioningFactor: 1.0
> 2021-10-27 17:59:22,112 DEBUG [c.c.c.CapacityManagerImpl]
> 

Re: ACS 4.15 - Disaster recovery after secondary storage issue

2021-10-27 Thread benoit lair
Hello Andrija,

Well seen :)

2021-10-27 17:59:22,100 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
(Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8
FirstFitRoutingAllocator) (logid:ce3ac740) Host name: xcp-cluster1-01,
hostId: 1 is in avoid set, skipping this and trying other available hosts
2021-10-27 17:59:22,109 DEBUG [c.c.c.CapacityManagerImpl]
(Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8
FirstFitRoutingAllocator) (logid:ce3ac740) Host: 3 has cpu capability
(cpu:48, speed:2593) to support requested CPU: 1 and requested speed: 500
2021-10-27 17:59:22,109 DEBUG [c.c.c.CapacityManagerImpl]
(Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8
FirstFitRoutingAllocator) (logid:ce3ac740) Checking if host: 3 has enough
capacity for requested CPU: 500 and requested RAM: (512.00 MB) 536870912 ,
cpuOverprovisioningFactor: 1.0
2021-10-27 17:59:22,112 DEBUG [c.c.c.CapacityManagerImpl]
(Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8
FirstFitRoutingAllocator) (logid:ce3ac740) Hosts's actual total CPU: 124464
and CPU after applying overprovisioning: 124464
2021-10-27 17:59:22,112 DEBUG [c.c.c.CapacityManagerImpl]
(Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8
FirstFitRoutingAllocator) (logid:ce3ac740) Free CPU: 74700 , Requested CPU:
500
2021-10-27 17:59:22,112 DEBUG [c.c.c.CapacityManagerImpl]
(Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8
FirstFitRoutingAllocator) (logid:ce3ac740) Free RAM: (391.87 GB)
420762157056 , Requested RAM: (512.00 MB) 536870912
2021-10-27 17:59:22,112 DEBUG [c.c.c.CapacityManagerImpl]
(Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8
FirstFitRoutingAllocator) (logid:ce3ac740) Host has enough CPU and RAM
available
2021-10-27 17:59:22,112 DEBUG [c.c.c.CapacityManagerImpl]
(Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8
FirstFitRoutingAllocator) (logid:ce3ac740) STATS: Can alloc CPU from host:
3, used: 49764, reserved: 0, actual total: 124464, total with
overprovisioning: 124464; requested cpu:500,alloc_from_last_host?:false
,considerReservedCapacity?: true
2021-10-27 17:59:22,112 DEBUG [c.c.c.CapacityManagerImpl]
(Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8
FirstFitRoutingAllocator) (logid:ce3ac740) STATS: Can alloc MEM from host:
3, used: (33.50 GB) 35970351104, reserved: (0 bytes) 0, total: (425.37 GB)
456732508160; requested mem: (512.00 MB) 536870912, alloc_from_last_host?:
false , considerReservedCapacity?: true
2021-10-27 17:59:22,112 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
(Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8
FirstFitRoutingAllocator) (logid:ce3ac740) Found a suitable host, adding to
list: 3
2021-10-27 17:59:22,112 DEBUG [c.c.a.m.a.i.FirstFitAllocator]
(Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8
FirstFitRoutingAllocator) (logid:ce3ac740) Host Allocator returning 1
suitable hosts
2021-10-27 17:59:22,115 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
(Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8)
(logid:ce3ac740) Checking suitable pools for volume (Id, Type): (211,ROOT)
2021-10-27 17:59:22,115 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
(Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8)
(logid:ce3ac740) We need to allocate new storagepool for this volume
2021-10-27 17:59:22,116 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
(Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8)
(logid:ce3ac740) Calling StoragePoolAllocators to find suitable pools
2021-10-27 17:59:22,121 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
(Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8)
(logid:ce3ac740) System VMs will use shared storage for zone id=1
2021-10-27 17:59:22,121 DEBUG [o.a.c.s.a.LocalStoragePoolAllocator]
(Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8)
(logid:ce3ac740) LocalStoragePoolAllocator trying to find storage pool to
fit the vm
2021-10-27 17:59:22,121 DEBUG [o.a.c.s.a.ClusterScopeStoragePoolAllocator]
(Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8)
(logid:ce3ac740) ClusterScopeStoragePoolAllocator looking for storage pool
2021-10-27 17:59:22,121 DEBUG [o.a.c.s.a.ClusterScopeStoragePoolAllocator]
(Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8)
(logid:ce3ac740) Looking for pools in dc: 1  pod:1  cluster:1. Disabled
pools will be ignored.
2021-10-27 17:59:22,122 DEBUG [o.a.c.s.a.ClusterScopeStoragePoolAllocator]
(Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8)
(logid:ce3ac740) Found pools matching tags: [Pool[1|PreSetup],
Pool[2|PreSetup]]
2021-10-27 17:59:22,124 DEBUG [o.a.c.s.a.AbstractStoragePoolAllocator]
(Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8)
(logid:ce3ac740) Checking if storage pool is suitable, name: null ,poolId: 1
2021-10-27 17:59:22,124 DEBUG [o.a.c.s.a.AbstractStoragePoolAllocator]
(Work-Job-Executor-93:ctx-30ef4f6b 

Re: Experience with clustered/shared filesystems based on SAN storage on KVM?

2021-10-27 Thread Leandro Mendes
I've been using Ceph in prod for volumes for some time. Note that although
I had several cloudstack installations,  this one runs on top of Cinder,
but it basic translates as libvirt and rados.

It is totally stable and performance IMHO is enough for virtualized
services.

IO might suffer some penalization due the data replication inside Ceph.
Elasticsearch for instance, the degradation would be a bit worse as there
is replication also in the application size, but IMHO, unless you need
extreme low latency it would be ok.


Best,

Leandro.

On Thu, Oct 21, 2021, 11:20 AM Brussk, Michael 
wrote:

> Hello community,
>
> today I need your experience and knowhow about clustered/shared
> filesystems based on SAN storage to be used with KVM.
> We need to consider about a clustered/shared filesystem based on SAN
> storage (no NFS or iSCSI), but do not have any knowhow or experience with
> this.
> Those I would like to ask if there any productive used environments out
> there based on SAN storage on KVM?
> If so, which clustered/shared filesystem you are using and how is your
> experience with that (stability, reliability, maintainability, performance,
> useability,...)?
> Furthermore, if you had already to consider in the past between SAN
> storage or CEPH, I would also like to participate on your considerations
> and results :)
>
> Regards,
> Michael
>


Re: Experience with clustered/shared filesystems based on SAN storage on KVM?

2021-10-27 Thread Andrija Panic
a.v.o.i.d = due to clustered file system stability...

CEPH = an awful of knowledge required to have this in production - and
definitively a much better/stable choice than clustered file systems.

Best,

On Thu, 21 Oct 2021 at 11:20, Brussk, Michael 
wrote:

> Hello community,
>
> today I need your experience and knowhow about clustered/shared
> filesystems based on SAN storage to be used with KVM.
> We need to consider about a clustered/shared filesystem based on SAN
> storage (no NFS or iSCSI), but do not have any knowhow or experience with
> this.
> Those I would like to ask if there any productive used environments out
> there based on SAN storage on KVM?
> If so, which clustered/shared filesystem you are using and how is your
> experience with that (stability, reliability, maintainability, performance,
> useability,...)?
> Furthermore, if you had already to consider in the past between SAN
> storage or CEPH, I would also like to participate on your considerations
> and results :)
>
> Regards,
> Michael
>


-- 

Andrija Panić


Re: ACS 4.15 - Disaster recovery after secondary storage issue

2021-10-27 Thread Andrija Panic
 No suitable storagePools found under this Cluster: 1

Can you check the mgmt log lines BEFORE this line above - there should be
clear indication WHY no suitable storage pools are found (this is Primary
Storage pool)

Best,

On Wed, 27 Oct 2021 at 18:04, benoit lair  wrote:

> Hello guys,
>
> I have a important issue with secondary storage
>
> I have 2 nfs secondary storage and a ACS Mgmt server
> I lost the system template vm id1 on both of Nfs sec storage servers
> The ssvm and cpvm are destroyed
> The template routing-1 has been deleted on all SR of hypervisors (xcp-ng)
>
> I am trying to recover the ACS system template workflow
>
> I have tried to reinstall the system vm template from ACS Mgmt server with
> :
>
>
> /usr/share/cloudstack-common/scripts/storage/secondary/cloud-install-sys-tmplt
> -m /mnt/secondary -u
>
> https://download.cloudstack.org/systemvm/4.15/systemvmtemplate-4.15.1-xen.vhd.bz2
> -h
> 
> xenserver -s  -F
>
> It has recreated on NFS1 the directory tmpl/1/1 and uploaded the vhd file
> and created the template.properties file
>
> I made the same on NFS2
> on ACS Gui, it says me the template SystemVM Template (XenServer)  is ready
> On nfs the vhd is present
> But even after restarting the ACS mgmt server, it fails to restart the
> system vm template with the following error in mgmt log file :
>
> 2021-10-27 17:59:22,128 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8)
> (logid:ce3ac740) No suitable storagePools found under this Cluster: 1
> 2021-10-27 17:59:22,129 DEBUG [c.c.a.t.Request]
> (Work-Job-Executor-94:ctx-58cb275b job-2553/job-2649 ctx-fa7b1ea6)
> (logid:02bb9549) Seq 1-873782770202889: Executing:  { Cmd , MgmtId:
> 161064792470736, via: 1(xcp-cluster1-01), Ver: v1, Flags: 100111,
>
> [{"org.apache.cloudstack.storage.command.CopyCommand":{"srcTO":{"org.apache.
> cloudstack.storage.to
> .TemplateObjectTO":{"path":"159e620a-575d-43a8-9a57-f3c7f57a1c8a","origUrl":"
>
> https://download.cloudstack.org/systemvm/4.15/systemvmtemplate-4.15.1-xen.vhd.bz2
> ","uuid":"a9151f22-f4bb-4f7a-983e-c8abd01f745b","id":"1","format":"VHD","accountId":"1","checksum":"{MD5}86373992740b1eca8aff8b08ebf3aea5","hvm":"false","displayText":"SystemVM
> Template
>
> (XenServer)","imageDataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"fbbf2bf0-ccc8-4df3-9794-c914f418a9d9","id":"2","poolType":"PreSetup","host":"localhost","path":"/fbbf2bf0-ccc8-4df3-9794-c914f418a9d9","port":"0","url":"PreSetup://localhost/fbbf2bf0-ccc8-4df3-9794-c914f418a9d9/?ROLE=Primary=fbbf2bf0-ccc8-4df3-9794-c914f418a9d9","isManaged":"false"}},"name":"routing-1","size":"(2.44
> GB)
>
> 262144","hypervisorType":"XenServer","bootable":"false","uniqueName":"routing-1","directDownload":"false","deployAsIs":"false"}},"destTO":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid":"edb85ea0-d786-44f3-901b-e530bb2e6030","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"fbbf2bf0-ccc8-4df3-9794-c914f418a9d9","id":"2","poolType":"PreSetup","host":"localhost","path":"/fbbf2bf0-ccc8-4df3-9794-c914f418a9d9","port":"0","url":"PreSetup://localhost/fbbf2bf0-ccc8-4df3-9794-c914f418a9d9/?ROLE=Primary=fbbf2bf0-ccc8-4df3-9794-c914f418a9d9","isManaged":"false"}},"name":"ROOT-207","size":"(2.45
> GB)
>
> 2626564608","volumeId":"212","vmName":"v-207-VM","accountId":"1","format":"VHD","provisioningType":"THIN","id":"212","deviceId":"0","hypervisorType":"XenServer","directDownload":"false","deployAsIs":"false"}},"executeInSequence":"true","options":{},"options2":{},"wait":"0","bypassHostMaintenance":"false"}}]
> }
> 2021-10-27 17:59:22,129 DEBUG [c.c.a.m.DirectAgentAttache]
> (DirectAgent-221:ctx-737e97d0) (logid:7a1a71eb) Seq 1-873782770202889:
> Executing request
> 2021-10-27 17:59:22,132 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
> (Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8)
> (logid:ce3ac740) Could not find suitable Deployment Destination for this VM
> under any clusters, returning.
> 2021-10-27 17:59:22,133 DEBUG [c.c.d.FirstFitPlanner]
> (Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8)
> (logid:ce3ac740) Searching all possible resources under this Zone: 1
> 2021-10-27 17:59:22,134 DEBUG [c.c.d.FirstFitPlanner]
> (Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8)
> (logid:ce3ac740) Listing clusters in order of aggregate capacity, that have
> (at least one host with) enough CPU and RAM capacity under this Zone: 1
> 2021-10-27 17:59:22,137 DEBUG [c.c.d.FirstFitPlanner]
> (Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8)
> (logid:ce3ac740) Removing from the clusterId list these clusters from avoid
> set: [1]
> 2021-10-27 17:59:22,138 DEBUG [c.c.h.x.r.XenServerStorageProcessor]
> (DirectAgent-221:ctx-737e97d0) (logid:02bb9549) Catch Exception
> 

ACS 4.15 - Disaster recovery after secondary storage issue

2021-10-27 Thread benoit lair
Hello guys,

I have a important issue with secondary storage

I have 2 nfs secondary storage and a ACS Mgmt server
I lost the system template vm id1 on both of Nfs sec storage servers
The ssvm and cpvm are destroyed
The template routing-1 has been deleted on all SR of hypervisors (xcp-ng)

I am trying to recover the ACS system template workflow

I have tried to reinstall the system vm template from ACS Mgmt server with
:

/usr/share/cloudstack-common/scripts/storage/secondary/cloud-install-sys-tmplt
-m /mnt/secondary -u
https://download.cloudstack.org/systemvm/4.15/systemvmtemplate-4.15.1-xen.vhd.bz2
-h xenserver -s  -F

It has recreated on NFS1 the directory tmpl/1/1 and uploaded the vhd file
and created the template.properties file

I made the same on NFS2
on ACS Gui, it says me the template SystemVM Template (XenServer)  is ready
On nfs the vhd is present
But even after restarting the ACS mgmt server, it fails to restart the
system vm template with the following error in mgmt log file :

2021-10-27 17:59:22,128 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
(Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8)
(logid:ce3ac740) No suitable storagePools found under this Cluster: 1
2021-10-27 17:59:22,129 DEBUG [c.c.a.t.Request]
(Work-Job-Executor-94:ctx-58cb275b job-2553/job-2649 ctx-fa7b1ea6)
(logid:02bb9549) Seq 1-873782770202889: Executing:  { Cmd , MgmtId:
161064792470736, via: 1(xcp-cluster1-01), Ver: v1, Flags: 100111,
[{"org.apache.cloudstack.storage.command.CopyCommand":{"srcTO":{"org.apache.cloudstack.storage.to.TemplateObjectTO":{"path":"159e620a-575d-43a8-9a57-f3c7f57a1c8a","origUrl":"
https://download.cloudstack.org/systemvm/4.15/systemvmtemplate-4.15.1-xen.vhd.bz2","uuid":"a9151f22-f4bb-4f7a-983e-c8abd01f745b","id":"1","format":"VHD","accountId":"1","checksum":"{MD5}86373992740b1eca8aff8b08ebf3aea5","hvm":"false","displayText":"SystemVM
Template
(XenServer)","imageDataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"fbbf2bf0-ccc8-4df3-9794-c914f418a9d9","id":"2","poolType":"PreSetup","host":"localhost","path":"/fbbf2bf0-ccc8-4df3-9794-c914f418a9d9","port":"0","url":"PreSetup://localhost/fbbf2bf0-ccc8-4df3-9794-c914f418a9d9/?ROLE=Primary=fbbf2bf0-ccc8-4df3-9794-c914f418a9d9","isManaged":"false"}},"name":"routing-1","size":"(2.44
GB)
262144","hypervisorType":"XenServer","bootable":"false","uniqueName":"routing-1","directDownload":"false","deployAsIs":"false"}},"destTO":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid":"edb85ea0-d786-44f3-901b-e530bb2e6030","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"fbbf2bf0-ccc8-4df3-9794-c914f418a9d9","id":"2","poolType":"PreSetup","host":"localhost","path":"/fbbf2bf0-ccc8-4df3-9794-c914f418a9d9","port":"0","url":"PreSetup://localhost/fbbf2bf0-ccc8-4df3-9794-c914f418a9d9/?ROLE=Primary=fbbf2bf0-ccc8-4df3-9794-c914f418a9d9","isManaged":"false"}},"name":"ROOT-207","size":"(2.45
GB)
2626564608","volumeId":"212","vmName":"v-207-VM","accountId":"1","format":"VHD","provisioningType":"THIN","id":"212","deviceId":"0","hypervisorType":"XenServer","directDownload":"false","deployAsIs":"false"}},"executeInSequence":"true","options":{},"options2":{},"wait":"0","bypassHostMaintenance":"false"}}]
}
2021-10-27 17:59:22,129 DEBUG [c.c.a.m.DirectAgentAttache]
(DirectAgent-221:ctx-737e97d0) (logid:7a1a71eb) Seq 1-873782770202889:
Executing request
2021-10-27 17:59:22,132 DEBUG [c.c.d.DeploymentPlanningManagerImpl]
(Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8)
(logid:ce3ac740) Could not find suitable Deployment Destination for this VM
under any clusters, returning.
2021-10-27 17:59:22,133 DEBUG [c.c.d.FirstFitPlanner]
(Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8)
(logid:ce3ac740) Searching all possible resources under this Zone: 1
2021-10-27 17:59:22,134 DEBUG [c.c.d.FirstFitPlanner]
(Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8)
(logid:ce3ac740) Listing clusters in order of aggregate capacity, that have
(at least one host with) enough CPU and RAM capacity under this Zone: 1
2021-10-27 17:59:22,137 DEBUG [c.c.d.FirstFitPlanner]
(Work-Job-Executor-93:ctx-30ef4f6b job-2552/job-2648 ctx-d1d9ade8)
(logid:ce3ac740) Removing from the clusterId list these clusters from avoid
set: [1]
2021-10-27 17:59:22,138 DEBUG [c.c.h.x.r.XenServerStorageProcessor]
(DirectAgent-221:ctx-737e97d0) (logid:02bb9549) Catch Exception
com.xensource.xenapi.Types$UuidInvalid :VDI getByUuid for uuid:
159e620a-575d-43a8-9a57-f3c7f57a1c8a failed due to The uuid you supplied
was invalid.
2021-10-27 17:59:22,138 WARN  [c.c.h.x.r.XenServerStorageProcessor]
(DirectAgent-221:ctx-737e97d0) (logid:02bb9549) Unable to create volume;
Pool=volumeTO[uuid=edb85ea0-d786-44f3-901b-e530bb2e6030|path=null|datastore=PrimaryDataStoreTO[uuid=fbbf2bf0-ccc8-4df3-9794-c914f418a9d9|name=null|id=2|pooltype=PreSetup]];
Disk:

Re: ACS 4.15.1 Migration between NFS Secondary storage servers interrupted

2021-10-27 Thread benoit lair
I found why the other templates didn't were migrated on 2nd sec storage
server
Mgmt Server was blocked on template vm id1 (system vm template)

In fact in initialised sys template vm from ACS mgmt server to 2nd sec
storage server with : cloud-install-sys-tmplt

So it was blocked to this operation
Finally i tried to help ACS server, moving the template id1 folder from 1st
sec storage to 2nd sec storage

I finally got the other templates downloaded onto my 2nd sec storage server

But after restarting ACS mgmt server, the template id1 folder has been
deleted on both secondary storage servers
I still have entries on BDD for template id1 on store_id 1 and store_id 2,
but the folder does not exist anymore ! :/

How can i get back my system vm template ? i cant start any system vm
routers

Thanks for your help

Regards, Benoit

Le mar. 26 oct. 2021 à 16:56, Pearl d'Silva  a
écrit :

> One way to identify it would be to check the vm_template table for
> templates that are marked as public but do not have a url (i.e., null) -
> such templates should be migrated, but maybe skipped, as in 4.15, public
> templates aren't migrated as they get downloaded on all stores in a zone.
> However, such templates i.e., templates created from volumes / snapshots
> that are marked as public do not get synced. This was addressed in
> https://github.com/apache/cloudstack/pull/5404
>
>
> Thanks,
> 
> From: benoit lair 
> Sent: Tuesday, October 26, 2021 8:01 PM
> To: users@cloudstack.apache.org 
> Subject: Re: ACS 4.15.1 Migration between NFS Secondary storage servers
> interrupted
>
> Hi Pearl,
>
> I am checking the logs of the mgmt server
> About the possibility the template came from a volume, is there way to
> check this in database ?
>
> Regards, Benoit
>
> Le mar. 26 oct. 2021 à 14:53, Pearl d'Silva  a
> écrit :
>
> > Hi Benoit,
> >
> > Can you please check the logs to see if the specific data objects were
> > skipped from being migrated because they couldn't be accomodated on the
> > destination store. Also, were these templates that were left behind
> created
> > from volumes / snapshots - in that case, in 4.15, it is a known issue to
> > skip those files, and has been addressed in 4.16.
> >
> > Thanks,
> > Pearl
> > 
> > From: benoit lair 
> > Sent: Tuesday, October 26, 2021 5:35 PM
> > To: users@cloudstack.apache.org ;
> > d...@cloudstack.apache.org 
> > Subject: Re: ACS 4.15.1 Migration between NFS Secondary storage servers
> > interrupted
> >
> > Hello Guys,
> >
> > I have still the problem on ACS 4.15
> > I am trying to migrate my first nfs secondary storage server to another
> nfs
> > server
> > ACS says in the events the migration is IMAGE.STORE.MIGRATE.DATA :
> > Successfully
> > completed migrating Image store data. Migrating files/data objects from :
> > NFS Secondary storage 001 to: [NFS Secondary storage 002]
> >
> > However, there are still templates hosted on the primary nfs server
> >
> > any ideas why the migration does not work as expected ?
> >
> > Regards, Benoit
> > Le mer. 20 oct. 2021 à 15:24, benoit lair  a
> écrit
> > :
> >
> > > Hello,
> > >
> > > I am trying to migrate my first NFS secondary storage to a second NFS
> one
> > > I asked for a migration with a migration policy "complete"
> > > The job is working but finishes before migrating all the data
> > >
> > > I had to relaunch the migration which continues
> > >
> > > Any ideas ?
> > >
> > > Regards, Benoit
> > >
> >
> >
> >
> >
>
>
>
>


Re: [VOTE] Apache CloudStack 4.16.0.0 (RC2)

2021-10-27 Thread Rohit Yadav
+1 (binding)

Tested fresh install on EL7+KVM+x86 using mbx and was able to validate basic 
VM, volume, template, network lifecycle operations in advanced zone. Smoketests 
are pending on https://github.com/apache/cloudstack/pull/5201 due to env issue, 
but most likely will pass given RC1 and testing before RC2 was cut against the 
same base git sha.

Tested upgrading from ACS 4.15.2 to 4.16.0RC2 arm64 
pkgs on a smol aarch64/arm64 RPi setup; 
validated basic UI, VM, volume, CKS, template, network. Post upgrade checked 
that customisation in config.json aren't lost.


Regards.


From: Nicolas Vazquez 
Sent: Monday, October 25, 2021 19:25
To: d...@cloudstack.apache.org ; users 

Subject: [VOTE] Apache CloudStack 4.16.0.0 (RC2)

Hi All,

I have created a 4.16.0.0 release (RC2), with the following artifacts up for 
testing and a vote:

Git Branch and Commit SHA:
https://github.com/apache/cloudstack/tree/4.16.0.0-RC20211025T0851
Commit: 1e070be4c9a87650f48707a44efff2796dfa802a

Source release (checksums and signatures are available at the same location):
https://dist.apache.org/repos/dist/dev/cloudstack/4.16.0.0/

PGP release keys (signed using 656E1BCC8CB54F84):
https://dist.apache.org/repos/dist/release/cloudstack/KEYS

The vote will be open until 28th October 2021, 16.00 CET (72h).

For sanity in tallying the vote, can PMC members please be sure to indicate 
"(binding)" with their vote?

[ ] +1  approve
[ ] +0  no opinion
[ ] -1  disapprove (and reason why)

For users convenience, the packages from this release candidate (RC2) and
4.16.0 systemvmtemplates are available here:
https://download.cloudstack.org/testing/41600-RC2/
https://download.cloudstack.org/systemvm/4.16/

Regards,
Nicolas Vazquez





 



Re: Unable to add host

2021-10-27 Thread vas...@gmx.de
which kind of useraccount are you using to add the host?

had similar problems adding a host with an "sudo"- enabled account as the
commands performed while performing the generation of an keystore-file
weren't working correctly.

Am Mi., 27. Okt. 2021 um 12:53 Uhr schrieb Nazmul Parvej <
nazmul.par...@bol-online.com>:

> Hi There,
>
> Please see my WARNING log, I can't understand why not added host on my
> Advanced
> zone. All servers are Ubuntu 20.04 LTS
>
> Management Server:
> In Web Portal Global Settings
>
> |ca.plugin.root.auth.strictness|is set to|false
>
>
> for ssh setting below config
>
> PubkeyAcceptedKeyTypes=+ssh-dss
> HostKeyAlgorithms=+ssh-dss
> KexAlgorithms=+diffie-hellman-group1-sha1
>
>
> In Host Server:
> Host as KVM Server Installation
> ==
>
> apt-get install qemu-kvm cloudstack-agent
> sed -i -e 's/\#vnc_listen.*$/vnc_listen = "0.0.0.0"/g'
> /etc/libvirt/qemu.conf
> sed -i -e 's/.*libvirtd_opts.*/env libvirtd_opts="-l"/'
> /etc/default/libvirtd
> echo 'listen_tls=0' >> /etc/libvirt/libvirtd.conf
> echo 'listen_tcp=1' >> /etc/libvirt/libvirtd.conf
> echo 'tcp_port = "16509"' >> /etc/libvirt/libvirtd.conf
> echo 'mdns_adv = 0' >> /etc/libvirt/libvirtd.conf
> echo 'auth_tcp = "none"' >> /etc/libvirt/libvirtd.conf
>
> systemctl restart libvirtd
>
> apt-get install uuid
> UUID=$(uuid)
> echo host_uuid = \"$UUID\" >> /etc/libvirt/libvirtd.conf
> systemctl restart libvirtd
>
> vi /etc/ssh/sshd_config
> PubkeyAcceptedKeyTypes=+ssh-dss
> HostKeyAlgorithms=+ssh-dss
> KexAlgorithms=+diffie-hellman-group1-sha1
>
> systemctl restart ssh
> systemctl restart sshd
>
>
>
>
> Please see my log error from ACS mgmt server
>
>
>
> 021-10-27 15:59:16,063 WARN  [c.c.u.n.Link]
> (AgentManager-SSLHandshakeHandler-1:null) (logid:) This SSL engine was
> forced to close inbound due to end of stream.
> 2021-10-27 15:59:16,959 WARN  [c.c.a.d.ParamGenericValidationWorker]
> (qtp1074389766-290:ctx-827d2286 ctx-57d77cbc) (logid:8809eb47) Received
> unknown parameters for command addHost. Unknown parameters : clustertype
> 2021-10-27 15:59:30,870 WARN  [c.c.h.k.d.LibvirtServerDiscoverer]
> (qtp1074389766-290:ctx-827d2286 ctx-57d77cbc) (logid:8809eb47)  can't setup
> agent, due to com.cloud.utils.exception.CloudRuntimeException: Failed to
> setup keystore on the KVM host: 10.10.9.51 - Failed to setup keystore on
> the KVM host: 10.10.9.51
> 2021-10-27 15:59:30,871 WARN  [c.c.r.ResourceManagerImpl]
> (qtp1074389766-290:ctx-827d2286 ctx-57d77cbc) (logid:8809eb47) Unable to
> find the server resources at http://10.10.9.51
> 2021-10-27 15:59:30,872 WARN  [o.a.c.a.c.a.h.AddHostCmd]
> (qtp1074389766-290:ctx-827d2286 ctx-57d77cbc) (logid:8809eb47) Exception:
> 2021-10-27 16:00:09,415 WARN  [c.c.a.d.ParamGenericValidationWorker]
> (qtp1074389766-291:ctx-32254b38 ctx-81598c92) (logid:276372ee) Received
> unknown parameters for command addHost. Unknown parameters : clustertype
> 2021-10-27 16:00:23,404 WARN  [c.c.h.k.d.LibvirtServerDiscoverer]
> (qtp1074389766-291:ctx-32254b38 ctx-81598c92) (logid:276372ee)  can't setup
> agent, due to com.cloud.utils.exception.CloudRuntimeException: Failed to
> setup keystore on the KVM host: 10.10.9.51 - Failed to setup keystore on
> the KVM host: 10.10.9.51
> 2021-10-27 16:00:23,405 WARN  [c.c.r.ResourceManagerImpl]
> (qtp1074389766-291:ctx-32254b38 ctx-81598c92) (logid:276372ee) Unable to
> find the server resources at http://10.10.9.51
> 2021-10-27 16:00:23,408 WARN  [o.a.c.a.c.a.h.AddHostCmd]
> (qtp1074389766-291:ctx-32254b38 ctx-81598c92) (logid:276372ee) Exception:
> 2021-10-27 16:00:27,313 WARN  [c.c.u.n.Link]
> (AgentManager-SSLHandshakeHandler-1:null) (logid:) This SSL engine was
> forced to close inbound due to end of stream.
>
>
>
> Yours sincerely,
>
> Nazmul Parvej
>


Unable to add host

2021-10-27 Thread Nazmul Parvej
Hi There,

Please see my WARNING log, I can't understand why not added host on my
Advanced
zone. All servers are Ubuntu 20.04 LTS

Management Server:
In Web Portal Global Settings

|ca.plugin.root.auth.strictness|is set to|false


for ssh setting below config

PubkeyAcceptedKeyTypes=+ssh-dss
HostKeyAlgorithms=+ssh-dss
KexAlgorithms=+diffie-hellman-group1-sha1


In Host Server:
Host as KVM Server Installation
==

apt-get install qemu-kvm cloudstack-agent
sed -i -e 's/\#vnc_listen.*$/vnc_listen = "0.0.0.0"/g'
/etc/libvirt/qemu.conf
sed -i -e 's/.*libvirtd_opts.*/env libvirtd_opts="-l"/'
/etc/default/libvirtd
echo 'listen_tls=0' >> /etc/libvirt/libvirtd.conf
echo 'listen_tcp=1' >> /etc/libvirt/libvirtd.conf
echo 'tcp_port = "16509"' >> /etc/libvirt/libvirtd.conf
echo 'mdns_adv = 0' >> /etc/libvirt/libvirtd.conf
echo 'auth_tcp = "none"' >> /etc/libvirt/libvirtd.conf

systemctl restart libvirtd

apt-get install uuid
UUID=$(uuid)
echo host_uuid = \"$UUID\" >> /etc/libvirt/libvirtd.conf
systemctl restart libvirtd

vi /etc/ssh/sshd_config
PubkeyAcceptedKeyTypes=+ssh-dss
HostKeyAlgorithms=+ssh-dss
KexAlgorithms=+diffie-hellman-group1-sha1

systemctl restart ssh
systemctl restart sshd




Please see my log error from ACS mgmt server



021-10-27 15:59:16,063 WARN  [c.c.u.n.Link]
(AgentManager-SSLHandshakeHandler-1:null) (logid:) This SSL engine was
forced to close inbound due to end of stream.
2021-10-27 15:59:16,959 WARN  [c.c.a.d.ParamGenericValidationWorker]
(qtp1074389766-290:ctx-827d2286 ctx-57d77cbc) (logid:8809eb47) Received
unknown parameters for command addHost. Unknown parameters : clustertype
2021-10-27 15:59:30,870 WARN  [c.c.h.k.d.LibvirtServerDiscoverer]
(qtp1074389766-290:ctx-827d2286 ctx-57d77cbc) (logid:8809eb47)  can't setup
agent, due to com.cloud.utils.exception.CloudRuntimeException: Failed to
setup keystore on the KVM host: 10.10.9.51 - Failed to setup keystore on
the KVM host: 10.10.9.51
2021-10-27 15:59:30,871 WARN  [c.c.r.ResourceManagerImpl]
(qtp1074389766-290:ctx-827d2286 ctx-57d77cbc) (logid:8809eb47) Unable to
find the server resources at http://10.10.9.51
2021-10-27 15:59:30,872 WARN  [o.a.c.a.c.a.h.AddHostCmd]
(qtp1074389766-290:ctx-827d2286 ctx-57d77cbc) (logid:8809eb47) Exception:
2021-10-27 16:00:09,415 WARN  [c.c.a.d.ParamGenericValidationWorker]
(qtp1074389766-291:ctx-32254b38 ctx-81598c92) (logid:276372ee) Received
unknown parameters for command addHost. Unknown parameters : clustertype
2021-10-27 16:00:23,404 WARN  [c.c.h.k.d.LibvirtServerDiscoverer]
(qtp1074389766-291:ctx-32254b38 ctx-81598c92) (logid:276372ee)  can't setup
agent, due to com.cloud.utils.exception.CloudRuntimeException: Failed to
setup keystore on the KVM host: 10.10.9.51 - Failed to setup keystore on
the KVM host: 10.10.9.51
2021-10-27 16:00:23,405 WARN  [c.c.r.ResourceManagerImpl]
(qtp1074389766-291:ctx-32254b38 ctx-81598c92) (logid:276372ee) Unable to
find the server resources at http://10.10.9.51
2021-10-27 16:00:23,408 WARN  [o.a.c.a.c.a.h.AddHostCmd]
(qtp1074389766-291:ctx-32254b38 ctx-81598c92) (logid:276372ee) Exception:
2021-10-27 16:00:27,313 WARN  [c.c.u.n.Link]
(AgentManager-SSLHandshakeHandler-1:null) (logid:) This SSL engine was
forced to close inbound due to end of stream.



Yours sincerely,

Nazmul Parvej


Re: Virtual router checks failing

2021-10-27 Thread Paul
Hi Wei,

I am using version 4.15 as of the moment. 

Do you recommend for me to upgrade then?

On 2021/10/27 08:16:04, Wei ZHOU  wrote: 
> Hi,
> 
> Which cloudstack version do you use ?
> 
> A similar issue has been fixed in 4.16.0.0-rc2. see
> https://github.com/apache/cloudstack/pull/5554
> 
> -Wei
> 
> On Wed, 27 Oct 2021 at 10:09, Paul  wrote:
> 
> > Hi, It's a mystery to me to find out that the dns_check.py for my
> > deployment is failing every 10 minutes. I tried searching Google but I
> > cannot find anything relevant to my scenario. Please advise.
> >
> > Log snippet:
> >
> > 2021-10-27 15:58:51,569 DEBUG [c.c.a.m.AgentManagerImpl]
> > (AgentManager-Handler-12:null) (logid:) SeqA 6-1926377: Sending Seq
> > 6-1926377:  { Ans: , MgmtId: 1743772850, via: 6, Ver: v1, Flags: 100010,
> > [{"com.cloud.agent.api.AgentControlAnswer":{"result":"true","wait":"0"}}] }
> > 2021-10-27 15:58:55,789 DEBUG [o.a.c.s.SecondaryStorageManagerImpl]
> > (secstorage-1:ctx-19c33859) (logid:f3458df2) Zone 1 is ready to launch
> > secondary storage VM
> > 2021-10-27 15:58:55,929 DEBUG [c.c.c.ConsoleProxyManagerImpl]
> > (consoleproxy-1:ctx-0e2ba419) (logid:fa5ecc2a) Zone 1 is ready to launch
> > console proxy
> > 2021-10-27 15:58:58,894 DEBUG [c.c.a.m.DirectAgentAttache]
> > (DirectAgentCronJob-319:ctx-1b85b80c) (logid:eb936b4c) Ping from 3(LAB)
> > 2021-10-27 15:58:58,894 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
> > (DirectAgentCronJob-319:ctx-1b85b80c) (logid:eb936b4c) Process host VM
> > state report from ping process. host: 3
> > 2021-10-27 15:58:59,030 DEBUG [c.c.a.m.DirectAgentAttache]
> > (DirectAgent-392:ctx-44d6234c) (logid:17ec443d) Seq 3-6314046677573480412:
> > Response Received:
> > 2021-10-27 15:58:59,044 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
> > (DirectAgentCronJob-319:ctx-1b85b80c) (logid:eb936b4c) Process VM state
> > report. host: 3, number of records in report: 24
> > 2021-10-27 15:58:59,044 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
> > (DirectAgentCronJob-319:ctx-1b85b80c) (logid:eb936b4c) VM state report.
> > host: 3, vm id: 98, power state: PowerOn
> > 2021-10-27 15:58:59,045 DEBUG [c.c.a.t.Request]
> > (DirectAgent-392:ctx-44d6234c) (logid:17ec443d) Seq 3-6314046677573480412:
> > Processing:  { Ans: , MgmtId: 1743772850, via: 3(LAB), Ver: v1, Flags: 10,
> > [{"com.cloud.agent.api.routing.GetRouterMonitorResultsAnswer":{"failingChecks":["dns_check.py"],"monitoringResults":"{"basic":{"lastRun":
> > {"duration": "11.6007211208", "start": "2021-10-27 07:57:01.679916", "end":
> > "2021-10-27 07:57:13.280637"}, "disk_space_check.py": {"lastRunDuration":
> > "32.3760509491", "lastUpdate": "1635321429961", "message": "Sufficient free
> > space is 1032 MB", "success": "true"}, "gateways_check.py":
> > {"lastRunDuration": "8066.79701805", "lastUpdate": "1635321421894",
> > "message": "All 1 gateways are reachable via ping", "success": "true"},
> > "ssh.service": {"lastRunDuration": "30.956029892", "lastUpdate":
> > "1635321421681", "message": "service is running", "success": "true"},
> > "dhcp.service": {"lastRunDuration": "9.65595245361", "lastUpdate":
> > "1635321421681", "message": "servi
> >  ce is running", "success": "true"}, "memory_usage_check.py":
> > {"lastRunDuration": "54.6729564667", "lastUpdate": "1635321421773",
> > "message": "Memory Usage within limits with current at 55.1192%",
> > "success": "true"}, "cpu_usage_check.py": {"lastRunDuration":
> > "3215.58308601", "lastUpdate": "1635321430064", "message": "CPU Usage
> > within limits with current at 1.0%", "success": "true"},
> > "router_version_check.py": {"lastRunDuration": "33.4870815277",
> > "lastUpdate": "1635321421740", "message": "Template and scripts version
> > match successful", "success": "true"}, "webserver.service":
> > {"lastRunDuration": "20.1549530029", "lastUpdate": "1635321421681",
> > "message": "service is running", "success": "true"}},"advanced":{"lastRun":
> > {"duration": "0.335355997086", "start": "2021-10-27 07:50:01.912077",
> > "end": "2021-10-27 07:50:02.247433"}, "haproxy_check.py":
> > {"lastRunDuration": "29.464006424", "lastUpdate": "1635321002030",
> > "message": "No data provided to check, skipping", "success": "true"},
> > "dhcp_ch
> >  eck.py": {"lastRunDuration": "32.4671268463", "lastUpdate":
> > "1635321001997", "message": "All 40 VMs are present in dhcphosts.txt",
> > "success": "true"}, "dns_check.py": {"lastRunDuration": "35.9070301056",
> > "lastUpdate": "1635321002164", "message": "Missing entries for VMs in
> > /etc/hosts -\n103.239.221.225 VM-59c7fd13-32f7-4ace-958f-18c84f2b50f5,
> > 103.239.221.225 VM-59c7fd13-32f7-4ace-958f-18c84f2b50f5, 103.239.221.230
> > VM-94b471ec-65ef-4f57-8774-c523a689d421, 103.239.221.230
> > VM-94b471ec-65ef-4f57-8774-c523a689d421", "success": "false"},
> > "iptables_check.py": {"lastRunDuration": "41.7759418488", "lastUpdate":
> > "1635321002122", "message": "No portforwarding rules provided to check,
> > skipping", "success":
> > 

Re: Virtual router checks failing

2021-10-27 Thread Wei ZHOU
Hi,

Which cloudstack version do you use ?

A similar issue has been fixed in 4.16.0.0-rc2. see
https://github.com/apache/cloudstack/pull/5554

-Wei

On Wed, 27 Oct 2021 at 10:09, Paul  wrote:

> Hi, It's a mystery to me to find out that the dns_check.py for my
> deployment is failing every 10 minutes. I tried searching Google but I
> cannot find anything relevant to my scenario. Please advise.
>
> Log snippet:
>
> 2021-10-27 15:58:51,569 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentManager-Handler-12:null) (logid:) SeqA 6-1926377: Sending Seq
> 6-1926377:  { Ans: , MgmtId: 1743772850, via: 6, Ver: v1, Flags: 100010,
> [{"com.cloud.agent.api.AgentControlAnswer":{"result":"true","wait":"0"}}] }
> 2021-10-27 15:58:55,789 DEBUG [o.a.c.s.SecondaryStorageManagerImpl]
> (secstorage-1:ctx-19c33859) (logid:f3458df2) Zone 1 is ready to launch
> secondary storage VM
> 2021-10-27 15:58:55,929 DEBUG [c.c.c.ConsoleProxyManagerImpl]
> (consoleproxy-1:ctx-0e2ba419) (logid:fa5ecc2a) Zone 1 is ready to launch
> console proxy
> 2021-10-27 15:58:58,894 DEBUG [c.c.a.m.DirectAgentAttache]
> (DirectAgentCronJob-319:ctx-1b85b80c) (logid:eb936b4c) Ping from 3(LAB)
> 2021-10-27 15:58:58,894 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
> (DirectAgentCronJob-319:ctx-1b85b80c) (logid:eb936b4c) Process host VM
> state report from ping process. host: 3
> 2021-10-27 15:58:59,030 DEBUG [c.c.a.m.DirectAgentAttache]
> (DirectAgent-392:ctx-44d6234c) (logid:17ec443d) Seq 3-6314046677573480412:
> Response Received:
> 2021-10-27 15:58:59,044 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
> (DirectAgentCronJob-319:ctx-1b85b80c) (logid:eb936b4c) Process VM state
> report. host: 3, number of records in report: 24
> 2021-10-27 15:58:59,044 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl]
> (DirectAgentCronJob-319:ctx-1b85b80c) (logid:eb936b4c) VM state report.
> host: 3, vm id: 98, power state: PowerOn
> 2021-10-27 15:58:59,045 DEBUG [c.c.a.t.Request]
> (DirectAgent-392:ctx-44d6234c) (logid:17ec443d) Seq 3-6314046677573480412:
> Processing:  { Ans: , MgmtId: 1743772850, via: 3(LAB), Ver: v1, Flags: 10,
> [{"com.cloud.agent.api.routing.GetRouterMonitorResultsAnswer":{"failingChecks":["dns_check.py"],"monitoringResults":"{"basic":{"lastRun":
> {"duration": "11.6007211208", "start": "2021-10-27 07:57:01.679916", "end":
> "2021-10-27 07:57:13.280637"}, "disk_space_check.py": {"lastRunDuration":
> "32.3760509491", "lastUpdate": "1635321429961", "message": "Sufficient free
> space is 1032 MB", "success": "true"}, "gateways_check.py":
> {"lastRunDuration": "8066.79701805", "lastUpdate": "1635321421894",
> "message": "All 1 gateways are reachable via ping", "success": "true"},
> "ssh.service": {"lastRunDuration": "30.956029892", "lastUpdate":
> "1635321421681", "message": "service is running", "success": "true"},
> "dhcp.service": {"lastRunDuration": "9.65595245361", "lastUpdate":
> "1635321421681", "message": "servi
>  ce is running", "success": "true"}, "memory_usage_check.py":
> {"lastRunDuration": "54.6729564667", "lastUpdate": "1635321421773",
> "message": "Memory Usage within limits with current at 55.1192%",
> "success": "true"}, "cpu_usage_check.py": {"lastRunDuration":
> "3215.58308601", "lastUpdate": "1635321430064", "message": "CPU Usage
> within limits with current at 1.0%", "success": "true"},
> "router_version_check.py": {"lastRunDuration": "33.4870815277",
> "lastUpdate": "1635321421740", "message": "Template and scripts version
> match successful", "success": "true"}, "webserver.service":
> {"lastRunDuration": "20.1549530029", "lastUpdate": "1635321421681",
> "message": "service is running", "success": "true"}},"advanced":{"lastRun":
> {"duration": "0.335355997086", "start": "2021-10-27 07:50:01.912077",
> "end": "2021-10-27 07:50:02.247433"}, "haproxy_check.py":
> {"lastRunDuration": "29.464006424", "lastUpdate": "1635321002030",
> "message": "No data provided to check, skipping", "success": "true"},
> "dhcp_ch
>  eck.py": {"lastRunDuration": "32.4671268463", "lastUpdate":
> "1635321001997", "message": "All 40 VMs are present in dhcphosts.txt",
> "success": "true"}, "dns_check.py": {"lastRunDuration": "35.9070301056",
> "lastUpdate": "1635321002164", "message": "Missing entries for VMs in
> /etc/hosts -\n103.239.221.225 VM-59c7fd13-32f7-4ace-958f-18c84f2b50f5,
> 103.239.221.225 VM-59c7fd13-32f7-4ace-958f-18c84f2b50f5, 103.239.221.230
> VM-94b471ec-65ef-4f57-8774-c523a689d421, 103.239.221.230
> VM-94b471ec-65ef-4f57-8774-c523a689d421", "success": "false"},
> "iptables_check.py": {"lastRunDuration": "41.7759418488", "lastUpdate":
> "1635321002122", "message": "No portforwarding rules provided to check,
> skipping", "success":
> "true"}}}","result":"true","details":"{"basic":{"lastRun": {"duration":
> "11.6007211208", "start": "2021-10-27 07:57:01.679916", "end": "2021-10-27
> 07:57:13.280637"}, "disk_space_check.py": {"lastRunDuration":
> "32.3760509491", "lastUpdate": "1635321429961", "message": "Sufficient free
> space is
>   1032 MB", 

Virtual router checks failing

2021-10-27 Thread Paul
Hi, It's a mystery to me to find out that the dns_check.py for my deployment is 
failing every 10 minutes. I tried searching Google but I cannot find anything 
relevant to my scenario. Please advise.

Log snippet:

2021-10-27 15:58:51,569 DEBUG [c.c.a.m.AgentManagerImpl] 
(AgentManager-Handler-12:null) (logid:) SeqA 6-1926377: Sending Seq 6-1926377:  
{ Ans: , MgmtId: 1743772850, via: 6, Ver: v1, Flags: 100010, 
[{"com.cloud.agent.api.AgentControlAnswer":{"result":"true","wait":"0"}}] }
2021-10-27 15:58:55,789 DEBUG [o.a.c.s.SecondaryStorageManagerImpl] 
(secstorage-1:ctx-19c33859) (logid:f3458df2) Zone 1 is ready to launch 
secondary storage VM
2021-10-27 15:58:55,929 DEBUG [c.c.c.ConsoleProxyManagerImpl] 
(consoleproxy-1:ctx-0e2ba419) (logid:fa5ecc2a) Zone 1 is ready to launch 
console proxy
2021-10-27 15:58:58,894 DEBUG [c.c.a.m.DirectAgentAttache] 
(DirectAgentCronJob-319:ctx-1b85b80c) (logid:eb936b4c) Ping from 3(LAB)
2021-10-27 15:58:58,894 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl] 
(DirectAgentCronJob-319:ctx-1b85b80c) (logid:eb936b4c) Process host VM state 
report from ping process. host: 3
2021-10-27 15:58:59,030 DEBUG [c.c.a.m.DirectAgentAttache] 
(DirectAgent-392:ctx-44d6234c) (logid:17ec443d) Seq 3-6314046677573480412: 
Response Received: 
2021-10-27 15:58:59,044 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl] 
(DirectAgentCronJob-319:ctx-1b85b80c) (logid:eb936b4c) Process VM state report. 
host: 3, number of records in report: 24
2021-10-27 15:58:59,044 DEBUG [c.c.v.VirtualMachinePowerStateSyncImpl] 
(DirectAgentCronJob-319:ctx-1b85b80c) (logid:eb936b4c) VM state report. host: 
3, vm id: 98, power state: PowerOn
2021-10-27 15:58:59,045 DEBUG [c.c.a.t.Request] (DirectAgent-392:ctx-44d6234c) 
(logid:17ec443d) Seq 3-6314046677573480412: Processing:  { Ans: , MgmtId: 
1743772850, via: 3(LAB), Ver: v1, Flags: 10, 
[{"com.cloud.agent.api.routing.GetRouterMonitorResultsAnswer":{"failingChecks":["dns_check.py"],"monitoringResults":"{"basic":{"lastRun":
 {"duration": "11.6007211208", "start": "2021-10-27 07:57:01.679916", "end": 
"2021-10-27 07:57:13.280637"}, "disk_space_check.py": {"lastRunDuration": 
"32.3760509491", "lastUpdate": "1635321429961", "message": "Sufficient free 
space is 1032 MB", "success": "true"}, "gateways_check.py": {"lastRunDuration": 
"8066.79701805", "lastUpdate": "1635321421894", "message": "All 1 gateways are 
reachable via ping", "success": "true"}, "ssh.service": {"lastRunDuration": 
"30.956029892", "lastUpdate": "1635321421681", "message": "service is running", 
"success": "true"}, "dhcp.service": {"lastRunDuration": "9.65595245361", 
"lastUpdate": "1635321421681", "message": "servi
 ce is running", "success": "true"}, "memory_usage_check.py": 
{"lastRunDuration": "54.6729564667", "lastUpdate": "1635321421773", "message": 
"Memory Usage within limits with current at 55.1192%", "success": "true"}, 
"cpu_usage_check.py": {"lastRunDuration": "3215.58308601", "lastUpdate": 
"1635321430064", "message": "CPU Usage within limits with current at 1.0%", 
"success": "true"}, "router_version_check.py": {"lastRunDuration": 
"33.4870815277", "lastUpdate": "1635321421740", "message": "Template and 
scripts version match successful", "success": "true"}, "webserver.service": 
{"lastRunDuration": "20.1549530029", "lastUpdate": "1635321421681", "message": 
"service is running", "success": "true"}},"advanced":{"lastRun": {"duration": 
"0.335355997086", "start": "2021-10-27 07:50:01.912077", "end": "2021-10-27 
07:50:02.247433"}, "haproxy_check.py": {"lastRunDuration": "29.464006424", 
"lastUpdate": "1635321002030", "message": "No data provided to check, 
skipping", "success": "true"}, "dhcp_ch
 eck.py": {"lastRunDuration": "32.4671268463", "lastUpdate": "1635321001997", 
"message": "All 40 VMs are present in dhcphosts.txt", "success": "true"}, 
"dns_check.py": {"lastRunDuration": "35.9070301056", "lastUpdate": 
"1635321002164", "message": "Missing entries for VMs in /etc/hosts 
-\n103.239.221.225 VM-59c7fd13-32f7-4ace-958f-18c84f2b50f5, 103.239.221.225 
VM-59c7fd13-32f7-4ace-958f-18c84f2b50f5, 103.239.221.230 
VM-94b471ec-65ef-4f57-8774-c523a689d421, 103.239.221.230 
VM-94b471ec-65ef-4f57-8774-c523a689d421", "success": "false"}, 
"iptables_check.py": {"lastRunDuration": "41.7759418488", "lastUpdate": 
"1635321002122", "message": "No portforwarding rules provided to check, 
skipping", "success": 
"true"}}}","result":"true","details":"{"basic":{"lastRun": {"duration": 
"11.6007211208", "start": "2021-10-27 07:57:01.679916", "end": "2021-10-27 
07:57:13.280637"}, "disk_space_check.py": {"lastRunDuration": "32.3760509491", 
"lastUpdate": "1635321429961", "message": "Sufficient free space is
  1032 MB", "success": "true"}, "gateways_check.py": {"lastRunDuration": 
"8066.79701805", "lastUpdate": "1635321421894", "message": "All 1 gateways are 
reachable via ping", "success": "true"}, "ssh.service": {"lastRunDuration": 
"30.956029892", "lastUpdate": "1635321421681", "message": "service is