Re: AutoScaling group not triggering

Ricardo Pertuz Thu, 01 Jun 2023 11:25:45 -0700

Thanks Wei,





I enabled DEBUG in the log4j and it’s not returning any error o warn, I’m using 
the utility “stress –cpu 4” to simulate busy cpu but nothing happens on a 1% 
threshold.





2023-06-01 13:19:22,690 DEBUG [cloud.agent.Agent] (agentRequest-Handler-4:null) 
(logid:7b03dbe0) Processing command:com.cloud.agent.api.GetHostStatsCommand





2023-06-01 13:19:30,516 DEBUG [cloud.agent.Agent] (agentRequest-Handler-2:null) 
(logid:aeb5068e) Processing command:com.cloud.agent.api.GetVmStatsCommand





2023-06-01 13:19:30,516 DEBUG [kvm.resource.LibvirtConnection] 
(agentRequest-Handler-2:null) (logid:aeb5068e) Looking for libvirtd connection 
at: qemu:///system





2023-06-01 13:19:30,541 DEBUG [kvm.resource.LibvirtVMDef] 
(agentRequest-Handler-2:null) (logid:aeb5068e) Using informed label [hdc] for 
volume [null].





2023-06-01 13:19:30,543 DEBUG [kvm.resource.LibvirtVMDef] 
(agentRequest-Handler-2:null) (logid:aeb5068e) Using informed label [hdc] for 
volume [null]


BR

Ricardo Pertuz





1 de junio de 2023, 3:47, "Wei ZHOU" <ustcweiz...@gmail.com> escribió:


> 
> Hi Ricardo,
> 
> ACS gets the VM statistics (including cpu, memory, network, disk
> statistics) by sending GetVmStatsCommand to the kvm host, and getting the
> answer GetVmStatsAnswer from the kvm host.
> Can you check agent.log if there are errors ?
> 
> For example, I cannot get memory statistics due to error below
> ```
> 2023-06-01 08:42:55,925 DEBUG [cloud.agent.Agent]
> (agentRequest-Handler-4:null) (logid:5cfb0714) Processing command:
> com.cloud.agent.api.GetVmStatsCommand
> 2023-06-01 08:42:55,925 DEBUG [kvm.resource.LibvirtConnection]
> (agentRequest-Handler-4:null) (logid:5cfb0714) Looking for libvirtd
> connection at: qemu:///system
> 2023-06-01 08:42:55,928 WARN [kvm.resource.LibvirtComputingResource]
> (agentRequest-Handler-4:null) (logid:5cfb0714) Couldn't retrieve free
> memory, returning -1.
> ```
> 
> But got the cpu load (which is very low)
> 
> ```
> mysql> select * from autoscale_vmgroup_statistics;
> +-------+------------+-----------+------------+-------------+------------------+-----------+------------------+---------------------+----------+
> | id | vmgroup_id | policy_id | counter_id | resource_id | resource_type
>  | raw_value | value_type | created | state |
> +-------+------------+-----------+------------+-------------+------------------+-----------+------------------+---------------------+----------+
> ...
> | 34142 | 5 | 13 | 101 | 9020 | UserVm
>  | 0.003534817956875221 | INSTANT_VM | 2023-06-01 08:39:02 |
> ACTIVE |
> | 34143 | 5 | 15 | 101 | 9020 | UserVm
>  | 0.003534817956875221 | INSTANT_VM | 2023-06-01 08:39:02 |
> ACTIVE |
> | 34144 | 5 | 13 | 101 | 9021 | UserVm
>  | 0.0035341933203746245 | INSTANT_VM | 2023-06-01 08:39:02 |
> ACTIVE |
> | 34145 | 5 | 15 | 101 | 9021 | UserVm
>  | 0.0035341933203746245 | INSTANT_VM | 2023-06-01 08:39:02 |
> ACTIVE |
> ```
> 
> -Wei
> 
> On Tue, 30 May 2023 at 23:03, Ricardo Pertuz
> <ricardo.per...@kuasar.co.invalid> wrote:
> 
> > 
> > Here's what I see
> > 
> >  2023-05-30 15:08:35,483 INFO
> >  [resource.virtualnetwork.VirtualRoutingResource]
> >  (agentRequest-Handler-4:null) (logid:bff92bbe) Fetching health check result
> >  for 169.254.82.180 and executing fresh checks: **false**
> >  2023-05-30 15:08:35,884 INFO
> >  [resource.virtualnetwork.VirtualRoutingResource]
> >  (agentRequest-Handler-2:null) (logid:bff92bbe) Fetching health check result
> >  for 169.254.116.47 and executing fresh checks: **false**
> >  2023-05-30 15:08:36,333 INFO
> >  [resource.virtualnetwork.VirtualRoutingResource]
> >  (agentRequest-Handler-3:null) (logid:bff92bbe) Fetching health check result
> >  for 169.254.166.143 and executing fresh checks: false
> >  2023-05-30 15:08:36,739 INFO
> >  [resource.virtualnetwork.VirtualRoutingResource]
> >  (agentRequest-Handler-1:null) (logid:bff92bbe) Fetching health check result
> >  for 169.254.47.117 and executing fresh checks: **false**
> > 
> >  Ricardo Pertuz
> > 
> >  May 30, 2023 at 3:25 PM, "Wei ZHOU" <ustcweiz...@gmail.com> wrote:
> > 
> >  Hi Ricardo,
> > 
> >  It looks the CPU usage (raw_value) is 0 . Can you check the agent.log ?
> > 
> >  INACTIVE means there are some changes with the AS vm group at that time,
> >  for example create/enable/disable/scaleup/scaledown.
> > 
> >  -Wei
> > 
> >  On Tuesday, 30 May 2023, Ricardo Pertuz <ricardo.per...@kuasar.co
> >  .invalid>
> >  wrote:
> > 
> >  >
> >  > Hi Wei,
> >  >
> >  > Thanks for replying, my threshold is 5% just to check and the ACS
> >  metrics
> >  > says 28% in usage
> >  >
> >  > looks like no error in logs, however I see this message
> >  >
> >  > **success: Creating file in VR, with ip: 169.254.89.121, file:
> >  > monitor_service.json.ec3acdd8-b1c1-4603-9fde-79eece662390","null -
> >  > success: Invalid unit name "cloud-password-server@172.28.0.1
> >  ,172.28.0.83"
> >  > escaped as "cloud-password-server@172.28.0.1\x2c172.28.0.83" (maybe
> >  you
> >  > should use systemd-escape?)**
> >  >
> >  > 2023-05-30 15:05:13,988 DEBUG [c.c.s.StatsCollector]
> >  (StatsCollector-6:ctx-361217f1)
> >  > (logid:f59c817a) AutoScaling Monitor is running...
> >  > 2023-05-30 15:05:13,989 DEBUG [c.c.s.StatsCollector]
> >  (StatsCollector-6:ctx-361217f1)
> >  > (logid:f59c817a) Skipping AutoScaling Monitor
> >  > 2023-05-30 15:05:14,225 DEBUG [c.c.n.a.AutoScaleManagerImpl]
> >  > (VmGroup-Monitor-4-1:ctx-94401ba2) (logid:1b0d873d) Start monitoring
> >  on
> >  > AutoScale VmGroup
> >  AutoScaleVmGroupVO[id=4|name=scaler01|loadBalancerId=93|
> >  > profileId=5]
> >  > 2023-05-30 15:05:14,232 DEBUG [c.c.n.a.AutoScaleManagerImpl]
> >  > (VmGroup-Monitor-4-1:ctx-94401ba2) (logid:1b0d873d) [AutoScale]
> >  > Collecting performance data ...
> >  > 2023-05-30 15:05:14,239 DEBUG [c.c.n.a.AutoScaleManagerImpl]
> >  > (VmGroup-Monitor-4-1:ctx-94401ba2) (logid:1b0d873d) [AutoScale]
> >  > Collecting performance data from hosts ...
> >  >
> >  > 023-05-30 15:04:47,539 DEBUG [c.c.s.StatsCollector]
> >  > (Cluster-Worker-4706:ctx-4eac4d6f) (logid:e5e83be1) StatusUpdate from
> >  > 262699919842878, json: {"managementServerHostId":202,
> >  > "managementServerHostUuid":"016a5d17-44ec-429b-acd9-
> >  > 36ee81fbd295","collectionTime":"May 30, 2023, 3:04:47
> >  PM","sessions":0,"
> >  >
> >  cpuUtilization":0.0,"totalJvmMemoryBytes":455081984,"freeJvmMemoryBytes"
> >  > :108107048,"maxJvmMemoryBytes":1908932607,"processJvmMemoryBytes":0,"
> >  > jvmUptime":594979551,"jvmStartTime":1684882107946,"
> >  > availableProcessors":16,"loadAverage":6.48,"totalInit":
> >  > 1062535168,"totalUsed":573048008,"totalCommitted":691445760,"pid"
> >  > Regarding database this is what I see, no so sure why the **INACTIVE
> >  > **state
> >  >
> >  > MariaDB [cloud]> select * from autoscale_vmgroup_statistics limit 5;
> >  > +-----+------------+-----------+------------+-------------+-
> >  > -----------------+-----------+------------------+-----------
> >  > ----------+----------+
> >  > | id | vmgroup_id | policy_id | counter_id | resource_id |
> >  > resource_type | raw_value | value_type | created |
> >  > state |
> >  > +-----+------------+-----------+------------+-------------+-
> >  > -----------------+-----------+------------------+-----------
> >  > ----------+----------+
> >  > | 294 | 2 | 0 | 0 | 2 |
> >  > AutoScaleVmGroup | -1 | INSTANT_VM_GROUP | 2023-05-30 13:48:25 |
> >  > INACTIVE |
> >  > | 295 | 2 | 0 | 0 | 2 |
> >  > AutoScaleVmGroup | -1 | INSTANT_VM_GROUP | 2023-05-30 13:48:31 |
> >  > INACTIVE |
> >  > | 296 | 2 | 0 | 0 | 2 |
> >  > AutoScaleVmGroup | -1 | INSTANT_VM_GROUP | 2023-05-30 13:48:37 |
> >  > INACTIVE |
> >  > | 297 | 2 | 3 | 106 | 9842 |
> >  > UserVm | 0 | INSTANT_VM | 2023-05-30 13:48:44 |
> >  > ACTIVE |
> >  > | 298 | 2 | 4 | 106 | 9842 |
> >  > UserVm | 0 | INSTANT_VM | 2023-05-30 13:48:44 |
> >  > ACTIVE |
> >  > +-----+------------+-----------+------------+-------------+-
> >  > -----------------+-----------+------------------+-----------
> >  > ----------+----------+
> >  >
> >  > Regards,
> >  >
> >  > Ricardo Pertuz
> >  >
> >  > May 30, 2023 at 2:39 PM, "Wei ZHOU" <ustcweiz...@gmail.com> wrote:
> >  >
> >  > Hi Ricardo,
> >  >
> >  > We (including dev and qa) have done intensive testing with different
> >  > hypervisors and scenarios. You may hit a bug, but more likely a
> >  > misconfiguration issue.
> >  >
> >  > You can check by the following steps:
> >  > (1) check database table "autoscale_vmgroup_statistics" to see if the
> >  > metrics have been collected with correct value and frequency.
> >  > (2) check management-server.log to see if cloudstack checks the
> >  metrics
> >  > periodically.
> >  >
> >  > I suggest you to test with small threshold. The cpu usage is collected
> >  > from
> >  > the kvm hypervisor , calculated from the cpu time on the vm, which
> >  might
> >  > have big difference as you thought.
> >  >
> >  > -Wei
> >  >
> >  > On Tuesday, 30 May 2023, Ricardo Pertuz <ricardo.per...@kuasar.co.
> >  > invalid>
> >  > wrote:
> >  >
> >  > >
> >  > > Hi,
> >  > >
> >  > > On our env with ACS 4.18 KVM hypervisor, we have configured an
> >  > autoscale
> >  > > vm group with cpu average counter, however it does not trigger the
> >  > scale up
> >  > > even the threshold have been reached longer than the stipulated.
> >  What
> >  > > should we check? are we missing something?
> >  > >
> >  > > Min Instances 1 (always remains in 1 instance)
> >  > > Max Instances 3
> >  > >
> >  > > Ricardo Pertuz
> >  > >
> >  >
> >
>

Re: AutoScaling group not triggering

Reply via email to