abh1sar commented on issue #12424:
URL: https://github.com/apache/cloudstack/issues/12424#issuecomment-3817173448

   I was able to reproduce the issue (With some code orchestration to speed up 
the process)
   
   Steps:
   1. Set `usage.stats.job.aggregation.range` to **5 minutes**
   <img width="1032" height="85" alt="Image" 
src="https://github.com/user-attachments/assets/52abdfbc-88bf-4d65-9d81-ed03632eb0f1";
 />
   
   2. Put a sleep in `UsageManagerImpl.parse(UsageJobVO job, long 
startDateMillis, long endDateMillis)` for **5 minutes** (we need a long running 
usage parsing job) 
   3. stop the cloudstack-usage service.
   4. start it after **~12 minutes**
   5. The heartbeat code sees that it has been long since last job was 
scheduled, so it schedules it.
   6. Usage job scheduled sleeps for **5 minutes**
   7. Heartbeat code runs every minute and logs the Usage job starting every 
one minute (It doesn't actually start new jobs, just updates the last running 
job with the new heartbeat)
   ```
   2026-01-29 11:31:58,846 INFO  [cloud.usage.UsageServer] (main:[]) (logid:) 
UsageServer ready...                                                            
                                                                                
                                                                                
         
   2026-01-29 11:32:13,827 DEBUG [usage.UsageManagerImpl.Heartbeat] 
(Usage-HB-1:[]) (logid:) it's been 733825 ms since last usage job and 63166176 
ms until next job, scheduling an immediate job to catch up (aggregation 
duration is 5 minutes)                                                          
                            
   2026-01-29 11:32:13,828 DEBUG 
[cloud.usage.UsageManagerImpl_EnhancerByCloudStack_7373c433] (Usage-HB-1:[]) 
(logid:) Scheduling Usage job...                                                
                                                                                
                                                         
   2026-01-29 11:32:13,829 INFO  
[cloud.usage.UsageManagerImpl_EnhancerByCloudStack_7373c433] (Usage-Job-1:[]) 
(logid:) starting usage job...                                                  
                                                                                
                                                        
   2026-01-29 11:32:13,881 INFO  
[cloud.usage.UsageManagerImpl_EnhancerByCloudStack_7373c433] (Usage-Job-1:[]) 
(logid:) Parsing usage records between [2026-01-29T11:20:00+0000] and 
[2026-01-29T11:31:59+0000].                                                     
                                                                  
   2026-01-29 11:33:13,808 DEBUG [usage.UsageManagerImpl.Heartbeat] 
(Usage-HB-1:[]) (logid:) it's been 793808 ms since last usage job and 63106193 
ms until next job, scheduling an immediate job to catch up (aggregation 
duration is 5 minutes)                                                          
                            
   2026-01-29 11:33:13,808 DEBUG 
[cloud.usage.UsageManagerImpl_EnhancerByCloudStack_7373c433] (Usage-HB-1:[]) 
(logid:) Scheduling Usage job...                                                
                                                                                
                                                         
   2026-01-29 11:34:13,806 DEBUG [usage.UsageManagerImpl.Heartbeat] 
(Usage-HB-1:[]) (logid:) it's been 853806 ms since last usage job and 63046195 
ms until next job, scheduling an immediate job to catch up (aggregation 
duration is 5 minutes)                                                          
                            
   2026-01-29 11:34:13,806 DEBUG 
[cloud.usage.UsageManagerImpl_EnhancerByCloudStack_7373c433] (Usage-HB-1:[]) 
(logid:) Scheduling Usage job...                                                
                                                                                
                                                         
   2026-01-29 11:35:13,816 DEBUG [usage.UsageManagerImpl.Heartbeat] 
(Usage-HB-1:[]) (logid:) it's been 913816 ms since last usage job and 62986185 
ms until next job, scheduling an immediate job to catch up (aggregation 
duration is 5 minutes)                                                          
                            
   2026-01-29 11:35:13,816 DEBUG 
[cloud.usage.UsageManagerImpl_EnhancerByCloudStack_7373c433] (Usage-HB-1:[]) 
(logid:) Scheduling Usage job...                                                
                                                                                
                                                         
   2026-01-29 11:36:13,804 DEBUG [usage.UsageManagerImpl.Heartbeat] 
(Usage-HB-1:[]) (logid:) it's been 973804 ms since last usage job and 62926197 
ms until next job, scheduling an immediate job to catch up (aggregation 
duration is 5 minutes)                                                          
                            
   2026-01-29 11:36:13,805 DEBUG 
[cloud.usage.UsageManagerImpl_EnhancerByCloudStack_7373c433] (Usage-HB-1:[]) 
(logid:) Scheduling Usage job...                                                
                                                                                
                                                         
   2026-01-29 11:37:13,804 DEBUG [usage.UsageManagerImpl.Heartbeat] 
(Usage-HB-1:[]) (logid:) it's been 1033804 ms since last usage job and 62866197 
ms until next job, scheduling an immediate job to catch up (aggregation 
duration is 5 minutes)                                                          
                           
   2026-01-29 11:37:13,804 DEBUG 
[cloud.usage.UsageManagerImpl_EnhancerByCloudStack_7373c433] (Usage-HB-1:[]) 
(logid:) Scheduling Usage job...      
   ```
   
   ```
   mysql> mysql> select * from usage_job where end_millis=0;
   
+------+------------------------------------------------------+--------+----------+-----------+--------------+------------+-----------+------------+----------+---------+---------------------+
   | id   | host                                                 | pid    | 
job_type | scheduled | start_millis | end_millis | exec_time | start_date | 
end_date | success | heartbeat           |
   
+------+------------------------------------------------------+--------+----------+-----------+--------------+------------+-----------+------------+----------+---------+---------------------+
   | 1811 | ref-trl-10702-k-Mol8-abhisar-sinha-mgmt2/10.0.35.225 | 393299 |     
   0 |         0 |            0 |          0 |         0 | NULL       | NULL    
 |    NULL | 2026-01-29 11:34:13 |
   
+------+------------------------------------------------------+--------+----------+-----------+--------------+------------+-----------+------------+----------+---------+---------------------+
   1 row in set (0.00 sec)
   
   mysql> select * from usage_job where end_millis=0;
   
+------+------------------------------------------------------+--------+----------+-----------+--------------+------------+-----------+------------+----------+---------+---------------------+
   | id   | host                                                 | pid    | 
job_type | scheduled | start_millis | end_millis | exec_time | start_date | 
end_date | success | heartbeat           |
   
+------+------------------------------------------------------+--------+----------+-----------+--------------+------------+-----------+------------+----------+---------+---------------------+
   | 1811 | ref-trl-10702-k-Mol8-abhisar-sinha-mgmt2/10.0.35.225 | 393299 |     
   0 |         0 |            0 |          0 |         0 | NULL       | NULL    
 |    NULL | 2026-01-29 11:34:13 |
   
+------+------------------------------------------------------+--------+----------+-----------+--------------+------------+-----------+------------+----------+---------+---------------------+
   1 row in set (0.00 sec)
   
   mysql> 
   mysql> select * from usage_job where end_millis=0;
   
+------+------------------------------------------------------+--------+----------+-----------+--------------+------------+-----------+------------+----------+---------+---------------------+
   | id   | host                                                 | pid    | 
job_type | scheduled | start_millis | end_millis | exec_time | start_date | 
end_date | success | heartbeat           |
   
+------+------------------------------------------------------+--------+----------+-----------+--------------+------------+-----------+------------+----------+---------+---------------------+
   | 1811 | ref-trl-10702-k-Mol8-abhisar-sinha-mgmt2/10.0.35.225 | 393299 |     
   0 |         0 |            0 |          0 |         0 | NULL       | NULL    
 |    NULL | 2026-01-29 11:35:13 |
   
+------+------------------------------------------------------+--------+----------+-----------+--------------+------------+-----------+------------+----------+---------+---------------------+
   1 row in set (0.00 sec)
   
   mysql> select * from usage_job where end_millis=0;
   
+------+------------------------------------------------------+--------+----------+-----------+--------------+------------+-----------+------------+----------+---------+---------------------+
   | id   | host                                                 | pid    | 
job_type | scheduled | start_millis | end_millis | exec_time | start_date | 
end_date | success | heartbeat           |
   
+------+------------------------------------------------------+--------+----------+-----------+--------------+------------+-----------+------------+----------+---------+---------------------+
   | 1811 | ref-trl-10702-k-Mol8-abhisar-sinha-mgmt2/10.0.35.225 | 393299 |     
   0 |         0 |            0 |          0 |         0 | NULL       | NULL    
 |    NULL | 2026-01-29 11:36:13 |
   
+------+------------------------------------------------------+--------+----------+-----------+--------------+------------+-----------+------------+----------+---------+---------------------+
   1 row in set (0.00 sec)
   
   mysql> select * from usage_job where end_millis=0;
   
+------+------------------------------------------------------+--------+----------+-----------+--------------+------------+-----------+------------+----------+---------+---------------------+
   | id   | host                                                 | pid    | 
job_type | scheduled | start_millis | end_millis | exec_time | start_date | 
end_date | success | heartbeat           |
   
+------+------------------------------------------------------+--------+----------+-----------+--------------+------------+-----------+------------+----------+---------+---------------------+
   | 1812 | ref-trl-10702-k-Mol8-abhisar-sinha-mgmt2/10.0.35.225 | 393299 |     
   0 |         0 |            0 |          0 |         0 | NULL       | NULL    
 |    NULL | 2026-01-29 11:37:16 |
   
+------+------------------------------------------------------+--------+----------+-----------+--------------+------------+-----------+------------+----------+---------+---------------------+
   1 row in set (0.00 sec)
   
   ```
   
   Please note that the new jobs are not actually being created. The last 
running job is updated.
   
   As a fix, Heartbeat code should check if there is a job already running and 
not try to schedule it again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to