mwa-sudo opened a new issue, #13399:
URL: https://github.com/apache/cloudstack/issues/13399

   ### problem
   
   We are experiencing what appears to be the same issue described in #13112, 
however in our environment it affects `usage_type = 1` (default VM usage) 
instead of the usage type reported in the original issue.
   The Usage Server repeatedly reprocesses historical usage records from a 
fixed timestamp and continuously inserts duplicate records into the 
`cloud_usage` table.
   
   ### **observed behaviour**
   
   The Usage Server repeatedly processes the same historical records and 
generates duplicate rows in `cloud_usage`.
   
   * start date repeatedly reprocessed: `2026-05-29T22:00:00+0200`
   
   ### **relation to #13112**
   
   This issue appears very similar to #13112.
   
   Differences:
   
   * #13112 reported the problem for usage_type = `13`
   * We observe the same behaviour for usage_type = `1`
   * The issue was reported as fixed in a later CloudStack release by the 
original Poster, however we are running `4.22.1.0` which is newer than the 
reported version.
   
   Because the symptoms are effectively identical, it indicates that the issue 
still persists in newer versions.
   
   ### **evidence**
   
   Relevant Usage Server logs:
   
   During the period where duplicate usage records are generated, the Usage 
Server repeatedly processes historical time ranges and emits SQL integrity 
constraint violations.
   A log snippet is attached. VM and account names have been redacted from the 
logs for data protection purposes. 
   
   
[_usage.2026-05-28.log](https://github.com/user-attachments/files/28835661/_usage.2026-05-28.log)
   
   A representative example is shown below:
   `2026-05-29 23:15:00,005 INFO Parsing usage records between 
[2026-05-29T20:00:00+0000] and [2026-05-29T20:59:59+0000]
   ERROR: Duplicate entry '570-2026-05-29 20:50:40' for key 'id' 
java.sql.SQLIntegrityConstraintViolationException
   WARN: Failed to create usage event id: 1703940 type: VOLUME.CREATE due to 
Entity already exists`
   The exception indicates that the Usage Server is attempting to insert 
records that already exist. This appears to coincide with the repeated 
processing of historical usage intervals and the continuous growth of duplicate 
usage records described above.
   
   To quantify the issue, we wrote a Python script that hashes usage records 
and counts identical entries for a single virtual machine grouped by hour.
   The expected result is exactly one record per hourly interval (`count = 1`).
   This expectation is met until `2026-05-29 22:00`, after which duplicate 
records begin appearing. The duplication follows a distinct pattern:
   
   * `2026-05-29 22:00` → 204 identical records
   * `2026-05-29 23:00` → 203 identical records
   * `2026-05-30 00:00` → 202 identical records
   * `2026-05-30 01:00` → 201 identical records
   
   The pattern continues with each subsequent hourly bucket containing one 
fewer duplicate than the previous bucket.
   This suggests that historical usage intervals are being repeatedly 
regenerated or reprocessed by the Usage Server. While the total number of 
records continues to increase over time, the number of duplicates associated 
with each successive hourly interval decreases by one.
   Example output for a single VM:
   
   `{`\
   `  "2026-05-29": {`\
   `    "21": 1,`\
   `    "22": 204,`\
   `    "23": 203`\
   `  },`\
   `  "2026-05-30": {`\
   `    "00": 202,`\
   `    "01": 201,`\
   `    "02": 200,`\
   `    "03": 199`\
   `  }`\
   `}`
   
   ### versions
   
   * Hypervisor: `KVM`
   * Database: `10.11.14-MariaDB`
   * Apache CloudStack `4.22.1.0`
   * Server: `-0ubuntu0.24.04.1-log Ubuntu 24.04`
   
   ### The steps to reproduce the bug
   
   The exact trigger is currently unknown. In our environment, the issue can be 
observed with the following setup:
   
   1. Enable the Usage Server in Apache CloudStack.
   2. Configure:
      - `usage.stats.job.aggregation.range = 60`
      - `usage.stats.job.exec.time = 00:15`
   3. Allow the Usage Server to run normally for an extended period.
   4. Observe the `cloud_usage` table and Usage Server logs.
   5. Duplicate usage records begin appearing and continue to accumulate over 
time.
   
   ### What to do about it?
   
   One observation that might be relevant, is a start-timestamp on 2026-05-29 
at 20:00:14 (UTC). This is the only differing usage record. All other 
duplicates are exactly the same (within VM, day and hour) and their 
start-timestamps are all for second "00". Without knowing the actual contexts, 
it does look as if this one existing usage record for second "14" kind of 
congests the system now. Meaning the server did create an entry for second 
"14". At the same time it tried to create an entry for second "00". This failed 
due to the constraint-violation by having the same ID as the entry for second 
"14". This is observable in the logs.
   
   Starting the following hour the system is indeed creating entries for all 
subsequent usage records including the one for the timestamp in question. But 
somehow maybe because of this one "failed" entry, the system does not recognize 
all subsequently created entries as valid/processed and recreates all of them 
again every hour.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to