mwa-sudo opened a new issue, #13399: URL: https://github.com/apache/cloudstack/issues/13399
### problem We are experiencing what appears to be the same issue described in #13112, however in our environment it affects `usage_type = 1` (default VM usage) instead of the usage type reported in the original issue. The Usage Server repeatedly reprocesses historical usage records from a fixed timestamp and continuously inserts duplicate records into the `cloud_usage` table. ### **observed behaviour** The Usage Server repeatedly processes the same historical records and generates duplicate rows in `cloud_usage`. * start date repeatedly reprocessed: `2026-05-29T22:00:00+0200` ### **relation to #13112** This issue appears very similar to #13112. Differences: * #13112 reported the problem for usage_type = `13` * We observe the same behaviour for usage_type = `1` * The issue was reported as fixed in a later CloudStack release by the original Poster, however we are running `4.22.1.0` which is newer than the reported version. Because the symptoms are effectively identical, it indicates that the issue still persists in newer versions. ### **evidence** Relevant Usage Server logs: During the period where duplicate usage records are generated, the Usage Server repeatedly processes historical time ranges and emits SQL integrity constraint violations. A log snippet is attached. VM and account names have been redacted from the logs for data protection purposes. [_usage.2026-05-28.log](https://github.com/user-attachments/files/28835661/_usage.2026-05-28.log) A representative example is shown below: `2026-05-29 23:15:00,005 INFO Parsing usage records between [2026-05-29T20:00:00+0000] and [2026-05-29T20:59:59+0000] ERROR: Duplicate entry '570-2026-05-29 20:50:40' for key 'id' java.sql.SQLIntegrityConstraintViolationException WARN: Failed to create usage event id: 1703940 type: VOLUME.CREATE due to Entity already exists` The exception indicates that the Usage Server is attempting to insert records that already exist. This appears to coincide with the repeated processing of historical usage intervals and the continuous growth of duplicate usage records described above. To quantify the issue, we wrote a Python script that hashes usage records and counts identical entries for a single virtual machine grouped by hour. The expected result is exactly one record per hourly interval (`count = 1`). This expectation is met until `2026-05-29 22:00`, after which duplicate records begin appearing. The duplication follows a distinct pattern: * `2026-05-29 22:00` → 204 identical records * `2026-05-29 23:00` → 203 identical records * `2026-05-30 00:00` → 202 identical records * `2026-05-30 01:00` → 201 identical records The pattern continues with each subsequent hourly bucket containing one fewer duplicate than the previous bucket. This suggests that historical usage intervals are being repeatedly regenerated or reprocessed by the Usage Server. While the total number of records continues to increase over time, the number of duplicates associated with each successive hourly interval decreases by one. Example output for a single VM: `{`\ ` "2026-05-29": {`\ ` "21": 1,`\ ` "22": 204,`\ ` "23": 203`\ ` },`\ ` "2026-05-30": {`\ ` "00": 202,`\ ` "01": 201,`\ ` "02": 200,`\ ` "03": 199`\ ` }`\ `}` ### versions * Hypervisor: `KVM` * Database: `10.11.14-MariaDB` * Apache CloudStack `4.22.1.0` * Server: `-0ubuntu0.24.04.1-log Ubuntu 24.04` ### The steps to reproduce the bug The exact trigger is currently unknown. In our environment, the issue can be observed with the following setup: 1. Enable the Usage Server in Apache CloudStack. 2. Configure: - `usage.stats.job.aggregation.range = 60` - `usage.stats.job.exec.time = 00:15` 3. Allow the Usage Server to run normally for an extended period. 4. Observe the `cloud_usage` table and Usage Server logs. 5. Duplicate usage records begin appearing and continue to accumulate over time. ### What to do about it? One observation that might be relevant, is a start-timestamp on 2026-05-29 at 20:00:14 (UTC). This is the only differing usage record. All other duplicates are exactly the same (within VM, day and hour) and their start-timestamps are all for second "00". Without knowing the actual contexts, it does look as if this one existing usage record for second "14" kind of congests the system now. Meaning the server did create an entry for second "14". At the same time it tried to create an entry for second "00". This failed due to the constraint-violation by having the same ID as the entry for second "14". This is observable in the logs. Starting the following hour the system is indeed creating entries for all subsequent usage records including the one for the timestamp in question. But somehow maybe because of this one "failed" entry, the system does not recognize all subsequently created entries as valid/processed and recreates all of them again every hour. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
