Hi everyone,

I'm seeing strange things with resource consumption with toolforge. My
resource consumption reported by the quota is way above the consumption of
my currently running job, preventing other jobs from starting. Where does
the extra consumption come from ? And how to clean this ?

More detailed data below

Any help appreciated,
NicoV

My tool account is wpcleaner on toolforge.

Only one job is currently running :

tools.wpcleaner@tools-bastion-15:~$ toolforge jobs list
+------------------------+--------------------+------------------------------------------+
|       Job name:        |     Job type:      |                 Status:
             |
+------------------------+--------------------+------------------------------------------+
|  wpcleaner-cs-weekly   | schedule: @weekly  | Last schedule time:
2025-12-03T05:17:00Z |
|   wpcleaner-en-list    | schedule: @weekly  | Unable to start, out of
quota for memory |
|    wpcleaner-fr-dab    | schedule: @weekly  | Unable to start, out of
quota for memory |
|   wpcleaner-fr-daily   |  schedule: @daily  | Last schedule time:
2025-12-07T03:58:00Z |
|   wpcleaner-fr-list    | schedule: @weekly  |           Running for
2d4h10m            |
|  wpcleaner-fr-weekly   | schedule: @weekly  | Last schedule time:
2025-12-04T05:03:00Z |
|  wpcleaner-meta-list   | schedule: @monthly | Last schedule time:
2025-12-04T04:25:00Z |
| wpcleaner-meta-monthly | schedule: @monthly | Last schedule time:
2025-11-18T06:05:00Z |
+------------------------+--------------------+------------------------------------------+

The currently running job defines 3G and 1 CPU for resources :

tools.wpcleaner@tools-bastion-15:~$ toolforge jobs show wpcleaner-fr-list
+---------------+-----------------------------------------------------------------+
| Job name:     | wpcleaner-fr-list
      |
+---------------+-----------------------------------------------------------------+
| Command:      |
/data/project/wpcleaner/tools/scripts/fr_ListCheckWiki_List.sh  |
+---------------+-----------------------------------------------------------------+
| Job type:     | schedule: @weekly
      |
+---------------+-----------------------------------------------------------------+
| Image:        | jdk17
      |
+---------------+-----------------------------------------------------------------+
| Port:         | none
       |
+---------------+-----------------------------------------------------------------+
| File log:     | no
       |
+---------------+-----------------------------------------------------------------+
| Output log:   |
      |
+---------------+-----------------------------------------------------------------+
| Error log:    |
      |
+---------------+-----------------------------------------------------------------+
| Emails:       | onfailure
      |
+---------------+-----------------------------------------------------------------+
| Resources:    | mem: 3.0Gi, cpu: 1.0
       |
+---------------+-----------------------------------------------------------------+
| Replicas:     |
      |
+---------------+-----------------------------------------------------------------+
| Mounts:       | none
       |
+---------------+-----------------------------------------------------------------+
| Retry:        | no
       |
+---------------+-----------------------------------------------------------------+
| Timeout:      | no
       |
+---------------+-----------------------------------------------------------------+
| Health check: | none
       |
+---------------+-----------------------------------------------------------------+
| Status:       | Running for 2d4h11m
      |
+---------------+-----------------------------------------------------------------+
| Hints:        | Last run at 2025-12-03T20:36:14Z. Pod in 'Running' phase.
State |
|               | 'running'. Started at '2025-12-03T20:36:15Z'.
      |
+---------------+-----------------------------------------------------------------+

But the quota command says I'm consuming 2.5 CPU and 6.5G, which is a lot
more than what the only running job defines.

tools.wpcleaner@tools-bastion-15:~$ toolforge jobs quota
Running jobs                                  Used    Limit
--------------------------------------------  ------  -------
Total running jobs at once (Kubernetes pods)  3       16
Running one-off and cron jobs                 4       15
CPU                                           2.5     16.0
Memory                                        6.5Gi   8.0Gi

Per-job limits    Used    Limit
----------------  ------  -------
CPU                       3.0
Memory                    6.0Gi

Job definitions                             Used    Limit
----------------------------------------  ------  -------
Cron jobs                                      8       50
Continuous jobs (including web services)       1       16
_______________________________________________
Cloud mailing list -- [email protected]
List information: 
https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/

Reply via email to