** Description changed:

- After upgrading to Caracal, we noticed the duration of GET calls to
- nova-api is increasing over time, and same for the memory usage of nova-
- api. We first noticed that in telegraf metrics, to validate that, I
- created a brand new cluster of VMs without telegraf, with only one
- headnode running nova-api, and have multiple nodes sending GET request
- to that and monitor the duration.
+ [ Impact ]
  
- Script to send requests:
- # --- Get a fresh token (requires openrc sourced first) ---
- get_token() {
-   openstack token issue -f value -c id
- }
- OS_TOKEN=$(get_token)
- echo "Using token: $OS_TOKEN"
+ * A resource leak bug in 21.2.0 of python-attrs 
(<https://github.com/python-attrs/attrs/issues/826>)
+  caused performance issues when an application created many classes with the 
same name.
  
- # --- Send requests at 10 per second ---
- COUNT=0
- while true; do
-   COUNT=$((COUNT+1))
-   STATUS=$(curl -s -o /dev/null -w "%{http_code}" \
-     -H "X-Auth-Token: $OS_TOKEN" \
-     -H "Accept: application/json" \
-     "$NOVA_URL/servers/detail")
+ * This caused a performance regression in nova through its dependency on 
python-jsonschema
+   * The latency of API calls to nova increases over time
  
-   echo "$(date +'%F %T') [$COUNT] HTTP $STATUS"
+ [ Test Plan ]
  
-   if [ "$STATUS" = "401" ]; then
-     echo "[$(date)] Got 401 → refreshing token..."
-     OS_TOKEN=$(get_token)
-     continue   # retry next loop with fresh token
-   fi
+  To reproduce the issue and confirm the fix:
  
-   sleep 0.1   # 0.1 sec → 10 per second
- done
+ * Create a minimal openstack deployment, using e.g. devstack
+ (<https://docs.openstack.org/devstack/latest/>).
  
- script to monitor the duration (avg per 5 minutes)
- grep 'servers/detail' /var/log/nova/nova-api.log | awk '
-     # Example line:
-   # 2025-08-21 17:27:08.859 ... "GET /v2.1/os-quota-sets/..." ... time: 
0.6598654
-   match($0, /^([0-9-]+) ([0-9]{2}):([0-9]{2}):([0-9]{2})(\.[0-9]+)?.* time: 
([0-9.]+)/, m) {
-       ymd = m[1]; hh = m[2]; mm = m[3]; dur = m[6]
-       bmin = int(mm/5)*5                           # floor minute to 5-min 
bucket
-       key = sprintf("%s %s:%02d", ymd, hh, bmin)   # e.g., 2025-08-21 17:25
-       sum[key] += dur; cnt[key]++
-   }
-   END {
-       for (k in sum) printf "%s,%.3f\n", k, sum[k]/cnt[k] | "sort"
-   }'
+ * Run a script which makes many calls to the nova API
  
- I use systemctl status to track the memory usage, it increased about
- 500MB during a weekend (I'm testing on a small cluster). The duration of
- the GET request also showed obvious increment, and seems no restriction
- limit.
+  ---Script to call API---
+  #!/bin/bash
+  # --- Get a fresh token (requires openrc sourced first) ---
+  get_token() {
+      openstack token issue -f value -c id
+  }
  
- Wondering if it is a memory leak thing, but want to get confirmation
- from team. Thanks.
+  OS_TOKEN=$(get_token)
+  echo "Using token: $OS_TOKEN"
+ 
+  # --- Send requests at 10 per second ---
+  COUNT=0
+  while true; do
+      COUNT=$((COUNT + 1))
+      STATUS=$(curl -s -o /dev/null -w "%{http_code}" \
+          -H "X-Auth-Token: $OS_TOKEN" \
+          -H "Accept: application/json" \
+          "$NOVA_URL/servers/detail")
+ 
+      echo "$(date +'%F %T') [$COUNT] HTTP $STATUS"
+ 
+      if [ "$STATUS" = "401" ]; then
+          echo "[$(date)] Got 401 → refreshing token..."
+          OS_TOKEN=$(get_token)
+          continue # retry next loop with fresh token
+      fi
+      # sleep 0.1 # 0.1 sec → 10 per second
+  done
+  ---/Script to call API---
+ 
+ * Simultaneously monitor the response time of API calls using the
+ openstack CLI
+ 
+  ---Script to monitor response time---
+  #!/bin/bash
+ 
+  # --- Again openrc must be sourced first ---
+  os_quota() {
+      openstack quota list --compute --timing --format=json
+  }
+ 
+  headers() {
+      os_quota | grep -o -P '/(\w|\d){32}(/d)?' | tr '\n' ','
+  }
+ 
+  time_api() {
+      os_quota |
+          awk 'BEGIN { FS=","; ORS=","} /quota/ {print $2} END { printf "\n" }'
+  }
+ 
+  OUTPUT="api_calls_$(date +%Y-%m-%dT%H_%M_%S)"
+ 
+  echo -n "# time," >>"$OUTPUT"
+  headers >>"$OUTPUT"
+  echo >>"$OUTPUT"
+ 
+  i=0
+  while true; do
+      echo -n "$(date +'%F %T')," >>"$OUTPUT"
+      time_api >>"$OUTPUT"
+      echo -n "."
+      if ((i > 80)); then
+          i=0
+          echo
+      fi
+      sleep 5
+  done
+ 
+ ---/Script to monitor response time---
+ 
+ * Observe that the response time increases over time with python-attrs 21.2.0
+   * I did this over a period of about 20 hrs.
+ 
+ * Log in to the server hosting the nova API and upgrade python-attrs to
+ 21.4.0
+ 
+ * Repeat the above steps and observe that the response time remains
+ stable over time.
+ 
+ [ Where problems could occur ]
+ 
+ * Other packages depend on python-attrs as a library and so could be
+ affected.
+ 
+ * From 21.2.0 to 21.4.0 there was one backward incompatible change change of 
which could potentailly cause issues:
+   * <https://github.com/python-attrs/attrs/blob/21.4.0/CHANGELOG.rst>
+     """
+     When using @define, converters are now run by default when setting an 
attribute on an instance -- additionally to validators. I.e. the new default is 
on_setattr=[attrs.setters.convert, attrs.setters.validate].
+ 
+     This is unfortunately a breaking change, but it was an oversight, 
impossible to raise a DeprecationWarning about, and it's better to fix it now 
while the APIs are very fresh with few users.
+     """
+   * Since there are no version pins of python-attrs excluding this version it 
is unlikely to break other packages.
+   * It is possible that some users could be using the python3-attr package in 
their own code, but it is much more likely that
+     for user code, attr would be installed via pip, either directly or within 
a virtualenv.
+ 
+ * The version in noble is 23.2.0 which already includes all these
+ changes and there doesn't seem to be any associated issues.
+ 
+ [ Other Info ]
+ 
+ * This SRU proposal bumps to 21.4.0 instead of 21.3.0 because 21.4.0 is a bug 
fix
+   release for a regression in 21.3.0.
+ 
+ * This is the first ubuntu specific modification to this package, but it
+ is a fairly minor one and doesn't need to be carried forward to any
+ other series.
+ 
+ * A possible lighter weight alternative is to simply cherry pick the
+ bugfix commit <https://github.com/python-
+ attrs/attrs/commit/38580632ceac1cd6e477db71e1d190a4130beed4>

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2121607

Title:
  Nova-api showing latency after upgrading to Caracal

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2121607/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to