venkateshwaracholan opened a new issue, #4601:
URL: https://github.com/apache/polaris/issues/4601

   ### Describe the bug
   
   ### Summary
   
   Helm CI applies fixture databases and immediately runs `ct install`, while 
local integration workflows and developer documentation wait for PostgreSQL and 
MongoDB pods to become Ready first.
   
   This creates a workflow inconsistency and a potential readiness race.
   
   ### Evidence
   
   * CI (`.github/workflows/ci.yml`) applies fixtures and proceeds directly to 
`ct install` without `kubectl wait`.
   * Local workflows (`make helm-integration-test` via `helm-fixtures` in 
`Makefile`) wait for PostgreSQL and MongoDB readiness before running chart 
tests.
   * Developer documentation uses the same waits 
(`site/content/in-dev/unreleased/helm-chart/dev.md`).
   * `ct install` runs `nosql-persistence-values.yaml` (MongoDB) as the 4th 
lexicographic CI scenario and `persistence-values.yaml` (Postgres) as the 5th; 
earlier scenarios use default in-memory persistence.
   * In a local kind cluster, immediately after `kubectl apply`:
   
     * MongoDB pod was `0/1` Ready
     * Service endpoints were empty
     * Connections to `mongodb:27017` returned `connection refused`
   * After `kubectl wait`, the pod became Ready and endpoints were populated.
   
   This demonstrates a real readiness window between fixture creation and 
database availability.
   
   ### Potential improvement
   
   Align CI with Makefile/dev workflows by waiting for fixture databases before 
running `ct install`:
   
   ```bash
   kubectl wait --namespace polaris-ns --for=condition=ready pod \
     --selector=app.kubernetes.io/name=postgres --timeout=120s
   
   kubectl wait --namespace polaris-ns --for=condition=ready pod \
     --selector=app.kubernetes.io/name=mongodb --timeout=120s
   ```
   
   ### Notes
   
   A recent Helm CI failure on #4580 showed Mongo connection errors compatible 
with this timing window. However, I have not confirmed that failure was caused 
by this readiness gap.
   
   ### Related
   
   * #4580
   
   
   ### To Reproduce
   
   _No response_
   
   ### Actual Behavior
   
   _No response_
   
   ### Expected Behavior
   
   _No response_
   
   ### Additional context
   
   _No response_
   
   ### System information
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to