synergiator opened a new issue, #288:
URL: https://github.com/apache/cloudstack-terraform-provider/issues/288

   Hello everybody,
   
   while experimenting with agentic coding in infrastructure context, the agent 
raised concern about a possible root cause which has been addressed in some 
0.6.0 fixes but as it seems is itself still there.
   
   Please have a look - does this make sense for those with deeper 
understanding of this provider? (Opus 4.6R is known to have quite comprehensive 
context understanding).
   
   ------
   
   # Firewall-family Read functions destroy state via `SetId("")` — root cause 
behind #115, #194, #241, #278
   
   ## Summary
   
   The Read functions in `cloudstack_firewall`, `cloudstack_egress_firewall`, 
and
   `cloudstack_port_forward` contain a structural defect: when the CloudStack 
API
   listing returns no matching rules, the Read function calls `d.SetId("")` —
   destroying the resource from Terraform state. This is the **root cause** 
behind
   four separate issues (#115, #194, #241, #278), each of which was diagnosed 
and
   patched as a project-context problem rather than the underlying logic error.
   
   ## The defect
   
   All three resources share this pattern in their Read function:
   
   ```go
   // resource_cloudstack_firewall.go:388-392
   if rules.Len() > 0 {
       d.Set("rule", rules)
   } else if !managed {
       d.SetId("")        // ← destroys the resource from state
   }
   ```
   
   The logic is: "if I found zero matching rules, the resource no longer 
exists."
   This conflates two distinct situations:
   
   | Situation | Correct action |
   |-----------|---------------|
   | The IP address / network was deleted | `d.SetId("")` — resource is gone |
   | The rules exist but the API listing didn't return them | Preserve state — 
resource is fine |
   
   The second case happens whenever `listFirewallRules` (or equivalent) can't 
see
   the rules. Every reported instance of this bug was caused by an API 
visibility
   problem, not an actually-deleted resource.
   
   ## How the secondary issues trace back to this root cause
   
   ### #115 / #194 — Firewall rules in projects (closed, PR #198)
   
   **Trigger**: Rules created in a project context. The Read function called
   `listFirewallRules` without passing `projectid`, so the API returned zero
   rules (they're scoped to the project).
   
   **Fix applied** (commit `edb0901`): Added a `project` schema field and
   `setProjectid()` call to the Read function so the listing includes the 
project
   context.
   
   **Why it's a symptom fix**: PR #198 fixed *one specific reason* the API 
returns
   zero rules. The `SetId("")` fallback remained untouched. Any other reason the
   listing returns empty (see below) still triggers the same destruction.
   
   ### #241 — Same error persists in v0.6.0-rc2 (open)
   
   **Trigger**: Egress firewall without an *explicit* `project` attribute. Even
   after PR #198 added the `project` field, it's `Optional` — if the user 
doesn't
   set it, `setProjectid()` is a no-op, and the listing still returns zero rules
   for project-scoped networks.
   
   **Why it proves the root cause is unfixed**: The PR #198 fix requires the 
user
   to explicitly specify `project` on every firewall resource. When they don't, 
the
   same empty-listing → `SetId("")` path fires. The root cause — treating "empty
   listing" as "resource deleted" — was never addressed.
   
   ### #278 — Port forward "always returns error" (open, PR #280 in progress)
   
   **Trigger**: Port forward rules in a project. Same mechanism: Read calls
   `listPortForwardingRules` without project context, gets zero results, calls
   `SetId("")`.
   
   **Fix in progress** (PR #280): Implements automatic project inheritance 
(inherit
   from IP address) and a fallback lookup with `projectid=-1`. PR #280 also
   refactors the `SetId("")` logic in port_forward, but the same pattern remains
   in `cloudstack_firewall` and `cloudstack_egress_firewall`.
   
   **Pattern**: Each new trigger produces a new issue. Each fix addresses that
   specific trigger. The `SetId("")` bomb stays armed.
   
   ## The chain of events on every occurrence
   
   ```
   Create           succeeds → rule exists in CloudStack
     ↓
   Read             calls listFirewallRules(ipaddressid=X)
     ↓
   API returns 0    (missing project, source NAT filter, eventual consistency, 
…)
     ↓
   rules.Len() == 0
     ↓
   d.SetId("")      resource destroyed from state
     ↓
   Terraform        "Root object was present, but now absent"
     ↓
   Next apply       tries to re-create → "conflicts with existing rule"
   ```
   
   ## Locations
   
   | Resource | File | Line | Pattern |
   |----------|------|------|---------|
   | `cloudstack_firewall` | `resource_cloudstack_firewall.go` | 388-392 | 
`rules.Len() == 0 && !managed → SetId("")` |
   | `cloudstack_egress_firewall` | `resource_cloudstack_egress_firewall.go` | 
425-430 | Same |
   | `cloudstack_port_forward` | `resource_cloudstack_port_forward.go` | 
355-359 | `forwards.Len() == 0 && !managed → SetId("")` |
   
   Note: `cloudstack_port_forward` already validates the IP address exists at 
the
   top of its Read (lines 237-250). It confirms the IP is present, then proceeds
   to destroy the resource at line 358 because the forwards listing is empty.
   This is self-contradictory.
   
   ## Proposed fix
   
   Replace the "empty set → destroy" logic with a parent-object existence check:
   
   **For `cloudstack_firewall`** — before calling `SetId("")`, verify the IP
   address still exists:
   
   ```go
   } else if !managed {
       _, count, err := cs.Address.GetPublicIpAddressByID(d.Id(),
           cloudstack.WithProject(d.Get("project").(string)))
       if err != nil && count == 0 {
           log.Printf("[DEBUG] IP %s no longer exists, removing from state", 
d.Id())
           d.SetId("")
       } else {
           log.Printf("[WARN] No firewall rules found for IP %s but IP exists", 
d.Id())
           d.Set("rule", rules) // preserve empty set in state
       }
   }
   ```
   
   **For `cloudstack_egress_firewall`** — verify the network still exists.
   
   **For `cloudstack_port_forward`** — remove the `SetId("")` at line 358
   entirely; the IP existence check at lines 237-250 already handles the
   "parent deleted" case.
   
   This eliminates the class of bug rather than patching individual triggers.
   Future API visibility issues (new project scoping, RBAC changes, API
   versioning) would be handled correctly without new code.
   
   ## Proof
   
   The structural defect is demonstrable via unit test without a CloudStack
   instance. See `cloudstack/gotcha_issues_test.go`:
   
   - `TestFirewallRead_clearsIdWhenRulesEmpty` — proves the logic path exists
   - `TestFirewallRead_ruleMatchRequiresUuidInMap` — proves unmatched UUIDs 
drop rules
   - `TestEgressFirewallRead_sameSetIdBugPattern` — proves egress has same 
defect
   
   ## Environment
   
   - Provider: `main` branch (includes v0.6.0 release code)
   - Terraform: 1.11+
   - CloudStack: 4.19+
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to