vassilvk opened a new issue, #828:
URL: https://github.com/apache/solr-operator/issues/828

   ## Problem
   
   When a `SolrCloud` CR is deleted and its `providedConfigMap` has already 
been removed from the cluster (e.g., by Helm during an upgrade or uninstall), 
the reconciler enters an infinite requeue loop and the SolrCloud CR can never 
be deleted.
   
   The root cause is in 
[`controllers/solrcloud_controller.go`](https://github.com/apache/solr-operator/blob/e81458cf9a33f346c69613a7fcb464161f0d525c/controllers/solrcloud_controller.go#L212-L219):
   
   ```go
   if instance.Spec.CustomSolrKubeOptions.ConfigMapOptions != nil && 
instance.Spec.CustomSolrKubeOptions.ConfigMapOptions.ProvidedConfigMap != "" {
       providedConfigMapName := 
instance.Spec.CustomSolrKubeOptions.ConfigMapOptions.ProvidedConfigMap
       foundConfigMap := &corev1.ConfigMap{}
       nn := types.NamespacedName{Name: providedConfigMapName, Namespace: 
instance.Namespace}
       err = r.Get(ctx, nn, foundConfigMap)
       if err != nil {
           return requeueOrNot, err // if they passed a providedConfigMap name, 
then it must exist
       }
   ```
   
   The comment says *"it must exist"*, but this assumption doesn't hold during 
deletion. The `Reconcile()` function has no early exit when `DeletionTimestamp` 
is set - it runs through ZooKeeper reconciliation, Service creation, node 
services, headless service, and then hits the ConfigMap lookup **before** it 
can ever reach the storage finalizer logic at line ~448. If the ConfigMap is 
gone, the error causes an immediate return and requeue, so the finalizer is 
never removed and the CR is stuck.
   
   ## How to reproduce
   
   1. Create a `SolrCloud` CR with 
`spec.customSolrKubeOptions.configMapOptions.providedConfigMap` pointing to a 
ConfigMap
   2. Delete the ConfigMap
   3. Delete the `SolrCloud` CR
   4. Observe: the CR gets a `deletionTimestamp` but is never finalized; the 
operator logs show a recurring `NotFound` error for the ConfigMap on every 
reconciliation cycle
   
   This also occurs in Helm-managed deployments: when a Helm upgrade removes 
SolrCloud workloads from values, Helm deletes the ConfigMap (no longer 
rendered) while the SolrCloud CR still exists, and the operator cannot complete 
the CR's deletion.
   
   ## Affected versions
   
   Confirmed in the controller code as of v0.9.1 (latest stable). The code path 
has been unchanged through v0.7.0–v0.9.1.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to