com98 commented on issue #13330: URL: https://github.com/apache/druid/issues/13330#issuecomment-2804834309
I'm having the exact same issue here with the latest version of Druid (v32.0.1). I have a Kafka job running and after the job completes successfully the MiddleManager disappears. As [Sh1ftry](https://github.com/Sh1ftry) pointed out previously the Kubernetes extension seems to remove the Kubernetes labels from the MiddleManager after peon is completed. Before: ``` Labels: app=druid apps.kubernetes.io/pod-index=0 component=middleManager controller-revision-hash=druid-cluster-middlemanagers-7fcff47567 druidDiscoveryAnnouncement-cluster-identifier=cluster druidDiscoveryAnnouncement-id-hash=837274055 druidDiscoveryAnnouncement-middleManager=true druid_cr=cluster nodeSpecUniqueStr=druid-cluster-middlemanagers statefulset.kubernetes.io/pod-name=druid-cluster-middlemanagers-0 Annotations: druidNodeInfo-middleManager: {"druidNode":{"service":"druid/middleManager","host":"10.0.164.21","bindOnHost":false,"plaintextPort":8088,"port":-1,"tlsPort":-1,"enableP... ``` After Kafka job is completed successfully: ``` Labels: app=druid apps.kubernetes.io/pod-index=0 component=middleManager controller-revision-hash=druid-cluster-middlemanagers-7fcff47567 druidDiscoveryAnnouncement-middleManager=true druid_cr=cluster nodeSpecUniqueStr=druid-cluster-middlemanagers statefulset.kubernetes.io/pod-name=druid-cluster-middlemanagers-0 Annotations: druidNodeInfo-middleManager: {"druidNode":{"service":"druid/middleManager","host":"10.0.154.9","bindOnHost":false,"plaintextPort":8088,"port":-1,"tlsPort":-1,"enablePl... ``` This leads to errors in the cluster not being able to find the MiddleManager: ``` 2025-04-15T11:56:43,927 ERROR [org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatchermiddleManager] org.apache.druid.discovery.BaseNodeRoleWatcher - Noticed disappearance of unknown druid node [http://10.0.154.9:8088] of role [middleManager]. 2025-04-15T11:57:22,264 INFO [org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatchermiddleManager] org.apache.druid.discovery.BaseNodeRoleWatcher - Node [http://10.0.164.21:8088] of role [middleManager] detected. ``` The labels seem to get removed by the `K8sDruidNodeAnnouncer` class: ``` 2025-04-15T11:52:25,333 INFO [task-runner-0-priority-0] org.apache.druid.k8s.discovery.K8sDruidNodeAnnouncer - Unannouncing DiscoveryDruidNode[DiscoveryDruidNode{druidNode=DruidNode{serviceName='druid/middleManager', host='10.0.154.9', bindOnHost=false, port=-1, plaintextPort=8100, enablePlaintextPort=true, tlsPort=-1, enableTlsPort=false}, nodeRole='PEON', services={dataNodeService=DataNodeService{tier='_default_tier', maxSize=0, serverType=indexer-executor, priority=0}, lookupNodeService=LookupNodeService{lookupTier='__default'}}', startTime=2025-04-15T11:51:24.122Z}] ``` Is there any fix planned for this as of now or any kind of configuration which can mitigate this issue? As pointed out previously this basically makes the whole Kubernetes extension still un-usuable. Thanks a lot! :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
