[ 
https://issues.apache.org/jira/browse/KAFKA-20295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18084111#comment-18084111
 ] 

Roland Sommer edited comment on KAFKA-20295 at 5/28/26 1:24 PM:
----------------------------------------------------------------

Just tried if this has been somehow fixed in 4.3.0, but kafka still insists 
that the non-existent controller(s) are still part of the cluster:
{code:java}
~$ /opt/kafka/bin/kafka-features.sh --bootstrap-server localhost:9092 upgrade 
--release-version 4.3
Could not upgrade eligible.leader.replicas.version to 1. The update failed for 
all features since the following feature had an error: Invalid update version 
30 for feature metadata.version. Controller 351 only supports versions 7-27
Could not upgrade group.version to 1. The update failed for all features since 
the following feature had an error: Invalid update version 30 for feature 
metadata.version. Controller 351 only supports versions 7-27
Could not upgrade kraft.version to 1. The update failed for all features since 
the following feature had an error: Invalid update version 30 for feature 
metadata.version. Controller 351 only supports versions 7-27
Could not upgrade metadata.version to 30. The update failed for all features 
since the following feature had an error: Invalid update version 30 for feature 
metadata.version. Controller 351 only supports versions 7-27
Could not upgrade share.version to 1. The update failed for all features since 
the following feature had an error: Invalid update version 30 for feature 
metadata.version. Controller 351 only supports versions 7-27
Could not upgrade streams.version to 1. The update failed for all features 
since the following feature had an error: Invalid update version 30 for feature 
metadata.version. Controller 351 only supports versions 7-27
Could not upgrade transaction.version to 2. The update failed for all features 
since the following feature had an error: Invalid update version 30 for feature 
metadata.version. Controller 351 only supports versions 7-27
7 out of 7 operation(s) failed. {code}
The mentioned controller 351 is not part of the cluster:
{code:java}
~$ /opt/kafka/bin/kafka-metadata-quorum.sh --bootstrap-server localhost:9092 
describe --replication
NodeId    DirectoryId               LogEndOffset    Lag    LastFetchTimestamp   
 LastCaughtUpTimestamp    Status      
158       2gsvOvnT7urpZcA_-LUy5w    210340245       0      1779974359863        
 1779974359863            Leader      
611       27Ii-xdAZ7ReQBLsvvJb0A    210340245       0      1779974359368        
 1779974359368            Follower    
206       Q7X9o3XbKxk_3tz4T8torg    210340245       0      1779974359369        
 1779974359369            Follower    
226       7n6aedUEuytkqhBnbe7ESw    210340245       0      1779974359368        
 1779974359368            Observer    
181       tZ17VQ8cYpf7R-LyAQWf2w    210340245       0      1779974359368        
 1779974359368            Observer    
299       P4qXt3K0G5Qg_7w_UdvaNA    210340245       0      1779974359368        
 1779974359368            Observer    
290       bA0pqZFsUa45lRTB6bS4bg    210340245       0      1779974359368        
 1779974359368            Observer    
293       Av_12222lURKVYVt-aNKOQ    210340245       0      1779974359368        
 1779974359368            Observer    
485       glENIgkIng1MYDF8HxxoDQ    210340245       0      1779974359368        
 1779974359368            Observer {code}


was (Author: JIRAUSER301730):
Just tried if this has been somehow fixed in 4.3.0, but kafka still insists 
that the non-existent controller(s) are still part of the cluster:
{code:java}
~$ /opt/kafka/bin/kafka-features.sh --bootstrap-server localhost:9092 upgrade 
--release-version 4.3
Could not upgrade eligible.leader.replicas.version to 1. The update failed for 
all features since the following feature had an error: Invalid update version 
30 for feature metadata.version. Controller 351 only supports versions 7-27
Could not upgrade group.version to 1. The update failed for all features since 
the following feature had an error: Invalid update version 30 for feature 
metadata.version. Controller 351 only supports versions 7-27
Could not upgrade kraft.version to 1. The update failed for all features since 
the following feature had an error: Invalid update version 30 for feature 
metadata.version. Controller 351 only supports versions 7-27
Could not upgrade metadata.version to 30. The update failed for all features 
since the following feature had an error: Invalid update version 30 for feature 
metadata.version. Controller 351 only supports versions 7-27
Could not upgrade share.version to 1. The update failed for all features since 
the following feature had an error: Invalid update version 30 for feature 
metadata.version. Controller 351 only supports versions 7-27
Could not upgrade streams.version to 1. The update failed for all features 
since the following feature had an error: Invalid update version 30 for feature 
metadata.version. Controller 351 only supports versions 7-27
Could not upgrade transaction.version to 2. The update failed for all features 
since the following feature had an error: Invalid update version 30 for feature 
metadata.version. Controller 351 only supports versions 7-27
7 out of 7 operation(s) failed. {code}
The mentioned controller is not part of the cluster:
{code:java}
~$ /opt/kafka/bin/kafka-metadata-quorum.sh --bootstrap-server localhost:9092 
describe --replication
NodeId    DirectoryId               LogEndOffset    Lag    LastFetchTimestamp   
 LastCaughtUpTimestamp    Status      
158       2gsvOvnT7urpZcA_-LUy5w    210340245       0      1779974359863        
 1779974359863            Leader      
611       27Ii-xdAZ7ReQBLsvvJb0A    210340245       0      1779974359368        
 1779974359368            Follower    
206       Q7X9o3XbKxk_3tz4T8torg    210340245       0      1779974359369        
 1779974359369            Follower    
226       7n6aedUEuytkqhBnbe7ESw    210340245       0      1779974359368        
 1779974359368            Observer    
181       tZ17VQ8cYpf7R-LyAQWf2w    210340245       0      1779974359368        
 1779974359368            Observer    
299       P4qXt3K0G5Qg_7w_UdvaNA    210340245       0      1779974359368        
 1779974359368            Observer    
290       bA0pqZFsUa45lRTB6bS4bg    210340245       0      1779974359368        
 1779974359368            Observer    
293       Av_12222lURKVYVt-aNKOQ    210340245       0      1779974359368        
 1779974359368            Observer    
485       glENIgkIng1MYDF8HxxoDQ    210340245       0      1779974359368        
 1779974359368            Observer {code}

> Removed controllers still in metadata, blocking finalizing upgrade to 4.2.0
> ---------------------------------------------------------------------------
>
>                 Key: KAFKA-20295
>                 URL: https://issues.apache.org/jira/browse/KAFKA-20295
>             Project: Kafka
>          Issue Type: Bug
>          Components: controller
>         Environment: Kafka 4.2.0 (Scala 2.13) running on Debian Trixie 13.3
>            Reporter: Roland Sommer
>            Priority: Major
>
> While upgrading our kafka clusters to new operating systems I switched to 
> dynamic voter configuration and removed controller instances with 
> {{/opt/kafka/bin/kafka-metadata-quorum.sh}} and the {{remove-controller}} 
> subcommand. Inspecting the cluster with {{describe}} only shows the actual 
> running nodes.
> Now during the update to 4.2.0, the final metadata upgrade step complains 
> about
> {code:java}
> Could not upgrade eligible.leader.replicas.version to 1. The update failed 
> for all features since the following feature had an error: Invalid update 
> version 29 for feature metadata.version. Controller 351 only supports 
> versions 7-27{code}
> with 351 being an ID of an already removed controller. Inspecting a snapshot 
> with {{/opt/kafka/bin/kafka-metadata-shell.sh}} indeed shows all controller 
> ids of already removed controllers:
> {code:java}
> >> ls image/cluster/controllers/
> 158 206 351 584 611 686 {code}
> while other tools only show the expected nodes:
> {code:java}
> ~$ /opt/kafka/bin/kafka-metadata-quorum.sh --bootstrap-controller 
> localhost:9093 describe --replication --human-readable
> NodeId DirectoryId LogEndOffset Lag LastFetchTimestamp LastCaughtUpTimestamp 
> Status 
> 158 2gsvOvnT7urpZcA_-LUy5w 196823524 0 7 ms ago 8 ms ago Leader 
> 611 27Ii-xdAZ7ReQBLsvvJb0A 196823524 0 348 ms ago 348 ms ago Follower 
> 206 Q7X9o3XbKxk_3tz4T8torg 196823524 0 348 ms ago 348 ms ago Follower 
> 226 7n6aedUEuytkqhBnbe7ESw 196823524 0 348 ms ago 348 ms ago Observer 
> 181 tZ17VQ8cYpf7R-LyAQWf2w 196823524 0 349 ms ago 349 ms ago Observer 
> 299 P4qXt3K0G5Qg_7w_UdvaNA 196823524 0 348 ms ago 348 ms ago Observer 
> 290 bA0pqZFsUa45lRTB6bS4bg 196823524 0 348 ms ago 348 ms ago Observer 
> 293 Av_12222lURKVYVt-aNKOQ 196823524 0 348 ms ago 348 ms ago Observer 
> 485 glENIgkIng1MYDF8HxxoDQ 196823524 0 349 ms ago 350 ms ago Observer {code}
> Grepping through {{bin/kafka-dump-log.sh --cluster-metadata-decoder}} only 
> shows the expected three {{REGISTER_CONTROLLER_RECORD}} entries.
> Is there any clear path for removing those stale nodes?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to