Thank you Greg for your inputs. In addition to my previous proposal, below we can add if they make sense.
1. Acl clean up cli: For all the non existent topics, find all the acls, show a preview in dry run mode and execute ./acl-cleanup.sh --mode dryrun/execute Need to be careful of the wildcard acls here. I think this would really improve cluster performance by reducing the znode/kraft medata size. There could be thousands of unused acls which users are not aware of. While these delete operations are dangerous, with dry run mode, showing a preview before and after, would be nice. 2. Stale topics detection cli: Stale topics could be determined based on the timestamp of the .log files on each partition on all the nodes if it's on/above the mentioned period (ex : 90 days) ./detect-stale-topics.sh --period-days 90 This would need an improvement to AdminClient describeLogs (kafka-log-dirs.sh) to fetch the lowest/highest timestamp of log files. Again there could be hundreds of unused topics which organizations could find easily and get them deleted. 3. When creating a new write and read acl (with consumer group) and if the topic doesn't exist, respond with a warning that the resource doesn't exist. 4. Improve kafka-consumer-groups.sh in case of a non-existent topic. 5. Export/Import of acls : ./acl-export.sh --file acl-date.json ./acl-import.sh --file acl-date.json With this, users can take regular backups of acls. 6. Export/Import of topics metadata similar to above I remember companies have comup with their own strategy of taking backups regularly. Please let me know your thoughts. Thanks, Murali On Thu, Sep 19, 2024 at 9:05 PM Greg Harris <[email protected]> wrote: > Hey Murali, > > Thanks for raising this. > > I think it would make sense to take another look at the dependencies > between the different CRUD objects that Kafka exposes, and add some sanity > checks where they make sense and can be implemented safely. > > I was just speaking to a user that had some confusion for consumer groups > referencing topics that had been deleted/truncated, and observing negative > lag and non-delivery of the "rewritten" offsets. I wonder if we could > improve the referential integrity there too. > > > Thanks, > Greg > > On Thu, Sep 19, 2024, 10:27 AM Muralidhar Basani > <[email protected]> wrote: > > > Hello everyone, > > > > I’d like to propose an improvement to the Kafka topic deletion process, > > specifically regarding ACLs. > > > > Currently, when we delete a topic using the kafka-topics.sh cli, the > topic > > is removed without any consideration for the ACLs that were applied to > it. > > This could lead to stale ACLs remaining in the system, which might > clutter > > the environment. > > > > Proposal: > > Prevent Topic Deletion if ACLs Exist: If there are read or write ACLs > > associated with the topic, the deletion should be prohibited by default. > > This helps prevent accidental deletion of topics that are still in use. > > > > Force Delete Option: By introducing a --force option (e.g., -f), we allow > > users to bypass the ACL check and delete the topic, even if ACLs are > > present. > > > > Optional ACL Cleanup: Users could be provided with the option to delete > > associated ACLs during the topic deletion process. However, this might > not > > apply to ACLs based on patterns, which would need to remain intact. > > > > Scenarios: > > Scenario 1: No ACLs exist > > > > bin/kafka-topics.sh --bootstrap-server localhost:9092 --delete --topic > > DummyTopic > > Topic deleted. > > > > Scenario 2: ACLs exist > > > > bin/kafka-topics.sh --bootstrap-server localhost:9092 --delete --topic > > DummyTopic > > The following ACLs exist on the topic: > > ResourcePattern(resourceType=TOPIC, name=DummyTopic, patternType=LITERAL, > > ... operation=READ) > > > > Topic cannot be deleted. > > > > Scenario 3: ACLs exist, force delete > > > > bin/kafka-topics.sh --bootstrap-server localhost:9092 --delete --topic > > DummyTopic -f > > The following ACLs exist on the topic: > > ResourcePattern(resourceType=TOPIC, name=DummyTopic, patternType=LITERAL, > > ... operation=READ/WRITE) > > > > Topic is force deleted. > > > > Even though these acl resources are provisioned independent of a topic > > existence, I see nice benefits : > > > > Accidental Deletion Prevention: This warning mechanism can help prevent > > users from unintentionally deleting topics that are still in use. > > Metadata Cleanup: It ensures a cleaner environment by avoiding stale ACLs > > that may linger after topic deletion. > > > > I believe this is technically feasible, but I would appreciate hearing > your > > thoughts on it. and happy to open a KIP for it. > > > > Thanks, > > Murali > > >
