[ https://issues.apache.org/jira/browse/KAFKA-559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tejas Patil updated KAFKA-559: ------------------------------ Attachment: KAFKA-559.v2.patch _1. Passing a groupId for cleanup will make the cleanup job tedious since we tend to have hundreds of console-consumer group ids in ZK that are stale. Running the tool for a particular topic or all topics probably makes more sense._ I had received different requirement spec which was based on "group" and not "topic". In the new patch, I have added support for topic based deletion too. _2. I would suggest accepting a date param "mm-dd-yyyy hh:mm:ss,SSS" as a String instead of accepting a timestamp value, and deleting the group only if it has had no updates to its offsets since that date, as described above._ I had a discussion with Joel about this one and he had suggested me to use the EPOCH time thing instead of "mm-dd-yyyy". I am open for modification but if there is a consensus about it. _3. It's dangerous to delete the entire group if the date/"since" is not provided. It's very easy for user to specify only two arguments (topic and zkconnect) and not specify the date. Let's also make sure that the user always specifies a date._ Ok. Change implemented _4. "dry-run" does not need to accept any value. You can simply use parser.accepts("dry-run", "....") and then use if (options.has(dryRunOpt)) ....._ Agreed. Change implemented _5. We can inline exitIfNoPathExists, the implementation is small and clear enough._ While adding support for topic based deletion, that method had no use so got rid of it. _6. We should have an info statement when the group ids are deleted in the non dry-run mode._ Good catch :) Change implemented _7. info("Removal has successfully completed.") can probably be refactored to something more specific to this tool._ Below is what I changed it to: logger.info("Kafka obsolete Zk entires cleanup tool shutdown successfully.") _8. Instead of writing a different info statement for dry-run mode, I think you should be able to set logIdent of Logging to "[dry-run]" or "" depending on which mode the tool is working in._ Nice one :) Change implemented Minor stuff: _1. I think we tend to use camelCase in variable names instead of underscores._ _2. Whitespaces can be made more consistent._ I just read http://kafka.apache.org/coding-guide.html. Will take care of that from now on. One related question: is there a code formatter for Kafka ? Creating a fat wiki page with a bunch of project specific formatting is helpful but people won't have that in mind everytime they code. It gets worse when people work on multiple projects which follow different conventions. Here is what some other projects came up with to account for that: - https://issues.apache.org/jira/browse/HBASE-3678 - http://svn.apache.org/viewvc/nutch/branches/2.x/eclipse-codeformat.xml?view=markup - https://github.com/cloudera/blog-eclipse/blob/master/hadoop-format.xml Attached KAFKA-559.v2.patch : Implemented the changes suggested by [~swapnilghike] and did some sundry refactoring > Garbage collect old consumer metadata entries > --------------------------------------------- > > Key: KAFKA-559 > URL: https://issues.apache.org/jira/browse/KAFKA-559 > Project: Kafka > Issue Type: New Feature > Reporter: Jay Kreps > Assignee: Tejas Patil > Labels: project > Attachments: KAFKA-559.v1.patch, KAFKA-559.v2.patch > > > Many use cases involve tranient consumers. These consumers create entries > under their consumer group in zk and maintain offsets there as well. There is > currently no way to delete these entries. It would be good to have a tool > that did something like > bin/delete-obsolete-consumer-groups.sh [--topic t1] --since [date] > --zookeeper [zk_connect] > This would scan through consumer group entries and delete any that had no > offset update since the given date. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira