+1 for such a tool. It would be of great help in a lot of use cases.

On Thu, Feb 23, 2017 at 11:44 PM, Matthias J. Sax <matth...@confluent.io>
wrote:

> \cc from dev
>
>
> -------- Forwarded Message --------
> Subject: Re: KIP-122: Add a tool to Reset Consumer Group Offsets
> Date: Thu, 23 Feb 2017 10:13:39 -0800
> From: Matthias J. Sax <matth...@confluent.io>
> Organization: Confluent Inc
> To: d...@kafka.apache.org
>
> So you suggest to merge "scope options" --topics, --topic, and
> --partitions into a single option? Sound good to me.
>
> I like the compact way to express it, ie, topicname:list-of-partitions
> with "all partitions" if not partitions are specified. It's quite
> intuitive to use.
>
> Just wondering, if we could get rid of the repeated --topic option; it's
> somewhat verbose. Have no good idea though who to improve it.
>
> If you concatenate multiple topic, we need one more character that is
> not allowed in topic names to separate the topics:
>
> > invalidChars = {'/', '\\', ',', '\u0000', ':', '"', '\'', ';', '*',
> '?', ' ', '\t', '\r', '\n', '='};
>
> maybe
>
> --topics t1=1,2,3:t2:t3=3
>
> use '=' to specify partitions (instead of ':' as you proposed) and ':'
> to separate topics? All other characters seem to be worse to use to me.
> But maybe you have a better idea.
>
>
>
> -Matthias
>
>
> On 2/23/17 3:15 AM, Jorge Esteban Quilcate Otoya wrote:
> > @Matthias about the point 9:
> >
> > What about keeping only the --topic option, and support this format:
> >
> > `--topic t1:0,1,2 --topic t2 --topic t3:2`
> >
> > In this case topics t1, t2, and t3 will be selected: topic t1 with
> > partitions 0,1 and 2; topic t2 with all its partitions; and topic t3,
> with
> > only partition 2.
> >
> > Jorge.
> >
> > El mar., 21 feb. 2017 a las 11:11, Jorge Esteban Quilcate Otoya (<
> > quilcate.jo...@gmail.com>) escribió:
> >
> >> Thanks for the feedback Matthias.
> >>
> >> * 1. You're right. I'll reorder the scenarios.
> >>
> >> * 2. Agree. I'll update the KIP.
> >>
> >> * 3. I like it, updating to `reset-offsets`
> >>
> >> * 4. Agree, removing the `reset-` part
> >>
> >> * 5. Yes, 1.e option without --execute or --export will print out
> current
> >> offset, and the new offset, that will be the same. The use-case of this
> >> option is to use it in combination with --export mostly and have a
> current
> >> 'checkpoint' to reset later. I will add to the KIP how the output should
> >> looks like.
> >>
> >> * 6. Considering 4., I will update it to `--to-offset`
> >>
> >> * 7. I like the idea to unify these options (plus, minus).
> >> `shift-offsets-by` is a good option, but I will like some more feedback
> >> here about the name. I will update the KIP in the meantime.
> >>
> >> * 8. Yes, discussed in 9.
> >>
> >> * 9. Agree. I'll love some feedback here. `topic` is already used by
> >> `delete`, and we can add `--all-topics` to consider all
> topics/partitions
> >> assigned to a group. How could we define specific topics/partitions?
> >>
> >> * 10. Haven't thought about it, but make sense.
> >> <topic>,<partition>,<offset> would be enough.
> >>
> >> * 11. Agree. Solved with 10.
> >>
> >> Also, I have a couple of changes to mention:
> >>
> >> 1. I have add a reference to the branch where I'm working on this KIP.
> >>
> >> 2. About the period scenario `--to-period`. I will change it to
> >> `--to-duration` given that duration (
> >> https://docs.oracle.com/javase/8/docs/api/java/time/Duration.html)
> >> follows this format: 'PnDTnHnMnS' and does not consider daylight saving
> >> efects.
> >>
> >>
> >>
> >> El mar., 21 feb. 2017 a las 2:47, Matthias J. Sax (<
> matth...@confluent.io>)
> >> escribió:
> >>
> >> Hi,
> >>
> >> thanks for updating the KIP. Couple of follow up comments:
> >>
> >> * Nit: Why is "Reset to Earliest" and "Reset to Latest" a "reset by
> >> time" option -- IMHO it belongs to "reset by position"?
> >>
> >>
> >> * Nit: Description of "Reset to Earliest"
> >>
> >>> using Kafka Consumer's `auto.offset.reset` to `earliest`
> >>
> >> I think this is strictly speaking not correct (as auto.offset.reset only
> >> triggered if no valid offset is found, but this tool explicitly modified
> >> committed offset), and should be phrased as
> >>
> >>> using Kafka Consumer's #seekToBeginning()
> >>
> >> -> similar issue for description of "Reset to Latest"
> >>
> >>
> >> * Main option: rename to --reset-offsets (plural instead of singular)
> >>
> >>
> >> * Scenario Options: I would remove "reset" from all options, because the
> >> main argument "--reset-offset" says already what to do:
> >>
> >>> bin/kafka-consumer-groups.sh --reset-offset --reset-to-datetime XXX
> >>
> >> better (IMHO):
> >>
> >>> bin/kafka-consumer-groups.sh --reset-offsets --to-datetime XXX
> >>
> >>
> >>
> >> * Option 1.e ("print and export current offset") is not intuitive to use
> >> IMHO. The main option is "--reset-offset" but nothing happens if no
> >> scenario is specified. It is also not specified, what the output should
> >> look like?
> >>
> >> Furthermore, --describe should actually show currently committed offset
> >> for a group. So it seems to be redundant to have the same option in
> >> --reset-offsets
> >>
> >>
> >> * Option 2.a: I would rename to "--reset-to-offset" (or considering the
> >> comment above to "--to-offset")
> >>
> >>
> >> * Option 2.b and 2.c: I would unify to "--shift-offsets-by" (or similar)
> >> and accept positive/negative values
> >>
> >>
> >> * About Scope "all": maybe it's better to have an option "--all-topics"
> >> (or similar). IMHO explicit arguments are preferable over implicit
> >> setting to guard again accidental miss use of the tool.
> >>
> >>
> >> * Scope: I also think, that "--topic" (singular) and "--topics" (plural)
> >> are too similar and easy to use in a wrong way (ie, mix up) -- maybe we
> >> can have two options that are easier to distinguish.
> >>
> >>
> >> * I still think that JSON is not the best format (it's too verbose/hard
> >> to write for humans from scratch). A simple CSV format with implicit
> >> schema (topic,partition,offset) would be sufficient.
> >>
> >>
> >> * Why does the JSON contain "group_id" field -- there is parameter
> >> "--group" to specify the group ID. Would one overwrite the other (what
> >> order) or would there be an error if "--group" is used in combination
> >> with "--reset-from-file"?
> >>
> >>
> >>
> >> -Matthias
> >>
> >>
> >>
> >>
> >> On 2/17/17 6:43 AM, Jorge Esteban Quilcate Otoya wrote:
> >>> Hi,
> >>>
> >>> according to the feedback, I've updated the KIP:
> >>>
> >>> - We have added and ordered the scenarios, scopes and executions of the
> >>> Reset Offset tool.
> >>> - Consider it as an extension to the current `ConsumerGroupCommand`
> tool
> >>> - Execution will be possible without generating JSON files.
> >>>
> >>>
> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 122%3A+Add+Reset+Consumer+Group+Offsets+tooling
> >>>
> >>> Looking forward to your feedback!
> >>>
> >>> Jorge.
> >>>
> >>> El mié., 8 feb. 2017 a las 23:23, Jorge Esteban Quilcate Otoya (<
> >>> quilcate.jo...@gmail.com>) escribió:
> >>>
> >>>> Great. I think I got the idea. What about this options:
> >>>>
> >>>> Scenarios:
> >>>>
> >>>> 1. Current status
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --group cg1´
> >>>>
> >>>> 2. To Datetime
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> --reset-to-datetime
> >>>> 2017-01-01T00:00:00.000´
> >>>>
> >>>> 3. To Period
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-period
> >> P2D´
> >>>>
> >>>> 4. To Earliest
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> >> --reset-to-earliest´
> >>>>
> >>>> 5. To Latest
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --group cg1
> --reset-to-latest´
> >>>>
> >>>> 6. Minus 'n' offsets
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-minus n´
> >>>>
> >>>> 7. Plus 'n' offsets
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-plus n´
> >>>>
> >>>> 8. To specific offset
> >>>>
> >>>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to x´
> >>>>
> >>>> Scopes:
> >>>>
> >>>> a. All topics used by Consumer Group
> >>>>
> >>>> Don't specify --topics
> >>>>
> >>>> b. Specific List of Topics
> >>>>
> >>>> Add list of values in --topics t1,t2,tn
> >>>>
> >>>> c. One Topic, all Partitions
> >>>>
> >>>> Add one topic and no partitions values: --topic t1
> >>>>
> >>>> d. One Topic, List of Partitions
> >>>>
> >>>> Add one topic and partitions values: --topic t1 --partitions 0,1,2
> >>>>
> >>>> About Reset Plan (JSON file):
> >>>>
> >>>> I think is still valid to have the option to persist reset
> configuration
> >>>> as a file, but I agree to give the option to run the tool without
> going
> >>>> down to the JSON file.
> >>>>
> >>>> Execution options:
> >>>>
> >>>> 1. Without execution argument (No args):
> >>>>
> >>>> Print out results (reset plan)
> >>>>
> >>>> 2. With --execute argument:
> >>>>
> >>>> Run reset process
> >>>>
> >>>> 3. With --output argument:
> >>>>
> >>>> Save result in a JSON format.
> >>>>
> >>>> 4. Only with --execute option and --reset-file (path to JSON)
> >>>>
> >>>> Reset based on file
> >>>>
> >>>> 4. Only with --verify option and --reset-file (path to JSON)
> >>>>
> >>>> Verify file values with current offsets
> >>>>
> >>>> I think we can remove --generate-and-execute because is a bit clumsy.
> >>>>
> >>>> With this options we will be able to execute with manual JSON
> >>>> configuration.
> >>>>
> >>>>
> >>>> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<b...@confluent.io>)
> >>>> escribió:
> >>>>
> >>>> Yes - using a tool like this to skip a set of consumer groups over a
> >>>> corrupt/bad message is definitely appealing.
> >>>>
> >>>> B
> >>>>
> >>>> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <g...@confluent.io>
> wrote:
> >>>>
> >>>>> I like the --reset-to-earliest and --reset-to-latest. In general,
> >>>>> since the JSON route is the most challenging for users, we want to
> >>>>> provide a lot of ways to do useful things without going there.
> >>>>>
> >>>>> Two things that can help:
> >>>>>
> >>>>> 1. A lot of times, users want to skip few messages that cause issues
> >>>>> and continue. maybe just specifying the topic, partition and delta
> >>>>> will be better than having to find the offset and write a JSON and
> >>>>> validate the JSON etc.
> >>>>>
> >>>>> 2. Thinking if there are other common use-cases that we can make easy
> >>>>> rather than just one generic but not very usable method.
> >>>>>
> >>>>> Gwen
> >>>>>
> >>>>> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
> >>>>> <quilcate.jo...@gmail.com> wrote:
> >>>>>> Thanks for the feedback!
> >>>>>>
> >>>>>> @Onur, @Gwen:
> >>>>>>
> >>>>>> Agree. Actually at the first draft I considered to have it inside
> >>>>>> ´kafka-consumer-groups.sh´, but I decide to propose it as a
> standalone
> >>>>> tool
> >>>>>> to describe it clearly and focus it on reset functionality.
> >>>>>>
> >>>>>> But now that you mentioned, it does make sense to have it in
> >>>>>> ´kafka-consumer-groups.sh´. How would be a consistent way to
> introduce
> >>>>> it?
> >>>>>>
> >>>>>> Maybe something like this:
> >>>>>>
> >>>>>> ´kafka-consumer-groups.sh --reset-offset --generate --group cg1
> >>>> --topics
> >>>>> t1
> >>>>>> --reset-from 2017-01-01T00:00:00.000 --output plan.json´
> >>>>>>
> >>>>>> ´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file
> >>>>>> plan.json´
> >>>>>>
> >>>>>> ´kafka-consumer-groups.sh --reset-offset --execute --reset-json-file
> >>>>>> plan.json´
> >>>>>>
> >>>>>> ´kafka-consumer-groups.sh --reset-offset --generate-and-execute
> >> --group
> >>>>> cg1
> >>>>>> --topics t1 --reset-from 2017-01-01T00:00:00.000´
> >>>>>>
> >>>>>> @Gwen:
> >>>>>>
> >>>>>>> It looks exactly like the replica assignment tool
> >>>>>>
> >>>>>> It was influenced by ;-) I use the generate-verify-execute process
> >> here
> >>>>> to
> >>>>>> make sure user will be aware of the result of this operation. At the
> >>>>>> beginning we considered only add a couple of options to Consumer
> Group
> >>>>>> Command:
> >>>>>>
> >>>>>> --rewind-to-timestamp and --rewind-to-period
> >>>>>>
> >>>>>> @Onur:
> >>>>>>
> >>>>>>> You can actually get away with overriding while members of the
> group
> >>>>> are live
> >>>>>> with method 2 by using group information from DescribeGroupsRequest.
> >>>>>>
> >>>>>> This means that we need to have Consumer Group stopped before
> >> executing
> >>>>> and
> >>>>>> start a new consumer internally to do this? Therefore, we won't be
> >> able
> >>>>> to
> >>>>>> consider executing reset when ConsumerGroup is active? (trying to
> >>>> relate
> >>>>> it
> >>>>>> with @Dong 5th question)
> >>>>>>
> >>>>>> @Dong:
> >>>>>>
> >>>>>>> Should we allow user to use wildcard to reset offset of all groups
> >>>> for a
> >>>>>> given topic as well?
> >>>>>>
> >>>>>> I haven't thought about this scenario. Could be interesting.
> Following
> >>>>> the
> >>>>>> recommendation to add it into Consumer Group Command, in this case
> >>>> Group
> >>>>>> argument will be optional if there are only 1 topic. I think for
> >>>> multiple
> >>>>>> topic won't be that useful.
> >>>>>>
> >>>>>>> Should we allow user to specify timestamp per topic partition in
> the
> >>>>> json
> >>>>>> file as well?
> >>>>>>
> >>>>>> Don't think this could be a valid from the tool, but if Reset Plan
> is
> >>>>>> generated, and user want to set the offset for a specific partition
> to
> >>>>>> other offset (eventually based on another timestamp), and execute
> it,
> >>>> it
> >>>>>> will be up to her/him.
> >>>>>>
> >>>>>>> Should the script take some credential file to make sure that this
> >>>>>> operation is authenticated given the potential impact of this
> >>>> operation?
> >>>>>>
> >>>>>> Haven't tried to secure brokers yet, but the tool should support
> >>>>>> authorization if it's enabled in the broker.
> >>>>>>
> >>>>>>> Should we provide constant to reset committed offset to
> >>>> earliest/latest
> >>>>>> offset of a partition, e.g. -1 indicates earliest offset and -2
> >>>> indicates
> >>>>>> latest offset.
> >>>>>>
> >>>>>> I will go for something like ´--reset-to-earliest´ and
> >>>>> ´--reset-to-latest´
> >>>>>>
> >>>>>>> Should we allow dynamic change of the comitted offset when consumer
> >>>> are
> >>>>>> running, such that consumer will seek to the newly committed offset
> >> and
> >>>>>> start consuming from there?
> >>>>>>
> >>>>>> Not sure about this. I will recommend to keep it simple and ask user
> >> to
> >>>>>> stop consumers first. But I would considered it if the trade-offs
> are
> >>>>>> clear.
> >>>>>>
> >>>>>> @Matthias
> >>>>>>
> >>>>>> Added :). And thanks a lot for your help to define this KIP!
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<g...@confluent.io>)
> >>>>>> escribió:
> >>>>>>
> >>>>>>> As long as the CLI is a bit consistent? Like, not just adding 3
> >>>>>>> arguments and a JSON parser to the existing tool, right?
> >>>>>>>
> >>>>>>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
> >>>>>>> <onurkaraman.apa...@gmail.com> wrote:
> >>>>>>>> I think it makes sense to just add the feature to
> >>>>>>> kafka-consumer-groups.sh
> >>>>>>>>
> >>>>>>>> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <g...@confluent.io>
> >>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Thanks for the KIP. I'm super happy about adding the capability.
> >>>>>>>>>
> >>>>>>>>> I hate the interface, though. It looks exactly like the replica
> >>>>>>>>> assignment tool. A tool everyone loves so much that there are
> >>>>> multiple
> >>>>>>>>> projects, open and closed, that try to fix it.
> >>>>>>>>>
> >>>>>>>>> Can we swap it with something that looks a bit more like the
> >>>> consumer
> >>>>>>>>> group tool? or the kafka streams reset tool? Consistency is
> helpful
> >>>>> in
> >>>>>>>>> such cases. I spent some time learning existing tools and
> learning
> >>>>> yet
> >>>>>>>>> another one is a deterrent.
> >>>>>>>>>
> >>>>>>>>> Gwen
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
> >>>>>>>>> <quilcate.jo...@gmail.com> wrote:
> >>>>>>>>>> Hi all,
> >>>>>>>>>>
> >>>>>>>>>> I would like to propose a KIP to Add a tool to Reset Consumer
> >>>> Group
> >>>>>>>>> Offsets.
> >>>>>>>>>>
> >>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>>>>>>>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
> >>>>>>>>>>
> >>>>>>>>>> Please, take a look at the proposal and share your feedback.
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Jorge.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> Gwen Shapira
> >>>>>>>>> Product Manager | Confluent
> >>>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> >> <(650)%20450-2760>
> >>>> <(650)%20450-2760> | @gwenshap
> >>>>>>>>> Follow us: Twitter | blog
> >>>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Gwen Shapira
> >>>>>>> Product Manager | Confluent
> >>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
> >> <(650)%20450-2760>
> >>>> <(650)%20450-2760> | @gwenshap
> >>>>>>> Follow us: Twitter | blog
> >>>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Gwen Shapira
> >>>>> Product Manager | Confluent
> >>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760> <(650)%20450-2760>
> >> | @gwenshap
> >>>>> Follow us: Twitter | blog
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >>
> >
>
>
>
>

Reply via email to