Hi,

thanks for updating the KIP. Couple of follow up comments:

* Nit: Why is "Reset to Earliest" and "Reset to Latest" a "reset by
time" option -- IMHO it belongs to "reset by position"?


* Nit: Description of "Reset to Earliest"

> using Kafka Consumer's `auto.offset.reset` to `earliest`

I think this is strictly speaking not correct (as auto.offset.reset only
triggered if no valid offset is found, but this tool explicitly modified
committed offset), and should be phrased as

> using Kafka Consumer's #seekToBeginning()

-> similar issue for description of "Reset to Latest"


* Main option: rename to --reset-offsets (plural instead of singular)


* Scenario Options: I would remove "reset" from all options, because the
main argument "--reset-offset" says already what to do:

> bin/kafka-consumer-groups.sh --reset-offset --reset-to-datetime XXX

better (IMHO):

> bin/kafka-consumer-groups.sh --reset-offsets --to-datetime XXX



* Option 1.e ("print and export current offset") is not intuitive to use
IMHO. The main option is "--reset-offset" but nothing happens if no
scenario is specified. It is also not specified, what the output should
look like?

Furthermore, --describe should actually show currently committed offset
for a group. So it seems to be redundant to have the same option in
--reset-offsets


* Option 2.a: I would rename to "--reset-to-offset" (or considering the
comment above to "--to-offset")


* Option 2.b and 2.c: I would unify to "--shift-offsets-by" (or similar)
and accept positive/negative values


* About Scope "all": maybe it's better to have an option "--all-topics"
(or similar). IMHO explicit arguments are preferable over implicit
setting to guard again accidental miss use of the tool.


* Scope: I also think, that "--topic" (singular) and "--topics" (plural)
are too similar and easy to use in a wrong way (ie, mix up) -- maybe we
can have two options that are easier to distinguish.


* I still think that JSON is not the best format (it's too verbose/hard
to write for humans from scratch). A simple CSV format with implicit
schema (topic,partition,offset) would be sufficient.


* Why does the JSON contain "group_id" field -- there is parameter
"--group" to specify the group ID. Would one overwrite the other (what
order) or would there be an error if "--group" is used in combination
with "--reset-from-file"?



-Matthias




On 2/17/17 6:43 AM, Jorge Esteban Quilcate Otoya wrote:
> Hi,
> 
> according to the feedback, I've updated the KIP:
> 
> - We have added and ordered the scenarios, scopes and executions of the
> Reset Offset tool.
> - Consider it as an extension to the current `ConsumerGroupCommand` tool
> - Execution will be possible without generating JSON files.
> 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+Reset+Consumer+Group+Offsets+tooling
> 
> Looking forward to your feedback!
> 
> Jorge.
> 
> El mié., 8 feb. 2017 a las 23:23, Jorge Esteban Quilcate Otoya (<
> quilcate.jo...@gmail.com>) escribió:
> 
>> Great. I think I got the idea. What about this options:
>>
>> Scenarios:
>>
>> 1. Current status
>>
>> ´kafka-consumer-groups.sh --reset-offset --group cg1´
>>
>> 2. To Datetime
>>
>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-datetime
>> 2017-01-01T00:00:00.000´
>>
>> 3. To Period
>>
>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-period P2D´
>>
>> 4. To Earliest
>>
>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-earliest´
>>
>> 5. To Latest
>>
>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to-latest´
>>
>> 6. Minus 'n' offsets
>>
>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-minus n´
>>
>> 7. Plus 'n' offsets
>>
>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-plus n´
>>
>> 8. To specific offset
>>
>> ´kafka-consumer-groups.sh --reset-offset --group cg1 --reset-to x´
>>
>> Scopes:
>>
>> a. All topics used by Consumer Group
>>
>> Don't specify --topics
>>
>> b. Specific List of Topics
>>
>> Add list of values in --topics t1,t2,tn
>>
>> c. One Topic, all Partitions
>>
>> Add one topic and no partitions values: --topic t1
>>
>> d. One Topic, List of Partitions
>>
>> Add one topic and partitions values: --topic t1 --partitions 0,1,2
>>
>> About Reset Plan (JSON file):
>>
>> I think is still valid to have the option to persist reset configuration
>> as a file, but I agree to give the option to run the tool without going
>> down to the JSON file.
>>
>> Execution options:
>>
>> 1. Without execution argument (No args):
>>
>> Print out results (reset plan)
>>
>> 2. With --execute argument:
>>
>> Run reset process
>>
>> 3. With --output argument:
>>
>> Save result in a JSON format.
>>
>> 4. Only with --execute option and --reset-file (path to JSON)
>>
>> Reset based on file
>>
>> 4. Only with --verify option and --reset-file (path to JSON)
>>
>> Verify file values with current offsets
>>
>> I think we can remove --generate-and-execute because is a bit clumsy.
>>
>> With this options we will be able to execute with manual JSON
>> configuration.
>>
>>
>> El mié., 8 feb. 2017 a las 22:43, Ben Stopford (<b...@confluent.io>)
>> escribió:
>>
>> Yes - using a tool like this to skip a set of consumer groups over a
>> corrupt/bad message is definitely appealing.
>>
>> B
>>
>> On Wed, Feb 8, 2017 at 9:37 PM Gwen Shapira <g...@confluent.io> wrote:
>>
>>> I like the --reset-to-earliest and --reset-to-latest. In general,
>>> since the JSON route is the most challenging for users, we want to
>>> provide a lot of ways to do useful things without going there.
>>>
>>> Two things that can help:
>>>
>>> 1. A lot of times, users want to skip few messages that cause issues
>>> and continue. maybe just specifying the topic, partition and delta
>>> will be better than having to find the offset and write a JSON and
>>> validate the JSON etc.
>>>
>>> 2. Thinking if there are other common use-cases that we can make easy
>>> rather than just one generic but not very usable method.
>>>
>>> Gwen
>>>
>>> On Wed, Feb 8, 2017 at 3:25 AM, Jorge Esteban Quilcate Otoya
>>> <quilcate.jo...@gmail.com> wrote:
>>>> Thanks for the feedback!
>>>>
>>>> @Onur, @Gwen:
>>>>
>>>> Agree. Actually at the first draft I considered to have it inside
>>>> ´kafka-consumer-groups.sh´, but I decide to propose it as a standalone
>>> tool
>>>> to describe it clearly and focus it on reset functionality.
>>>>
>>>> But now that you mentioned, it does make sense to have it in
>>>> ´kafka-consumer-groups.sh´. How would be a consistent way to introduce
>>> it?
>>>>
>>>> Maybe something like this:
>>>>
>>>> ´kafka-consumer-groups.sh --reset-offset --generate --group cg1
>> --topics
>>> t1
>>>> --reset-from 2017-01-01T00:00:00.000 --output plan.json´
>>>>
>>>> ´kafka-consumer-groups.sh --reset-offset --verify --reset-json-file
>>>> plan.json´
>>>>
>>>> ´kafka-consumer-groups.sh --reset-offset --execute --reset-json-file
>>>> plan.json´
>>>>
>>>> ´kafka-consumer-groups.sh --reset-offset --generate-and-execute --group
>>> cg1
>>>> --topics t1 --reset-from 2017-01-01T00:00:00.000´
>>>>
>>>> @Gwen:
>>>>
>>>>> It looks exactly like the replica assignment tool
>>>>
>>>> It was influenced by ;-) I use the generate-verify-execute process here
>>> to
>>>> make sure user will be aware of the result of this operation. At the
>>>> beginning we considered only add a couple of options to Consumer Group
>>>> Command:
>>>>
>>>> --rewind-to-timestamp and --rewind-to-period
>>>>
>>>> @Onur:
>>>>
>>>>> You can actually get away with overriding while members of the group
>>> are live
>>>> with method 2 by using group information from DescribeGroupsRequest.
>>>>
>>>> This means that we need to have Consumer Group stopped before executing
>>> and
>>>> start a new consumer internally to do this? Therefore, we won't be able
>>> to
>>>> consider executing reset when ConsumerGroup is active? (trying to
>> relate
>>> it
>>>> with @Dong 5th question)
>>>>
>>>> @Dong:
>>>>
>>>>> Should we allow user to use wildcard to reset offset of all groups
>> for a
>>>> given topic as well?
>>>>
>>>> I haven't thought about this scenario. Could be interesting. Following
>>> the
>>>> recommendation to add it into Consumer Group Command, in this case
>> Group
>>>> argument will be optional if there are only 1 topic. I think for
>> multiple
>>>> topic won't be that useful.
>>>>
>>>>> Should we allow user to specify timestamp per topic partition in the
>>> json
>>>> file as well?
>>>>
>>>> Don't think this could be a valid from the tool, but if Reset Plan is
>>>> generated, and user want to set the offset for a specific partition to
>>>> other offset (eventually based on another timestamp), and execute it,
>> it
>>>> will be up to her/him.
>>>>
>>>>> Should the script take some credential file to make sure that this
>>>> operation is authenticated given the potential impact of this
>> operation?
>>>>
>>>> Haven't tried to secure brokers yet, but the tool should support
>>>> authorization if it's enabled in the broker.
>>>>
>>>>> Should we provide constant to reset committed offset to
>> earliest/latest
>>>> offset of a partition, e.g. -1 indicates earliest offset and -2
>> indicates
>>>> latest offset.
>>>>
>>>> I will go for something like ´--reset-to-earliest´ and
>>> ´--reset-to-latest´
>>>>
>>>>> Should we allow dynamic change of the comitted offset when consumer
>> are
>>>> running, such that consumer will seek to the newly committed offset and
>>>> start consuming from there?
>>>>
>>>> Not sure about this. I will recommend to keep it simple and ask user to
>>>> stop consumers first. But I would considered it if the trade-offs are
>>>> clear.
>>>>
>>>> @Matthias
>>>>
>>>> Added :). And thanks a lot for your help to define this KIP!
>>>>
>>>>
>>>>
>>>> El mié., 8 feb. 2017 a las 7:47, Gwen Shapira (<g...@confluent.io>)
>>>> escribió:
>>>>
>>>>> As long as the CLI is a bit consistent? Like, not just adding 3
>>>>> arguments and a JSON parser to the existing tool, right?
>>>>>
>>>>> On Tue, Feb 7, 2017 at 10:29 PM, Onur Karaman
>>>>> <onurkaraman.apa...@gmail.com> wrote:
>>>>>> I think it makes sense to just add the feature to
>>>>> kafka-consumer-groups.sh
>>>>>>
>>>>>> On Tue, Feb 7, 2017 at 10:24 PM, Gwen Shapira <g...@confluent.io>
>>> wrote:
>>>>>>
>>>>>>> Thanks for the KIP. I'm super happy about adding the capability.
>>>>>>>
>>>>>>> I hate the interface, though. It looks exactly like the replica
>>>>>>> assignment tool. A tool everyone loves so much that there are
>>> multiple
>>>>>>> projects, open and closed, that try to fix it.
>>>>>>>
>>>>>>> Can we swap it with something that looks a bit more like the
>> consumer
>>>>>>> group tool? or the kafka streams reset tool? Consistency is helpful
>>> in
>>>>>>> such cases. I spent some time learning existing tools and learning
>>> yet
>>>>>>> another one is a deterrent.
>>>>>>>
>>>>>>> Gwen
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Feb 7, 2017 at 6:43 PM, Jorge Esteban Quilcate Otoya
>>>>>>> <quilcate.jo...@gmail.com> wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> I would like to propose a KIP to Add a tool to Reset Consumer
>> Group
>>>>>>> Offsets.
>>>>>>>>
>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>>>>>> 122%3A+Add+a+tool+to+Reset+Consumer+Group+Offsets
>>>>>>>>
>>>>>>>> Please, take a look at the proposal and share your feedback.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Jorge.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Gwen Shapira
>>>>>>> Product Manager | Confluent
>>>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
>> <(650)%20450-2760> | @gwenshap
>>>>>>> Follow us: Twitter | blog
>>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Gwen Shapira
>>>>> Product Manager | Confluent
>>>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760>
>> <(650)%20450-2760> | @gwenshap
>>>>> Follow us: Twitter | blog
>>>>>
>>>
>>>
>>>
>>> --
>>> Gwen Shapira
>>> Product Manager | Confluent
>>> 650.450.2760 <(650)%20450-2760> <(650)%20450-2760> | @gwenshap
>>> Follow us: Twitter | blog
>>>
>>
>>
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to