-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

The tool resets all offsets of user input and intermediate topics as
well as for all internal intermediate and topics.

Furthermore, internal topics (intermediate and store changelogs) get
deleted. But NOT local stores -- for local stores you need to leverage
KafkaStreams#cleanUp()

See also:
http://docs.confluent.io/current/streams/developer-guide.html#applicatio
n-reset-tool


- -Matthias

On 11/10/16 3:48 AM, Sachin Mittal wrote:
> Hi, The reset tool looks like a great feature.
> 
> So following this link 
> https://www.confluent.io/blog/data-reprocessing-with-kafka-streams-res
etting-a-streams-application/
>
>  What I understand is that this tool resets the offsets for
> internal and intermediate topics and also deletes all the internal
> local storage and topics. Please confirm this.
> 
> I was actually doing all this manually.
> 
> Thanks Sachin
> 
> 
> On Thu, Nov 10, 2016 at 12:05 AM, Matthias J. Sax
> <[email protected]> wrote:
> 
> Hey,
> 
> changelog topics are compacted topics and no retention time is
> applied (one exception are window-changelog topics though, which
> have both -- compaction and retention policy enabled)
> 
> If an input message is purged via retention time (and this is you 
> latest committed offset), and you start you Stream application, it 
> will resume according to "auto.offset.reset" policy what you can 
> specify in StreamsConfig. So Streams will just run fine, but the
> data is of course lost.
> 
> For repartitioning topics that same argument applies.
> 
>>>> I am asking this because I am planning to keep the retention
>>>> time for internal changelog topics also small so no message
>>>> gets big enough to start getting exceptions.
> 
> I don't understand this part though...
> 
> To set an arbitrary start offset there is no API or tooling
> available at the moment. However, we plan to add some of this in
> future releases.
> 
> As for now, you could set start offsets "manually" by writing a
> small consumer application, that does not process data, but only
> seek() to (and commit()) the start offsets you want to use. This is
> a similar idea as the Streams application reset tool is built on.
> See this blog post for details:
> 
> https://www.confluent.io/blog/data-reprocessing-with-kafka-streams-res
et
>
> 
ting-a-streams-application/
> 
> However, you should be careful to not mess with internally kept
> state (ie, make sure it is still semantically meaningful and
> compatible if you start to modify offsets...)
> 
> 
> Hope this helps.
> 
> -Matthias
> 
> On 11/9/16 7:29 AM, Sachin Mittal wrote:
>>>> Hi, What happens when the message itself is purged by kafka
>>>> via retention time setting or something else, which was later
>>>> than the last offset stored by the stream consumer.
>>>> 
>>>> I am asking this because I am planning to keep the retention
>>>> time for internal changelog topics also small so no message
>>>> gets big enough to start getting exceptions.
>>>> 
>>>> So if messages from last offset are deleted then will there
>>>> be any issues?
>>>> 
>>>> Also is there anyway to control or set the offset manually
>>>> when we re start the streaming application so certain old
>>>> messages are not consumed at all as logic wise they are not
>>>> useful to streaming application any more. Like say past users
>>>> sessions created while streaming application was stopped.
>>>> 
>>>> 
>>>> Thanks Sachin
>>>> 
>>>> 
>>>> On Wed, Nov 9, 2016 at 7:46 PM, Eno Thereska 
>>>> <[email protected]> wrote:
>>>> 
>>>>> Hi Sachin,
>>>>> 
>>>>> Kafka Streams is built on top of standard Kafka consumers.
>>>>> For for every topic it consumes from (whether changelog
>>>>> topic or source topic, it doesn't matter), the consumer
>>>>> stores the offset it last consumed from. Upon restart, by
>>>>> default it start consuming from where it left off from each
>>>>> of the topics. So you can think of it this way: a restart
>>>>> should be no different than if you had left the application
>>>>> running (i.e., no restart).
>>>>> 
>>>>> Thanks Eno
>>>>> 
>>>>> 
>>>>>> On 9 Nov 2016, at 13:59, Sachin Mittal
>>>>>> <[email protected]> wrote:
>>>>>> 
>>>>>> Hi, I had some basic questions on sequence of tasks for 
>>>>>> streaming application restart in case of failure or
>>>>>> otherwise.
>>>>>> 
>>>>>> Say my stream is structured this way
>>>>>> 
>>>>>> source-topic branched into 2 kstreams source-topic-1 
>>>>>> source-topic-2 each mapped to 2 new kstreams (new
>>>>>> key,value pairs) backed by 2 kafka topics
>>>>>> source-topic-1-new source-topic-2-new each aggregated to
>>>>>> new ktable backed by internal changelog topics
>>>>>> source-topic-1-new-table (scource-topic-1-new-changelog)
>>>>>> source-topic-2-new-table (scource-topic-2-new-changelog)
>>>>>> table1 left join table2 -> to final stream Results of
>>>>>> final stream are then persisted into another data
>>>>>> storage
>>>>>> 
>>>>>> So if you see I have following physical topics or state
>>>>>> stores source-topic source-topic-1-new
>>>>>> source-topic-2-new scource-topic-1-new-changelog
>>>>>> scource-topic-2-new-changelog
>>>>>> 
>>>>>> Now at a give point if the streaming application is
>>>>>> stopped there is some data in all these topics. Barring
>>>>>> the source-topic all other topic has data inserted by
>>>>>> the
>>>>> streaming
>>>>>> application.
>>>>>> 
>>>>>> Also I suppose streaming application stores the offset
>>>>>> for each of the topic as where it was last.
>>>>>> 
>>>>>> So when I restart the application how does the
>>>>>> processing starts again? Will it pick the data from last
>>>>>> left changelog topics and process them first and then
>>>>>> process the source topic data from the offset last left?
>>>>>> 
>>>>>> Or it will start from source topic. I really don't want
>>>>>> it to maintain offset to changelog tables because any old
>>>>>> key's value can be modified as part of aggregation
>>>>>> again.
>>>>>> 
>>>>>> Bit confused here, any light would help a lot.
>>>>>> 
>>>>>> Thanks Sachin
>>>>> 
>>>>> 
>>>> 
>> 
> 
-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - https://gpgtools.org

iQIcBAEBCgAGBQJYJMkaAAoJECnhiMLycopPeOkP/1MtVk2ZSYYkZO8Ru3IWJ5cw
5b0s5IpyLAS10JsoYLsITpNuWffJJx6L0E25mf7WZ/nn918BuqTLl5jNivw9C9ru
BVouCPFITVs8BRIAVkh9Vzux9/FCHS82UEDseYuWKrjyiJ0v6zVpXnrj65R//VpQ
b4tja3ubbXRaIeoreK9cxQsO3dQdU7Yzz0pFWs1/EyaACghHGeTmwrhQqq02mzm8
HcwbjKJHqEnWRpSXQ6AJ7ak13lgAUrefUCgOI2DK2wBN4267lwuKO9QAh0oJv6EZ
YKjdNXqSRTNd72RfLEZauCNb0dkhRK9s3GN7BWAw0Ce2rjTsXvFP+jMfCjiNj5xp
cd1SM4TG8QlFp6agCHr7W1E1Pcbq/OKnZ2vpQYPNn+qcAk9k5HiZ4pp7h7FUBuLZ
yXmD8gKNwg8S1LfP6dtayo6yxuuwL/CdquzsE3Hi1q8H0C/ZaWz8eiMjd1gM/9Da
D9VZAWSWy7lwQfeyg1vxC7Q/glqF6qOY5kXd40usOt+LNw9CTNXyrAn+n8lVqCKR
SlrRHdnJM5BUu03KmN613dsPJb9XaEcJKJ0tYGENdvZgjWt4jmIYDPBKTiqHKp8c
kJjUjRAHx37El/wLDDsZ2jKMGdG1QwKZ+EzppFMEpVdcf679okqyRNY4sbRA/E8S
mpT1ankVjOZaaPfPcEaw
=9wTX
-----END PGP SIGNATURE-----

Reply via email to