Re: [VOTE] KIP-622 Add currentSystemTimeMs and currentStreamTimeMs to ProcessorContext
Hello all, Based on the feedback of the pr <https://github.com/apache/kafka/pull/9744> https://github.com/apache/kafka/pull/9744, there are following changes done to the kip <https://cwiki.apache.org/confluence/display/KAFKA/KIP-622%3A+Add+currentSystemTimeMs+and+currentStreamTimeMs+to+ProcessorContext> . *ProcessorContext#currentSystemTimeMs()* It is expected that this method will return the internally cached system timestamp from the Kafka Stream runtime. Thus, it may return a different value compared to System.currentTimeMillis(). The cached system time represents the time when we start processing / punctuating, and it would not change throughout the process / punctuate. So this method will return current system time (also called wall-clock time) known from kafka streams runtime. New methods to MockProcessorContext for testing purposes: *MockProcessorContext#setRecordTimestamp*: set record timestamp *MockProcessorContext#setCurrentSystemTimeMs:* set system timestamp *MockProcessorContext#setCurrentStreamTimeMs*: set stream time Deprecate method: MockProcessorContext#setTimestamp as it's name is misleading and we are adding a new method MockProcessorContext#setRecordTimestamp which does the same work. Please let me know if you have any thoughts or concerns with this change. Thanks, Roohit On Fri, Dec 4, 2020 at 7:31 PM Rohit Deshpande wrote: > Hello all, > I am closing the vote for this KIP: > https://cwiki.apache.org/confluence/display/KAFKA/KIP-622%3A+Add+currentSystemTimeMs+and+currentStreamTimeMs+to+ProcessorContext > > Summary of the KIP: > Planning to add two new methods to ProcessorContext: > 1. long currentSystemTimeMs() to fetch wall-clock time > 2. long currentStreamTimeMs() to fetch maximum timestamp of any record yet > processed by the task > > Thanks, > Rohit > > > On 2020/12/01 16:09:54, Bill Bejeck wrote: > > Sorry for jumping into this so late, > > > > Thanks for the KIP, I'm a +1 (binding) > > > > -Bill > > > > On Sun, Jul 26, 2020 at 11:06 AM John Roesler wrote: > > > > > Thanks William, > > > > > > I’m +1 (binding) > > > > > > Thanks, > > > John > > > > > > On Fri, Jul 24, 2020, at 20:22, Sophie Blee-Goldman wrote: > > > > Thanks all, +1 (non-binding) > > > > > > > > Cheers, > > > > Sophie > > > > > > > > On Wed, Jul 8, 2020 at 4:02 AM Bruno Cadonna > wrote: > > > > > > > > > Thanks Will and Piotr, > > > > > > > > > > +1 (non-binding) > > > > > > > > > > Best, > > > > > Bruno > > > > > > > > > > On Wed, Jul 8, 2020 at 8:12 AM Matthias J. Sax > > > wrote: > > > > > > > > > > > > Thanks for the KIP. > > > > > > > > > > > > +1 (binding) > > > > > > > > > > > > > > > > > > -Matthias > > > > > > > > > > > > On 7/7/20 11:48 AM, William Bottrell wrote: > > > > > > > Hi everyone, > > > > > > > > > > > > > > I'd like to start a vote for adding two new time API's to > > > > > ProcessorContext. > > > > > > > > > > > > > > Add currentSystemTimeMs and currentStreamTimeMs to > ProcessorContext > > > > > > > < > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-622%3A+Add+currentSystemTimeMs+and+currentStreamTimeMs+to+ProcessorContext > > > > > > > > > > > > > > > > > > > > Thanks everyone for the initial feedback and thanks for your > time. > > > > > > > > > > > > > > > > > > > > > > > > > > > >
[jira] [Resolved] (KAFKA-10871) StreamTask shouldn't take WallClockTime as input parameter in process method
[ https://issues.apache.org/jira/browse/KAFKA-10871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Deshpande resolved KAFKA-10871. - Resolution: Not A Bug > StreamTask shouldn't take WallClockTime as input parameter in process method > > > Key: KAFKA-10871 > URL: https://issues.apache.org/jira/browse/KAFKA-10871 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.7.0 > Reporter: Rohit Deshpande >Assignee: Rohit Deshpande >Priority: Major > Labels: newbie > > While working on https://issues.apache.org/jira/browse/KAFKA-10062 I realized > process method in StreamTask is taking > wallClockTime as input parameter which is redundant as StreamTask already > contains > time(https://github.com/apache/kafka/blob/2.5.0/streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamTask.java#L75) > field which represents wallClockTime. > In process method > (https://github.com/apache/kafka/blob/2.7/streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamTask.java#L664), > wallClockTime can be passed from StreamTask's time field itself. > This method was changed as part of pr: > https://github.com/apache/kafka/pull/7997. > As part of https://issues.apache.org/jira/browse/KAFKA-10062, I believe > wallClockTime need not be stored in > ProcessorContext(https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/AbstractProcessorContext.java#L48) > but should be fetched from StreamTask's time field. Reference pr: > https://github.com/apache/kafka/pull/9744 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-10871) StreamTask shouldn't take WallClockTime as input parameter in process method
Rohit Deshpande created KAFKA-10871: --- Summary: StreamTask shouldn't take WallClockTime as input parameter in process method Key: KAFKA-10871 URL: https://issues.apache.org/jira/browse/KAFKA-10871 Project: Kafka Issue Type: Bug Components: streams Affects Versions: 2.7.0 Reporter: Rohit Deshpande Assignee: Rohit Deshpande While working on https://issues.apache.org/jira/browse/KAFKA-10062 I realized process method in StreamTask is taking wallClockTime as input parameter which is redundant as StreamTask already contains time(https://github.com/apache/kafka/blob/2.5.0/streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamTask.java#L75) field which represents wallClockTime. In process method (https://github.com/apache/kafka/blob/2.7/streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamTask.java#L664), wallClockTime can be passed from StreamTask's time field itself. This method was changed as part of pr: https://github.com/apache/kafka/pull/7997. As part of https://issues.apache.org/jira/browse/KAFKA-10062, I believe wallClockTime need not be stored in ProcessorContext(https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/AbstractProcessorContext.java#L48) but should be fetched from StreamTask's time field. Reference pr: https://github.com/apache/kafka/pull/9744 -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [VOTE] KIP-622 Add currentSystemTimeMs and currentStreamTimeMs to ProcessorContext
Hello all, I am closing the vote for this KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-622%3A+Add+currentSystemTimeMs+and+currentStreamTimeMs+to+ProcessorContext Summary of the KIP: Planning to add two new methods to ProcessorContext: 1. long currentSystemTimeMs() to fetch wall-clock time 2. long currentStreamTimeMs() to fetch maximum timestamp of any record yet processed by the task Thanks, Rohit On 2020/12/01 16:09:54, Bill Bejeck wrote: > Sorry for jumping into this so late, > > Thanks for the KIP, I'm a +1 (binding) > > -Bill > > On Sun, Jul 26, 2020 at 11:06 AM John Roesler wrote: > > > Thanks William, > > > > I’m +1 (binding) > > > > Thanks, > > John > > > > On Fri, Jul 24, 2020, at 20:22, Sophie Blee-Goldman wrote: > > > Thanks all, +1 (non-binding) > > > > > > Cheers, > > > Sophie > > > > > > On Wed, Jul 8, 2020 at 4:02 AM Bruno Cadonna wrote: > > > > > > > Thanks Will and Piotr, > > > > > > > > +1 (non-binding) > > > > > > > > Best, > > > > Bruno > > > > > > > > On Wed, Jul 8, 2020 at 8:12 AM Matthias J. Sax > > wrote: > > > > > > > > > > Thanks for the KIP. > > > > > > > > > > +1 (binding) > > > > > > > > > > > > > > > -Matthias > > > > > > > > > > On 7/7/20 11:48 AM, William Bottrell wrote: > > > > > > Hi everyone, > > > > > > > > > > > > I'd like to start a vote for adding two new time API's to > > > > ProcessorContext. > > > > > > > > > > > > Add currentSystemTimeMs and currentStreamTimeMs to ProcessorContext > > > > > > < > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-622%3A+Add+currentSystemTimeMs+and+currentStreamTimeMs+to+ProcessorContext > > > > > > > > > > > > > > > > > Thanks everyone for the initial feedback and thanks for your time. > > > > > > > > > > > > > > > > > > > > >
Re: [VOTE] KIP-680: TopologyTestDriver should not require a Properties argument
Hello all, I am closing the vote as there are 3 binding votes. Summary of the change: Proposing to add two new constructors to TopologyTestDriver class. 1. One with only topology as parameter 2. Second one with topology and wall clock time as parameter Additional condition is we want to set randomized application id in stream config to avoid conflicts with tests running in parallel. Wiki for this change: link <https://cwiki.apache.org/confluence/display/KAFKA/KIP-680%3A+TopologyTestDriver+should+not+require+a+Properties+argument> Pull request: link <https://github.com/apache/kafka/pull/9660> Thanks, Rohit On Mon, Nov 23, 2020 at 9:58 AM Rohit Deshpande wrote: > Thanks John and Matthias. > Waiting for 1 more binding vote. > Thanks, > Rohit > > On Sat, Nov 21, 2020 at 11:01 AM Matthias J. Sax wrote: > >> +1 (binding) >> >> On 11/20/20 7:43 PM, John Roesler wrote: >> > Thanks again for the KIP, Rohit. >> > >> > I’m +1 (binding) >> > >> > Sorry, I missed your vote thread. >> > >> > -John >> > >> > On Fri, Nov 20, 2020, at 21:35, Rohit Deshpande wrote: >> >> Thanks Guozhang. >> >> Waiting for binding votes. >> >> Thanks, >> >> Rohit >> >> >> >> On Tue, Nov 17, 2020 at 10:13 AM Guozhang Wang >> wrote: >> >> >> >>> +1, thanks Rohit. >> >>> >> >>> >> >>> Guozhang >> >>> >> >>> On Sun, Nov 15, 2020 at 11:53 AM Rohit Deshpande < >> rohitdesh...@gmail.com> >> >>> wrote: >> >>> >> >>>> Hello all, >> >>>> I would like to start voting on KIP-680: TopologyTestDriver should >> not >> >>>> require a Properties argument. >> >>>> >> >>>> >> >>> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-680%3A+TopologyTestDriver+should+not+require+a+Properties+argument >> >>>> >> >>>> Discuss thread: >> >>>> >> >>>> >> >>> >> https://lists.apache.org/thread.html/r5d3d0afc6feb5e18ade47aefbd88534f1b19b2f550a14d33cbc7a0dd%40%3Cdev.kafka.apache.org%3E >> >>>> >> >>>> Jira for the KIP: >> >>>> https://issues.apache.org/jira/browse/KAFKA-10629 >> >>>> >> >>>> If we end up making changes, they will look like this: >> >>>> https://github.com/apache/kafka/compare/trunk...rohitrmd:KAFKA-10629 >> >>>> >> >>>> Thanks, >> >>>> Rohit >> >>>> >> >>> >> >>> >> >>> -- >> >>> -- Guozhang >> >>> >> >> >> >
Re: [DISCUSS] KIP-622 Add currentSystemTimeMs and currentStreamTimeMs to ProcessorContext
Hi, I would like to revive this KIP. 1. As per proposed solution, we want to add following method in ProcessorContext class /** * Returns current cached wall-clock system timestamp in milliseconds. * * @return the current cached wall-clock system timestamp in milliseconds */ long currentSystemTimeMs(); but InternalProcessorContext class already contains same method: https://github.com/guozhangwang/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/InternalProcessorContext.java#L54 Will it make more sense to get rid of this method from InternalProcessorContext and add it to ProcessorContext? 2. I am thinking of adding one test in TopologyDriverTest where using currentSystemTimeMs(), Processor will determine what to do with incoming record by comparing its timestamp with wall clock time. Similarly we can have another test where we fetch streamTime and can take an action on incoming record. Thanks, Rohit On 2020/08/14 05:07:04, "John Roesler" wrote: > Thanks for the reply, Matthias,> > > I see what you mean. I suppose I was thinking that we would pass in the > cached system time, which is also what we’re proposing to add to the > ProcessorContext.> > > If you think there’s something about the timestamp extractor in particular > that would make people want more precision, then something like Time would do > the trick. Since it’s not a public API, maybe just ‘Supplier’ would be > appropriate.> > > But I also don’t want to bikeshed it. My only concern was that it’s awkward > to ask people to actually change their application code for testing. But > maybe in this case, an option is better than no option, and if people don’t > like it, we can always deprecate the mock extractor and evolve the interface > later. > > > So, I’m +1 either way.> > > Thanks,> > John> > > On Mon, Aug 3, 2020, at 16:28, Matthias J. Sax wrote:> > > Interesting proposal.> > > > > > However, it raises the question how the runtime would pass in the> > > `systemTime` parameter? To be accurate, we would need to call> > > `Time.milliseconds()` each time before we call the timestamp extractor.> > > This sound expensive and maybe the extractor does not even use this value.> > > > > > Or we only call `Time.milliseconds()` periodically (as we also do in our> > > runtime code) to make it cheap, however, we loose precision? Not sure if> > > we can make this trade-off for the user?> > > > > > Handing in the `Time` object itself might be another idea, however it> > > seems "dangerous" though, as it does not seem to be actually public API?> > > > > > Last, do we really think we need this feature? We never had a feature> > > request for it and I am not aware of any issue with the current> > > TimestampExtractor interface.> > > > > > It's always easier to add it later if there is real demand instead of> > > pro-actively changing it (and maybe the need to deprecate and remove> > > later) with no clear benefit? Adding the `MockTimestampsExtractor` as> > > part of the test-utils package seems less "dangerous" and should do the> > > job, allowing us to collect feedback. If it's not good enough, we can> > > still change the TimestampExtractor interface as a follow up?> > > > > > > > > -Matthias> > > > > > On 7/28/20 10:03 AM, John Roesler wrote:> > > > Thanks Matthias,> > > > > > > > This is a really good point. It might be a bummer> > > > to have to actually change the topology between> > > > testing and production. Do you think we can rather> > > > evolve the TimestampExtractor interface to let> > > > Streams pass in the current system time, along with> > > > the current record and the current partition time?> > > > > > > > For example, we could add a new method:> > > > long extract(> > > > ConsumerRecord record, > > > > long partitionTime,> > > > long systemTime> > > > );> > > > > > > > Then, Streams could pass in the current system > > > > time and TopologyTestDriver could pass the mocked> > > > time. Additionally, users who implement> > > > TimestampExtractor on their own would be able to> > > > deterministically unit-test their own implementation.> > > > > > > > It's always a challenge to add to an interface without> > > > breaking compatibility. In this case, it seems like> > > > we could provide a default implementation that just> > > > ignores the systemTime argument and calls> > > > extract(record, partitionTime) and also deprecate> > > > the existing method. Then custom implementations> > > > would get a deprecation warning telling them to> > > > implement the other method, and when we remove> > > > the deprecated extract(record, partitionTime), we can> > > > also drop the default implementation from the new> > > > method.> > > > > > > > Specifically, what do you think about:> > > > => > > > public interface TimestampExtractor {> > > > /*...> > > > * @deprecated
Re: [VOTE] KIP-680: TopologyTestDriver should not require a Properties argument
Thanks John and Matthias. Waiting for 1 more binding vote. Thanks, Rohit On Sat, Nov 21, 2020 at 11:01 AM Matthias J. Sax wrote: > +1 (binding) > > On 11/20/20 7:43 PM, John Roesler wrote: > > Thanks again for the KIP, Rohit. > > > > I’m +1 (binding) > > > > Sorry, I missed your vote thread. > > > > -John > > > > On Fri, Nov 20, 2020, at 21:35, Rohit Deshpande wrote: > >> Thanks Guozhang. > >> Waiting for binding votes. > >> Thanks, > >> Rohit > >> > >> On Tue, Nov 17, 2020 at 10:13 AM Guozhang Wang > wrote: > >> > >>> +1, thanks Rohit. > >>> > >>> > >>> Guozhang > >>> > >>> On Sun, Nov 15, 2020 at 11:53 AM Rohit Deshpande < > rohitdesh...@gmail.com> > >>> wrote: > >>> > >>>> Hello all, > >>>> I would like to start voting on KIP-680: TopologyTestDriver should not > >>>> require a Properties argument. > >>>> > >>>> > >>> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-680%3A+TopologyTestDriver+should+not+require+a+Properties+argument > >>>> > >>>> Discuss thread: > >>>> > >>>> > >>> > https://lists.apache.org/thread.html/r5d3d0afc6feb5e18ade47aefbd88534f1b19b2f550a14d33cbc7a0dd%40%3Cdev.kafka.apache.org%3E > >>>> > >>>> Jira for the KIP: > >>>> https://issues.apache.org/jira/browse/KAFKA-10629 > >>>> > >>>> If we end up making changes, they will look like this: > >>>> https://github.com/apache/kafka/compare/trunk...rohitrmd:KAFKA-10629 > >>>> > >>>> Thanks, > >>>> Rohit > >>>> > >>> > >>> > >>> -- > >>> -- Guozhang > >>> > >> >
Re: [VOTE] KIP-680: TopologyTestDriver should not require a Properties argument
Thanks Guozhang. Waiting for binding votes. Thanks, Rohit On Tue, Nov 17, 2020 at 10:13 AM Guozhang Wang wrote: > +1, thanks Rohit. > > > Guozhang > > On Sun, Nov 15, 2020 at 11:53 AM Rohit Deshpande > wrote: > > > Hello all, > > I would like to start voting on KIP-680: TopologyTestDriver should not > > require a Properties argument. > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-680%3A+TopologyTestDriver+should+not+require+a+Properties+argument > > > > Discuss thread: > > > > > https://lists.apache.org/thread.html/r5d3d0afc6feb5e18ade47aefbd88534f1b19b2f550a14d33cbc7a0dd%40%3Cdev.kafka.apache.org%3E > > > > Jira for the KIP: > > https://issues.apache.org/jira/browse/KAFKA-10629 > > > > If we end up making changes, they will look like this: > > https://github.com/apache/kafka/compare/trunk...rohitrmd:KAFKA-10629 > > > > Thanks, > > Rohit > > > > > -- > -- Guozhang >
[VOTE] KIP-680: TopologyTestDriver should not require a Properties argument
Hello all, I would like to start voting on KIP-680: TopologyTestDriver should not require a Properties argument. https://cwiki.apache.org/confluence/display/KAFKA/KIP-680%3A+TopologyTestDriver+should+not+require+a+Properties+argument Discuss thread: https://lists.apache.org/thread.html/r5d3d0afc6feb5e18ade47aefbd88534f1b19b2f550a14d33cbc7a0dd%40%3Cdev.kafka.apache.org%3E Jira for the KIP: https://issues.apache.org/jira/browse/KAFKA-10629 If we end up making changes, they will look like this: https://github.com/apache/kafka/compare/trunk...rohitrmd:KAFKA-10629 Thanks, Rohit
Re: [DISCUSS] KIP-680: TopologyTestDriver should not require a Properties argument
Hi John and Matthias, I have updated the wiki with your suggestions. https://cwiki.apache.org/confluence/display/KAFKA/KIP-680%3A+TopologyTestDriver+should+not+require+a+Properties+argument Thanks, Rohit On Fri, Nov 6, 2020 at 3:23 PM Rohit Deshpande wrote: > Thanks John, I will go ahead and update the KIP with a randomized > application id requirement. > > On Fri, Nov 6, 2020 at 3:12 PM John Roesler wrote: > >> Hi Rohit, >> >> Ah, indeed, that was my mistake. I made a bad assumption about the code. >> >> Since we are already cleaning up, then I’d suggest only that we might >> generate a randomized application id so that concurrent tests won’t >> interfere with each other. But this is sounding like a minor implementation >> note, not a concern for the KIP. >> >> The proposal looks good to me. >> >> Thanks again, >> John >> >> On Fri, Nov 6, 2020, at 16:54, Rohit Deshpande wrote: >> > Hi John, >> > Thank you for your review and the feedback. >> > >> > In existing method TTD.close(), stateDirectory.clean() >> > < >> https://github.com/apache/kafka/blob/trunk/streams/test-utils/src/main/java/org/apache/kafka/streams/TopologyTestDriver.java#L1193 >> > >> > method is getting called which is cleaning up task and global >> > directories >> > < >> https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/StateDirectory.java#L285-L291 >> >. >> > If RocksDB directories are not getting cleaned up in that close method, >> > would like to hear about how we can clean them up in that method. >> > Currently default value of state_directory is set to /tmp/kafka-streams >> > < >> https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/StreamsConfig.java#L605 >> > >> > so >> > I am not setting it's value explicitly in proposed no argument >> > constructor. >> > Does the directory have to be unique in each test? If yes, then I agree >> > that we can tackle RocksDb directories cleanup and creating unique >> > directory tasks in separate KIP. >> > >> > Thanks, >> > Rohit >> > >> > >> > On Fri, Nov 6, 2020 at 7:12 AM John Roesler >> wrote: >> > >> > > Hello Rohit, >> > > >> > > Thanks for picking this up! I think your KIP looks good. >> > > >> > > While I was doing some cleanup of our tests before, one thing I >> > > encountered is that, while most tests don’t semantically need to >> specify >> > > any configs, many tests actually do set the state directory config. >> They >> > > set it specifically so that they can delete it at the end of the test. >> > > Otherwise, the tests would leave RocksDB directories behind. >> > > >> > > I’m wondering if we should address this issue as part of your KIP. >> What >> > > I’m thinking is this: if no state directory is specified, then we >> create a >> > > new, unique temp directory and register it for cleanup when the JVM >> exits. >> > > Additionally, we would set a flag and clean up the state dir when >> > > TTD.close() is called. >> > > >> > > That way, TTD tests would be by default independent and tidy. >> > > >> > > Admittedly, this is outside the current scope of your KIP, so please >> feel >> > > free to reject this idea, in which case I can file a separate ticket >> for >> > > it. >> > > >> > > Thanks! >> > > -John >> > > >> > > On Tue, Nov 3, 2020, at 18:59, Rohit Deshpande wrote: >> > > > Hi Matthias, >> > > > Thank you for the review and the suggestion. >> > > > Considering at most 3 parameters to the constructor of >> > > > TopologyTestDriver >> > > > and topology being required parameter, we can definitely add a new >> > > > constructor `TopologyTestDriver(Topology, Instant)` . >> > > > Right now, I see one test where we can use this constructor: >> > > > >> > > >> https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/kstream/internals/KStreamTransformTest.java#L80-L83 >> > > > Also we can get rid of this method in TestDriver trait: >> > > > >> > > >> https://github.com/apache/kafka/blob/trunk/streams/streams-scala/src/test/sca
Re: [DISCUSS] KIP-680: TopologyTestDriver should not require a Properties argument
Thanks John, I will go ahead and update the KIP with a randomized application id requirement. On Fri, Nov 6, 2020 at 3:12 PM John Roesler wrote: > Hi Rohit, > > Ah, indeed, that was my mistake. I made a bad assumption about the code. > > Since we are already cleaning up, then I’d suggest only that we might > generate a randomized application id so that concurrent tests won’t > interfere with each other. But this is sounding like a minor implementation > note, not a concern for the KIP. > > The proposal looks good to me. > > Thanks again, > John > > On Fri, Nov 6, 2020, at 16:54, Rohit Deshpande wrote: > > Hi John, > > Thank you for your review and the feedback. > > > > In existing method TTD.close(), stateDirectory.clean() > > < > https://github.com/apache/kafka/blob/trunk/streams/test-utils/src/main/java/org/apache/kafka/streams/TopologyTestDriver.java#L1193 > > > > method is getting called which is cleaning up task and global > > directories > > < > https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/StateDirectory.java#L285-L291 > >. > > If RocksDB directories are not getting cleaned up in that close method, > > would like to hear about how we can clean them up in that method. > > Currently default value of state_directory is set to /tmp/kafka-streams > > < > https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/StreamsConfig.java#L605 > > > > so > > I am not setting it's value explicitly in proposed no argument > > constructor. > > Does the directory have to be unique in each test? If yes, then I agree > > that we can tackle RocksDb directories cleanup and creating unique > > directory tasks in separate KIP. > > > > Thanks, > > Rohit > > > > > > On Fri, Nov 6, 2020 at 7:12 AM John Roesler wrote: > > > > > Hello Rohit, > > > > > > Thanks for picking this up! I think your KIP looks good. > > > > > > While I was doing some cleanup of our tests before, one thing I > > > encountered is that, while most tests don’t semantically need to > specify > > > any configs, many tests actually do set the state directory config. > They > > > set it specifically so that they can delete it at the end of the test. > > > Otherwise, the tests would leave RocksDB directories behind. > > > > > > I’m wondering if we should address this issue as part of your KIP. What > > > I’m thinking is this: if no state directory is specified, then we > create a > > > new, unique temp directory and register it for cleanup when the JVM > exits. > > > Additionally, we would set a flag and clean up the state dir when > > > TTD.close() is called. > > > > > > That way, TTD tests would be by default independent and tidy. > > > > > > Admittedly, this is outside the current scope of your KIP, so please > feel > > > free to reject this idea, in which case I can file a separate ticket > for > > > it. > > > > > > Thanks! > > > -John > > > > > > On Tue, Nov 3, 2020, at 18:59, Rohit Deshpande wrote: > > > > Hi Matthias, > > > > Thank you for the review and the suggestion. > > > > Considering at most 3 parameters to the constructor of > > > > TopologyTestDriver > > > > and topology being required parameter, we can definitely add a new > > > > constructor `TopologyTestDriver(Topology, Instant)` . > > > > Right now, I see one test where we can use this constructor: > > > > > > > > https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/kstream/internals/KStreamTransformTest.java#L80-L83 > > > > Also we can get rid of this method in TestDriver trait: > > > > > > > > https://github.com/apache/kafka/blob/trunk/streams/streams-scala/src/test/scala/org/apache/kafka/streams/scala/utils/TestDriver.scala#L32-L35 > > > > which is used in multiple test classes and seems redundant. I agree > with > > > > your suggestion. > > > > Thanks, > > > > Rohit > > > > > > > > On Tue, Nov 3, 2020 at 3:19 PM Matthias J. Sax > wrote: > > > > > > > > > Thanks for the KIP Rohit. > > > > > > > > > > Wondering, if we should also add `TopologyTestDriver(Topology, > > > > > Instant)`? Not totally sure, as having too many overload could > also be > > > > > annoying. > > > > > > > > > > > > > > > -Matthias > > > > > > > > > > On 11/3/20 10:02 AM, Rohit Deshpande wrote: > > > > > > Hello all, > > > > > > I have created KIP-680: TopologyTestDriver should not require a > > > > > Properties > > > > > > argument. > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-680%3A+TopologyTestDriver+should+not+require+a+Properties+argument > > > > > > > > > > > > Jira for the KIP: > > > > > > https://issues.apache.org/jira/browse/KAFKA-10629 > > > > > > > > > > > > If we end up making changes, they will look like this: > > > > > > > https://github.com/apache/kafka/compare/trunk...rohitrmd:KAFKA-10629 > > > > > > > > > > > > Please have a look and let me know what you think. > > > > > > > > > > > > Thanks, > > > > > > Rohit > > > > > > > > > > > > > > > > > > > > >
Re: [DISCUSS] KIP-680: TopologyTestDriver should not require a Properties argument
Hi John, Thank you for your review and the feedback. In existing method TTD.close(), stateDirectory.clean() <https://github.com/apache/kafka/blob/trunk/streams/test-utils/src/main/java/org/apache/kafka/streams/TopologyTestDriver.java#L1193> method is getting called which is cleaning up task and global directories <https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/StateDirectory.java#L285-L291>. If RocksDB directories are not getting cleaned up in that close method, would like to hear about how we can clean them up in that method. Currently default value of state_directory is set to /tmp/kafka-streams <https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/StreamsConfig.java#L605> so I am not setting it's value explicitly in proposed no argument constructor. Does the directory have to be unique in each test? If yes, then I agree that we can tackle RocksDb directories cleanup and creating unique directory tasks in separate KIP. Thanks, Rohit On Fri, Nov 6, 2020 at 7:12 AM John Roesler wrote: > Hello Rohit, > > Thanks for picking this up! I think your KIP looks good. > > While I was doing some cleanup of our tests before, one thing I > encountered is that, while most tests don’t semantically need to specify > any configs, many tests actually do set the state directory config. They > set it specifically so that they can delete it at the end of the test. > Otherwise, the tests would leave RocksDB directories behind. > > I’m wondering if we should address this issue as part of your KIP. What > I’m thinking is this: if no state directory is specified, then we create a > new, unique temp directory and register it for cleanup when the JVM exits. > Additionally, we would set a flag and clean up the state dir when > TTD.close() is called. > > That way, TTD tests would be by default independent and tidy. > > Admittedly, this is outside the current scope of your KIP, so please feel > free to reject this idea, in which case I can file a separate ticket for > it. > > Thanks! > -John > > On Tue, Nov 3, 2020, at 18:59, Rohit Deshpande wrote: > > Hi Matthias, > > Thank you for the review and the suggestion. > > Considering at most 3 parameters to the constructor of > > TopologyTestDriver > > and topology being required parameter, we can definitely add a new > > constructor `TopologyTestDriver(Topology, Instant)` . > > Right now, I see one test where we can use this constructor: > > > https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/kstream/internals/KStreamTransformTest.java#L80-L83 > > Also we can get rid of this method in TestDriver trait: > > > https://github.com/apache/kafka/blob/trunk/streams/streams-scala/src/test/scala/org/apache/kafka/streams/scala/utils/TestDriver.scala#L32-L35 > > which is used in multiple test classes and seems redundant. I agree with > > your suggestion. > > Thanks, > > Rohit > > > > On Tue, Nov 3, 2020 at 3:19 PM Matthias J. Sax wrote: > > > > > Thanks for the KIP Rohit. > > > > > > Wondering, if we should also add `TopologyTestDriver(Topology, > > > Instant)`? Not totally sure, as having too many overload could also be > > > annoying. > > > > > > > > > -Matthias > > > > > > On 11/3/20 10:02 AM, Rohit Deshpande wrote: > > > > Hello all, > > > > I have created KIP-680: TopologyTestDriver should not require a > > > Properties > > > > argument. > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-680%3A+TopologyTestDriver+should+not+require+a+Properties+argument > > > > > > > > Jira for the KIP: > > > > https://issues.apache.org/jira/browse/KAFKA-10629 > > > > > > > > If we end up making changes, they will look like this: > > > > https://github.com/apache/kafka/compare/trunk...rohitrmd:KAFKA-10629 > > > > > > > > Please have a look and let me know what you think. > > > > > > > > Thanks, > > > > Rohit > > > > > > > > > >
Re: [DISCUSS] KIP-680: TopologyTestDriver should not require a Properties argument
Hi Matthias, Thank you for the review and the suggestion. Considering at most 3 parameters to the constructor of TopologyTestDriver and topology being required parameter, we can definitely add a new constructor `TopologyTestDriver(Topology, Instant)` . Right now, I see one test where we can use this constructor: https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/kstream/internals/KStreamTransformTest.java#L80-L83 Also we can get rid of this method in TestDriver trait: https://github.com/apache/kafka/blob/trunk/streams/streams-scala/src/test/scala/org/apache/kafka/streams/scala/utils/TestDriver.scala#L32-L35 which is used in multiple test classes and seems redundant. I agree with your suggestion. Thanks, Rohit On Tue, Nov 3, 2020 at 3:19 PM Matthias J. Sax wrote: > Thanks for the KIP Rohit. > > Wondering, if we should also add `TopologyTestDriver(Topology, > Instant)`? Not totally sure, as having too many overload could also be > annoying. > > > -Matthias > > On 11/3/20 10:02 AM, Rohit Deshpande wrote: > > Hello all, > > I have created KIP-680: TopologyTestDriver should not require a > Properties > > argument. > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-680%3A+TopologyTestDriver+should+not+require+a+Properties+argument > > > > Jira for the KIP: > > https://issues.apache.org/jira/browse/KAFKA-10629 > > > > If we end up making changes, they will look like this: > > https://github.com/apache/kafka/compare/trunk...rohitrmd:KAFKA-10629 > > > > Please have a look and let me know what you think. > > > > Thanks, > > Rohit > > >
[DISCUSS] KIP-680: TopologyTestDriver should not require a Properties argument
Hello all, I have created KIP-680: TopologyTestDriver should not require a Properties argument. https://cwiki.apache.org/confluence/display/KAFKA/KIP-680%3A+TopologyTestDriver+should+not+require+a+Properties+argument Jira for the KIP: https://issues.apache.org/jira/browse/KAFKA-10629 If we end up making changes, they will look like this: https://github.com/apache/kafka/compare/trunk...rohitrmd:KAFKA-10629 Please have a look and let me know what you think. Thanks, Rohit
Re: Permission for contribution
Hi Boyang, My wiki id is rohitdeshaws Thanks, Rohit On Thu, Oct 29, 2020 at 1:59 PM Rohit Deshpande wrote: > I don't know how to create wiki id. I am following this documentation: > https://kafka.apache.org/contributing > > On Thu, Oct 29, 2020 at 1:40 PM Boyang Chen > wrote: > >> Added you to the JIRA. What's your id for wiki? >> >> On Thu, Oct 29, 2020 at 1:23 PM rohit deshpande < >> rohitmdeshpa...@gmail.com> >> wrote: >> >> > I forgot to provide details. >> > >> > My email id: rohitmdeshpa...@gmail.com >> > This request is regarding code contribution. >> > Thanks, >> > Rohit >> > >> > On Thu, Oct 29, 2020 at 11:58 AM rohit deshpande < >> > rohitmdeshpa...@gmail.com> >> > wrote: >> > >> > > Hi, >> > > I would like to get permission for the contribution. >> > > My jira id: rohitdeshaws >> > > >> > > Thanks, >> > > Rohit >> > > >> > >> >
Re: Permission for contribution
I don't know how to create wiki id. I am following this documentation: https://kafka.apache.org/contributing On Thu, Oct 29, 2020 at 1:40 PM Boyang Chen wrote: > Added you to the JIRA. What's your id for wiki? > > On Thu, Oct 29, 2020 at 1:23 PM rohit deshpande > > wrote: > > > I forgot to provide details. > > > > My email id: rohitmdeshpa...@gmail.com > > This request is regarding code contribution. > > Thanks, > > Rohit > > > > On Thu, Oct 29, 2020 at 11:58 AM rohit deshpande < > > rohitmdeshpa...@gmail.com> > > wrote: > > > > > Hi, > > > I would like to get permission for the contribution. > > > My jira id: rohitdeshaws > > > > > > Thanks, > > > Rohit > > > > > >
Re: Permission for contribution
I forgot to provide details. My email id: rohitmdeshpa...@gmail.com This request is regarding code contribution. Thanks, Rohit On Thu, Oct 29, 2020 at 11:58 AM rohit deshpande wrote: > Hi, > I would like to get permission for the contribution. > My jira id: rohitdeshaws > > Thanks, > Rohit >
Permission for contribution
Hi, I would like to get permission for the contribution. My jira id: rohitdeshaws Thanks, Rohit