Hey John,

Re: Mock Processor Context:

That's a good point, I'm convinced that we should keep them as two classes.


Re: test-utils module:

I think I agree with your proposed changes, in fact in order to not scatter
the test classes in two places maybe it's better to move all of them to the
new module. One caveat is that it will make streams' project hierarchy
inconsistent with other projects where the unit test classes are maintained
inside the main artifact package, but I think it is a good cost to pay,
plus once we start publishing test-util artifacts for other projects like
client and connect, we may face the same issue and need to do this
refactoring as well.



Guozhang




On Fri, Mar 9, 2018 at 9:54 AM, John Roesler <j...@confluent.io> wrote:

> Hi Guozhang and Bill,
>
> I'll summarize what I'm currently thinking in light of all the discussion:
>
> Mock Processor Context:
> ===================
>
> Here's how I see the use cases for the two mocks differing:
>
> 1. o.a.k.test.MPC: Crafted for testing Streams use cases. Implements
> AbstractProcessorContext, actually forward to child processor nodes, allow
> restoring a state store. Most importantly, the freedom to do stuff
> convenient for our tests without impacting anyone.
>
> 2. (test-utils) MPC: Crafted for testing community Processors (and
> friends). Very flat and simple implementation (so people can read it in one
> sitting); i.e., doesn't drag in other data models like RecordContext. Test
> one processor in isolation, so generally don't bother with complex logic
> like scheduling punctuators, forwarding results, or restoring state stores.
> Most importantly, an API that can be stable.
>
> So, I really am leaning toward keeping both implementations. I like Bill's
> suggestion of renaming the unit testing class to
> InternalMockProcessorContext, since having classes with the same name in
> different packages is confusing. I look forward to the day when Java 9
> takes off and we can actually hide internal classes from the public
> interface.
>
> test-utils module:
> =============
>
> This is actually out of scope for this KIP if we keep both MPC
> implementations, but it has been a major feature of this discussion, so we
> may as well see it though.
>
> I've waffled a bit on this point, but right now I would propose we
> restructure the streams directory thusly:
>
> streams/ (artifact name := "streams", the actual streams code lives here)
> - test-utils/ (this is the current test-utils artifact, depends on
> "streams")
> - tests/ (new module, depends on "streams" and "test-utils", *NO published
> artifact*)
>
> This gets us out of the circular dependency without having to engage in any
> Gradle shenanigans while preserving "test-utils" as a separate artifact.
> This is good because: 1) the test-utils don't need to be in production
> code, so it's nice to have a separate artifact, 2) test-utils is already
> public in 1.1, and it's a bummer to introduce users' code when we can so
> easily avoid it.
>
> Note, though, that if we agree to keep both MPC implementations, then this
> really is just important for rewriting our tests to use TopologyTestDriver,
> and in fact only the tests that need it should move to "streams/tests/".
>
> What say you?
>
> -John
>
> On Fri, Mar 9, 2018 at 9:01 AM, Guozhang Wang <wangg...@gmail.com> wrote:
>
> > Hmm.. it seems to be a general issue then, since we were planning to also
> > replace the KStreamTestDriver and ProcessorTopologyTestDriver with the
> new
> > TopologyTestDriver soon, so if the argument that testing dependency could
> > still cause circular dependencies holds it means we cannot do that as
> well.
> >
> > My understanding on gradle dependencies has been that test dependencies
> are
> > not required to compile when compiling the project, but only required
> when
> > testing the project; and the way we script gradle follows the way that
> for
> > any test tasks of the project we require compiling it first so this is
> > fine. John / Bill, could you elaborate a bit more on the maintenance
> > complexity concerns?
> >
> >
> > Guozhang
> >
> > On Fri, Mar 9, 2018 at 7:40 AM, Bill Bejeck <bbej...@gmail.com> wrote:
> >
> > > John,
> > >
> > > Sorry for the delayed response.  Thanks for the KIP, I'm +1 on it, and
> I
> > > don't have any further comments on the KIP itself aside from the
> comments
> > > that others have raised.
> > >
> > > Regarding the existing MockProcessorContext and its removal in favor of
> > the
> > > one added from this KIP, I'm actually in favor of keeping both.
> > >
> > > IMHO it's reasonable to have both because the testing requirements are
> > > different.  Most users are trying to verify their logic works as
> expected
> > > within a Kafka Streams application and aren't concerned (or shouldn't
> be
> > at
> > > least, again IMHO) with testing Kafka Streams itself, that is the
> > > responsibility of the Kafka Streams developers and contributors.
> > >
> > > However, for the developers and contributors of Kafka Streams, the need
> > to
> > > test the internals of how Streams works is the primary concern and
> could
> > at
> > > times require different logic or available methods from a given mock
> > > object.
> > >
> > > I have a couple of thoughts on mitigation of having two
> > > MockProcessorContext objects
> > >
> > >    1. Leave the current MockProcessorContext in the o.a.k.test package
> > but
> > >    rename it to InternalMockProcessorContext and add some documentation
> > as
> > > to
> > >    why it's there.
> > >    2. Create a new package under o.a.k.test, called internals and move
> > the
> > >    existing MockProcessorContext there, but that would require a change
> > to
> > > the
> > >    visibility of the MockProcessorContext#allStateStores() method to
> > > public.
> > >
> > > Just wanted to throw in my 2 cents.
> > >
> > > Thanks,
> > > Bill
> > >
> > > On Thu, Mar 8, 2018 at 11:51 PM, John Roesler <j...@confluent.io>
> wrote:
> > >
> > > > I think what you're suggesting is to:
> > > > 1. compile the main streams code, but not the tests
> > > > 2. compile test-utils (and compile and run the test-utils tests)
> > > > 3. compile and run the streams tests
> > > >
> > > > This works in theory, since the test-utils depends on the main
> streams
> > > > code, but not the streams tests. and the streams tests depend on
> > > test-utils
> > > > while the main streams code does not.
> > > >
> > > > But after poking around a bit and reading up on it, I think this is
> not
> > > > possible, or at least not mainstream.
> > > >
> > > > The issue is that dependencies are formed between projects, in this
> > case
> > > > streams and streams:test-utils. The upstream project must be built
> > before
> > > > the dependant one, regardless of whether the dependency is for
> > compiling
> > > > the main code or the test code. This means we do have a circular
> > > dependency
> > > > on our hands if we want the tests in streams to use the test-utils,
> > since
> > > > they'd both have to be built before the other.
> > > >
> > > > Gradle seems to be quite scriptable, so there may be some way to
> > achieve
> > > > this, but increasing the complexity of the build also introduces a
> > > project
> > > > maintenance concern.
> > > >
> > > > The MockProcessorContext itself is pretty simple, so I'm tempted to
> > argue
> > > > that we should just have one for internal unit tests and another for
> > > > test-utils, however this situation also afflicts KAFKA-6474
> > > > <https://issues.apache.org/jira/browse/KAFKA-6474>, and the
> > > > TopologyTestDriver is not so trivial.
> > > >
> > > > I think the best thing at this point is to go ahead and fold the
> > > test-utils
> > > > into the streams project. We can put it into a separate "testutils"
> > > package
> > > > to make it easy to identify which code is for test support and which
> > code
> > > > is Kafka Streams. The biggest bummer about this suggestion is that it
> > we
> > > > *just* introduced the test-utils artifact, so folks would to add that
> > > > artifact in 1.1 to write their tests and then have to drop it again
> in
> > > 1.2.
> > > >
> > > > The other major solution is to create a new gradle project for the
> > > streams
> > > > unit tests, which depends on streams and test-utils and move all the
> > > > streams unit tests there. I'm pretty sure we can configure gradle
> just
> > to
> > > > include this project for running tests and not actually package any
> > > > artifacts. This structure basically expresses your observation that
> the
> > > > test code is essentially a separate module from the main streams
> code.
> > > >
> > > > Of course, I'm open to alternatives, especially if someone with more
> > > > experience in Gradle is aware of a solution.
> > > >
> > > > Thanks,
> > > > -John
> > > >
> > > >
> > > > On Thu, Mar 8, 2018 at 3:39 PM, Matthias J. Sax <
> matth...@confluent.io
> > >
> > > > wrote:
> > > >
> > > > > Isn't MockProcessorContext in o.a.k.test part of the unit-test
> > package
> > > > > but not the main package?
> > > > >
> > > > > This should resolve the dependency issue.
> > > > >
> > > > > -Matthias
> > > > >
> > > > > On 3/8/18 3:32 PM, John Roesler wrote:
> > > > > > Actually, replacing the MockProcessorContext in o.a.k.test could
> > be a
> > > > bit
> > > > > > tricky, since it would make the "streams" module depend on
> > > > > > "streams:test-utils", but "streams:test-utils" already depends on
> > > > > "streams".
> > > > > >
> > > > > > At first glance, it seems like the options are:
> > > > > > 1. leave the two separate implementations in place. This
> shouldn't
> > be
> > > > > > underestimated, especially since our internal tests may need
> > > different
> > > > > > things from a mocked P.C. than our API users.
> > > > > > 2. move the public testing artifacts into the regular streams
> > module
> > > > > > 3. move the unit tests for Streams into a third module that
> depends
> > > on
> > > > > both
> > > > > > streams and test-utils. Yuck!
> > > > > >
> > > > > > Thanks,
> > > > > > -John
> > > > > >
> > > > > > On Thu, Mar 8, 2018 at 3:16 PM, John Roesler <j...@confluent.io>
> > > > wrote:
> > > > > >
> > > > > >> Thanks for the review, Guozhang,
> > > > > >>
> > > > > >> In response:
> > > > > >> 1. I missed that! I'll look into it and update the KIP.
> > > > > >>
> > > > > >> 2. I was planning to use the real implementation, since folks
> > might
> > > > > >> register some metrics in the processors and want to verify the
> > > values
> > > > > that
> > > > > >> get recorded. If the concern is about initializing all the stuff
> > > > that's
> > > > > in
> > > > > >> the Metrics object, I can instantiate it lazily or even make it
> > > > > optional by
> > > > > >> taking a nullable constructor parameter.
> > > > > >>
> > > > > >> 3. Agreed. I think that's the real sharp edge here. I actually
> > think
> > > > it
> > > > > >> would be neat to auto-trigger those scheduled punctuators, but
> it
> > > > seems
> > > > > >> like that moves this component out of "mock" territory and into
> > > > "driver"
> > > > > >> territory. Since we already have the TopologyTestDriver, I'd
> > prefer
> > > to
> > > > > >> focus on keeping the mock lean. I agree it should be in the
> > javadoc
> > > as
> > > > > well
> > > > > >> as the web documentation.
> > > > > >>
> > > > > >> Thanks,
> > > > > >> -John
> > > > > >>
> > > > > >> On Thu, Mar 8, 2018 at 1:46 PM, Guozhang Wang <
> wangg...@gmail.com
> > >
> > > > > wrote:
> > > > > >>
> > > > > >>> Hello John,
> > > > > >>>
> > > > > >>> Thanks for the KIP. I made a pass over the wiki page and here
> are
> > > > some
> > > > > >>> comments:
> > > > > >>>
> > > > > >>> 1. Meta-comment: there is an internal class
> MockProcessorContext
> > > > under
> > > > > the
> > > > > >>> o.a.k.test package, which should be replaced as part of this
> KIP.
> > > > > >>>
> > > > > >>> 2. In @Override StreamsMetrics metrics(), will you return a
> fully
> > > > > created
> > > > > >>> StreamsMetricsImpl object or are you planning to use the
> > > > > >>> MockStreamsMetrics? Note that for the latter case you probably
> > need
> > > > to
> > > > > >>> look
> > > > > >>> into https://issues.apache.org/jira/browse/KAFKA-5676 as well.
> > > > > >>>
> > > > > >>> 3. Not related to the KIP changes themselves: about
> > > > > >>> "context.scheduledPunctuators": we need to well document that
> in
> > > the
> > > > > >>> MockProcessorContext the scheduled punctuator will never by
> > > > > >>> auto-triggered,
> > > > > >>> and hence it is only for testing people's code that some
> > > punctuators
> > > > > are
> > > > > >>> indeed registered, and if people want full auto punctuation
> > testing
> > > > > they
> > > > > >>> have to go with TopologyTestDriver.
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>> Guozhang
> > > > > >>>
> > > > > >>>
> > > > > >>> On Wed, Mar 7, 2018 at 8:04 PM, John Roesler <
> j...@confluent.io>
> > > > > wrote:
> > > > > >>>
> > > > > >>>> On Wed, Mar 7, 2018 at 8:03 PM, John Roesler <
> j...@confluent.io
> > >
> > > > > wrote:
> > > > > >>>>
> > > > > >>>>> Thanks Ted,
> > > > > >>>>>
> > > > > >>>>> Sure thing; I updated the example code in the KIP with a
> little
> > > > > >>> snippet.
> > > > > >>>>>
> > > > > >>>>> -John
> > > > > >>>>>
> > > > > >>>>> On Wed, Mar 7, 2018 at 7:18 PM, Ted Yu <yuzhih...@gmail.com>
> > > > wrote:
> > > > > >>>>>
> > > > > >>>>>> Looks good.
> > > > > >>>>>>
> > > > > >>>>>> See if you can add punctuator into the sample code.
> > > > > >>>>>>
> > > > > >>>>>> On Wed, Mar 7, 2018 at 7:10 PM, John Roesler <
> > j...@confluent.io
> > > >
> > > > > >>> wrote:
> > > > > >>>>>>
> > > > > >>>>>>> Dear Kafka community,
> > > > > >>>>>>>
> > > > > >>>>>>> I am proposing KIP-267 to augment the public Streams test
> > utils
> > > > > >>> API.
> > > > > >>>>>>> The goal is to simplify testing of Kafka Streams
> > applications.
> > > > > >>>>>>>
> > > > > >>>>>>> Please find details in the
> > > > > >>>>>>> wiki:https://cwiki.apache.org/
> confluence/display/KAFKA/KIP-
> > > > > >>>>>>> 267%3A+Add+Processor+Unit+Test+Support+to+Kafka+Streams+
> > > > Test+Utils
> > > > > >>>>>>>
> > > > > >>>>>>> An initial WIP PR can be found here:https://github.com/
> > > > > >>>>>>> apache/kafka/pull/4662
> > > > > >>>>>>>
> > > > > >>>>>>> I also included the user-list (please hit "reply-all" to
> > > include
> > > > > >>> both
> > > > > >>>>>>> lists in this KIP discussion).
> > > > > >>>>>>>
> > > > > >>>>>>> Thanks,
> > > > > >>>>>>>
> > > > > >>>>>>> -John
> > > > > >>>>>>>
> > > > > >>>>>>
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>> --
> > > > > >>> -- Guozhang
> > > > > >>>
> > > > > >>
> > > > > >>
> > > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>



-- 
-- Guozhang

Reply via email to