Re: [DISCUSS] KAFKA-4345: Run decktape test for each pull request

Ewen Cheslack-Postava Mon, 07 Nov 2016 15:21:23 -0800

On Mon, Nov 7, 2016 at 10:30 AM Raghav Kumar Gautam <rag...@apache.org>
wrote:


> Hi Ewen,
>
> Thanks for the feedback. Answers are inlined.
>
> On Sun, Nov 6, 2016 at 8:46 PM, Ewen Cheslack-Postava <e...@confluent.io>
> wrote:
>
> > Yeah, I'm all for getting these to run more frequently and on lighter
> > weight infrastructure. (By the way, I also saw the use of docker; I'd
> > really like to get a "native" docker cluster type into ducktape at some
> > point so all you have to do is bake the image and then spawn containers
> on
> > demand.)
> >
> I completely agree, supporting docker integration in ducktape would be the
> ideal solution of the problem.
>
>
> >
> > A few things. First, it'd be nice to know if we can chain these with
> normal
> > PR builds or something like that. Even starting the system tests when we
> > don't know the unit tests will pass seems like it'd be wasteful.
> >
> If we do chaining one problem that it will bring is that the turn around
> time will suffer. It would take 1.5 hrs to run unit tests then another 1.5
> hrs to run decktape tests. Also, don't dev run relevant unit tests before
> they submit a patch ?
>

Yeah, I get that. Turnaround time will obviously suffer from serializing
anything. Here the biggest problem today is that Jenkins builds are not as
highly parallelized as most users run the tests locally, and the large
number of integration tests that are baked into the unit tests mean they
take quite a long time. While running the tests locally has been creeping
up quite a bit recently, it's still at least < 15min on a relatively recent
MBP. Ideally we could just get the Jenkins builds to finish faster...


> >
> > Second, agreed on getting things stable before turning this on across the
> > board.
>
> I have done some work for stabilizing the tests. But I need help from kafka
> community to take this further. It will be great if someone can guide me on
> how to do this ? Should we start with a subset of tests that are stable and
> enable others as we make progress ? Who are the people that can I work with
> on this problem ?
>

It'll probably be a variety of people because it depends on the components
that are unstable. For example, just among committers, different folks know
different areas of the code (and especially system tests) to different
degrees. I can probably help across the board in terms of ducktape/system
test stuff, but for any individual test you'll probably just want to git
blame to figure out who might be best to ask for help.

I can take a pass at this patch and see how much makes sense to commit
immediately. If we don't immediately start getting feedback on failing
tests and can instead make progress by requesting them manually on only
some PRs or something like that, then that seems like it could be
reasonable.

My biggest concern, just taking a quick pass at the changes, is that we're
doing a lot of renaming of tests just to split them up rather than by
logical grouping. If we need to do this, it seems much better to add a
small amount of tooling to ducktape to execute subsets of tests (e.g. split
across N subsets of the tests). It requires more coordination between
ducktape and getting this landed, but feels like a much cleaner solution,
and one that could eventually take advantage of additional information
(e.g. if it knows avg runtime from previous runs, then it can divide them
based on that instead of only considering the # of tests).


> > Confluent runs these tests nightly on full VMs in AWS and
> > historically, besides buggy logic in tests, underprovisioned resources
> tend
> > to be the biggest source of flakiness in tests.
> >
>  Good to know that I am not the only one worrying about this problem :-)
>
> Finally, should we be checking w/ infra and/or Travis folks before enabling
> > something this expensive? Are the Storm integration tests of comparable
> > cost? There are some in-flight patches for parallelizing test runs of
> > ducktape tests (which also results in better utilization). But even with
> > those changes, the full test run is still quite a few VM-hours per PR and
> > we only expect it to increase.
> >
> We can ask infra people about this. But I think this will not be a problem.
> For e.g. Flink <https://travis-ci.org/apache/flink/builds/173852382> is
> using 11 hrs of computation time for each run. For kafka we are going to
> start with 6hrs. Also, with the docker setup we can bring up the whole 12
> node cluster on the laptop and run ducktape tests against it. So, test
> development cycles will become faster.
>

Sure, it's just that over time this tends to lead to the current state of
the Jenkins where it can take many hours before you get any feedback
because things are so backed up.

-Ewen


>
> With Regards,
> Raghav.
>
>
>
> >
> > -Ewen
> >
> > On Thu, Nov 3, 2016 at 11:26 AM, Becket Qin <becket....@gmail.com>
> wrote:
> >
> > > Thanks for the explanation, Raghav.
> > >
> > > If the workload is not a concern then it is probably fine to run tests
> > for
> > > each PR update, although it may not be necessary :)
> > >
> > > On Thu, Nov 3, 2016 at 10:40 AM, Raghav Kumar Gautam <
> rag...@apache.org>
> > > wrote:
> > >
> > > > Hi Becket,
> > > >
> > > > The tests would be run each time a PR is created/updated this will
> look
> > > > similar to https://github.com/apache/storm/pulls. Ducktape tests
> take
> > > > about
> > > > 7-8 hours to run on my laptop. For travis-ci we can split them in
> > groups
> > > > and run them in parallel. This was done in the POC run which took 1.5
> > > hrs.
> > > > It had 10 splits with 5 jobs running in parallel.
> > > > https://travis-ci.org/raghavgautam/kafka/builds/171502069
> > > > For apache projects the limit is 30 jobs in parallel and across all
> > > > projects, so I expect it to take less time but it also depends on the
> > > > workload at the time.
> > > > https://blogs.apache.org/infra/entry/apache_gains_
> additional_travis_ci
> > > >
> > > > Thanks,
> > > > Raghav.
> > > >
> > > > On Thu, Nov 3, 2016 at 9:41 AM, Becket Qin <becket....@gmail.com>
> > wrote:
> > > >
> > > > > Thanks Raghav,
> > > > >
> > > > > +1 for the idea in general.
> > > > >
> > > > > One thing I am wondering is when the tests would be run? Would it
> be
> > > run
> > > > > when we merge a PR or it would be run every time a PR is
> > > created/updated?
> > > > > I am not sure how long do the tests in other projects take. For
> Kafka
> > > it
> > > > > may take a few hours to run all the ducktape tests, will that be an
> > > issue
> > > > > if we run the tests for each updates of the PR?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jiangjie (Becket) Qin
> > > > >
> > > > > On Thu, Nov 3, 2016 at 8:16 AM, Harsha Chintalapani <
> ka...@harsha.io
> > >
> > > > > wrote:
> > > > >
> > > > > > Thanks, Raghav . I am +1 for having this in Kafka. It will help
> > > > identify
> > > > > > any potential issues, especially with big patches. Given that
> we've
> > > > some
> > > > > > tests failing due to timing issues
> > > > > > can we disable the failing tests for now so that we don't get any
> > > false
> > > > > > negatives?
> > > > > >
> > > > > > Thanks,
> > > > > > Harsha
> > > > > >
> > > > > > On Tue, Nov 1, 2016 at 11:47 AM Raghav Kumar Gautam <
> > > rag...@apache.org
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > I want to start a discussion about running ducktape tests for
> > each
> > > > pull
> > > > > > > request. I have been working on KAFKA-4345
> > > > > > > <https://issues.apache.org/jira/browse/KAFKA-4345> to enable
> > this
> > > > > using
> > > > > > > docker on travis-ci.
> > > > > > > Pull request: https://github.com/apache/kafka/pull/2064
> > > > > > > Working POC: https://travis-ci.org/raghavgautam/kafka/builds/
> > > > 171502069
> > > > > > >
> > > > > > > In the POC I am able to run 124/149 tests out of which 88 pass.
> > The
> > > > > > failure
> > > > > > > are mostly timing issues. We can run the same scripts on the
> > laptop
> > > > > with
> > > > > > > which I am able to run 138/149 tests successfully.
> > > > > > >
> > > > > > > For this to work we need to enable travis-ci for Kafka. I can
> > open
> > > a
> > > > > > infra
> > > > > > > bug to request travis-ci for this. Travis-ci is already running
> > > tests
> > > > > for
> > > > > > > many apache projects like Storm, Hive, Flume, Thrift etc. see:
> > > > > > > https://travis-ci.org/apache/.
> > > > > > >
> > > > > > > Does this sound interesting ? Please comment.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Raghav.
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > Thanks,
> > Ewen
> >
>

Re: [DISCUSS] KAFKA-4345: Run decktape test for each pull request

Reply via email to