Re: [DISCUSS] e2e test infrastructure

Otto Fowler Wed, 29 Nov 2017 11:29:39 -0800

So we will just have a :

ZK container
Kafka Container
HDFS Container


and not deploy any metron stuff to them in the docker setup, the test
itself will deploy what it needs and cleanup?


On November 29, 2017 at 11:53:46, Ryan Merriman ([email protected]) wrote:

“I would feel better using docker if each docker container only had the
base services, and did not require a separate but parallel deployment path
to ambari”

This exactly how it works. There is a container for each base service, just
like we now have an in-memory component for each base service. There is
also no deployment path to Ambari. Ambari is not involved at all.

>From a client perspective (our e2e/integration tests in this case) there
really is not much of a difference. At the end of the day services are up
and running and available on various ports.

Also there is going to be maintenance required no matter what approach we
decide on. If we add another ES template that needs to be loaded by the
MPack, our e2e/integration test infrastructure will also have to load that
template. I have had to do this with our current integration tests.

> On Nov 29, 2017, at 9:38 AM, Otto Fowler <[email protected]> wrote:
>
> So the issue with metron-docker is that it is all custom setup for metron
components, and understanding how to maintain it when you make changes to
the system is difficult for the developers.
> This is a particular issue to me, because I would have to re-write a big
chunk of it to accommodate 777.
>
> I would feel better using docker if each docker container only had the
base services, and did not require a separate but parallel deployment path
to ambari. That is to say if the docker components
> were functional equivalent and limited to the in memory components
functionality and usage. I apologize if that is in fact what you are
getting at.
>
> Then we could move the integrations and e2e to them.
>
>
>
>> On November 29, 2017 at 10:00:20, Ryan Merriman ([email protected])
wrote:
>>
>> Thanks for the feedback so far everyone. All good points.
>>
>> Otto, if we did decide to go down the Docker route, we could
>> use /master/metron-contrib/metron-docker as a starting point. The reason
I
>> initially create that module was to support Management UI testing
because
>> full dev was unusable for that purpose at that time. This is the same
use
>> case. A lot of the work has already been done but we would need to
review
>> it and bring it up to date with the current state of master. Once we get
>> it to a point where we can manually spin up the Docker environment and
get
>> the e2e tests to pass, we would then need to add it into our Travis
>> workflow.
>>
>> Mike, yes this is one of the options I listed at the start of the
discuss
>> thread although I'm not sure I agree with the Docker disadvantages you
>> list. We could use a similar approach for HDFS in Docker by setting it
to
>> local FS and creating a shared volume that all the containers have
access
>> to. I've also found that Docker Compose makes the networking part much
>> easier. What other advantages would in-memory components in separate
>> process offer us that you can think of? Are there other disadvantages
with
>> using Docker?
>>
>> Justin, I think that's a really good point and I would be on board with
>> it. I see this use case (e2e testing infrastructure) as a good way to
>> evaluate our options without making major changes across our codebase. I
>> would agree that standardizing on an approach would be ideal and
something
>> we should work towards. The debugging request is also something that
would
>> be extremely helpful. The only issue I see is debugging a Storm
topology,
>> this would still need to be run locally using LocalCluster because
remote
>> debugging does not work well in Storm (per previous comments from Storm
>> committers). At one point I was able to get this to work with Docker
>> containers but we would definitely need to revisit it and create tooling
>> around it.
>>
>> So in summary, I think we agree on these points so far:
>>
>> - no one seems to be in favor of mocking our backend so I'll take that
>> option off the table
>> - everyone seems to be in favor of moving to a strategy where we spin up
>> backend services at the beginning of all tests and spin down at the end,
>> rather than spinning up/down for each class or suite of tests
>> - the ability to debug our code locally is important and something to
>> keep in mind as we evaluate our options
>>
>> I think the next step is to decide whether we pursue in-memory/separate
>> process vs Docker. Having used both, there are a couple disadvantages I
>> see with the in-memory approach:
>>
>> - The in-memory components are different than real installations and
>> come with separate issues. There have been cases where an in-memory
>> component had a bug (looking at you Kafka) that a normal installation
>> wouldn't have and required effort to put workarounds in place.
>> - Spinning up the in-memory components in separate processes and
>> managing their life cycles is not a trivial task. In Otto's words, I
>> believe this will inevitably become a "large chuck of custom development
>> that has to be maintained". Docker Compose exposes a declarative
interface
>> that is much simpler in my opinion (check out
>>
https://github.com/apache/metron/blob/master/metron-contrib/metron-docker/compose/docker-compose.yml
>> as an example). I also think our testing infrastructure will be more
>> accessible to outside contributors because Docker is a common skill in
the
>> industry. Otherwise a contributor would have to come up to speed with
our
>> custom in-memory process module before being able to make any meaningful
>> contributions.
>>
>> I can live with the first one but the second one is a big issue IMO.
Even
>> if we do decide to use the in-memory components I think we need to
delegate
>> the process management stuff to another framework not maintained by us.
>>
>> How do others feel? What other considerations are there?
>>
>> On Wed, Nov 29, 2017 at 6:59 AM, Justin Leet <[email protected]>
wrote:
>>
>> > As an additional consideration, it would be really nice to get our
current
>> > set of integration tests to be able to be run on this infrastructure
as
>> > well. Or at least able to be converted in a known manner. Eventually,
we
>> > could probably split out the integration tests from the unit tests
>> > entirely. It would likely improve the build times if we we're reusing
the
>> > components between test classes (keep in mind right now, we only reuse
>> > between test cases in a given class).
>> >
>> > In my mind, ideally we have a single infra for integration and e2e
tests.
>> > I'd like to be able to run them from IntelliJ and debug them directly
(or
>> > at least be able to easily, and in a well documented manner, be able
to do
>> > remote debugging of them). Obviously, that's easier said than done,
but
>> > what I'd like to avoid us having essentially two different ways to do
the
>> > same thing (spin up some our of dependency components and run code
against
>> > them). I'm worried that's quick vs full dev all over again. But
without us
>> > being able to easily kill one because half of tests depend on one and
half
>> > on the other.
>> >
>> > On Wed, Nov 29, 2017 at 1:22 AM, Michael Miklavcic <
>> > [email protected]> wrote:
>> >
>> > > What about just spinning up each of the components in their own
process?
>> > > It's even lighter weight, doesn't have the complications for HDFS
(you
>> > can
>> > > use the local FS easily, for example), and doesn't have any issues
around
>> > > ports and port mapping with the containers.
>> > >
>> > > On Tue, Nov 28, 2017 at 3:48 PM, Otto Fowler <[email protected]>

>> > > wrote:
>> > >
>> > > > As long as there is not a large chuck of custom deployment that
has to
>> > be
>> > > > maintained docker sounds ideal.
>> > > > I would like to understand what it would take to create the docker
e2e
>> > > env.
>> > > >
>> > > >
>> > > >
>> > > > On November 28, 2017 at 17:27:13, Ryan Merriman (
[email protected])
>> > > > wrote:
>> > > >
>> > > > Currently the e2e tests for our Alerts UI depends on full dev
being up
>> > > and
>> > > > running. This is not a good long term solution because it forces a
>> > > > contributor/reviewer to run the tests manually with full dev
running.
>> > It
>> > > > would be better if the backend services could be made available to
the
>> > > e2e
>> > > > tests while running in Travis. This would allow us to add the e2e
tests
>> > > to
>> > > > our automated build process.
>> > > >
>> > > > What is the right approach? Here are some options I can think of:
>> > > >
>> > > > - Use the in-memory components we use for the backend integration
tests
>> > > > - Use a Docker approach
>> > > > - Use mock components designed for the e2e tests
>> > > >
>> > > > Mocking the backend would be my least favorite option because it
would
>> > > > introduce a complex module of code that we have to maintain.
>> > > >
>> > > > The in-memory approach has some shortcomings but we may be able to
>> > solve
>> > > > some of those by moving components to their own process and
spinning
>> > them
>> > > > up/down at the beginning/end of tests. Plus we are already using
them.
>> > > >
>> > > > My preference would be Docker because it most closely mimics a
real
>> > > > installation and gives you isolation, networking and dependency
>> > > management
>> > > > features OOTB. In many cases Dockerfiles are maintained and
published
>> > by
>> > > a
>> > > > third party and require no work other than some setup like loading
data
>> > > or
>> > > > templates/schemas. Elasticsearch is a good example.
>> > > >
>> > > > I believe we could make any of these approaches work in Travis.
What
>> > does
>> > > > everyone think?
>> > > >
>> > > > Ryan
>> > > >
>> > >
>> >

Re: [DISCUSS] e2e test infrastructure

Reply via email to