The most actively maintained method of running a self-contained Impala cluster is the development environment: https://cwiki.apache.org/confluence/display/IMPALA/Bootstrapping+an+Impala+Development+Environment+From+Scratch. That runs the Impala processes plus all the dependent services on a single node (HDFS, HMS, Kudu, etc).
We test that automatically on Ubuntu 16.04 and all Impala developers actively use it. That environment has the scripts to start up all the required services and impala. The main catch is that it requires building Impala, which is automated by the script but takes a while the first time around just cause there's a lot of dependencies to download and C++ to compile. This page has instructions for how to create that dev environment *inside* a docker container, if your host OS is not Ubuntu 16.04 or you want it contained: https://cwiki.apache.org/confluence/display/IMPALA/Impala+Development+Environment+inside+Docker There's an alternative set of docker containers that puts each service in its own containers, but that probably doesn't help you too much since it would require having an existing HDFS, HMS, etc plus configuring the containers yourself: https://cwiki.apache.org/confluence/display/IMPALA/Build+and+Test+for+Daemon+Docker+Containers. I built and pushed some docker containers a few months back off a random commit on master on dockerhub in the following repos: timgarmstrong/asf-master-{statestored,catalogd,impalad_coordinator,impalad_executor,impalad_coord_exec}. Those are very much use at your own risk, but if you wanted to inspect the containers that could get you started On Fri, Oct 25, 2019 at 5:59 AM Antoni Ivanov <aiva...@vmware.com> wrote: > Hi, > > We’d like to test with Impala locally. So we can build integration tests > against running (if mock) version of Impala > What options do we have ? > > The way we've found so far is: > * So far the best option we've seen are docker container with Impala. We > have found a few: > * https://github.com/tomwhite/docker-impala > * https://hub.docker.com/r/parrotstream/impala > But they are not necessarily well updated and also are a bit heavier in > term of resources than necessary > > Is there way to start embedded impala or something like > https://github.com/sakserv/hadoop-mini-clusters > that can be used for testing purposes > > > Thanks, > Antoni >