[ 
https://issues.apache.org/jira/browse/IMPALA-7987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16746475#comment-16746475
 ] 

ASF subversion and git services commented on IMPALA-7987:
---------------------------------------------------------

Commit ff628d2b136e9b5ca72a7179294dea06f4cdf0d8 in impala's branch 
refs/heads/master from Tim Armstrong
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=ff628d2 ]

IMPALA-7986,IMPALA-7987: run daemons in docker containers

This refactors start-impala-cluster.py to allow multiple implementations
of the minicluster operations like start and stop. There are now
two classes implementing the same set of operations -
MiniClusterOperations and DockerMiniClusterOperations. The docker
versions start and stop the containers added in IMPALA-7948.

With some configuration (see instructions below), the containers can
connect back to services (HDFS, HMS, Kudu, Sentry, etc) running on the
host. Config generation was modified so that services optionally
communicate via the docker bridge network rather than loopback
(the host's loopback interface is not accessible to the containers).

Notes:
* I improved the container build to regenerate containers when cluster
  configs are regenerated (previously the containers could have stale
  configs).
* Switch from CMD to ENTRYPOINT to allow passing in arguments to "docker
  run" without clobbering default args.
* Python 2.6 is not supported for this code path. This only affects
  CentOS 6, which has limited support for docker anyway.
* I deferred implementing wait_for_cluster(), since the existing
  code requires surgery to abstract out assumptions about locating
  processes and web UI ports - see IMPALA-7988.

How to use:
==========
Create a docker network to use for internal cluster communication,
e.g.:
  docker network create -d bridge --gateway=172.17.0.1 \
      --subnet=172.17.0.1/16 impala-cluster

Add the gateway address of the docker network you created to
impala-config-local.sh, e.g.:

  export INTERNAL_LISTEN_HOST=172.17.0.1
  export DEFAULT_FS=hdfs://${INTERNAL_LISTEN_HOST}:20500

Regenerate configs and docker images:

  . bin/impala-config.sh
  ./bin/create-test-configuration.sh
  ninja -j $IMPALA_BUILD_THREADS docker_images

Restart the minicluster and Impala services to pick up the config:

  ./testdata/bin/run-all.sh
  start-impala-cluster.py --docker_network impala-cluster

You can connect with impala-shell and run some queries. You will
likely run into issues, particularly if running against an existing
data load, since "localhost" or "127.0.0.1" get baked into HMS
table definitions.

Testing:
Ran exhaustive tests (not using Docker) to make sure I didn't break
anything.

Change-Id: I5975cced33fa93df43101dd47d19b8af12e93d11
Reviewed-on: http://gerrit.cloudera.org:8080/12095
Reviewed-by: Tim Armstrong <[email protected]>
Reviewed-by: Joe McDonnell <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Get start-impala-cluster.py to start up a usable minicluster
> ------------------------------------------------------------
>
>                 Key: IMPALA-7987
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7987
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Infrastructure
>            Reporter: Tim Armstrong
>            Assignee: Tim Armstrong
>            Priority: Major
>
> The goal here is to start up an impala minicluster running inside docker 
> containers (process per container) and be able to run queries against the 
> minicluster running on the host (i.e. not in docker).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to