[jira] [Commented] (HBASE-12721) Create Docker container cluster infrastructure to enable better testing

Dima Spivak (JIRA) Thu, 02 Jun 2016 16:46:54 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-12721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15313319#comment-15313319
 ]


Dima Spivak commented on HBASE-12721:
-------------------------------------

Yay! Glad to hear it works outside of my network :). As for the great points 
raised:

bq. How does one build and register different versions of HBase for launching 
by build_cluster? Possible to add to a local library?
You can do this for the 0.98 branch, for example, by running:
{noformat}
clusterdock_run ./bin/build_cluster apache_hbase --hbase-version=0.98 
--hbase-git-commit=0.98
{noformat}
In this case, the build process will check out source code, build a binary 
tarball using Maven, and extract it into a proper Docker image that can then be 
picked up by {{start_cluster}}. FYI, the {{hbase-version}} argument is just a 
label for our internal use whereas {{hbase-git-commit}} is what's used when 
checking out code; this lets you potentially do one-off builds of a particular 
commit and name it whatever you want.

bq. Can we set up a library of HBase versions to test? 0.98, 1.1, 1.2, 1.3, and 
master.
I hope so! We've had this running internally at Cloudera for a while where once 
a night, we build these images, push them to our local repository, and then 
assuming that that succeeds, trigger a Jenkins job that runs some tests from 
{{hbase-it}}. Once we have the Docker registry part ironed out, I can 
coordinate with a committer on setting up the necessary Jenkins infra needed to 
do the same.

bq. Build_cluster should allow me to set Xms and Xmx, at least for 
regionservers. If I start up a high memory instance I might want 8 GB heaps, 
etc.
So the difference between {{build_cluster}} and {{start_cluster}} is that 
{{build_cluster}} creates the necessary Docker image(s) needed for a 
{{start_cluster}} to work properly. Once those images exist, you can pass the 
location of an .ini-like file as an argument to {{start_cluster}} and it will 
set configurations for you before starting services (and also keep them 
synchronized among every node of your Docker-based cluster). Here's what the 
default configuration looks like:
{noformat}
[hadoop/slaves]
+++ '\n'.join(["{{0}}.{network}".format(node) for node in {secondary_nodes}])

[hadoop/core-site.xml]
fs.default.name = hdfs://{primary_node[0]}.{network}:8020

[hadoop/mapred-site.xml]
mapreduce.framework.name = yarn

[hadoop/yarn-site.xml]
yarn.resourcemanager.hostname = {primary_node[0]}.{network}
yarn.nodemanager.aux-services = mapreduce_shuffle
yarn.nodemanager.aux-services.mapreduce_shuffle.class = 
org.apache.hadoop.mapred.ShuffleHandler
yarn.nodemanager.vmem-check-enabled = false

[hbase/regionservers]
+++ '\n'.join(["{{0}}.{network}".format(node) for node in {secondary_nodes}])

[hbase/backup-masters]
{secondary_nodes[0]}.{network}

[hbase/hbase-site.xml]
hbase.cluster.distributed = true
hbase.rootdir = hdfs://{primary_node[0]}.{network}/hbase
hbase.zookeeper.quorum = {primary_node[0]}.{network}
hbase.zookeeper.property.dataDir = /usr/local/zookeeper

hbase.it.clustermanager.hadoop.hdfs.user = root
hbase.it.clustermanager.zookeeper.user = root
hbase.it.clustermanager.hbase.user = root
{noformat}
Note that this file uses the group (thing inside [ ]) to decide which file to 
modify, and it knows to format xml files as property files differently than 
non-xml files. As such, we could just pass in JVM arguments in here.

bq. How would we use G1 instead of CMS? Bonus points for extra GC flag support 
for Shenandoah.
Same as above, I think? If it can be controlled through an option in an 
{{/hbase/conf}}-based file, this framework already supports it.

bq. How would we use a different version of the JVM? (including a custom 
compiled version)
{{./bin/build_cluster apache_hbase}} supports a flag that specifies a Java 
tarball to use. I imagine I'd need to modify the code a little bit to handle 
non-Oracle releases, though.

bq. Let's add a script/wrapper that runs all of the IT tests. Extra credit if 
one can optionally specify monkey type and policy on the command line.
+1!

> Create Docker container cluster infrastructure to enable better testing
> -----------------------------------------------------------------------
>
>                 Key: HBASE-12721
>                 URL: https://issues.apache.org/jira/browse/HBASE-12721
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Dima Spivak
>            Assignee: Dima Spivak
>
> Some simple work on using HBase with Docker was committed into /dev-support 
> as "hbase_docker;" all this did was stand up a standalone cluster from source 
> and start a shell. Now seems like a good time to extend this to be useful for 
> applications that could actual benefit the community, especially around 
> testing. Some ideas:
> - Integration testing would be much more accessible if people could stand up 
> distributed HBase clusters on a single host machine in a couple minutes and 
> run our awesome hbase-it suite against it.
> - Binary compatibility testing of an HBase client is easiest when standing up 
> an HBase cluster can be done once and then different client source/binary 
> permutations run against it.
> - Upgrade testing, and especially rolling upgrade testing, doesn't have any 
> upstream automation on build.apache.org, in part because it's a pain to set 
> up x-node clusters on Apache infrastructure.
> This proposal, whether it stays under /dev-support or moves out into it's own 
> top-level module ("hbase-docker" would conveniently fit the existing schema 
> :-)), strives to create a simple framework for deploying "distributed," 
> multi-container Apache HBase clusters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12721) Create Docker container cluster infrastructure to enable better testing

Reply via email to