[
https://issues.apache.org/jira/browse/HBASE-12721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15313319#comment-15313319
]
Dima Spivak commented on HBASE-12721:
-------------------------------------
Yay! Glad to hear it works outside of my network :). As for the great points
raised:
bq. How does one build and register different versions of HBase for launching
by build_cluster? Possible to add to a local library?
You can do this for the 0.98 branch, for example, by running:
{noformat}
clusterdock_run ./bin/build_cluster apache_hbase --hbase-version=0.98
--hbase-git-commit=0.98
{noformat}
In this case, the build process will check out source code, build a binary
tarball using Maven, and extract it into a proper Docker image that can then be
picked up by {{start_cluster}}. FYI, the {{hbase-version}} argument is just a
label for our internal use whereas {{hbase-git-commit}} is what's used when
checking out code; this lets you potentially do one-off builds of a particular
commit and name it whatever you want.
bq. Can we set up a library of HBase versions to test? 0.98, 1.1, 1.2, 1.3, and
master.
I hope so! We've had this running internally at Cloudera for a while where once
a night, we build these images, push them to our local repository, and then
assuming that that succeeds, trigger a Jenkins job that runs some tests from
{{hbase-it}}. Once we have the Docker registry part ironed out, I can
coordinate with a committer on setting up the necessary Jenkins infra needed to
do the same.
bq. Build_cluster should allow me to set Xms and Xmx, at least for
regionservers. If I start up a high memory instance I might want 8 GB heaps,
etc.
So the difference between {{build_cluster}} and {{start_cluster}} is that
{{build_cluster}} creates the necessary Docker image(s) needed for a
{{start_cluster}} to work properly. Once those images exist, you can pass the
location of an .ini-like file as an argument to {{start_cluster}} and it will
set configurations for you before starting services (and also keep them
synchronized among every node of your Docker-based cluster). Here's what the
default configuration looks like:
{noformat}
[hadoop/slaves]
+++ '\n'.join(["{{0}}.{network}".format(node) for node in {secondary_nodes}])
[hadoop/core-site.xml]
fs.default.name = hdfs://{primary_node[0]}.{network}:8020
[hadoop/mapred-site.xml]
mapreduce.framework.name = yarn
[hadoop/yarn-site.xml]
yarn.resourcemanager.hostname = {primary_node[0]}.{network}
yarn.nodemanager.aux-services = mapreduce_shuffle
yarn.nodemanager.aux-services.mapreduce_shuffle.class =
org.apache.hadoop.mapred.ShuffleHandler
yarn.nodemanager.vmem-check-enabled = false
[hbase/regionservers]
+++ '\n'.join(["{{0}}.{network}".format(node) for node in {secondary_nodes}])
[hbase/backup-masters]
{secondary_nodes[0]}.{network}
[hbase/hbase-site.xml]
hbase.cluster.distributed = true
hbase.rootdir = hdfs://{primary_node[0]}.{network}/hbase
hbase.zookeeper.quorum = {primary_node[0]}.{network}
hbase.zookeeper.property.dataDir = /usr/local/zookeeper
hbase.it.clustermanager.hadoop.hdfs.user = root
hbase.it.clustermanager.zookeeper.user = root
hbase.it.clustermanager.hbase.user = root
{noformat}
Note that this file uses the group (thing inside [ ]) to decide which file to
modify, and it knows to format xml files as property files differently than
non-xml files. As such, we could just pass in JVM arguments in here.
bq. How would we use G1 instead of CMS? Bonus points for extra GC flag support
for Shenandoah.
Same as above, I think? If it can be controlled through an option in an
{{/hbase/conf}}-based file, this framework already supports it.
bq. How would we use a different version of the JVM? (including a custom
compiled version)
{{./bin/build_cluster apache_hbase}} supports a flag that specifies a Java
tarball to use. I imagine I'd need to modify the code a little bit to handle
non-Oracle releases, though.
bq. Let's add a script/wrapper that runs all of the IT tests. Extra credit if
one can optionally specify monkey type and policy on the command line.
+1!
> Create Docker container cluster infrastructure to enable better testing
> -----------------------------------------------------------------------
>
> Key: HBASE-12721
> URL: https://issues.apache.org/jira/browse/HBASE-12721
> Project: HBase
> Issue Type: New Feature
> Reporter: Dima Spivak
> Assignee: Dima Spivak
>
> Some simple work on using HBase with Docker was committed into /dev-support
> as "hbase_docker;" all this did was stand up a standalone cluster from source
> and start a shell. Now seems like a good time to extend this to be useful for
> applications that could actual benefit the community, especially around
> testing. Some ideas:
> - Integration testing would be much more accessible if people could stand up
> distributed HBase clusters on a single host machine in a couple minutes and
> run our awesome hbase-it suite against it.
> - Binary compatibility testing of an HBase client is easiest when standing up
> an HBase cluster can be done once and then different client source/binary
> permutations run against it.
> - Upgrade testing, and especially rolling upgrade testing, doesn't have any
> upstream automation on build.apache.org, in part because it's a pain to set
> up x-node clusters on Apache infrastructure.
> This proposal, whether it stays under /dev-support or moves out into it's own
> top-level module ("hbase-docker" would conveniently fit the existing schema
> :-)), strives to create a simple framework for deploying "distributed,"
> multi-container Apache HBase clusters.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)