Hey Pierre, Sorry, I just don't think it's worth the time trying to debug this framework when a more robust one exists. Perhaps try reaching out to "kiwenlau?"
-Dima On Mon, Sep 5, 2016 at 9:49 PM, Pierre Caserta <[email protected]> wrote: > Thanks Dima, > Now even if I use a network called hadoopnet.com <http://hadoopnet.com/> > I still have the same problem. > Here are my regionservers that get detected: > > Region Servers > Base Stats > <http://192.168.99.100:33224/master-status#tab_baseStats>Memory > <http://192.168.99.100:33224/master-status#tab_memoryStats>Requests > <http://192.168.99.100:33224/master-status#tab_requestStats>Storefiles > <http://192.168.99.100:33224/master-status#tab_storeStats>Compactions > <http://192.168.99.100:33224/master-status#tab_compactStas> > ServerName Start time Version Requests Per Second Num. > Regions > hadoop-slave1.hadoopnet.com,16020,1473137128613 <http://hadoop-slave1. > hadoopnet.com:16030/rs-status> Tue Sep 06 04:45:28 UTC 2016 1.2.2 > 0 0 > hadoop-slave1.hadoopnet.com.hadoopnet.com,16020,1473137128613 < > http://hadoop-slave1.hadoopnet.com.hadoopnet.com:60010/rs-status> > Tue Sep 06 04:45:28 UTC 2016 Unknown 0 0 > hadoop-slave2.hadoopnet.com,16020,1473137127975 <http://hadoop-slave2. > hadoopnet.com:16030/rs-status> Tue Sep 06 04:45:27 UTC 2016 1.2.2 > 0 0 > hadoop-slave2.hadoopnet.com.hadoopnet.com,16020,1473137127975 < > http://hadoop-slave2.hadoopnet.com.hadoopnet.com:60010/rs-status> > Tue Sep 06 04:45:27 UTC 2016 Unknown 0 0 > Total:4 2 nodes with inconsistent version 0 0 > instead of just hadoop-slave1.hadoopnet.com,16020,1473137128613 < > http://hadoop-slave1.hadoopnet.com:16030/rs-status> and > hadoop-slave2.hadoopnet.com,16020,1473137127975 <http://hadoop-slave2. > hadoopnet.com:16030/rs-status> > This is the script I used to start the hadoop cluster > > --- > #!/bin/bash > > # the default node number is 3 > N=${1:-3} > > > NETWORK=hadoopnet.com > docker rm -f zk.$NETWORK &> /dev/null > echo "start zk container..." > docker run -p 2181:2181 --name zk.$NETWORK --hostname zk.$NETWORK > --net=$NETWORK -itd -v conf:/opt/zookeeper/conf -v data:/tmp/zookeeper > jplock/zookeeper > > # start hadoop master container > docker rm -f hadoop-master.$NETWORK &> /dev/null > echo "start hadoop-master container..." > docker run -itd \ > --net=$NETWORK \ > -P \ > --name hadoop-master.$NETWORK \ > --hostname hadoop-master.$NETWORK \ > --add-host zk.$NETWORK:$(docker inspect -f "{{with index > .NetworkSettings.Networks \"${NETWORK}\"}}{{.IPAddress}}{{end}}" > zk.$NETWORK) \ > casertap/hhb > > > # start hadoop slave container > i=1 > while [ $i -lt $N ] > do > docker rm -f hadoop-slave$i.$NETWORK &> /dev/null > echo "start hadoop-slave$i container..." > docker run -itd \ > --net=$NETWORK \ > --name hadoop-slave$i.$NETWORK \ > --hostname hadoop-slave$i.$NETWORK \ > --publish-all=false \ > --add-host hadoop-master.$NETWORK:$(docker inspect -f > "{{with index .NetworkSettings.Networks \"${NETWORK}\"}}{{.IPAddress}}{{end}}" > hadoop-master.$NETWORK) \ > --add-host zk.$NETWORK:$(docker inspect -f "{{with index > .NetworkSettings.Networks \"${NETWORK}\"}}{{.IPAddress}}{{end}}" > zk.$NETWORK) \ > casertap/hhb > i=$(( $i + 1 )) > done > > # get into hadoop master container > docker exec -it hadoop-master.$NETWORK bash > --- > > Thanks, > pierre > > > On 6 Sep 2016, at 08:47, Dima Spivak <[email protected]> wrote: > > > > Sounds good, Pierre. FWIW, if you want a preview, here's how to get a > > 5-node HBase cluster running based on the master branch of HBase in > about a > > minute: > > > > 1. Source the clusterdock.sh script that defines the clusterdock_ helper > > functions: source /dev/stdin <<< "$(curl -sL > > http://tiny.cloudera.com/clusterdock.sh <http://tiny.cloudera.com/ > clusterdock.sh>)" > > 2. Start up a cluster: CLUSTERDOCK_TOPOLOGY_IMAGE= > > hbasejenkinsuser-docker-hbase.bintray.io/dev/clusterdock: > apache_hbase_topology > > clusterdock_run ./bin/start_cluster -r > > hbasejenkinsuser-docker-hbase.bintray.io --namespace dev apache_hbase > > --hbase-version=master --hadoop-version=2.7.1 > > --secondary-nodes='node-{2..5}' > > > > And that's it. Feel free to put a -h for help information (put it right > > after the ./bin/start_cluster for details about the function or after the > > apache_hbase for details about the Apache HBase topology. > > > > -Dima > > > > On Mon, Sep 5, 2016 at 3:44 PM, Pierre Caserta <[email protected] > <mailto:[email protected]>> > > wrote: > > > >> Thanks for your answer. > >> I will check the ticket https://issues.apache.org/ > jira/browse/HBASE-15961 <https://issues.apache.org/jira/browse/HBASE-15961 > > > >> <https://issues.apache.org/jira/browse/HBASE-15961 < > https://issues.apache.org/jira/browse/HBASE-15961>> regularly and try > >> clusterdock as soon as the documentation comes out. > >> I will try to use hostname with domain like: master.hadoopnet.com < > http://master.hadoopnet.com/> < > >> http://master.hadoopnet.com/ <http://master.hadoopnet.com/>> and > network named hadoopnet.com <http://hadoopnet.com/> < > >> http://hadoopnet.com/ <http://hadoopnet.com/>> to try if this resolve > the problem. > >> Currently my hostnames are hadoop-master, hadoop-slave1 and > hadoop-slave2, > >> maybe that is the problem. > >> > >>> On 5 Sep 2016, at 23:31, Dima Spivak <[email protected]> wrote: > >>> > >>> clusterdock uses --net=host for running the framework out of a > container, > >>> but each Hadoop/HBase cluster itself runs with its own bridge network. > >> Just > >>> suggesting clusterdock since it's what we now use for testing HBase > >>> releases and it looks a bit more sophisticated than this other project > >>> (e.g. no need to rebuild images for different cluster sizes). > >>> > >>> The error you're seeing is caused by not using the FQDN of the > containers > >>> when referring to them; Docker networks use the network name as the > >> domain. > >>> > >>> On Monday, September 5, 2016, Pierre Caserta <[email protected] > >> <mailto:[email protected] <mailto:[email protected]>>> > >>> wrote: > >>> > >>>> That is a good script thanks but I would like to understand exactly > what > >>>> is the problem with my config without adding another level of > >> abstraction > >>>> and just running the clusterdock command. > >>>> In your script I can see that you are using --net=host. I think this > is > >>>> the main difference compared to what I am doing which is creating a > >> bridge > >>>> network for the hadoop cluster. > >>>> I have only 3 machines: hadoop-master, hadoop-slave1, hadoop-slave2. > >>>> > >>>> Why do those strange hadoop-slave2.hadoopnet alias appear in the web > ui? > >>>> It looks like the network name is used as part of the hostname. > >>>> Any idea what it is happening in my case? > >>>> > >>>> Pierre > >>>> > >>>>> On 5 Sep 2016, at 16:48, Dima Spivak <[email protected] > >>>> <javascript:;>> wrote: > >>>>> > >>>>> You should try the Apache HBase topology for clusterdock that was > >>>> committed > >>>>> a few months back. See HBASE-12721 for details. > >>>>> > >>>>> On Sunday, September 4, 2016, Pierre Caserta < > [email protected] > >> <mailto:[email protected] <mailto:[email protected]>> > >>>> <javascript:;>> > >>>>> wrote: > >>>>> > >>>>>> Hi, > >>>>>> I am building a fully distributed hbase cluster with unmanaged > >>>> zookeeper. > >>>>>> I pretty much used this example and install hbase on top of it: > >>>>>> https://github.com/kiwenlau/hadoop-cluster-docker > >>>>>> > >>>>>> Hadoop and hdfs works fine but I get this exception with hbase: > >>>>>> > >>>>>> 2016-09-05 06:27:12,268 INFO [hadoop-master:16000. > >>>> activeMasterManager] > >>>>>> zookeeper.MetaTableLocator: Failed verification of hbase:meta,,1 at > >>>>>> address=hadoop-slave2,16020,1473052276351, > >> exception=org.apache.hadoop. > >>>>>> hbase.NotServingRegionException: Region hbase:meta,,1 is not online > >> on > >>>>>> hadoop-slave2.hadoopnet,16020,1473056813966 > >>>>>> at org.apache.hadoop.hbase.regionserver.HRegionServer. > >>>>>> getRegionByEncodedName(HRegionServer.java:2910) > >>>>>> > >>>>>> This is bloking because any command I enter on the hbase shell will > >>>> return > >>>>>> the following error: > >>>>>> > >>>>>> ERROR: org.apache.hadoop.hbase.PleaseHoldException: Master is > >>>>>> initializing > >>>>>> > >>>>>> The containers are runned using --net=hadoopnet > >>>>>> which is a network create as such: > >>>>>> > >>>>>> docker network create --driver=bridge hadoopnet > >>>>>> > >>>>>> The hbase webui is showing this: > >>>>>> > >>>>>> Region Servers > >>>>>> ServerName Start time Version Requests Per Second Num. > >>>>>> Regions > >>>>>> hadoop-slave1,16020,1473056814064 Mon Sep 05 06:26:54 UTC 2016 > >>>>>> 1.2.2 0 0 > >>>>>> hadoop-slave1.hadoopnet,16020,1473056814064 Mon Sep 05 06:26:54 > UTC > >>>>>> 2016 Unknown 0 0 > >>>>>> hadoop-slave2,16020,1473056813966 Mon Sep 05 06:26:53 UTC 2016 > >>>>>> 1.2.2 0 0 > >>>>>> hadoop-slave2.hadoopnet,16020,1473056813966 Mon Sep 05 06:26:53 > UTC > >>>>>> 2016 Unknown 0 0 > >>>>>> Total:4 2 nodes with inconsistent version 0 > >> 0 > >>>>>> > >>>>>> I should have only 2 regionservers but 2 strange > >> hadoop-slave1.hadoopnet > >>>>>> and hadoop-slave2.hadoopnet are added to the list. > >>>>>> When I look at zk using: > >>>>>> > >>>>>> /usr/local/hbase/bin/hbase zkcli -server zk:2181 ls /hbase/rs > >>>>>> > >>>>>> I only see my 2 regionserver: hadoop-slave1,16020,1473056814064 and > >>>>>> hadoop-slave2,16020,1473056813966 > >>>>>> > >>>>>> Looking at the zookeeper.MetaTableLocator: Failed verification > error I > >>>> see > >>>>>> that hadoop-slave2,16020,1473052276351 and > >>>> hadoop-slave2.hadoopnet,16020,1473056813966 > >>>>>> get mixed up. > >>>>>> > >>>>>> here is my config on all server > >>>>>> > >>>>>> <?xml version="1.0" encoding="UTF-8"?> > >>>>>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> > >>>>>> > >>>>>> <configuration> > >>>>>> <property> > >>>>>> <name>hbase.rootdir</name> > >>>>>> <value>hdfs://hadoop-master:9000/hbase</value> > >>>>>> <description>The directory shared by region servers. > >> Should > >>>>>> be fully-qualified to include the filesystem to use. E.g: > >>>>>> hdfs://NAMENODE_SERVER:PORT/HBASE_ROOTDIR</description> > >>>>>> </property> > >>>>>> <property> > >>>>>> <name>hbase.master</name> > >>>>>> <value>hdfs://hadoop-master:60000</value> > >>>>>> <description>The host and port that the HBase master runs > >>>>>> at.</description> > >>>>>> </property> > >>>>>> <property> > >>>>>> <name>hbase.cluster.distributed</name> > >>>>>> <value>true</value> > >>>>>> <description>The mode the cluster will be in. Possible > >>>>>> values are > >>>>>> false: standalone and pseudo-distributed setups with > >>>> managed > >>>>>> Zookeeper > >>>>>> true: fully-distributed with unmanaged Zookeeper Quorum > >>>> (see > >>>>>> hbase-env.sh)</description> > >>>>>> </property> > >>>>>> <property> > >>>>>> <name>hbase.master.info.port</name> > >>>>>> <value>60010</value> > >>>>>> <description>The UI interface of HBase master > >>>>>> runs.</description> > >>>>>> </property> > >>>>>> <property> > >>>>>> <name>hbase.zookeeper.quorum</name> > >>>>>> <value>zk</value> > >>>>>> <description>string m_e_m_b_e_r_s is replaced by list of > >>>>>> hosts separated by comma. Its generated by configure-slaves.sh on > >> master > >>>>>> node</description> > >>>>>> </property> > >>>>>> <property> > >>>>>> <name>hbase.zookeeper.property.maxClientCnxns</name> > >>>>>> <value>300</value> > >>>>>> </property> > >>>>>> <property> > >>>>>> <name>hbase.zookeeper.property.datadir</name> > >>>>>> <value>/tmp/zookeeper</value> > >>>>>> <description>location of storage of zookeeper > >>>>>> data</description> > >>>>>> </property> > >>>>>> <property> > >>>>>> <name>hbase.zookeeper.property.clientPort</name> > >>>>>> <value>2181</value> > >>>>>> </property> > >>>>>> > >>>>>> </configuration> > >>>>>> > >>>>>> I created a stack overflow question as well: > >> http://stackoverflow.com/ > >>>>>> questions/39325041/hbase-on-docker-notservingregionexception- > >>>>>> because-of-hostname-alisas <http://stackoverflow.com/ > >>>>>> questions/39325041/hbase-on-docker-notservingregionexception- > >>>>>> because-of-hostname-alisas> > >>>>>> > >>>>>> Thanks, > >>>>>> Pierre > >>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> -Dima > >>>> > >>>> > >>> > >>> -- > >>> -Dima > >
