like 19:09:23 208.94.1.52 jack@zero:~ $ host 38.99.76.204 204.76.99.38.in-addr.arpa domain name pointer img646.imageshack.us. 19:10:26 208.94.1.52 jack@zero:~ $
This is the name I wanted it to use. It appears that with current setup, we can't change hostnames. -Jack On Tue, May 24, 2011 at 7:03 PM, Jack Levin <[email protected]> wrote: > "HBase uses the local hostname to self-report it's IP address." > > using 'hostname' as authoritative name for regionserver is what caused > all of the confusion, hostname usually not governed by name resolution > (/etc/hosts, dns), some users may call their servers something other > than whats in dns, so hbase will break for them if they do. Better > idea would be to check eth0 for IP, get reverse dns name for it, and > use that. > > just my small two cents. > > -Jack > > On Tue, May 24, 2011 at 6:02 PM, Jean-Daniel Cryans <[email protected]> > wrote: >> Zookeeper doesn't query addresses, it's all done in HBase which in >> turn stores it in ZK. >> >> Also http://hbase.apache.org/book.html#dns >> >> J-D >> >> On Tue, May 24, 2011 at 4:37 PM, Jack Levin <[email protected]> wrote: >>> figured it out... the /etc/hosts file has ip to name, was used by >>> zookeeper was *.prod.imageshack.com, while hostname was >>> imgXX.imageshack.us... use by Regionserver/Master - Ideally, all >>> three components should source hostnames form same place, whether its >>> hostname or /etc/hosts (or dns), etc... it gotta be consistent, >>> otherwise aliases end up screwing things up and people will end up >>> guessing why things don't work. >>> >>> -Jack >>> >>> On Tue, May 24, 2011 at 4:04 PM, Jack Levin <[email protected]> wrote: >>>> img645.prod.imageshack.us and img645.imageshack.us are both point to >>>> the same IP. >>>> >>>> -Jack >>>> >>>> On Tue, May 24, 2011 at 3:50 PM, Jack Levin <[email protected]> wrote: >>>>> looks like our balancer is on: >>>>> >>>>> hbase(main):001:0> balance_switch true >>>>> true >>>>> 0 row(s) in 0.3700 seconds >>>>> >>>>> I simply kill PID for RS, and it stays on the list with regions >>>>> assigned, and master does not know about it. >>>>> >>>>> So it still does not work. >>>>> >>>>> -Jack >>>>> >>>>> On Tue, May 24, 2011 at 3:43 PM, Dave Latham <[email protected]> wrote: >>>>>> Are you using the graceful_stop script? >>>>>> >>>>>> In 0.90.3 the bin/graceful_stop.sh script was updated to disable the >>>>>> master's balancer. However, it doesn't seem that anything re-enables >>>>>> it, so >>>>>> if you're using it you need to re-enable it on your own. See the book >>>>>> for >>>>>> more details: >>>>>> http://hbase.apache.org/book.html#decommission >>>>>> >>>>>> Dave >>>>>> >>>>>> On Tue, May 24, 2011 at 3:33 PM, Jack Levin <[email protected]> wrote: >>>>>> >>>>>>> just put new hbase version on our test cluster. and been testing it... >>>>>>> so far if I shutdown an RS, master does not reassign its regions, and >>>>>>> we remain inconsistent forerver, likewise when new RS is up, it does >>>>>>> not get regions assigned to it, this is the master log: >>>>>>> >>>>>>> >>>>>>> 2011-05-24 15:30:57,724 DEBUG >>>>>>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: >>>>>>> master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper >>>>>>> Event, type=NodeDeleted, state=SyncConnected, >>>>>>> path=/hbase/rs/img645.prod.imageshack.com,60020,1306276075768 >>>>>>> 2011-05-24 15:30:57,724 INFO >>>>>>> org.apache.hadoop.hbase.zookeeper.RegionServerTracker: RegionServer >>>>>>> ephemeral node deleted, processing expiration >>>>>>> [img645.prod.imageshack.com,60020,1306276075768] >>>>>>> 2011-05-24 15:30:57,724 INFO >>>>>>> org.apache.hadoop.hbase.zookeeper.RegionServerTracker: No HServerInfo >>>>>>> found for img645.prod.imageshack.com,60020,1306276075768 >>>>>>> 2011-05-24 15:30:57,726 DEBUG >>>>>>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: >>>>>>> master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper >>>>>>> Event, type=NodeChildrenChanged, state=SyncConnected, path=/hbase/rs >>>>>>> 2011-05-24 15:31:03,330 DEBUG >>>>>>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: >>>>>>> master:60000-0x1302094818900a4-0x1302094818900a4 Received ZooKeeper >>>>>>> Event, type=NodeChildrenChanged, state=SyncConnected, path=/hbase/rs >>>>>>> 2011-05-24 15:31:03,338 DEBUG >>>>>>> org.apache.hadoop.hbase.zookeeper.ZKUtil: >>>>>>> master:60000-0x1302094818900a4-0x1302094818900a4 Retrieved 32 byte(s) >>>>>>> of data from znode >>>>>>> /hbase/rs/img645.prod.imageshack.com,60020,1306276262774 and set >>>>>>> watcher; img645.prod.imageshack.com:60020 >>>>>>> 2011-05-24 15:31:03,350 INFO >>>>>>> org.apache.hadoop.hbase.master.ServerManager: Server start rejected; >>>>>>> we already have img645.imageshack.us:60020 registered; >>>>>>> existingServer=serverName=img645.imageshack.us,60020,1306276075768, >>>>>>> load=(requests=0, regions=0, usedHeap=40, maxHeap=3995), >>>>>>> newServer=serverName=img645.imageshack.us,60020,1306276262774, >>>>>>> load=(requests=0, regions=0, usedHeap=23, maxHeap=3995) >>>>>>> 2011-05-24 15:31:03,350 INFO >>>>>>> org.apache.hadoop.hbase.master.ServerManager: Triggering server >>>>>>> recovery; existingServer img645.imageshack.us,60020,1306276075768 >>>>>>> looks stale >>>>>>> 2011-05-24 15:31:03,353 DEBUG >>>>>>> org.apache.hadoop.hbase.master.ServerManager: >>>>>>> Added=img645.imageshack.us,60020,1306276075768 to dead servers, >>>>>>> submitted shutdown handler to be executed, root=false, meta=false >>>>>>> 2011-05-24 15:31:03,353 INFO >>>>>>> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: >>>>>>> Splitting logs for img645.imageshack.us,60020,1306276075768 >>>>>>> 2011-05-24 15:31:04,348 INFO >>>>>>> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: >>>>>>> Reassigning 0 region(s) that img645.imageshack.us,60020,1306276075768 >>>>>>> was carrying (skipping 0 regions(s) that are already in transition) >>>>>>> 2011-05-24 15:31:04,348 INFO >>>>>>> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished >>>>>>> processing of shutdown of img645.imageshack.us,60020,1306276075768 >>>>>>> 2011-05-24 15:31:06,333 DEBUG >>>>>>> org.apache.hadoop.hbase.master.ServerManager: Server >>>>>>> img645.imageshack.us,60020,1306276262774 came back up, removed it from >>>>>>> the dead servers list >>>>>>> 2011-05-24 15:31:06,333 INFO >>>>>>> org.apache.hadoop.hbase.master.ServerManager: Registering >>>>>>> server=img645.imageshack.us,60020,1306276262774, regionCount=0, >>>>>>> userLoad=false >>>>>>> 2011-05-24 15:31:49,890 DEBUG >>>>>>> org.apache.hadoop.hbase.zookeeper.ZKUtil: hconnection opening >>>>>>> connection to ZooKeeper with ensemble (img648:2181) >>>>>>> 2011-05-24 15:31:49,890 INFO org.apache.zookeeper.ZooKeeper: >>>>>>> Initiating client connection, connectString=img648:2181 >>>>>>> sessionTimeout=180000 watcher=hconnection >>>>>>> 2011-05-24 15:31:49,891 INFO org.apache.zookeeper.ClientCnxn: Opening >>>>>>> socket connection to server img648/38.99.76.205:2181 >>>>>>> 2011-05-24 15:31:49,892 INFO org.apache.zookeeper.ClientCnxn: Socket >>>>>>> connection established to img648/38.99.76.205:2181, initiating session >>>>>>> 2011-05-24 15:31:49,893 INFO org.apache.zookeeper.ClientCnxn: Session >>>>>>> establishment complete on server img648/38.99.76.205:2181, sessionid = >>>>>>> 0x13024216e690004, negotiated timeout = 180000 >>>>>>> 2011-05-24 15:31:49,894 DEBUG >>>>>>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: hconnection >>>>>>> Received ZooKeeper Event, type=None, state=SyncConnected, path=null >>>>>>> 2011-05-24 15:31:49,895 DEBUG >>>>>>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: >>>>>>> hconnection-0x13024216e690004 connected >>>>>>> 2011-05-24 15:31:49,896 DEBUG >>>>>>> org.apache.hadoop.hbase.zookeeper.ZKUtil: >>>>>>> hconnection-0x13024216e690004 Set watcher on existing znode >>>>>>> /hbase/master >>>>>>> 2011-05-24 15:31:49,896 DEBUG >>>>>>> org.apache.hadoop.hbase.zookeeper.ZKUtil: >>>>>>> hconnection-0x13024216e690004 Retrieved 32 byte(s) of data from znode >>>>>>> /hbase/master and set watcher; img648.prod.imageshack.com:60000 >>>>>>> 2011-05-24 15:31:49,897 DEBUG >>>>>>> org.apache.hadoop.hbase.zookeeper.ZKUtil: >>>>>>> hconnection-0x13024216e690004 Set watcher on existing znode >>>>>>> /hbase/root-region-server >>>>>>> 2011-05-24 15:31:49,897 DEBUG >>>>>>> org.apache.hadoop.hbase.zookeeper.ZKUtil: >>>>>>> hconnection-0x13024216e690004 Retrieved 26 byte(s) of data from znode >>>>>>> /hbase/root-region-server and set watcher; img731.imageshack.us:60020 >>>>>>> 2011-05-24 15:31:49,900 DEBUG >>>>>>> org.apache.hadoop.hbase.client.MetaScanner: Scanning .META. starting >>>>>>> at row= for max=2147483647 rows >>>>>>> 2011-05-24 15:31:49,900 DEBUG >>>>>>> >>>>>>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: >>>>>>> Lookedup root region location, >>>>>>> >>>>>>> connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@26f50154 >>>>>>> ; >>>>>>> hsa=img731.imageshack.us:60020 >>>>>>> 2011-05-24 15:31:49,913 DEBUG >>>>>>> >>>>>>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: >>>>>>> Cached location for .META.,,1.1028785192 is img654.imageshack.us:60020 >>>>>>> 2011-05-24 15:31:50,061 INFO >>>>>>> >>>>>>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: >>>>>>> Closed zookeeper sessionid=0x13024216e690004 >>>>>>> 2011-05-24 15:31:50,063 INFO org.apache.zookeeper.ZooKeeper: Session: >>>>>>> 0x13024216e690004 closed >>>>>>> 2011-05-24 15:31:50,063 INFO org.apache.zookeeper.ClientCnxn: >>>>>>> EventThread shut down >>>>>>> >>>>>>> Please help :) >>>>>>> >>>>>>> -Jack >>>>>>> >>>>>> >>>>> >>>> >>> >> >
