I will dig in Monday James.  If a cluster restart then deleting state up in zk 
is fine. The restart will run w/o previous state.  Deleting state from zk is 
bad if a running cluster. It will more than likely mess it up as regions in 
transition kept up in zk are erased 

Stack



On Jan 14, 2011, at 10:52, James Kennedy <james.kenn...@troove.net> wrote:

> Negative. I deleted the zookeeper dir and HMaser still managed to pull the 
> wrong IP address from somewhere.
> 
> I don't have a lot of time to really investigate this myself but I'll try to 
> reproduce it with a basic test and log a case for it.
> 
> By the way,  can someone clarify the side-effects of deleting the zookeeper 
> dir like that? I assume it has no ill effect on the data itself especially 
> when the cluster is down. But what is the worst that can happen if you delete 
> the dir while the cluster is running?
> 
> Thanks
> 
> James
> 
> On 2011-01-14, at 9:54 AM, Stack wrote:
> 
>> It does seem like a regression.   If u kill the zk data dir and restart the 
>> cluster does it work? (root location is up in zk)
>> 
>> 
>> Stack
>> 
>> 
>> 
>> On Jan 13, 2011, at 11:37, James Kennedy <james.kenn...@troove.net> wrote:
>> 
>>> I'm currently validating the new 0.90.0 RC3 with the hbase-trx layer and 
>>> our own application.
>>> 
>>> All seems well so far except for the fact that I now find that HBase 
>>> doesn't adapt if I try to run the same data on different machines.
>>> 
>>> e.g.
>>> 1) I work from home and generated our seeded test data.
>>> 2) Run the test suite and all tests pass
>>> 3) I go to the office and re-run the tests.
>>> 
>>> Result: HMaster fails because the .ROOT data has the wrong ip address for 
>>> locating the .META. At least that is my understanding from the stacktrace 
>>> below.  Note that the 192.168.1.102 IP address in that trace is the IP from 
>>> my home network and is incorrect.
>>> 
>>> This wasn't an issue with previous versions of HBase as far as I've 
>>> noticed.  And this seems to be a big data portability fail.
>>> Surely the HMaster should be able to absorb stale metadata and wait for new 
>>> region-servers to check in.
>>> Instead it just keels over and dies.
>>> But before logging a case I wanted to know if there was something I'm 
>>> obviously missing or doing wrong.
>>> 
>>> The seeded test data is on HDFS.
>>> 
>>> Thoughts?
>>> 
>>> 
>>> [13/01/11 10:58:42] 5939   [           main] INFO  
>>> ion.service.HBaseRegionService  - troove> Starting region server thread.
>>> [13/01/11 11:00:15] 98699  [        HMaster] FATAL 
>>> he.hadoop.hbase.master.HMaster  - Unhandled exception. Starting shutdown.
>>> java.net.SocketTimeoutException: 20000 millis timeout while waiting for 
>>> channel to be ready for connect. ch : 
>>> java.nio.channels.SocketChannel[connection-pending 
>>> remote=192.168.1.102/192.168.1.102:60020]
>>>  at 
>>> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:213)
>>>  at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:404)
>>>  at 
>>> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:311)
>>>  at 
>>> org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:865)
>>>  at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:732)
>>>  at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:258)
>>>  at $Proxy15.getProtocolVersion(Unknown Source)
>>>  at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419)
>>>  at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393)
>>>  at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444)
>>>  at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349)
>>>  at 
>>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:954)
>>>  at 
>>> org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:384)
>>>  at 
>>> org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:283)
>>>  at 
>>> org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:478)
>>>  at 
>>> org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:435)
>>>  at 
>>> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:382)
>>>  at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:277)
>>>  at java.lang.Thread.run(Thread.java:680)
>>> 
>>> 
>>> James Kennedy
>>> Troove Inc.
>>> 
>>> 
> 

Reply via email to