Write it up James. Others will probably trip on it too. Good stuff, St.Ack
On Fri, Jan 21, 2011 at 4:44 PM, James Kennedy <[email protected]> wrote: > Aha that stupid dot! > > My /etc/hosts file looks pretty standard: > > 127.0.0.1 localhost > > ::1 ip6-localhost ip6-loopback > fe00::0 ip6-localnet > ff00::0 ip6-mcastprefix > ff02::1 ip6-allnodes > ff02::2 ip6-allrouters > ff02::3 ip6-allhosts > > However look what I found in the data-seed-specific hbase-site.xml > > <property> > <name>hbase.master.dns.interface</name> > <value>lo</value> > </property> > <property> > <name>hbase.regionserver.dns.interface</name> > <value>lo</value> > </property> > > Not sure why we had that in there originally but taking it out fixes the > problem. Both sides now resolve hregioninfo to "localhost" instead of > "localhost.". I have no idea how specifying the lo interface adds a period > to the localhost name but that sounds like a bug to me. Shall I report it or > is this a known issue? > > Thanks for your help, > > James Kennedy > Project Manager > Troove Inc. > > On 2011-01-21, at 1:34 PM, Jean-Daniel Cryans wrote: > >> There's some sort of mismatch: >> >> RegionServer ephemeral node deleted, processing expiration >> [localhost.,60020,1295592845214] >> >> and >> >> Waiting on regionserver(s) to go down localhost,60020,1295592845214 >> >> >> Do you see the dot after "localhost" in the first line? I wonder how >> it got different in the znode and in ServerManager.onlineServers... In >> any case, I'm pretty sure you can get it working by playing with your >> /etc/hosts >> >> J-D >> >> On Thu, Jan 20, 2011 at 11:28 PM, James Kennedy >> <[email protected]> wrote: >>> I've come across a strange bug that I'm having trouble debugging. >>> Basically I have a seed application that is executed via maven and runs a >>> single JVM ApplicationStarter that starts up hdfs, regionserver, hmaster >>> threads. It does some seeding then shuts those down in reverse order. >>> So this isn't a typical way of running hbase to be sure. However it has >>> always worked until I upgraded to HBase 0.90.0. >>> I didn't notice it when I was originally testing 0.90.0 because it only >>> seems to be happening on our EC2.small build server node when I run this >>> particular seeder. >>> Running the same thing locally on my mac works. >>> Attached is the error output starting from when the HRegionServer.stop() is >>> called to when HMaster.shutdown() is called and it starts looping forever in >>> letRegionServersShutdown(). >>> It looks like RegionServerTracker is getting to "RegionServer ephemeral node >>> deleted, processing expiration" but then because it can't get the >>> HServerInfo it doesn't follow-through with actually expiring it. >>> Does anyone have any ideas as to why this might be happening? >>> >>> >>> Thanks, >>> James Kennedy >>> Project Manager >>> Troove Inc. >>> >>> > >
