Re: HBase is not ready for Primetime

Robert Gonzalez Tue, 12 Apr 2011 10:53:15 -0700

Ok, deleted logs that master was complaining about, restarted master
only.  Seemed to be stable after a bunch of the messages like the one
below, then restarted regionservers, sans the one that gave me trouble
this morning.  Now seems to be up and running again.  I don't trust
it, seen this kind of "ok, I'm about to make your life suck" behavior
before.  We will see......




On Tue, Apr 12, 2011 at 12:40 PM, Robert Gonzalez
<[email protected]> wrote:
> A bunch of this in the master log:
>
> 2011-04-12 12:38:23,771 WARN
> org.apache.hadoop.hbase.master.CatalogJanitor: REGIONINFO_QUALIFIER is
> empty in 
> keyvalues={urlhashcopy,E3208173766FDD7C01FE9633E281ED0A,1296085183252.7501ae2b7e933057ea12610c4ec6d001./info:server/1296142856167/Put/vlen=41,
> urlhashcopy,E3208173766FDD7C01FE9633E281ED0A,1296085183252.7501ae2b7e933057ea12610c4ec6d001./info:serverstartcode/1296142856167/Put/vlen=8}
> 2011-04-12 12:38:23,772 WARN
> org.apache.hadoop.hbase.master.CatalogJanitor: REGIONINFO_QUALIFIER is
> empty in 
> keyvalues={urlhashcopy,E3FBE7AD03D5618BD6AE9E28D4C68FA3,1296085263052.da74b3e6d534a1d2a2f6d75e5bd7686d./info:server/1296142855579/Put/vlen=41,
> urlhashcopy,E3FBE7AD03D5618BD6AE9E28D4C68FA3,1296085263052.da74b3e6d534a1d2a2f6d75e5bd7686d./info:serverstartcode/1296142855579/Put/vlen=8}
> 2011-04-12 12:38:23,773 WARN
> org.apache.hadoop.hbase.master.CatalogJanitor: REGIONINFO_QUALIFIER is
> empty in 
> keyvalues={urlhashcopy,E5032151B4A9A65D45E961C29ECF3323,1296085338403.1b878d372ca96a8bdd830b7620d31464./info:server/1296142855706/Put/vlen=41,
> urlhashcopy,E5032151B4A9A65D45E961C29ECF3323,1296085338403.1b878d372ca96a8bdd830b7620d31464./info:serverstartcode/1296142855706/Put/vlen=8}
> 2011-04-12 12:38:23,774 WARN
> org.apache.hadoop.hbase.master.CatalogJanitor: REGIONINFO_QUALIFIER is
> empty in 
> keyvalues={urlhashcopy,E5F2ADD100BDD9791417FEC48997213F,1296085429914.6aeb9c1db827acc7a7969d3b2c8470a7./info:server/1296142855577/Put/vlen=41,
> urlhashcopy,E5F2ADD100BDD9791417FEC48997213F,1296085429914.6aeb9c1db827acc7a7969d3b2c8470a7./info:serverstartcode/1296142855577/Put/vlen=8}
>
>
>
> On Tue, Apr 12, 2011 at 12:38 PM, Gary Helmling <[email protected]> wrote:
>> Robert,
>>
>> You can stop the daemons individually on each node:
>>
>> bin/hbase-daemon.sh stop master
>> bin/hbase-daemon.sh stop regionserver
>>
>> Use this to stop the processes that can be cleanly shutdown.  Then let's
>> look at which processes are still hanging and what the logs of the hanging
>> processes are showing.
>>
>> Thanks,
>> Gary
>>
>>
>> On Tue, Apr 12, 2011 at 10:34 AM, Robert Gonzalez <
>> [email protected]> wrote:
>>
>>> You mean like this:
>>>
>>> hbase@c1-m02:/usr/lib/hbase-0.90.0/bin$ ./stop-hbase.sh
>>> stopping
>>> hbase...........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
>>>
>>> Still going.....   :(
>>>
>>> On Tue, Apr 12, 2011 at 12:01 PM, Jinsong Hu <[email protected]>
>>> wrote:
>>> > You probably should stop all master/regionservers, then start one master,
>>> > tail -f the log to confirm all the hlogs are handled,
>>> >
>>> > then start the first regionserver, and then other regionservers.
>>> >
>>> > I have encountered this issues before.
>>> > hbase is not as good as what you want, but not as bad as you say either.
>>> The
>>> > truth is in between.
>>> >
>>> > Jimmy
>>> >
>>> > --------------------------------------------------
>>> > From: "Robert Gonzalez" <[email protected]>
>>> > Sent: Tuesday, April 12, 2011 9:49 AM
>>> > To: <[email protected]>
>>> > Subject: HBase is not ready for Primetime
>>> >
>>> >> We've been using HBase for about a year, consistenly running into
>>> >> problems where we lost data.  After reading forums and some back and
>>> >> forth with other Hbase users, we changed our data methodology to save
>>> >> less data per row.  This last time, we upgraded to 0.90 at the
>>> >> recommendation of the hbase community, cleared off all our data, and
>>> >> started over.  Seemed to be running ok for a couple of months, until
>>> >> this morning.  One of the regionservers stopped responding to data
>>> >> requests and we tried to restart it to no avail.  Then we shutdown our
>>> >> processes so that nothing was using HBase and we shut down HBase and
>>> >> brought it back up.  We waited a little bit, until hbase status
>>> >> indicated that all the servers were back up.  We turned on our
>>> >> processes and lo and behold, HBase is broken, getting
>>> >> org.apache.hadoop.hbase.
>>> >> NotServingRegionException:
>>> >> org.apache.hadoop.hbase.NotServingRegionException: Region is not
>>> >> online: -ROOT-,,0
>>> >>    at
>>> >>
>>> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2319)
>>> >>    at
>>> >>
>>> org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1607)
>>> >>    at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>>> >>    at
>>> >>
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>> >>    at java.lang.reflect.Method.invoke(Method.java:597)
>>> >>    at
>>> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
>>> >>    at
>>> >>
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1036)
>>> >>
>>> >> And now we can't even shut it down.
>>> >>
>>> >> Seems that Hbase is just too flaky to depend on for a serious system,
>>> >> we've not had this type of problem to this degree with conventional DB
>>> >> systems. Now that we are not saving that much data (we are using large
>>> >> hdfs files for that) in Hbase, we are probably going to move back to a
>>> >> conventional SQL system for our control data.  We just can't afford to
>>> >> be constantly fighting problems like this.
>>> >>
>>> >>
>>> >> --
>>> >>
>>> >> Robert Gonzalez
>>> >>
>>> >> Maxpoint Interactive
>>> >>
>>> >
>>>
>>>
>>>
>>> --
>>>
>>>
>>>     Robert Gonzalez / Senior Software Architect
>>>
>>>     7600 Burnet Road, Suite 500
>>>     Austin, TX 78757
>>>     T 512 981 9561    F 919 882 8529
>>>     [email protected]
>>>
>>
>
>
>
> --
>
>
>     Robert Gonzalez / Senior Software Architect
>
>     7600 Burnet Road, Suite 500
>     Austin, TX 78757
>     T 512 981 9561    F 919 882 8529
>     [email protected]
>



-- 


    Robert Gonzalez / Senior Software Architect

    7600 Burnet Road, Suite 500
    Austin, TX 78757
    T 512 981 9561    F 919 882 8529
    [email protected]

Re: HBase is not ready for Primetime

Reply via email to