Re: Data disappears and re-appears again after HBase cluster restart

Stack Tue, 27 Jul 2010 10:41:51 -0700

On Tue, Jul 27, 2010 at 10:26 AM, Vladimir Rodionov
<vrodio...@carrieriq.com> wrote:
> Yes, we set timestamps on all Puts. The vast majority of timestamps are in 
> the past (several minutes from now()) and only small fraction is in the 
> future (and this future will never come - Its pretty close to Long.MAX_VALUE)


When you run a scan, do you set the starttime to include these Puts
that are in the future?

> But we have now clocks synced on all servers so I do not think this can 
> explain the issue. Besides this, we do not set timestamps when we do inserts 
> into  one particular table and this table disappears as well (and reappears 
> after restart)
>

This I cannot explain.  I don't see this phenomeon at all.  Restart
should have no effect on the data being carried by the cluster.  Can
you dig around some more and get us some more data points?

St.Ack

> Best regards,
> Vladimir Rodionov
> Principal Platform Engineer
> Carrier IQ, www.carrieriq.com
> e-mail: vrodio...@carrieriq.com
>
> ________________________________________
> From: saint....@gmail.com [saint....@gmail.com] On Behalf Of Stack 
> [st...@duboce.net]
> Sent: Monday, July 26, 2010 11:05 PM
> To: dev@hbase.apache.org
> Subject: Re: Data disappears and re-appears again after HBase cluster restart
>
> Vladimir:
>
> Are you setting times on cells you add to HBase?  If so, could these
> be in the future as far as the regionserver is concerned.  For
> example, perhaps you are setting the version/timestamp on a client
> whose close is different from that over on the RegionServer, then when
> we scan, we miss these future values?
>
> Do you have to restart the cluster?  What happens if you just wait?
> Does the data come back then?
>
> St.Ack
>
>
> On Mon, Jul 26, 2010 at 6:14 PM, Vladimir Rodionov
> <vrodio...@carrieriq.com> wrote:
>> We are running ntpd on all servers and clocks are in sync now but it has not 
>> fixed the problem.
>> I run the flow, then check
>>
>> hbase shell
>>> count 'tableX'
>> 0 rows
>>
>> after HBase restart I am able to get the 'right' number of rows in a table
>>
>> For some tables I get wrong number of rows that is always less than the 
>> actual number of rows, for others I get - 0 rows.
>> It always goes away after HBase restart. All tables are small in size and 
>> all are newly created during our flow execution.
>>
>> I have checked many times Master and Region server's log files but apart 
>> from:
>>
>> RegionNotServingException -META- (or -ROOT-) I can see nothing suspicious.
>>
>> In Region servers log files I see a lot of messages like this one:
>> 2010-07-26 17:05:43,751 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
>> Finished memstore flush of ~114.4k for region 
>> 10__HB_NOINC_ORCL_JDBC_0726_MEJOMEJO-ERROR_COUNTS-1280187791424-0,,1280187802112
>>  in 985ms, sequence id=309833, compaction requested=false
>>
>> This is during the cluster's shutdown operation.
>>
>> Best regards,
>> Vladimir Rodionov
>> Principal Platform Engineer
>> Carrier IQ, www.carrieriq.com
>> e-mail: vrodio...@carrieriq.com
>>
>> ________________________________________
>> From: jdcry...@gmail.com [jdcry...@gmail.com] On Behalf Of Jean-Daniel 
>> Cryans [jdcry...@apache.org]
>> Sent: Thursday, July 22, 2010 5:43 PM
>> To: dev@hbase.apache.org
>> Subject: Re: Data disappears and re-appears again after HBase cluster restart
>>
>> Data doesn't disappear, it's probably just hidden behind a delete or
>> something like that (the user mailing list contains reports of events
>> like that that were fixed by running NTP on all machines, as required
>> by the Getting Started guide
>> http://hbase.apache.org/docs/r0.20.5/api/overview-summary.html#requirements).
>>
>> This article explains gives good info about timestamps in HBase
>> http://outerthought.org/blog/417-ot.html
>>
>> J-D
>>
>> On Thu, Jul 22, 2010 at 5:29 PM, Vladimir Rodionov
>> <vrodio...@carrieriq.com> wrote:
>>> Yes, I just checked all 3 servers and their clocks are not synchronized (up 
>>> to 2 min diff)
>>> Can you please elaborate a little bit more:  how can this result in data 
>>> disappearance?
>>>
>>> Best regards,
>>> Vladimir Rodionov
>>> Principal Platform Engineer
>>> Carrier IQ, www.carrieriq.com
>>> e-mail: vrodio...@carrieriq.com
>>>
>>> ________________________________________
>>> From: jdcry...@gmail.com [jdcry...@gmail.com] On Behalf Of Jean-Daniel 
>>> Cryans [jdcry...@apache.org]
>>> Sent: Thursday, July 22, 2010 4:38 PM
>>> To: dev@hbase.apache.org
>>> Subject: Re: Data disappears and re-appears again after HBase cluster 
>>> restart
>>>
>>> I would guess clock skew, all the machines have approx the same time?
>>> A few seconds is acceptable, but not more.
>>>
>>> J-D
>>>
>>> On Thu, Jul 22, 2010 at 4:34 PM, Vladimir Rodionov
>>> <vrodio...@carrieriq.com> wrote:
>>>> Have anybody encountered this particular bug before?
>>>> We have been having this intermittently in our QA small cluster.
>>>>
>>>> We run a flow  which is basically custom ETL process over data stored in 
>>>> hdfs. Yes it is a bunch of M/R jobs.
>>>> One of the jobs stores data into HBase (0.20.3), the next one loads data 
>>>> from HBase (using scan) performs additional transformations
>>>> and stores data finally into RDBMS.
>>>>
>>>> Flow works fine (most of the time). It means that new HBase tables are 
>>>> created, data is loaded and can be read after that during the next M/R job
>>>>
>>>> After flow finishes , data from tables (but not tables itself), sometimes, 
>>>> mysteriously disappear. This is not deterministic and to get data back we 
>>>> need to RESTART HBase cluster.
>>>> So HBase restart fixes the problem.
>>>>
>>>> Cluster is small (3 servers). RAM is limited - 8GB. Only 2 CPU cores per 
>>>> server but input data size is small as well and the average size of 
>>>> disappearing tables is several 1000s rows-
>>>> they are small. Hadoop is from CHD2. I can not get you any additional 
>>>> helpful information at the time (no log files), but may be somebody has 
>>>> encountered this
>>>> before and has idea how to fix it.
>>>>
>>>>
>>>> Best regards,
>>>> Vladimir Rodionov
>>>> Principal Platform Engineer
>>>> Carrier IQ, www.carrieriq.com
>>>> e-mail: vrodio...@carrieriq.com
>>>>
>>>
>>
>

Re: Data disappears and re-appears again after HBase cluster restart

Reply via email to