I logged https://issues.apache.org/jira/browse/HBASE-2882
On Tue, Jul 27, 2010 at 10:41 AM, Stack <st...@duboce.net> wrote: > On Tue, Jul 27, 2010 at 10:26 AM, Vladimir Rodionov > <vrodio...@carrieriq.com> wrote: > > Yes, we set timestamps on all Puts. The vast majority of timestamps are > in the past (several minutes from now()) and only small fraction is in the > future (and this future will never come - Its pretty close to > Long.MAX_VALUE) > > When you run a scan, do you set the starttime to include these Puts > that are in the future? > > > But we have now clocks synced on all servers so I do not think this can > explain the issue. Besides this, we do not set timestamps when we do inserts > into one particular table and this table disappears as well (and reappears > after restart) > > > > This I cannot explain. I don't see this phenomeon at all. Restart > should have no effect on the data being carried by the cluster. Can > you dig around some more and get us some more data points? > > St.Ack > > > Best regards, > > Vladimir Rodionov > > Principal Platform Engineer > > Carrier IQ, www.carrieriq.com > > e-mail: vrodio...@carrieriq.com > > > > ________________________________________ > > From: saint....@gmail.com [saint....@gmail.com] On Behalf Of Stack [ > st...@duboce.net] > > Sent: Monday, July 26, 2010 11:05 PM > > To: dev@hbase.apache.org > > Subject: Re: Data disappears and re-appears again after HBase cluster > restart > > > > Vladimir: > > > > Are you setting times on cells you add to HBase? If so, could these > > be in the future as far as the regionserver is concerned. For > > example, perhaps you are setting the version/timestamp on a client > > whose close is different from that over on the RegionServer, then when > > we scan, we miss these future values? > > > > Do you have to restart the cluster? What happens if you just wait? > > Does the data come back then? > > > > St.Ack > > > > > > On Mon, Jul 26, 2010 at 6:14 PM, Vladimir Rodionov > > <vrodio...@carrieriq.com> wrote: > >> We are running ntpd on all servers and clocks are in sync now but it has > not fixed the problem. > >> I run the flow, then check > >> > >> hbase shell > >>> count 'tableX' > >> 0 rows > >> > >> after HBase restart I am able to get the 'right' number of rows in a > table > >> > >> For some tables I get wrong number of rows that is always less than the > actual number of rows, for others I get - 0 rows. > >> It always goes away after HBase restart. All tables are small in size > and all are newly created during our flow execution. > >> > >> I have checked many times Master and Region server's log files but apart > from: > >> > >> RegionNotServingException -META- (or -ROOT-) I can see nothing > suspicious. > >> > >> In Region servers log files I see a lot of messages like this one: > >> 2010-07-26 17:05:43,751 INFO > org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of > ~114.4k for region > 10__HB_NOINC_ORCL_JDBC_0726_MEJOMEJO-ERROR_COUNTS-1280187791424-0,,1280187802112 > in 985ms, sequence id=309833, compaction requested=false > >> > >> This is during the cluster's shutdown operation. > >> > >> Best regards, > >> Vladimir Rodionov > >> Principal Platform Engineer > >> Carrier IQ, www.carrieriq.com > >> e-mail: vrodio...@carrieriq.com > >> > >> ________________________________________ > >> From: jdcry...@gmail.com [jdcry...@gmail.com] On Behalf Of Jean-Daniel > Cryans [jdcry...@apache.org] > >> Sent: Thursday, July 22, 2010 5:43 PM > >> To: dev@hbase.apache.org > >> Subject: Re: Data disappears and re-appears again after HBase cluster > restart > >> > >> Data doesn't disappear, it's probably just hidden behind a delete or > >> something like that (the user mailing list contains reports of events > >> like that that were fixed by running NTP on all machines, as required > >> by the Getting Started guide > >> > http://hbase.apache.org/docs/r0.20.5/api/overview-summary.html#requirements > ). > >> > >> This article explains gives good info about timestamps in HBase > >> http://outerthought.org/blog/417-ot.html > >> > >> J-D > >> > >> On Thu, Jul 22, 2010 at 5:29 PM, Vladimir Rodionov > >> <vrodio...@carrieriq.com> wrote: > >>> Yes, I just checked all 3 servers and their clocks are not synchronized > (up to 2 min diff) > >>> Can you please elaborate a little bit more: how can this result in > data disappearance? > >>> > >>> Best regards, > >>> Vladimir Rodionov > >>> Principal Platform Engineer > >>> Carrier IQ, www.carrieriq.com > >>> e-mail: vrodio...@carrieriq.com > >>> > >>> ________________________________________ > >>> From: jdcry...@gmail.com [jdcry...@gmail.com] On Behalf Of Jean-Daniel > Cryans [jdcry...@apache.org] > >>> Sent: Thursday, July 22, 2010 4:38 PM > >>> To: dev@hbase.apache.org > >>> Subject: Re: Data disappears and re-appears again after HBase cluster > restart > >>> > >>> I would guess clock skew, all the machines have approx the same time? > >>> A few seconds is acceptable, but not more. > >>> > >>> J-D > >>> > >>> On Thu, Jul 22, 2010 at 4:34 PM, Vladimir Rodionov > >>> <vrodio...@carrieriq.com> wrote: > >>>> Have anybody encountered this particular bug before? > >>>> We have been having this intermittently in our QA small cluster. > >>>> > >>>> We run a flow which is basically custom ETL process over data stored > in hdfs. Yes it is a bunch of M/R jobs. > >>>> One of the jobs stores data into HBase (0.20.3), the next one loads > data from HBase (using scan) performs additional transformations > >>>> and stores data finally into RDBMS. > >>>> > >>>> Flow works fine (most of the time). It means that new HBase tables are > created, data is loaded and can be read after that during the next M/R job > >>>> > >>>> After flow finishes , data from tables (but not tables itself), > sometimes, mysteriously disappear. This is not deterministic and to get data > back we need to RESTART HBase cluster. > >>>> So HBase restart fixes the problem. > >>>> > >>>> Cluster is small (3 servers). RAM is limited - 8GB. Only 2 CPU cores > per server but input data size is small as well and the average size of > disappearing tables is several 1000s rows- > >>>> they are small. Hadoop is from CHD2. I can not get you any additional > helpful information at the time (no log files), but may be somebody has > encountered this > >>>> before and has idea how to fix it. > >>>> > >>>> > >>>> Best regards, > >>>> Vladimir Rodionov > >>>> Principal Platform Engineer > >>>> Carrier IQ, www.carrieriq.com > >>>> e-mail: vrodio...@carrieriq.com > >>>> > >>> > >> > > >