I tried also with hbase-0.92 with hadoop-1.0.0 (same configuration than before) and it works fine (means no data loss).
With hbase-0.90.3/hadoop-0.20-append, I checked my append configuration, and ran the unit tests successfully. Maybe the master starts hlog processing then blocks on something (only .META. and -ROOT- are online), and upon restart it did not resume well ? On Wed, Feb 22, 2012 at 2:29 AM, lars hofhansl <[email protected]> wrote: > I tried on trunk, and this scenario seems to works fine. > > In fact I first forgot to enable appends (I just switched my local end to > hadoop-1.0.0), and without that I did in fact loose the edits. > With appends enables this works as designed. > > Might still be a timing issue or only occur in 0.90.x. > Maybe somebody else could mine the attached logs for clues? > > -- Lars > > ------------------------------ > *From:* lars hofhansl <[email protected]> > *To:* Manuel de Ferran <[email protected]>; "[email protected]" > <[email protected]> > *Sent:* Tuesday, February 21, 2012 4:38 PM > > *Subject:* Re: Flushing to HDFS sooner > > You still should not loose data this way. > Looks like something that is easily reproducible. I'll try with the latest > trunk. > > -- Lars > > > > ________________________________ > From: Manuel de Ferran <[email protected]> > To: [email protected] > Cc: lars hofhansl <[email protected]> > Sent: Tuesday, February 21, 2012 3:51 AM > Subject: Re: Flushing to HDFS sooner > > > On Mon, Feb 20, 2012 at 9:43 PM, Stack <[email protected]> wrote: > > On Mon, Feb 20, 2012 at 11:58 AM, lars hofhansl <[email protected]> > wrote: > >> Are there any messages about log replay when you restart the region > server? > >> > >> > > > >What Lars says', whats it say in the logs on master on restart? > >St.Ack > > > > I just noticed that if I restart the master before starting the > regionserver, I still have my rows and I have "recovered.edits" logs in the > master log : > > > 2012-02-21 10:09:55,541 INFO > org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Closed path hdfs:// > lxc167.nightly-dev.com:9100/hbase/.META./1028785192/recovered.edits/0000000000000029079 > (wrote > 4 edits in 22ms) > > During my previous tests, I was doing the following : > - kill datanode > - start datanode > - start regionserver > - restart master (because .META. and -ROOT- were the only regions online): > i did not realize that I could loose any data doing that. > > Here are the master logs to both cases : > http://pastebin.com/KD03P0pD : master restart before regionserver > http://pastebin.com/FvYaBMdm : master restart after regionserver > >
