I would let the cleaner chore handle the cleanup for you. You don't know the state of all entries in that folder. To that extent, I'd avoid making any direct changes to the content of HBase's working directory, especially while HBase is running...
On Fri, Feb 27, 2015 at 1:29 PM, Liam Slusser <[email protected]> wrote: > Once I disable/remove the replication, can I just blow away the oldWALs > folder safely? > > On Fri, Feb 27, 2015 at 3:10 AM, Madeleine Piffaretti < > [email protected]> wrote: > > > Thanks a lot! > > > > Indeed, we had a replication enable in the past because we used the > > hbase-indexer from NgData (use to replicate data from Hbase to Solr). > > The replication was disable from a long time but the hbase-indexer peer > was > > still activated and so, as you mentioned, the data was keept to > guarantee > > to not lose data between disable and enable. > > > > I have removed the peer and empty the oldWALs folder. > > > > > > > > 2015-02-27 1:42 GMT+01:00 Liam Slusser <[email protected]>: > > > > > Huge thanks, Enis, that was the information I was looking for. > > > > > > Cheers! > > > liam > > > > > > > > > On Thu, Feb 26, 2015 at 3:48 PM, Enis Söztutar <[email protected]> > > wrote: > > > > > > > @Madeleine, > > > > > > > > The folder gets cleaned regularly by a chore in master. When a WAL > file > > > is > > > > not needed any more for recovery purposes (when HBase can guaratee > > HBase > > > > has flushed all the data in the WAL file), it is moved to the oldWALs > > > > folder for archival. The log stays there until all other references > to > > > the > > > > WAL file are finished. There is currently two services which may keep > > the > > > > files in the archive dir. First is a TTL process, which ensures that > > the > > > > WAL files are kept at least for 10 min. This is mainly for debugging. > > You > > > > can reduce this time by setting hbase.master.logcleaner.ttl > > configuration > > > > property in master. It is by default 600000. The other one is > > > replication. > > > > If you have replication setup, the replication processes will hang on > > to > > > > the WAL files until they are replicated. Even if you disabled the > > > > replication, the files are still referenced. > > > > > > > > You can look at the logs from master from classes (LogCleaner, > > > > TimeToLiveLogCleaner, ReplicationLogCleaner) to see whether the > master > > is > > > > actually running this chore and whether it is getting any exceptions. > > > > > > > > @Liam, > > > > Disabled replication will still hold on to the WAL files because, > > because > > > > it has a guarantee to not lose data between disable and enable. You > can > > > > remove_peer, which frees up the WAL files to be eligible for > deletion. > > > When > > > > you re-add replication peer again, the replication will start from > the > > > > current status, versus if you re-enable a peer, it will continue from > > > where > > > > it left. > > > > > > > > > > > > > > > > On Thu, Feb 26, 2015 at 12:56 AM, Madeleine Piffaretti < > > > > [email protected]> wrote: > > > > > > > > > Hi, > > > > > > > > > > The replication is not turned on HBase... > > > > > Does this folder should be clean regularly? Because I have data > from > > > > > december 2014... > > > > > > > > > > > > > > > 2015-02-26 1:40 GMT+01:00 Liam Slusser <[email protected]>: > > > > > > > > > > > I'm having this same problem. I had replication enabled but have > > > since > > > > > > been disabled. However oldWALs still grows. There are so many > > files > > > > in > > > > > > there that running "hadoop fs -ls /hbase/oldWALs" runs out of > > memory. > > > > > > > > > > > > On Wed, Feb 25, 2015 at 9:27 AM, Nishanth S < > > [email protected] > > > > > > > > > > wrote: > > > > > > > > > > > > > Do you have replication turned on in hbase and if so is your > > slave > > > > > > > consuming the replicated data?. > > > > > > > > > > > > > > -Nishanth > > > > > > > > > > > > > > On Wed, Feb 25, 2015 at 10:19 AM, Madeleine Piffaretti < > > > > > > > [email protected]> wrote: > > > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > > > We are running out of space in our small hadoop cluster so I > > was > > > > > > checking > > > > > > > > disk usage on HDFS and I saw that most of the space was > > occupied > > > by > > > > > > the* > > > > > > > > /hbase/oldWALs* folder. > > > > > > > > > > > > > > > > I have checked in the "HBase Definitive Book" and others > books, > > > > > > web-site > > > > > > > > and I have also search my issue on google but I didn't find a > > > > proper > > > > > > > > response... > > > > > > > > > > > > > > > > So I would like to know what does this folder, what is use > for > > > and > > > > > also > > > > > > > how > > > > > > > > can I free space from this folder without breaking > > everything... > > > > > > > > > > > > > > > > > > > > > > > > If it's related to a specific version... our cluster is under > > > > > > > > 5.3.0-1.cdh5.3.0.p0.30 from cloudera (hbase 0.98.6). > > > > > > > > > > > > > > > > Thx for your help! > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
