Re: oldWALs: what it is and how can I clean it?

Liam Slusser Wed, 11 Mar 2015 14:27:05 -0700

Sean -

Very likely.  Funny the Cloudera manager 5.2.1 was able to upgrade to
CDH5.3 with parcels and I just noticed now that is saying CDH5.3.2 is
available for download.  So while it might not a  "supported configuration"
they sure don't stop you from doing it nor does it say anything about it
not being a supported configuration.


I'm not sure if somebody else can chime in with cloudera manager 5.3+ to
see if they see the same behavior.  I'm heading out for a much needed
vacation here for the next week but once I get back I'll upgrade our
cloudera manager to 5.3 and see if that bug is fixed.

thanks!
liam



On Wed, Mar 11, 2015 at 2:05 PM, Sean Busbey <[email protected]> wrote:

> (Apologies for hitting send too soon and more vendor specific info.)
>
> Liam, FYI CM 5.2.z running CDH5.3.z isn't a supported configuration[1] and
> might be the source of your problem.
>
> [1]:
>
> http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/pcm_cdh_cm.html
>
> On Wed, Mar 11, 2015 at 4:02 PM, Sean Busbey <[email protected]> wrote:
>
> > Thanks for the follow up Liam! I'll add this as a bug against CDH.
> >
> > On Wed, Mar 11, 2015 at 3:58 PM, Liam Slusser <[email protected]>
> wrote:
> >
> >> I just wanted to update this thread as after more investigation I've
> >> figured out why my oldWALs folder wasn't being cleaned up.  I had a look
> >> at
> >> the code to ReplicationLogCleaner and it makes this call:
> >>
> >>  if (!config.getBoolean(HConstants.REPLICATION_ENABLE_KEY,
> >>         HConstants.REPLICATION_ENABLE_DEFAULT)) {
> >>       LOG.warn("Not configured - allowing all wals to be deleted");
> >>       return;
> >>     }
> >>
> >> I searched through my logs and was never able to find that line of text.
> >> So wrote a quick program to run that piece of code, and sure enough it
> >> came
> >> back as True.  getBoolean returns the value if it's been defined and if
> >> not
> >> returns the default.  And after reading HBASE-3489 replication is
> enabled
> >> by default these days, which I also verified by looking at
> >> HConstants.REPLICATION_ENABLE_DEFAULT.  I run cloudera CDH5.3 and in the
> >> user interface even with hbase replication set to false, wasn't putting
> >> "hbase.replication" false in the configuration file.  I manually added
> the
> >> hbase.replication to false in the advance configuration for
> hbase-site.xml
> >> and restarted hbase and sure enough it deleted all the logs!
> >>
> >> So this is probably a bug in CDH, at least in the version that I ran.
> I'm
> >> running cloudera manger 5.2.1 with CDH5.3.0-1.cdh5.3.0.p0.30.
> >>
> >> thanks,
> >> liam
> >>
> >>
> >> On Wed, Mar 4, 2015 at 5:18 PM, Liam Slusser <[email protected]>
> wrote:
> >>
> >> > So after removing all the replication peers hbase still doesn't want
> to
> >> > clean up the oldWALs folder.  In the master logs I don't see any
> errors
> >> > from ReplicationLogCleaner or LogCleaner.  I have my logging set to
> >> INFO so
> >> > I'd think I would see something.
> >> >
> >> > Is there anyway to run the ReplicationLogCleaner manually and see the
> >> > output?  Can I write something that calls the right API functions?
> >> >
> >> > thanks,
> >> > liam
> >> >
> >> >
> >> > On Fri, Feb 27, 2015 at 1:50 PM, Nick Dimiduk <[email protected]>
> >> wrote:
> >> >
> >> >> I would let the cleaner chore handle the cleanup for you. You don't
> >> know
> >> >> the state of all entries in that folder. To that extent, I'd avoid
> >> making
> >> >> any direct changes to the content of HBase's working directory,
> >> especially
> >> >> while HBase is running...
> >> >>
> >> >> On Fri, Feb 27, 2015 at 1:29 PM, Liam Slusser <[email protected]>
> >> wrote:
> >> >>
> >> >> > Once I disable/remove the replication, can I just blow away the
> >> oldWALs
> >> >> > folder safely?
> >> >> >
> >> >> > On Fri, Feb 27, 2015 at 3:10 AM, Madeleine Piffaretti <
> >> >> > [email protected]> wrote:
> >> >> >
> >> >> > > Thanks a lot!
> >> >> > >
> >> >> > > Indeed, we had a replication enable in the past because we used
> the
> >> >> > > hbase-indexer from NgData (use to replicate data from Hbase to
> >> Solr).
> >> >> > > The replication was disable from a long time but the
> hbase-indexer
> >> >> peer
> >> >> > was
> >> >> > > still activated and so, as you mentioned, the data was keept  to
> >> >> > guarantee
> >> >> > > to not lose data between disable and enable.
> >> >> > >
> >> >> > > I have removed the peer and empty the oldWALs folder.
> >> >> > >
> >> >> > >
> >> >> > >
> >> >> > > 2015-02-27 1:42 GMT+01:00 Liam Slusser <[email protected]>:
> >> >> > >
> >> >> > > > Huge thanks, Enis, that was the information I was looking for.
> >> >> > > >
> >> >> > > > Cheers!
> >> >> > > > liam
> >> >> > > >
> >> >> > > >
> >> >> > > > On Thu, Feb 26, 2015 at 3:48 PM, Enis Söztutar <
> >> [email protected]>
> >> >> > > wrote:
> >> >> > > >
> >> >> > > > > @Madeleine,
> >> >> > > > >
> >> >> > > > > The folder gets cleaned regularly by a chore in master. When
> a
> >> WAL
> >> >> > file
> >> >> > > > is
> >> >> > > > > not needed any more for recovery purposes (when HBase can
> >> guaratee
> >> >> > > HBase
> >> >> > > > > has flushed all the data in the WAL file), it is moved to the
> >> >> oldWALs
> >> >> > > > > folder for archival. The log stays there until all other
> >> >> references
> >> >> > to
> >> >> > > > the
> >> >> > > > > WAL file are finished. There is currently two services which
> >> may
> >> >> keep
> >> >> > > the
> >> >> > > > > files in the archive dir. First is a TTL process, which
> ensures
> >> >> that
> >> >> > > the
> >> >> > > > > WAL files are kept at least for 10 min. This is mainly for
> >> >> debugging.
> >> >> > > You
> >> >> > > > > can reduce this time by setting hbase.master.logcleaner.ttl
> >> >> > > configuration
> >> >> > > > > property in master. It is by default 600000. The other one is
> >> >> > > > replication.
> >> >> > > > > If you have replication setup, the replication processes will
> >> >> hang on
> >> >> > > to
> >> >> > > > > the WAL files until they are replicated. Even if you disabled
> >> the
> >> >> > > > > replication, the files are still referenced.
> >> >> > > > >
> >> >> > > > > You can look at the logs from master from classes
> (LogCleaner,
> >> >> > > > > TimeToLiveLogCleaner, ReplicationLogCleaner) to see whether
> the
> >> >> > master
> >> >> > > is
> >> >> > > > > actually running this chore and whether it is getting any
> >> >> exceptions.
> >> >> > > > >
> >> >> > > > > @Liam,
> >> >> > > > > Disabled replication will still hold on to the WAL files
> >> because,
> >> >> > > because
> >> >> > > > > it has a guarantee to not lose data between disable and
> enable.
> >> >> You
> >> >> > can
> >> >> > > > > remove_peer, which frees up the WAL files to be eligible for
> >> >> > deletion.
> >> >> > > > When
> >> >> > > > > you re-add replication peer again, the replication will start
> >> from
> >> >> > the
> >> >> > > > > current status, versus if you re-enable a peer, it will
> >> continue
> >> >> from
> >> >> > > > where
> >> >> > > > > it left.
> >> >> > > > >
> >> >> > > > >
> >> >> > > > >
> >> >> > > > > On Thu, Feb 26, 2015 at 12:56 AM, Madeleine Piffaretti <
> >> >> > > > > [email protected]> wrote:
> >> >> > > > >
> >> >> > > > > > Hi,
> >> >> > > > > >
> >> >> > > > > > The replication is not turned on HBase...
> >> >> > > > > > Does this folder should be clean regularly? Because I have
> >> data
> >> >> > from
> >> >> > > > > > december 2014...
> >> >> > > > > >
> >> >> > > > > >
> >> >> > > > > > 2015-02-26 1:40 GMT+01:00 Liam Slusser <[email protected]
> >:
> >> >> > > > > >
> >> >> > > > > > > I'm having this same problem.  I had replication enabled
> >> but
> >> >> have
> >> >> > > > since
> >> >> > > > > > > been disabled.  However oldWALs still grows.  There are
> so
> >> >> many
> >> >> > > files
> >> >> > > > > in
> >> >> > > > > > > there that running "hadoop fs -ls /hbase/oldWALs" runs
> out
> >> of
> >> >> > > memory.
> >> >> > > > > > >
> >> >> > > > > > > On Wed, Feb 25, 2015 at 9:27 AM, Nishanth S <
> >> >> > > [email protected]
> >> >> > > > >
> >> >> > > > > > > wrote:
> >> >> > > > > > >
> >> >> > > > > > > > Do you have replication turned on in hbase and  if so
> is
> >> >> your
> >> >> > > slave
> >> >> > > > > > > >  consuming the replicated data?.
> >> >> > > > > > > >
> >> >> > > > > > > > -Nishanth
> >> >> > > > > > > >
> >> >> > > > > > > > On Wed, Feb 25, 2015 at 10:19 AM, Madeleine Piffaretti
> <
> >> >> > > > > > > > [email protected]> wrote:
> >> >> > > > > > > >
> >> >> > > > > > > > > Hi all,
> >> >> > > > > > > > >
> >> >> > > > > > > > > We are running out of space in our small hadoop
> cluster
> >> >> so I
> >> >> > > was
> >> >> > > > > > > checking
> >> >> > > > > > > > > disk usage on HDFS and I saw that most of the space
> was
> >> >> > > occupied
> >> >> > > > by
> >> >> > > > > > > the*
> >> >> > > > > > > > > /hbase/oldWALs* folder.
> >> >> > > > > > > > >
> >> >> > > > > > > > > I have checked in the "HBase Definitive Book" and
> >> others
> >> >> > books,
> >> >> > > > > > > web-site
> >> >> > > > > > > > > and I have also search my issue on google but I
> didn't
> >> >> find a
> >> >> > > > > proper
> >> >> > > > > > > > > response...
> >> >> > > > > > > > >
> >> >> > > > > > > > > So I would like to know what does this folder, what
> is
> >> use
> >> >> > for
> >> >> > > > and
> >> >> > > > > > also
> >> >> > > > > > > > how
> >> >> > > > > > > > > can I free space from this folder without breaking
> >> >> > > everything...
> >> >> > > > > > > > >
> >> >> > > > > > > > >
> >> >> > > > > > > > > If it's related to a specific version... our cluster
> is
> >> >> under
> >> >> > > > > > > > > 5.3.0-1.cdh5.3.0.p0.30 from cloudera (hbase 0.98.6).
> >> >> > > > > > > > >
> >> >> > > > > > > > > Thx for your help!
> >> >> > > > > > > > >
> >> >> > > > > > > >
> >> >> > > > > > >
> >> >> > > > > >
> >> >> > > > >
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >> >
> >> >
> >>
> >
> >
> >
> > --
> > Sean
> >
>
>
>
> --
> Sean
>

Re: oldWALs: what it is and how can I clean it?

Reply via email to