How about an option that tells the cleaner to archive them, with compression? There’s a lot of wastage in WAL files due to repeated information, and reasons to not enable WAL compression for live files, but I think little reason not to rewrite an archived WAL file with a typical and standard archival compression format like BZIP if retaining it for only possible debugging purposes. (Or maybe a home grown incremental backup solution built on snapshots and log replay. Or...)
So, a switch that tells the cleaner to archive rather than delete, and maybe another toggle that starts a background task to find archived WALs that are uncompressed and compress them, only removing them once the compressed version is in place. Compress, optionally, in a temporary location with final atomic rename like compaction. ? > On Mar 16, 2019, at 7:01 AM, Sean Busbey <[email protected]> wrote: > > Hi folks! > > Sometimes while working to diagnose an HBase failure in production settings > I need to ensure WALs stick around so that I can examine or possibly replay > them. For difficult problems on clusters with plenty of HDFS space relative > to the HBase write workload sometimes that might mean for days or a week. > > The way I've always done this is by setting up placeholder replication > information for a peer that's disabled. It nicely makes the cleaner chore > pass over things, doesn't require a restart of anything, and has a > relatively straight forward way to go back to normal. > > Lately I've been thinking that I do this often enough that a command for it > would be better (kind of like how we can turn the balancer on and off). > > How do other folks handle this operational need? Am I just missing an > easier way? > > If a new command is needed, what do folks think the minimally useful > version is? Keep all WALs until told otherwise? Limit to most recent/oldest > X bytes? Limit to files that include edits to certain > namespace/table/region?
