That's quite a good argument :) -- there's a difference between occasional verification for building confidence and full data verification for every, single, backup (which is how HBASE-19106 read to me). I still think the latter (thus, 19106, verbatim in its ask) would be unwieldy; however, the ability to do it ad-hoc as you describe has benefits.

Also makes me wonder what how reusable VerifyReplication is at its core (I mean, it's more or less the same thing under the hood, right?).

Let's continue to hash out what we think the scope of a data verification "feature" should be and then get that put up on 19106. This is good.

On 11/1/17 11:32 PM, Andrew Purtell wrote:
Potential adopters will absolutely want to construct for themselves a 
verifiable live exercise. Tooling that lets you do that against a snapshot 
would be the way to go, I think. Once you do that exercise, probably a few 
times, you can trust the backup solution enough for restore into production, 
where verification may or may not be possible.

A user who claims they'd rather not verify their backup solution works on 
account of performance concerns shouldn't be taken seriously. (Not that you 
would (smile))


On Nov 1, 2017, at 7:55 PM, Josh Elser <[email protected]> wrote:



On 11/1/17 8:22 PM, Sean Busbey wrote:
On Wed, Nov 1, 2017 at 7:08 PM, Vladimir Rodionov
<[email protected]> wrote:
There is no way to validate correctness of backup in a general case.

You can restore backup into temp table, but then what? Read rows one-by-one
from temp table and look them up
in a primary table? Won't work, because rows can be deleted or modified
since the last backup was done.

This is why we have snapshots, no?

True, we could try to take a snapshot exactly when the backup was taken 
(likely, still difficult to coordinate on an active system), but in what 
reality would we actually want to do this? Most users I see are so concerned 
about the cost of running compactions (which are actually making performance 
better!), they wouldn't take non-negligible portion of their computing power 
and available space to re-instantiate their data (at least once) to make sure a 
copy worked correctly.

We have WALs, HFiles, and some metadata we'd export in a backup right? Why not 
intrinsically perform some validation that things like headers, trailers, etc 
still exist on the files we exported (e.g. open file, read header, seek to end, 
verify trailer, etc). I feel like that's a much more tenable solution that 
isn't going to have a ridiculous burden like restoring tables of modest and 
above size.

This smells like it's really asking to verify a distcp, than verifying backups. 
There is certainly something we can do to give a reasonable level of confidence 
that doesn't involve reconstituting the whole thing.

Reply via email to