On 11/1/17 8:22 PM, Sean Busbey wrote:
On Wed, Nov 1, 2017 at 7:08 PM, Vladimir Rodionov
<[email protected]> wrote:
There is no way to validate correctness of backup in a general case.
You can restore backup into temp table, but then what? Read rows one-by-one
from temp table and look them up
in a primary table? Won't work, because rows can be deleted or modified
since the last backup was done.
This is why we have snapshots, no?
True, we could try to take a snapshot exactly when the backup was taken
(likely, still difficult to coordinate on an active system), but in what
reality would we actually want to do this? Most users I see are so
concerned about the cost of running compactions (which are actually
making performance better!), they wouldn't take non-negligible portion
of their computing power and available space to re-instantiate their
data (at least once) to make sure a copy worked correctly.
We have WALs, HFiles, and some metadata we'd export in a backup right?
Why not intrinsically perform some validation that things like headers,
trailers, etc still exist on the files we exported (e.g. open file, read
header, seek to end, verify trailer, etc). I feel like that's a much
more tenable solution that isn't going to have a ridiculous burden like
restoring tables of modest and above size.
This smells like it's really asking to verify a distcp, than verifying
backups. There is certainly something we can do to give a reasonable
level of confidence that doesn't involve reconstituting the whole thing.