[ 
https://issues.apache.org/jira/browse/HBASE-19106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16235440#comment-16235440
 ] 

Amit Kabra edited comment on HBASE-19106 at 11/2/17 6:08 PM:
-------------------------------------------------------------

Thanks for adding the release version.

But its very important feature IMO.  

If we miss any wal for the backup due to code bug , etc or after backup some 
part (hfile, manifests, etc...) of it gets corrupted due to xyz cluster / code 
issues, then we should we aware of it.

It would be great to know that the data we just backed up is restorable and has 
taken correct backup. Backups are important and needed in critical scenarios 
only and hence its validation is important IMO.


was (Author: amitkabraiiit):
Thanks for adding the release version.

But its very important feature IMO.  

Due to dynamic nature of HBase where compaction/splits/merges/flushes, etc keep 
happening all the time, there can arise scenarios in production where backup 
may miss rows , cells , etc. 
Or some part (hfile, manifests, etc...) of it gets corrupted / deleted due to 
xyz cluster issues. 
It would be great to know that the data we just backed up is restorable and has 
taken correct backup. Backups are important and needed in critical scenarios 
only and hence its validation is important IMO.

> Backup self validation for its correctness.
> -------------------------------------------
>
>                 Key: HBASE-19106
>                 URL: https://issues.apache.org/jira/browse/HBASE-19106
>             Project: HBase
>          Issue Type: Improvement
>          Components: backup&restore
>            Reporter: Amit Kabra
>            Priority: Major
>             Fix For: 2.1.0
>
>
> Backups are critical and if they don't work when we need them at the time of 
> restore than they are not useful. We should do sanity test for each backup 
> job we run that it is restorable and hence can be trusted.
> A self validation feature can be added for the same to the backups where 
> whenever a backup is run , once it finishes it will trigger a validation job 
> that will do a sample restoration of the backed up data and will make sure 
> that it compares well with actual data.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to