There is a desire to improve backup creation and restore. The suggested improvements are listed below and I am seeking feedback from the community:
1) Allow saving of backups to different locations/systems: currently, backups are saved to a directory on each member. Users can manually or through scripting move those backups elsewhere, but it would be advantageous to allow direct backups to cloud storage providers (amazon, google, azure, etc.) and possibly other systems. To make this possible, it is proposed to refactor backups into a service style architecture with backup location plugins that can be used to specify the target location. This would allow creation of additional backup strategies as demand is determined and allow users to create their own plugins for their own special use cases. 2) Changing backup restore procedure: backups create a restore script per member that must be run from each member to restore a backup to. The script created is based on the OS of the machine the backup is created on (it mainly moves files to the correct directories). A more flexible system would be to instead create a metadata file (xml, yaml, etc.) which contains information on the files in the backup. This would allow the logic for moving files and other activities in the backup restore process to be maintained in our codebase in an operating system agnostic way. Because the existing script is not dependent on geode code, old backups would not be affected by this change, though the process for restoring new backups would (likely using gfsh instead of sh or bat scripts). 3) Improved incremental backups: incremental backup allows for significant space savings and is much quicker to run. However, it suffers from the problem that you can only restore to the latest time the incremental backup was run, as we overwrite user files, cache xml and properties, among other files in the backup directory. By saving this information to timestamped directories, restoring to a specific time point would be as simple as choosing the newest point in the backup to include in the restore. Using timestamped directories for normal backups as well would prevent successive backups from overwriting each other.