There is a desire to improve backup creation and restore. The suggested
improvements are listed below and I am seeking feedback from the community:

1) Allow saving of backups to different locations/systems: currently,
backups are saved to a directory on each member. Users can manually or
through scripting move those backups elsewhere, but it would be
advantageous to allow direct backups to cloud storage providers (amazon,
google, azure, etc.) and possibly other systems. To make this possible, it
is proposed to refactor backups into a service style architecture with
backup location plugins that can be used to specify the target location.
This would allow creation of additional backup strategies as demand is
determined and allow users to create their own plugins for their own
special use cases.

2) Changing backup restore procedure: backups create a restore script per
member that must be run from each member to restore a backup to. The script
created is based on the OS of the machine the backup is created on (it
mainly moves files to the correct directories). A more flexible system
would be to instead create a metadata file (xml, yaml, etc.) which contains
information on the files in the backup. This would allow the logic for
moving files and other activities in the backup restore process to be
maintained in our codebase in an operating system agnostic way. Because the
existing script is not dependent on geode code, old backups would not be
affected by this change, though the process for restoring new backups would
(likely using gfsh instead of sh or bat scripts).

3) Improved incremental backups: incremental backup allows for significant
space savings and is much quicker to run. However, it suffers from the
problem that you can only restore to the latest time the incremental backup
was run, as we overwrite user files, cache xml and properties, among other
files in the backup directory. By saving this information to timestamped
directories, restoring to a specific time point would be as simple as
choosing the newest point in the backup to include in the restore. Using
timestamped directories for normal backups as well would prevent successive
backups from overwriting each other.

Reply via email to