Hello everyone, We’ve been discussing an idea internally at Cloudera about implementing continuous backups using the replication workflow. The concept involves writing database edits to external storage for backup as soon as they’re written to the database, minimizing the gap between system failures and data availability. This approach would allow for recovery from accidental deletions, erroneous writes, or data corruption at any point in time.
Additionally, it could serve as a cost-effective disaster recovery solution. While it offers a longer recovery time compared to a fully operational DR cluster, it significantly reduces the costs associated with running and maintaining a dedicated DR environment. The idea is still in its early stages, and we’re working through the finer details. However, we’ve created a document outlining the concept [1] and how it is gonna be different from current incremental backups. We’d greatly appreciate your feedback in the document: whether it’s about the viability of the idea, areas for improvement, or suggestions to simplify the approach [1] https://docs.google.com/document/d/1csQBMyM1mwpe4QpWkCbyqvsC9F5nUBr4ierOo8IuGpE/edit Regards, Ankit Singhal