[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17459784#comment-17459784
 ] 

Pablo Francisco Pérez Hidalgo commented on ZOOKEEPER-4425:
----------------------------------------------------------

Huge thanks for taking a look into this [~symat] and for the insightful 
response !

In the absolutely disastrous scenario I used as an example, due backup and 
clean-up process mishandling, I assumed no data files present at all (not 
transaction logs, nor snapshots), being the only data available in memory for 
certain instances. In this situation, being able to restore  anything, even a 
partially updated tree is a huge win.

Doesn't a fuzzy snapshot alone allow a node to start up with potential partial 
data loss?

Even if it doesn't, this "escape pod" command would help with situations where 
the snapshots are lost completely (especially when snapshot and translog 
directories are different and potentially stored at different storage devices) .

> 4lw Command: On demand snapshot
> -------------------------------
>
>                 Key: ZOOKEEPER-4425
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4425
>             Project: ZooKeeper
>          Issue Type: New Feature
>          Components: server
>            Reporter: Pablo Francisco Pérez Hidalgo
>            Priority: Minor
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Working with disaster recovery scenarios at work, we found that having the 
> capacity of telling a ZooKeeper instance to take a snapshot, thus dumping 
> into files the contents of its view of the internal database, could be a last 
> resource hatch out of potential data loss.
> As an example, imagine that all the voting members of the ensemble are wiped 
> out due a wrong deployment configuration change. A single surviving observer 
> could hold on its memory the last copy of most recently updated the ensemble 
> data. Sending it a _*snap*_ *four letter words command* that forced it to 
> save a snapshot of that information into disk could be a very convenient way 
> of recovering the database.
>  
> This issue aims to discuss the addition of this feature and serve as the gate 
> for a an already available patch providing this feature.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to