Sorry to have misspelled your name Henning. On Mon, Sep 3, 2018, 1:26 PM Ryanne Dolan <ryannedo...@gmail.com> wrote:
> Hanning, > > In mission-critical (and indeed GDPR-related) applications, I've ETL'd > Kafka to a secondary store e.g. HDFS, and built tooling around recovering > state back into Kafka. I've had situations where data is accidentally or > incorrectly ingested into Kafka, causing downstream systems to process bad > data. This, in my experience, is astronomically more likely than the other > DR scenarios you describe. But my approach is the same: > > - don't treat Kafka as a source-of-truth. It is hard to fix data in an > append-only log, so we can't trust it to always be correct. > > - ETL Kafka topics to a read-only, append-only, indexable log e.g. in > HDFS, and then build tooling to reingest data from HDFS back into Kafka. > That way in the event of disaster, data can be recovered from cold storage. > Don't rely on unlimited retention in Kafka. > > - build everything around log compaction, keys, and idempotency. People > think compaction is just to save space, but it is also the only way to > layer new records over existing records in an otherwise append-only > environment. I've built pipelines that let me surgically remove or fix > records at rest and then submit them back to Kafka. Since these records > have the same key, processors will treat them as replacements to earlier > records. Additionally, processors should honor timestamps and/or sequence > IDs and not the actual order of records in a partition. That way old data > can be ingested from HDFS -> Kafka idempotently. > > Imagine that one record out of millions is bad and you don't notice it for > months. You can grab this record from HDFS, modify it, and then submit it > back to Kafka. Even tho it is in the stream months later than real-time, > processors will treat it as a replacement for the old bad record and the > entire system will end up exactly as if the record was never bad. If you > can achieve these semantics consistently, DR is straightforward. > > - don't worry too much about offsets wrt consumer progress. Timestamps are > usually more important. If the above is in place, it doesn't matter if you > skip some records during failover. Just reprocess a few hours from cold > storage and it's like the failure never happened. > > Ryanne > > On Mon, Sep 3, 2018, 9:49 AM Henning Røigaard-Petersen <h...@edlund.dk> > wrote: > >> I am looking for advice on how to handle disasters not covered by the >> official methods of replication, whether intra-cluster replication (via >> replication factor and producer acks) or multi-cluster replication (using >> Confluent Replicator). >> >> >> >> We are looking into using Kafka not only as a broker for decoupling, but >> also as event store. >> >> For us, data is mission critical and has unlimited retention with the >> option to compact (required by GDPR). >> >> >> >> We are especially interested in two types of disasters: >> >> 1. Unintentional or malicious use of the Kafka API to create or >> compact messages, as well as deletions of topics or logs. >> >> 2. Loss of tail data from partitions if all ISR fail in the >> primary cluster. A special case of this is the loss of the tail from the >> commit topic, which results in loss of consumer progress. >> >> >> >> Ad (1) >> >> Replication does not guard against disaster (1), as the resulting >> messages and deletions spread to the replicas. >> >> A naive solution is to simply secure the cluster, and ensure that there >> are no errors in the code (…), but anyone with an ounce of experience with >> software knows that stuff goes wrong. >> >> >> >> Ad (2) >> >> The potential of disaster (2) is much worse. In case of total data center >> loss, the secondary cluster will lose the tail of every partition not fully >> replicated. Besides the data loss itself, there is now inconsistencies >> between related topics and partitions, which breaks the state of the system. >> >> Granted, the likelihood of total center loss is not great, but there is a >> reason we have multi-center setups. >> >> The special case with loss of consumer offsets results in double >> processing of events, once we resume processing from an older offset. While >> idempotency is a solution, it might not always be possible nor desirable. >> >> >> >> Once any of these types of disaster has occurred, there is no means to >> restore lost data, and even worse, we cannot restore the cluster to a point >> in time where it is consistent. We could probably look our customers in the >> eyes and tell them that they lost a days worth of progress, but we cannot >> inform them of a state of perpetual inconsistency. >> >> >> >> The only solution we can think of right now is to shut the primary >> cluster down (ensuring that no new events are produced), and then copy all >> files to some secure location, i.e. effectively creating a backup, allowing >> restoration to a consistency point in time. Though the tail (backup-wise) >> will be lost in case of disaster, we are ensured a consistent state to >> restore to. >> >> As a useful side effect, such backup-files can also be used to create >> environments for test or other destructive purposes. >> >> >> >> Does anyone have a better idea? >> >> >> >> >> >> Venlig hilsen / Best regards >> >> *Henning Røigaard-Petersen * >> Principal Developer >> MSc in Mathematics >> >> h...@edlund.dk >> >> Direct phone >> >> +45 36150272 >> >> >> [image: https://www.edlund.dk/sites/default/files/edlundnylogo1.png] >> <http://www.edlund.dk/> >> >> Edlund A/S >> La Cours Vej 7 >> <https://maps.google.com/?q=La+Cours+Vej+7&entry=gmail&source=g> >> DK-2000 Frederiksberg >> Tel +45 36150630 >> www.edlund.dk >> >