Hi I'd like to understand how HBase deals with the situation where the only available DataNodes for a given offline Region contain stale data. Will HBase allow the Region to be brought online again, effectively making the inconsistency permanent, or will it refuse to do so?
My question is motivated from seeing how Kafka and Elasticsearch handle this scenario. They both allow the inconsistency to become permanent, Kafka via unclean leader election, and Elasticsearch via the allocate_stale_primary command. To better understand my question, please consider the following example: - HDFS is configured with `dfs.replication=2` and `dfs.namenode.replication.min=1` - DataNodes DN1 and DN2 contain the blocks for Region R1 - DN2 goes offline - R1 receives a writes which succeeds as it can be written successfully to DN1 - DN1 goes offline before the NameNode can replicate the under-replicated block containing the write to another DataNode - At this point the R1 is offline - DN2 comes back online, but it does not contain the missed write There are now two options: - R1 is brought back online, violating consistency - R1 remains offline, indefinitely, until DN1 is brought back online How does HBase deal with this situation? Many thanks Paul
