GitHub user benjumanji edited a discussion: region aware ensemble placement (e=3,w=3,a=2) can't create new ledgers.
I have the following config (shortened for brevity) on pulsar 4.0.1 ``` bookkeeperClientRegionawarePolicyEnabled=true reppRegionsToWrite=euw1-az3;euw1-az1;euw1-az2 reppMinimumRegionsForDurability=2 ``` I have at least three bookies. If I try the aforementioned policy (e3,w3,a2) then the exception here: https://github.com/apache/bookkeeper/blob/0748423e3228f7cf61d2e1f2ab11e354ed84c0df/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/RegionAwareEnsemblePlacementPolicy.java#L317 is thrown. <img width="1210" alt="Screenshot 2025-01-30 at 21 01 17" src="https://github.com/user-attachments/assets/001a603c-32ce-4d1f-aba9-fea20dd17032" /> This makes little sense to me as `2 <= 3 - 3/2` evaluates to true, but I am failing to see _why_ this is a bad configuration. ``` // We must survive the failure of numRegions - effectiveMinRegionsForDurability. When these // regions have failed we would spread the replicas over the remaining // effectiveMinRegionsForDurability regions; we have to make sure that the ack quorum is large // enough such that there is a configuration for spreading the replicas across // effectiveMinRegionsForDurability - 1 regions ``` Ok so I have 3 regions, and I want 2 for durability. I therefore can only tolerate 1 region failing. If that region fails I have two regions, and I require two acks. I have two bookies, they can both ack, what's the problem? Why is 4/4/3 good and 3/3/2 bad? If the argument is that the initial placements might be 2 in one region and 1 in another, why doesn't this apply to 4/4/3 (3 in one region and one in another)? If we plug in 3/3/2 to the comment, then we need to survive 3 - 2 failures (1), and we need to make sure acks cover 2 - 1 (1) regions? Why does 3 acks + 4 writers fulfil this and 2 acks and 3 writers not? I guess what's eating me is I don't want the extra tail latency or to pay for the extra disks. I just want 3 replicas, and to survive a region out. There doesn't seem to be a configuration possible for this. Ok, lets take the following (from the [docs](https://pulsar.apache.org/docs/4.0.x/administration-isolation-bookie/#region-aware-placement-policy)): > For example, the BookKeeper cluster has 4 regions, and each region has > several racks with their bookie instances, as shown in the following diagram. > If a topic is configured with EnsembleSize=3, WriteQuorum=3, and AckQuorum=2, > the BookKeeper client chooses three different regions, such as Region A, > Region C and Region D. For each region, it chooses one bookie on a single > rack, such as Bookie5 on Rack2, Bookie17 on Rack6, and Bookie21 on Rack8. The only value for min reegions for durability under which the expression evaluates to false for 3/3/2 is 1, which is a data-loss ready config. So either the docs are recommending a guaranteed fail, or an impossible configuration according the repp validation code. GitHub link: https://github.com/apache/pulsar/discussions/23913 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
