GitHub user lhotari added a comment to the discussion: FIFO guarantees with Key_Shared subscriptions when scaling partitions and clearing backlog
> After increasing the partition count > * Will messages for existing deviceIds continue to be routed to their > original partitions? > * Or can existing keys be rehashed to newly added partitions? By default, messages will get routed based on the hash of the key. After increasing the partition count, new messages for some of the existing keys will go to added partitions. > When consumers start after downtime: > * How does Pulsar merge backlog + live traffic while preserving strict FIFO > per key? There's explanation of Pulsar subscription type in https://pulsar.apache.org/docs/4.1.x/concepts-messaging/#subscription-types. A consumer has a subscription to a topic in Pulsar. The exclusive, failover and key_shared subscriptions consume messages in FIFO order. > * Is backlog always drained in partition order before newer messages for the > same key are delivered? yes. however if new partitions have been added, there's no special support for this. It's similar to Kafka in this matter. There have been some thoughts around adding a new feature to Pulsar 5.0 to address this shortcoming. > With Key_Shared subscriptions: > * Is per-key ordering guaranteed even if consumers join after backlog has > accumulated? yes. The guarantee is explained in https://pulsar.apache.org/docs/4.1.x/concepts-messaging/#preserving-order-of-message-delivery-by-key . > * Are there any edge cases where unacked messages for a key can block or > delay delivery of other keys? yes. when new consumers join, the key space is repartitioned across consumers to balance the load. Outstanding unacknowledged messages will cause delivery of reassigned keys until the outstanding messages have been acknowledged. This has been improved since Pulsar 4.0.0, the docs at https://pulsar.apache.org/docs/4.1.x/concepts-messaging/#preserving-order-of-message-delivery-by-key explain more. In Pulsar 4.0.0+ this is handled at a hash level and it doesn't block all message delivery when rehashing happens. Delivery of other message keys will continue as long as the number of temporarily skipped messages stay under the amount configured with `keySharedLookAheadMsgInReplayThresholdPerConsumer`/`keySharedLookAheadMsgInReplayThresholdPerSubscription` (values are `2000`/`20000` by default). The reason for this is to avoid the state of the subscription growing without a boundary. GitHub link: https://github.com/apache/pulsar/discussions/25131#discussioncomment-15452089 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
