poorbarcode commented on code in PR #21488:
URL: https://github.com/apache/pulsar/pull/21488#discussion_r1383743754


##########
pip/pip-314.md:
##########
@@ -0,0 +1,57 @@
+# PIP-314: Add metrics pulsar_subscription_redelivery_messages
+
+# Background knowledge
+
+## Delivery of messages in normal
+
+To simplify the description of the mechanism, let's take the policy [Auto 
Split Hash 
Range](https://pulsar.apache.org/docs/3.0.x/concepts-messaging/#auto-split-hash-range)
  as an example:
+
+| `0 ~ 16,384`      | `16,385 ~ 32,768` | `32,769 ~ 65,536`              |
+|-------------------|-------------------|--------------------------------|
+| ------- C1 ------ | ------- C2 ------ | ------------- C3 ------------- |
+
+- If the entry key is between `-1(non-include) ~ 16,384(include)`, it is 
delivered to C1
+- If the entry key is between `16,384(non-include) ~ 32,768(include)`, it is 
delivered to C2
+- If the entry key is between `32,768(non-include) ~ 65,536(include)`, it is 
delivered to C3
+
+# Motivation
+
+For the example above, if `C1` is stuck or consumed slowly, the Broker will 
push the entries that should be delivered to `C1` into a memory collection 
`redelivery_messages` and read next entries continue, then the collection 
`redelivery_messages` becomes larger and larger and take up a lot of memory. 
When sending messages, it will also determine the key of the entries in the 
collection `redelivery_messages`, affecting performance.

Review Comment:
   > What do you mean by "determine" the key ? Also, why doing that would ruin 
performance?
   
   After reading new messages, the Broker should filter out which messages have 
the same key, which is stuck in `redelivery_messages`, to avoid breaking the 
consumption order. For example:
   - Client-side:`C1` is stuck now; there are `1000` messages in the client 
memory, and the keys of these messages are `[k1,k2,k3]`.
   - Broker-side: read batch messages, the keys of these messages are `[k1, 
k10]`, the Broker will filter out the messages whose key is `k1` and only send 
other messages to the client.
   
   The Broker uses the data structure `Map` to manage keys, and stuck consumers 
occupy the more keys, the larger and less efficient the map becomes.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to