KingCide opened a new issue, #9632:
URL: https://github.com/apache/rocketmq/issues/9632

   ### Before Creating the Enhancement Request
   
   - [x] I have confirmed that this should be classified as an enhancement 
rather than a bug/feature.
   
   
   ### Summary
   
   `Pop long-polling is not awakened for V1 retry messages, causing significant 
consumption delay`
   
   ### Motivation
   
   #### Description
   
   In Pop consumption mode, when a message is not acknowledged (ACKed) within 
its invisible time, it is moved to a retry topic for later consumption. The 
system is expected to wake up any waiting long-polling requests for the 
original topic so the message can be re-consumed promptly.
   
   Currently, this wake-up mechanism works correctly for V2 retry topics 
(`%RETRY%group+topic`) because the original topic and group can be reliably 
parsed.
   
   However, for V1 retry topics (`%RETRY%group_topic`), the broker fails to 
parse the original topic name from the retry topic. As a result, the 
`notifyMessageArrivingWithRetryTopic` method cannot identify and awaken the 
correct long-polling request. This forces the consumer's long-polling request 
to wait until it times out (controlled by `BROKER_SUSPEND_MAX_TIME_MILLIS`, 
typically 15 seconds), introducing a significant delay in message retries.
   
   #### Steps to Reproduce
   
   1. **Configure and start the broker** with the following settings:
   
   - `enableRetryTopicV2 = false` (to use the V1 retry topic format)
   - `popConsumerKVServiceEnable = true` (or `popConsumerFSServiceInit = true`)
   
   2. **Producer:** Send a batch of messages (e.g., 32 messages) to a normal 
topic, let's call it `TopicA`.
   3. **Consumer:** Use a Pop-based consumer (e.g., `PushConsumer`) to 
subscribe to `TopicA`.
   4. **Simulate Failure:** In the message listener, do not return 
`CONSUME_SUCCESS` for the received messages. For example, return 
`RECONSUME_LATER` or simply don't ACK them, causing them to expire and be sent 
to the retry topic (`%RETRY%YourConsumerGroup_TopicA`).
   5. **Observe:** Do not send any new messages to `TopicA`. Monitor the 
consumer logs.
   
   #### Expected Behavior
   
   When the invisible time for a message expires and it's moved to the V1 retry 
topic, the long-polling request waiting for messages on `TopicA` should be 
awakened immediately. The consumer should receive the retry message promptly 
(e.g., within 1-2 seconds after the invisible time + retry delay).
   
   #### Actual Behavior
   
   The long-polling request is **not** awakened by the arrival of the retry 
message. It remains suspended until the long-poll timeout is reached (approx. 
15 seconds). Only after the timeout does the client re-initiate the poll 
request and finally fetch the message from the retry queue.
   
   **Log Evidence:**
   
   - Message first nack'd time: `2025-08-22 16:05:21,414`
   - Message revived and written to retry topic (from `mqadmin topicStatus`): 
`2025-08-22 16:05:33,497`
   - Consumer receives the retry message: `2025-08-22 16:05:36,385`
   - **Observed Delay:** ~15 seconds from the initial NACK, matching the 
long-poll timeout.
   
   This is further confirmed by a second experiment: if new _normal_ messages 
are continuously sent to `TopicA`, the retry messages are consumed much faster. 
This proves that the arrival of new normal messages is waking up the long-poll, 
which then happens to pick up the waiting retry messages. The retry message 
itself is not triggering the wake-up.
   
   ### Describe the Solution You'd Like
   
   The issue lies in 
`PopLongPollingService.notifyMessageArrivingWithRetryTopic`. For V1 retry 
topics, it cannot resolve the original topic.
   
   We can enhance this method by implementing a reverse-lookup mechanism using 
the `topicCidMap`, which stores active `(topic, consumer_group)` mappings.
   
   **Logic:**
   
   1. When a message arrives in a V1 retry topic, iterate through the entries 
in `PopLongPollingService.topicCidMap`.
   2. For each `(topic, cid)` pair, reconstruct the potential V1 retry topic 
name using `KeyBuilder.buildPopRetryTopicV1(topic, cid)`.
   3. Compare this reconstructed name with the incoming retry topic name.
   4. If a **unique** match is found, we have successfully identified the 
original topic. Use this original topic to notify the long-polling service.
   5. If multiple matches are found, or no match is found, fall back to the 
current behavior (using the retry topic name) to avoid incorrect notifications.
   
   ### Describe Alternatives You've Considered
   
   popKV may be an alternative
   
   ### Additional Context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to