wangshuai67 opened a new issue, #10052:
URL: https://github.com/apache/rocketmq/issues/10052

   ### Before Creating the Bug Report
   
   - [x] I found a bug, not just asking a question, which should be created in 
[GitHub Discussions](https://github.com/apache/rocketmq/discussions).
   
   - [x] I have searched the [GitHub 
Issues](https://github.com/apache/rocketmq/issues) and [GitHub 
Discussions](https://github.com/apache/rocketmq/discussions)  of this 
repository and believe that this is not a duplicate.
   
   - [x] I have confirmed that this bug belongs to the current repository, not 
other repositories of RocketMQ.
   
   
   ### Runtime platform environment
   
   uos 32G 16C
   
   ### RocketMQ version
   
   5.3.2
   
   ### JDK Version
   
   open jdk 1.8.0_342
   
   ### Describe the Bug
   
   RocketMQ 5.3.1 master (SYNC_MASTER) in 32G container (JVM heap 26G) throws 
Java heap space OOM in PopBufferMergeService, using official default 
runbroker.sh (only modified -Xms26g -Xmx26g, no other changes).
   No business/retry/revive log backlog, master-slave configs fully consistent 
(24h fileReservedTime). Slave node works fine. GC logs show continuous Full GC 
with 0 memory reclaimed (old gen 100% full). OOM fixed after G1GC tuning 
(G1HeapRegionSize=32m etc.).
   
   Business handles large messages (attachments/video file) + batch GPS data 
packets (no backlog for business/retry/revive logs, master-slave configs 
consistent). Slave works fine, master has continuous Full GC with 0 memory 
reclaimed (old gen 100% full). OOM fixed after G1GC tuning 
(G1HeapRegionSize=32m etc.).
   
   <img width="1473" height="607" alt="Image" 
src="https://github.com/user-attachments/assets/271f6d73-6adc-4bb4-b587-4545cc6a4bee";
 />
   
   <img width="1042" height="866" alt="Image" 
src="https://github.com/user-attachments/assets/b26357d0-3b2e-4c09-bf05-2a6e0281f199";
 />
   
   ### Steps to Reproduce
   
   1. Deploy RocketMQ 5.3.1 master-slave cluster in 32G container.
   2. Use official default runbroker.sh, only set -Xms26g -Xmx26g for master.
   3. Set fileReservedTime=24 on both master and slave (consistent).
   4. Start NameServer, master, slave with default config.
   5. Run moderate normal production/consumption traffic (1-2KB messages, no 
failures).
   6. After hours, master throws Java heap space OOM in PopBufferMergeService.
   
   ### What Did You Expect to See?
   
   Master node runs stably with 26G heap, no OOM, normal GC, even under 
moderate traffic.
   
   ### What Did You See Instead?
   
   Master node (32G container, 26G heap) throws:
   java.lang.OutOfMemoryError: Java heap space
   at org.apache.rocketmq.broker.pop.PopBufferMergeService.merge(...)
   
   GC logs show continuous Full GC with 0 memory reclaimed (old gen full: 
26309M->26309M).
   No business/retry/revive log backlog. Slave node works fine.
   OOM fixed after tuning G1GC (G1HeapRegionSize=32m, 
InitiatingHeapOccupancyPercent=40).
   
   ### Additional Context
   
   
   <img width="1473" height="607" alt="Image" 
src="https://github.com/user-attachments/assets/271f6d73-6adc-4bb4-b587-4545cc6a4bee";
 />
   
   <img width="1042" height="866" alt="Image" 
src="https://github.com/user-attachments/assets/b26357d0-3b2e-4c09-bf05-2a6e0281f199";
 />


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to