sijie commented on issue #5513: Publish rate limit on broker to avoid OOM URL: https://github.com/apache/pulsar/issues/5513#issuecomment-556266528 @rdhabalia We are able to reproduce OOM problem with master code-base (including your code change and doesn't enable publish-rate limiting). The steps to reproduce: 1) setup a bookkeeper cluster with 10 nodes. Each node has one SSD disk for both journal and ledger directory. 2) setup a broker with 10 nodes. 3) create a topic with 20 partitions. 4) launch a flink job with a data-generator source (without throttling) and a flink pulsar sink (https://github.com/streamnative/pulsar-flink), with parallelism == 20. The producer setting in the pulsar sink is the default producer setting. 5) run the flink job for a while. you can observe the OOM very quickly. The heapdump of broker shows a lot of pending entries because the flink job is producing more than the bookkeeper cluster can accept. so all the entries are accumulating at the broker and cause broker crash due to OOM. The expectation is broker should degrade when reaching its capacity limitation and give back pressure to the clients. Broker shouldn't crash due to OOM. If we don't provide the capability as Jia proposed in #5710, pulsar can't be used in high-volume ingestion workload.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
