LiPengze97 opened a new issue #10406: URL: https://github.com/apache/pulsar/issues/10406
**Describe the bug** I set up five single-broker Pulsar clusters in five regions, Clemson, Wisconsin, Massachusetts, and two in Utah (Utah1, Utah2). The publisher is in Utah1, and the subscribers are in each of the rest clusters. For my experiment purpose, I prefer **non-persistent** topics. I want to use the geo-replication of Pulsar, and I found that when the sending throughput is high (like 1k message/sec, message size 10k, totally 10k message), only the subscriber in Utah2, which has the highest bandwidth from the Utah1, can receive all the published messages (not always, some times it also receives only a part). For the rest of the subscribers in other clusters, they are only able to receive part of the messages. I use Java client at first. I got the bug and I thought it was my code's fault. So I use the perf tool, and the problem is still there.(see picture below, the records received are all not 10000, but the producer says `08:25:20.996 [Thread-1] INFO org.apache.pulsar.testclient.PerformanceProducer - Aggregated throughput stats --- 10000 records sent --- 639.113 msg/s --- 49.931 Mbit/s`) I set the parameters below in broker.conf to 10000(higher or equal to the total number of message than I want to publish) ``` maxConcurrentNonPersistentMessagePerConnection replicationProducerQueueSize maxPendingPublishdRequestsPerConnection ``` and I also set the `receiverQueueSize(10000)` in my Consumer client code. I think this is not a bug, but a feature, because the sending bandwidth is much higher than the bandwidth on WAN. In [Issue 451](https://github.com/apache/pulsar/issues/451), there seems some discussion about the sending rate of the publisher. But I don't know what sending rate is reasonable under the WAN environment. Could you offer me some suggestions on what the sending bandwidth should be? The network information is: | | Utah2 | Wisconsin | Clemson | MIT | | ----- | ---------- | --------- | -------- | ------- | | Utah1 | 1174.74125 | 3.2375 | 52.99125 | 16.5625 | bandwidth(Mbyte/s) | | Utah2 | Wisconsin | Clemson | MIT | | ----- | ----- | --------- | ------- | ------ | | Utah1 | 0.061 | 35.612 | 50.918 | 48.083 | latency(ms) **To Reproduce** Steps to reproduce the behavior: 1. Go to root path of the pulsar 2. run `bin/pulsar-perf produce non-persistent://my-tenant/my-namespace/wa -bm 0 -s 10240 -r 1000 -m 10000` at publisher 3. run `bin/pulsar-perf consume non-persistent://my-tenant/my-namespace/wa` at subscriber 4. See the output of the consumer, and the "bug" can be seen. **Expected behavior** All the non-persistent messages are received correctly. **Screenshots**  **Desktop (please complete the following information):** - OS: [e.g. iOS] Ubuntu 18.04 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
