LiPengze97 opened a new issue #10406:
URL: https://github.com/apache/pulsar/issues/10406


   **Describe the bug**
   I set up five single-broker Pulsar clusters in five regions, Clemson, 
Wisconsin, Massachusetts, and two in Utah (Utah1, Utah2). The publisher is in 
Utah1, and the subscribers are in each of the rest clusters.
   For my experiment purpose, I prefer **non-persistent** topics.
   I want to use the geo-replication of Pulsar, and I found that when the 
sending throughput is high (like 1k message/sec, message size 10k, totally 10k 
message), only the subscriber in Utah2, which has the highest bandwidth from 
the Utah1, can receive all the published messages (not always, some times it 
also receives only a part). For the rest of the subscribers in other clusters, 
they are only able to receive part of the messages.
   I use Java client at first. I got the bug and I thought it was my code's 
fault. So I use the perf tool, and the problem is still there.(see picture 
below, the records received are all not 10000, but the producer says 
   `08:25:20.996 [Thread-1] INFO  
org.apache.pulsar.testclient.PerformanceProducer - Aggregated throughput stats 
--- 10000 records sent --- 639.113 msg/s --- 49.931 Mbit/s`)
   
   I set the parameters below in broker.conf to 10000(higher or equal to the 
total number of message than I want to publish)
   ```
   maxConcurrentNonPersistentMessagePerConnection
   
   replicationProducerQueueSize
   
   maxPendingPublishdRequestsPerConnection
   ```
   and I also set the `receiverQueueSize(10000)` in my Consumer client code.
   
   I think this is not a bug, but a feature, because the sending bandwidth is 
much higher than the bandwidth on WAN. In [Issue 
451](https://github.com/apache/pulsar/issues/451), there seems some discussion 
about the sending rate of the publisher. But I don't know what sending rate is 
reasonable under the WAN environment. Could you offer me some suggestions on 
what the sending bandwidth should be?
   
   
   The network information is:
   
   |       | Utah2      | Wisconsin | Clemson  | MIT     |
   | ----- | ---------- | --------- | -------- | ------- |
   | Utah1 | 1174.74125 | 3.2375    | 52.99125 | 16.5625 |
   
   bandwidth(Mbyte/s)
   
   |       | Utah2 | Wisconsin | Clemson | MIT    |
   | ----- | ----- | --------- | ------- | ------ |
   | Utah1 | 0.061 | 35.612    | 50.918  | 48.083 |
   
   latency(ms)
   
   **To Reproduce**
   Steps to reproduce the behavior:
   1. Go to root path of the pulsar
   2. run `bin/pulsar-perf produce non-persistent://my-tenant/my-namespace/wa 
-bm 0 -s 10240 -r 1000 -m 10000` at publisher
   3. run `bin/pulsar-perf consume non-persistent://my-tenant/my-namespace/wa` 
at subscriber
   4. See the output of the consumer, and the "bug" can be seen.
   
   **Expected behavior**
   All the non-persistent messages are received correctly.
   
   **Screenshots**
   
![image](https://user-images.githubusercontent.com/18279506/116259418-5d80d280-a7a8-11eb-8e18-2547f4cdc5d3.png)
   
   **Desktop (please complete the following information):**
    - OS: [e.g. iOS]
   Ubuntu 18.04
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to