candlerb opened a new issue #5546: C++ API: insufficient digits in reader 
temporary subscriptions
URL: https://github.com/apache/pulsar/issues/5546
 
 
   **Describe the bug**
   In C++/python API, only [6 hex digits of 
ID](https://github.com/apache/pulsar/blob/master/pulsar-client-cpp/lib/ClientImpl.cc#L57)
 are used for the temporary subscription name for a reader.  This means that 
with 4,000 readers there is a ~40% chance of two readers picking the same 
subscription name.
   
   However in the Java client API, [10 hex digits of 
ID](https://github.com/apache/pulsar/blob/master/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ReaderImpl.java#L41)
 are used.  This would require about 1,000,000 readers before there were a 
similar chance of a clash. With 64K readers the chance is reduced to 0.2%
   
   **To Reproduce**
   Write a python or C program which opens a `reader` on a topic and sleeps. 
Check the assigned reader name using pulsar-admin:
   
   ```
   $ apache-pulsar-2.4.1/bin/pulsar-admin topics subscriptions pulsar-log
   reader-73bebe
   ```
   
   **Expected behavior**
   Sufficient randomness be included in the subscription name; consistency 
between Java and C++ clients
   
   **Screenshots**
   N/A
   
   **Desktop (please complete the following information):**
   N/A
   
   **Additional context**
   https://en.wikipedia.org/wiki/Birthday_problem
   
   ```
   $ bc -l
   scale=10
   ibase=16
   1 - e(-(1000^2)/(2*1000000))
   => .3934693403
   
   1 - e(-(100000^2)/(2*10000000000))
   => .3934693403
   
   1 - e(-(10000^2)/(2*10000000000))
   => .0019512189
   
   10000
   => 65536
   ```
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to