Hi Ryan,

Rajan has just created a PR for the split-bundle problem:
https://github.com/apache/incubator-pulsar/pull/822

About the 2 commands:

bin/pulsar-admin broker-stats destinations -i
and
bin/pulsar-admin namespaces destinations test/us-west/ns-bundle

They might get different output, considering that :
 * broker-stats destinations is targeting a particular broker. For
efficiency reasons, the stats are updated in background every 1min, so
newly created topics won't show up until next refresh
 * bin/pulsar-admin namespaces destinations --- will get the list of all
topics for that namespace that are being served by any broker

Except for the error during the split, I hope there's no other "meltdown"
:)
At this point, if you expect the load on the namespace to be significant, I
would suggest to start up with a corresponding number of bundles, such that
no splitting is needed. A good rule of thumb could be to use 2-3x of the
number of brokers (or the subset of broker that it's expected to serve the
traffic).

Matteo






On Wed, Oct 11, 2017 at 3:44 PM Ryan Stout <[email protected]> wrote:

> I'm using Pulsar 1.19.0 (for reference).
>
> Running
>
> bin/pulsar-admin broker-stats destinations -i
>
> gives empty JSON, but I'm not running any producers currently (as they are
> unable to produce due to the aforementioned exceptions).
>
> Running
>
> bin/pulsar-admin namespaces destinations test/us-west/ns-bundle
>
> outputs the following for both brokers:
>
> persistent://test/us-west/ns-bundle/p8-topic-partition-1
> persistent://test/us-west/ns-bundle/p8-topic-partition-2
> persistent://test/us-west/ns-bundle/p8-topic-partition-3
> persistent://test/us-west/ns-bundle/p8-topic-partition-5
> persistent://test/us-west/ns-bundle/p8-topic-partition-6
> persistent://test/us-west/ns-bundle/p8-topic-partition-7
>
> I've tried tearing down the cluster and wiping everything, but I'm still
> seeing errors when the brokers try to split/unload.
>
> It's concerning to me that a seemingly basic use case (heavy load on a
> single topic) is causing Pulsar to melt-down. I'm validating Pulsar for
> production use, and in order to gain confidence in this system I need to be
> convinced that Pulsar can gracefully split load among available brokers
> without manual intervention (e.g manual unloading).
>
> Thus I think the course of action I would prefer is to understand what is
> causing this unloading/splitting to consistently fail.
>
> The error seems to originate from the validateBundle function found here,
> line 110:
> https://github.com/apache/incubator-pulsar/blob/380e47e694fa28dc67947a81beb1d72e5039a84b/pulsar-broker/src/main/java/org/apache/pulsar/common/naming/NamespaceBundles.java
>
> It appears that an assertion is failing. It seems strange that an
> assertion would be failing for what seems to me to be a normal use case.
> Either I haven't set-up my environment/conf correctly, or perhaps there is
> a bug in Pulsar.
>
> Do you know anyone who is familiar with how namespace bundles work, or the
> "NamespaceBundles" class? Someone who might know conditions which might
> cause this assertion to fail? I'll deploy the latest version of Pulsar in
> the meantime, maybe this issue has been address already.
>
>
> On Wed, Oct 11, 2017 at 3:07 PM, Rajan Dhabalia <[email protected]>
> wrote:
>
>> >>>  2017-10-11 18:19:07,976 - ERROR
>> [pulsar-web-56-3:PulsarWebResource@373] - [null] *Failed to validate
>> namespace bundle test/us-west/ns-bundle/0x50000000_0x58000000*
>> >>> java.lang.IllegalArgumentException: *Invalid upper boundary for
>> bundl*
>>
>> Have you tried to split bundle manually? If you have started with 16
>> bundles then are you generating such load which can trigger
>> bundle-splitting?
>> Can you check how many bundles broker is serving using following command:
>> pulsar-admin namespaces broker-stats destinations -i
>>
>> >> Caused by: java.util.concurrent.CompletionException:
>> org.apache.pulsar.client.api.PulsarClientException$LookupException:
>> java.lang.IllegalStateException: Namespace bundle
>> test/us-west/ns-bundle/0x50000000_0x60000000 is being unloaded
>>
>> It seems namesapce bundle unloading gets stuck here.
>> For which topic do you see this error? If you are using version1.20 then
>> you can verify what is the bundle name for that topic and then check if
>> broker is serving that bundle using above command.
>> *pulsar-admin persistent bundle-range *
>> *persistent://test-property/cl1/ns1/tp1*
>> With given log I am not sure, why bundle unloading get stuck but you can
>> try broker restart which will make sure your bundle gets unloaded properly
>> and own by other broker so, you should not see above error.
>>
>>
>> Thanks,
>> Rajan
>>
>>
>>
>>
>>
>>
>> On Wed, Oct 11, 2017 at 1:55 PM, Ryan Stout <[email protected]> wrote:
>>
>>> Additional log-line:
>>>
>>> 2017-10-11 18:19:07,975 - INFO  [pulsar-web-56-3:Namespaces@789] -
>>> [null] Split namespace bundle test/us-west/ns-bundle/0x50000000_0x58000000
>>> 2017-10-11 18:19:07,976 - INFO  [pulsar-web-56-3:PulsarWebResource@222]
>>> - Successfully validated clusters on property [test]
>>> 2017-10-11 18:19:07,976 - ERROR [pulsar-web-56-3:PulsarWebResource@373]
>>> - [null] *Failed to validate namespace bundle
>>> test/us-west/ns-bundle/0x50000000_0x58000000*
>>> java.lang.IllegalArgumentException: *Invalid upper boundary for bundle*
>>> at
>>> com.google.common.base.Preconditions.checkArgument(Preconditions.java:93)
>>> at
>>> org.apache.pulsar.common.naming.NamespaceBundles.validateBundle(NamespaceBundles.java:110)
>>> at
>>> org.apache.pulsar.broker.web.PulsarWebResource.validateNamespaceBundleRange(PulsarWebResource.java:370)
>>> at
>>> org.apache.pulsar.broker.web.PulsarWebResource.validateNamespaceBundleOwnership(PulsarWebResource.java:381)
>>> at
>>> org.apache.pulsar.broker.admin.Namespaces.splitNamespaceBundle(Namespaces.java:801)
>>>
>>> On Wed, Oct 11, 2017 at 11:27 AM, Ryan Stout <[email protected]> wrote:
>>>
>>>> Thanks for the suggestions Matteo and Rajan. I've created a bundled
>>>> namespace (16 bundles) and a partitioned topic (8 partitions). However, I'm
>>>> stilling running into issues running perf tests. Client-side, I'm
>>>> continuously seeing the following exception:
>>>>
>>>> Caused by: java.util.concurrent.CompletionException:
>>>> org.apache.pulsar.client.api.PulsarClientException$LookupException:
>>>> java.lang.IllegalStateException: Namespace bundle
>>>> test/us-west/ns-bundle/0x50000000_0x60000000 is being unloaded
>>>>
>>>> Server-side, I see the following error:
>>>>
>>>> 2017-10-11 18:19:07,978 - INFO  [pulsar-web-56-3:Slf4jRequestLog@60] -
>>>> 172.31.10.179 - - [11/Oct/2017:18:19:07 +0000] "PUT
>>>> //ip-172-31-10-179.us-west-2.compute.internal:8080/admin/namespaces/test/us-west/ns-bundle/0x50000000_0x58000000/split
>>>> HTTP/1.1" 500 5278 "-" "Jersey/2.23.2 (HttpUrlConnection 1.8.0_141)" 4
>>>> 2017-10-11 18:19:07,979 - ERROR
>>>> [pulsar-load-manager-11-1:SimpleLoadManagerImpl@1455] - *Failed to
>>>> split namespace bundle test/us-west/ns-bundle/0x50000000_0x58000000*
>>>> org.apache.pulsar.client.admin.PulsarAdminException$ServerSideErrorException:
>>>> Some error occourred on the server
>>>> [trace redacted]
>>>> Caused by: *javax.ws.rs
>>>> <http://javax.ws.rs>.InternalServerErrorException: HTTP 500 Internal Server
>>>> Error*
>>>> [trace redacted]
>>>>
>>>>
>>>> I can provide the stack traces if needed. I'm not seeing any WARN logs
>>>> in the bookies.
>>>>
>>>> On Tue, Oct 10, 2017 at 5:45 PM, Rajan Dhabalia <[email protected]>
>>>> wrote:
>>>>
>>>>> >> I have no idea what "0x00000000_0xffffffff" is or why it's being
>>>>> used in place of the topic name I've given.
>>>>>
>>>>> 0x00000000_0xffffffff defines the bundle-range.
>>>>> Namespace can be divided into multiple logical parts by defining
>>>>> bundle range. Initially, by default every namespace has 1 bundle with
>>>>> range: "0x00000000_0xffffffff".
>>>>> If you split it into 2 bundles then this bundle-range will be :
>>>>> "0x00000000_0x7FFFFFFF" and "0x7FFFFFFF_0xFFFFFFFF". and based on
>>>>> topic-name's hash, that topic will fall under  appropriate bundle. Broker
>>>>> which owns that bundle, will own all topics that fall under that
>>>>> namespace-bundle.
>>>>>
>>>>> To split bundle, you have to first create a namespace which creates a
>>>>> namespace-metadata place-holder in zookeeper. So, we can't split namespace
>>>>> bundle if namespace is not created.
>>>>>
>>>>> >> I'll try out the ModularLoadManager.
>>>>> Sure, ModularLoadManager has visibility of larger metrics of broker's
>>>>> load and it distributes load efficiently. However, ModularLoadManager
>>>>> doesn't support auto-split functionality right now and PR
>>>>> <https://github.com/apache/incubator-pulsar/pull/385> is open.
>>>>> Probably ModularLoadManager's auto-split functionality will be available 
>>>>> by
>>>>> next release.
>>>>>
>>>>> Thanks,
>>>>> Rajan
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Oct 10, 2017 at 5:13 PM, Ryan Stout <[email protected]> wrote:
>>>>>
>>>>>> I should've looked before, as I do see exceptions in the logs due to
>>>>>> bundle splits. It's complaining about a missing namespace, however I'm 
>>>>>> able
>>>>>> to successfully publish to the topic
>>>>>> "persistent://test/us-west/ns1/p4-topic". I have no idea what
>>>>>> "0x00000000_0xffffffff" is or why it's being used in place of the topic
>>>>>> name I've given.
>>>>>>
>>>>>> I'll try out the ModularLoadManager.
>>>>>>
>>>>>> Logs:
>>>>>>
>>>>>> 2017-10-11 00:06:04,412 - INFO
>>>>>> [pulsar-load-manager-11-1:SimpleLoadManagerImpl@1398] - Running
>>>>>> namespace bundle split with thresholds: topics 1000, sessions 1000, 
>>>>>> msgRate
>>>>>> 1000, bandwidth 104857600, maxBundles 128
>>>>>> 2017-10-11 00:06:04,413 - INFO
>>>>>> [pulsar-load-manager-11-1:SimpleLoadManagerImpl@1435] - Will split
>>>>>> hot namespace bundle test/us-west/ns1/0x00000000_0xffffffff, topics 4,
>>>>>> producers+consumers 8, msgRate in+out 1999.1277760920889, bandwidth 
>>>>>> in+out
>>>>>> 2121007.929782623
>>>>>> 2017-10-11 00:06:04,414 - INFO
>>>>>> [pulsar-simple-load-manager-55-1:SimpleLoadManagerImpl@698] -
>>>>>> doLoadRanking - load balancing strategy: weightedRandomSelection
>>>>>> 2017-10-11 00:06:04,416 - INFO  [pulsar-web-56-14:Namespaces@789] -
>>>>>> [null] Split namespace bundle test/us-west/ns1/0x00000000_0xffffffff
>>>>>> 2017-10-11 00:06:04,418 - INFO  [pulsar-web-56-14:Slf4jRequestLog@60]
>>>>>> - 172.31.10.179 - - [11/Oct/2017:00:06:04 +0000] "PUT
>>>>>> //ip-172-31-10-179.us-west-2.compute.internal:8080/admin/namespaces/test/us-west/ns1/0x00000000_0xffffffff/split
>>>>>> HTTP/1.1" 404 37 "-" "Jersey/2.23.2 (HttpUrlConnection 1.8.0_141)" 3
>>>>>> 2017-10-11 00:06:04,419 - *ERROR*
>>>>>> [pulsar-load-manager-11-1:SimpleLoadManagerImpl@1455] - *Failed to
>>>>>> split namespace bundle test/us-west/ns1/0x00000000_0xffffffff*
>>>>>> org.apache.pulsar.client.admin.*PulsarAdminException$NotFoundException:
>>>>>> Namespace does not exist*
>>>>>> at
>>>>>> org.apache.pulsar.client.admin.internal.BaseResource.getApiException(BaseResource.java:173)
>>>>>> at
>>>>>> org.apache.pulsar.client.admin.internal.NamespacesImpl.splitNamespaceBundle(NamespacesImpl.java:352)
>>>>>> at
>>>>>> org.apache.pulsar.broker.loadbalance.impl.SimpleLoadManagerImpl.doNamespaceBundleSplit(SimpleLoadManagerImpl.java:1450)
>>>>>> at
>>>>>> org.apache.pulsar.broker.loadbalance.impl.SimpleLoadManagerImpl.writeLoadReportOnZookeeper(SimpleLoadManagerImpl.java:1271)
>>>>>> at
>>>>>> org.apache.pulsar.broker.loadbalance.LoadReportUpdaterTask.run(LoadReportUpdaterTask.java:41)
>>>>>> at
>>>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>>>> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>>>>>> at
>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>>>>>> at
>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>>>>>> at
>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>>>>> at
>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>>>>> at
>>>>>> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
>>>>>> at java.lang.Thread.run(Thread.java:748)
>>>>>> Caused by: javax.ws.rs.NotFoundException: HTTP 404 Not Found
>>>>>> at org.glassfish.jersey.client.Je
>>>>>> rseyInvocation.convertToException(JerseyInvocation.java:1020)
>>>>>> at org.glassfish.jersey.client.Je
>>>>>> rseyInvocation.translate(JerseyInvocation.java:819)
>>>>>> at org.glassfish.jersey.client.Je
>>>>>> rseyInvocation.access$700(JerseyInvocation.java:92)
>>>>>> at org.glassfish.jersey.client.Je
>>>>>> rseyInvocation$2.call(JerseyInvocation.java:701)
>>>>>> at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
>>>>>> at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
>>>>>> at org.glassfish.jersey.internal.Errors.process(Errors.java:228)
>>>>>> at
>>>>>> org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:444)
>>>>>> at org.glassfish.jersey.client.Je
>>>>>> rseyInvocation.invoke(JerseyInvocation.java:697)
>>>>>> at org.glassfish.jersey.client.Je
>>>>>> rseyInvocation$Builder.method(JerseyInvocation.java:448)
>>>>>> at org.glassfish.jersey.client.Je
>>>>>> rseyInvocation$Builder.put(JerseyInvocation.java:332)
>>>>>> at
>>>>>> org.apache.pulsar.client.admin.internal.NamespacesImpl.splitNamespaceBundle(NamespacesImpl.java:350)
>>>>>> ... 11 more
>>>>>>
>>>>>>
>>>>>> On Tue, Oct 10, 2017 at 4:44 PM, Rajan Dhabalia <[email protected]
>>>>>> > wrote:
>>>>>>
>>>>>>> COUNT          |TOPIC          |BUNDLE         |PRODUCER
>>>>>>>  |CONSUMER       |BUNDLE +       |BUNDLE -
>>>>>>>                              4              |1
>>>>>>> |8                          |0                          |0
>>>>>>>           |0              ||
>>>>>>>
>>>>>>> ip-[redacted].us-west-2.compute.internal:8080             |1
>>>>>>>     |1500.41      |639.99       |3414.49      |15.97        ||
>>>>>>>
>>>>>>>
>>>>>>> Based on stats, it seems : a broker is serving 4 topics under the
>>>>>>> same bundle. So, yes, we need to split the bundle so, topics can be
>>>>>>> distributed evenly into multiple bundles and those bundles can be owned 
>>>>>>> by
>>>>>>> different brokers. There are few pointers to troubleshoot 
>>>>>>> bundle-splitting:
>>>>>>>
>>>>>>> *1. Is there any way to verify if bundle is split automatically by
>>>>>>> loadbalancer in the log?*
>>>>>>> In the broker log under class: *SimpleLoadManagerImpl* do you seen
>>>>>>> any log with text
>>>>>>>
>>>>>>> *"split hot namespace bundle"?*
>>>>>>> *2. Is there any way to split the bundle manually and unload
>>>>>>> namespace bundles?*
>>>>>>>   A. we can split bundle manually using pulsar-admin tool
>>>>>>> <https://pulsar.incubator.apache.org/docs/latest/admin-api/namespaces/#splitbundle>
>>>>>>>
>>>>>>> pulsar-admin namespaces split-bundle --bundle 0x00000000_0xffffffff 
>>>>>>> test-property/cl1/ns1
>>>>>>>
>>>>>>>  B. Unload namespace bundle
>>>>>>>
>>>>>>> pulsar-admin namespaces unload --bundle 0x00000000_0xffffffff 
>>>>>>> test-property/pstg-gq1/ns1
>>>>>>>
>>>>>>>
>>>>>>> *3. How to get list of bundles which my broker is serving?*
>>>>>>>
>>>>>>> pulsar-admin namespaces broker-stats destinations -i
>>>>>>> {
>>>>>>>     "sample/standalone/ns1": {
>>>>>>>         "0x00000000_0xffffffff": {
>>>>>>>             "persistent": {
>>>>>>>                 "persistent://sample/standalone/ns1/t1": {
>>>>>>>                     "publishers": [],
>>>>>>>                     "replication": {},
>>>>>>>                     "subscriptions": {},
>>>>>>>                     "producerCount": 0,
>>>>>>>                     "averageMsgSize": 0.0,
>>>>>>>                     "msgRateIn": 0.0,
>>>>>>>                     "msgRateOut": 0.0,
>>>>>>>                     "msgThroughputIn": 0.0,
>>>>>>>                     "msgThroughputOut": 0.0,
>>>>>>>                     "storageSize": 0,
>>>>>>>                     "pendingAddEntriesCount": 0
>>>>>>>                 }
>>>>>>>             }
>>>>>>>         }
>>>>>>>     }
>>>>>>>
>>>>>>>
>>>>>>> *this commands gives list of namespace-bundles, topics and its
>>>>>>> output.*
>>>>>>>
>>>>>>>
>>>>>>> *4. Few release back, there is an advanced load-balancer is
>>>>>>> introduced in pulsar which does better job in terms of distributing 
>>>>>>> load.
>>>>>>> How can we enable new advanced load-balancer?*
>>>>>>> Modular-load-manager
>>>>>>> <https://pulsar.incubator.apache.org/docs/latest/admin/ModularLoadManager/>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Rajan
>>>>>>>
>>>>>>> On Tue, Oct 10, 2017 at 4:04 PM, Ryan Stout <[email protected]> wrote:
>>>>>>>
>>>>>>>> I've created a topic with 4 partitions, and monitor-brokers reports
>>>>>>>> 4 topics:
>>>>>>>>
>>>>>>>>
>>>>>>>> ===================================================================================================================
>>>>>>>> ||COUNT          |TOPIC          |BUNDLE         |PRODUCER
>>>>>>>>  |CONSUMER       |BUNDLE +       |BUNDLE -       ||
>>>>>>>> ||               |4              |1              |8
>>>>>>>> |0              |0              |0              ||
>>>>>>>> ||RAW SYSTEM     |CPU %          |MEMORY %       |DIRECT %
>>>>>>>>  |BW IN %        |BW OUT %       |MAX %          ||
>>>>>>>> ||               |2.95           |18.36          |1.56
>>>>>>>>  |0.16           |0.29           |18.36          ||
>>>>>>>> ||ALLOC SYSTEM   |CPU %          |MEMORY %       |DIRECT %
>>>>>>>>  |BW IN %        |BW OUT %       |MAX %          ||
>>>>>>>> ||               |42.68          |3.88           |
>>>>>>>>  |3.57           |2.90           |42.68          ||
>>>>>>>> ||RAW MSG        |MSG/S IN       |MSG/S OUT      |TOTAL
>>>>>>>> |KB/S IN        |KB/S OUT       |TOTAL          ||
>>>>>>>> ||               |1500.41        |0.00           |1500.41
>>>>>>>> |16.14          |29.18          |45.32          ||
>>>>>>>> ||ALLOC MSG      |MSG/S IN       |MSG/S OUT      |TOTAL
>>>>>>>> |KB/S IN        |KB/S OUT       |TOTAL          ||
>>>>>>>> ||               |3295.35        |118.70         |3414.05
>>>>>>>> |357.11         |289.76         |646.86         ||
>>>>>>>>
>>>>>>>> ===================================================================================================================
>>>>>>>>
>>>>>>>> I also see a throughput of over 1k on one of the brokers:
>>>>>>>>
>>>>>>>> 2017-10-10 21:16:25,548 - INFO  - [main:BrokerMonitor@203] -
>>>>>>>> Overall Broker Data:
>>>>>>>>
>>>>>>>> ***************************************************************************************************************************************
>>>>>>>> ||BROKER
>>>>>>>>  |BUNDLE       |MSG/S        |LONG/S       |KB/S         |MAX %        
>>>>>>>> ||
>>>>>>>> ||ip-[redacted].us-west-2.compute.internal:8080              |0
>>>>>>>>         |0.00         |0.00         |0.00         |5.81         ||
>>>>>>>> ||ip-[redacted].us-west-2.compute.internal:8080             |1
>>>>>>>>       |1500.41      |639.99       |3414.49      |15.97        ||
>>>>>>>> ||TOTAL                                                        |1
>>>>>>>>           |1500.41      |3414.49      |639.99       |15.97        ||
>>>>>>>>
>>>>>>>> ***************************************************************************************************************************************
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Oct 10, 2017 at 3:48 PM, Rajan Dhabalia <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>> Hi Ryan,
>>>>>>>>>
>>>>>>>>> >> I've set "loadBalancerAutoBundleSplitEnabled" to "true" and
>>>>>>>>> "loadBalancerNamespaceBundleMaxMsgRate" to 1000. I then ran 2 
>>>>>>>>> producers at
>>>>>>>>> 1k msg/s for ~5mins, but I didn't see a bundle split
>>>>>>>>>
>>>>>>>>> LoadBalancer will split the bundle only if it has more than 1
>>>>>>>>> topic in the bundle (because bundle is a logical part of namespace 
>>>>>>>>> that
>>>>>>>>> contains topics. if namespace has only 1 topic then there is no need 
>>>>>>>>> of
>>>>>>>>> split bundle).
>>>>>>>>> Load-balancer splits bundle if bundle reaches one of the threshold
>>>>>>>>> configured at broker-config
>>>>>>>>> <https://git.corp.yahoo.com/cloud-messaging/pulsar/blob/yahoo/pulsar-broker-common/src/main/java/org/apache/pulsar/broker/ServiceConfiguration.java#L260-L266>:
>>>>>>>>>
>>>>>>>>> 1. *loadBalancerNamespaceBundleMaxTopics*:
>>>>>>>>> maximum topics in a bundle
>>>>>>>>> 2. *loadBalancerNamespaceBundleMaxSessions*:
>>>>>>>>> maximum sessions (producers + consumers) in a bundle
>>>>>>>>> 3. *loadBalancerNamespaceBundleMaxMsgRate*:
>>>>>>>>> maximum msgRate (in + out) in a bundle
>>>>>>>>> 4. *loadBalancerNamespaceBundleMaxBandwidthMbytes*:   maximum
>>>>>>>>> bandwidth (in + out) in a bundle
>>>>>>>>>
>>>>>>>>> >> I found "bin/pulsar-perf monitor-brokers"
>>>>>>>>> Using this utility can you confirm bundle usage and can you
>>>>>>>>> confirm if it meets that threshold to split the bundle?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Rajan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Oct 10, 2017 at 3:33 PM, Ryan Stout <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hey Pulsar devs,
>>>>>>>>>>
>>>>>>>>>> I've deployed a small Pulsar cluster (in AWS) with 2 brokers and
>>>>>>>>>> 3 bookies. I've started doing perf testing using bin/pulsar-perf to
>>>>>>>>>> determine the limitations of Pulsar. I'm at the point where I can't 
>>>>>>>>>> produce
>>>>>>>>>> more than ~25k msg/s on a topic (regardless of number of partitions,
>>>>>>>>>> clients, or bookies). Upon trying to understand the bottleneck, I 
>>>>>>>>>> found
>>>>>>>>>> "bin/pulsar-perf monitor-brokers" and it showed that only one of the 
>>>>>>>>>> two
>>>>>>>>>> brokers is receiving traffic. I've set-up the service-discovery 
>>>>>>>>>> service
>>>>>>>>>> that came with Pulsar, which my producers are hitting, so I expected 
>>>>>>>>>> the
>>>>>>>>>> requests to be distributed fairly across the brokers, but this is 
>>>>>>>>>> not the
>>>>>>>>>> case.
>>>>>>>>>>
>>>>>>>>>> In conf/broker.conf, there's a load balancing section that seems
>>>>>>>>>> to hint at the ability for brokers to shed traffic to other brokers. 
>>>>>>>>>> I've
>>>>>>>>>> tried tuning the values in this section, but haven't been able to 
>>>>>>>>>> get the
>>>>>>>>>> brokers to share the load. For example, I've set
>>>>>>>>>> "loadBalancerAutoBundleSplitEnabled" to "true" and
>>>>>>>>>> "loadBalancerNamespaceBundleMaxMsgRate" to 1000. I then ran 2 
>>>>>>>>>> producers at
>>>>>>>>>> 1k msg/s for ~5mins, but I didn't see a bundle split (I also reduced 
>>>>>>>>>> some
>>>>>>>>>> of the intervals e.g. "loadBalancerSheddingIntervalMinutes" to 1 
>>>>>>>>>> minute).
>>>>>>>>>>
>>>>>>>>>> Is there a way to configure my Pulsar cluster to balance between
>>>>>>>>>> my 2 brokers? Is there perhaps another, better way I might increase
>>>>>>>>>> throughput?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
> --
Matteo Merli
<[email protected]>

Reply via email to