2018-01-29 01:13:29 UTC - Daniel Ferreira Jorge: Hi, I still am really confused 
regarding the messages retention and expiry in pulsar. This particular page in 
the documentation is really confusing. For instance, the first paragraph 
states: "by default brokers immediately deletes all messages that have been 
acknowledged by a consumer". Does this mean that the acked message is 
physically deleted from the bookie ledger? Isn't the ack of a message tied to a 
particular SUBSCRIPTION? If I have 10 subscriptions consuming from a topic, the 
first subscription that ack the message will cause the message to be DELETED 
and not being consumed by the other 9 subacriptions? Or the "delete" in the 
docs really mean "marked as consumed for a particular subscription"? In that 
same page in the "Retention policies" section it also states "By default, when 
a Pulsar message arrives at a broker it will be stored until it has been 
acknowledged by a consumer, at which point it will be deleted". What is really 
confusing to me is: since a topic can have MANY subscriptions, an 
acknowledgement is not a GLOBAL event for a message, but a per SUBSCRIPTION 
event. Am I completely misguided here? 
----
2018-01-29 01:26:02 UTC - jia zhai: Yes, ack is tied to subscription.  It is 
the second of your understanding `the "delete" in the docs really mean "marked 
as consumed for a particular subscription"`. 
The message will be delete-able, when all the subscription have consume it.
----
2018-01-29 01:29:48 UTC - Daniel Ferreira Jorge: Ok so if I have only one 
subscription, if I ack a message it will be deleted from de bookie unless I 
configure the retention policy for the namespace to which that topic belongs?
----
2018-01-29 01:30:44 UTC - jia zhai: :+1:
----
2018-01-29 01:33:32 UTC - Daniel Ferreira Jorge: Thank you! I have another 
question unrelated to this one. What are bundles? 
----
2018-01-29 01:44:57 UTC - Daniel Ferreira Jorge: Why would I increase the 
number of bundles in a namespace?
----
2018-01-29 01:49:08 UTC - jia zhai: Oh, Sorry, I did not touch too much related 
to bundle. It seems be more load balance related between brokers.
----
2018-01-29 01:54:07 UTC - Daniel Ferreira Jorge: Are there any docs about 
bundles and their purpose? I couldn't find anything related to that...
----
2018-01-29 02:05:55 UTC - jia zhai: @Matteo Merli @Sijie Guo for more info 
regarding bundles
----
2018-01-29 05:13:36 UTC - Matteo Merli: @Daniel Ferreira Jorge Ok, this is 
becoming really a FAQ and there’s not much documentation around bundles. 
The intention for that was that in most cases, one should not worry about it 
(or even know what they
are and what they’re for). 

I’ll try to summarize here and we’ll add a better redacted page in the docs. 

In Pulsar, “namespaces” are the administrative unit: you can configure most 
options on a namespace and they will be applied on the topics contained on the 
namespace. It gives the convenience of doing settings and operations on a group 
of topics rather than having to do it once per topic. 

In general, the pattern is to use a namespace for each user application. So a 
single user/tenant, can create
multiple namespaces to manage its own applications.

When it comes to topics, we need a way to assign topics to brokers, control the 
load and move them
if a broker becomes overloaded. Rather that doing this operations per each 
single topic (ownership, load-monitoring, assigning), we do it in _bundles_, or 
“groups of topics”.

In practical words, the number of bundles determines “into how many brokers can 
I spread the topics for a given namespace”.

From the client API or implementation, there’s no concept of bundles, clients 
will lookup the topics they want to publish/consume individually.

On the broker side, the namespace is broke down into multiple _bundles_, and 
each bundle can be assigned to a
different broker. Effectively, bundles are the “unit of assignment” for topics 
into brokers and this is what 
the load-manager uses to track the traffic and decide where to place “bundles” 
and whether to offload them
to other brokers.

A bundle is represented by a hash-range. The 32bit hash space is initially 
divided equally into the 
the requested bundles. Topics are matched to a bundle by hashing on the topic 
name.

Default number of bundles is configured in `broker.conf`: 
`defaultNumberOfNamespaceBundles=4`

When the traffic increases on a given bundle, it will be split in 2 and 
reassigned to a different broker.

Enable auto-split: `loadBalancerAutoBundleSplitEnabled=true`
Trigger unload and reassignment after splitting: 
`loadBalancerAutoUnloadSplitBundlesEnabled=true`

If is expected to have a high traffic on a particular namespace, it’s a good 
practice to 
specify a higher number of bundles when creating the namespace: 

`bin/pulsar-admin namespaces create $NS --bundles 64`

This will avoid the initial auto-adjustment phase.

All the thresholds for the auto-splitting can be configured in `broker.conf`, 
eg: number of topics/partitions, messages in/out, bytes in/out, etc..
----
2018-01-29 05:15:00 UTC - jia zhai: :+1:
----
2018-01-29 07:26:24 UTC - Jaebin Yoon: @Sijie Guo Here is what I did before I 
started seeing those errors. 
1) brought up new bookies (10)
2) terminated all old bookies  (10) (while no auto-recovery was running)
3) deleted the old partitioned topic
4) create a new partitioned topics (with same name)
5) started traffic on the new partitioned topic
----
2018-01-29 07:27:51 UTC - Jaebin Yoon: I thought there would be auto-recovery 
running by default in the bookie cluster but realized it required running a 
separate auto-recovery service (or embedded option, which by default was off).
----
2018-01-29 07:34:00 UTC - Sijie Guo: @Jaebin Yoon: > realized it required 
running a separate auto-recovery service (or embedded option, which by default 
was off)

ah, right. we can make the auto recovery is on by default /cc @Matteo Merli 

@Jaebin Yoon we will try to repeat your sequence to see if we can reproduce 
this behavior. /cc @Matteo Merli
----
2018-01-29 08:01:45 UTC - Jaebin Yoon: I'm trying to increase the number of 
partitions on the existing partitioned topic with pulsar-admin CLI while 
messages are being produced and consumed on that topic. (from 10 partitions to 
50  partitions since the traffic was not distributed well over brokers) but the 
command gets stuck and it seems nothing is happening. Here is the command I 
used :

```pulsar-admin persistent update-partitioned-topic -p 50 
persistent://$PROPERTY/${CLUSTER}/${NS}/${TOPIC}```
----
2018-01-29 09:17:06 UTC - Ivan Kelly: @Jaebin Yoon what does jstack say the jvm 
is doing?
----
2018-01-29 09:27:00 UTC - Jaebin Yoon: Some brokers' cpus were hot because of 
traffic and GC pauses up to 1.5s repeatedly for the traffic as well. It doesn't 
seem that the API changes any CPU usage or GC pauses.
----
2018-01-29 09:32:15 UTC - Ivan Kelly: what version is the cluster running?
----
2018-01-29 14:37:48 UTC - Jesse Thompson: Test-driving Pulsar via the 
standalone Docker instructions. I’m trying to create a new namespace, but am 
stuck in trying to first create a property. If I try `pulsar-admin create 
property property-name` I get an error stating that parameter _create_ was 
passed but no _main_ parameter was defined.. Not sure what that means. 
Additionally, the documents state that the options are `--admin-roles` and 
`--allowed-clusters`, are either of those required parameters? What are the 
possible _admin roles_? What would one specify the cluster as being if using a 
standalone cluster? Loopback?
----
2018-01-29 15:52:32 UTC - Ivan Kelly: is should be pulsar-admin properties 
create property-name
----
2018-01-29 15:55:27 UTC - Jesse Thompson: `pulsar-admin --admin-url 
<http://localhost:8080> properties create property-name`

Tells me that I must specify `--admin-roles` and `--allowed-clusters`
What kinds of roles are available? For the clusters, do I just point it at 
loopback?
----
2018-01-29 16:16:22 UTC - Ivan Kelly: for the clusters, you can get a list with 
"pulsar-admin clusters list"
----
2018-01-29 16:16:36 UTC - Ivan Kelly: you can just make something up for 
admin-roles
----
2018-01-29 16:16:40 UTC - Ivan Kelly: like test-admin-role
----

Reply via email to