2018-01-29 01:13:29 UTC - Daniel Ferreira Jorge: Hi, I still am really confused
regarding the messages retention and expiry in pulsar. This particular page in
the documentation is really confusing. For instance, the first paragraph
states: "by default brokers immediately deletes all messages that have been
acknowledged by a consumer". Does this mean that the acked message is
physically deleted from the bookie ledger? Isn't the ack of a message tied to a
particular SUBSCRIPTION? If I have 10 subscriptions consuming from a topic, the
first subscription that ack the message will cause the message to be DELETED
and not being consumed by the other 9 subacriptions? Or the "delete" in the
docs really mean "marked as consumed for a particular subscription"? In that
same page in the "Retention policies" section it also states "By default, when
a Pulsar message arrives at a broker it will be stored until it has been
acknowledged by a consumer, at which point it will be deleted". What is really
confusing to me is: since a topic can have MANY subscriptions, an
acknowledgement is not a GLOBAL event for a message, but a per SUBSCRIPTION
event. Am I completely misguided here?
----
2018-01-29 01:26:02 UTC - jia zhai: Yes, ack is tied to subscription. It is
the second of your understanding `the "delete" in the docs really mean "marked
as consumed for a particular subscription"`.
The message will be delete-able, when all the subscription have consume it.
----
2018-01-29 01:29:48 UTC - Daniel Ferreira Jorge: Ok so if I have only one
subscription, if I ack a message it will be deleted from de bookie unless I
configure the retention policy for the namespace to which that topic belongs?
----
2018-01-29 01:30:44 UTC - jia zhai: :+1:
----
2018-01-29 01:33:32 UTC - Daniel Ferreira Jorge: Thank you! I have another
question unrelated to this one. What are bundles?
----
2018-01-29 01:44:57 UTC - Daniel Ferreira Jorge: Why would I increase the
number of bundles in a namespace?
----
2018-01-29 01:49:08 UTC - jia zhai: Oh, Sorry, I did not touch too much related
to bundle. It seems be more load balance related between brokers.
----
2018-01-29 01:54:07 UTC - Daniel Ferreira Jorge: Are there any docs about
bundles and their purpose? I couldn't find anything related to that...
----
2018-01-29 02:05:55 UTC - jia zhai: @Matteo Merli @Sijie Guo for more info
regarding bundles
----
2018-01-29 05:13:36 UTC - Matteo Merli: @Daniel Ferreira Jorge Ok, this is
becoming really a FAQ and there’s not much documentation around bundles.
The intention for that was that in most cases, one should not worry about it
(or even know what they
are and what they’re for).
I’ll try to summarize here and we’ll add a better redacted page in the docs.
In Pulsar, “namespaces” are the administrative unit: you can configure most
options on a namespace and they will be applied on the topics contained on the
namespace. It gives the convenience of doing settings and operations on a group
of topics rather than having to do it once per topic.
In general, the pattern is to use a namespace for each user application. So a
single user/tenant, can create
multiple namespaces to manage its own applications.
When it comes to topics, we need a way to assign topics to brokers, control the
load and move them
if a broker becomes overloaded. Rather that doing this operations per each
single topic (ownership, load-monitoring, assigning), we do it in _bundles_, or
“groups of topics”.
In practical words, the number of bundles determines “into how many brokers can
I spread the topics for a given namespace”.
From the client API or implementation, there’s no concept of bundles, clients
will lookup the topics they want to publish/consume individually.
On the broker side, the namespace is broke down into multiple _bundles_, and
each bundle can be assigned to a
different broker. Effectively, bundles are the “unit of assignment” for topics
into brokers and this is what
the load-manager uses to track the traffic and decide where to place “bundles”
and whether to offload them
to other brokers.
A bundle is represented by a hash-range. The 32bit hash space is initially
divided equally into the
the requested bundles. Topics are matched to a bundle by hashing on the topic
name.
Default number of bundles is configured in `broker.conf`:
`defaultNumberOfNamespaceBundles=4`
When the traffic increases on a given bundle, it will be split in 2 and
reassigned to a different broker.
Enable auto-split: `loadBalancerAutoBundleSplitEnabled=true`
Trigger unload and reassignment after splitting:
`loadBalancerAutoUnloadSplitBundlesEnabled=true`
If is expected to have a high traffic on a particular namespace, it’s a good
practice to
specify a higher number of bundles when creating the namespace:
`bin/pulsar-admin namespaces create $NS --bundles 64`
This will avoid the initial auto-adjustment phase.
All the thresholds for the auto-splitting can be configured in `broker.conf`,
eg: number of topics/partitions, messages in/out, bytes in/out, etc..
----
2018-01-29 05:15:00 UTC - jia zhai: :+1:
----
2018-01-29 07:26:24 UTC - Jaebin Yoon: @Sijie Guo Here is what I did before I
started seeing those errors.
1) brought up new bookies (10)
2) terminated all old bookies (10) (while no auto-recovery was running)
3) deleted the old partitioned topic
4) create a new partitioned topics (with same name)
5) started traffic on the new partitioned topic
----
2018-01-29 07:27:51 UTC - Jaebin Yoon: I thought there would be auto-recovery
running by default in the bookie cluster but realized it required running a
separate auto-recovery service (or embedded option, which by default was off).
----
2018-01-29 07:34:00 UTC - Sijie Guo: @Jaebin Yoon: > realized it required
running a separate auto-recovery service (or embedded option, which by default
was off)
ah, right. we can make the auto recovery is on by default /cc @Matteo Merli
@Jaebin Yoon we will try to repeat your sequence to see if we can reproduce
this behavior. /cc @Matteo Merli
----
2018-01-29 08:01:45 UTC - Jaebin Yoon: I'm trying to increase the number of
partitions on the existing partitioned topic with pulsar-admin CLI while
messages are being produced and consumed on that topic. (from 10 partitions to
50 partitions since the traffic was not distributed well over brokers) but the
command gets stuck and it seems nothing is happening. Here is the command I
used :
```pulsar-admin persistent update-partitioned-topic -p 50
persistent://$PROPERTY/${CLUSTER}/${NS}/${TOPIC}```
----
2018-01-29 09:17:06 UTC - Ivan Kelly: @Jaebin Yoon what does jstack say the jvm
is doing?
----
2018-01-29 09:27:00 UTC - Jaebin Yoon: Some brokers' cpus were hot because of
traffic and GC pauses up to 1.5s repeatedly for the traffic as well. It doesn't
seem that the API changes any CPU usage or GC pauses.
----
2018-01-29 09:32:15 UTC - Ivan Kelly: what version is the cluster running?
----
2018-01-29 14:37:48 UTC - Jesse Thompson: Test-driving Pulsar via the
standalone Docker instructions. I’m trying to create a new namespace, but am
stuck in trying to first create a property. If I try `pulsar-admin create
property property-name` I get an error stating that parameter _create_ was
passed but no _main_ parameter was defined.. Not sure what that means.
Additionally, the documents state that the options are `--admin-roles` and
`--allowed-clusters`, are either of those required parameters? What are the
possible _admin roles_? What would one specify the cluster as being if using a
standalone cluster? Loopback?
----
2018-01-29 15:52:32 UTC - Ivan Kelly: is should be pulsar-admin properties
create property-name
----
2018-01-29 15:55:27 UTC - Jesse Thompson: `pulsar-admin --admin-url
<http://localhost:8080> properties create property-name`
Tells me that I must specify `--admin-roles` and `--allowed-clusters`
What kinds of roles are available? For the clusters, do I just point it at
loopback?
----
2018-01-29 16:16:22 UTC - Ivan Kelly: for the clusters, you can get a list with
"pulsar-admin clusters list"
----
2018-01-29 16:16:36 UTC - Ivan Kelly: you can just make something up for
admin-roles
----
2018-01-29 16:16:40 UTC - Ivan Kelly: like test-admin-role
----