2018-04-23 10:23:35 UTC - Doodle: @Sijie Guo On your faq you mention `you can
try to configure the bookies with multiple journals (e.g. 4)` - I can see a
codechange for bookkeeper to allow you to set a number of journal directories,
but can't find out how you can actually use that configuration? Any tips?
----
2018-04-23 11:28:03 UTC - Byron: @Byron has joined the channel
----
2018-04-23 11:34:24 UTC - Byron: Hi folks, I noticed that the Reader interface
is being introduce in 2.0.0. Two questions, using the current consumer API now,
how would one read earlier messages on a topic? Second, are there 2.0.0
development builds for the client libraries? And a semi-related question.. are
there plans for a native Go client (or is anyone working on one?)
+1 : Doodle
----
2018-04-23 12:54:14 UTC - Doodle: Interested in a golang client as well, i
think someone higher up in this thread mentioned that was planned to do?
----
2018-04-23 13:00:54 UTC - Byron: Yea I saw the reference to the Kafka Go client
----
2018-04-23 16:04:40 UTC - Matteo Merli: > @Sijie Guo On your faq you mention
`you can try to configure the bookies with multiple journals (e.g. 4)` - I can
see a codechange for bookkeeper to allow you to set a number of journal
directories, but can’t find out how you can actually use that configuration?
Any tips?
@Doodle Using multiple journals is only avaialable with BookKeeper-4.7 (the one
that will be included with Pulsar-2.0, though you can run Pulsar-1.22 with
Bookies from 4.7 as well). The advantage of using multiple journals is to have
multiple threads and hence higher throughput per bookie. Even though there’s a
single disk, having multiple threads uses the IO bandwidth of SSDs/NVMe disks
more effectively.
----
2018-04-23 16:11:08 UTC - Jon Bock: There is a Go client being developed by
some users at a company using Pulsar, they are working to complete it + get
their company’s approval to open source it.
----
2018-04-23 16:12:25 UTC - Matteo Merli: > Hi folks, I noticed that the
Reader interface is being introduce in 2.0.0. Two questions, using the current
consumer API now, how would one read earlier messages on a topic? Second, are
there 2.0.0 development builds for the client libraries? And a semi-related
question.. are there plans for a native Go client (or is anyone working on one?)
@Byron
1. The `Reader` interface was introduced back in Pulsar 1.18. Using consumer
API, you have few options to read earlier messages in a topic :
- Using `consumer.seek()` on a created consumer
- Using `pulsar-admin persistent reset-cursor $MY_TOPIC --subscription
$MY_SUBSCRIPTION --time 6h`
- Specifying where to initialize the subscription. This will be available
in 2.0:
<http://pulsar.apache.org/api/client/org/apache/pulsar/client/api/ConsumerBuilder.html#subscriptionInitialPosition-org.apache.pulsar.client.api.SubscriptionInitialPosition->
2. Yes there are snapshot JARs being published on the Apache snapshots maven
repo
3. There are some people working on a native Go client and there are plans to
have a Go library wrapping C++ as well.
+1 : Byron
----
2018-04-23 16:15:07 UTC - Byron: @Matteo Merli Thanks. When I installed the
1.22.0 Python client, there was not `create_reader` method as noted in the
docs. That is why I assumed I assumed it was a new feature
----
2018-04-23 16:15:43 UTC - Matteo Merli:
<http://pulsar.apache.org/api/python/#pulsar.Client.create_reader> :wink:
----
2018-04-23 16:16:41 UTC - Byron: I meant using the library itself. When I
installed the client and called `client.create_reader` in my script, it threw
an exception that the method didn’t exist
----
2018-04-23 16:18:18 UTC - Byron: Sorry I have the 1.19.0 client installed
----
2018-04-23 16:18:35 UTC - Byron: I only see `close`, `create_producer` and
`subscribe` on the client class
----
2018-04-23 16:18:59 UTC - Matteo Merli: Ok, the Reader was in Java API only
until 1.20 where it was added to C++/Python
----
2018-04-23 16:21:30 UTC - Byron: Is it possible 1.19.0 is only compatible with
Python 3? While up to 1.22 is Python 2? I see version 1.22
(<https://pypi.org/project/pulsar-client/#history>), but using pip does not
allow me to download anything above 1.19
----
2018-04-23 16:22:18 UTC - Matteo Merli: Are you on MacOS 10.11 by any chance?
----
2018-04-23 16:22:27 UTC - Byron: yes
----
2018-04-23 16:23:03 UTC - Byron: ah i see the distributions are 10.12 and above
----
2018-04-23 16:23:30 UTC - Matteo Merli: ok, the issue is that the binaries
uploaded on PyPI after 1.19 are only for 10.12 and 10.13
----
2018-04-23 16:23:58 UTC - Byron: Great thanks
----
2018-04-23 16:24:32 UTC - Byron: Hm, ok. So presumably I can still build from
source
----
2018-04-23 16:25:44 UTC - Matteo Merli: Yes, the instructions are at
<https://github.com/apache/incubator-pulsar/tree/master/pulsar-client-cpp#compile-on-mac-os-x>
----
2018-04-23 16:26:21 UTC - Matteo Merli: Though we’ll try to get the 10.11 build
for next releases
----
2018-04-23 16:26:47 UTC - Byron: Thanks. Yea unfortunately my employer is a tad
slow to allow certain OS upgrades
----
2018-04-23 16:27:42 UTC - Byron: Thanks for the help
----
2018-04-23 16:28:07 UTC - Matteo Merli: No problem, let me know if you have
issues in building from source
----
2018-04-23 16:28:53 UTC - Byron: I did try before (against master) and ran into
a boost-python issue.. it was not able to be found on my system. I did follow
the instructions to install it from homebrew
----
2018-04-23 16:29:28 UTC - Byron: @Byron uploaded a file:
<https://apache-pulsar.slack.com/files/UACD54WB1/FABHYKGGJ/-.txt|Untitled>
----
2018-04-23 16:32:17 UTC - Matteo Merli: Uhm, ok, I think it might be an issue
with the versions of boost that are being picked up by brew. Let me try
something
----
2018-04-23 17:37:12 UTC - Matteo Merli: @Byron Just uploaded Py-2.7 binaries
for mac-10.11 for Pulsar-1.22.0
----
2018-04-23 17:37:45 UTC - Byron: Great, any chance you could build Py-3.x?
----
2018-04-23 17:38:16 UTC - Byron: Did you happen to determine what the boost
problem could be?
----
2018-04-23 17:40:51 UTC - Matteo Merli: Problems was to do use [email protected] and
[email protected] (to match) and to do `brew link --force` to have them linked
in `/usr/local/include`
----
2018-04-23 17:40:58 UTC - Matteo Merli: Let me see Py3
----
2018-04-23 18:21:43 UTC - Byron: It appears boost-python3 only goes back to
1.66, latest is 1.67
----
2018-04-23 18:21:51 UTC - Byron: (which is the latest boost version also)
----
2018-04-23 18:23:25 UTC - Matteo Merli: Yes, that’s giving some trouble
:slightly_smiling_face:
----
2018-04-23 18:24:09 UTC - Byron: No worries, I will use Py2 for now. I
appreciate the help
+1 : Matteo Merli
----
2018-04-23 18:24:29 UTC - Byron: If/when I deploy anything it would be in linux
containers anyway
----
2018-04-23 18:25:35 UTC - Matteo Merli: there the py2/py3 binaries are there,
since it easier to automate the build in container with the precise
dependencies and compile statically
----
2018-04-23 18:25:53 UTC - Byron: yes, makes sense
----
2018-04-23 19:41:09 UTC - Byron: Based on the docs, a partitioned topic seems
to be only relevant to producers. The number of partitions (or even that it is
partitioned in the first place) is transparent to consumers? The only side
effect the consumer would observe is the message ordering?
----
2018-04-23 19:43:38 UTC - Byron: Can a topic be repartitioned? If so, the key
hash will change causing messages with an existing key being routed to a
potentially different partition than before. So with the same reasoning as
above.. since partitions are transparent to consumers the message ordering (wrt
the message key) would remain regardless that they are being published to a
different partition.
----
2018-04-23 22:10:39 UTC - Doodle: @Matteo Merli - ah cheers - I guess that the
documentation around this configuration for 4.7 isn't quite there? It seems to
suggest a single path for the journal unlike the ledgers
----
2018-04-23 22:12:45 UTC - Matteo Merli: Take a look at `journalDirectories` in
<http://bookkeeper.apache.org/docs/4.7.0/reference/config/>
----
2018-04-23 22:14:45 UTC - Doodle: Ah egg on my face there - I have the docs
open on all sorts of versions in different tabs - completely looking at the
wrong one! Cheers @Matteo Merli!
----
2018-04-23 22:18:29 UTC - Matteo Merli: :grinning:
----
2018-04-24 00:21:32 UTC - Matteo Merli: > Based on the docs, a partitioned
topic seems to be only relevant to producers. The number of partitions (or even
that it is partitioned in the first place) is transparent to consumers? The
only side effect the consumer would observe is the message ordering?
@Byron It is also relevant to consumers, if you care about ordering. The same
subscription modes apply in case of partitioned topics. Eg:
* Shared -> All consumers consume from all partitions — no ordering
guarantee
* Failover -> There will be one active consumer per partition —
Partitions will be spread evenly across available consumers — Order is
guaranteed
* Exclusive -> Only 1 consumer allowed across all partitions — In general
it’s not very useful with partitions
----
2018-04-24 00:22:35 UTC - Matteo Merli: > Can a topic be repartitioned? If
so, the key hash will change causing messages with an existing key being routed
to a potentially different partition than before. So with the same reasoning as
above.. since partitions are transparent to consumers the message ordering (wrt
the message key) would remain regardless that they are being published to a
different partition.
Yes, the number of partitions in a topic can be increased, though, as of now,
when that happens, order will not be guaranteed
----
2018-04-24 00:23:33 UTC - Byron: Ok. Is there any guarantee that a consumer
will receive messages for the same key?
----
2018-04-24 00:23:50 UTC - Byron: In the shared case
----
2018-04-24 00:24:15 UTC - Matteo Merli: no, in the shared subscription it’s a
round-robin delivery
----
2018-04-24 00:24:34 UTC - Byron: Ok understood thanks
----
2018-04-24 00:25:17 UTC - Matteo Merli: you need failover to maintain ordering,
and start with a number of partitions that allows for some extra capacity
----
2018-04-24 00:26:08 UTC - Byron: Yep that makes sense
----
2018-04-24 00:26:21 UTC - Byron: Just figuring which modes for different use
cases
----
2018-04-24 00:26:29 UTC - Byron: This is helpful
----
2018-04-24 00:27:30 UTC - Byron: Also one other note (which may be in the next
release), but the Python consumer class doesn’t have a `seek` method defined
(as you noted above)
----
2018-04-24 00:29:29 UTC - Matteo Merli: Correct, there are few new features
that were added to Java but not yet to C++/Python. There’s a project to track
them. <https://github.com/apache/incubator-pulsar/projects/7> Most likely we’ll
tackle them post 2.0, since we already have a big list of items waiting to be
released.
----
2018-04-24 00:29:57 UTC - Byron: Got it, thanks
----
2018-04-24 01:30:03 UTC - Byron: anyone happen to observe the following error.
i am setting up the components on kubernetes..
----
2018-04-24 01:30:08 UTC - Byron: @Byron uploaded a file:
<https://apache-pulsar.slack.com/files/UACD54WB1/FAC0YHS9K/Untitled.txt|Untitled>
----
2018-04-24 01:30:33 UTC - Byron: the bookie connects to ZK, but then this error
follows
----
2018-04-24 05:03:30 UTC - jia zhai: This is because the mis-match of cookie.
clear the data of local bookie, and metadata in zk, could solve this.
----
2018-04-24 07:38:10 UTC - Sijie Guo: @Byron which deployment method are you
using in k8s? daemon set or stateful set?
----
2018-04-24 07:38:36 UTC - Sijie Guo: I am suspecting your disks on bookies are
erased when bookie pods restarted..
----