Slack digest for #general - 2020-06-24

Apache Pulsar Slack Wed, 24 Jun 2020 02:11:31 -0700

2020-06-23 11:28:24 UTC - Oleg Brovko: @Oleg Brovko has joined the channel
----
2020-06-23 11:38:08 UTC - Pierre-Yves Lebecq: Hey :wave: I’m using the C++ 
client in a build script, using the following link to download the .deb 
package: 
<https://www.apache.org/dyn/mirrors/mirrors.cgi?action=download&amp;filename=pulsar/pulsar-2.5.2/DEB/apache-pulsar-client.deb>
This is the only link I found in the documentation when I wrote it.
It seems that since the 2.6 release, this link now returns a 404 error. Does 
anyone know any permanent link for previous versions so I can keep my script 
running using the 2.5.2 version?
----
2020-06-23 12:28:35 UTC - testinglab89: 
----
2020-06-23 13:06:09 UTC - Yifan: Hi, All, I have been having problem with 
pulsar-client since 2.5.2 on OSX Catalina. The problem is:
```E   ImportError: 
dlopen(*/.tox/py37/lib/python3.7/site-packages/_pulsar.cpython-37m-darwin.so, 
2): Symbol not found: __Py_tracemalloc_config
E     Referenced from: 
*/.tox/py37/lib/python3.7/site-packages/_pulsar.cpython-37m-darwin.so
E     Expected in: flat namespace
E    in */.tox/py37/lib/python3.7/site-packages/_pulsar.cpython-37m-darwin.so```
Does anyone else see this problem?
----
2020-06-23 13:11:15 UTC - Yifan: I was able to fix it by locking pulsar-client 
to 2.5.1, it stopped working when Homebrew upgraded libpulsar to 0.6.0.
----
2020-06-23 13:39:24 UTC - Gilles Barbier: Hi, have you successfully used the 
last standalone 2.6.0 (-all) docker image? I fail to run it .
----
2020-06-23 16:05:55 UTC - Matteo Merli: @Pierre-Yves Lebecq All releases are 
available at <http://archive.apache.org/dist/pulsar/>


We link the latest release to the mirrors, based on ASF policies.
----
2020-06-23 16:18:57 UTC - rani: *[Pulsar 2.5.1] Bookie Decommissioning 
(`autoRecoveryDaemonEnabled` = `true`) @Sijie Guo* 
I am running a bookie cluster on AWS that is scaled by an ASG. I operate with a 
3 node bookkeeper cluster at all times. If i need to release a new AMI, my 
current strategy is to scale the bookkeeper nodes 3 -&gt; 6 and then scale them 
back down again 6 -&gt; 3  (in order to remove the old bookkeeper nodes).

The bookkeeper decommissioning steps i’m taking on the nodes that need to be 
scaled down are:
1. Shutdown the bookkeeeper service
2. `/opt/pulsar/bin/bookkeeper shell decommissionbookie` 
The `decommissionbookie` command hangs most of the time/takes a long time to 
complete, considering that I have produced almost no custom events into my 
Pulsar cluster. Any clues as to what’s happening/if there’s any way in which I 
can optimise my workflow?
----
2020-06-23 16:23:40 UTC - Sijie Guo: What was the issue you encountered?
----
2020-06-23 16:24:21 UTC - Adriaan de Haan: Hi, I have just installed the 2.6.0 
based kubernetes helm chart
----
2020-06-23 16:24:44 UTC - Adriaan de Haan: and everything seems to have started 
up ok, except for the "recovery" POD
----
2020-06-23 16:25:46 UTC - Adriaan de Haan: Checking the logs, it is stuck in 
the pulsar-bookkeeper-verify-clusterid container with the following error:
----
2020-06-23 16:25:49 UTC - Adriaan de Haan: ```JMX enabled by default
Error: Could not find or load main class "
JMX enabled by default
Error: Could not find or load main class "
JMX enabled by default
Error: Could not find or load main class "
JMX enabled by default
Error: Could not find or load main class "```

----
2020-06-23 16:26:22 UTC - Adriaan de Haan: any ideas? anybody else with a 
similar issue?
----
2020-06-23 16:27:09 UTC - Sijie Guo: 1. Any logs from the decommionbookie? I 
would recommend you first try to run this manually to see if there is any issue 
with this approach?
2. Any reason why do you go with 3-6? Why can’t you do 3-&gt; 4? Since you need 
to take down bookie one by one.
----
2020-06-23 16:27:56 UTC - Sijie Guo: Are you using the latest master from 
<https://github.com/apache/pulsar-helm-chart> ?
----
2020-06-23 16:28:09 UTC - Sijie Guo: I think I fixed the issue along with 2.6.0 
release.
----
2020-06-23 16:28:28 UTC - Sijie Guo: Because there is a change on 2.6.0 about 
how we apply environment variables.
----
2020-06-23 16:29:20 UTC - Adriaan de Haan: I see there is already an issue
----
2020-06-23 16:29:31 UTC - Adriaan de Haan: yes
----
2020-06-23 16:29:43 UTC - Adriaan de Haan: I just did a git clone from that repo
----
2020-06-23 16:30:13 UTC - Adriaan de Haan: 
<https://github.com/apache/pulsar/issues/7243>
----
2020-06-23 16:30:37 UTC - Adriaan de Haan: Somebody else confirmed the same 
problem 2 days ago
----
2020-06-23 16:30:50 UTC - Sijie Guo: Yes.
----
2020-06-23 16:31:06 UTC - Sijie Guo: I think I missed one change in the bookie 
recovery pod
----
2020-06-23 16:31:12 UTC - Adriaan de Haan: probably :slightly_smiling_face:
----
2020-06-23 16:31:30 UTC - Adriaan de Haan: quick fix? :slightly_smiling_face:
----
2020-06-23 16:31:40 UTC - rani: 1. I do not receive any error logs. Simply the 
following for example:
```16:29:03.009 [main] INFO  org.apache.bookkeeper.client.BookKeeperAdmin - 
Count of Ledgers which need to be rereplicated: 8```
2. Going from 3 -&gt; 6 simplifies the the process of ami rotation. As it only 
involves 2 steps. If I were to do 3 -&gt; 4 -&gt; 3 -&gt; 4 -&gt; 3 -&gt; 4 
it’ll add a few more steps considering we’ll need to repeat the same for all 
zookeeper components (proxy, bookkeeper, zookeeper, etc). However, if you’re 
suggesting that this approach is cleaner, then we can definitely script this 
ami rotation process to simplify things
----
2020-06-23 16:33:50 UTC - Sijie Guo: sending out  a pr now
----
2020-06-23 16:35:04 UTC - Sijie Guo: 
<https://github.com/apache/pulsar-helm-chart/pull/24>
----
2020-06-23 16:36:40 UTC - Adriaan de Haan: great, I can fix that! a pity I 
can't approve
----
2020-06-23 16:38:39 UTC - Adriaan de Haan: Quick off-topic question
----
2020-06-23 16:39:16 UTC - Adriaan de Haan: The java memory settings, are they 
pretty battle-tested? Or could some of them require some tweaking?
----
2020-06-23 16:40:30 UTC - Adriaan de Haan: I noticed my previous cluster had 
some PODs where the garbage collection was going crazy (2.5.2 release). I also 
noticed however that the G1 garbage collector wasn't active, so perhaps those 
settings were not applied before?
----
2020-06-23 16:42:08 UTC - Pierre-Yves Lebecq: @Matteo Merli Thank you very much!
----
2020-06-23 18:19:25 UTC - sundar: Hello all, I am trying to write test for 
pulsarfunction- localrun module more specifically localrunner.java. Here there 
is an import error, org.apache.pulsar.functions.proto.Function
Which says "cannot resolve symbol Function". This error is present in few more 
modules under pulsar-functions. This is hindering me from using the debugger. 
Can anyone help me out with this?
----
2020-06-23 18:33:34 UTC - Sijie Guo: The default values are used for minimal 
setup.
----
2020-06-23 18:33:55 UTC - Sijie Guo: GC settings are pretty good. I don’t think 
you need to tune them.
----
2020-06-23 18:34:07 UTC - Sijie Guo: But you need to adjust the memory settings 
based on your requirements.
----
2020-06-23 18:44:16 UTC - Adriaan de Haan: I am a bit perplexed about storage 
usage of pulsar topics
----
2020-06-23 18:45:42 UTC - Adriaan de Haan: When is storage actually released? I 
did some testing and there's a lot of storage being used by my topics, but all 
subscriptions have ack'ed all messages and I don't change any persistence 
settings of the topics.
----
2020-06-23 18:45:46 UTC - Adriaan de Haan: 
----
2020-06-23 18:46:26 UTC - Adriaan de Haan: as you can see, the backlog is zero, 
but the storage size keeps going up
----
2020-06-23 18:50:53 UTC - Chris Bartholomew: This blog post might: 
<https://kesque.com/understanding-pulsar-message-ttl-backlog-and-retention/>
----
2020-06-23 19:07:20 UTC - Sijie Guo: The storage size is the amount of the data 
that is still “live” (not deleted). Pulsar’s partition is a segment based 
implementation. Data is deleted segment by segment. So even the backlog is 
zero, you will still see the storage size is not empty. Because there is a 
segment or multiple segments not deleted. They are not deleted either because 
of retention policy or because it is the last segment in the partition and is 
not sealed yet.

You can watch this video. I walk through the lifecycle of a Pulsar message: 
<https://www.youtube.com/watch?v=R197TYYFaiI>

<https://www.slideshare.net/streamnative/tgipulsar-ep-006-lifecycle-of-a-pulsar-message>
----
2020-06-23 19:43:01 UTC - rani: Just tried doing it manually. It took 
~12minutes to decommision 2/3 bookies. Is it expected to take this long?
```19:27:05.318 [main] INFO  org.apache.bookkeeper.client.BookKeeperAdmin - 
Count of Ledgers which need to be rereplicated: 1
19:27:15.320 [main] INFO  org.apache.bookkeeper.client.BookKeeperAdmin - Count 
of Ledgers which need to be rereplicated: 1
.
.
.
19:39:55.663 [main] INFO  
org.apache.bookkeeper.tools.cli.commands.bookies.DecommissionCommand - Cookie 
of the decommissioned bookie: 1.2.3.4:3181 is deleted successfully```

----
2020-06-23 20:13:15 UTC - rani: Repeated the experiment again this time with 
`pulsar-perf` producing data into a topic (~0.7gb so far). Its now been 
~25minutes and the decommissioning command is still running on 3 bookkeepers 
simultaneously!
----
2020-06-23 20:15:30 UTC - rani: any hints @Sijie Guo? Could there be a 
parameter that I need to re-configure?
----
2020-06-23 20:39:50 UTC - Sijie Guo: So decommission a bookie requires copying 
the data from that bookie to others. It depends on 1) how large is the amount 
of the data; 2) tuning the re-replication settings. If you have one ledger, it 
will read the entries in sequence of that ledger to replicate. It might be 
limited by the re-replication batch size. Because you don’t want re-replicate 
the entries take the huge amount of your bandwidth.

The question I have here is why do you need decomission? You are just updating 
AMI. Can’t you update the AMI on existing bookies?
----
2020-06-23 21:19:06 UTC - rwaweber: Without hijacking Adriaan’s thread:

On the topic of “un-sealed segments” is there a way to identify if segments are 
still open and which ones they are?

Also, to that same idea — does a segment stay open until it is filled? And is 
that size dictated by the `logSizeLimit` on bookkeeper?
----
2020-06-23 21:35:54 UTC - Sijie Guo: `pulsar-admin topics stats-internal`
----
2020-06-23 21:36:27 UTC - Sijie Guo: `topics stats` and `topics stats-internal` 
are the two commands you can rely on for your daily operations on Pulsar
----
2020-06-23 21:37:00 UTC - Sijie Guo: logSizeLimit is the bookkeeper’s side 
setting. It controls the size of the files at bookie side.
----
2020-06-23 21:37:48 UTC - Sijie Guo: The size of the segment is controlled at 
broker side. You can check the settings `*LedgerRollover*`. Those are used for 
controlling when to roll over a new ledger.
----
2020-06-23 22:21:26 UTC - Adriaan de Haan: Great, thanks for asking these 
questions because those would have been my own follow-up doubts 
:slightly_smiling_face:
----
2020-06-23 22:43:42 UTC - Adriaan de Haan: Now I have another issue... during 
my attempts to see if I can trigger cleanup of the storage usage, I tried 
"compaction" of the one topic. The compaction finished quite quickly, but 
didn't have any impact on the storage space used. But it had an unfortunate 
side-effect... I now have a __compaction subscription on this topic without a 
consumer... no idea why it's hanging around.
----
2020-06-24 00:17:03 UTC - Adriaan de Haan: I have checked the video and the 
settings you mentioned, but the amount of space being used still doesn't make 
sense
----
2020-06-24 00:17:06 UTC - Adriaan de Haan: 
----
2020-06-24 00:19:13 UTC - Adriaan de Haan: It is using 2.5GB already and not 
releasing, and I have all the default settings, which should rollover after 50k 
entries - this was 6million messages, so it should have rolled over a lot of 
times already...
----
2020-06-24 00:20:17 UTC - Adriaan de Haan: and the 10min min has also passed a 
long time ago
----
2020-06-24 01:22:39 UTC - Sijie Guo: You can run `bin/pulsar-admin topics 
stats-internal` to get the internal stats of a given topic.
----
2020-06-24 01:22:54 UTC - Sijie Guo: That will tell you how the storage size 
was used.
----
2020-06-24 05:45:37 UTC - Karthik Ramasamy: Slides of my keynote 'Why Splunk 
Chose Pulsar' at Pulsar Summit - 
<https://www.slideshare.net/KarthikRamasamy3/pulsar-summitkeynotefinal>
+1 : Sijie Guo, Julius S, Ali Ahmed, Toktok Rambo
100 : Sijie Guo, Julius S
----
2020-06-24 06:22:39 UTC - Dan Melman: @Dan Melman has joined the channel
----
2020-06-24 08:21:17 UTC - Gilles Barbier: The docker has an error right away 
with : `Error: Could not find or load main class "`
----
2020-06-24 08:21:42 UTC - Gilles Barbier: it does not go further
----
2020-06-24 08:36:42 UTC - Sijie Guo: docker compose file has an issue.
----
2020-06-24 08:36:49 UTC - Sijie Guo: There is a issue tracking that.
----
2020-06-24 08:39:18 UTC - Gilles Barbier: Thx - 
<https://github.com/apache/pulsar/issues/7315> - I'm going to try that
----
2020-06-24 08:42:20 UTC - Gilles Barbier: It worked by replacing PULSAR_MEM by 
BOOKIE_MEM thx
----

Slack digest for #general - 2020-06-24

Reply via email to