Slack digest for #general - 2020-07-14

Apache Pulsar Slack Tue, 14 Jul 2020 02:11:37 -0700

2020-07-13 11:19:45 UTC - Rahul Vashishth: @Ali Ahmed
&gt;  Since a topic can have multiple backlogs, Pulsar applies the limit to the 
largest subscription backlog for the topic (that is from the slowest consumer).
As you mentioned topic can have multiple backlogs (one per subscription), does 
each backlog keep the copy of the message or it only maintains a cursor for 
each subscription in the message backlog.
I am trying to understand how message backlog is different from a topic backlog?
----
2020-07-13 11:31:06 UTC - Ali Ahmed: its maintains cursors
+1 : Rahul Vashishth
----
2020-07-13 11:46:31 UTC - Cristian COLA: @Cristian COLA has joined the channel
----
2020-07-13 13:08:27 UTC - wuYin: I send a PR 
<https://github.com/apache/pulsar-helm-chart/pull/38> to implement this.
thanks for review.
----
2020-07-13 14:21:49 UTC - Ebere Abanonu: @Sijie Guo @Penghui Li @Matteo Merli 
please I need to understand this: If I create a consumer before producing 
messages, I get all messages from 0 entryId. But if I create a consumer with 
start position earliest after the producer is created, I get only the first 
message with entryId 0 (if ten messages already exists I get the first one, but 
not the rest and if new messages are published, messages from the 11th message 
is sent to the consumer from the broker), why is this? Same things happen with 
unack message redelivery - if I tell the broker to redeliver messages from 0 
entryId to 10 entryId, the broker sends only the message at 0 entryId.
----
2020-07-13 15:00:57 UTC - Meyappan Ramasamy: hi team, i am trying to connect 
pulsar java client to pulsar running in a docker container , trying to connect 
using URL <pulsar://localhost:6650> , but i am getting below exception , please 
let me know any method to troubleshoot this issue
```Connection handshake failed: 
org.apache.pulsar.client.api.PulsarClientException: Connection already closed```


----
2020-07-13 15:27:14 UTC - Rahul Vashishth: I am seeing different monitoring 
data from different sources for the same namespace/topics

I have installed the helm chart and testing the topics on the pulsar cluster. 
But when I see topic stats on pulsar-manager,  grafana dashboards, admin topic 
stat API. All three reports different topic count.

Does anyone face the same issue?
----
2020-07-13 15:31:20 UTC - Rahul Vashishth: i am confused as if which data to 
trust the most?
----
2020-07-13 15:46:19 UTC - Asaf Mesika: 
<https://twitter.com/benstopford/status/1282683695653105666?s=21|https://twitter.com/benstopford/status/1282683695653105666?s=21>
----
2020-07-13 15:46:56 UTC - Asaf Mesika: I’ll reply but if a committer can cheap 
in it will be better 
----
2020-07-13 15:49:41 UTC - Viktor: Hello. I observe a large throughput drop (3x) 
with `journalSyncData=false`, vs the default of `journalSyncData=true` on 
Bookkeeper. is this expected? This is counter intuitive to what is written on 
the bookkeeper docs `Beware - when disabling data sync in the bookie journal 
might improve the bookie write performance,`
----
2020-07-13 16:16:32 UTC - Addison Higham: @VanderChen `PULSAR_MEM` is the 
setting for broker memory, did you try adjusting `BOOKIE_MEM`?
----
2020-07-13 16:23:37 UTC - Addison Higham: I think what is being discussed by 
"streaming pull" is that that you give the broker a message that indicates the 
number of messages you will accept (permits). If it has a backlog, it will 
immediately respond with as many messages as you ask for, but if there are no 
messages currently, it will send you any messages as soon as it gets them (as 
long as it still fits within the allowed permits)

As far as the Pulsar client itself, it is true that  it has an internal buffer 
and that buffer is filled by a background thread, but it is recommended that 
you use the async API for high performance, where it does minimal blocking
----
2020-07-13 16:27:22 UTC - Asaf Mesika: So is it similar to Kafka client which 
sends a fetch request limited in configurable upper limit, and if it doesn’t 
have nothing it doesn’t answer until it has messages and it starts streaming 
the response ? In Kafka you can’t async it’s only blocking as far as I know 
----
2020-07-13 16:27:51 UTC - Addison Higham: @Zhenhao Li When you first add a 
bookkeeper node, it registers its name in zookeeper along with a generated ID 
(called the cookie). This cookie gets stored in your data directories. If you 
loose your data directories but register back with zookeeper with the same 
name, this is an error state.

Is it possible you started your bookie node and then cleared out the directory 
mentioned in the error log?

To clear this issue you can use this CLI command:
<https://bookkeeper.apache.org/docs/4.5.1/reference/cli/#bookkeeper-shell-bookieformat>
----
2020-07-13 16:31:37 UTC - Addison Higham: :thinking_face: interesting, do you 
have some more details on your test setup?
----
2020-07-13 16:47:55 UTC - Viktor: I am running open messaging benchmarks. I 
upgraded the setup to 2.6 and just ran with that one flag changed. 
Interestingly, I did notice on bookkeeper graphs that it was syncing lot less 
with `journalSyncData=false`
----
2020-07-13 16:56:46 UTC - Addison Higham: Hrm... This conversation might be 
most effective as an issue on the bookkeeper project. If you have a minute to 
open an issue there, that would really help.
----
2020-07-13 17:26:19 UTC - Sijie Guo: Because by default the subscription 
initial position is latest. You can change your consumer to use 
`SubscriptionInitialPosition(SubscriptionInitialPosition.earliest)`
----
2020-07-13 17:26:46 UTC - Sijie Guo: Did your expose 6650 outside of the docker 
container?
----
2020-07-13 17:33:07 UTC - Ebere Abanonu: Already done that but only get the 
first message in the ledger entry and not more until a fresh message is produced
----
2020-07-13 17:34:33 UTC - Sijie Guo: I added a few notes
white_check_mark : Asaf Mesika
+1 : Julius S
muscle : Dan Melman
----
2020-07-13 17:36:15 UTC - Sijie Guo: @victor what disks are you using?
----
2020-07-13 17:38:02 UTC - Sijie Guo: Hmm. That sounds like a bug. Can you 
create an issue with your code sample?
----
2020-07-13 17:41:08 UTC - Ebere Abanonu: Bug at the client or broker level? 
Running Broker in standalone mode
----
2020-07-13 17:41:35 UTC - Ebere Abanonu: Testing my own our client 
implementation
----
2020-07-13 17:43:21 UTC - Ebere Abanonu: Same with unacked redelivery. It 
worked once until I was forced to refresh docker image because broker was 
failing to start
----
2020-07-13 17:45:04 UTC - Zhenhao Li: @Addison Higham thank you! I didn't touch 
the directory at all. I deployed via some scripts and I can confirm it only 
creates the directory at the first time.
I have two questions.
1. is it possible to let the user to set the "cookie" instead of a generated 
one?
2. since cookie is stored in ZK, why does Pulsar bookie need to store it 
locally? 
----
2020-07-13 18:46:04 UTC - Addison Higham: @Zhenhao Li

If you want to share your startup scripts, that might be helpful. I can't say 
how you got into that state, but hopefully the `bookieformat` helps fix it. Did 
you see this guide? 
<https://bookkeeper.apache.org/docs/4.10.0/deployment/manual/>? It is a bit out 
of date as the better command to run is `initnewcluster`


1. yes you can, but I am not sure of all the implications, suggest you look at 
the bookkeeper CLI `bookkeeper shell cookie_create`
2. This is just part of the mechanism to ensure that bookkeeper is in a valid 
state on boot and also to ensure the bookie disks are as expected.
----
2020-07-13 19:46:39 UTC - Zhenhao Li: we use Nix and NixOps to deploy Pulsar to 
NixOS machines. I'm sure it is not the usual way in the Pulsar community
----
2020-07-13 19:50:13 UTC - Zhenhao Li: I forgot to say I was using an existing 
zookeeper cluster
----
2020-07-13 19:51:17 UTC - Zhenhao Li: I just tried to deploy with the bundled 
zookeeper in Pulsar. now it runs on 2 nodes but fails on the node where the 
single zookeeper node is running
----
2020-07-13 19:52:13 UTC - Zhenhao Li: the error is different now:


----
2020-07-13 19:52:13 UTC - Zhenhao Li: ```Jul 13 21:48:32 server1 systemd[1]: 
Started Pulsar's Bookkeeper Daemon.
Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: 21:48:39.082 [main] ERROR 
org.apache.bookkeeper.server.Main - Failed to build bookie server
Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: 
org.apache.bookkeeper.bookie.BookieException$InvalidCookieException: Cookie [4
Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: bookieHost: 
"192.168.1.201:3181"
Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: journalDir: 
"/var/lib/pulsar-bookie/journal"
Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: ledgerDirs: 
"1\t/var/lib/pulsar-bookie/ledger"
Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: instanceId: 
"1d0829c4-8d69-4457-a684-412767fc4b00"
Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: ] is not matching with [4
Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: bookieHost: 
"192.168.1.201:3181; 192.168.1.202:3181; 192.168.1.203:3181:3181"
Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: journalDir: 
"/var/lib/pulsar-bookie/journal"
Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: ledgerDirs: 
"1\t/var/lib/pulsar-bookie/ledger"
Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: instanceId: 
"f1c20d1f-7c71-4c03-8ef1-dab70aecbf17"
Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: ]
Jul 13 21:48:39 server1 pulsar-bookie-start[17802]:         at 
org.apache.bookkeeper.bookie.Cookie.verifyInternal(Cookie.java:136) 
~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
Jul 13 21:48:39 server1 pulsar-bookie-start[17802]:         at 
org.apache.bookkeeper.bookie.Cookie.verify(Cookie.java:147) 
~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
Jul 13 21:48:39 server1 pulsar-bookie-start[17802]:         at 
org.apache.bookkeeper.bookie.Bookie.verifyAndGetMissingDirs(Bookie.java:369) 
~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
Jul 13 21:48:39 server1 pulsar-bookie-start[17802]:         at 
org.apache.bookkeeper.bookie.Bookie.checkEnvironmentWithStorageExpansion(Bookie.java:432)
 ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
Jul 13 21:48:39 server1 pulsar-bookie-start[17802]:         at 
org.apache.bookkeeper.bookie.Bookie.checkEnvironment(Bookie.java:250) 
~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
Jul 13 21:48:39 server1 pulsar-bookie-start[17802]:         at 
org.apache.bookkeeper.bookie.Bookie.&lt;init&gt;(Bookie.java:688) 
~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
Jul 13 21:48:39 server1 pulsar-bookie-start[17802]:         at 
org.apache.bookkeeper.proto.BookieServer.newBookie(BookieServer.java:136) 
~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
Jul 13 21:48:39 server1 pulsar-bookie-start[17802]:         at 
org.apache.bookkeeper.proto.BookieServer.&lt;init&gt;(BookieServer.java:105) 
~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
Jul 13 21:48:39 server1 pulsar-bookie-start[17802]:         at 
org.apache.bookkeeper.server.service.BookieService.&lt;init&gt;(BookieService.java:41)
 ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
Jul 13 21:48:39 server1 pulsar-bookie-start[17802]:         at 
org.apache.bookkeeper.server.Main.buildBookieServer(Main.java:301) 
~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
Jul 13 21:48:39 server1 pulsar-bookie-start[17802]:         at 
org.apache.bookkeeper.server.Main.doMain(Main.java:221) 
[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
Jul 13 21:48:39 server1 pulsar-bookie-start[17802]:         at 
org.apache.bookkeeper.server.Main.main(Main.java:203) 
[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
Jul 13 21:48:39 server1 pulsar-bookie-start[17802]:         at 
org.apache.bookkeeper.proto.BookieServer.main(BookieServer.java:313) 
[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
Jul 13 21:48:39 server1 systemd[1]: pulsar-bookie.service: Main process exited, 
code=exited, status=2/INVALIDARGUMENT
Jul 13 21:48:39 server1 systemd[1]: pulsar-bookie.service: Failed with result 
'exit-code'.
Jul 13 21:48:39 server1 systemd[1]: pulsar-bookie.service: Consumed 11.457s CPU 
time, received 7.4K IP traffic, sent 5.8K IP traffic.```
----
2020-07-13 19:53:20 UTC - Zhenhao Li: the cause seems to be that `bookieHost` 
is inconsistent between zookeeper and local bookie
----
2020-07-13 19:55:34 UTC - Zhenhao Li: @Addison Higham I am not doing the manual 
fix yet because I want to make sure our deployment file work correctly. we 
don't want it happen again when adding new nodes
----
2020-07-13 19:56:40 UTC - Addison Higham: you may try looking at raw records in 
zookeeper
----
2020-07-13 19:57:06 UTC - Addison Higham: or I assume that is what you did 
already? but yes, there are options to configure how bookie nodes get their 
hostname
----
2020-07-13 19:59:59 UTC - Alan Broddle: We actually thought we had this 
working, and are finding that the TLS security is not actually working.  Short 
version… No we don’t think we have this figured out.
We started looking at it again yesterday and are not seeing where the issue is. 
 When we run a tcpdump, we can see the data between a Broker and BookKeeper.
We think it is something with the cert or “PULSAR_EXTRA_OPTS”
We are NOT seeing a list of the supported extra ops to verify we have the 
correct information

PULSAR_EXTRA_OPTS=” -Dpulsar.allocator.exit_on_oom=true 
-Dio.netty.recycler.maxCapacity.default=1000 
-Dio.netty.recycler.linkCapacity=1024 
-Dzookeeper.clientCnxnSocket=org.apache.zookeeper.ClientCnxnSocketNetty 
-Dzookeeper.client.secure=true -Dzookeeper.ssl.hostnameVerification=false 
-Dzookeeper.ssl.keyStore.location=/usr/pulsar/certs/zookeeper.eaipulsarcloudnaengcluster1.pem
 -Dzookeeper.ssl.trustStore.location=/usr/pulsar/certs/ca.cert.pem”
----
2020-07-13 20:01:22 UTC - Alan Broddle: Update:  Internal Bookie communication 
between bookie servers seems to be working and is encrypted.
Bookie to Broker is not!
----
2020-07-13 20:15:17 UTC - Zhenhao Li: thanks for your help! I figured out the 
cause. I made the a mistake at the first run by using the same 
advertisedAddress for each bookie name, and this mistake turned out to be very 
sticky in the sense that re-run with correct advertisedAddress values won't fix 
it.
----
2020-07-13 20:17:42 UTC - Zhenhao Li: I need to put the following in my TODO 
list. 1. add a optional clean up phase to my deployment script. 2. to see if 
there is a better approach to fix it inside the Pulsar project
----
2020-07-13 20:21:20 UTC - Viktor: ok. I will open an issue on bookkeeper. I am 
using SSDs. default in OMB
----
2020-07-13 20:33:05 UTC - Zhenhao Li: Hi, I have some issues with Pulsar 
brokers in my deployment. 2 nodes are running and 2 nodes are failing with the 
following error

----
2020-07-13 20:33:05 UTC - Zhenhao Li: ```Jul 13 22:12:36 server2 systemd[1]: 
Started Pulsar's Broker Daemon.
Jul 13 22:12:37 server2 pulsar-broker-start[26492]: [AppClassLoader@18b4aac2] 
info AspectJ Weaver Version 1.9.2 built on Wednesday Oct 24, 2018 at 15:43:33 
GMT
Jul 13 22:12:37 server2 pulsar-broker-start[26492]: [AppClassLoader@18b4aac2] 
info register classloader sun.misc.Launcher$AppClassLoader@18b4aac2
Jul 13 22:12:37 server2 pulsar-broker-start[26492]: [AppClassLoader@18b4aac2] 
info using configuration 
file:/nix/store/zj60lld9z5yp0s5qas46sffc48wm2c2i-pulsar-2.6.0/lib/org.apache.pulsar-pulsar-zookeeper-utils-2.6.0.jar!/META-INF/aop.xml
Jul 13 22:12:37 server2 pulsar-broker-start[26492]: [AppClassLoader@18b4aac2] 
info using configuration 
file:/nix/store/zj60lld9z5yp0s5qas46sffc48wm2c2i-pulsar-2.6.0/lib/org.apache.pulsar-pulsar-zookeeper-2.6.0.jar!/META-INF/aop.xml
Jul 13 22:12:37 server2 pulsar-broker-start[26492]: [AppClassLoader@18b4aac2] 
info register aspect org.apache.pulsar.broker.zookeeper.aspectj.ClientCnxnAspect
Jul 13 22:12:37 server2 pulsar-broker-start[26492]: [AppClassLoader@18b4aac2] 
info register aspect org.apache.pulsar.zookeeper.FinalRequestProcessorAspect
Jul 13 22:12:37 server2 pulsar-broker-start[26492]: [AppClassLoader@18b4aac2] 
info register aspect org.apache.pulsar.zookeeper.ZooKeeperServerAspect
Jul 13 22:13:09 server2 pulsar-broker-start[26492]: 22:13:09.274 [main] ERROR 
org.apache.pulsar.broker.PulsarService - Failed to establish session with local 
ZK
Jul 13 22:13:09 server2 pulsar-broker-start[26492]: java.io.IOException: Failed 
to establish session with local ZK
Jul 13 22:13:09 server2 pulsar-broker-start[26492]:         at 
org.apache.pulsar.zookeeper.LocalZooKeeperConnectionService.start(LocalZooKeeperConnectionService.java:74)
 ~[org.apache.pulsar-pulsar-zookeeper-utils-2.6.0.jar:2.6.0]
Jul 13 22:13:09 server2 pulsar-broker-start[26492]:         at 
org.apache.pulsar.broker.PulsarService.start(PulsarService.java:438) 
[org.apache.pulsar-pulsar-broker-2.6.0.jar:2.6.0]
Jul 13 22:13:09 server2 pulsar-broker-start[26492]:         at 
org.apache.pulsar.PulsarBrokerStarter$BrokerStarter.start(PulsarBrokerStarter.java:280)
 [org.apache.pulsar-pulsar-broker-2.6.0.jar:2.6.0]
Jul 13 22:13:09 server2 pulsar-broker-start[26492]:         at 
org.apache.pulsar.PulsarBrokerStarter.main(PulsarBrokerStarter.java:349) 
[org.apache.pulsar-pulsar-broker-2.6.0.jar:2.6.0]
Jul 13 22:13:09 server2 pulsar-broker-start[26492]: Caused by: 
java.util.concurrent.TimeoutException
Jul 13 22:13:09 server2 pulsar-broker-start[26492]:         at 
java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1784) 
~[?:1.8.0_242]
Jul 13 22:13:09 server2 pulsar-broker-start[26492]:         at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928) 
~[?:1.8.0_242]
Jul 13 22:13:09 server2 pulsar-broker-start[26492]:         at 
org.apache.pulsar.zookeeper.LocalZooKeeperConnectionService.start(LocalZooKeeperConnectionService.java:68)
 ~[org.apache.pulsar-pulsar-zookeeper-utils-2.6.0.jar:2.6.0]
Jul 13 22:13:09 server2 pulsar-broker-start[26492]:         ... 3 more
Jul 13 22:13:09 server2 pulsar-broker-start[26492]: 22:13:09.286 [main] ERROR 
org.apache.pulsar.PulsarBrokerStarter - Failed to start pulsar service.
Jul 13 22:13:09 server2 pulsar-broker-start[26492]: 
org.apache.pulsar.broker.PulsarServerException: java.io.IOException: Failed to 
establish session with local ZK
Jul 13 22:13:09 server2 pulsar-broker-start[26492]:         at 
org.apache.pulsar.broker.PulsarService.start(PulsarService.java:587) 
~[org.apache.pulsar-pulsar-broker-2.6.0.jar:2.6.0]
Jul 13 22:13:09 server2 pulsar-broker-start[26492]:         at 
org.apache.pulsar.PulsarBrokerStarter$BrokerStarter.start(PulsarBrokerStarter.java:280)
 ~[org.apache.pulsar-pulsar-broker-2.6.0.jar:2.6.0]
Jul 13 22:13:09 server2 pulsar-broker-start[26492]:         at 
org.apache.pulsar.PulsarBrokerStarter.main(PulsarBrokerStarter.java:349) 
[org.apache.pulsar-pulsar-broker-2.6.0.jar:2.6.0]
Jul 13 22:13:09 server2 pulsar-broker-start[26492]: Caused by: 
java.io.IOException: Failed to establish session with local ZK
Jul 13 22:13:09 server2 pulsar-broker-start[26492]:         at 
org.apache.pulsar.zookeeper.LocalZooKeeperConnectionService.start(LocalZooKeeperConnectionService.java:74)
 ~[org.apache.pulsar-pulsar-zookeeper-utils-2.6.0.jar:2.6.0]
Jul 13 22:13:09 server2 pulsar-broker-start[26492]:         at 
org.apache.pulsar.broker.PulsarService.start(PulsarService.java:438) 
~[org.apache.pulsar-pulsar-broker-2.6.0.jar:2.6.0]
Jul 13 22:13:09 server2 pulsar-broker-start[26492]:         ... 2 more
Jul 13 22:13:09 server2 pulsar-broker-start[26492]: Caused by: 
java.util.concurrent.TimeoutException
Jul 13 22:13:09 server2 pulsar-broker-start[26492]:         at 
java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1784) 
~[?:1.8.0_242]
Jul 13 22:13:09 server2 pulsar-broker-start[26492]:         at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928) 
~[?:1.8.0_242]
Jul 13 22:13:09 server2 pulsar-broker-start[26492]:         at 
org.apache.pulsar.zookeeper.LocalZooKeeperConnectionService.start(LocalZooKeeperConnectionService.java:68)
 ~[org.apache.pulsar-pulsar-zookeeper-utils-2.6.0.jar:2.6.0]
Jul 13 22:13:09 server2 pulsar-broker-start[26492]:         at 
org.apache.pulsar.broker.PulsarService.start(PulsarService.java:438) 
~[org.apache.pulsar-pulsar-broker-2.6.0.jar:2.6.0]
Jul 13 22:13:09 server2 pulsar-broker-start[26492]:         ... 2 more
Jul 13 22:13:09 server2 systemd[1]: pulsar-broker.service: Main process exited, 
code=exited, status=1/FAILURE
Jul 13 22:13:09 server2 systemd[1]: pulsar-broker.service: Failed with result 
'exit-code'.
Jul 13 22:13:09 server2 systemd[1]: pulsar-broker.service: Consumed 4.488s CPU 
time, received 0B IP traffic, sent 480B IP traffic.```
----
2020-07-13 20:33:05 UTC - Zhenhao Li: the whole deployment uses a single 
zookeeper node.
----
2020-07-13 20:34:55 UTC - Zhenhao Li: logs on the working nodes:

```Jul 13 22:23:38 server1 systemd[1]: Started Pulsar's Broker Daemon.
Jul 13 22:23:38 server1 pulsar-broker-start[18818]: [AppClassLoader@18b4aac2] 
info AspectJ Weaver Version 1.9.2 built on Wednesday Oct 24, 2018 at 15:43:33 
GMT
Jul 13 22:23:38 server1 pulsar-broker-start[18818]: [AppClassLoader@18b4aac2] 
info register classloader sun.misc.Launcher$AppClassLoader@18b4aac2
Jul 13 22:23:38 server1 pulsar-broker-start[18818]: [AppClassLoader@18b4aac2] 
info using configuration 
file:/nix/store/zj60lld9z5yp0s5qas46sffc48wm2c2i-pulsar-2.6.0/lib/org.apache.pulsar-pulsar-zookeeper-2.6.0.jar!/META-INF/aop.xml
Jul 13 22:23:38 server1 pulsar-broker-start[18818]: [AppClassLoader@18b4aac2] 
info using configuration 
file:/nix/store/zj60lld9z5yp0s5qas46sffc48wm2c2i-pulsar-2.6.0/lib/org.apache.pulsar-pulsar-zookeeper-utils-2.6.0.jar!/META-INF/aop.xml
Jul 13 22:23:38 server1 pulsar-broker-start[18818]: [AppClassLoader@18b4aac2] 
info register aspect org.apache.pulsar.zookeeper.FinalRequestProcessorAspect
Jul 13 22:23:39 server1 pulsar-broker-start[18818]: [AppClassLoader@18b4aac2] 
info register aspect org.apache.pulsar.zookeeper.ZooKeeperServerAspect
Jul 13 22:23:39 server1 pulsar-broker-start[18818]: [AppClassLoader@18b4aac2] 
info register aspect org.apache.pulsar.broker.zookeeper.aspectj.ClientCnxnAspect
Jul 13 22:24:12 server1 pulsar-broker-start[18818]: [MethodUtil@39cf86be] info 
AspectJ Weaver Version 1.9.2 built on Wednesday Oct 24, 2018 at 15:43:33 GMT
Jul 13 22:24:12 server1 pulsar-broker-start[18818]: [MethodUtil@39cf86be] info 
register classloader sun.reflect.misc.MethodUtil@39cf86be
Jul 13 22:24:12 server1 pulsar-broker-start[18818]: [MethodUtil@39cf86be] info 
using configuration 
file:/nix/store/zj60lld9z5yp0s5qas46sffc48wm2c2i-pulsar-2.6.0/lib/org.apache.pulsar-pulsar-zookeeper-2.6.0.jar!/META-INF/aop.xml
Jul 13 22:24:12 server1 pulsar-broker-start[18818]: [MethodUtil@39cf86be] info 
using configuration 
file:/nix/store/zj60lld9z5yp0s5qas46sffc48wm2c2i-pulsar-2.6.0/lib/org.apache.pulsar-pulsar-zookeeper-utils-2.6.0.jar!/META-INF/aop.xml
Jul 13 22:24:12 server1 pulsar-broker-start[18818]: [MethodUtil@39cf86be] info 
register aspect org.apache.pulsar.zookeeper.FinalRequestProcessorAspect
Jul 13 22:24:12 server1 pulsar-broker-start[18818]: [MethodUtil@39cf86be] info 
register aspect org.apache.pulsar.zookeeper.ZooKeeperServerAspect
Jul 13 22:24:12 server1 pulsar-broker-start[18818]: [MethodUtil@39cf86be] info 
register aspect org.apache.pulsar.broker.zookeeper.aspectj.ClientCnxnAspect```
----
2020-07-13 21:01:58 UTC - Matt Mitchell: @Matteo Merli I’ll see if I can share 
something later this week
----
2020-07-13 21:04:05 UTC - Matteo Merli: :+1:
----
2020-07-13 21:09:32 UTC - Matt Mitchell: I’m experiencing an issue related to 
the java PulsarAdmin client, where sometimes just calling 
`client.topics().getList(tenantAndNamespace)` fails with a timeout error. I 
originally was using the client to get the list of topics, which worked the 
first time, and then I attempted to delete a subscription, which timed out too. 
After I re-ran the code, the call to `topics().getList(…)` then times out. Has 
anyone experienced this before?
----
2020-07-13 21:10:49 UTC - Addison Higham: What version of Pulsar are you 
running?
----
2020-07-13 21:11:02 UTC - Matt Mitchell: This is 2.5.2
----
2020-07-13 21:11:30 UTC - Addison Higham: Also, can you try doing a 
`pulsar-admin namespace unload &lt;tenant&gt;/&lt;namespace&gt;` and see if 
that fixes it?
----
2020-07-13 21:11:51 UTC - Matt Mitchell: sure, will do
----
2020-07-13 21:23:34 UTC - Matt Mitchell: I got this:
```root@pulsar:/pulsar# ./bin/pulsar-admin namespaces unload fusion/_system
null

Reason: HTTP 500 Internal Server Error```
----
2020-07-13 21:24:18 UTC - Matt Mitchell: I’m running Pulsar in docker fwiw
----
2020-07-13 21:26:33 UTC - Matt Mitchell: and from the Pulsar logs:

----
2020-07-13 21:26:33 UTC - Matt Mitchell: 
```fusion_pulsar.1.5untcal3sdns@docker-desktop    | 21:22:51.988 
[AsyncHttpClient-timer-87-1] WARN  
org.apache.pulsar.client.admin.internal.BaseResource - 
[<http://pulsar:8080/admin/v2/namespaces/fusion/_system/0xc0000000_0xffffffff/unload>]
 Failed to perform http put request: java.util.concurrent.TimeoutException: 
Read timeout to pulsar/10.0.4.18:8080 after 30000 ms
fusion_pulsar.1.5untcal3sdns@docker-desktop    | 21:22:51.991 
[AsyncHttpClient-timer-87-1] ERROR 
org.apache.pulsar.broker.admin.impl.NamespacesBase - [null] Failed to unload 
namespace fusion/_system
fusion_pulsar.1.5untcal3sdns@docker-desktop    | 
java.util.concurrent.CompletionException: 
org.apache.pulsar.client.admin.PulsarAdminException: 
java.util.concurrent.TimeoutException: Read timeout to pulsar/10.0.4.18:8080 
after 30000 ms
fusion_pulsar.1.5untcal3sdns@docker-desktop    |        at 
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
 ~[?:1.8.0_232]
fusion_pulsar.1.5untcal3sdns@docker-desktop    |        at 
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
 ~[?:1.8.0_232]
fusion_pulsar.1.5untcal3sdns@docker-desktop    |        at 
java.util.concurrent.CompletableFuture.biRelay(CompletableFuture.java:1300) 
~[?:1.8.0_232]
fusion_pulsar.1.5untcal3sdns@docker-desktop    |        at 
java.util.concurrent.CompletableFuture$BiRelay.tryFire(CompletableFuture.java:1284)
 ~[?:1.8.0_232]
fusion_pulsar.1.5untcal3sdns@docker-desktop    |        at 
java.util.concurrent.CompletableFuture$CoCompletion.tryFire(CompletableFuture.java:1034)
 ~[?:1.8.0_232]
fusion_pulsar.1.5untcal3sdns@docker-desktop    |        at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) 
~[?:1.8.0_232]
fusion_pulsar.1.5untcal3sdns@docker-desktop    |        at 
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
 ~[?:1.8.0_232]
fusion_pulsar.1.5untcal3sdns@docker-desktop    |        at 
org.apache.pulsar.client.admin.internal.BaseResource$1.failed(BaseResource.java:130)
 ~[org.apache.pulsar-pulsar-client-admin-original-2.4.2.jar:2.4.2]
fusion_pulsar.1.5untcal3sdns@docker-desktop    |        at 
org.glassfish.jersey.client.JerseyInvocation$4.failed(JerseyInvocation.java:1030)
 ~[org.glassfish.jersey.core-jersey-client-2.27.jar:?]
fusion_pulsar.1.5untcal3sdns@docker-desktop    |        at 
org.glassfish.jersey.client.ClientRuntime.processFailure(ClientRuntime.java:231)
 ~[org.glassfish.jersey.core-jersey-client-2.27.jar:?]
fusion_pulsar.1.5untcal3sdns@docker-desktop    |        at 
org.glassfish.jersey.client.ClientRuntime.access$100(ClientRuntime.java:85) 
~[org.glassfish.jersey.core-jersey-client-2.27.jar:?]
fusion_pulsar.1.5untcal3sdns@docker-desktop    |        at 
org.glassfish.jersey.client.ClientRuntime$2.lambda$failure$1(ClientRuntime.java:183)
 ~[org.glassfish.jersey.core-jersey-client-2.27.jar:?]
fusion_pulsar.1.5untcal3sdns@docker-desktop    |        at 
org.glassfish.jersey.internal.Errors$1.call(Errors.java:272) 
[org.glassfish.jersey.core-jersey-common-2.27.jar:?]
fusion_pulsar.1.5untcal3sdns@docker-desktop    |        at 
org.glassfish.jersey.internal.Errors$1.call(Errors.java:268) 
[org.glassfish.jersey.core-jersey-common-2.27.jar:?]
fusion_pulsar.1.5untcal3sdns@docker-desktop    |        at 
org.glassfish.jersey.internal.Errors.process(Errors.java:316) 
[org.glassfish.jersey.core-jersey-common-2.27.jar:?]
fusion_pulsar.1.5untcal3sdns@docker-desktop    |        at 
org.glassfish.jersey.internal.Errors.process(Errors.java:298) 
[org.glassfish.jersey.core-jersey-common-2.27.jar:?]
fusion_pulsar.1.5untcal3sdns@docker-desktop    |        at 
org.glassfish.jersey.internal.Errors.process(Errors.java:268) 
[org.glassfish.jersey.core-jersey-common-2.27.jar:?]
fusion_pulsar.1.5untcal3sdns@docker-desktop    |        at 
org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:312)
 [org.glassfish.jersey.core-jersey-common-2.27.jar:?]
fusion_pulsar.1.5untcal3sdns@docker-desktop    |        at 
org.glassfish.jersey.client.ClientRuntime$2.failure(ClientRuntime.java:183) 
[org.glassfish.jersey.core-jersey-client-2.27.jar:?]
fusion_pulsar.1.5untcal3sdns@docker-desktop    |        at 
org.apache.pulsar.client.admin.internal.http.AsyncHttpConnector$3.onThrowable(AsyncHttpConnector.java:250)
 [org.apache.pulsar-pulsar-client-admin-original-2.4.2.jar:2.4.2]
fusion_pulsar.1.5untcal3sdns@docker-desktop    |        at 
org.asynchttpclient.netty.NettyResponseFuture.abort(NettyResponseFuture.java:277)
 [org.asynchttpclient-async-http-client-2.7.0.jar:?]
fusion_pulsar.1.5untcal3sdns@docker-desktop    |        at 
org.asynchttpclient.netty.request.NettyRequestSender.abort(NettyRequestSender.java:473)
 [org.asynchttpclient-async-http-client-2.7.0.jar:?]
fusion_pulsar.1.5untcal3sdns@docker-desktop    |        at 
org.asynchttpclient.netty.timeout.TimeoutTimerTask.expire(TimeoutTimerTask.java:43)
 [org.asynchttpclient-async-http-client-2.7.0.jar:?]
fusion_pulsar.1.5untcal3sdns@docker-desktop    |        at 
org.asynchttpclient.netty.timeout.ReadTimeoutTimerTask.run(ReadTimeoutTimerTask.java:56)
 [org.asynchttpclient-async-http-client-2.7.0.jar:?]
fusion_pulsar.1.5untcal3sdns@docker-desktop    |        at 
io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:682)
 [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
fusion_pulsar.1.5untcal3sdns@docker-desktop    |        at 
io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:757)
 [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
fusion_pulsar.1.5untcal3sdns@docker-desktop    |        at 
io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:485) 
[io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
fusion_pulsar.1.5untcal3sdns@docker-desktop    |        at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
 [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
fusion_pulsar.1.5untcal3sdns@docker-desktop    |        at 
java.lang.Thread.run(Thread.java:748) [?:1.8.0_232]
fusion_pulsar.1.5untcal3sdns@docker-desktop    | Caused by: 
org.apache.pulsar.client.admin.PulsarAdminException: 
java.util.concurrent.TimeoutException: Read timeout to pulsar/10.0.4.18:8080 
after 30000 ms
fusion_pulsar.1.5untcal3sdns@docker-desktop    |        at 
org.apache.pulsar.client.admin.internal.BaseResource.getApiException(BaseResource.java:228)
 ~[org.apache.pulsar-pulsar-client-admin-original-2.4.2.jar:2.4.2]
fusion_pulsar.1.5untcal3sdns@docker-desktop    |        ... 22 more
fusion_pulsar.1.5untcal3sdns@docker-desktop    | Caused by: 
java.util.concurrent.TimeoutException: Read timeout to pulsar/10.0.4.18:8080 
after 30000 ms
fusion_pulsar.1.5untcal3sdns@docker-desktop    |        ... 7 more```
----
2020-07-13 21:28:07 UTC - Matt Mitchell: But it looks like it was at least 
unloading, because in my client (which was connected when I executed `unload`) 
logged this:
```Caused by: org.apache.pulsar.client.api.PulsarClientException: 
java.util.concurrent.CompletionException: 
org.apache.pulsar.client.api.PulsarClientException$LookupException: 
java.lang.IllegalStateException: Namespace bundle 
fusion/_system/0xc0000000_0xffffffff is being unloaded```
----
2020-07-13 21:34:10 UTC - Addison Higham: Is this pulsa running as standalone? 
or a full cluster? If you only have a single broker, I would be surprised by 
the second part...

how many topics do you have in this namespace?  Namespaces get split into 
"bundles" of topics and those bundles are what gets serviced by a broker. 
Because of that, sometimes certain calls need to talk to multiple brokers. The 
`listTopics` call is one of those, if one of your brokers is down/having 
issues, it can cause problems with `listTopics`. Same with that offload call 
you did
----
2020-07-13 21:37:53 UTC - Matt Mitchell: ok, good to know
----
2020-07-13 21:37:55 UTC - Matt Mitchell: This is running Pulsar in standalone 
mode
----
2020-07-13 21:38:38 UTC - Matt Mitchell: This is the compose file I’m using:
```services:
  pulsar:
    image: apachepulsar/pulsar:2.4.2
    hostname: pulsar
    #    volumes:
    #      - ${PWD}/data:/pulsar/data
    environment:
      PULSAR_MEM: " -Xms512m -Xmx512m -XX:MaxDirectMemorySize=1g"
    command: &gt;
      /bin/bash -c
      "bin/apply-config-from-env.py conf/standalone.conf
      &amp;&amp; bin/pulsar standalone"
    ports:
      - "6650:6650"
      - "8080:8080"
    #    restart: always
    networks:
      - default```
----
2020-07-13 21:38:56 UTC - Matt Mitchell: uh oh, that’s 2.4.2 :neutral_face:
----
2020-07-13 21:41:56 UTC - Matt Mitchell: lemme dbl check that and make sure i’m 
starting up 2.5.2
----
2020-07-13 22:20:37 UTC - Addison Higham: does a restart fix it?
----
2020-07-14 00:15:58 UTC - Rounak Jaggi: @Sijie Guo need little help with 
configuring bookie tls. I followed the bookie tls documentation, created 
truststore and keystore and configured those 7 parameters as per the 
documentation, but still when I do openssl command to test tls on the bookie 
port I am not getting any certs. Am I missing anything?
----
2020-07-14 00:16:36 UTC - Hiroyuki Yamada: @Penghui Li Can you answer my 
question when you get a chance ?
I want to dig into it deeper as well.
<https://github.com/apache/pulsar/issues/7455#issuecomment-654763271>
----
2020-07-14 03:59:14 UTC - Penghui Li: ok
----
2020-07-14 05:02:06 UTC - Rahul Vashishth: &gt;  set a retention policy based 
on size or time that will retain messages regardless of subscription
Retention policies applies to acked msg. while backlogQuota and ttl works for 
unacked msg. i guess we need to set ttl and backlogQuota to retain messages on 
topic without subscriptions.
----
2020-07-14 05:13:56 UTC - Zhenhao Li: on the nodes where brokers are failing, 
the bookies are running fine. so it can't be zookeeper connection issue
----
2020-07-14 05:40:05 UTC - Zhenhao Li: figured out why. I forgot to put 2181 in 
the open port list for the machines running the bundled zookeeper in my script. 
On the working nodes, broker can still reach zookeeper from 127.0.0.0; on the 
failing ones, bookie worked because they were talking to my own zookeeper 
cluster which still contains previous wrong configuration.
now all brokers are running, but all bookies are failing.
at least I know what to do.
I will study how Pulsar uses zookeeper and add a cleanup module to my 
deployment script
----
2020-07-14 06:57:20 UTC - zsh0139: @zsh0139 has joined the channel
----
2020-07-14 07:54:31 UTC - Hiroyuki Yamada: Hi, I have a question about Bookie 
(auto) recovery.
When a bookie node fails, does auto recovery tries to recover all the ledger 
data that the failed node has ?
----
2020-07-14 07:56:56 UTC - Meyappan Ramasamy: this is my pulsar docker 
configuration
 pulsar:
      image: apachepulsar/pulsar:2.5.0
      ports:
        - '8080:8080'
        - '6650:6650'
      expose:
        - 8080
        - 6650
      environment:
         - PULSAR_MEM=" -Xms512m -Xmx512m -XX:MaxDirectMemorySize=1g"
      command:  &gt;
         /bin/bash -c
         "bin/apply-config-from-env.py conf/standalone.conf
         &amp;&amp; bin/pulsar standalone"
----
2020-07-14 08:13:39 UTC - Meyappan Ramasamy: followed the example from here : 
<https://github.com/apache/pulsar/blob/master/docker-compose/standalone-dashboard/docker-compose.yml>
----

Slack digest for #general - 2020-07-14

Reply via email to