2020-07-13 11:19:45 UTC - Rahul Vashishth: @Ali Ahmed > Since a topic can have multiple backlogs, Pulsar applies the limit to the largest subscription backlog for the topic (that is from the slowest consumer). As you mentioned topic can have multiple backlogs (one per subscription), does each backlog keep the copy of the message or it only maintains a cursor for each subscription in the message backlog. I am trying to understand how message backlog is different from a topic backlog? ---- 2020-07-13 11:31:06 UTC - Ali Ahmed: its maintains cursors +1 : Rahul Vashishth ---- 2020-07-13 11:46:31 UTC - Cristian COLA: @Cristian COLA has joined the channel ---- 2020-07-13 13:08:27 UTC - wuYin: I send a PR <https://github.com/apache/pulsar-helm-chart/pull/38> to implement this. thanks for review. ---- 2020-07-13 14:21:49 UTC - Ebere Abanonu: @Sijie Guo @Penghui Li @Matteo Merli please I need to understand this: If I create a consumer before producing messages, I get all messages from 0 entryId. But if I create a consumer with start position earliest after the producer is created, I get only the first message with entryId 0 (if ten messages already exists I get the first one, but not the rest and if new messages are published, messages from the 11th message is sent to the consumer from the broker), why is this? Same things happen with unack message redelivery - if I tell the broker to redeliver messages from 0 entryId to 10 entryId, the broker sends only the message at 0 entryId. ---- 2020-07-13 15:00:57 UTC - Meyappan Ramasamy: hi team, i am trying to connect pulsar java client to pulsar running in a docker container , trying to connect using URL <pulsar://localhost:6650> , but i am getting below exception , please let me know any method to troubleshoot this issue ```Connection handshake failed: org.apache.pulsar.client.api.PulsarClientException: Connection already closed```
---- 2020-07-13 15:27:14 UTC - Rahul Vashishth: I am seeing different monitoring data from different sources for the same namespace/topics I have installed the helm chart and testing the topics on the pulsar cluster. But when I see topic stats on pulsar-manager, grafana dashboards, admin topic stat API. All three reports different topic count. Does anyone face the same issue? ---- 2020-07-13 15:31:20 UTC - Rahul Vashishth: i am confused as if which data to trust the most? ---- 2020-07-13 15:46:19 UTC - Asaf Mesika: <https://twitter.com/benstopford/status/1282683695653105666?s=21|https://twitter.com/benstopford/status/1282683695653105666?s=21> ---- 2020-07-13 15:46:56 UTC - Asaf Mesika: I’ll reply but if a committer can cheap in it will be better ---- 2020-07-13 15:49:41 UTC - Viktor: Hello. I observe a large throughput drop (3x) with `journalSyncData=false`, vs the default of `journalSyncData=true` on Bookkeeper. is this expected? This is counter intuitive to what is written on the bookkeeper docs `Beware - when disabling data sync in the bookie journal might improve the bookie write performance,` ---- 2020-07-13 16:16:32 UTC - Addison Higham: @VanderChen `PULSAR_MEM` is the setting for broker memory, did you try adjusting `BOOKIE_MEM`? ---- 2020-07-13 16:23:37 UTC - Addison Higham: I think what is being discussed by "streaming pull" is that that you give the broker a message that indicates the number of messages you will accept (permits). If it has a backlog, it will immediately respond with as many messages as you ask for, but if there are no messages currently, it will send you any messages as soon as it gets them (as long as it still fits within the allowed permits) As far as the Pulsar client itself, it is true that it has an internal buffer and that buffer is filled by a background thread, but it is recommended that you use the async API for high performance, where it does minimal blocking ---- 2020-07-13 16:27:22 UTC - Asaf Mesika: So is it similar to Kafka client which sends a fetch request limited in configurable upper limit, and if it doesn’t have nothing it doesn’t answer until it has messages and it starts streaming the response ? In Kafka you can’t async it’s only blocking as far as I know ---- 2020-07-13 16:27:51 UTC - Addison Higham: @Zhenhao Li When you first add a bookkeeper node, it registers its name in zookeeper along with a generated ID (called the cookie). This cookie gets stored in your data directories. If you loose your data directories but register back with zookeeper with the same name, this is an error state. Is it possible you started your bookie node and then cleared out the directory mentioned in the error log? To clear this issue you can use this CLI command: <https://bookkeeper.apache.org/docs/4.5.1/reference/cli/#bookkeeper-shell-bookieformat> ---- 2020-07-13 16:31:37 UTC - Addison Higham: :thinking_face: interesting, do you have some more details on your test setup? ---- 2020-07-13 16:47:55 UTC - Viktor: I am running open messaging benchmarks. I upgraded the setup to 2.6 and just ran with that one flag changed. Interestingly, I did notice on bookkeeper graphs that it was syncing lot less with `journalSyncData=false` ---- 2020-07-13 16:56:46 UTC - Addison Higham: Hrm... This conversation might be most effective as an issue on the bookkeeper project. If you have a minute to open an issue there, that would really help. ---- 2020-07-13 17:26:19 UTC - Sijie Guo: Because by default the subscription initial position is latest. You can change your consumer to use `SubscriptionInitialPosition(SubscriptionInitialPosition.earliest)` ---- 2020-07-13 17:26:46 UTC - Sijie Guo: Did your expose 6650 outside of the docker container? ---- 2020-07-13 17:33:07 UTC - Ebere Abanonu: Already done that but only get the first message in the ledger entry and not more until a fresh message is produced ---- 2020-07-13 17:34:33 UTC - Sijie Guo: I added a few notes white_check_mark : Asaf Mesika +1 : Julius S muscle : Dan Melman ---- 2020-07-13 17:36:15 UTC - Sijie Guo: @victor what disks are you using? ---- 2020-07-13 17:38:02 UTC - Sijie Guo: Hmm. That sounds like a bug. Can you create an issue with your code sample? ---- 2020-07-13 17:41:08 UTC - Ebere Abanonu: Bug at the client or broker level? Running Broker in standalone mode ---- 2020-07-13 17:41:35 UTC - Ebere Abanonu: Testing my own our client implementation ---- 2020-07-13 17:43:21 UTC - Ebere Abanonu: Same with unacked redelivery. It worked once until I was forced to refresh docker image because broker was failing to start ---- 2020-07-13 17:45:04 UTC - Zhenhao Li: @Addison Higham thank you! I didn't touch the directory at all. I deployed via some scripts and I can confirm it only creates the directory at the first time. I have two questions. 1. is it possible to let the user to set the "cookie" instead of a generated one? 2. since cookie is stored in ZK, why does Pulsar bookie need to store it locally? ---- 2020-07-13 18:46:04 UTC - Addison Higham: @Zhenhao Li If you want to share your startup scripts, that might be helpful. I can't say how you got into that state, but hopefully the `bookieformat` helps fix it. Did you see this guide? <https://bookkeeper.apache.org/docs/4.10.0/deployment/manual/>? It is a bit out of date as the better command to run is `initnewcluster` 1. yes you can, but I am not sure of all the implications, suggest you look at the bookkeeper CLI `bookkeeper shell cookie_create` 2. This is just part of the mechanism to ensure that bookkeeper is in a valid state on boot and also to ensure the bookie disks are as expected. ---- 2020-07-13 19:46:39 UTC - Zhenhao Li: we use Nix and NixOps to deploy Pulsar to NixOS machines. I'm sure it is not the usual way in the Pulsar community ---- 2020-07-13 19:50:13 UTC - Zhenhao Li: I forgot to say I was using an existing zookeeper cluster ---- 2020-07-13 19:51:17 UTC - Zhenhao Li: I just tried to deploy with the bundled zookeeper in Pulsar. now it runs on 2 nodes but fails on the node where the single zookeeper node is running ---- 2020-07-13 19:52:13 UTC - Zhenhao Li: the error is different now: ---- 2020-07-13 19:52:13 UTC - Zhenhao Li: ```Jul 13 21:48:32 server1 systemd[1]: Started Pulsar's Bookkeeper Daemon. Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: 21:48:39.082 [main] ERROR org.apache.bookkeeper.server.Main - Failed to build bookie server Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: org.apache.bookkeeper.bookie.BookieException$InvalidCookieException: Cookie [4 Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: bookieHost: "192.168.1.201:3181" Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: journalDir: "/var/lib/pulsar-bookie/journal" Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: ledgerDirs: "1\t/var/lib/pulsar-bookie/ledger" Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: instanceId: "1d0829c4-8d69-4457-a684-412767fc4b00" Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: ] is not matching with [4 Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: bookieHost: "192.168.1.201:3181; 192.168.1.202:3181; 192.168.1.203:3181:3181" Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: journalDir: "/var/lib/pulsar-bookie/journal" Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: ledgerDirs: "1\t/var/lib/pulsar-bookie/ledger" Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: instanceId: "f1c20d1f-7c71-4c03-8ef1-dab70aecbf17" Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: ] Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: at org.apache.bookkeeper.bookie.Cookie.verifyInternal(Cookie.java:136) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0] Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: at org.apache.bookkeeper.bookie.Cookie.verify(Cookie.java:147) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0] Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: at org.apache.bookkeeper.bookie.Bookie.verifyAndGetMissingDirs(Bookie.java:369) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0] Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: at org.apache.bookkeeper.bookie.Bookie.checkEnvironmentWithStorageExpansion(Bookie.java:432) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0] Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: at org.apache.bookkeeper.bookie.Bookie.checkEnvironment(Bookie.java:250) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0] Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: at org.apache.bookkeeper.bookie.Bookie.<init>(Bookie.java:688) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0] Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: at org.apache.bookkeeper.proto.BookieServer.newBookie(BookieServer.java:136) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0] Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: at org.apache.bookkeeper.proto.BookieServer.<init>(BookieServer.java:105) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0] Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: at org.apache.bookkeeper.server.service.BookieService.<init>(BookieService.java:41) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0] Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: at org.apache.bookkeeper.server.Main.buildBookieServer(Main.java:301) ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0] Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: at org.apache.bookkeeper.server.Main.doMain(Main.java:221) [org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0] Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: at org.apache.bookkeeper.server.Main.main(Main.java:203) [org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0] Jul 13 21:48:39 server1 pulsar-bookie-start[17802]: at org.apache.bookkeeper.proto.BookieServer.main(BookieServer.java:313) [org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0] Jul 13 21:48:39 server1 systemd[1]: pulsar-bookie.service: Main process exited, code=exited, status=2/INVALIDARGUMENT Jul 13 21:48:39 server1 systemd[1]: pulsar-bookie.service: Failed with result 'exit-code'. Jul 13 21:48:39 server1 systemd[1]: pulsar-bookie.service: Consumed 11.457s CPU time, received 7.4K IP traffic, sent 5.8K IP traffic.``` ---- 2020-07-13 19:53:20 UTC - Zhenhao Li: the cause seems to be that `bookieHost` is inconsistent between zookeeper and local bookie ---- 2020-07-13 19:55:34 UTC - Zhenhao Li: @Addison Higham I am not doing the manual fix yet because I want to make sure our deployment file work correctly. we don't want it happen again when adding new nodes ---- 2020-07-13 19:56:40 UTC - Addison Higham: you may try looking at raw records in zookeeper ---- 2020-07-13 19:57:06 UTC - Addison Higham: or I assume that is what you did already? but yes, there are options to configure how bookie nodes get their hostname ---- 2020-07-13 19:59:59 UTC - Alan Broddle: We actually thought we had this working, and are finding that the TLS security is not actually working. Short version… No we don’t think we have this figured out. We started looking at it again yesterday and are not seeing where the issue is. When we run a tcpdump, we can see the data between a Broker and BookKeeper. We think it is something with the cert or “PULSAR_EXTRA_OPTS” We are NOT seeing a list of the supported extra ops to verify we have the correct information PULSAR_EXTRA_OPTS=” -Dpulsar.allocator.exit_on_oom=true -Dio.netty.recycler.maxCapacity.default=1000 -Dio.netty.recycler.linkCapacity=1024 -Dzookeeper.clientCnxnSocket=org.apache.zookeeper.ClientCnxnSocketNetty -Dzookeeper.client.secure=true -Dzookeeper.ssl.hostnameVerification=false -Dzookeeper.ssl.keyStore.location=/usr/pulsar/certs/zookeeper.eaipulsarcloudnaengcluster1.pem -Dzookeeper.ssl.trustStore.location=/usr/pulsar/certs/ca.cert.pem” ---- 2020-07-13 20:01:22 UTC - Alan Broddle: Update: Internal Bookie communication between bookie servers seems to be working and is encrypted. Bookie to Broker is not! ---- 2020-07-13 20:15:17 UTC - Zhenhao Li: thanks for your help! I figured out the cause. I made the a mistake at the first run by using the same advertisedAddress for each bookie name, and this mistake turned out to be very sticky in the sense that re-run with correct advertisedAddress values won't fix it. ---- 2020-07-13 20:17:42 UTC - Zhenhao Li: I need to put the following in my TODO list. 1. add a optional clean up phase to my deployment script. 2. to see if there is a better approach to fix it inside the Pulsar project ---- 2020-07-13 20:21:20 UTC - Viktor: ok. I will open an issue on bookkeeper. I am using SSDs. default in OMB ---- 2020-07-13 20:33:05 UTC - Zhenhao Li: Hi, I have some issues with Pulsar brokers in my deployment. 2 nodes are running and 2 nodes are failing with the following error ---- 2020-07-13 20:33:05 UTC - Zhenhao Li: ```Jul 13 22:12:36 server2 systemd[1]: Started Pulsar's Broker Daemon. Jul 13 22:12:37 server2 pulsar-broker-start[26492]: [AppClassLoader@18b4aac2] info AspectJ Weaver Version 1.9.2 built on Wednesday Oct 24, 2018 at 15:43:33 GMT Jul 13 22:12:37 server2 pulsar-broker-start[26492]: [AppClassLoader@18b4aac2] info register classloader sun.misc.Launcher$AppClassLoader@18b4aac2 Jul 13 22:12:37 server2 pulsar-broker-start[26492]: [AppClassLoader@18b4aac2] info using configuration file:/nix/store/zj60lld9z5yp0s5qas46sffc48wm2c2i-pulsar-2.6.0/lib/org.apache.pulsar-pulsar-zookeeper-utils-2.6.0.jar!/META-INF/aop.xml Jul 13 22:12:37 server2 pulsar-broker-start[26492]: [AppClassLoader@18b4aac2] info using configuration file:/nix/store/zj60lld9z5yp0s5qas46sffc48wm2c2i-pulsar-2.6.0/lib/org.apache.pulsar-pulsar-zookeeper-2.6.0.jar!/META-INF/aop.xml Jul 13 22:12:37 server2 pulsar-broker-start[26492]: [AppClassLoader@18b4aac2] info register aspect org.apache.pulsar.broker.zookeeper.aspectj.ClientCnxnAspect Jul 13 22:12:37 server2 pulsar-broker-start[26492]: [AppClassLoader@18b4aac2] info register aspect org.apache.pulsar.zookeeper.FinalRequestProcessorAspect Jul 13 22:12:37 server2 pulsar-broker-start[26492]: [AppClassLoader@18b4aac2] info register aspect org.apache.pulsar.zookeeper.ZooKeeperServerAspect Jul 13 22:13:09 server2 pulsar-broker-start[26492]: 22:13:09.274 [main] ERROR org.apache.pulsar.broker.PulsarService - Failed to establish session with local ZK Jul 13 22:13:09 server2 pulsar-broker-start[26492]: java.io.IOException: Failed to establish session with local ZK Jul 13 22:13:09 server2 pulsar-broker-start[26492]: at org.apache.pulsar.zookeeper.LocalZooKeeperConnectionService.start(LocalZooKeeperConnectionService.java:74) ~[org.apache.pulsar-pulsar-zookeeper-utils-2.6.0.jar:2.6.0] Jul 13 22:13:09 server2 pulsar-broker-start[26492]: at org.apache.pulsar.broker.PulsarService.start(PulsarService.java:438) [org.apache.pulsar-pulsar-broker-2.6.0.jar:2.6.0] Jul 13 22:13:09 server2 pulsar-broker-start[26492]: at org.apache.pulsar.PulsarBrokerStarter$BrokerStarter.start(PulsarBrokerStarter.java:280) [org.apache.pulsar-pulsar-broker-2.6.0.jar:2.6.0] Jul 13 22:13:09 server2 pulsar-broker-start[26492]: at org.apache.pulsar.PulsarBrokerStarter.main(PulsarBrokerStarter.java:349) [org.apache.pulsar-pulsar-broker-2.6.0.jar:2.6.0] Jul 13 22:13:09 server2 pulsar-broker-start[26492]: Caused by: java.util.concurrent.TimeoutException Jul 13 22:13:09 server2 pulsar-broker-start[26492]: at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1784) ~[?:1.8.0_242] Jul 13 22:13:09 server2 pulsar-broker-start[26492]: at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928) ~[?:1.8.0_242] Jul 13 22:13:09 server2 pulsar-broker-start[26492]: at org.apache.pulsar.zookeeper.LocalZooKeeperConnectionService.start(LocalZooKeeperConnectionService.java:68) ~[org.apache.pulsar-pulsar-zookeeper-utils-2.6.0.jar:2.6.0] Jul 13 22:13:09 server2 pulsar-broker-start[26492]: ... 3 more Jul 13 22:13:09 server2 pulsar-broker-start[26492]: 22:13:09.286 [main] ERROR org.apache.pulsar.PulsarBrokerStarter - Failed to start pulsar service. Jul 13 22:13:09 server2 pulsar-broker-start[26492]: org.apache.pulsar.broker.PulsarServerException: java.io.IOException: Failed to establish session with local ZK Jul 13 22:13:09 server2 pulsar-broker-start[26492]: at org.apache.pulsar.broker.PulsarService.start(PulsarService.java:587) ~[org.apache.pulsar-pulsar-broker-2.6.0.jar:2.6.0] Jul 13 22:13:09 server2 pulsar-broker-start[26492]: at org.apache.pulsar.PulsarBrokerStarter$BrokerStarter.start(PulsarBrokerStarter.java:280) ~[org.apache.pulsar-pulsar-broker-2.6.0.jar:2.6.0] Jul 13 22:13:09 server2 pulsar-broker-start[26492]: at org.apache.pulsar.PulsarBrokerStarter.main(PulsarBrokerStarter.java:349) [org.apache.pulsar-pulsar-broker-2.6.0.jar:2.6.0] Jul 13 22:13:09 server2 pulsar-broker-start[26492]: Caused by: java.io.IOException: Failed to establish session with local ZK Jul 13 22:13:09 server2 pulsar-broker-start[26492]: at org.apache.pulsar.zookeeper.LocalZooKeeperConnectionService.start(LocalZooKeeperConnectionService.java:74) ~[org.apache.pulsar-pulsar-zookeeper-utils-2.6.0.jar:2.6.0] Jul 13 22:13:09 server2 pulsar-broker-start[26492]: at org.apache.pulsar.broker.PulsarService.start(PulsarService.java:438) ~[org.apache.pulsar-pulsar-broker-2.6.0.jar:2.6.0] Jul 13 22:13:09 server2 pulsar-broker-start[26492]: ... 2 more Jul 13 22:13:09 server2 pulsar-broker-start[26492]: Caused by: java.util.concurrent.TimeoutException Jul 13 22:13:09 server2 pulsar-broker-start[26492]: at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1784) ~[?:1.8.0_242] Jul 13 22:13:09 server2 pulsar-broker-start[26492]: at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928) ~[?:1.8.0_242] Jul 13 22:13:09 server2 pulsar-broker-start[26492]: at org.apache.pulsar.zookeeper.LocalZooKeeperConnectionService.start(LocalZooKeeperConnectionService.java:68) ~[org.apache.pulsar-pulsar-zookeeper-utils-2.6.0.jar:2.6.0] Jul 13 22:13:09 server2 pulsar-broker-start[26492]: at org.apache.pulsar.broker.PulsarService.start(PulsarService.java:438) ~[org.apache.pulsar-pulsar-broker-2.6.0.jar:2.6.0] Jul 13 22:13:09 server2 pulsar-broker-start[26492]: ... 2 more Jul 13 22:13:09 server2 systemd[1]: pulsar-broker.service: Main process exited, code=exited, status=1/FAILURE Jul 13 22:13:09 server2 systemd[1]: pulsar-broker.service: Failed with result 'exit-code'. Jul 13 22:13:09 server2 systemd[1]: pulsar-broker.service: Consumed 4.488s CPU time, received 0B IP traffic, sent 480B IP traffic.``` ---- 2020-07-13 20:33:05 UTC - Zhenhao Li: the whole deployment uses a single zookeeper node. ---- 2020-07-13 20:34:55 UTC - Zhenhao Li: logs on the working nodes: ```Jul 13 22:23:38 server1 systemd[1]: Started Pulsar's Broker Daemon. Jul 13 22:23:38 server1 pulsar-broker-start[18818]: [AppClassLoader@18b4aac2] info AspectJ Weaver Version 1.9.2 built on Wednesday Oct 24, 2018 at 15:43:33 GMT Jul 13 22:23:38 server1 pulsar-broker-start[18818]: [AppClassLoader@18b4aac2] info register classloader sun.misc.Launcher$AppClassLoader@18b4aac2 Jul 13 22:23:38 server1 pulsar-broker-start[18818]: [AppClassLoader@18b4aac2] info using configuration file:/nix/store/zj60lld9z5yp0s5qas46sffc48wm2c2i-pulsar-2.6.0/lib/org.apache.pulsar-pulsar-zookeeper-2.6.0.jar!/META-INF/aop.xml Jul 13 22:23:38 server1 pulsar-broker-start[18818]: [AppClassLoader@18b4aac2] info using configuration file:/nix/store/zj60lld9z5yp0s5qas46sffc48wm2c2i-pulsar-2.6.0/lib/org.apache.pulsar-pulsar-zookeeper-utils-2.6.0.jar!/META-INF/aop.xml Jul 13 22:23:38 server1 pulsar-broker-start[18818]: [AppClassLoader@18b4aac2] info register aspect org.apache.pulsar.zookeeper.FinalRequestProcessorAspect Jul 13 22:23:39 server1 pulsar-broker-start[18818]: [AppClassLoader@18b4aac2] info register aspect org.apache.pulsar.zookeeper.ZooKeeperServerAspect Jul 13 22:23:39 server1 pulsar-broker-start[18818]: [AppClassLoader@18b4aac2] info register aspect org.apache.pulsar.broker.zookeeper.aspectj.ClientCnxnAspect Jul 13 22:24:12 server1 pulsar-broker-start[18818]: [MethodUtil@39cf86be] info AspectJ Weaver Version 1.9.2 built on Wednesday Oct 24, 2018 at 15:43:33 GMT Jul 13 22:24:12 server1 pulsar-broker-start[18818]: [MethodUtil@39cf86be] info register classloader sun.reflect.misc.MethodUtil@39cf86be Jul 13 22:24:12 server1 pulsar-broker-start[18818]: [MethodUtil@39cf86be] info using configuration file:/nix/store/zj60lld9z5yp0s5qas46sffc48wm2c2i-pulsar-2.6.0/lib/org.apache.pulsar-pulsar-zookeeper-2.6.0.jar!/META-INF/aop.xml Jul 13 22:24:12 server1 pulsar-broker-start[18818]: [MethodUtil@39cf86be] info using configuration file:/nix/store/zj60lld9z5yp0s5qas46sffc48wm2c2i-pulsar-2.6.0/lib/org.apache.pulsar-pulsar-zookeeper-utils-2.6.0.jar!/META-INF/aop.xml Jul 13 22:24:12 server1 pulsar-broker-start[18818]: [MethodUtil@39cf86be] info register aspect org.apache.pulsar.zookeeper.FinalRequestProcessorAspect Jul 13 22:24:12 server1 pulsar-broker-start[18818]: [MethodUtil@39cf86be] info register aspect org.apache.pulsar.zookeeper.ZooKeeperServerAspect Jul 13 22:24:12 server1 pulsar-broker-start[18818]: [MethodUtil@39cf86be] info register aspect org.apache.pulsar.broker.zookeeper.aspectj.ClientCnxnAspect``` ---- 2020-07-13 21:01:58 UTC - Matt Mitchell: @Matteo Merli I’ll see if I can share something later this week ---- 2020-07-13 21:04:05 UTC - Matteo Merli: :+1: ---- 2020-07-13 21:09:32 UTC - Matt Mitchell: I’m experiencing an issue related to the java PulsarAdmin client, where sometimes just calling `client.topics().getList(tenantAndNamespace)` fails with a timeout error. I originally was using the client to get the list of topics, which worked the first time, and then I attempted to delete a subscription, which timed out too. After I re-ran the code, the call to `topics().getList(…)` then times out. Has anyone experienced this before? ---- 2020-07-13 21:10:49 UTC - Addison Higham: What version of Pulsar are you running? ---- 2020-07-13 21:11:02 UTC - Matt Mitchell: This is 2.5.2 ---- 2020-07-13 21:11:30 UTC - Addison Higham: Also, can you try doing a `pulsar-admin namespace unload <tenant>/<namespace>` and see if that fixes it? ---- 2020-07-13 21:11:51 UTC - Matt Mitchell: sure, will do ---- 2020-07-13 21:23:34 UTC - Matt Mitchell: I got this: ```root@pulsar:/pulsar# ./bin/pulsar-admin namespaces unload fusion/_system null Reason: HTTP 500 Internal Server Error``` ---- 2020-07-13 21:24:18 UTC - Matt Mitchell: I’m running Pulsar in docker fwiw ---- 2020-07-13 21:26:33 UTC - Matt Mitchell: and from the Pulsar logs: ---- 2020-07-13 21:26:33 UTC - Matt Mitchell: ```fusion_pulsar.1.5untcal3sdns@docker-desktop | 21:22:51.988 [AsyncHttpClient-timer-87-1] WARN org.apache.pulsar.client.admin.internal.BaseResource - [<http://pulsar:8080/admin/v2/namespaces/fusion/_system/0xc0000000_0xffffffff/unload>] Failed to perform http put request: java.util.concurrent.TimeoutException: Read timeout to pulsar/10.0.4.18:8080 after 30000 ms fusion_pulsar.1.5untcal3sdns@docker-desktop | 21:22:51.991 [AsyncHttpClient-timer-87-1] ERROR org.apache.pulsar.broker.admin.impl.NamespacesBase - [null] Failed to unload namespace fusion/_system fusion_pulsar.1.5untcal3sdns@docker-desktop | java.util.concurrent.CompletionException: org.apache.pulsar.client.admin.PulsarAdminException: java.util.concurrent.TimeoutException: Read timeout to pulsar/10.0.4.18:8080 after 30000 ms fusion_pulsar.1.5untcal3sdns@docker-desktop | at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292) ~[?:1.8.0_232] fusion_pulsar.1.5untcal3sdns@docker-desktop | at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308) ~[?:1.8.0_232] fusion_pulsar.1.5untcal3sdns@docker-desktop | at java.util.concurrent.CompletableFuture.biRelay(CompletableFuture.java:1300) ~[?:1.8.0_232] fusion_pulsar.1.5untcal3sdns@docker-desktop | at java.util.concurrent.CompletableFuture$BiRelay.tryFire(CompletableFuture.java:1284) ~[?:1.8.0_232] fusion_pulsar.1.5untcal3sdns@docker-desktop | at java.util.concurrent.CompletableFuture$CoCompletion.tryFire(CompletableFuture.java:1034) ~[?:1.8.0_232] fusion_pulsar.1.5untcal3sdns@docker-desktop | at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) ~[?:1.8.0_232] fusion_pulsar.1.5untcal3sdns@docker-desktop | at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990) ~[?:1.8.0_232] fusion_pulsar.1.5untcal3sdns@docker-desktop | at org.apache.pulsar.client.admin.internal.BaseResource$1.failed(BaseResource.java:130) ~[org.apache.pulsar-pulsar-client-admin-original-2.4.2.jar:2.4.2] fusion_pulsar.1.5untcal3sdns@docker-desktop | at org.glassfish.jersey.client.JerseyInvocation$4.failed(JerseyInvocation.java:1030) ~[org.glassfish.jersey.core-jersey-client-2.27.jar:?] fusion_pulsar.1.5untcal3sdns@docker-desktop | at org.glassfish.jersey.client.ClientRuntime.processFailure(ClientRuntime.java:231) ~[org.glassfish.jersey.core-jersey-client-2.27.jar:?] fusion_pulsar.1.5untcal3sdns@docker-desktop | at org.glassfish.jersey.client.ClientRuntime.access$100(ClientRuntime.java:85) ~[org.glassfish.jersey.core-jersey-client-2.27.jar:?] fusion_pulsar.1.5untcal3sdns@docker-desktop | at org.glassfish.jersey.client.ClientRuntime$2.lambda$failure$1(ClientRuntime.java:183) ~[org.glassfish.jersey.core-jersey-client-2.27.jar:?] fusion_pulsar.1.5untcal3sdns@docker-desktop | at org.glassfish.jersey.internal.Errors$1.call(Errors.java:272) [org.glassfish.jersey.core-jersey-common-2.27.jar:?] fusion_pulsar.1.5untcal3sdns@docker-desktop | at org.glassfish.jersey.internal.Errors$1.call(Errors.java:268) [org.glassfish.jersey.core-jersey-common-2.27.jar:?] fusion_pulsar.1.5untcal3sdns@docker-desktop | at org.glassfish.jersey.internal.Errors.process(Errors.java:316) [org.glassfish.jersey.core-jersey-common-2.27.jar:?] fusion_pulsar.1.5untcal3sdns@docker-desktop | at org.glassfish.jersey.internal.Errors.process(Errors.java:298) [org.glassfish.jersey.core-jersey-common-2.27.jar:?] fusion_pulsar.1.5untcal3sdns@docker-desktop | at org.glassfish.jersey.internal.Errors.process(Errors.java:268) [org.glassfish.jersey.core-jersey-common-2.27.jar:?] fusion_pulsar.1.5untcal3sdns@docker-desktop | at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:312) [org.glassfish.jersey.core-jersey-common-2.27.jar:?] fusion_pulsar.1.5untcal3sdns@docker-desktop | at org.glassfish.jersey.client.ClientRuntime$2.failure(ClientRuntime.java:183) [org.glassfish.jersey.core-jersey-client-2.27.jar:?] fusion_pulsar.1.5untcal3sdns@docker-desktop | at org.apache.pulsar.client.admin.internal.http.AsyncHttpConnector$3.onThrowable(AsyncHttpConnector.java:250) [org.apache.pulsar-pulsar-client-admin-original-2.4.2.jar:2.4.2] fusion_pulsar.1.5untcal3sdns@docker-desktop | at org.asynchttpclient.netty.NettyResponseFuture.abort(NettyResponseFuture.java:277) [org.asynchttpclient-async-http-client-2.7.0.jar:?] fusion_pulsar.1.5untcal3sdns@docker-desktop | at org.asynchttpclient.netty.request.NettyRequestSender.abort(NettyRequestSender.java:473) [org.asynchttpclient-async-http-client-2.7.0.jar:?] fusion_pulsar.1.5untcal3sdns@docker-desktop | at org.asynchttpclient.netty.timeout.TimeoutTimerTask.expire(TimeoutTimerTask.java:43) [org.asynchttpclient-async-http-client-2.7.0.jar:?] fusion_pulsar.1.5untcal3sdns@docker-desktop | at org.asynchttpclient.netty.timeout.ReadTimeoutTimerTask.run(ReadTimeoutTimerTask.java:56) [org.asynchttpclient-async-http-client-2.7.0.jar:?] fusion_pulsar.1.5untcal3sdns@docker-desktop | at io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:682) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final] fusion_pulsar.1.5untcal3sdns@docker-desktop | at io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:757) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final] fusion_pulsar.1.5untcal3sdns@docker-desktop | at io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:485) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final] fusion_pulsar.1.5untcal3sdns@docker-desktop | at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final] fusion_pulsar.1.5untcal3sdns@docker-desktop | at java.lang.Thread.run(Thread.java:748) [?:1.8.0_232] fusion_pulsar.1.5untcal3sdns@docker-desktop | Caused by: org.apache.pulsar.client.admin.PulsarAdminException: java.util.concurrent.TimeoutException: Read timeout to pulsar/10.0.4.18:8080 after 30000 ms fusion_pulsar.1.5untcal3sdns@docker-desktop | at org.apache.pulsar.client.admin.internal.BaseResource.getApiException(BaseResource.java:228) ~[org.apache.pulsar-pulsar-client-admin-original-2.4.2.jar:2.4.2] fusion_pulsar.1.5untcal3sdns@docker-desktop | ... 22 more fusion_pulsar.1.5untcal3sdns@docker-desktop | Caused by: java.util.concurrent.TimeoutException: Read timeout to pulsar/10.0.4.18:8080 after 30000 ms fusion_pulsar.1.5untcal3sdns@docker-desktop | ... 7 more``` ---- 2020-07-13 21:28:07 UTC - Matt Mitchell: But it looks like it was at least unloading, because in my client (which was connected when I executed `unload`) logged this: ```Caused by: org.apache.pulsar.client.api.PulsarClientException: java.util.concurrent.CompletionException: org.apache.pulsar.client.api.PulsarClientException$LookupException: java.lang.IllegalStateException: Namespace bundle fusion/_system/0xc0000000_0xffffffff is being unloaded``` ---- 2020-07-13 21:34:10 UTC - Addison Higham: Is this pulsa running as standalone? or a full cluster? If you only have a single broker, I would be surprised by the second part... how many topics do you have in this namespace? Namespaces get split into "bundles" of topics and those bundles are what gets serviced by a broker. Because of that, sometimes certain calls need to talk to multiple brokers. The `listTopics` call is one of those, if one of your brokers is down/having issues, it can cause problems with `listTopics`. Same with that offload call you did ---- 2020-07-13 21:37:53 UTC - Matt Mitchell: ok, good to know ---- 2020-07-13 21:37:55 UTC - Matt Mitchell: This is running Pulsar in standalone mode ---- 2020-07-13 21:38:38 UTC - Matt Mitchell: This is the compose file I’m using: ```services: pulsar: image: apachepulsar/pulsar:2.4.2 hostname: pulsar # volumes: # - ${PWD}/data:/pulsar/data environment: PULSAR_MEM: " -Xms512m -Xmx512m -XX:MaxDirectMemorySize=1g" command: > /bin/bash -c "bin/apply-config-from-env.py conf/standalone.conf && bin/pulsar standalone" ports: - "6650:6650" - "8080:8080" # restart: always networks: - default``` ---- 2020-07-13 21:38:56 UTC - Matt Mitchell: uh oh, that’s 2.4.2 :neutral_face: ---- 2020-07-13 21:41:56 UTC - Matt Mitchell: lemme dbl check that and make sure i’m starting up 2.5.2 ---- 2020-07-13 22:20:37 UTC - Addison Higham: does a restart fix it? ---- 2020-07-14 00:15:58 UTC - Rounak Jaggi: @Sijie Guo need little help with configuring bookie tls. I followed the bookie tls documentation, created truststore and keystore and configured those 7 parameters as per the documentation, but still when I do openssl command to test tls on the bookie port I am not getting any certs. Am I missing anything? ---- 2020-07-14 00:16:36 UTC - Hiroyuki Yamada: @Penghui Li Can you answer my question when you get a chance ? I want to dig into it deeper as well. <https://github.com/apache/pulsar/issues/7455#issuecomment-654763271> ---- 2020-07-14 03:59:14 UTC - Penghui Li: ok ---- 2020-07-14 05:02:06 UTC - Rahul Vashishth: > set a retention policy based on size or time that will retain messages regardless of subscription Retention policies applies to acked msg. while backlogQuota and ttl works for unacked msg. i guess we need to set ttl and backlogQuota to retain messages on topic without subscriptions. ---- 2020-07-14 05:13:56 UTC - Zhenhao Li: on the nodes where brokers are failing, the bookies are running fine. so it can't be zookeeper connection issue ---- 2020-07-14 05:40:05 UTC - Zhenhao Li: figured out why. I forgot to put 2181 in the open port list for the machines running the bundled zookeeper in my script. On the working nodes, broker can still reach zookeeper from 127.0.0.0; on the failing ones, bookie worked because they were talking to my own zookeeper cluster which still contains previous wrong configuration. now all brokers are running, but all bookies are failing. at least I know what to do. I will study how Pulsar uses zookeeper and add a cleanup module to my deployment script ---- 2020-07-14 06:57:20 UTC - zsh0139: @zsh0139 has joined the channel ---- 2020-07-14 07:54:31 UTC - Hiroyuki Yamada: Hi, I have a question about Bookie (auto) recovery. When a bookie node fails, does auto recovery tries to recover all the ledger data that the failed node has ? ---- 2020-07-14 07:56:56 UTC - Meyappan Ramasamy: this is my pulsar docker configuration pulsar: image: apachepulsar/pulsar:2.5.0 ports: - '8080:8080' - '6650:6650' expose: - 8080 - 6650 environment: - PULSAR_MEM=" -Xms512m -Xmx512m -XX:MaxDirectMemorySize=1g" command: > /bin/bash -c "bin/apply-config-from-env.py conf/standalone.conf && bin/pulsar standalone" ---- 2020-07-14 08:13:39 UTC - Meyappan Ramasamy: followed the example from here : <https://github.com/apache/pulsar/blob/master/docker-compose/standalone-dashboard/docker-compose.yml> ----
