2019-02-28 09:20:59 UTC - Enrico Olivelli: Hi guys, I am looking into <http://pulsar.apache.org/docs/en/deploy-bare-metal-multi-cluster/#service-discovery> But I can't understand I do service discovery works for pulsar:// URIs, I see that for the HTTP URI I have to set up some kind of reverse-proxy, http redirector or DNS round robin, but for the binary protocol ? My case is that I have 3 pulsar brokers. If you prefer I will send an email to the users ML ---- 2019-02-28 09:21:46 UTC - Enrico Olivelli: Do I need to start the discovery service ? ---- 2019-02-28 09:22:12 UTC - Enrico Olivelli: Here it seems that it is not needed <http://pulsar.apache.org/docs/en/deploy-bare-metal/#service-discovery-setup> ---- 2019-02-28 09:23:22 UTC - Enrico Olivelli: I am on a single datacenter ---- 2019-02-28 09:24:32 UTC - Enrico Olivelli: I can setup a "redirect" for HTTP, but I cannot ask my customers to create a special DNS entries, because in my product it is important that each machine/service can be identitfed by its own DNS name ---- 2019-02-28 09:25:56 UTC - Enrico Olivelli: In my system I already have a ZK based service discovery, and it knows that I have pulsar brokers, and for each broker the binary endpoint URI and the http endpoint uri ---- 2019-02-28 09:26:27 UTC - Ali Ahmed: @Enrico Olivelli I would just use the pulsar proxy service ---- 2019-02-28 09:26:52 UTC - Enrico Olivelli: mmm what about implemeting a custom ServiceUrlProvider ? ---- 2019-02-28 09:27:49 UTC - Enrico Olivelli: it will return the URI of one of the active brokers, with Round Robin ---- 2019-02-28 09:28:36 UTC - Ali Ahmed: sure but you will need a service to round robin among alive brokers ---- 2019-02-28 09:28:57 UTC - Ali Ahmed: that will fair amount of eng effort ---- 2019-02-28 09:29:09 UTC - Enrico Olivelli: I already have a custom service discovery mechanism ---- 2019-02-28 09:29:46 UTC - Ali Ahmed: it can work then ---- 2019-02-28 09:31:10 UTC - Enrico Olivelli: Thank you @Ali Ahmed I will give it a try ---- 2019-02-28 10:16:24 UTC - bhagesharora: How we can produce the message and consume the message parallel using command in cluster mode ?? ---- 2019-02-28 10:21:24 UTC - Slackbot: This message was deleted. ---- 2019-02-28 10:30:04 UTC - Yuvaraj Loganathan: You can open two terminal windows one to produce and one to consume. ---- 2019-02-28 10:32:52 UTC - bhagesharora: yeah I tried out but using above command message successfully produced but I want to do in dynamic way like i am producing a message in one side and another side in consumer window I can consuming the message dynamically ---- 2019-02-28 10:34:45 UTC - Yuvaraj Loganathan: You can use any python Client Libraries for java, python or go. ---- 2019-02-28 10:34:46 UTC - bhagesharora: Like in kafka we can do ---- 2019-02-28 10:36:19 UTC - Yuvaraj Loganathan: ` pulsar-client produce -n 500 sentences --messages "hello pulsar" ` ---- 2019-02-28 10:36:26 UTC - Yuvaraj Loganathan: Will produce 500 messages ---- 2019-02-28 10:36:38 UTC - Yuvaraj Loganathan: For more you can refer here <https://pulsar.apache.org/docs/latest/reference/CliTools/#pulsar-client> ---- 2019-02-28 10:46:38 UTC - bhagesharora: @Yuvaraj Loganathan Using this command : pulsar-client produce -n 500 test-topic --messages "hello pulsar" In last I got the notification 500 messages successfully produced but for consumer when I am executing pulsar-client consume test-topic --num-messages 0 command I am getting options like The following option is required: -s, --subscription-name ---- 2019-02-28 11:22:39 UTC - Yuvaraj Loganathan: You need to provide subscription name . Try this `pulsar-client consume test-topic --num-messages 0 -s "test-sub"` ---- 2019-02-28 11:40:05 UTC - bhagesharora: yeah, got it thanks :+1: ---- 2019-02-28 11:52:07 UTC - Yuvaraj Loganathan: Is it possible to increase the number of TCP connections per broker for an single python client ? Right now we see only one tcp connection established per broker. ---- 2019-02-28 11:54:09 UTC - Yuvaraj Loganathan: Does increasing the io_threads in the clients increases the TCP connection per broker ? ---- 2019-02-28 12:50:48 UTC - Marc Le Labourier: Ok, we got grafana-prometheus-pulsar cluster working together. @Matteo Merli are you interested in us making a pull request (or an issue with the steps we followed) with new information for the doc and the deployement on aws ? If we have the time, we could make one. +1 : Sijie Guo ---- 2019-02-28 14:33:42 UTC - Sijie Guo: It would be great if you guys can contribute :slightly_smiling_face: ---- 2019-02-28 15:07:18 UTC - Matteo Merli: @Marc Le Labourier of course contributions are very welcome! ---- 2019-02-28 15:10:49 UTC - Alexandre DUVAL: /!\ Hi there, the documentation of 2.2.1 from link <https://pulsar.apache.org/docs/en/2.2.1/standalone/> has content on version 2.3.0. More, where can I have access to javadoc version 2.2.1, only the 2.3.0 is available. ---- 2019-02-28 15:17:27 UTC - Alexandre DUVAL: Or maybe the javadoc displayed on <https://pulsar.apache.org/api/client/> is not about v2.3.0, then can you display the documentation version? :smile: ---- 2019-02-28 15:57:58 UTC - m.makaveeva: @m.makaveeva has joined the channel ---- 2019-02-28 16:12:10 UTC - Sijie Guo: it seems to be a bug on generating the links :disappointed: ---- 2019-02-28 16:12:15 UTC - Sijie Guo: can you file a github issue for it? ---- 2019-02-28 16:52:33 UTC - Matteo Merli: @Enrico Olivelli The service discovery in Pulsar is typically implemented by exposing a single “serviceURL” or “hostname” to the clients.
There are multiple ways to achieve that: * Any IP load balancer (VIP, ELB, ..) * DNS CNAME mapping to list of IPs * HTTP reverse proxy for `http://` serviceURL * Dumb TCP proxy for `pulsar://` serviceURL Additionally, since version 2.3, it’s possible (in Java client only, for now) to specify a list of broker hostnames instead of a single hostname. See <https://github.com/apache/pulsar/pull/3249> ---- 2019-02-28 16:53:46 UTC - Matteo Merli: In any case, the list of running brokers is available in ZK (at `/loadbalance/brokers` if you want to implement any custom discovery mechanism. +1 : Enrico Olivelli ---- 2019-02-28 17:01:04 UTC - Matteo Merli: @Yuvaraj Loganathan The number of connections per-host in the connection pool is only configurable in the Java client. The reason there was mostly for increasing the throughput over a large number topic in the geo-replication context, where the RTT can be in the 100s of millis. I don’t think that in most cases having more TCP connections per broker will lead to any measurable throughput improvement otherwise. Also, it would only be beneficial when there are multiple topics, since a single producer/consumer is anyway pegged to 1 single connection. (In any case, it’s something very easy to add in c++ client, eg: keep a list of N pooled connections instead of just one) Finally, the “io_thread” setting controls the number of background thread that are managing these connections to brokers. We’re using Boost Asio for networking, which basically means that there will be one epoll event loop thread for each “io_thread”. Default is 1. It could make sense to increase these threads if the CPU% looks high on that, otherwise just leave it at 1 . ---- 2019-02-28 18:00:48 UTC - eolivelli: @Matteo Merli thank you. Having multiple brokers is very like Kafka (and I am migration from Kafka actually). Today I am experimenting with a custom ServiceURIProvider as it looks promising. I have my own discovery service and I can link with it. Is there any way to force the client disconnect and search again for a broker? (Like when I have a notification from the discovery service that the current server is going to shutdown or is no more available) ---- 2019-02-28 18:02:06 UTC - Matteo Merli: Yes, you can call `PulsarClient.updateServiceUrl()` to trigger that <https://pulsar.apache.org/api/client/org/apache/pulsar/client/api/PulsarClient.html#updateServiceUrl-java.lang.String-> ---- 2019-02-28 18:02:48 UTC - eolivelli: Good. That's what I was supposing, but a double check is a good confirmation. Thanks ---- 2019-02-28 18:04:46 UTC - eolivelli: Btw my first end to end benchmarks in my application result in a 5x with Pulsar default configuration (and my default bookie configuration, they are shared with other subsystems of my application) against Kafka ! :heart: looks promising ! man-surfing : Matteo Merli heart : Ezequiel Lovelle, Sijie Guo 100 : Sijie Guo ---- 2019-02-28 18:19:52 UTC - Grant Wu: Anyone have thoughts on this? ---- 2019-02-28 18:20:12 UTC - Grant Wu: I don’t know what Debezium is and we don’t use it ---- 2019-02-28 18:21:58 UTC - David Kjerrumgaard: Short answer is that the above is a benign error. ---- 2019-02-28 18:22:28 UTC - David Kjerrumgaard: When we start the Pulsar platform, we also scan a pre-defined directory for our "built-in" connectors. ---- 2019-02-28 18:22:53 UTC - Ryan Samo: So I tried a few more tactics for getting the bookie racks to take hold, like deleting the /bookies out of zookeeper using the zk shell and then readding the racks. It all looks fine in zookeeper but the broker logs always show /default-rack/bookie1:3181 /default-rack/bookie2:3181 /default-rack/bookie3:3181 /default-rack/bookie4:3181 /default-rack/bookie5:3181 /default-rack/bookie6:3181 No matter what you try. Since I am stuck with only 1 default-rack, what is the potential for data loss on the bookies if I have... 6 bookie nodes Ensemble 3 Write quorum 2 Read quorum 2 I will log a bug per @David Kjerrumgaard s request shortly. I just need to consider how many nodes I can loose before data loss. +1 : David Kjerrumgaard ---- 2019-02-28 18:23:44 UTC - David Kjerrumgaard: The above error indicates that one of the connector NAR files inside that directory is corrupt in some way. We are aware of the issue and correcting it in the next release. Fortunately for you, this will have no impact, since you don't utilize that specific connector ---- 2019-02-28 18:48:35 UTC - vinay Parekar: Hi guys , is ther any nifi-pulsar processor for nifi-1.7.0 and pulsar 2.3.0 ? ---- 2019-02-28 18:50:17 UTC - Matteo Merli: Then you can remove it from the connectors folder. Actually that shouldn’t have been added there to begin with.. (and it’s already fixed in master and will be backported to 2.3.1) In any case, you can also ignore that exception since it’s just trying to load that connector and fails at it so it just moves on without. ---- 2019-02-28 18:50:28 UTC - Grant Wu: Okay, I’ll just wait once 2.3.1 comes out ---- 2019-02-28 18:50:53 UTC - Matteo Merli: Are you using this with the pulsar-all Docker image? ---- 2019-02-28 18:50:57 UTC - Grant Wu: Yes ---- 2019-02-28 18:51:39 UTC - Matteo Merli: Are you using other connectors or tiered storage? ---- 2019-02-28 18:51:47 UTC - Grant Wu: Not as far as I’m aware of ---- 2019-02-28 18:52:00 UTC - Grant Wu: Do you have any thoughts on this + above error message? @Matteo Merli I just tried our deploy process again and was unable to reproduce. Any ideas on how do debug this issue in the future if it does come up? ---- 2019-02-28 18:52:00 UTC - Matteo Merli: Than you can just use `pulsar` image instead ---- 2019-02-28 18:52:09 UTC - Matteo Merli: apachepulsar/pulsar that is ---- 2019-02-28 18:52:31 UTC - Matteo Merli: that doesn’t bundle all the connectors and tiered-storage providers ---- 2019-02-28 18:54:05 UTC - Matteo Merli: I don’t have it on top of my head :confused: ---- 2019-02-28 19:13:18 UTC - Matteo Merli: Can you open an issue and put all the info there so that we don’t forget about it ? ---- 2019-02-28 19:17:30 UTC - Grant Wu: ok ---- 2019-02-28 19:21:44 UTC - Grant Wu: Made <https://github.com/apache/pulsar/issues/3715> ---- 2019-02-28 19:57:30 UTC - David Kjerrumgaard: The nifi processor I wrote will work with any version of Pulsar. ---- 2019-02-28 19:57:41 UTC - David Kjerrumgaard: Which version of NiFi are you using? ---- 2019-02-28 20:06:26 UTC - David Kjerrumgaard: @vinay Parekar I can build you a NAR file for it if you like ---- 2019-02-28 21:27:02 UTC - Matteo Merli: Thanks ---- 2019-02-28 22:37:39 UTC - vinay Parekar: i am currently using 1.7.1 ---- 2019-02-28 22:38:05 UTC - vinay Parekar: plus i am using pulsar 2.3.0 version ---- 2019-02-28 22:38:48 UTC - vinay Parekar: i did found these nar files , i tried using these ---- 2019-02-28 22:39:37 UTC - vinay Parekar: but i am getting errors. ---- 2019-03-01 03:07:13 UTC - Yuvaraj Loganathan: Thanks a lot @Matteo Merli for detailed explanation. It really helps. ---- 2019-03-01 07:43:45 UTC - Alexandre DUVAL: Sure. ---- 2019-03-01 08:36:52 UTC - bhagesharora: Hi guys, I am triggering one word-count function in cluster mode using command : /pulsar# bin/pulsar-admin functions trigger --tenant public --namespace default --name word-count --triggerValue "hello world hello world" But I am getting: HTTP 408 Request Timeout error ? What could be the reason ?? meanwhile pulsar-standalone is already running ! ---- 2019-03-01 08:39:59 UTC - Sijie Guo: @bhagesharora I think it might be related to state storage. the state storage is enabled by default in standalone. However the state storage requires some additional setup in cluster mode. I supposed to add the instructions to the documentation. I didn’t get time to add those yet. ----
