2018-11-01 09:16:26 UTC - Elena Poughia: Hi all, My name is Elena and I’m the Managing Director at Data Natives. I wanted to let you know about our Data Natives conference that will take place on November 22-23 in Berlin, Germany. It focuses on key areas of Artificial Intelligence, Machine Learning, Internet of Things, FinTech, HealthTech and more. Our event aims to bridge the gap between technical innovation and business interests. We’d like to offer your community a 50% discount on tickets using this promo code: DN18LOVESCOMMUNITY. You can also enter the raffle for a free ticket here: <https://datanatives.typeform.com/to/aKSIlv> Please, find more info about the conference here: <http://datanatives.io/> Hope to see you there! ---- 2018-11-01 13:22:40 UTC - Ganga Lakshmanasamy: @Ali Ahmed so is there a way to filter data before the data is sent to consumer? Because we might have same data coming from 2 different producers, so we might have to filter that out based on some rules. Is that something possible in topics? ---- 2018-11-01 13:56:09 UTC - Ganga Lakshmanasamy: @Ravi Can you check the thread I posted about pulsar topics. Lets go over the security queries we have on topics there. ---- 2018-11-01 13:56:29 UTC - Ravi: ok ganga will check it ---- 2018-11-01 15:59:25 UTC - George Wilk: RE: Slow consumers - could long time lapsed between receiving a message and acknowledging it to the broker have impact on the backlog? Specifically, could the broker time out waiting for ACK and disconnect the consumer? We have a consumer using Pulsar Java library, which subscribes to about 10 topics on single client (passing in a list of 10 topics) and sometimes it fails to consume messages which are then found to be unACKed in backlog via topic stats API. We're trying to determine if we should decouple consumer::receive loop from the processing of messages via intermediate queue to accelerate consuming loop. Thanks in advance for any tips. ---- 2018-11-01 16:38:52 UTC - Irfan: @Irfan has joined the channel ---- 2018-11-01 16:47:35 UTC - Ali Ahmed: @Ganga Lakshmanasamy Rules are just compute what ever logic is needed can be be encapsulated in pulsar functions ---- 2018-11-01 16:51:27 UTC - Matteo Merli: @George Wilk The broker does not have a timeout on acknowledgments by a consumer.
Consumers can set a `ackTimeout` (in the consumer builder) to have messages redelivered if not acked within a certain time. Calling `consumer.receive()` is already decoupled from the actual message receiving by the means of the receiver queue (default size: 1000 messages). This queue is used to prefetch messages from broker and ensure `receive()` can always find a message ready to be processed (if avaialble). ---- 2018-11-01 16:54:54 UTC - George Wilk: Thanks for the quick response! Is there any downside to have a single consumer subscribe to multiple topics? ---- 2018-11-01 16:55:47 UTC - Matteo Merli: Not really, internally it’s still implemented by having individual consumers ---- 2018-11-01 16:57:05 UTC - George Wilk: Sweet! Thanks again. ---- 2018-11-01 16:59:41 UTC - Matteo Merli: BTW: I forgot to mention on the first reply. There is a forced disconnection method (both from broker and clients) when the other end of the TCP connections is unreachable (network partition, machine crashed, etc.. ). This is completely independent of consumer acking the messages, or the delivery flow control. It is implemented through health-probes that are sent periodically in the TCP connection. ---- 2018-11-01 17:00:17 UTC - Emma Pollum: Hi again! I've been doing some performance testing with pulsar-perf but I am now getting a message saying "Cannot create producer on topic with backlog quota exceeded" the replication backlog seems to not be emptying itself. Where should I be looking in the configuration to get it to empty the backlog. ---- 2018-11-01 17:00:41 UTC - Ryan Samo: Hey guys just checking back to see if there is any tips, examples, etc. around the whole rack and data center aware configs, not using geo-replication? Thanks! ---- 2018-11-01 17:05:24 UTC - Matteo Merli: @Ryan Samo there is a tool to “assign” bookies to “racks” where each rack is represented as a label. I think we still miss a comprehensive documentation page.. ehem. In any case, it’s very simple to use : `bin/pulsar-admin bookies set-bookie-rack --bookie my-bookie:3181 --rack us-west-a1` ---- 2018-11-01 17:06:06 UTC - Matteo Merli: `bin/pulsar-admin bookies racks-placement` Will print the entire bookies-to-rack mapping configured ---- 2018-11-01 17:06:42 UTC - Irfan: Hi, I'm trying to enable TLS for Pulsar following the instructions found here: <https://pulsar.apache.org/docs/en/security-tls-authentication/> However I'm running into the following exception when I run any kind of client app such as pulsar-admin: `class org.apache.pulsar.client.api.PulsarClientException$InvalidConfigurationException: Private key must be accompanied by certificate chain`. Any ideas on where this issue could stem from? ---- 2018-11-01 17:06:44 UTC - Matteo Merli: Brokers will (try to) chose bookies from different racks when creating ledgers ---- 2018-11-01 17:07:14 UTC - Ryan Samo: Awesome! That’s exactly what I’m after, I really appreciate it! ---- 2018-11-01 17:09:00 UTC - Irfan: Also the instructions specify add the following line to proxy.conf: `brokerClientAuthenticationParameters=tlsCertFile:/path/to/proxy.cert.pem,tlsKeyFile:/path/to/proxy.key-pk8.pem` but I couldn't find any documentation on pulsar about how to create the proxy cert and key files. Could not having the proxy properly set up be related to my problem? ---- 2018-11-01 17:12:11 UTC - Matteo Merli: @Emma Pollum The default backlog quota is set to 10GB (which is a very low number). The backlog quota gets filled when you have consumers not consuming from a subscription. There are different remediations: * Increase the backlog quota: `bin/pulsar-admin namespaces set-backlog-quota $MY_NAMESPACE --limit 1T --policy producer_request_hold` * Unsubscribe a dangling subscription: - Check the topic stats: `bin/pulsar-admin topics stats $MY_TOPIC` ---- 2018-11-01 17:12:59 UTC - Matteo Merli: then drop the offending subscription : `bin/pulsar-admin topics unsubscribe $MY_TOPIC -s $MY_SUBSCRIPTION` ---- 2018-11-01 17:13:19 UTC - Emma Pollum: Thank you so much! I suppose I must have made a consumer with no subscription with pulsar-perf consume. Is there something specific I have to do with the command to prevent this from happenning again ---- 2018-11-01 17:15:04 UTC - Matteo Merli: A subscription is meant to retain data when a consumer is not connected. If your not concerned about that, you could use `pulsar-perf read` instead of `consume` ---- 2018-11-01 17:15:18 UTC - George Wilk: "msgBacklog" in topic stats vs. "unackedMessages". We have TTL set to 2 days on our namespace. In scope of a single subscription topic stats show: "msgBacklog":9472, "blockedSubscriptionOnUnackedMsgs":false, "unackedMessages":735,... Since the TTL is configured to 2 days, my interpretation of these stats is as follows: there are currently 9472 messages in the backlog of this subscription and this number includes all unacked messages as well as acked messages which have not aged out of TTL. Unacked messages number shows messages which have not yet been consumed and ACKed. Is this correct assumpion? Also, if the message was consumed but never ACKed, will it be re-delivered at some point or forever (until TTL) stay in the backlog? ---- 2018-11-01 17:15:31 UTC - Matteo Merli: That will use a `Reader` instead of `Consumer` and that doesn’t involve a subscription ---- 2018-11-01 17:20:55 UTC - George Wilk: Thank you, good to know! ---- 2018-11-01 17:23:54 UTC - Matteo Merli: `msgBacklog` —> Messages that were not acknowledged in any form `unackedMessages` —> Messages pushed to consumer but not yet acked When a message is expired through TTL, you can think of it as an auto-ack from the part of the broker. ---- 2018-11-01 17:24:29 UTC - Emma Pollum: So this isn't related to the fact that I have a large replication backlog? Where can I edit the settings for the replication to get it empty ---- 2018-11-01 17:24:59 UTC - Matteo Merli: You mean geo-replication ? ---- 2018-11-01 17:25:13 UTC - Matteo Merli: That would count as well as backlog quota ---- 2018-11-01 17:25:34 UTC - Emma Pollum: under topic stats $mytoic I have "replication" : { "p-pulsar" : { "msgRateIn" : 0.0, "msgThroughputIn" : 0.0, "msgRateOut" : 0.0, "msgThroughputOut" : 0.0, "msgRateExpired" : 0.0, "replicationBacklog" : 725890, "connected" : false, "replicationDelayInSeconds" : 0 } ---- 2018-11-01 17:25:54 UTC - Emma Pollum: It says its not connected, which I assume is why its not emptying. ---- 2018-11-01 17:26:02 UTC - George Wilk: Thanks. Just to clarify... wouldn't `unackedMessages` also fall into the category of messages that were not acknowledged in any form by definition? and as such be included in the msgBacklog total? ---- 2018-11-01 17:26:14 UTC - Matteo Merli: Correct, is the `p-pulsar` a different cluster? ---- 2018-11-01 17:26:23 UTC - Emma Pollum: no, i only have one cluster right now ---- 2018-11-01 17:26:24 UTC - Matteo Merli: and it’s reachable from current broker? ---- 2018-11-01 17:26:41 UTC - Matteo Merli: ok, then it’s a config issue :slightly_smiling_face: ---- 2018-11-01 17:28:11 UTC - Emma Pollum: Where in the configs can I edit this? ---- 2018-11-01 17:28:14 UTC - Matteo Merli: Yes, they are already included in the total. It’s just to show how many message are outstanding to any particular consumer ---- 2018-11-01 17:28:43 UTC - George Wilk: Thanks for that clarification! +1 : Matteo Merli ---- 2018-11-01 17:28:43 UTC - Matteo Merli: What is the cluster name you have set in `broker.conf` ? ---- 2018-11-01 17:29:20 UTC - Matteo Merli: What cluster name you have created when doing `initialize-cluster-metadata` ? `pulsar-admin clusters list` ---- 2018-11-01 17:29:46 UTC - Matteo Merli: If you don’t have geo-replication, you should have 1 single cluster name in both cases ---- 2018-11-01 17:30:10 UTC - Emma Pollum: Ah, it has an old cluster listed still ---- 2018-11-01 17:30:24 UTC - Emma Pollum: Thought I had used the delete command already for that. ---- 2018-11-01 17:31:00 UTC - Matteo Merli: Then you need to check the namespace policies: `pulsar-admin namespaces policies $NAMESPACE` ---- 2018-11-01 17:31:51 UTC - Matteo Merli: that includes the list of clusters this namespace is supposed to be present/replicated. With 1 cluster, you would only have 1 entry there ---- 2018-11-01 17:32:14 UTC - Emma Pollum: I see the old cluster there too. Is this the last place I need to fix it? ---- 2018-11-01 17:33:50 UTC - Matteo Merli: It should ---- 2018-11-01 17:34:19 UTC - Matteo Merli: `pulsar-admin namespaces set-clusters $NAMESPACE -c $CLUSTER` ---- 2018-11-01 17:35:58 UTC - Emma Pollum: thank you! ---- 2018-11-01 17:36:04 UTC - Emma Pollum: I'll give this a go. ---- 2018-11-01 17:40:44 UTC - Emma Pollum: Unfortunately it seems it believes s-pulsar (which doesn't exist) is its local cluster, what is the command to change that ---- 2018-11-01 17:42:53 UTC - Matteo Merli: It should be in `broker.conf` ---- 2018-11-01 17:44:52 UTC - Emma Pollum: :+1: ---- 2018-11-01 17:53:21 UTC - Karthik Palanivelu: Team, Can you please help me understand how back pressure handled at Pulsar? Please feel free to point me to the doc if any. Thanks in advance. ---- 2018-11-01 17:54:18 UTC - Jesse Zhang (Bose): @Jesse Zhang (Bose) has joined the channel ---- 2018-11-01 18:17:13 UTC - Ryan Samo: I see cool so this call sets and saves the metadata about the bookie configs into zookeeper correct? That way the brokers can pick it up too when attempting to choose the bookies? ---- 2018-11-01 18:24:41 UTC - Matteo Merli: That is correct ---- 2018-11-01 18:25:12 UTC - Ryan Samo: :+1: ---- 2018-11-01 18:52:05 UTC - Ali Ahmed: Hi irfan here is sample of additional configs you need for the proxy. ``` "tlsEnabledInProxy", "true" "tlsCertificateFilePath", "/pulsar/ssl/broker.cert.pem" "tlsKeyFilePath", "/pulsar/ssl/broker.key-pk8.pem" "tlsTrustCertsFilePath", "/pulsar/ssl/ca.cert.pem" ``` ---- 2018-11-01 21:24:32 UTC - Ravi: @Ganga Lakshmanasamy please check this if this is relevant for us ---- 2018-11-02 01:45:43 UTC - wuzang: @wuzang has joined the channel ---- 2018-11-02 02:14:58 UTC - jia zhai: Hi @hj, Please take a look of how the command is serialised around these lines, It may help for you to understand the code of decode: <https://github.com/apache/pulsar/blob/master/pulsar-common/src/main/java/org/apache/pulsar/common/api/Commands.java#L865> ---- 2018-11-02 03:13:28 UTC - Ganga Lakshmanasamy: ok @Ali Ahmed Also, need information on data security. Is there any default encryption logic available to secure data In-Flight and At-Rest? @Ravi ---- 2018-11-02 03:15:08 UTC - Ali Ahmed: pulsar deals with bytes, you can have connection security but data encryption is something you would do outside of the context of messaging. Pulsar should not aware of it ---- 2018-11-02 08:38:23 UTC - hj: @jia zhai Oh I see:thinking_face:.ths:+1: for help ---- 2018-11-02 08:38:47 UTC - jia zhai: always welcome. ----
