2018-12-11 10:14:41 UTC - Maarten Tielemans: Morning all, currently hitting a DNS resolve issue when I attempt to connect to either of my two brokers. Keep in mind, zookeeper and the bookie are running on the same nodes.
I did perform the write tests on the bookie's successfully. ``` 10:11:12.321 [main] INFO org.apache.pulsar.testclient.PerformanceProducer - Adding 1 publishers on topic <persistent://public/default/persistent-60> 10:11:12.465 [pulsar-client-io-2-2] INFO org.apache.pulsar.client.impl.ConnectionPool - [[id: 0x61c9eaa6, L:/127.0.0.1:51854 - R:localhost/127.0.0.1:6650]] Connected to server 10:11:12.877 [pulsar-client-io-2-2] INFO org.apache.pulsar.client.impl.ConnectionPool - [[id: 0xd9fb817a, L:/127.0.0.1:51856 - R:localhost/127.0.0.1:6650]] Connected to server 10:11:13.367 [pulsar-client-io-2-2] WARN org.apache.pulsar.client.impl.ConnectionPool - Failed to open connection to ip-10-0-3-183.kanto.indigo:6650 : io.netty.resolver.dns.DnsNameResolverContext$SearchDomainUnknownHostException: Search domain query failed. Original hostname: 'ip-10-0-3-183.kanto.indigo' failed to resolve 'ip-10-0-3-183.kanto.indigo.kanto.indigo' after 6 queries ``` ---- 2018-12-11 12:51:40 UTC - Maarten Tielemans: Going through my config trying to resolve the above. When setting the cluster metadata, `--web-service-url The web service URL for the cluster, plus a port. This URL should be a standard DNS name.` What is the recommended way to set this up? I tried with a TCP and a HTTP load balancer in front of the two pulsar nodes ---- 2018-12-11 14:26:21 UTC - Sijie Guo: yes +1 ---- 2018-12-11 15:32:07 UTC - Julien Plissonneau Duquène: Thanks for the information. Not sure I will be able to attend that one, but I'll try for sure. ---- 2018-12-11 15:40:07 UTC - Mike Card: Has anyone else had a problem with message truncation under very heavy load? We have been running a test with Pulsar 2.2.0 in which we have a web app running on 3 EC2 instances (1 in each of us-east-1a, 1b, and 1c) which has a REST API that can be called which will result in a message being written into a partitioned Pulsar topic. We have a load test that results in calls to that REST API at a rate of about 15 KHz in aggregate across all 3 availability zones. We are seeing messages truncated at 64 bytes into the partitioned topic using both synchronous and asynchronous producer send calls, under light load this does not happen. Just wondered if any of you had seen anything similar and identified a cause or work-around. ---- 2018-12-11 15:48:37 UTC - David Kjerrumgaard: @Maarten Tielemans The quickest way to resolve your issue is to edit the /etc/hosts file on the host that is trying to resolve the host name "ip-10-0-3-183.kanto.indigo" to include that host/IP mapping. If you are using a load balancer in front of the broker, make sure that your forwarding rules use the IP address and not the hostname, otherwise you will have to manually add cname records into the DNS server you are using. ---- 2018-12-11 15:51:21 UTC - David Kjerrumgaard: @Mike Card How are you determining that the message size is 64 bytes? Is that the size of the messages returned from the consumer? ---- 2018-12-11 15:54:56 UTC - Mike Card: Yes. My downstream consumer tasks log a buffer underflow exception for a 64 byte message (should be larger) and if I restart the web app it will die again when it restarts and begins reading messages out of the “input” topic. I do not observe this behavior under light load. The code in question runs fine with Kafka, putting Pulsar in place of Kafka in this application seemed easy but I’m wondering if there is something I have mis-configured or what to cause this. ---- 2018-12-11 16:14:25 UTC - Grant Wu: Are message IDs incrementing on a per topic basis? ---- 2018-12-11 16:16:20 UTC - Mike Card: wouldn't message IDs increment on a per-message basis? ---- 2018-12-11 16:17:11 UTC - Mike Card: (oh if it matters producers are using round-robin to determine partition to write messages to next) ---- 2018-12-11 16:18:33 UTC - Grant Wu: I meant more like ---- 2018-12-11 16:18:58 UTC - Grant Wu: if the first message in a topic has message ID n, does the second message have message ID n+1 ---- 2018-12-11 16:19:15 UTC - Grant Wu: Sorry, this isn’t related to your previous issue :sweat_smile: ---- 2018-12-11 16:19:24 UTC - Grant Wu: For what it’s worth, I doubt this is the case - I just wanted to check ---- 2018-12-11 16:20:37 UTC - David Kjerrumgaard: @Mike Card It would be useful if you could share the code/configuration you are using for testing, as Pulsar handles 100s of millions of messages in Production at Yahoo, so it does perform well under heavy load scenarios. ---- 2018-12-11 16:22:52 UTC - David Kjerrumgaard: @Mike Card So the data flow is load generator --- REST ----> web app ----> pulsar client ---> Partitioned topic? ---- 2018-12-11 16:23:09 UTC - David Kjerrumgaard: Is it possible that the truncation occurs in the web application? ---- 2018-12-11 16:23:09 UTC - Mike Card: yes ---- 2018-12-11 16:24:07 UTC - David Kjerrumgaard: Does the web app use a Java client or a web socket to communicate to the Pulsar Broker? ---- 2018-12-11 16:26:41 UTC - Mike Card: It uses the Pulsar Java client ---- 2018-12-11 16:36:49 UTC - David Kjerrumgaard: Just as a sanity check, can you add a line to the web app that checks the size of the message before it publishes it? That will help us isolate the issue. ---- 2018-12-11 16:37:10 UTC - Mike Card: Yes I can +1 : David Kjerrumgaard ---- 2018-12-11 16:37:40 UTC - David Kjerrumgaard: Let me know what you find ---- 2018-12-11 16:38:12 UTC - Mike Card: OK ---- 2018-12-11 17:49:04 UTC - Shalin: Can you set python version for the pulsar functions to run in or point to the python file to use? ---- 2018-12-11 18:58:08 UTC - Ryan Samo: Hey guys, is there a way to easily reset a Pulsar cluster? Like wipe the metadata back to when the cluster was first created? ---- 2018-12-11 19:54:27 UTC - David Kjerrumgaard: @Ryan Samo You may be able to re-initialize the cluster metadata, but be cautious. <http://pulsar.apache.org/docs/latest/admin-api/clusters/#Initializeclustermetadata-v8mupg> ---- 2018-12-11 19:55:34 UTC - David Kjerrumgaard: It basically wipes out the entries in ZK, but leaves the data on the bookies ---- 2018-12-11 19:58:44 UTC - Ryan Samo: Ok thanks, one more question. Where does Pulsar keep track of the cert to role mapping? Like if I have a cert named client1 and grant consume to client1, is that in a bookie or zookeeper? ---- 2018-12-11 19:59:52 UTC - David Kjerrumgaard: @Ryan Samo Look at this on how to clear out the data on the bookies as well (if you are interested) <https://bookkeeper.apache.org/docs/latest/admin/bookies/#formatting> ---- 2018-12-11 20:00:19 UTC - Sanjeev Kulkarni: @Shalin that is not yet possible. Python functions are run invoking the python command thats present on the system. Both python2 and python3 are supported. ---- 2018-12-11 20:02:09 UTC - Shalin: Gotcha. Thanks . :thumbsup: ---- 2018-12-11 20:08:49 UTC - Ryan Samo: Thanks @David Kjerrumgaard ! ---- 2018-12-11 20:21:14 UTC - David Kjerrumgaard: @Ryan Samo Are you asking where the role mapping is physically stored? ---- 2018-12-11 20:22:21 UTC - Tobias Gustafsson: Is there any way to get the last message id of a topic? I want to be able to, later, use this to position a Reader instance to get messages from that ID and forward. I know that I can instantiate the Reader with `pulsar.LatestMessage` but that does not help since I want to know the actual message ID. ---- 2018-12-11 20:23:26 UTC - Ryan Samo: @David Kjerrumgaard yeah I was just wondering how the roles are stored in case we do a reset ---- 2018-12-11 20:25:03 UTC - David Kjerrumgaard: @Ryan Samo I am not sure, but I believe they are part of the cluster metadata kept on ZK. So you will have to re-execute the pulsar-admin commands to create those mappings...AFAIK. ---- 2018-12-11 20:25:25 UTC - Ryan Samo: Cool thanks ---- 2018-12-11 20:26:00 UTC - David Kjerrumgaard: for the proxy user ONLY, they are kept in the broker.conf, <http://pulsar.apache.org/docs/en/security-authorization.html#proxy-roles> But I don't think that is what you are looking for ---- 2018-12-11 20:28:36 UTC - Ryan Samo: Nah I was looking to see how Pulsar stores references to the roles, on bookie or zookeeper, etc ---- 2018-12-11 20:31:31 UTC - Tobias Gustafsson: One more question: Is the REST API response content documented anywhere? <https://pulsar.apache.org/docs/latest/reference/RestApi/> only seem to contain the URL and response codes. Perhaps I'm missing something? ---- 2018-12-11 20:31:35 UTC - David Kjerrumgaard: @Tobias Why not use that reader and then call the getNext() method and then call getMessageId() on the returned message? It is definitely a hack, but it SHOULD work ---- 2018-12-11 20:33:43 UTC - David Kjerrumgaard: @Tobias We have that for the admin REST API, if that helps.... <https://pulsar.apache.org/en/admin-rest-api/> ---- 2018-12-11 20:33:50 UTC - Tobias Gustafsson: @David Kjerrumgaard What should I use as initial message ID for the reader then? If I use `LatestMessage` getNext() may not return ---- 2018-12-11 20:36:10 UTC - Tobias Gustafsson: anything as long as there are no more messages produced on that topic ---- 2018-12-11 20:36:12 UTC - David Kjerrumgaard: readNext(int timeout, TimeUnit unit) would at least prevent a hanging process ---- 2018-12-11 20:36:23 UTC - David Kjerrumgaard: but I see what you are saying ---- 2018-12-11 20:38:35 UTC - Tobias Gustafsson: As a backup option, is there any way to get the message id of a message I've produced? That way I could at least keep track of the last message ID successfully commited even though I would have to do it outside of Pulsar. ---- 2018-12-11 20:40:13 UTC - Tobias Gustafsson: @David Kjerrumgaard Thanks for the docs link, that was what I was searching for! ---- 2018-12-11 20:40:17 UTC - David Kjerrumgaard: @Tobias Yes, every call to send(T message) returns a MessageId ---- 2018-12-11 20:40:24 UTC - David Kjerrumgaard: <http://pulsar.apache.org/api/client/org/apache/pulsar/client/api/Producer.html#send-T-> ---- 2018-12-11 20:42:03 UTC - David Kjerrumgaard: If you are sending messages asynchronously, the completable future returns the messageId <http://pulsar.apache.org/api/client/org/apache/pulsar/client/api/Producer.html#sendAsync-org.apache.pulsar.client.api.Message-> ---- 2018-12-11 20:42:15 UTC - Tobias Gustafsson: I'm using the Go client, it seems to be missing that functionality from what I can tell. ---- 2018-12-11 20:42:40 UTC - David Kjerrumgaard: And you will ONLY get a messageId IF the message has been committed to disk ---- 2018-12-11 20:42:47 UTC - David Kjerrumgaard: Ah ---- 2018-12-11 20:43:59 UTC - David Kjerrumgaard: @Tobias Can you please file a feature request for these methods to be added? ---- 2018-12-11 20:43:59 UTC - David Kjerrumgaard: <https://github.com/apache/pulsar/issues> ---- 2018-12-11 20:44:31 UTC - David Kjerrumgaard: That way you can track the progress, and will be notified when they are ready ---- 2018-12-11 20:45:25 UTC - Tobias Gustafsson: OK, thanks! ---- 2018-12-11 21:00:41 UTC - Cristian: Is there a roadmap somewhere? ---- 2018-12-11 22:14:30 UTC - Dave Southwell: Can anyone give me some guidance on how to use Basic Auth with Pulsar? I saw in some slack history and in github that there is support for it, but there isn't any documentation that I've found. ---- 2018-12-11 23:14:27 UTC - Mike Card: Hey @David Kjerrumgaard I ran this test and the producer says the original object size is 64 bytes but the consumer won't deserialize it, still gets a buffer underflow. This is going to take more investigation, I am beginning to believe something about the Pulsar byte array serializer is different from the Kafka ByteBuffer serializer ---- 2018-12-11 23:15:56 UTC - Mike Card: Fundamentally it seems to me they should be the same, but something in the serde process is making the consumer unhappy under heavy load conditions. I am starting to think this is related to using multiple tasks somehow, that is going to be my next test ---- 2018-12-11 23:17:03 UTC - David Kjerrumgaard: which SerDe are you using? ---- 2018-12-11 23:20:20 UTC - Mike Card: We are using our own serde to turn an object into a byte array , the call to publish looks like this: <http://_log.info|_log.info>("DefaultDatabus.onUpdateIntent: publishing message " + ref.toString() + ", size in bytes == " + UpdateRefSerializer.toByteBuffer(ref).array().length); eventProducer.sendAsync(UpdateRefSerializer.toByteBuffer(ref).array()).thenAccept(msgId -> {}); ---- 2018-12-11 23:21:38 UTC - Mike Card: ---- 2018-12-11 23:21:51 UTC - Mike Card: This is the custom serializer ---- 2018-12-11 23:26:51 UTC - Mike Card: It works fine with an identical app setup using Kafka in lieu of Pulsar, I think something changes with the serialization though somehow. ---- 2018-12-11 23:27:26 UTC - Mike Card: I was looking at the statics in that serde but I don't *think* they are the culprit since they don't cause problems with Kafka. ---- 2018-12-11 23:27:51 UTC - Mike Card: (I did not write that serde so) ---- 2018-12-11 23:36:05 UTC - David Kjerrumgaard: I will get with the engineering team to see if they have any ideas. ---- 2018-12-12 00:22:39 UTC - Ali Ahmed: @Dave Southwell are you looking for basic auth with pulsar client or http access ? ---- 2018-12-12 00:23:54 UTC - Dave Southwell: Hi @Ali Ahmed I'm looking for basic auth over http. <https://github.com/apache/pulsar/pull/1087> ---- 2018-12-12 00:26:47 UTC - Matteo Merli: @Dave Southwell I’d say that at this point the best option is to look at the unit test contained in that PR for how to use the basic auth ---- 2018-12-12 00:30:33 UTC - Matteo Merli: Essentially, you’d have to pass an `.htpasswd` file (create with regular HTTP server tools) in the broker. On the client, when you pass the authentication, you pass `org.apache.pulsar.client.impl.auth.AuthenticationBasic` and the auth param string will be something like : `userId:MY_USER,password:MY_PASSWORD` ---- 2018-12-12 00:31:51 UTC - Dave Southwell: Ok. I'd gleaned that .htpasswd was what was used to pass in the username and hashed password for valid users. I assume the usernames should match then to roles in Pulsar as well. ---- 2018-12-12 00:32:23 UTC - Matteo Merli: Yes, that will be the same authorization part that is common to all the authentication plugins +1 : Dave Southwell ---- 2018-12-12 00:33:29 UTC - Matteo Merli: In any case, for next release 2.3 we’ve added a better way to perform simple authentication based on JWT tokens: <http://pulsar.apache.org/docs/en/next/security-token-client/> ---- 2018-12-12 00:34:28 UTC - Dave Southwell: Looks good! ---- 2018-12-12 04:58:20 UTC - VendyLuo: @VendyLuo has joined the channel ----
