2019-04-11 10:13:14 UTC - Michael Bongartz: Hi there, Anyone knows if there is any issue with pulsar-admin handling `307 Temporary Redirect` when authentication is enabled ? I have a multi-broker with pulsar proxy and authentication enabled setup and I can only perform operations on topics if I am lucky enough that my requests is directed to the broker owning the topic by Pulsar proxy. When it hits another broker, it seems `pulsar-admin` receives a HTTP 307 from the broker then tries to connect to the broker owning the topic but without authenticating. ---- 2019-04-11 10:40:36 UTC - Sijie Guo: What is the authentication method you are using? TLS or JWT? ---- 2019-04-11 12:49:16 UTC - Steve Kim: I am willing to contribute, but it will take me a while because I am new to this code and don't have much time. ---- 2019-04-11 12:49:26 UTC - Steve Kim: Do I need to sign a contributor agreement? ---- 2019-04-11 14:01:19 UTC - jia zhai: no need to do that ---- 2019-04-11 14:02:57 UTC - jia zhai: <https://pulsar.apache.org/en/contributing/> ---- 2019-04-11 14:03:16 UTC - jia zhai: FYI. Here is a guide:slightly_smiling_face: ---- 2019-04-11 15:58:33 UTC - chris: @Michael Bongartz if you are using JWT for authentication there is an issue with java stripping the headers in the http url connection on a redirect. This issue is addressed here <https://github.com/apache/pulsar/pull/3869>. The fix is in master and will be out in 2.3.1. ---- 2019-04-11 16:23:57 UTC - Grant Wu: Are Pulsar brokers supposed to crash loop while Zookeeper is undergoing network partitions/having leader elections? Not saying this is unreasonable behavior, just wanted to double check this is expected ---- 2019-04-11 16:26:07 UTC - Matteo Merli: It depends how long the network partition/election lasts. ---- 2019-04-11 16:27:47 UTC - Matteo Merli: Brokers have a ZK sessions and hold “locks” on resources (like ownership of group of topics) within that session.
A session is valid until it cannot be refreshed within the session timeout. By default the session time ---- 2019-04-11 16:29:11 UTC - Matteo Merli: By default, we use 30 sec for session timeout. If a broker is not able to talk with a functioning ZK ensemble for that amount of time, it will not be able to ensure it’s still the owner of the resources. ---- 2019-04-11 16:30:15 UTC - Matteo Merli: ..Hence we bounce the broker. We have the plan of improving that behavior, by making sure we can keep writing to BookKeeper while ZK is down (avoiding all attempts at metadata changes). ---- 2019-04-11 16:30:52 UTC - Matteo Merli: In meantime, you can increase/decrease the ZK session timeout depending on needs ---- 2019-04-11 16:32:59 UTC - Matteo Merli: a long session timeout will make a broker “survive” a long partition, at the expense of a longer time for this session to expire when a broker crashes badly. eg: in a case a broker hard crash, all the topics will still be seen as “owned” by that broker until the session expires and it’s cleaned up by ZK. In the meantime, clients keep trying to reconnect ---- 2019-04-11 16:34:45 UTC - Grant Wu: I see ---- 2019-04-11 16:37:18 UTC - George Wilk: Message retention policy question: if message retention is set to keep all messages indefinitely, does it apply to all messages ever published in scope of a namespace? If so, would it apply had there never been any open subscriptions? Actual use case scenario: ServiceA (publisher) is deployed before any client services (subscribers) come online, but when they do we need to make sure they can get the all backlogged (pardon misnomer) messages ever published by ServiceA. ---- 2019-04-11 16:38:34 UTC - Yuvaraj Loganathan: Yes. Messages will be retained even if there are no subscriptions ---- 2019-04-11 16:39:35 UTC - Fredrick P Eisele: In the <https://pulsar.apache.org/docs/en/deploy-bare-metal/#initializing-cluster-metadata> it says "It only needs to be written once". Is there a problem with running it more than once? How can it be retracted? ---- 2019-04-11 16:42:24 UTC - George Wilk: Thank you for this quick reply! Quick follow-up: would the same be true about scenario where some subscriptions exist when new subscription is added. Existing subscriptions have already consumed and ACKED all messages - would the new subscription be able to consume all messages from the beginning? ---- 2019-04-11 16:47:59 UTC - Grant Wu: You would need to reset its cursor to the beginning ---- 2019-04-11 16:48:55 UTC - Matteo Merli: Or subscribing with initialPosition: <https://pulsar.apache.org/api/client/org/apache/pulsar/client/api/ConsumerBuilder.html#subscriptionInitialPosition-org.apache.pulsar.client.api.SubscriptionInitialPosition-> ---- 2019-04-11 16:49:19 UTC - Matteo Merli: (or creating a Reader and starting on `MessageId.earliest`) ---- 2019-04-11 16:49:26 UTC - Grant Wu: Ah, yes, I see that’s landed into the Java client ---- 2019-04-11 16:49:36 UTC - Grant Wu: Will we be getting that in the other clients? :stuck_out_tongue: ---- 2019-04-11 16:50:04 UTC - George Wilk: ty! ---- 2019-04-11 17:02:25 UTC - Chris Bartholomew: There is an issue in 2.3.0 on redirects when using JWT. See <https://github.com/apache/pulsar/pull/3869> ---- 2019-04-11 17:14:50 UTC - David Kjerrumgaard: @Fredrick P Eisele The command is used to initialize all the bookie meta-data within zookeeper. This meta-data is crucial to keeping track of where the Pulsar topic data is stored on BookKeeper. If you re-run the command all of that information will be overwritten, resulting in full data loss from the Pulsar perspective. ---- 2019-04-11 17:17:51 UTC - Matteo Merli: @Grant Wu It’s already there in C++ and Go.. ---- 2019-04-11 17:18:16 UTC - Matteo Merli: In Python too.. though the pdoc publish is broken.. and the docs on the webpage not update :confused: ---- 2019-04-11 17:18:35 UTC - Matteo Merli: Anyway: `subscribe(....., initial_position=InitialPosition.Earliest)` ---- 2019-04-11 17:21:42 UTC - Fredrick P Eisele: @David Kjerrumgaard If you overwrite metadata with exactly the same values that will look like data loss to Pulsar? ---- 2019-04-11 17:29:16 UTC - David Kjerrumgaard: @Fredrick P Eisele No, but I don't think it is possible to have a copy of the same values. The data that is stored includes each of the ledger ids and the bookies that they were placed on for EACH pulsar TOPIC. ---- 2019-04-11 17:30:27 UTC - Grant Wu: Ah, I see ---- 2019-04-11 17:30:29 UTC - Grant Wu: That’s good to hear ---- 2019-04-11 17:30:36 UTC - Grant Wu: Do you know about the Websocket API :stuck_out_tongue: ---- 2019-04-11 20:52:01 UTC - Devin G. Bost: We have a pulsar Kafka source that we are creating successfully, but it's not starting. (Checking the sink's status shows that it's not running.) In the logs, we're seeing, "Was passed main parameter but no main parameter was defined." Any ideas? ---- 2019-04-11 20:58:48 UTC - David Kjerrumgaard: What was the command you used to start it? The error seems to indicate that you are providing it some additional information that it is not expecting. ---- 2019-04-11 21:10:06 UTC - Devin G. Bost: It should start automatically once it is created. ---- 2019-04-11 21:17:41 UTC - Thor Sigurjonsson: I have a pressing question on roles, permissions and functions in version 2.3.0. (we're doing a deploy) ---- 2019-04-11 21:18:25 UTC - Thor Sigurjonsson: We've turned on token auth. We're figuring out the hard way what roles the functions-worker (as a java thread executor) has to have. (we were masking that issue with having an anonymous role). ---- 2019-04-11 21:18:36 UTC - Thor Sigurjonsson: We got it to deploy but no data is flowing. ---- 2019-04-11 21:18:46 UTC - Thor Sigurjonsson: Does it need produce,consume permissions? ---- 2019-04-11 21:18:48 UTC - Devin G. Bost: (It's related to the error I reported. We think it's related to removal of "anonymous" permissions.) ---- 2019-04-11 21:19:25 UTC - Thor Sigurjonsson: (we initially could not deploy without a super-user role token given to the worker) ---- 2019-04-11 21:21:12 UTC - Emma Pollum: Is there an option to turn off function metrics? ---- 2019-04-11 21:27:28 UTC - David Kjerrumgaard: @Devin G. Bost Is the issue with one of the built-in Pulsar connectors or one you have developed yourself? ---- 2019-04-11 21:27:45 UTC - Devin G. Bost: It's using the built-in Kafka connector for Pulsar. ---- 2019-04-11 21:28:40 UTC - David Kjerrumgaard: Sink or source? ---- 2019-04-11 21:33:24 UTC - David Kjerrumgaard: What configs did you pass to the `pulsar-admin source create ...` command? @Devin G. Bost ---- 2019-04-11 21:37:35 UTC - Thor Sigurjonsson: we have both sinks and sources, but data flow would not begin until a source kicked in.. ---- 2019-04-11 21:37:50 UTC - Thor Sigurjonsson: we also see 0 instances of source ---- 2019-04-11 21:51:09 UTC - Ali Ahmed: @Devin G. Bost @Thor Sigurjonsson can you post the full command you use to start the source instance ---- 2019-04-11 21:55:34 UTC - Devin G. Bost: ```bin/pulsar-admin source create \ --source-type 'kafka' \ --destinationTopicName <persistent://osp/obfuscated/log-topic> \ --sourceConfigFile /data/provisioning/obfuscated-kafka-log-topic-source.conf \ --namespace obfuscated \ --name kafka-log-topic-source \ --tenant osp``` Then, the contents of the .conf file are: ```configs: bootstrapServers: obfuscated consumerConfigProperties: auto.offset.reset: latest sasl.jaas.config: com.sun.security.auth.module.Krb5LoginModule required doNotPrompt=true useTicketCache=false serviceName="kafka" principal="<mailto:[email protected]|[email protected]>" useKeyTab=true keyTab="/pulsar/conf/auth/pulsar_runtime_dev.keytab" client=true; sasl.kerberos.service.name: kafka security.protocol: SASL_PLAINTEXT groupId: log-group topic: log-topic``` ---- 2019-04-11 21:55:57 UTC - Thor Sigurjonsson: @Ali Ahmed we've had that work before ---- 2019-04-11 21:56:23 UTC - Devin G. Bost: The only difference related to the roles and the involvement of the `anonymous` role. ---- 2019-04-11 21:57:24 UTC - Thor Sigurjonsson: (we think) ---- 2019-04-11 21:58:14 UTC - Ali Ahmed: you seem to be providing the topic twice in different formats ---- 2019-04-11 21:59:11 UTC - Devin G. Bost: That part has worked before. ---- 2019-04-11 21:59:58 UTC - Ali Ahmed: is this the kafka topic ```topic: log-topic``` ---- 2019-04-11 22:00:43 UTC - Jerry Peng: @Devin G. Bost @Thor Sigurjonsson so the only thing changed is that your guys “anonymousUserRole=anonymous” ? ---- 2019-04-11 22:01:00 UTC - Jerry Peng: or can you describe the changes you’ve made ---- 2019-04-11 22:02:03 UTC - Thor Sigurjonsson: I just started the broker with superuser role being anonymous role and kafka source is flowing data... (from a status call I can see that). Before it was zero instances in the status response). ---- 2019-04-11 22:05:29 UTC - Jerry Peng: ok then there is probably an authorization issue somewhere where the anonymous didn’t have to correct permissions set to produce or consume in a namespace ---- 2019-04-11 22:05:42 UTC - Jerry Peng: I would also check the function logs to see if there are any errors ---- 2019-04-11 22:06:05 UTC - Jerry Peng: Should see connections fail ---- 2019-04-11 22:07:30 UTC - Thor Sigurjonsson: we were getting empty logs at one point ---- 2019-04-11 22:07:34 UTC - Thor Sigurjonsson: and zero instances ---- 2019-04-11 22:07:38 UTC - Thor Sigurjonsson: but let me look now ---- 2019-04-11 22:07:55 UTC - Thor Sigurjonsson: (we have anonymous working, but want to get to a more "secure" setup :wink: ) ---- 2019-04-11 22:08:35 UTC - Matteo Merli: @Thor Sigurjonsson a secure setup will involve containers and K8S ---- 2019-04-11 22:08:40 UTC - Thor Sigurjonsson: (log file is 0 bytes now) ---- 2019-04-11 22:08:50 UTC - Devin G. Bost: Anonymous superusers just seems less than desirable... ---- 2019-04-11 22:09:17 UTC - Thor Sigurjonsson: @Devin G. Bost "he used sarcasm" :slightly_smiling_face: ---- 2019-04-11 22:09:57 UTC - Thor Sigurjonsson: @Matteo Merli what can we do with thread execution stuff today in terms of auth? ---- 2019-04-11 22:10:33 UTC - Matteo Merli: problem is mainly that if you have “untrusted” code running as thread/process it will have access to the worker credentials ---- 2019-04-11 22:12:12 UTC - Thor Sigurjonsson: We're ok waiting for a release with more auth support as part of function support, we're just trying to get to a place first where we have producer/consumer clients auth'd and functions/sources/sinks that work also (superuser is ok there in that interrim). We have control of the code now but later will need to treat functions as more "untrusted". ---- 2019-04-11 22:13:33 UTC - Jerry Peng: Currenlty, functions running in process or runtime mode can only assume the same role as the worker/broker ---- 2019-04-11 22:14:07 UTC - Thor Sigurjonsson: That's fine for our current use case. ---- 2019-04-11 22:14:21 UTC - Jerry Peng: which is what is specified for clientAuthenticationParameters and clientAuthenticationPlugin ---- 2019-04-11 22:14:22 UTC - Jerry Peng: ok ---- 2019-04-11 22:15:36 UTC - Thor Sigurjonsson: Yes, we set that up as the super user (was working with anonymous role as superuser before). Then deploys worked once we had a different super user role in there but no data was flowing. ---- 2019-04-11 22:15:50 UTC - Thor Sigurjonsson: We were guessing that a produce,consume was missing. ---- 2019-04-11 22:16:07 UTC - Thor Sigurjonsson: But that needs to be applied to different constructs we're creating (tenants, namespaces, etc). ---- 2019-04-11 22:16:34 UTC - Thor Sigurjonsson: For now we just need a little more clarity on that so we can turn off the anonymous hack. ---- 2019-04-11 22:17:57 UTC - Jerry Peng: @Thor Sigurjonsson are you using tokens or TLS for auth? ---- 2019-04-11 22:18:09 UTC - Thor Sigurjonsson: @Jerry Peng we're using token auth ---- 2019-04-11 22:18:11 UTC - Jerry Peng: and what role will the token or certificate resovle to? ---- 2019-04-11 22:18:38 UTC - Jerry Peng: that role needs to be as super user role ---- 2019-04-11 22:18:58 UTC - Thor Sigurjonsson: Yes, we got that working (for deploying). ---- 2019-04-11 22:20:01 UTC - Thor Sigurjonsson: we named it 'superuser' and added it to `super-user` config in the broker and put the token for that in the `function-worker.yml`. Deploy worked. Source would not instantiate or flow. ---- 2019-04-11 22:20:51 UTC - Thor Sigurjonsson: And, before also broker didn't start without that role for the function worker. ---- 2019-04-11 22:21:53 UTC - Jerry Peng: you are using version 2.3.0? ---- 2019-04-11 22:22:27 UTC - Devin G. Bost: Yes. ---- 2019-04-11 22:25:35 UTC - Jerry Peng: can you check the assignments for the sources: curl <BROKER_HOSTNAME>:8080/admin/v2/worker/assignments ---- 2019-04-11 22:25:45 UTC - Devin G. Bost: Sure thing. ---- 2019-04-11 22:27:13 UTC - Thor Sigurjonsson: we're seeing those ---- 2019-04-11 22:28:01 UTC - Devin G. Bost: What should we be looking for? ---- 2019-04-11 22:28:12 UTC - Jerry Peng: can you go to a machine that is running an instance of the source and check the command line arguments that are passed in ---- 2019-04-11 22:29:16 UTC - Jerry Peng: see if the parameters client_auth_plugin and client_auth_params are properly configured +1 : Thor Sigurjonsson ---- 2019-04-11 22:29:56 UTC - Thor Sigurjonsson: checking ---- 2019-04-11 22:31:19 UTC - Thor Sigurjonsson: how should we check it? -- we used to be able to do it with a process executor with listing processes ---- 2019-04-11 22:31:54 UTC - Jerry Peng: something like: ps aux | grep pulsar ---- 2019-04-11 22:31:55 UTC - Thor Sigurjonsson: this is running as a thread (we're doing that as we needed to plumb the kerberos params) and we're not on a build with the PR that would fix that. ---- 2019-04-11 22:32:03 UTC - Jerry Peng: oh gotcha ---- 2019-04-11 22:33:56 UTC - Jerry Peng: give me one sec to check something ---- 2019-04-11 22:36:16 UTC - Jerry Peng: ok in thread runtime those args won’t be printed out anywhere. Can you check in the broker logs to see if there are any exceptions? Like connection exceptions or auth exceptions? ---- 2019-04-11 22:37:36 UTC - Jerry Peng: You can also check what the configs the function worker is starting with by searching for line: ``` Worker Configs: ``` ---- 2019-04-11 22:37:55 UTC - Jerry Peng: see if clientAuthenticationParameters and clientAuthenticationPlugin configs are set properly ---- 2019-04-11 22:44:50 UTC - Thor Sigurjonsson: those are set in the config, I've also seen them in log files (/var/log/messages) I see logs from when I had an old token, changed to the new one, and when I didn't have one. RIght now we the super user one in play. ---- 2019-04-11 22:45:30 UTC - Thor Sigurjonsson: org.apache.pulsar.client.impl.auth.AuthenticationToken is also there.. ---- 2019-04-11 22:46:12 UTC - Thor Sigurjonsson: I guess my biggest question is if a super-user role has to also have specific consume/produce permissions. ---- 2019-04-11 22:47:05 UTC - Jerry Peng: it should not ---- 2019-04-11 22:47:52 UTC - Jerry Peng: a role set in “superUserRoles” should have permissions to do everything ----
