2020-04-13 10:13:04 UTC - JG: @JG has joined the channel ---- 2020-04-13 10:13:22 UTC - JG: hello ! ---- 2020-04-13 10:14:18 UTC - JG: I have a technical/architecture question regarding Pulsar, I was wondering if its possible to use it as an Event Store in a event sourcing system ? If yes how to "browse" events and how to replay them ? ---- 2020-04-13 10:24:25 UTC - Hiroyuki Yamada: Downloaded bookkeeper and did `bookkeeper shell metaformat` then rm all data under `data` directory, then all the nodes can be started.
So there seems something wrong in the current doc. ---- 2020-04-13 10:54:18 UTC - Subash Kunjupillai: Okay Thanks @Sijie Guo and @tuteng.. will use these steps and will come back if any clarification is required.. ---- 2020-04-13 12:43:26 UTC - Gordan Grasarevic: @Gordan Grasarevic has joined the channel ---- 2020-04-13 13:11:59 UTC - Chris: I'm experiencing this <https://github.com/apache/pulsar/issues/6501> bug when trying to use pulsar sql, but in non-standalone. It looks like this issue is still open and hasn't been addressed, was wondering if anyone has any ideas ---- 2020-04-13 13:16:26 UTC - Aviram Webman: @Aviram Webman has joined the channel ---- 2020-04-13 13:30:06 UTC - xiaopeng: @xiaopeng has joined the channel ---- 2020-04-13 13:36:24 UTC - Aviram Webman: ```Hi, consumer with subscriptionType: key_shared when producer enableBatching(true). messages are routed only to the first consumer. it works only when enableBatching(false) is it a bug? because I need for performance of batching``` ---- 2020-04-13 14:39:09 UTC - Guilherme Perinazzo: The node client is not using sendAsync. It uses the node.js worker pool to send syncronously. When the worker pool gets clogged, you have the issues that you're seeing with HTTP calls not being handled. I made a PR a while ago with a listener implementation to solve this issue on the consumer side, but something similar needs to be done for the producer side. ---- 2020-04-13 16:23:17 UTC - Sijie Guo: This is fixed and will be released in 2.5.1 +1 : Chris ---- 2020-04-13 16:54:03 UTC - Pradeep Mishra: Ya i tried with blockIfQueueFull false Option without success, its kind a deal breaker, it should be fixed. ---- 2020-04-13 16:55:07 UTC - Guilherme Perinazzo: I don't think the current implementation even uses that option. ---- 2020-04-13 16:58:53 UTC - Sijie Guo: did you produce messages with keys? ---- 2020-04-13 17:02:20 UTC - Sijie Guo: hmm I see. if it is using sync send, we need to improve it. @Nozomi Kurihara any idea why synchronous send was used? ---- 2020-04-13 17:05:09 UTC - Guilherme Perinazzo: @Sijie Guo was due to <https://github.com/apache/pulsar-client-node/issues/14> When they started the library, the node addon api for C++ didn't have a good story for sending messages from outside the node-controlled threads into it. It has since changed (with the ThreadsafeFunction api I used to create the listener), but the rest of the codebase was created before it was available. ---- 2020-04-13 17:09:00 UTC - Sijie Guo: I see thanks ---- 2020-04-13 17:59:26 UTC - Curtis Cook: @Curtis Cook has joined the channel ---- 2020-04-13 18:28:35 UTC - Aviram Webman: sure. ```private void produce() throws PulsarClientException { long counter = 0; Producer<byte[]> producer = createPulsarProducer(); while(true){ String key = "" + counter % numOfKeys; String msg = "msg" + key; producer.newMessage() .key(key) .value(msg.getBytes()) .sendAsync() .thenRun(()-> producedMsgPerSecond.increment()) .exceptionally((ex)->{ System.out.println(ex); return null; }); counter++; } }``` ---- 2020-04-13 19:19:21 UTC - Sijie Guo: what did you observe in the consumer side? ---- 2020-04-13 19:22:43 UTC - Sijie Guo: @Tolulope Awode : We have just updated the helm chart and the documentation. <https://pulsar.apache.org/docs/en/kubernetes-helm/> Can you try the latest master. The raw yaml files are pretty out of dated. I have started a pull request to remove them. Those are misleading. ---- 2020-04-13 19:39:05 UTC - Ryan Slominski: When packing a custom connector into an Apache NiFi NAR file is there anything special about the maven plugin for this purpose, or can I use a Gradle build and gradle NAR plugin: ```de.fanero.gradle.plugin.nar``` ---- 2020-04-13 19:40:49 UTC - Ryan Slominski: I ask, because when I try to run my custom connector I see: ```java.nio.file.NoSuchFileException: /tmp/pulsar-nar/my-custom-connector.nar-unpacked/META-INF/services/pulsar-io.yaml``` ---- 2020-04-13 19:44:32 UTC - Ryan Slominski: This exception is actually caused by: ```Exception in thread "main" java.lang.IllegalArgumentException: Topic name cannot be null at org.apache.pulsar.functions.utils.SourceConfigUtils.validate(SourceConfigUtils.java:223) at org.apache.pulsar.functions.LocalRunner.start(LocalRunner.java:266)``` I'm setting the destination topic name in the org.apache.pulsar.functions.api.Record so I'm not sure what topic name this Exception is about. ---- 2020-04-13 19:46:15 UTC - Ryan Slominski: In case it matters, I'm starting the connector with: ```./bin/pulsar-admin sources localrun --archive ./connectors/my-custom-connector.nar --tenant public --namespace default --name my-custom-connector --source-config-file ./conf/my-custom-connector.yml --parallelism 1``` ---- 2020-04-13 20:06:04 UTC - Aviram Webman: only one consumer get messages. ---- 2020-04-13 20:07:37 UTC - Cornel K.: I am interested in the same topic! ---- 2020-04-13 20:17:41 UTC - Ryan Slominski: Nevermind, figured it out. Looks like you must create a pulsar-io.yaml file. Docs don't seem to mention this though. I also ran into a difference with Gradle Nifi Plugin as discussed here: <https://github.com/sponiro/gradle-nar-plugin/issues/5> ---- 2020-04-13 20:20:10 UTC - Aviram Webman: my program record which keys are received on each consumer *Batch=true* producer 178713 message per second consumer 0 set=[] consumer 1 set=[] consumer 2 set=[] consumer 3 set=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19] consumer 4 set=[] *batch=false* producer 28091 message per second consumer 0 set=[0, 2, 4, 9, 11, 12, 14, 15, 19] consumer 1 set=[6, 13, 16] consumer 2 set=[5, 18] consumer 3 set=[1, 3, 7] consumer 4 set=[8, 10, 17] ---- 2020-04-13 20:40:33 UTC - rwaweber: Hey all! Question on tiered-storage and being able to set something along the lines of user-quotas. Is it be possible to set permissions so that only certain users are able to read from offsets that are beyond a certain age? Behavior like “User Bob is the only user that is capable of reading from offsets that are more than 2 weeks old” ---- 2020-04-13 21:01:05 UTC - Sijie Guo: yes. it requires a pulsar-io.yaml file. Are you interested in contributing a fix to improve the documentation? ---- 2020-04-13 21:23:27 UTC - Sijie Guo: Currently it doesn’t support this kind of finer granularity access control. But it is an interesting feature. Can you create a github issue? So we can track the feature along with the discussion there. ---- 2020-04-13 21:25:50 UTC - Sijie Guo: @Aviram Webman I see. For key_sharing, you need to use KEY_BASED batcher: <https://github.com/apache/pulsar/blob/master/pulsar-client-api/src/main/java/org/apache/pulsar/client/api/BatcherBuilder.java#L49> <https://github.com/apache/pulsar/blob/master/pulsar-client-api/src/main/java/org/apache/pulsar/client/api/ProducerBuilder.java#L404> ---- 2020-04-13 21:42:07 UTC - JG: yes because most of the systems are using Kafka as broker and event storage like mongo or sql database ---- 2020-04-13 21:44:50 UTC - JG: Hi all! I am suing this docker file: ---- 2020-04-13 21:44:58 UTC - JG: `version: "3.7"` services: pulsar: image: apachepulsar/pulsar:2.5.0 command: bin/pulsar standalone hostname: pulsar ports: - "8080:8080" - "6650:6650" restart: unless-stopped volumes: - "./data/:/pulsar/data" dashboard: image: apachepulsar/pulsar-manager:v0.1.0 ports: - "9527:9527" depends_on: - pulsar links: - pulsar volumes: - "./data/:/data" environment: REDIRECT_HOST: "<http://127.0.0.1>" REDIRECT_PORT: "9527" DRIVER_CLASS_NAME: "org.postgresql.Driver" URL: "jdbc:<postgresql://127.0.0.1:5432/pulsar_manager>" USERNAME: "pulsar" PASSWORD: "pulsar" LOG_LEVEL: "DEBUG" ---- 2020-04-13 21:45:45 UTC - JG: but I always got error messages on pulsar-manager: ---- 2020-04-13 21:45:52 UTC - JG: ---- 2020-04-13 21:46:08 UTC - JG: nothing is shown on logs ---- 2020-04-13 21:46:32 UTC - JG: ---- 2020-04-13 21:46:46 UTC - JG: really annoying as I wanted to monitor my topics ---- 2020-04-13 22:01:30 UTC - Tanner Nilsson: here's what I have that is working for me: ```version: "3.7" services: pulsar: image: apachepulsar/pulsar:2.5.0 command: bin/pulsar standalone -a pulsar ports: - "8080:8080" - "6650:6650" restart: unless-stopped volumes: - "../data/:/pulsar/data" dashboard: image: apachepulsar/pulsar-manager:v0.1.0 ports: - "9527:9527" depends_on: - pulsar environment: REDIRECT_HOST: "<http://dashboard>" REDIRECT_PORT: "9527" DRIVER_CLASS_NAME: "org.postgresql.Driver" URL: "jdbc:<postgresql://127.0.0.1:5432/pulsar_manager>" USERNAME: "pulsar" PASSWORD: "pulsar" LOG_LEVEL: "DEBUG" volumes: - "../data/:/data"``` ---- 2020-04-13 22:02:56 UTC - Tanner Nilsson: then I still access it at <http://localhost:9527>, and when adding my environment the URL is <http://pulsar:8080> ---- 2020-04-13 22:31:54 UTC - JG: there is a http 500: ---- 2020-04-13 22:31:59 UTC - JG: ---- 2020-04-13 22:33:17 UTC - JG: seems to be a problem with postgresql ? : ---- 2020-04-13 22:33:22 UTC - JG: ---- 2020-04-13 22:35:24 UTC - Chris: Working on writing a source/sink for google pubsub, but its protobuf libraries clash with the ones in pulsar. Does anyone understand how the classloading works for sources/sinks? I found this line <https://github.com/apache/pulsar/blob/52ae1823dbcbef95637580c5f3568843232cd379/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java#L454> which I'm sure exists for a reason, but wouldn't it be reasonable for sources/sinks to have their classpaths take higher priority over the instances'? That (should) avoid versioning clashes because then the user can provide whichever one they need. I'm sure I can do some classloading wizardry and force it to load my version, but I'd rather try the "right" way first. Also if there's already a pubsub source/sink out there that I missed, I'd love to use it instead. ---- 2020-04-13 22:41:38 UTC - Chris Miller: I created an issue about this here: <https://github.com/apache/pulsar/issues/6484> ---- 2020-04-13 22:44:50 UTC - Chris: Ah, great. No workarounds yet I assume? ---- 2020-04-13 22:46:03 UTC - Chris: I’ll try to come up with one tomorrow. Will post in that issue if I figure out something reasonable. +1 : Chris Miller ---- 2020-04-13 22:47:13 UTC - Chris Miller: Roll back to 2.4.2 is one possible option. I haven't tried to figure out a workaround for the classloading with 2.5.0, seemed like it was going to be a painful tangent ---- 2020-04-13 23:25:35 UTC - JG: Do you use Docker Desktop for Windows or linux ? ---- 2020-04-14 01:56:04 UTC - tuteng: Can you check pulsar_manager.log? in docker container ---- 2020-04-14 02:31:20 UTC - Nozomi Kurihara: Using C++ async functions seems better, but we didn't find the good way at the first time. @Guilherme Perinazzo Do you think ThreadsafeFunction api solves the issue? ---- 2020-04-14 03:58:35 UTC - Vincent: @Vincent has joined the channel ---- 2020-04-14 04:24:46 UTC - Tymm: @Tymm has joined the channel ---- 2020-04-14 06:11:09 UTC - Pradeep Mishra: @Sijie Guo not related to this but out of curiosity, Pulsar websocket apis look stable enough, are we thinking on making nodejs lib based on that? ---- 2020-04-14 06:12:54 UTC - Sijie Guo: @Pradeep Mishra if it is websocket, I don’t think you need to a library. You can just use any existing websocket library to call the endpoint, no? just like issuing an http call? ---- 2020-04-14 06:16:23 UTC - Pradeep Mishra: @Sijie Guo yup you are right, i did same instead of using c++ based lib made small lib based on websocket, but havin some kind of official lib helps adopting it very fast, it will help adoption of Pulsar in nodejs cummunity. ---- 2020-04-14 06:21:53 UTC - Sijie Guo: Agreed. I think the initial thought is to develop a c++ wrapped node client. Since c++ has more features supported in websocket and the performance is better. I think we need to look into how to improve the current node implementation. ---- 2020-04-14 06:31:20 UTC - Sijie Guo: @Chris Miller - I just commented in that issue. We are going to triage this and take look at it in the coming week. @jia zhai +1 : Chris Miller ---- 2020-04-14 06:33:57 UTC - jia zhai: get it ---- 2020-04-14 07:03:25 UTC - Rattanjot Singh: running mvn package -Pdocker getting the following error ```+ tar xfz /pulsar/distribution/server/target/apache-pulsar-2.6.0-SNAPSHOT-src.tar.gz tar (child): /pulsar/distribution/server/target/apache-pulsar-2.6.0-SNAPSHOT-src.tar.gz: Cannot open: No such file or directory tar (child): Error is not recoverable: exiting now tar: Child returned status 2 tar: Error is not recoverable: exiting now [ERROR] Command execution failed. org.apache.commons.exec.ExecuteException: Process exited with an error: 2 (Exit value: 2)``` ---- 2020-04-14 07:12:05 UTC - hugues DESLANDES: done ---- 2020-04-14 07:12:21 UTC - hugues DESLANDES: <https://github.com/apache/pulsar/issues/6734> ---- 2020-04-14 07:29:18 UTC - Rattanjot Singh: @Addison Higham getting this issue when i run this ```tar xfz /pulsar/distribution/server/target/apache-pulsar-2.6.0-SNAPSHOT-src.tar.gz tar (child): /pulsar/distribution/server/target/apache-pulsar-2.6.0-SNAPSHOT-src.tar.gz: Cannot open: No such file or directory``` ---- 2020-04-14 07:39:46 UTC - JG: I could fix it but I had to use another postgres database ---- 2020-04-14 07:40:00 UTC - JG: the embedded one does not have any table... ---- 2020-04-14 07:48:06 UTC - Hiroyuki Yamada: Sorry for basic questions about BookKeeper replication but let me ask here. I’m testing it with 3-node bookie cluster. 1. I’m testing with E=3, Qw=3, Qa=2 (this is similar to C* replication with Replication factor=3 and quorum), and even though Qa=2, one of the bookie node down causes failure (system unavailable) with `Not enough non-faulty bookies available`. Probably it is as expected, but I think it is pretty misleading because <https://bookkeeper.apache.org/docs/4.10.0/development/protocol/|the doc> says `The system can tolerate *Qa* – 1 failures without data loss.` Can anyone explain in a better way ? Using default setting; E=2, Qw=2, Qa=2 then it continues without failure, so I’m pretty confused why we need 3 variables. 2. The doc doesn’t really explain about read quorum, but doesn’t Bookie do read quorum ? For example, if E=3, Qa=2, then read should go at least 2 nodes to get sequential consistent data, but does Bookie work like that ? ---- 2020-04-14 08:30:33 UTC - wuYin: @wuYin has joined the channel ----
