2019-12-11 09:11:57 UTC - juraj: here u can see how i'm cleaning the EKS env, and that the ZK init task had been run only single time: ---- 2019-12-11 09:12:31 UTC - juraj: so if u can think of any other reason than those two, that might be the way forward ---- 2019-12-11 09:14:12 UTC - Sijie Guo: Can you try with 2.4.1 image? ---- 2019-12-11 09:14:48 UTC - juraj: ok ---- 2019-12-11 09:32:29 UTC - juraj: sijie, just before i do that, i want to check something -- what's the desired order of the cluster components init?
1. zookeper nodes up and joined in cluster 2. zookeper metadata inserted 3. all the other components can proceed with init is that roughly correct? ---- 2019-12-11 09:33:09 UTC - Sijie Guo: correct ---- 2019-12-11 09:34:16 UTC - juraj: ok, then i know where to look, bc i have the rest of compos trying to init before zk metadata has finished, e..g. (after putting 2 min delay before the zk meta init task): ---- 2019-12-11 09:34:50 UTC - juraj: so i'll focus on that now (the only other change is that i upgraded from helm3 beta to helm3 stable, but even that is likely not the cluprit) ---- 2019-12-11 09:57:29 UTC - juraj: `if bin/pulsar zookeeper-shell -server pulsar-dev-zookeeper ls /admin/clusters/pulsar-dev; then echo "yes"; else echo "no"; fi` returns: ---- 2019-12-11 09:58:41 UTC - juraj: (the `until` loop used in `wait-zookeeper-ready` works the same - it incorrectly succeeds when the zookeeper shell says the cluster hasn't been inited) ---- 2019-12-11 11:01:43 UTC - juraj: fyi the helm3 upgrade forced me to add serviceName: to my StatefulSet defs, possible that's the breaking change, i'm looking into it ---- 2019-12-11 11:32:10 UTC - juraj: submitted the issue <https://github.com/helm/helm/issues/7207> ---- 2019-12-11 11:56:08 UTC - juraj: there's still a difference between 2.4.1 and 2.4.2 on the same heml, 2.4.2 won't start. so i'm looking to fix the ZK wait script. on a working 2.4.1, `bin/pulsar zookeeper-shell -server pulsar-local-zookeeper ls /admin/clusters` returns: `[global, pulsar-local]` and `bin/pulsar zookeeper-shell -server pulsar-local-zookeeper ls /admin/clusters/pulsar-local` returns: `[failureDomain]` so i'll use the `"failureDomain"` string as a positive sign that ZK is inited ---- 2019-12-11 17:20:21 UTC - juraj: using this zookeeper wait script fixed it: `CMD="bin/pulsar zookeeper-shell -server {{ template "pulsar.fullname" . }}-{{ .Values.zookeeper.component }} ls /admin/clusters"; until $CMD && [ $(echo $($CMD) | tail -n 1 | grep -c {{ template "pulsar.fullname" . }}) -eq 1 ]; do echo "waiting"; sleep 3; done;` ---- 2019-12-11 17:22:37 UTC - juraj: the first $CMD invocation right after the `until` is inexplicably needed, otherwise the `until` finishes when the `$(echo $CMD..)` cannot connect bc the pods weren't inited yet, weird but works ---- 2019-12-11 17:23:58 UTC - juraj: this whole thing is shabby and every component should be directly checking/waiting on ZK itself from the java code imo ---- 2019-12-11 18:42:47 UTC - Sijie Guo: cool. can you create github issues or pull requests about the enhancements? ---- 2019-12-11 18:44:10 UTC - juraj: had to do more mods bc worked on local but not EKS. testing now. will create issue if i really crack it. ---- 2019-12-11 19:19:11 UTC - Fredrick P Eisele: From the documentation <https://pulsar.apache.org/docs/en/io-develop/#testing> it appears that `testcontainers` are used for testing connectors. I have looked at the integration tests, they are complicated. I have an https service which is interrogated by a custom source connector. The custom source connector then updates pulsar. It seems to me that I need two testcontainers, one for the service (<https://www.testcontainers.org/modules/mockserver/>) and one for the pulsar instance (<https://www.testcontainers.org/modules/pulsar/>). The documentation is brief. 1. How do I attach my custom source connector to the pulsar container? Use localrun ?<https://pulsar.apache.org/docs/en/functions-debug/#debug-with-localrun-mode> ----