2018-06-12 14:04:45 UTC - Karthik Palanivelu: Hi All, This is regarding the 
Zookeeper in K8 cluster. I am using the script generate-zookeeper-config.sh 
without DOMAIN based on @Sijie Guo comment to my earlier issue which I raised 
couple of days back. But not I am getting the below exception for the 
zookeeper.conf entries. Please help me here.
```
server.1=zookeeper-0:2888:3888
server.2=zookeeper-1:2888:3888
server.3=zookeeper-2:2888:3888
``` 
```
09:46:34.875 [WorkerSender[myid=3]] WARN  
org.apache.zookeeper.server.quorum.QuorumCnxManager - Cannot open channel to 1 
at election address zookeeper-0:3888
java.net.UnknownHostException: zookeeper-0
        at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184) 
~[?:1.8.0_161]
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) 
~[?:1.8.0_161]
        at java.net.Socket.connect(Socket.java:589) ~[?:1.8.0_161]
        at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:562)
 
[org.apache.pulsar-pulsar-zookeeper-2.0.0-rc1-incubating.jar:2.0.0-rc1-incubating]
        at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:538)
 
[org.apache.pulsar-pulsar-zookeeper-2.0.0-rc1-incubating.jar:2.0.0-rc1-incubating]
        at 
org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:452)
 
[org.apache.pulsar-pulsar-zookeeper-2.0.0-rc1-incubating.jar:2.0.0-rc1-incubating]
        at 
org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:433)
 
[org.apache.pulsar-pulsar-zookeeper-2.0.0-rc1-incubating.jar:2.0.0-rc1-incubating]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161]
09:46:34.876 [WorkerSender[myid=3]] WARN  
org.apache.zookeeper.server.quorum.QuorumPeer - Failed to resolve address: 
zookeeper-0
java.net.UnknownHostException: zookeeper-0
        at java.net.InetAddress.getAllByName0(InetAddress.java:1280) 
~[?:1.8.0_161]
        at java.net.InetAddress.getAllByName(InetAddress.java:1192) 
~[?:1.8.0_161]
        at java.net.InetAddress.getAllByName(InetAddress.java:1126) 
~[?:1.8.0_161]
        at java.net.InetAddress.getByName(InetAddress.java:1076) ~[?:1.8.0_161]
        at 
org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer.recreateSocketAddresses(QuorumPeer.java:166)
 
[org.apache.pulsar-pulsar-zookeeper-2.0.0-rc1-incubating.jar:2.0.0-rc1-incubating]
        at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:595)
 
[org.apache.pulsar-pulsar-zookeeper-2.0.0-rc1-incubating.jar:2.0.0-rc1-incubating]
        at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:538)
 
[org.apache.pulsar-pulsar-zookeeper-2.0.0-rc1-incubating.jar:2.0.0-rc1-incubating]
        at 
org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:452)
 
[org.apache.pulsar-pulsar-zookeeper-2.0.0-rc1-incubating.jar:2.0.0-rc1-incubating]
        at 
org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:433)
 
[org.apache.pulsar-pulsar-zookeeper-2.0.0-rc1-incubating.jar:2.0.0-rc1-incubating]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161]
```
----
2018-06-12 14:06:53 UTC - Ivan Kelly: don't know specifically for k8s, but it 
can't resolve the zookeeper-0 to an IP
----
2018-06-12 14:20:58 UTC - David Asher: Is it possible to publish messages > 
5MB on a pulsar topic?
----
2018-06-12 14:22:01 UTC - Karthik Palanivelu: Yes @Ivan Kelly but not sure how 
to get the hostname of it. I am new to k8s
----
2018-06-12 14:22:23 UTC - Ivan Kelly: @David Asher i think the 5MB limit is 
hard coded
----
2018-06-12 14:22:59 UTC - Ivan Kelly: @Karthikeyan Palanivelu how are you 
starting things in k8s?
----
2018-06-12 14:24:49 UTC - David Asher: @Ivan Kelly thx... so the only way 
around is storing it somewhere else and reference the external id?
----
2018-06-12 14:25:41 UTC - Ivan Kelly: Ya, that's one option. there was a bit of 
a discussion about unlimited message size a while back, but I'm not sure what 
happened with it
----
2018-06-12 14:26:35 UTC - Ivan Kelly: effectively, it would be transparently 
chunking the message into multiple messages, and sending a "commit" message at 
the end.
----
2018-06-12 14:32:51 UTC - Karthik Palanivelu: @Ivan Kelly We have a K8s cluster 
which I used to deploy the Zookeeper containers based on my image with similar 
yaml that is in repo along with the script. I use `./pulsar zookeeper` cmd 
within my image.
----
2018-06-12 14:36:26 UTC - Ivan Kelly: how do you set up the zookeeper.conf?
----
2018-06-12 14:40:29 UTC - Karthik Palanivelu: Using the below script:
----
2018-06-12 14:40:36 UTC - Karthik Palanivelu: @Karthik Palanivelu uploaded a 
file: <https://apache-pulsar.slack.com/files/U7VRE0Q1G/FB63M16P4/-.sh|Untitled>
----
2018-06-12 14:40:46 UTC - Karthik Palanivelu: Same from the repo; only change 
is I removed domain which is not resolving
----
2018-06-12 14:41:21 UTC - Ivan Kelly: @Karthikeyan Palanivelu are you just 
running "./pulsar zookeeper" on container start? nothing else?
----
2018-06-12 14:41:30 UTC - Karthik Palanivelu: Yes
----
2018-06-12 14:42:34 UTC - Karthik Palanivelu: I believe I am making mistake at 
the below config:
```
server.1=zookeeper-0:2888:3888
server.2=zookeeper-1:2888:3888
server.3=zookeeper-2:2888:3888
```
----
2018-06-12 14:43:24 UTC - Ivan Kelly: is your statefulset called zookeeper?
----
2018-06-12 14:43:46 UTC - Ivan Kelly: will you post your k8s deployment yaml?
----
2018-06-12 14:48:49 UTC - Karthik Palanivelu: @Karthik Palanivelu uploaded a 
file: <https://apache-pulsar.slack.com/files/U7VRE0Q1G/FB5CPJTJL/-.php|Untitled>
----
2018-06-12 14:53:08 UTC - Ivan Kelly: strange. It should be named zookeeper-0, 
etc in the dns
----
2018-06-12 14:53:19 UTC - Ivan Kelly: could you post the contents on /etc/hosts 
from one of the pods?
----
2018-06-12 14:56:17 UTC - Karthik Palanivelu: Are you referring to container or 
POD? If POD, Can you please let me know how to get it?
----
2018-06-12 14:56:31 UTC - Ivan Kelly: container
----
2018-06-12 14:58:07 UTC - Karthik Palanivelu: @Karthik Palanivelu uploaded a 
file: <https://apache-pulsar.slack.com/files/U7VRE0Q1G/FB5CXM15E/-.txt|Untitled>
----
2018-06-12 15:00:43 UTC - Ivan Kelly: can you ping zookeeper-0 from the same 
container?
----
2018-06-12 15:02:10 UTC - Karthik Palanivelu: Within my container I do not have 
ping, on host it is UnknownHost
----
2018-06-12 15:02:51 UTC - Ivan Kelly: nslookup? nc?
----
2018-06-12 15:03:38 UTC - Karthik Palanivelu: @Karthik Palanivelu uploaded a 
file: <https://apache-pulsar.slack.com/files/U7VRE0Q1G/FB647HWUA/-.m|Untitled>
----
2018-06-12 15:04:35 UTC - Ivan Kelly: cat /etc/resolv.conf &amp; 
/etc/nsswitch.conf?
----
2018-06-12 15:08:56 UTC - Ivan Kelly: does nslookup zookeeper-0.zookeeper work?
----
2018-06-12 15:09:44 UTC - Karthik Palanivelu: yes
----
2018-06-12 15:10:12 UTC - Ivan Kelly: ok, strange, k8s docs say just the 
statefulset-&lt;number&gt; should work
----
2018-06-12 15:10:45 UTC - Ivan Kelly: but anyhow, you need to change the 
ZOOKEEPER_SERVERS env variable in the deployment spec
----
2018-06-12 15:11:03 UTC - Ivan Kelly: to be 
zookeeper-0.zookeeper,zookeeper-1.zookeeper,zookeeper-2.zookeeper
----
2018-06-12 15:13:10 UTC - Karthik Palanivelu: I tried that Option as well and 
it did not work. Let me try one more time.
----
2018-06-12 15:13:58 UTC - Ivan Kelly: how does the zookeeper.conf look after 
you do that?
----
2018-06-12 15:22:59 UTC - Karthik Palanivelu: I think it worked, Thank You 
@Ivan Kelly. BTW can you please let me know what would be the host name for 
Broker to be used within Intialize Cluster Data
----
2018-06-12 15:24:32 UTC - Ivan Kelly: broker-0.broker I would guess, but can't 
be sure, it's not a stateful set, so rules may be different
----
2018-06-12 15:34:34 UTC - Karthik Palanivelu: Bookie also got started...Working 
on Broker. BTW why do we need to have bookie as Daemon Set, Initialization and 
autoRecovery; It was not working for me and I just used Deployment option
----
2018-06-12 16:50:53 UTC - Matteo Merli: @Karthikeyan Palanivelu the broker URL 
will be the DNS name associated with the brokers service. either `broker` or 
similar
----
2018-06-12 16:59:06 UTC - Karthik Palanivelu: Thanks @Matteo Merli Let me try 
that
----
2018-06-12 16:59:26 UTC - Guillaume LECROC: @David Asher 
<https://github.com/apache/incubator-pulsar/issues/523>
I think the max message size is configurable in bookeeper and/or pulsar
----
2018-06-12 17:54:36 UTC - William Fry: What would it take to use Spark’s 
structured streaming with Pulsar via PySpark?
----
2018-06-12 17:58:06 UTC - Sijie Guo: @William Fry I think we need a python 
based pulsar input sources implementing spark data frames or streaming datasets 
interface.
----
2018-06-12 18:07:58 UTC - William Fry: Gotcha, how would I push for that to 
happen? A ticket on Github?
----
2018-06-12 18:09:13 UTC - William Fry: Would it be very difficult to implement?
----
2018-06-12 18:13:15 UTC - Matteo Merli: The Pulsar py API should be 
straightforward to use. If you're familiar with PySpark, it should be easy to 
create an adaptor
----
2018-06-12 18:19:07 UTC - Alex Bradbury: @Alex Bradbury has joined the channel
----
2018-06-12 18:26:36 UTC - Alex Bradbury: Hi @durga, how did you get on with 
your prototype? Were you able to send large binary messages? I'm considering 
something similar. Thanks!
----
2018-06-12 19:28:59 UTC - Sijie Guo: @William Fry I think it should be fairly 
simple. I’ve checked the pyspark there are already multiple input sources 
there, for example kafka.py 
<https://github.com/apache/spark/blob/master/python/pyspark/streaming/kafka.py>

I think one interesting question is where to host the pulsar python source 
code, is it in pulsar, in spark or some 3rd party repo. my feeling is it might 
be better to contribute the python one back to spark, since it might be easier 
to manage pyspark dependencies. although I am not a python export, @Matteo 
Merli or @Sanjeev Kulkarni might have a better thought on this
----
2018-06-12 19:33:19 UTC - Sanjeev Kulkarni: Where are spark connectors usually 
based?
----
2018-06-12 20:07:55 UTC - Ali Ahmed: @William Fry @Matteo Merli @Sanjeev 
Kulkarni I don’t think it’s simple my understanding pyspark connectors are 
wrappers over java code, so you need to write both at the same time
----
2018-06-12 20:09:20 UTC - William Fry: Interesting, I believe there’s already a 
Spark connector for Pulsar written in Java
----
2018-06-12 20:09:29 UTC - William Fry: just nothing for PySpark
----
2018-06-12 20:10:25 UTC - Ali Ahmed: basically the java code is called via py4j
----
2018-06-12 20:12:01 UTC - Matteo Merli: I don't think it has to do anything 
with java, spark itself already is bridging python from Java. There's no need 
to go back to Java, if you can use a Py library library like pulsar client
----
2018-06-12 20:15:08 UTC - Ali Ahmed: I remember you  had to implicitly call 
java methods from python like so
<https://github.com/radanalyticsio/streaming-amqp/blob/master/python/amqp.py#L30>
----
2018-06-13 06:40:59 UTC - Idan: @Idan uploaded a file: 
<https://apache-pulsar.slack.com/files/UALJD8929/FB61UDBPB/-.java|Untitled>
----
2018-06-13 06:41:23 UTC - Idan: also would be great to know how to stop the 
standalone gracefully seems like when I cntrl+C the process (mac) it always has 
issues coming back again
----
2018-06-13 06:43:17 UTC - Sijie Guo: @Idan which version of this? and can you 
describe your command sequence?
----
2018-06-13 06:44:13 UTC - Idan: apache-pulsar-2.0.0-rc1-incubating
----
2018-06-13 06:44:23 UTC - Idan: doing via /bin/plusar standalone
----
2018-06-13 06:44:34 UTC - Idan: then it comes up with the end of the log I just 
sent ya
----
2018-06-13 06:44:59 UTC - Idan: to shutdown usually iam just cntrl+c: ^Z
[1]+  Stopped                 ./pulsar standalone
ip-10-8-0-10:bin idanfridman$
----
2018-06-13 06:45:47 UTC - Sijie Guo: let me produce
----
2018-06-13 06:47:06 UTC - Idan: pretty naive sequence
----
2018-06-13 06:48:43 UTC - Sijie Guo: @Idan I tried download the tarball and run 
the sequence. I didn’t see the problem though.

is it a new tarball downloaded or have you run with older versions before?
----
2018-06-13 06:49:00 UTC - Idan: actually it’s the first one I used
----
2018-06-13 06:49:23 UTC - Idan: perhaps via logs we can nail what’s wrong?
----
2018-06-13 06:49:38 UTC - Idan: something with: 09:48:42.396 [main] INFO  
org.apache.bookkeeper.proto.BookieNettyServer - Shutting down BookieNettyServer
----
2018-06-13 06:50:10 UTC - Idan: i shutted down everything. perhaps you can 
catch here unreleased used port:
----
2018-06-13 06:50:19 UTC - Idan: @Idan uploaded a file: 
<https://apache-pulsar.slack.com/files/UALJD8929/FB6HZFYLA/-.java|Untitled>
----
2018-06-13 06:53:46 UTC - Sijie Guo: @Idan: the ports look good. the logging 
basically says it can’t find /ledgers/cookies, when starting up. it is weird 
that this would happen on a standalone instance, unless the data directory is 
on a tempfs directory. this is a bit strange.

can you do following:

1) in the pulsar directory : copy the data as a backup : ‘mv data data_back’
2) run `bin/pulsar standalone` again to see if standalone can come up
----
2018-06-13 06:55:54 UTC - Idan: Ok left my comp for a hour ill do that and show 
results
----
2018-06-13 06:59:10 UTC - Sijie Guo: sure. ping me when you have results
----

Reply via email to