Slack digest for #general - 2020-02-04

Apache Pulsar Slack Tue, 04 Feb 2020 01:12:01 -0800

2020-02-03 09:53:17 UTC - Manju Priya A R: @Sijie Guo: Is there any plan to add 
it? If so when will it be added?
----
2020-02-03 11:31:49 UTC - Alexandre DUVAL: I'm writing a logstash-output-pulsar 
plugin, but I've got the issue `error while create opSendMsg by batch message 
container` as it's pulsar producer which is batching by itself I'm not sure to 
understand the error, is it an issue that i shall fill on pulsar's github?
----
2020-02-03 11:32:10 UTC - Alexandre DUVAL: 
(<https://github.com/CleverCloud/logstash-output-pulsar>)
----
2020-02-03 11:48:59 UTC - rmb: Hi all, I have a message-ordering question.  
Suppose I have two producers writing to the same topic, and I'm using message 
keys (so <https://pulsar.apache.org/docs/en/concepts-messaging/> says that the 
broker will put messages in per-key-partiion order).  How does the broker order 
messages coming from the different producers?  Is it just the order it receives 
them in?
----
2020-02-03 13:34:08 UTC - Antti Kaikkonen: Is my assumption correct that pulsar 
could require significantly more network bandwidth compared to kafka because 
kafka stores data in the brokers and pulsar doesn't?
----
2020-02-03 13:44:20 UTC - Antti Kaikkonen: &gt; Is it just the order it 
receives them in?
Yes I think so. If you want to guarantee that messages with the same key retain 
ordering then you should make sure that the producers don't produce messages 
with same keys. I'm not 100% sure but this is how I think it works.
----
2020-02-03 13:45:53 UTC - Antti Kaikkonen: For example if you use customer-id 
as the key then the two producers should only handle different customers.
+1 : Konstantinos Papalias
----
2020-02-03 14:21:07 UTC - Alex Yaroslavsky: I managed to set up pulsar 2.5.0 
(pulsar:latest) on Kubernetes (Amazon EKS) using modified yamls from the source 
code as they just don't work as is 
(<https://github.com/trexinc/pulsar/commit/4e3d5a3f6d8a7acd9354a9b1bf0a036c61bbc1f4>)
I got to a point when all elements are running without errors. But now I am 
having issues with publishing and consuming.

I if I use the simplest example python code to produce to a topic I always get
a timeout message. In broker logs the only WARN prints are:
13:20:19.158 [bookkeeper-ml-workers-OrderedExecutor-3-0] WARN
org.apache.bookkeeper.client.BookieWatcherImpl - New ensemble:
[bookkeeper-0.bookkeeper.default.svc.cluster.local:3181,
bookkeeper-1.bookkeeper.default.svc.cluster.local:3181] is not adhering to
Placement Policy. quarantinedBookies: []
13:20:19.170 [bookkeeper-ml-workers-OrderedExecutor-3-0] WARN
org.apache.pulsar.broker.service.AbstractTopic -
[<persistent://public/default/test1>] Error getting policies
java.util.concurrent.CompletableFuture cannot be cast to
org.apache.pulsar.common.policies.data.Policies and publish throttling will be
disabled
13:21:18.979 [pulsar-io-22-1] WARN
org.apache.pulsar.common.protocol.PulsarHandler - [[id: 0x049d1ece,
L:/10.0.3.64:6650 - R:/10.0.3.12:35454]] Forcing connection to close after
keep-alive timeout

If I use pulsar-perf produce <persistent://public/default/test1> --rate 100 it
seems to produce messages without errors.
But then if I try to consume with python, I get timeout and I try to consume
with pulsar-perf it doesn't receive any messages (but no errors).

Any ideas on how to progress?

P.S. pulsar-perf is run from within the k8s, while python is run on a machine
connecting to the broker-proxy over internet.
----
2020-02-03 14:34:14 UTC - Alex Yaroslavsky: Yeah, I figured that
----
2020-02-03 14:34:30 UTC - Roman Popenov: Do you see any errors inside the proxy
logs?
----
2020-02-03 14:36:56 UTC - Roman Popenov: If you see errors inside the proxies,
this could be related to
<https://github.com/apache/pulsar/issues/5994>
----
2020-02-03 14:51:33 UTC - Alex Yaroslavsky: No errors or exceptions in the
proxies, only
13:50:52.183 [pulsar-proxy-io-2-1] WARN
org.apache.pulsar.common.protocol.PulsarHandler - [[id: 0xa2e31840,
L:/10.0.2.19:6650 - R:/10.0.2.105:27500]] Forcing connection to close after
keep-alive timeout

And the failure is only with producing, connection itself is established
without issue as can be seen from this client log:
2020-02-03 09:49:31.491 INFO ConnectionPool:85 | Created connection for
<pulsar://xxx:6650>
2020-02-03 09:49:31.817 INFO ClientConnection:330 | [10.43.9.73:32862 -&gt;
xxx:6650] Connected to broker
2020-02-03 09:49:32.182 INFO HandlerBase:53 |
[<persistent://public/default/test1>, ] Getting connection from pool
2020-02-03 09:49:32.361 INFO ConnectionPool:85 | Created connection for
<pulsar://10.0.3.64:6650>
2020-02-03 09:49:32.535 INFO ClientConnection:332 | [10.43.9.73:60532 -&gt;
yyy:6650] Connected to broker through proxy. Logical broker:
<pulsar://10.0.3.64:6650>
2020-02-03 09:49:32.902 INFO ProducerImpl:151 |
[<persistent://public/default/test1>, ] Created producer on broker
[10.43.9.73:60532 -&gt; yyy:6650]
----
2020-02-03 14:56:08 UTC - Roman Popenov: And what is your producer code?
----
2020-02-03 15:51:34 UTC - Alex Yaroslavsky: Troubleshooted some more and it
seems less like a pulsar issue now, but I'm still a bit puzzled.
All works fine when I connect from a machine within the same VPC.
But it doesn't work over the internet, quite strange as it is not a security
group issue as the connection is established successfully. Strange...
----
2020-02-03 15:52:55 UTC - Alex Yaroslavsky: Maybe a firewall messing up with
the stream, I'll try to check that later...
Thanks for your help though!
----
2020-02-03 15:53:06 UTC - Roman Popenov: Strange, I would assume it is a
security group issue where you need to allow traffic on the 6650 port
----
2020-02-03 15:54:57 UTC - Roman Popenov: Did you enable an inbound rule to
allow traffic to port 6650?
----
2020-02-03 16:38:27 UTC - Alexandre DUVAL: @Sijie Guo wdyt? ^
----
2020-02-03 17:17:38 UTC - Konstantinos Papalias: @roman is there a need for the
modified .yaml ? we are launching into a similar adventure now and would be
keen to follow the official path. if they are needed maybe @Alex Yaroslavsky
can contribute them to the project :slightly_smiling_face:
----
2020-02-03 17:35:45 UTC - Roman Popenov: I know that `BOOKIE_MEM` do need to be
modified for some configs to take place for the bookkeeper jvm options. I will
have a PR for that some time later today.
----
2020-02-03 17:41:07 UTC - Sijie Guo: interesting. Can you create a github issue
for this problem?
----
2020-02-03 17:42:36 UTC - Sijie Guo: @rmb broker doesn’t guarantee any orders
across producers. the order is just it receives the messages.
----
2020-02-03 17:44:13 UTC - Sijie Guo: No. If you have same replication settings,
both systems use same network bandwidth. See a blog post I wrote before:
<https://streaml.io/blog/apache-pulsar-architecture-designing-for-streaming-performance-and-scalability>
----
2020-02-03 17:45:15 UTC - Alexandre DUVAL:
<https://github.com/apache/pulsar/issues/6195>
----
2020-02-03 17:47:43 UTC - Sijie Guo: thanks
----
2020-02-03 18:00:55 UTC - natalie: @natalie has joined the channel
tada : Roman Popenov
----
2020-02-03 18:03:49 UTC - rmb: Thanks very much!
----
2020-02-03 18:07:22 UTC - Roman Popenov: Yeah, it looks like there was a lot of
just copy-pasta from the helm charts deployments files, so the changes are
necessary
----
2020-02-03 18:39:05 UTC - Marc d'Entremont: @Marc d'Entremont has joined the
channel
----
2020-02-03 18:56:20 UTC - Santiago Del Campo: Hello! we're currently running
apache pulsar over a K8s cluster. Everything up until now runs fine, except
when we want to design a high availability strategy:

Basically, we need that, inside the same cluster, one of the three nodes can go
down. At first, we understand that we need a minimum of 3 Zks to keep quorum
around leader election, but at the same time, we're facing errors related to
ledgers not replicated, brokers throwing 500s, when one of the nodes go down
(just to be clear, Zks, after one of the three goes down, are able to find new
leader and still have quorum).

I know my explanation is not exhaustive enough, but that's because we're kinda
lost at this point, don't know where to start.

So the question is the following --&gt; whats the configuration needed (at
broker &amp; bookkeeper level) to have a cluster able to resist 1 node failing
of 3 and still maintain stability in the process? (does not matter if there's a
bit of downtime).
----
2020-02-03 19:02:18 UTC - Addison Higham: Hrm... running into an interesting
issue:
We are running 3 brokers and 2 proxy nodes. We have a busy namespace that
occasionally has unloads and re-balances. When we get a re-balance, we see the
following happen:
1. The broker triggers unload, and see a log message that the producer will be
disconnected
2. The client closes the connection.
3. It attempts to reconnect but gets an error that the producer already exists
on the broker
4. It gets into this loop and never recovers
After a lot of digging into the code, not quite sure what is happening. It
would appear that the client is hanging up and everything is happening as it
should but it just doesn't get pointed at the new broker. Is it possible that
the proxy is interfering here? I am not totally sure how the proxy works, but I
imagine it needs to know about topic placement and perhaps it isn't picking up
the new metadata
----
2020-02-03 19:06:39 UTC - Santiago Del Campo: Some time ago we had this
problem... basically we reduced proxies to only one and magically connection
errors stopped. We thought the same, at some point there seems to be some
interference between proxies

At the same time, if brokers are restarted, we saw a tendendcy of that same
proxy to also throw errors, we need to restart it and kinda "solve" the issue.
----
2020-02-03 19:11:01 UTC - Addison Higham: oh interesting... we actually aren't
connecting through the proxy we just realized
----
2020-02-03 19:11:56 UTC - Addison Higham: oh NM on ^^ we aren't connecting
through the proxy in this case because it is a pulsar IO source, which we have
connect straight to brokers
----
2020-02-03 19:17:52 UTC - Brian Cruz: @Brian Cruz has joined the channel
----
2020-02-03 19:31:27 UTC - Sijie Guo: what is your ensemble size
`managedLedgerDefaultEnsembleSize` at broker.conf? You need ensemble_size + 1
bookies to tolerant one bookie going down.
----
2020-02-03 19:32:15 UTC - Santiago Del Campo: ahhh that must be it... i just
checked in config maps and saw "1" as the value.
----
2020-02-03 19:37:58 UTC - Antti Kaikkonen: Thanks you for the answer. I was
thinking about a stream processor running in the same host where the consumed
partition is stored, but now that I read more about it I think that it is not
recommended to run kafka streams on the broker hosts.
----
2020-02-03 19:50:31 UTC - Joshua Dunham: Hey @Sijie Guo, just catching up on
this. Is separate BK an4pHunE8%
----
2020-02-03 19:50:45 UTC - Joshua Dunham: sorry... mashed the keys
----
2020-02-03 19:50:57 UTC - Joshua Dunham: Is separate BK and Pulsar possible?
----
2020-02-03 19:51:16 UTC - Joshua Dunham: OR how do you deploy using containers
having one process per container.
----
2020-02-03 20:13:42 UTC - Addison Higham: Okay, we think we know what is
happening here, should be fixed by
<https://github.com/apache/pulsar/pull/5571>, but yeah, when that offloading is
happening it looks like we get that bundle offloaded back to the same broker!
+1 : Ryan
----
2020-02-03 20:35:18 UTC - Sam Leung: What is the recommendation for using
Key_Shared subscriptions in a production setting? Documentation says it’s still
a beta feature.
----
2020-02-03 20:54:39 UTC - Sijie Guo: Ah it was marked when it was introduced in
2.4.0. there are multiple companies using this feature in production.

@Penghui Li we might need to remove “beta” from the documentation.
+1 : Ryan
----
2020-02-03 20:55:54 UTC - Sijie Guo: ok
----
2020-02-03 21:27:02 UTC - Alex Yaroslavsky: It's something with the office
firewall, the producer/consumer work fine from home.
+1 : Roman Popenov
----
2020-02-03 21:27:17 UTC - Alex Yaroslavsky: @Konstantinos Papalias, already
submitted a PR
thanks : Roman Popenov
----
2020-02-03 22:46:42 UTC - Andrew Tan: Thank you!
----
2020-02-03 23:39:26 UTC - Penghui Li: yes, will remove it
+1 : Ryan
----
2020-02-03 23:46:26 UTC - Kelvin Sajere: @Kelvin Sajere has joined the channel
----
2020-02-03 23:54:04 UTC - Kelvin Sajere: Hi every one. I noticed python client
doesn't support windows. What options are available for using python client on
windows?
Thanks
----
2020-02-04 02:34:50 UTC - Ryan: So this fix should be in 2.5.0?
----
2020-02-04 03:40:33 UTC - Naveen Aechan: @Naveen Aechan has joined the channel
----

Slack digest for #general - 2020-02-04

Reply via email to