2020-07-06 11:58:05 UTC - yuanbo: Any idea on how to improve KOP performance in
large scale cluster ?
----
2020-07-06 11:59:42 UTC - David Lanouette: Could you be a bit more specific?
What size cluster are you talking about? What performance issues are you
seeing? etc.
----
2020-07-06 12:00:15 UTC - David Lanouette: Also, you might get a better
response over in <#C0106RFKAHM|kop>
----
2020-07-06 12:07:20 UTC - yuanbo: Thanks for your response, I will add more
information later.
----
2020-07-06 12:24:25 UTC - yuanbo: Seems my colleague has provided details in
<#C0106RFKAHM|kop> . There is a lot of traffic in our cluster, and KOP doesn't
very fit in the case. We will try to figure out more ideas about how to improve
it and hopefully feed them back to the community.
----
2020-07-06 12:27:02 UTC - David Lanouette: Who is your colleague? I don't see
anything over there that looks similar to your comment/question.
----
2020-07-06 12:30:40 UTC - yuanbo:
<https://apache-pulsar.slack.com/archives/C0106RFKAHM/p1585829956017300>
----
2020-07-06 12:32:08 UTC - yuanbo: about 3 months ago
----
2020-07-06 12:32:32 UTC - David Lanouette: OK. I didn't look back that far.
----
2020-07-06 12:38:18 UTC - David Lanouette: I also see that you didn't really
get an answer to your problem.
----
2020-07-06 12:40:39 UTC - David Lanouette: Do you know if your colleague tried
increasing the number of threads, as @Sijie Guo suggested?
----
2020-07-06 12:47:45 UTC - yuanbo: Yeah, it's a bit complecated. When kafka
messages are written in to Pulsar by kop, Broker needs to unpack those messages
and makes them in order. It turns out it's not cpu-intense case, it spends a
lot time in order.
----
2020-07-06 12:52:39 UTC - David Lanouette: Have you looked to see if there's an
open issue in the [kop](<https://github.com/streamnative/kop>) project? That
may be the best place for it to get addressed.
----
2020-07-06 12:53:35 UTC - David Lanouette: Note: if you do create a new issue,
it will be *very* helpful to have precise steps to reproduce the issue.
----
2020-07-06 12:54:31 UTC - yuanbo: Sure, I will try it, thanks a lot
----
2020-07-06 12:56:39 UTC - David Lanouette: I took a quick look, and I don't see
an existing issue concerning performance.
----
2020-07-06 12:56:50 UTC - David Lanouette: You'll need to create a new issue.
----
2020-07-06 12:57:10 UTC - David Lanouette: Good luck. Hopefully the team can
look at it and find a simple fix for you.
----
2020-07-06 13:00:34 UTC - yuanbo: They have opened a discussion issue there
:sweat_smile:.
<https://github.com/streamnative/kop/issues/134>
----
2020-07-06 13:07:05 UTC - David Lanouette: Interesting. Looks like it has been
fixed. Have you tried to re-test after that fix was put in?
----
2020-07-06 13:14:08 UTC - yuanbo: It wasn't fixed just closed.
----
2020-07-06 13:15:14 UTC - David Lanouette: It looks like [PR
#150](<https://github.com/streamnative/kop/pull/150>) was added to address it.
----
2020-07-06 13:15:47 UTC - David Lanouette: But, maybe that didn't fix it.
----
2020-07-06 13:15:51 UTC - yuanbo: Oh, sorry my mistake.
----
2020-07-06 13:16:23 UTC - David Lanouette: if that didn't fix the issue, I'd
suggest requesting that it gets re-opened.
----
2020-07-06 13:16:47 UTC - David Lanouette: but, hopefully that fixed the issue
for you.
----
2020-07-06 13:20:04 UTC - yuanbo: this pr comes from my another colleague. I'm
new to this project. Sadlly this pr doesn't have much performance improvement.
----
2020-07-06 13:50:46 UTC - Vaibhav Aiyar: @Vaibhav Aiyar has joined the channel
----
2020-07-06 14:07:41 UTC - Raphael Enns: Hi, how many active web socket
connection can we have to the web socket api? Is there any limit?
----
2020-07-06 14:38:23 UTC - Aaron: Does the acknowledgement at batch index level
work with cumulative acknowledgements from the consumer?
----
2020-07-06 14:56:30 UTC - Kirill Kosenko: Hello
I'm playing with Pulsar and Flink.
In the video "Query Pulsar Streams using Apache Flink - Sijie Guo" @Sijie Guo
told that it would be possible to have end-to-end "exactly-once" between
multiple Flink jobs when Pulsar Transactions API is supported.
So my understanding that it will be possible to consume events using
FlinkPulsarSource, process them by multiple jobs, merge results and produce the
result to FlinkPulsarSink only once?
Please confirm
Hope it was clear
Thank you
----
2020-07-06 15:02:04 UTC - vali: ty!
----
2020-07-06 16:23:01 UTC - Viktor: @Viktor has joined the channel
----
2020-07-06 17:08:56 UTC - Matteo Merli: Yes, we plan to address this by hiding
the partitions concept in the future
----
2020-07-06 17:10:35 UTC - Matteo Merli: There's no predefined hard-limit (just
check file-descriptor limit for the process)
----
2020-07-06 17:11:30 UTC - Raphael Enns: Thanks
----
2020-07-06 17:23:30 UTC - Viktor: Hello. I am trying to benchmark Pulsar using
the open messaging benchmark framework, since it was mentioned in a few blogs.
I am unable to push pulsar beyond 100-150MB/sec
```17:16:49.708 [main] INFO - Pub rate 101306.6 msg/s / 98.9 Mb/s | Cons rate
96507.6 msg/s / 94.2 Mb/s | Backlog: 48.6 K | Pub Latency (ms) avg: 573.0 -
50%: 30.9 - 99%: 10043.6 - 99.9%: 10083.8 - Max: 10106.4
17:16:59.818 [main] INFO - Pub rate 101131.5 msg/s / 98.8 Mb/s | Cons rate
87866.5 msg/s / 85.8 Mb/s | Backlog: 182.6 K | Pub Latency (ms) avg: 562.7 -
50%: 31.2 - 99%: 10039.9 - 99.9%: 10070.1 - Max: 10145.3
17:17:09.924 [main] INFO - Pub rate 152701.6 msg/s / 149.1 Mb/s | Cons rate
84343.7 msg/s / 82.4 Mb/s | Backlog: 873.7 K | Pub Latency (ms) avg: 423.6 -
50%: 28.9 - 99%: 10013.2 - 99.9%: 10279.0 - Max: 10323.5
17:17:20.023 [main] INFO - Pub rate 55590.7 msg/s / 54.3 Mb/s | Cons rate
110058.3 msg/s / 107.5 Mb/s | Backlog: 323.5 K | Pub Latency (ms) avg: 984.2 -
50%: 35.2 - 99%: 10032.8 - 99.9%: 10060.3 - Max: 10143.7
17:17:20.171 [main] INFO - ----- Aggregated Pub Latency (ms) avg: 485.8 - 50%:
31.5 - 95%: 902.8 - 99%: 10033.4 - 99.9%: 10109.2 - 99.99%: 10280.8 - Max:
10323.5```
It seems like some queuing happening on the producer side. Server are very
lightly loaded on disk, cpu, network. So I am sure this is client side. See the
very high tail latency.
I already tried also setting `maxPendingMessagesAcrossPartitions()` and
`maxPendingMessages()` to 100K. gave more producers. it's always capped there.
any suggestions?
----
2020-07-06 17:34:01 UTC - Kiran Chitturi: I enabled auth in pulsar and seeing
this stacktrace when I hit the admin API with token
```17:33:15.781 [pulsar-io-22-3] WARN
org.apache.pulsar.broker.service.ServerCnx - [/10.8.85.9:52796] Got exception
TooLongFrameException : Adjusted frame length exceeds 5253120: 1195725860 -
discarded
io.netty.handler.codec.TooLongFrameException: Adjusted frame length exceeds
5253120: 1195725860 - discarded
at
io.netty.handler.codec.LengthFieldBasedFrameDecoder.fail(LengthFieldBasedFrameDecoder.java:503)
~[io.netty-netty-codec-4.1.48.Final.jar:4.1.48.Final]
at
io.netty.handler.codec.LengthFieldBasedFrameDecoder.failIfNecessary(LengthFieldBasedFrameDecoder.java:489)
~[io.netty-netty-codec-4.1.48.Final.jar:4.1.48.Final]
at
io.netty.handler.codec.LengthFieldBasedFrameDecoder.exceededFrameLength(LengthFieldBasedFrameDecoder.java:376)
~[io.netty-netty-codec-4.1.48.Final.jar:4.1.48.Final]
at
io.netty.handler.codec.LengthFieldBasedFrameDecoder.decode(LengthFieldBasedFrameDecoder.java:419)
~[io.netty-netty-codec-4.1.48.Final.jar:4.1.48.Final]
at
io.netty.handler.codec.LengthFieldBasedFrameDecoder.decode(LengthFieldBasedFrameDecoder.java:332)
~[io.netty-netty-codec-4.1.48.Final.jar:4.1.48.Final]
at
io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:498)
~[io.netty-netty-codec-4.1.48.Final.jar:4.1.48.Final]
at
io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:437)
~[io.netty-netty-codec-4.1.48.Final.jar:4.1.48.Final]
at
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
~[io.netty-netty-codec-4.1.48.Final.jar:4.1.48.Final]
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
[io.netty-netty-transport-4.1.48.Final.jar:4.1.48.Final]
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
[io.netty-netty-transport-4.1.48.Final.jar:4.1.48.Final]
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
[io.netty-netty-transport-4.1.48.Final.jar:4.1.48.Final]
at
io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
[io.netty-netty-transport-4.1.48.Final.jar:4.1.48.Final]
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
[io.netty-netty-transport-4.1.48.Final.jar:4.1.48.Final]
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
[io.netty-netty-transport-4.1.48.Final.jar:4.1.48.Final]
at
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
[io.netty-netty-transport-4.1.48.Final.jar:4.1.48.Final]```
----
2020-07-06 17:42:30 UTC - Kiran Chitturi: sasl/ssl is not enabled
----
2020-07-06 17:52:42 UTC - Sijie Guo: Correct. Currently the source connector is
exactly-once. However the sink connector is not exactly-once. It will become
exactly-once after Pulsar transaction is supported.
----
2020-07-06 17:53:13 UTC - Sijie Guo: Yes. It should work by design.
----
2020-07-06 17:55:02 UTC - Sijie Guo: @yuanbo - I think @jia zhai has been
working with @aloyszhang on this. He might have more context to bring here.
----
2020-07-06 17:59:14 UTC - David Lanouette: Have you tried running `pulsar-perf
produce`? Is it giving you similar results?
----
2020-07-06 18:00:36 UTC - David Lanouette: Also, are you sure the network
between the broker and the tool can handle more than 100Mb/s?
----
2020-07-06 18:04:06 UTC - Viktor: yes.. sure about the network.. it has enough
room..
I did not run `pulsar-perf`.. but any idea why that would behave differently?
or is that the supported benchmark tool .. OMB is nice that lets us compare
different things. so like to stick with it, if possible.
----
2020-07-06 18:05:04 UTC - David Lanouette: That is definitely a supported
benchmark tool. It comes "out of the box". But it only tests pulsar, not
other types of brokers.
----
2020-07-06 18:05:14 UTC - David Lanouette: But, it would give you another data
point.
----
2020-07-06 18:06:03 UTC - David Lanouette: If it's fast, but the OMB tool is
slow, then you might have one issue. If pulsar-perf is also slow, then it
might be your setup.
----
2020-07-06 18:06:33 UTC - David Lanouette: You've also hit the limit of my
knowledge of configuring and testing pulsar :slightly_smiling_face:
----
2020-07-06 18:17:34 UTC - Joshua Decosta: Did you ever manage to get this
working? I’ve been trying to verify my broker tlsKeystore setup via OpenSSL and
I’m having a similar issue
----
2020-07-06 18:23:23 UTC - Viktor: Thanks. let me play with pulsar-perf and get
back
----
2020-07-06 18:23:54 UTC - David Lanouette: :thumbsup:
----
2020-07-06 19:13:27 UTC - Devin G. Bost: It looks like the debian package for
Pulsar 2.5.0 is no longer available on the Apache mirror. Does anyone know
where I can find it?
----
2020-07-06 19:14:02 UTC - Devin G. Bost: I’m working on a Dockerfile for
building Pulsar’s Go client with 2.5.2.
----
2020-07-06 19:14:05 UTC - Addison Higham: archives
----
2020-07-06 19:14:26 UTC - Devin G. Bost: Thanks. I’ll check there.
----
2020-07-06 19:14:31 UTC - Addison Higham:
<https://archive.apache.org/dist/pulsar/pulsar-2.5.0/apache-pulsar-2.5.0-bin.tar.gz>
<- direct link
----
2020-07-06 19:14:45 UTC - Devin G. Bost: That works! Thanks!
----
2020-07-06 19:15:03 UTC - Addison Higham: only most recent minor and patch
versions are on the mirrors, anything older goes to archives
+1 : Devin G. Bost
----
2020-07-06 19:27:03 UTC - Devin G. Bost: I’m having trouble getting the Go
client to work in my dockerfile.
It blows up at this line:
```Step 17/24 : RUN go get -u
<http://github.com/apache/pulsar/pulsar-client-go/pulsar|github.com/apache/pulsar/pulsar-client-go/pulsar>
---> Running in 81c32d040c9d
#
<http://github.com/apache/pulsar/pulsar-client-go/pulsar|github.com/apache/pulsar/pulsar-client-go/pulsar>
/usr/lib/gcc/x86_64-linux-gnu/5/../../../../lib/libpulsar.so: undefined
reference to `std::thread::_State::~_State()@GLIBCXX_3.4.22'
/usr/lib/gcc/x86_64-linux-gnu/5/../../../../lib/libpulsar.so: undefined
reference to `typeinfo for std::thread::_State@GLIBCXX_3.4.22'
/usr/lib/gcc/x86_64-linux-gnu/5/../../../../lib/libpulsar.so: undefined
reference to
`std::thread::_M_start_thread(std::unique_ptr<std::thread::_State,
std::default_delete<std::thread::_State> >, void (*)())@GLIBCXX_3.4.22'
. . .
[many more lines]```
Does anyone recognize this exception?
----
2020-07-06 19:34:02 UTC - Devin G. Bost: It looks like there’s a version
mismatch. Maybe I need to build from source.
----
2020-07-06 21:01:20 UTC - Devin G. Bost: How do I determine what versions are
in use?
----
2020-07-06 21:07:34 UTC - Devin G. Bost: e.g.
----
2020-07-06 21:09:12 UTC - Addison Higham: that is the old c++ based golang, the
newer client, <https://github.com/apache/pulsar-client-go>, is pure golang, no
C++ bindings involved
----
2020-07-06 21:14:34 UTC - Jon Featherstone: :wave: In the docs, it says
> When subscribing to multiple topics, the Pulsar client will automatically
make a call to the Pulsar API to discover the topics that match the regex
pattern/list and then subscribe to all of them. If any of the topics don't
currently exist, the consumer will auto-subscribe to them once the topics are
created.
How quickly does the consumer see the new topics? Is this dependent on each
client implementation?
----
2020-07-06 21:16:34 UTC - Addison Higham: by default, 1 minute, (see
<https://pulsar.apache.org/docs/en/client-libraries-java/>
`patternAutoDiscoveryPeriod`). It is client dependent, but most follow the java
as a guide
----
2020-07-06 21:17:06 UTC - Addison Higham: golang property:
<https://github.com/apache/pulsar-client-go/blob/master/pulsar/consumer.go#L87>
----
2020-07-06 21:17:53 UTC - Addison Higham: and the default is set here:
<https://github.com/apache/pulsar-client-go/blob/6edc8f4ef954477eb36eed92e88464f56b4c48ee/pulsar/consumer_regex.go#L37>
----
2020-07-06 21:18:18 UTC - Devin G. Bost: Thanks. When I try using that one, I
get a different error at the go get line.
```In file included from
go/src/github.com/apache/pulsar/pulsar-client-go/pulsar/c_client.go:24:0:
./c_go_pulsar.h:22:29: fatal error: pulsar/c/client.h: No such file or directory
compilation terminated.```
I put my Dockerfile in the latest comment here:
<https://github.com/apache/pulsar/issues/7463>
----
2020-07-06 21:22:50 UTC - Jon Featherstone: Awesome. Thanks addison!
----
2020-07-06 21:23:24 UTC - Addison Higham: oh so you are previously using the
old client? Well then you will need to change your code (the API is very
similar though...). In your follow on comment though, you aren't referencing
the right golang client. The new client should be `go get
<http://github.com/apache/pulsar-client-go/pulsar|github.com/apache/pulsar-client-go/pulsar>`
----
2020-07-06 21:23:32 UTC - Addison Higham: :bow:
----
2020-07-06 21:24:50 UTC - Devin G. Bost: Ah, thanks! That’s a very subtle path
difference
----
2020-07-06 21:31:36 UTC - Devin G. Bost: That worked! You’re a life saver.
----
2020-07-06 21:38:09 UTC - Joshua Decosta: If the command “pulsar-admin brokers
list use” returns “localhost:8080” would that be a sign that tls transport
encryption is misconfigured? I’m trying to verify that I’ve configured it
correctly
----
2020-07-06 21:53:55 UTC - Addison Higham: That wouldn't indicate a problem, by
default, a cluster registers itself in the list of clusters, but that address
is only used in the case of geo-replication.
+1 : Joshua Decosta
----
2020-07-06 23:48:40 UTC - Viktor: @David Lanouette qq.. I dont see a way to
control number of partitions for each topic using `pulsar-perf`. am I missing
something
----
2020-07-07 02:46:46 UTC - Hiroyuki Yamada: @Penghui Li Sorry, I reported wrong
results. Both auto-split and consistent hashing can produce inconsistently
ordered messages
I updated the issue. <https://github.com/apache/pulsar/issues/7455>
----
2020-07-07 02:55:08 UTC - Penghui Li: Are there any messages replayed in your
testcase? The current implementation allows the replay messages dispatch to the
new consumer.
----
2020-07-07 02:59:16 UTC - Hiroyuki Yamada: @Penghui Li Honestly I’m not sure
when the reply happens. Adding consumers will make it happen ?
As you can see, you can reproduce it with a super simple consumer.
I start consumers after a producer so the number of consumers is changing
during consuming, which is very normal in Pulsar I assume.
<https://github.com/feeblefakie/misc/blob/master/pulsar/src/main/java/MyConsumer.java>
----
2020-07-07 03:02:53 UTC - Penghui Li: Ok
----
2020-07-07 06:38:27 UTC - Hiroyuki Yamada: I start looking at the code to see
if I can do anything for that and check the behavior by adding debug print in
the code.
Is there any better way other than doing below every time the code is updated ?
``` mvn install -DskipTests```
It takes too long time so I’m wondering what others are doing.
(Sorry I’m new to maven)
----
2020-07-07 07:46:19 UTC - Kirill Kosenko: Thank you @Sijie Guo
One more question
There're two connectors:
1. <https://github.com/streamnative/pulsar-flink>
2.
<https://cwiki.apache.org/confluence/display/FLINK/FLIP-72%3A+Introduce+Pulsar+Connector>
Which one should we use?
Is there any difference in functionality?
As I understood, we should use the first one(StreamNative connector) until
Flink community merge these pull requests:
<https://github.com/apache/flink/pull/10875>
<https://github.com/apache/flink/pull/10455>
Please advice
Thank you
----
2020-07-07 08:48:09 UTC - Rahul Vashishth: it seems proxy is required for
broker lookup. But as mentioned in the
<https://www.youtube.com/watch?v=wGgEx1M17O0&list=PLqRma1oIkcWhWAhKgImEeRiQi5vMlqTc-&index=13|TGI
Pulsar 002: Proxy and Kubernetes Deployment>, it says the client still does
the broker lookup via proxy.
@Sijie Guo Is it important to publish both the pulsar binary port 6550 and HTTP
port 80 on the same DNS? As we only mention the broker URL on pulsar client,
does the client then do a lookup on the HTTP URL on the same DNS?
can we create two different DNS for binary port and admin API HTTP port?
----