2019-09-19 13:51:19 UTC - Nick Marchessault: I recently added ackTimeOut 
configuration to my consumer and am now seeing this Warn message repeatedly in 
the logs:

WARN 1 --- [ulsar-timer-4-1] o.a.p.client.impl.UnAckedMessageTracker  : 
[ConsumerBase{subscription='fortis', consumerName='b5180', 
topic='TopicsConsumerFakeTopicNamed84a6'}] 2 messages have timed-out

The timeout is more than enough to process the message and ACK before the time 
elapses. If 2 messages have timed out, they should get redelivered and then the 
warn message should disappear correct?
----
2019-09-19 14:44:33 UTC - David Kjerrumgaard: @Nick Marchessault Yes, the 
Pulsar client will track the unacknowledged messages, and the broker will 
automatically redeliver those messages that weren't acknowledged. However, if 
the consumer is unable to process these messages even after they were 
re-delivered, then the error message will remain.  Typically this is caused by  
an uncaught exception being thrown by the consumer when it tries to process the 
message.
----
2019-09-19 17:23:55 UTC - Sujith Ramanathan: @Sujith Ramanathan has joined the 
channel
----
2019-09-19 17:29:23 UTC - Sujith Ramanathan: Hi,
I ran ./bin/pulsar standalone
Then getting below exception.

22:31:31.000 [pulsar-client-io-76-1] WARN  
org.apache.pulsar.client.impl.ConnectionPool - Failed to open connection to 
38f9d3e4dd35:6650 : 
io.netty.resolver.dns.DnsResolveContext$SearchDomainUnknownHostException: 
Search domain query failed. Original hostname: '38f9d3e4dd35' failed to resolve 
'<http://38f9d3e4dd35.mycompany.com|38f9d3e4dd35.mycompany.com>' after 2 queries
----
2019-09-19 17:39:32 UTC - David Kjerrumgaard: @Sujith Ramanathan This looks 
like a DNS issue. Is your hostname really 
'<http://38f9d3e4dd35.mycompany.com|38f9d3e4dd35.mycompany.com>' ?  Is so, you 
might need to add it to your /etc/hosts file
----
2019-09-19 17:58:25 UTC - Sujith Ramanathan: Hey David,
I just downloaded pulsar and trying to run it on locally by using pulsar 
standalone. Then am getting this exception. Do i need DNS config to run pulsar 
locally ?
Thanks for your quick response.
----
2019-09-19 18:03:50 UTC - Sujith Ramanathan: It's working fine, If am running 
as ./bin/pulsar standalone --advertised-address 127.0.0.1
----
2019-09-19 18:09:53 UTC - David Kjerrumgaard: In standalone it uses the local 
host name as the advertised address, but it looks like you are missing an entry 
in your /etc/hosts file that resolves your hostname, 
`<http://38f9d3e4dd35.mycompany.com|38f9d3e4dd35.mycompany.com>` to the 
loopback IP address of 127.0.0.1
----
2019-09-19 18:10:04 UTC - David Kjerrumgaard: Glad you found a work-around for 
this.
----
2019-09-19 18:47:16 UTC - Sujith Ramanathan: Yep :slightly_smiling_face:
----
2019-09-19 19:27:24 UTC - Luke Lu: What’s the expected behavior wrt geo 
replication and retention policies? Are retention policies replicated as well? 
If one of the clusters died and rejoined the namespace with empty storage, 
would they catch up with all the retained data from peers as well?
----
2019-09-19 19:35:10 UTC - Junli Antolovich: @Junli Antolovich has joined the 
channel
----
2019-09-19 19:48:16 UTC - Junli Antolovich: Hey Everyone,  I am trying to find 
a windows service bus replacement, as many of our product lines are using  
windows service bus which went out of mainstream support in January last year.  
I come across Pulsar and like all what I see except that there is no .Net 
client. Not sure if this is something already in the pipeline or we will have 
to build it ourselves in order to use it. As you may see, we are kinda in a 
time crunch\
----
2019-09-19 19:48:27 UTC - David Kjerrumgaard: @Luke Lu Within a multi-region 
Pulsar cluster there are multiple ZK instances. Each regional cluster has a 
"local" ZK cluster that stores metadata about the topics and ledgers used to 
store the data. There is also a "cluster" instance that is used to store 
metadata around geo-replication, and namespace policies such as the retention 
policy. Having said that, this means that when the cluster re-joins the Pulsar 
instance (term for multi-cluster pulsar) it would be able to get those policies 
from the global ZK. However, there is no automatic replication of old messages 
between the two clusters which enables the re-joining cluster to "catch-up"
----
2019-09-19 19:49:18 UTC - Matteo Merli: There are a couple of .Net client libs 
being worked on
+1 : David Kjerrumgaard
----
2019-09-19 19:49:38 UTC - David Kjerrumgaard: @Junli Antolovich There actually 
is a .NET client that someone developed, but it hasn't been added to the open 
source codebase as of yet. Reach out to those teams for the exact status of 
those projects.
+1 : dba
----
2019-09-19 19:50:16 UTC - Matteo Merli: 
<https://github.com/danske-commodities/dotpulsar>
+1 : dba
----
2019-09-19 19:50:22 UTC - Matteo Merli: 
<https://github.com/fsharplang-ru/pulsar-client-dotnet>
----
2019-09-19 19:53:40 UTC - Junli Antolovich: Thanks, I did take a look at those 
two, and they are just started with very limited functionality. Guess I can use 
the  C# one for a POC. It is .net core client, and we will have to convert it 
to .net framework in order to use it and add needed functionality when necessary
----
2019-09-19 20:03:12 UTC - Junli Antolovich: Also, I read that Pulsar is very 
high performing. Do you have the performance matrix handy that I can present to 
 our architecture council for evaluation?
----
2019-09-19 20:04:41 UTC - Matteo Merli: These slides contain some benchmark 
results: 
<https://www.slideshare.net/merlimat/high-performance-messaging-with-apache-pulsar>
----
2019-09-19 20:05:36 UTC - Junli Antolovich: TYVMfor the help 
:slightly_smiling_face:
----
2019-09-19 20:24:53 UTC - Junli Antolovich: @Matteo Merli Thanks so much for 
the slides and I can borrow many of them directly for the evaluation in depth.  
Could you elaborate a bit on the last 2 points on the optimizations slide(29)?
----
2019-09-19 20:30:16 UTC - Matteo Merli: &gt; Serialize operations to thread to 
avoid mutex

Instead of having multiple threads contending on a mutex, it’s better to 
delegate operations to a single thread (eg. hashing different topics ops to a 
pool of threads), so that we don’t need to acquire the mutex, or in any case it 
won’t be contended
----
2019-09-19 20:31:19 UTC - Matteo Merli: &gt; Pulsar brokers acts as a “proxy” — 
Payloads are forwarded with zero-copies from producers to storage and consumers

Brokers are passing the payloads from producers to the storage nodes. We avoid 
all possible memory copies in the process.
+1 : Junli Antolovich
----
2019-09-19 20:41:18 UTC - Luke Lu: This means that there is no DR/backup 
solution for data in retention?
----
2019-09-19 20:55:58 UTC - David Kjerrumgaard: Data that is being retained has 
already been processed and acknowledged by all the consumers in the topic. 
Periodically this data gets deleted based on the retention policies that you 
set. If you want to keep this data beyond the retention period, you would 
configure tiered-storage to ensure that the data is moved to longer-term 
storage such as S3.
----
2019-09-19 20:56:50 UTC - David Kjerrumgaard: if you backed-up and restored the 
retained messages, the data/events would be processed twice, which is probably 
NOT what you want.
----
2019-09-19 21:00:42 UTC - Poule: Just filed a github issue for this
----
2019-09-19 21:21:26 UTC - Tarek Shaar: I am setting the total number of 
listener threads to 20 (PulsarClient client = 
PulsarClient.builder().listenerThreads(20)). Yet  when I receive the messages I 
am only seeing the same thread processing the messages.. It is always 
2019-09-19 17:18:39 [pulsar-external-listener-3-1] INFO  
PulsarLatenyMessageListiner:23 - Received message key:xxxxx. The thread is 
always pulsar-external-listener-3-1
----
2019-09-19 21:26:53 UTC - David Kjerrumgaard: @Tarek Shaar The listener threads 
are pooled, so when the thread is finished processing the message it is 
returned to the pool and can be selected again for the next message. It is a 
common misconception that threads in this pool will execute in parallel, 
allowing for concurrent processing of the messages. This is not that case, and 
you should use multiple consumers to achieve that behavior.
----
2019-09-19 21:48:42 UTC - Luke Lu: 1. How do you restore a cluster with 
offloaded data on tiered storage (s3)? I was under the impression that this is 
not supported yet: <https://github.com/apache/pulsar/issues/4942>
2. The retained messages are “acked” messages that won’t be seen again by 
consumers less they explicitly seek back.

I was hoping that geo replication can restore retained data as already “acked” 
data in recovered clusters.
----
2019-09-19 21:55:58 UTC - David Kjerrumgaard: 1. Loading data from S3 would 
require some manual work at the moment, but it is possible to do. Currently, 
you could access the data in S3 via the SQL interface.  2. Yes, readers can 
always attempt to read data beyond the retention window, but there are no 
guarantees that the data will still be available which is the case here.  3. 
Backing-up Pulsar would require backing-up the ZK metadata which keeps track of 
where the data is stored, etc.
----
2019-09-19 21:59:46 UTC - Luke Lu: I was just hoping the acked data _inside_ 
the retention window can be restored by geo-replication.
----
2019-09-19 22:02:04 UTC - David Kjerrumgaard: In an active-active scenario, if 
one of the clusters fails, producers and consumers (including readers)  should 
be re-directed to the remaining active cluster and continue processing incoming 
messages. If you are using asynchronous geo-replication, then there is a small 
amount of messages that may not get replicated across. Now, if the failed 
cluster later re-joins the instance, it will start receiving replication 
traffic again and will quickly get filled with the most recent topic data 
(which is what most consumers want).  Now if you had a use case that required a 
reader, and it was going to scan back to a time during the outage, you would 
want to ensure that the reader only interacted with the cluster that was 
"alive" during that period, as it would be the only place that had the data you 
are looking for.
----
2019-09-19 22:10:18 UTC - Luke Lu: Understood. This can be made to work with 
some effort (extra service to track “complete” (with retention data, which is 
critical to reconstruct streaming windows) clusters”), but as you can see, it’s 
not friendly from consumer PoV. A consumer typically talks to a pulsar cluster 
in its own region with a broker url as a fixed config.
----
2019-09-19 22:51:14 UTC - David Kjerrumgaard: That makes sense. The reader 
pattern is a bit of a gap in this scenario that can be bettered addressed. But 
bear in mind that data retention is always a moving target and varies by time. 
The higher the topic throughput, the more frequently the messages fall out of 
the retention window. Trying to keep them around indefinitely is a lot of work, 
particularly in a "catch-up" scenario, when they will ultimately not be kept 
around for a long time as they will get expired out as new data flows in.
----
2019-09-20 00:17:44 UTC - Tarek Shaar: @David Kjerrumgaard So my consumer that 
processed 100K messages, used the same thread from the pool of 20 threads?
----
2019-09-20 00:50:46 UTC - Sijie Guo: @Poule currently the namespace level 
compatibility check policy is only applied for producers and consumers. the 
admin restful api is using FULL compatibility check. there is a change in 2.5.0 
to change restful admin to respect to namespace setting.
hugging_face : Poule
----
2019-09-20 00:51:13 UTC - Sijie Guo: 
<https://github.com/apache/pulsar/issues/4821>
----
2019-09-20 00:54:03 UTC - David Kjerrumgaard: How many message listeners did 
you attach to the consumer? 
<http://pulsar.apache.org/api/client/org/apache/pulsar/client/api/ConsumerBuilder.html#messageListener-org.apache.pulsar.client.api.MessageListener->
----
2019-09-20 01:54:07 UTC - Poule: if I `pulsar-admin topics delete 
--deleteSchema --force persistent://x/y/z` is it normal that I still get a 
schema with `pulsar-admin schemas get --version 0 persistent://x/y/z` ?
----
2019-09-20 05:25:40 UTC - Luke Lu: The retention windows could be size 
(typically MBs or GBs) or time (typically hours or days) based, we just want to 
have the current windows be maintained in all replicated clusters. i.e., we’re 
not interested in restoring the old messages that are outside the current 
retention windows. Ideally, we want to use pulsar (or equivalent) as the only 
stream storage, without having to actively save checkpoints into separate 
storage for window reconstructions.
----
2019-09-20 05:40:47 UTC - Vinoth: @Vinoth has joined the channel
----
2019-09-20 05:50:38 UTC - Vinoth: Hello everyone, I have to test the 
performance of the pulsar broker. We are using pulsar-perf tool for publishing 
and receive the messages. Need to find the average throughput with 1 broker, 2 
broker and 3 brokers. When I tested with all the broker scenarios, there is no 
big difference in throughput. Kindly advice, How could I test it?
----
2019-09-20 08:04:56 UTC - Vladimir Shchur: @Junli Antolovich this one 
<https://www.nuget.org/packages/Pulsar.Client> supports .net standard 2.0 so 
can be used from old dotnet as well (with just some binding redirects). And of 
course it supports both F# and C#
----

Reply via email to