2020-01-25 10:36:48 UTC - Rahul: @Rahul has joined the channel
----
2020-01-25 16:36:31 UTC - Gaetan SNL: Hello, I'm considering Apache pulsar to 
schedule messages at specific time using the "delayed message" functionality. I 
can't find any documentation about limitations. Is it OK for example to have 1 
million messages waiting to be delivered ? Or is it ok to schedule a message in 
2 years for example ? If I understand how it's implemented it should be fine, 
no ? Thank you
----
2020-01-25 18:06:13 UTC - Nouvelle: I'm trying to determine which function 
runtime option is active on my cluster, however the `get-runtime-config` option 
of the `brokers` command is not available despite the documentation: 
<https://pulsar.apache.org/docs/en/2.4.1/pulsar-admin/#get-runtime-config>
Is there another way to determine which function runtime option is active on my 
cluster (besides inspecting the conf file; I'm looking for a way to verify)?
----
2020-01-25 18:31:54 UTC - Paul Danckaert: @Paul Danckaert has joined the channel
----
2020-01-25 19:58:26 UTC - Roman Popenov: Will do!
----
2020-01-25 20:51:28 UTC - Md. Farhan Memon: @Md. Farhan Memon has joined the 
channel
----
2020-01-25 21:35:31 UTC - Roman Popenov: 
<https://github.com/apache/pulsar/issues/6141>
----
2020-01-25 21:35:35 UTC - Roman Popenov: Done
----
2020-01-25 23:45:56 UTC - Sijie Guo: The current implementation is not good for 
scheduling too many delayed messages or long delayed duration.

There is a proposal to implement a time-wheel based approach. Once it is done, 
we should able to support the use cases you mentioned.
----
2020-01-25 23:47:08 UTC - Sijie Guo: what do you mean “not available”?
----
2020-01-25 23:47:21 UTC - Sijie Guo: thanks
+1 : Roman Popenov
----
2020-01-25 23:57:47 UTC - Eugen: When consuming historical data (using 
`SubscriptionInitialPosition.Earliest`) from a partitioned topic, I assume ( 
<https://pulsar.apache.org/docs/en/concepts-messaging/#ordering-guarantee> ) 
that the order of items with different keys is not strictly guaranteed. Will 
items however be consumed "roughly" in order, e.g. using the `publish time` to 
merge the streams at the consumer, or is the relative timing of items from 
different partitions a completely nondeterministic outcome that depends on 
things like network and i/o performance of the involved bookies (or even S3, 
HDFS, etc in case of tiered storage)?
----
2020-01-26 00:11:09 UTC - Sijie Guo: I don’t think you can rely on publish 
time. publish time currently is assigned at the client side, not the broker 
side.
----
2020-01-26 04:12:29 UTC - Eugen: In my case, relying on the publish time would 
in fact be preferable to broker generated timestamps, due to the fact that I 
have only a single producer (with a single clock), but there would be multiple 
brokers (per partition - with different clocks). But I'm not after strict 
ordering here - I just want to be able to consume historical data, without 
items from different partitions drifting further and further apart over time. 
In other words, I want to be able to consume data "roughly" in order. I'd be 
happy if items from different partitions are never out-of-sync for more than 1 
second - but as the publish time is more fine-grained (at least milliseconds) 
it would theoretically be possible to keep them much more in sync than that. As 
I mentioned earlier, this is for consumption of historical data only. For 
real-time data consumption, I would _not_ expect any pacing / throttling of 
partitions so they are "roughly" in sync.
----
2020-01-26 04:35:29 UTC - Addison Higham: @Eugen this article touches on what 
you can expect out of Pulsar (both in terms of tailing and historical reads) 
and also gives some ways you can maybe get more control 
<https://jack-vanlightly.com/blog/2019/9/4/a-look-at-multi-topic-subscriptions-with-apache-pulsar>
----
2020-01-26 04:43:47 UTC - Eugen: @Addison Higham Although the title reads 
"multi-topic subscriptions", grepping for "partition" in the article, it seems 
jack is addressing my question as well, and Pulsar is much better for this than 
Kafka. Will read - thanks a lot!
----
2020-01-26 04:51:03 UTC - Addison Higham: :thumbsup: yeah, in pulsar, 
multi-topic and partitioned subscriptions are the same thing (since partitions 
in pulsar are just multiple topics)
----
2020-01-26 06:36:54 UTC - Eugen: Great article, saved me hours and days of my 
time! His "A Best-Effort Strategy Based on Publisher Timestamps" section at the 
end is what I was considering as well. But I don't think I can get this to work 
in Pulsar with partitions (in contrast to multiple topics), as I cannot 
subscribe to partitions individually and peek and pace. But I'd like to avoid 
the use of multiple topics in this case where data is coming from a single 
producer anyways and is one thing, logically. And as partitions are implemented 
as topics in Pulsar, there may even be some (undocumented?) way to subscribe to 
those internal topics... But this is getting a bit more involved than I'd like 
it to be. (I think stream processing engines allow for simple stream merging 
based on time stamps along those lines)
----
2020-01-26 06:58:13 UTC - Addison Higham: @Eugen you can subscribe to 
individual partitions, internally, a partitioned topic is just represented by a 
topic with a numbered suffix, so if you have a topic `public/default/my-topic` 
that has 5 partitions, you can individually subscribe to each topic by doing 
`public/default/my-topic-0`  `public/default/my-topic-1` , etc
+1 : Eugen
----
2020-01-26 06:58:31 UTC - Addison Higham: err, might be `partition-&lt;n&gt;`
----
2020-01-26 07:53:18 UTC - Eugen: Good to know! So it would seem technically 
possible to merge partitions on the consuming end using e.g. the publish-time. 
If I end up going this route, I may in fact add this as an consumer/reader 
option to Pulsar...
----

Reply via email to