2019-12-12 09:12:14 UTC - Fernando: I’m creating a python reader with
```reader = client.create_reader(
'<persistent://tenant/namespace/topic>',
start_message_id=pulsar.MessageId.earliest)```
and then read messages with
```msg = reader.read_next(1000)```
I’ve noticed that messages are deleted after a while. Is this normal? What do I
need to configure for this to not happen?
----
2019-12-12 09:32:24 UTC - Fernando: Ok retention has to be set for this to work
<https://github.com/apache/pulsar/issues/355#issuecomment-298784473>
----
2019-12-12 09:35:59 UTC - Ryan: We share the same concerns. If you create a
GitHub issue or post on StackOverflow, can you please post a link to the URL in
this thread too! Thanks!
----
2019-12-12 11:58:16 UTC - Ryan Samo: We use Grafana to visualize our Pulsar
metrics. Is there a way to look at the backlog metric and deduce the difference
between message TTL and retention? I know that when TTL expires it sends the
message into the retention you have set, just wondering if you can actually see
this via the metrics?
----
2019-12-12 12:01:41 UTC - Brian Doran: Thanks @Sijie Guo We've tried turning
fsync off before to no avail but maybe we did it incorrectly .. what is the
procedure to do this correctly?
----
2019-12-12 12:24:57 UTC - robertsiri: @robertsiri has joined the channel
----
2019-12-12 12:27:43 UTC - robertsiri: HI, how can I use my data type "e.g.,
User" to work with Spark Streaming instead of "byte[]" as shown in the example
e.g. JavaReceiverInputDStream<byte[]> lineDStream =
jsc.receiverStream(pulsarReceiver);
----
2019-12-12 15:39:06 UTC - Brendan Price: sure thing!
----
2019-12-12 15:52:15 UTC - Brendan Price: @Sanjeev Kulkarni as requested:
<https://github.com/apache/pulsar/issues/5846>
----
2019-12-12 16:40:36 UTC - Jose Estefania: @Jose Estefania has joined the channel
----
2019-12-12 16:45:49 UTC - David Kjerrumgaard: @Ryan Samo Great question. AFAIK,
there isn't a way to differentiate between the 2.
+1 : Ryan Samo
----
2019-12-12 17:04:55 UTC - Sanjeev Kulkarni: thanks!@
----
2019-12-12 17:36:02 UTC - Sandeep Kotagiri: Within Kubernetes Pulsar
deployment, I plan to use a Pulsar Proxy in front of my brokers. In this case,
what should be the value I should be configuring in the zookeeper metadata as
explained in
<https://pulsar.apache.org/docs/en/2.4.2/deploy-bare-metal/#initializing-cluster-metadata>?
Should the hostname part in web-service-url and broker-service-url point to
the proxy host name or the broker hostname?
----
2019-12-12 17:48:55 UTC - Logan B: As pulsar tracks the consumer cursors in a
shared subscription - what happens during broker failure? Where would the new
broker begin offering messages?
I couldn't find details of this in any of the docs.
----
2019-12-12 18:04:28 UTC - David Kjerrumgaard: @Logan B The consumer cursors are
stored in the bookkeeper layer. Therefore, if the broker serving the messages
from a topic fails, the newly assigned broker can access the consumer cursors
from there and resume exactly where the previous broker left off.
----
2019-12-12 18:07:58 UTC - David Kjerrumgaard: @Sandeep Kotagiri I think you
want to follow the steps outlined here in the deploying to K8s docs.
<https://pulsar.apache.org/docs/en/2.4.2/deploy-kubernetes/#initialize-cluster-metadata>
----
2019-12-12 18:08:49 UTC - Aditya badramraju: @Aditya badramraju has joined the
channel
----
2019-12-12 18:19:03 UTC - Nick Ruhl: @Ryan Yes no problem. I plan on creating
this within a few days
+1 : Sijie Guo
----
2019-12-12 18:22:51 UTC - Sandeep Kotagiri: @David Kjerrumgaard Thank you. I am
using the HELM chart. In the documentation, per the steps outlined, we seed
metadata with broker's web/pulsar service urls. And after that we also deploy
Proxy. So I will go with using broker's service urls for these values. (I am
going to use Proxy as the front end. I was not sure if this had to be the proxy
or the broker. And hence my question)
----
2019-12-12 18:24:45 UTC - David Kjerrumgaard: @Sandeep Kotagiri The proxy uses
the information stored in ZK to determine where to route the requests.
Therefore, those values MUST be the broker URLS. The flow is client --->
proxy ----> ZK (lookup borker addr) ---> broker (forwarded by tge proxy)
----
2019-12-12 18:24:46 UTC - David Kjerrumgaard: HTH
----
2019-12-12 18:25:16 UTC - Sandeep Kotagiri: @David Kjerrumgaard thank you
----
2019-12-12 19:29:38 UTC - Ryan Samo: Thanks @David Kjerrumgaard . Do you think
this is a worthwhile enhancement to the Prometheus metrics? I think it would be
great to visualize this behavior
----
2019-12-12 19:30:30 UTC - David Kjerrumgaard: It would be for sure. I am just
not sure that we even track that information.
----
2019-12-12 19:40:34 UTC - Ryan Samo: Ok thanks!
----
2019-12-12 19:43:56 UTC - Logan B: Yes, but what positions are stored?
For example, if I have a shared subscription with two consumers, and 10
messages in the topic (id 1-10).
Consumer A gets message 1 and does not ack
Consumer B gets messages 2, 3, 4 and acks messages 2 & 3 but NOT 4
Broker dies, and consumers fail and go do something else.
New broker comes online, message ack timeouts expire, and a new consumer C
starts reading from the subscription - where is the cursor and what messages
will be sent to consumer C?
Does C see messages 1, 2, 3, 4, 5 ...? This implies redelivery of 2 & 3.
Does C see messages 1, 4, 5, ...? How would it know 2 & 3 were processed?
Does C see messages 5, 6, 7, 8? How & when would 1&4 get processed?
----
2019-12-12 20:04:26 UTC - David Kjerrumgaard: The new consumer would get
messages 1 and 4 since they weren't acked, along with 5 and upwards
----
2019-12-12 20:05:32 UTC - David Kjerrumgaard: the only way the broker "knows" a
message was "processed" is via an ack from the consumer. No ak means not
processed and the message is redelivered automatically
----
2019-12-12 22:12:05 UTC - Joe Francis: @Logan B In general, consumers always
begin with the latest available unacked message. Case 1 applies,( or should )
Everything will be redlivered. But recently there has been
managedLedgerMaxUnackedRangesToPersist added to Pulsar which will try to
persist "ack holes" and this is enabled by default, so the ack holes[(1)] will
be persisted, you and will see, as David said, 1, and 4. To me, this covers
for poor application and use case design. Those who require random deletes
should use a database, not a queue. And if the application cant handle
idempotency there are bigger issues. All our applications (in the hundreds) run
with managedLedgerMaxUnackedRangesToPersist set to zero.
----
2019-12-13 00:31:31 UTC - Logan B: @Joe Francis , perfect, that was exactly
what I was wondering and the docs for this help clarify. Thank you!
----
2019-12-13 05:27:44 UTC - Sandeep Kotagiri: Hello, for 2.4.1 Pulsar, I am
observing a strange issue. I have turned on TLS and Authentication.
Authentication provider is
org.apache.pulsar.broker.authentication.AuthenticationProviderTls. With this
setting, when using pulsar-admin CLI tool to create tenants, I am getting a 500
Internal Error. I see a NullpointerException in Broker logs. I do not have any
problem with other functions of pulsar-admin tool like retrieving clusters,
creating namespaces etc. However, any operations on Tenants are failing. Is
this a known issue?
----