2020-06-30 11:47:21 UTC - rani: *[Pulsar 2.6.0][Presto]*  Architecture: Presto 
communicating with Brokers over Broker loadbalancer.

Running the following presto query results in the following errors. _Note_: 
running a similar query on another namespace with lesser data succeeds.

*Query*
```show tables in "myTenant/myNamespace";```
*Presto CLI Error*
```Query 20200630_114317_00004_t2nsz failed: Failed to get tables/topics in 
myTenant/myNamespace: HTTP 500 Internal Server Error
java.lang.RuntimeException: Failed to get tables/topics in 
myTenant/myNamespace: HTTP 500 Internal Server Error
        at 
org.apache.pulsar.sql.presto.PulsarMetadata.listTables(PulsarMetadata.java:191)
        at 
com.facebook.presto.metadata.MetadataManager.listTables(MetadataManager.java:432)```
*Broker Error*
```11:43:47.340 [pulsar-web-42-7] ERROR 
org.apache.pulsar.broker.web.PulsarWebResource - [pulsar-role-broker] Failed to 
check whether namespace bundle is owned 
myTenant/myNamespace/0x80000000_0xc0000000
java.util.concurrent.TimeoutException: null
        at 
java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886) 
~[?:?]
        at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021) ~[?:?]
        at 
org.apache.pulsar.broker.namespace.NamespaceService.getWebServiceUrl(NamespaceService.java:231)```
Are there any parameters in need of tuning here?
----
2020-06-30 11:47:35 UTC - rani: @Sijie Guo, your expertise would be much 
appreciated here
----
2020-06-30 13:35:09 UTC - Meyappan Ramasamy: hi team, i used apache pulsar for 
message broker which had similar features to kafka. i wanted to know if dynamic 
topic subscription using a regex pattern to subscribe to future topics is 
supported in the latest version of pulsar 2.6.0
----
2020-06-30 13:35:39 UTC - Meyappan Ramasamy: hi team, i used apache pulsar for 
message broker which had similar features to kafka. refer to this thread for 
previous communication. i wanted to know if dynamic topic subscription using a 
regex pattern to subscribe to future topics is supported in the latest version 
of pulsar 2.6.0
----
2020-06-30 15:05:41 UTC - Matteo Merli: While the python client supports 
schema, Pulsar functions in python do not (yet)
----
2020-06-30 15:09:04 UTC - rwaweber: Thanks David! That definitely helps, thanks 
for the suggestion.

I suppose my earlier question ended up morphing into another one, where it 
appears that the `pulsar_storage_size` metric is reporting double the occupied 
storage reported by the `pulsar-admin topic stats-internal <topic>` 
command.
----
2020-06-30 16:59:12 UTC - Joshua Eric: Ok thank you for your help!
----
2020-06-30 17:06:57 UTC - Sijie Guo: @rani - Have you verified your Pulsar 
cluster  first before using Presto?
----
2020-06-30 17:07:43 UTC - rani: verified in what sense @Sijie Guo?
`{{PULSAR_ENDPOINT}}/admin/v2/brokers/health` returns `200` and I have a bunch 
of producers and consumers writing and reading from the cluster
----
2020-06-30 17:07:52 UTC - Sijie Guo: Regex subscription has been supported 
starting from earlier version (like 2.3.0)
----
2020-06-30 17:09:24 UTC - Sijie Guo: In the presto server, have you tried to 
run `bin/pulsar-admin topics list myTenant/myNamespace` ?
----
2020-06-30 17:10:31 UTC - Sijie Guo: The presto connector basically calls this 
restful api to get the list of topics.
----
2020-06-30 17:13:34 UTC - rani: hmm, nope, I haven’t tried that! A few minutes 
before you messaged everything started working as expected, but i’m not sure 
exactly what I did for this to work.

I initially tried increasing `zooKeeperSessionTimeoutMillis` assuming that this 
is causing the timeout, but that didn’t help and I reverted the change.

The only other change I did which I think is irrelevant is that I had a 
“zombie” pulsar function running which i destroyed and now everything works as 
expected from Presto end.

(by zombie function I mean it was a function that was created before I 
re-created my pulsar cluster)
----
2020-06-30 17:14:31 UTC - rani: For good measure, i’m gonna do a fresh cluster 
re-creation now and monitor to see if I can replicate this issue again
----
2020-06-30 17:33:53 UTC - Abhishek Varshney: @Matteo Merli
> ordering will be (briefly) broken.
I assume the key-shared consumers would still be able to consume messages in 
order without breaking any ordering when partitions are increased. // @Sijie Guo
----
2020-06-30 17:35:53 UTC - Matteo Merli: ordering is only guaranteed within a 
partition.

When you increase partition the hashing logic will shift the partition 
assignments, therefore a `K1` that was going to `partition-1` might now go to 
`partition-12` .
----
2020-06-30 17:44:03 UTC - rani: i’ve re-created the cluster without issues! I 
am able to query my topics via Presto. Everything works well.
----
2020-06-30 17:44:39 UTC - rani: Just still puzzled at how destroying an old 
running function could have resolved this issue
----
2020-06-30 18:16:28 UTC - Rahul Vashishth: was it supposed to be published with 
this <https://github.com/apache/pulsar-helm-chart/pull/21> merge.
----
2020-06-30 19:21:26 UTC - Raphael Enns: Hey, for this issue, it appeared to be 
a disconnect from the PulsarClient. Currently we have one instance of 
PulsarClient created and are using that. But I've noticed in some testing that 
it can go down and it causes problems.
----
2020-06-30 19:22:11 UTC - Raphael Enns: The 2 cases I noticed was I've seen a 
couple of out of memory errors, and those seem to stem from a disconnect.
----
2020-06-30 19:23:01 UTC - Raphael Enns: Another case is that we suddenly stop 
receiving any messages, which seems like PulsarClient's listenier thread got 
blocked or something.
----
2020-06-30 19:24:19 UTC - Raphael Enns: Do you have any recommendations for 
handling the PulsarClient object? Is it supposed to reconnect as necessary on a 
disconnect? Should we have some thread monitoring it to make sure it is 
working? Should we be creating new PulsarClient instances more often?
----
2020-06-30 19:25:16 UTC - Raphael Enns: I'm just looking for recommendations 
for how to run this stably. We have a long-lived service that sits on top of 
Pulsar and right now we are seeing some stability issues.
----
2020-06-30 20:48:02 UTC - Matteo Merli: Yes, if there's a disconnection, the 
client will internally attempt to reconnect, with exp backoff.

During the time it's disconnected, the producer will buffer up messages until 
the queue gets full. By default, the queue size is 1K messages. To reduce the 
amount of memory used, just reduce the `producerQueueSize` to a smaller number 
(eg: 100 or 10).
----
2020-07-01 03:10:45 UTC - jixing7: @jixing7 has joined the channel
----
2020-07-01 04:37:00 UTC - sjmittal: thanks
----
2020-07-01 05:40:32 UTC - Luke Stephenson: I've got a question about the ledger 
disk usage on the grafana dashboards.  After a topic has been offloaded to S3, 
on the bookie metrics when should that be reflected as lower "Ledgers Disk 
Usage".  I'm not seeing any changes to this metric over 12 hours after the 
offloading finished (expecting it to go down from 50% to about 1%).  I'm 
confident the offloading worked as the S3 bucket has a lot of data in it.  Plus 
a full consumer read of the topic puts almost no load on the bookies .  (When 
on pulsar 2.5.1 I saw this metric drop after offloading, but I've since 
upgraded to 2.6.0 and it is staying high).
----
2020-07-01 06:27:12 UTC - Sijie Guo: A couple of reasons:

• there is a `managedLedgerOffloadDeletionLagMs` that delays the deletion of 
ledgers after they are offloaded to tiered storage.
• bookies cleaned up disks in a lazy manner via garbage collection and entry 
log file compactions (the compaction is an internal compaction mechanism to 
reclaim the disk space)
In order to check what is the issue, you can run:

```bin/pulsar-admin topics stats-internal```
to find the internal stats of a topic to see what ledgers are offloaded.

for those offloaded ledgers, you can use

```bin/bookkeeper shell ledgermetadata```
to check if the ledger exists or not.
ok_hand : Konstantinos Papalias
----
2020-07-01 07:07:06 UTC - Luke Stephenson: Thank you for that explanation
----

Reply via email to