2020-05-07 09:38:42 UTC - xue: pulsar sql error , pulsar verison 2.5.1
----
2020-05-07 10:16:52 UTC - Alexandre DUVAL: What did you put in the zookeepers
servers configuration on catalog configuration?
----
2020-05-07 10:17:34 UTC - Alexandre DUVAL:
<https://github.com/apache/pulsar/issues/6852|https://github.com/apache/pulsar/issues/6852>
----
2020-05-07 10:28:47 UTC - Andreas Cz: @Andreas Cz has joined the channel
----
2020-05-07 11:13:13 UTC - Gary Fredericks: okay, I think I asked the question
poorly, I'll try again:
Assuming a very long retention policy, is there any benefit to having a backlog
quota? or what is the risk in not having a backlog quota?
----
2020-05-07 11:16:45 UTC - dionjansen: @dionjansen has joined the channel
----
2020-05-07 11:23:28 UTC - dionjansen: Hi, I’m having an issue with one of my
bookies. I have a cluster deployed on kubernetes and one of the bookies is
reaching +90% disk utilisation. I set up managed offloading to s3, but the disk
pressure doesn’t seem to be reducing (I see the ledgers being set to
`offloaded: true` ). Do I need to do any additional actions to clear up the
current ledgers on disk? Thanks
----
2020-05-07 11:30:42 UTC - alex kurtser: Hello everyone. We encountered with a
new issue. The function/function definitions disappeared from pulsar.
We would like to figure out what happened.
Recently we have set the retention polocy ant message ttl setings to the short
period ( 24 hours for retention and 1 h for message ttl) .
I suspect that we also defined these settings on public/function namespace and
that caused to function to be deleted after the retention time passed over.
Does anybody know whether the functions are actually stored in the
public/function namespaces and if it could be affected by retention policies on
this namespace ?
----
2020-05-07 12:12:55 UTC - Frank Kelly: Is there any documentation on how to
downscale on AWS - I presume there's some "data balancing" that has to go on
but just wondering how does Pulsar do that data "balancing" and how AWS can
figure out how/when to kill an EC2 instance that is no longer needed?
----
2020-05-07 12:23:53 UTC - Jeroen Coenders: @Jeroen Coenders has joined the
channel
----
2020-05-07 13:04:07 UTC - Ming: Gary, these concepts are confusing. They are
supposed to be independent but intertwined. Assuming a very long retention
policy, you still need long enough backlog, because backlog prevents disk
growing indefinitely. With no backlog, the producer can no longer writes to the
topic or producer gets disconnected (remember there are 3 backlog policy reacts
what if the size limit is reached?) The problem is the backlog size should not
more than the disk size. Well if you can set the retention to be very long as
the disk size, the backlog size should not be more than that.
----
2020-05-07 13:24:27 UTC - Damien Roualen: I don’t find a similar case in the
channel:
When I am trying to get tables from Pulsar using Presto (like in the
documentation) I get HTTP 500 error.
```presto:default> show catalogs;
Catalog
---------
pulsar
system
(2 rows)
presto:default> show schemas in pulsar;
Schema
--------------------
information_schema
public/default
pulsar/system
(3 rows)
presto:default> show tables in pulsar."public/default";
Query 20200507_125433_00004_h8fj8 failed: Failed to get tables/topics in
public/default: HTTP 500 Internal Server Error```
----
2020-05-07 13:25:12 UTC - Damien Roualen: Is there a way to understand that
HTTP 500 error from my Pulsar Cluster?
----
2020-05-07 13:25:55 UTC - Damien Roualen: I am using JWT token for the
connection, which allows me to run the two first commands.
----
2020-05-07 13:27:01 UTC - Damien Roualen: Documentation link:
<https://pulsar.apache.org/docs/en/sql-getting-started/>
----
2020-05-07 13:29:28 UTC - Alexandre DUVAL: Did you check brokers logs?
----
2020-05-07 13:30:05 UTC - Alexandre DUVAL: Or presto runner logs? Or try
`pulsar sql --debug`
----
2020-05-07 13:31:00 UTC - Damien Roualen: We run Pulsar with a service using
systemctl.
I checked with: `sudo systemctl status pulsar.broker.service`
----
2020-05-07 13:31:52 UTC - Alexandre DUVAL: so journalctl -efau
pulsar.broker.service ?
----
2020-05-07 13:31:55 UTC - Damien Roualen: I am using Presto cli.
----
2020-05-07 13:32:11 UTC - Alexandre DUVAL: pulsar sql is a presto cli :wink:
----
2020-05-07 13:33:26 UTC - Damien Roualen: Ok, :sweat_smile:. I will run the
presto cli with a similar debug parameter
----
2020-05-07 13:34:01 UTC - Alexandre DUVAL: The log showed by the presto cli
should be detailed on the presto runner
----
2020-05-07 13:35:09 UTC - Damien Roualen: I have the exception trace now, I am
looking at it if that can help.
----
2020-05-07 13:35:52 UTC - Damien Roualen: The CLI doesn’t give meaningful
information about the issu.
----
2020-05-07 13:36:20 UTC - Damien Roualen: ```Query 20200507_133355_00009_h8fj8
failed: Failed to get tables/topics in pulsar/system: HTTP 500 Internal Server
Error
java.lang.RuntimeException: Failed to get tables/topics in pulsar/system: HTTP
500 Internal Server Error
at
org.apache.pulsar.sql.presto.PulsarMetadata.listTables(PulsarMetadata.java:191)
at
com.facebook.presto.spi.connector.ConnectorMetadata.listTables(ConnectorMetadata.java:224)
at
com.facebook.presto.metadata.MetadataManager.listTables(MetadataManager.java:548)
at
com.facebook.presto.connector.informationSchema.InformationSchemaMetadata.lambda$calculatePrefixesWithTableName$7(InformationSchemaMetadata.java:304)
at
java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:271)
at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
at
java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at
java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
at
java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
at
java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913)
at
java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at
java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578)
at
com.facebook.presto.connector.informationSchema.InformationSchemaMetadata.calculatePrefixesWithTableName(InformationSchemaMetadata.java:308)
at
com.facebook.presto.connector.informationSchema.InformationSchemaMetadata.getTableLayouts(InformationSchemaMetadata.java:242)```
----
2020-05-07 13:36:31 UTC - Alexandre DUVAL: with --debug the cli shows all the
stacktrace iirc
----
2020-05-07 13:36:41 UTC - Damien Roualen: Yes, I use that parameter.
----
2020-05-07 13:36:42 UTC - Alexandre DUVAL: ok so it should be broker side,
check its logs
----
2020-05-07 13:37:09 UTC - Damien Roualen: Since I am using three brokers, I
should use journactl on each instances?
----
2020-05-07 13:39:38 UTC - Alexandre DUVAL: yes, or you can lookup to check
which broker is currently running the topicused by your presto request
----
2020-05-07 13:41:20 UTC - Damien Roualen: How can I do the lookup?
----
2020-05-07 13:42:03 UTC - Damien Roualen: Otherwise I would update Presto
server config to only use 1 broker and relaunch the Presto server,
----
2020-05-07 13:42:39 UTC - Damien Roualen: The logs don’t show anything from the
3 brokers. Maybe there is a way to increase the level of verbosity.
----
2020-05-07 13:44:33 UTC - Alexandre DUVAL: maybe
```org.apache.pulsar.sql.presto.PulsarMetadata.listTables(PulsarMetadata.java:191)```
----
2020-05-07 13:44:43 UTC - Alexandre DUVAL: look for metadata on zookeepers?
----
2020-05-07 13:44:54 UTC - Alexandre DUVAL: you can check the code ran
----
2020-05-07 13:58:36 UTC - Damien Roualen: Ok; I am going to check the metadata.
----
2020-05-07 14:04:03 UTC - Gary Fredericks: thanks for the explanation; I will
play around with this and see what I can figure out
----
2020-05-07 14:04:36 UTC - Gary Fredericks: (I also posted a different version
of this question on the github issue:
<https://github.com/apache/pulsar/issues/6847#issuecomment-625189109>)
----
2020-05-07 14:09:06 UTC - Damien Roualen: By running: `sudo journalctl -efau
zookeeper.service`
----
2020-05-07 14:10:14 UTC - Damien Roualen: ```I see an error earlier but it
doesn't seem to be related: ```
```May 07 12:28:46 ip-10-242-177-121 pulsar[4411]: WARNING: An illegal
reflective access operation has occurred
May 07 12:28:46 ip-10-242-177-121 pulsar[4411]: WARNING: Illegal reflective
access using Lookup on org.aspectj.weaver.loadtime.ClassLoaderWeavingAdaptor
(file:/opt/pulsar/lib/org.aspectj-aspectjweaver-1.9.2.jar) to class
java.lang.ClassLoader
May 07 12:28:46 ip-10-242-177-121 pulsar[4411]: WARNING: Please consider
reporting this to the maintainers of
org.aspectj.weaver.loadtime.ClassLoaderWeavingAdaptor
May 07 12:28:46 ip-10-242-177-121 pulsar[4411]: WARNING: Use
--illegal-access=warn to enable warnings of further illegal reflective access
operations
May 07 12:28:46 ip-10-242-177-121 pulsar[4411]: WARNING: All illegal access
operations will be denied in a future release```
----
2020-05-07 14:32:04 UTC - rani: *Pulsar Component Version*: `2.5.1`
*Architecture*: Function workers running on k8s separately from remaining
components that are managed on EC2 via ASGs. Function worker uses
`pulsar-proxy` for communication with broker.
*Issue:* the function worker works as expected however it occasionally throws
this error
```11:30:42.196 [cluster-service-coordinator-timer] ERROR
org.apache.pulsar.functions.worker.MembershipManager - Failed to get status of
coordinate topic <persistent://pulsar/functions/coordinate>
org.apache.pulsar.client.admin.PulsarAdminException$ServerSideErrorException:
HTTP 502 Bad Gateway
at
org.apache.pulsar.client.admin.internal.BaseResource.getApiException(BaseResource.java:204)
~[org.apache.pulsar-pulsar-client-admin-original-2.5.1.jar:2.5.1]
at
org.apache.pulsar.client.admin.internal.TopicsImpl$9.failed(TopicsImpl.java:469)
~[org.apache.pulsar-pulsar-client-admin-original-2.5.1.jar:2.5.1]
at
org.glassfish.jersey.client.JerseyInvocation$4.failed(JerseyInvocation.java:1030)
~[org.glassfish.jersey.core-jersey-client-2.27.jar:?]
at
org.glassfish.jersey.client.JerseyInvocation$4.completed(JerseyInvocation.java:1017)
~[org.glassfish.jersey.core-jersey-client-2.27.jar:?]
at
org.glassfish.jersey.client.ClientRuntime.processResponse(ClientRuntime.java:227)
~[org.glassfish.jersey.core-jersey-client-2.27.jar:?]
at
org.glassfish.jersey.client.ClientRuntime.access$200(ClientRuntime.java:85)
~[org.glassfish.jersey.core-jersey-client-2.27.jar:?]
at
org.glassfish.jersey.client.ClientRuntime$2.lambda$response$0(ClientRuntime.java:178)
~[org.glassfish.jersey.core-jersey-client-2.27.jar:?]```
Any input would be appreciated
----
2020-05-07 15:00:44 UTC - David Kjerrumgaard: Thanks for the detailed
explanation of the issue and filing the issue.
----
2020-05-07 15:05:42 UTC - David Kjerrumgaard: FWIW, I usually have my SerDe
classes defined in a different project/JAR file and never had this issue.
----
2020-05-07 15:41:20 UTC - Sijie Guo: New case study blog post from Zhaopin
about why and how they use Pulsar SQL for search log analysis. If you are
interested in using Pulsar (Presto) SQL, please check it out.
<https://streamnative.io/blog/tech/2020-05-07-zhaopin-tech-blog/>
+1 : Konstantinos Papalias, Penghui Li, Simba Khadder, Luke Lu, David
Kjerrumgaard
----
2020-05-07 15:56:20 UTC - Addison Higham: are you referring to autoscaling?
bookkeepers don't lend themselves well to autoscaling, but brokers should be
reasonable to do either just based on CPU or you could use a custom metric.
Brokers don't have any non-replaceable state so you can just add or remove any
as you need (but they do have caches, so you don't want to go crazy with it).
You can read more about load balancing basics here:
<https://pulsar.apache.org/docs/en/administration-load-balance/>. What that
means on AWS though is that assuming a pretty even load across a number of
topics, ideally no one broker is better suited than the others, so the choice
*should* be arbitrary of which broker you terminate. Assuming you let a broker
properly terminate, it will signal that all of it's bundles (the unit of load
balancing) should be moved to other brokers. Even in the event of a
non-graceful termination, the other brokers will see that there is a bundle
that isn't scheduled and will pick it up, it may just take a bit longer
+1 : Frank Kelly
----
2020-05-07 17:14:09 UTC - Damien Roualen: I got the error from the broker log:
```14:52:39.152 [AsyncHttpClient-44-4] ERROR
org.apache.pulsar.broker.admin.v2.NonPersistentTopics - [pulsar-role-admin]
Failed to get list of topics under namespace public/default
java.util.concurrent.ExecutionException:
org.apache.pulsar.client.admin.PulsarAdminException$NotAuthorizedException:
Don't have permission to administrate resources on this tenant
...
Caused by: <http://javax.ws.rs|javax.ws.rs>.NotAuthorizedException: HTTP 401
Unauthorized
at
org.glassfish.jersey.client.JerseyInvocation.convertToException(JerseyInvocation.java:1080)
~[org.glassfish.jersey.core-jersey-client-2.27.jar:?]
at
org.glassfish.jersey.client.JerseyInvocation.access$700(JerseyInvocation.java:99)
~[org.glassfish.jersey.core-jersey-client-2.27.jar:?]```
I can also reproduce it with admin client.
`sudo ./bin/pulsar-admin topics list "pulsar/default`
or `sudo ./bin/pulsar-admin topics list "public/default"`
HTTP 500 Internal Server Error
I will continue to investigate and try to better understand.
----
2020-05-07 17:16:25 UTC - Alexandre DUVAL: did you configured your client.conf ?
----
2020-05-07 17:16:30 UTC - Alexandre DUVAL: used by pulsar-admin
----
2020-05-07 17:16:49 UTC - Damien Roualen: Yes, with JWT and pulsar-admin.
----
2020-05-07 17:17:34 UTC - Damien Roualen: I was reading here:
<https://stackoverflow.com/questions/60385447/apache-pulsar-authorization-failed-on-topic-dont-have-permission-to-adminis>
“This error is occurring because you’re not using the correct token to connect,
or the role associated with your token lacks sufficient permission. You will
need to ensure that you are using the correct token (and that it has the right
permissions) to enable you to connect.”
I am going to check that.
----
2020-05-07 17:17:37 UTC - Alexandre DUVAL: and the jwt is admin of the cluster
which is hosting public/default?
----
2020-05-07 17:17:53 UTC - Alexandre DUVAL: (the cluster as pulsar resource)
----
2020-05-07 17:20:00 UTC - Damien Roualen: I am going to verify.
----
2020-05-07 17:35:54 UTC - Damien Roualen: jfyi It could take a bit of time, I
am reading the documentation and learn about how the project has been
configured.
----
2020-05-07 17:52:03 UTC - Andrei Canta: @Andrei Canta has joined the channel
----
2020-05-07 18:07:36 UTC - Alex Yaroslavsky: Hi, I have a several brokers and
several separate function workers running in k8s. All setup to use TLS and
certificates. And I'm trying to make pulsar-admin work seamlessly with a single
admin-url. I have seen here a suggestion to put a proxy which should make this
possible. So I have setup a proxy, gave it the admin role client certificate,
and pointed admin-url to this proxy. But still all function related commands
return:
Function worker service is not done initializing. Please try again in a little
while.
Reason: HTTP 503 Service Unavailable
----
2020-05-07 18:11:03 UTC - Chris Bartholomew: Did you configure
`functionWorkerWebServiceURL` and/or `functionWorkerWebServiceURLTLS` in the
`proxy.conf` so I can find the standalone function workers?
----
2020-05-07 18:13:58 UTC - Alex Yaroslavsky: ye
----
2020-05-07 18:14:04 UTC - Alex Yaroslavsky: yes
root@pulsar-hidden-proxy-6dc5954d9b-bwdvt:/pulsar# cat conf/proxy.conf | grep
http
# <http://www.apache.org/licenses/LICENSE-2.0>
brokerWebServiceURL=<http://pulsar-broker:8080>
brokerWebServiceURLTLS=<https://pulsar-broker:8443>
functionWorkerWebServiceURL=<http://pulsar-function:6750>
functionWorkerWebServiceURLTLS=<https://pulsar-function:6751>
----
2020-05-07 18:14:43 UTC - Alex Yaroslavsky: and my admin-url is
<https://pulsar-proxy:8443>
----
2020-05-07 18:19:30 UTC - Sijie Guo: @Alex Yaroslavsky which version of Pulsar
are you running? Is it 2.5.1?
----
2020-05-07 18:20:07 UTC - Alex Yaroslavsky: its 2.5.0, plan to upgrade to 2.5.1
in a week
----
2020-05-07 18:23:10 UTC - Sijie Guo: prior to 2.5.1, the proxy only works to
proxy v1/v2 functions routes to the function worker. @Addison Higham fixed the
issue and it is available in 2.5.1. So you might need to upgrade to 2.5.1 to
make this approach work - <https://github.com/apache/pulsar/pull/6486>
----
2020-05-07 18:24:33 UTC - Alex Yaroslavsky: Cool, thanks! I'll try it out once
I upgrade.
----
2020-05-07 18:34:01 UTC - Anshul Dhyani: Guys, I am trying out Pulsar locally
specially pulsar functions, but sadly I can't run it locally as it looks like
there is no pulsar admin for windows.
If it exists, It will be really helpful if someone can point out place to
download pulsar admin for windows .
----
2020-05-07 18:42:03 UTC - Alex Yaroslavsky: You can write java code that uses
PulsarAdmin, or run pulsar-admin in a docker/WSL.
----
2020-05-07 21:21:39 UTC - David Kjerrumgaard: Can somebody tell me how to get
state storage working in standalone mode? I have tried adding
`extraServerComponents=org.apache.bookkeeper.stream.server.StreamStorageLifecycleComponent`
to the standalone.conf file (no luck) and starting pulsar with
`/pulsar/bin/pulsar standalone --stream-storage-port 4181` neither of which
works. Any pointers on what I am doing wrong?
----
2020-05-07 21:24:06 UTC - Zoltan Lajos Kis: @Zoltan Lajos Kis has joined the
channel
----
2020-05-08 00:19:45 UTC - Liam Clarke: Brilliant thanks :slightly_smiling_face:
----
2020-05-08 03:10:00 UTC - Jared: @Jared has joined the channel
----
2020-05-08 03:30:57 UTC - Jared: Hello, I'm getting started with Pulsar +
Debezium (with Postgres) and have successfully followed the example shown here:
<http://pulsar.apache.org/docs/en/io-cdc-debezium/> - but I have a problem,
when it comes to testing messages using 'docker exec -it pulsar-postgresql
/bin/bash' I get various control characters in the log - I've pasted a
screenshot. Does anyone know why those appear and how to properly work
with/around them?
----
2020-05-08 03:35:34 UTC - hahaxiaowen: @hahaxiaowen has joined the channel
----
2020-05-08 03:43:20 UTC - baiyuqing: @baiyuqing has joined the channel
----
2020-05-08 04:17:08 UTC - Subba Gaddamadugu: @Subba Gaddamadugu has joined the
channel
----
2020-05-08 06:48:26 UTC - dionjansen: It seems the bookie in question is is now
completely full:
```06:43:46.006 [LedgerDirsMonitorThread] WARN
org.apache.bookkeeper.bookie.LedgerDirsMonitor - LedgerDirsMonitor check
process: All ledger directories are non writable
06:43:46.007 [LedgerDirsMonitorThread] ERROR
org.apache.bookkeeper.util.DiskChecker - Space left on device
data/bookkeeper/ledgers/current : 0, Used space fraction: 1.0 > threshold
0.95.```
I see that the other bookies have considerably reduced their data on disk (I
presume because of the `offloadLedgerDeletionLagMs` came around) but this one
did not perform the deletions. Do I need manual intervention to resolve this?
Any help would be greatly appreciated!
----
2020-05-08 07:28:01 UTC - evir35: @evir35 has joined the channel
----