Slack digest for #general - 2020-01-14

Apache Pulsar Slack Tue, 14 Jan 2020 01:11:20 -0800

2020-01-13 09:48:42 UTC - Lukas Chripko: Hi there, when creating new 
environment in the pulsar-manager (public-address) running on GKE and using 
pulsar-proxy (Internal LoadBalancer) to access broker, what should I use as 
service url? :confused: neither internal IP of the pulsar-proxy or 
"<http://pulsar-proxy:6650>" works and there is not much of an info, just 
message there was an error. Any idea how to debug this?
----
2020-01-13 10:02:06 UTC - Daniel Livshin: @Daniel Livshin has joined the channel
----
2020-01-13 10:11:20 UTC - Roman Popenov: I usually run these two commands to 
get the service URLs:
```pulsar-admin clusters list
pulsar-admin clusters get ${cluster_name}```


----
2020-01-13 10:12:20 UTC - Roman Popenov: And the output is usually something 
like
```{
  "serviceUrl" : "http://${web_service_url_or_ip}:8080";,
  "brokerServiceUrl" : "pulsar://{broker_service_url_ip}:6650"
}```
----
2020-01-13 10:12:58 UTC - Roman Popenov: Have you tried using port `8080`?
----
2020-01-13 10:22:47 UTC - Sergii Zhevzhyk: Hello everyone! I tried to use a 
custom Schema in a source connector for Pulsar which publishes AVRO messages.
The connector doesn't work because it cannot load the custom Schema. My 
assumption is that it uses a wrong class loader, but it should be confirmed.
I found a similar issue related to SerDe 
<https://github.com/apache/pulsar/issues/5350> that was resolved. I think that 
if a custom class loader is used to get schema 
<https://github.com/apache/pulsar/commit/10a450121c020217eab85a13fb92268a56b5a778#diff-ba8b1c4fcbe5f5e2c266d9bb30a82aaaR328>
 then it should fine.
I've created a sample project with code and readme on how to reproduce the 
issue <https://github.com/vzhikserg/pulsar-connector-custom-schema>
Any recommendations (or workarounds) on how to solve this issue would be 
greatly appreciated. Thank you in advance!
----
2020-01-13 11:18:14 UTC - Sijie Guo: pulsar-manager should run in the same 
network as your brokers.
----
2020-01-13 11:19:14 UTC - Sijie Guo: because some brokers related operations 
are sent to individual broker directly.
----
2020-01-13 11:29:10 UTC - Lukas Chripko: @Roman Popenov Thanks Roman getting 
address by `pulsar-admin clusters get ${cluster_name}` worked! 
:slightly_smiling_face: of course you need full cluster dns on GKE:
```pulsar-admin clusters get pulsar
{
  "serviceUrl" : "<http://pulsar-broker.pulsar.svc.cluster.local:8080/>",
  "brokerServiceUrl" : "<pulsar://pulsar-broker.pulsar.svc.cluster.local:6650/>"
}```

+1 : Roman Popenov
----
2020-01-13 11:52:01 UTC - Sijie Guo: Ah I see. I never thought that people will 
implement their own `Schema` when we worked on #5350 issue.

1. to fix the issue, I think you have already found the line to fix.
----
2020-01-13 11:52:36 UTC - Sijie Guo: 2. to get around, you don’t need to 
specify a schema class, you can just specify `--schema-type AVRO`.
----
2020-01-13 11:59:40 UTC - Sergii Zhevzhyk: It wasn't my initial intention to 
implement my own schema, but it is a workaround I've tried to fix the previous 
issue <https://apache-pulsar.slack.com/archives/C5Z4T36F7/p1578605986056500>
----
2020-01-13 12:00:27 UTC - Sergii Zhevzhyk: Shall I create a pull request to fix 
the issue with the class loader?
----
2020-01-13 12:05:08 UTC - Sijie Guo: oh i see. yes. please create a pull 
request to fix the issue.
----
2020-01-13 12:08:28 UTC - Sergii Zhevzhyk: Ok, thank you. I will do so
100 : Sijie Guo
----
2020-01-13 13:50:25 UTC - rmb: Hi all, I'm bumping some questions about the 
node client library from a few days ago: 
<https://apache-pulsar.slack.com/archives/C5Z4T36F7/p1578536787012100>
----
2020-01-13 14:10:12 UTC - Guilherme Perinazzo: &gt; if sending a message fails, 
what are the possible error messages send() could throw?
it'll be "Failed to send message: " followed by one of these: 
<https://github.com/apache/pulsar/blob/master/pulsar-client-cpp/lib/Result.cc>
&gt; the nodejs documentation only lists methods send(), flush(), and close() 
for the producer.
The node client is still a bit barebones, but it's mostly a wrapper on top of 
the C api so it's really easy to modify. If you need a specific function 
exposed, the best way is probably to create a pull request
----
2020-01-13 14:20:11 UTC - Roman Popenov: Looking at 
<https://github.com/apache/pulsar/blob/master/pulsar-functions/run-counter-function.sh>,
 it doesn’t look like there is a `CounterFunction` in the package 
`org.apache.pulsar.functions.api.examples`
----
2020-01-13 14:32:26 UTC - rmb: Thanks!  Is there any kind of automatic retry 
behavior if send() fails?  Or is that left to the client to handle?
----
2020-01-13 14:35:45 UTC - Lukasz Olczak: @Lukasz Olczak has joined the channel
----
2020-01-13 15:48:02 UTC - Roman Popenov: Does anyone have clear instructions as 
to how run Pulsar Functions as pod in kubernetes?
----
2020-01-13 16:13:48 UTC - Debi Hammel: @Debi Hammel has joined the channel
----
2020-01-13 16:28:50 UTC - Debi Hammel: New to Pulsar. I am connecting via a 
library that's a wrapper around the .net client (and also allows connectivity 
to MQTT, etc.). I am using the same library in 3 different projects (a .net 
Winforms test harness, a windows service and a rest api) but can only 
successfully connect via the test harness. I am running my pulsar instance 
standalone with all defaults. The only error I see when connecting from the 
other two projects is "Pulsar handshake not completed within timeout, 
connection closing". Any ideas?
----
2020-01-13 16:49:46 UTC - Guilherme Perinazzo: I'm not sure but I think the C++ 
driver will keep trying until it succeeds. Be careful if you're going to use 
the node client though as send will lock one of the worker threads until it 
finishes sending
----
2020-01-13 16:53:20 UTC - Guilherme Perinazzo: (one of the worker threads from 
node)
----
2020-01-13 17:00:29 UTC - Addison Higham: @roman yeah... the docs are lacking, 
we have it working. Let me grab our config
bananadance : Roman Popenov
----
2020-01-13 17:04:33 UTC - Roman Popenov: My hero!
----
2020-01-13 17:15:39 UTC - Addison Higham: I assume you are running your broker 
on k8s? If so, then the only config you need in `broker.conf` is 
`functionsWorkerEnabled=true`, the rest of the config is all in the 
`functions_worker.yml`
----
2020-01-13 17:16:36 UTC - Addison Higham: our copy of that looks like this:
```assignmentWriteMaxRetries: 60
clusterCoordinationTopicName: coordinate
connectorsDirectory: ./connectors
downloadDirectory: /tmp/pulsar_functions
failureCheckFreqMs: 30000
functionAssignmentTopicName: assignments
functionMetadataTopicName: metadata
initialBrokerReconnectMaxRetries: 60
instanceLivenessCheckFreqMs: 30000
numFunctionPackageReplicas: 1
numHttpServerThreads: 8
secretsProviderConfiguratorClassName: 
org.apache.pulsar.functions.secretsproviderconfigurator.KubernetesSecretsProviderConfigurator
kubernetesContainerFactory:
  jobNamespace: pulsar
  pulsarDockerImageName: instructure/pulsar-all:2.4.1-inst4
  pulsarServiceUrl: <pulsar+ssl://pulsar-beta-broker.pulsar:6651/>
  pulsarAdminUrl: <https://pulsar-beta-broker.pulsar:8443/>
  submittingInsidePod: true
  percentMemoryPadding: 10
pulsarFunctionsCluster: pulsar-beta-iad
pulsarFunctionsNamespace: public/functions-iad
rescheduleTimeoutMs: 60000
schedulerClassName: 
org.apache.pulsar.functions.worker.scheduler.RoundRobinScheduler
tlsCertRefreshCheckDurationSec: 300
useTls: true # this is the important one
tokenPublicKey: file:///etc/pulsar/jwt/public.key
topicCompactionFrequencySec: 1800```
----
2020-01-13 17:17:16 UTC - Addison Higham: will highlight a few important config 
values that took a while to figure out
----
2020-01-13 17:19:14 UTC - Addison Higham: 
`secretsProviderConfiguratorClassName` that one is important as it allows you 
to do `secrets` in your yaml for functions/io. Basically, it allows you to 
reference an k8s secret and inject it as an env var, like so:
```secrets:
  # this isn't the real password! this is a reference to a k8s secret that 
stores the real password
  MY_PASSWORD:
    path: "my-password" # the name of the k8s secret
    key: "password" # the key in that secret```
----
2020-01-13 17:21:41 UTC - Addison Higham: the `kubernetesContainerFactory` 
block is pretty straight forward, we just override the namespace where 
functions get run, as well as we use our own pulsar fork (all our stuff is 
upstreamed, just waiting for 2.5 to release). Technically, I am not sure you 
need to override the URLs if you aren't using TLS, but if you are using TLS you 
will want to make sure you specify the TLS endpoints.
----
2020-01-13 17:23:43 UTC - Addison Higham: the `pulsarFunctionsCluster` and 
`pulsarFunctionsNamespace` are critical to overwrite if you have 
geo-replication. Each cluster will need it's own namepsace. Otherwise, each 
regional cluster will complain that it doesn't have permission to use the 
namespace, but if you add it so that the namespace is replicated, then each 
function worker in each region will try and pick up work from other regions, 
which is no good :stuck_out_tongue:
----
2020-01-13 17:25:23 UTC - Addison Higham: if you are using TLS and want the 
functions worker to connect over TLS, you MUST set `useTls` it seems like it is 
a bug in the code as that property is deprecated but it works for now. Finally, 
the `tokenPublicKey` is needed if you are using token auth as the functions 
worker needs to be able to validate JWTs
----
2020-01-13 17:28:52 UTC - rmb: Thanks.  If the broker has deduplication enabled 
but the producer has its timeout left at 30 seconds, do you know what would go 
wrong? (I'm trying to understand why the docs recommend turning off timeouts to 
use deduplication)
----
2020-01-13 18:43:11 UTC - Mathieu Druart: @Addison Higham are you using state 
API in your Functions ?
----
2020-01-13 18:44:03 UTC - Addison Higham: nope, we tried in 2.4.x and were met 
with defeat, I think it is maturing a bit more with 2.5, will try it again once 
we get there
----
2020-01-13 18:44:10 UTC - Mathieu Druart: we can't figure how to make 
persistance work in Functions
----
2020-01-13 18:44:19 UTC - Mathieu Druart: ok thanks !
----
2020-01-13 18:44:54 UTC - Addison Higham: yeah, 2.5 takes a new version of 
bookkeeper which has some improvements, but I think most of it it was issues on 
the Pulsar side. It isn't really well documented yet
----
2020-01-13 18:45:57 UTC - Mathieu Druart: We will try again with 2.5.0 too
----
2020-01-13 19:49:23 UTC - Joshua Dunham: Hey Everyone,
----
2020-01-13 19:49:46 UTC - Joshua Dunham: I'm trying to start up 
bookkeeper/pulsar and getting an InvalidCookie error in BK.
----
2020-01-13 19:50:24 UTC - Joshua Dunham: Either not all local directories have 
cookies or directories being added newly are not empty. Directories missing 
cookie file are: [/&lt;PATH&gt;/bk-ledger/current]
----
2020-01-13 19:50:52 UTC - Joshua Dunham: Has anyone seen this. I'm assuming BK 
is responsible for seeding this so I can ask in the BK slack if that's the 
right place.
----
2020-01-13 20:54:25 UTC - Vladimir Shchur: Which .net client are you using?
----
2020-01-13 21:08:24 UTC - Debi Hammel: DotPulsar 0.7.1
----
2020-01-13 21:09:59 UTC - Vladimir Shchur: I see, fell free to ping me if you 
use Pulsar.Client :)
----
2020-01-13 21:12:15 UTC - Joshua Dunham: Figured it out. My metadataServiceURI 
was not the same as the BK_zkLedgerURI. D'oh!
100 : Sijie Guo
----
2020-01-13 21:25:22 UTC - Carol Willing: @Carol Willing has joined the channel
----
2020-01-13 23:55:12 UTC - Jordan Widdison: So I have a general question about 
Pulsar. In the documentation, it mentions the idea of a 
`negative_acknowledgement`. 
(<https://pulsar.apache.org/docs/en/concepts-messaging/#negative-acknowledgement>).
 However, there is nothing about sending nacks in the protobuf spec. I've been 
looking through some of the client libraries and it seems like some of them use 
a polling strategy with `CommandRedeliverUnacknowledgedMessages` . But has 
anyone figured out a way to just tell pulsar "I don't want to process this 
message right now, give it back to me later" like you would with a nack in 
RabbitMQ for example?
----
2020-01-14 00:43:52 UTC - Addison Higham: @Jordan Widdison we have run into 
that, you are correct, `nack` is completely a client side concept and just uses 
the redeliver unacknowledged messages functionality. We also have use cases 
where would like to send a `nack`  + some span of time before redelivery 
happens, and AFAICT, the best you can do right now is to figure out how to use 
delayed message delivery: 
<https://github.com/apache/pulsar/wiki/PIP-26:-Delayed-Message-Delivery>
----
2020-01-14 00:46:54 UTC - Addison Higham: for example, you could have your 
consumer produce the message back onto the topic with a delayed delivery (and 
then acknowledge the original message!). That works, but can produce duplicates 
and you have to push it into the client. The other option might be to just let 
it get delivered a few times and configure a dead letter queue where those 
messages might be processed with a different mechanism. I would love to see the 
ability to have `nack` take advantage of delayed message delivery tracking to 
give you some mechanism to back-off messages that can't yet be processed
----
2020-01-14 06:45:40 UTC - dba: Hi Debi. I replied to your e-mail. Let me know 
(here or by e-mail) how I can help :-)
----

Slack digest for #general - 2020-01-14

Reply via email to