2020-06-06 09:24:11 UTC - Liam Clarke: More fun, I'm debugging, and noticed
this in the logs:
```09:21:46.710 [main] INFO
org.apache.bookkeeper.mledger.offload.jcloud.impl.BlobStoreManagedLedgerOffloader
- Constructor offload driver: aws-s3, host: null, container: test, region:
ap-southeast-2 ```
So the Jcloud offloader got the region okay - but the OffloadPolicies in
BrokerService#getManagedLedgerConfig is still missing the necessary values:
```OffloadPolicies{managedLedgerOffloadDriver=aws-s3,
managedLedgerOffloadMaxThreads=2, managedLedgerOffloadPrefetchRounds=1,
managedLedgerOffloadThresholdInBytes=-1,
managedLedgerOffloadDeletionLagInMillis=60000,
s3ManagedLedgerOffloadRegion=null, s3ManagedLedgerOffloadBucket=null,
s3ManagedLedgerOffloadServiceEndpoint=null,
s3ManagedLedgerOffloadMaxBlockSizeInBytes=67108864,
s3ManagedLedgerOffloadReadBufferSizeInBytes=1048576,
s3ManagedLedgerOffloadRole=null,
s3ManagedLedgerOffloadRoleSessionName=pulsar-s3-offload,
gcsManagedLedgerOffloadRegion=null, gcsManagedLedgerOffloadBucket=null,
gcsManagedLedgerOffloadMaxBlockSizeInBytes=67108864,
gcsManagedLedgerOffloadReadBufferSizeInBytes=1048576,
gcsManagedLedgerOffloadServiceAccountKeyFile=null, fileSystemProfilePath=null,
fileSystemURI=null}```
----
2020-06-06 09:30:55 UTC - Ebere Abanonu: Hi, I have been able to look into
this. PatternMultiTopicConsumer support auto discovery of new topics. You can
configure that with ConsumerBuilder
----
2020-06-06 09:44:14 UTC - Liam Clarke: Okay, so using `pulsar-admin namespaces
set-offload-policies --driver aws-s3 --region ap-southeast-2 --bucket test ...
test-tenant/test-namespace` to set an explicit offload policy on the namespace
worked, so I guess my question is - is this because I was using
`standalone.conf` vs `broker.conf`? Or will I have to set a per-namespace
offload policy for a production cluster also?
----
2020-06-06 10:16:58 UTC - Adriaan de Haan: Hi, I am trying to get the jdbc io
connector working, but I keep getting the following:
```07:11:25.185 [main] INFO
org.apache.pulsar.functions.utils.io.ConnectorUtils - Searching for connectors
in /home/adriaan/apache-pulsar-2.5.2/connectors
07:11:26.013 [main] INFO org.apache.pulsar.functions.utils.io.ConnectorUtils -
Found connector ConnectorDefinition(name=jdbc, description=Jdbc sink,
sourceClass=null, sinkClass=org.apache.pulsar.io.jdbc.JdbcAutoSchemaSink) from
/home/adriaan/apache-pulsar-2.5.2/connectors/pulsar-io-jdbc-2.5.2.nar
Exception in thread "main" java.lang.NullPointerException
at
org.apache.pulsar.functions.LocalRunner.startThreadedMode(LocalRunner.java:421)
at org.apache.pulsar.functions.LocalRunner.start(LocalRunner.java:319)
at org.apache.pulsar.functions.LocalRunner.main(LocalRunner.java:152)```
NullPointerException is not very helpful in trying to debug the issue... any
advice on how I can determine what is wrong?
----
2020-06-06 10:25:01 UTC - Liam Clarke: Hi Adrian, line 421 is
`instanceConfig.setMaxPendingAsyncRequests(functionConfig.getMaxPendingAsyncRequests());`
maxPendingAsyncRequests in InstanceConfig is an `int` while in FunctionConfig
it's an `Integer` - if it was set to `null` in the function config, it will
throw an NPE on unboxing to an `int`.
----
2020-06-06 10:27:33 UTC - Liam Clarke: In both *Config classes it defaults to
1000. Are you setting it explicitly to null?
----
2020-06-06 10:28:52 UTC - Adriaan de Haan: I don't set it at all
----
2020-06-06 10:42:27 UTC - Liam Clarke: Try setting it to 1000, can't hurt and
might resolve the issue
----
2020-06-06 12:39:22 UTC - Adriaan de Haan: so the null pointer exception at
that line would imply that functionConfig is null
----
2020-06-06 12:42:14 UTC - Aaron Batilo: @Aaron Batilo has joined the channel
----
2020-06-06 12:46:28 UTC - Aaron Batilo: :wave: Hi everyone. I'm Aaron. I came
across Pulsar a few weeks ago and have been trying to push it on my
organization because I think it solves a lot of our use cases.
+1 : Enrico Olivelli, Karthik Ramasamy
----
2020-06-06 12:46:46 UTC - Adriaan de Haan: Since this is a Sink it has a
SinkConfig and not a FunctionConfig I believe... so it seems that mgiht be why
it's failing
----
2020-06-06 12:56:14 UTC - Adriaan de Haan: Hi, can anyobdy please confirm that
sinks still work in v2.5.x?
----
2020-06-06 12:57:37 UTC - Adriaan de Haan: It seems that this commit:
<https://github.com/apache/pulsar/commit/55d5430701d41d92ce290d838e332eb9d9154b9e>
might have introduced a bug that will result in a null pointer exception -
since functionConfig is null for a sink, but it is using functionConfig without
checking for null
----
2020-06-06 13:01:17 UTC - alex kurtser: Hi @Sijie Guo
We set up it as separated statefullset (seprated from brokers) with "bin/pulsar
proxy" as entrypoint command for the container.
We also provide function_worker,yaml config file with parameters like this:
processContainerFactory:
extraFunctionDependenciesDir: null
javaInstanceJarLocation: null
logDirectory: null
pythonInstanceLocation: null
----
2020-06-06 13:03:36 UTC - alex kurtser: Of course, we have other paramters like
pulsar endpoints and so on. Important to note that the functions actually are
working good. The only one issue is with metrics. As i mentioned earlier, each
function instance inside the container creates random port exposing its
metrics. So we can not know what the port it will expose and can't define it on
the annotations on in the prometheus config file.
----
2020-06-06 14:45:12 UTC - YounggyuChun: @YounggyuChun has joined the channel
----
2020-06-06 15:55:12 UTC - Amit Pal: @Amit Pal has joined the channel
----
2020-06-06 16:40:47 UTC - Asaf Mesika: @Asaf Mesika has joined the channel
----
2020-06-06 16:48:02 UTC - Asaf Mesika: I’ve got a couple of questions on that:
1. I searched a lot in the documentation and in the internet to answer this
exact question. Is it documented some where and I missed it?
2. The default behaviour means I will potentially acknowledge to the broker,
the broker acks back and I can still lose that information (meaning, the
message in that 1sec will be redelivered)? From you information, is that
different from Kafka design (out of curiosity comparing the two)?
----
2020-06-06 17:11:28 UTC - Asaf Mesika: I’m reading a lot about Apache Pulsar to
understand how it works and understand it failures. One failure I couldn’t
understand yet. If I experience a complete data loss (all machines terminated,
or some corruption ruined data dir of all ZK nodes) - other than back up ZK
disks and recover by restoring, is there any other way to recover or without ZK
data, the pulsar+bookkeeper is essentially useless?
----
2020-06-06 17:19:07 UTC - Matteo Merli: Yes. ZK stores the metadata, so the
pointers to the data. If that is missing, the data is not accessible.
Though....
ZK availability is determined by the number of nodes. Eg: in normal production
environment one would run 5 ZK nodes.
On a bare-metal deployment, that would mean that 5 disks would have to
physically break down in a very short amount of time to lose this data.
It would be **very** unlikely to happen. Sure, there's still a chance, but in
any storage system the durability guarantee cannot ever be 100%, just
approximate to that through more redundancy.
On a cloud deployment, the local VM disks are ephemerals, so it's not a good
idea to use them for ZK. Rather, you would use EBS volumes (or similars). At
that point, the data on each EBS volume is already replicated 2 way and it can
be remounted in a different VM.
Finally, it's certainly possible to take offline backups of ZK snapshot and
txn-log. You can restore ZK nodes through that.
+1 : Asaf Mesika
----
2020-06-06 22:48:17 UTC - Nicolas Ha: the json seems fixed, but I still can’t
get to the page
<http://pulsar.apache.org/functions-rest-api/?version=2.5.1>
----