2020-03-11 09:20:45 UTC - Steven Le Roux: No since Pulsar needs working 
softwares :=) Seriously, if you really don't want to go with ZK, you could try 
zetcd which is a ZK API in front of etcd. IMO you shouldn't do this. Zookeeper 
is a great peace of software that survived Jepsen more than etcd. And working 
on Etcd for K8S really make me want to replace etcd everywhere it's possible.
----
2020-03-11 09:22:02 UTC - Ali Ahmed: @xue sort of there is work underway to 
make the metadata store pluggable , it’s not complete yet , once done etcd or 
any baking store can be used.
----
2020-03-11 09:22:12 UTC - Ali Ahmed: currently zk is required
----
2020-03-11 09:22:16 UTC - Florentin Dubois: Hi Xue, why do you want to use 
etcd? For operating both in production, zookeeper is a way better in all 
categories. Besides, It generate no operating tasks compare to etcd.
----
2020-03-11 09:25:14 UTC - xue: I see that bookkeeper supports etcd after 
version 4.9.0, so I want to confirm whether pulsar cluster supports etcd
----
2020-03-11 09:32:43 UTC - Florentin Dubois: You could use an etcd cluster for 
one bookkeeper cluster, but pulsar need a dedicated zookeeper cluster for 
sharing namespaces, configurations and so on... accross pulsar clusters
----
2020-03-11 09:34:16 UTC - xue: I see, thank you!
----
2020-03-11 10:07:35 UTC - Abhilash Mandaliya: hi
does pulsar connector handle crash? I mean let’s say I am having my own sink 
connector and somehow their occurred an error which crashed my java client. 
Will pulsar know about this and send all the messages starting from that 
message or it will only give messages from the time I start my sink again?
----
2020-03-11 11:54:12 UTC - Ildefonso Junquero: @Ildefonso Junquero has joined 
the channel
----
2020-03-11 12:10:21 UTC - Ian: Thank you @Sijie Guo and @Joe Francis
----
2020-03-11 12:13:52 UTC - Ildefonso Junquero: Hello, I'm starting to play with 
pulsar functions, and I haven't found any way of specify the subscription 
initial position. I have investigated the source code in 
FunctionConfigUtils.convert, and I see there is no way to retrieve that 
configuration information and setup the final PulsarConsumerConfig. I have 
taken a look at LocalRunner.start which uses 
FunctionConfigUtils.convert(functionConfig, classLoader). This method should 
configure the subscriptionPosition in sourceSpec which is used later in 
JavaInstanceRunnable,setupInput to setup pulsarSourceConfig.

Now, my question is: is this a missing feature or is there any suggestion to 
avoid a function to consume messages since Latest, and force it to consume 
since Earliest?
eyes : Konstantinos Papalias
----
2020-03-11 12:17:48 UTC - Ildefonso Junquero: Another topic I'd like to comment 
is that Pulsar SQL (presto) does not support topics having names with uppercase 
characters. For instance, if you create a topic named MyTopic, you can 
subscribe and presto shows there is a mytopic table, but I haven't found a way 
to query that topic because it always return a table not found. If I create the 
topic mytopic (all lowercase) it works.

I have tried different options in the SQL syntax, with no success. Any 
suggestion?
----
2020-03-11 14:11:54 UTC - Michael Kaufman: @Michael Kaufman has joined the 
channel
----
2020-03-11 15:05:53 UTC - David Kjerrumgaard: I would recommend using unit 
tests for most of the testing, with local development. This allows me to set 
breakpoints inside the debugger. Once I move to localrun mode, I use LOG 
statements to trace the flow of messages that are problematic.
----
2020-03-11 15:29:19 UTC - Ming: I only speak for MySQL experience. MySQL can 
support mix case. This is the document how MYSQL supports mix case 
<https://dev.mysql.com/doc/refman/5.6/en/identifier-case-sensitivity.html> If 
you read carefully, MySQL case sensitivity support depends on underline OS. If 
you would like make your SQL portable, I would strongly against camel case, 
that is why SQL databases commonly use underscore. In Pulsar SQL, I switch 
everything to lowercase.
+1 : Ildefonso Junquero
----
2020-03-11 16:27:45 UTC - Pierre-Yves Lebecq: When localrun mode works 
correctly, which is not my case when trying to use state. :sweat_smile: Unit 
testing is a good suggestion though. Thank you for your help.
----
2020-03-11 16:29:29 UTC - David Kjerrumgaard: I haven't tried debugging with 
localrun mode, so I wouldn't be of much help in fixing that for you. However, I 
am confident that it can be done, and that it is more of a documentation issue 
than a technical issue. :smiley:
----
2020-03-11 16:34:38 UTC - Pierre-Yves Lebecq: For sure. I’m not familiar with 
the Java world, it’s the first time I try to run something in Java I know I’m 
missing some knowledge about running java code and packaging it in a jar file, 
etc. I find it quite difficult to get into. There are a lot of things to learn. 
It’s not Pulsar’s fault but for sure the docs are not beginner friendly! 
Anyway, I really appreciate you took some time to help me. Cheers!
+1 : David Kjerrumgaard
----
2020-03-11 17:00:11 UTC - John G: @John G has joined the channel
----
2020-03-11 17:08:57 UTC - Ildefonso Junquero: Understood and agreed. Anyway, In 
my mind I created a topic, not a table, and I never read anything telling that 
the topic shouldn't use capital letters in the name due to a "conflict" with 
Pulsar SQL (Presto). But at the end, I reached the conclussion of avoiding 
capital letters in topics. I think this could be explained in the pulsar doc to 
warn future users.
----
2020-03-11 17:35:34 UTC - Antti Kaikkonen: I had this same issue some weeks ago 
and found this solution:
1. `./bin/pulsar-admin topics create-subscription --messageId earliest 
--subscription testsub <persistent://public/default/topicname>`
2. Create your function using `--subs-name testsub`
+1 : Ildefonso Junquero
----
2020-03-11 17:36:06 UTC - Antti Kaikkonen: --messageId can also be 'latest' or 
(ledgerId:entryId)
----
2020-03-11 17:40:41 UTC - Antti Kaikkonen: &gt; Now, my question is: is this a 
missing feature or is there any suggestion to avoid a function to consume 
messages since Earliest, and force it to consume since Latest?
Isn't latest the default?
----
2020-03-11 18:01:05 UTC - Ildefonso Junquero: Yes, my mistake. It should say 
avoid ... Latest and force it to consume Earliest. Thank you. Original message 
corrected.
----
2020-03-11 18:12:43 UTC - Ildefonso Junquero: I have tested your workaround and 
it works! Thank you. :star-struck:
----
2020-03-11 18:33:20 UTC - Antti Kaikkonen: No problem. --subs-name may not be 
needed if you create the subscription with the name that the function uses by 
default, but i'm not sure what is that.
----
2020-03-11 19:17:50 UTC - Alexander Ursu: Been running mysql jdbc sink 
connectors on some topics, but one seems to be showing this error in the logs, 
not sure what it means.
```19:16:25.352 [pool-5-thread-1] ERROR 
<http://org.apache.pulsar.io|org.apache.pulsar.io>.jdbc.JdbcAbstractSink - Got 
exception 
java.lang.NullPointerException: null
        at 
org.apache.pulsar.client.impl.schema.generic.GenericJsonRecord.getField(GenericJsonRecord.java:49)
 ~[pulsar-client-original-2.5.0.jar:2.5.0]
        at 
<http://org.apache.pulsar.io|org.apache.pulsar.io>.jdbc.JdbcAutoSchemaSink.bindValue(JdbcAutoSchemaSink.java:63)
 ~[pulsar-io-jdbc-2.5.0.nar-unpacked/:?]
        at 
<http://org.apache.pulsar.io|org.apache.pulsar.io>.jdbc.JdbcAbstractSink.flush(JdbcAbstractSink.java:200)
 ~[pulsar-io-jdbc-2.5.0.nar-unpacked/:?]
        at 
<http://org.apache.pulsar.io|org.apache.pulsar.io>.jdbc.JdbcAbstractSink.lambda$open$0(JdbcAbstractSink.java:108)
 ~[pulsar-io-jdbc-2.5.0.nar-unpacked/:?]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[?:1.8.0_232]
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
[?:1.8.0_232]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
 [?:1.8.0_232]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
 [?:1.8.0_232]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_232]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_232]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_232]
19:16:25.352 [pool-5-thread-1] ERROR 
<http://org.apache.pulsar.io|org.apache.pulsar.io>.jdbc.JdbcAbstractSink - 
Update count 0  not match total number of records 7```
----
2020-03-11 19:18:39 UTC - Alexander Ursu: If helpful, these are the stats of 
the sink
```{
  "numInstances" : 1,
  "numRunning" : 1,
  "instances" : [ {
    "instanceId" : 0,
    "status" : {
      "running" : true,
      "error" : "",
      "numRestarts" : 0,
      "numReadFromPulsar" : 28267,
      "numSystemExceptions" : 0,
      "latestSystemExceptions" : [ ],
      "numSinkExceptions" : 0,
      "latestSinkExceptions" : [ ],
      "numWrittenToSink" : 28267,
      "lastReceivedTime" : 1583954291838,
      "workerId" : "c-pulsar-cluster-1-fw-475eab276fee-8080"
    }
  } ]
}```
----
2020-03-11 19:27:33 UTC - Antti Kaikkonen: I'm trying to deploy a single node 
bare metal cluster and I'm getting
```Exception in thread "Thread-3" java.lang.IllegalStateException: State is not 
enabled.
        at 
com.google.common.base.Preconditions.checkState(Preconditions.java:507)
        at 
org.apache.pulsar.functions.instance.ContextImpl.ensureStateEnabled(ContextImpl.java:262)
        at 
org.apache.pulsar.functions.instance.ContextImpl.putStateAsync(ContextImpl.java:299)
...```
I also have 
`extraServerComponents=org.apache.bookkeeper.stream.server.StreamStorageLifecycleComponent`
in bookkeeper.conf
----
2020-03-11 19:43:27 UTC - Tim Corbett: @Tim Corbett has joined the channel
----
2020-03-11 20:50:07 UTC - David Kjerrumgaard: @Alexander Ursu It appears that 
there is a mismatch between the DB schema and the incoming JSON based data.
----
2020-03-11 21:03:55 UTC - Kirill Merkushev: Hello, does anybody know how to 
cleanup stale subscriptions from the pulsar stats? I’ve already unsubscribed 
them and cleaned the backlog, but its still there when calling `./pulsar-admin 
topics stats` or `/metrics` endpoint. How to get rid of them?
----
2020-03-11 21:07:10 UTC - Kirill Merkushev: Also maybe someone knows if there 
is a way to configure pulsar ns/tenants/topics via code with terraform/yml 
style config, so it could be like a control plane for Envoy and could be stored 
in github, versioned and automated on the template generation layer, but handle 
the difference automatically?
----
2020-03-11 21:10:21 UTC - Chris Bartholomew: Did you delete them with the admin 
CLI?
```bin/pulsar-admin topics unsubscribe
The following option is required: -s, --subscription 

Delete a durable subscriber from a topic. 
                The subscription cannot be deleted if there are any active 
consumers attached to it 

Usage: unsubscribe [options] <persistent://tenant/namespace/topic>
  Options:
  * -s, --subscription
       Subscription to be deleted```

----
2020-03-11 21:10:39 UTC - Kirill Merkushev: yep
----
2020-03-11 21:11:41 UTC - Chris Bartholomew: And do you have consumers 
connected to the topic that might be recreating them?
----
2020-03-11 21:12:52 UTC - Kirill Merkushev: no, thats for sure
----
2020-03-11 21:13:45 UTC - Kirill Merkushev: actually those subs were created by 
a script with uuid as a name of a consumer, which then disconnected at some 
point
----
2020-03-11 21:14:15 UTC - Kirill Merkushev: so I cleaned the traces with cli, 
but can’t get rid of the stats
----
2020-03-11 21:15:26 UTC - Chris Bartholomew: Hmm...that's strange. That works 
to get subscriptions out of the stats output for me. I use the REST API, but 
that is what the CLI is doing behind the scenes.
----
2020-03-11 21:15:47 UTC - Kirill Merkushev: btw topic is partitioned and these 
subscriptions are invisible on the topic name without partition
----
2020-03-11 21:18:23 UTC - Kirill Merkushev: okay, found the issue, I should 
unsubscribe it from each partition individually, thanks, good to have a place 
to ask questions :smile:
----
2020-03-11 21:29:52 UTC - Alexander Ursu: The only issue I can think of are 
maybe null values being sent for some keys in the json data, but even column in 
the mysql schema for the table is nullable by default, so this should be fine 
right? It doesn't seem to mention what specific column or key is causing the 
issue so it's hard to say.
----
2020-03-11 21:41:47 UTC - Greg Methvin: Can pulsar do deduplication based on a 
message key? I essentially want topic compaction but where it discards 
subsequent messages if the key is the same as an existing message. Perhaps what 
I’m describing can be done in some other way though.
----
2020-03-11 21:49:17 UTC - David Kjerrumgaard: Do you have the types defined as 
"optional" in the JSON schema?  (I am assuming it is JSON and not Avro 
converted to JSON)
----
2020-03-11 22:14:36 UTC - Kirill Merkushev: hash map? :slightly_smiling_face:
----
2020-03-11 22:23:58 UTC - Greg Methvin: yes, you could use some kind of 
distributed hash map, or redis
----
2020-03-11 22:25:25 UTC - Greg Methvin: but it’d be convenient to have it done 
by pulsar so we don’t have to coordinate state in two different places
----
2020-03-11 22:36:26 UTC - Kirill Merkushev: bookkeeper directly then? Pretty 
sure it can serve this purpose, since in pulsar functions there is a context 
which handles key-value case
----
2020-03-11 23:07:07 UTC - Sijie Guo: it seems that load balancing was triggered 
and namespace bundles are offloaded.
----
2020-03-11 23:07:24 UTC - Sijie Guo: Can you check your cpu usage?
----
2020-03-11 23:08:39 UTC - Andy Papia: Yeah I think it was an overload. I've 
moved on.
----
2020-03-11 23:09:30 UTC - Sijie Guo: 
<https://github.com/streamnative/terraform-provider-pulsar>
----
2020-03-11 23:10:04 UTC - Sijie Guo: We have developed a terraform provider for 
provisioning tenants/namespaces/topics. Not sure if that is something you are 
looking for.
----
2020-03-11 23:13:34 UTC - Sijie Guo: This sounds a simple feature to add to 
topic compaction. the current implementation of topic compaction overwrites 
keys and what you want is to drop keys if key exists.

It should be simple to introduce a flag to control duplicated key behavior in 
topic compaction.

• overwrite (current behavior)
• drop (new behavior)
then user can configure what is the compaction behavior through namespace 
settings.

maybe raise a github issue?
----
2020-03-11 23:13:52 UTC - Sijie Guo: okay
----
2020-03-11 23:25:57 UTC - Antti Kaikkonen: Is ECC memory recommended for pulsar 
deployments?
----
2020-03-12 01:33:07 UTC - Andy Papia: I've mostly run stateless apps in K8s up 
to now.  How should I think about Pulsar on K8s in AWS?  If I'm trying to cost 
optimize, I assume using persistent volume claims I'll need to run the bookie 
and zk pods 24x7 in order to keep their volumes available.  Since the brokers 
are stateless I assume they can be dynamically scaled with the autoscaler based 
on some metric.  So will I have a static ZK and bookie cluster than I can scale 
out when I need more throughput?  The volumes themselves can be resized if I 
need more storage.  Is it possible to use a distributed filesystem like EFS 
with Bookkeeper?
----
2020-03-12 04:27:45 UTC - Jeon.DeukJin: Hello, here is doesn’t show site.
<https://pulsar.apache.org/docs/en/reference-connector-admin/#sinks>
----
2020-03-12 04:28:19 UTC - Jeon.DeukJin: empty page.
----
2020-03-12 04:28:31 UTC - Jeon.DeukJin: also, 
<https://pulsar.apache.org/docs/en/reference-connector-admin/#sources>
----
2020-03-12 04:29:09 UTC - Jeon.DeukJin: and, then, Korean page ~!!
<https://pulsar.apache.org/docs/ko-KR/standalone>
----
2020-03-12 04:29:31 UTC - Jeon.DeukJin: Not Found error.
----
2020-03-12 04:29:42 UTC - Jeon.DeukJin: Please fix it.
----
2020-03-12 04:57:06 UTC - Greg Methvin: sounds good. I reported an issue here: 
<https://github.com/apache/pulsar/issues/6526>
----
2020-03-12 05:35:56 UTC - tuteng: This doc move to 
<https://pulsar.apache.org/docs/en/io-use/#sink-2>
----
2020-03-12 06:53:58 UTC - Devin G. Bost: @Jeon.DeukJin I've already reported 
this in a Github issue, so we're aware of it. Thanks for letting us know it's 
still an issue.
----
2020-03-12 06:54:44 UTC - Devin G. Bost: @tuteng I have an open pulsar issue 
for this. None of the foreign language pages are working.
FYI @jia zhai
----
2020-03-12 06:55:49 UTC - Devin G. Bost: 
<https://github.com/apache/pulsar/issues/6470|https://github.com/apache/pulsar/issues/6470>
----
2020-03-12 07:22:53 UTC - Aravindhan: @Aravindhan has joined the channel
----
2020-03-12 09:10:04 UTC - Aravindhan: Hi All, I am using pulsar io source 
connector to pull messages from Kafka to Pulsar topic. I need to do some data 
transformation, Before taking it into the application for processing. One way 
is writing a pulsar function for the transformation and get the required 
messages in the destination topic of pulsar function.

Is it possible to override the pulsar io source connector, So that the same can 
do the transformation as well? So that it can reduce the intermediate 
topic(which is an input to the pulsar function)?
----

Reply via email to