2019-10-17 09:19:35 UTC - Retardust: what parameter I should use to run
function in Thread or Process mode?
I thought default is Thread mode, but I see resource constraints in stats
output and OOMs, so it looks like functions runs in Process mode.
also, how could I set function receiver queue ?
----
2019-10-17 09:20:14 UTC - Retardust: yes until they packed in fatjar with the
function
----
2019-10-17 10:17:17 UTC - Dmytro Nasyrov: @Dmytro Nasyrov has joined the channel
----
2019-10-17 12:20:02 UTC - Yong Zhang: You can set it in the
`functions_worker.yml`, there has a Function Runtime Management to setting
which mode the function will run as.
you can set the functions receiver queue with this :
```
-o, --output
The output topic of a Pulsar Function (If none is specified, no
output is written)
```
+1 : Retardust
----
2019-10-17 12:24:43 UTC - Naby: Hi @Matteo Merli . Just a follow up on my
question. Is this a client side limitation or part of Pulsar architectural
design to have producer tied to a topic? I have a message queue that has
messages from several topics and creating a new producer for each message and
then tearing down the producer after sending the message does not seem
efficient. You have any suggestion? Thanks.
----
2019-10-17 12:31:19 UTC - Retardust: sorry, I mean size of incoming buffer
----
2019-10-17 13:11:12 UTC - Retardust: is there any examples of pulsar log topic
-> ELK logstash integration?
----
2019-10-17 13:25:03 UTC - Constantinos Papadopoulos: @Constantinos Papadopoulos
has joined the channel
----
2019-10-17 13:38:27 UTC - Junli Antolovich: Thanks @Vladimir Shchur, I am still
getting the same old error `Error open RocksDB database`. I will try spawn a
linux vm to see if it can be installed. If still fails on the VM, I will just
give Pulsar up.
----
2019-10-17 13:41:51 UTC - Junli Antolovich: Given the difficulties even to
install it on Windows, it would be rather tough to use Pulsar for
Microsoft/windows shop.
----
2019-10-17 13:42:23 UTC - Vladimir Shchur: You can also try finding some Docker
expert :slightly_smiling_face: It's probably smth wrong with your installation,
since I know at least 3 environments where this combination worked
----
2019-10-17 13:44:04 UTC - Junli Antolovich: We do't have docker expert around
here, and per @Sijie Guo, "Currently I don't think Pulsar support running on
Windows very well."
----
2019-10-17 13:45:02 UTC - Vladimir Shchur: Yes it's not running on Windows,
only on linux. Therefore the docker exists where you run linux images including
pulsar
----
2019-10-17 13:46:53 UTC - Junli Antolovich: Here is his reasoning on why it
failed on docker for windows: "Because there are a few dependencies are using
native code. The bindings we are shipping probably doesn't contain the native
libraries built for windows."
----
2019-10-17 13:47:32 UTC - Junli Antolovich: Linux VM will make it very hard to
integrate with windows apps.
----
2019-10-17 13:48:06 UTC - Vladimir Shchur: No, he was speaking about pulsar on
windows without docker, take a look at the issue once again
----
2019-10-17 13:50:09 UTC - Vladimir Shchur: Anyway if you are not ready for
running Pulsar on pure linux (or smth like Kubernetes) in production I'd rather
recommend you to not run it at all
----
2019-10-17 13:50:24 UTC - Vladimir Shchur: I'm using it on Windows for
development purposes only
----
2019-10-17 13:51:40 UTC - Chris Bartholomew: Junli, I tried to get a Windows
setup working to debug this yesterday, but my machine didn't have enough
memory. I will try to wrangle a bigger machine today. Are you sure you are
using Linux containers and not Windows containers in Docker? If you use Linux
containers, I would expect it to work.
----
2019-10-17 13:53:55 UTC - Junli Antolovich: I am not using linux vm yet, gonna
to try :slightly_smiling_face:
----
2019-10-17 15:38:31 UTC - Retardust: ts in pulsar_function_user_exception is
timestamp, am I right? If so it's very bad for prometheus cause will create one
sequence per exception:)
----
2019-10-17 15:53:31 UTC - Retardust: Wrong documentation in query params?
Does REST documentation generated or manual?
<https://pulsar.apache.org/admin-rest-api/?version=2.4.1#operation/createSubscription>
----
2019-10-17 16:39:09 UTC - Retardust: and connector pages are dead:)
<https://pulsar.apache.org/docs/en/io-connectors/io-elasticsearch.md#sink>
----
2019-10-17 16:58:03 UTC - David Kjerrumgaard: @Retardust you don't need to
manage the receiver queue size for a function. The function is triggered on a
per event basis.
----
2019-10-17 16:58:32 UTC - David Kjerrumgaard: @Retardust The REST docs are
auto-generated.
----
2019-10-17 17:03:52 UTC - Retardust: thanks
----
2019-10-17 17:16:15 UTC - Jon Bock: Are you looking for this page?
<https://pulsar.apache.org/docs/en/io-elasticsearch/>
----
2019-10-17 17:35:57 UTC - Luke Lu: We’re seeing quite a bit of warnings like
this in most broker logs: `[bookkeeper-ml-workers-OrderedExecutor-6-0] WARN
org.apache.bookkeeper.client.BookieWatcherImpl - New ensemble: [...] is not
adhering to Placement Policy. quarantinedBookies: []`. Any explanations?
Harmless?
----
2019-10-17 18:23:17 UTC - Retardust: Thnx. Probably link from connectors page
is broken. Its follow to markdown file
----
2019-10-17 18:25:01 UTC - Addison Higham: :thinking_face: I just realized there
isn't a source or a sink for pulsar itself... my use case:
replicating some topics from my prod env to my test/beta env. I was thinking I
could have both prod and beta clusters share the same global-zk and use built
in replication features... but that feels like not the right call so I can
fully test upgrades/etc in my beta env env
----
2019-10-17 18:26:17 UTC - Addison Higham: I see the `PulsarSink` in code, but I
don't think you can use that as it as it takes arguments via the constructor
instead of initing everything it needs in the open method
----
2019-10-17 18:27:23 UTC - Jerry Peng: There are internal implementations for
PulsarSink/Source. What is the reason you can’t just use the built-in
replication for this?
----
2019-10-17 18:27:53 UTC - Jon Bock: Yes, we’re debugging, we think someone
checked in a change to a build step for the website that caused the links to be
broken. Still digging in to find what commit to the repo broke it.
----
2019-10-17 18:30:43 UTC - Jon Bock: The links to the connector pages are
pointing one level lower than where the pages actually reside for some reason.
----
2019-10-17 18:36:45 UTC - Poule: @Luke Lu I see those too
----
2019-10-17 18:40:42 UTC - Jerry Peng: that statistic is rate limited to 5 per
min
----
2019-10-17 18:45:18 UTC - Retardust: Same
----
2019-10-17 18:47:19 UTC - Retardust: Anyway, thats info not about metrics, but
logs. 5 per min thats 300 sequnces per hour. And I see another tag that seems
to be high dementional in that metric. So that should be 300 sequnces * error
types per function in worst case
----
2019-10-17 18:47:44 UTC - Retardust:
<https://prometheus.io/docs/practices/naming/> look at the caution section
----
2019-10-17 18:51:01 UTC - Matteo Merli: Good point. You could do that with an
“identity” function. You just configure the in/out topics but the function
itself will just return what it gets on the input.
----
2019-10-17 18:51:46 UTC - Addison Higham: I think you could with some tweaks,
but looking at the code right now, I don't think it would work, as the
constructor is where it gets the client from:
```
public PulsarSink(PulsarClient client, PulsarSinkConfig pulsarSinkConfig,
Map<String, String> properties,
ComponentStatsManager stats, ClassLoader
functionClassLoader) {
this.client = client;
this.pulsarSinkConfig = pulsarSinkConfig;
this.topicSchema = new TopicSchema(client);
this.properties = properties;
this.stats = stats;
this.functionClassLoader = functionClassLoader;
}
```
----
2019-10-17 18:52:27 UTC - Cory Davenport: Hi,
Has anyone ran into a case in node. Where pulsar prevents node-mailer from
working?
----
2019-10-17 18:53:07 UTC - Addison Higham: but I don't think i could plumb
through the correct credentials as it currently stands with an identity
function? the client it builds all assumes writing to the same cluster
----
2019-10-17 18:54:32 UTC - Addison Higham: since the client comes from the
constructor but by default the a sink/sources class is created it calls a zero
arg constructor, I think ti would fail
----
2019-10-17 18:55:27 UTC - Jerry Peng: those are for internal use since every
function/source/sink is actually composed of a source->functions->sink
----
2019-10-17 18:56:02 UTC - Jerry Peng: so when you submit a source, internally
its run as source->identity_function->pulsar_sink
----
2019-10-17 18:56:22 UTC - Jerry Peng: but connectors are not recommended to be
used for replication
----
2019-10-17 21:02:02 UTC - Rong: Hi there, not sure if this is the right
channel? - we've enabled pulsar auth on broker and proxy, however, the
pulsar-admin command `grant-permission` returns 'Authorization is not enabled`
501 error. In our configMap, we have `authorizationEnabled: true` for both
broker and admin. No other info/stack trace in logs. What could be the issue?
----
2019-10-17 21:02:21 UTC - Rong: This is a deployment of pulsar in k8s cluster
----
2019-10-17 21:16:53 UTC - Addison Higham: oh sorry, I misunderstood your
original question, see my reasoning above:
This is to replicate some data from my prod envs to my beta envs, this all ties
back to a feature where give our customer a preview of their environment in our
next release.
So while I could do native replication, that requires me to use the same global
ZK, which feels a bit yucky as then I can't treat those envs as independent as
I want. It is also only a small portion of the data,
----
2019-10-17 21:17:33 UTC - Addison Higham: so just kind of a weird edge case for
one weird edge case of how we use some data, but just noticed it
----
2019-10-17 21:31:02 UTC - Oleg Kozlov: Hello, general messaging question: it
seems that if I produce a message to a topic w/o existing subscriptions - the
message does not get persisted or retained. Is there any configuration we can
make to make topics without subscriptions retain messages until a later time
when subscription is created?
----
2019-10-17 22:32:14 UTC - Poule: @Oleg Kozlov
<https://apache-pulsar.slack.com/archives/C5Z4T36F7/p1571231986295900>
----
2019-10-17 22:34:05 UTC - Poule: We will see this question countless times, so
I think Pulsar should retain even with no subscription
heavy_plus_sign : Oleg Kozlov
----
2019-10-17 22:35:06 UTC - Matteo Merli: yes, there’s a setting for broker-wide
default:
```
# Default message retention time
defaultRetentionTimeInMinutes=0
# Default retention size
defaultRetentionSizeInMB=0
```
----
2019-10-17 22:36:04 UTC - Matteo Merli: I guess the question is more if the
default should be changed
----
2019-10-17 22:38:18 UTC - Poule: I vote for 365 * 1440 minutes
----
2019-10-17 22:39:29 UTC - Matteo Merli: that sounds a lot of disk space to me
----
2019-10-17 22:39:32 UTC - Matteo Merli: :slightly_smiling_face:
----
2019-10-17 22:40:43 UTC - Poule: any setting that does not trigger a "where is
the message I published?" will do
----
2019-10-17 22:42:22 UTC - Matteo Merli: that won’t be enough though: you also
need to set that you want a new subscription to be created on the “earliest”
message id as well
----
2019-10-17 23:06:25 UTC - Jerry Peng: I see feel free to write a Pulsar Sink or
modify the existing internal version
----
2019-10-17 23:06:45 UTC - Jerry Peng: though the internal version might be more
complicated than necessary
----
2019-10-17 23:25:27 UTC - Jacob O'Farrell: Hi all - on the hunt for any
existing benchmarks showing Pulsar messaging throughput etc (even better if it
is compared against Kafka, Kinesis etc). If you know of any, please send them
my way!
----
2019-10-17 23:28:44 UTC - Matteo Merli:
<https://www.slideshare.net/merlimat/high-performance-messaging-with-apache-pulsar#31>
----
2019-10-18 00:16:29 UTC - tuteng:
<https://github.com/streamnative/awesome-pulsar#logging>
+1 : Retardust
----
2019-10-18 01:04:44 UTC - Jacob O'Farrell: Thanks! Great presentation
----
2019-10-18 01:27:34 UTC - Oleg Kozlov: thank you, this helps, it seems that if
we don't wont to retain ack'd messages - to avoid duplicate deliveries - we
have to make sure we create subscriptions for all topics before we start
sending data
----
2019-10-18 01:29:29 UTC - Oleg Kozlov: i think would be be very helpful is to
be able to set retention, but with an asterisk: the messages are retained until
EITHER retention window expires, OR a message is ack'd
----
2019-10-18 01:29:51 UTC - Oleg Kozlov: probably a separate config flag
----
2019-10-18 01:32:04 UTC - Oleg Kozlov: i think a common use case is to retain
messages, until they are ack'd , regardless of whether there is an already
existing subscription or not... otherwise i don't want to waste disk space for
ack'd messages, but also don't want to have to remember to create a
subscription for new topics to make sure msgs are not lost
----
2019-10-18 01:33:55 UTC - Oleg Kozlov: it's obviously not a blocker, just
complicates our workflow
----
2019-10-18 01:47:19 UTC - Chris Bartholomew: @Junli Antolovich Take a look at
the comment I put in the issue you opened for this. The revised command works
for me reliably in PowerShell or CMD.
----
2019-10-18 03:12:43 UTC - Ron Wheeler: @Ron Wheeler has joined the channel
----
2019-10-18 03:45:13 UTC - Matteo Merli: that is already possible. It’s the TTL.
messages are expired if a consumer don’t ack within a certain timeout:
<https://pulsar.apache.org/docs/en/cookbooks-retention-expiry/#time-to-live-ttl>
----
2019-10-18 03:45:35 UTC - Matteo Merli: The default is TTL disabled (to avoid
unexpected data loss)
----
2019-10-18 06:41:35 UTC - Retardust: Do you have any plans to migrate
pulsar-manager to another storage? I mean we already have zookeeper and
bookkeeper and pulsar itself:) and now we need postgress just for dashboard.
Kafka manager stores data in zookeeper as far as I remember:)
----
2019-10-18 07:20:11 UTC - Sijie Guo: @Retardust you can use any database that
supports JDBC. By default we add Postgres and HerdDB. HerdDB is a database
library built on bookkeeper You can use that for pulsar-manager so you don’t
need to use an extra database. @Enrico Olivelli wrote a blog post for it. We
will publish it next week.
+1 : Retardust
----