2019-08-15 11:53:14 UTC - Jan-Pieter George: Great news!
Would be happy to give this a test-drive if that'd be helpful.
+1 : dba
----
2019-08-15 12:59:36 UTC - MichaelM: @MichaelM has joined the channel
----
2019-08-15 13:03:20 UTC - Aaron: Not in the server logs, no. I don't understand 
how their could be missing key value pairs if the JSON is generated from the 
same POJO every time
----
2019-08-15 13:11:22 UTC - Ming Fang: Are there any advantages to running 
multiple Pulsar components on the same host, besides making it easier to start?
For example, if Broker and Bookie are on the same host then do they know and 
take advantage of that?
I do see a potential problem with running Functions Worker with Broker since a 
bad function can take resources away from the Broker.
----
2019-08-15 13:25:33 UTC - Alexandre DUVAL: @Sijie Guo poke
----
2019-08-15 15:36:03 UTC - Addison Higham: @Ming Fang no real advantage, in the 
case of publishes, you are communicating with multiple bookies and waiting for 
them all to confirm, so there really isn't any advantage to be gained there. In 
the case of tailing reads (i.e. a subscription that is caught up to the "edge" 
of the topic) that in most cases won't hit a bookie at all as the broker 
buffers the message. In the case of catchup reads, where the broker no longer 
has the message and it needs to fetch from a bookie, you could *maybe* squeeze 
out a small latency advantage, but likely better to have them on separate hosts 
and have better isolation
----
2019-08-15 15:38:52 UTC - Ming Fang: @Addison Higham Thanks for the detailed 
response. It was very helpful.
----
2019-08-15 15:39:36 UTC - Addison Higham: np! this blog post is super helpful 
in getting a more detail idea of how pulsar works: 
<https://jack-vanlightly.com/blog/2018/10/2/understanding-how-apache-pulsar-works>,
 also, the streamlio blog posts contain a lot of great technical details
100 : Ming Fang
----
2019-08-15 15:45:00 UTC - Raman Gupta: Getting tired of endless weird errors 
from Kafka. Researching alternatives, and Pulsar looks promising. Obviously 
this is a biased community, and this is a very generic/open-ended question, but 
can anyone speak to experiences migrating from Kafka+Avro+Confluent Schema 
Registry+streams to Pulsar?
----
2019-08-15 15:49:01 UTC - Ming Fang: @Addison Higham Thanks for the link. It’s 
an excellent article indeed.  Btw I’m building my Pulsar stack using Terraform 
and Kubernetes here 
<https://github.com/mingfang/terraform-provider-k8s/tree/master/modules/pulsar>.
  It does require my own plugin and fork of Terraform but can be a good 
reference for anyone.
----
2019-08-15 15:56:51 UTC - Addison Higham: interesting, where is your fork of 
terraform? curious what all you had to changed there. Yeah, so we are starting 
in on a terraform provider for pulsar resources themselves, like managing 
namespaces/topics/etc, just waiting to get the last approval and we will be 
doing that development in the open and will post here
----
2019-08-15 15:58:03 UTC - Ming Fang: It’s a simple but critical change here 
<https://github.com/mingfang/terraform/commit/a451ae6ab50108d350ac7a17e3f499c58c5615d2>
----
2019-08-15 15:59:29 UTC - Ming Fang: Basically Terraform never pass the 
“actual” config to the plugin, but rather an interpreted version.  And in the 
case of Kubernetes, it’s interpretation can be wrong so my plugin needed the 
original config to be passed in.
----
2019-08-15 16:02:02 UTC - Ming Fang: The use case is when you remove something 
from the config. TF does not have a concept of removing things so it will merge 
the previous state(the stuff you want to remove) with your config(which you 
tried to remove something). The plugin will end up no able to tell that 
something was removed.
----
2019-08-15 16:03:19 UTC - Addison Higham: :thinking_face: seem like they will 
accept it upstream? Your k8s provider looks really nice! I can probably make a 
custom provider for k8s fly but a fork of TF will be tougher for me to get 
people to use :stuck_out_tongue:
----
2019-08-15 16:05:38 UTC - Ming Fang: I submitted PR 
<https://github.com/hashicorp/terraform/pull/21218> but it’s still open.
----
2019-08-15 16:34:28 UTC - Raman Gupta: This is what I have documented so far in 
terms of advantages/disadvantages for us: 
<https://docs.google.com/document/d/11lw2cFABwZvqHi-l20Zm2fe1BsQ2F6D5MzxFwbBuN5Y/edit?usp=sharing>,
 comments/corrections are welcome!
----
2019-08-15 17:31:48 UTC - Ali Ahmed: @Ming Fang there are advantages of running 
all components in one instance even for production , think edge or iot 
deployments , you can have lots of small instances serving and processing data 
at the edge and replication data to a datacenter asynchronously.
----
2019-08-15 17:33:46 UTC - Ali Ahmed: @Raman Gupta some general comments  in 
your doc
pulsar supports protobuf for schema also
Tiered storage is also available for hadoop  and any object store that is 
supported by jclouds or has an s3 api example minio.
----
2019-08-15 17:34:43 UTC - Ali Ahmed: pulsar functions have a state store built 
in , it’s fully replication on bookkeeper
----
2019-08-15 17:35:38 UTC - Ali Ahmed: depends of the json serializer.
----
2019-08-15 17:36:06 UTC - Aaron: Im using JSONSchema.of
----
2019-08-15 17:36:28 UTC - Ali Ahmed: you can embed pulsar functions in another 
jvm it just need to documented better
----
2019-08-15 17:37:45 UTC - Guillaume Braibant: A thing I haven't seen in your 
advantages list is that Pulsar does not need a third party (like Kafka + 
RabbitMQ) to act like a message queue thanks to shared subscriptions.

It is the main reason why we chose to use Pulsar and not Kafka for a PoC where 
requests were sent to a distribution layer (a topic in Pulsar) to be processed 
by one node among an indefinite number of nodes.

In your disadvantages list, you mention Kafka Stream and the query stores. If I 
remember well, query stores are local to each Kafka Stream instance. I know you 
can store some state in your bookies with the Pulsar Function SDK but I don't 
know if that state is available for all your function instances or not.
----
2019-08-15 17:39:01 UTC - Ali Ahmed: it’s available to all function instance, 
since it’s written to bookkeeper.
slightly_smiling_face : Guillaume Braibant
----
2019-08-15 17:40:44 UTC - Guillaume Braibant: And another advantage is that 
writing a pulsar function requires less boiler plate code than a Kafka Stream 
application and provides metrics and logging out of the box
----
2019-08-15 17:42:16 UTC - Ali Ahmed: functions were written so there no 
learning curve for java developers.
----
2019-08-15 17:44:19 UTC - Raman Gupta: Great feedback guys, thanks. Good to 
know I can embed functions in existing JVMs, which makes managing them a bit 
easier. Love that function state is replicated and available to all instances. 
For really super-simple functions, deploying them into the brokers is a nice 
option to have. It looks like Jclouds supports Azure Blob for tiered storage, 
so that's great too.
----
2019-08-15 17:44:47 UTC - Raman Gupta: Great point about supporting the message 
queue use case.
----
2019-08-15 17:46:06 UTC - Raman Gupta: I thought I saw something in the Pulsar 
docs about supporting the message request/reply case, but now can't seem to 
find it. Or was that NATS I'm thinking about?
----
2019-08-15 17:47:59 UTC - Ali Ahmed: @Raman Gupta there are no current plans to 
support a request response model , it can done but it’s not been requested
----
2019-08-15 17:48:56 UTC - Raman Gupta: @Ali Ahmed No worries, we could 
implement it easily ourselves, if we needed it. We don't use it now with Kafka.
----
2019-08-15 17:50:28 UTC - Ali Ahmed: there is middleware that is emulating 
request response on top of kafka, I don’t know how well any of them work.
----
2019-08-15 17:51:16 UTC - Raman Gupta: Is the max message size PIP in 2.4? It 
seems to be but the PIP still shows as PENDING. We do currently send some 
messages up to 5 MB.
----
2019-08-15 17:57:28 UTC - Ali Ahmed: 5MB is the largest message size it depends 
on configs but sending messages over 1 MB is not really recommended for pub sub 
systems.
----
2019-08-15 18:01:47 UTC - Raman Gupta: Yeah its likely we'll move this large 
data into a blob storage system instead, and just link to it. Though it would 
be super-nice if we can avoid that work via the PIP for chunking.
----
2019-08-15 18:06:16 UTC - Aaron: @Ali Ahmed Is there another serializer I 
should use?
----
2019-08-15 19:43:04 UTC - Ali Ahmed: potentially even use bk as an underlying 
blog storage for  the future
----
2019-08-15 19:47:17 UTC - Raman Gupta: Interesting idea. It doesn't seem like 
the API is ideal for that though.
----
2019-08-15 19:49:31 UTC - Ali Ahmed: it has been tried and used in production 
<https://github.com/diennea/blobit>
+1 : Raman Gupta
----
2019-08-15 20:28:09 UTC - Tarek Shaar: My java consumer seems to not get 
messages beyond a certain number. I have a producer sending patches of messages 
(roughly 200 messages every minute), the consumer subscribes using a reg exp 
pattern  (<persistent://tenant/namespace/.*> ). The subscriber keeps receiving 
messages then it stops at message number 500. I have tried to stop and start 
many times and the same thing happens all the time. Is there a limit or a 
setting that I need to change?
----
2019-08-15 20:35:22 UTC - Chris Bartholomew: Is the subscriber acknowledging 
the messages as it receives them ? What subscription type are you using?
----
2019-08-15 20:40:05 UTC - Chris Bartholomew: There are several broker settings 
around the maximum number of messages that can be unacknowledged: 
maxUnackedMessagesPerConsumer, maxUnackedMessagesPerSubscription, 
maxUnackedMessagesPerBroker. Perhaps you are hitting one of these limits.
----
2019-08-15 21:04:32 UTC - Tarek Shaar: Am acking as soon as I get the message. 
I am using Exclusive sub. When sending to individual subscriptions (for example 
from producer 1 topic1 to consumer 1 topic1) then all the messages are 
delivered. Even if I create 4000 producers and 4000 consumers. The case that 
chokes is when I create 4000 producers and send all of the messages to one sub 
that's subscribed to a topic pattern that matches all of the produced topics 
patterns.
----
2019-08-15 21:21:17 UTC - Chris Bartholomew: In case it is not acking fast 
enough, you can try disabling all those unackedMessage settings by making them 
0 in the broker.conf. If this "fixes" the problem, you know that your case is 
hitting the maxUnacked logic in the broker.  If you are hitting a limit on the 
consumer side, you try adjusting receiverQueueSize or 
maxTotalReceiverQueueSizeAcrossPartitions when creating the consumer.
----
2019-08-15 22:38:00 UTC - Tarek Shaar: Thanks Chris will take a look
----
2019-08-16 01:44:35 UTC - Poule: when I create a function using a wheel file, 
how can I tell it to install the dependencies? All I have now is 
`install_requires=[blblaballab]` set in my setup.py
----
2019-08-16 01:54:22 UTC - Poule: ..there is 
`install_usercode_dependencies=None,`, how can I set it to True when creating 
the function? I tried putting it in the yaml file to no luck
----
2019-08-16 02:10:47 UTC - Ali Ahmed: @Poule 
<https://github.com/apache/pulsar/blob/master/site2/docs/functions-quickstart.md#package-python-dependencies>
----
2019-08-16 02:13:54 UTC - Poule: @Ali Ahmed wheels not yet supported?
----
2019-08-16 02:14:18 UTC - Poule: they are in 
`<https://github.com/apache/pulsar/blob/master/pulsar-functions/instance/src/main/python/python_instance_main.py>`
----
2019-08-16 02:16:38 UTC - Poule: line 100
----
2019-08-16 02:17:53 UTC - Ali Ahmed: I haven’t tried that option
----
2019-08-16 04:36:21 UTC - Raman Gupta: I'm trying to understand the point of 
`redeliverUnacknowledgedMessages`. Wouldn't Pulsar automatically redeliver 
these? Why and when should this be called?
----
2019-08-16 04:54:51 UTC - Vinay Aggarwal: Thanks a lot, it worked 
:slightly_smiling_face:
----
2019-08-16 05:40:38 UTC - jinfeng105: @jinfeng105 has joined the channel
----
2019-08-16 06:23:13 UTC - Poule: when trying to delete a subscription I get 
`Failed: Subscription has active connected consumers`
----
2019-08-16 06:23:24 UTC - Poule: how can I view/delete those consumers?
----
2019-08-16 06:23:36 UTC - Poule: so I can delete the subscription
----
2019-08-16 07:52:42 UTC - Sijie Guo: you can run `pulsar-admin topic stats` to 
get the stats of a topic, which includes the connected consumers.
----
2019-08-16 08:06:05 UTC - tasguocheng: @tasguocheng has joined the channel
----
2019-08-16 08:17:51 UTC - Alexandre DUVAL: :confused:
----
2019-08-16 08:32:40 UTC - Federico Ponzi: @msk docker just makes stuff easier 
to run (usually). If you know a bit of docker, you can use the Dockerfile [0] 
as a guidance to run the dashboard outside docker (e.g. see which packets to 
install and how to run the app)
[0]: <https://github.com/apache/pulsar/blob/master/dashboard/Dockerfile>
----
2019-08-16 08:59:26 UTC - Poule: ok I thought functions were bounded to 
tenant+namespace
----
2019-08-16 08:59:49 UTC - Poule: looks like an old deleted func. was still a 
connected consumer
----
2019-08-16 09:00:21 UTC - Poule: even after the tenant were deleted
----

Reply via email to