2020-02-04 09:47:13 UTC - Konstantinos Papalias: awesome @Alex Yaroslavsky @roman ---- 2020-02-04 09:53:20 UTC - Konstantinos Papalias: Hello, not sure if there is a Pulsar Functions channel yet, but couldn't find one.
So here it goes, coming from different streaming frameworks / libraries (kafka streams, flink, etc) we are used to *chain* functions, transformations, aggregations one after the other, so you can break down your logic into small, re-usable, single-responsibility functions that can be easily tested on isolation. What is the paradigm we should be using on Pulsar Functions? Even though it's dead easy to use, how do you go about composing different transformations together, do you deploy separate functions ? do you perform all your logic into one monolithic function ? Thanks for the help and the direction ---- 2020-02-04 09:59:43 UTC - Ali Ahmed: @Konstantinos Papalias there is some preliminary discussion of function chaining but nothing concrete, most jobs are single stage transformations if you have a few steps just have a few functions with topics connecting them this automatically gives a clean way to deal with back pressure and topics very light weight constructs in pulsar. There are quite a few large deployments doing complex business logic this way. If it can’t be composed in a few simple isolated functions a more heavy duty stream processing platform may be better fit , the pulsar functions implementation will prioritize simplicity over features. ---- 2020-02-04 10:10:41 UTC - Konstantinos Papalias: Thanks for the direction @Ali Ahmed, personally I find it too heavyweight to have to pipe 2 functions via a separate topic and have to prepare and deploy 2 functions separately, I'm still talking about simple functions, transformations Abstract example: e.g. filter elements based on a predicate and chain this with a transformation to uppercase values. That's where I believe pulsar functions can be really useful, for simple validations and transformations without the need for a separate cluster and framework. Maybe I need to explore and understand better what are the options for deploying functions on a lightweight way, but I still believe that having to perform IO to a temp topic and IO to read from temp topic, instead of chaining the transformations is an overkill for some use cases as the above! The alternative is to bundle everything in one Function, but compromise on the readability and compassability unless if we use yet another framework inside the Function body. ---- 2020-02-04 12:37:45 UTC - Sergii Zhevzhyk: Pulling of the image is the easiest part ---- 2020-02-04 14:52:04 UTC - Ryan: What is the state of large message storage in Pulsar? I believe there was work to implement transparent chunking of large files, has that work been completed or even needed anymore? ---- 2020-02-04 14:53:06 UTC - Roman Popenov: It wasn’t reviewed and there were some merge issues conflicting with PIP-36 +1 : Ryan ---- 2020-02-04 14:53:20 UTC - Roman Popenov: Don’t think it’s coming before 2.6.0 ---- 2020-02-04 15:02:40 UTC - Sergii Zhevzhyk: I have the same problem. @Sijie Guo did you hear before about this issue? do you know any workaround? ---- 2020-02-04 15:03:32 UTC - Eric Simon: I just avoided Avro for the time being and used JSON. ---- 2020-02-04 17:08:21 UTC - Bobby: i'm not sure how familiar people here are with the openmessaging benchmark tool, but i'm getting 404's when running a workload on my pulsar instance. Wondering if there's something i need to change to get that to run correctly? Only thing i changed is the ip's in the driver to point to a broker. ---- 2020-02-04 17:53:51 UTC - Sijie Guo: it might be good to create issues in openmessaging-benchmark with your steps and errors. you can paste the link here. bunch of the committers are actually helping maintaining the benchmark code. ---- 2020-02-04 17:54:50 UTC - Bobby: do you mean like a bug report on their github? ---- 2020-02-04 17:56:33 UTC - Sijie Guo: @Konstantinos Papalias Currently pulsar functions was focusing on providing a framework for people to write and run functions for processing the event. There was some thoughts around providing the ability to orchestrate functions into a pipeline. That can be useful for addressing the concerns you mentioned here. ---- 2020-02-04 18:23:37 UTC - Sijie Guo: yes ---- 2020-02-04 21:13:04 UTC - Bobby: <https://github.com/openmessaging/openmessaging-benchmark/issues/166> ---- 2020-02-04 21:18:46 UTC - Bobby: Do ya'll have any other benchmarking tools that you use besides openmessaging? ---- 2020-02-04 21:48:52 UTC - Nouvelle: @Addison Higham @Sijie Guo I'm receiving "State is not enabled" errors when trying to use the State API for Pulsar functions in v2.4.1; does the following post about not making much progress also apply to v2.4.1? <https://apache-pulsar.slack.com/archives/C5Z4T36F7/p1579311050035900?thread_ts=1578930482.115800&cid=C5Z4T36F7> ---- 2020-02-04 23:04:09 UTC - Addison Higham: hrm, in our process of upgrading to 2.5.0 on a k8s cluster, we are experiencing an issue where it appears that pulsar is (likely) caching the IP of a bookie, so when the bookies get restarted, they get a new IP and we end up needing to restart the brokers to get it to pick up on the new IP ---- 2020-02-04 23:07:45 UTC - Addison Higham: it manifests in two ways: 1. it tries to make writes to an open ledger and fails for one bookie 2. the bookies get blacklisted and it can't open new ledgers, it doesn't appear like it tries to re-resolves the IPs/re-check the bookie (at least on a few minute timeframe) and we restart the broker just to clear the bookies out of the excluded bookie blacklist ---- 2020-02-04 23:24:41 UTC - mussa: @mussa has joined the channel ---- 2020-02-05 00:33:57 UTC - Ryan: Okay, thank you for the heads-up. So the code is just waiting approval for 2.6? ---- 2020-02-05 03:00:06 UTC - Yang Yang: @Yang Yang has joined the channel ---- 2020-02-05 04:55:57 UTC - Youngkyun Kim: @Youngkyun Kim has joined the channel ---- 2020-02-05 06:32:13 UTC - sambhav gupta: @sambhav gupta has joined the channel ----
