Hi Anush, you’re right - there are various things to cover, also in terms of flexibility as JVM based pipeline elements supports various static properties, messaging transport protocols and so on.
However, I’d suggest we start rather minimal thereby gradually evolving the python wrapper. That means in the first phase, we would implement a minimal subset of static properties, a fixed supported message transport protocol (Kafka) and so on. As you already saw, we got an MVP that still relies on a Java part as an interface to the pipeline management in the core - see figure in confluence [1] and an example here [2]. Thus, as a first step we would need to further decrease this dependency, ultimately having everything covered in Python to have first standalone MVP in Python. I’d suggest the following to reduce the complexity: Common setting: - We start with a minimal set of static properties and one supported transport protocol (kafka) - We only focus on data processors, no sinks When looking at the anatomy of a pipeline element wrapper runtime, the runtime sits behind a RESTful api used to communicate with the core and it potentially „bundles" multiple data processor, for instance see the filter processors in Java. A processors provide and registers it’s „model“ (DataProcessorDescription) at the backend which contains requirements on the data stream, output strategies etc (see potential descriptions e.g. for the demo python greeter controller in the declare model [4]). Here we already provide a SDK for easing the definition for developers, but in general it resembles to a DataProcessorDescription [5]. This is used for the backend to provide the information needed for the UI and the rendering and so on. Thus, to start I’d suggest working on porting this model to Python classes. What we need: - model for DataProcessorDescription, parent classes and relevant sub-classes Goal: Providing a valid DataProcessorDescription graph I created Jira sub-task for it https://issues.apache.org/jira/browse/STREAMPIPES-180 <https://issues.apache.org/jira/browse/STREAMPIPES-180> As said, we should start minimal. Next we would work on the DataProcessorInvokation model, which we receive from the pipeline management on service invocation. Later we add more support and work on the runtime itself, that takes care of the actual pipeline element (data processor) management at runtime based on this invocation request. If you have any questions, feel free to reach out to us. Patrick [1] https://cwiki.apache.org/confluence/display/STREAMPIPES/SIP-02+Python+wrapper <https://cwiki.apache.org/confluence/display/STREAMPIPES/SIP-02+Python+wrapper> [2] https://github.com/apache/incubator-streampipes-examples/tree/dev/streampipes-pipeline-elements-examples-processors-jvm/src/main/java/org/apache/streampipes/pe/examples/jvm/python <https://github.com/apache/incubator-streampipes-examples/tree/dev/streampipes-pipeline-elements-examples-processors-jvm/src/main/java/org/apache/streampipes/pe/examples/jvm/python> [3] https://github.com/apache/incubator-streampipes-extensions/tree/dev/streampipes-processors-filters-jvm/src/main/java/org/apache/streampipes/processors/filters/jvm <https://github.com/apache/incubator-streampipes-extensions/tree/dev/streampipes-processors-filters-jvm/src/main/java/org/apache/streampipes/processors/filters/jvm> [4] https://github.com/apache/incubator-streampipes-examples/blob/dev/streampipes-pipeline-elements-examples-processors-jvm/src/main/java/org/apache/streampipes/pe/examples/jvm/python/GreeterPythonController.java <https://github.com/apache/incubator-streampipes-examples/blob/dev/streampipes-pipeline-elements-examples-processors-jvm/src/main/java/org/apache/streampipes/pe/examples/jvm/python/GreeterPythonController.java> [5] https://github.com/apache/incubator-streampipes/blob/dev/streampipes-model/src/main/java/org/apache/streampipes/model/graph/DataProcessorDescription.java <https://github.com/apache/incubator-streampipes/blob/dev/streampipes-model/src/main/java/org/apache/streampipes/model/graph/DataProcessorDescription.java> > Am 15.03.2021 um 12:56 schrieb 19bda004 Anush Krishna v > <[email protected]>: > > Thanks, I'll drop in my channel and say Hi. > I started looking over the existing wrappers and the one in Python I guess > there are a lot of things to cover. It would be great if you can assign my > beginner level issues on the Wrapper task so that I can use the time to > understand the project better and get some work done > > On Mon, 15 Mar 2021 at 15:18, Patrick Wiener <[email protected] > <mailto:[email protected]>> wrote: > Hi Anush, > > great to hear from you :) > > For communication via Slack: > You can find us on Slack in the official ASF Slack -> #streampipes [1] > > For issues/tasks: > Are you looking for a general-purpose issue to get more familiar with > StreamPipes > or specific ones related to the Python wrapper task for GSoC? > > > Best > Patrick > > > [1] https://the-asf.slack.com/ <https://the-asf.slack.com/> > > > >> Am 12.03.2021 um 21:41 schrieb 19bda004 Anush Krishna v >> <[email protected] <mailto:[email protected]>>: >> >> This is great, thank you so much! >> I think I have picked up skills to get me started on this, Are there any >> easy or starter-level issues I can start working on? >> I feel it will help me writing up my proposal and come up with some good >> ideas. >> Is there a Slack or a Discord channel I can join to interact and learn more >> about the project from the team? >> >> On Mon, 22 Feb 2021 at 13:51, Patrick Wiener <[email protected] >> <mailto:[email protected]>> wrote: >> Hi Anush, >> >> Welcome to the open source community and thanks for your interest >> in contributing to StreamPipes. As a start, you can familiarize yourself >> with StreamPipes by going to the website[1] as well as documentation[2]. >> >> You can also find some youtube material that is worth checking out to >> understand some of the concepts (description model, invokation graph with >> static properties) >> behind StreamPipes, e.g., the anatomy of StreamPipes data processor >> explained at >> ApacheCon@Home 2020 [3]. Here, you can also checkout the example dummy >> greeter processor using our SDK in Java [4]. Under the hood, the wrapper >> runtime exposes >> a restful API to interact with the central pipeline management, handles >> connection to >> the transport layer (mostly Kafka), etc. >> >> For the GSoC project your interested in, it’s relevant to have understand the >> concepts and architecture of the anatomy of a data processor incl. the >> StreamPipes processor wrapper runtime. You can go through the implementation >> of the >> Java wrapper [5]. You can find a sketch of the current WIP on the Python >> implementation >> on our wiki [6] as well as the current core implementation of the python >> runtime wrapper [7]. >> >> The idea is to introduce a new StreamPipes python runtime wrapper, such that >> processor developer >> can write their data processors in Python. >> >> Do not hesitate to ask if you have any questions. >> >> >> [1] https://streampipes.apache.org/ <https://streampipes.apache.org/> >> [2] https://streampipes.apache.org/docs/index.html >> <https://streampipes.apache.org/docs/index.html> >> [3] https://streampipes.apache.org/media.html >> <https://streampipes.apache.org/media.html> >> [4] https://github.com/wipatrick/apachecon-demo-processor >> <https://github.com/wipatrick/apachecon-demo-processor> >> [5] >> https://github.com/apache/incubator-streampipes/tree/dev/streampipes-wrapper-standalone >> >> <https://github.com/apache/incubator-streampipes/tree/dev/streampipes-wrapper-standalone> >> >> [6] >> https://cwiki.apache.org/confluence/display/STREAMPIPES/SIP-02+Python+wrapper >> >> <https://cwiki.apache.org/confluence/display/STREAMPIPES/SIP-02+Python+wrapper> >> >> [7] >> https://github.com/apache/incubator-streampipes/tree/dev/streampipes-wrapper-python >> >> <https://github.com/apache/incubator-streampipes/tree/dev/streampipes-wrapper-python> >> >> >> Regards, >> Patrick >> >> >>> Am 20.02.2021 um 14:58 schrieb 19BDA004 Anush Krishna >>> <[email protected] <mailto:[email protected]>>: >>> >>> Hey Partrick >>> I understand you are busy thank you for your time just give me 5 minutes >>> I got excellent Python skills ( I might regret saying excellent later ) >>> I am learning more about stream processing paradigm incl, I pickup new >>> topics quick so I think I can learn it in a couple of days >>> I am a python developer java is not my cup of coffee but I understand >>> enough to read >>> basic codes. >>> I think I can work on it, Is there any first issues I can start working on >>> and maybe work on it even more during GSOC. >>> Thanks for taking time to read this >>> hoping for a reply
