I added a new SIP to confluent [1] Feel free to contribute to it and add your thoughts. I mainly think its good to point out the design choices we make.
Patrick [1] https://cwiki.apache.org/confluence/display/STREAMPIPES/SIP-02+Python+wrapper <https://cwiki.apache.org/confluence/display/STREAMPIPES/SIP-02+Python+wrapper> > Am 19.07.2020 um 17:50 schrieb Patrick Wiener <[email protected]>: > > I pushed the current work to the core repo under streampipes-wrapper-python > [1]. Additionally, I created an Issue to track the > tasks for adding the wrapper in Jira [2]. > > As said, this still heavily relies on the corresponding counterpart in Java > where the processor is described, registered etc. > And it currently only works with Kafka as the go-to transport protocol. > > The main concern I have right now is how to really integrate it as the goal > should not be to reimplement everything in Python. > > Does anybody have a „smart“ idea how we could tackle this problem? > > Patrick > > [1] > https://github.com/apache/incubator-streampipes/tree/dev/streampipes-wrapper-python > > <https://github.com/apache/incubator-streampipes/tree/dev/streampipes-wrapper-python> > [2] https://issues.apache.org/jira/browse/STREAMPIPES-174 > <https://issues.apache.org/jira/browse/STREAMPIPES-174> > >> Am 18.07.2020 um 15:41 schrieb Patrick Wiener <[email protected] >> <mailto:[email protected]>>: >> >> To give you a sneak peak - it currently looks like this on the python side. >> However, note that the main magic (registration, model declaration etc) >> still happens on the java side. >> >> def main(): >> processors = { >> 'org.streampipes.pe.processors.python.simple': SimpleProcessor, >> 'org.streampipes.pe.processors.python.filter': ThresholdFilter, >> } >> >> Declarer.add(processors=processors) >> StandaloneSubmitter.init() >> >> >> if __name__ == '__main__': >> main() >> >> An example Threshold filter: >> >> class ThresholdFilter(EventProcessor): >> >> threshold = None >> filter_property = None >> operator = None >> >> def on_invocation(self): >> self.threshold = self.static_properties.get('threshold') >> self.filter_property = self.static_properties.get('filter_property') >> self.operator = self.static_properties.get('operation') >> >> def on_event(self, event): >> if self.eval_operator(op=self.operator, >> value=event[self.filter_property], >> threshold=self.threshold): >> return event >> >> def on_detach(self): >> pass >> >> @staticmethod >> def eval_operator(op=None, value=None, threshold=None): >> switcher = { >> 'LE': operator.le(value, threshold), >> 'LT': operator.lt(value, threshold), >> 'EQ': operator.eq(value, threshold), >> 'GT': operator.gt(value, threshold), >> 'GE': operator.ge(value, threshold), >> 'IE': operator.ne(value, threshold) >> } >> return switcher.get(op, "Invalid operator“) >> >> >> >> Patrick >> >> >>> Am 16.07.2020 um 23:29 schrieb Dominik Riemer <[email protected] >>> <mailto:[email protected]>>: >>> >>> Hi, >>> >>> I'm fully +1 for a complete, plain python wrapper that integrates both >>> runtime and controller interfaces! Also, given our microservice >>> architecture with standalone pipeline elements that communicate over >>> JSON/JSON-LD I don't think we need any code-level integration between >>> Python and Java. >>> >>> Concerning the code structure, I'd suggest to create a >>> streampipes-wrapper-python module in the core project, add the Python code >>> there and to create an example using the current Java >>> ExternalEventProcessor into the streampipes-examples project that explains >>> how to use the Python wrapper. By adding the Python code to the core >>> project, all wrappers would be located in the same repository, while the >>> extensions project solely provides specific pipeline elements and adapters. >>> In the meantime, we could add the missing features to the Python wrapper. I >>> agree that it is some work, but it should mainly consist of parsing the >>> graphs (we could use JSON instead of JSON-LD here to simplify parsing), >>> extracting parameters and adding some Flask endpoints. >>> >>> As I'm not that familiar with Python, are there any Python experts on the >>> list who want to help building the wrapper? I'd expect that finishing the >>> wrapper could probably be done within a few days if there is a Python >>> expert and someone who is familiar with the StreamPipes model - I'd be >>> happy to support the model side 😉 >>> >>> Dominik >>> >>> >>> -----Original Message----- >>> From: Philipp Zehnder <[email protected] <mailto:[email protected]>> >>> Sent: Thursday, July 16, 2020 11:09 PM >>> To: [email protected] <mailto:[email protected]> >>> Subject: Re: Adding StreamPipes Python wrapper >>> >>> Hi guys, >>> >>> I am also in favor of integrating the current prototype of the python >>> wrapper for further development. >>> I would also like to discuss how the proper integration might look like. >>> The cleanest way would indeed be to implement all the StreamPipes >>> interfaces and models in python, but I fear this is a lot of work and will >>> take quite some time. >>> Is there a better way, or does anyone have experience integrating Python >>> code into Java? >>> >>> As for the first integration I would suggest to create a module in the >>> extensions project and put all the code there. >>> We currently use the interfaces of the Java wrapper, right? So we do not >>> have any python specific endpoints. >>> I think this would ease the usage for people in the community and already >>> try an early version of the wrapper. >>> Alternatively, we can put it into the core in streampipes-wrapper-python as >>> you suggested, but then a user has to checkout the backend and the >>> extensions project to develop a new processor. >>> Whats your opinion on that? >>> >>> Philipp >>> >>> >>> >>>> On 16. Jul 2020, at 20:41, Patrick Wiener <[email protected] >>>> <mailto:[email protected]>> wrote: >>>> >>>> Hi Grainier, >>>> >>>> Definitely, it should make it super simple to integrate various well >>>> known Python libs. The only real limitation is that they’ll also have >>>> to work in an event-driven fashion. >>>> >>>> I guess the most clean way would be to port the Java wrapper to Python >>>> to finally have something such "pip install streampipes-python“. Right >>>> now in the prototype we have a special ExternalEventProcessor [1] that >>>> only calls in the >>>> onInvocation() and onDetach() and forwards the request to a Flask >>>> endpoint in Python. >>>> >>>> Do you have experience with running Python + Java projects „together“? >>>> I saw Flink is using py4j [2]. >>>> >>>> What do you think about porting it all to Python? >>>> >>>> Patrick >>>> >>>> [1] >>>> https://github.com/apache/incubator-streampipes/blob/dev/streampipes-w >>>> <https://github.com/apache/incubator-streampipes/blob/dev/streampipes-w> >>>> rapper/src/main/java/org/apache/streampipes/wrapper/runtime/ExternalEv >>>> entProcessor.java >>>> <https://github.com/apache/incubator-streampipes/blob/dev/streampipes- >>>> <https://github.com/apache/incubator-streampipes/blob/dev/streampipes-> >>>> wrapper/src/main/java/org/apache/streampipes/wrapper/runtime/ExternalE >>>> ventProcessor.java> [2] https://www.py4j.org/ <https://www.py4j.org/> >>>> <https://www.py4j.org/ <https://www.py4j.org/>> >>>> >>>> >>>>> Am 16.07.2020 um 14:42 schrieb Grainier Perera <[email protected] >>>>> <mailto:[email protected]>>: >>>>> >>>>> Hi Patrick, >>>>> >>>>> This will be very useful. We can use this to expose the capabilities >>>>> of popular libraries such as scikit-learn, SciPy, etc... By the way, >>>>> How this works? Will it use java bridge, Jython or something similar? >>>>> >>>>> Grainier Perera. >>>>> >>>>> >>>>> On Thu, 16 Jul 2020 at 13:42, Patrick Wiener <[email protected] >>>>> <mailto:[email protected]>> wrote: >>>>> >>>>>> Hi guys, >>>>>> >>>>>> this mail is to inform you and discuss the addition of a new wrapper >>>>>> for >>>>>> StreamPipes: StreamPipes Python Wrapper >>>>>> >>>>>> Current wrappers such as standalone (JVM) or distributed (Flink) >>>>>> already allow us to develop new processors in the given runtime >>>>>> environment. I suppose to add the Python wrapper to this family. >>>>>> >>>>>> Why Python wrapper? >>>>>> >>>>>> * Python is a widely used language especially in the domain of data >>>>>> science >>>>>> * Python is more concise and thus better to read >>>>>> * We provide more options for standalone algorithms: It allows >>>>>> newcomers unfamiliar with Java to faster implement their algorithmns >>>>>> >>>>>> Current implementation: >>>>>> >>>>>> Currently it only works when implementing the declareModel() as part >>>>>> of the controller in Java and sending the invocation request to >>>>>> Python on the receiver side. Thus, it is necessary to run both Java >>>>>> + Python in one container . While it works, this should of course >>>>>> not be the standard way to do it. >>>>>> >>>>>> As said, I already started a very very basic implementation of it >>>>>> that I would add it to the core project under >>>>>> streampipes-wrapper-python or do you have any other thoughts? >>>>>> >>>>>> I am happy to discuss this topic with you and hope that some of you >>>>>> are eager to help working on the Python wrapper. >>>>>> >>>>>> What are your thoughts? >>>>>> >>>>>> Patrick >>>>>> >>>>>> >>>>>> >>>>>> >>>> >>> >>> >> >
