Mahdu, Since you won't be able to return your dictionary, another approach would be to create the dictionary from the main script and pass it into the callback constructor. Then process() can update it, and you can use the populated dictionary after process() returns to set attributes and such.
Regards, Matt > On Mar 30, 2016, at 5:00 PM, Madhukar Thota <[email protected]> wrote: > > Matt, > > I tired the following code but i am getting the following error. Can you help > me where i am doing wrong? > > Error: > 16:56:10 EDT > ERROR6f15a6f2-7744-404c-9961-f545d3f29042 > ExecuteScript[id=6f15a6f2-7744-404c-9961-f545d3f29042] Failed to process > session due to org.apache.nifi.processor.exception.ProcessException: > javax.script.ScriptException: TypeError: None required for void return in > <script> at line number 38: > org.apache.nifi.processor.exception.ProcessException: > javax.script.ScriptException: TypeError: None required for void return in > <script> at line number 38 > > Code: > > import urllib > import urlparse > import java.io > from org.apache.commons.io import IOUtils > from java.nio.charset import StandardCharsets > from org.apache.nifi.processor.io import InputStreamCallback > > > > class PyReadStreamCallback(InputStreamCallback): > def __init__(self): > self.d = {} > > def process(self, inputStream): > text = IOUtils.toString(inputStream, StandardCharsets.UTF_8) > split = (urllib.unquote(text)).split("&") > self.d = dict(s.split('=') for s in split) > return self.d > > > flowFile = session.get() > if (flowFile != None): > flowFile = session.read(flowFile, PyReadStreamCallback()) > flowFile = session.putAttribute(flowFile, > PyReadStreamCallback().process()) > session.transfer(flowFile, REL_SUCCESS) > >> On Thu, Mar 24, 2016 at 8:59 AM, Matt Burgess <[email protected]> wrote: >> Madhu, >> >> The example from my blog post shows how to overwrite flow content, by first >> reading in content from an input stream, then processing it and writing back >> out to an output stream. If for your example you just need to read from the >> incoming flow file and add some attributes, you can use the session.read() >> method instead of session.write(). In Jython the callback might look >> something like this: >> >> class PyReadStreamCallback(InputStreamCallback): >> def __init__(self): >> pass >> def process(self, inputStream): >> text = IOUtils.toString(inputStream, StandardCharsets.UTF_8) >> # Do your parsing here >> >> Note the stream callback methods do not have a reference to the >> ProcessSession, so you may want to create a dictionary for the attributes to >> be added, and pass that into the PyReadStreamCallback constructor. Then >> process() would add the attributes name/value pairs to the dictionary, and >> after you call session.read() in the main script, you can add all the >> attributes from the dictionary to the flow file. >> >> The rest of the script will likely be similar to the blog post's script, >> note there is no "outputStream" passed in (as PyReadStreamCallback is a >> subclass of InputStreamCallback not StreamCallback), so there is no >> "outputStream.write()" call in the process() method or anywhere else in the >> script. >> >> You may find another blog post helpful: >> http://funnifi.blogspot.com/2016/02/executescript-explained-split-fields.html >> Although it uses Groovy as the language, it also explains some of the NiFi >> Java API, at least the part that deals with reading/writing flow files, >> immutable flow file references, etc. >> >> Let me know if this works for you and/or if you have other questions or >> issues. >> >> Cheers, >> Matt >> >>> On Thu, Mar 24, 2016 at 8:42 AM, Madhukar Thota <[email protected]> >>> wrote: >>> Hi Matt, >>> >>> Do you have an example on how to use ExecuteScript on flowContent? >>> >>> I have the following url encoded string as flow content, where i would like >>> use python parse it to get flow artibutes based on key values pairs. >>> >>> rt.start=navigation&rt.tstart=1458797018682&rt.bstart=1458797019033&rt.end=1458797019075&t_resp=21&t_page=372&t_done=393&t_other=t_domloaded%7C364&r=http%3A%2F%2Flocalhost%3A63342%2FBeacon%2Ftest.html&r2=&u=http%3A%2F%2Flocalhost%3A63342%2FBeacon%2Ftest.html&v=0.9&vis.st=visible >>> >>> -Madhu >>> >>>> On Thu, Mar 24, 2016 at 12:34 AM, Madhukar Thota >>>> <[email protected]> wrote: >>>> Hi Matt, >>>> >>>> Thank you for the input. I updated my config as you suggested and it >>>> worked like charm and also big thankyou for nice article. i used your >>>> article as reference when i am started Exploring ExecuteScript. >>>> >>>> >>>> Thanks >>>> Madhu >>>> >>>> >>>> >>>>> On Thu, Mar 24, 2016 at 12:18 AM, Matt Burgess <[email protected]> >>>>> wrote: >>>>> Madhukar, >>>>> >>>>> Glad to hear you found a solution, I was just replying when your email >>>>> came in. >>>>> >>>>> Although in ExecuteScript you have chosen "python" as the script engine, >>>>> it is actually Jython that is being used to interpret the scripts, not >>>>> your installed version of Python. The first line (shebang) is ignored as >>>>> it is a comment in Python/Jython. >>>>> >>>>> Modules installed with pip are not automatically available to the Jython >>>>> engine, but if the modules are pure Python code (rather than native C / >>>>> CPython), like user_agents is, you can import them one of two equivalent >>>>> ways: >>>>> >>>>> 1) The way you have done, using sys.path.append. I should mention that >>>>> "import sys" is done for you so you can safely leave that out if you wish. >>>>> 2) Add the path to the packages >>>>> ('/usr/local/lib/python2.7/site-packages') to the Module Path property of >>>>> the ExecuteScript processor. In this case the processor effectively does >>>>> Option #1 for you. >>>>> >>>>> I was able to get your script to work but had to force the result of >>>>> parse (a UserAgent object) into a string, so I wrapped it in str: >>>>> >>>>> str(parse(flowFile.getAttribute('http.headers.User-Agent')).browser) >>>>> >>>>> You're definitely on the right track :) For another Jython example with >>>>> ExecuteScript, check out this post on my blog: >>>>> http://funnifi.blogspot.com/2016/03/executescript-json-to-json-revisited_14.html >>>>> >>>>> I am new to Python as well, but am happy to help if I can with any issues >>>>> you run into, as it will help me learn more as well :) >>>>> >>>>> Regards, >>>>> Matt >>>>> >>>>> >>>>>> On Thu, Mar 24, 2016 at 12:10 AM, Madhukar Thota >>>>>> <[email protected]> wrote: >>>>>> I was able to solve the python modules issues by adding the following >>>>>> lines: >>>>>> >>>>>> import sys >>>>>> sys.path.append('/usr/local/lib/python2.7/site-packages') # Path where >>>>>> my modules are installed. >>>>>> >>>>>> Now the issue i have is , how do i parse the incoming attributes using >>>>>> this libarary correctly and get the new fields. I am kind of new to >>>>>> python and also this my first attempt of using python with nifi. >>>>>> >>>>>> Any help is appreciated. >>>>>> >>>>>> >>>>>> >>>>>>> On Wed, Mar 23, 2016 at 11:31 PM, Madhukar Thota >>>>>>> <[email protected]> wrote: >>>>>>> Hi >>>>>>> >>>>>>> I am trying to use the following script to parse http.headers.useragent >>>>>>> with python useragent module using ExecuteScript Processor. >>>>>>> >>>>>>> Script: >>>>>>> >>>>>>> #!/usr/bin/env python2.7 >>>>>>> from user_agents import parse >>>>>>> >>>>>>> flowFile = session.get() >>>>>>> if (flowFile != None): >>>>>>> flowFile = session.putAttribute(flowFile, "browser", >>>>>>> parse(flowFile.getAttribute('http.headers.User-Agent')).browser) >>>>>>> session.transfer(flowFile, REL_SUCCESS) >>>>>>> >>>>>>> >>>>>>> But ExecuteProcessor, complaining about missing python module but >>>>>>> modules are already installed using pip and tested outside nifi. How >>>>>>> can i add or reference this modules to nifi? >>>>>>> >>>>>>> Error: >>>>>>> >>>>>>> 23:28:03 EDTERRORaf354413-9866-4557-808a-7f3a84353597 >>>>>>> ExecuteScript[id=af354413-9866-4557-808a-7f3a84353597] Failed to >>>>>>> process session due to >>>>>>> org.apache.nifi.processor.exception.ProcessException: >>>>>>> javax.script.ScriptException: ImportError: No module named user_agents >>>>>>> in <script> at line number 2: >>>>>>> org.apache.nifi.processor.exception.ProcessException: >>>>>>> javax.script.ScriptException: ImportError: No module named user_agents >>>>>>> in <script> at line number 2 >
