Hi Matt, My Python/Jython skills are poor. Can you provide me an example plz?
-Madhu On Wed, Mar 30, 2016 at 5:53 PM, Matt Burgess <[email protected]> wrote: > Mahdu, > > Since you won't be able to return your dictionary, another approach would > be to create the dictionary from the main script and pass it into the > callback constructor. Then process() can update it, and you can use the > populated dictionary after process() returns to set attributes and such. > > Regards, > Matt > > > On Mar 30, 2016, at 5:00 PM, Madhukar Thota <[email protected]> > wrote: > > Matt, > > I tired the following code but i am getting the following error. Can you > help me where i am doing wrong? > > Error: > 16:56:10 EDT > ERROR > 6f15a6f2-7744-404c-9961-f545d3f29042 > > ExecuteScript[id=6f15a6f2-7744-404c-9961-f545d3f29042] Failed to process > session due to org.apache.nifi.processor.exception.ProcessException: > javax.script.ScriptException: TypeError: None required for void return in > <script> at line number 38: > org.apache.nifi.processor.exception.ProcessException: > javax.script.ScriptException: TypeError: None required for void return in > <script> at line number 38 > > > Code: > > import urllib > import urlparse > import java.io > from org.apache.commons.io import IOUtils > from java.nio.charset import StandardCharsets > from org.apache.nifi.processor.io import InputStreamCallback > > > > class PyReadStreamCallback(InputStreamCallback): > def __init__(self): > self.d = {} > > def process(self, inputStream): > text = IOUtils.toString(inputStream, StandardCharsets.UTF_8) > split = (urllib.unquote(text)).split("&") > self.d = dict(s.split('=') for s in split) > return self.d > > > flowFile = session.get() > if (flowFile != None): > flowFile = session.read(flowFile, PyReadStreamCallback()) > flowFile = session.putAttribute(flowFile, > PyReadStreamCallback().process()) > session.transfer(flowFile, REL_SUCCESS) > > > On Thu, Mar 24, 2016 at 8:59 AM, Matt Burgess <[email protected]> wrote: > >> Madhu, >> >> The example from my blog post shows how to overwrite flow content, by >> first reading in content from an input stream, then processing it and >> writing back out to an output stream. If for your example you just need to >> read from the incoming flow file and add some attributes, you can use the >> session.read() method instead of session.write(). In Jython the callback >> might look something like this: >> >> class PyReadStreamCallback(InputStreamCallback): >> def __init__(self): >> pass >> def process(self, inputStream): >> text = IOUtils.toString(inputStream, StandardCharsets.UTF_8) >> # Do your parsing here >> >> Note the stream callback methods do not have a reference to the >> ProcessSession, so you may want to create a dictionary for the attributes >> to be added, and pass that into the PyReadStreamCallback constructor. Then >> process() would add the attributes name/value pairs to the dictionary, and >> after you call session.read() in the main script, you can add all the >> attributes from the dictionary to the flow file. >> >> The rest of the script will likely be similar to the blog post's script, >> note there is no "outputStream" passed in (as PyReadStreamCallback is a >> subclass of InputStreamCallback not StreamCallback), so there is no >> "outputStream.write()" call in the process() method or anywhere else in the >> script. >> >> You may find another blog post helpful: >> http://funnifi.blogspot.com/2016/02/executescript-explained-split-fields.html >> Although it uses Groovy as the language, it also explains some of the NiFi >> Java API, at least the part that deals with reading/writing flow files, >> immutable flow file references, etc. >> >> Let me know if this works for you and/or if you have other questions or >> issues. >> >> Cheers, >> Matt >> >> On Thu, Mar 24, 2016 at 8:42 AM, Madhukar Thota <[email protected] >> > wrote: >> >>> Hi Matt, >>> >>> Do you have an example on how to use ExecuteScript on flowContent? >>> >>> I have the following url encoded string as flow content, where i would >>> like use python parse it to get flow artibutes based on key values pairs. >>> >>> >>> rt.start=navigation&rt.tstart=1458797018682&rt.bstart=1458797019033&rt.end=1458797019075&t_resp=21&t_page=372&t_done=393&t_other=t_domloaded%7C364&r=http%3A%2F%2Flocalhost%3A63342%2FBeacon%2Ftest.html&r2=&u=http%3A%2F%2Flocalhost%3A63342%2FBeacon%2Ftest.html&v=0.9& >>> vis.st=visible >>> >>> -Madhu >>> >>> On Thu, Mar 24, 2016 at 12:34 AM, Madhukar Thota < >>> [email protected]> wrote: >>> >>>> Hi Matt, >>>> >>>> Thank you for the input. I updated my config as you suggested and it >>>> worked like charm and also big thankyou for nice article. i used your >>>> article as reference when i am started Exploring ExecuteScript. >>>> >>>> >>>> Thanks >>>> Madhu >>>> >>>> >>>> >>>> On Thu, Mar 24, 2016 at 12:18 AM, Matt Burgess <[email protected]> >>>> wrote: >>>> >>>>> Madhukar, >>>>> >>>>> Glad to hear you found a solution, I was just replying when your email >>>>> came in. >>>>> >>>>> Although in ExecuteScript you have chosen "python" as the script >>>>> engine, it is actually Jython that is being used to interpret the scripts, >>>>> not your installed version of Python. The first line (shebang) is ignored >>>>> as it is a comment in Python/Jython. >>>>> >>>>> Modules installed with pip are not automatically available to the >>>>> Jython engine, but if the modules are pure Python code (rather than native >>>>> C / CPython), like user_agents is, you can import them one of two >>>>> equivalent ways: >>>>> >>>>> 1) The way you have done, using sys.path.append. I should mention >>>>> that "import sys" is done for you so you can safely leave that out if you >>>>> wish. >>>>> 2) Add the path to the packages ('/usr/local/lib/python2.7/site-packages') >>>>> to the Module Path property of the ExecuteScript processor. In this case >>>>> the processor effectively does Option #1 for you. >>>>> >>>>> I was able to get your script to work but had to force the result of >>>>> parse (a UserAgent object) into a string, so I wrapped it in str: >>>>> >>>>> str(parse(flowFile.getAttribute('http.headers.User-Agent')).browser) >>>>> >>>>> You're definitely on the right track :) For another Jython example >>>>> with ExecuteScript, check out this post on my blog: >>>>> http://funnifi.blogspot.com/2016/03/executescript-json-to-json-revisited_14.html >>>>> >>>>> I am new to Python as well, but am happy to help if I can with any >>>>> issues you run into, as it will help me learn more as well :) >>>>> >>>>> Regards, >>>>> Matt >>>>> >>>>> >>>>> On Thu, Mar 24, 2016 at 12:10 AM, Madhukar Thota < >>>>> [email protected]> wrote: >>>>> >>>>>> I was able to solve the python modules issues by adding the following >>>>>> lines: >>>>>> >>>>>> import sys >>>>>> sys.path.append('/usr/local/lib/python2.7/site-packages') # Path >>>>>> where my modules are installed. >>>>>> >>>>>> Now the issue i have is , how do i parse the incoming attributes >>>>>> using this libarary correctly and get the new fields. I am kind of new to >>>>>> python and also this my first attempt of using python with nifi. >>>>>> >>>>>> Any help is appreciated. >>>>>> >>>>>> >>>>>> >>>>>> On Wed, Mar 23, 2016 at 11:31 PM, Madhukar Thota < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Hi >>>>>>> >>>>>>> I am trying to use the following script to parse >>>>>>> http.headers.useragent with python useragent module using ExecuteScript >>>>>>> Processor. >>>>>>> >>>>>>> Script: >>>>>>> >>>>>>> #!/usr/bin/env python2.7 >>>>>>> from user_agents import parse >>>>>>> >>>>>>> flowFile = session.get() >>>>>>> if (flowFile != None): >>>>>>> flowFile = session.putAttribute(flowFile, "browser", >>>>>>> parse(flowFile.getAttribute('http.headers.User-Agent')).browser) >>>>>>> session.transfer(flowFile, REL_SUCCESS) >>>>>>> >>>>>>> >>>>>>> But ExecuteProcessor, complaining about missing python module but >>>>>>> modules are already installed using pip and tested outside nifi. How >>>>>>> can i >>>>>>> add or reference this modules to nifi? >>>>>>> >>>>>>> Error: >>>>>>> >>>>>>> 23:28:03 EDT >>>>>>> ERROR >>>>>>> af354413-9866-4557-808a-7f3a84353597 >>>>>>> ExecuteScript[id=af354413-9866-4557-808a-7f3a84353597] Failed to >>>>>>> process session due to >>>>>>> org.apache.nifi.processor.exception.ProcessException: >>>>>>> javax.script.ScriptException: ImportError: No module named user_agents >>>>>>> in >>>>>>> <script> at line number 2: >>>>>>> org.apache.nifi.processor.exception.ProcessException: >>>>>>> javax.script.ScriptException: ImportError: No module named user_agents >>>>>>> in >>>>>>> <script> at line number 2 >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
