Hi Matt,

My Python/Jython skills are poor. Can you provide me an example plz?

-Madhu

On Wed, Mar 30, 2016 at 5:53 PM, Matt Burgess <[email protected]> wrote:

> Mahdu,
>
> Since you won't be able to return your dictionary, another approach would
> be to create the dictionary from the main script and pass it into the
> callback constructor. Then process() can update it, and you can use the
> populated dictionary after process() returns to set attributes and such.
>
> Regards,
> Matt
>
>
> On Mar 30, 2016, at 5:00 PM, Madhukar Thota <[email protected]>
> wrote:
>
> Matt,
>
> I tired the following code but i am getting the following error. Can you
> help me where i am doing wrong?
>
> Error:
>  16:56:10 EDT
> ERROR
> 6f15a6f2-7744-404c-9961-f545d3f29042
>
> ExecuteScript[id=6f15a6f2-7744-404c-9961-f545d3f29042] Failed to process 
> session due to org.apache.nifi.processor.exception.ProcessException: 
> javax.script.ScriptException: TypeError: None required for void return in 
> <script> at line number 38: 
> org.apache.nifi.processor.exception.ProcessException: 
> javax.script.ScriptException: TypeError: None required for void return in 
> <script> at line number 38
>
>
> Code:
>
> import urllib
> import urlparse
> import java.io
> from org.apache.commons.io import IOUtils
> from java.nio.charset import StandardCharsets
> from org.apache.nifi.processor.io import InputStreamCallback
>
>
>
> class PyReadStreamCallback(InputStreamCallback):
>     def __init__(self):
>         self.d = {}
>
>     def process(self, inputStream):
>         text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
>         split = (urllib.unquote(text)).split("&")
>         self.d = dict(s.split('=') for s in split)
>         return self.d
>
>
> flowFile = session.get()
> if (flowFile != None):
>     flowFile = session.read(flowFile, PyReadStreamCallback())
>     flowFile = session.putAttribute(flowFile, 
> PyReadStreamCallback().process())
>     session.transfer(flowFile, REL_SUCCESS)
>
>
> On Thu, Mar 24, 2016 at 8:59 AM, Matt Burgess <[email protected]> wrote:
>
>> Madhu,
>>
>> The example from my blog post shows how to overwrite flow content, by
>> first reading in content from an input stream, then processing it and
>> writing back out to an output stream.  If for your example you just need to
>> read from the incoming flow file and add some attributes, you can use the
>> session.read() method instead of session.write(). In Jython the callback
>> might look something like this:
>>
>> class PyReadStreamCallback(InputStreamCallback):
>>   def __init__(self):
>>         pass
>>   def process(self, inputStream):
>>     text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
>>     # Do your parsing here
>>
>> Note the stream callback methods do not have a reference to the
>> ProcessSession, so you may want to create a dictionary for the attributes
>> to be added, and pass that into the PyReadStreamCallback constructor. Then
>> process() would add the attributes name/value pairs to the dictionary, and
>> after you call session.read() in the main script, you can add all the
>> attributes from the dictionary to the flow file.
>>
>> The rest of the script will likely be similar to the blog post's script,
>> note there is no "outputStream" passed in (as PyReadStreamCallback is a
>> subclass of InputStreamCallback not StreamCallback), so there is no
>> "outputStream.write()" call in the process() method or anywhere else in the
>> script.
>>
>> You may find another blog post helpful:
>> http://funnifi.blogspot.com/2016/02/executescript-explained-split-fields.html
>>  Although it uses Groovy as the language, it also explains some of the NiFi
>> Java API, at least the part that deals with reading/writing flow files,
>> immutable flow file references, etc.
>>
>> Let me know if this works for you and/or if you have other questions or
>> issues.
>>
>> Cheers,
>> Matt
>>
>> On Thu, Mar 24, 2016 at 8:42 AM, Madhukar Thota <[email protected]
>> > wrote:
>>
>>> Hi Matt,
>>>
>>> Do you have an example on how to use ExecuteScript on flowContent?
>>>
>>> I have the following url encoded string as flow content, where i would
>>> like use python parse it to get flow artibutes based on key values pairs.
>>>
>>>
>>> rt.start=navigation&rt.tstart=1458797018682&rt.bstart=1458797019033&rt.end=1458797019075&t_resp=21&t_page=372&t_done=393&t_other=t_domloaded%7C364&r=http%3A%2F%2Flocalhost%3A63342%2FBeacon%2Ftest.html&r2=&u=http%3A%2F%2Flocalhost%3A63342%2FBeacon%2Ftest.html&v=0.9&
>>> vis.st=visible
>>>
>>> -Madhu
>>>
>>> On Thu, Mar 24, 2016 at 12:34 AM, Madhukar Thota <
>>> [email protected]> wrote:
>>>
>>>> Hi Matt,
>>>>
>>>> Thank you for the input. I updated my config as you suggested and it
>>>> worked like charm and also big thankyou for nice article. i used your
>>>> article as reference when i am started Exploring ExecuteScript.
>>>>
>>>>
>>>> Thanks
>>>> Madhu
>>>>
>>>>
>>>>
>>>> On Thu, Mar 24, 2016 at 12:18 AM, Matt Burgess <[email protected]>
>>>> wrote:
>>>>
>>>>> Madhukar,
>>>>>
>>>>> Glad to hear you found a solution, I was just replying when your email
>>>>> came in.
>>>>>
>>>>> Although in ExecuteScript you have chosen "python" as the script
>>>>> engine, it is actually Jython that is being used to interpret the scripts,
>>>>> not your installed version of Python.  The first line (shebang) is ignored
>>>>> as it is a comment in Python/Jython.
>>>>>
>>>>> Modules installed with pip are not automatically available to the
>>>>> Jython engine, but if the modules are pure Python code (rather than native
>>>>> C / CPython), like user_agents is, you can import them one of two
>>>>> equivalent ways:
>>>>>
>>>>> 1) The way you have done, using sys.path.append.  I should mention
>>>>> that "import sys" is done for you so you can safely leave that out if you
>>>>> wish.
>>>>> 2) Add the path to the packages ('/usr/local/lib/python2.7/site-packages')
>>>>> to the Module Path property of the ExecuteScript processor. In this case
>>>>> the processor effectively does Option #1 for you.
>>>>>
>>>>> I was able to get your script to work but had to force the result of
>>>>> parse (a UserAgent object) into a string, so I wrapped it in str:
>>>>>
>>>>> str(parse(flowFile.getAttribute('http.headers.User-Agent')).browser)
>>>>>
>>>>> You're definitely on the right track :)  For another Jython example
>>>>> with ExecuteScript, check out this post on my blog:
>>>>> http://funnifi.blogspot.com/2016/03/executescript-json-to-json-revisited_14.html
>>>>>
>>>>> I am new to Python as well, but am happy to help if I can with any
>>>>> issues you run into, as it will help me learn more as well :)
>>>>>
>>>>> Regards,
>>>>> Matt
>>>>>
>>>>>
>>>>> On Thu, Mar 24, 2016 at 12:10 AM, Madhukar Thota <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> I was able to solve the python modules issues by adding the following
>>>>>> lines:
>>>>>>
>>>>>> import sys
>>>>>> sys.path.append('/usr/local/lib/python2.7/site-packages')  # Path
>>>>>> where my modules are installed.
>>>>>>
>>>>>> Now the issue i have is , how do i parse the incoming attributes
>>>>>> using this libarary correctly and get the new fields. I am kind of new to
>>>>>> python and also this my first attempt of using python with nifi.
>>>>>>
>>>>>> Any help is appreciated.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Mar 23, 2016 at 11:31 PM, Madhukar Thota <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Hi
>>>>>>>
>>>>>>> I am trying to use the following script to parse
>>>>>>> http.headers.useragent with python useragent module using ExecuteScript
>>>>>>> Processor.
>>>>>>>
>>>>>>> Script:
>>>>>>>
>>>>>>> #!/usr/bin/env python2.7
>>>>>>> from user_agents import parse
>>>>>>>
>>>>>>> flowFile = session.get()
>>>>>>> if (flowFile != None):
>>>>>>>   flowFile = session.putAttribute(flowFile, "browser",
>>>>>>> parse(flowFile.getAttribute('http.headers.User-Agent')).browser)
>>>>>>>   session.transfer(flowFile, REL_SUCCESS)
>>>>>>>
>>>>>>>
>>>>>>> But ExecuteProcessor, complaining about missing python module but
>>>>>>> modules are already installed using pip and tested outside nifi. How 
>>>>>>> can i
>>>>>>> add or reference this modules to nifi?
>>>>>>>
>>>>>>> Error:
>>>>>>>
>>>>>>> 23:28:03 EDT
>>>>>>> ERROR
>>>>>>> af354413-9866-4557-808a-7f3a84353597
>>>>>>> ExecuteScript[id=af354413-9866-4557-808a-7f3a84353597] Failed to
>>>>>>> process session due to
>>>>>>> org.apache.nifi.processor.exception.ProcessException:
>>>>>>> javax.script.ScriptException: ImportError: No module named user_agents 
>>>>>>> in
>>>>>>> <script> at line number 2:
>>>>>>> org.apache.nifi.processor.exception.ProcessException:
>>>>>>> javax.script.ScriptException: ImportError: No module named user_agents 
>>>>>>> in
>>>>>>> <script> at line number 2
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to