Thanks a lot, Matt! I'll try implementing it with the ExecuteStreamCommand 
processor.

Best regards,
Elemir

On 25/3/19, 2:48 pm, "Matt Burgess" <[email protected]> wrote:

    As NiFi is a pure Java/JVM application, we use Jython rather than
    Python for ExecuteScript. This means that you can't import native
    (CPython, e.g.) modules into your Jython scripts in ExecuteScript,
    which is what I believe is happening here. If you need native CPython
    modules (and if you're operating only on flowfile content and not
    attributes), consider using ExecuteStreamCommand with a real Python
    interpreter and script. I'm looking at Py4J to try and bridge the gap,
    but in the meantime you have to choose between "pure" Python (Jython)
    for ExecuteScript and full Python with
    ExecuteStreamCommand/ExecuteProcess.
    
    Regards,
    Matt
    
    On Sun, Mar 24, 2019 at 11:40 PM Elemir Stevko
    <[email protected]> wrote:
    >
    > Hello,
    >
    > I am trying to implement a Python-based ExecuteScript processor in NiFi 
1.9.0 that will get a list of files from S3. I am getting an exception when I 
am trying to create an s3 client in boto3:
    >
    > boto_client = boto3.client('s3', region_name='us-east-1')
    >
    > Exception: No module named multiprocessing in <script> at line number 16
    >
    > 2019-03-25 03:08:33,591 ERROR [Timer-Driven Process Thread-4] 
o.a.nifi.processors.script.ExecuteScript 
ExecuteScript[id=b258d892-0169-1000-f2ec-1e98e077f15b] Failed to process 
session due to org.apache.nifi.processor.exception.ProcessException: 
javax.script.ScriptException: ImportError: No module named multiprocessing in 
<script> at line number 16
    >
    > I can however create a boto3 client for Athena for example and that 
passes without error:
    >
    > boto_client = boto3.client('athena', region_name='us-east-1')
    >
    > I have observed the same behaviour with InvokeScriptedProcessor.
    >
    > I am passing '/usr/lib/python2.7/site-packages/' in Module Directory 
property.
    >
    > Here is the code snippet for ExecuteScript processor that should 
reproduce this issue:
    >
    > import boto3
    > import java.io
    > from org.apache.commons.io import IOUtils
    > from java.nio.charset import StandardCharsets
    > from org.apache.nifi.processor.io import StreamCallback
    >
    > class PyStreamCallback(StreamCallback):
    >   def __init__(self, text):
    >     self.text = text
    >   def process(self, inputStream, outputStream):
    >     outputStream.write(bytearray(self.text.encode('utf-8')))
    >
    > def getFileList():
    >     return ['file1', 'file2']
    >
    > boto_client = boto3.client('s3', region_name='ap-southeast-2')
    >
    > for file in getFileList():
    >   flowfile = session.create()
    >   if flowfile:
    >     flowfile = session.write(flowfile, PyStreamCallback(file))
    >     session.transfer(flowfile, REL_SUCCESS)
    >
    > Is there any workaround for this issue?
    >
    > Best regards,
    > Elemir
    

Reply via email to