Hello,
I am trying to implement a Python-based ExecuteScript processor in NiFi 1.9.0
that will get a list of files from S3. I am getting an exception when I am
trying to create an s3 client in boto3:
boto_client = boto3.client('s3', region_name='us-east-1')
Exception: No module named multiprocessing in <script> at line number 16
2019-03-25 03:08:33,591 ERROR [Timer-Driven Process Thread-4]
o.a.nifi.processors.script.ExecuteScript
ExecuteScript[id=b258d892-0169-1000-f2ec-1e98e077f15b] Failed to process
session due to org.apache.nifi.processor.exception.ProcessException:
javax.script.ScriptException: ImportError: No module named multiprocessing in
<script> at line number 16
I can however create a boto3 client for Athena for example and that passes
without error:
boto_client = boto3.client('athena', region_name='us-east-1')
I have observed the same behaviour with InvokeScriptedProcessor.
I am passing '/usr/lib/python2.7/site-packages/' in Module Directory property.
Here is the code snippet for ExecuteScript processor that should reproduce this
issue:
import boto3
import java.io
from org.apache.commons.io import IOUtils
from java.nio.charset import StandardCharsets
from org.apache.nifi.processor.io import StreamCallback
class PyStreamCallback(StreamCallback):
def __init__(self, text):
self.text = text
def process(self, inputStream, outputStream):
outputStream.write(bytearray(self.text.encode('utf-8')))
def getFileList():
return ['file1', 'file2']
boto_client = boto3.client('s3', region_name='ap-southeast-2')
for file in getFileList():
flowfile = session.create()
if flowfile:
flowfile = session.write(flowfile, PyStreamCallback(file))
session.transfer(flowfile, REL_SUCCESS)
Is there any workaround for this issue?
Best regards,
Elemir