Re: ExecuteScript Processor - Control Flow

James Wing Tue, 30 Aug 2016 21:59:02 -0700

Ramanujam,

Rather than writing the result XML to disk and then reading it back, it
might be preferable to pass your XML in the flowfile to the success
relationship from the Python script.  Sadly, I'm no expert on
ExecuteScript, but I made the following franken-sample to demonstrate
reading, transforming, and writing flowfile text (not XML, but hopefully
not too dissimilar):


import sys
import traceback
from java.nio.charset import StandardCharsets
from org.apache.commons.io import IOUtils
from org.apache.nifi.processor.io import StreamCallback
from org.python.core.util import StringUtil

class TransformCallback(StreamCallback):
    def __init__(self):
        pass

    def process(self, inputStream, outputStream):
        try:
            input_text = IOUtils.toString(inputStream,
StandardCharsets.UTF_8)
            output_text = input_text + " - Transformed"
            outputStream.write(StringUtil.toBytes(output_text))
        except:
            traceback.print_exc(file=sys.stdout)
            raise

flowFile = session.get()
if flowFile != None:
    flowFile = session.write(flowFile, TransformCallback())
    session.transfer(flowFile, REL_SUCCESS)

Does that help?  Are you able to share some Python code where you are
having trouble?  I found the ExecuteScript unit test samples
<https://github.com/apache/nifi/tree/master/nifi-nar-bundles/nifi-scripting-bundle/nifi-scripting-processors/src/test/resources/jython>
useful.

Thanks,

James

On Mon, Aug 29, 2016 at 4:15 AM, Nathamuni, Ramanujam <[email protected]>
wrote:

> I do have similar question – as I have the Execute script using python to
> run code and it produces the output file (/tmp/test.xml)  but not sure how
> to use that file to next processor without using additional flow file using
>   GetFile processor to get a file produced by python execute script.  I am
> very new to NiFi.
>
>
>
> Following is need:
>
>
>
> 1.       READ CSV file from HDFS
>
> 2.       Execute python script – reads CSV file and produces XML  output
> file – example /tmp/test.xml .
>
> 3.      I need to process the /tmp/test.xml file using  SplitXML
> processor
>
> 4.      Put these into HDFS
>
>
>
>
>
> Thanks,
>
> Ram
>
> *From:* James Wing [mailto:[email protected]]
> *Sent:* Monday, August 29, 2016 12:47 AM
> *To:* [email protected]
> *Subject:* Re: ExecuteScript Processor - Control Flow
>
>
>
> Koustav,
>
> How are you running the Sqoop job?  Can you share some code?  Python is
> sequential by default, but your Sqoop job might run asynchronously.  I
> believe the answer depends on your code (or library) not only starting the
> Sqoop job, but polling for it's status until it is complete.
>
> Thanks,
>
> James
>
>
>
> On Sun, Aug 28, 2016 at 8:24 PM, koustav choudhuri <[email protected]>
> wrote:
>
> Hi All
>
>
>
> I have a python script running on a Nifi Server , whin in turn calls a
> Sqoop job on a different Server . The next step in the script is to use the
> flow file from the previous processor to continue to the next processor .
>
>
>
> So the python script is like :
>
>
>
> 1. call the sqoop job on server 2
>
> 2. get the flow file from the session and continue
>
>
>
>
>
> Question  :
>
> Will step 2 wait till Step1 completes ?
>
> Or ,
>
> As soon as the Sqoop job gets initiated through Step 1 , Step 2 Executes
> irrespective of whether Step 1 completes or not .
>
>
>
> Could be a dumb question , still asking .
>
>
>
>
>
>
>
>
> *************************************************************************
> This e-mail may contain confidential or privileged information.
> If you are not the intended recipient, please notify the sender
> immediately and then delete it.
>
> TIAA
> *************************************************************************
>

Re: ExecuteScript Processor - Control Flow

Reply via email to