Re: ExecuteScript once at workflow inception

James McMahon Wed, 29 Mar 2017 03:27:17 -0700

*Matt, I am adapting a model I found in a reply at Hortonworks for using
Python from InvokeScriptedProcessor:*






*from org.apache.nifi.processor import Processor, Relationship*

*from org.python.core import PySet*



*class PythonProcessor(Processor):*

*   def __init__(self):*

*     self.REL_SUCCESS =
Relationship.Builder().name("success").description("FlowFiles that were
successfully processed").build()*



*     self.REL_FAILURE =
Relationship.Builder().name("failure").description("FlowFiles that failed
to be processed").build()*



*     self.REL_UNMATCH =
Relationship.Builder().name("unmatch").description("FlowFiles that did not
match rules").build()*



*     self.log = None*



*   def initialize(self, context):*

*     self.log = context.getLogger()*



*   def getRelationships(self):*

*     return PySet([self.REL_SUCCESS, self.REL_FAILURE, self.REL_UNMATCH])*



*   def validate(self, context):*

*     return None*



*   def getPropertyDescriptor(self, name):*

*     return None*



*   def getPropertyDescriptors(self):*

*     return None*



*   def validate(self, context):*

*     return None*



*   def onPropertyModified(self, descriptor, oldValue, newValue):*

*     pass*



*   def getIdentifier(self):*

*     return None*



*processor = PythonProcessor()*





This template is my starting point, and I am attempting to bring my python
code from my  ExecuteScript into this model. In the initialize() method I
intend to establish my logger and my handler - logging constructs which I
am given to understand should be done one time and one time only.  Something
like this:



*LOG_FILENAME='/home/nifi/latest/logs/LogFile1.log'*

*FORMAT="%(asctime)-15s %(message)s"*

*logging.basicConfig(filename=LOG_FILENAME,format=FORMAT,level=logging.INFO)*

*a = logging.getLogger("a")*

*formatter = logging.Formatter('%(asctime)-15s %(message)s')*

*handler = logging.FileHandler(LOG_FILENAME)*

*handler.setFormatter(formatter)*
*a.addHandler(handler)*

You mention that



You mention that "



* One caveat is that the Processor interface does not provide a "stop" or
"shutdown" method, so you will need to make sure that any created objects
(connections, clients, e.g.) will be cleaned up gracefully when the
Processor object is garbage-collected. This is not always easy to do, and
the alternative is to write a full custom processor."*


I assume by this you mean that when InvokeScriptedProcessor gets stopped,
I must have a method that executes - only on exit - that does a close() on
my
file handle and that does a shutdown() on my logger. How do I incorporate a
method that accomplishes these things only on exit?

Thank you.

I assume by this you mean that when this InvokeScriptedProcessor gets
stopped, I want to have a method that executes - only on exit - that closes
the handle and that shuts down the logger instance. How do I do this within
the design template I'm working with for InvokeExectuedScript?



Thank you again for your help. -Jim


I assume by this you mean that when this InvokeScriptedProcessor gets
stopped, I want to have a method that executes - only on exit - that closes
the handle and that shuts down the logger instance. How do I do this within
the design template I'm working with for InvokeExectuedS




On Tue, Mar 28, 2017 at 11:00 AM, Matt Burgess <[email protected]> wrote:

> Jim,
>
> You can use InvokeScriptedProcessor [1] rather than ExecuteScript for
> this. ExecuteScript basically lets you provide an onTrigger() body,
> which is called when the ExecuteScript processor "has work to do".
> None of the other lifecycle methods are available.  For
> InvokeScriptedProcessor, you actually script up a subclass of
> Processor [2], and it will have its initialize() method called by
> InvokeScriptedProcessor when it is scheduled to run (once per
> "start"). If you stop and start InvokeScriptedProcessor, or change a
> property, the scripted initialize() method will be called again.
>
> One caveat is that the Processor interface does not provide a "stop"
> or "shutdown" method, so you will need to make sure that any created
> objects (connections, clients, e.g.) will be cleaned up gracefully
> when the Processor object is garbage-collected. This is not always
> easy to do, and the alternative is to write a full custom processor.
> There is an open Jira [3] to invoke annotated lifecycle methods such
> as @OnStopped on the scripted Processor instance.
>
> I have a simple example (albeit in Groovy) [4], but the same approach
> you're likely using for Jython should apply there too. Please let me
> know if you have any questions or issues in setting that up.
>
> Regards,
> Matt
>
> [1] https://nifi.apache.org/docs/nifi-docs/components/org.
> apache.nifi.processors.script.InvokeScriptedProcessor/index.html
> [2] https://github.com/apache/nifi/blob/master/nifi-api/src/
> main/java/org/apache/nifi/processor/Processor.java
> [3] https://issues.apache.org/jira/browse/NIFI-2215
> [4] http://funnifi.blogspot.com/2016/02/writing-reusable-
> scripted-processors-in.html
>
> On Tue, Mar 28, 2017 at 10:48 AM, James McMahon <[email protected]>
> wrote:
> > Hello. I am interested in calling a python script from ExecuteScript that
> > sets up Python loggers and establishes file handles to those loggers for
> use
> > by other python scripts called later in the workflow by other
> ExecuteScript
> > processors. Is there a means to execute a script at workflow inception -
> > once only, not once per flowfile? I have found some retry count examples
> in
> > the open source literature, but those seem to enforce counts at the
> flowfile
> > level. In other words the counter restriction sets to 0 for each
> flowfile.
> > Thank you for any insights. -Jim
>

Re: ExecuteScript once at workflow inception

Reply via email to