Hi. Wanted to close the loop in case it might help someone in the future who has a similar interest. This is how I got logging to work from an InvokeScriptedProcessor processor to a specific log file of my choosing, executed from a Jython script (some hybrid of java and python that I don't yet fully understand). I'm running NiFi 0.7.x, and I'm running Python 2.6.6.
Please do take my stuff with a grain of salt. Beyond dogged persistence I don't have much experience going for me. I built this Frankenstein monster with: o initial guidance from Matt B. and Joe W. of this Nifi users group, o a NiFi example in Git ( https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-scripting-bundle/nifi-scripting-processors/src/test/resources/jython/test_update_attribute.py ), o this stackoverflow discussion ( http://stackoverflow.com/questions/6729268/python-logging-messages-appearing-twice ) and o a PySet problem solution from Michael K. available at the Hortonworks site ( https://community.hortonworks.com/questions/75420/invokescriptedprocessor-in-python.html ). A big thank you for your assistance. -Jim import sys import traceback import logging from org.apache.nifi.processor import Processor from org.apache.nifi.processor import Relationship from org.apache.nifi.components import PropertyDescriptor from org.apache.nifi.processor.util import StandardValidators from org.python.core import PySet class UpdateAttributes(Processor) : def __init__(self) : self.__rel_success = Relationship.Builder().name("success").description("Success").build() def initialize(self, context) : try : # create a logger associated with this InvokeScriptedProcessor… self.logger = logging.getLogger('nifi_ISP_1') self.logger.setLevel(logging.DEBUG) # DON'T create your logging file handler here because it must be established, used, # and discarded with each processing cycle. Else we get all sorts of wacky multiples # of outputs in our log file. except : pass def getRelationships(self) : return PySet([self.__rel_success]) def validate(self, context) : return None def getPropertyDescriptor(self) : return None def getPropertyDescriptors(self) : emptyList=[] return emptyList def onPropertyModified(self, descriptor, newValue, oldValue) : pass def onTrigger(self, context, sessionFactory) : session = sessionFactory.createSession() try : # ensure we have some work to do (TBD: try grabbing many files at once rather than one at a time # to optimize throughput) flowfile = session.get() if flowfile is None : return # Establish a file handler that logs even debug messages… fh.logging.FileHandler('/home.nifi/latest/logs/ISP.log') fh.setLevel(logging.DEBUG) # Create a formatter and add it to the handler… formatter = logging.Formatter('%(asctime)-15s %(message)s') fh.setFormatter(formatter) self.logger.addHandler(fh) self.logger.info('About to process file %s',flowfile.getAttribute("filename")) # Extract an attribute of interest… # fromPropertyValue = context.getProperty("for-attributes").getValue() fromAttributeValue = flowfile.getAttribute("filename") # Set the attribute to a new value… # flowfile = session.putAttribute(flowfile, "from-property", fromPropertyValue) flowfile = session.putAttribute(flowfile, "filename", "Larry_Curley_Moe.txt") self.logger.info('File renamed to %s',flowfile.getAttribute("filename")) # Remove the handler… fh.close() while self.logger.handlers : self.logger.handlers.pop() del fh # transfer… session.transfer(flowfile, self.__rel_success) session.commit() except : session.rollback() raise() processor = UpdateAttributes() On Tue, Apr 4, 2017 at 4:23 PM, James Wing <[email protected]> wrote: > James, > > I apologize, I did not absorb the fact that you are using the Python > logging package rather than the built-in Java logging. My advice below is > not helpful. > > Are you aware that you can use the Java logging from within scripts? I > believe you can configure it to provide the various files you need. > > On Tue, Apr 4, 2017 at 11:35 AM, James McMahon <[email protected]> > wrote: > >> How many distinct log files will that logback approach permit me James? I >> have six different workflow paths for which I want to log to separate and >> distinct log files. >> >> From this initialization, which of my python logging commands would stay >> in initialize() and which would move to the logback.xml? >> self.logger = logging.getLogger('nifi_ISP_1') >> >> self.logger.setLevel(logging.DEBUG) >> >> # establish a file handler… >> >> fh = logging.FileHandler('/home/nif >> i/latest/logs/TestLog.log') >> >> fh.setLevel(logging.DEBUG) >> >> # create a formatter and associate it with our handler… >> >> formatter = logging.Formatter('%(asctime)-15s >> %(message)s') >> >> fh.setFromatter(formatter) >> >> self.logger.addHandler(fh) >> >> self.logger.info('Stooges 2020: Larry, Curley, Moe for >> President') >> >> >> So the problem here is that when I Stop the processor, the file handles >> are not eliminated? The initialize() runs at Start only, but if it has been >> stopped and started one or more times prior it inherits all that previous >> baggage. Is that right? >> >> Thanks very much. >> >> Jim >> >> On Tue, Apr 4, 2017 at 2:18 PM, James Wing <[email protected]> wrote: >> >>> James, >>> >>> I suspect your call to self.logger.addHandler(fh) is cumulatively adding >>> to your log results as initialize() is called again. Can you define the >>> log file and formatting in your NiFi's conf/logback.xml (no restart >>> required)? Then you can safely call getLogger() and access the shared >>> configuration. >>> >>> >>> Thanks, >>> >>> James >>> >>> On Tue, Apr 4, 2017 at 10:02 AM, James McMahon <[email protected]> >>> wrote: >>> >>>> Good afternoon. I have been working to configure logging to a specific >>>> log file that associates exclusively with one NiFi processor instance. I >>>> understood from previous posts that InvokeScriptedProcessor processor lends >>>> itself to this requirement. ISP allows for code that runs once - and only >>>> once - when the processor is started. >>>> >>>> >>>> >>>> In this code snippet that follows I show how I establish my logging in >>>> my ISP initialize() method. The problem I am having is that with each >>>> stop/restart of the processor during test and development it appears that I >>>> am initializing new instances of my logger following each restart of the >>>> processor. After my first run my log has >>>> >>>> Stooges 2020… >>>> >>>> After my second run, >>>> >>>> Stooges 2020… >>>> >>>> Stooges 2020… >>>> >>>> After my third, >>>> >>>> Stooges 2020… >>>> >>>> Stooges 2020… >>>> >>>> Stooges 2020… >>>> >>>> And so on. After three runs of my ISP processor I have six log entries >>>> rather than the three I expect. >>>> >>>> >>>> >>>> My code below is lacking something. How do I correct this error so that >>>> I do not output N lines to the log file on run N? >>>> >>>> >>>> >>>> class UpdateProcessors(Processor) : >>>> >>>> >>>> >>>> def __init__(self) : >>>> >>>> self.__rel_success = Relationship.Builder().name("s >>>> uccess").description("Success").build() >>>> >>>> >>>> >>>> def initialize(self, context) : >>>> >>>> try : >>>> >>>> # create a logger for this processor… >>>> >>>> self.logger = logging.getLogger('nifi_ISP_1') >>>> >>>> self.logger.setLevel(logging.DEBUG) >>>> >>>> # establish a file handler… >>>> >>>> fh = logging.FileHandler('/home/nif >>>> i/latest/logs/TestLog.log') >>>> >>>> fh.setLevel(logging.DEBUG) >>>> >>>> # create a formatter and associate it with our handler… >>>> >>>> formatter = logging.Formatter('%(asctime)-15s >>>> %(message)s') >>>> >>>> fh.setFromatter(formatter) >>>> >>>> self.logger.addHandler(fh) >>>> >>>> self.logger.info('Stooges 2020: Larry, Curley, Moe for >>>> President') >>>> >>>> except : >>>> >>>> pass >>>> >>>> >>>> >>>> def getRelationships(self) : >>>> >>>> . >>>> >>>> . >>>> >>>> . >>>> processor = UpdateAttributes() >>>> >>> >>> >> >
