I am about to submit a PR for an implementation of the run-once scheduling. There is no outstanding JIRA ticket on this so what kind of NIFI-XXXX or other labeling should I put into the title of the PR?
Thanks, Naz Irizarry MITRE Corp. 617-893-0074 > On Jan 12, 2017, at 3:55 PM, Irizarry Jr., Nazario <[email protected]> wrote: > > I think it is a matter of the model in one's head. If one thinks of a > continuous activation paradigm the green arrow versus red square indicate > what you point out. On the other hand in an ad-hoc run-once paradigm the > green arrow is a nice succinct indicator of what has not run yet. In an > analytics environment processing can take minutes to hours for some > processors. As processing goes on the processors with the remaining green > arrows indicate what is left to complete in the “visual script.” > > Consider the following example. Say there there are five processors. The > first processor, say A, makes a query and gets data. Depending on what I > know about today’s input to A the output should be directed to B1, B2, B3, or > B4. The B's are actually variations on a particular analytic algorithm and > most of the time only one of them needs to be used. On one day (based on > external knowledge) I click on A and B1 and then the Start arrow. On another > day I modify the query, click on A and B2 and then click on the Start arrow. > etc, Clearly I could have four flows and I could start/stop entire flows. > But, as the number of processing stages increases and the number of > processing alternatives increases at each stage the combinatorial growth > makes distinct flows painful to manage. Sometimes it is easier to have one > all encompassing flow and then allow the analyst to shift click the portions > they want to invoke for the next “run." > > > Naz Irizarry > MITRE Corp. > 617-893-0074 > > > >> On Jan 12, 2017, at 2:14 PM, Joe Witt <[email protected]> wrote: >> >> Naz >> >> The green arrow vs red square says "scheduled to execute" vs "not >> scheduled to execute". For most processors, such as those which take >> input flow files from a connection, even if they're scheduled to run >> they're not going to be executed unless there is work to do (data >> sitting in the queue) and space available (on all destination >> relationships). Because of this I'm suggesting to consider just >> leaving them all scheduled to execute even though they won't actually >> be doing anything most of the time. The stats on each component tell >> you how many times it was actually invoked and how much data it >> processed, etc.. So you'll see that they're not doing anything most >> of the time. >> >> You mentioned not wanting to have to do anything manual yet run once >> would be a manual construct, right? >> >> I dont mean to suggest I'm closed off to the idea of a run once >> concept I just really want to understand your use case better. >> >> Thanks >> Joe >> >> On Thu, Jan 12, 2017 at 2:11 PM, Irizarry Jr., Nazario <[email protected]> >> wrote: >>> Correction, that was the processor scheduler’s stopProcessor() method that >>> needs to be invoked so the UI shows that the processor is stopped. >>> >>> Naz Irizarry >>> MITRE Corp. >>> 617-893-0074 >>> >>> >>> >>>> On Jan 12, 2017, at 2:08 PM, Irizarry Jr., Nazario <[email protected]> wrote: >>>> >>>> Yes, we found that to keep the UI in sync (make sure it looks stopped >>>> after it runs once) the flow controller's stopProcessor() method has to be >>>> called. >>>> >>>> Naz Irizarry >>>> MITRE Corp. >>>> 617-893-0074 >>>> >>>> >>>> >>>> On Jan 12, 2017, at 1:41 PM, Brandon DeVries >>>> <[email protected]<mailto:[email protected]>> wrote: >>>> >>>> I think answering Joe's question is step one. However, I was curious and >>>> tried something: >>>> >>>> public void onTrigger(...){ >>>> if(!isSheduled()){ >>>> return; >>>> } >>>> FlowFile flowFile = session.get() >>>> if (flowFile == null){ >>>> return; >>>> } >>>> session.transfer(flowFile, REL_SUCCESS); >>>> updateScheduledFalse(); >>>> } >>>> >>>> This processes one FlowFile per "scheduling". I.e., one FlowFile goes >>>> through, and you need to stop / start to get another. I'm not 100% that >>>> nothing else would ever call the "built in" updateScheduled* methods, but >>>> worst case the processor could maintain its own flag. Also, for what it's >>>> worth, calling updateScheduledFalse() doesn't "stop" the processor on the >>>> graph... as Oleg mentions, this (or something like this) could potentially >>>> be visually confusing. >>>> >>>> I'm not sure how this fits in a production system, but this + >>>> GenerateFlowFile and some backpressure seems possibly useful for >>>> debugging. I know I've faked this behavior with a GenerateFlowFile w/ run >>>> schedule "1 day" or something before... then again, maybe it would be best >>>> to not create something that could be confusing / misused in a production >>>> system. >>>> >>>> Brandon >>>> >>>> >>>> >>>> >>>> On Thu, Jan 12, 2017 at 1:02 PM Joe Witt >>>> <[email protected]<mailto:[email protected]>> wrote: >>>> >>>> Naz, >>>> >>>> Why not just leave all the processes running? If the data only >>>> arrives periodically that is ok, right? >>>> >>>> Thanks >>>> Joe >>>> >>>> On Thu, Jan 12, 2017 at 10:54 AM, Irizarry Jr., Nazario >>>> <[email protected]<mailto:[email protected]>> >>>> wrote: >>>> On a project that I am on we have been looking at using NiFi for >>>> orchestrations that are invoked infrequently. For example, once a month a >>>> new data input product becomes available and then one wants to run it >>>> through a set of processing steps that can be nicely implemented using NiFi >>>> processors. However, using the interval or cron scheduling for this >>>> purpose begins to get cumbersome after a while with the need to start and >>>> manually stop these occasional flows. >>>> >>>> It would be fairly easy to add an additional scheduling option - “Run >>>> Once” for this use case. The behavior would be that when a processor is >>>> set to run once it automatically stops after it has successfully processed >>>> one input. >>>> >>>> What do people think? We are willing to implement this small >>>> enhancement. >>>> >>>> Cheers, >>>> >>>> Naz Irizarry >>>> MITRE Corp. >>>> 617-893-0074 <(617)%20893-0074> >>>> >>> >> >
