Matt,
Thanks for the suggestion. I did end up creating a ExecuteScript
processor that expanded out the range of the numbers into the content,
and it is working for this scenario.
Thanks again!
- Scott
Matt Foley <mailto:[email protected]>
Monday, February 27, 2017 5:21 PM
If I understand correctly, your desired goal is for each input row
that specifies a range, A to A+N, you would generate a sequence of N
(or perhaps N+1) flowfiles, right? And the only difference in each
flowfile is that you’ve Replaced the range specification with a single
number from that range?
I would suggest that at the level of the row input, you use
ExecuteScript to expand each input row into N rows, with the
substituted number values, then run that through SplitText, to get one
row per flowfile. This should be way more efficient, as well as much
safer than a cyclic graph.
Cheers,
--Matt
*From: *Scott Wagner <[email protected]>
*Reply-To: *"[email protected]" <[email protected]>
*Date: *Monday, February 27, 2017 at 2:34 PM
*To: *"[email protected]" <[email protected]>
*Subject: *How to gracefully handle a circular graph?
Hello all,
I have created a graph where I am downloading a number of rows
from an SQL database, and each row defines a range of numbers
(100-200, 700-1500, etc.). What I am then doing on the NiFi side is
generating an individual FlowFile for each number in that range. The
way that I was accomplishing this was by setting attributes to the
"current" value to the lower boundary, and an attribute of the upper
boundary, and then creating two queues off the "success" output for a
Processor (the ReplaceText processor in the bottom right of the
image), one of which goes on to process that number's record (going
off the bottom right in the picture), and the other one of which goes
off to a processor to increment the "current" number, and will then
forward it to the processor that will check to make sure that
"current" is less than or equal to "upper boundary".
This works great, until the queues end up filling up. Once this
happens, I have a gridlock situation where none of the processors in
this triangle are running any longer, because they all have a full
output queue. I have tried searching the Internet and put a little
thought into how I could make it so that my "Check if done" processor
would prefer entries that are coming in from the circular portion of
the graph, but so far haven't been able to come up with anything.
What I have tried is making both of the input queues to "Check if
done" go through a funnel, and set an Oldest FlowFile prioritizer, but
it still eventually ends up filling up the entire triangle of queues.
Does anyone have a suggestion as to how I could gracefully handle
a situation like this? I appreciate any advice.
Thanks!
- Scott Wagner
<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
Virus-free. www.avg.com
<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
Scott Wagner <mailto:[email protected]>
Monday, February 27, 2017 4:34 PM
Hello all,
I have created a graph where I am downloading a number of rows
from an SQL database, and each row defines a range of numbers
(100-200, 700-1500, etc.). What I am then doing on the NiFi side is
generating an individual FlowFile for each number in that range. The
way that I was accomplishing this was by setting attributes to the
"current" value to the lower boundary, and an attribute of the upper
boundary, and then creating two queues off the "success" output for a
Processor (the ReplaceText processor in the bottom right of the
image), one of which goes on to process that number's record (going
off the bottom right in the picture), and the other one of which goes
off to a processor to increment the "current" number, and will then
forward it to the processor that will check to make sure that
"current" is less than or equal to "upper boundary".
This works great, until the queues end up filling up. Once this
happens, I have a gridlock situation where none of the processors in
this triangle are running any longer, because they all have a full
output queue. I have tried searching the Internet and put a little
thought into how I could make it so that my "Check if done" processor
would prefer entries that are coming in from the circular portion of
the graph, but so far haven't been able to come up with anything.
What I have tried is making both of the input queues to "Check if
done" go through a funnel, and set an Oldest FlowFile prioritizer, but
it still eventually ends up filling up the entire triangle of queues.
Does anyone have a suggestion as to how I could gracefully handle
a situation like this? I appreciate any advice.
Thanks!
- Scott Wagner
<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
Virus-free. www.avg.com
<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>