Kyle Sykes created NIFI-5911:
--------------------------------

             Summary: Create Integration Test Framework for Flow Developers
                 Key: NIFI-5911
                 URL: https://issues.apache.org/jira/browse/NIFI-5911
             Project: Apache NiFi
          Issue Type: Wish
          Components: Tools and Build
            Reporter: Kyle Sykes
         Attachments: nifi_integrated_testing_sample.png

(Encouraged to post this from the Nifi Slack)

As a flow developer, i would like to be able to test my flows using a 
integrated testing approach, rather than unit testing each individual 
processor.  I currently can guarantee a single processor will work as intended, 
but making a change early on in a flow can lead to unanticipated consequences 
later on as the flow file gets passed off to other flows for processing.

The basic functionality I'm looking for is generating a series of test flow 
files to insert into my pipeline at some step and be able to insert test checks 
(akin to assert statements) throughout my flow that are only executed with the 
test flow files.  Currently, I can implement this using the following workflow 
(see attached file for something similar I made a while back to proof of 
concept the idea):
 # Generate a flow file with a boolean attribute `testFlowFile`.
 # Connect the `GenerateFlowFile` processor into a flow where I want to run 
some test cases on
 # Insert `RouteOnAttribute` FlowFiles that route all test cases to a group of 
processors that holds a set of processors that asserts things about the 
attributes or content, adding an `testFailed` attribute if any of the test 
cases fail
 # If `testFailed` is true, then route the FlowFile to a "Failed" location for 
further examination by me
 # If `testFailed` is false, then route the FlowFile to an Output Port which 
inserts it back into the flow to continue down the line.
 # Insert `RouteOnAttribute` processors to prevent the test FlowFiles from 
inserting the test cases into databases or stop the testing for that particular 
flow. 

The above causes a large number of extra processors to be created and generally 
makes the flows more difficult to understand.  My work has me handing of flows 
to a client and I like to keep things as understandable as possible and it can 
get a little messy.

I envision several ways this could work as an integrated part of Nifi, so I'll 
give some thoughts and leave it to the contributors to determine what's 
actually feasible/useful.
 # A `GenerateTestFile` processor that generates a flow file that internally is 
marked as a test FlowFile (probably by an attribute), allowing someone to set 
both the attributes and content.  The content could be either text (in the case 
of JSON) or an external file to be loaded (in cases of binary formats, image, 
video, etc). 
 ## This could also extend to a processor that loads many test cases at once to 
test specific scenarios.  Giving a useful name to the test case would also be 
nice for debugging.
 ## Allow for the possibility that you anticipate that a flowfile is expected 
to fail the test case (xfail type situations)?  I don't have an immediate use 
case, but it's worth considering if it's something that needs to be implemented.
 # Processors specifically designed to create a wide variety of test cases and 
store information as to why it failed somewhere on the flow file (a stack trace 
of errors if applicable, or just pointing at which test case failed).  Could be 
something like `TestAttribute` or `TestContent` or `TestExecuteScript`.  Some 
thought could be given to how to best setup test cases in general.  Perhaps the 
work around I use above isn't the best approach but I'm sure people more 
familiar with testing in general might have better approaches.
 # The ability to mark processors as "Non Testable", specifically things like 
database connections, adding to queues, some output ports to other flows you 
don't necessarily want to pass a test FlowFile onto (maybe set some/all as Non 
Testable by default?).  Similar to how you can right click a processor and 
begin tracking it with the registry, I was envisioning something along the 
lines of "Allow Processing of Test FlowFiles"
 # Perhaps have some way to automate runs of a test suite and monitor the 
results visually, with an overview of what test cases failed (by name)
 ## Allow for the situation where a developer might desire a CI workflow to 
develop and promote their flows, and have the ability to support the workflow 
where someone can create an isolated testing environment (using Docker to load 
nifi), load the flow to be tested, then run the test suite and report back the 
results.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to