Mark Payne created NIFI-9691:
--------------------------------

             Summary: Create processors for joining record-oriented data with 
enrichment data
                 Key: NIFI-9691
                 URL: https://issues.apache.org/jira/browse/NIFI-9691
             Project: Apache NiFi
          Issue Type: New Feature
          Components: Extensions
            Reporter: Mark Payne
            Assignee: Mark Payne


A powerful capability of NiFi is its ability to perform enrichment on data as 
it flows through the system. We have processors such as LookupRecord and 
GeoEnrichRecord. However, there are cases where these don't really provide the 
necessary capabilities for performing enrichment.

A particularly powerful use case is when we have data that we want to enrich by 
calling out to some web service. In this case, we don't want to send our 
payload to the web service. Instead, we want to transform our payload into a 
request that is reasonable to send to a web service. Then, we want to take the 
result from that web service call and use it to enrich our original payload. 
NiFi does not currently offer a convenient mechanism for doing this.

We should add two additional processors: ForkEnrichment and JoinEnrichment.

ForkEnrichment would be used create a clone of the incoming FlowFile, assigning 
relevant attributes to the original and the clone and then sending each to a 
different relationship (original and enrichment).

Data sent to the 'enrichment' connection can then be transformed into whatever 
is necessary to send as a request to the web service. The result would then be 
fed to the JoinEnrichment processor.

JoinEnrichment should then take input from the "original" connection and the 
"enrichment" connection and join the records together.

I can foresee three ways to join together the Records:
 * Correlating the records by their index in the FlowFile with a wrapper (i.e., 
there's a "wrapper" element that encapsulates the first record from the 
"original" FlowFile and the first record from the "enrichment" FlowFile.
 * Correlating the records by their index in the FlowFile and insert the 
Enrichment record into the original payload.
 * Use SQL with a JOIN clause to join the records based on some field within 
the data.

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to