Mark Payne created NIFI-9691:
--------------------------------
Summary: Create processors for joining record-oriented data with
enrichment data
Key: NIFI-9691
URL: https://issues.apache.org/jira/browse/NIFI-9691
Project: Apache NiFi
Issue Type: New Feature
Components: Extensions
Reporter: Mark Payne
Assignee: Mark Payne
A powerful capability of NiFi is its ability to perform enrichment on data as
it flows through the system. We have processors such as LookupRecord and
GeoEnrichRecord. However, there are cases where these don't really provide the
necessary capabilities for performing enrichment.
A particularly powerful use case is when we have data that we want to enrich by
calling out to some web service. In this case, we don't want to send our
payload to the web service. Instead, we want to transform our payload into a
request that is reasonable to send to a web service. Then, we want to take the
result from that web service call and use it to enrich our original payload.
NiFi does not currently offer a convenient mechanism for doing this.
We should add two additional processors: ForkEnrichment and JoinEnrichment.
ForkEnrichment would be used create a clone of the incoming FlowFile, assigning
relevant attributes to the original and the clone and then sending each to a
different relationship (original and enrichment).
Data sent to the 'enrichment' connection can then be transformed into whatever
is necessary to send as a request to the web service. The result would then be
fed to the JoinEnrichment processor.
JoinEnrichment should then take input from the "original" connection and the
"enrichment" connection and join the records together.
I can foresee three ways to join together the Records:
* Correlating the records by their index in the FlowFile with a wrapper (i.e.,
there's a "wrapper" element that encapsulates the first record from the
"original" FlowFile and the first record from the "enrichment" FlowFile.
* Correlating the records by their index in the FlowFile and insert the
Enrichment record into the original payload.
* Use SQL with a JOIN clause to join the records based on some field within
the data.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)