Re: Apache Nifi - Splitting input and distributing processing to multiple nodes in a Nifi cluster

Joe Witt Fri, 15 Jul 2016 06:45:55 -0700

Mans,

The general pattern for something like this that works well is:
 - Capture
 - Split
 - Site-to-Site transfer back to same cluster which distributes the
partitioned/split data to all nodes
 - Do work on smaller chunks


We often do exactly this sort of thing for larger scale geo enrichment
for example.
- Receive large batch of events on a given system (in a line oriented
event model)
- Run SplitText to break out each event
- Use site-to-site to distribute them to the entire cluster
- On each node receive split events then run geo enrichment
- then send to Kafka as-is or aggregate and send to HDFS

Does that make sense/help for your scenario?

Thanks
Joe


On Fri, Jul 15, 2016 at 9:09 AM, M Singh <[email protected]> wrote:
> Hey Folks:
>
> I am looking for information on how to split/partition input in a generic
> way (say rows in a relational database, or lines in a file) and then process
> each split on a different node in parallel in a Nifi cluster.  I believe
> there is a webinar from the Nifi team on this but am not able to find it
> now.
>
> If someone has the documentation on this or link the webinar, please let me
> know.
>
> Thanks
>
> Mans

Re: Apache Nifi - Splitting input and distributing processing to multiple nodes in a Nifi cluster

Reply via email to