Re: Apache Nifi - Splitting input and distributing processing to multiple nodes in a Nifi cluster

M Singh Fri, 15 Jul 2016 11:06:10 -0700

Thanks Bryan.  I will check it. 

    On Friday, July 15, 2016 9:49 AM, Bryan Bende <[email protected]> wrote:


 Hi Mans,
Not sure if this is what you are referring to, but there is a diagram in this 
article that shows how this would work for fetching from HDFS in 
parallel:https://community.hortonworks.com/articles/16120/how-do-i-distribute-data-across-a-nifi-cluster.html

It is more from the logical point of view, rather than how to actually 
configure step-by-step in NiFi.
-Bryan
On Fri, Jul 15, 2016 at 12:42 PM, M Singh <[email protected]> wrote:

Hi Joe:
Thanks for the info.  
I believe one of the Nifi team members had a webinar/presentation on it or 
something very similar.  If you have a reference for that, please let me know.
Thanks again for your help. 

    On Friday, July 15, 2016 6:37 AM, Joe Witt <[email protected]> wrote:
 

 Mans,

The general pattern for something like this that works well is:
 - Capture
 - Split
 - Site-to-Site transfer back to same cluster which distributes the
partitioned/split data to all nodes
 - Do work on smaller chunks

We often do exactly this sort of thing for larger scale geo enrichment
for example.
- Receive large batch of events on a given system (in a line oriented
event model)
- Run SplitText to break out each event
- Use site-to-site to distribute them to the entire cluster
- On each node receive split events then run geo enrichment
- then send to Kafka as-is or aggregate and send to HDFS

Does that make sense/help for your scenario?

Thanks
Joe


On Fri, Jul 15, 2016 at 9:09 AM, M Singh <[email protected]> wrote:
> Hey Folks:
>
> I am looking for information on how to split/partition input in a generic
> way (say rows in a relational database, or lines in a file) and then process
> each split on a different node in parallel in a Nifi cluster.  I believe
> there is a webinar from the Nifi team on this but am not able to find it
> now.
>
> If someone has the documentation on this or link the webinar, please let me
> know.
>
> Thanks
>
> Mans

Re: Apache Nifi - Splitting input and distributing processing to multiple nodes in a Nifi cluster

Reply via email to