Re: Is NiFi processing streamingly?

Joe Percivall Mon, 25 Apr 2016 08:45:38 -0700

Hello Yang,

To better understand how data flows through NiFi to the processors you need to 
understand FlowFiles. FlowFiles are the data record that gets processed by the 
processors. FlowFiles are a pointer to content and a collection of attributes. 
So each time the processor acts on the entire FlowFile produced by the previous 
processor.


For clustering, the flow is replicated to each node of the cluster. This means 
each node in the cluster has a copy of the flow which it uses to process all 
data sent to it (except for processor's marked as "primary node" only, but 
that's a bit more advanced).

Also for a better worded, more in-depth look into NiFi I would suggest checking 
out the PR for the "NiFi In Depth" doc[1]. It would help answer many questions 
you may have about the internals of NiFi. Also any comments on it are much 
appreciated.
 
[1] https://github.com/apache/nifi/pull/339#discussion_r60103526

Joe

- - - - - - Joseph Percivall
linkedin.com/in/Percivall
e: [email protected]




On Monday, April 25, 2016 11:21 AM, Yuanzhe Yang (杨远哲) <[email protected]> 
wrote:
Hi,

I have read some documentation about NiFi, but I haven’t got a clear impression 
about how data flows inside NiFi. Is it processed streamingly? Or does a 
processor get the entire intermediate result produced by its previous 
processor? Moreover, what is the granularity of clustering? Is it dataflow 
level or processor level?

Thank you very much for your clarification and your work is very much 
appreciated.

Regards,
Yang

Re: Is NiFi processing streamingly?

Reply via email to