Re: Feeding & Consuming data to & from Nifi
Hey Krish, Welcome to NiFi! Storm is how I got my start with data flow as well and NiFi has been awesome. NiFi's "processors" are equivalent to Storm's "spouts" and "bolts". NiFi can certainly handle the logging/auditing use case among others. With NiFi's data provenance you can see exactly how each message was handled, the content and attributes at each stage, as well as replay any message at any stage if you feel the need. For ingesting into NiFi there are a couple of options depending on your system. If the files already exist on the same system then there is a simple "getFile" processor. For other set-ups you can check out the "get" processors in the documentation [1]. For forwarding to another service again it depends on your system. If you set up two NiFi instances you can check out "Remote Process Group". If you're just "Put"ing it through HTTP there is a processor for that as well. Again there are other ways for exfiltrate the data depending on your set-up. So for the configuration it all depends on what you want to do and how your system is set up. If you give a bit more information on the specific services you're using or use-case we can give even more direction. [1] https://nifi.apache.org/docs.html Again welcome to NiFi,Joe- - - - - - Joseph Percivalllinkedin.com/in/Percivalle: joeperciv...@yahoo.com On Sunday, September 20, 2015 1:15 AM, Krishwrote: Hi, I am a n00b at NiFi although I have worked with Storm; currently that is how we handle data flow logic. I am evaluating using NiFi for logging/auditing use case (might move to other uses if this works) in a ~50 machine cluster, and am thinking if it would be a good fit to ingest messages from various sources and spit out the graph of how a message was handled at each stage, by each of the services. Anyway, I was wondering how do we feed data into a NiFi cluster to get the first stage started. Also, how does the data exit the system, if I want it to be forwarded to another service after NiFi is done with its processing [similar to spouts and bolts in apache storm]. Thanks.-- κρισhναν
Re: Feeding & Consuming data to & from Nifi
Krish, Yes. If it is two different times that the same 'concept of an object' goes through NiFi then there will be two provenance trails. You can add attributes to be indexed though in NiFi by adjusting the appropriate properties for provenance indexing in nifi.properties. With this you can index whichever attribute you have in mind that will hold this correlation attribute. As a result you can given finding one correlation identifier find all paths that the object took through the system even if it is multiple trips. There is a supported concept called 'associated object identifier' but I am not sure we already have the user experience in place for that. We really should document this entire concept because this is a great question and a common question as folks start to realize the power of provenance. That JIRA can be found here [1] Thanks Joe [1] https://issues.apache.org/jira/browse/NIFI-979 On Sun, Sep 20, 2015 at 2:02 PM, Krishwrote: > Thanks for the pointer, Joe! I will be going through them. > > My exact problem is I am trying to think if I can trace a message through > the system; something like a message entering NiFi, leaving it, and then > reentering it, & NiFi recognising that this is the same flow that is still > going through. > Consider the scenario: > > HTTP request coming in through a web server. > This goes to backend for processing & a copy (or some attribute to ID this > request) goes to NiFi. > The backend processes this request & then sends a copy (or the attribute > identifying the message & its response) to NiFi. > Web server responds to the request. > In NiFi, I should be able to trace the request from web-server to backend to > response. > > Is this possible? Can NiFi link the different messages it receives at > different times based on some unique ID? > > > > > -- > κρισhναν > > On Sun, Sep 20, 2015 at 9:51 PM, Joe Percivall > wrote: >> >> Hey Krish, >> >> Welcome to NiFi! Storm is how I got my start with data flow as well and >> NiFi has been awesome. NiFi's "processors" are equivalent to Storm's >> "spouts" and "bolts". >> >> NiFi can certainly handle the logging/auditing use case among others. With >> NiFi's data provenance you can see exactly how each message was handled, the >> content and attributes at each stage, as well as replay any message at any >> stage if you feel the need. >> >> For ingesting into NiFi there are a couple of options depending on your >> system. If the files already exist on the same system then there is a simple >> "getFile" processor. For other set-ups you can check out the "get" >> processors in the documentation [1]. For forwarding to another service again >> it depends on your system. If you set up two NiFi instances you can check >> out "Remote Process Group". If you're just "Put"ing it through HTTP there is >> a processor for that as well. Again there are other ways for exfiltrate the >> data depending on your set-up. >> >> So for the configuration it all depends on what you want to do and how >> your system is set up. If you give a bit more information on the specific >> services you're using or use-case we can give even more direction. >> >> [1] https://nifi.apache.org/docs.html >> >> Again welcome to NiFi, >> Joe >> - - - - - - >> Joseph Percivall >> linkedin.com/in/Percivall >> e: joeperciv...@yahoo.com >> >> >> >> >> On Sunday, September 20, 2015 1:15 AM, Krish >> wrote: >> >> >> Hi, >> >> I am a n00b at NiFi although I have worked with Storm; currently that is >> how we handle data flow logic. >> >> I am evaluating using NiFi for logging/auditing use case (might move to >> other uses if this works) in a ~50 machine cluster, and am thinking if it >> would be a good fit to ingest messages from various sources and spit out the >> graph of how a message was handled at each stage, by each of the services. >> >> Anyway, I was wondering how do we feed data into a NiFi cluster to get the >> first stage started. Also, how does the data exit the system, if I want it >> to be forwarded to another service after NiFi is done with its processing >> [similar to spouts and bolts in apache storm]. >> >> Thanks. >> -- >> κρισhναν >> >> >
Re: Feeding & Consuming data to & from Nifi
Thanks for the pointer, Joe! I will be going through them. My exact problem is I am trying to think if I can trace a message through the system; something like a message entering NiFi, leaving it, and then reentering it, & NiFi recognising that this is the same flow that is still going through. Consider the scenario: 1. HTTP request coming in through a web server. 2. This goes to backend for processing & a copy (or some attribute to ID this request) goes to NiFi. 3. The backend processes this request & then sends a copy (or the attribute identifying the message & its response) to NiFi. 4. Web server responds to the request. 5. In NiFi, I should be able to trace the request from web-server to backend to response. Is this possible? Can NiFi link the different messages it receives at different times based on some unique ID? -- κρισhναν On Sun, Sep 20, 2015 at 9:51 PM, Joe Percivallwrote: > Hey Krish, > > Welcome to NiFi! Storm is how I got my start with data flow as well and > NiFi has been awesome. NiFi's "processors" are equivalent to Storm's > "spouts" and "bolts". > > NiFi can certainly handle the logging/auditing use case among others. With > NiFi's data provenance you can see exactly how each message was handled, > the content and attributes at each stage, as well as replay any message at > any stage if you feel the need. > > For ingesting into NiFi there are a couple of options depending on your > system. If the files already exist on the same system then there is a > simple "getFile" processor. For other set-ups you can check out the "get" > processors in the documentation [1]. For forwarding to another service > again it depends on your system. If you set up two NiFi instances you can > check out "Remote Process Group". If you're just "Put"ing it through HTTP > there is a processor for that as well. Again there are other ways for > exfiltrate the data depending on your set-up. > > So for the configuration it all depends on what you want to do and how > your system is set up. If you give a bit more information on the specific > services you're using or use-case we can give even more direction. > > [1] https://nifi.apache.org/docs.html > > Again welcome to NiFi, > Joe > - - - - - - > *Joseph Percivall* > linkedin.com/in/Percivall > e: joeperciv...@yahoo.com > > > > > On Sunday, September 20, 2015 1:15 AM, Krish > wrote: > > > Hi, > > I am a n00b at NiFi although I have worked with Storm; currently that is > how we handle data flow logic. > > I am evaluating using NiFi for logging/auditing use case (might move to > other uses if this works) in a ~50 machine cluster, and am thinking if it > would be a good fit to ingest messages from various sources and spit out > the graph of how a message was handled at each stage, by each of the > services. > > Anyway, I was wondering how do we feed data into a NiFi cluster to get the > first stage started. Also, how does the data exit the system, if I want it > to be forwarded to another service after NiFi is done with its processing > [similar to spouts and bolts in apache storm]. > > Thanks. > -- > κρισhναν > > >