Re: Feeding & Consuming data to & from Nifi

2015-09-20 Thread Joe Percivall
Hey Krish,
Welcome to NiFi! Storm is how I got my start with data flow as well and NiFi 
has been awesome. NiFi's "processors" are equivalent to Storm's "spouts" and 
"bolts".
NiFi can certainly handle the logging/auditing use case among others. With 
NiFi's data provenance you can see exactly how each message was handled, the 
content and attributes at each stage, as well as replay any message at any 
stage if you feel the need.
For ingesting into NiFi there are a couple of options depending on your system. 
If the files already exist on the same system then there is a simple "getFile" 
processor. For other set-ups you can check out the "get" processors in the 
documentation [1]. For forwarding to another service again it depends on your 
system. If you set up two NiFi instances you can check out "Remote Process 
Group". If you're just "Put"ing it through HTTP there is a processor for that 
as well. Again there are other ways for exfiltrate the data depending on your 
set-up.
So for the configuration it all depends on what you want to do and how your 
system is set up. If you give a bit more information on the specific services 
you're using or use-case we can give even more direction.
[1] https://nifi.apache.org/docs.html
Again welcome to NiFi,Joe- - - - - - Joseph 
Percivalllinkedin.com/in/Percivalle: joeperciv...@yahoo.com
 


 On Sunday, September 20, 2015 1:15 AM, Krish  
wrote:
   

 Hi,
I am a n00b at NiFi although I have worked with Storm; currently that is how we 
handle data flow logic.
I am evaluating using NiFi for logging/auditing use case (might move to other 
uses if this works) in a ~50 machine cluster, and am thinking if it would be a 
good fit to ingest messages from various sources and spit out the graph of how 
a message was handled at each stage, by each of the services.
Anyway, I was wondering how do we feed data into a NiFi cluster to get the 
first stage started. Also, how does the data exit the system, if I want it to 
be forwarded to another service after NiFi is done with its processing [similar 
to spouts and bolts in apache storm].
Thanks.--
κρισhναν

  

Re: Feeding & Consuming data to & from Nifi

2015-09-20 Thread Joe Witt
Krish,

Yes.  If it is two different times that the same 'concept of an
object' goes through NiFi then there will be two provenance trails.
You can add attributes to be indexed though in NiFi by adjusting the
appropriate properties for provenance indexing in nifi.properties.
With this you can index whichever attribute you have in mind that will
hold this correlation attribute.  As a result you can given finding
one correlation identifier find all paths that the object took through
the system even if it is multiple trips.

There is a supported concept called 'associated object identifier' but
I am not sure we already have the user experience in place for that.
We really should document this entire concept because this is a great
question and a common question as folks start to realize the power of
provenance.  That JIRA can be found here [1]

Thanks
Joe

[1] https://issues.apache.org/jira/browse/NIFI-979

On Sun, Sep 20, 2015 at 2:02 PM, Krish  wrote:
> Thanks for the pointer, Joe! I will be going through them.
>
> My exact problem is I am trying to think if I can trace a message through
> the system; something like a message entering NiFi, leaving it, and then
> reentering it, & NiFi recognising that this is the same flow that is still
> going through.
> Consider the scenario:
>
> HTTP request coming in through a web server.
> This goes to backend for processing & a copy (or some attribute to ID this
> request) goes to NiFi.
> The backend processes this request & then sends a copy (or the attribute
> identifying the message & its response) to NiFi.
> Web server responds to the request.
> In NiFi, I should be able to trace the request from web-server to backend to
> response.
>
> Is this possible? Can NiFi link the different messages it receives at
> different times based on some unique ID?
>
>
>
>
> --
> κρισhναν
>
> On Sun, Sep 20, 2015 at 9:51 PM, Joe Percivall 
> wrote:
>>
>> Hey Krish,
>>
>> Welcome to NiFi! Storm is how I got my start with data flow as well and
>> NiFi has been awesome. NiFi's "processors" are equivalent to Storm's
>> "spouts" and "bolts".
>>
>> NiFi can certainly handle the logging/auditing use case among others. With
>> NiFi's data provenance you can see exactly how each message was handled, the
>> content and attributes at each stage, as well as replay any message at any
>> stage if you feel the need.
>>
>> For ingesting into NiFi there are a couple of options depending on your
>> system. If the files already exist on the same system then there is a simple
>> "getFile" processor. For other set-ups you can check out the "get"
>> processors in the documentation [1]. For forwarding to another service again
>> it depends on your system. If you set up two NiFi instances you can check
>> out "Remote Process Group". If you're just "Put"ing it through HTTP there is
>> a processor for that as well. Again there are other ways for exfiltrate the
>> data depending on your set-up.
>>
>> So for the configuration it all depends on what you want to do and how
>> your system is set up. If you give a bit more information on the specific
>> services you're using or use-case we can give even more direction.
>>
>> [1] https://nifi.apache.org/docs.html
>>
>> Again welcome to NiFi,
>> Joe
>> - - - - - -
>> Joseph Percivall
>> linkedin.com/in/Percivall
>> e: joeperciv...@yahoo.com
>>
>>
>>
>>
>> On Sunday, September 20, 2015 1:15 AM, Krish 
>> wrote:
>>
>>
>> Hi,
>>
>> I am a n00b at NiFi although I have worked with Storm; currently that is
>> how we handle data flow logic.
>>
>> I am evaluating using NiFi for logging/auditing use case (might move to
>> other uses if this works) in a ~50 machine cluster, and am thinking if it
>> would be a good fit to ingest messages from various sources and spit out the
>> graph of how a message was handled at each stage, by each of the services.
>>
>> Anyway, I was wondering how do we feed data into a NiFi cluster to get the
>> first stage started. Also, how does the data exit the system, if I want it
>> to be forwarded to another service after NiFi is done with its processing
>> [similar to spouts and bolts in apache storm].
>>
>> Thanks.
>> --
>> κρισhναν
>>
>>
>


Re: Feeding & Consuming data to & from Nifi

2015-09-20 Thread Krish
Thanks for the pointer, Joe! I will be going through them.

My exact problem is I am trying to think if I can trace a message through
the system; something like a message entering NiFi, leaving it, and then
reentering it, & NiFi recognising that this is the same flow that is still
going through.
Consider the scenario:

   1. HTTP request coming in through a web server.
   2. This goes to backend for processing & a copy (or some attribute to ID
   this request) goes to NiFi.
   3. The backend processes this request & then sends a copy (or the
   attribute identifying the message & its response) to NiFi.
   4. Web server responds to the request.
   5. In NiFi, I should be able to trace the request from web-server to
   backend to response.

Is this possible? Can NiFi link the different messages it receives at
different times based on some unique ID?




--
κρισhναν

On Sun, Sep 20, 2015 at 9:51 PM, Joe Percivall 
wrote:

> Hey Krish,
>
> Welcome to NiFi! Storm is how I got my start with data flow as well and
> NiFi has been awesome. NiFi's "processors" are equivalent to Storm's
> "spouts" and "bolts".
>
> NiFi can certainly handle the logging/auditing use case among others. With
> NiFi's data provenance you can see exactly how each message was handled,
> the content and attributes at each stage, as well as replay any message at
> any stage if you feel the need.
>
> For ingesting into NiFi there are a couple of options depending on your
> system. If the files already exist on the same system then there is a
> simple "getFile" processor. For other set-ups you can check out the "get"
> processors in the documentation [1]. For forwarding to another service
> again it depends on your system. If you set up two NiFi instances you can
> check out "Remote Process Group". If you're just "Put"ing it through HTTP
> there is a processor for that as well. Again there are other ways for
> exfiltrate the data depending on your set-up.
>
> So for the configuration it all depends on what you want to do and how
> your system is set up. If you give a bit more information on the specific
> services you're using or use-case we can give even more direction.
>
> [1] https://nifi.apache.org/docs.html
>
> Again welcome to NiFi,
> Joe
> - - - - - -
> *Joseph Percivall*
> linkedin.com/in/Percivall
> e: joeperciv...@yahoo.com
>
>
>
>
> On Sunday, September 20, 2015 1:15 AM, Krish 
> wrote:
>
>
> Hi,
>
> I am a n00b at NiFi although I have worked with Storm; currently that is
> how we handle data flow logic.
>
> I am evaluating using NiFi for logging/auditing use case (might move to
> other uses if this works) in a ~50 machine cluster, and am thinking if it
> would be a good fit to ingest messages from various sources and spit out
> the graph of how a message was handled at each stage, by each of the
> services.
>
> Anyway, I was wondering how do we feed data into a NiFi cluster to get the
> first stage started. Also, how does the data exit the system, if I want it
> to be forwarded to another service after NiFi is done with its processing
> [similar to spouts and bolts in apache storm].
>
> Thanks.
> --
> κρισhναν
>
>
>