Re: Use Case Validation

2015-12-23 Thread Bryan Bende
Hi Dan,

This is definitely a use case that NiFi can handle.

A possible architecture for your scenario would be something like the
following...
- Run NiFi instances on the machines where you need to collect logs, these
would not be clustered, just stand-alone instances.
- These would pick up your log files using List/FetchFile, or TailFile, and
send them to a central NiFi using Site-to-Site [1]
- The central NiFi would be receiving the data from all the machines and
making the routing decisions as to which Azure hub to send to.
- Depending on your data volume, the central NiFi could be a cluster of a
few nodes, or for a lower volume it could be a stand-alone instance.

Let us know if you have any questions.

-Bryan

[1] https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#site-to-site


On Wed, Dec 23, 2015 at 10:52 AM, Dan  wrote:

> I've recently found NiFi and have been playing around with it locally for
> a day or so to assess whether it would be a good fit for the following use
> case:
>
> 1. I'm tasked with gathering log files from 100s of machines from a
> predetermined directory structure local to the machine (e.g. /log/appname/
> or c:\log\appname) which may be Linux or Windows
> 2. File names include date (e.g. appname_20151223.log)
> 3. The log file is structured as JSON - each line of the file is a JSON
> object
> 4. The JSON object in each file includes data that determines where to
> route the message
> 5. Each message should be routed to one of several Azure Event Hubs based
> on #4
>
> Would I set up a single NiFi cluster to do this, or would I set up what
> would essentially be 100 NiFi clusters if I have 100 machines from which I
> want to gather logs from their local /log/appname directory?
>
> Thanks - this looks like a very well thought out project!
>
> Best
> Dan
>


Re: Use Case Validation

2015-12-23 Thread Bryan Bende
Dan,

A stand-alone instance is the default behavior. If you extract a NiFi
distribution and run "bin/nifi.sh start", without changing any of the
clustering related properties, then you get a single instance running on
port 8080 by default.

My thought behind sending them via site-to-site is to have a central
instance/cluster where you can monitor/change the routing part of the flow.
The flow running on the machines where the logs are would likely be a very
simple flow to grab some data and send back, so there wouldn't be as much
to see/change there.

-Bryan

On Wed, Dec 23, 2015 at 11:32 AM, Dan <dcies...@hotmail.com> wrote:

> Thanks; is the idea of sending the log file data via Site-to-Site to
> reduce load caused by making the routing decision on the machine containing
> the logs?
>
> Total newb question: How does one create a stand-alone instance? I wound
> up running 2 processes (node and server) as I started poking around. On the
> "server" process, I filled in the nifi.cluster.is.manager=true along with
> n.c.m.address and n.c.m.protocol.port while on the "node" process, I filled
> in nifi.cluster.is.node=true along with n.c.node.* and pointed the
> n.c.n.unicast.* stuff over to the manager values. Is there a simpler way?
> Can I do this with a single process running?
>
> Thanks
> Dan
>
> ----------
> Date: Wed, 23 Dec 2015 11:10:23 -0500
> Subject: Re: Use Case Validation
> From: bbe...@gmail.com
> To: users@nifi.apache.org
>
>
> Hi Dan,
>
> This is definitely a use case that NiFi can handle.
>
> A possible architecture for your scenario would be something like the
> following...
> - Run NiFi instances on the machines where you need to collect logs, these
> would not be clustered, just stand-alone instances.
> - These would pick up your log files using List/FetchFile, or TailFile,
> and send them to a central NiFi using Site-to-Site [1]
> - The central NiFi would be receiving the data from all the machines and
> making the routing decisions as to which Azure hub to send to.
> - Depending on your data volume, the central NiFi could be a cluster of a
> few nodes, or for a lower volume it could be a stand-alone instance.
>
> Let us know if you have any questions.
>
> -Bryan
>
> [1]
> https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#site-to-site
>
>
> On Wed, Dec 23, 2015 at 10:52 AM, Dan <dcies...@hotmail.com> wrote:
>
> I've recently found NiFi and have been playing around with it locally for
> a day or so to assess whether it would be a good fit for the following use
> case:
>
> 1. I'm tasked with gathering log files from 100s of machines from a
> predetermined directory structure local to the machine (e.g. /log/appname/
> or c:\log\appname) which may be Linux or Windows
> 2. File names include date (e.g. appname_20151223.log)
> 3. The log file is structured as JSON - each line of the file is a JSON
> object
> 4. The JSON object in each file includes data that determines where to
> route the message
> 5. Each message should be routed to one of several Azure Event Hubs based
> on #4
>
> Would I set up a single NiFi cluster to do this, or would I set up what
> would essentially be 100 NiFi clusters if I have 100 machines from which I
> want to gather logs from their local /log/appname directory?
>
> Thanks - this looks like a very well thought out project!
>
> Best
> Dan
>
>
>