Hi Dan, This is definitely a use case that NiFi can handle.
A possible architecture for your scenario would be something like the following... - Run NiFi instances on the machines where you need to collect logs, these would not be clustered, just stand-alone instances. - These would pick up your log files using List/FetchFile, or TailFile, and send them to a central NiFi using Site-to-Site [1] - The central NiFi would be receiving the data from all the machines and making the routing decisions as to which Azure hub to send to. - Depending on your data volume, the central NiFi could be a cluster of a few nodes, or for a lower volume it could be a stand-alone instance. Let us know if you have any questions. -Bryan [1] https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#site-to-site On Wed, Dec 23, 2015 at 10:52 AM, Dan <[email protected]> wrote: > I've recently found NiFi and have been playing around with it locally for > a day or so to assess whether it would be a good fit for the following use > case: > > 1. I'm tasked with gathering log files from 100s of machines from a > predetermined directory structure local to the machine (e.g. /log/appname/ > or c:\log\appname) which may be Linux or Windows > 2. File names include date (e.g. appname_20151223.log) > 3. The log file is structured as JSON - each line of the file is a JSON > object > 4. The JSON object in each file includes data that determines where to > route the message > 5. Each message should be routed to one of several Azure Event Hubs based > on #4 > > Would I set up a single NiFi cluster to do this, or would I set up what > would essentially be 100 NiFi clusters if I have 100 machines from which I > want to gather logs from their local /log/appname directory? > > Thanks - this looks like a very well thought out project! > > Best > Dan >
