Hi Laurens, I've never done this but here are some ideas you could experiment with.
Assuming the logs are coming from something like an application running on an EC2 instance, there are a number of ways you could probably expose them to NiFi without going through CloudWatch logs. There are a number of articles and blog posts [1] that describe how to do this. For instance, if your app logging framework supports an appender that can go direct to NiFi, or logging locally and running a local MiNiFi agent running a simple flow that tails a log file and sends the contents to NiFi using the site to site protocol. This would have the advantage of attaching provenance metadata to your logs right at the source, in case that is valuable for your use case. I'm assuming you want to use CloudWatch for some other reason/integration as part of your overall architecture, or that the source is not something you could run a MiNiFi agent on (i.e., another AWS service). There is a backlogged NiFi JIRA for reading from a Kinesis Stream [2], but in absence of that feature being implemented, you would have to have something between a Kinesis Stream/Firehose carrying the log data and NiFi. Some ideas include: - Kinesis > S3 > NiFi (as you suggested) could work - Kinesis > Redshift > NiFi using one of NiFi's SQL processors (e.g., QueryDatabaseTable) to get data out of Redshift. - Kinesis > AWS Lambda > NiFi using one of NiFi's listener processors (ListenUDP or HTTP) on the NiFi side to create an endpoint an AWS Lambda function could post to. - Kinesis > [any streaming/messaging framework that NiFi can integrate with, such as Kafka or JMS] > NiFi The last one is kind of a shot in the dark, but given the number of libraries and protocols out there that are interoperable, it might be possible to leverage one if you can find a good third-party tool for getting a data stream out of Kinesis into a system that natively interoperates with another messaging protocol. I'd be interested to hear if you come up with an approach that works well for your use case. Hope this helps! Kevin [1] https://bryanbende.com/development/2015/05/17/collecting-logs-with-apache-nifi [2] https://issues.apache.org/jira/browse/NIFI-2892 On 3/23/18, 14:15, "Laurens Vets" <[email protected]> wrote: Hi list, Has anyone tried to setup NiFi to get real-time CloudWatch logs somehow? I can export CloudWatch logs to S3, but it might take up to 12 hours for them to become available. I suspect the only other option is to go through AWS Kinesis Firehose to stream to S3 and have NiFi pick up the logs there? Any ideas/comments/suggestions are highly appreciated :)
