Hi Laurens,
I've never done this but here are some ideas you could experiment with.
Assuming the logs are coming from something like an application running on an
EC2 instance, there are a number of ways you could probably expose them to NiFi
without going through CloudWatch logs. There are a number of articles and blog
posts [1] that describe how to do this. For instance, if your app logging
framework supports an appender that can go direct to NiFi, or logging locally
and running a local MiNiFi agent running a simple flow that tails a log file
and sends the contents to NiFi using the site to site protocol. This would have
the advantage of attaching provenance metadata to your logs right at the
source, in case that is valuable for your use case.
I'm assuming you want to use CloudWatch for some other reason/integration as
part of your overall architecture, or that the source is not something you
could run a MiNiFi agent on (i.e., another AWS service). There is a backlogged
NiFi JIRA for reading from a Kinesis Stream [2], but in absence of that feature
being implemented, you would have to have something between a Kinesis
Stream/Firehose carrying the log data and NiFi. Some ideas include:
- Kinesis > S3 > NiFi (as you suggested) could work
- Kinesis > Redshift > NiFi using one of NiFi's SQL processors (e.g.,
QueryDatabaseTable) to get data out of Redshift.
- Kinesis > AWS Lambda > NiFi using one of NiFi's listener processors
(ListenUDP or HTTP) on the NiFi side to create an endpoint an AWS Lambda
function could post to.
- Kinesis > [any streaming/messaging framework that NiFi can integrate with,
such as Kafka or JMS] > NiFi
The last one is kind of a shot in the dark, but given the number of libraries
and protocols out there that are interoperable, it might be possible to
leverage one if you can find a good third-party tool for getting a data stream
out of Kinesis into a system that natively interoperates with another messaging
protocol.
I'd be interested to hear if you come up with an approach that works well for
your use case. Hope this helps!
Kevin
[1]
https://bryanbende.com/development/2015/05/17/collecting-logs-with-apache-nifi
[2] https://issues.apache.org/jira/browse/NIFI-2892
On 3/23/18, 14:15, "Laurens Vets" wrote:
Hi list,
Has anyone tried to setup NiFi to get real-time CloudWatch logs somehow?
I can export CloudWatch logs to S3, but it might take up to 12 hours for
them to become available. I suspect the only other option is to go
through AWS Kinesis Firehose to stream to S3 and have NiFi pick up the
logs there?
Any ideas/comments/suggestions are highly appreciated :)