I think that's a good problem to solve, Jon. Having some way to handle different types of data hitting the same Kafka topic, would be a very common problem. We should make this easy to handle. And as Simon mentioned, it solves the problem of ingesting low-volume data streams where the cost of a dedicated topology is overkill.
Syslog is a good example use case. Another example use case might be extracting data out of Splunk. I worked at an organization that was using Splunk as the centralized log store to meet regulatory requirements. Of course, Splunk is expensive so overlaying additional functionality on the existing installation was cost prohibitive. The only efficient way we could get data out of Splunk was one big pipe containing heterogenous data. Perhaps there are other ways around it now. I am no Splunk expert, but this seems like a common problem. On Thu, Oct 6, 2016 at 11:51 AM, [email protected] <[email protected]> wrote: > A storm splitter gateway topology was another path that I considered, > especially because it would allow configs like what Yohann mentioned > earlier with: > > > So, it would be really useful that Metron could handle a syslog flow > > and automatically apply the right parser for each log. In order to > > help Metron, a config could be provide by the "Security Platform > > Engineer" to preselect a list of parser per device (as you know what > > type of logs a device should send). This feature exists in > > commercial SIEM. > > It's just not as easy to get going as an upstream splitter and/or parser in > my scenario. > > Perhaps that should be an enhancement JIRA though? I really think we need > to lower the barrier to getting logs to Metron in the first place, even > going as far as having a syslog listener (I looked at embedding rsyslog and > syslog-ng and they both unfortunately are GPL licensed, so that's out...). > > Jon > > On Thu, Oct 6, 2016 at 9:58 AM Otto Fowler <[email protected]> > wrote: > > Each of these split things would need to end up in their own topology, > since they would each have different STELLAR and Enrichment configurations. > > It would be simpler I think to split them than to have a topology chain > that ‘switches’ over a type of field and muddy stellar configs etc. > > If that is true, then the question is to split as part of the external > delivery ( not metron’s problem ) in NiFi or XXXX, or to have a ‘gateway - > splitter’ topology with only split rules to feed the other typed > topologies. > > Or I’m totally wrong and you can forgive me ;) > > O > > > On October 6, 2016 at 08:32:51, [email protected] ([email protected]) wrote: > > If we don't do it by device I would be concerned that some more > appliance-based systems wouldn't allow the flexibility to split things up > to different destinations, nor would they allow external additions (NiFi, > etc.). This where I am right now, where I can send from certain appliances > into my syslog infrastructure, then either force my syslog architecture to > selectively send onto Metron, or parse and then send into a generic JSON > parser (I will probably go the latter route). In order to standardize and > simplify, I would suggest continuing down the device-based route. > > Generally, I expect the community to grow and for parsers to just exist, > and some users to only do minor updates to them or throw together grok > parsers using GROK_PREDICT() where necessary. In fact I would hope that is > the case, as it would indicate a broader user base. > > Jon > > On Thu, Oct 6, 2016 at 8:02 AM Simon Elliston Ball < > [email protected]> wrote: > > > > On 6 Oct 2016, at 12:22, Yohann Lepage <[email protected]> wrote: > > > > > > 2016-10-06 12:21 GMT+02:00 [email protected] <[email protected]>: > > >> I would think that instead we work to make each parser able to handle > > all > > >> the known outputs (and document explicitly what outputs per parser are > > >> supported) from a product and go back to vendor_product, with versions > > of > > >> the product supported/tested and version of the parser being stored in > > code > > >> and documentation only. > > > +1 > > > > > > > +1 - this is similar to the evolving schema problem, and probably belongs > > in code. > > > > >> I'm currently working on mechanisms to get logs into Metron most > > >> efficiently because all of my syslog comes in one big pipe. > > > I have a similar use case. Most of the time, admins are ok to forward > > > logs from rsyslog/syslog-ng to the SIEM as they don't want to install > > > an agent ( *.* @@siem.intra:514;). > > > > > > The result is that you receive a mix of log > > > (sudo/apache/mysql/audit/etc) from the same device and the SIEM have > > > to deals with it. > > > > > > So, it would be really useful that Metron could handle a syslog flow > > > and automatically apply the right parser for each log. In order to > > > help Metron, a config could be provide by the "Security Platform > > > Engineer" to preselect a list of parser per device (as you know what > > > type of logs a device should send). This feature exists in > > > commercial SIEM. > > > > > > > +1 for this too. One question though, do you think it’s viable to do this > > by device. I would expect multiple types of syslog coming from the same > > physical device, especially when dealing with things like server logs. > > > > This could be handled with minimal parse and routing in NiFi potentially, > > but that may make setup more complex than the sort of mapping you’re > > talking about here. Thoughts? > > > > Simon > > -- > > Jon > > -- > > Jon > -- Nick Allen <[email protected]>
