[
https://issues.apache.org/jira/browse/METRON-701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878751#comment-15878751
]
ASF GitHub Bot commented on METRON-701:
---------------------------------------
Github user cestella commented on the issue:
https://github.com/apache/incubator-metron/pull/449
So, I think this is an interesting approach. My issues with it are:
* You seem to be sending every profile into kafka, not just the configured
ones.
* You seem to be assuming that one value only is being sent into the
telemetry and it's the value that you store in HBase
I'd recommend, rather, that you make the `result` field more complex by
making it a map where the key is the source (e.g. "hbase" or "kafka"). This
allows you to separate the storage structure by storage medium. You may, for
instance, want to *STORE* a stats object in Hbase, but only send along the mean
and standard deviation. Also, I'd recommend allowing `result`to be either a
string (which would presume only hbase is supported) or a Map, which would
explicitly specify the structure for just the sources you want to write to.
Here's a worked example config for maximum clarity (!):
```
{
"profiles": [
{
"profile": "test",
"foreach": "'global'",
"onlyif": "source.type == 'squid'",
"init": { "stats": "STATS_INIT()" },
"update": { "stats": "STATS_ADD(stats, LENGTH(url))" },
"result": {
"hbase" : "stats",
"kafka" : "{ 'mean' : STATS_MEAN(stats), 'stddev' :
STATS_SD(stats) }"
}
}
]
}
```
> Triage Metrics Produced by the Profiler
> ---------------------------------------
>
> Key: METRON-701
> URL: https://issues.apache.org/jira/browse/METRON-701
> Project: Metron
> Issue Type: Improvement
> Reporter: Nick Allen
> Assignee: Nick Allen
>
> h3. Problem
> The motivating example is that I would like to create an alert if the number
> of inbound flows to any host over a 15 minute interval is abnormal.
> The value being interrogated here, the number of inbound flows, is not a
> static value contained within any single telemetry message. This value is
> calculated across multiple messages by the Profiler. The current Threat
> Triage process cannot be used to interrogate values calculated by the
> Profiler.
> h3. Proposed Solution
> I am proposing that we treat the Profiler as a source of telemetry. The
> measurements captured by the Profiler would be enqueued into a Kafka topic.
> We would then treat those Profiler messages like any other telemetry. We
> would parse, enrich, triage, and index those messages.
> This would have the following advantages.
> 1. We would be able to reuse the same threat triage mechanism for values
> calculated by the Profiler.
> 2. We would be able to generate profiles from the profiled data - aka
> meta-profiles anyone?
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)