[
https://issues.apache.org/jira/browse/METRON-701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15860186#comment-15860186
]
ASF GitHub Bot commented on METRON-701:
---------------------------------------
Github user nickwallen commented on a diff in the pull request:
https://github.com/apache/incubator-metron/pull/449#discussion_r100408667
--- Diff:
metron-analytics/metron-profiler/src/main/java/org/apache/metron/profiler/bolt/KafkaDestinationHandler.java
---
@@ -0,0 +1,78 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ *
+ */
+
+package org.apache.metron.profiler.bolt;
+
+import org.apache.metron.common.utils.JSONUtils;
+import org.apache.metron.profiler.ProfileMeasurement;
+import org.apache.storm.task.OutputCollector;
+import org.apache.storm.topology.OutputFieldsDeclarer;
+import org.apache.storm.tuple.Fields;
+import org.apache.storm.tuple.Values;
+import org.json.simple.JSONObject;
+
+import java.io.Serializable;
+
+/**
+ * Handles emitting a ProfileMeasurement to the stream which writes
+ * profile measurements to Kafka.
+ */
+public class KafkaDestinationHandler implements DestinationHandler,
Serializable {
+
+ /**
+ * The stream identifier used for this destination;
+ */
+ private String streamId = "kafka";
+
+ @Override
+ public void declareOutputFields(OutputFieldsDeclarer declarer) {
+ // the kafka writer expects a field named 'message'
+ declarer.declareStream(getStreamId(), new Fields("message"));
+ }
+
+ @Override
+ public void emit(ProfileMeasurement measurement, OutputCollector
collector) {
+
+ try {
+ JSONObject message = new JSONObject();
+ message.put("profile", measurement.getDefinition().getProfile());
+ message.put("entity", measurement.getEntity());
+ message.put("period", measurement.getPeriod().getPeriod());
+ message.put("periodStartTime",
measurement.getPeriod().getStartTimeMillis());
+
+ // TODO How to serialize an object (like a StatisticsProvider) in a
form that can be used on the other side? (Threat Triage)
+ // TODO How to embed binary in JSON?
+ message.put("value", measurement.getValue());
+
--- End diff --
Everything works just fine if your profile produces a nice, easy to
serialize value like a number. If all you want to do is triage profiles that
produce numbers, then this code works. But how should this work if your
profile is producing a complex object, like is done with the STATS package?
The Profiler has to serialize everything that the profiles produce.
Serializing objects into HBase works just fine. But how should we serialize a
complex object when sending it to Kafka?
Random options follow. What else might work?
* Sticking with the current architecture, I would have to embed binary data
in the message JSON that is produced, which I don't think is possible or at
least not ideal.
* I could just write the binary data alone to a Kafka topic, but then how
do we pass the meta-information like the profile and entity name? Do I create
a wrapper object that contains the name, profile and value, then serialize all
of that and write to a topic? That would be drastically different from what we
do today.
> Triage Metrics Produced by the Profiler
> ---------------------------------------
>
> Key: METRON-701
> URL: https://issues.apache.org/jira/browse/METRON-701
> Project: Metron
> Issue Type: Improvement
> Reporter: Nick Allen
> Assignee: Nick Allen
>
> h3. Problem
> The motivating example is that I would like to create an alert if the number
> of inbound flows to any host over a 15 minute interval is abnormal.
> The value being interrogated here, the number of inbound flows, is not a
> static value contained within any single telemetry message. This value is
> calculated across multiple messages by the Profiler. The current Threat
> Triage process cannot be used to interrogate values calculated by the
> Profiler.
> h3. Proposed Solution
> I am proposing that we treat the Profiler as a source of telemetry. The
> measurements captured by the Profiler would be enqueued into a Kafka topic.
> We would then treat those Profiler messages like any other telemetry. We
> would parse, enrich, triage, and index those messages.
> This would have the following advantages.
> 1. We would be able to reuse the same threat triage mechanism for values
> calculated by the Profiler.
> 2. We would be able to generate profiles from the profiled data - aka
> meta-profiles anyone?
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)