Piotr Wawrzyniek created SAMOA-65:
-------------------------------------
Summary: Apache Kafka integration components for SAMOA
Key: SAMOA-65
URL: https://issues.apache.org/jira/browse/SAMOA-65
Project: SAMOA
Issue Type: New Feature
Components: SAMOA-API, SAMOA-Instances
Reporter: Piotr Wawrzyniek
As of now Apache SAMOA includes no integration components for Apache Kafka,
meaning in particular no possibility to read data coming from Kafka and write
data with prediction results back to Kafka.
The key assumptions for the development of Kafka-related components are as
follows:
1) develop support for input data stream arriving to Apache Samoa via
Apache Kafka
2) develop support for output data stream produced by Apache Samoa,
including the results of stream mining and forwarded to Apache Kafka to be
provided in this way to other modules consuming the stream.
This makes the goal of this issue is to create the following components:
1) KafkaEntranceProcessor in samoa-api. This entrance processor will be
able to accept incoming Kafka stream. It will require KafkaDeserializer
interface implementation to be delivered. The role of Deserializer would be to
translate incoming Apache Kafka messages into implementation of Instance
interface of SAMOA.
2) KafkaDestinationProcessor in samoa-api. Similarly to the
KafkaEntranceProcessor, this processor would require KafkaSerializer interface
implementation to be delivered. The role of Serializer would be to create a
Kafka message from the underlying Instance class.
3) KafkaStream, as the extension to existing streams (e.g.
InstanceStream), would take similar role to other streams, and will provide the
control over Instances flows in the entire topology.
Moreover, the following assumptions are considered:
1) Components would be implemented with the use of most up-to-date version
of Apache Kafka, i.e. 0.10
2) Samples of aforementioned Serializer and Deserializer would be
delivered, both supporting AVRO and JSON serialization of Instance objects.
3) Sample testing classes providing reference use of Kafka source and
destination would be included in the project as well.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)