[
https://issues.apache.org/jira/browse/SAMOA-65?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicolas Kourtellis updated SAMOA-65:
------------------------------------
Status: Done (was: In Progress)
Resolution: Done
> Apache Kafka integration components for SAMOA
> ---------------------------------------------
>
> Key: SAMOA-65
> URL: https://issues.apache.org/jira/browse/SAMOA-65
> Project: SAMOA
> Issue Type: New Feature
> Components: SAMOA-API, SAMOA-Instances
> Reporter: Piotr Wawrzyniak
> Assignee: Nicolas Kourtellis
> Labels: kafka, sink, source, streaming
> Original Estimate: 672h
> Remaining Estimate: 672h
>
> As of now Apache SAMOA includes no integration components for Apache Kafka,
> meaning in particular no possibility to read data coming from Kafka and write
> data with prediction results back to Kafka.
> The key assumptions for the development of Kafka-related components are as
> follows:
> 1) develop support for input data stream arriving to Apache Samoa via
> Apache Kafka
> 2) develop support for output data stream produced by Apache Samoa,
> including the results of stream mining and forwarded to Apache Kafka to be
> provided in this way to other modules consuming the stream.
> This makes the goal of this issue is to create the following components:
> 1) KafkaEntranceProcessor in samoa-api. This entrance processor will be
> able to accept incoming Kafka stream. It will require KafkaDeserializer
> interface implementation to be delivered. The role of Deserializer would be
> to translate incoming Apache Kafka messages into implementation of Instance
> interface of SAMOA.
> 2) KafkaDestinationProcessor in samoa-api. Similarly to the
> KafkaEntranceProcessor, this processor would require KafkaSerializer
> interface implementation to be delivered. The role of Serializer would be to
> create a Kafka message from the underlying Instance class.
> 3) KafkaStream, as the extension to existing streams (e.g.
> InstanceStream), would take similar role to other streams, and will provide
> the control over Instances flows in the entire topology.
> Moreover, the following assumptions are considered:
> 1) Components would be implemented with the use of most up-to-date version
> of Apache Kafka, i.e. 0.10
> 2) Samples of aforementioned Serializer and Deserializer would be
> delivered, both supporting AVRO and JSON serialization of Instance objects.
> 3) Sample testing classes providing reference use of Kafka source and
> destination would be included in the project as well.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)