[
https://issues.apache.org/jira/browse/SAMOA-65?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16050565#comment-16050565
]
ASF GitHub Bot commented on SAMOA-65:
-------------------------------------
Github user nicolas-kourtellis commented on the issue:
https://github.com/apache/incubator-samoa/pull/64
Thank you @pwawrzyniak for the contributions!
Regarding the current code:
- Can you change the copyright to be current year? (I wonder if we should
keep this year appearing. It needs constant updating every time we have a new
release).
- I added some minor comments on leftovers from the AVRO combined
integration. If you can remove them it would be cleaner.
- I noticed that there is redundancy / repetition between the three PRs
(#59,#64#65). Is there a way to make them unique to each other? Otherwise I
think there will be conflicts when trying to merge them. @gdfm what do you
think?
- After checking the code, I realized that this is dedicated for Kafka.
A quick question: Can this JSON serializer/deserializer be
extended/abstracted to be used by other interfaces besides Kafka (e.g., even
storing/retrieving JSON files from disk)? Do you think it is feasible or needs
a lot of work?
- Can we do something similar for Avro?
> Apache Kafka integration components for SAMOA
> ---------------------------------------------
>
> Key: SAMOA-65
> URL: https://issues.apache.org/jira/browse/SAMOA-65
> Project: SAMOA
> Issue Type: New Feature
> Components: SAMOA-API, SAMOA-Instances
> Reporter: Piotr Wawrzyniak
> Labels: kafka, sink, source, streaming
> Original Estimate: 672h
> Remaining Estimate: 672h
>
> As of now Apache SAMOA includes no integration components for Apache Kafka,
> meaning in particular no possibility to read data coming from Kafka and write
> data with prediction results back to Kafka.
> The key assumptions for the development of Kafka-related components are as
> follows:
> 1) develop support for input data stream arriving to Apache Samoa via
> Apache Kafka
> 2) develop support for output data stream produced by Apache Samoa,
> including the results of stream mining and forwarded to Apache Kafka to be
> provided in this way to other modules consuming the stream.
> This makes the goal of this issue is to create the following components:
> 1) KafkaEntranceProcessor in samoa-api. This entrance processor will be
> able to accept incoming Kafka stream. It will require KafkaDeserializer
> interface implementation to be delivered. The role of Deserializer would be
> to translate incoming Apache Kafka messages into implementation of Instance
> interface of SAMOA.
> 2) KafkaDestinationProcessor in samoa-api. Similarly to the
> KafkaEntranceProcessor, this processor would require KafkaSerializer
> interface implementation to be delivered. The role of Serializer would be to
> create a Kafka message from the underlying Instance class.
> 3) KafkaStream, as the extension to existing streams (e.g.
> InstanceStream), would take similar role to other streams, and will provide
> the control over Instances flows in the entire topology.
> Moreover, the following assumptions are considered:
> 1) Components would be implemented with the use of most up-to-date version
> of Apache Kafka, i.e. 0.10
> 2) Samples of aforementioned Serializer and Deserializer would be
> delivered, both supporting AVRO and JSON serialization of Instance objects.
> 3) Sample testing classes providing reference use of Kafka source and
> destination would be included in the project as well.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)