Hi Jayadeep, I think it's pretty cool! If we get both Avro and Kafka support right, we can connect to almost anything.
The document looks very comprehensive, you seem to have given a lot of thought to it. I am not extremely familiar with Avro myself, I've just used it a couple of times, but I'll try to provide some suggestions. - The general idea of where and how to store data and meta-data seems right. - In general, all attributes in a sparse instance are optional, and all attributes in a dense instance are required. Maybe we want to be more granular than this in the future, but it seems that Avro supports a superset of these settings. We may want to have some defaults "prototypes" in order to make mapping the current dense/sparse instances easy. - Right now we are not making use of Date-type attributes in SAMOA (there is no such thing in samoa-instances), so if it makes it easier we could skip supporting it. Ideally we could have algorithms that respect event-time as provided by timestamps in the instances (as opposed to processing the event whenever it arrives), however we are not there yet :) All the rest seems pretty straightforward. Moving to the more software-engineering oriented aspects, where would we have dependencies for Avro? And how should they be deployed? Would they simply go inside the deployable uber-jar of SAMOA? Thanks, -- Gianmarco On 19 October 2015 at 11:24, Jayadeep J <[email protected]> wrote: > Hi Gianmarco / All, > > I am working on an integration of SAMOA with Apache Avro. Basically I want > to use data stored in Avro Files to be used as input to SAMOA. > > As I understand, current SAMOA readers only support ARFF format. Do you > think such a feature would be useful to SAMOA in general ? Avro allows two > encodings for the data: Binary & JSON. Hence an Avro support may allow > users with JSON data also to use SAMOA. > > Based on the input given by @gdfm to @ctippur, I have prepared an Input > Format document in Google Docs. > > > https://docs.google.com/document/d/1EiyuXOZFKk7MTs-gWaEJq5PVHYyiphhateTaDJMKuR8/edit?usp=sharing > > > Would it be possible for you to have a look and provide your valuable > suggestions ? Thanks > > > Thanks > Jay > https://github.com/jayadeepj >
