Gianmarco

Thanks for the response.  Can you please specify the format? Can you please
explain the reason for keeping it in a specific format?
I would like contribute to kafka enhancement. I will look into the code
base you pointed out.

Shekar
On Jul 11, 2015 1:36 AM, "Gianmarco De Francisci Morales" <[email protected]>
wrote:

> Hi Shekar,
>
> At the moment we do not support JSON data.
> The current readers support ARFF format, which is a CSV with some header.
> http://www.cs.waikato.ac.nz/ml/weka/arff.html
> Adding support for JSON is doable, but it should conform to a very specific
> format.
>
> About Kafka, we support it as a transport via Samza, but we don't have a
> reader for it right now.
> Adding it would be very valuable. If you wanted to work on it I'd be happy
> to help.
> Have a look at org.apache.samoa.streams.fs.HDFSFileStreamSource,
> and org.apache.samoa.streams.ArffFileStream for some examples.
>
> Cheers,
>
>
> --
> Gianmarco
>
> On 10 July 2015 at 01:18, Shekar Tippur <[email protected]> wrote:
>
> > Hello,
> >
> > I am trying to use Samoa/Samza combination to apply ML for a dataset I
> have
> > in JSON format.
> >
> > This is the document I am following:
> >
> >
> https://samoa.incubator.apache.org/documentation/Executing-SAMOA-with-Apache-Samza.html
> >
> > Couple of questions:
> > 1. How do I point the input event to a Stream/Topic in Kafka? The data is
> > in JSON.
> > 2. If I want to use historical data that is stored in a file, how do I
> > point the job to read from a file and serialise as json?
> >
> > bin/samoa samza target/SAMOA-Samza-0.3.0-SNAPSHOT.jar
> > "PrequentialEvaluation -l classifiers.ensemble.Bagging -s (??)"
> >
> > - Shekar
> >
>

Reply via email to