Hi Giacomo- Don't have any advice on what you are trying to do, but I think the end goal is to have everything leverage the common data models in Spot. So I think the recommendation would be to figure out a way to convert your data to the common data model. But I don't think the Spot ML code actually leverages the common data model yet, so that's more of a future solution.
If anyone knows better, feel free to correct me. Michael On Tue, Mar 7, 2017 at 10:57 AM, Giacomo Bernardi <[email protected]> wrote: > Hi, > let me ask a suggestion on how to proceed: > > I already have flow data stored HDFS in Parquet files from an existing > netflow receiver system, but with different columns/schema than Spot. I'd > like to patch spot-ml and spot-oa to have them run directly on that data > without having to store everything twice. > > I'm still figuring out the parsing code, any hints on how I should do this? > Or, even better, how to do it in a sane/modular way that can be useful for > everyone? > > Thanks a lot! > Giacomo > -- Michael Ridley <[email protected]> office: (650) 352-1337 mobile: (571) 438-2420 Senior Solutions Architect Cloudera, Inc.
