Re: [SPOT][ML] Schema check for custom data sets

Jonathan Natkins Thu, 22 Jun 2017 10:32:38 -0700

Personally, I'd love for there to be more information about the expected
schema for the ML jobs, as well as information about where the data can be
picked up from. The documentation seems to be mostly written with a
specific example in mind, so is not extremely helpful when trying to
integrate new data sources. A data dictionary would help with being able to
map fields from data formats (other logs, etc) to fields that spot-ml can
process.


Whatever happened to the open data model that was being discussed for Spot?

Thanks!
Natty

On Thu, Jun 22, 2017 at 10:10 AM Barona, Ricardo <[email protected]>
wrote:

> Hi everyone.
>
> I’m happy to see how more people is playing with Spot and particularly
> with spot-ml everytime.
>
> Something that I’ve noticed thanks to these two Jira issues (
> https://issues.apache.org/jira/browse/SPOT-149 and
> https://issues.apache.org/jira/browse/SPOT-174) is that sometimes users
> are going to want to try spot-ml without ingesting data using spot-ingest
> and I think that’s cool but seems like that can lead to inconsistent schema
> issues.
>
> I’d like to know what you think, what would be the best approach to deal
> with this; I’m thinking that we can add schema validation to spot-ml before
> anything else happens but I don’t know if that’s going to lock things too
> much.
>
> Please share your thoughts.
>
> Thanks,
> Ricardo Barona
>
-- 
Jonathan "Natty" Natkins
StreamSets | Field Engineering Director
mobile: 609.577.1600 | linkedin <http://www.linkedin.com/in/nattyice>

Re: [SPOT][ML] Schema check for custom data sets

Reply via email to