[
https://issues.apache.org/jira/browse/MAHOUT-8?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karl Wettin updated MAHOUT-8:
-----------------------------
Attachment: pseudo_jsr.txt
My question is, did anyone else take a closer look at the JSRs? I would very
much like to hear what you people think of this data model. I'm quite attracted
to it.
It says nothing about how data is stored, it is about roles and abstract access
to physical instance data. And it seperates logical (the data set definition
used by ML algorithms) from physical (the deta set definition describing the
source data) model, allowing one to vitually transform the data set by mapping
logical data to the physical data in any way without messing things up.
I now have this half baked pseudo implementation of this. It uses abstract
classes rather than interfaces, and some of the interfaces have been merged to
a single class. It would however not be a big deal to have it implement the
interfaces if one wish. I feel some of the stuff in there is a bit overkill at
this point, but I tried to follow the specs as well as I could (I replaced a
bit of ad hoc enum classes with enums, etc).
There is no documentation, tests or anything concrete, just a bunch of classes
I'm now popping in the JIRA to show what it could look like.
Actually, there is an early attempt at an abstract seekable physical data
record reader. And an ARFF writer. They are sort of my dry coded thoughts. You
can ignore them.
> Data definition model
> ---------------------
>
> Key: MAHOUT-8
> URL: https://issues.apache.org/jira/browse/MAHOUT-8
> Project: Mahout
> Issue Type: New Feature
> Reporter: Karl Wettin
> Attachments: pseudo_jsr.txt
>
>
> How do we define classes, attributes and instance data?
> This has nothing to do with physical data records, this is about data types,
> roles, etc.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.