26 jan 2008 kl. 04.23 skrev Grant Ingersoll:
let's try to start filling in the TODO list and get some basic build infrastructure in place.
We should talk about a unison data access API. No need for something fancy or speedy from the start, a seekable record reader might be enough for now. Lots of abstract layers to allow people adding support methods and use of any data source with optional levels of access optimization. An ARFF, an inverted index or what ever fits best with the algortihm you are about to pass the data to.
Does anyone feel particularly strong about initial algorithms to tackle? I'm thinking k-Means, naive bayes or neural nets, but am obviously open to other suggestions.
I'm planning a soft start implementing pre processing filters (discretization, resampling, etc). Then I'll probably look at feature selection, heirarchial clustering or reinforcement learning.
karl
