I agree that models should be highly generic.  I just don't think that we
should legislate the content of either their internal model, nor of their
serialized representation.

The contract is pretty clear, however.  There are just a few methods and it
isn't hard for all models to support them, especially with an abstract class
providing default implementations.  There are a few strangenesses in the API
that I would suggest that can help a lot with performance.  For instance,
for two-class models, it should be possible to call the model and get back a
double value which is the score for the first class.  For k-class models, it
should be possible to pass in a vector that gets the scores for the various
models and both 1 of k and 1 of k-1 encoding should be supported.  All of
these variants are useful to avoid having to cons up a bunch of extra
vectors at classification time.



On Tue, Jun 22, 2010 at 10:07 AM, Robin Anil <[email protected]> wrote:

> The reason I said models be generic is because they can then be read across
> classifiers. Like if the classifier does nearest centroid matching like in
> NB or using output of K-Means it can be used. Or if a margin trained using
> pegasos can be used by any SVM classifier. Thats all
>

Reply via email to