[ https://issues.apache.org/jira/browse/MAHOUT-479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898429#action_12898429 ]
Drew Farris commented on MAHOUT-479: ------------------------------------ Thanks for getting the ball rolling Isabel More discussion from the mailing list can be found here http://markmail.org/thread/tarn6f4ump5zn67n One thought I remember reading somewhere was instead of using a common datatype as input for classifiers and clusterers, provide something like an InputSource interface -- which defines how things are read and can be implemented for any number of physical input types, e.g: text files, serialized vectors in sequence files, various types of distributed data stores)... we provide a number of input sources with the appropriate methods and allow users of Mahout to implement their own. Some other discussion I'm particularly interested in is how we might unify models across the classification and clustering code. > Streamline classification/ clustering data structures > ----------------------------------------------------- > > Key: MAHOUT-479 > URL: https://issues.apache.org/jira/browse/MAHOUT-479 > Project: Mahout > Issue Type: Improvement > Components: Classification, Clustering > Affects Versions: 0.1, 0.2, 0.3, 0.4 > Reporter: Isabel Drost > > Opening this JIRA issue to collect ideas on how to streamline our > classification and clustering algorithms to make integration for users easier > as per mailing list thread http://markmail.org/message/pnzvrqpv5226twfs > {quote} > Jake and Robin and I were talking the other evening and a common lament was > that our classification (and clustering) stuff was all over the map in terms > of data structures. Driving that to rest and getting those comments even > vaguely as plug and play as our much more advanced recommendation components > would be very, very helpful. > {quote} > This issue probably also realates to MAHOUT-287 (intention there is to make > naive bayes run on vectors as input). > Ted, Jake, Robin: Would be great if someone of you could add a comment on > some of the issues you discussed "the other evening" and (if applicable) any > minor or major changes you think could help solve this issue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.