[ https://issues.apache.org/jira/browse/MAHOUT-552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12935923#action_12935923 ]
Ted Dunning commented on MAHOUT-552: ------------------------------------ {quote} May I propose instead that the Name feature be promoted to the Vector interface? {quote} We started there and decided it was a bad thing (tm). The rationale was that we wanted to allow existing non-Mahout vector implementations to implement a simpler interface that was purely numerically oriented. It was also desirable to have a very simple semantic for matrix multiplication and pairwise operations while still having really high performance. This is hard to do coherently with labels other than a dense collection of integers. There was also some controversy whether string labels were sufficient. Ultimately, the solution was a named vector and named matrix that wraps an ordinary matrix. This avoids forcing a tax on all implementations, but gives the flexibility to use named objects. For reference, I was one of the ones pushing original for labels all up and down. My current position is the opposite. > AbstractCluster eliminates NamedVectors by replacing them with > RandomAccessSparseVector always > ---------------------------------------------------------------------------------------------- > > Key: MAHOUT-552 > URL: https://issues.apache.org/jira/browse/MAHOUT-552 > Project: Mahout > Issue Type: Bug > Components: Clustering > Affects Versions: 0.5 > Reporter: Pere Ferrera Bertran > Assignee: Jeff Eastman > Fix For: 0.5 > > Attachments: MAHOUT-552.patch > > > When clustering using NamedVectors as input - after running seq2sparse with > patch https://issues.apache.org/jira/browse/MAHOUT-401 - names are lost > because AbstractCluster replaces vectors coming in the constructor with > RandomAccessSparseVector. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.