[ 
https://issues.apache.org/jira/browse/MAHOUT-552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12935923#action_12935923
 ] 

Ted Dunning commented on MAHOUT-552:
------------------------------------

{quote}
May I propose instead that the Name feature be promoted to the Vector interface?
{quote}

We started there and decided it was a bad thing (tm).

The rationale was that we wanted to allow existing non-Mahout vector 
implementations to implement a simpler interface that was purely numerically 
oriented.  It was also desirable to have a very simple semantic for matrix 
multiplication and pairwise operations while still having really high 
performance.  This is hard to do coherently with labels other than a dense 
collection of integers.

There was also some controversy whether string labels were sufficient.

Ultimately, the solution was a named vector and named matrix that wraps an 
ordinary matrix.  This avoids forcing a tax on all implementations, but gives 
the flexibility to use named objects.

For reference, I was one of the ones pushing original for labels all up and 
down.  My current position is the opposite.


> AbstractCluster eliminates NamedVectors by replacing them with 
> RandomAccessSparseVector always
> ----------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-552
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-552
>             Project: Mahout
>          Issue Type: Bug
>          Components: Clustering
>    Affects Versions: 0.5
>            Reporter: Pere Ferrera Bertran
>            Assignee: Jeff Eastman
>             Fix For: 0.5
>
>         Attachments: MAHOUT-552.patch
>
>
> When clustering using NamedVectors as input - after running seq2sparse with 
> patch https://issues.apache.org/jira/browse/MAHOUT-401 - names are lost 
> because AbstractCluster replaces vectors coming in the constructor with 
> RandomAccessSparseVector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to