I mean, walking through the algorithms and tracking what vector name becomes what matrix row/column label.
On Mon, Jul 11, 2011 at 8:58 PM, Lance Norskog <[email protected]> wrote: > I'm finding it hard to maintain these labels across vector and matrix > factorizations & direct operations. > > On Mon, Jul 11, 2011 at 1:10 AM, Gabor Makrai <[email protected]> wrote: >> Thank you very much! NamedVector has to solve my problem! >> Anyway, I'm always wondering the answer speed in the Hadoop lists! >> >> Thank you, >> Gabor >> >> On Mon, Jul 11, 2011 at 3:51 AM, Lance Norskog <[email protected]> wrote: >> >>> The NamedVector class adds a string to any vector, forwarding all >>> methods to the wrapped vector. You can cluster these, and then pull >>> the strings. The clustering algorithm operates on the wrapped vector. >>> >>> Lance >>> >>> On Sun, Jul 10, 2011 at 4:18 PM, Gabor Makrai <[email protected]> >>> wrote: >>> > Hi, >>> > >>> > I'm a little bit confused about Mahout's clustering algorithms. I like to >>> > clustering data with id column. How can I do that? >>> > For example, I like to run K-Means clustering on the Iris data set ( >>> > http://archive.ics.uci.edu/ml/datasets/Iris) where I've got four >>> numerical >>> > columns. I generated an id column to identify the records and when the >>> > clustering is done, I like to see the results. >>> > When I examine the code, I realized that I can create DenseVector >>> instances >>> > (with the four numberical column, without the id) and write those in >>> > VectorWriteable format. These were my input data. After I managed to run >>> > K-Means, I get IntWritable/WeightedVectorWritable key/value pairs, where >>> > keys tell me the clusterID. Is it possible to handle ID attribute >>> somehow? >>> > Maybe the order of the output data is the same as the input data? Can >>> anyone >>> > confirm this? >>> > >>> > Thank you very much, >>> > Gabor Makrai >>> > >>> >>> >>> >>> -- >>> Lance Norskog >>> [email protected] >>> >> > > > > -- > Lance Norskog > [email protected] > -- Lance Norskog [email protected]
