Transform your vector in a NamedVector.

On 04-11-2011 08:02, WangRamon wrote:
> OK, me again, I checked the KMeansDriver code for output points information, 
> following is the code:   Map<Text, Text> props = new HashMap<Text, Text>();
>     props.put(new Text("distance"), new 
> Text(String.valueOf(nearestDistance)));
>     context.write(new IntWritable(nearestCluster.getId()), new 
> WeightedPropertyVectorWritable(1, vector, props)); It's good to output 
> point(the vector) and distance information,  but usually we need something 
> like a name in real business to identify the the point, name <--> 
> vector/point,  and this information is not written out, if we can add this 
> information, that's will be much more better.   Cheers  Ramon
>  > Subject: Re: How to find which point belongs which cluster after running 
> KMeansClusterer
>> From: [email protected]
>> Date: Thu, 3 Nov 2011 08:28:19 -0400
>> To: [email protected]
>>
>> There is code for this, it's in two places (on trunk, at least):
>>
>> 1. ClusterDumper:
>> public static Map<Integer, List<WeightedVectorWritable>> readPoints(Path 
>> pointsPathDir, Configuration conf) {
>>     Map<Integer, List<WeightedVectorWritable>> result = new TreeMap<Integer, 
>> List<WeightedVectorWritable>>();
>>     for (Pair<IntWritable, WeightedVectorWritable> record :
>>             new SequenceFileDirIterable<IntWritable, WeightedVectorWritable>(
>>                     pointsPathDir, PathType.LIST, 
>> PathFilters.logsCRCFilter(), conf)) {
>>       // value is the cluster id as an int, key is the name/id of the
>>       // vector, but that doesn't matter because we only care about printing
>>       // it
>>       //String clusterId = value.toString();
>>       int keyValue = record.getFirst().get();
>>       List<WeightedVectorWritable> pointList = result.get(keyValue);
>>       if (pointList == null) {
>>         pointList = Lists.newArrayList();
>>         result.put(keyValue, pointList);
>>       }
>>       pointList.add(record.getSecond());
>>     }
>>     return result;
>>   }
>>
>> 2. ClusterDumperWriter:
>> List<WeightedVectorWritable> points = clusterIdToPoints.get(value.getId()); 
>> //look up the points by cluster id
>>     if (points != null) {
>>       writer.write("\tWeight : [props - optional]:  Point:\n\t");
>>       for (Iterator<WeightedVectorWritable> iterator = points.iterator(); 
>> iterator.hasNext(); ) {
>>         WeightedVectorWritable point = iterator.next();
>>         writer.write(String.valueOf(point.getWeight()));
>>
>> On Nov 3, 2011, at 5:48 AM, WangRamon wrote:
>>
>>> Yes, Paritosh, it's a bit missleading for new users, I will start to check 
>>> KMeansDriver, thanks for your quickly reply.
>>>> Date: Thu, 3 Nov 2011 15:02:28 +0530
>>>> From: [email protected]
>>>> To: [email protected]
>>>> Subject: Re: How to find which point belongs which cluster after running 
>>>> KMeansClusterer
>>>>
>>>> I also thought in the beginning that using KMeansClusterer and
>>>> ClusterDumper will help in getting all vectors belonging to a cluster,
>>>> but it did not help me a lot.
>>>>
>>>> I used KMeansDriver which I think is easy enough to use.
>>>>
>>>> After execution the records are written in the form
>>>> <cluster id><vector>
>>>>
>>>> "context.write(new Text(cluster.getIdentifier()), cluster);"
>>>>
>>>> So, what helped me was to process this into a map with cluster Id as the
>>>> key and vector list as the value. I read the clustered points and all
>>>> the data in the map in the form. In the end, the list against each
>>>> cluster id was what I needed.
>>>>
>>>> Hope this helps.
>>>>
>>>> Regards,
>>>> Paritosh
>>>>
>>>> On 03-11-2011 14:23, WangRamon wrote:
>>>>>
>>>>>
>>>>> Hi All I'm using KMeansClusterer, I will use KMeansDriver on a Hadoop 
>>>>> environment later, but I think it will be easy to understand it by using 
>>>>> KMeansClusterer, OK, so the question is i cannot find a way to find the 
>>>>> cluster a point should belong to after running KMeansClusterer, I expect 
>>>>> I can get some API on the Cluster interface to get all points/vector 
>>>>> belong to this cluster, but... so did i miss something? Thanks a lot.  
>>>>> Cheers Ramon                                            
>>>>>
>>>>>
>>>>> -----
>>>>> No virus found in this message.
>>>>> Checked by AVG - www.avg.com
>>>>> Version: 10.0.1411 / Virus Database: 2092/3992 - Release Date: 11/02/11
>>>                                       
>> --------------------------------------------
>> Grant Ingersoll
>> http://www.lucidimagination.com
>>
>>
>>
>                                         
>
>
> -----
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 10.0.1411 / Virus Database: 2092/3992 - Release Date: 11/02/11

Reply via email to