The key in the CSV is the clusterId (and not the named vector).

Here's the complete code snippet which should make sense.

{Code}

    Cluster cluster = clusterWritable.getValue();
    line.append(cluster.getId());
    List<WeightedPropertyVectorWritable> points = 
getClusterIdToPoints().get(cluster.getId());
    if (points != null) {
      for (WeightedPropertyVectorWritable point : points) {
        Vector theVec = point.getVector();
        line.append(',');
        if (theVec instanceof NamedVector) {
         
 line.append(((NamedVector)theVec).getName());
        } else {
          String vecStr = theVec.asFormatString();
          //do some basic manipulations for display
          vecStr = VEC_PATTERN.matcher(vecStr).replaceAll("_");
          line.append(vecStr);
        }
      }
      getWriter().append(line).append("\n");
    }


{Code}

For each clusterId it prints the names of the Named Vectors in the cluster or 
the vector itself (if not a named vector).
Hope that clarifies.







On Friday, February 21, 2014 2:13 AM, Bikash Gupta <bikash.gupt...@gmail.com> 
wrote:
 
Suneel,

I was going through code of CSVClusterWriter and found that if
 vector
is an instance of NamedVector then it writes only Key.

if (theVec instanceof NamedVector) {
          line.append(((NamedVector)theVec).getName());
        } else {
          String vecStr = theVec.asFormatString();
          //do some basic manipulations for display
          vecStr = VEC_PATTERN.matcher(vecStr).replaceAll("_");
          line.append(vecStr);
        }

Hence I am getting only key as an ouput of cluster dumper. Request you
to specify the design assumption behind this....

On Wed, Feb 19, 2014 at 10:36 PM, Bikash Gupta <bikash.gupt...@gmail.com> wrote:
> I am running cluster
 dumper
>
> After extracting output from Cluster dump I am transposing the row to
> column, hence I have directly called this class from my java code.
>
> Code:
>
> ClusterDumper.main(new String[] {
>                 buildOption(DefaultOptionCreator.INPUT_OPTION),seqFileDir,
>                 buildOption(DefaultOptionCreator.OUTPUT_OPTION),outputFile,
>                 buildOption(ClusterDumper.OUTPUT_FORMAT_OPT),format,
>                 buildOption(ClusterDumper.POINTS_DIR_OPTION),pointsDir
>                 });
>
> I have attached output too. Please note Key of Sequence File is
> Text.class and its seperated using "`" character. I have also attached
> Cluster
 Metadata
>
>
>
>
> On Wed, Feb 19, 2014 at 9:21 PM, Suneel Marthi <suneel_mar...@yahoo.com> 
> wrote:
>> R u running clusterdump or seqdumper?
>>
>> Could u paste the commands that u had run and their respective outputs?
>>
>>
>>
>>
>>
>>
>>
>> On Wednesday, February 19, 2014 6:16 AM, Bikash Gupta 
>> <bikash.gupt...@gmail.com> wrote:
>>
>> Hi,
>>
>> After running the cluster dumper on Kmeans output I am getting only
>> Key of Sequence File.
>>
>> Options provided for cluster dumper is:-
>>
>> -i <<cluster-*-final of Kmeans>> -o <<Output
 File>>  -p
>> <<clusteredPoint>> -of CSV
>>
>> Is it something that I am missing.
>>
>> PN: I am using sequential mode.
>>
>> --
>> Regards
>> Bikash Gupta
>
>
>
> --
> Thanks & Regards
> Bikash Kumar Gupta



-- 
Thanks & Regards
Bikash Kumar Gupta

Reply via email to