Hello,

i'm using mahout 0.8

import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.SequenceFile;
import org.apache.mahout.clustering.Cluster;
import org.apache.mahout.clustering.canopy.CanopyDriver;
import org.apache.mahout.clustering.classify.WeightedVectorWritable;
import org.apache.mahout.clustering.kmeans.KMeansDriver;
import org.apache.mahout.common.Pair;
import org.apache.mahout.common.distance.DistanceMeasure;
import org.apache.mahout.common.distance.ManhattanDistanceMeasure;
import org.apache.mahout.common.distance.TanimotoDistanceMeasure;
import org.apache.mahout.common.iterator.sequencefile.PathFilters;
import org.apache.mahout.common.iterator.sequencefile.PathType;
import org.apache.mahout.common.iterator.sequencefile.SequenceFileDirIterable;
import org.apache.mahout.math.NamedVector;
import org.apache.mahout.math.Vector;
import org.apache.mahout.math.VectorWritable;
import org.apache.mahout.utils.vectors.VectorHelper;
import org.apache.mahout.utils.vectors.lucene.Driver;




On 23/05/14 04:36, Aleksander Sadecki wrote:
Hi,

Thank you.

Which version of Apache Mahout you are using? Could you paste here your 
imports? Thanks

==================================
Projet Industriel PI16 – SICAP
==================================
Equipe: Deschamps Mathias
          Razafindramaka Rado
         Sadecki Aleksander
Encadrée par: Brun Emmanuel

Salle C104
==================================
ESISAR
50 rue Barthelemy de Laffemas
BP 54
26902 Valence cedex 9
==================================
tel: 04 56 52 99 16
fax: 04 75 75 94 44
==================================


----- Oryginalna wiadomość -----
Od: "Angel Luis Scull" <[email protected]>
Do: [email protected]
Wysłane: czwartek, 22 maj 2014 19:40:24
Temat: Re: How to list all vectors from a cluster

Hi

that work for me
   ...
Path path = new Path(workPath + kmeansClustersPath +
"/clusteredPoints/part-m-0");
for (Pair<IntWritable, WeightedVectorWritable> record : new
SequenceFileDirIterable<IntWritable, WeightedVectorWritable>(path,
PathType.GLOB,
                  PathFilters.logsCRCFilter(), conf)) {
              NamedVector vec = ((NamedVector)
record.getSecond().getVector());
              System.out.println(record.getFirst().get() + "  " +
vec.getName());

          }
...


On 22/05/14 11:22, Aleksander Sadecki wrote:
Hi,

Thank you very much!

I am trying to implement a Java function with this class.

I wrote this piece of code:

                ClusterDumper dumper = new ClusterDumper(new Path(partMDir), 
new Path(
                                seqDir));

                Map<Integer, List<WeightedPropertyVectorWritable>> dumped = 
dumper
                                .getClusterIdToPoints();

                for (Integer numberOfList : dumped.keySet()) {
                        List<WeightedPropertyVectorWritable> listWithVectors = 
dumped
                                        .get(numberOfList);

                        for (WeightedPropertyVectorWritable vec : 
listWithVectors) {
                                System.out.println(vec.getVector().toString());
                        }
                }

when I run it, I have got an exception.

Constructor takes 2 parameters:

ClusterDumper(seqFileDir, pointsDir) and I do not know which files should I 
pass here...

I have got 9 files:

                String s1 = root + "synthetic_control.data";
                String s2 = root + "synthetic_control.seq";
                String s3 = root + ".synthetic_control.seq.crc";
                String s4 = outputDir + "\\clusteredPoints\\part-m-0";
                String s5 = outputDir + "\\clusteredPoints\\.part-m-0.crc";
                String s6 = outputDir + "\\clusters-0-final\\_policy";
                String s7 = outputDir + "\\clusters-0-final\\part-r-00000";
                String s8 = outputDir + "\\clusters-0-final\\._policy.crc";
                String s9 = outputDir + "\\clusters-0-final\\.part-r-00000.crc";

                Path p1 = new Path(s1);
                Path p2 = new Path(s2);
                Path p3 = new Path(s3);
                Path p4 = new Path(s4);
                Path p5 = new Path(s5);
                Path p6 = new Path(s6);
                Path p7 = new Path(s7);
                Path p8 = new Path(s8);
                Path p9 = new Path(s9);

I tried to find which 2 should I use but nothing works.

Some of them gives me:

synthetic_control.data not a SequenceFile

another one:

org.apache.hadoop.io.LongWritable cannot be cast to 
org.apache.hadoop.io.IntWritable

or sometimes there is no excpetion but output is empty.

Could you help me?

Thank you in advance

Reply via email to