Hello,
i'm using mahout 0.8
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.SequenceFile;
import org.apache.mahout.clustering.Cluster;
import org.apache.mahout.clustering.canopy.CanopyDriver;
import org.apache.mahout.clustering.classify.WeightedVectorWritable;
import org.apache.mahout.clustering.kmeans.KMeansDriver;
import org.apache.mahout.common.Pair;
import org.apache.mahout.common.distance.DistanceMeasure;
import org.apache.mahout.common.distance.ManhattanDistanceMeasure;
import org.apache.mahout.common.distance.TanimotoDistanceMeasure;
import org.apache.mahout.common.iterator.sequencefile.PathFilters;
import org.apache.mahout.common.iterator.sequencefile.PathType;
import
org.apache.mahout.common.iterator.sequencefile.SequenceFileDirIterable;
import org.apache.mahout.math.NamedVector;
import org.apache.mahout.math.Vector;
import org.apache.mahout.math.VectorWritable;
import org.apache.mahout.utils.vectors.VectorHelper;
import org.apache.mahout.utils.vectors.lucene.Driver;
On 23/05/14 04:36, Aleksander Sadecki wrote:
Hi,
Thank you.
Which version of Apache Mahout you are using? Could you paste here your
imports? Thanks
==================================
Projet Industriel PI16 – SICAP
==================================
Equipe: Deschamps Mathias
Razafindramaka Rado
Sadecki Aleksander
Encadrée par: Brun Emmanuel
Salle C104
==================================
ESISAR
50 rue Barthelemy de Laffemas
BP 54
26902 Valence cedex 9
==================================
tel: 04 56 52 99 16
fax: 04 75 75 94 44
==================================
----- Oryginalna wiadomość -----
Od: "Angel Luis Scull" <[email protected]>
Do: [email protected]
Wysłane: czwartek, 22 maj 2014 19:40:24
Temat: Re: How to list all vectors from a cluster
Hi
that work for me
...
Path path = new Path(workPath + kmeansClustersPath +
"/clusteredPoints/part-m-0");
for (Pair<IntWritable, WeightedVectorWritable> record : new
SequenceFileDirIterable<IntWritable, WeightedVectorWritable>(path,
PathType.GLOB,
PathFilters.logsCRCFilter(), conf)) {
NamedVector vec = ((NamedVector)
record.getSecond().getVector());
System.out.println(record.getFirst().get() + " " +
vec.getName());
}
...
On 22/05/14 11:22, Aleksander Sadecki wrote:
Hi,
Thank you very much!
I am trying to implement a Java function with this class.
I wrote this piece of code:
ClusterDumper dumper = new ClusterDumper(new Path(partMDir),
new Path(
seqDir));
Map<Integer, List<WeightedPropertyVectorWritable>> dumped =
dumper
.getClusterIdToPoints();
for (Integer numberOfList : dumped.keySet()) {
List<WeightedPropertyVectorWritable> listWithVectors =
dumped
.get(numberOfList);
for (WeightedPropertyVectorWritable vec :
listWithVectors) {
System.out.println(vec.getVector().toString());
}
}
when I run it, I have got an exception.
Constructor takes 2 parameters:
ClusterDumper(seqFileDir, pointsDir) and I do not know which files should I
pass here...
I have got 9 files:
String s1 = root + "synthetic_control.data";
String s2 = root + "synthetic_control.seq";
String s3 = root + ".synthetic_control.seq.crc";
String s4 = outputDir + "\\clusteredPoints\\part-m-0";
String s5 = outputDir + "\\clusteredPoints\\.part-m-0.crc";
String s6 = outputDir + "\\clusters-0-final\\_policy";
String s7 = outputDir + "\\clusters-0-final\\part-r-00000";
String s8 = outputDir + "\\clusters-0-final\\._policy.crc";
String s9 = outputDir + "\\clusters-0-final\\.part-r-00000.crc";
Path p1 = new Path(s1);
Path p2 = new Path(s2);
Path p3 = new Path(s3);
Path p4 = new Path(s4);
Path p5 = new Path(s5);
Path p6 = new Path(s6);
Path p7 = new Path(s7);
Path p8 = new Path(s8);
Path p9 = new Path(s9);
I tried to find which 2 should I use but nothing works.
Some of them gives me:
synthetic_control.data not a SequenceFile
another one:
org.apache.hadoop.io.LongWritable cannot be cast to
org.apache.hadoop.io.IntWritable
or sometimes there is no excpetion but output is empty.
Could you help me?
Thank you in advance