Run this code after the kmeans clustering is done.
I have arranged code so that you can simply use the process method by
supplying it the path of clusteredPoints directory inside the output
path for clustering, the hadoop fileSystem and Configuration.
//use clusterId and vector here to write to a local file.
At this line you will get the clusterId and vector. Use it to write to
the file.
public void process(Path clusteredPoints, FileSystem fileSystem,
Configuration conf){
FileStatus[] partFiles = getAllClusteredPointPartFiles();
for (FileStatus partFile : partFiles) {
SequenceFile.Reader clusteredPointsReader = new
SequenceFile.Reader(fileSystem, partFile.getPath(),
conf);
WritableComparable clusterIdAsKey = (WritableComparable)
clusteredPointsReader.getKeyClass()
.newInstance();
Writable vector = (Writable)
clusteredPointsReader.getValueClass().newInstance();
while (clusteredPointsReader.next(clusterIdAsKey, vector)) {
//use clusterId and vector here to write to a local file.
}
clusteredPointsReader.close();
}
}
}
private FileStatus[] getAllClusteredPointPartFiles(Path
clusteredPoints, FileSystem fileSystem) throws IOException {
Path[] partFilePaths =
FileUtil.stat2Paths(fileSystem.globStatus(clusteredPoints,
PathFilters.partFilter()));
FileStatus[] partFileStatuses =
fileSystem.listStatus(partFilePaths, PathFilters.partFilter());
return partFileStatuses;
}
Paritosh
On 25-11-2011 12:27, Rachana wrote:
Hi Ranjan,
Thank you for your response, but as I am newbee I am kind of confused a bit!
Where should I include this code?
Or should I run this as a seperate program.
Rachana.
-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1411 / Virus Database: 2092/4037 - Release Date: 11/24/11