Hi, I have a problem with Customized OutputFormat, my OutputFormat splits
kinds of my seeds and outputs three paths of results,and it works when
running in one node or eclipse plugin but it doesnt work right in 5 nodes
as a jar running, the file will be smaller than the correct sizes. here is
my code ,any help will be appreciated .
public static class SepOutputFormat implements OutputFormat {
public void checkOutputSpecs(FileSystem arg0, JobConf arg1)
throws IOException {
}
public RecordWriter getRecordWriter(final FileSystem fs,
final JobConf job, final String name, Progressable progress)
throws IOException {
return new RecordWriter() {
String date = DateUtil.getString("yyyy-MM-dd-HH", new
Date());
Path newPath = new Path("data/seednew/" + date);
Path lostPath = new Path("data/seedlost/" + date);
Path basePath = new Path("data/seedbase/" + date);
SequenceFile.Writer newwriter = new SequenceFile.Writer(fs,
job, newPath, Text.class, Seed.class);
SequenceFile.Writer lostwriter = new SequenceFile.Writer(fs,
job, lostPath, Text.class, Seed.class);
SequenceFile.Writer basewriter = new SequenceFile.Writer(fs,
job, basePath, Text.class, Seed.class);
public void close(Reporter reporter) throws IOException {
newwriter.close();
basewriter.close();
lostwriter.close();
}
public void write(Object key, Object value) throws
IOException {
String word = key.toString();
if (word.startsWith("new_")) {
String skey = word.replace("new_", "");
newwriter.append(new Text(skey), (Seed) value);
} else if (word.startsWith("lost_")) {
String skey = word.replace("lost_", "");
lostwriter.append(new Text(skey), (Seed) value);
} else {
basewriter.append(new Text(word), (Seed) value);
}
}
};
}
}