Hi,
I wrote a simple program to gather some statistics about bigrams in some
data.
I print statistics to a custom file.
Path file = new Path(context.getConfiguration().get("mapred.output.dir")
+ "/bigram.txt");
FSDataOutputStream out =
file.getFileSystem(context.getConfiguration()).create(file);
My code has following lines:
Text.writeString(out, "total number of unique bigrams: " +
uniqBigramCount + "\n");
Text.writeString(out, "total number of bigrams: " + totalBigramCount +
"\n");
Text.writeString(out, "number of bigrams that appear only once: " +
onceBigramCount + "\n");
I get following output:
'total number of unique bigrams: 424462
!total number of bigrams: 1578220
0number of bigrams that appear only once: 296139
Apart from unwanted characters at the beginning of the lines, there are
some non-printing characters too. What could be the reason behind this?
Thanks.