SequenceFile.Writer writer = new SequenceFile.Writer(fs, conf, new
Path(inputDir,"documents.seq"),Text.class, Text.class);

     for(int i=0;i<s.length;i++)
        {

             writer.append(new Text(s[i][0]), new Text(s[i][1]));
         }
      writer.close();

Here Text(s[i][0]) is a string value, which is the ID of a news
article and Text(s[i][1]) is the news article text . I have clustered
some 100+ news articles like this and i get the output in
clusteredPoints/part-m-00000. My question is that is it possible to
extract the article ID (ie Texts[i][0]), which i had appended) and
corresponding cluster id from the part-m-00000 file.

Anyone knows ???

-- 
Thank You..!!
Sarath Ramachandran
[email protected]
+919995024287

Reply via email to