Does LDAPrintTopics print the *document*-topic probabilities, or just
the *term*-topic probabilities?  I thought only the latter, because I was
too
lazy (sorry!) to update it to add in the ability to put the former as well
when
I added docTopics to the LDA output.

On Thu, Jul 7, 2011 at 8:24 PM, Jeff Eastman <[email protected]> wrote:

> I think you want LDAPrintTopics?
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Dhruv
> Kumar
> Sent: Thursday, July 07, 2011 11:29 AM
> To: [email protected]
> Subject: Re: how to transfer the sequence file into readable format
>
> Sequence Files store key and value pairs in a binary, compressed format. To
> read a sequence file and display the key and values in a human format, you
> can use SequenceFile Reader:
>
> http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/SequenceFile.Reader.html
>
> I don't know the outputs of LDA, but in general you can do the following,
> assuming key is IntWritable and value is DoubleWritable.
>
> Configuration conf = new Configuration();
> FileSystem fs = FileSystem.get(conf);
> SequenceFile.reader reader = new SequenceFile.reader(fs, new
> Path("/path/to/output/of/LDA"), conf);
> IntWritable key = new IntWritable();
> DoubleWritable value = new DoubleWritable();
>
> while(reader.next(key, value)) {
>  System.out.println(key.toString(), value.toString());
> }
> reader.close();
>
>
> There may be a convenient command line utility for LDA also which someone
> else can point out. However, you can always write your own simple class as
> shown above for reading any Sequence File.
>
>
>
>
>
> On Thu, Jul 7, 2011 at 1:53 PM, wine lover <[email protected]> wrote:
>
> > Dear All,
> >
> > After running LDA analysis, I got the docTopic file, which is a regular
> > sequence-file. How to transfer it into a readable format? I searched
> > vectordumper, or vectordump, but did not get any useful results, such as
> > how
> > to use it in command-line? Thanks.
> >
>

Reply via email to