Re: [jira] [Commented] (MAHOUT-458) The LDA output does not include the topic-probability distribution per document (p(z|d)). It outputs only the topics and corresponding words.

Jake Mannix Wed, 27 Apr 2011 15:47:32 -0700

Weird.  Ok, Cloned, and now we can track MAHOUT-682 instead of this
one.

  -jake


On Wed, Apr 27, 2011 at 3:19 PM, Dmitriy Lyubimov <[email protected]> wrote:

> I think the state automaton definition for an issue is a admin-level
> function in jira. Apache Jira admins may have defined it the way that
> "closed" is where it all hits the floor with no recourse. I am
> guessing you still can clone it as another issue.
>
> On Wed, Apr 27, 2011 at 3:05 PM, Sean Owen (JIRA) <[email protected]> wrote:
> >
> >    [
> https://issues.apache.org/jira/browse/MAHOUT-458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026027#comment-13026027]
> >
> > Sean Owen commented on MAHOUT-458:
> > ----------------------------------
> >
> > Hm, I can't reopen either. Surely JIRA allows it? or is it telling us to
> file a new ticket? Do what you gotta do to make sure it's on the radar and
> gets done to your satisfaction.
> >
> >> The LDA output does not include the topic-probability distribution per
> document (p(z|d)). It outputs only the topics and corresponding words.
> >>
> ---------------------------------------------------------------------------------------------------------------------------------------------
> >>
> >>                 Key: MAHOUT-458
> >>                 URL: https://issues.apache.org/jira/browse/MAHOUT-458
> >>             Project: Mahout
> >>          Issue Type: Improvement
> >>          Components: Clustering
> >>    Affects Versions: 0.4
> >>            Reporter: Himanshu Gahlot
> >>            Assignee: Jake Mannix
> >>             Fix For: 0.6
> >>
> >>         Attachments: MAHOUT-458.patch, MAHOUT-458.patch
> >>
> >>
> >> The current implementation of LDA outputs only topics and their words.
> Many applications need the p(z|d) values of a document to use this vector as
> a reduced representation of the document (dimensionality reduction of
> document). We need to introduce a new key which would keep track of the
> gamma values for each document (as obtained from the document.infer()
> method) and writes these to the output stream and finally, PrintLDATopics
> should output these values per document id. Also, outputting the
> probabilities of words in a topic would also provide a more meaningful
> output.
> >
> > --
> > This message is automatically generated by JIRA.
> > For more information on JIRA, see:
> http://www.atlassian.com/software/jira
> >
>

Re: [jira] [Commented] (MAHOUT-458) The LDA output does not include the topic-probability distribution per document (p(z|d)). It outputs only the topics and corresponding words.

Reply via email to