Highest ranking topic for each doc is just np.argmax(nmf.transform(tfidf),
axis=1).

This is because nmf.transform
<http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.NMF.html#sklearn.decomposition.NMF.transform>(tfidf)
returns a matrix of shape (num samples, num components / topics) scoring
each topic per sample. An argmax over axis 1 indicates the highest scoring
topic per sample.

On 29 April 2015 at 11:44, C K Kashyap <ckkash...@gmail.com> wrote:

> Thanks Joel and Andreas,
>
> Joel, I think "highest ranking topic for each doc" is exactly what I am
> looking for. Could you elaborate on the code please?
>
> What would be dataset.target_names and dataset.target in my case -
> http://lpaste.net/131649
>
> Regards,
> Kashyap
>
> On Wed, Apr 29, 2015 at 3:08 AM, Joel Nothman <joel.noth...@gmail.com>
> wrote:
>
>> This shows the newsgroup name and highest scoring topic for each doc.
>>
>> zip(np.take(dataset.target_names, dataset.target),
>> np.argmax(nmf.transform(tfidf), axis=1))
>>
>> I think something based on this should be added to the example.
>>
>> On 29 April 2015 at 07:01, Andreas Mueller <t3k...@gmail.com> wrote:
>>
>>>  Clusters are one per data point, while topics are not. So the model is
>>> slightly different.
>>> You can get the list of topics for each sample using
>>> NMF().fit_transform(X).
>>>
>>>
>>> On 04/28/2015 01:13 PM, C K Kashyap wrote:
>>>
>>> Hi everyone,
>>> I am new to scikit. I only feel sad for not knowing it earlier - it's
>>> awesome.
>>>
>>>  I am trying to do the following. Extract topics from a bunch of
>>> tweets. I tried NMF (from the sample here -
>>> http://scikit-learn.org/stable/auto_examples/applications/topics_extraction_with_nmf.html)
>>> but I was not able to figure out how to list documents corresponding to the
>>> extracted topics. Could someone please point me to an example that lists
>>> the documents under each topic?
>>>
>>>  When I got stuck with NMF, I thought of using kmeans (min batch). I am
>>> just wondering though if clustering is a reasonable approach for "topics".
>>>
>>>  I'd really appreciate any advice here.
>>>
>>>  Thanks,
>>> Kashyap
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> One dashboard for servers and applications across Physical-Virtual-Cloud
>>> Widest out-of-the-box monitoring support with 50+ applications
>>> Performance metrics, stats and reports that give you Actionable Insights
>>> Deep dive visibility with transaction tracing using APM 
>>> Insight.http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>>>
>>>
>>>
>>> _______________________________________________
>>> Scikit-learn-general mailing 
>>> listScikit-learn-general@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> One dashboard for servers and applications across Physical-Virtual-Cloud
>>> Widest out-of-the-box monitoring support with 50+ applications
>>> Performance metrics, stats and reports that give you Actionable Insights
>>> Deep dive visibility with transaction tracing using APM Insight.
>>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> Scikit-learn-general@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>>
>>
>>
>> ------------------------------------------------------------------------------
>> One dashboard for servers and applications across Physical-Virtual-Cloud
>> Widest out-of-the-box monitoring support with 50+ applications
>> Performance metrics, stats and reports that give you Actionable Insights
>> Deep dive visibility with transaction tracing using APM Insight.
>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>
>
> ------------------------------------------------------------------------------
> One dashboard for servers and applications across Physical-Virtual-Cloud
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to