Thank you so much Joel,
I understood. Just one more thing please.
How can I include a document against it's highest ranking topic only if it
crosses a threshold?
regards,
Kashyap
On Wed, Apr 29, 2015 at 9:45 AM, Joel Nothman <joel.noth...@gmail.com>
wrote:
> Highest ranking topic for each doc is just np.argmax(nmf.transform(tfidf),
> axis=1).
>
> This is because nmf.transform
> <http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.NMF.html#sklearn.decomposition.NMF.transform>(tfidf)
> returns a matrix of shape (num samples, num components / topics) scoring
> each topic per sample. An argmax over axis 1 indicates the highest scoring
> topic per sample.
>
> On 29 April 2015 at 11:44, C K Kashyap <ckkash...@gmail.com> wrote:
>
>> Thanks Joel and Andreas,
>>
>> Joel, I think "highest ranking topic for each doc" is exactly what I am
>> looking for. Could you elaborate on the code please?
>>
>> What would be dataset.target_names and dataset.target in my case -
>> http://lpaste.net/131649
>>
>> Regards,
>> Kashyap
>>
>> On Wed, Apr 29, 2015 at 3:08 AM, Joel Nothman <joel.noth...@gmail.com>
>> wrote:
>>
>>> This shows the newsgroup name and highest scoring topic for each doc.
>>>
>>> zip(np.take(dataset.target_names, dataset.target),
>>> np.argmax(nmf.transform(tfidf), axis=1))
>>>
>>> I think something based on this should be added to the example.
>>>
>>> On 29 April 2015 at 07:01, Andreas Mueller <t3k...@gmail.com> wrote:
>>>
>>>> Clusters are one per data point, while topics are not. So the model is
>>>> slightly different.
>>>> You can get the list of topics for each sample using
>>>> NMF().fit_transform(X).
>>>>
>>>>
>>>> On 04/28/2015 01:13 PM, C K Kashyap wrote:
>>>>
>>>> Hi everyone,
>>>> I am new to scikit. I only feel sad for not knowing it earlier - it's
>>>> awesome.
>>>>
>>>> I am trying to do the following. Extract topics from a bunch of
>>>> tweets. I tried NMF (from the sample here -
>>>> http://scikit-learn.org/stable/auto_examples/applications/topics_extraction_with_nmf.html)
>>>> but I was not able to figure out how to list documents corresponding to the
>>>> extracted topics. Could someone please point me to an example that lists
>>>> the documents under each topic?
>>>>
>>>> When I got stuck with NMF, I thought of using kmeans (min batch). I
>>>> am just wondering though if clustering is a reasonable approach for
>>>> "topics".
>>>>
>>>> I'd really appreciate any advice here.
>>>>
>>>> Thanks,
>>>> Kashyap
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> One dashboard for servers and applications across Physical-Virtual-Cloud
>>>> Widest out-of-the-box monitoring support with 50+ applications
>>>> Performance metrics, stats and reports that give you Actionable Insights
>>>> Deep dive visibility with transaction tracing using APM
>>>> Insight.http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Scikit-learn-general mailing
>>>> listScikit-learn-general@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>>
>>>>
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> One dashboard for servers and applications across Physical-Virtual-Cloud
>>>> Widest out-of-the-box monitoring support with 50+ applications
>>>> Performance metrics, stats and reports that give you Actionable Insights
>>>> Deep dive visibility with transaction tracing using APM Insight.
>>>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>>>> _______________________________________________
>>>> Scikit-learn-general mailing list
>>>> Scikit-learn-general@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>>
>>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> One dashboard for servers and applications across Physical-Virtual-Cloud
>>> Widest out-of-the-box monitoring support with 50+ applications
>>> Performance metrics, stats and reports that give you Actionable Insights
>>> Deep dive visibility with transaction tracing using APM Insight.
>>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> Scikit-learn-general@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>>
>>
>>
>> ------------------------------------------------------------------------------
>> One dashboard for servers and applications across Physical-Virtual-Cloud
>> Widest out-of-the-box monitoring support with 50+ applications
>> Performance metrics, stats and reports that give you Actionable Insights
>> Deep dive visibility with transaction tracing using APM Insight.
>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>
>
> ------------------------------------------------------------------------------
> One dashboard for servers and applications across Physical-Virtual-Cloud
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general