Thanks Joel,

What about estimating the number of topics? Is there a recommended way to
do it?

Regards,
Kashyap

On Wed, Apr 29, 2015 at 12:25 PM, Joel Nothman <joel.noth...@gmail.com>
wrote:

> Yes, this is not a probabilistic method.
>
> On 29 April 2015 at 14:56, C K Kashyap <ckkash...@gmail.com> wrote:
>
>> Works like a charm. Just noticed though that the max value is sometimes
>> more than 1.0 .... is that okay?
>>
>> Regards,
>> Kashyap
>>
>> On Wed, Apr 29, 2015 at 10:12 AM, Joel Nothman <joel.noth...@gmail.com>
>> wrote:
>>
>>> mask with np.max(..., axis=1) > threshold
>>>
>>> On 29 April 2015 at 14:35, C K Kashyap <ckkash...@gmail.com> wrote:
>>>
>>>> Thank you so much Joel,
>>>>
>>>> I understood. Just one more thing please.
>>>>
>>>> How can I include a document against it's highest ranking topic only if
>>>> it crosses a threshold?
>>>>
>>>> regards,
>>>> Kashyap
>>>>
>>>> On Wed, Apr 29, 2015 at 9:45 AM, Joel Nothman <joel.noth...@gmail.com>
>>>> wrote:
>>>>
>>>>> Highest ranking topic for each doc is just
>>>>> np.argmax(nmf.transform(tfidf), axis=1).
>>>>>
>>>>> This is because nmf.transform
>>>>> <http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.NMF.html#sklearn.decomposition.NMF.transform>(tfidf)
>>>>> returns a matrix of shape (num samples, num components / topics) scoring
>>>>> each topic per sample. An argmax over axis 1 indicates the highest scoring
>>>>> topic per sample.
>>>>>
>>>>> On 29 April 2015 at 11:44, C K Kashyap <ckkash...@gmail.com> wrote:
>>>>>
>>>>>> Thanks Joel and Andreas,
>>>>>>
>>>>>> Joel, I think "highest ranking topic for each doc" is exactly what I
>>>>>> am looking for. Could you elaborate on the code please?
>>>>>>
>>>>>> What would be dataset.target_names and dataset.target in my case -
>>>>>> http://lpaste.net/131649
>>>>>>
>>>>>> Regards,
>>>>>> Kashyap
>>>>>>
>>>>>> On Wed, Apr 29, 2015 at 3:08 AM, Joel Nothman <joel.noth...@gmail.com
>>>>>> > wrote:
>>>>>>
>>>>>>> This shows the newsgroup name and highest scoring topic for each doc.
>>>>>>>
>>>>>>> zip(np.take(dataset.target_names, dataset.target),
>>>>>>> np.argmax(nmf.transform(tfidf), axis=1))
>>>>>>>
>>>>>>> I think something based on this should be added to the example.
>>>>>>>
>>>>>>> On 29 April 2015 at 07:01, Andreas Mueller <t3k...@gmail.com> wrote:
>>>>>>>
>>>>>>>>  Clusters are one per data point, while topics are not. So the
>>>>>>>> model is slightly different.
>>>>>>>> You can get the list of topics for each sample using
>>>>>>>> NMF().fit_transform(X).
>>>>>>>>
>>>>>>>>
>>>>>>>> On 04/28/2015 01:13 PM, C K Kashyap wrote:
>>>>>>>>
>>>>>>>> Hi everyone,
>>>>>>>> I am new to scikit. I only feel sad for not knowing it earlier -
>>>>>>>> it's awesome.
>>>>>>>>
>>>>>>>>  I am trying to do the following. Extract topics from a bunch of
>>>>>>>> tweets. I tried NMF (from the sample here -
>>>>>>>> http://scikit-learn.org/stable/auto_examples/applications/topics_extraction_with_nmf.html)
>>>>>>>> but I was not able to figure out how to list documents corresponding 
>>>>>>>> to the
>>>>>>>> extracted topics. Could someone please point me to an example that 
>>>>>>>> lists
>>>>>>>> the documents under each topic?
>>>>>>>>
>>>>>>>>  When I got stuck with NMF, I thought of using kmeans (min batch).
>>>>>>>> I am just wondering though if clustering is a reasonable approach for
>>>>>>>> "topics".
>>>>>>>>
>>>>>>>>  I'd really appreciate any advice here.
>>>>>>>>
>>>>>>>>  Thanks,
>>>>>>>> Kashyap
>>>>>>>>
>>>>>>>>
>>>>>>>> ------------------------------------------------------------------------------
>>>>>>>> One dashboard for servers and applications across 
>>>>>>>> Physical-Virtual-Cloud
>>>>>>>> Widest out-of-the-box monitoring support with 50+ applications
>>>>>>>> Performance metrics, stats and reports that give you Actionable 
>>>>>>>> Insights
>>>>>>>> Deep dive visibility with transaction tracing using APM 
>>>>>>>> Insight.http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Scikit-learn-general mailing 
>>>>>>>> listScikit-learn-general@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ------------------------------------------------------------------------------
>>>>>>>> One dashboard for servers and applications across
>>>>>>>> Physical-Virtual-Cloud
>>>>>>>> Widest out-of-the-box monitoring support with 50+ applications
>>>>>>>> Performance metrics, stats and reports that give you Actionable
>>>>>>>> Insights
>>>>>>>> Deep dive visibility with transaction tracing using APM Insight.
>>>>>>>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>>>>>>>> _______________________________________________
>>>>>>>> Scikit-learn-general mailing list
>>>>>>>> Scikit-learn-general@lists.sourceforge.net
>>>>>>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ------------------------------------------------------------------------------
>>>>>>> One dashboard for servers and applications across
>>>>>>> Physical-Virtual-Cloud
>>>>>>> Widest out-of-the-box monitoring support with 50+ applications
>>>>>>> Performance metrics, stats and reports that give you Actionable
>>>>>>> Insights
>>>>>>> Deep dive visibility with transaction tracing using APM Insight.
>>>>>>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>>>>>>> _______________________________________________
>>>>>>> Scikit-learn-general mailing list
>>>>>>> Scikit-learn-general@lists.sourceforge.net
>>>>>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> ------------------------------------------------------------------------------
>>>>>> One dashboard for servers and applications across
>>>>>> Physical-Virtual-Cloud
>>>>>> Widest out-of-the-box monitoring support with 50+ applications
>>>>>> Performance metrics, stats and reports that give you Actionable
>>>>>> Insights
>>>>>> Deep dive visibility with transaction tracing using APM Insight.
>>>>>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>>>>>> _______________________________________________
>>>>>> Scikit-learn-general mailing list
>>>>>> Scikit-learn-general@lists.sourceforge.net
>>>>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> One dashboard for servers and applications across
>>>>> Physical-Virtual-Cloud
>>>>> Widest out-of-the-box monitoring support with 50+ applications
>>>>> Performance metrics, stats and reports that give you Actionable
>>>>> Insights
>>>>> Deep dive visibility with transaction tracing using APM Insight.
>>>>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>>>>> _______________________________________________
>>>>> Scikit-learn-general mailing list
>>>>> Scikit-learn-general@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>>>
>>>>>
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> One dashboard for servers and applications across Physical-Virtual-Cloud
>>>> Widest out-of-the-box monitoring support with 50+ applications
>>>> Performance metrics, stats and reports that give you Actionable Insights
>>>> Deep dive visibility with transaction tracing using APM Insight.
>>>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>>>> _______________________________________________
>>>> Scikit-learn-general mailing list
>>>> Scikit-learn-general@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>>
>>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> One dashboard for servers and applications across Physical-Virtual-Cloud
>>> Widest out-of-the-box monitoring support with 50+ applications
>>> Performance metrics, stats and reports that give you Actionable Insights
>>> Deep dive visibility with transaction tracing using APM Insight.
>>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> Scikit-learn-general@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>>
>>
>>
>> ------------------------------------------------------------------------------
>> One dashboard for servers and applications across Physical-Virtual-Cloud
>> Widest out-of-the-box monitoring support with 50+ applications
>> Performance metrics, stats and reports that give you Actionable Insights
>> Deep dive visibility with transaction tracing using APM Insight.
>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>
>
> ------------------------------------------------------------------------------
> One dashboard for servers and applications across Physical-Virtual-Cloud
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to