[scikit-learn] Agglomerative clustering

2017-06-29 Thread Ariani A
I have some data and also the pairwise distance matrix of these data
points. I want to cluster them using Agglomerative clustering. I readthat
in sklearn, we can have 'precomputed' as affinity and I expect it is the
distance matrix. But I could not find any example which uses precomputed
affinity and a custom distance matrix.
Any help will be highly appreciated.
Best,
-Noushin
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


[scikit-learn] Agglomerative Clustering without knowing number of clusters

2017-06-30 Thread Ariani A
I want to perform agglomerative clustering, but I have no idea of number of
clusters before hand. But I want that every cluster has at least 40 data
points in it. How can I apply this to sklearn.agglomerative clustering?
Should I use dendrogram and cut it somehow? I have no idea how to relate
dendrogram to this and cutting it out. Any help will be appreciated!
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


Re: [scikit-learn] Agglomerative Clustering without knowing number of clusters

2017-07-06 Thread Ariani A
Dear Shane,
Thanks for your time. But I have to implement it by agglomerative
clustering and cut it when each cluster has at least 40 data points. But I
am not sure how to do cut it. I was guessing maybe it can be done by
cutting the dandrogram? Is it correct? If so, I do not know how to apply
it. Could you give me a point?
Best,
Ariani

On Thu, Jul 6, 2017 at 12:32 PM, Shane Grigsby 
wrote:

> This sounds like it may be a problem more amenable to either DBSCAN or
> OPTICS. Both algorithms don't require a priori knowledge of the number of
> clusters, and both let you specify a minimum point membership threshold for
> cluster membership. The OPTICS algorithm will also produce a dendrogram
> that you can cut for sub clusters if need be.
>
> DBSCAN is part of the stable release and has been for some time; OPTICS is
> pending as a pull request, but it's stable and you can try it if you like:
>
> https://github.com/scikit-learn/scikit-learn/pull/1984
>
> Cheers,
> Shane
>
>
> On 06/30, Ariani A wrote:
>
>> I want to perform agglomerative clustering, but I have no idea of number
>> of
>> clusters before hand. But I want that every cluster has at least 40 data
>> points in it. How can I apply this to sklearn.agglomerative clustering?
>> Should I use dendrogram and cut it somehow? I have no idea how to relate
>> dendrogram to this and cutting it out. Any help will be appreciated!
>>
>
> ___
>> scikit-learn mailing list
>> scikit-learn@python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>
>
> --
> *PhD candidate & Research Assistant*
> *Cooperative Institute for Research in Environmental Sciences (CIRES)*
> *University of Colorado at Boulder*
> ___
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


[scikit-learn] Help with NLP

2017-07-07 Thread Ariani A
Dear all,
I need an urgent help with NLP, do you happen to know anyone who knows nltk
or NLP modules? Have anybody of you read this paper?
"Template-Based Information Extraction without the Templates."
I am looking forward to hearirng from you soon!
Best,
-Ariani
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


Re: [scikit-learn] Help with NLP

2017-07-07 Thread Ariani A
Yes , it is.
regards

On Fri, Jul 7, 2017 at 12:23 PM, Carlton Banks  wrote:

> NLP as is Natural language processing?
>
> Den 7. jul. 2017 kl. 18.18 skrev Ariani A :
>
> Dear all,
> I need an urgent help with NLP, do you happen to know anyone who knows
> nltk or NLP modules? Have anybody of you read this paper?
> "Template-Based Information Extraction without the Templates."
> I am looking forward to hearirng from you soon!
> Best,
> -Ariani
> ___
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
>
> ___
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


Re: [scikit-learn] Help with NLP

2017-07-07 Thread Ariani A
Dear Jacob,
I know, but I am just asking to get help!

@Carlton, I want to do text processing, can I email you so that the others
do not bother?
Best,
-Ariani

On Fri, Jul 7, 2017 at 12:52 PM, Jacob Schreiber 
wrote:

> The scikit-learn mailing list is probably not the best place to be asking
> for help with another module.
>
> On Fri, Jul 7, 2017 at 9:28 AM Ariani A  wrote:
>
>> Yes , it is.
>> regards
>>
>> On Fri, Jul 7, 2017 at 12:23 PM, Carlton Banks  wrote:
>>
>>> NLP as is Natural language processing?
>>>
>>> Den 7. jul. 2017 kl. 18.18 skrev Ariani A :
>>>
>>> Dear all,
>>> I need an urgent help with NLP, do you happen to know anyone who knows
>>> nltk or NLP modules? Have anybody of you read this paper?
>>> "Template-Based Information Extraction without the Templates."
>>> I am looking forward to hearirng from you soon!
>>> Best,
>>> -Ariani
>>> ___
>>> scikit-learn mailing list
>>> scikit-learn@python.org
>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>
>>>
>>>
>>> ___
>>> scikit-learn mailing list
>>> scikit-learn@python.org
>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>
>>>
>> ___
>> scikit-learn mailing list
>> scikit-learn@python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>
> ___
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


[scikit-learn] Agglomerative clustering problem

2017-07-11 Thread Ariani A
Hi all,
I want to perform agglomerative clustering, but I have no idea of number of
clusters before hand. But I want that every cluster has at least 40 data
points in it. How can I apply this to sklearn.agglomerative clustering?
Should I use dendrogram and cut it somehow? I have no idea how to relate
dendrogram to this and cutting it out. Any help will be appreciated!
I have to use agglomerative clustering!
Thanks,
-Ariani
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


Re: [scikit-learn] Agglomerative clustering problem

2017-07-11 Thread Ariani A
ِDear Uri,
Thanks. I just have a pairwise distance matrix and I want to implement it
so that each cluster has at least 40 data points. (in Agglomerative).
Does it work?
Thanks,
-Ariani

On Tue, Jul 11, 2017 at 1:54 PM, Uri Goren  wrote:

> Take a look at scipy's fcluster function.
> If M is a matrix of all of your feature vectors, this code snippet should
> work.
>
> You need to figure out what metric and algorithm work for you
>
> from sklearn.metrics import pairwise_distance
> from scipy.cluster import  hierarchy
> X = pairwise_distance(M, metric=metric)
> Z = hierarchy.linkage(X, algo, metric=metric)
> C = hierarchy.fcluster(Z,threshold, criterion="distance")
>
> Best,
> Uri Goren
>
> On Tue, Jul 11, 2017 at 7:42 PM, Ariani A  wrote:
>
>> Hi all,
>> I want to perform agglomerative clustering, but I have no idea of number
>>  of clusters before hand. But I want that every cluster has at least 40
>> data points in it. How can I apply this to sklearn.agglomerative clusteri
>> ng?
>> Should I use dendrogram and cut it somehow? I have no idea how to relate
>> dendrogram to this and cutting it out. Any help will be appreciated!
>> I have to use agglomerative clustering!
>> Thanks,
>> -Ariani
>>
>> ___
>> scikit-learn mailing list
>> scikit-learn@python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>
>
>
> --
>
>
> *Uri Goren,Software innovator*
>
> *Phone: +972-507-649-650*
>
> *EMail: u...@goren4u.com *
> *Linkedin: il.linkedin.com/in/ugoren/ <http://il.linkedin.com/in/ugoren/>*
>
> ___
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


Re: [scikit-learn] Agglomerative Clustering without knowing number of clusters

2017-07-13 Thread Ariani A
Dear Shane,
Thanks for your answer.
Does DBSCAN works with distance matrix/? I have a distance matrix
(symmetric matrix which contains pairwise distances). Can you help me? I
did not find DBSCAN code in that link.
Best,
-Ariani

On Thu, Jul 6, 2017 at 12:32 PM, Shane Grigsby 
wrote:

> This sounds like it may be a problem more amenable to either DBSCAN or
> OPTICS. Both algorithms don't require a priori knowledge of the number of
> clusters, and both let you specify a minimum point membership threshold for
> cluster membership. The OPTICS algorithm will also produce a dendrogram
> that you can cut for sub clusters if need be.
>
> DBSCAN is part of the stable release and has been for some time; OPTICS is
> pending as a pull request, but it's stable and you can try it if you like:
>
> https://github.com/scikit-learn/scikit-learn/pull/1984
>
> Cheers,
> Shane
>
>
> On 06/30, Ariani A wrote:
>
>> I want to perform agglomerative clustering, but I have no idea of number
>> of
>> clusters before hand. But I want that every cluster has at least 40 data
>> points in it. How can I apply this to sklearn.agglomerative clustering?
>> Should I use dendrogram and cut it somehow? I have no idea how to relate
>> dendrogram to this and cutting it out. Any help will be appreciated!
>>
>
> ___
>> scikit-learn mailing list
>> scikit-learn@python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>
>
> --
> *PhD candidate & Research Assistant*
> *Cooperative Institute for Research in Environmental Sciences (CIRES)*
> *University of Colorado at Boulder*
> ___
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


Re: [scikit-learn] Agglomerative Clustering without knowing number of clusters

2017-07-13 Thread Ariani A
Dear Shane,
Thanks for your prompt answer.
Do you mean that for DBSCAN there is no need to feed other parameters? Do I
just call the function or I have to manipulate the code?
P.S. I was not able to find the DBSCAN code on github.
Looking forward to hearing from you.
Best,
-Noushin

On Thu, Jul 13, 2017 at 5:38 PM, Shane Grigsby 
wrote:

> Hi Ariani,
> Yes, you can use a distance matrix-- I think that what you want is
> metric='precomputed', and then X would be your N by N distance matrix.
> Hope that helps,
> ~Shane
>
>
> On 07/13, Ariani A wrote:
>
>> Dear Shane,
>> Thanks for your answer.
>> Does DBSCAN works with distance matrix/? I have a distance matrix
>> (symmetric matrix which contains pairwise distances). Can you help me? I
>> did not find DBSCAN code in that link.
>> Best,
>> -Ariani
>>
>> On Thu, Jul 6, 2017 at 12:32 PM, Shane Grigsby <
>> shane.grig...@colorado.edu>
>> wrote:
>>
>> This sounds like it may be a problem more amenable to either DBSCAN or
>>> OPTICS. Both algorithms don't require a priori knowledge of the number of
>>> clusters, and both let you specify a minimum point membership threshold
>>> for
>>> cluster membership. The OPTICS algorithm will also produce a dendrogram
>>> that you can cut for sub clusters if need be.
>>>
>>> DBSCAN is part of the stable release and has been for some time; OPTICS
>>> is
>>> pending as a pull request, but it's stable and you can try it if you
>>> like:
>>>
>>> https://github.com/scikit-learn/scikit-learn/pull/1984
>>>
>>> Cheers,
>>> Shane
>>>
>>>
>>> On 06/30, Ariani A wrote:
>>>
>>> I want to perform agglomerative clustering, but I have no idea of number
>>>> of
>>>> clusters before hand. But I want that every cluster has at least 40 data
>>>> points in it. How can I apply this to sklearn.agglomerative clustering?
>>>> Should I use dendrogram and cut it somehow? I have no idea how to relate
>>>> dendrogram to this and cutting it out. Any help will be appreciated!
>>>>
>>>>
>>> ___
>>>
>>>> scikit-learn mailing list
>>>> scikit-learn@python.org
>>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>>
>>>>
>>>
>>> --
>>> *PhD candidate & Research Assistant*
>>> *Cooperative Institute for Research in Environmental Sciences (CIRES)*
>>> *University of Colorado at Boulder*
>>> ___
>>> scikit-learn mailing list
>>> scikit-learn@python.org
>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>
>>>
> ___
>> scikit-learn mailing list
>> scikit-learn@python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>
>
> --
> *PhD candidate & Research Assistant*
> *Cooperative Institute for Research in Environmental Sciences (CIRES)*
> *University of Colorado at Boulder*
> ___
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


Re: [scikit-learn] Agglomerative Clustering without knowing number of clusters

2017-07-13 Thread Ariani A
Dear Shane,
Sorry bothering you!
Is the "precomputed" and "distance matrix" you are talking about, are about
"DBSCAN" ?
Thanks,
Best.

On Thu, Jul 13, 2017 at 7:03 PM, Ariani A  wrote:

> Dear Shane,
> Thanks for your prompt answer.
> Do you mean that for DBSCAN there is no need to feed other parameters? Do
> I just call the function or I have to manipulate the code?
> P.S. I was not able to find the DBSCAN code on github.
> Looking forward to hearing from you.
> Best,
> -Noushin
>
> On Thu, Jul 13, 2017 at 5:38 PM, Shane Grigsby  > wrote:
>
>> Hi Ariani,
>> Yes, you can use a distance matrix-- I think that what you want is
>> metric='precomputed', and then X would be your N by N distance matrix.
>> Hope that helps,
>> ~Shane
>>
>>
>> On 07/13, Ariani A wrote:
>>
>>> Dear Shane,
>>> Thanks for your answer.
>>> Does DBSCAN works with distance matrix/? I have a distance matrix
>>> (symmetric matrix which contains pairwise distances). Can you help me? I
>>> did not find DBSCAN code in that link.
>>> Best,
>>> -Ariani
>>>
>>> On Thu, Jul 6, 2017 at 12:32 PM, Shane Grigsby <
>>> shane.grig...@colorado.edu>
>>> wrote:
>>>
>>> This sounds like it may be a problem more amenable to either DBSCAN or
>>>> OPTICS. Both algorithms don't require a priori knowledge of the number
>>>> of
>>>> clusters, and both let you specify a minimum point membership threshold
>>>> for
>>>> cluster membership. The OPTICS algorithm will also produce a dendrogram
>>>> that you can cut for sub clusters if need be.
>>>>
>>>> DBSCAN is part of the stable release and has been for some time; OPTICS
>>>> is
>>>> pending as a pull request, but it's stable and you can try it if you
>>>> like:
>>>>
>>>> https://github.com/scikit-learn/scikit-learn/pull/1984
>>>>
>>>> Cheers,
>>>> Shane
>>>>
>>>>
>>>> On 06/30, Ariani A wrote:
>>>>
>>>> I want to perform agglomerative clustering, but I have no idea of number
>>>>> of
>>>>> clusters before hand. But I want that every cluster has at least 40
>>>>> data
>>>>> points in it. How can I apply this to sklearn.agglomerative clustering?
>>>>> Should I use dendrogram and cut it somehow? I have no idea how to
>>>>> relate
>>>>> dendrogram to this and cutting it out. Any help will be appreciated!
>>>>>
>>>>>
>>>> ___
>>>>
>>>>> scikit-learn mailing list
>>>>> scikit-learn@python.org
>>>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>>>
>>>>>
>>>>
>>>> --
>>>> *PhD candidate & Research Assistant*
>>>> *Cooperative Institute for Research in Environmental Sciences (CIRES)*
>>>> *University of Colorado at Boulder*
>>>> ___
>>>> scikit-learn mailing list
>>>> scikit-learn@python.org
>>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>>
>>>>
>> ___
>>> scikit-learn mailing list
>>> scikit-learn@python.org
>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>
>>
>>
>> --
>> *PhD candidate & Research Assistant*
>> *Cooperative Institute for Research in Environmental Sciences (CIRES)*
>> *University of Colorado at Boulder*
>> ___
>> scikit-learn mailing list
>> scikit-learn@python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>
>
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


[scikit-learn] No module named crluster.hierarchical

2017-08-13 Thread Ariani A
Dear all,

I am writing this import:

from sklearn.crluster.hierarchical import (_hc_cut, _TREE_BUILDERS,
  linkage_tree)
But it gives this error:
ImportError: No module named crluster.hierarchical

Any clue?
Best regards,
-Noushin
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


Re: [scikit-learn] No module named crluster.hierarchical

2017-08-13 Thread Ariani A
Thank you so much!

On Sun, Aug 13, 2017 at 12:20 PM, Vlad Niculae  wrote:

> Looks like you're misspelling the word "cluster".
>
> Yours,
> Vlad
>
> On Aug 13, 2017 12:19 PM, "Ariani A"  wrote:
>
>> Dear all,
>>
>> I am writing this import:
>>
>> from sklearn.crluster.hierarchical import (_hc_cut, _TREE_BUILDERS,
>>   linkage_tree)
>> But it gives this error:
>> ImportError: No module named crluster.hierarchical
>>
>> Any clue?
>> Best regards,
>> -Noushin
>>
>> ___
>> scikit-learn mailing list
>> scikit-learn@python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>
> ___
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn