Re: [R] PAM Clustering

2017-08-18 Thread Sema Atasever
Grazie mille,

So grateful for your kindness in answering questions.

Regards.



On Thu, Aug 17, 2017 at 8:50 PM, Germano Rossi 
wrote:

> Sorry, I never use pam. In the help, you can see that pam require a
> dataframe OR a dissimilarity matrix. If diss=FALSE then "euclidean" was
> use.So, I interpret that a matrix of dissimilarity is generated
> automatically.
>
> Problems may be in your data. Indeed
>
> pam(ruspini, 4)$diss
>
> write a dissimilaty matrix
>
> while
> pam(MYdata,10)$diss
>
> wite NULL
>
>
> 2017-08-17 16:03 GMT+02:00 Sema Atasever :
>
>> Dear Germano,
>>
>> Thank you for your fast reply,
>>
>> In the above code, *MYData *is the actual data set.
>>
>> Do not we need to convert *MYData to *the dissimilarity matrix using
>> *pam(as.dist(**MYData**), k = 10, diss = TRUE*)*   code line?*
>>
>> *Regards.*
>>
>> On Thu, Aug 17, 2017 at 2:58 PM, Germano Rossi 
>> wrote:
>>
>>> try this
>>>
>>> MYdata <- read.csv2("data.txt",dec='.')
>>> library(cluster)
>>> cluster.pam = pam(MYdata,10)
>>> table(cluster.pam$clustering)
>>> filenameclu = paste("clusters", ".txt")
>>> write.table(cluster.pam$clustering, file=filenameclu,sep=",")
>>>
>>>
>>> 2017-08-17 10:28 GMT+02:00 Sema Atasever :
>>>
 Dear Authorized Sir / Madam,

 I have a data set in which each row indicates an amino asid and each
 column corresponds
 to a feature (in total 539 features).
 I want to use PAM Clustering usign this data set.


 *when i ran R script i am getting this error:*
 *Error in pam(d, 10) : x is not a numeric dataframe or matrix.*
 *Execution halted*

 How can i fix this error? Is there a problem with my dataset?

 Thanks in advance.


 *PAM clustering codes:*

 MYdata <- read.csv2("data.txt", dec = ".")
 attach(MYdata)
 d=as.matrix(MYdata)
 library(cluster)
 cluster.pam = pam(d,10)
 table(cluster.pam$clustering)

 filenameclu = paste("clusters", ".txt")
 write.table(cluster.pam$clustering, file=filenameclu,sep=",")

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posti
 ng-guide.html
 and provide commented, minimal, self-contained, reproducible code.

>>>
>>>
>>>
>>> --
>>> ==
>>> Germano Rossi, Dipartimento di Psicologia, Universita' degli Studi di
>>> Milano Bicocca
>>> Piazza dell'Ateneo Nuovo, 1- 20126 Milano - Italy
>>>
>>
>>
>
>
> --
> ==
> Germano Rossi, Dipartimento di Psicologia, Universita' degli Studi di
> Milano Bicocca
> Piazza dell'Ateneo Nuovo, 1- 20126 Milano - Italy
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] PAM Clustering

2017-08-17 Thread Germano Rossi
Sorry, I never use pam. In the help, you can see that pam require a
dataframe OR a dissimilarity matrix. If diss=FALSE then "euclidean" was use.So,
I interpret that a matrix of dissimilarity is generated automatically.

Problems may be in your data. Indeed

pam(ruspini, 4)$diss

write a dissimilaty matrix

while
pam(MYdata,10)$diss

wite NULL


2017-08-17 16:03 GMT+02:00 Sema Atasever :

> Dear Germano,
>
> Thank you for your fast reply,
>
> In the above code, *MYData *is the actual data set.
>
> Do not we need to convert *MYData to *the dissimilarity matrix using
> *pam(as.dist(**MYData**), k = 10, diss = TRUE*)*   code line?*
>
> *Regards.*
>
> On Thu, Aug 17, 2017 at 2:58 PM, Germano Rossi 
> wrote:
>
>> try this
>>
>> MYdata <- read.csv2("data.txt",dec='.')
>> library(cluster)
>> cluster.pam = pam(MYdata,10)
>> table(cluster.pam$clustering)
>> filenameclu = paste("clusters", ".txt")
>> write.table(cluster.pam$clustering, file=filenameclu,sep=",")
>>
>>
>> 2017-08-17 10:28 GMT+02:00 Sema Atasever :
>>
>>> Dear Authorized Sir / Madam,
>>>
>>> I have a data set in which each row indicates an amino asid and each
>>> column corresponds
>>> to a feature (in total 539 features).
>>> I want to use PAM Clustering usign this data set.
>>>
>>>
>>> *when i ran R script i am getting this error:*
>>> *Error in pam(d, 10) : x is not a numeric dataframe or matrix.*
>>> *Execution halted*
>>>
>>> How can i fix this error? Is there a problem with my dataset?
>>>
>>> Thanks in advance.
>>>
>>>
>>> *PAM clustering codes:*
>>>
>>> MYdata <- read.csv2("data.txt", dec = ".")
>>> attach(MYdata)
>>> d=as.matrix(MYdata)
>>> library(cluster)
>>> cluster.pam = pam(d,10)
>>> table(cluster.pam$clustering)
>>>
>>> filenameclu = paste("clusters", ".txt")
>>> write.table(cluster.pam$clustering, file=filenameclu,sep=",")
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posti
>>> ng-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>> --
>> ==
>> Germano Rossi, Dipartimento di Psicologia, Universita' degli Studi di
>> Milano Bicocca
>> Piazza dell'Ateneo Nuovo, 1- 20126 Milano - Italy
>>
>
>


-- 
==
Germano Rossi, Dipartimento di Psicologia, Universita' degli Studi di
Milano Bicocca
Piazza dell'Ateneo Nuovo, 1- 20126 Milano - Italy

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] PAM Clustering

2017-08-17 Thread Sema Atasever
Dear Germano,

Thank you for your fast reply,

In the above code, *MYData *is the actual data set.

Do not we need to convert *MYData to *the dissimilarity matrix using
*pam(as.dist(**MYData**), k = 10, diss = TRUE*)*   code line?*

*Regards.*

On Thu, Aug 17, 2017 at 2:58 PM, Germano Rossi 
wrote:

> try this
>
> MYdata <- read.csv2("data.txt",dec='.')
> library(cluster)
> cluster.pam = pam(MYdata,10)
> table(cluster.pam$clustering)
> filenameclu = paste("clusters", ".txt")
> write.table(cluster.pam$clustering, file=filenameclu,sep=",")
>
>
> 2017-08-17 10:28 GMT+02:00 Sema Atasever :
>
>> Dear Authorized Sir / Madam,
>>
>> I have a data set in which each row indicates an amino asid and each
>> column corresponds
>> to a feature (in total 539 features).
>> I want to use PAM Clustering usign this data set.
>>
>>
>> *when i ran R script i am getting this error:*
>> *Error in pam(d, 10) : x is not a numeric dataframe or matrix.*
>> *Execution halted*
>>
>> How can i fix this error? Is there a problem with my dataset?
>>
>> Thanks in advance.
>>
>>
>> *PAM clustering codes:*
>>
>> MYdata <- read.csv2("data.txt", dec = ".")
>> attach(MYdata)
>> d=as.matrix(MYdata)
>> library(cluster)
>> cluster.pam = pam(d,10)
>> table(cluster.pam$clustering)
>>
>> filenameclu = paste("clusters", ".txt")
>> write.table(cluster.pam$clustering, file=filenameclu,sep=",")
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> ==
> Germano Rossi, Dipartimento di Psicologia, Universita' degli Studi di
> Milano Bicocca
> Piazza dell'Ateneo Nuovo, 1- 20126 Milano - Italy
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] PAM Clustering

2017-08-17 Thread Germano Rossi
try this

MYdata <- read.csv2("data.txt",dec='.')
library(cluster)
cluster.pam = pam(MYdata,10)
table(cluster.pam$clustering)
filenameclu = paste("clusters", ".txt")
write.table(cluster.pam$clustering, file=filenameclu,sep=",")


2017-08-17 10:28 GMT+02:00 Sema Atasever :

> Dear Authorized Sir / Madam,
>
> I have a data set in which each row indicates an amino asid and each
> column corresponds
> to a feature (in total 539 features).
> I want to use PAM Clustering usign this data set.
>
>
> *when i ran R script i am getting this error:*
> *Error in pam(d, 10) : x is not a numeric dataframe or matrix.*
> *Execution halted*
>
> How can i fix this error? Is there a problem with my dataset?
>
> Thanks in advance.
>
>
> *PAM clustering codes:*
>
> MYdata <- read.csv2("data.txt", dec = ".")
> attach(MYdata)
> d=as.matrix(MYdata)
> library(cluster)
> cluster.pam = pam(d,10)
> table(cluster.pam$clustering)
>
> filenameclu = paste("clusters", ".txt")
> write.table(cluster.pam$clustering, file=filenameclu,sep=",")
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
==
Germano Rossi, Dipartimento di Psicologia, Universita' degli Studi di
Milano Bicocca
Piazza dell'Ateneo Nuovo, 1- 20126 Milano - Italy

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] PAM Clustering

2017-07-10 Thread Ulrik Stervbo
Hi Sema,

read.csv2 use ',' as the decimal separator. Since '.' is used in your file,
everything becomes a character which in turn makes pam complain that what
you pass to the function isn't numeric.

Use read.csv2("data.csv", dec = ".") and it should work.

You can also use class(d) to check the class of the matrix before you pass
it to pam().

See ?read.table for more options.

There is a base function called 'data', so naming a variable data is a poor
choice.

HTH
Ulrik

On Mon, 10 Jul 2017 at 17:25 Sema Atasever  wrote:

> Dear Authorized Sir / Madam,
>
> I have an R script file in which it includes PAM Clustering codes:
>
> *when i ran R script i am getting this error:*
> *Error in pam(d, 10) : x is not a numeric dataframe or matrix.*
> *Execution halted*
>
> How can i fix this error?
>
> Thanks in advance.
> ​
>  data.csv
> <
> https://drive.google.com/file/d/0B4rY6f4kvHeCcVpLRTQ5VDhDNUk/view?usp=drive_web
> >
> ​
>
> *pam.R*
> data <- read.csv2("data.csv")
> attach(data)
> d=as.matrix(data)
> library(cluster)
> cluster.pam = pam(d,10)
> table(cluster.pam$clustering)
>
> filenameclu = paste("clusters", ".txt")
> write.table(cluster.pam$clustering, file=filenameclu,sep=",")
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] PAM Clustering

2017-07-10 Thread Sema Atasever
Dear Authorized Sir / Madam,

I have an R script file in which it includes PAM Clustering codes:

*when i ran R script i am getting this error:*
*Error in pam(d, 10) : x is not a numeric dataframe or matrix.*
*Execution halted*

How can i fix this error?

Thanks in advance.
​
 data.csv

​

*pam.R*
data <- read.csv2("data.csv")
attach(data)
d=as.matrix(data)
library(cluster)
cluster.pam = pam(d,10)
table(cluster.pam$clustering)

filenameclu = paste("clusters", ".txt")
write.table(cluster.pam$clustering, file=filenameclu,sep=",")

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] PAM Clustering Ignores Cluster Number Parameter

2011-05-17 Thread Dario Strbenac
I am using PAM with k = 10 clusters, but I only get one cluster ID for all my 
observations. I couldn't find any discussion about this in the help file, or 
mailing lists.

Is there a reasonable explanation for this result ?

cIDs <- pam(all, 10, cluster.only = TRUE, do.swap = FALSE)
> table(cIDs)
cIDs
0 
16671

The matrix of observations can be found at : 
http://129.94.136.7/file_dump/dario/all.obj

I'm using R version 2.13.0 (2011-04-13) on Platform: x86_64-unknown-linux-gnu 
(64-bit) and have cluster_1.13.3.

--
Dario Strbenac
Research Assistant
Cancer Epigenetics
Garvan Institute of Medical Research
Darlinghurst NSW 2010
Australia

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pam() clustering for large data sets

2011-05-17 Thread Christian Hennig

Dear Lilia,

I'm not sure whether this is particularly helpful in your situation, but 
sometimes it is possible to emulate the same (or approximately the same) 
distance measure as Euclidean distance between points that are 
somehow rescaled and retransformed. In this case, you can rescale and 
retransform your original data from which you computed the distances, and 
use clara, which then implicitly computes Euclidean distances.


Of course whether this works depends on the nature of your data and the 
distance measure that you want to use.


Another possibility is to draw a random subset of, say, 3,000 
observations, run pam on it, and assign the remaining ones to their 
closest medoid "manually". Actually this is about what clara does anyway.


Best regards,
Christian

On Mon, 16 May 2011, Lilia Nedialkova wrote:


Hello everyone,

I need to do k-medoids clustering for data which consists of 50,000
observations.  I have computed distances between the observations
separately and tried to use those with pam().

I got the "cannot allocate vector of length" error and I realize this
job is too memory intensive.  I am at a bit of a loss on what to do at
this point.

I can't use clara(), because I want to use the already computed distances.

What is it that people do to perform clustering for such large data sets?

I would greatly appreciate any form of suggestions that people may have.

Thank you very much in advance.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



*** --- ***
Christian Hennig
University College London, Department of Statistical Science
Gower St., London WC1E 6BT, phone +44 207 679 1698
chr...@stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] pam() clustering for large data sets

2011-05-16 Thread Lilia Nedialkova
Hello everyone,

I need to do k-medoids clustering for data which consists of 50,000
observations.  I have computed distances between the observations
separately and tried to use those with pam().

I got the "cannot allocate vector of length" error and I realize this
job is too memory intensive.  I am at a bit of a loss on what to do at
this point.

I can't use clara(), because I want to use the already computed distances.

What is it that people do to perform clustering for such large data sets?

I would greatly appreciate any form of suggestions that people may have.

Thank you very much in advance.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.