Re: [R] Clustering Functions used by Reverse-Dependencies

2024-02-29 Thread Leo Mada via R-help
f code. On the other hand, the help page for codetools::checkUsage is quite cryptic. But it's good to know at least where to look. Sincerely, Leonard From: Ivan Krylov Sent: Wednesday, February 28, 2024 10:36 AM To: Leo Mada via R-help Cc: Leo Mada Subject

Re: [R] Clustering Functions used by Reverse-Dependencies

2024-02-28 Thread Ivan Krylov via R-help
В Sat, 24 Feb 2024 03:08:26 + Leo Mada via R-help пишет: > Are there any tools to extract the function names called by > reverse-dependencies? For well-behaved packages that declare their dependencies correctly, parsing the NAMESPACE for importFrom() and import() calls should give you the

[R] Clustering Functions used by Reverse-Dependencies

2024-02-23 Thread Leo Mada via R-help
Dear R Users, Are there any tools to extract the function names called by reverse-dependencies? I would like to group these functions using clustering methods based on the co-occurrence in the reverse-dependencies. Utility: It may be possible to split complex packages into modules with fewer

Re: [R] Clustering of datasets

2022-09-05 Thread Rui Barradas
Hello, I am not at all sure that the following answers the question. The code below ries to find the optimal number of clusters. One of the changes I have made to your call to kmeans is to subset DMs not dropping the dim attribute. library(cluster) max_clust <- 10 wss <- numeric(max_clust)

Re: [R] Clustering of datasets

2022-09-05 Thread Jim Lemon
Hi Subhamitra, I think the fact that you are passing a vector of values rather than a matrix is part of the problem. As you have only one value for each country, The points plotted will be the index on the x-axis and the value for each country on the y-axis. Passing a value for ylim= means that

[R] clustering levels using Tukey HSD in a one way anova

2017-12-31 Thread Ashim Kapoor
Dear all, I am doing a one way between subjects anova in an unbalanced data set. Suppose we have "a" levels of the one factor. I want to merge the not so significantly different levels into the same cluster. Can I do a Tukey Kramer HSD and then use the following algorithm: For i in 2 : "a"

Re: [R] Clustering methods for data that has bimodal distribution

2016-12-05 Thread Ranjan Maitra
Hello Adrian, It all depends on what the structure of the dataset is. For instance, you said that all your values are betweenn -1 and 1. Do the data rown sum-squared up to 1? How about the means? Are they zero. I guess all this has to depend on the application and how the data were processed

[R] Clustering methods for data that has bimodal distribution

2016-12-04 Thread Adrian Johnson
Dear group, pardon me for a naive question. I have data matrix (11K rows , 4K columns). The data range is between -1 to 1. Not strictly integers, but real numbers with at least place values in millionths. The data distribution is peculiar (if I do plot(density(myMatrix)), I get nice bimodal

[R] Clustering of clients (retail) - Free data sets?

2015-09-17 Thread Omar André Gonzáles Díaz
Hi all, I'm learning about how to do clusters of clients. Ç I've founde this nice presentation on the subject, but the data is not avaliable to use. I've contacted the autor, hope he'll answer soon. https://ds4ci.files.wordpress.com/2013/09/user08_jimp_custseg_revnov08.pdf Someone knows

[R] clustering with hclust

2014-07-25 Thread Marianna Bolognesi
Hi everybody, I have a problem with a cluster analysis. I am trying to use hclust, method=ward. The Ward method works with SQUARED Euclidean distances. Hclust demands a dissimilarity structure as produced by dist. Yet, dist does not seem to produce a table of squared euclidean distances,

Re: [R] clustering with hclust

2014-07-25 Thread Christian Hennig
Dear Marianna, the function agnes in library cluster can compute Ward's method from a raw data matrix (at least this is what the help page suggests). Also, you may not be using the most recent version of hclust. The most recent version has a note in its help page that states: Two different

[R] Clustering of data set documentation files in package description

2013-09-20 Thread Thiem Alrik
Dear R help list, I was just wondering whether there is a way to cluster the documentation files of data sets in the package documentation index file, so that common prefixes such as dat... are not necessary. Best wishes, Alrik Dr.

[R] Clustering with uneven variables

2013-05-09 Thread Elizabeth McKenzie
Hello, I am new to R (and a novice at statistics). I have a list of objects, with (ideally) 10 different attributes measured per object. However, in reality, I was not able to obtain all 10 attributes for every object, so there is some data missing (unequal number of measured attributes

[R] Clustering newbie question

2012-12-18 Thread Anton Ashanin
Hello, Please advice on encoding data for the following clustering problem.  I have a dataset with car usage info. Dataset has the following fields: 1. Car model  (Toyoya Celica, BMW, Nissan X-Trail, Mazda Cosmo, etc.) 2. Year built  3. Country where the car runs  4. Distance run by car before

[R] clustering of binary data

2012-12-06 Thread marco milella
Good morning, I am analyzing a dataset composed by 364 subjects and 13 binary variables (0,1 = absence,presence). I am testing possible association (co-presence) of my variables. To do this, I was trying with cluster analysis. My main interest is to check for the significance of the obtained

Re: [R] clustering of binary data

2012-12-06 Thread David L Carlson
Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of marco milella Sent: Thursday, December 06, 2012 12:08 PM To: r-help@r-project.org Subject: [R] clustering of binary data

[R] Clustering groups according to multiple variables

2012-10-31 Thread Matthew Ouellette
Dear R help, I am trying to cluster my data according to group in a data frame such as the following: df=data.frame(group=rep(c(a,b,c,d),10),(replicate(100,rnorm(40 I'm not sure how to tell hclust() that I want to cluster according to the group variable. For example:

[R] Clustering groups according to multiple variables

2012-10-31 Thread Matthew Ouellette
Dear R help, I am trying to cluster my data according to group in a data frame such as the following: df=data.frame(group=rep(c(a,b,c,d),10),(replicate(100,rnorm(40 I'm not sure how to tell hclust() that I want to cluster according to the group variable. For example:

[R] clustering spline-based models

2012-10-01 Thread Wyatt McMahon
Hello playeRs! I'm working on a project for a client. She's modeling hormone levels periodically, and trying to develop a model and fit her data to that model, and subsequently she's trying to cluster individuals based on how well each fits the model. I've been looking at grofit for this,

Re: [R] Clustering analysis with ordination plots

2012-05-02 Thread Gavin Simpson
Please read the posting guide for future questions. I presume you mean using the vegan package? If so, then see this blog post of mine which shows how to do something similar: http://wp.me/pZRQ9-73 If you post more details and an example I will help further if the blog post is not sufficient

Re: [R] Clustering analysis with ordination plots

2012-05-01 Thread Uwe Ligges
On 30.04.2012 18:44, borinot wrote: Hello to all, I'm new to R so I have a lot of problems with it, but I'll only ask the main one. I have clustered an environmental matrix We do not know what that is. Where is the example data? See the posting guide. with 2 different methods,

[R] Clustering analysis with ordination plots

2012-04-30 Thread borinot
Hello to all, I'm new to R so I have a lot of problems with it, but I'll only ask the main one. I have clustered an environmental matrix with 2 different methods, and I'd like to plot them in a PCA and a db-RDA. I mean, I want see these clusters in the plots like points of differents colours,

[R] clustering and the region of integration

2012-02-10 Thread Barbara Uszczynska
Dear R users, I'm having trouble with calculating pvalues for my 2d dataset. First I performed clustering and I would like to get some info about the strength of cluster membership for each point. I've calculated (thanks to nice people help) the multivariate normal densities (mnd) using dmvnorm

[R] Clustering and visualising a wordcloud

2012-01-23 Thread Sachinthaka Abeywardana
Is there a package (and for that matter a function) that I can use to create clustered wordclouds. The current wordcloud package simply has more frequent words as larger words, whereas what I want is the cluster centre to be the more frequent words but, the closer a word is to another the higher

[R] Clustering Large Applications..sort of

2011-08-10 Thread Ken Hutchison
Hello all, I am using the clustering functions in R in order to work with large masses of binary time series data, however the clustering functions do not seem able to fit this size of practical problem. Library 'hclust' is good (though it may be sub par for this size of problem, thus doubly

Re: [R] Clustering Large Applications..sort of

2011-08-10 Thread Thomas Lumley
Try the flow cytometry clustering functions in Bioconductor. -thomas On Thu, Aug 11, 2011 at 7:07 AM, Ken Hutchison vicvoncas...@gmail.com wrote: Hello all,   I am using the clustering functions in R in order to work with large masses of binary time series data, however the clustering

Re: [R] Clustering Large Applications..sort of

2011-08-10 Thread Peter Langfelder
On Wed, Aug 10, 2011 at 12:07 PM, Ken Hutchison vicvoncas...@gmail.com wrote: Hello all,   I am using the clustering functions in R in order to work with large masses of binary time series data, however the clustering functions do not seem able to fit this size of practical problem. Library

Re: [R] Clustering Large Applications..sort of

2011-08-10 Thread Christian Hennig
There is a number of methods in the literature to decide the number of clusters for k-means. Probably the most popular one is the Calinski and Harabasz index, implemented as calinhara in package fpc. A distance based version (and several other indexes to do this) is in function cluster.stats

Re: [R] Clustering Large Applications..sort of

2011-08-10 Thread Christian Hennig
PS to my previous posting: Also have a look at kmeansruns in fpc. This runs kmeans for several numbers of clusters and decides the number of clusters by either CalinskiHarabasz or Average Silhouette Width. Christian On Wed, 10 Aug 2011, Ken Hutchison wrote: Hello all, I am using the

Re: [R] clustering based on most significant pvalues does not separate the groups!

2011-07-06 Thread S Ellison
-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of pguilha Sent: 04 July 2011 19:22 To: r-help@r-project.org Subject: [R] clustering based on most significant pvalues does not separate the groups! Hi all, I have some microarray data on 40 samples that fall into two groups. I

Re: [R] clustering based on most significant pvalues does not separate the groups!

2011-07-06 Thread pguilha
Yes absolutely, your explanation makes sense. Thanks very much. rgds Paul -- View this message in context: http://r.789695.n4.nabble.com/clustering-based-on-most-significant-pvalues-does-not-separate-the-groups-tp3644249p3649233.html Sent from the R help mailing list archive at Nabble.com.

[R] clustering based on most significant pvalues does not separate the groups!

2011-07-04 Thread pguilha
Hi all, I have some microarray data on 40 samples that fall into two groups. I have a value for 480k probes for each of those samples. I performed a t test (rowttests) on each row(giving the indices of the columns for each group) then used p.adjust() to adjust the pvalues for the number of tests

[R] Clustering help in Heat Maps

2011-04-12 Thread khush ........
Dear Experts, I am using the below script to generate the heat map of gene expression data. I am using Hierarchical Clustering (hclust) for clustering. Now I want to compare different clustering parameters such as *K-means* clustering, Model Based Clustering, I have two queries: 1. How to

[R] Clustering problem

2011-03-21 Thread Abhishek Pratap
Hi Guys I want to apply a clustering algo to my dataset in order to find the regions points(X,Y) which have similar values(percent_GC and mean_phred_quality). Details below. I have sampled 1% of points from my main data set of 85 million points. The result is still somewhat large 800K points

[R] clustering problem

2011-03-02 Thread Maxim
Hi, I have a gene expression experiment with 20 samples and 25000 genes each. I'd like to perform clustering on these. It turned out to become much faster when I transform the underlying matrix with t(matrix). Unfortunately then I'm not anymore able to use cutree to access individual clusters. In

Re: [R] clustering problem

2011-03-02 Thread rex.dwyer
Don't you expect it to be a lot faster if you cluster 20 items instead of 25000? -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Maxim Sent: Wednesday, March 02, 2011 4:08 PM To: r-help@r-project.org Subject: [R] clustering problem

Re: [R] clustering problem

2011-03-02 Thread Maxim
-project.org Subject: [R] clustering problem Hi, I have a gene expression experiment with 20 samples and 25000 genes each. I'd like to perform clustering on these. It turned out to become much faster when I transform the underlying matrix with t(matrix). Unfortunately then I'm not anymore able

Re: [R] clustering fuzzy

2011-02-05 Thread pete
After ordering the table of membership degrees , i must get the difference between the first and second coloumns , between the first and second largest membership degree of object i. This for K=2,K=3,to K.max=6. This difference is multiplyed by the Crisp silhouette index vector (si). Too it

[R] clustering fuzzy

2011-02-05 Thread pete
After ordering the table of membership degrees , i must get the difference between the first and second coloumns , between the first and second largest membership degree of object i. This for K=2,K=3,to K.max=6. This difference is multiplyed by the Crisp silhouette index vector (si). Too it

[R] clustering with finite mixture model

2011-02-02 Thread karuna m
Dear R-help, I am doing clustering via finite mixture model. Please suggest some packages in R to find clusters via finite mixture model with continuous variables. And also I wish to verify the distributional properties of the mixture distributions by fitting the model with lognormal, gamma,

Re: [R] clustering with finite mixture model

2011-02-02 Thread Matt Shotwell
There are quite a few packages that work with finite mixtures, as evidenced by the descriptions here: http://cran.r-project.org/web/packages/index.html These might be useful: http://cran.r-project.org/web/packages/flexmix/index.html http://cran.r-project.org/web/packages/mclust/index.html

Re: [R] clustering fuzzy

2011-02-02 Thread pete
After ordering the table of membership degrees , i must get the difference between the first and second coloumns , between the first and second largest membership degree of object i. This for K=2,K=3,to K.max=6. This difference is multiplyed by the Crisp silhouette index vector (si). Too it

Re: [R] clustering fuzzy

2011-01-22 Thread pete
I must get an index (fuzzy silhouette), a weighted average. A average the crisp silhouette for every row (i) s and the weight of each term is determined by the difference between the membership degrees of corrisponding object to its first and second best matching fuzzy clusters. i need the

[R] clustering fuzzy

2011-01-21 Thread pete
hello, i'm pete ,how can i order rows of matrix by max to min value? I have a matrix of membership degrees, with 82 (i) rows and K coloumns, K are clusters. I need first and second largest elements of the i-th row. for example 1 0.66 0.04 0.01 0.30 2 0.02 0.89 0.09 0.00 3 0.06 0.92 0.01 0.01

Re: [R] clustering fuzzy

2011-01-21 Thread jim holtman
use 'apply': head(x.m) V2 V3 V4 V5 [1,] 0.66 0.04 0.01 0.30 [2,] 0.02 0.89 0.09 0.00 [3,] 0.06 0.92 0.01 0.01 [4,] 0.07 0.71 0.21 0.01 [5,] 0.10 0.85 0.04 0.01 [6,] 0.91 0.04 0.02 0.02 x.m.sort - apply(x.m, 1, sort, decreasing = TRUE) head(t(x.m.sort)) [,1] [,2] [,3] [,4]

Re: [R] clustering fuzzy

2011-01-21 Thread pete
thank you ,you have been very kind -- View this message in context: http://r.789695.n4.nabble.com/clustering-fuzzy-tp3229853p3230228.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list

Re: [R] clustering association rules

2010-11-11 Thread Michael Hahsler
Jüri, How did you create the output? An example to cluster transactions with arules can be found in: Michael Hahsler and Kurt Hornik. Building on the arules infrastructure for analyzing transaction data with R. In R. Decker and H.-J. Lenz, editors, /Advances in Data Analysis, Proceedings of

[R] clustering association rules

2010-11-10 Thread Kuusik , Jüri
Hello. I have a general question regarding to clustering of association rules. According to http://cran.r-project.org/web/packages/arules/vignettes/arules.pdf 4.7 Distance based clustering transactions and associations there is possibility for creating clusters of association rules. I do not

Re: [R] Clustering

2010-10-30 Thread dpender
David Winsemius wrote: On Oct 29, 2010, at 12:08 PM, David Winsemius wrote: On Oct 29, 2010, at 11:37 AM, dpender wrote: Apologies for being vague, The structure of the output is as follows: Still no code? I am using the Clusters function from the evd package $ cluster1 :

Re: [R] Clustering

2010-10-30 Thread David Winsemius
On Oct 30, 2010, at 7:49 AM, dpender wrote: David Winsemius wrote: On Oct 29, 2010, at 12:08 PM, David Winsemius wrote: On Oct 29, 2010, at 11:37 AM, dpender wrote: Apologies for being vague, The structure of the output is as follows: Still no code? I am using the Clusters

Re: [R] Clustering

2010-10-29 Thread dpender
That's helpful but the reason I'm using clusters in evd is that I need to specify a time condition to ensure independence. I therefore have an output in the form Cluster[[i]][j-k] where i is the cluster number and j-k is the range of values above the threshold taking account of the time

Re: [R] Clustering

2010-10-29 Thread David Winsemius
On Oct 29, 2010, at 5:14 AM, dpender wrote: That's helpful but the reason I'm using clusters in evd is that I need to specify a time condition to ensure independence. I believe this is the first we heard about any particular function or package. I therefore have an output We

Re: [R] Clustering

2010-10-29 Thread dpender
Apologies for being vague, The structure of the output is as follows: $ cluster1 : Named num [1:131] 3.05 2.71 3.26 2.91 2.88 3.11 3.21 -1 2.97 3.39 ... ..- attr(*, names)= chr [1:131] 6667 6668 6669 6670 ... With 613 clusters. What I require is abstracting the first and last value of

Re: [R] Clustering

2010-10-29 Thread David Winsemius
On Oct 29, 2010, at 11:37 AM, dpender wrote: Apologies for being vague, The structure of the output is as follows: Still no code? $ cluster1 : Named num [1:131] 3.05 2.71 3.26 2.91 2.88 3.11 3.21 -1 2.97 3.39 ... ..- attr(*, names)= chr [1:131] 6667 6668 6669 6670 ... With 613

Re: [R] Clustering

2010-10-29 Thread David Winsemius
On Oct 29, 2010, at 12:08 PM, David Winsemius wrote: On Oct 29, 2010, at 11:37 AM, dpender wrote: Apologies for being vague, The structure of the output is as follows: Still no code? $ cluster1 : Named num [1:131] 3.05 2.71 3.26 2.91 2.88 3.11 3.21 -1 2.97 3.39 ... ..- attr(*,

[R] Clustering

2010-10-28 Thread dpender
I am looking to use R in order to determine the number of extreme events for a high frequency (20 minutes) dataset of wave heights that spans 25 years (657,432) data points. I require the number, spacing and duration of the extreme events as an output. I have briefly used the clusters function

Re: [R] Clustering

2010-10-28 Thread Albyn Jones
I have worked with seismic data measured at 100hz, and had no trouble locating events in long records (several times the size of your dataset). 20 minutes is high frequency? what kind of waves are these? what is the wavelength? some details would help. albyn On Thu, Oct 28, 2010 at 05:00:10AM

Re: [R] Clustering

2010-10-28 Thread David Winsemius
On Oct 28, 2010, at 8:00 AM, dpender wrote: I am looking to use R in order to determine the number of extreme events for a high frequency (20 minutes) dataset of wave heights that spans 25 years (657,432) data points. I require the number, spacing and duration of the extreme events as

[R] clustering on scaled dataset or not?

2010-10-28 Thread array chip
Hi, just a general question: when we do hierarchical clustering, should we compute the dissimilarity matrix based on scaled dataset or non-scaled dataset? daisy() in cluster package allow standardizing the variables before calculating dissimilarity matrix; but dist() doesn't have that option at

Re: [R] clustering on scaled dataset or not?

2010-10-28 Thread Claudia Beleites
John, Hi, just a general question: when we do hierarchical clustering, should we compute the dissimilarity matrix based on scaled dataset or non-scaled dataset? daisy() in cluster package allow standardizing the variables before calculating dissimilarity matrix; I'd say that should

[R] Clustering with ordinal data

2010-10-19 Thread Steve_Friedman
Hello I've been asked to help evaluate a vegetation data set, specifically to examine it for community similarity. The initial problem I see is that the data is ordinal. At best this only captures a relative ranking of abundance and ordinal ranks are assigned after data collection.I've

Re: [R] Clustering with ordinal data

2010-10-19 Thread Phil Spector
Steve - Take a look at daisy() in the cluster package. - Phil Spector Statistical Computing Facility Department of Statistics UC

Re: [R] Clustering with ordinal data

2010-10-19 Thread Steve_Friedman
Re: [R] Clustering with ordinal data

Re: [R] Clustering with ordinal data

2010-10-19 Thread Michael Bedward
Hello Steve, I've been asked to help evaluate a vegetation data set, specifically to examine it for community similarity. The initial problem I see is that the data is ordinal.   At best this only captures a relative ranking of abundance and ordinal ranks are assigned after data collection.

[R] clustering with cosine correlation

2010-10-11 Thread l.mohammadikhankahdani
Dear All Do you know how to make a heatmap and use cosine correlation for clustering? This is what my colleague can do in gene-math and I want to do in R but I don't know how to. Thanks a lot Leila __ R-help@r-project.org mailing list

[R] Clustering groups

2010-07-21 Thread syrvn
Hi, is there a way in R to identify those cluster methods / distance measures which best reflect predefined cluster groups. Given 10 observations O1...O10. Optimally, these 10 observations cluster as follows: cluster1: O1, O2, O3, O4 cluster2: O5, O6 cluster3: O7, O8, O9, O10. What I want is a

[R] Clustering

2010-06-23 Thread Ralph Modjesch
Hi, I use the following clustering methods and get the corresponding dendrograms for single, complete, average, ward and kmeans clustering. This gives the dendrograms, but doesn't show the calculation-way. My question: is there a possibility to show this calculation steps (cluster steps) in

Re: [R] Clustering

2010-06-23 Thread Tal Galili
Hi Ralph, In case of hclust, the dendrogram does show the steps (they are the heights presented in the graph). You can present them also in a matrix using cutree, for example: dat - (USArrests) n - (dim(dat)[1]) hc - hclust(dist(USArrests)) cutree(hc, k=1:n) You might then visualize the

Re: [R] Clustering algorithms don't find obvious clusters

2010-06-14 Thread Henrik Aldberg
Thank you Etienne, this seems to work like a charm. Also thanks to the rest of you for your help. Henrik On 11 June 2010 13:51, Cuvelier Etienne ecuscim...@gmail.com wrote: Le 11/06/2010 12:45, Henrik Aldberg a écrit : I have a directed graph which is represented as a matrix on the form

Re: [R] Clustering algorithms don't find obvious clusters

2010-06-13 Thread Joris Meys
Henrik, the methods you use are NOT applicable to directed graphs, in the contrary even. They will split up what you want to put together. In your data, an author never cites himself. Hence, A and B are far more different than B and D according to the techniques you use. Please check out

Re: [R] Clustering algorithms don't find obvious clusters

2010-06-12 Thread Henrik Aldberg
Dave, I used daisy with the default settings (daisy(M) where M is the matrix). Henrik On 11 June 2010 21:57, Dave Roberts dvr...@ecology.msu.montana.edu wrote: Henrik, The clustering algorithms you refer to (and almost all others) expect the matrix to be symmetric. They do not seek a

Re: [R] Clustering algorithms don't find obvious clusters

2010-06-12 Thread Dave Roberts
Henrik, Given your initial matrix, that should tell you which authors are similar/dissimilar to which other authors in terms of which authors they cite. In this case authors 1 and 3 are most similar because they both cite authors 2 and 4. Authors 2 and 3 are most different because they

[R] Clustering algorithms don't find obvious clusters

2010-06-11 Thread Henrik Aldberg
I have a directed graph which is represented as a matrix on the form 0 4 0 1 6 0 0 0 0 1 0 5 0 0 4 0 Each row correspond to an author (A, B, C, D) and the values says how many times this author have cited the other authors. Hence the first row says that author A have cited author B four

Re: [R] Clustering algorithms don't find obvious clusters

2010-06-11 Thread Cuvelier Etienne
Le 11/06/2010 12:45, Henrik Aldberg a écrit : I have a directed graph which is represented as a matrix on the form 0 4 0 1 6 0 0 0 0 1 0 5 0 0 4 0 Each row correspond to an author (A, B, C, D) and the values says how many times this author have cited the other authors. Hence the first

Re: [R] Clustering algorithms don't find obvious clusters

2010-06-11 Thread Dave Roberts
Henrik, The clustering algorithms you refer to (and almost all others) expect the matrix to be symmetric. They do not seek a graph-theoretic solution, but rather proximity in geometric or topological space. How did you convert y9oru matrix to a dissimilarity? Dave Roberts Henrik

Re: [R] clustering in R

2010-05-28 Thread Tal Galili
Hi Ayesha, hclust is a way to go (much better then trying to invent the wheel here). Please add what you used to create: distA And create a sample data set to show us what you did, using dput Best, Tal Contact Details:---

Re: [R] clustering in R

2010-05-28 Thread Joris Meys
As Tal said. Next to that, I read that column1 (and column2?) are supposed to be seen as factors, not as numerical variables. Did you take that into account somehow? It's easy to reproduce the error code : n - NULL if(n2)print(This is OK) Error in if (n 2) print(This is OK) : argument is of

Re: [R] clustering in R

2010-05-28 Thread Ayesha Khan
Thanks Tal Joris! I created my distance matrix distA by using the dist() function in R manipulating my output in order to get a matrix. distA =as.matrix(dist(t(x2))) # x2 being my original dataset as according to the documentaion on dist() For the default method, a dist object, or a matrix (of

Re: [R] clustering in R

2010-05-28 Thread Tal Galili
Hi Ayesha, I wish to help you, but without a simple self contained example that shows your issue, I will not be able to help. Try using the ?dput command to create some simple data, and let us see what you are doing. Best, Tal Contact

Re: [R] clustering in R

2010-05-28 Thread Joris Meys
errr, forget about the output of dput(q), but keep it in mind for next time. f = dist(t(q)) hclust(f,method=single) it's as simple as that. Cheers Joris On Fri, May 28, 2010 at 10:39 PM, Ayesha Khan ayesha.diamond...@gmail.comwrote: v - dput(x,sampledata.txt) dim(v) q - v[1:10,1:10] f

Re: [R] clustering in R

2010-05-28 Thread Ayesha Khan
Yes Joris. I did try that and it does produce the results. I am now wondering why I wanted a matrix like structure in the first place. However, I do want 'f' to contain values less than 2 only. but when i try to get rid of values greater than 2 by doing N - (f[f2], f strcuture disrupts and hclust

Re: [R] clustering in R

2010-05-28 Thread Ayesha Khan
v - dput(x,sampledata.txt) dim(v) q - v[1:10,1:10] f =as.matrix(dist(t(q))) distB=NULL for(k in 1:(nrow(f)-1)) for( m in (k+1):ncol(f)) { if(f[k,m] 2) distB=rbind(distB,c(k,m,f[k,m])) } #now distB looks like this distB [,1] [,2] [,3] [1,]12 1.6275568 [2,]13

Re: [R] clustering in R

2010-05-28 Thread Ayesha Khan
I assume my matrix should look something like this?.. round(distance, 4) P00A P00B M02A M02B P04A P04B M06A M06B P08A P08B M10A P00B 0.9678 M02A 1.0054 1.0349 M02B 1.0258 1.0052 1.2106 P04A 1.0247 0.9928 1.0145 0.9260 P04B 0.9898 0.9769 0.9875 0.9855 0.6075 M06A 1.0159

Re: [R] clustering in R

2010-05-28 Thread Joris Meys
I can't run your code. Please, just give me whatever comes on your screen when you run: dput(q) On Fri, May 28, 2010 at 10:57 PM, Ayesha Khan ayesha.diamond...@gmail.comwrote: I assume my matrix should look something like this?.. round(distance, 4) P00A P00B M02A M02B P04A

Re: [R] clustering in R

2010-05-28 Thread Joris Meys
Ah OK, I didn't get your question then. a dist-object is actually a vector of numbers with a couple of attributes. You can't just cut out values like that. The hclust function needs a perfect distance matrix to use the calculations. shortcut is easy : just do f - f/2*max(f), and all values are

[R] clustering in R

2010-05-27 Thread Ayesha Khan
i have a matrix with the following dimensions 136 3 and it looks something like [,1] [,2] [,3] [1,] 402 675 1.802758 [2,] 402 696 1.938902 [3,] 402 699 1.994253 [4,] 402 945 1.898619 [5,] 424 470 1.812857 [6,] 424 905 1.816345 [7,] 470 905 1.871252

[R] Clustering with clara

2010-01-14 Thread pacomet
Hello everyone I am trying to use CLARA method for finding clusters in my spatial surface temperature data and noticed one problem. My data are in the form lat,lon,temperature. I extract lat,lon and cluster number for each point in the dataset. When I plotted a map of cluster numbers I found

Re: [R] Clustering with clara

2010-01-14 Thread Christian Hennig
Dear Paco, as far as I know, there is no such problem with clara, but I may be wrong. However, in order to help you (though I'm not sure whether I'll be able to do that), we'd need to understand precisely what you were doing in R and how your data looks like (code and data; you can show us a

Re: [R] Clustering for Ordinal data

2009-10-15 Thread Dylan Beaudette
On Wednesday 14 October 2009, Paul Evans wrote: Hi, I just wanted to check whether there is a clustering package available for ordinal data. My data looks something like: #1 #2 #3 #4. A B C D... D B C A... D C A A... where each column represents a sample, and each row some ordinal

[R] Clustering for Ordinal data

2009-10-14 Thread Paul Evans
Hi, I just wanted to check whether there is a clustering package available for ordinal data. My data looks something like: #1 #2 #3 #4. A B C D... D B C A... D C A A... where each column represents a sample, and each row some ordinal values. I would like to cluster such that similar samples

[R] Clustering with R - efficient processing of large sparse data sets (text data)

2009-09-27 Thread dataguru
I checked the R procedure HCLUST (hierarchical clustering) but it looks like it requires a full triangular n x n similarity matrix as input, where n = number of observations. The number of variables is 200. My data set has n = 50,000 observations (keywords), and I use ad-hoc similarity measures,

[R] Clustering with R - efficient processing of large sparse data sets (text data)

2009-09-27 Thread dataguru
I checked the R procedure HCLUST (hierarchical clustering) but it looks like it requires a full triangular n x n similarity matrix as input, where n = number of observations. The number of variables is 200. My data set has n = 50,000 observations (keywords), and I use ad-hoc similarity measures,

[R] Clustering within part of a cluster result

2009-07-09 Thread Albert Vernon Smith
How can I cluster and order within part of a previous clustering result? For example, I am clustering and ordering results as follows: rows - 30 cols - 3 x - matrix(sample(-1:1,rows*cols,replace=T), nrow=rows, ncol=cols,dimnames=list(c(paste(R,1:rows,sep=)),c(paste(C,1:cols,sep= x

Re: [R] clustering, don't understand this error

2009-04-16 Thread Christian Hennig
Hi there, I'm travelling right now so I can't really check this but it seems that the problem is that cluster.stats needs a partition as input. hclust doesn't give you a partition but you can generate one from it using cutree. BTW, rather use - than =. Best wishes, Christian On Wed, 15

[R] clustering, don't understand this error

2009-04-15 Thread Ana M Aparicio Carrasco
Hello, I am using the dunn metric, but something is wrong and I dont understand what or what that this error mean. Please can you help me with this? The instructions are: #Indice de Dunn disbupa=dist(bupa[,1:6]) a=hclust(disbupa) cluster.stats(disbupa,a,bupa[,7])$dunn And the error is:

Re: [R] Clustering with Mahalanobis Distance

2008-12-10 Thread Wayne F
I don't have any experience with your particular problem, but the thing I notice is that mahalanobis is that by default you specify a covariance matrix, and it uses solve to calculate its inverse. If you could supply the inverse covariance matrix (and specify inverted=TRUE to mahalanobis), that

[R] Clustering with Mahalanobis Distance

2008-12-08 Thread Richardson, Patrick
Dear R ExpeRts, I'm having memory difficulties using mahalanobis distance to trying to cluster in R. I was wondering if anyone has done it with a matrix of 6525x17 (or something similar to that size). I have a matrix of 6525 genes and 17 samples. I have my R memory increased to the max and

[R] Clustering and functions

2008-11-08 Thread Bryan Richardson
I am new to R and have written a function that clusters on subsets of a big data data set with 60,000 points. I am not sure why, but I keep getting a run-time error. Any suggestions would be greatly appreciated. Here is the code: library(cba) d-read.csv(data.csv, header=TRUE)

Re: [R] Clustering and functions

2008-11-08 Thread Sarah Goslee
It would help a lot if you told us what the error message was, and provided some data to work with. As it is, we can't even run the function to find out what goes wrong. And also, OS, version of R - all that stuff that the posting guide requests. Sarah On Sat, Nov 8, 2008 at 10:31 AM, Bryan

[R] Clustering In R. (rookie)

2008-11-04 Thread paul murima
Hi all. I have alrge microarray dat set that i would like to analyze using hierarchical clustering. The problem is when i use the command below, hc- hclust(dist(array), ave) i get get this feedback... Error in as.vector(x, mode) : cannot coerce type 'closure' to vector of type 'any' Can some

  1   2   >