Re: [R] confusion matrix - better code?

2007-09-07 Thread Wolfgang Huber
Dear Monica,

try this:

cm =  table(tr, pr)
cm
pr
tr  1 2 3
   1 2 1 0
   2 2 1 0
   3 0 0 3
   4 0 1 0


rowSums(cm)
colSums(cm)

   Best wishes
Wolfgang Huber

Monica Pisica ha scritto:
 Hi,
  
 I�ve written some code to obtain a confusion matrix when the true 
 classification and the predicted classification are known. Suppose true 
 classification is called �tr� and predicted classification is �pr�. I have 4 
 classes in tr, but only 3 classes out of 4 are predicted in �pr�. Following 
 is my code, but looks quite �clunky� to me. I wonder if you have any 
 suggestions to improve it.
  
 Thanks,
  
 Monica
  
 -
  
 tr - c(1,2,2,3,3,3,2,4,1,1)
 pr-c(1,2,1,3,3,3,1,2,1,2)
 dat - data.frame(tr, pr)
 class - c(1:length(tr))
 m - max(c(length(unique(tr)), length(unique(pr
 for(i in 1:length(class)) {
  class[i] - sub(' ','',paste(dat[i,1],dat[i,2])) }
 dat - data.frame(dat, class)
 mat - matrix(0, nrow=m, ncol=m)
 for (i in 1:m){
   for (j in 1:m){
  mat[i,j] - sub(' ','',paste(i,j))
  }}
 cat - matrix(0, nrow=(m+1), ncol=(m+1))
   for (i in 1:m){
   for(j in 1:m){
  cat[i,j]- nrow(dat[dat$class==mat[i,j],])
 }}
 for (i in 1:m){
 cat[(m+1),i]-sum(cat[1:m,i])
  cat[i,(m+1)]- sum(cat[i,1:m])
 cat[(m+1),(m+1)] - sum(cat[1:m,(m+1)])
 }
 cat
  [,1] [,2] [,3] [,4] [,5]
 [1,]21003
 [2,]21003
 [3,]00303
 [4,]01001
 [5,]4330   10
  
 The 5th row / col represents the sum on each row / col respectively.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Colours in R, was: deldir package - voronoi

2007-08-12 Thread Wolfgang Huber
Dear Rolf and Binabina

perhaps this is of use to some:

Colour for Presentation Graphics. Ross Ihaka.
www.stat.auckland.ac.nz/~ihaka/colour/color.pdf

Choosing Color Palettes for Statistical Graphics
Achim Zeileis and Kurt Hornik.
eeyore.ucdavis.edu/stat250/epub-wu-01_abd.pdf

  Best wishes
Wolfgang

--
Wolfgang Huber  EBI/EMBL  Cambridge UK  http://www.ebi.ac.uk/huber


Turner ha scritto:
 On 12/08/2007, at 1:22 PM, zubin wrote:
 
 Hello!

 I am using the deldir package to visualize my data, pretty neat.
 However, i need to fill the colors using polycol =, fill with colors
 like a heatmap - more of a gradient fill.   The only colors i get are
 very blocky - how do i assign the correct colors for a gradient,  
 even a
 grayscale?  i tried the chart of R colors, using 200 numbers for
 grayscale but not getting them.  The polycol = colors the cells in the
 tesselation with the value a specific vector for color.
 
   snip
 
 I'm sorry, but I can't help here.  I've been struggling with colors,  
 in a different
 context, recently myself, and I'm unclear as to how they work.  There  
 are a bunch
 of functions --- palette(), colorRamp(), colorRampPalette() that  
 probably relate
 to what you want to do, but I'm not sure just *how* they relate.
 
 With a bit of luck, someone cleverer than I will come to your rescue.
 
   cheers,
 
   Rolf Turner
 
 ##
 Attention:\ This e-mail message is privileged and confidenti...{{dropped}}
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sorting with criteria that are out of order

2007-03-19 Thread Wolfgang Huber
Dear Niels,

you can convert the columns (which are apparently character vectors) 
into ordered factors and then sort / order on these.

  a
  [1] b a a a c a a b b c b a c c
  c c a a a c

  b=ordered(a, levels=c(b, a, c))

  b
  [1] b a a a c a a b b c b a c c c c a a a c
Levels: b  a  c

  sort (b)
  [1] b b b b a a a a a a a a a c c c c c c c
Levels: b  a  c


Best wishes
   Wolfgang

--
Wolfgang Huber  EBI/EMBL  Cambridge UK  http://www.ebi.ac.uk/huber

  Steen Krogh wrote:
 I try to sort this dataframe: 
  [,1]  [,2] [,3][,4]  [,5]  [,6]  [,7]  [,8] 
  [1,] CM  BARBY  INCREASED   0   2   0   1   1
  [2,] CM  BARBY  REDUCED 0   1   2   2   5
  [3,] CM  BARBY  STANDARD   93  51  56  41  77
  [4,] CM  BONBON INCREASED  43  30  39  32  58
  [5,] CM  BONBON REDUCED 4   3   6   4  10
  [6,] CM  BONBON STANDARD  200 141 127  73 134
  [7,] RAR BARBY  INCREASED   4   1   3   1   5
  [8,] RAR BARBY  REDUCED 5   7   8   9  16
  [9,] RAR BARBY  STANDARD  571 286 314 270 467
 [10,] RAR BONBON INCREASED  49  92 108 154 240
 [11,] RAR BONBON REDUCED11   9   5   6  18
 [12,] RAR BONBON STANDARD  978 627 571 324 541
 
 
 I want the sorting criteria: 
 Column 1: CM before RAR
 Column 2: BONBON before BARBY
 Column 3: REDUCED before STANDARD before INCREASED
 
 Have played with the order function but without being able to sort out how
 to sort using information in column three. 
 
 /Niels 
 
 Niels Steen Krogh
 Konsulent
 ZiteLab ApS 
 
 Mail: -- [EMAIL PROTECTED]
 Telefon: --- +45 38 88 86 13
 Mobil: - +45 22 67 37 38
 Adresse: --- ZiteLab ApS 
  Solsortvej 44
  2000 F.
 Web: --- www.zitelab.dk
 
 ZiteLab
 -Let's Empower Your Data with Webservices


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] count the # of appearances...

2007-03-01 Thread Wolfgang Huber
Dear Bunny,

? table

might be what you wish.

  Best wishes
  Wolfgang Huber
  EBI

bunny , lautloscrew.com wrote:
 Hi there,
 
 is there a possibility to count the number of appearances of an  
 element in a vector ?
 i mean of any given element.. deliver all elements which are exactly  
 xtimes in this vector ?
 
 thx in advance !!

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Advice on visual graph packages

2007-02-13 Thread Wolfgang Huber
Hi Jarrett,

would the coercion methods for the graph class, provided by the 
package of the same name at Bioconductor be useful for doing what you 
want? This is the same class that also Rgraphviz works on. Try

library(graph)
example(graphNEL-class)
as(gR, matrix)

class ? graph
class ? graphNEL
? toGXL

There is a rich sets of methods for setting and accessing node and edge 
attributes, and it is straightforward R to convert into any other 
representation you like. See the vignette Attributes for Graph 
Objects. I am looking at version = 1.13.6 of the package as I write this,

  Best wishes

-- 
--
Wolfgang Huber  EBI/EMBL  Cambridge UK  http://www.ebi.ac.uk/huber

 Hey, all.  I'm looking for packages that are good at two things
 
 1) Drawing directed graphs (i.e nodes and edges), both with single  
 and double headed arrows, as well as allowing for differences in line  
 width and solid versus dashed.  Note: I've tried Rgraphviz here, but  
 have run into some problems (which seem fixable and I may go with it  
 in the end), and it doesn't satisfy need # 2 (which would be ideal if  
 there is a package that does both).
 
 2) Allowing a user to create a directed graph, and have some text  
 object created that can be reprocessed easily reprocessed into a  
 matrix representation, or other representation of my choosing.   I've  
 tried dynamicGraph, but it seems buggy, and continually either  
 crashes, behaves very erratically (nodes disappearing when I modify  
 edges), nor is it clear from the UI how one outputs a new graph, nor  
 how one even accesses many graph attributes.  This may be my own  
 ignorance on the latter.
 
 Do you have any suggestions?
 
 Thanks!
 
 -Jarrett
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Near function?

2007-02-11 Thread Wolfgang Huber
Dear Bart,

hclust might be useful for this as well:

   dat = c(1,20,2,21)

   hc = hclust(dist(dat))

   thresh = 2
   ct = cutree(hc, h=thresh)

   clusteredNumbers = split(dat, ct)
   firstOne = dat[!duplicated(ct)]

   clusteredNumbers
$`1`
[1] 1 2
$`2`
[1] 20 21


  firstOne
[1]  1 20


  Best wishes
   Wolfgang


 
 I have an integer which is extracted from a dataframe, which is sorted by 
 another column of the dataframe.
 Now I would like to remove some elements of the integer, which are near to 
 others by their value. For example: integer: c(1,20,2,21) should be c(1,20).
 
 I tried to write a function, but for some reason, somethings won't work
 
 x - 1:20
 near - function(x,th) {
 nr - NROW(x)
 for (i in 1:(nr-1)){
 for (j in (i+1):nr){
 if (j  nr) break
 t=0
 if (abs(x[i] - x[j])  th) t = 1
 if (t== 1) x - x[-j]
 if (t== 1) nr - nr-1
 if (t== 1) j - (j-1)
 cat ( i,i, j,j,\n)
 }} 
 x
 }
 near(x,10)
 
 
 This gives you 1  3  7 13 17 while I was suspecting 1, 20 as the outcome.
 If you look at the intermediate results of the cat instruction, you see that, 
 after he substracted a number, he skipped the next one.
 
 Sorting the integer is not an option, the order is important.
 I used an integer from 1:20 as an example, while x - sample((1:20),20) is 
 maybe a bit more representable for our data, but isn't reproducible for the 
 output of the function.
 
 Maybe there is already an R-function, which does such thing, or what is wrong 
 with my coding?
 
 
 thanks a lot for your time
 
 
 Bart
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
--
Wolfgang Huber  EBI/EMBL  Cambridge UK  http://www.ebi.ac.uk/huber

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error in heatmap()

2007-01-20 Thread Wolfgang Huber
Dear Yuhong,

heatmap deals gracefully with sparse occurences of NA in the matrix, but 
will fail if whole rows or columns are NA. Try preprocessing your xx as 
follows:

xx = xx[rowSums(!is.na(x))!=0, colSums(!is.na(x))!=0]

  Best wishes
   Wolfgang

--
Wolfgang Huber  EBI/EMBL  Cambridge UK  http://www.ebi.ac.uk/huber

 Hi, 
 
  
 
 I run into following error when using heatmap() for data matrix xx.
 Any help is appreciated? xx contains many NAs.
 
  
 
 hv - heatmap(data.matrix(xx))
 
 Error in hclustfun(distfun(if (symm) x else t(x))) : 
 
 NA/NaN/Inf in foreign function call (arg 11)
 
  
 
 Thanks a lot.
 
  
 
 Yuhong


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Advanced course R programming and Bioconductor in Cambridge UK 30.3.+1.4.

2007-01-15 Thread Wolfgang Huber
Dear R users  developers,

Seth Falcon and Martin Morgan are teaching

Advanced R Programming and Bioconductor 30 March - 1 April 2007 at the
Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.

This two-day course focuses on programming skills required to develop
software for statistical analysis of high-throughput genomic data using
R and Bioconductor.  Lectures and practical sessions introduce the
diversity of packages already available for analysis of data, and
present the tools and techniques participants need for effectively
implementing their own analyses.

Closing date for applications: 7th February 2007

Further information and details of how to apply can be found on our
website: www.wellcome.ac.uk/advancedcourses and
http://www.wellcome.ac.uk/doc_WTX035299.html



Best wishes
--
Wolfgang Huber  EBI/EMBL  Cambridge UK  http://www.ebi.ac.uk/huber

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Re: [Rd] (no subject)

2004-10-31 Thread Wolfgang Huber
Dear Fang Lai,

you have sent your mail to both r-devel and r-help. Please do not do this,
but decide for one. Cross-posting just creates unnecessary and unpleasant
junk-mail to many people.

Furthermore, neither the r-devel nor the r-help mailing lists are intended
as replacements to taking a basic statistics course or reading the
software manuals - rather, as supplement and last resort. The answers are
provided by unpaid voluntary contributors, who appreciate that you
yourself also make at least a minimal effort before going off the mailing
list.

Best wishes
 Wolfgang

-
Wolfgang Huber
European Bioinformatics Institute
European Molecular Biology Laboratory
Cambridge CB10 1SD
England
Phone: +44 1223 494642
Http:  www.dkfz.de/abt0840/whuber
-

quote who=fang lai
 Dear all,
  I have several questions regarding fisher.test() in
 R, and I'd highly appreciate any help with it.
  I have a group of observations, each having people's
 income, and an indicator of whether selected in or out
 a program. I want to test the difference between
 income of people who are in and out.
  Because the distribution is far from normal, I decide
 to use the fisher's exact test, using either mean or
 rank as statistics.
  Question 0 is: Can I do this test using fisher.test()
 in R?
 If so,
  My first question is: Does fisher.test() offer an
 option to choose the statistics? Actually it is not
 clear from the help to me what statistics it uses.
 Does it just compare the mean of people in and out of
 the program?
  My second question is: when the group is large, I
 always receive a warning message such as Fisher exact
 result might not be right  when I set hybrid=T.
 When I set hybrid=F, it does return a result of
 p-value without warning message. I wonder if this
 p-value is reliable or not. And, how does it get the
 approximation of p-value when hybrid=F? Ideally, it
 should randomly draw, say 1000 times, from the full
 sets of permutation of assignment, and get an
 approximate p-value--is this the way it works in
 fisher.test( ) in R? If not, does it use another test,
 or some other measure of approximation?
  My last question is: when the group is small enough,
 will it calculates the exact probabilities even if
 hybrid=F?
 Many thanks,

 Fang

 =
 Lai, Fang

 PhD candidate
 University of California, Berkeley
 Department of Agricultural and Resource Economics
 314 Giannini Hall, Berkeley, CA 94720-3310
 tel: (510) 643 - 5421(O)
  (510) 847 - 9811(Cell)
 fax: (510) 643 - 8911
 email: [EMAIL PROTECTED]
 http://www.are.berkeley.edu/jobmarket/fang.html

 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html