Re: [R] 'matplot' for matrix with NAs: broken lines

2010-05-06 Thread Tao Shi

I just found out that my does this by default statement (by which I was 
referring to the ability to automatically connect two points with a NA in the 
middle in a time series) is wrong!  Actually, all plotting functions, i.e. 
plot, matplot and xyplot, don't plot NAs.  The solution I came up with is 
convert the data to long table, remove NAs, and then use xyplot.  See example 
below:

set.seed(1234)
a=b=matrix(rnorm(9), 3,3)
b[2,2]=NA
matplot(a, type=b)
matplot(b, type=b)  ## I want the two 2 connected!
matplot(b, type=l)  ## Now my data for the second column are missing from the 
graph


## my solution
tmp1 - data.frame(g=rep(1:3,each=3), x=rep(1:3,3),  y=c(b))
xyplot(y~x, group=g, data=tmp1, type=b, pch=c(1,2,3))  ## there is 
still no line connecting two 2s.

tmp2 - tmp1[!is.na(tmp1$y),]
xyplot(y~x, group=g, data=tmp2, type=b)  
## this is what I want, b/c it's easier for me to keep track of both trend and 
missing values.  The original post was really asking whether a simple change of 
some parameters in matplot can do this.  Now, I guess not.

...Tao



 From: maech...@stat.math.ethz.ch
 Date: Thu, 6 May 2010 18:34:22 +0200
 To: shi...@hotmail.com
 CC: ggrothendi...@gmail.com; r-help@r-project.org
 Subject: Re: [R] 'matplot' for matrix with NAs: broken lines

 TS == Tao Shi 
 on Wed, 5 May 2010 20:11:26 + writes:

 TS Thanks, Gabor!  So, there is no way I can change some graphic parameters 
 in 'matplot' to get this?


 TS I forgot to mention that I purposely use type=b, so I know where the 
 missing data are.  With imputed data, either using b or l, there is no 
 way to keep track of NAs.  Plus, in my real data sometimes there is only one 
 non-missing value in a particular column and na.approx can't work (well I 
 could selectively impute the NAs ... )

 TS So far, my best solution to this is to use xyplot.  It does this by 
 default, but of course I need some data manipulation first.

 does this by default meaning what?
 I don't think it does impute missing, does it?

 Can you elaborate, using your example (below)?

 I found Gabor's answer appropriate,
 I really cannot see why matplot() should behave differently here...

 

 Martin Maechler




 TS 
 From: ggrothendi...@gmail.com
 Date: Wed, 5 May 2010 15:45:44 -0400
 Subject: Re: [R] 'matplot' for matrix with NAs: broken lines
 To: shi...@hotmail.com
 CC: r-help@r-project.org

 Try this:

 library(zoo)
 matplot(na.approx(b), type = l)

 On Wed, May 5, 2010 at 2:30 PM, Tao Shi wrote:

 Hi list,

 I know that points involving NAs are not plotted in 'matplot', but when I 
 plot them as lines, I still want the lines to connect all the points (i.e. 
 not broken where there are NAs). Please see the example below. How can I 
 achieve this in 'matplot'? If I can't, any good alternatives so I don't 
 have to use 'plot' + 'lines' and loop through all the columns.

 Many thanks!

 ...Tao

 set.seed(1234)
 a=b=matrix(rnorm(9), 3,3)
 b[2,2]=NA
 matplot(a, type=b)
 TS [[elided Hotmail spam]]
 matplot(b, type=l) ## Now my data for the second column are missing 
 from the graph


 _
 Hotmail is redefining busy with tools for the New Busy. Get more from your 
 inbox.

 N:WL:en-US:WM_HMP:042010_2
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 TS _
 TS Hotmail has tools for the New Busy. Search, chat and e-mail from your 
 inbox.

 TS N:WL:en-US:WM_HMP:042010_1
 TS __
 TS R-help@r-project.org mailing list
 TS https://stat.ethz.ch/mailman/listinfo/r-help
 TS PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 TS and provide commented, minimal, self-contained, reproducible code.
  
_
Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.

N:WL:en-US:WM_HMP:042010_1
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 'matplot' for matrix with NAs: broken lines

2010-05-06 Thread Tao Shi

Thanks for the suggestion!


 Date: Thu, 6 May 2010 13:40:04 -0700
 Subject: Re: [R] 'matplot' for matrix with NAs: broken lines
 From: djmu...@gmail.com
 To: shi...@hotmail.com
 CC: maech...@stat.math.ethz.ch; r-help@r-project.org

 Hi:

 If you intend to use your preferred solution, then I would suggest that you 
 increase the size of
 the plotted points relative to the thickness of the adjoining lines; in your 
 last line of code, something
 like


 xyplot(y~x, group=g, data=tmp2, type=b, cex = 2, pch = 16)

 This way, it will be easier to spot where data values are missing.

 HTH,
 Dennis

 On Thu, May 6, 2010 at 10:44 AM, Tao Shi wrote:



 I just found out that my does this by default statement (by which I was 
 referring to the ability to automatically connect two points with a NA in the 
 middle in a time series) is wrong! Actually, all plotting functions, i.e. 
 plot, matplot and xyplot, don't plot NAs. The solution I came up with is 
 convert the data to long table, remove NAs, and then use xyplot. See 
 example below:




 set.seed(1234)

 a=b=matrix(rnorm(9), 3,3)

 b[2,2]=NA

 matplot(a, type=b)

[[elided Hotmail spam]]

 matplot(b, type=l) ## Now my data for the second column are missing from 
 the graph





 ## my solution

 tmp1 - data.frame(g=rep(1:3,each=3), x=rep(1:3,3), y=c(b))

 xyplot(y~x, group=g, data=tmp1, type=b, pch=c(1,2,3)) ## there is 
 still no line connecting two 2s.



 tmp2 - tmp1[!is.na(tmp1$y),]

 xyplot(y~x, group=g, data=tmp2, type=b)

 ## this is what I want, b/c it's easier for me to keep track of both trend 
 and missing values. The original post was really asking whether a simple 
 change of some parameters in matplot can do this. Now, I guess not.




 ...Tao





 

 From: maech...@stat.math.ethz.ch

 Date: Thu, 6 May 2010 18:34:22 +0200

 To: shi...@hotmail.com

 CC: ggrothendi...@gmail.com; r-help@r-project.org

 Subject: Re: [R] 'matplot' for matrix with NAs: broken lines



 TS == Tao Shi

 on Wed, 5 May 2010 20:11:26 + writes:



 TS Thanks, Gabor! So, there is no way I can change some graphic parameters 
 in 'matplot' to get this?





 TS I forgot to mention that I purposely use type=b, so I know where the 
 missing data are. With imputed data, either using b or l, there is no 
 way to keep track of NAs. Plus, in my real data sometimes there is only one 
 non-missing value in a particular column and na.approx can't work (well I 
 could selectively impute the NAs ... )




 TS So far, my best solution to this is to use xyplot. It does this by 
 default, but of course I need some data manipulation first.



 does this by default meaning what?

 I don't think it does impute missing, does it?



 Can you elaborate, using your example (below)?



 I found Gabor's answer appropriate,

 I really cannot see why matplot() should behave differently here...



 



 Martin Maechler









 TS 

 From: ggrothendi...@gmail.com

 Date: Wed, 5 May 2010 15:45:44 -0400

 Subject: Re: [R] 'matplot' for matrix with NAs: broken lines

 To: shi...@hotmail.com

 CC: r-help@r-project.org



 Try this:



 library(zoo)

 matplot(na.approx(b), type = l)



 On Wed, May 5, 2010 at 2:30 PM, Tao Shi wrote:



 Hi list,



 I know that points involving NAs are not plotted in 'matplot', but when I 
 plot them as lines, I still want the lines to connect all the points 
 (i.e. not broken where there are NAs). Please see the example below. How 
 can I achieve this in 'matplot'? If I can't, any good alternatives so I 
 don't have to use 'plot' + 'lines' and loop through all the columns.




 Many thanks!



 ...Tao



 set.seed(1234)

 a=b=matrix(rnorm(9), 3,3)

 b[2,2]=NA

 matplot(a, type=b)

 TS [[elided Hotmail spam]]

 matplot(b, type=l) ## Now my data for the second column are missing 
 from the graph





 _

 Hotmail is redefining busy with tools for the New Busy. Get more from 
 your inbox.



 N:WL:en-US:WM_HMP:042010_2

 __

 R-help@r-project.org mailing list

 https://stat.ethz.ch/mailman/listinfo/r-help

 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html

 and provide commented, minimal, self-contained, reproducible code.





 TS _

 TS Hotmail has tools for the New Busy. Search, chat and e-mail from your 
 inbox.



 TS N:WL:en-US:WM_HMP:042010_1

 TS __

 TS R-help@r-project.org mailing list

 TS https://stat.ethz.ch/mailman/listinfo/r-help

 TS PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html

 TS and provide commented, minimal, self-contained, reproducible code.



 _

 Hotmail has tools for the New Busy. Search

[R] 'matplot' for matrix with NAs: broken lines

2010-05-05 Thread Tao Shi

Hi list,

I know that points involving NAs are not plotted in 'matplot', but when I plot 
them as lines, I still want the lines to connect all the points (i.e. not 
broken where there are NAs).  Please see the example below.  How can I achieve 
this in 'matplot'?  If I can't, any good alternatives so I don't have to use 
'plot' + 'lines' and loop through all the columns.

Many thanks!

...Tao

 set.seed(1234)
 a=b=matrix(rnorm(9), 3,3)
 b[2,2]=NA
 matplot(a, type=b)
 matplot(b, type=b)  ## I want the two 2 connected!
 matplot(b, type=l)  ## Now my data for the second column are missing from 
 the graph
  
  
_
Hotmail is redefining busy with tools for the New Busy. Get more from your 
inbox.

N:WL:en-US:WM_HMP:042010_2
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Read data from .csv file as a matrix

2010-05-05 Thread Tao Shi

Vincent,

The root of this problem seems to be that you don't fully understand the 
differences between matrix and data.frame.  Read up on them and you'll know how 
to solve this problem.

For now:

as.matrix(temp[,-1])

or 
temp = read.csv(Weather.csv, sep=,, row.names=1)
temp1 - as.matrix(temp) 

should work.


...Tao






 I have a csv file that contains weather observation (rows) by days (in
 columns). 
 
 I open using: 
 
 temp = read.csv(Weather.csv, sep=,) 
 
 and read: 
 
  X X1.Jan X2.Jan X3.Jan X4.Jan 
 1  Min  2  3  4  1 
 2  Max  6 10  8  6 
 3 Forecast Min  3  1  1  3 
 4 Forecast Max  8  7  4  9 
 
 If I type 
 
 mean(temp[2,2:3]) 
 
 I get 
 
 X1.Jan X2.Jan 
  6 10 
 
 The same command on 
 
 y = matrix(1:21, ncol=7) 
 
 mean(y[2,2:3]) 
 [1] 6.5 
 
 Works because the data is in a matrix. I believe R stores the data from the
 csv file as a data.frame with these annoying headers. So how do I convert
 the data from my csv file into a matrix? 
 
 I tried as.matrix but it did not help. 
 

  
_
The New Busy is not the old busy. Search, chat and e-mail from your inbox.

N:WL:en-US:WM_HMP:042010_3
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] What is the best way to have R output tables in an MS Word format?

2010-05-05 Thread Tao Shi

Hi Max,


It looks like most of answers were towards to the statisticians you work with 
(i.e. R - Word).  For yourself, if you just worry about converting the PDF 
reports from your statisticians to Word, here is another link with a more 
comprehensive review besides the two online apps Prof. Harrell's mentioned on 
his webpage.
http://www.freewaregenius.com/2010/03/06/how-to-convert-pdf-to-word-doc-for-free-a-comparative-test/

Also, Adobe Acrobat 9.0 can do PDF-Word, but I haven't tried it personally.


...Tao

On Fri, Apr 30, 2010 at 5:13 PM, Max Gunther max.gunt...@vanderbilt.eduwrote:

 Dear R list,

 Our statisticians usually give us results back in a PDF format. I would
 like to be able to copy and past tables from R output directly into a
 Microsoft Word table since this will save us tons of time, be more accurate
 to minimize human copying errors and help us update data in our papers more
 easily.

 Do people have suggestions for the best way to do this?

 I am a novice to R but I do work with a couple of
 very knowledgeable statisticians who do most of the heavy statistical
 lifting for our research group.

 Many thanks,
 Max


 Max Gunther, PhD

 Vanderbilt University - Radiology
 Institute of Imaging Sciences - VUIIS
 Center for Health Services Research
 Nashville, TN www.ICUdelirium.org


  
_
Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.

N:WL:en-US:WM_HMP:042010_1
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 'matplot' for matrix with NAs: broken lines

2010-05-05 Thread Tao Shi

Thanks, Gabor!  So, there is no way I can change some graphic parameters in 
'matplot' to get this?


I forgot to mention that I purposely use type=b, so I know where the missing 
data are.  With imputed data, either using b or l, there is no way to keep 
track of NAs.  Plus, in my real data sometimes there is only one non-missing 
value in a particular column and na.approx can't work (well I could selectively 
impute the NAs ... )

So far, my best solution to this is to use xyplot.  It does this by default, 
but of course I need some data manipulation first.


...Tao




 From: ggrothendi...@gmail.com
 Date: Wed, 5 May 2010 15:45:44 -0400
 Subject: Re: [R] 'matplot' for matrix with NAs: broken lines
 To: shi...@hotmail.com
 CC: r-help@r-project.org

 Try this:

 library(zoo)
 matplot(na.approx(b), type = l)

 On Wed, May 5, 2010 at 2:30 PM, Tao Shi  wrote:

 Hi list,

 I know that points involving NAs are not plotted in 'matplot', but when I 
 plot them as lines, I still want the lines to connect all the points (i.e. 
 not broken where there are NAs).  Please see the example below.  How can I 
 achieve this in 'matplot'?  If I can't, any good alternatives so I don't 
 have to use 'plot' + 'lines' and loop through all the columns.

 Many thanks!

 ...Tao

 set.seed(1234)
 a=b=matrix(rnorm(9), 3,3)
 b[2,2]=NA
 matplot(a, type=b)
[[elided Hotmail spam]]
 matplot(b, type=l)  ## Now my data for the second column are missing from 
 the graph


 _
 Hotmail is redefining busy with tools for the New Busy. Get more from your 
 inbox.

 N:WL:en-US:WM_HMP:042010_2
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

  
_
Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.

N:WL:en-US:WM_HMP:042010_1
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can't load doSMP from REvolutionR in regular R2.11.0

2010-04-29 Thread Tao Shi

Hi David,

Thank you for the reply!  Do you know where I can find the source code for 
these packages?  I can give it a try.

...Tao



 From: da...@revolution-computing.com
 Date: Thu, 29 Apr 2010 08:59:08 -0700
 Subject: Re: [R] Can't load doSMP from REvolutionR in regular R2.11.0
 To: shi...@hotmail.com
 CC: r-help@r-project.org

 We haven't tested doSMP with the mingw compiler (hence why we haven't
 yet submitted it to CRAN). We compiled it under R 2.10 using the same
 Intel compilers we use for REvolution R. It is open source (GPL) so
 you're welcome to try compiling it under mingw yourself, but we can't
 offer support for that configuration.

 # David Smith

 On Wed, Apr 28, 2010 at 5:10 PM, Tao Shi  wrote:
 I was testing out the doSMP package from REvolutionR in my regular R2.11.0 
 installation and I got the following error message.  Well, one obvious thing 
 is that R2.11.0 was built using i386-pc-mingw32 which is different from 
 what revoIPC used.  I could just use REvolutionR, but all my R peripherals 
 were set up to work
  with the regular R2.11.0.  So, I really want to make this work.  Anyideas?

 --
 David M Smith
 VP of Marketing, REvolution Computing  http://blog.revolution-computing.com
 Tel: +1 (650) 330-0553 x205 (Palo Alto, CA, USA)

 Download REvolution R free:
 www.revolution-computing.com/downloads/revolution-r.php
  
_
Hotmail is redefining busy with tools for the New Busy. Get more from your 
inbox.

N:WL:en-US:WM_HMP:042010_2
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can't load doSMP from REvolutionR in regular R2.11.0

2010-04-29 Thread Tao Shi

Thanks, David!  I forgot to check that email...

I have built the revoIPC using R2.11.0 and it seemed there were no error 
messages.  I can load doSMP now, but haven't tested it yet.

Tal,

I'll send you the file offline, so you can also test it.

best!

...Tao




 From: da...@revolution-computing.com
 Date: Thu, 29 Apr 2010 13:15:43 -0700
 Subject: Re: [R] Can't load doSMP from REvolutionR in regular R2.11.0
 To: shi...@hotmail.com
 CC: r-help@r-project.org; tal.gal...@gmail.com

 Did you download the source bundle when you downloaded REvolution R?
 You'll find them there. There's a link in the same email that gives
 instructions for downloading the binaries.

 # David Smith

 On Thu, Apr 29, 2010 at 12:48 PM, Tao Shi  wrote:

 Hi David,

 Thank you for the reply!  Do you know where I can find the source code for 
 these packages?  I can give it a try.

 ...Tao


 
 From: da...@revolution-computing.com
 Date: Thu, 29 Apr 2010 08:59:08 -0700
 Subject: Re: [R] Can't load doSMP from REvolutionR in regular R2.11.0
 To: shi...@hotmail.com
 CC: r-help@r-project.org

 We haven't tested doSMP with the mingw compiler (hence why we haven't
 yet submitted it to CRAN). We compiled it under R 2.10 using the same
 Intel compilers we use for REvolution R. It is open source (GPL) so
 you're welcome to try compiling it under mingw yourself, but we can't
 offer support for that configuration.

 # David Smith

 On Wed, Apr 28, 2010 at 5:10 PM, Tao Shi  wrote:
 I was testing out the doSMP package from REvolutionR in my regular 
 R2.11.0 installation and I got the following error message.  Well, one 
 obvious thing is that R2.11.0 was built using i386-pc-mingw32 which is 
 different from what revoIPC used.  I could just use REvolutionR, but all 
 my R peripherals were set up to work
  with the regular R2.11.0.  So, I really want to make this work.  Anyideas?

 --
 David M Smith
 VP of Marketing, REvolution Computing  http://blog.revolution-computing.com
 Tel: +1 (650) 330-0553 x205 (Palo Alto, CA, USA)

 Download REvolution R free:
 www.revolution-computing.com/downloads/revolution-r.php

 _
 Hotmail is redefining busy with tools for the New Busy. Get more from your 
 inbox.
 http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_2



 --
 David M Smith 
 VP of Marketing, REvolution Computing http://blog.revolution-computing.com
 Tel: +1 (650) 330-0553 x205 (Palo Alto, CA, USA)

 Download REvolution R free:
 www.revolution-computing.com/downloads/revolution-r.php
  
_
Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.

N:WL:en-US:WM_HMP:042010_1
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] apply vs. foreach vs. foreach with doSMP (multi cores)

2010-04-29 Thread Tao Shi


Hi David and list,

I'm a little puzzled to see these results below.  Since, apply is basically a 
for loop, I was expecting foreach uses about same amount of time as apply, 
whereas foreach after registering 2-cores runs much faster.  However, the 
results show apply is the fastest.  

Also could you please explain the error message (i.e. Error in 
ipcTaskSetEnvironment(taskq, envir) : ...) in the second run?

The results were recorded on REvolution R 3.2 and I observed the same on 
regular R2.11.0.

Many thanks!

...Tao



 library(doSMP)
Loading required package: foreach
Loading required package: iterators
Loading required package: codetools
foreach: simple, scalable parallel programming from REvolution Computing
Use REvolution R for scalability, fault tolerance and more.
http://www.revolution-computing.com
Loading required package: revoIPC
 
 m - matrix(rnorm(10), 1, 10)
 
 system.time(tmp - t(apply(m, 1, function(x) x/mean(x
   user  system elapsed 
   0.21    0.00    0.21 
 system.time(tmp1 - foreach(i=1:nrow(m), .combine=rbind) %dopar% (m[i,] / 
 mean(m[i,])))
   user  system elapsed 
   5.50    0.00    5.53 
Warning message:
executing %dopar% sequentially: no parallel backend registered 
 

 w - startWorkers(2)
Warning messages:
1: In startWorkers(2) : there is an existing doSMP session using doSMP1
2: In startWorkers(2) : there is an existing doSMP session using doSMP2
 registerDoSMP(w)
 system.time(tmp1 - foreach(i=1:nrow(m), .combine=rbind) %dopar% (m[i,] / 
 mean(m[i,])))
   user  system elapsed 
   6.02    0.03    7.84 
 stopWorkers(w)
 
 ## second run
 system.time(tmp - t(apply(m, 1, function(x) x/mean(x
   user  system elapsed 
   0.22    0.02    0.23 
 system.time(tmp1 - foreach(i=1:nrow(m), .combine=rbind) %dopar% (m[i,] / 
 mean(m[i,])))
Error in ipcTaskSetEnvironment(taskq, envir) : 
  The task queue has been freed.
Timing stopped at: 0.03 0.02 0.04 
 
 
 w - startWorkers(2)
Warning messages:
1: In startWorkers(2) : there is an existing doSMP session using doSMP1
2: In startWorkers(2) : there is an existing doSMP session using doSMP2
 registerDoSMP(w)
 system.time(tmp1 - foreach(i=1:nrow(m), .combine=rbind) %dopar% (m[i,] / 
 mean(m[i,])))
   user  system elapsed 
   6.11    0.01    7.62 
 stopWorkers(w)
 dim(m)
[1] 1    10
 sessionInfo()
R version 2.10.1 (2009-12-14) 
i386-pc-intel32 

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United 
States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C   LC_TIME=English_United 
States.1252    

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

other attached packages:
[1] doSMP_1.0-1 revoIPC_1.0-2   foreach_1.3.0   codetools_0.2-2 
iterators_1.0.3 Revobase_3.2.0 
 

  
_
The New Busy is not the old busy. Search, chat and e-mail from your inbox.

N:WL:en-US:WM_HMP:042010_3
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Can't load doSMP from REvolutionR in regular R2.11.0

2010-04-28 Thread Tao Shi

Hi list,

I was testing out the doSMP package from REvolutionR in my regular R2.11.0 
installation and I got the following error message.  Well, one obvious thing is 
that R2.11.0 was built using i386-pc-mingw32 which is different from what 
revoIPC used.  I could just use REvolutionR, but all my R peripherals were set 
up to work
 with the regular R2.11.0.  So, I really want to make this work.  Anyideas?

Many thanks in advance!

...Tao



 library(doSMP)
Loading required package: foreach
Loading required package: iterators
Loading required package: codetools
foreach: simple, scalable parallel programming from REvolution Computing
Use REvolution R for scalability, fault tolerance and more.
http://www.revolution-computing.com
Loading required package: revoIPC
Error: package 'revoIPC' was built for i386-pc-intel32
 sessionInfo()
R version 2.11.0 (2010-04-22) 
i386-pc-mingw32 

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United 
States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C   LC_TIME=English_United 
States.1252    

attached base packages:
[1] grDevices datasets  splines   graphics  stats tcltk utils 
methods   base 

other attached packages:
[1] foreach_1.3.0   codetools_0.2-2 iterators_1.0.3 svSocket_0.9-48 
TinnR_1.0.3 R2HTML_2.0.0    Hmisc_3.7-0 survival_2.35-8

loaded via a namespace (and not attached):
[1] cluster_1.12.3 grid_2.11.0    lattice_0.18-5 svMisc_0.9-57  tools_2.11.0  

  
_
The New Busy is not the too busy. Combine all your e-mail accounts with Hotmail.

ID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_4
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] a warning message from heatmap.2

2010-04-05 Thread Tao Shi

Hi List,

I want to show the heatmap of a correlation matrix using heatmap.2, however 
always get this warning message (see below) and the column dendrogram is not 
showing.  It's not really a big deal, but curious how to suppress it and still 
let R show what I want to show (i.e. a symmetrical heatmap with dendrogram for 
both rows and columns).  I'm using gplots 2.7.4 and R2.10.1 on Win XP.

Thanks!


 heatmap.2(cor(matrix(rnorm(100),10,10)), symm=T, symbreaks=T, trace=none, 
 density.info=none)
Warning message:
In heatmap.2(cor(matrix(rnorm(100), 10, 10)), symm = T, symbreaks = T,  :
  Discrepancy: Colv is FALSE, while dendrogram is `row'. Omitting column 
dendogram.



  
_
The New Busy think 9 to 5 is a cute idea. Combine multiple calendars with 
Hotmail. 

PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_5
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lineplot.CI in sciplot: option ci.fun can't be changed?

2010-04-02 Thread Tao Shi

hi List and Manuel,

I have encounter the following problem with the function lineplot.CI.  I'm 
running R 2.10.1, sciplot 1.0-7 on Win XP.  It seems like it's a scoping issue, 
but I couldn't figure it out.

Thanks!

...Tao



 lineplot.CI(x.factor = dose, response = len, data = ToothGrowth)    ## fine
 lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, fun=median)  
 ## fine
 lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, fun=mean)  
 ## fine
 lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, ci.fun= 
 function(x) c(fun(x)-2*se(x), fun(x)+2*se(x)))  ## failed!
Error in FUN(X[[1L]], ...) : could not find function fun

 debug(lineplot.CI)
 lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, ci.fun= 
 function(x) c(fun(x)-2*se(x), fun(x)+2*se(x)))




Browse[2] 
debug: mn.data - tapply(response, groups, fun)
Browse[2] 
debug: CI.data - tapply(response, groups, ci.fun)
Browse[2] fun
function (x) 
mean(x, na.rm = TRUE)
environment: 0x07178640
Browse[2] ci.fun
function(x) c(fun(x)-2*se(x), fun(x)+2*se(x))
Browse[2] debug(ci.fun)
Browse[2] fun
function (x) 
mean(x, na.rm = TRUE)
environment: 0x07178640
Browse[2] 
debugging in: FUN(X[[1L]], ...)
debug: c(fun(x) - 2 * se(x), fun(x) + 2 * se(x))
Browse[3] 
Error in FUN(X[[1L]], ...) : could not find function fun
 undebug(lineplot.CI)
 lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, ci.fun= 
 function(x) c(fun(x)-se(x), fun(x)+se(x))) 
Error in FUN(X[[1L]], ...) : could not find function fun
 lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, fun = 
 function(x) mean(x, na.rm=TRUE),ci.fun= function(x) c(fun(x)-se(x), 
 fun(x)+se(x))) 
Error in FUN(X[[1L]], ...) : could not find function fun
 lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, fun = 
 function(x) median(x, na.rm=TRUE),ci.fun= function(x) c(fun(x)-se(x), 
 fun(x)+se(x))) 
Error in FUN(X[[1L]], ...) : could not find function fun





  
_
Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.

N:WL:en-US:WM_HMP:042010_1
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lineplot.CI in sciplot: option ci.fun can't be changed?

2010-04-02 Thread Tao Shi

Thanks, Manuel!


 Subject: Re: lineplot.CI in sciplot: option ci.fun can't be changed?
 From: manuel.a.mora...@williams.edu
 To: shi...@hotmail.com
 CC: r-help@r-project.org; mmora...@williams.edu
 Date: Fri, 2 Apr 2010 14:22:33 -0400

 For now, just change fun(x) to median(x) (or whatever) in your ci.fun()
 below.

 E.g.
 lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, ci.fun=
 function(x) c(mean(x)-2*se(x), mean(x)+2*se(x)))

 Otherwise, maybe the list members could help with a solution. An example
 that illustrates the problem:

 ex.fn - function(x,
 fun = mean,
 fun2 = function(x) fun(x)+sd(x)) {

 list(fun=fun(x), fun2=fun2(x))
 }

 data - rnorm(10)

 ex.fn(data) #works
 ex.fn(data, fun=median) #works
 ex.fn(data, fun2=function(x) fun(x)+3) #error with fun(x) not found


 On Fri, 2010-04-02 at 17:36 +, Tao Shi wrote:
 hi List and Manuel,

 I have encounter the following problem with the function lineplot.CI. I'm 
 running R 2.10.1, sciplot 1.0-7 on Win XP. It seems like it's a scoping 
 issue, but I couldn't figure it out.

 Thanks!

 ...Tao



 lineplot.CI(x.factor = dose, response = len, data = ToothGrowth) ## fine
 lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, 
 fun=median) ## fine
 lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, fun=mean) 
 ## fine
 lineplot.CI(x.factor = dose, response = len, data = ToothGrow[[elided 
 Hotmail spam]]
 Error in FUN(X[[1L]], ...) : could not find function fun

 debug(lineplot.CI)
 lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, ci.fun= 
 function(x) c(fun(x)-2*se(x), fun(x)+2*se(x)))
 
 
 
 
 Browse[2]
 debug: mn.data - tapply(response, groups, fun)
 Browse[2]
 debug: CI.data - tapply(response, groups, ci.fun)
 Browse[2] fun
 function (x)
 mean(x, na.rm = TRUE)
 
 Browse[2] ci.fun
 function(x) c(fun(x)-2*se(x), fun(x)+2*se(x))
 Browse[2] debug(ci.fun)
 Browse[2] fun
 function (x)
 mean(x, na.rm = TRUE)
 
 Browse[2]
 debugging in: FUN(X[[1L]], ...)
 debug: c(fun(x) - 2 * se(x), fun(x) + 2 * se(x))
 Browse[3]
 Error in FUN(X[[1L]], ...) : could not find function fun
 undebug(lineplot.CI)
 lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, ci.fun= 
 function(x) c(fun(x)-se(x), fun(x)+se(x)))
 Error in FUN(X[[1L]], ...) : could not find function fun
 lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, fun = 
 function(x) mean(x, na.rm=TRUE),ci.fun= function(x) c(fun(x)-se(x), 
 fun(x)+se(x)))
 Error in FUN(X[[1L]], ...) : could not find function fun
 lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, fun = 
 function(x) median(x, na.rm=TRUE),ci.fun= function(x) c(fun(x)-se(x), 
 fun(x)+se(x)))
 Error in FUN(X[[1L]], ...) : could not find function fun






 _
 Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.
 http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_1

 --
 http://mutualism.williams.edu

  
_
The New Busy is not the too busy. Combine all your e-mail accounts with Hotmail.

ID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_4
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] wrap long lines in table using latex in Hmisc

2010-03-01 Thread Tao Shi

Thank you guys for the wonderful suggestions!

Charlie,

You obviously foresaw my problem!   It took me a while to figure out that the 
\raggedright and other justification commands should be applied to each cell.  
It didn't work for me when applied to the headings.  See this nice document 
here:  

http://nepsweb.co.uk/docs/tableTricks.pdf


Thanks again.

...Tao




 Subject: Re: [R] wrap long lines in table using latex in Hmisc
 From: marc_schwa...@me.com
 Date: Fri, 26 Feb 2010 18:03:33 -0600
 CC: r-help@r-project.org; shi...@hotmail.com
 To: ch...@sharpsteen.net

 On Feb 26, 2010, at 5:18 PM, Sharpie wrote:



 Ista Zahn wrote:

 Hi Tao,
 Just set the appropriate *.just argument, e.g.:

 Dat - data.frame(x1 = rep(this value consists of a long string of
 text, 5), x2 = rep(this value consists of an even longer string of
 text, 5))

 library(Hmisc)
 latex(Dat, col.just = rep(p{1in}, 2))

 You can also set justification for column headings, column group
 headings etc. See ?latex for details.

 Best,
 Ista



 As Ista said, you can use the p{}, m{} and b{} LaTeX column specifications
 to create a table column that enforces a line wrap on it's contents. See:

 http://en.wikibooks.org/wiki/LaTeX/Tables#The_tabular_environment

 for full details.

 However, one problem with using say, p{2in}, is that the text is set *fully
 justified*. This means that the inter-word spacing in each line is expanded
 so that the line fully occupies the allotted 2 inches of space. For some
 tables the results are a typographical travesty.

 The solution is to prepend a {justificationCommand} to your column
 specification, such as:

 {\centering}p{2in}

 The justification commands you can use are :

 \centering - Centers wrapped text
 \raggedright - *left* aligns wrapped text
 \raggedleft - *right* aligns wrapped text

 Remember to double the backslash if you are passing this command as an
 argument in R.

 This trick will cause a LaTeX compilation error if used to specify the
 right-most column in a table, unless the hmisc latex() command produces
 tables that use \tabularnewline to invoke table row breaks instead of
 \\.

 Hope this helps.

 -Charlie


 One other option that you can use is to create a \newcommand that wraps text 
 in a tabular, which you can then actually use within an existing table cell. 
 This enables multiple lines of text within the cell, with line breaks that 
 you specify. So you in effect end up with nested tables. Of course, the 
 entire row height is adjusted accordingly, but this way, you don't need to 
 specify a fixed column width.


 For example, put the following in your .tex file (or .Rnw file) after the 
 \begin{document} directive:

 \newcommand{\multiliner}[1]{\begin{tabular}[...@{}r@{}}#1\end{tabular}}
 \newcommand{\multilinel}[1]{\begin{tabular}[...@{}l@{}}#1\end{tabular}}
 \newcommand{\multilinec}[1]{\begin{tabular}[...@{}c@{}}#1\end{tabular}}


 Each of the above provides for Right, Left and Centered justification, 
 respectively, within the table cell.

 Then, you can create a cell entry that results in the following TeX markup:

 \multilineC{Line 1 \\ Line 2 \\ ...}


 If you are cat()ing the output from R, you need to double the backslashes, so 
 that you begin with something like:

 \\multilineC{Line 1  Line 2  ...}


 I typically do this with headers for tables that would otherwise be too wide 
 for the column.


 So you would start with a long line of text:

 LongLine - This is a really long line that needs to wrap in a table row


 Break it into chunks around 15 characters in length using strwrap():

 strwrap(LongLine, 15)
 [1] This is a really long line that needs to wrap
 [5] in a table row


 Use paste() to begin to create the proper LaTeX markup for \multilineC:

 TMP1 - paste(strwrap(LongLine, 15), collapse = )

 TMP1
 [1] This is areally longline thatneeds to wrapin a table row


 Now create the full line:

 TMP2 - paste(\\multiLine{, TMP1, })

 TMP2
 [1] \\multiLine{ This is areally longline thatneeds to 
 wrapin a table row }


 When you cat() the output, you get:

 cat(TMP2)
 \multiLine{ This is a\\really long\\line that\\needs to wrap\\in a table row }


 TMP2 can now be used in place of the original long line of text and when 
 processed by 'latex', will be rendered properly.


 Of course, rather than using strwrap(), you can hard code the line breaks 
 into your character vector as you may otherwise require.


 HTH,

 Marc Schwartz

  
_
Hotmail: Free, trusted and rich email service.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] wrap long lines in table using latex in Hmisc

2010-02-26 Thread Tao Shi

Hi list,

Is there a way to control long-line wrapping in a table using latex function 
in Hmisc or any other functions?  It seems I can't find any examples.

Thank you very much!


...Tao
  
_
Hotmail: Trusted email with Microsoft’s powerful SPAM protection.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] text editors

2010-02-26 Thread Tao Shi

If you do everything in Windows, Tinn-R is one of the best and also the one I 
use.  I also tried WinEdt. It's very good, but it is not free.  If you want a 
cross-platform editor, Emacs+ESS is the one.  Like others said, the learning 
curve is steep, but worth it.

...Tao

===
Dear all,

Do you use a text editor ? What would you recommend for Windows users ? What
about Tinn-R ?

Thank you very much,
Dwayne

  
_
Hotmail: Free, trusted and rich email service.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] wrap long lines in table using latex in Hmisc

2010-02-26 Thread Tao Shi

Hi Ista,

Thanks!  I missed that.

...Tao


 From: istaz...@gmail.com
 Date: Fri, 26 Feb 2010 17:48:32 -0500
 Subject: Re: [R] wrap long lines in table using latex in Hmisc
 To: shi...@hotmail.com
 CC: r-help@r-project.org

 Hi Tao,
 Just set the appropriate *.just argument, e.g.:

 Dat - data.frame(x1 = rep(this value consists of a long string of
 text, 5), x2 = rep(this value consists of an even longer string of
 text, 5))

 library(Hmisc)
 latex(Dat, col.just = rep(p{1in}, 2))

 You can also set justification for column headings, column group
 headings etc. See ?latex for details.

 Best,
 Ista


 On Fri, Feb 26, 2010 at 3:37 PM, Tao Shi  wrote:

 Hi list,

 Is there a way to control long-line wrapping in a table using latex 
 function in Hmisc or any other functions?  It seems I can't find any 
 examples.

[[elided Hotmail spam]]


 ...Tao

 _
 Hotmail: Trusted email with Microsoft’s powerful SPAM protection.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Ista Zahn
 Graduate student
 University of Rochester
 Department of Clinical and Social Psychology
 http://yourpsyche.org
  
_
Hotmail: Powerful Free email with security by Microsoft.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] a minor bug in venn from gplots?

2009-10-29 Thread Tao Shi


Hi list,

I found this one when I was trying to output the Venn diagram to a .pdf file.  
When there are 4 sets of groups to draw, the .pdf file automatically has 3 
pages and the figure only appears on the 3rd page in the .pdf file with the 
first 2 pages being blank.  Try the following: (I'm using R-2.9.0 on WinXP, 
gplots 2.7.1).  gplots 2.7.3 has the same problem.

###==
pdf(test.pdf)
A- 1:20

B- 1:20

C- 2:20

D- 3:21

input-list(A,B,C,D)

venn(input)

dev.off()
###==

By looking at the code of drawVennDiagram, I think the problem comes from the 
fact there are one grid.newpage() call and two plot function calls when 
numCircles==4 (see below).  I wonder the grid.newpage() and the second plot 
call are necessary?

Thanks a lot.

...Tao


##=

    else if (4 = numCircles  numCircles = 5  !simplify) {
    grid.newpage() 
===
    relocate_elp - function(e, alpha, x, y) {
    phi = (alpha/180) * pi
    xr = e[, 1] * cos(phi) + e[, 2] * sin(phi)
    yr = -e[, 1] * sin(phi) + e[, 2] * cos(phi)
    xr = x + xr
    yr = y + yr
    return(cbind(xr, yr))
    }
    lab - function(identifier, data, showLabel = showSetLogicLabel) {
    r - data[identifier, 1]
    if (showLabel) {
    return(paste(identifier, r, sep = \n))
    }
    else {
    return(r)
    }
    }
    plot(c(0, 400), c(0, 400), type = n, axes = F, ylab = ,   

    xlab = )
    if (4 == numCircles) {
    elps = cbind(162 * cos(seq(0, 2 * pi, len = 1000)), 
    108 * sin(seq(0, 2 * pi, len = 1000)))
    plot(c(0, 400), c(0, 400), type = n, axes = F, 
===
    ylab = , xlab = )
    polygon(relocate_elp(elps, 45, 130, 170))
    polygon(relocate_elp(elps, 45, 200, 200))

...
###===

  
_
[[elided Hotmail spam]]

D24727::T:WLMTAGL:ON:WL:en-US:WWL_WIN_myidea:102009
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SVM cross validation in e1071

2009-07-07 Thread Tao Shi


Hi list,

Could someone help me to explain why the leave-one-out cross validation results 
I got from svm using the internal option cross are different from those I got 
manually?  It seems using cross to do cross validation, the results are 
always better.  Please see the code below.  I also include lda as a comparison.

I'm using WinXP, R-2.9.0, and e1071_1.5-19.

Many thanks!

...Tao



##==
## manual
##
##
 set.seed(1234)
 dat - data.frame( rbind(matrix(rnorm(1000),ncol = 10),  matrix(rnorm(1000, 
 mean=0.6),ncol = 10)))
 cl - as.factor(rep(1:2, each=100))
 y.lda - rep(NA, nrow(dat))
 y.svm - rep(NA, nrow(dat))
 for (i in 1:nrow(dat)){
+ testset - dat[i, ]
+ trainset - dat[-i, ]
+ model.lda - lda(cl[-i]~., data=trainset)
+ model.svm - svm(cl[-i]~., data=trainset)
+ y.lda[i] - as.character(predict(model.lda, testset)$class)
+ y.svm[i] - as.character(predict(model.svm, testset))
+ }
 
 table(y.lda, cl)
 cl
y.lda  1  2
1 84 10
2 16 90
 table(y.svm, cl)
 cl
y.svm  1  2
1 83  8
2 17 92
##==
## using internal CV options
##
 z2 - lda(cl~., data=dat, CV=T)
 table(z2$class, cl)
   cl
 1  2
  1 84 10
  2 16 90
 z - svm(cl~., data=dat, cross=200)
 table(z$fitted, cl)
   cl
 1  2
  1 93  4
  2  7 96

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SVM cross validation in e1071

2009-07-07 Thread Tao Shi

Hi Steve,

Thanks for the pointer!  After checking the source code myself, it's indeed the 
case.  And of course, the training error is always better than the CV error.  
And this can be also checked if you vary the fold number in cross, the 
fitted result is not changing. 

I wonder if the default value for fitted should be set to FALSE to avoid 
the confusion: i.e. I thought the fitted field in the returning object is the 
CV results.  

...Tao



 CC: r-help@r-project.org
 From: mailinglist.honey...@gmail.com
 To: shi...@hotmail.com
 Subject: Re: [R] SVM cross validation in e1071
 Date: Tue, 7 Jul 2009 21:43:49 -0400
 
 Hi Tao,
 
 On Jul 7, 2009, at 8:33 PM, Tao Shi wrote:
 
 Hi list,

 Could someone help me to explain why the leave-one-out cross  
 validation results I got from svm using the internal option cross  
 are different from those I got manually?  It seems using cross to  
 do cross validation, the results are always better.  Please see the  
 code below.  I also include lda as a comparison.
 
 Looking at the C code in Rsvm.c, it looks like the model that is  
 returned is one that is trained on *all* of the data that is  
 originally passed in.
 
 After the model is built, and the value for cross is 1, the  
 `do_cross_validation` function is called, in which your data is then  
 split into folds for cross validation. This is only done to report  
 accuracy or MSE (depending on classification vs. regression). The  
 models from this CV do not effect the model that is returned back to R.
 
 So ... that's why. If you train your svm without holding out any data  
 (and do no cross validation), you should essentially get back the same  
 model that you're getting back no when you set cross 1.
 
 Does that make sense?
 
 -steve
 
 --
 Steve Lianoglou
 Graduate Student: Physiology, Biophysics and Systems Biology
 Weill Medical College of Cornell University
 
 Contact Info: http://cbio.mskcc.org/~lianos
 
 
 

_
Lauren found her dream laptop. Find the PC that’s right for you.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] silhouette: clustering labels have to be consecutive integers starting

2009-05-13 Thread Tao Shi

Thanks, Martin!

...Tao

 From: maech...@stat.math.ethz.ch
 Date: Wed, 13 May 2009 14:13:48 +0200
 To: shi...@hotmail.com
 CC: rip...@stats.ox.ac.uk; bcarv...@jhsph.edu; r-help@r-project.org; 
 maech...@stat.math.ethz.ch
 Subject: Re: [R] silhouette: clustering labels have to be consecutive 
 integers starting
 
  TS == Tao Shi shi...@hotmail.com
  on Wed, 10 Oct 2007 06:15:53 + writes:
 
 TS Thank you very much, Benilton and Prof. Ripley, for the
 TS speedy replies!
 
 TS Looking forward to the fix!
 TS Tao
 
 I have finally re-stumbled onto this e-mail thread,
 and indeed found fixed the problem.
 
 Version 1.12.0 of 'cluster' should become visible within a few days,
 and will allow to call
 
 silhoutte(g, dis)
 
 on a grouping vector of k different integer values which need
 *not* necessarily be in 1:k.
 
 Martin Maechler,
 ETH Zurich
 
 
  From: Prof Brian Ripley rip...@stats.ox.ac.uk
  To: Benilton Carvalho bcarv...@jhsph.edu
  CC: Tao Shi shi...@hotmail.com, maech...@stat.math.ethz.ch,
  r-help@r-project.org
  Subject: Re: [R] silhouette: clustering labels have to be consecutive 
  intergers starting from 1?
  Date: Wed, 10 Oct 2007 05:33:03 +0100 (BST)
  
  It is a C-level problem in package cluster: valgrind gives
  
  ==11377== Invalid write of size 8
  ==11377==at 0xA4015D3: sildist (sildist.c:35)
  ==11377==by 0x4706D8: do_dotCode (dotcode.c:1750)
  
  This is a matter for the package maintainer (Cc:ed here), not R-help.
  
  On Tue, 9 Oct 2007, Benilton Carvalho wrote:
  
  that happened to me with R-2.4.0 (alpha) and was fixed on R-2.4.0
  (final)...
  
  http://tolstoy.newcastle.edu.au/R/e2/help/06/11/5061.html
  
  then i stopped using... now, the problem seems to be back. The same
  examples still apply.
  
  This fails:
  
  require(cluster)
  set.seed(1)
  x - rnorm(100)
  g - sample(2:4, 100, rep=T)
  for (i in 1:100){
  print(i)
  tmp - silhouette(g, dist(x))
  }
  
  and this works:
  
  require(cluster)
  set.seed(1)
  x - rnorm(100)
  g - sample(2:4, 100, rep=T)
  for (i in 1:100){
  print(i)
  tmp - silhouette(as.integer(factor(g)), dist(x))
  }
  
  and here's the sessionInfo():
  
   sessionInfo()
  R version 2.6.0 (2007-10-03)
  x86_64-unknown-linux-gnu
  
  locale:
  
 LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.U
  
 TF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-
  
 8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_ID
  ENTIFICATION=C
  
  attached base packages:
  [1] stats graphics  grDevices utils datasets  methods   base
  
  other attached packages:
  [1] cluster_1.11.9
  
  
  (Red Hat EL 2.6.9-42 smp - AMD opteron 848)
  
  b
  
  On Oct 9, 2007, at 8:35 PM, Tao Shi wrote:
  
  Hi list,
  
  When I was using 'silhouette' from the 'cluster' package to
  calculate clustering performances, R crashed.  I traced the problem
  to the fact that my clustering labels only have 2's and 3's.  when
  I replaced them with 1's and 2's, the problem was solved.  Is the
  function purposely written in this way so when I have clustering
  labels, 2 and 3, for example, the function somehow takes the
  'missing' cluster 2 into account when it calculates silhouette
  widths?
  
  Thanks,
  
  Tao
  
  ##
  ## sorry about the long attachment
  
  R.Version()
  $platform
  [1] i386-pc-mingw32
  
  $arch
  [1] i386
  
  $os
  [1] mingw32
  
  $system
  [1] i386, mingw32
  
  $status
  [1] 
  
  $major
  [1] 2
  
  $minor
  [1] 5.1
  
  $year
  [1] 2007
  
  $month
  [1] 06
  
  $day
  [1] 27
  
  $`svn rev`
  [1] 42083
  
  $language
  [1] R
  
  $version.string
  [1] R version 2.5.1 (2007-06-27)
  
  library(cluster)
  cl1   ## clustering labels
  [1] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 2 2
  [30] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
  [59] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
  [88] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
  [117] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
  [146] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
  [175] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
  [204] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
  x1  ## 1-d input vector
  [1] 1.5707963 1.5707963 1.5707963 1.5707963 1.5707963
  [6] 1.5707963 1.5707963 1.5707963

[R] why two diff. se in nlsList?

2009-01-26 Thread Tao Shi

Hi list,

In the object returned by summary.nlsList, what's the difference between 
coefficients and parameters?  The have the same Estimate, different se 
(therefore t value), but same p values.

R.2.8.0 on winxp with nlme_3.1-89

Thanks,

...Tao

+

 library(nlme)
 fm1 - nlsList(uptake ~ SSasympOff(conc, Asym, lrc, c0),
   data = CO2, start = c(Asym = 30, lrc = -4.5, c0 = 52))

 summary(fm1)$para[,,1]
Estimate Std. Error  t value Pr(|t|)
Qn1 38.13977  0.9911148 38.48169 1.991990e-06
Qn2 42.87169  1.0932089 39.21638 2.583953e-06
Qn3 44.22800  1.0241029 43.18706 1.809264e-07
Qc1 36.42874  1.1941594 30.50576 1.140085e-05
Qc3 40.68373  1.2480923 32.59673 1.424635e-04
Qc2 39.81950  1.0167249 39.16447 2.692304e-06
Mn3 28.48286  1.0624246 26.80930 1.066434e-06
Mn2 32.12827  1.0174826 31.57624 3.488786e-06
Mn1 34.08482  1.3400596 25.43530 4.199333e-06
Mc2 13.55519  1.0506404 12.90184 4.385886e-06
Mc3 18.53506  0.8363371 22.16219 1.461563e-06
Mc1 21.78723  1.4113318 15.43735 5.756870e-06

 summary(fm1)$coef[,,1]
Estimate Std. Error  t value Pr(|t|)
Qn1 38.13977  0.9163882 41.61967 1.991990e-06
Qn2 42.87169  1.0994599 38.99341 2.583953e-06
Qn3 44.22800  0.5829894 75.86415 1.809264e-07
Qc1 36.42874  1.3556273 26.87224 1.140085e-05
Qc3 40.68373  2.8632576 14.20890 1.424635e-04
Qc2 39.81950  1.0317496 38.59415 2.692304e-06
Mn3 28.48286  0.5852408 48.66861 1.066434e-06
Mn2 32.12827  0.8883225 36.16735 3.488786e-06
Mn1 34.08482  0.9872439 34.52522 4.199333e-06
Mc2 13.55519  0.3969189 34.15104 4.385886e-06
Mc3 18.53506  0.4121147 44.97549 1.461563e-06
Mc1 21.78723  0.6830001 31.89930 5.756870e-06


_


ore_012009
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] latex in Hmisc: cell formating

2009-01-23 Thread Tao Shi

Hi Dieter,

Thank you for pointing out the website.  From the website it seems the bug has 
been fixed early 2008 (see 
http://biostat.mc.vanderbilt.edu/trac/Hmisc/changeset/582 ).

So I upgraded my Hmisc package to 3.4-4, which was published on 11/3/2008 and 
hoped it would work.  However, the problem persisted.


...Tao



Tao Shi  hotmail.com writes:

 ## I'm using R 2.8.0 on WinXP, Hmisc_3.4-3

 table1 - matrix(10, 180,7)
 cell.format - matrix(, ncol=7, nrow=180)
 cell.format[c(seq(3,180,6),seq(4,180,6)),] - color{red}
 cell.format[c(seq(5,180,6),seq(6,180,6)),] - color{green}

 latex(table1, where='htbp', long=TRUE,  lines.page=1000,  size=scriptsize,
 + cgroup=c(group1,group2), n.cgroup=c(6,1),
 + rgroup=c(n=1,n=5,n=10,n=20,n=50), n.rgroup=rep(36,5),
 +   cellTexCmds=cell.format, numeric.dollar = FALSE)
 Error in cat(rcellTexCmds[i, colNum],  , cx[i, colNum], file = file,  :
   subscript out of bounds


 ## if I remove the column name grouping, it works fine!
 ##
 latex(table1, where='htbp', long=TRUE,  lines.page=1000,  size=scriptsize,
 + #cgroup=c(group1,group2), n.cgroup=c(6,1),
 + rgroup=c(n=1,n=5,n=10,n=20,n=50), n.rgroup=rep(36,5),
 +   cellTexCmds=cell.format, numeric.dollar = FALSE)

The example you posted is good, but it is more helpful to post the code,
not the pasted result, so that trying you example does not require
manual editing.

I had reported a similar case a year ago; see below. Maybe you should
post it at:

http://biostat.mc.vanderbilt.edu/trac/Hmisc/

#
library(Hmisc)
sessionInfo()

x - matrix(1:12, nrow=2, dimnames=list(c('a','p'),
  letters[1:6]))
cellTex = matrix(rep(, NROW(x) * NCOL(x)),  nrow=NROW(x))

cellTex[1,1] - cellcolor[gray]{0.9}

# works ok
p = latex(x,file=a.tex,
  cellTexCmds = cellTex) # ok

# Works ok
p = latex(x,file=a.tex,
cgroup =  c(a,b,c),n.cgroup=c(2,2,2)
 )

# Fails with a error message subscript out of bounds
p = latex(x, file=a.tex,
cellTexCmds = cellTex,
cgroup =  c(a,b,c),n.cgroup=c(2,2,2)
 )

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] latex in Hmisc: cell formating

2009-01-22 Thread Tao Shi

Hi list,

Could you explain the error I see here?  Thanks!

## I'm using R 2.8.0 on WinXP, Hmisc_3.4-3

 table1 - matrix(10, 180,7)
 cell.format - matrix(, ncol=7, nrow=180)
 cell.format[c(seq(3,180,6),seq(4,180,6)),] - color{red}
 cell.format[c(seq(5,180,6),seq(6,180,6)),] - color{green}
 
 latex(table1, where='htbp', long=TRUE,  lines.page=1000,  size=scriptsize,
+ cgroup=c(group1,group2), n.cgroup=c(6,1),
+ rgroup=c(n=1,n=5,n=10,n=20,n=50), n.rgroup=rep(36,5),
+   cellTexCmds=cell.format, numeric.dollar = FALSE)
Error in cat(rcellTexCmds[i, colNum],  , cx[i, colNum], file = file,  : 
  subscript out of bounds
 
 
## if I remove the column name grouping, it works fine!
##
 latex(table1, where='htbp', long=TRUE,  lines.page=1000,  size=scriptsize,
+ #cgroup=c(group1,group2), n.cgroup=c(6,1),
+ rgroup=c(n=1,n=5,n=10,n=20,n=50), n.rgroup=rep(36,5),
+   cellTexCmds=cell.format, numeric.dollar = FALSE)

Thanks!

...Tao

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] follow up on [Rd] NAMESPACE methods guidance, please ( http://tolstoy.newcastle.edu.au/R/e4/devel/08/06/1901.html )

2008-10-07 Thread Tao Shi

This is a follow-up on the discussion originally posted on the R-devel list ( 
http://tolstoy.newcastle.edu.au/R/e4/devel/08/06/1901.html ), as I have 
encountered the exact same issue mentioned in Martin's email.   Here is a 
simplified version of my problem:


##=
## I created a package, say, tmpA, with a NAMESPACE with Depend: and Imports: 
org.Hs.eg.db and in a new session of R

 library(tmpA)
 foo

function ()
{

require(org.Hs.eg.db)
get(A GO TERM, org.Hs.egGO2ALLEGS)
}


 foo()

Error in as.environment(pos) : invalid object for 'as.environment' 
##=


I fixed the problem by changing get explicitly to AnnotationDbi::get.  I'm 
just wondering what was the final decision on the problem and if there are more 
elegant ways of handling this.

Thanks,

...Tao



_


50F681DAD532637!5295.entry?ocid=TXT_TAGLM_WL_domore_092008
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] a question regarding package building

2008-06-26 Thread Tao Shi

Hi List,

In Windows, if I do  R CMD build mypkg, then I'll get 'mypkg_1.0.tar.gz'.  
Any option in R CMD build lets me to change the version, i.e. gives me 
'mypkg_2.0.tar.gz?   It seems -version option doesn't do anything for me.

Is it OK if I just change the version number in the file name manually?

Thanks,


...Tao

_
The i’m Talkathon starts 6/24/08.  For now, give amongst yourselves.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] a question regarding package building

2008-06-26 Thread Tao Shi

Got it!  Thank you very much!

...Tao




 Date: Thu, 26 Jun 2008 20:41:29 +0200
 From: [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 CC: r-help@r-project.org
 Subject: Re: [R] a question regarding package building
 
 
 
 Tao Shi wrote:
 Hi List,
 
 In Windows, if I do  R CMD build mypkg, then I'll get 'mypkg_1.0.tar.gz'.  
 Any option in R CMD build lets me to change the version, i.e. gives me 
 'mypkg_2.0.tar.gz?   It seems -version option doesn't do anything for me.
 
 Is it OK if I just change the version number in the file name manually?
 
 No, you need to change it in the DESCRIPTION file.
 
 Uwe Ligges
 
 
 Thanks,
 
 
 ...Tao
 
 _
 The i’m Talkathon starts 6/24/08.  For now, give amongst yourselves.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

_
[[elided Hotmail spam]]

ntrosrchcashback
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] nlme model specification (revisit)

2008-01-11 Thread Tao Shi

Hi List,

While using 'nlme' function, I have encountered the similar problem Dr. Stevens 
and Dr. Graves observed (please see the posts: 
https://stat.ethz.ch/pipermail/r-help/2006-May/105832.html ).  I have tried Dr. 
Stevens's original example, the problem is still there,


 mod.lis - nlsList(circumference ~  SSlogis(age, Asymp, xmid, scal),
+  data=Orange )
 
 mod - nlme(circumference ~  SSlogis(age, Asymp, xmid, scal),
+  data=Orange,
+  fixed = Asymp + xmid + scal ~ 1,
+  start = fixef(mod.lis) )
 

 mod - nlme(circumference ~  SSlogis(age, Asymp, xmid, scal),
+  data=Orange,
+  fixed = list(Asymp ~ 1, xmid ~ 1, scal ~ 1),
+  start = fixef(mod.lis) )
Error in parse(file, n, text, prompt, srcfile, encoding) : 
syntax error, unexpected END_OF_INPUT in ~ 
 sessionInfo()
R version 2.5.1 (2007-06-27) 
i386-pc-mingw32 

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United 
States.1252;LC_MONETARY=English_United 
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   
base 

other attached packages:
  lattice  nlme 
0.15-11  3.1-86 


Just wondering if you guys have any answers for this since I didn't see any for 
the original posts.

Thanks,

...Tao
_


GLM_CPC_VideoChat_distantfamily_012008
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] updating a helper function in a R package

2007-12-06 Thread Tao Shi

Hi list,

Sorry for the vague title, but here is the scenario.  

I’m writing an R package, let’s say, ‘pkg1’, which contains 3 functions: f1, 
f2, f3.  f2 and f3 are helper functions for f1, i.e. f1 calls f2 which in turn 
calls f3.

f1 - function(…) {
….
f2()
…
}

f2 - function(…){
…
f3(…)
…
}

f3 - function(...){
   
}

Then, I wrote a new version of f3 and I want to test it.  With the old version 
of ‘pkg1’ already loaded into my R session, I tried just copy-and-paste the new 
‘f3’ to R console and hope f1 will pick the new ‘f3’ up.  It obviously didn’t 
work.  I know it’s b/c the new f3 and old f3 are in different environments and 
when f1 is called, only old f3 is used.  Then I tried to change the environment 
of new f3 to the same as the old f3's by calling:

environment(f3) - environment(pkg1:::f3)

but it wasn't working either.

So,  
1)  Could somebody help me to put all these into perspectives?
2)  Is there an easier way to update f3 without rebuilding the package? (by 
that I mean, writing the new version of f3 in a way that I only need to 
copy-and-paste to R console and I’m good to go.  I know it’s kind of stupid but 
I’m curious to know) 

I'm using R-2.5.1, on WinXP.

Many thanks,

…Tao


_


07
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] RMySQL installation problem

2007-11-15 Thread Tao Shi

Hi List,

I'm running R2.5.1 on WinXP.  Downloaded RMySQL_0.6-0.zip from 
http://www.stats.ox.ac.uk/pub/RWin/bin/windows/contrib/2.6/ and the 
installation seemed fine.  However, when I tried to load the package, the error 
occured:


 utils:::menuInstallLocal()
package 'RMySQL' successfully unpacked and MD5 sums checked
updating HTML package descriptions
 library(RMySQL)
Loading required package: DBI
Error in dyn.load(x, as.logical(local), as.logical(now)) : 
unable to load shared library 
'C:/PROGRA~1/R/R-25~1.1/library/RMySQL/libs/RMySQL.dll':
  LoadLibrary failure:  The specified module could not be found.


Error: package/namespace load failed for 'RMySQL'

## However, I can see the .dll file is there from window's explorer!!


There was also a pop windows says:

R Console: Rgui.exe-Unable To Locate Component
This application has failed to start because LIBMYSQL.dll was not found.  
Re-installing the application may fix the problem.

I tried the re-installation.  It didn't work. The DBI package I have is 
version 0.2-4, just in case.

thanks,


...Tao



_
Climb to the top of the charts!  Play Star Shuffle:  the word scramble 
challenge with star power.

t
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] silhouette: clustering labels have to be consecutive intergers starting

2007-10-10 Thread Tao Shi
Thank you very much,  Benilton and Prof. Ripley, for the speedy replies!  
Looking forward to the fix!

Tao


From: Prof Brian Ripley [EMAIL PROTECTED]
To: Benilton Carvalho [EMAIL PROTECTED]
CC: Tao Shi [EMAIL PROTECTED], [EMAIL PROTECTED],
r-help@r-project.org
Subject: Re: [R] silhouette: clustering labels have to be consecutive 
intergers starting from 1?
Date: Wed, 10 Oct 2007 05:33:03 +0100 (BST)

It is a C-level problem in package cluster: valgrind gives

==11377== Invalid write of size 8
==11377==at 0xA4015D3: sildist (sildist.c:35)
==11377==by 0x4706D8: do_dotCode (dotcode.c:1750)

This is a matter for the package maintainer (Cc:ed here), not R-help.

On Tue, 9 Oct 2007, Benilton Carvalho wrote:

that happened to me with R-2.4.0 (alpha) and was fixed on R-2.4.0
(final)...

http://tolstoy.newcastle.edu.au/R/e2/help/06/11/5061.html

then i stopped using... now, the problem seems to be back. The same
examples still apply.

This fails:

require(cluster)
set.seed(1)
x - rnorm(100)
g - sample(2:4, 100, rep=T)
for (i in 1:100){
   print(i)
   tmp - silhouette(g, dist(x))
}

and this works:

require(cluster)
set.seed(1)
x - rnorm(100)
g - sample(2:4, 100, rep=T)
for (i in 1:100){
   print(i)
   tmp - silhouette(as.integer(factor(g)), dist(x))
}

and here's the sessionInfo():

  sessionInfo()
R version 2.6.0 (2007-10-03)
x86_64-unknown-linux-gnu

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.U
TF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-
8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_ID
ENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] cluster_1.11.9


(Red Hat EL 2.6.9-42 smp - AMD opteron 848)

b

On Oct 9, 2007, at 8:35 PM, Tao Shi wrote:

Hi list,

When I was using 'silhouette' from the 'cluster' package to
calculate clustering performances, R crashed.  I traced the problem
to the fact that my clustering labels only have 2's and 3's.  when
I replaced them with 1's and 2's, the problem was solved.  Is the
function purposely written in this way so when I have clustering
labels, 2 and 3, for example, the function somehow takes the
'missing' cluster 2 into account when it calculates silhouette
widths?

Thanks,

Tao

##
## sorry about the long attachment

R.Version()
$platform
[1] i386-pc-mingw32

$arch
[1] i386

$os
[1] mingw32

$system
[1] i386, mingw32

$status
[1] 

$major
[1] 2

$minor
[1] 5.1

$year
[1] 2007

$month
[1] 06

$day
[1] 27

$`svn rev`
[1] 42083

$language
[1] R

$version.string
[1] R version 2.5.1 (2007-06-27)

library(cluster)
cl1   ## clustering labels
  [1] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 2 2
[30] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[59] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[88] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[117] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[146] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[175] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[204] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
x1  ## 1-d input vector
  [1] 1.5707963 1.5707963 1.5707963 1.5707963 1.5707963
  [6] 1.5707963 1.5707963 1.5707963 1.5707963 1.5707963
[11] 1.5707963 1.5707963 1.5707963 1.5707963 1.5707963
[16] 1.5707963 1.5707963 1.5707963 1.5707963 1.5707963
[21] 1.0163758 0.7657763 0.7370084 0.6999689 0.7366476
[26] 0.7883921 0.6925395 0.7729240 0.7202391 0.7910149
[31] 0.7397698 0.7958092 0.6978596 0.7350255 0.7294362
[36] 0.6125713 0.7174000 0.7413046 0.7044205 0.7568104
[41] 0.7048469 0.7334515 0.7143170 0.7002311 0.7540981
[46] 0.7627527 0.7712762 0.8193611 0.7801148 0.9061762
[51] 0.8248195 0.7932630 0.7248037 0.7423547 0.6419314
[56] 0.6001092 0.7572272 0.7631742 0.7085384 0.8710853
[61] 0.6589563 0.7464943 0.7487340 0.7751280 0.7946542
[66] 0.7666081 0.8508109 0.8314308 0.7442471 0.8006093
[71] 0.7949156 0.7852447 0.7630048 0.7104764 0.6768218
[76] 0.6806351 0.7255355 0.7431389 0.7523627 0.7670515
[81] 0.8118214 0.7215615 0.8186164 0.6941610 0.8285453
[86] 0.8395170 0.8088044 0.8182706 0.7550723 0.7948639
[91] 0.7204830 0.7109068 0.7756949 0.6837856 0.7055604
[96] 0.612 0.7201964 0.6849890 0.7779753 0.7845284
[101] 0.9370788 0.8242935 0.6908860 0.6446151 0.7660386
[106] 0.8141526 0.8111984 0.8624186 0.7865335 0.8213035
[111] 0.8059171 0.6735751 0.7815353 0.6972508 0.6699396
[116] 0.6293971 0.7475913 0.7700821 0.8258339 0.8096144
[121] 0.7058171 0.7516635 0.7323909 0.7229136 0.8344846
[126] 0.7205433 0.8287774 0.8322097 0.7767547 0.7402277
[131] 0.7939879 0.7797308 0.7112453 0.7091554 0.6417382
[136] 0.6369171 0.7059020 0.7496380 0.7298359 0.8202566
[141] 0.7331830 0.7344492 0.8316894 0.7323979 0.7977615
[146] 0.7841205 0.7587060 0.8056685 0.7895643 0.8140731
[151] 0.7890221 0.8016008 0.7381577 0.6936453 0.7133525
[156