date:20050721

Re: [R] poisson fit for histogram

2005-07-21 Thread Vito Ricci

Hi,

see: 

http://cran.r-project.org/doc/contrib/Ricci-distributions-en.pdf

Regards,

Vito


Thomas Isenbarger isen at plantpath.wisc.edu 
wrote:
I haven't been an R lister for a bit, but I hope to
enlist someone's  
help here.  I think this is a simple question, so I
hope the answer  
is not much trouble.  Can you please respond directly
to this email  
address in addition to the list (if responding to the
list is  
warranted)?

I have a histogram and I want to see if the data fit a
Poisson  
distribution.  How do I do this?  It is preferable if
it could be  
done without having to install any or many packages.

I use R Version 1.12 (1622) on OS X

Thank-you very much,
Tom Isenbarger


--
Tom Isenbarger PhD
[EMAIL PROTECTED]
608.265.0850


Diventare costruttori di soluzioni
Became solutions' constructors

The business of the statistician is to catalyze 
the scientific learning process.  
George E. P. Box

Statistical thinking will one day be as necessary for efficient citizenship as 
the ability to read and write
H. G. Wells

Top 10 reasons to become a Statistician

 1. Deviation is considered normal
 2. We feel complete and sufficient
 3. We are 'mean' lovers
 4. Statisticians do it discretely and continuously
 5. We are right 95% of the time
 6. We can legally comment on someone's posterior distribution
 7. We may not be normal, but we are transformable
 8. We never have to say we are certain
 9. We are honestly significantly different
10. No one wants our jobs


Visitate il portale http://www.modugno.it/
e in particolare la sezione su Palese  
http://www.modugno.it/archivio/palesesanto_spirito/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Clustered standard errors in a panel

2005-07-21 Thread Spencer Graves

  Have you considered lmer in library(lme4)?  If you are interested 
in this, you may want to check the article by Doug Bates in the latest R 
news, www.r-project.org - Documentation:  Newsletter.

  spencer graves

Thomas Davidoff wrote:

 I want to do the following:
 
 glm(y ~ x1 + x2 +...)
 within a panel.  Hence y, x1, and x2 all vary at the individual  
 level.  However, there is likely correlation of these variables  
 within an individual, so standard errors need adjustment.
 I do not want to estimate fixed effects, but do want to cluster  
 standard errors at the individual level.
 Is there an automated way to do this?  Nothing in the cluster  
 documentation makes it clear that there is.
 
 (An alternative is to do this by hand.  In that case, I would need to  
 be able to calculate weighted sums of x1 and x2... at the individual  
 level.  I can do this at the variable level [with lapply,split and  
 unsplit], but would love to be able to do so over the matrix of x's.   
 Of course, doing by hand is less easy than an automated solution if  
 it exists.)
 
 
 Thomas Davidoff
 Assistant Professor
 Haas School of Business
 UC Berkeley
 Berkeley, CA 94720
 phone: (510) 643-1425
 fax:(510) 643-7357
 [EMAIL PROTECTED]
 http://faculty.haas.berkeley.edu/davidoff
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

-- 
Spencer Graves, PhD
Senior Development Engineer
PDF Solutions, Inc.
333 West San Carlos Street Suite 700
San Jose, CA 95110, USA

[EMAIL PROTECTED]
www.pdf.com http://www.pdf.com
Tel:  408-938-4420
Fax: 408-280-7915

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] family

2005-07-21 Thread Prof Brian Ripley

You want to aim to write a family for R, not find the equivalents of S 
constructs -- they are different and so exact equivalents do not exist. 
In particular, an R family has several components which an S family does 
not.

There are lots of example families for you to follow (e.g. see ?family and 
the negative binomial families in the MASS package).  Others have found 
reading the sources sufficient to write new families.

On Thu, 21 Jul 2005, Dr L. Y Hin wrote:

 I am in the process of migrating an S programme library to R, and it involves 
 the family entity.

 I have checked ?family but it does not give much detail of its components.
 I will be very grateful if anyone can point towards sources/ways to look up 
 on this areas
 with an aim to find the equivalance of the followings in S:
 family()$inverse
 family()$deriv
 family()$variance
 family()$deviance t


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] again, a question between R and C++

2005-07-21 Thread Molins, Jordi


Dear R Users,

I want to make a call from R into C++. My inputs are List1, List2, List3,
IntegerID. The amount of elements of the lists and their type depend on
IntegerID. Typical elements of a given list can be vectors, doubles, and
even other lists. I want to return also a list (whose nature will depend
also, possibly, on IntegerID).

What I want to do is to call these 4 inputs from C++ and then use a factory
pattern (depending on IntegerID) that will perform different calculations on
the lists depending on the IntegerID (of course, I could also do this with a
simple switch statement).

I have been reading the documentation, especially the one regarding .Call
and .External, and it seems that my algorithm could be implemented, but the
examples I have seen up to now are such that what occupies the place of my
lists are just vectors (like in convolve4 example).

Is there an example where I could see how instead of a vector, a set of
lists (with an unkown number of arguments, as well as unkown types) are used
as inputs? I guess that the ideal would be that in the equivalent of the
convolve4 function, my args would be variant type of lists, and then,
after the factory pattern is called, and the correct class is registered
(via IntegerID), this variant type is really decomposed into the
individual types that compose the list (ie, vectors, doubles, ...). Of
course, in the factory there should be as many decomposing algorithms as
IntegerIDs, each creating a particular decomposition.

Also, how returning a list (whose nature will depend also, possibly, on
IntegerID) should be handled?

Thank you in advance

Jordi




The information contained herein is confidential and is inte...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] cut in R

2005-07-21 Thread Peter Dalgaard

Steve Su [EMAIL PROTECTED] writes:

 Dear All, 
 
 I wonder whether it is still valid to use the following R code for cut. All I 
 have done is changed:
 
if (is.na(breaks) | breaks  2) 
 
 to:
  
if (is.na(breaks) | breaks  1)
 
 so that it covers interval of 1?  
 
 It seems okay for my purposes but I am not sure why R specifically does not 
 allow break2 to happen.
 
 Steve.

What do you need it for? It gives you a factor with only one group, so
I suppose that the idea is that this is more likely to be due to a
programming error.

(However, I spot a bit of a bug in that we don't set
include.lowest=TRUE when using breaks as a number:

 x - round(rnorm(20),2)
 x
 [1]  0.66 -2.22 -0.70 -1.68  0.38 -0.23 -0.43 -0.72  0.30 -0.22 -1.36
 0.60
[13]  0.44 -0.40 -0.61  1.08 -0.41 -0.02 -1.41 -0.49
 cut(x,breaks=3)
 [1] (-0.0189,1.08]  (-2.22,-1.12]   (-1.12,-0.0189] (-2.22,-1.12]
 [5] (-0.0189,1.08]  (-1.12,-0.0189] (-1.12,-0.0189] (-1.12,-0.0189]
 [9] (-0.0189,1.08]  (-1.12,-0.0189] (-2.22,-1.12]   (-0.0189,1.08]
[13] (-0.0189,1.08]  (-1.12,-0.0189] (-1.12,-0.0189] (-0.0189,1.08]
[17] (-1.12,-0.0189] (-1.12,-0.0189] (-2.22,-1.12]   (-1.12,-0.0189]
Levels: (-2.22,-1.12] (-1.12,-0.0189] (-0.0189,1.08]

Notice how -2.22 appears to be inside the interval (-2.22,-1.12] .)

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] dpill in KernSmooth package

2005-07-21 Thread Giacomo De Giorgi

Hi,
 
just a quick question does dpill computes the bandwidth or
half-bandwidth? The help says bandwidth, but in the literature there is
often confusion between the bandwidth and half-bandwidth.
 
thanks,
Giacomo

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Question about 'text' (add lm summary to a plot)

2005-07-21 Thread Dan Bolser


I would like to annotate my plot with a little box containing the slope,
intercept and R^2 of a lm on the data.

I would like it to look like...

 ++
 | Slope :   3.45 +- 0.34 |
 | Intercept : -10.43 +- 1.42 |
 | R^2   :   0.78 |
 ++

However I can't make anything this neat, and I can't find out how to
combine this with symbols for R^2 / +- (plus minus).

Below is my best attempt (which is franky quite pour). Can anyone
improve on the below?

Specifically, 

aligned text and numbers, 
aligned decimal places, 
symbol for R^2 in the text (expression(R^2) seems to fail with
'paste') and +- 



Cheers,
Dan.


dat.lm - lm(dat$AVG_CH_PAIRS ~ dat$CHAINS)

abline(coef(dat.lm),lty=2,lwd=1.5)


dat.lm.sum - summary(dat.lm)
dat.lm.sum

attributes(dat.lm.sum)

my.text.1 -
  paste(Slope : , round(dat.lm.sum$coefficients[2],2),
+/-,  round(dat.lm.sum$coefficients[4],2))

my.text.2 -
  paste(Intercept : , round(dat.lm.sum$coefficients[1],2),
+/-,  round(dat.lm.sum$coefficients[3],2))

my.text.3 -
  paste(R^2 : ,   round(dat.lm.sum$r.squared,2))

my.text.1
my.text.2
my.text.3


## Add legend
text(x=3,
 y=300,
 paste(my.text.1,
   my.text.2,
   my.text.3,
   sep=\n),
 adj=c(0,0),
 cex=1)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] bubble.plot() - standardize size of unit circle

2005-07-21 Thread Dan Bebber

Hello,

I wrote a wrapper for symbols() that produces a
bivariate bubble plot, for use when plot(x,y) hides
multiple occurrences of the same x,y combination (e.g.
if x,y are integers).
Circle area ~ counts per bin, and circle size is
controlled by 'scale'.
Question: how can I automatically make the smallest
circle the same size as a standard plot character,
rather than having to approximate it using 'scale'?

#Function:
bubble.plot-function(x,y,scale=0.1,xlab=substitute(x),ylab=substitute(y),...){
z-table(x,y)
xx-rep(as.numeric(rownames(z)),ncol(z))
yy-sort(rep(as.numeric(colnames(z)),nrow(z)))
id-which(z!=0)
symbols(xx[id],yy[id],inches=F,circles=sqrt(z[id])*scale,xlab=xlab,ylab=ylab,...)}

#Example:
x-rpois(100,3)
y-x+rpois(100,2)
bubble.plot(x,y)



___ 
How much free photo storage do you get? Store your holiday

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Is it possible to create highly customized report in *.xls format by using R/S+?

2005-07-21 Thread Wensui Liu

Thank you all for the replies. It is very eye-opening for me.

I probably need something like RDCOMClient. I've tried it last night.
Very nice package!!!

On 7/20/05, Gabor Grothendieck [EMAIL PROTECTED] wrote:
 Here is an example where R is the client and Excel is the server
 so that R is issuing commands to Excel.  This example uses the
 RDCOMClient package from www.omegahat.org:
 
 library(RDCOMClient)
 xl - COMCreate(Excel.Application)  # starts up Excel
 xl[[Visible]] - TRUE   # Excel becomes visible
 wkbk - xl$Workbooks()$Add()  # new workbook
 
 # set some cells
 
 sh - xl$ActiveSheet()
 
 x12 - sh$Cells(1,2)
 x12[[Value]] - 123
 
 x22 - sh$Cells(2,2)
 x22[[Value]] - 100
 
 x31 - sh$Cells(3,1)
 x31[[Value]] - Total
 
 B3R - sh$Range(B3)
 B3R[[Formula]] - =Sum(R1C2:R2C2)
 B3R[[NumberFormat]] - _($* #,##0.00_)
 B3RF - B3R$Font()
 B3RF[[Bold]] - TRUE
 
 
 # save and exit
 wkbk$SaveAs(\\test.xls)
 xl$Quit()
 
 Code using the rcom package at (second link is mailing list):
 
 http://sunsite.univie.ac.at/rcom/download/
 http://mailman.csd.univie.ac.at/pipermail/rcom-l/
 
 would be nearly identical once the upcoming version of rcom comes out.
 rcom and omegahat both provide the possibility of having
 Excel as the client and R as the server; however, in that setup the
 user would have to have R running whereas in the above setup only you do.
 
 On 7/20/05, Wensui Liu [EMAIL PROTECTED] wrote:
  I appreciate your reply and understand your point completely. But at
  times we can't change the rule, the only choice is to follow the rule.
  Most deliverables in my work are in excel format.
 
  On 7/20/05, Greg Snow [EMAIL PROTECTED] wrote:
   See:
  
   http://www.burns-stat.com/pages/Tutor/spreadsheet_addiction.html
   and
   http://www.stat.uiowa.edu/~jcryer/JSMTalk2001.pdf
  
   Greg Snow, Ph.D.
   Statistical Data Center, LDS Hospital
   Intermountain Health Care
   [EMAIL PROTECTED]
   (801) 408-8111
  
Wensui Liu [EMAIL PROTECTED] 07/19/05 03:22PM 
   I remember in one slide of Prof. Ripley's presentation overhead, he
   said the most popular data analysis software is excel.
  
   So is there any resource or tutorial on this topic?
  
   Thank you so much!
 


-- 
WenSui Liu, MS MA
Senior Decision Support Analyst
Division of Health Policy and Clinical Effectiveness
Cincinnati Children Hospital Medical Center

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] again, a question between R and C++

2005-07-21 Thread Whit Armstrong

Jordi,

The place to ask this question is probably the r-devel list; it's a
little too heavy for r-help.

This is fairly easy to do using the .Call interface.

Have a look at lapply2 in the Writing R Extensions manual.
http://cran.r-project.org/doc/manuals/R-exts.html#Evaluating-R-expressio
ns-from-C

or just follow this short example.


write a function in C++ as follows:
 
SEXP myFunc(SEXP list1, SEXP list2, SEXP list3, SEXP list4, SEXP
intID_SEXP) {

// obtain the list length as follows:
int list1_len = length(list1);

// to access your integer (I assume it's a scalar, not a vector)
// you need to grab the first element of this integer vector
int INTEGER(intID_SEXP)[0]

// you will want to add some checks to make sure the arguments
are of the right type

...
...

SEXP ans = (whatever)
return ans;
}

you can call it in R as follows:

.Call(myFunc, list1, list2, list3, list4, intID)



-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Molins, Jordi
Sent: Thursday, July 21, 2005 3:25 AM
To: r-help@stat.math.ethz.ch
Cc: Jordi Molins
Subject: [R] again, a question between R and C++


Dear R Users,

I want to make a call from R into C++. My inputs are List1, List2,
List3, IntegerID. The amount of elements of the lists and their type
depend on IntegerID. Typical elements of a given list can be vectors,
doubles, and even other lists. I want to return also a list (whose
nature will depend also, possibly, on IntegerID).

What I want to do is to call these 4 inputs from C++ and then use a
factory pattern (depending on IntegerID) that will perform different
calculations on the lists depending on the IntegerID (of course, I could
also do this with a simple switch statement).

I have been reading the documentation, especially the one regarding
.Call and .External, and it seems that my algorithm could be
implemented, but the examples I have seen up to now are such that what
occupies the place of my lists are just vectors (like in convolve4
example).

Is there an example where I could see how instead of a vector, a set of
lists (with an unkown number of arguments, as well as unkown types) are
used as inputs? I guess that the ideal would be that in the equivalent
of the
convolve4 function, my args would be variant type of lists, and then,
after the factory pattern is called, and the correct class is registered
(via IntegerID), this variant type is really decomposed into the
individual types that compose the list (ie, vectors, doubles, ...). Of
course, in the factory there should be as many decomposing algorithms
as IntegerIDs, each creating a particular decomposition.

Also, how returning a list (whose nature will depend also, possibly, on
IntegerID) should be handled?

Thank you in advance

Jordi





The information contained herein is confidential and is\ int...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] The steps of building library in R 2.1.1

2005-07-21 Thread Uwe Ligges

Ivy_Li wrote:

 Dear All,
 
 With the warm support of every R expert, I have built my R library 
 successfully. 
 Especially thanks: Duncan Murdoch
   Gabor Grothendieck   
   Henrik Bengtsson
   Uwe Ligges

You are welcome.


The following is intended for the records in the archive in order to
protect readers.


 Without your help, I will lower efficiency.
 I noticed that some other friends were puzzled by the method of building 
 library. Now, I organize a document about it. Hoping it can help more friends.
 
 1. Read the webpage http://www.stats.ox.ac.uk/pub/Rtools 

Do you mean http://www.murdoch-sutherland.com/Rtools/ ?

 2. Download the rw2011.exe; Install the newest version of R
 3. Download the tools.zip; Unpack it into c:\cygwin

Not required to call it cygwin - also a bit misleading...

 4. Download the ActivePerl-5.6.1.633-MSWin32-x86.msi; Install Active Perl 
 in c:\Perl

Why in C:\Perl ?

 5. Download the MinGW-3.1.0-1.exe; Install the mingw32 port of gcc in 
 c:\mingwin

Why in c:\mingwin ?


 6. Then go to Control Panel - System - Advanced - Environment Variables 
 - Path - Variable Balue; add c:\cygwin;c:\mingwin\bin 
   The PATH variable already contains a couple of paths, add the two given 
 above in front of all others, separated by ;. 
   Why we add them in the beginning of the path? Because we want the 
 folder that contains the tools to be at the beginning so that you eliminate 
 the possibility of finding a different program of the same name first in a 
 folder that comes prior to the one where the tools are stored.


OK, this (1-6) is all described in the R Administration and Installation
manual, hence I do not see why we have to repeat it here.


 7. I use the package.skeleton() function to make a draft package. It will 
 automate some of the setup for a new source package. It creates directories, 
 saves functions anddata to appropriate places, and creates skeleton help 
 files and 'README' files describing further steps in packaging.
 I type in R:
   f - function(x,y) x+y
   g - function(x,y) x-y
   d - data.frame(a=1, b=2)
   e - rnorm(1000)
   package.skeleton(list=c(f,g,d,e), name=example)
 Then modify the 'DESCRIPTION':
   Package: example
   Version: 1.0-1
   Date: 2005-07-09
   Title: My first function
   Author: Ivy [EMAIL PROTECTED]
   Maintainer: Ivy [EMAIL PROTECTED]
   Description: simple sum and subtract
   License: GPL version 2 or later
   Depends: R (= 1.9), stats, graphics, utils
 You can refer to the web page: 
 http://cran.r-project.org/src/contrib/Descriptions/  There are larger source 
 of examples. And you can read the part of 'Creating R Packages' in 'Writing R 
 Extension'. It introduces some useful things for your reference. 


This is described in Writing R Extension and is not related to the setup
of you system in 1-6.


 
 8. Download hhc.exe Microsoft help compiler from somewhere. And save it 
 somewhere in your path.
 I download a 'htmlhelp.exe' and setup. saved the hhc.exe into the 
 'C:\cygwin\bin' because this path has been writen in my PATH Variable Balue.
 However if you decided not to use the Help Compiler (hhc), then you need 
 to modify the MkRules file in RHOME/src/gnuwin32 to tell it not to try to 
 build that kind of help file


This is described in the R Administration and Installation manual
and I do not see why we should put the html compiler to the other tools.


 9. In the DOS environment. Into the D:\  Type the following code: 

There is no DOS environment in Windows NT based operating systems.


   cd \Program Files\R\rw2010 
   bin\R CMD INSTALL /Program Files/R/rw2011/example

I do not see why anybody would like to contaminate the binary
installation of R with some development source packages.
I'd rather use a separate directory.

I think reading the two mentioned manuals shoul be sufficient. You have
not added relevant information. By adding irrelevant information and
omitting some relevant information, I guess we got something that is
misleading if the reader does NOT read the manuals as well.

Best,
Uwe Ligges


 Firstly, because I install the new version R in the D:\Program Files\. So I 
 should first into the D drive. Secondly, because I use the package.skeleton() 
 function to build 'example' package in the path of D:\Program Files\R\rw2011\ 
  So I must tell R the path where saved the 'example' package. So I write the 
 code is like that. If your path is different from me, you should modify part 
 of these code.
 
 10.Finally, this package is successfully built up.
 
 -- Making package example 
 adding build stamp to DESCRIPTION
 installing R files
 installing data files
 installing man source files
 installing indices
 not zipping data
 installing help
 Building/Updating help pages for package 'example'

[R] Problem with read.table()

2005-07-21 Thread Kristian Skrede Gleditsch

Dear all,

I have encountered a strange problem with read.table(). When I try to 
read a tab delimited file I get an error message for line 260 not being 
equal to 14 (see below).

Using count.fields() suggests that a number of lines have length not 
equal to 14, but not 260.

Looking at the actual file, however, I cannot see anything wrong with 
any lines. They all seem to have length 14, there are no double tabs 
etc., and the file reads correctly in other programs. Does anyone have 
any suggestions as to what this might stem from?

I have placed a copy of the file at 
http://dss.ucsd.edu/~kgledits/archigos_v.1.9.asc

regards,
Kristian Skrede Gleditsch


  archigos1.9 - read.table(c:/work/work12/archigos/archigos_v.1.9.asc,
+ sep=\t,header=T,as.is=T,row.names=NULL)
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = 
dec,  :
 line 260 did not have 14 elements
  a - count.fields(c:/work/work12/archigos/archigos_v.1.9.asc,sep=\t)
  a - data.frame(c(1:length(a)),a)
  a[a[,2]!=14,]
  c.1.length.a..  a
150 150 10
313 313 10
424 424 10
1189   1189  5
1510   1510 10
1514   1514 10
1590   1590  5
1600   1600 10
1612   1612 10
1618   1618 10
1619   1619 10
1709   1709 10
1722   1722 10
1981   1981 10
1985   1985 10
2112   2112 10
2178   2178 10
2208   2208 10
2224   2224 10
2530   2530  5
2536   2536  5
2573   2573  5
2928   2928  5
-- 
Kristian Skrede Gleditsch
Department of Political Science, UCSD
(On leave, University of Essex, 2005-6)
Tel: +44 1206 872499, Fax: +44 1206 873234
Email: [EMAIL PROTECTED] or [EMAIL PROTECTED]
http://weber.ucsd.edu/~kgledits/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] R graphics

2005-07-21 Thread Sam Baxter


Hi

I am trying to set up 16 graphs on one graphics page in R. I have used 
the mfrow=c(4,4) command. However I get a lot of white space between 
each graph. Does anyone know how I can reduce this?

Thanks

Sam

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Is it possible to create highly customized report in *.xls format by using R/S+?

2005-07-21 Thread bogdan romocea

So your conclusion is that the only choice is to make mistakes and get
in trouble. (That's what Excel excels at.)

Two options I haven't seen mentioned are:
1. Create your deliverables in HTML format, and change the extension
from .htm to .xls; Excel will import them automatically. The way the
file looks in Excel is determined by .CSS settings (I've seen this
happen) and I presume HTML tags.
2. For the real spreadsheet thing, switch to OpenOffice.org. Their
format is XML compressed with ZIP which you can easily work with since
the format specifications are not proprietary. See
http://xml.openoffice.org/ for details.



 -Original Message-
 From: Wensui Liu [mailto:[EMAIL PROTECTED] 
 Sent: Wednesday, July 20, 2005 10:56 AM
 To: Greg Snow
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] Is it possible to create highly customized 
 report in *.xls format by using R/S+?
 
 
 I appreciate your reply and understand your point completely. But at
 times we can't change the rule, the only choice is to follow the rule.
 Most deliverables in my work are in excel format.
 
 On 7/20/05, Greg Snow [EMAIL PROTECTED] wrote:
  See:
  
  http://www.burns-stat.com/pages/Tutor/spreadsheet_addiction.html
  and
  http://www.stat.uiowa.edu/~jcryer/JSMTalk2001.pdf
  
  Greg Snow, Ph.D.
  Statistical Data Center, LDS Hospital
  Intermountain Health Care
  [EMAIL PROTECTED]
  (801) 408-8111
  
   Wensui Liu [EMAIL PROTECTED] 07/19/05 03:22PM 
  I remember in one slide of Prof. Ripley's presentation overhead, he
  said the most popular data analysis software is excel.
  
  So is there any resource or tutorial on this topic?
  
  Thank you so much!
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide!
  http://www.R-project.org/posting-guide.html
  
  
 
 
 -- 
 WenSui Liu, MS MA
 Senior Decision Support Analyst
 Division of Health Policy and Clinical Effectiveness
 Cincinnati Children Hospital Medical Center
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Question about 'text' (add lm summary to a plot)

2005-07-21 Thread Christoph Buser

Dear Dan

I can only help you with your third problem, expression and
paste. You can use:

plot(1:5,1:5, type = n)
text(2,4,expression(paste(Slope : , 3.45%+-%0.34, sep = )), pos = 4)
text(2,3.8,expression(paste(Intercept : , -10.43%+-%1.42)), pos = 4)
text(2,3.6,expression(paste(R^2,: , 0.78, sep = )), pos = 4)

I do not have an elegant solution for the alignment.

Regards,

Christoph Buser

--
Christoph Buser [EMAIL PROTECTED]
Seminar fuer Statistik, LEO C13
ETH (Federal Inst. Technology)  8092 Zurich  SWITZERLAND
phone: x-41-44-632-4673 fax: 632-1228
http://stat.ethz.ch/~buser/
--


Dan Bolser writes:
  
  I would like to annotate my plot with a little box containing the slope,
  intercept and R^2 of a lm on the data.
  
  I would like it to look like...
  
   ++
   | Slope :   3.45 +- 0.34 |
   | Intercept : -10.43 +- 1.42 |
   | R^2   :   0.78 |
   ++
  
  However I can't make anything this neat, and I can't find out how to
  combine this with symbols for R^2 / +- (plus minus).
  
  Below is my best attempt (which is franky quite pour). Can anyone
  improve on the below?
  
  Specifically, 
  
  aligned text and numbers, 
  aligned decimal places, 
  symbol for R^2 in the text (expression(R^2) seems to fail with
  'paste') and +- 
  
  
  
  Cheers,
  Dan.
  
  
  dat.lm - lm(dat$AVG_CH_PAIRS ~ dat$CHAINS)
  
  abline(coef(dat.lm),lty=2,lwd=1.5)
  
  
  dat.lm.sum - summary(dat.lm)
  dat.lm.sum
  
  attributes(dat.lm.sum)
  
  my.text.1 -
paste(Slope : , round(dat.lm.sum$coefficients[2],2),
  +/-,  round(dat.lm.sum$coefficients[4],2))
  
  my.text.2 -
paste(Intercept : , round(dat.lm.sum$coefficients[1],2),
  +/-,  round(dat.lm.sum$coefficients[3],2))
  
  my.text.3 -
paste(R^2 : ,   round(dat.lm.sum$r.squared,2))
  
  my.text.1
  my.text.2
  my.text.3
  
  
  ## Add legend
  text(x=3,
   y=300,
   paste(my.text.1,
 my.text.2,
 my.text.3,
 sep=\n),
   adj=c(0,0),
   cex=1)
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] RandomForest question

2005-07-21 Thread Arne.Muller

Hello,

I'm trying to find out the optimal number of splits (mtry parameter) for a 
randomForest classification. The classification is binary and there are 32 
explanatory variables (mostly factors with each up to 4 levels but also some 
numeric variables) and 575 cases.

I've seen that although there are only 32 explanatory variables the best 
classification performance is reached when choosing mtry=80. How is it possible 
that more variables can used than there are in columns the data frame?

thanks for your help
+ kind regards,

Arne




[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] heatmap color distribution

2005-07-21 Thread Jacob Michaelson

Hi all,

I've got a set of gene expression data, and I'm plotting several  
heatmaps for subsets of the whole set.  I'd like the heatmaps to have  
the same color distribution, so that comparisons may be made  
(roughly) across heatmaps; this would require that the color  
distribution and distance functions be based on the entire dataset,  
rather than on individual subsets.  Does anyone know how to do this?

Thanks in advance,

Jake

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Chemoinformatic people

2005-07-21 Thread A.J. Rossini

Just with R, or via another tool integrating R, such as Pipeline Pilot?

best,
-tony

On 7/20/05, Frédéric Ooms [EMAIL PROTECTED] wrote:
 Dear colleague,
 Just an e-mail to know if they are people working in the field of 
 chemoinformatic that are using R in their work. If yes I was wondering if we 
 couldn't exchange tips and tricks about the use of R in this area ?
 Best regards
 Fred Ooms
 
 [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 


-- 
best,
-tony

Commit early,commit often, and commit in a repository from which we can easily
roll-back your mistakes (AJR, 4Jan05).

A.J. Rossini
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] R graphics

2005-07-21 Thread joerg van den hoff

Sam Baxter wrote:
 Hi
 
 I am trying to set up 16 graphs on one graphics page in R. I have used 
 the mfrow=c(4,4) command. However I get a lot of white space between 
 each graph. Does anyone know how I can reduce this?
 
 Thanks
 
 Sam
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 


?layout
as an alternative to par(mfrow) might be helpful anyway


too large margins:

?par

reduce value of mar, for instance

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] The steps of building library in R 2.1.1

2005-07-21 Thread Gabor Grothendieck

I think you have been using R too long.  Something like
this is very much needed.  There are two problems:

1. the process itself is too complex (need to get rid of perl,
integrate package development tools with package installation 
   procedure [it should be as easy as downloading a package], 
   remove necessity to set or modify any environment variables
including the path variables).

2. there is too much material to absorb just to create a package.
The manuals are insufficient.

A step-by-step simplification is very much needed.  Its no
coincidence that there are a number of such descriptions on
the net (google for 'making creating R package') since I would 
guess that just about everyone has significant problems in creating 
their first package on Windows.


On 7/21/05, Uwe Ligges [EMAIL PROTECTED] wrote:
 Ivy_Li wrote:
 
  Dear All,
 
  With the warm support of every R expert, I have built my R library 
  successfully.
  Especially thanks: Duncan Murdoch
Gabor Grothendieck
Henrik Bengtsson
Uwe Ligges
 
 You are welcome.
 
 
 The following is intended for the records in the archive in order to
 protect readers.
 
 
  Without your help, I will lower efficiency.
  I noticed that some other friends were puzzled by the method of building 
  library. Now, I organize a document about it. Hoping it can help more 
  friends.
 
  1. Read the webpage http://www.stats.ox.ac.uk/pub/Rtools
 
 Do you mean http://www.murdoch-sutherland.com/Rtools/ ?
 
  2. Download the rw2011.exe; Install the newest version of R
  3. Download the tools.zip; Unpack it into c:\cygwin
 
 Not required to call it cygwin - also a bit misleading...
 
  4. Download the ActivePerl-5.6.1.633-MSWin32-x86.msi; Install Active Perl 
  in c:\Perl
 
 Why in C:\Perl ?
 
  5. Download the MinGW-3.1.0-1.exe; Install the mingw32 port of gcc in 
  c:\mingwin
 
 Why in c:\mingwin ?
 
 
  6. Then go to Control Panel - System - Advanced - Environment Variables 
  - Path - Variable Balue; add c:\cygwin;c:\mingwin\bin
The PATH variable already contains a couple of paths, add the two 
  given above in front of all others, separated by ;.
Why we add them in the beginning of the path? Because we want the 
  folder that contains the tools to be at the beginning so that you eliminate 
  the possibility of finding a different program of the same name first in a 
  folder that comes prior to the one where the tools are stored.
 
 
 OK, this (1-6) is all described in the R Administration and Installation
 manual, hence I do not see why we have to repeat it here.
 
 
  7. I use the package.skeleton() function to make a draft package. It will 
  automate some of the setup for a new source package. It creates 
  directories, saves functions anddata to appropriate places, and creates 
  skeleton help files and 'README' files describing further steps in 
  packaging.
  I type in R:
f - function(x,y) x+y
g - function(x,y) x-y
d - data.frame(a=1, b=2)
e - rnorm(1000)
package.skeleton(list=c(f,g,d,e), name=example)
  Then modify the 'DESCRIPTION':
Package: example
Version: 1.0-1
Date: 2005-07-09
Title: My first function
Author: Ivy [EMAIL PROTECTED]
Maintainer: Ivy [EMAIL PROTECTED]
Description: simple sum and subtract
License: GPL version 2 or later
Depends: R (= 1.9), stats, graphics, utils
  You can refer to the web page: 
  http://cran.r-project.org/src/contrib/Descriptions/  There are larger 
  source of examples. And you can read the part of 'Creating R Packages' in 
  'Writing R Extension'. It introduces some useful things for your reference.
 
 
 This is described in Writing R Extension and is not related to the setup
 of you system in 1-6.
 
 
 
  8. Download hhc.exe Microsoft help compiler from somewhere. And save it 
  somewhere in your path.
  I download a 'htmlhelp.exe' and setup. saved the hhc.exe into the 
  'C:\cygwin\bin' because this path has been writen in my PATH Variable Balue.
  However if you decided not to use the Help Compiler (hhc), then you 
  need to modify the MkRules file in RHOME/src/gnuwin32 to tell it not to try 
  to build that kind of help file
 
 
 This is described in the R Administration and Installation manual
 and I do not see why we should put the html compiler to the other tools.
 
 
  9. In the DOS environment. Into the D:\  Type the following code:
 
 There is no DOS environment in Windows NT based operating systems.
 
 
cd \Program Files\R\rw2010
bin\R CMD INSTALL /Program Files/R/rw2011/example
 
 I do not see why anybody would like to contaminate the binary
 installation of R with some development source packages.
 I'd rather use a separate directory.
 
 I think reading the two mentioned manuals shoul be sufficient. You have
 not added relevant information. By adding irrelevant information and

Re: [R] R graphics

2005-07-21 Thread Sundar Dorai-Raj



Sam Baxter wrote:
 Hi
 
 I am trying to set up 16 graphs on one graphics page in R. I have used 
 the mfrow=c(4,4) command. However I get a lot of white space between 
 each graph. Does anyone know how I can reduce this?
 
 Thanks
 
 Sam
 

Two options:

1. play around with the `mar' parameter in ?par.

2. (Preferred) Use the lattice package. See, for example:

library(lattice)
trellis.device(theme = col.whitebg())
z - expand.grid(x = 1:10, y = 1:10, g = LETTERS[1:16])
xyplot(y ~ x | g, z)


HTH,

--sundar

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Chemoinformatic people

2005-07-21 Thread Frédéric Ooms

I am looking for both.
Fred

-Original Message-
From: A.J. Rossini [mailto:[EMAIL PROTECTED] 
Sent: Thursday, July 21, 2005 3:36 PM
To: Frédéric Ooms
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Chemoinformatic people

Just with R, or via another tool integrating R, such as Pipeline Pilot?

best,
-tony

On 7/20/05, Frédéric Ooms [EMAIL PROTECTED] wrote:
 Dear colleague,
 Just an e-mail to know if they are people working in the field of 
 chemoinformatic that are using R in their work. If yes I was wondering 
 if we couldn't exchange tips and tricks about the use of R in this 
 area ? Best regards Fred Ooms

 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list 
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

-- 
best,
-tony

Commit early,commit often, and commit in a repository from which we can easily 
roll-back your mistakes (AJR, 4Jan05).

A.J. Rossini
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] bubble.plot() - standardize size of unit circle

2005-07-21 Thread Dan Bebber

Thanks- 'sizeplot' didn't come up in any of my
searches.

Dan


--- Jim Lemon [EMAIL PROTECTED] wrote:

 Dan Bebber wrote:
  Hello,
  
  I wrote a wrapper for symbols() that produces a
  bivariate bubble plot, for use when plot(x,y)
 hides
  multiple occurrences of the same x,y combination
 (e.g.
  if x,y are integers).
  Circle area ~ counts per bin, and circle size is
  controlled by 'scale'.
  Question: how can I automatically make the
 smallest
  circle the same size as a standard plot character,
  rather than having to approximate it using
 'scale'?
  
 Ben Bolker's sizeplot in the plotrix package does
 this using the 
 standard plotting symbol 1.
 
 Jim
 




___ 
How much free photo storage do you get? Store your holiday

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Problem with read.table()

2005-07-21 Thread Prof Brian Ripley

On Thu, 21 Jul 2005, Kristian Skrede Gleditsch wrote:

 Dear all,

 I have encountered a strange problem with read.table().

Most `strange problems' are user error, so please try not to blame your 
tools.

 When I try to
 read a tab delimited file I get an error message for line 260 not being
 equal to 14 (see below).

Yes, but not line 260 in that file, but line 260 as read by scan().

Think about quotes ... it works for me with quote=, and the quote on ca 
line 150 is causing you to get some very large fields with embedded new 
lines and tabs.

BTW, there is a 'R Data Import/Export' manual which goes through 
step-by-step the assumptions you make when using read.table with various 
options.  Do read it now.


 Using count.fields() suggests that a number of lines have length not
 equal to 14, but not 260.

 Looking at the actual file, however, I cannot see anything wrong with
 any lines. They all seem to have length 14, there are no double tabs
 etc., and the file reads correctly in other programs. Does anyone have
 any suggestions as to what this might stem from?

 I have placed a copy of the file at
 http://dss.ucsd.edu/~kgledits/archigos_v.1.9.asc

 regards,
 Kristian Skrede Gleditsch


  archigos1.9 - read.table(c:/work/work12/archigos/archigos_v.1.9.asc,
 + sep=\t,header=T,as.is=T,row.names=NULL)
 Error in scan(file = file, what = what, sep = sep, quote = quote, dec =
 dec,  :
 line 260 did not have 14 elements
  a - count.fields(c:/work/work12/archigos/archigos_v.1.9.asc,sep=\t)
  a - data.frame(c(1:length(a)),a)
  a[a[,2]!=14,]
  c.1.length.a..  a
 150 150 10
 313 313 10
 424 424 10
 1189   1189  5
 1510   1510 10
 1514   1514 10
 1590   1590  5
 1600   1600 10
 1612   1612 10
 1618   1618 10
 1619   1619 10
 1709   1709 10
 1722   1722 10
 1981   1981 10
 1985   1985 10
 2112   2112 10
 2178   2178 10
 2208   2208 10
 2224   2224 10
 2530   2530  5
 2536   2536  5
 2573   2573  5
 2928   2928  5
 -- 
 Kristian Skrede Gleditsch
 Department of Political Science, UCSD
 (On leave, University of Essex, 2005-6)
 Tel: +44 1206 872499, Fax: +44 1206 873234
 Email: [EMAIL PROTECTED] or [EMAIL PROTECTED]
 http://weber.ucsd.edu/~kgledits/

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Clustered standard errors in a panel

2005-07-21 Thread Thomas Lumley


No, he wants to fit a glm() and get the right standard errors.  For linear 
models the best way to do this is to model some random effects, but doing 
so in glm changes the meanings of the parameters.  To estimate the same 
parameters you want to use the sandwich standard errors variously 
attributed to Huber and White.

The Design package has robcov() to do this, and there is also code at
http://faculty.washington.edu/tlumley/data2001/sandwich.R

-thomas


On Wed, 20 Jul 2005, Spencer Graves wrote:

 Have you considered lmer in library(lme4)?  If you are interested
 in this, you may want to check the article by Doug Bates in the latest R
 news, www.r-project.org - Documentation:  Newsletter.

 spencer graves

 Thomas Davidoff wrote:

 I want to do the following:

 glm(y ~ x1 + x2 +...)
 within a panel.  Hence y, x1, and x2 all vary at the individual
 level.  However, there is likely correlation of these variables
 within an individual, so standard errors need adjustment.
 I do not want to estimate fixed effects, but do want to cluster
 standard errors at the individual level.
 Is there an automated way to do this?  Nothing in the cluster
 documentation makes it clear that there is.

 (An alternative is to do this by hand.  In that case, I would need to
 be able to calculate weighted sums of x1 and x2... at the individual
 level.  I can do this at the variable level [with lapply,split and
 unsplit], but would love to be able to do so over the matrix of x's.
 Of course, doing by hand is less easy than an automated solution if
 it exists.)


 Thomas Davidoff
 Assistant Professor
 Haas School of Business
 UC Berkeley
 Berkeley, CA 94720
 phone: (510) 643-1425
 fax:(510) 643-7357
 [EMAIL PROTECTED]
 http://faculty.haas.berkeley.edu/davidoff


  [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

 -- 
 Spencer Graves, PhD
 Senior Development Engineer
 PDF Solutions, Inc.
 333 West San Carlos Street Suite 700
 San Jose, CA 95110, USA

 [EMAIL PROTECTED]
 www.pdf.com http://www.pdf.com
 Tel:  408-938-4420
 Fax: 408-280-7915

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] cutomized link function in R

2005-07-21 Thread ozaksoy

Hello!
I am trying to run my S+ code in R (version 2.1.0). I've created a customized 
link function, namely my.binomial where parameter theta has to be
given. I'm considering theta to be .05. Unfortunately, R is giving an
error (I had used MASS in adjusting the S+ code to R). I would very
much appreciate it if you could help me in finding the correction
needed in the code. My S+ code I'm trying to adjust to R is:
g-glm(y~diffwhale+tdelta:diffwhale+tdelta2:diffwhale,
data=ind, family=my.binomial(theta=.05))
Thanks,
Isin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Chemoinformatic people

2005-07-21 Thread Weiwei Shi

I am just curious why I always want to have a position like that but
never find one. Am I lazy or unlucky for job huntering?(^%$$%*^(

weiwei

On 7/21/05, Frédéric Ooms [EMAIL PROTECTED] wrote:
 I am looking for both.
 Fred
 
 -Original Message-
 From: A.J. Rossini [mailto:[EMAIL PROTECTED]
 Sent: Thursday, July 21, 2005 3:36 PM
 To: Frédéric Ooms
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] Chemoinformatic people
 
 
 Just with R, or via another tool integrating R, such as Pipeline Pilot?
 
 best,
 -tony
 
 On 7/20/05, Frédéric Ooms [EMAIL PROTECTED] wrote:
  Dear colleague,
  Just an e-mail to know if they are people working in the field of
  chemoinformatic that are using R in their work. If yes I was wondering
  if we couldn't exchange tips and tricks about the use of R in this
  area ? Best regards Fred Ooms
 
  [[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide!
  http://www.R-project.org/posting-guide.html
 
 
 
 --
 best,
 -tony
 
 Commit early,commit often, and commit in a repository from which we can 
 easily roll-back your mistakes (AJR, 4Jan05).
 
 A.J. Rossini
 [EMAIL PROTECTED]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 


-- 
Weiwei Shi, Ph.D

Did you always know?
No, I did not. But I believed...
---Matrix III

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] heatmap color distribution

2005-07-21 Thread Wiener, Matthew

You can use the breaks argument in image to do this.  (You don't specify a
function you're using, but other heatmap functions probably have a similar
parameter.)  Look across all your data, figure out the ranges you want to
have different colors, and specify the appropriate break points in each call
to image.  Then you're using the same color set in each one.  You run the
risk, of course, that some of your images will have a very narrow color
range, which might obscure interesting features.  But nothing stops you from
making more than one plot.

Hope this helps.

Regards,

Matt Wiener

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Jacob Michaelson
Sent: Thursday, July 21, 2005 9:26 AM
To: r-help@stat.math.ethz.ch
Subject: [R] heatmap color distribution


Hi all,

I've got a set of gene expression data, and I'm plotting several  
heatmaps for subsets of the whole set.  I'd like the heatmaps to have  
the same color distribution, so that comparisons may be made  
(roughly) across heatmaps; this would require that the color  
distribution and distance functions be based on the entire dataset,  
rather than on individual subsets.  Does anyone know how to do this?

Thanks in advance,

Jake

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Rprof fails in combination with RMySQL

2005-07-21 Thread Thieme, Lutz

Dear R community,

I tried to optimized my R code by using Rprof. In my R code I'm using MySQL
database connections intensively. After a bunch of queries R fails with the 
following error message:
Error in .Call(RS_MySQL_newConnection, drvId, con.params, groups, PACKAGE = 
.MySQLPkgName) : 
RS-DBI driver: (could not connect [EMAIL PROTECTED] on dbname myDB

Without the R profiler this code runs very stable since weeks.

Do you have any ideas or suggestions?

I tried the following R versions:
___
platform i386-pc-solaris2.8
arch i386  
os   solaris2.8
system   i386, solaris2.8  
status 
major1 
minor9.1   
year 2004  
month06
day  21
language R   
___
platform sparc-sun-solaris2.8
arch sparc   
os   solaris2.8  
system   sparc, solaris2.8   
status   
major2   
minor1.1 
year 2005
month06  
day  20  
language R   
___
platform sparc-sun-solaris2.8
arch sparc   
os   solaris2.8  
system   sparc, solaris2.8   
status   
major1   
minor9.1 
year 2004
month06  
day  21  
language R   


Thank you in advance and kind regards,

Lutz Thieme
AMD Saxony/ Product Engineering AMD Saxony Limited Liability Company  Co. KG
phone:  + 49-351-277-4269   M/S E22-PE, Wilschdorfer Landstr. 101
fax:+ 49-351-277-9-4269 D-01109 Dresden, Germany


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Problem with read.table()

2005-07-21 Thread Rolf Turner


I don't really understand it, but the problem seems to come down to
the presence of apostrophes (single right quotes ') in the text
strings.

The first of these occurs in line 149 (not counting the header
line).  If one tries to scan just that line, one gets a vector of
length 10.  Fields 10 to 14 are read as a single field. Upon deleting
the apostrophe, I got a a vector of length 14 (OMMM!)

The help on scan() talks about a quote argument and indicates that if
sep is not the newline character, then quote defaults to '\.  It
remarks that you can include quotes inside strings by doubling them.
I did a global substitution, changing ' to '' throughout, and the
read.table() worked (i.e. didn't complain and yielded up a data frame
of dimension 2935 x 14).  But no apostrophes appeared in the fields
in the resulting data frame.

The help seems to indicate that you can get around the problem by
specifying quote = some character which doesn't appear in the file.
(This also saves having to do a global edit.)  I tried quote=# and
it seemed to work in this instance.  And the apostrophes ***did***
appear in the strings in the data frame.

I don't grok why the complaint shows up at line 260 rather than
immediately at line 149   but it's a start.

cheers,

Rolf Turner
[EMAIL PROTECTED]

Original message:

  From [EMAIL PROTECTED] Thu Jul 21 10:12:09 2005
  Date: Thu, 21 Jul 2005 14:11:36 +0100
  From: Kristian Skrede Gleditsch [EMAIL PROTECTED]
  User-Agent: Mozilla Thunderbird 1.0.2 (Windows/20050317)
  X-Accept-Language: en-us, en
  MIME-Version: 1.0
  To: r-help@stat.math.ethz.ch
  X-Essex-ClamAV: No malware found
  X-Essex-MailScanner: Found to be clean
  X-Essex-MailScanner-SpamCheck: not spam, SpamAssassin (score=-2.82,
   required 5, autolearn=disabled, ALL_TRUSTED -2.82)
  X-MailScanner-From: [EMAIL PROTECTED]
  X-Virus-Scanned: by amavisd-new at stat.math.ethz.ch
  Subject: [R] Problem with read.table()
  X-BeenThere: r-help@stat.math.ethz.ch
  X-Mailman-Version: 2.1.6
  List-Id: Main R Mailing List: Primary help r-help.stat.math.ethz.ch
  List-Unsubscribe: https://stat.ethz.ch/mailman/listinfo/r-help,
   mailto:[EMAIL PROTECTED]
  List-Archive: https://stat.ethz.ch/pipermail/r-help
  List-Post: mailto:r-help@stat.math.ethz.ch
  List-Help: mailto:[EMAIL PROTECTED]
  List-Subscribe: https://stat.ethz.ch/mailman/listinfo/r-help,
   mailto:[EMAIL PROTECTED]
  Content-Transfer-Encoding: 7bit
  X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on erdos.math.unb.ca
  X-Spam-Math-Flag: NO
  X-Spam-Math-Status: No, hits=0.0 required=5.0 tests=BAYES_50 autolearn=no 
   version=3.0.4
  
  Dear all,
  
  I have encountered a strange problem with read.table(). When I try to 
  read a tab delimited file I get an error message for line 260 not being 
  equal to 14 (see below).
  
  Using count.fields() suggests that a number of lines have length not 
  equal to 14, but not 260.
  
  Looking at the actual file, however, I cannot see anything wrong with 
  any lines. They all seem to have length 14, there are no double tabs 
  etc., and the file reads correctly in other programs. Does anyone have 
  any suggestions as to what this might stem from?
  
  I have placed a copy of the file at 
  http://dss.ucsd.edu/~kgledits/archigos_v.1.9.asc
  
  regards,
  Kristian Skrede Gleditsch
  
  
archigos1.9 - read.table(c:/work/work12/archigos/archigos_v.1.9.asc,
  + sep=\t,header=T,as.is=T,row.names=NULL)
  Error in scan(file = file, what = what, sep = sep, quote = quote, dec = 
  dec,  :
   line 260 did not have 14 elements
a - count.fields(c:/work/work12/archigos/archigos_v.1.9.asc,sep=\t)
a - data.frame(c(1:length(a)),a)
a[a[,2]!=14,]
c.1.length.a..  a
  150 150 10
  313 313 10
  424 424 10
  1189   1189  5
  1510   1510 10
  1514   1514 10
  1590   1590  5
  1600   1600 10
  1612   1612 10
  1618   1618 10
  1619   1619 10
  1709   1709 10
  1722   1722 10
  1981   1981 10
  1985   1985 10
  2112   2112 10
  2178   2178 10
  2208   2208 10
  2224   2224 10
  2530   2530  5
  2536   2536  5
  2573   2573  5
  2928   2928  5
  -- 
  Kristian Skrede Gleditsch
  Department of Political Science, UCSD
  (On leave, University of Essex, 2005-6)
  Tel: +44 1206 872499, Fax: +44 1206 873234
  Email: [EMAIL PROTECTED] or [EMAIL PROTECTED]
  http://weber.ucsd.edu/~kgledits/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Chemoinformatic people

2005-07-21 Thread Luke Tierney


I don't.  THere is an address an email at novartis in the ASA directory

 ID   068970
NameAnthony J. Rossini
Company Novartis Pharma AG
Address Biostatistics
WSJ-27.1.012
City State Zip  CH-4002 Basel
Country Switzerland
Phone   (206) 543-2005
Email   [EMAIL PROTECTED]

luke

On Thu, 21 Jul 2005, A.J. Rossini wrote:


Just with R, or via another tool integrating R, such as Pipeline Pilot?

best,
-tony

On 7/20/05, Frédéric Ooms [EMAIL PROTECTED] wrote:

Dear colleague,
Just an e-mail to know if they are people working in the field of 
chemoinformatic that are using R in their work. If yes I was wondering if we 
couldn't exchange tips and tricks about the use of R in this area ?
Best regards
Fred Ooms

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html







--
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:  [EMAIL PROTECTED]
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Chemoinformatic people

2005-07-21 Thread A.J. Rossini

I know of a good number of companies who use R via pipeline pilot (and
have looked into it a bit recently), but not R by itself.

One of the big I wish items that I've got is seemless handling of
large data.  Some of the  RDBMS will do it, but not quite seemlessly.
  SPLUS 7.0 does it for a limited class, but in a painful (very
non-seemless) manner.

This would be required to use R in this context, at least for what I've seen.

best,
-tony


On 7/21/05, Frédéric Ooms [EMAIL PROTECTED] wrote:
 I am looking for both.
 Fred
 
 -Original Message-
 From: A.J. Rossini [mailto:[EMAIL PROTECTED]
 Sent: Thursday, July 21, 2005 3:36 PM
 To: Frédéric Ooms
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] Chemoinformatic people
 
 
 Just with R, or via another tool integrating R, such as Pipeline Pilot?
 
 best,
 -tony
 
 On 7/20/05, Frédéric Ooms [EMAIL PROTECTED] wrote:
  Dear colleague,
  Just an e-mail to know if they are people working in the field of
  chemoinformatic that are using R in their work. If yes I was wondering
  if we couldn't exchange tips and tricks about the use of R in this
  area ? Best regards Fred Ooms
 
  [[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide!
  http://www.R-project.org/posting-guide.html
 
 
 
 --
 best,
 -tony
 
 Commit early,commit often, and commit in a repository from which we can 
 easily roll-back your mistakes (AJR, 4Jan05).
 
 A.J. Rossini
 [EMAIL PROTECTED]
 


-- 
best,
-tony

Commit early,commit often, and commit in a repository from which we can easily
roll-back your mistakes (AJR, 4Jan05).

A.J. Rossini
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] RandomForest question

2005-07-21 Thread Uwe Ligges

[EMAIL PROTECTED] wrote:

 Hello,
 
 I'm trying to find out the optimal number of splits (mtry parameter)
 for a randomForest classification. The classification is binary and
 there are 32 explanatory variables (mostly factors with each up to 4
 levels but also some numeric variables) and 575 cases.
 
 I've seen that although there are only 32 explanatory variables the
 best classification performance is reached when choosing mtry=80. How
 is it possible that more variables can used than there are in columns
 the data frame?

If some of the variables are factors, dummy variables are generated and 
you get a larger number of variables in the later process.

Uwe Ligges


 thanks for your help + kind regards,
 
 Arne
 
 
 
 
 [[alternative HTML version deleted]]
 
 __ 
 R-help@stat.math.ethz.ch mailing list 
 https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the
 posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] heatmap color distribution

2005-07-21 Thread Jake Michaelson

Thanks for the reply.  As I understand it, breaks only controls the  
binning.  The problem I'm having is that each subset heatmap has  
slightly different min and max log2 intensities.  I'd like the colors  
to be based on the overall (complete set) max and min, not the subsets'  
max and min -- I could be wrong, but I don't think breaks will help  
me there.  And you're right - this might obscure some of the  
trends/features, but we'll also plot the default heatmaps.

Also (I should have specified) I'm using heatmap.2.

Thanks,

Jake

On Jul 21, 2005, at 8:09 AM, Wiener, Matthew wrote:

 You can use the breaks argument in image to do this.  (You don't  
 specify a
 function you're using, but other heatmap functions probably have a  
 similar
 parameter.)  Look across all your data, figure out the ranges you want  
 to
 have different colors, and specify the appropriate break points in  
 each call
 to image.  Then you're using the same color set in each one.  You run  
 the
 risk, of course, that some of your images will have a very narrow color
 range, which might obscure interesting features.  But nothing stops  
 you from
 making more than one plot.

 Hope this helps.

 Regards,

 Matt Wiener

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Jacob Michaelson
 Sent: Thursday, July 21, 2005 9:26 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] heatmap color distribution


 Hi all,

 I've got a set of gene expression data, and I'm plotting several
 heatmaps for subsets of the whole set.  I'd like the heatmaps to have
 the same color distribution, so that comparisons may be made
 (roughly) across heatmaps; this would require that the color
 distribution and distance functions be based on the entire dataset,
 rather than on individual subsets.  Does anyone know how to do this?

 Thanks in advance,

 Jake

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html





 --- 
 ---
 Notice:  This e-mail message, together with any attachment...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] debian vcd package

2005-07-21 Thread Peter Ho

[Apologies if you  have already read this message sent  from another 
email address]

Hi R-Help,

I have been using R  in Linux (Debian) for the past month. The usual way 
I install packages is through apt. Recently, a new packages vcd became 
available on CRAN.  I tried installing it today and found that Debian 
does not seem to support this package. I also found that many other 
packages were unavailable.
Does anyone have any recommended sites where a full list is available? 
If none exist, what would be the best way to move ahead in installing 
say the vcd package.

I am still a novice in using Debian and so please forgive me if some of 
my questions may seem trivial for experienced users.



Peter


Peter Ho, PhD.
Escola Superior de Tecnologia e Gestao.
Instituto Politecnico de Viana do Castelo.
Avenida do Atlantico- Apartado 574.
4901-908 Viana do Castelo. Portugal.
Tel: +351-258-819700 Ext. 1252
Email: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] heatmap color distribution

2005-07-21 Thread Wiener, Matthew

Breaks affects the binning into colors.  Try this.  Assume that temp is one
of your data sets.  It's values are restricted to 0.25 - 0.75, and we'll
assume that the full data set goes from 0 to 1.

 temp - matrix(runif(60, 0.25, 0.75), nc = 6)
 breaks - seq(from = 0, to = 1, length = 11)
 image(temp2, col = heat.colors(10)) # full range of
color
 image(temp2, col = heat.colors(10), breaks = breaks)# muted colors

The second image is told about all the colors, and about the full range of
data through breaks, and only uses the colors in the middle.

Is that what you mean?

HTH, 

Matt

-Original Message-
From: Jake Michaelson [mailto:[EMAIL PROTECTED] 
Sent: Thursday, July 21, 2005 10:45 AM
To: Wiener, Matthew
Cc: R-help@stat.math.ethz.ch
Subject: Re: [R] heatmap color distribution


Thanks for the reply.  As I understand it, breaks only controls the  
binning.  The problem I'm having is that each subset heatmap has  
slightly different min and max log2 intensities.  I'd like the colors  
to be based on the overall (complete set) max and min, not the subsets'  
max and min -- I could be wrong, but I don't think breaks will help  
me there.  And you're right - this might obscure some of the  
trends/features, but we'll also plot the default heatmaps.

Also (I should have specified) I'm using heatmap.2.

Thanks,

Jake

On Jul 21, 2005, at 8:09 AM, Wiener, Matthew wrote:

 You can use the breaks argument in image to do this.  (You don't  
 specify a
 function you're using, but other heatmap functions probably have a  
 similar
 parameter.)  Look across all your data, figure out the ranges you want  
 to
 have different colors, and specify the appropriate break points in  
 each call
 to image.  Then you're using the same color set in each one.  You run  
 the
 risk, of course, that some of your images will have a very narrow color
 range, which might obscure interesting features.  But nothing stops  
 you from
 making more than one plot.

 Hope this helps.

 Regards,

 Matt Wiener

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Jacob Michaelson
 Sent: Thursday, July 21, 2005 9:26 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] heatmap color distribution


 Hi all,

 I've got a set of gene expression data, and I'm plotting several
 heatmaps for subsets of the whole set.  I'd like the heatmaps to have
 the same color distribution, so that comparisons may be made
 (roughly) across heatmaps; this would require that the color
 distribution and distance functions be based on the entire dataset,
 rather than on individual subsets.  Does anyone know how to do this?

 Thanks in advance,

 Jake

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html





 --- 
 ---
 Notice:  This e-mail message, together with any attachment...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Rprof fails in combination with RMySQL

2005-07-21 Thread bogdan romocea

I think you're barking up the wrong tree. Optimize the MySQL code
separately from optimizing the R code. A very nice reference about the
former is http://highperformancemysql.com/. Also, if possible, do
everything in MySQL.
hth,
b.


 -Original Message-
 From: Thieme, Lutz [mailto:[EMAIL PROTECTED] 
 Sent: Thursday, July 21, 2005 10:11 AM
 To: Rhelp (E-mail)
 Subject: [R] Rprof fails in combination with RMySQL
 
 
 Dear R community,
 
 I tried to optimized my R code by using Rprof. In my R code 
 I'm using MySQL
 database connections intensively. After a bunch of queries R 
 fails with the 
 following error message:
 Error in .Call(RS_MySQL_newConnection, drvId, con.params, 
 groups, PACKAGE = .MySQLPkgName) : 
 RS-DBI driver: (could not connect [EMAIL PROTECTED] 
 on dbname myDB
 
 Without the R profiler this code runs very stable since weeks.
 
 Do you have any ideas or suggestions?
 
 I tried the following R versions:
 ___
 platform i386-pc-solaris2.8
 arch i386  
 os   solaris2.8
 system   i386, solaris2.8  
 status 
 major1 
 minor9.1   
 year 2004  
 month06
 day  21
 language R   
 ___
 platform sparc-sun-solaris2.8
 arch sparc   
 os   solaris2.8  
 system   sparc, solaris2.8   
 status   
 major2   
 minor1.1 
 year 2005
 month06  
 day  20  
 language R   
 ___
 platform sparc-sun-solaris2.8
 arch sparc   
 os   solaris2.8  
 system   sparc, solaris2.8   
 status   
 major1   
 minor9.1 
 year 2004
 month06  
 day  21  
 language R   
 
 
 Thank you in advance and kind regards,
 
 Lutz Thieme
 AMD Saxony/ Product Engineering AMD Saxony Limited 
 Liability Company  Co. KG
 phone: + 49-351-277-4269 M/S E22-PE, 
 Wilschdorfer Landstr. 101
 fax: + 49-351-277-9-4269 D-01109 Dresden, Germany
 
 
 [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] The steps of building library in R 2.1.1

2005-07-21 Thread Gabor Grothendieck

An article like that would be really great. 

On 7/21/05, Uwe Ligges [EMAIL PROTECTED] wrote:
 Gabor Grothendieck wrote:
 
  I think you have been using R too long.  Something like
  this is very much needed.  There are two problems:
 
  1. the process itself is too complex (need to get rid of perl,
  integrate package development tools with package installation
 procedure [it should be as easy as downloading a package],
 remove necessity to set or modify any environment variables
  including the path variables).
 
  2. there is too much material to absorb just to create a package.
  The manuals are insufficient.
 
  A step-by-step simplification is very much needed.  Its no
  coincidence that there are a number of such descriptions on
  the net (google for 'making creating R package') since I would
  guess that just about everyone has significant problems in creating
  their first package on Windows.
 
 OK, if people really think this is required, I will sit down on a clean
 Windows XP machine, do the setup, and write it down for the next R Help
 Desk in R News -- something like Creating my first R package under
 Windows?
 
 If anybody else is willing to contribute and can write something up in a
 manner that is *not* confusing or misleading (none of the other material
 spread over the web satisfies this requirement, AFAICS), she/he is
 invited to contribute, of course.
 
 BTW, everybody else is invited to submit proposals for R Help Desk!!!
 
 Uwe Ligges
 
 
 
 
 
  On 7/21/05, Uwe Ligges [EMAIL PROTECTED] wrote:
 
 Ivy_Li wrote:
 
 
 Dear All,
 
 With the warm support of every R expert, I have built my R library 
 successfully.
 Especially thanks: Duncan Murdoch
   Gabor Grothendieck
   Henrik Bengtsson
   Uwe Ligges
 
 You are welcome.
 
 
 The following is intended for the records in the archive in order to
 protect readers.
 
 
 
 Without your help, I will lower efficiency.
 I noticed that some other friends were puzzled by the method of building 
 library. Now, I organize a document about it. Hoping it can help more 
 friends.
 
 1. Read the webpage http://www.stats.ox.ac.uk/pub/Rtools
 
 Do you mean http://www.murdoch-sutherland.com/Rtools/ ?
 
 
 2. Download the rw2011.exe; Install the newest version of R
 3. Download the tools.zip; Unpack it into c:\cygwin
 
 Not required to call it cygwin - also a bit misleading...
 
 
 4. Download the ActivePerl-5.6.1.633-MSWin32-x86.msi; Install Active 
 Perl in c:\Perl
 
 Why in C:\Perl ?
 
 
 5. Download the MinGW-3.1.0-1.exe; Install the mingw32 port of gcc in 
 c:\mingwin
 
 Why in c:\mingwin ?
 
 
 
 6. Then go to Control Panel - System - Advanced - Environment 
 Variables - Path - Variable Balue; add c:\cygwin;c:\mingwin\bin
   The PATH variable already contains a couple of paths, add the two 
  given above in front of all others, separated by ;.
   Why we add them in the beginning of the path? Because we want the 
  folder that contains the tools to be at the beginning so that you 
  eliminate the possibility of finding a different program of the same name 
  first in a folder that comes prior to the one where the tools are stored.
 
 
 OK, this (1-6) is all described in the R Administration and Installation
 manual, hence I do not see why we have to repeat it here.
 
 
 
 7. I use the package.skeleton() function to make a draft package. It will 
 automate some of the setup for a new source package. It creates 
 directories, saves functions anddata to appropriate places, and 
 creates skeleton help files and 'README' files describing further steps in 
 packaging.
 I type in R:
   f - function(x,y) x+y
   g - function(x,y) x-y
   d - data.frame(a=1, b=2)
   e - rnorm(1000)
   package.skeleton(list=c(f,g,d,e), name=example)
 Then modify the 'DESCRIPTION':
   Package: example
   Version: 1.0-1
   Date: 2005-07-09
   Title: My first function
   Author: Ivy [EMAIL PROTECTED]
   Maintainer: Ivy [EMAIL PROTECTED]
   Description: simple sum and subtract
   License: GPL version 2 or later
   Depends: R (= 1.9), stats, graphics, utils
 You can refer to the web page: 
 http://cran.r-project.org/src/contrib/Descriptions/  There are larger 
 source of examples. And you can read the part of 'Creating R Packages' in 
 'Writing R Extension'. It introduces some useful things for your reference.
 
 
 This is described in Writing R Extension and is not related to the setup
 of you system in 1-6.
 
 
 
 8. Download hhc.exe Microsoft help compiler from somewhere. And save it 
 somewhere in your path.
 I download a 'htmlhelp.exe' and setup. saved the hhc.exe into the 
  'C:\cygwin\bin' because this path has been writen in my PATH Variable 
  Balue.
 However if you decided not to use the Help Compiler (hhc), then you 
  need to modify the MkRules file in RHOME/src/gnuwin32 to tell it not to 
  try to build that kind of help file

[R] principal component analysis in affy

2005-07-21 Thread Wagle, Mugdha

Hi,
 
I have been using the prcomp function to perform PCA on my example microarray 
data, (stored in metric text files) which looks like this:
 
1a 1b 1c 1d 1e 1f ...4r 
4s 4t
g11.2705 1.2766 
...2.0298
g20.1631 
0.7067
g30.2212 
1.0439
.
.
.
.
g99  
1.3657..2.3736
 
i.e. a matrix of 63 columns and 99 rows, where the columns represent chip and 
rows represent genes. Now, the biplot function
 
biplot(prcomp(pcadata, scale = TRUE), cex = c(0.75,0.75))
 
gives me a plot with one vector per gene. However, I actually need to get one 
vector per chip instead of one vector per gene.  I have been told that there is 
a function in the affy package that does what I am looking for i.e. gives one 
vector per chip. Can someone please tell me what the function is called, and 
how I can get hold of the code(since I believe affy only works on CEL files) ? 
I have downloaded the affy R code from Terry Speed's website already, but I 
don't know where (if at all) the code to perform PCA is.
 
Thank you everyone!
 
Sincerely,
Mugdha Wagle
Hartwell Center for Bioinformatics and Biotechnology,
St.Jude Children's Research Hospital, Memphis TN 38105

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] RandomForest question

2005-07-21 Thread Weiwei Shi

Hi,
I found the following lines from Leo's randomForest, and I am not sure
if it can be applied here but just tried to help:

mtry0 = the number of variables to split on at each node. Default is
the square root of mdim. ATTENTION! DO NOT USE THE DEFAULT VALUES OF
MTRY0 IF YOU WANT TO OPTIMIZE THE PERFORMANCE OF RANDOM FORESTS. TRY
DIFFERENT VALUES-GROW 20-30 TREES, AND SELECT THE VALUE OF MTRY THAT
GIVES THE SMALLEST OOB ERROR RATE.

mdim is the number of predicators.

HTH,

weiwei

On 7/21/05, Liaw, Andy [EMAIL PROTECTED] wrote:
  From: [EMAIL PROTECTED]
 
  Hello,
 
  I'm trying to find out the optimal number of splits (mtry
  parameter) for a randomForest classification. The
  classification is binary and there are 32 explanatory
  variables (mostly factors with each up to 4 levels but also
  some numeric variables) and 575 cases.
 
  I've seen that although there are only 32 explanatory
  variables the best classification performance is reached when
  choosing mtry=80. How is it possible that more variables can
  used than there are in columns the data frame?
 
 It's not.  The code for randomForest.default() has:
 
 ## Make sure mtry is in reasonable range.
 mtry - max(1, min(p, round(mtry)))
 
 so it silently sets mtry to number of predictors if it's too large.
 As an example:
 
  library(randomForest)
 randomForest 4.5-12
 Type rfNews() to see new features/changes/bug fixes.
  iris.rf = randomForest(Species ~ ., iris, mtry=10)
  iris.rf$mtry
 [1] 4
 
 I should probably add a warning in such cases...
 
 Andy
 
 
thanks for your help
+ kind regards,
 
Arne
 
 
 
 
[[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide!
  http://www.R-project.org/posting-guide.html
 
 
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 


-- 
Weiwei Shi, Ph.D

Did you always know?
No, I did not. But I believed...
---Matrix III

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] The steps of building library in R 2.1.1

2005-07-21 Thread Duncan Murdoch

On 7/21/2005 9:43 AM, Gabor Grothendieck wrote:
 I think you have been using R too long.  Something like
 this is very much needed.  There are two problems:
 
 1. the process itself is too complex (need to get rid of perl,
 integrate package development tools with package installation 
procedure [it should be as easy as downloading a package], 
remove necessity to set or modify any environment variables
 including the path variables).

I agree with some of this, but I don't see much interest in fixing it.

For example, getting rid of Perl would be a lot of work.  When the Perl 
scripts were written, R was not capable of doing what they do.  I think 
it is capable now, but there's still a huge amount of translation work 
to do.  Who will do that?  Who will test that they did it right?  At the 
end, will it actually have been worth all the trouble?  Installing Perl 
is not all that hard.

 2. there is too much material to absorb just to create a package.
 The manuals are insufficient.

The first sentence here is basically a repetition of the process is too 
complex.  I think the second sentence is incorrect.  Could you please 
point out what necessary steps are missing?
 
 A step-by-step simplification is very much needed.  

Exactly this has been in the Installation and Administration manual 
since I put it there in February for the 2.1.0 release.  It's at the 
beginning of the appendix on the Windows toolset, with multiple 
references pointing people there.  It's followed by detailed 
descriptions of each of the steps.

If you think it could be further improved, please submit improvements.

 Its no
 coincidence that there are a number of such descriptions on
 the net (google for 'making creating R package') since I would 
 guess that just about everyone has significant problems in creating 
 their first package on Windows.

As far as I can tell, those all predate the release of 2.1.0.  I think 
your complaints are out of date.

Duncan Murdoch

 
 
 On 7/21/05, Uwe Ligges [EMAIL PROTECTED] wrote:
 Ivy_Li wrote:
 
  Dear All,
 
  With the warm support of every R expert, I have built my R library 
  successfully.
  Especially thanks: Duncan Murdoch
Gabor Grothendieck
Henrik Bengtsson
Uwe Ligges
 
 You are welcome.
 
 
 The following is intended for the records in the archive in order to
 protect readers.
 
 
  Without your help, I will lower efficiency.
  I noticed that some other friends were puzzled by the method of building 
  library. Now, I organize a document about it. Hoping it can help more 
  friends.
 
  1. Read the webpage http://www.stats.ox.ac.uk/pub/Rtools
 
 Do you mean http://www.murdoch-sutherland.com/Rtools/ ?
 
  2. Download the rw2011.exe; Install the newest version of R
  3. Download the tools.zip; Unpack it into c:\cygwin
 
 Not required to call it cygwin - also a bit misleading...
 
  4. Download the ActivePerl-5.6.1.633-MSWin32-x86.msi; Install Active 
  Perl in c:\Perl
 
 Why in C:\Perl ?
 
  5. Download the MinGW-3.1.0-1.exe; Install the mingw32 port of gcc in 
  c:\mingwin
 
 Why in c:\mingwin ?
 
 
  6. Then go to Control Panel - System - Advanced - Environment 
  Variables - Path - Variable Balue; add c:\cygwin;c:\mingwin\bin
The PATH variable already contains a couple of paths, add the two 
  given above in front of all others, separated by ;.
Why we add them in the beginning of the path? Because we want the 
  folder that contains the tools to be at the beginning so that you 
  eliminate the possibility of finding a different program of the same name 
  first in a folder that comes prior to the one where the tools are stored.
 
 
 OK, this (1-6) is all described in the R Administration and Installation
 manual, hence I do not see why we have to repeat it here.
 
 
  7. I use the package.skeleton() function to make a draft package. It will 
  automate some of the setup for a new source package. It creates 
  directories, saves functions anddata to appropriate places, and 
  creates skeleton help files and 'README' files describing further steps in 
  packaging.
  I type in R:
f - function(x,y) x+y
g - function(x,y) x-y
d - data.frame(a=1, b=2)
e - rnorm(1000)
package.skeleton(list=c(f,g,d,e), name=example)
  Then modify the 'DESCRIPTION':
Package: example
Version: 1.0-1
Date: 2005-07-09
Title: My first function
Author: Ivy [EMAIL PROTECTED]
Maintainer: Ivy [EMAIL PROTECTED]
Description: simple sum and subtract
License: GPL version 2 or later
Depends: R (= 1.9), stats, graphics, utils
  You can refer to the web page: 
  http://cran.r-project.org/src/contrib/Descriptions/  There are larger 
  source of examples. And you can read the part of 'Creating R Packages' in 
  'Writing R Extension'. It introduces some useful things for your reference.
 
 
 This is described in

Re: [R] The steps of building library in R 2.1.1

2005-07-21 Thread Duncan Murdoch

On 7/21/2005 10:29 AM, Uwe Ligges wrote:
 Gabor Grothendieck wrote:
 
 I think you have been using R too long.  Something like
 this is very much needed.  There are two problems:
 
 1. the process itself is too complex (need to get rid of perl,
 integrate package development tools with package installation 
procedure [it should be as easy as downloading a package], 
remove necessity to set or modify any environment variables
 including the path variables).
 
 2. there is too much material to absorb just to create a package.
 The manuals are insufficient.
 
 A step-by-step simplification is very much needed.  Its no
 coincidence that there are a number of such descriptions on
 the net (google for 'making creating R package') since I would 
 guess that just about everyone has significant problems in creating 
 their first package on Windows.
 
 OK, if people really think this is required, I will sit down on a clean
 Windows XP machine, do the setup, and write it down for the next R Help
 Desk in R News -- something like Creating my first R package under
 Windows?
 
 If anybody else is willing to contribute and can write something up in a
 manner that is *not* confusing or misleading (none of the other material
 spread over the web satisfies this requirement, AFAICS), she/he is
 invited to contribute, of course.
 
 BTW, everybody else is invited to submit proposals for R Help Desk!!!

That sounds great.  Could you also take notes as you go about specific
problems in the writeup in the R-admin manual, so it can be improved for
the next release?

Another thing you could do which would be valuable:  get a student or
someone else who is reasonably computer literate, but unfamiliar with R
details, to do this while you sit watching and recording their mistakes.

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] heatmap color distribution

2005-07-21 Thread Ruben Roa

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] Behalf Of Jacob Michaelson
 Sent: 21 July 2005 12:26
 To: r-help@stat.math.ethz.ch
 Subject: [R] heatmap color distribution

 Hi all,

 I've got a set of gene expression data, and I'm plotting several  
 heatmaps for subsets of the whole set.  I'd like the heatmaps 
 to have  the same color distribution, so that comparisons may be made  
 (roughly) across heatmaps; this would require that the color  
 distribution and distance functions be based on the entire dataset,  
 rather than on individual subsets.  Does anyone know how to do this?

 Thanks in advance,

For each heatmap, in image() set the zlim argument to c(zmin,zmax) where 
zmin and zmax are the minimum and maximum observed across the entire data 
set. Also, for each heatmap set col=heat.colors(n) to the same n for all 
heatmaps. I do that with image.kriging in geoR. Hope it works for you.
Ruben

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] output of variance estimate of random effect from a gamma frailty model using Coxph in R

2005-07-21 Thread ghwei

Hi,

I have a question about the output for variance of random effect from a gamma
frailty model using coxph in R. Is it the vairance of frailties themselves or
variance of log frailties? Thanks.

Guanghui

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Concatenate 2 functions

2005-07-21 Thread [EMAIL PROTECTED]

hi all

I need to concatenate 2 functions into one like
temp-1:1000
for(i=0;i1000;i++)
{
func- func  function(beta) dweibull(temp[i],beta,eta)
}

Any idee on this?

thks
guillaume.


// Webmail Oreka : http://www.oreka.com


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] R:plot and dots

2005-07-21 Thread Marc Schwartz (via MN)

On Thu, 2005-07-21 at 16:18 +0200, Clark Allan wrote:
 hi all
 
 a very simple question.
 
 i have plot(x,y)
 
 but i would like to add in on the plot the observation number associated
 with each point.
 
 how can this be done?
 
 /
 allan

If you mean the unique observation number associated with each x,y pair,
you can use:

  text(x, y, labels = ObsNumberVector, pos = 3)

after the plot(x, y) call:

 df - data.frame(x = rnorm(10), y = rnorm(10), ID = 1:10)
 with(df, plot(x, y))
 with(df, text(x, y, labels = ID, pos = 3))

See ?text for more information.  Note that I used pos = 3 which places
the label above the data point. There are other positioning parameters
available, which are noted in the help file.

Note also that you might have to adjust the plot axis limits depending
upon where you place the text and your extreme points.

If you mean the frequency of each x,y pair (if there is more than one
observation per x,y pair), you might want to review Deepayan's recent
post here:

https://stat.ethz.ch/pipermail/r-help/2005-July/074042.html

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] reorder bug in heatmap.2?

2005-07-21 Thread Jake Michaelson

I want to plot a heatmap without reordering the columns.  This works 
fine in heatmap:

  heatmap(meanX[selected,], col=cm.colors(256), Colv=NA)

But in heatmap.2 I get:

  heatmap.2(meanX[selected,], col=cm.colors(256), Colv=NA)
Error in if (!is.logical(Colv) || Colv) ddc - reorder(ddc, Colv) :
missing value where TRUE/FALSE needed

(Note that instructions for the use of Colv and Rowv are identical 
in both heatmap and heatmap.2 documentation)

Is there another way to not reorder columns in heatmap.2?

Thanks in advance,

Jake

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] debian vcd package

2005-07-21 Thread Jan T. Kim

On Thu, Jul 21, 2005 at 03:55:29PM +0100, Peter Ho wrote:
 [Apologies if you  have already read this message sent  from another 
 email address]
 
 Hi R-Help,
 
 I have been using R  in Linux (Debian) for the past month. The usual way 
 I install packages is through apt. Recently, a new packages vcd became 
 available on CRAN.  I tried installing it today and found that Debian 
 does not seem to support this package. I also found that many other 
 packages were unavailable.
 Does anyone have any recommended sites where a full list is available? 
 If none exist, what would be the best way to move ahead in installing 
 say the vcd package.
 
 I am still a novice in using Debian and so please forgive me if some of 
 my questions may seem trivial for experienced users.

Unfortunately, the term package means different things in the context
of R and of Debian. A Debian package is what you install using tools like
apt etc. The traditional way of installing an R package on Linux is to

* have R installed from source, or install the r-base-dev Debian
  package

* download the package archive (e.g.
  http://www.stats.bris.ac.uk/R/src/contrib/vcd_0.9-0.tar.gz )

* run the R CMD INSTALL command on it, e.g.

R CMD INSTALL vcd_0.9-0.tar.gz

This requires having a number of development Debian packages installed,
such as gcc, g77 etc (installing r-base-dev will automatically resolve
such dependencies).

Best regards, Jan
-- 
 +- Jan T. Kim ---+
 |*NEW*email: [EMAIL PROTECTED]   |
 |*NEW*WWW:   http://www.cmp.uea.ac.uk/people/jtk |
 *-=  hierarchical systems are for files, not for humans  =-*

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] The steps of building library in R 2.1.1

2005-07-21 Thread Uwe Ligges

Duncan Murdoch wrote:

 On 7/21/2005 10:29 AM, Uwe Ligges wrote:
 
Gabor Grothendieck wrote:


I think you have been using R too long.  Something like
this is very much needed.  There are two problems:

1. the process itself is too complex (need to get rid of perl,
integrate package development tools with package installation 
   procedure [it should be as easy as downloading a package], 
   remove necessity to set or modify any environment variables
including the path variables).

2. there is too much material to absorb just to create a package.
The manuals are insufficient.

A step-by-step simplification is very much needed.  Its no
coincidence that there are a number of such descriptions on
the net (google for 'making creating R package') since I would 
guess that just about everyone has significant problems in creating 
their first package on Windows.

OK, if people really think this is required, I will sit down on a clean
Windows XP machine, do the setup, and write it down for the next R Help
Desk in R News -- something like Creating my first R package under
Windows?

If anybody else is willing to contribute and can write something up in a
manner that is *not* confusing or misleading (none of the other material
spread over the web satisfies this requirement, AFAICS), she/he is
invited to contribute, of course.

BTW, everybody else is invited to submit proposals for R Help Desk!!!
 
 
 That sounds great.  Could you also take notes as you go about specific
 problems in the writeup in the R-admin manual, so it can be improved for
 the next release?

Of course.


 Another thing you could do which would be valuable:  get a student or
 someone else who is reasonably computer literate, but unfamiliar with R
 details, to do this while you sit watching and recording their mistakes.

Good idea.

Uwe


 Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Concatenate 2 functions

2005-07-21 Thread Uwe Ligges

[EMAIL PROTECTED] wrote:

 hi all
 
 I need to concatenate 2 functions into one like
 temp-1:1000
 for(i=0;i1000;i++)
 
 {
 func- func  function(beta) dweibull(temp[i],beta,eta)
 }

Please read An Introduction to R.
Please read the posting guide.
What do you expect to be in func? This is completely unclear to me.

Uwe Ligges



 Any idee on this?
 
 thks
 guillaume.
 
 
 // Webmail Oreka : http://www.oreka.com
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] R:plot and dots

2005-07-21 Thread Uwe Ligges

Clark Allan wrote:

 hi all
 
 a very simple question.
 
 i have plot(x,y)
 
 but i would like to add in on the plot the observation number associated
 with each point.
 
 how can this be done?

See ?text

Uwe Ligges


 /
 allan
 
 
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] About object of class mle returned by user defined functions

2005-07-21 Thread Christophe Pouzat

Hi,

There is something I don't get with object of class mle returned by a 
function I wrote. More precisely it's about the behaviour of method 
confint and profile applied to these object.

I've written a short function (see below) whose arguments are:
1) A univariate sample (arising from a gamma, log-normal or whatever).
2) A character string standing for one of the R densities, eg, gamma, 
lnorm, etc. That's the density the user wants to fit to the data.
3) A named list with initial values for the density parameters; that 
will be passed to optim via mle.
4) The method to be used by optim via mle. That can be change by the 
code if parameter boundaries are also supplied.
5) The lowest allowed values for the parameters.
6) The largest allowed values.

The big thing this short function does is writing on-fly the 
corresponding log-likelihood function before calling mle. The object 
of class mle returned by the call to mle is itself returned by the 
function.

Here is the code:

newFit - function(isi, ## The data set
   isi.density = gamma, ## The name of the density 
used as model
   initial.para = list( shape = (mean(isi)/sd(isi))^2,
 scale = sd(isi)^2 / mean(isi) ), ## Inital 
parameters passed to optim
   optim.method = BFGS, ## optim method
   optim.lower = numeric(length(initial.para)) + 0.1,
   optim.upper = numeric(length(initial.para)) + Inf,
   ...) {

  require(stats4)
 
  ## Create a string with the log likelihood definition
  minusLogLikelihood.txt - paste(function( ,
  paste(names(initial.para), collapse = 
, ),
   ) {,
  isi - eval(,
  deparse(substitute(isi)),
  , envir = .GlobalEnv);,
  -sum(,
  paste(d, isi.density, sep = ),
  (isi, ,
  paste(names(initial.para), collapse = 
, ),
  , log = TRUE) ) }
  )

  ## Define logLikelihood function
  minusLogLikelihood - eval( parse(text = minusLogLikelihood.txt) )
  environment(minusLogLikelihood) - .GlobalEnv

 
  if ( all( is.infinite( c(optim.lower,optim.upper) ) ) ) {
  getFit - mle(minusLogLikelihood,
start = initial.para,
method = optim.method,
...
)
  } else {
getFit - mle(minusLogLikelihood,
  start = initial.para,
  method = L-BFGS-B,
  lower = optim.lower,
  upper = optim.upper,
  ...
  )
  }  ## End of conditional on all(is.infinite(c(optim.lower,optim.upper)))
 
  getFit
 
}


It seems to work fine on examples like:

  isi1 - rgamma(100, shape = 2, scale = 1)
  fit1 - newFit(isi1) ## fitting here with the correct density 
(initial parameters are obtained by the method of moments)
  coef(fit1)
shape scale
1.8210477 0.9514774
  vcov(fit1)
   shape  scale
shape 0.05650600 0.02952371
scale 0.02952371 0.02039714
  logLik(fit1)
'log Lik.' -155.9232 (df=2)

If we compare with a direct call to mle:

  llgamma - function(sh, sc) -sum(dgamma(isi1, shape = sh, scale = sc, 
log = TRUE))
  fitA - mle(llgamma, start = list( sh = (mean(isi1)/sd(isi1))^2, sc = 
sd(isi1)^2 / mean(isi1) ),lower = c(0.0001,0.0001), method = L-BFGS-B)
  coef(fitA)
  sh   sc
1.821042 1.051001
  vcov(fitA)
sh  sc
sh  0.05650526 -0.03261146
sc -0.03261146  0.02488714
  logLik(fitA)
'log Lik.' -155.9232 (df=2)

I get almost the same estimated parameter values, same log-likelihood 
but not the same vcov matrix.

A call to profile or confint on fit1 does not work, eg:
  confint(fit1)
Profiling...
Erreur dans approx(sp$y, sp$x, xout = cutoff) :
need at least two non-NA values to interpolate
De plus : Message d'avis :
collapsing to unique 'x' values in: approx(sp$y, sp$x, xout = cutoff)

Although calling the log-likelihood function defined in fit1 
([EMAIL PROTECTED]) with argument values different from the MLE does return 
something sensible:

  [EMAIL PROTECTED](coef(fit1)[1],coef(fit1)[2])
[1] 155.9232
  [EMAIL PROTECTED](coef(fit1)[1]+0.01,coef(fit1)[2]+0.01)
[1] 155.9263

There is obviously something I'm missing here since I thought for a 
while that the problem was with the environment attached to the 
function minusLogLikelihood when calling eval; but the lines above 
make me think it is not the case...

Any help and/or ideas warmly welcomed.

Thanks,

Christophe.

-- 
A Master Carpenter has many tools and is expert with most of them.If you
only know how to use a hammer, every problem starts to look like a nail.
Stay away from that trap.
Richard B Johnson.
--

Christophe Pouzat

[R] R:plot and dots

2005-07-21 Thread Clark Allan

hi all

a very simple question.

i have plot(x,y)

but i would like to add in on the plot the observation number associated
with each point.

how can this be done?

/
allan__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] The steps of building library in R 2.1.1

2005-07-21 Thread Gabor Grothendieck

On 7/21/05, Duncan Murdoch [EMAIL PROTECTED] wrote:
 On 7/21/2005 9:43 AM, Gabor Grothendieck wrote:
  I think you have been using R too long.  Something like
  this is very much needed.  There are two problems:
 
  1. the process itself is too complex (need to get rid of perl,
  integrate package development tools with package installation
 procedure [it should be as easy as downloading a package],
 remove necessity to set or modify any environment variables
  including the path variables).
 
 I agree with some of this, but I don't see much interest in fixing it.
 
 For example, getting rid of Perl would be a lot of work.  When the Perl
 scripts were written, R was not capable of doing what they do.  I think
 it is capable now, but there's still a huge amount of translation work
 to do.  Who will do that?  Who will test that they did it right?  At the
 end, will it actually have been worth all the trouble?  Installing Perl
 is not all that hard.

Each step may not be hard but the totality of them all means its
pretty complex for most people.  I don't know who will do it or
whether anyone even will but a first step is identifying that it needs
to be done.  Since the key to expanding R is to expand the library
to me making it simple to create and install packages and R ought
to be of very high priority for the core group regardless of difficulty
in achieving this.  If no one is interested in doing it then it will remain
a limitation of R that commercial or other free systems can use to
gain advantage over R.

 
  2. there is too much material to absorb just to create a package.
  The manuals are insufficient.
 
 The first sentence here is basically a repetition of the process is too
 complex.  I think the second sentence is incorrect.  Could you please
 point out what necessary steps are missing?

Its not that anything is missing that I am aware of.  Its that there is so much 
detail one is overwhelmed.  Its not completely the fault of the description 
since as point #1 mentions the process itself is a key part of the 
problem.

 
  A step-by-step simplification is very much needed.
 
 Exactly this has been in the Installation and Administration manual
 since I put it there in February for the 2.1.0 release.  It's at the
 beginning of the appendix on the Windows toolset, with multiple
 references pointing people there.  It's followed by detailed
 descriptions of each of the steps.
 
 If you think it could be further improved, please submit improvements.

That is easy to say but, in fact, if anyone does this they are not met
with a receptive atmosphere.  The excellent post describing the process
that started this out (even if there are some small errors) is just one 
example.  

 
  Its no
  coincidence that there are a number of such descriptions on
  the net (google for 'making creating R package') since I would
  guess that just about everyone has significant problems in creating
  their first package on Windows.
 
 As far as I can tell, those all predate the release of 2.1.0.  I think
 your complaints are out of date.

I am sure the situation is getting better but I did look at the manuals
again before posting and do think that a step by step article such
as that in Ivy Li's post, the various documents on the net findable by google
as I mentioned and the proposed article by Uwe are really needed
in addition to the manuals.  The manuals can then be used to get
additional detail.

 
 Duncan Murdoch
 
 
 
  On 7/21/05, Uwe Ligges [EMAIL PROTECTED] wrote:
  Ivy_Li wrote:
 
   Dear All,
  
   With the warm support of every R expert, I have built my R library 
   successfully.
   Especially thanks: Duncan Murdoch
 Gabor Grothendieck
 Henrik Bengtsson
 Uwe Ligges
 
  You are welcome.
 
 
  The following is intended for the records in the archive in order to
  protect readers.
 
 
   Without your help, I will lower efficiency.
   I noticed that some other friends were puzzled by the method of building 
   library. Now, I organize a document about it. Hoping it can help more 
   friends.
  
   1. Read the webpage http://www.stats.ox.ac.uk/pub/Rtools
 
  Do you mean http://www.murdoch-sutherland.com/Rtools/ ?
 
   2. Download the rw2011.exe; Install the newest version of R
   3. Download the tools.zip; Unpack it into c:\cygwin
 
  Not required to call it cygwin - also a bit misleading...
 
   4. Download the ActivePerl-5.6.1.633-MSWin32-x86.msi; Install Active 
   Perl in c:\Perl
 
  Why in C:\Perl ?
 
   5. Download the MinGW-3.1.0-1.exe; Install the mingw32 port of gcc in 
   c:\mingwin
 
  Why in c:\mingwin ?
 
 
   6. Then go to Control Panel - System - Advanced - Environment 
   Variables - Path - Variable Balue; add c:\cygwin;c:\mingwin\bin
 The PATH variable already contains a couple of paths, add the two 
   given above in front of all others, separated by ;.
 Why we add them in the beginning of the path?

[R] opening RDB files

2005-07-21 Thread Emili Tortosa-Ausina

Hi all,

I've recently upgraded to R version 2.1.1 and when trying to inspect the 
contents of many packages in the library (for instance library\MASS\R) I've 
realized wordpad, or the notepad, won't open them since they have *.RDB and 
*.RDX extensions which these editors cannot recognize.

However, libraries in previous versions of R did not have these extensions 
and I could inspect the contents of each package without any trouble.

I've been searching for this thread but did not find it.

Thank you!

Emili

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] opening RDB files

2005-07-21 Thread Uwe Ligges

Emili Tortosa-Ausina wrote:

 Hi all,
 
 I've recently upgraded to R version 2.1.1 and when trying to inspect the 
 contents of many packages in the library (for instance library\MASS\R) I've 
 realized wordpad, or the notepad, won't open them since they have *.RDB and 
 *.RDX extensions which these editors cannot recognize.
 
 However, libraries in previous versions of R did not have these extensions 
 and I could inspect the contents of each package without any trouble.
 
 I've been searching for this thread but did not find it.


Well, these are the lazy loading databases which have been introduced in 
R-1.9.0, AFAIR. There is a corresponding article in R News.

Just download the source package in order to look at the code.

Uwe Ligges



 Thank you!
 
 Emili
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] normal reference intervals

2005-07-21 Thread Crabb, David

I am interested in calculating Age-Specific normal reverence intervals,
using non-parametric methods - or ideally something called the LMS
method (which as I understand it uses cubic splines fitted to the data).
Any packages in R that you think might help me? Any other advice
gratefully received.

Many thanks. Best wishes, David.


-
Dr. David Crabb
School of Biomedical and Natural Sciences,
Nottingham Trent University, Clifton Campus, Nottingham. NG11 8NS
Tel: 0115 848 3275   Fax: 0115 848 6690
 



This email is intended solely for the addressee.  It may contain private and 
confidential information.  If you are not the intended addressee, please take 
no action based on it nor show a copy to anyone.  In this case, please reply to 
this email to highlight the error.  Opinions and information in this email that 
do not relate to the official business of Nottingham Trent University shall be 
understood as neither given nor endorsed by the University.
Nottingham Trent University has taken steps to ensure that this email and any 
attachments are virus-free, but we do advise that the recipient should check 
that the email and its attachments are actually virus free.  This is in keeping 
with good computing practice.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] opening RDB files

2005-07-21 Thread Henrik Bengtsson

Emili Tortosa-Ausina wrote:
 Hi all,
 
 I've recently upgraded to R version 2.1.1 and when trying to inspect the 
 contents of many packages in the library (for instance library\MASS\R) I've 
 realized wordpad, or the notepad, won't open them since they have *.RDB and 
 *.RDX extensions which these editors cannot recognize.
 
 However, libraries in previous versions of R did not have these extensions 
 and I could inspect the contents of each package without any trouble.

The *.rdb etc are the new compact package file formats introduced around 
R v2.0.0; these are binary files that won't make much sense to look at. 
   It was only in version before this you could inspect the R code by 
looking at the file pkg/R/pkg.R in a text file viewer/editor.

To look at the code now, you have to either download the source of the 
package you're interested in (look for the *.tar.gz files), or you can 
always do it from within R, e.g. print(read.table). If the the function 
you want to look at gives UseMethod and so on, you're looking at a 
generic function, e.g. print(print):

function (x, ...)
UseMethod(print)
environment: namespace:base

then you want to track down the method for your specific object. To find 
all implementation of print, use methods(), e.g. methods(print):

   [1] print.acf*   print.anova
   [3] print.aov*   print.aovlist*
   [5] print.ar*print.Arima*
   [7] print.arima0*print.AsIs
   [9] print.Bibtex*print.by
snip/snip
[123] print.vignette*  print.xgettext*
[125] print.xngettext* print.xtabs*

Then do, say, print(print.by) and you'll see the code. All method with 
an asterisk are namespace protected methods. To get these you have to 
use getAnywhere(), e.g. print(getAnywhere(print.acf)).

Why the new file format?  It is used for package that utilized lazy 
loading, which more and more package now use (packages without lazy 
loading can still be inspected the old way). Thanks to lazy loading, 
packages now loads more or less instantainously. They are also more 
memory efficient, because all code is not loaded at once.

Cheers

Henrik Bengtsson

 I've been searching for this thread but did not find it.
 
 Thank you!
 
 Emili
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] colnames

2005-07-21 Thread Gilbert Wu

Hi Adai,

Your diagnosis is absolutely right; class(r1) returned data.frame and your 
suggested solution worked perfectly. Your assumption is also right; both x and 
y are positive.

If I want to compare the performance of the my old function with yours, are 
there some functions in R I could use to get the elapsed time etc?

Many Thanks indeed.

Regards,

Gilbert

-Original Message-
From: Adaikalavan Ramasamy [mailto:[EMAIL PROTECTED]
Sent: 19 July 2005 23:38
To: Gilbert Wu
Cc: r-help@stat.math.ethz.ch
Subject: RE: [R] colnames


What does class(r1) give you ? If it is data.frame, then try 
  exp( diff( log( as.matrix( df ) ) ) )

BTW, I made the assumption that both x and y are positive values only.

Regards, Adai



On Tue, 2005-07-19 at 16:30 +0100, Gilbert Wu wrote:
 Hi Adai,
 
 When I tried the optimized routine, I got the following error message:
 
 r1
  899188 902232   901714 28176U 15322M
 20050713  7.595  10.97 17.96999 5.1925  11.44
 20050714  7.605  10.94 18.00999 5.2500  11.50
 20050715  7.480  10.99 17.64999 5.2500  11.33
 20050718  7.415  11.05 17.64000 5.2250  11.27
  exp(diff(log(r1))) -1
 Error in r[i1] - r[-length(r):-(length(r) - lag + 1)] : 
 non-numeric argument to binary operator
 
 
 Any idea?
 
 Many Thanks.
 
 Gilbert
 -Original Message-
 From: Adaikalavan Ramasamy [mailto:[EMAIL PROTECTED]
 Sent: 19 July 2005 12:20
 To: Gilbert Wu
 Cc: r-help@stat.math.ethz.ch
 Subject: RE: [R] colnames
 
 
 First, your problem could be boiled down to the following example. See
 how the colnames of the two outputs vary.
 
 df - cbind.data.frame( 100=1:2, 200=3:4 )
 df/df
   X100 X200
 111
 211
 
 m  - as.matrix( df )   # coerce to matrix class
 m/m
   100 200
 1   1   1
 2   1   1
 
 It appears that whenever R has to create a new dataframe automatically,
 it tries to get nice colnames. See help(data.frame). I am not exactly
 sure why this behaviour is different when creating a matrix. But I do
 not think this is a major problem for most people. If you coerce your
 input to matrix, the problem goes away.
 
 
 Next, note the following points :
  a) mat[ 1:3, 1:ncol(mat) ] is equivalent to simply mat[ 1:3,  ]. 
  b) mat[ 2:nrow(mat), ] is equivalent to simply mat[ -1,  ]
 See help(subset) for more information.
 
 Using the points above, we can simplify your function as 
 
  p.RIs2Returns - function (mat){
 
mat - as.matrix(mat)
x - mat[ -nrow(mat), ]
y - mat[ -1, ]
   
return( y/x -1 )
  }
 
 If your data contains only numerical data, it is probably good idea to
 work with matrices as matrix operations are faster.
 
 
 Finally, we can shorten your function. You can use the diff (which works
 column-wise if input is a matrix) and apply function if you know that 
 
   y/x  =  exp(log(y/x))  =  exp( log(y) - log(x) )
 
 which could be coded in R as
 
   exp( diff( log(r1) ) )
 
 and then subtract 1 from above to get your returns.
 
 Regards, Adai
 
 
 
 On Tue, 2005-07-19 at 09:17 +0100, Gilbert Wu wrote:
  Hi Adai,
  
  Many Thanks for the examples.
  
  I work for a financial institution. We are exploring R as a tool to 
  implement our portfolio optimization strategies. Hence, R is still a new 
  language to us.
  
  The script I wrote tried to make a returns matrix from the daily return 
  indices extracted from a SQL database. Please find below the output that 
  produces the 'X' prefix in the colnames. The reason to preserve the column 
  names is that they are stock identifiers which are to be used by other sub 
  systems rather than R.
  
  I would welcome any suggestion to improve the script.
  
  
  Regards,
  
  Gilbert
  
   p.RIs2Returns -
  + function (RIm)
  + {
  + x-RIm[1:(nrow(RIm)-1), 1:ncol(RIm)]
  + y-RIm[2:nrow(RIm), 1:ncol(RIm)]
  + RReturns - (y/x -1)
  + RReturns
  + }
   
   
   channel-odbcConnect(ourSQLDB)
   result-sqlQuery(channel,paste(select * from equityRIs;))
   odbcClose(channel)
   result
 stockidsdate  dbPrice
  1   899188 20050713  7.59500
  2   899188 20050714  7.60500
  3   899188 20050715  7.48000
  4   899188 20050718  7.41500
  5   902232 20050713 10.97000
  6   902232 20050714 10.94000
  7   902232 20050715 10.99000
  8   902232 20050718 11.05000
  9   901714 20050713 17.96999
  10  901714 20050714 18.00999
  11  901714 20050715 17.64999
  12  901714 20050718 17.64000
  13  28176U 20050713  5.19250
  14  28176U 20050714  5.25000
  15  28176U 20050715  5.25000
  16  28176U 20050718  5.22500
  17  15322M 20050713 11.44000
  18  15322M 20050714 11.5
  19  15322M 20050715 11.33000
  20  15322M 20050718 11.27000
   r1-reshape(result, timevar=stockid, idvar=sdate, direction=wide)
   r1
   sdate dbPrice.899188 dbPrice.902232 dbPrice.901714 dbPrice.28176U 
  dbPrice.15322M
  1 20050713  7.595  10.97   17.96999 5.1925  
  11.44
  2 20050714  7.605  10.94   18.00999 5.2500  
  11.50
  3 20050715  7.480

Re: [R] RandomForest question

2005-07-21 Thread Liaw, Andy

See the tuneRF() function in the package for an implementation of 
the strategy recommended by Breiman  Cutler.

BTW, randomForest is only for the R package.  See Breiman's 
web page for notice on trademarks.

Andy

 From: Weiwei Shi 
 
 Hi,
 I found the following lines from Leo's randomForest, and I am not sure
 if it can be applied here but just tried to help:
 
 mtry0 = the number of variables to split on at each node. Default is
 the square root of mdim. ATTENTION! DO NOT USE THE DEFAULT VALUES OF
 MTRY0 IF YOU WANT TO OPTIMIZE THE PERFORMANCE OF RANDOM FORESTS. TRY
 DIFFERENT VALUES-GROW 20-30 TREES, AND SELECT THE VALUE OF MTRY THAT
 GIVES THE SMALLEST OOB ERROR RATE.
 
 mdim is the number of predicators.
 
 HTH,
 
 weiwei
 
 On 7/21/05, Liaw, Andy [EMAIL PROTECTED] wrote:
   From: [EMAIL PROTECTED]
  
   Hello,
  
   I'm trying to find out the optimal number of splits (mtry
   parameter) for a randomForest classification. The
   classification is binary and there are 32 explanatory
   variables (mostly factors with each up to 4 levels but also
   some numeric variables) and 575 cases.
  
   I've seen that although there are only 32 explanatory
   variables the best classification performance is reached when
   choosing mtry=80. How is it possible that more variables can
   used than there are in columns the data frame?
  
  It's not.  The code for randomForest.default() has:
  
  ## Make sure mtry is in reasonable range.
  mtry - max(1, min(p, round(mtry)))
  
  so it silently sets mtry to number of predictors if it's too large.
  As an example:
  
   library(randomForest)
  randomForest 4.5-12
  Type rfNews() to see new features/changes/bug fixes.
   iris.rf = randomForest(Species ~ ., iris, mtry=10)
   iris.rf$mtry
  [1] 4
  
  I should probably add a warning in such cases...
  
  Andy
  
  
 thanks for your help
 + kind regards,
  
 Arne
  
  
  
  
 [[alternative HTML version deleted]]
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide!
   http://www.R-project.org/posting-guide.html
  
  
  
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
  
 
 
 -- 
 Weiwei Shi, Ph.D
 
 Did you always know?
 No, I did not. But I believed...
 ---Matrix III
 
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] RandomForest question

2005-07-21 Thread Liaw, Andy

 From: [EMAIL PROTECTED]
 
 Hello,
 
 I'm trying to find out the optimal number of splits (mtry 
 parameter) for a randomForest classification. The 
 classification is binary and there are 32 explanatory 
 variables (mostly factors with each up to 4 levels but also 
 some numeric variables) and 575 cases.
 
 I've seen that although there are only 32 explanatory 
 variables the best classification performance is reached when 
 choosing mtry=80. How is it possible that more variables can 
 used than there are in columns the data frame?

It's not.  The code for randomForest.default() has:

## Make sure mtry is in reasonable range.
mtry - max(1, min(p, round(mtry)))

so it silently sets mtry to number of predictors if it's too large.
As an example:

 library(randomForest)
randomForest 4.5-12 
Type rfNews() to see new features/changes/bug fixes.
 iris.rf = randomForest(Species ~ ., iris, mtry=10)
 iris.rf$mtry
[1] 4

I should probably add a warning in such cases...

Andy

 
   thanks for your help
   + kind regards,
 
   Arne
 
 
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] R graphics

2005-07-21 Thread Earl F. Glynn

Sam Baxter [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
 I am trying to set up 16 graphs on one graphics page in R. I have used
 the mfrow=c(4,4) command. However I get a lot of white space between
 each graph. Does anyone know how I can reduce this?

The default par()$mar is c(5,4,4,2) + 0.1 and can be reduced.

For example:
par(mfrow=c(4,4), mar=c(3,3,0,0))

for (i in 1:16)
{
  plot(0:10)
}

efg
--
Earl F. Glynn
Bioinformatics Department
Stowers Institute for Medical Research

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] colnames

2005-07-21 Thread Adaikalavan Ramasamy

See help(system.time).


On Thu, 2005-07-21 at 17:56 +0100, Gilbert Wu wrote:
 Hi Adai,
 
 Your diagnosis is absolutely right; class(r1) returned data.frame and your 
 suggested solution worked perfectly. Your assumption is also right; both x 
 and y are positive.
 
 If I want to compare the performance of the my old function with yours, are 
 there some functions in R I could use to get the elapsed time etc?
 
 Many Thanks indeed.
 
 Regards,
 
 Gilbert
 
 -Original Message-
 From: Adaikalavan Ramasamy [mailto:[EMAIL PROTECTED]
 Sent: 19 July 2005 23:38
 To: Gilbert Wu
 Cc: r-help@stat.math.ethz.ch
 Subject: RE: [R] colnames
 
 
 What does class(r1) give you ? If it is data.frame, then try 
   exp( diff( log( as.matrix( df ) ) ) )
 
 BTW, I made the assumption that both x and y are positive values only.
 
 Regards, Adai
 
 
 
 On Tue, 2005-07-19 at 16:30 +0100, Gilbert Wu wrote:
  Hi Adai,
  
  When I tried the optimized routine, I got the following error message:
  
  r1
   899188 902232   901714 28176U 15322M
  20050713  7.595  10.97 17.96999 5.1925  11.44
  20050714  7.605  10.94 18.00999 5.2500  11.50
  20050715  7.480  10.99 17.64999 5.2500  11.33
  20050718  7.415  11.05 17.64000 5.2250  11.27
   exp(diff(log(r1))) -1
  Error in r[i1] - r[-length(r):-(length(r) - lag + 1)] : 
  non-numeric argument to binary operator
  
  
  Any idea?
  
  Many Thanks.
  
  Gilbert
  -Original Message-
  From: Adaikalavan Ramasamy [mailto:[EMAIL PROTECTED]
  Sent: 19 July 2005 12:20
  To: Gilbert Wu
  Cc: r-help@stat.math.ethz.ch
  Subject: RE: [R] colnames
  
  
  First, your problem could be boiled down to the following example. See
  how the colnames of the two outputs vary.
  
  df - cbind.data.frame( 100=1:2, 200=3:4 )
  df/df
X100 X200
  111
  211
  
  m  - as.matrix( df )   # coerce to matrix class
  m/m
100 200
  1   1   1
  2   1   1
  
  It appears that whenever R has to create a new dataframe automatically,
  it tries to get nice colnames. See help(data.frame). I am not exactly
  sure why this behaviour is different when creating a matrix. But I do
  not think this is a major problem for most people. If you coerce your
  input to matrix, the problem goes away.
  
  
  Next, note the following points :
   a) mat[ 1:3, 1:ncol(mat) ] is equivalent to simply mat[ 1:3,  ]. 
   b) mat[ 2:nrow(mat), ] is equivalent to simply mat[ -1,  ]
  See help(subset) for more information.
  
  Using the points above, we can simplify your function as 
  
   p.RIs2Returns - function (mat){
  
 mat - as.matrix(mat)
 x - mat[ -nrow(mat), ]
 y - mat[ -1, ]

 return( y/x -1 )
   }
  
  If your data contains only numerical data, it is probably good idea to
  work with matrices as matrix operations are faster.
  
  
  Finally, we can shorten your function. You can use the diff (which works
  column-wise if input is a matrix) and apply function if you know that 
  
  y/x  =  exp(log(y/x))  =  exp( log(y) - log(x) )
  
  which could be coded in R as
  
  exp( diff( log(r1) ) )
  
  and then subtract 1 from above to get your returns.
  
  Regards, Adai
  
  
  
  On Tue, 2005-07-19 at 09:17 +0100, Gilbert Wu wrote:
   Hi Adai,
   
   Many Thanks for the examples.
   
   I work for a financial institution. We are exploring R as a tool to 
   implement our portfolio optimization strategies. Hence, R is still a new 
   language to us.
   
   The script I wrote tried to make a returns matrix from the daily return 
   indices extracted from a SQL database. Please find below the output that 
   produces the 'X' prefix in the colnames. The reason to preserve the 
   column names is that they are stock identifiers which are to be used by 
   other sub systems rather than R.
   
   I would welcome any suggestion to improve the script.
   
   
   Regards,
   
   Gilbert
   
p.RIs2Returns -
   + function (RIm)
   + {
   + x-RIm[1:(nrow(RIm)-1), 1:ncol(RIm)]
   + y-RIm[2:nrow(RIm), 1:ncol(RIm)]
   + RReturns - (y/x -1)
   + RReturns
   + }


channel-odbcConnect(ourSQLDB)
result-sqlQuery(channel,paste(select * from equityRIs;))
odbcClose(channel)
result
  stockidsdate  dbPrice
   1   899188 20050713  7.59500
   2   899188 20050714  7.60500
   3   899188 20050715  7.48000
   4   899188 20050718  7.41500
   5   902232 20050713 10.97000
   6   902232 20050714 10.94000
   7   902232 20050715 10.99000
   8   902232 20050718 11.05000
   9   901714 20050713 17.96999
   10  901714 20050714 18.00999
   11  901714 20050715 17.64999
   12  901714 20050718 17.64000
   13  28176U 20050713  5.19250
   14  28176U 20050714  5.25000
   15  28176U 20050715  5.25000
   16  28176U 20050718  5.22500
   17  15322M 20050713 11.44000
   18  15322M 20050714 11.5
   19  15322M 20050715 11.33000
   20  15322M 20050718 11.27000
r1-reshape(result, timevar=stockid, idvar=sdate, direction=wide)
r1
sdate dbPrice.899188 dbPrice.902232

Re: [R] Problem with read.table()

2005-07-21 Thread Kristian Skrede Gleditsch

Thanks to all who responded to my earlier message. The problem lies in 
that apostrophes (i.e., ') in some of the text fields are read as quotes.

The file can be read without problems setting quotes= in read.table.

Incidently, read.delim() also works, even without setting quotes= 
explicitly.

best regards,

Kristian Skrede Gleditsch
Department of Political Science, UCSD
(On leave, University of Essex, 2005-6)
Tel: +44 1206 872499, Fax: +44 1206 873234
Email: [EMAIL PROTECTED] or [EMAIL PROTECTED]
http://weber.ucsd.edu/~kgledits/


Kristian Skrede Gleditsch wrote:
 Dear all,
 
 I have encountered a strange problem with read.table(). When I try to 
 read a tab delimited file I get an error message for line 260 not being 
 equal to 14 (see below).
 
 Using count.fields() suggests that a number of lines have length not 
 equal to 14, but not 260.
 
 Looking at the actual file, however, I cannot see anything wrong with 
 any lines. They all seem to have length 14, there are no double tabs 
 etc., and the file reads correctly in other programs. Does anyone have 
 any suggestions as to what this might stem from?
 
 I have placed a copy of the file at 
 http://dss.ucsd.edu/~kgledits/archigos_v.1.9.asc
 
 regards,
 Kristian Skrede Gleditsch
 
 
   archigos1.9 - read.table(c:/work/work12/archigos/archigos_v.1.9.asc,
 + sep=\t,header=T,as.is=T,row.names=NULL)
 Error in scan(file = file, what = what, sep = sep, quote = quote, dec = 
 dec,  :
 line 260 did not have 14 elements
   a - count.fields(c:/work/work12/archigos/archigos_v.1.9.asc,sep=\t)
   a - data.frame(c(1:length(a)),a)
   a[a[,2]!=14,]
  c.1.length.a..  a
 150 150 10
 313 313 10
 424 424 10
 1189   1189  5
 1510   1510 10
 1514   1514 10
 1590   1590  5
 1600   1600 10
 1612   1612 10
 1618   1618 10
 1619   1619 10
 1709   1709 10
 1722   1722 10
 1981   1981 10
 1985   1985 10
 2112   2112 10
 2178   2178 10
 2208   2208 10
 2224   2224 10
 2530   2530  5
 2536   2536  5
 2573   2573  5
 2928   2928  5

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] help with a xyplot legend

2005-07-21 Thread Deepayan Sarkar

On 7/20/05, Ronaldo Reis-Jr. [EMAIL PROTECTED] wrote:
 Hi,
 
 I try to put a legend in a xyplot graphic.
 
 xyplot(y~x|g,ylim=c(0,80),xlim=c(10,40),as.table=T,layout=c(2,3), ylab=Número
 de machos capturados,xlab=expression(paste(Temperatura (,degree,C))),
 key=list(corner=c(0,0),x=0, y=0, text=list(legenda),lines=list(col=cor, lwd =
 espessura, lty=linha),columns=7,between=0.5,betwen.columns=0.5,cex=0.8))
 
 The problem is that legend is very close do xlab. I try change
 corner=c(0,0),x=0, y=0, to corner=c(0,0),x=0, y=1, but in this way the legend
 dont appear.
 
 How to make the vertical bottom area of the plot bigger to put the legend a
 bit separated of the xlabel?

Where exactly do you want the legend? If it's outside and below the
plot, you should try
space=bottom instead of x,y,corner, etc. Otherwise, with
space=inside, no attempt will be made to save space for the legend.

Deepayan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] The steps of building library in R 2.1.1

2005-07-21 Thread Duncan Murdoch

On 7/21/2005 12:03 PM, Gabor Grothendieck wrote:

 
  2. there is too much material to absorb just to create a package.
  The manuals are insufficient.
 
 The first sentence here is basically a repetition of the process is too
 complex.  I think the second sentence is incorrect.  Could you please
 point out what necessary steps are missing?
 
 Its not that anything is missing that I am aware of.  Its that there is so 
 much 
 detail one is overwhelmed.  Its not completely the fault of the description 
 since as point #1 mentions the process itself is a key part of the 
 problem.

And what solution do you propose to this problem?  Saying this ought to
be a high priority for the core group is not a solution.  Tell us where
the resources will come from to do this.

  A step-by-step simplification is very much needed.
 
 Exactly this has been in the Installation and Administration manual
 since I put it there in February for the 2.1.0 release.  It's at the
 beginning of the appendix on the Windows toolset, with multiple
 references pointing people there.  It's followed by detailed
 descriptions of each of the steps.
 
 If you think it could be further improved, please submit improvements.
 
 That is easy to say but, in fact, if anyone does this they are not met
 with a receptive atmosphere.  The excellent post describing the process
 that started this out (even if there are some small errors) is just one 
 example.  

I don't think that post was written with the intention of putting it
into the manual.  It would still take a fair bit of work to do that:

 1. deciding where it fits and what to replace,
 2. correcting the errors,
 3. writing it in texinfo format.

I'd be happy to talk with someone who volunteers to do that.  (I'd
suggest the volunteer should do number 1 first, so as not to waste a lot
of time on versions that don't fit.)

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] problem running R with perl Statistics::R

2005-07-21 Thread Rachel Lomasky


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Using Perl Statistics::R module, not running the

2005-07-21 Thread Rachel Lomasky


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Chemoinformatic people

2005-07-21 Thread A.J. Rossini

Sure, but Luke, I DO NOT currently use R at work...

(now, that's not to say I won't be using it in a few months, but currently...).

best,
-tony


On 7/21/05, Luke Tierney [EMAIL PROTECTED] wrote:
 I don't.  THere is an address an email at novartis in the ASA directory
 
   ID  068970
 NameAnthony J. Rossini
 Company Novartis Pharma AG
 Address Biostatistics
 WSJ-27.1.012
 City State Zip  CH-4002 Basel
 Country Switzerland
 Phone   (206) 543-2005
 Email   [EMAIL PROTECTED]
 
 luke
 
 On Thu, 21 Jul 2005, A.J. Rossini wrote:
 
  Just with R, or via another tool integrating R, such as Pipeline Pilot?
 
  best,
  -tony
 
  On 7/20/05, Frédéric Ooms [EMAIL PROTECTED] wrote:
  Dear colleague,
  Just an e-mail to know if they are people working in the field of 
  chemoinformatic that are using R in their work. If yes I was wondering if 
  we couldn't exchange tips and tricks about the use of R in this area ?
  Best regards
  Fred Ooms
 
  [[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
 
 
 
 
 
 --
 Luke Tierney
 Chair, Statistics and Actuarial Science
 Ralph E. Wareham Professor of Mathematical Sciences
 University of Iowa  Phone: 319-335-3386
 Department of Statistics andFax:   319-335-3017
 Actuarial Science
 241 Schaeffer Hall  email:  [EMAIL PROTECTED]
 Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu
 


-- 
best,
-tony

Commit early,commit often, and commit in a repository from which we can easily
roll-back your mistakes (AJR, 4Jan05).

A.J. Rossini
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] The steps of building library in R 2.1.1

2005-07-21 Thread Uwe Ligges

Gabor Grothendieck wrote:

 I think you have been using R too long.  Something like
 this is very much needed.  There are two problems:
 
 1. the process itself is too complex (need to get rid of perl,
 integrate package development tools with package installation 
procedure [it should be as easy as downloading a package], 
remove necessity to set or modify any environment variables
 including the path variables).
 
 2. there is too much material to absorb just to create a package.
 The manuals are insufficient.
 
 A step-by-step simplification is very much needed.  Its no
 coincidence that there are a number of such descriptions on
 the net (google for 'making creating R package') since I would 
 guess that just about everyone has significant problems in creating 
 their first package on Windows.

OK, if people really think this is required, I will sit down on a clean
Windows XP machine, do the setup, and write it down for the next R Help
Desk in R News -- something like Creating my first R package under
Windows?

If anybody else is willing to contribute and can write something up in a
manner that is *not* confusing or misleading (none of the other material
spread over the web satisfies this requirement, AFAICS), she/he is
invited to contribute, of course.

BTW, everybody else is invited to submit proposals for R Help Desk!!!

Uwe Ligges




 
 On 7/21/05, Uwe Ligges [EMAIL PROTECTED] wrote:
 
Ivy_Li wrote:


Dear All,

With the warm support of every R expert, I have built my R library 
successfully.
Especially thanks: Duncan Murdoch
  Gabor Grothendieck
  Henrik Bengtsson
  Uwe Ligges

You are welcome.


The following is intended for the records in the archive in order to
protect readers.



Without your help, I will lower efficiency.
I noticed that some other friends were puzzled by the method of building 
library. Now, I organize a document about it. Hoping it can help more 
friends.

1. Read the webpage http://www.stats.ox.ac.uk/pub/Rtools

Do you mean http://www.murdoch-sutherland.com/Rtools/ ?


2. Download the rw2011.exe; Install the newest version of R
3. Download the tools.zip; Unpack it into c:\cygwin

Not required to call it cygwin - also a bit misleading...


4. Download the ActivePerl-5.6.1.633-MSWin32-x86.msi; Install Active Perl 
in c:\Perl

Why in C:\Perl ?


5. Download the MinGW-3.1.0-1.exe; Install the mingw32 port of gcc in 
c:\mingwin

Why in c:\mingwin ?



6. Then go to Control Panel - System - Advanced - Environment Variables 
- Path - Variable Balue; add c:\cygwin;c:\mingwin\bin
  The PATH variable already contains a couple of paths, add the two 
 given above in front of all others, separated by ;.
  Why we add them in the beginning of the path? Because we want the 
 folder that contains the tools to be at the beginning so that you eliminate 
 the possibility of finding a different program of the same name first in a 
 folder that comes prior to the one where the tools are stored.


OK, this (1-6) is all described in the R Administration and Installation
manual, hence I do not see why we have to repeat it here.



7. I use the package.skeleton() function to make a draft package. It will 
automate some of the setup for a new source package. It creates directories, 
saves functions anddata to appropriate places, and creates skeleton help 
files and 'README' files describing further steps in packaging.
I type in R:
  f - function(x,y) x+y
  g - function(x,y) x-y
  d - data.frame(a=1, b=2)
  e - rnorm(1000)
  package.skeleton(list=c(f,g,d,e), name=example)
Then modify the 'DESCRIPTION':
  Package: example
  Version: 1.0-1
  Date: 2005-07-09
  Title: My first function
  Author: Ivy [EMAIL PROTECTED]
  Maintainer: Ivy [EMAIL PROTECTED]
  Description: simple sum and subtract
  License: GPL version 2 or later
  Depends: R (= 1.9), stats, graphics, utils
You can refer to the web page: 
http://cran.r-project.org/src/contrib/Descriptions/  There are larger source 
of examples. And you can read the part of 'Creating R Packages' in 'Writing 
R Extension'. It introduces some useful things for your reference.


This is described in Writing R Extension and is not related to the setup
of you system in 1-6.



8. Download hhc.exe Microsoft help compiler from somewhere. And save it 
somewhere in your path.
I download a 'htmlhelp.exe' and setup. saved the hhc.exe into the 
 'C:\cygwin\bin' because this path has been writen in my PATH Variable Balue.
However if you decided not to use the Help Compiler (hhc), then you need 
 to modify the MkRules file in RHOME/src/gnuwin32 to tell it not to try to 
 build that kind of help file


This is described in the R Administration and Installation manual
and I do not see why we should put the html compiler to the other tools.



9. In the DOS environment. Into the D:\  Type the following code:

There is no DOS

Re: [R] Question about 'text' (add lm summary to a plot)

2005-07-21 Thread Dan Bolser

On Thu, 21 Jul 2005, Christoph Buser wrote:

Dear Dan

I can only help you with your third problem, expression and
paste. You can use:

plot(1:5,1:5, type = n)
text(2,4,expression(paste(Slope : , 3.45%+-%0.34, sep = )), pos = 4)
text(2,3.8,expression(paste(Intercept : , -10.43%+-%1.42)), pos = 4)
text(2,3.6,expression(paste(R^2,: , 0.78, sep = )), pos = 4)


Cheers for this.

I was trying to get it to work, but the problem is that I need to replace
the values above with variables, from the following code...


dat.lm - lm(dat$AVG_CH_PAIRS ~ dat$CHAINS)
dat.lm.sum - summary(dat.lm)

my.slope.1 - round(dat.lm.sum$coefficients[2],2)
my.slope.2 - round(dat.lm.sum$coefficients[4],2)

my.inter.1 - round(dat.lm.sum$coefficients[1],2)
my.inter.2 - round(dat.lm.sum$coefficients[3],2)

my.Rsqua.1 - round(dat.lm.sum$r.squared,2)


Anything I try results in either the words 'paste(Slope:, my.slope.1,
%+-%my.slope.2,sep=)' being written to the plot, or just
'my.slope.1+-my.slope2' (where the +- is correctly written).

I want to script it up and write all three lines to the plot with
'sep=\n', rather than deciding three different heights.


I do not have an elegant solution for the alignment.

Thanks very much for what you gave, its a good start for me to figure out 
how I am supposed to be telling R what to do!

Any way to just get fixed width fonts with text? (for the alignment
problem)


Cheers,
Dan.


Regards,

Christoph Buser

--
Christoph Buser [EMAIL PROTECTED]
Seminar fuer Statistik, LEO C13
ETH (Federal Inst. Technology) 8092 Zurich  SWITZERLAND
phone: x-41-44-632-4673fax: 632-1228
http://stat.ethz.ch/~buser/
--


Dan Bolser writes:
  
  I would like to annotate my plot with a little box containing the slope,
  intercept and R^2 of a lm on the data.
  
  I would like it to look like...
  
   ++
   | Slope :   3.45 +- 0.34 |
   | Intercept : -10.43 +- 1.42 |
   | R^2   :   0.78 |
   ++
  
  However I can't make anything this neat, and I can't find out how to
  combine this with symbols for R^2 / +- (plus minus).
  
  Below is my best attempt (which is franky quite pour). Can anyone
  improve on the below?
  
  Specifically, 
  
  aligned text and numbers, 
  aligned decimal places, 
  symbol for R^2 in the text (expression(R^2) seems to fail with
  'paste') and +- 
  
  
  
  Cheers,
  Dan.
  
  
  dat.lm - lm(dat$AVG_CH_PAIRS ~ dat$CHAINS)
  
  abline(coef(dat.lm),lty=2,lwd=1.5)
  
  
  dat.lm.sum - summary(dat.lm)
  dat.lm.sum
  
  attributes(dat.lm.sum)
  
  my.text.1 -
paste(Slope : , round(dat.lm.sum$coefficients[2],2),
  +/-,  round(dat.lm.sum$coefficients[4],2))
  
  my.text.2 -
paste(Intercept : , round(dat.lm.sum$coefficients[1],2),
  +/-,  round(dat.lm.sum$coefficients[3],2))
  
  my.text.3 -
paste(R^2 : ,   round(dat.lm.sum$r.squared,2))
  
  my.text.1
  my.text.2
  my.text.3
  
  
  ## Add legend
  text(x=3,
   y=300,
   paste(my.text.1,
 my.text.2,
 my.text.3,
 sep=\n),
   adj=c(0,0),
   cex=1)
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] The steps of building library in R 2.1.1

2005-07-21 Thread Sean O'Riordain

Just a few thoughts... Good documentation helps everybody - the
beginners and the experts (less beginner questions if there is
thorough and accessible documentation.  I fully appreciate that this
is a volunteer effort - I'm just trying to pin down some places where
we have documentation issues.

Docs can be in a number of different forms - reference, examples,
carefully and throughly explained.  I personally find it difficult to
understand reference type material until I have seen a worked example,
and some of the reference material is a little light on the examples
for me, and others like me who thrive on examples.  The Linux-HOWTO
collection are a good example of step-by-step documentation... If
you're an expert, then you can read it fast and skip... otherwise you
read every line.

Since R comes as a computer package and a statistics thingy... the
questions on this list come in three forms - those that are very
'package', e.g. how do I reduce the space between two graphs - to
the statistics questions how reliable is the coefficient of
determination in the presence of outliers (which shouldn't really be
asked here), and then the how do I do 'statistic X' in R - how do I
calculate a confidence interval around the coefficent of determination
in R?.

The standard documentation got me so far in learning about R - I got a
copy of MASS, S-Programming, Introductory Statistics with R, and
Michael Crawley's new book Statistics : An Introduction using R -
along with every online book I could find on CRAN and elsewhere. 
Unfortunately for me, my experience leaves me in between the beginner
books and the more advanced texts like MASS, David, Schervish etc...
The learning curve is steep - but then like many people, I'd like to
be able to do sophisticated modelling with deep understanding and no
effort :-)

I feel there is a bit of a hole in the middle of the documentation
which could be attacked from both sides - the introduction element is
starting to be covered - it's the next step up from that.

And yes before you ask I would like to help - but my statistics
knowledge is very poor!  Should this conversation go to r-devel?

Thanks for listening,
Sean

On 21/07/05, Uwe Ligges [EMAIL PROTECTED] wrote:
 Duncan Murdoch wrote:
 
  On 7/21/2005 10:29 AM, Uwe Ligges wrote:
 
 Gabor Grothendieck wrote:
 
 
 I think you have been using R too long.  Something like
 this is very much needed.  There are two problems:
 
 1. the process itself is too complex (need to get rid of perl,
 integrate package development tools with package installation
procedure [it should be as easy as downloading a package],
remove necessity to set or modify any environment variables
 including the path variables).
 
 2. there is too much material to absorb just to create a package.
 The manuals are insufficient.
 
 A step-by-step simplification is very much needed.  Its no
 coincidence that there are a number of such descriptions on
 the net (google for 'making creating R package') since I would
 guess that just about everyone has significant problems in creating
 their first package on Windows.
 
 OK, if people really think this is required, I will sit down on a clean
 Windows XP machine, do the setup, and write it down for the next R Help
 Desk in R News -- something like Creating my first R package under
 Windows?
 
 If anybody else is willing to contribute and can write something up in a
 manner that is *not* confusing or misleading (none of the other material
 spread over the web satisfies this requirement, AFAICS), she/he is
 invited to contribute, of course.
 
 BTW, everybody else is invited to submit proposals for R Help Desk!!!
 
 
  That sounds great.  Could you also take notes as you go about specific
  problems in the writeup in the R-admin manual, so it can be improved for
  the next release?
 
 Of course.
 
 
  Another thing you could do which would be valuable:  get a student or
  someone else who is reasonably computer literate, but unfamiliar with R
  details, to do this while you sit watching and recording their mistakes.
 
 Good idea.
 
 Uwe
 
 
  Duncan Murdoch
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] unable to call R t-test from Java

2005-07-21 Thread O'Brien, Laura

In my last post, I left off version info:

Java:   jdk1.5.0_03
R:  2.1.1
SJava:  0.68
OS: SunOs 5.8

By trying either the eval or the call method to execute a t-test results in a 
core dump.


e.g.  eval method

 /* produces a core */
System.err.println(eval a t.test);
Object value = e.eval(t.test (c(1,2,3), c(4,5,6)));
if (value != null)
interp.show(value);

e.g   call method

Object[] funArgs = new Object[2];
double[] d0 = { 1.1, 2.2, 3.3};
double[] d1 = { 9.9, 8.8, 7.7};

funArgs[0] = d0;
funArgs[1] = d1;

 System.err.println(\r\n Calling t.test and passing a java array);
   
 Object value =  e.call(t.test, funArgs); 
 if(value != null) 
 {
 interp.show(value ); 
 System.err.println(\r\n);
 }

Thanks,
Laura

-Original Message-
From: O'Brien, Laura 
Sent: Wednesday, July 20, 2005 10:42 AM
To: 'r-help@stat.math.ethz.ch'
Subject: unable to call R t-test from Java


Hello,

My colleague and I would like to write Java code that invokes R to do a simple 
TTest.  I've included my sample java code below.  I tried various alternatives 
and am unable to pass a vector to the TTest method.  In my investigation, I 
tried to call other R methods that take vectors and also ran into various 
degrees of failure.   Any insight you can provide or other Web references you 
can point me to would be appreciated.

Thank you,
Laura O'Brien
Application Architect



---  code   --


package org.omegahat.R.Java.Examples;

import org.omegahat.R.Java.ROmegahatInterpreter;
import org.omegahat.R.Java.REvaluator;


public class JavaRCall2
{

/**
 * want to see if I can eval a t.test command like what I would run in the
 * R command line
 */

static public void runTTestByEval_cores(REvaluator e, ROmegahatInterpreter 
interp)
{
/* produces a core */
System.err.println(eval a t.test);
Object value = e.eval(t.test (c(1,2,3), c(4,5,6)));
if (value != null)
interp.show(value);
}


/**
 * want to see if I can eval anything that takes a vector, e.g. mean, 
 * like what I would run in the R command line
 */

static public void runMeanByEval_works(REvaluator e, ROmegahatInterpreter 
interp)
{
System.err.println(\r\n  evaluation string mean command);
Object value = e.eval(mean(c(1,2,3)));
if(value != null) 
{
interp.show(value ); 
System.err.println(\r\n);
}
}

/**
 *  if I pass mean a org.omegahat.Environment.DataStructures.numeric what do I 
get?  NaN
 */
 
static public void runMeanByNumericList_nan(REvaluator e, 
ROmegahatInterpreter interp)
{
 Object[] funArgs = new Object[1];
 // given argument is not numeric or logical
 
 org.omegahat.Environment.DataStructures.numeric rList1 = new 
org.omegahat.Environment.DataStructures.numeric(3);   

 double[] dList = new double[3];
 dList[0] = (double) 1.1;
 dList[1] = (double) 2.2; 
 dList[2] = (double) 3.3;
 rList1.setData(dList, true);
 System.err.println(rList1.toString());

 funArgs[0] = rList1 ;
 
 System.err.println(\r\n Calling mean and passing an omegahat vector);


 Object value =  e.call(mean, funArgs); 
 if(value != null) 
 {
 interp.show(value ); 
 System.err.println(\r\n);
 }

}

/**
 * let's run some tests on the vector passed in and see what R thinks I'm 
handing it
 * 
 * it returns 
 * isnumeric: false
 * mode:  list
 * length:2
 */

public static void runTestsOnOmegahatNumeric(REvaluator e, 
ROmegahatInterpreter interp)
{
Object[] funArgs = new Object[1];
// given argument is not numeric or logical
 
org.omegahat.Environment.DataStructures.numeric rList1 = new 
org.omegahat.Environment.DataStructures.numeric(3);   

double[] dList = new double[3];
dList[0] = (double) 1.1;
dList[1] = (double) 2.2; 
dList[2] = (double) 3.3;
rList1.setData(dList, true);
System.err.println(rList1.toString());

funArgs[0] = rList1 ;
 
System.err.println(\r\n Calling isnumeric and passing an omegahat 
vector);
   
Object value =  e.call(is.numeric, funArgs); 
if(value != null) 
{
interp.show(value ); 
System.err.println(\r\n);
}

// mode is list

System.err.println(\r\n Calling mode and passing an omegahat vector);
   
value =  e.call(mode, funArgs); 
if(value != null) 
{
interp.show(value ); 
System.err.println(\r\n);
}

Re: [R] output of variance estimate of random effect from a gamma frailty model using Coxph in R

2005-07-21 Thread Thomas Lumley

On Thu, 21 Jul 2005 [EMAIL PROTECTED] wrote:

 Hi,

 I have a question about the output for variance of random effect from a gamma
 frailty model using coxph in R. Is it the vairance of frailties themselves or
 variance of log frailties? Thanks.


For a Gamma frailty model it is the variance of the Gamma distribution, so 
the variance of the frailties.  For Gaussian frailty it will be the log 
frailties, though.


-thomas

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] debian vcd package

2005-07-21 Thread Jean Eid

I just installed it on a Debian 2.6.8.1 and the following R

platform i386-pc-linux-gnu
arch i386
os   linux-gnu
system   i386, linux-gnu
status
major2
minor0.1
year 2004
month11
day  15
language R


By the way are you using apt-get install vcd. and if so why? just use
install.packages(vcd) from within R

HTH

Jean

On Thu, 21 Jul 2005, Peter Ho wrote:

 [Apologies if you  have already read this message sent  from another
 email address]

 Hi R-Help,

 I have been using R  in Linux (Debian) for the past month. The usual way
 I install packages is through apt. Recently, a new packages vcd became
 available on CRAN.  I tried installing it today and found that Debian
 does not seem to support this package. I also found that many other
 packages were unavailable.
 Does anyone have any recommended sites where a full list is available?
 If none exist, what would be the best way to move ahead in installing
 say the vcd package.

 I am still a novice in using Debian and so please forgive me if some of
 my questions may seem trivial for experienced users.



 Peter
 

 Peter Ho, PhD.
 Escola Superior de Tecnologia e Gestao.
 Instituto Politecnico de Viana do Castelo.
 Avenida do Atlantico- Apartado 574.
 4901-908 Viana do Castelo. Portugal.
 Tel: +351-258-819700 Ext. 1252
 Email: [EMAIL PROTECTED]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] tav files

2005-07-21 Thread Carlos Rodrigo Zarate Blades


Dear colleagues:

I am using the spotfinder 2.2.3 and the .tav files generated there have 20 
columns. When I exclude the 3 last columns, the .tav file can not be 
recognized by aroma package in R platform.
What I have to do to generate .tav files with 17 columns only?

Thank you in advance.

Carlos 

Dept. of Immunology
University of São Paulo

--
Open WebMail Project (http://openwebmail.org)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Invitation to New York, Spain, and Italy; c/ba

2005-07-21 Thread IPSI Conferences

Dear potential Speaker:

On behalf of the organizing committee, I would like to extend a cordial
invitation for you to submit a paper to the IPSI Transactions journal, or
to attend one of the upcoming IPSI BgD multidisciplinary,
interdisciplinary, and transdisciplinary conferences.



The first one will take place in New York City, NY, USA:

IPS-USA-2006 NEW YORK
Hotel Beacon (arrival: 5 January 06 / departure: 8 January 06)
New Deadlines: 1 August 05 (abstract)  1 October 05 (full paper)


The second one will take place in Marbella, Spain:

IPSI-2006 SPAIN
Hotel Puente Romano (arrival: 10 February 06 / departure: 13 February 06)
Deadlines: 1 September 05 (abstract)  1 November 05 (full paper)


The third one will take place in Amalfi, Italy:

IPSI-2006 ITALY
Hotel Santa Caterina (arrival: 23 March 06 / departure: 26 March 06)
Deadlines: 1 October 05 (abstract)  1 December 05 (full paper)


All IPSI BgD conferences are non-profit. They bring together the elite of
the world science; so far, we have had seven Nobel Laureates speaking at
the opening ceremonies. The conferences always take place in some of the
most attractive places of the world. All those who come to IPSI
conferences once, always love to come back (because of the unique
professional quality and the extremely creative atmosphere); lists of past
participants are on the web, as well as details of future conferences.

These conferences are in line with the newest recommendations of the US
National Science Foundation and of the EU research sponsoring agencies, to
stress multidisciplinary, interdisciplinary, and transdisciplinary
research (M+I+T++ research). The speakers and activities at the
conferences truly support this type of scientific interaction.

Among the main topics of these conferencs are: E-education and E-business
with Special Emphasis on Semantic Web and Web Datamining

Other topics of interest include, but are not limited to:

* Internet
* Computer Science and Engineering
* Mobile Communications/Computing for Science and Business
* Management and Business Administration
* Education
* e-Medicine
* e-Oriented Bio Engineering/Science and Molecular Engineering/Science
* Environmental Protection
* e-Economy
* e-Law
* Technology Based Art and Art to Inspire Technology Developments
* Internet Psychology

If you would like more information on either conference, please reply to
this e-mail message.

If you plan to submit an abstract and paper, please let us know
immediately for planning purposes. Remember that you can submit your paper
also to the IPSI Transactions journal.

Sincerely Yours,

Prof. V. Milutinovic, Chairman,
IPSI BgD Conferences


* * * CONTROLLING OUR E-MAILS TO YOU * * *

If you would like to continue to be informed about future IPSI BgD
conferences, please reply to this e-mail message with a subject line of
SUBSCRIBE.

If you would like to be removed from our mailing list, please reply to
this e-mail message with a subject line of REMOVE.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Question about 'text' (add lm summary to a plot)

2005-07-21 Thread Gabor Grothendieck

Use bquote instead of expression, e.g.

trees.lm - lm(Volume ~ Girth, trees)
trees.sm - summary(trees.lm)
trees.co - round(trees.sm$coefficients,2)
trees.rsq - round(trees.sm$r.squared,2)

plot(Volume ~ Girth, trees)

text(10,60, bquote(Intercept : .(trees.co[1,1])%+-%.(trees.co[1,2])), pos = 4)
text(10,57, bquote(Slope : .(trees.co[2,1])%+-%.(trees.co[2,2])), pos = 4)
text(10,54,bquote(R^2 : .(trees.rsq)), pos = 4)


On 7/21/05, Dan Bolser [EMAIL PROTECTED] wrote:
 On Thu, 21 Jul 2005, Christoph Buser wrote:
 
 Dear Dan
 
 I can only help you with your third problem, expression and
 paste. You can use:
 
 plot(1:5,1:5, type = n)
 text(2,4,expression(paste(Slope : , 3.45%+-%0.34, sep = )), pos = 4)
 text(2,3.8,expression(paste(Intercept : , -10.43%+-%1.42)), pos = 4)
 text(2,3.6,expression(paste(R^2,: , 0.78, sep = )), pos = 4)
 
 
 Cheers for this.
 
 I was trying to get it to work, but the problem is that I need to replace
 the values above with variables, from the following code...
 
 
 dat.lm - lm(dat$AVG_CH_PAIRS ~ dat$CHAINS)
 dat.lm.sum - summary(dat.lm)
 
 my.slope.1 - round(dat.lm.sum$coefficients[2],2)
 my.slope.2 - round(dat.lm.sum$coefficients[4],2)
 
 my.inter.1 - round(dat.lm.sum$coefficients[1],2)
 my.inter.2 - round(dat.lm.sum$coefficients[3],2)
 
 my.Rsqua.1 - round(dat.lm.sum$r.squared,2)
 
 
 Anything I try results in either the words 'paste(Slope:, my.slope.1,
 %+-%my.slope.2,sep=)' being written to the plot, or just
 'my.slope.1+-my.slope2' (where the +- is correctly written).
 
 I want to script it up and write all three lines to the plot with
 'sep=\n', rather than deciding three different heights.
 
 
 I do not have an elegant solution for the alignment.
 
 Thanks very much for what you gave, its a good start for me to figure out
 how I am supposed to be telling R what to do!
 
 Any way to just get fixed width fonts with text? (for the alignment
 problem)
 
 
 Cheers,
 Dan.
 
 
 Regards,
 
 Christoph Buser
 
 --
 Christoph Buser [EMAIL PROTECTED]
 Seminar fuer Statistik, LEO C13
 ETH (Federal Inst. Technology) 8092 Zurich  SWITZERLAND
 phone: x-41-44-632-4673fax: 632-1228
 http://stat.ethz.ch/~buser/
 --
 
 
 Dan Bolser writes:
  
   I would like to annotate my plot with a little box containing the slope,
   intercept and R^2 of a lm on the data.
  
   I would like it to look like...
  
++
| Slope :   3.45 +- 0.34 |
| Intercept : -10.43 +- 1.42 |
| R^2   :   0.78 |
++
  
   However I can't make anything this neat, and I can't find out how to
   combine this with symbols for R^2 / +- (plus minus).
  
   Below is my best attempt (which is franky quite pour). Can anyone
   improve on the below?
  
   Specifically,
  
   aligned text and numbers,
   aligned decimal places,
   symbol for R^2 in the text (expression(R^2) seems to fail with
   'paste') and +-
  
  
  
   Cheers,
   Dan.
  
  
   dat.lm - lm(dat$AVG_CH_PAIRS ~ dat$CHAINS)
  
   abline(coef(dat.lm),lty=2,lwd=1.5)
  
  
   dat.lm.sum - summary(dat.lm)
   dat.lm.sum
  
   attributes(dat.lm.sum)
  
   my.text.1 -
 paste(Slope : , round(dat.lm.sum$coefficients[2],2),
   +/-,  round(dat.lm.sum$coefficients[4],2))
  
   my.text.2 -
 paste(Intercept : , round(dat.lm.sum$coefficients[1],2),
   +/-,  round(dat.lm.sum$coefficients[3],2))
  
   my.text.3 -
 paste(R^2 : ,   round(dat.lm.sum$r.squared,2))
  
   my.text.1
   my.text.2
   my.text.3
  
  
   ## Add legend
   text(x=3,
y=300,
paste(my.text.1,
  my.text.2,
  my.text.3,
  sep=\n),
adj=c(0,0),
cex=1)
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide! 
   http://www.R-project.org/posting-guide.html
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Question about 'text' (add lm summary to a plot)

2005-07-21 Thread Marc Schwartz (via MN)

[Note: the initial posts have been re-arranged to attempt to maintain
the flow from top to bottom]

 Dan Bolser writes:
   
   I would like to annotate my plot with a little box containing the slope,
   intercept and R^2 of a lm on the data.
   
   I would like it to look like...
   
++
| Slope :   3.45 +- 0.34 |
| Intercept : -10.43 +- 1.42 |
| R^2   :   0.78 |
++
   
   However I can't make anything this neat, and I can't find out how to
   combine this with symbols for R^2 / +- (plus minus).
   
   Below is my best attempt (which is franky quite pour). Can anyone
   improve on the below?
   
   Specifically, 
   
   aligned text and numbers, 
   aligned decimal places, 
   symbol for R^2 in the text (expression(R^2) seems to fail with
   'paste') and +- 
   
   
   
   Cheers,
   Dan.
   
   
   dat.lm - lm(dat$AVG_CH_PAIRS ~ dat$CHAINS)
   
   abline(coef(dat.lm),lty=2,lwd=1.5)
   
   
   dat.lm.sum - summary(dat.lm)
   dat.lm.sum
   
   attributes(dat.lm.sum)
   
   my.text.1 -
 paste(Slope : , round(dat.lm.sum$coefficients[2],2),
   +/-,  round(dat.lm.sum$coefficients[4],2))
   
   my.text.2 -
 paste(Intercept : , round(dat.lm.sum$coefficients[1],2),
   +/-,  round(dat.lm.sum$coefficients[3],2))
   
   my.text.3 -
 paste(R^2 : ,   round(dat.lm.sum$r.squared,2))
   
   my.text.1
   my.text.2
   my.text.3
   
   
   ## Add legend
   text(x=3,
y=300,
paste(my.text.1,
  my.text.2,
  my.text.3,
  sep=\n),
adj=c(0,0),
cex=1


 On Thu, 21 Jul 2005, Christoph Buser wrote:
 
 Dear Dan
 
 I can only help you with your third problem, expression and
 paste. You can use:
 
 plot(1:5,1:5, type = n)
 text(2,4,expression(paste(Slope : , 3.45%+-%0.34, sep = )), pos = 4)
 text(2,3.8,expression(paste(Intercept : , -10.43%+-%1.42)), pos = 4)
 text(2,3.6,expression(paste(R^2,: , 0.78, sep = )), pos = 4)

 I do not have an elegant solution for the alignment.


On Thu, 2005-07-21 at 19:55 +0100, Dan Bolser wrote:
 Cheers for this.
 
 I was trying to get it to work, but the problem is that I need to replace
 the values above with variables, from the following code...
 
 
 dat.lm - lm(dat$AVG_CH_PAIRS ~ dat$CHAINS)
 dat.lm.sum - summary(dat.lm)
 
 my.slope.1 - round(dat.lm.sum$coefficients[2],2)
 my.slope.2 - round(dat.lm.sum$coefficients[4],2)
 
 my.inter.1 - round(dat.lm.sum$coefficients[1],2)
 my.inter.2 - round(dat.lm.sum$coefficients[3],2)
 
 my.Rsqua.1 - round(dat.lm.sum$r.squared,2)
 
 
 Anything I try results in either the words 'paste(Slope:, my.slope.1,
 %+-%my.slope.2,sep=)' being written to the plot, or just
 'my.slope.1+-my.slope2' (where the +- is correctly written).
 
 I want to script it up and write all three lines to the plot with
 'sep=\n', rather than deciding three different heights.


 Thanks very much for what you gave, its a good start for me to figure out 
 how I am supposed to be telling R what to do!
 
 Any way to just get fixed width fonts with text? (for the alignment
 problem)


Dan,

Here is one approach. It may not be the best, but it gets the job done.
You can certainly take this and encapsulate it in a function to automate
the text/box placement and to pass values as arguments.

A couple of quick concepts:

1. As far as I know, plotmath cannot do multiple lines, so each line in
your box needs to be done separately.

2. The horizontal alignment is a bit problematic when using expression()
or bquote() since I don't believe that multiple spaces are honored as
such after parsing. Thus I break up each component (label, : and
values) into separate text() calls. The labels are left justified.

3. The alignment for the numeric values are done with right
justification. So, as long as you use a consistent number of decimals in
the value outputs (2 here), you should be OK. This means you might need
to use formatC() or sprintf() to control the numeric output values on
either side of the +/- sign.

4. In the variable replacement, note the use of substitute() and the
list of x and y arguments as replacement values in the expressions.



# Set your values
my.slope.1 - 3.45
my.slope.2 - 0.34

my.inter.1 - -10.43
my.inter.2 - 1.42

my.Rsqua - 0.78


# Create the initial plot as per Christoph's post
plot(1:5, 1:5, type = n)


#-
# Do the Slope
#-

text(1, 4.5,  Slope, pos = 4)
text(2, 4.5, :)
text(3, 4.5, substitute(x %+-% y, 
list(x = my.slope.1, 
 y = my.slope.2)),
 pos = 2)


#-
# Do the Intercept
#-

text(1, 4.25, Intercept, pos = 4)
text(2, 4.25, :)
text(3, 4.25, substitute(x %+-% y, 
 list(x = my.inter.1, 
  y = my.inter.2)),
 pos = 2)

Re: [R] RandomForest question

2005-07-21 Thread Liaw, Andy

 From: Uwe Ligges
 
 [EMAIL PROTECTED] wrote:
 
  Hello,
  
  I'm trying to find out the optimal number of splits (mtry parameter)
  for a randomForest classification. The classification is binary and
  there are 32 explanatory variables (mostly factors with each up to 4
  levels but also some numeric variables) and 575 cases.
  
  I've seen that although there are only 32 explanatory variables the
  best classification performance is reached when choosing 
 mtry=80. How
  is it possible that more variables can used than there are 
 in columns
  the data frame?
 
 If some of the variables are factors, dummy variables are 
 generated and 
 you get a larger number of variables in the later process.

No, unless the OP is using the formula interface with a version of the 
package from two years or so ago.  We got the first formula interface
by copying and modifying the one for svm() in e1071, and forgot the
fact that SVM needs that for dealing with factors, but not trees 
(especially not how the underlying RF code handles them).  This has
been correctly long ago.

Cheers,
Andy


 
 Uwe Ligges
 
 
  thanks for your help + kind regards,
  
  Arne
  
  
  
  
  [[alternative HTML version deleted]]
  
  __ 
  R-help@stat.math.ethz.ch mailing list 
  https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the
  posting guide! http://www.R-project.org/posting-guide.html
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] A Question About Inverse Gamma

2005-07-21 Thread pantd

Hi R users,


I am having a little problem finding the the solution to this problem in R:

1. I need to generate normal distribution of sample size 30, mean = 50, sd = 5.
2. From the statistics obtained in step 1, I need to generate the Inverse Gamma
distribution.

Your views and help will be appreciated.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] A Question About Inverse Gamma

2005-07-21 Thread Sundar Dorai-Raj



[EMAIL PROTECTED] wrote:
 Hi R users,
 
 
 I am having a little problem finding the the solution to this problem in R:
 
 1. I need to generate normal distribution of sample size 30, mean = 50, sd = 
 5.
 2. From the statistics obtained in step 1, I need to generate the Inverse 
 Gamma
 distribution.
 
 Your views and help will be appreciated.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

I found rinvgamma in the MCMCpack package. Perhaps that's what you need.

Did you read the posing guide? A help.search(normal) would have helped 
you with item 1.

--sundar

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] R:plot and dots

2005-07-21 Thread Adaikalavan Ramasamy

Here is an example :

   x - rnorm(20)
   y - rnorm(20)
   plot(x, y, type=n)
   text(x, y, labels=as.character(1:20))

Also look into help(identify) if you want to point out specific points.

Regards, Adai



On Thu, 2005-07-21 at 16:18 +0200, Clark Allan wrote:
 hi all
 
 a very simple question.
 
 i have plot(x,y)
 
 but i would like to add in on the plot the observation number associated
 with each point.
 
 how can this be done?
 
 /
 allan
 __ R-help@stat.math.ethz.ch 
 mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the 
 posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] The steps of building library in R 2.1.1

2005-07-21 Thread Gabor Grothendieck

On 22 Jul 2005 00:01:18 +0200, Peter Dalgaard [EMAIL PROTECTED] wrote:
 Duncan Murdoch [EMAIL PROTECTED] writes:
 
  On 7/21/2005 9:43 AM, Gabor Grothendieck wrote:
   I think you have been using R too long.  Something like
   this is very much needed.  There are two problems:
  
   1. the process itself is too complex (need to get rid of perl,
   integrate package development tools with package installation
  procedure [it should be as easy as downloading a package],
  remove necessity to set or modify any environment variables
   including the path variables).
 
  I agree with some of this, but I don't see much interest in fixing it.
 
  For example, getting rid of Perl would be a lot of work.  When the Perl
  scripts were written, R was not capable of doing what they do.  I think
  it is capable now, but there's still a huge amount of translation work
  to do.  Who will do that?  Who will test that they did it right?  At the
  end, will it actually have been worth all the trouble?  Installing Perl
  is not all that hard.
 
 Another point of view is that the issue is that we cannot ship a full
 set of build tools with R on Windows, the main obstacle being that
 Active Perl has redistribution restrictions.
 

Although the ultimate answer is to get rid of perl entirely,
in the same vein as your discussion, perhaps R could simply 
provide standalone executables for each perl program currently
used by R  (using perlcc to produce them).

I believe there is a free alternative to Microsoft's Help Compiler 
too but I just googled for it and was unable to locate it.

By placing all these items (and the UNIXish tools) in the \R\rw...\bin
directory and using the registry as in Rcmd.bat and Rgui.bat
found in batchfiles:
  http://cran.r-project.org/contrib/extra/batchfiles/
modifying the path and environment variables by the user might 
be eliminated.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] principal component analysis in affy

2005-07-21 Thread Adaikalavan Ramasamy

1) Please learn to wrap your emails to 72 characters per line. See 
http://expita.com/nomime.html

2) You might have better luck with the BioConductor folks. Their mailing
list is https://stat.ethz.ch/mailman/listinfo/bioconductor

3) The affy package has many functions including some algorithms for
preprocessing CEL files.

4) I do not understand why you need the affy package if you do not have
CEL files ? I presume that you only have a subset of the final data.

5) I do not understand your question but you might want to simply
transpose the matrix before doing a prcomp() on it. See help(t).

Regards, Adai



On Thu, 2005-07-21 at 10:09 -0500, Wagle, Mugdha wrote:
 Hi,
  
 I have been using the prcomp function to perform PCA on my example microarray 
 data, (stored in metric text files) which looks like this:
  
 1a 1b 1c 1d 1e 1f 
 ...4r 4s 4t
 g11.2705 1.2766 
 ...2.0298
 g20.1631 
 0.7067
 g30.2212 
 1.0439
 .
 .
 .
 .
 g99  
 1.3657..2.3736
  
 i.e. a matrix of 63 columns and 99 rows, where the columns represent chip and 
 rows represent genes. Now, the biplot function
  
 biplot(prcomp(pcadata, scale = TRUE), cex = c(0.75,0.75))
  
 gives me a plot with one vector per gene. However, I actually need to get one 
 vector per chip instead of one vector per gene.  I have been told that there 
 is a function in the affy package that does what I am looking for i.e. gives 
 one vector per chip. Can someone please tell me what the function is called, 
 and how I can get hold of the code(since I believe affy only works on CEL 
 files) ? I have downloaded the affy R code from Terry Speed's website 
 already, but I don't know where (if at all) the code to perform PCA is.
  
 Thank you everyone!
  
 Sincerely,
 Mugdha Wagle
 Hartwell Center for Bioinformatics and Biotechnology,
 St.Jude Children's Research Hospital, Memphis TN 38105
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] vectorising ifelse()

2005-07-21 Thread Federico Calboli

Hi All,

is there any chance of vectorising the two ifelse() statements in the
following code:

for(i in gp){
   new[i,1] = ifelse(srow[i]0, new[srow[i],zippo[i]], sample(1:100, 1,
prob =Y1, rep = T))
   new[i,2] = ifelse(drow[i]0, new[drow[i]0,zappo[i]], sample(1:100,
1, prob =Y1, rep = T))
 }

Where I am forced to check if the value of drow and srow are 0 for each
line... in practical terms, I am attributing haplotypes to a pedigree,
so I have to give the haplotypes to the parents before I give them to
the offspring. The vectors *zippo* and *zappo* are the chances of
getting one or the other hap from the sire and dam respectively. *gp* is
the vectors of non-ancestral animals. *new* is a two col matrix where
the haps are stored.

Cheers,

Federico

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Re Randomization test for interaction effect

2005-07-21 Thread Cliff Lunneborg

Dear Pedro,

How to test for an interaction--or, even, how to pose the question of an
interaction--in randomization-based inference is not at all obvious.
And, in the permutation test context reliance has been placed on the
exchangeability of (estimated) residuals under an additive,
homoscedastic model. Where estimated, the residuals are not exactly
exchangeable.

A reference you might find useful is Pesarin, F (2001) Multivariate
permutations tests. Wiley: Chichester, UK. His method of synchronized
permutations may be applied to test for interactions under some
limited circumstances.


Pedro de Barros writes, in part:

Dear All,

I am trying to build a randomization test for interaction

The problem is as follows: I have a set of stations where the occurrence
and
biomass of each species being investigated was recorded.

snip

I would really appreciate any pointer to a solution of this problem. I
believe it is not complicated (and probably quite obvious) but the
solution
keeps out of reach, even though I have been searching for over a week.

Thanks,
Pedro



**
Cliff Lunneborg, Professor Emeritus, Statistics 
Psychology, University of Washington, Seattle
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] help: pls package

2005-07-21 Thread wu sz

Hello,

I have a data set with 15 variables (first one is the response) and
1200 observations. Now I use pls package to do the plsr with cross
validation as below.

trainSet = as.data.frame(scale(trainSet, center = T, scale = T))
trainSet.plsr = mvr(formula, ncomp = 14, data = trainSet, method = kernelpls, 
   CV = TRUE, validation = LOO, model =
TRUE, x = TRUE,
y = TRUE)

after that I wish to obtain the value of se, estimated standard
errors of the estimates(cross validation), mentioned in the function
of MSEP, but not implemented yet, so I made the program by myself to
calculate it. The results I got seem not right, and I wonder which
step below is wrong.

y = trainSet.plsr$y
p = as.data.frame(trainSet.plsr$validation$pred)

i = 1; msep_element = c()
while(i = length(p)){
msep_element[,i] = (p[i]-y)^2 
i = i + 1
}

msep = colMeans(msep_element)
msep_sd = sd(msep_element)

Then I compare msep with trainSet.plsr$validation$MSEP, which are
the same, but the values of msep_sd seem much larger than I
expected, is it the same as se?

Thank you,
Shengzhe

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] help: pls package

2005-07-21 Thread wu sz

Hello,

I have a data set with 15 variables (first one is the response) and
1200 observations. Now I use pls package to do the plsr with cross
validation as below.

trainSet = as.data.frame(scale(trainSet, center = T, scale = T))
trainSet.plsr = mvr(formula, ncomp = 14, data = trainSet, method = kernelpls,
  CV = TRUE, validation = LOO, model = TRUE, x = TRUE,
   y = TRUE)

after that I wish to obtain the value of se, estimated standard
errors of the estimates(cross validation), mentioned in the function
of MSEP, but not implemented yet, so I made the program by myself to
calculate it. The results I got seem not right, and I wonder which
step below is wrong.

y = trainSet.plsr$y
p = as.data.frame(trainSet.plsr$validation$pred)

i = 1; msep_element = c()
while(i = length(p)){
   msep_element[,i] = (p[i]-y)^2
   i = i + 1
}

msep = colMeans(msep_element)
msep_sd = sd(msep_element)

Then I compare msep with trainSet.plsr$validation$MSEP, which are
the same, but the values of msep_sd seem much larger than I
expected, is it the same as se? If not, how to calculate se of
cross validation?

Thank you,
Shengzhe

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] :)

2005-07-21 Thread Fernando Lapa

olá
- Original Message -
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Thursday, July 21, 2005 3:04 PM
Subject: :)

 pq nao me liga??

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] find confounder in covariates

2005-07-21 Thread Young Cho

Hi,

I was wondering if there is a way, or function in R to
find confounders. For istance, 

 a = sample( c(1:3), size=10,replace=T)
 X1 = factor( c('A','B','C')[a] )
 X2 = factor( c('Aa','Bb','Cc')[a] )
 Xmat = data.frame(X1,X2,rnorm(10),rnorm(10))
 dimnames(Xmat)[[2]] = c('z1','z2','z3','y')

Now, z2 is just an alias of z1. There can be a
collinearity like one is a linear combination of
others. If you run lm on it:
 
 f = lm(y~.,data=Xmat)
 summary(f)

Call:
lm(formula = y ~ ., data = Xmat)

Residuals:
Min  1Q  Median  3Q Max 
-1.2853 -0.3708 -0.1224  0.4617  1.2821 

Coefficients: (2 not defined because of singularities)
Estimate Std. Error t value Pr(|t|)  
(Intercept)  0.821410.44583   1.842   0.1150  
z1B -1.341670.65176  -2.059   0.0852 .
z1C  0.808911.07639   0.751   0.4808  
z2Bb  NA NA  NA   NA  
z2Cc  NA NA  NA   NA  
z3   0.042310.23397   0.181   0.8625  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.'
0.1 ' ' 1 

Residual standard error: 0.971 on 6 degrees of freedom
Multiple R-Squared: 0.5086, Adjusted R-squared:
0.2629 
F-statistic:  2.07 on 3 and 6 DF,  p-value: 0.2057 

In this case, I can look at data and figure out which
variable is confounded with which. But, if we have
many categorial covariates ( not necessarily same
number of levels ), it is almost impossible to check
it out.  

Any help would be greatly appreicated. Thanks.

Young.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] A Question About Inverse Gamma

2005-07-21 Thread David Scott

On Thu, 21 Jul 2005, Sundar Dorai-Raj wrote:



 [EMAIL PROTECTED] wrote:
 Hi R users,


 I am having a little problem finding the the solution to this problem in R:

 1. I need to generate normal distribution of sample size 30, mean = 50, sd = 
 5.
 2. From the statistics obtained in step 1, I need to generate the Inverse 
 Gamma
 distribution.

 Your views and help will be appreciated.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

 I found rinvgamma in the MCMCpack package. Perhaps that's what you need.



I think there is a problem with rinvgamma:

 rinvgamma
function (n, shape, scale = 1)
{
 return(1/rgamma(n, shape, scale))
}
environment: namespace:MCMCpack

I know it is not necessarily authoritative but look at wikipedia

http://en.wikipedia.org/wiki/Inverse-gamma_distribution

It seems the one line of the function should be:

return(1/rgamma(n, shape, 1/scale)


Or you could of course throw caution to the winds and write your own 
rinvgamma using rgamma :-)

David Scott




_
David Scott Department of Statistics, Tamaki Campus
The University of Auckland, PB 92019
AucklandNEW ZEALAND
Phone: +64 9 373 7599 ext 86830 Fax: +64 9 373 7000
Email:  [EMAIL PROTECTED]


Graduate Officer, Department of Statistics

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

92 matches

Mail list logo