Re: [R] Integration + Normal Distribution + Directory Browsing Processing Questions

2007-01-22 Thread Matthias Kohl
Hi Nils,

I would say, pnorm is faster and has a higher precision.

Best,
Matthias

- original message 

Subject: Re: [R] Integration + Normal Distribution + Directory Browsing 
Processing Questions
Sent: Mon, 22 Jan 2007
From: Nils Hoeller[EMAIL PROTECTED]

 Thank you,
 
 both work fine. Why is pnorm to prefer?
 
 Nils
 
 
 Matthias Kohl schrieb:
  Hi,
 
  why don't you use pnorm?
  E.g.,
 
  pnorm(1, mean = 0.1, sd = 1.2) - pnorm(0, mean = 0.1, sd = 1.2)
 
  Matthias
 
  - original message 
 
  Subject: Re: [R] Integration + Normal Distribution + Directory Browsing
 Processing Questions
  Sent: Sun, 21 Jan 2007
  From: Dimitris Rizopoulos[EMAIL PROTECTED]
 

  you can use the `...' argument of integrate, e.g.,
 
  integrate(dnorm, 0, 1)
  integrate(dnorm, 0, 1, mean = 0.1)
  integrate(dnorm, 0, 1, mean = 0.1, sd = 1.2)
 
  look at ?integrate for more info.
 
  I hope it helps.
 
  Best,
  Dimitris
 
  
  Dimitris Rizopoulos
  Ph.D. Student
  Biostatistical Centre
  School of Public Health
  Catholic University of Leuven
 
  Address: Kapucijnenvoer 35, Leuven, Belgium
  Tel: +32/(0)16/336899
  Fax: +32/(0)16/337015
  Web: http://med.kuleuven.be/biostat/
http://www.student.kuleuven.be/~m0390867/dimitris.htm
 
 
  Quoting Nils Hoeller [EMAIL PROTECTED]:
 
  
  Hi everyone,
 
  I am new to R, but it's really great and helped me a lot!
 
  But now I have 2 questions. It would be great, if someone can help me:
 
  1. I want to integrate a normal distribution, given a median and sd.
  The integrate function works great BUT the first argument has to be a
  function
 
  so I do integrate(dnorm,0,1) and it works with standard m. and sd.
 
  But I have the m and sd given.
 
  So for fixed m and sd I work around with a new function mynorm
 
  mynorm - function(n) {
  ret - dnorm(n,0.6,0.15)
  ret
  }
 
  for example.
 
  BUT what can I do for dynamic m and sd?
  I want something like integrate(dnorm(,0.6,0.15),0,1), with the first
  dnorm parameter open for the
  integration but fixed m and sd.
 
  I hope you can help me.
 
  2. I am working with textfiles with rows of measure data.
  read.table(file) works fine.
 
  Now I want R to read.table all files within a given directory and
  process them one by the other.
 
  for(all files in dir xy) {
  x - read.table(nextfile)
  process x
  }
 
  Is that possible with R? I hope so. Can anyone give me a link to

  examples.
  
  Thanks for your help
 
  Nils
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide

  http://www.R-project.org/posting-guide.html
  
  and provide commented, minimal, self-contained, reproducible code.
 
 

 
  Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
  
 
  --- original message end 
 
 
  --
  Dr. rer. nat. Matthias Kohl
  [EMAIL PROTECTED]
  www.stamats.de 
 
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

--- original message end 


--
Dr. rer. nat. Matthias Kohl
[EMAIL PROTECTED]
www.stamats.de

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] command-line arguments in R

2007-01-22 Thread talepanda
see:
?commandArgs

or more detail for R startup mechanisms:
?Startup


On 1/22/07, Deepak Chandra [EMAIL PROTECTED] wrote:
 Hi All,

 A  simple and naive question from a newbie.  How can one access command-line
 arguments in R i.e. equivalent of argv in C?
 Have spent a lot of time on finding it.  Also, it will be useful if someone
 can point me out to a resource which I could have used to answer this
 question or similar ones.


 Thanks,
 Deepak

   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Integration + Normal Distribution + Directory Browsing Processing Questions

2007-01-22 Thread Dimitris Rizopoulos
the reason is that is more natural to use pnorm(), which also should a 
more efficient approximation of the Normal integral than intgrate(), 
you may even use

diff(pnorm(0:1, mean = 0.5, sd = 1.2))

however, the point I meant to make was to use the '...' argument that 
can found in many R functions.

Best,
Dimitris


Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm


- Original Message - 
From: Nils Hoeller [EMAIL PROTECTED]
To: Matthias Kohl [EMAIL PROTECTED]
Cc: Dimitris Rizopoulos [EMAIL PROTECTED]; 
r-help@stat.math.ethz.ch
Sent: Monday, January 22, 2007 8:49 AM
Subject: Re: [R] Integration + Normal Distribution + Directory 
Browsing Processing Questions


 Thank you,

 both work fine. Why is pnorm to prefer?

 Nils


 Matthias Kohl schrieb:
 Hi,

 why don't you use pnorm?
 E.g.,

 pnorm(1, mean = 0.1, sd = 1.2) - pnorm(0, mean = 0.1, sd = 1.2)

 Matthias

 - original message 

 Subject: Re: [R] Integration + Normal Distribution + Directory 
 Browsing Processing Questions
 Sent: Sun, 21 Jan 2007
 From: Dimitris Rizopoulos[EMAIL PROTECTED]


 you can use the `...' argument of integrate, e.g.,

 integrate(dnorm, 0, 1)
 integrate(dnorm, 0, 1, mean = 0.1)
 integrate(dnorm, 0, 1, mean = 0.1, sd = 1.2)

 look at ?integrate for more info.

 I hope it helps.

 Best,
 Dimitris

 
 Dimitris Rizopoulos
 Ph.D. Student
 Biostatistical Centre
 School of Public Health
 Catholic University of Leuven

 Address: Kapucijnenvoer 35, Leuven, Belgium
 Tel: +32/(0)16/336899
 Fax: +32/(0)16/337015
 Web: http://med.kuleuven.be/biostat/
   http://www.student.kuleuven.be/~m0390867/dimitris.htm


 Quoting Nils Hoeller [EMAIL PROTECTED]:


 Hi everyone,

 I am new to R, but it's really great and helped me a lot!

 But now I have 2 questions. It would be great, if someone can 
 help me:

 1. I want to integrate a normal distribution, given a median and 
 sd.
 The integrate function works great BUT the first argument has to 
 be a
 function

 so I do integrate(dnorm,0,1) and it works with standard m. and 
 sd.

 But I have the m and sd given.

 So for fixed m and sd I work around with a new function mynorm

 mynorm - function(n) {
 ret - dnorm(n,0.6,0.15)
 ret
 }

 for example.

 BUT what can I do for dynamic m and sd?
 I want something like integrate(dnorm(,0.6,0.15),0,1), with the 
 first
 dnorm parameter open for the
 integration but fixed m and sd.

 I hope you can help me.

 2. I am working with textfiles with rows of measure data.
 read.table(file) works fine.

 Now I want R to read.table all files within a given directory and
 process them one by the other.

 for(all files in dir xy) {
 x - read.table(nextfile)
 process x
 }

 Is that possible with R? I hope so. Can anyone give me a link to

 examples.

 Thanks for your help

 Nils

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide

 http://www.R-project.org/posting-guide.html

 and provide commented, minimal, self-contained, reproducible 
 code.




 Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --- original message end 


 --
 Dr. rer. nat. Matthias Kohl
 [EMAIL PROTECTED]
 www.stamats.de



 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Combination diffrent variables ,how to calculates

2007-01-22 Thread anil kumar rohilla
  
Hi,
 List , i have 6 predictor variables and i want to make possible combinations 
of these 6 predictors ,all the data is in matrix form ,
if i am having 6 predictors than possible combination of sets are 64 2 power 6, 
or 63 ,whatever it may be i want to store the result in another variable to 
each combination and that i want to put in some model ,

i want to put every combination in some model ,please help me how to find 
suitable combinations of these variables 

i was not able to do this .
Any package is there in R which can help me in this regards
any help .
thanks in Advance


ANIL KUMAR( METEOROLOGIST)
LRF SECTION 
NATIONAL CLIMATE CENTER 
ADGM(RESEARCH)
INDIA METEOROLOGICAL DEPARTMENT
SHIVIJI NAGAR
PUNE-411005 INDIA
MOBILE +919422023277
[EMAIL PROTECTED]

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to disable existing menus in tcltk?

2007-01-22 Thread Jarno Tuimala
Hi!

I've constructed a small menu-driven interface to a couple of R functions 
using the possibilities offered by the tcltk package. When user runs some 
specific analyses, I would then like to disable some of the menus (or menu 
choises) that are not applicable after the performed analysis. I tried to 
modify the state of an existing menu, but it seems that neither 
tkconfigure nor tkentryconfigure contains the state as one of its options.

Here's a snip of the code. How could I disable, for example, the Edit 
data menu choise after already creating the menu (I want it to be active 
initially)?

gui-tktoplevel()
topMenu-tkmenu(gui)
tkconfigure(gui,menu=topMenu)
editMenu-tkmenu(topMenu, tearoff=FALSE)
tkadd(editMenu, command, label=Edit data, command=function() editData())
tkadd(editMenu, command, label=Preferences, command=function() editPref())
tkadd(topMenu, cascade, label=Edit, menu=editMenu)

Thanks,
Jarno Tuimala

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aov y lme

2007-01-22 Thread Christoph Buser
Dear Tomas

You can produce the results in Montgomery Montgomery with
lme. Please remark that you should indicate the nesting with the
levels in your nested factor. Therefore I recreated your data,
but used 1,...,12 for the levels of batch instead of 1,...,4.

purity-c(1,-2,-2,1,-1,-3,0,4, 0,-4, 1,0, 1,0,-1,0,-2,4,0,3,
  -3,2,-2,2,2,-2,1,3,4,0,-1,2,0,2,2,1)
suppli-factor(c(rep(1,12),rep(2,12),rep(3,12)))
batch-factor(c(rep(1:4,3), rep(5:8,3), rep(9:12,3)))
material-data.frame(purity,suppli,batch)

As you remarked you can use aov

summary(material.aov-aov(purity~suppli+suppli:batch,data=material))
 Df Sum Sq Mean Sq F value  Pr(F)  
suppli2 15.056   7.528  2.8526 0.07736 .
suppli:batch  9 69.917   7.769  2.9439 0.01667 *
Residuals24 63.333   2.639  
---
Signif. codes:  0 $,1rx(B***$,1ry(B 0.001 $,1rx(B**$,1ry(B 0.01 
$,1rx(B*$,1ry(B 0.05 $,1rx(B.$,1ry(B 0.1 $,1rx(B $,1ry(B 1 

Remark that the test of suppli is not the same as in
Montgomery. Here it is wrongly tested against the Residuals and
you should perform the calculate the test with: 

(Fsuppi - summary(material.aov)[[1]][1,Mean Sq]/
  summary(material.aov)[[1]][2,Mean Sq])
pf(Fsuppi, df1 = 2, df2 = 9)

To use lme the correct level numbering is now important to
indicate the nesting. You should specify your random component
as 

random=~1|batch

If you use random=~1|suppli/batch instead, random components
for batch AND suppli are included in the model and you specify a 
model that incorporates suppli as random and fixed. Therefore
the correct syntax is

library(nlme)
material.lme-lme(purity~suppli,random=~1|batch,data=material)
## You obtain the F-test for suppli using anova
anova(material.lme)
summary(material.lme)
## Remark that in the summary output, the random effects are
## standard deviations and not variance components and you 
## should square them to compare them with Montgomery
## 1.307622^2 = 1.71 1.624466^2 = 2.64

## Or you can use 
VarCorr(material.lme)

I hope this helps you.

Best regards,

Christoph Buser

--

Credit and Surety PML study: visit our web page www.cs-pml.org

--
Christoph Buser [EMAIL PROTECTED]
Seminar fuer Statistik, LEO C13
ETH Zurich  8092 Zurich  SWITZERLAND
phone: x-41-44-632-4673 fax: 632-1228
http://stat.ethz.ch/~buser/
--

Tomas Goicoa writes:
  
  
  
  Dear R user,
  
  I am trying to reproduce the results in Montgomery D.C (2001, chap 13, 
  example 13-1).
  
  Briefly, there are three suppliers, four batches nested within suppliers 
  and three determinations of purity (response variable) on each batch. It is 
  a two stage nested design, where suppliers are fixed and batches are random.
  
  y_ijk=mu+tau_i+beta_j(nested in tau_i)+epsilon_ijk
  
  Here are the data,
  
  purity-c(1,-2,-2,1,
-1,-3, 0,4,
 0,-4, 1, 0,
 1,0,-1,0,
 -2,4,0,3,
 -3,2,-2,2,
 2,-2,1,3,
 4,0,-1,2,
 0,2,2,1)
  
  suppli-factor(c(rep(1,12),rep(2,12),rep(3,12)))
  batch-factor(rep(c(1,2,3,4),9))
  
  material-data.frame(purity,suppli,batch)
  
  If I use the function aov, I get
  
  material.aov-aov(purity~suppli+suppli:batch,data=material)
  summary(material.aov)
Df Sum Sq Mean Sq F value  Pr(F)
  suppli2 15.056   7.528  2.8526 0.07736 .
  suppli:batch  9 69.917   7.769  2.9439 0.01667 *
  Residuals24 63.333   2.639
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  
  and I can estimate the variance component for the batches as
  
  (7.769- 2.639)/3=1.71
  
  which is the way it is done in Montgomery, D.
  
  I want to use the function lme because I would like to make a diagnosis of 
  the model, and I think it is more appropriate.
  
  Looking at Pinheiro and Bates, I have tried the following,
  
  library(nlme)
  material.lme-lme(purity~suppli,random=~1|suppli/batch,data=material)
  VarCorr(material.lme)
  
   Variance StdDev
  suppli =pdLogChol(1)
  (Intercept) 1.563785 1.250514
  batch = pdLogChol(1)
  (Intercept) 1.709877 1.307622
  Residual2.638889 1.624466
  
  material.lme
  
  Linear mixed-effects model fit by REML
 Data: material
 Log-restricted-likelihood: -71.42198
 Fixed: purity ~ suppli
  (Intercept) suppli2 suppli3
-0.417   0.750   1.583
  
  Random effects:
Formula: ~1 | suppli
   (Intercept)
  StdDev:1.250514
  
Formula: ~1 | batch %in% suppli
   (Intercept) Residual
  StdDev:1.307622 1.624466
  
  Number of Observations: 36
  Number of Groups:
  suppli batch %in% suppli
   312
  
   From VarCorr I obtain the variance component 1.71, but I am not sure if 
  

Re: [R] Scatterplot help

2007-01-22 Thread David Barron
The following three lines will do what you want.  You will probably
want to change some of the default behaviour; just look at the
relevant help pages.

plot(x,y)
text(x,y,ID)
grid(2)

On 21/01/07, gnv shqp [EMAIL PROTECTED] wrote:
 Hi my friends,

 I'm trying to make a scatterplot like this.

 1) I have a 3-variable dataset.  They are ID, x, and y.

 2) x is for the X-axis, y for the Y-axis, and ID is used to label all
 the cases in the scatterplot.

 3) After creating the scatterplot, I need to add both a X-axis  reference
 line and a Y-axis reference line.  The X-axis reference line is a vertical
 line starting from the center of the X-axis.  The Y-axis reference line is a
 horizontal line starting from the center of the Y-axis.  In other words, by
 creating the two reference lines, the scatterplot is divided into 4
 quadrants.

 Please help me figure out how to do that in R.

 Many thanks in advance!

 Feng

 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
=
David Barron
Said Business School
University of Oxford
Park End Street
Oxford OX1 1HP

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to disable existing menus in tcltk?

2007-01-22 Thread talepanda
play below after your code and look at tk window:

tkentryconfigure(editMenu,0,state=disable)
tkentryconfigure(editMenu,0,state=active)

tkentryconfigure(topMenu,1,state=disable)
tkentryconfigure(topMenu,1,state=active)

HTH

On 1/22/07, Jarno Tuimala [EMAIL PROTECTED] wrote:
 Hi!

 I've constructed a small menu-driven interface to a couple of R functions
 using the possibilities offered by the tcltk package. When user runs some
 specific analyses, I would then like to disable some of the menus (or menu
 choises) that are not applicable after the performed analysis. I tried to
 modify the state of an existing menu, but it seems that neither
 tkconfigure nor tkentryconfigure contains the state as one of its options.

 Here's a snip of the code. How could I disable, for example, the Edit
 data menu choise after already creating the menu (I want it to be active
 initially)?

 gui-tktoplevel()
 topMenu-tkmenu(gui)
 tkconfigure(gui,menu=topMenu)
 editMenu-tkmenu(topMenu, tearoff=FALSE)
 tkadd(editMenu, command, label=Edit data, command=function() editData())
 tkadd(editMenu, command, label=Preferences, command=function()
 editPref())
 tkadd(topMenu, cascade, label=Edit, menu=editMenu)

 Thanks,
 Jarno Tuimala

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Unusual behaviour get.hist.quote

2007-01-22 Thread Jerry Pressnell
I have been running this script regularly for some time. This morning the
following error message appeared.

Any suggestions to correct this change would be appreciated.

 

EWL-get.hist.quote(EWL,start=(today - Sys.Date())-350,quote=Cl)

trying URL 'http://chart.yahoo.com/table.csv?s=EWLa=1bc 06d=0e!f
07g=dq=qy=0z=EWLx=.csv'

Content type 'text/csv' length unknown

opened URL

downloaded 11Kb

 

Error in if (dat[n] != start) cat(format(dat[n], time series starts
%Y-%m-%d\n)) : 

missing value where TRUE/FALSE needed

 

Regrds

 

Jerry Pressnell

 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ECB/Sidebar/R (Emacs) was: Re: kate editor for R

2007-01-22 Thread Ramon Diaz-Uriarte
On 1/22/07, Dirk Eddelbuettel [EMAIL PROTECTED] wrote:

 On 22 January 2007 at 00:05, Ramon Diaz-Uriarte wrote:
 | On 1/20/07, Dirk Eddelbuettel [EMAIL PROTECTED] wrote:
 |  Just confirms my suspicion that even after all these years, I barely
 |  scratched the surface of ess.  That '2+ years' old feature wouldn't 
 happen to
 |  be documented somewhere, would it?
 |
 | Dirk, I must be missing something. All I do is: M-x ecb-activate
 | Everything works. I do nothing special with ess. For that matter, I do
 | nothing special when editing LaTeX or Python, and ecb (et al) do work
 | as intended.

 I had looked at ECB for C++ programming. It simply hadn't occurred to me that
 it would plug into ESS.

I wasn't aware of it either until I attended Tony Rossini's tutorial
at useR! 2006.


 Score another one for Emacs as an operating system.

Oh yes, and almost coffee maker and pizza deliverer :-).


R.


 Dirk

 --
 Hell, there are no rules here - we're trying to accomplish something.
   -- Thomas A. Edison



-- 
Ramon Diaz-Uriarte
Statistical Computing Team
Structural Biology and Biocomputing Programme
Spanish National Cancer Centre (CNIO)
http://ligarto.org/rdiaz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unusual behaviour get.hist.quote

2007-01-22 Thread Barry Rowlingson
Jerry Pressnell wrote:
 I have been running this script regularly for some time. This morning the
 following error message appeared.

 EWL-get.hist.quote(EWL,start=(today - Sys.Date())-350,quote=Cl)

 Error in if (dat[n] != start) cat(format(dat[n], time series starts
 %Y-%m-%d\n)) : 
 
 missing value where TRUE/FALSE needed
 

Looks like this has happened before and spontaneously fixed itself then:

http://www.mail-archive.com/r-help@stat.math.ethz.ch/msg77866.html

http://www.mail-archive.com/r-help@stat.math.ethz.ch/msg77869.html

  Now working, must have been a Yahoo! issue.

Barry

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Combination of variables

2007-01-22 Thread anil kumar rohilla
  
Hi,
 List , i have 6 predictor variables and i want to make possible combinations 
of these 6 predictors ,all the data is in matrix form ,
if i am having 6 predictors than possible combination of sets are 64 2 power 6, 
or 63 ,whatever it may be i want to store the result in another variable to 
each combination and that i want to put in some model ,

i want to put every combination in some model ,please help me how to find 
suitable combinations of these variables 

i was not able to do this .
Any package is there in R which can help me in this regards
any help .
thanks in Advance


ANIL KUMAR( METEOROLOGIST)
LRF SECTION 
NATIONAL CLIMATE CENTER 
ADGM(RESEARCH)
INDIA METEOROLOGICAL DEPARTMENT
SHIVIJI NAGAR
PUNE-411005 INDIA
MOBILE +919422023277
[EMAIL PROTECTED]

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] D'Agostino test

2007-01-22 Thread Matthieu Mourroux
Hello,

I'd like to know if the D'Agostino test of normality is reliable,
because somme of our results are not really coherent.
This test seems to be very sensitive. Even compared to a normal
distribution generated by R, the results are not very clear.

thanks for any help
Matthieu.


This e-mail has been scanned for all viruses by Star. The\ s...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fwd: Re: aov y lme

2007-01-22 Thread Tomas Goicoa

Dear Prof. Ripley and Christoph,

thank you very much for your comments. You have helped me a lot.

Thanks,

Tomas Goicoa










Dear Prof. Ripley

Thank you for your email. Yes, this is of course the correct
syntax to save us the extra calculation. And I forgot the
lower.tail = FALSE for pf() in my example to obtain the
p-value.

Thank you for the corrections and

Best regards,

Christoph Buser


Prof Brian Ripley writes:
   On Mon, 22 Jan 2007, Christoph Buser wrote:
  
Dear Tomas
   
You can produce the results in Montgomery Montgomery with
lme. Please remark that you should indicate the nesting with the
levels in your nested factor. Therefore I recreated your data,
but used 1,...,12 for the levels of batch instead of 1,...,4.
   
purity-c(1,-2,-2,1,-1,-3,0,4, 0,-4, 1,0, 1,0,-1,0,-2,4,0,3,
 -3,2,-2,2,2,-2,1,3,4,0,-1,2,0,2,2,1)
suppli-factor(c(rep(1,12),rep(2,12),rep(3,12)))
batch-factor(c(rep(1:4,3), rep(5:8,3), rep(9:12,3)))
material-data.frame(purity,suppli,batch)
   
As you remarked you can use aov
   
summary(material.aov-aov(purity~suppli+suppli:batch,data=material))
Df Sum Sq Mean Sq F value  Pr(F)
suppli2 15.056   7.528  2.8526 0.07736 .
suppli:batch  9 69.917   7.769  2.9439 0.01667 *
Residuals24 63.333   2.639
---
Signif. codes:  0 $,1rx(B***$,1ry(B 0.001 $,1rx(B**$,1ry(B 0.01 
 $,1rx(B*$,1ry(B 0.05 $,1rx(B.$,1ry(B 0.1 $,1rx(B $,1ry(B 1
   
Remark that the test of suppli is not the same as in
Montgomery. Here it is wrongly tested against the Residuals and
  
   I don't think so: aov() is doing the correct thing for the model
   specified. The aov() model you want is probably
  
   aov(purity ~ suppli + Error(suppli:batch), data=material)
  
   and this gives
  
summary(.Last.value)
  
   Error: suppli:batch
  Df Sum Sq Mean Sq F value Pr(F)
   suppli 2 15.056   7.528   0.969 0.4158
   Residuals  9 69.917   7.769
  
   Error: Within
  Df Sum Sq Mean Sq F value Pr(F)
   Residuals 24 63.333   2.639
  
  
you should perform the calculate the test with:
  
(Fsuppi - summary(material.aov)[[1]][1,Mean Sq]/
 summary(material.aov)[[1]][2,Mean Sq])
pf(Fsuppi, df1 = 2, df2 = 9)
  
   You want the other tail.
  
   --
   Brian D. Ripley,  [EMAIL PROTECTED]
   Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
   University of Oxford, Tel:  +44 1865 272861 (self)
   1 South Parks Road, +44 1865 272866 (PA)
   Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combination of variables

2007-01-22 Thread David Barron
You don't say what model you want to do.  It isn't necessary to store
each combination of predictors in a separate matrix unless you really
need to do this for some other reason, in which case I imagine you
could adopt this idea. I dare say there are better ways, but this
should work (assuming you want a linear model):

x-matrix(runif(30*6),ncol=6)
y - rnorm(30)
cmbs - list()
for (i in 1:6) cmbs - c(cmbs,combn(6,i,simplify=FALSE))
for (i in 1:length(cmbs))  print(summary(lm(y ~ x[,cmbs[[i]]])))


On 22 Jan 2007 08:51:22 -, anil kumar rohilla
[EMAIL PROTECTED] wrote:

 Hi,
  List , i have 6 predictor variables and i want to make possible combinations 
 of these 6 predictors ,all the data is in matrix form ,
 if i am having 6 predictors than possible combination of sets are 64 2 power 
 6, or 63 ,whatever it may be i want to store the result in another variable 
 to each combination and that i want to put in some model ,

 i want to put every combination in some model ,please help me how to find 
 suitable combinations of these variables 

 i was not able to do this .
 Any package is there in R which can help me in this regards
 any help .
 thanks in Advance


 ANIL KUMAR( METEOROLOGIST)
 LRF SECTION
 NATIONAL CLIMATE CENTER
 ADGM(RESEARCH)
 INDIA METEOROLOGICAL DEPARTMENT
 SHIVIJI NAGAR
 PUNE-411005 INDIA
 MOBILE +919422023277
 [EMAIL PROTECTED]

 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
=
David Barron
Said Business School
University of Oxford
Park End Street
Oxford OX1 1HP

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combination of variables

2007-01-22 Thread hadley wickham
You might find the following code useful.  It's part of a package I'm
developing for interactive model exploration.

Hadley

# Generate all models
# Fit all combinations of x variables ($2^p$)
#
# This technique generalises \code{\link{fitbest}}.  While it is much
# slower it will work for any type of model.
#
# @arguments vector y values
# @arguments matrix of x values
# @arguments method used to fit the model, eg
\code{\link{lm}},\code{\link[MASS]{rlm}}
# @keyword regression
fitall - function(y, x, method=lm, ...) {
data - cbind(y=y, x)

combs - do.call(expand.grid, rep(list(c(FALSE, TRUE)), ncol(x)))[-1, ]

vars - apply(combs, 1, function(i) names(x)[i])
form - paste(y ~ , lapply(vars, paste, collapse= + ), sep = )
form - lapply(form, as.formula)

models - lapply(form, function(f) eval(substitute(method(f,
data=data, ...), list(f=f, data=data, method=method
names(models) - 1:length(models)
class(models) - c(ensemble, class(models))
models
}

# Generate best linear models
# Use the leaps package to generate the best subsets.
#
# @arguments model formula
# @arguments data frame
# @arguments number of subsets of each size to record
# @arguments other arguments passed to \code{\link[leaps]{regsubsets}}
# @keyword regression
fitbest - function(formula, data, nbest=10, ...) {
b - regsubsets(formula, data=data, nbest=nbest, ...)
mat - summary(b, matrix.logical = TRUE)$which

intercept - c(, -1)[as.numeric(mat[,1])]
vars - apply(mat[,-1], 1, function(x) colnames(mat[, -1])[x])
form - paste(formula[[2]],  ~ , lapply(vars, paste, collapse= +
), sep = )
form - lapply(form, as.formula)

models - lapply(form, function(f) eval(substitute(lm(f, data=data),
list(f=f, data=data
names(models) - 1:length(models)
class(models) - c(ensemble, class(models))
models
}

On 22 Jan 2007 08:51:22 -, anil kumar rohilla
[EMAIL PROTECTED] wrote:

 Hi,
  List , i have 6 predictor variables and i want to make possible combinations 
 of these 6 predictors ,all the data is in matrix form ,
 if i am having 6 predictors than possible combination of sets are 64 2 power 
 6, or 63 ,whatever it may be i want to store the result in another variable 
 to each combination and that i want to put in some model ,

 i want to put every combination in some model ,please help me how to find 
 suitable combinations of these variables 

 i was not able to do this .
 Any package is there in R which can help me in this regards
 any help .
 thanks in Advance


 ANIL KUMAR( METEOROLOGIST)
 LRF SECTION
 NATIONAL CLIMATE CENTER
 ADGM(RESEARCH)
 INDIA METEOROLOGICAL DEPARTMENT
 SHIVIJI NAGAR
 PUNE-411005 INDIA
 MOBILE +919422023277
 [EMAIL PROTECTED]

 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Time-varying correlation calculation

2007-01-22 Thread Amir Safari
 
Dear R useres,
   
  I'm interested in getting a series of time-varying correlation, simply 
between two random variables.
   
  Could you please introduce a package to do this task?
   
  Thank you so much for any help.
  Amir 
   

 
-
Don't pick lemons.

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] efficient code. how to reduce running time?

2007-01-22 Thread Charilaos Skiadas
On Jan 21, 2007, at 8:11 PM, John Fox wrote:

 Dear Haris,

 Using lapply() et al. may produce cleaner code, but it won't  
 necessarily
 speed up a computation. For example:

 X - data.frame(matrix(rnorm(1000*1000), 1000, 1000))
 y - rnorm(1000)

 mods - as.list(1:1000)
 system.time(for (i in 1:1000) mods[[i]] - lm(y ~ X[,i]))
 [1] 40.53  0.05 40.61NANA

 system.time(mods - lapply(as.list(X), function(x) lm(y ~ x)))
 [1] 53.29  0.37 53.94NANA

Interesting, in my system the results are quite different:

  system.time(for (i in 1:1000) mods[[i]] - lm(y ~ X[,i]))
[1] 192.035  12.601 797.094   0.000   0.000
  system.time(mods - lapply(as.list(X), function(x) lm(y ~ x)))
[1]  59.913   9.918 289.030   0.000   0.000

Regular MacOSX install with ~760MB memory.

 In cases such as this, I don't even find the code using *apply()  
 easier to
 read.

 Regards,
  John

Haris

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Compare effects between lm-models

2007-01-22 Thread Rense Nieuwenhuis
Dear helpeRs,

I'm estimating a series of linear models (using lm) in which in every  
new model variables are added. I want to test to what degree the new  
variables can explain the effects of the variables already present in  
the models. In order to do that, I simply observe wether these  
effects decrease in strength and / or lose their significance.

My question is: does any of you know a package / function in R that  
can test whether these changes in effects between models are  
significant? I figure these effects follow a T-distribution and I  
know the std. devs., so it must be easy to do manually. But I would  
like not to invent the wheel, when the function is already present.

Below is an example of what I mean. In model2, the variable z is  
added, which is hypothesized to partly explain the effect of x.  
Indeed, the effect of x decreases in model2, compared to model1. What  
I want to find out, is if this decrease is statistically significant.

Many thanks,

Rense


x - c(1,1,1,1,1,2,2,2,2,2,3,4,4,4,5)
z - c(2,2,2,2,2,2,2,2,3,3,3,3,4,4,5)
y - c(1,2,2,2,3,3,3,3,4,4,4,5,5,5,5)

model1 - lm(y~x)
model2 - lm(y~x+z)



[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] naiveBayes question

2007-01-22 Thread David Meyer
Aimin:

The problem is that the columns you choose for training (only 4 
variables) do not match the ones used for prediction (all except y).

David



I try to use naiveBayes

   p.nb.90-naiveBayes(y~aa_three+bas+bcu+aa_ss,data=training)
   
pr.nb.90-table(predict(p.nb.90,training[,-13],type=class),training[,13])

bur I get this error
Error in object$tables[[v]] : subscript out of bounds
  
head is data set
   head(training)
  pr aa_three aa_one aa_ss aa_posaas bas   ams bmsacu
bcu omega   y index
1 1acx  ALA  A C  1 127.71   0 69.99   0
-0.2498560   0  79.91470 outward  TRUE
2 1acx  PRO  P C  2  68.55   0 55.44   0
-0.0949008   0  76.60380 outward  TRUE
3 1acx  ALA  A E  3  52.72   0 47.82   0
-0.0396550   0  52.19970 outward  TRUE
4 1acx  PHE  F E  4  22.62   0 31.21   0  0.1270330   0
169.52500  inward  TRUE
5 1acx  SER  S E  5  71.32   0 52.84   0
-0.1312380   0   7.47528 outward  TRUE
6 1acx  VAL  V E  6  12.92   0 22.40   0  0.1728390   0
149.09400  inward  TRUE

anyone know why?

Aimin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Compare effects between lm-models

2007-01-22 Thread Kuhn, Max
You can use the anova function a la:

anova(model1, model2)
   Analysis of Variance Table
   
   Model 1: y ~ x
   Model 2: y ~ x + z
 Res.DfRSS Df Sum of Sq  F Pr(F)
   1 13 4.4947   
   2 12 4.4228  10.0720 0.1952 0.6665

I would suggest getting a copy of MASS and/or reading 

   http://www.stats.ox.ac.uk/pub/MASS3/Exegeses.pdf


Max

   

--
LEGAL NOTICE\ Unless expressly stated otherwise, this messag...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to get correct integration in C for step function?

2007-01-22 Thread Thomas Lumley
On Sun, 21 Jan 2007, Lynette wrote:

 Dear all,

 I am using Rdqags in C to realize the integration. It seems for the
 continous C function I can get correct results. However, for step functions,
 the results are not correct. For example, the following one, when integrated
 from 0 to 1 gives 1 instead of the correct 1.5


Using integrate() in R for an R-defined step function gives the right 
answer (eg on the example in ?ecdf).

This suggests a problem in your C code, since integrate() just calls 
dqags.

-thomas

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] predict.survreg() with frailty term and newdata

2007-01-22 Thread Terry Therneau
  It can't be done with the current code.
  
  In a nutshell, you are trying to use a feature that I never got around to
coding.  It's been on my to do list, but may never make it to the top.

Terry

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finding the effect of Box-Cox transformation using vis.boxcoxu

2007-01-22 Thread Michael Kubovy
?box.cox

?boxcox

On Jan 22, 2007, at 2:25 AM, Arun Kumar Saha wrote:

 I have a dataset 'data' and I want to see the effect of Box-Cox
 transformation on it Interactively for different lambda values. I  
 already
 got a look on function vis.boxcoxu in package TeachingDemos. But I
 didn't find any option to put user's own dataset. Can anyone tell  
 me how to
 put my own dataset here i.e. data?

_
Professor Michael Kubovy
University of Virginia
Department of Psychology
USPS: P.O.Box 400400Charlottesville, VA 22904-4400
Parcels:Room 102Gilmer Hall
 McCormick RoadCharlottesville, VA 22903
Office:B011+1-434-982-4729
Lab:B019+1-434-982-4751
Fax:+1-434-982-4766
WWW:http://www.people.virginia.edu/~mk9y/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] efficient code. how to reduce running time?

2007-01-22 Thread Prof Brian Ripley
On Mon, 22 Jan 2007, Charilaos Skiadas wrote:

 On Jan 21, 2007, at 8:11 PM, John Fox wrote:

 Dear Haris,

 Using lapply() et al. may produce cleaner code, but it won't
 necessarily
 speed up a computation. For example:

 X - data.frame(matrix(rnorm(1000*1000), 1000, 1000))
 y - rnorm(1000)

 mods - as.list(1:1000)
 system.time(for (i in 1:1000) mods[[i]] - lm(y ~ X[,i]))
 [1] 40.53  0.05 40.61NANA

 system.time(mods - lapply(as.list(X), function(x) lm(y ~ x)))
 [1] 53.29  0.37 53.94NANA

 Interesting, in my system the results are quite different:

  system.time(for (i in 1:1000) mods[[i]] - lm(y ~ X[,i]))
 [1] 192.035  12.601 797.094   0.000   0.000
  system.time(mods - lapply(as.list(X), function(x) lm(y ~ x)))
 [1]  59.913   9.918 289.030   0.000   0.000

 Regular MacOSX install with ~760MB memory.

But MacOS X is infamous for having rather specific speed problems with its 
malloc, and so gives different timing results from all other platforms.
We are promised a solution in MacOS 10.5.

Both of your machines seem very slow compared to mine:

 system.time(for (i in 1:1000) mods[[i]] - lm(y ~ X[,i]))
user  system elapsed
  11.011   0.250  11.311
 system.time(mods - lapply(as.list(X), function(x) lm(y ~ x)))
user  system elapsed
  13.463   0.260  13.812

and that on a 64-bit platform (AMD64 Linux FC5).

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] efficient code. how to reduce running time?

2007-01-22 Thread John Fox
Dear Haris,

My timings were on a 3 GHz Pentium 4 system with 1 GB of memory running Win
XP SP2 and R 2.4.1.

I'm no expert on these matters, and I wouldn't have been surprised by
qualitatively different results on different systems, but this difference is
larger than I would have expected. One thing that seems particularly
striking in your results is the large difference between elapsed time and
user CPU time, making me wonder what else was going on when you ran these
examples.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: Charilaos Skiadas [mailto:[EMAIL PROTECTED] 
 Sent: Monday, January 22, 2007 10:00 AM
 To: John Fox
 Cc: r-help@stat.math.ethz.ch; 'miraceti'
 Subject: Re: [R] efficient code. how to reduce running time?
 
 On Jan 21, 2007, at 8:11 PM, John Fox wrote:
 
  Dear Haris,
 
  Using lapply() et al. may produce cleaner code, but it won't 
  necessarily speed up a computation. For example:
 
  X - data.frame(matrix(rnorm(1000*1000), 1000, 1000)) y - 
  rnorm(1000)
 
  mods - as.list(1:1000)
  system.time(for (i in 1:1000) mods[[i]] - lm(y ~ X[,i]))
  [1] 40.53  0.05 40.61NANA
 
  system.time(mods - lapply(as.list(X), function(x) lm(y ~ x)))
  [1] 53.29  0.37 53.94NANA
 
 Interesting, in my system the results are quite different:
 
   system.time(for (i in 1:1000) mods[[i]] - lm(y ~ X[,i]))
 [1] 192.035  12.601 797.094   0.000   0.000
   system.time(mods - lapply(as.list(X), function(x) lm(y ~ x)))
 [1]  59.913   9.918 289.030   0.000   0.000
 
 Regular MacOSX install with ~760MB memory.
 
  In cases such as this, I don't even find the code using *apply() 
  easier to read.
 
  Regards,
   John
 
 Haris
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Latin hyper cube sampling from expand.grid()

2007-01-22 Thread Prasanna
Dear R experts

I am looking for a package which gives me latin hyper cube samples
from the grid of values produced from the command expand.grid. Any
pointers to this issue might be very useful. Basically, I am doing the
following:

 a-(1:10)
 b-(20:30)
 dataGrid-expand.grid(a,b)

Now, is there a way to use this dataGrid in the package lhs to get
latin hyper cube samples.

Thanking you
Prasanna

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Compare effects between lm-models

2007-01-22 Thread Rense Nieuwenhuis
Thank you Alain and Max for your swift responses.

  It might be that I'm misunderstanding your responses, but aren't  
you testing if there is a difference between the two full models?
What I want to know, os whether the effect of a specific predictor  
(x) differs between model1 and model2. I'm not interested (presently)  
if the fit of model 2 is better than that of model 1 (for instance).

thanks again,

Rense




On Jan 22, 2007, at 16:26 , Kuhn, Max wrote:

 You can use the anova function a la:

 anova(model1, model2)
Analysis of Variance Table

Model 1: y ~ x
Model 2: y ~ x + z
  Res.DfRSS Df Sum of Sq  F Pr(F)
1 13 4.4947
2 12 4.4228  10.0720 0.1952 0.6665

 I would suggest getting a copy of MASS and/or reading

http://www.stats.ox.ac.uk/pub/MASS3/Exegeses.pdf


 Max



 --
 LEGAL NOTICE
 Unless expressly stated otherwise, this message is confidential and  
 may be privileged.  It is intended for the addressee(s) only.   
 Access to this E-mail by anyone else is unauthorized.  If you are  
 not an addressee, any disclosure or copying of the contents of this  
 E-mail or any action taken (or not taken) in reliance on it is  
 unauthorized and may be unlawful.  If you are not an addressee,  
 please inform the sender immediately.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] efficient code. how to reduce running time?

2007-01-22 Thread John Fox
Dear Brian,


 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Prof 
 Brian Ripley
 Sent: Monday, January 22, 2007 11:06 AM
 To: Charilaos Skiadas
 Cc: John Fox; r-help@stat.math.ethz.ch
 Subject: Re: [R] efficient code. how to reduce running time?
 
 On Mon, 22 Jan 2007, Charilaos Skiadas wrote:
 
  On Jan 21, 2007, at 8:11 PM, John Fox wrote:
 
  Dear Haris,
 
  Using lapply() et al. may produce cleaner code, but it won't 
  necessarily speed up a computation. For example:
 
  X - data.frame(matrix(rnorm(1000*1000), 1000, 1000)) y - 
  rnorm(1000)
 
  mods - as.list(1:1000)
  system.time(for (i in 1:1000) mods[[i]] - lm(y ~ X[,i]))
  [1] 40.53  0.05 40.61NANA
 
  system.time(mods - lapply(as.list(X), function(x) lm(y ~ x)))
  [1] 53.29  0.37 53.94NANA
 
  Interesting, in my system the results are quite different:
 
   system.time(for (i in 1:1000) mods[[i]] - lm(y ~ X[,i]))
  [1] 192.035  12.601 797.094   0.000   0.000
   system.time(mods - lapply(as.list(X), function(x) lm(y ~ x)))
  [1]  59.913   9.918 289.030   0.000   0.000
 
  Regular MacOSX install with ~760MB memory.
 
 But MacOS X is infamous for having rather specific speed 
 problems with its malloc, and so gives different timing 
 results from all other platforms.
 We are promised a solution in MacOS 10.5.
 

Thanks for the clarification.

 Both of your machines seem very slow compared to mine:
 
  system.time(for (i in 1:1000) mods[[i]] - lm(y ~ X[,i]))
 user  system elapsed
   11.011   0.250  11.311
  system.time(mods - lapply(as.list(X), function(x) lm(y ~ x)))
 user  system elapsed
   13.463   0.260  13.812
 
 and that on a 64-bit platform (AMD64 Linux FC5).
 

As you can see from the specs (in a previous message), my system is quite
old, which probably accounts for at least part of the difference. The ratios
of the user times for my and your system aren't too different though:

 53.29/40.53  # mine
[1] 1.314829

 13.463/11.011  # yours
[1] 1.222686

Regards,
 John

 -- 
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with C extension

2007-01-22 Thread Markus Schmidberger
Hello,
thanks for help and code.
We did a lot of work to speedup our function in R. We have a nested 
loop, vectorizing is the fastest way. But there we got a very big matrix 
and problems with memory. So we want to stay by loops an speedup with C.

My code is similar to this. (my_c is code from Brian D. Ripley)

SEXP test(SEXP a, SEXP b, SEXP in)
{
SEXP ans, new;
int n=INTEGER(in)[0],i,j;
PROTECT(ans = allocVector(REALSXP, 1));
REAL(ans)[0]=REAL(a)[0];
/*for(j = 0; i  m; j++)*/
for(i = 0; i  n; i++)
{
   /* b= ... ^i *j*/
PROTECT(new = allocVector(REALSXP, i+2));
new = my_c(ans,b);
PROTECT(ans = allocVector(REALSXP, i+2));
ans = new;
UNPROTECT(2);
}
UNPROTECT(1);
return ans;
}

We get an error by in=1300

  .Call(test,1,3,as.integer(1300));
Fehler: type mismatch
  .Call(test,1,3,as.integer(1300));
Speicherzugriffsfehler

Is there a possibility to free allocated memory? free(...) does not work.
Is there a possibility to reallocate a vector?

Thanks a lot
Markus

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to get correct integration in C for step function?

2007-01-22 Thread Lynette
Well,

I have no idea either. I can get correct answers for continous functions but 
incorrect for step functions.

Sign, I have been trying to realize the integration in C for long time.

Thank you for your answering.

Best,
Lynette

- Original Message - 
From: Thomas Lumley [EMAIL PROTECTED]
To: Lynette [EMAIL PROTECTED]
Cc: r-help@stat.math.ethz.ch; AJ Rossini [EMAIL PROTECTED]; 
[EMAIL PROTECTED]
Sent: Monday, January 22, 2007 10:48 AM
Subject: Re: [R] How to get correct integration in C for step function?


 On Sun, 21 Jan 2007, Lynette wrote:

 Dear all,

 I am using Rdqags in C to realize the integration. It seems for the
 continous C function I can get correct results. However, for step 
 functions,
 the results are not correct. For example, the following one, when 
 integrated
 from 0 to 1 gives 1 instead of the correct 1.5


 Using integrate() in R for an R-defined step function gives the right 
 answer (eg on the example in ?ecdf).

 This suggests a problem in your C code, since integrate() just calls 
 dqags.

  -thomas


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lnme convergence

2007-01-22 Thread Thomas BRUNEL
Dear R-user,

I am trying to use the R nlme function to fit a non linear mixed
effects model. The method has some problem to reach the convergence.

I am trying to understand causes of the problem by following step by 
step evolution of the iterative algorithm (verbose=TRUE command).

However, I don't understand the meaning of some information given while 
the algorithm is running, especialy the last part referring to convergence :

Convergence:
  fixed   reStruct
0.01168598 0.82634050

Futhermore, is the tolerance criteria (nlmeControl) related to any of 
the these two values?

Thank you for any help

Thomas

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] efficient code. how to reduce running time?

2007-01-22 Thread Charilaos Skiadas
On Jan 22, 2007, at 10:39 AM, John Fox wrote:

  One thing that seems particularly
 striking in your results is the large difference between elapsed  
 time and
 user CPU time, making me wonder what else was going on when you ran  
 these
 examples.

Yes, indeed there were a lot of other things going on, this is the  
only machine I have and I use it continuously. I'll try to run  
another test tonight when the machine is not in use.
It did seem a very striking difference though.

But am I wrong in thinking that these measurements should be  
independent of what other applications are running at the same time,  
and should measure exactly the time in terms of CPU cycles needed to  
finish this task, regardless of how often the process got to use the  
CPU? I guess I was working under that assumption, which indeed makes  
the above comparison a very unfair one, because there was a lot more  
going on during the first system.time call.

Still, the difference is quite large, which of course could simply  
have to do with the internals of the two commands, coupled with Prof.  
Ripley's comments about malloc in Mac OS X.

 Regards,
  John

Haris

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with C extension

2007-01-22 Thread Brian Ripley
On Mon, 22 Jan 2007, Markus Schmidberger wrote:

 Hello,
 thanks for help and code.
 We did a lot of work to speedup our function in R. We have a nested loop, 
 vectorizing is the fastest way. But there we got a very big matrix and 
 problems with memory. So we want to stay by loops an speedup with C.

 My code is similar to this. (my_c is code from Brian D. Ripley)

And not in many ways similar to mine, hence your error message.
You really do have to handle all the types, as I did most.


 SEXP test(SEXP a, SEXP b, SEXP in)
 {
   SEXP ans, new;
   int n=INTEGER(in)[0],i,j;
   PROTECT(ans = allocVector(REALSXP, 1));
   REAL(ans)[0]=REAL(a)[0];
 /*for(j = 0; i  m; j++)*/
   for(i = 0; i  n; i++)
   {
  /* b= ... ^i *j*/
   PROTECT(new = allocVector(REALSXP, i+2));
   new = my_c(ans,b);
   PROTECT(ans = allocVector(REALSXP, i+2));
   ans = new;
   UNPROTECT(2);
   }
   UNPROTECT(1);
   return ans;
 }

 We get an error by in=1300

 .Call(test,1,3,as.integer(1300));
 Fehler: type mismatch
 .Call(test,1,3,as.integer(1300));
 Speicherzugriffsfehler

 Is there a possibility to free allocated memory? free(...) does not work.

No, but garbage collection does.

 Is there a possibility to reallocate a vector?

Yes, sort of.  See lengthgets().

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [ESS] How to get correct integration in C for step function?

2007-01-22 Thread Thomas Lumley
On Mon, 22 Jan 2007, Lynette wrote:

 Well,

 I have no idea either. I can get correct answers for continous functions but
 incorrect for step functions.


I have just tried using Rdqags from C for the function x0 and it worked 
fine (once I had declared all the arguments correctly). The code is below.

-thomas





#include Rinternals.h
#include R_ext/Applic.h

double stepfn(double x){
return (x0);
}


SEXP call_stepfn(SEXP x){
SEXP answer;
int i,n;

n=length(x);
PROTECT(answer=allocVector(REALSXP,n));
for(i=0;in;i++){
REAL(answer)[i]=stepfn(REAL(x)[i]);
}
UNPROTECT(1);
return answer;
}


void vec_stepfn(double *x, int n, void *ex){
int i;
for (i=0;in;i++) x[i]=stepfn(x[i]);
return;
}

void Cvec_stepfn(double *x, double *n){

  vec_stepfn(x, *n, (void *) NULL);
  return;
}

SEXP int_stepfn(SEXP lower, SEXP upper){

SEXP answer; 
double result, abserr;
int last, neval, ier;
int lenw;
int *iwork;
double *work;
int limit=100;
double reltol=0.1;
double abstol=0.1;

  lenw = 4 * limit;
  iwork =   (int *) R_alloc(limit, sizeof(int));
  work = (double *) R_alloc(lenw,  sizeof(double));

  Rdqags(vec_stepfn, (void *) NULL, REAL(lower), REAL(upper),
abstol,  reltol,
   result,  abserr,  neval,  ier,
limit,  lenw, last,
iwork, work);

printf(%f %f %d\n, result,abserr, ier);

PROTECT(answer=allocVector(REALSXP,1));
REAL(answer)[0]=result;
UNPROTECT(1);

return answer;
}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Example function for bigglm (biglm) data input from file

2007-01-22 Thread Yeh, Richard C
This is to submit a commented example function for use in the data
argument to the bigglm(biglm) function, when you want to read the data
from a file (instead of a URL), or rescale or modify the data before
fitting the model.  In the hope that this may be of help to someone out
there.


make.data - function (filename, chunksize, ...) {
  conn-NULL;
  function (reset=FALSE) { 
if (reset) {
  if (!is.null(conn)) {
close(conn);
  };
  # This is for a file.
  # For other methods, see: help(connections)
  # and replace the following definition of conn
  # (and possibly the read.table call).
  conn - file (description=filename, open=r);
} else {
  # It's best that the file you use has no header 
  # line, because when you use the connection to 
  # read each excerpt, any header won't get re-read.
  # If you choose to skip the first line, then the 
  # first line of each excerpt will be skipped.
  rval - read.table (conn, nrows=chunksize, 
skip=0, header=FALSE,...);
  if (nrow(rval)==0) {
# Then we have reached the end of the input.
# Clean up:
close(conn);
conn-NULL;
rval-NULL;
  } else {
# We did not reach the end of the input,
# so this function will return data.
# Here, you can define any derived fields
# or put instructions to rescale input data
# that you want done after the data are read
# but before they are used for fitting.
# For example:
rval$rescaled_column - rval$original_column / 100.0;
# If you don't want to do anything like this,
# then delete this else clause, and make
# the end of the function resemble the URL 
# example in bigglm.
  };
return(rval);
}
  }
};

a - make.data ( filename = myfile, chunksize = 100, 
  # In our definition of make.data, any remaining 
  # arguments get passed to the read.table function by 
  # the ... argument.
  # Define column types:
  colClasses = list (character, character, 
integer, numeric, numeric),
  # Define the column names in the call:
  # (recall that we cannot rely on the file header)
  col.names = c(fromState, toState,
first, original_column, second)
);

library(biglm);

bigglm (formula = toState ~ 1 + first + rescaled_column,
  data = a, family = binomial(link='logit'), 
  weights = ~second);

summary(.Last.value)



NOTICE TO RECIPIENTS: Any information contained in or attach...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to get correct integration in C for step function?

2007-01-22 Thread Lynette
Dear all, especially to Thomas,

I have figured out the problem. For the step function, something wrong with 
my C codes. I should use the expression ((x=0.25)(x=0.75)) ? 2:1 instead 
of ((x=1/4)(x=3/4)) ? 2:1 ). Have no idea why 0.25 makes difference from 
1/4 in C. But now I can go ahead with the correct integration in C. Thank 
you all. And hope this helps to others.

Best wishes,
Lynette

- Original Message - 
From: Lynette [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Cc: AJ Rossini [EMAIL PROTECTED]; [EMAIL PROTECTED]
Sent: Sunday, January 21, 2007 6:24 PM
Subject: [R] How to get correct integration in C for step function?


 Dear all,

 I am using Rdqags in C to realize the integration. It seems for the
 continous C function I can get correct results. However, for step 
 functions,
 the results are not correct. For example, the following one, when 
 integrated
 from 0 to 1 gives 1 instead of the correct 1.5

 void func( double *x, int n, void *ex )
 {
 int i;

 for(i=0;in;i++) { x[i]=( ((x=1/4)(x=3/4)) ? 2:1 ) ; }
return;
 }

 while the following one when integrated from 0 to 1 gives the correct
 0.7853983

 void func( double *x, int n, void *ex )
 {
 int i;

 for(i=0;in;i++) { x[i]= pow(1-x[i]*x[i],.5); }
return;
 }

 Please advise the problems. Thanks a lot.

 Best,
 Lynette

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to get correct integration in C for step function?

2007-01-22 Thread Thomas Lumley
On Mon, 22 Jan 2007, Lynette wrote:

 Dear all, especially to Thomas,

 I have figured out the problem. For the step function, something wrong with 
 my C codes. I should use the expression ((x=0.25)(x=0.75)) ? 2:1 instead 
 of ((x=1/4)(x=3/4)) ? 2:1 ). Have no idea why 0.25 makes difference from 
 1/4 in C. But now I can go ahead with the correct integration in C. Thank you 
 all. And hope this helps to others.


1 and 4 are ints in C, so 1/4 is 0 (integer division).

-thomas

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Integration + Normal Distribution + Directory Browsing Processing Questions

2007-01-22 Thread Erich Neuwirth
Nils Hoeller wrote:
 
 for example.
 
 BUT what can I do for dynamic m and sd?
 I want something like integrate(dnorm(,0.6,0.15),0,1), with the first 
 dnorm parameter open for the
 integration but fixed m and sd.

integrate(function(x)dnorm(x,0.1,1.2), 0, 1)
is a way of fixing additional parameters.



-- 
Erich Neuwirth, University of Vienna
Faculty of Computer Science
Computer Supported Didactics Working Group
Visit our SunSITE at http://sunsite.univie.ac.at
Phone: +43-1-4277-39464 Fax: +43-1-4277-39459

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R interpretation

2007-01-22 Thread dan kumpik
Hi,

I am new to R (and not really a stats expert) and am having trouble 
interpreting its output. I am running a human learning experiment, with 
6 scores per subject in both the pretest and the posttest. I believe I 
have fitted the correct model for my data- a mixed-effects design, with 
subject as a random factor and session (pre vs post) nested within group 
(trained vs control).

I am confused about the output. The summary command gives me this table:


   D.lme- lme(score~GROUP/session, random=~1|subject, data=ILD4L )
   summary(D.lme)


Linear mixed-effects model fit by REML
   Data: ILD4L
Subset: EXP == F
  AIC   BIC   logLik
-63.69801 -45.09881 37.84900

Random effects:
   Formula: ~1 | subject
  (Intercept)  Residual
StdDev:   0.1032511 0.1727145

Fixed effects: score ~ GROUP/session
   Value  Std.Error  DF   t-value p-value
(Intercept) 0.10252778 0.05104328 152  2.008644  0.0463
GROUPT  0.09545347 0.06752391  12  1.413625  0.1829
GROUPC:sessionpost -0.00441389 0.04070919 152 -0.108425  0.9138
GROUPT:sessionpost -0.23586042 0.03525520 152 -6.690090  0.
   Correlation:
 (Intr) GROUPT GROUPC
GROUPT -0.756
GROUPC:sessionpost -0.399  0.301
GROUPT:sessionpost  0.000 -0.261  0.000

Standardized Within-Group Residuals:
  Min  Q1 Med  Q3 Max
-2.66977386 -0.52935645 -0.08616759  0.57215015  3.26532101

Number of Observations: 168
Number of Groups: 14


I believe the fixed-effects section of this output to be telling me that
my model intercept (which I assume to be the control group pretest?) is
significantly different from 0, and that GROUPT (i.e. the trained group)
does not differ significantly from the intercept- therefore no pretest
difference between groups?
The next line is, I believe showing that the GROUPC x sessionpost
interaction (i.e., control posttest scores?) is not significantly
different from the intercept (i.e. control pretest scores). Finally, I
am interpreting the final line as indicating that the GROUPT x
sessionpost interaction (ie, trained posttest scores?) is significantly
different from the trained pretest scores (GROUPT). A treatment contrast 
that I would like to apply would be for Control-post vs Trained-post, to 
see if the groups differ after training, but I'm not sure how to do 
this- and I feel I am probably overcomplicating the matter.

also,
I am confused about how to report this output in my publication. For 
instance, what should I be reporting for df? Those found on the output 
of the anova table?

Would it be possible to look through this for me and indicate how to
interpret the R output, and also how I should be reporting this? 
Apologies for asking such basic questions, but I would like to start 
using R more regularly and to make sure I am doing so correctly.

Many thanks,

Dan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Repeated measures

2007-01-22 Thread Richard Plant
In the two solutions for the repeated measures problem given in the
original reply below, the F and p values given by aov() with the error
strata defined by Error() are different from those given by lme().
However, when one does the problem by hand using the standard split
plot model, the results agree with those of nlme(). The difference
between the two aov() solutions is in the partitioning of sums of
squares. Is there a ready explanation for this discrepancy?

Thanks,
Richard Plant

 tolerance - tolerance -
+
read.table(http://www.ats.ucla.edu/stat/Splus/examples/alda/tolerance1.
txt,
+ sep=,, header=TRUE)
 tolerance.long - reshape(tolerance,
+   varying = list(c(tol11,tol12,tol13,
+tol14, tol15)),
+   v.names = c(tol), timevar = time,
+   times = 11:15, direction = long)
 tolerance.aov2 - aov(tol ~ factor(male) + factor(male):factor(id) +
factor(time) + factor(time):male, data = tolerance.long)
 tolerance.sum - summary(tolerance.aov2)
 tolerance.sum
Df Sum Sq Mean Sq F valuePr(F)
factor(male) 1 0.3599  0.3599  2.6077  0.111967
factor(time) 4 2.8326  0.7081  5.1309  0.001358 ** 
factor(male):factor(id) 14 8.2990  0.5928  4.2951 4.295e-05 ***
factor(time):male4 0.1869  0.0467  0.3386  0.850786
Residuals   56 7.7289  0.1380  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
 tolerance.list - tolerance.sum[[1]]
 tolerance.mat - as.matrix(tolerance.list[3])
 tolerance.F.male - tolerance.mat[1,1]/tolerance.mat[3,1]
 tolerance.F.male
[1] 0.607137
 tolerance.df - as.matrix(tolerance.list[1])
 tolerance.p.male - 1 -
pf(tolerance.F.male,tolerance.df[1,1],tolerance.df[3,1])
 tolerance.p.male
[1] 0.4488394
 
 Message: 68
 Date: Wed, 17 Jan 2007 05:45:01 -0500
 From: Chuck Cleland [EMAIL PROTECTED]
 Subject: Re: [R] Repeated measures
 To: Tom Backer Johnsen [EMAIL PROTECTED]
 Cc: r-help@stat.math.ethz.ch
 Message-ID: [EMAIL PROTECTED]
 Content-Type: text/plain; charset=ISO-8859-1
 
 Tom Backer Johnsen wrote:
  I am having a hard time understanding how to perform a repeated
  measures type of ANOVA with R.  When reading the document found
here:
 
  http://cran.r-project.org/doc/contrib/Lemon-kickstart/kr_repms.html
 
  I find that there is a reference to a function make.rm () that is
  supposed to rearrange a one row per person type of frame to a one
  row per observation type of frame.  But that function does not seem
  to be there.  Nor does the help.search suggest anything.  Is that
  function buried in some package?
 
   I'm not able to find that function.  Perhaps that document is out of
 date.
 
  Is there  some simple documentation that might be useful somewhere?
  Starting with a really simple problem (one group, two observations)?
 
   Here is an example showing the use of reshape() and analysis via
aov()
 and lme() in the nlme package.
 
 tolerance -

read.table(http://www.ats.ucla.edu/stat/Splus/examples/alda/tolerance1.
tx
 t,
 sep=,, header=TRUE)
 
 tolerance.long - reshape(tolerance,
   varying = list(c(tol11,tol12,tol13,
tol14, tol15)),
   v.names = c(tol), timevar = time,
   times = 11:15, direction = long)
 
 tolerance.aov - aov(tol ~ as.factor(time) * male + Error(id),
  data = tolerance.long)
 
 summary(tolerance.aov)
 
 Error: id
  Df   Sum Sq  Mean Sq
 male  1 0.085168 0.085168
 
 Error: Within
  Df  Sum Sq Mean Sq F value  Pr(F)
 as.factor(time)   4  2.8326  0.7081  3.0538 0.02236 *
 male  1  0.3024  0.3024  1.3039 0.25745
 as.factor(time):male  4  0.1869  0.0467  0.2015 0.93670
 Residuals69 16.0002  0.2319
 ---
 Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
 
 library(nlme)
 
 tolerance.lme - lme(tol ~ as.factor(time) * male, random = ~ 1 | id,
  data = tolerance.long)
 
 anova(tolerance.lme)
  numDF denDF  F-value p-value
 (Intercept)  156 353.9049  .0001
 as.factor(time)  456   5.1309  0.0014
 male 114   0.6071  0.4488
 as.factor(time):male 456   0.3386  0.8508
 
   RSiteSearch(repeated measures) points to other examples,
functions,
 and documentation.
 
  Tom
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 --
 Chuck Cleland, Ph.D.
 NDRI, Inc.
 71 West 23rd Street, 8th floor
 New York, NY 10010
 tel: (212) 845-4495 (Tu, Th)
 tel: (732) 512-0171 (M, W, F)
 fax: (917) 438-0894
 
 
 
 

[R] Query about using optimizers in R without causing program to crash

2007-01-22 Thread lalitha viswanath
Hi
I am a newbie to R and am using  the lm function to
fit my data.
This optimization is to be performed for around 45000
files not all of which lend themselves to
optimization. Some of these will and do crash.
 
However, How do I ensure that the program simply goes
to the next file in line without exiting the code with
the error
Error in lm.fit(x, y, offset = offset, singular.ok =
singular.ok, ...) : 
NA/NaN/Inf in foreign function call (arg 4)
everytime it encounters troublesome data?

I would greatly appreciate your input as it would
avoid me having to manually type
for fileId in (c(4351:46000)) { ... }
for fileId in (c(5761:46000)) { ... }, etc...

Thanks
Lalitha


 

Now that's room service!  Choose from over 150,000 hotels

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Recursive-SVM (R-SVM)

2007-01-22 Thread Nair, Murlidharan T
I am trying to implement a simple r-svm example using the iris data (only two 
of the classes are taken and data is within the code). I am running into some 
errors. I am not an expert on svm's. If any one has used it, I would appreciate 
their help. I am appending the code below. 
Thanks../Murli
 
###
### R-code for R-SVM

### use leave-one-out / Nfold or bootstrape to permute data for external CV

### build SVM model and use mean-balanced weight to sort genes on training set

### and recursive elimination of least important genes

### author: Dr. Xin Lu, Research Scientist

### Biostatistics Department, Harvard School of Public Health

library(e1071)

## read in SVM formated data in filename

## the format following the defination of SVMTorch

## the first line contains 2 integer: nSample nFeature+1 

## followed by a matrix, each row for one sample, with the last column being 
+/1 1 for class label

ReadSVMdata - function(filename)

{

dd - read.table( filename, header=F, skip=1)

x - as.matrix( dd[, 1:(ncol(dd)-1)] ) 

y - factor( dd[, ncol(dd)] )

ret - list(x=x, y=y)

}

## create a decreasing ladder for recursive feature elimination

CreatLadder - function( Ntotal, pRatio=0.75, Nmin=5 )

{

x - vector()

x[1] - Ntotal

for( i in 1:100 )

{

pp - round(x[i] * pRatio)

if( pp == x[i] )

{

pp - pp-1

} 

if( pp = Nmin )

{

x[i+1] - pp

} else

{

break

}

}

x

}

## R-SVM core code

## input:

## x: row matrix of data

## y: class label: 1 / -1 for 2 classes

## CVtype: 

## integer: N fold CV

## LOO: leave-one-out CV

## bootstrape: bootstrape CV

## CVnum: number of CVs

## LOO: defined as sample size

## Nfold and bootstrape: user defined, default as sample size

## output: a named list

## Error: a vector of CV error on each level

## SelFreq: a matrix for the frequency of each gene being selected in each level

## with each column corresponds to a level of selection

## and each row for a gene

## The top important gene in each level are those high-freqent ones

RSVM - function(x, y, ladder, CVtype, CVnum=0 )

{

## check if y is binary response

Ytype - names(table(y))

if( length(Ytype) != 2) 

{

print(ERROR!! RSVM can only deal with 2-class problem)

return(0)

}

## class mean

m1 - apply(x[ which(y==Ytype[1]), ], 2, mean)

m2 - apply(x[ which(y==Ytype[2]), ], 2, mean)

md - m1-m2

yy - vector( length=length(y))

yy[which(y==Ytype[1])] - 1

yy[which(y==Ytype[2])] - -1 

y - yy

## check ladder

if( min(diff(ladder)) = 0 )

{

print(ERROR!! ladder must be monotonously decreasing)

return(0);

}

if( ladder[1] != ncol(x) )

{

ladder - c(ncol(x), ladder)

}

nSample - nrow(x)

nGene - ncol(x)

SampInd - seq(1, nSample)

if( CVtype == LOO )

{

CVnum - nSample

} else

{

if( CVnum == 0 )

{

CVnum - nSample

}

}

## vector for test error and number of tests

ErrVec - vector( length=length(ladder))

names(ErrVec) - paste(Lev_, ladder, sep=)

nTests - 0

SelFreq - matrix( 0, nrow=nGene, ncol=length(ladder))

colnames(SelFreq) - paste(Lev_, ladder, sep=)

## for each CV 

for( i in 1:CVnum )

{

## split data

if( CVtype == LOO )

{

TestInd - i

TrainInd - SampInd[ -TestInd]

} else

{

if( CVtype == bootstrape )

{

TrainInd - sample(SampInd, nSample, replace=T )

TestInd - SampInd[ which(!(SampInd %in% TrainInd ))]

} else

{

## Nfold

TrainInd - sample(SampInd, nSample*(CVtype-1)/CVtype )

TestInd - SampInd[ which(!(SampInd %in% TrainInd ))]

}

}

nTests - nTests + length(TestInd)

## in each level, train a SVM model and record test error

xTrain - x[TrainInd, ]

yTrain - y[TrainInd]

xTest - x[TestInd,]

yTest - y[TestInd]

## index of the genes used in the 

SelInd - seq(1, nGene)

for( gLevel in 1:length(ladder) )

{

## record the genes selected in this ladder

SelFreq[SelInd, gLevel] - SelFreq[SelInd, gLevel] +1

## train SVM model and test error

svmres - svm(xTrain[, SelInd], yTrain, scale=F, type=C-classification, 
kernel=linear )

if( CVtype == LOO )

{

svmpred - predict(svmres, matrix(xTest[SelInd], nrow=1) )

} else

{

svmpred - predict(svmres, xTest[, SelInd] )

}

ErrVec[gLevel] - ErrVec[gLevel] + sum(svmpred != yTest )

## weight vector

W - t(svmres$coefs*yTrain[svmres$index]) %*% svmres$SV * md[SelInd]

rkW - rank(W)

if( gLevel  length(ladder) )

{

SelInd - SelInd[which(rkW  (ladder[gLevel] - ladder[gLevel+1]))]

}

}

}

ret - list(ladder=ladder, Error=ErrVec/nTests, SelFreq=SelFreq)

}

SummaryRSVM - function( RSVMres )

{

ERInd - max( which(RSVMres$Error == min(RSVMres$Error)) )

MinLevel - RSVMres$ladder[ERInd]

FreqVec - RSVMres$SelFreq[, ERInd]

SelInd - which( rank(FreqVec) = (ladder[1]-MinLevel) )

# print(MinCV error of, min(RSVMres$Error), at, MinLevel, genes )

ret - list( MinER=min(RSVMres$Error), MinLevel=MinLevel, SelInd=SelInd)

}

###

#my code starts below

#data-ReadSVMdata(iris_r-svm.txt)

#The data read from the file is given below. 


Re: [R] Query about using try block

2007-01-22 Thread lalitha viswanath
Hi
Thanks for your response.
However I seem to be doing something wrong regarding
the try block resulting in yet another error described
below.

I have a function that takes in a file name and
does the fit for the data in that file.
Hence based on your input, I tried

try ( (fit = lm(y~x, data = data_fitting)), silent =
T);


I left the subsequent lines of my code unchanged.
coeffs = as.list(coef(fit);
lambda = exp(coeffs$x)

After the change using try, when I tried to resume
processing under R as follows
source(fitting.R)
for filename in list { process(filename); }
It says Cannot find object fit ...(in the line
trying to get the coefficients...)

Am I closing the try block in the wrong place?
This function does some post processing on the
coefficients returned by coef(fit), puts them in a
list and sends it to another function.
(i.e. around 6 lines of code after the call to fit).
Thanks
Lalitha
--- Andreas Hary [EMAIL PROTECTED] wrote:

 Look at ?try
 
 Your code will probably need to be something like
 the following:
 
 fit - list()
 for(fileId in 1:n){
try(fit[i] - lm(formula,data=???,...), silent=F)
#or silent=T if you would like to be made aware
 of problems
 }
 
 Best wishes,
 
 Andreas
 
 
 
 
 lalitha viswanath wrote:
  Hi
  I am a newbie to R and am using  the lm function
 to
  fit my data.
  This optimization is to be performed for around
 45000
  files not all of which lend themselves to
  optimization. Some of these will and do crash.
   
  However, How do I ensure that the program simply
 goes
  to the next file in line without exiting the code
 with
  the error
  Error in lm.fit(x, y, offset = offset,
 singular.ok =
  singular.ok, ...) : 
  NA/NaN/Inf in foreign function call (arg
 4)
  everytime it encounters troublesome data?
  
  I would greatly appreciate your input as it would
  avoid me having to manually type
  for fileId in (c(4351:46000)) { ... }
  for fileId in (c(5761:46000)) { ... }, etc...
  
  Thanks
  Lalitha
  
  
   
 


  Now that's room service!  Choose from over 150,000
 hotels
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
 reproducible code.
  
 
 -- 
 =
 Andreas Hary
 Flat 5, 70 Finsbury Park Road
 Lond, N4 2JX, UK
 
 Email:[EMAIL PROTECTED]
 Mobile: 07906 860 987
 



 

Want to start your own business?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query about using optimizers in R without causing program to crash

2007-01-22 Thread Andrew Robinson
Hi Lalitha,

Use 

try()

or 

tryCatch()

Cheers

Andrew

On Mon, Jan 22, 2007 at 12:43:28PM -0800, lalitha viswanath wrote:
 Hi
 I am a newbie to R and am using  the lm function to
 fit my data.
 This optimization is to be performed for around 45000
 files not all of which lend themselves to
 optimization. Some of these will and do crash.
  
 However, How do I ensure that the program simply goes
 to the next file in line without exiting the code with
 the error
 Error in lm.fit(x, y, offset = offset, singular.ok =
 singular.ok, ...) : 
 NA/NaN/Inf in foreign function call (arg 4)
 everytime it encounters troublesome data?
 
 I would greatly appreciate your input as it would
 avoid me having to manually type
 for fileId in (c(4351:46000)) { ... }
 for fileId in (c(5761:46000)) { ... }, etc...
 
 Thanks
 Lalitha
 
 
  
 
 Now that's room service!  Choose from over 150,000 hotels
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-9763
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sequential processing

2007-01-22 Thread bogdan romocea
One option for processing very large files with R is split:
  ## split a large file into pieces
  #--parameters: the folder, file and number of parts
  FLD=/home/user/data
  F=very_large_file.dat
  parts=50
  #---split
  cd $FLD
  fn=`echo $F | awk -F\. '{print $1}'`  #file name without extension
  ln=`wc -l $F | awk '{print $1}'`  #number of lines in the file
  forsplit=`expr $ln / $parts + 1`  #number of lines in each part
  echo == $F will be split in $parts parts of $forsplit lines each.
  split -l $forsplit $F $fn
You could also load the entire file into a DBMS then pull parts of it
into R, or read specific lines through a pipe e.g.
readLines(pipe(sed, grep, python... command)).

Don't try to replicate the SAS processing into R. The exact
translations of the SAS DATA STEP usage of _N_, first., last., retain
etc into R would be: inefficient, ugly, retrogressive, wrong, rigid,
complicated, silly and so on. For a start, read up on indexing - this
seemingly simple and innocuous R feature is in fact far more powerful
than the entire DATA STEP with its whole bag of tricks. Then search
the list for similar questions, for example
http://thread.gmane.org/gmane.comp.lang.r.general/44332/focus=44343


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Gerard Smits
 Sent: Sunday, January 21, 2007 2:22 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] sequential processing

 Like many others, I am new to R but old to SAS.

 Am I correct in understanding that R processes a data frame in a
 sequential ly?  This would imply that large input files could be
 read, without the need to load the entire file into memory.
 Related to the manner of reading a frame, I have been looking for the
 equivalent of SAS _n_ (I realize that I can use a variant of which to
 identify an index value) as well as  useful SAS features such as
 first., last., retain, etc.  Any help with this conversion
 appreciated.

 Thanks,

 Gerard Smits

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lattice: put key where unused panel would have been

2007-01-22 Thread Benjamin Tyner
Hi,

Say I have

z-data.frame(y=runif(190),
   x=runif(190),
   f=gl(5,38),
   g=gl(19,10))

plot-xyplot(y~x|g,
 data=z,
 layout=c(5,4),
 groups=f,
 auto.key=TRUE)

How might one place the key in the empty space where the twentieth panel 
would have been?

Thanks,
Ben

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Question about rpart and regression trees

2007-01-22 Thread Paul Smith
Dear All

I would like to use rpart to obtain a regression tree for a dataset
like the following:

Y   X1  X2  X3  X4
5.500033B   A   3   2
0.35625148  D   B   6   5
0.8062546   E   C   4   3
5.100014C   A   3   2
5.7000422   A   A   3   2
0.76875436  C   A   6   5
1.0312537   D   A   4   1

Y is the objective variable. X1, X2, X3 and X4 can take, respectively,
the following values:

X1: A,B,C,D,E
X2: A,B,C,D,E
X3: 3,4,5,6
X4. 1,2,3,4,5

Should I convert X3 and X4 to factor before running rpart?

Thanks in advance,

Paul

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sequential processing

2007-01-22 Thread Gerard Smits
So, I take it, given that the use of a pipe is suggested for 
sequential reading, that the standard approach to processing a data 
frame is to load the entire file?  Please correct if wrong.

BTW, I am not interested in finding direct translations of SAS data 
step statements to R, but instead in finding an approach by which I 
can address the type of problems I consistent have to deal with 
(grouped processing with retention of baseline records, etc.).  I'll 
read more on the indexing as a means of dealing with relative position issues

Thanks,

Gerard



You could also load the entire file into a DBMS then pull parts of it
into R, or read specific lines through a pipe e.g.
readLines(pipe(sed, grep, python... command)).

Don't try to replicate the SAS processing into R. The exact
translations of the SAS DATA STEP usage of _N_, first., last., retain
etc into R would be: inefficient, ugly, retrogressive, wrong, rigid,
complicated, silly and so on. For a start, read up on indexing - this
seemingly simple and innocuous R feature is in fact far more powerful
than the entire DATA STEP with its whole bag of tricks. Then search
the list for similar questions, for example
http://thread.gmane.org/gmane.comp.lang.r.general/44332/focus=44343


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Gerard Smits
Sent: Sunday, January 21, 2007 2:22 PM
To: r-help@stat.math.ethz.ch
Subject: [R] sequential processing

Like many others, I am new to R but old to SAS.

Am I correct in understanding that R processes a data frame in a
sequential ly?  This would imply that large input files could be
read, without the need to load the entire file into memory.
Related to the manner of reading a frame, I have been looking for the
equivalent of SAS _n_ (I realize that I can use a variant of which to
identify an index value) as well as  useful SAS features such as
first., last., retain, etc.  Any help with this conversion
appreciated.

Thanks,

Gerard Smits

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lattice: put key where unused panel would have been

2007-01-22 Thread Sundar Dorai-Raj


Benjamin Tyner said the following on 1/22/2007 3:18 PM:
 Hi,
 
 Say I have
 
 z-data.frame(y=runif(190),
x=runif(190),
f=gl(5,38),
g=gl(19,10))
 
 plot-xyplot(y~x|g,
  data=z,
  layout=c(5,4),
  groups=f,
  auto.key=TRUE)
 
 How might one place the key in the empty space where the twentieth panel 
 would have been?
 
 Thanks,
 Ben
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

You can try supplying x and y coordinates to auto.key

xyplot(y ~ x | g,
data = z,
layout = c(5, 4),
groups = f,
auto.key = list(x = .85, y = .9))

HTH,

--sundar

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] logistic regression model + Cross-Validation

2007-01-22 Thread Weiwei Shi
why not use lda{MASS} and it has cv=T option; it does loo, though.
Or use randomForest.

if you have to use lrm, then the following code might help:

n.fold - 5 # 5-fold cv
n.sample - 50 # assumed 50 samples
s - sample(1:n.fold, size=n.sample, replace=T)
for (i in 1:n.fold){
  # create your training data and validation data for each fold
  trn - YOURWHOLEDATAFRAME[s!=i,]
  val - YOURWHOLEDATAFRAME[s==i,]
  # now do your own modeling using lrm
  # todo
}

HTH,

weiwei

On 1/21/07, nitin jindal [EMAIL PROTECTED] wrote:
 If validate.lrm does not has this option, do any other function has it.
 I will certainly look into your advice on cross validation. Thnx.

 nitin

 On 1/21/07, Frank E Harrell Jr [EMAIL PROTECTED] wrote:
 
  nitin jindal wrote:
   Hi,
  
   I am trying to cross-validate a logistic regression model.
   I am using logistic regression model (lrm) of package Design.
  
   f - lrm( cy ~ x1 + x2, x=TRUE, y=TRUE)
   val - validate.lrm(f, method=cross, B=5)
 
  val - validate(f, ...)# .lrm not needed
 
  
   My class cy has values 0 and 1.
  
   val variable will give me indicators like slope and AUC. But, I also
  need
   the vector of predicted values of class variable cy for each record
  while
   cross-validation, so that I can manually look at the results. So, is
  there
   any way to get those probabilities assigned to each class.
  
   regards,
   Nitin
 
  No, validate.lrm does not have that option.  Manually looking at the
  results will not be easy when you do enough cross-validations.  A single
  5-fold cross-validation does not provide accurate estimates.  Either use
  the bootstrap or repeat k-fold cross-validation between 20 and 50 times.
k is often 10 but the optimum value may not be 10.  Code for averaging
  repeated cross-validations is in
  http://biostat.mc.vanderbilt.edu/twiki/pub/Main/RmS/logistic.val.pdf
  along with simulations of bootstrap vs. a few cross-validation methods
  for binary logistic models.
 
  Frank
  --
  Frank E Harrell Jr   Professor and Chair   School of Medicine
Department of Biostatistics   Vanderbilt University
 

 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Weiwei Shi, Ph.D
Research Scientist
GeneGO, Inc.

Did you always know?
No, I did not. But I believed...
---Matrix III

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Loess with more than 4 predictors / offsets

2007-01-22 Thread Louisell, Paul
Hello,

Does anyone know of an R version of loess that allows more than 4
predictors and/or allows the specification of offsets? For that matter,
does anyone know of _any_ version of loess that does either of the
things I mention?

Thanks,

Paul Louisell
650-833-6254
[EMAIL PROTECTED]
Research Associate (Statistician)
Modeling  Data Analytics
ARPC



[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Questions about xtable and print.xtable

2007-01-22 Thread Charilaos Skiadas
I have been using the wonderful xtable package lately, in combination  
with Sweave, and I have a couple of general questions along with a  
more particular one.

I'll start with the particular question. I basically have a 1x3 array  
with column names but no row names. I want to create a latex table  
with column setting set to |rrr|. I want the column names to  
appear, but the row names not to appear. The code I am trying is this:

library(xtable)
x - matrix(c(1:3), c(1,3), dimnames=list(NULL,c(1:3)))
tab - xtable(x, align=||)
print.xtable(tab, include.rownames=FALSE)
print.xtable(tab)

The problem here is that the xtable call requires an align value that  
has one extra row setting, I suppose to account for a possible row  
name. However, the first print.xtable call seems to ignore the align  
argument set in the xtable call, when include.rownames is included.  
Any workarounds will be most welcomed.

More generally, I have the following questions:
1) Why are the include.rownames and include.colnames parameters not  
appearing in the xtable call, but only in the print.xtable call  
instead? Why do I need to specify n+1 arguments for things like align  
and digits, when I don't want the row names to be printed? In  
general, why are the align and digits calls not setable in  
print.xtable, but only in xtable?
2) I like to enclose my tabular environments in a center environment,  
instead of a table environment. Unless I've missed it, I don't see  
how I can do that from within the xtable package. Is this really not  
possible, and if so why not? The latex.environments setting seems to  
only be allowed when floating=TRUE, which is exactly what I want to  
avoid. Any particular reason it is not allowed when floating=FALSE as  
well?

That's it really, thanks in advance for any responses.

Haris

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query about using try block

2007-01-22 Thread talepanda
Usually (that is, not limited in R language), when error occurs in
try, stacks are rollbacked, so the variables defined in try no longer
exists after calling try.

One non-elegant solution is:

fit-NULL
try ( (fit = lm(y~x, data = data_fitting)), silent =T)
if(!is.null(fit)){
coeffs = as.list(coef(fit))
## other subsequent processes
}

On 1/23/07, lalitha viswanath [EMAIL PROTECTED] wrote:
 Hi
 Thanks for your response.
 However I seem to be doing something wrong regarding
 the try block resulting in yet another error described
 below.

 I have a function that takes in a file name and
 does the fit for the data in that file.
 Hence based on your input, I tried

 try ( (fit = lm(y~x, data = data_fitting)), silent =
 T);


 I left the subsequent lines of my code unchanged.
 coeffs = as.list(coef(fit);
 lambda = exp(coeffs$x)

 After the change using try, when I tried to resume
 processing under R as follows
 source(fitting.R)
 for filename in list { process(filename); }
 It says Cannot find object fit ...(in the line
 trying to get the coefficients...)

 Am I closing the try block in the wrong place?
 This function does some post processing on the
 coefficients returned by coef(fit), puts them in a
 list and sends it to another function.
 (i.e. around 6 lines of code after the call to fit).
 Thanks
 Lalitha
 --- Andreas Hary [EMAIL PROTECTED] wrote:

  Look at ?try
 
  Your code will probably need to be something like
  the following:
 
  fit - list()
  for(fileId in 1:n){
 try(fit[i] - lm(formula,data=???,...), silent=F)
 #or silent=T if you would like to be made aware
  of problems
  }
 
  Best wishes,
 
  Andreas
 
 
 
 
  lalitha viswanath wrote:
   Hi
   I am a newbie to R and am using  the lm function
  to
   fit my data.
   This optimization is to be performed for around
  45000
   files not all of which lend themselves to
   optimization. Some of these will and do crash.
  
   However, How do I ensure that the program simply
  goes
   to the next file in line without exiting the code
  with
   the error
   Error in lm.fit(x, y, offset = offset,
  singular.ok =
   singular.ok, ...) :
   NA/NaN/Inf in foreign function call (arg
  4)
   everytime it encounters troublesome data?
  
   I would greatly appreciate your input as it would
   avoid me having to manually type
   for fileId in (c(4351:46000)) { ... }
   for fileId in (c(5761:46000)) { ... }, etc...
  
   Thanks
   Lalitha
  
  
  
  
 
 
   Now that's room service!  Choose from over 150,000
  hotels
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained,
  reproducible code.
  
 
  --
  =
  Andreas Hary
  Flat 5, 70 Finsbury Park Road
  Lond, N4 2JX, UK
 
  Email:  [EMAIL PROTECTED]
  Mobile: 07906 860 987
 




 
 Want to start your own business?

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] factor()

2007-01-22 Thread talepanda
 TeamInfo
TEAMNAME LEVEL WORKTIME BONUS
1  batch   sunan B  135 9,818
2  batch  Chenqi E  121 6,050
3  batch jiangxu F   97 4,189
4 online  zhouxi F   63 2,720
5 online  chenhe H   36 1,064

## try:

 factor(TeamInfo$TEAM)
[1] batch  batch  batch  online online
Levels: batch online

## or

 attach(TeamInfo)
 factor(TEAM)
[1] batch  batch  batch  online online
Levels: batch online


On 1/23/07, qing [EMAIL PROTECTED] wrote:














 Dear All,

 I am running Windows XP, R 2.4.1, and doing an example about factor().

  read.csv(c:\\TeamInfo.csv)-TeamInfo;
  TeamInfo;
  TEAMNAME LEVEL WORKTIME BONUS
 1   batch   sunan B  135 9,818
 2   batch  Chenqi E  121 6,050
 3   batch jiangxu F   97 4,189
 4  online  zhouxi F   63 2,720
 5  online  chenhe H   36 1,064
 6  online NA
 7  online NA
 8  online NA
 9  online NA
 10 client NA
 11 client NA
 12 client NA
 13 client NA
 14 client NA
  factor(TEAM)-Teamactor;
 Error in typeof(x) : object TEAM not found

 Any help or suggestions that you can provide will be greatly appreciated.

 Qing

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about rpart and regression trees

2007-01-22 Thread Prof Brian Ripley
On Mon, 22 Jan 2007, Paul Smith wrote:

 Dear All

 I would like to use rpart to obtain a regression tree for a dataset
 like the following:

 Y X1  X2  X3  X4
 5.500033  B   A   3   2
 0.35625148D   B   6   5
 0.8062546 E   C   4   3
 5.100014  C   A   3   2
 5.7000422 A   A   3   2
 0.76875436C   A   6   5
 1.0312537 D   A   4   1

 Y is the objective variable. X1, X2, X3 and X4 can take, respectively,
 the following values:

 X1: A,B,C,D,E
 X2: A,B,C,D,E
 X3: 3,4,5,6
 X4. 1,2,3,4,5

 Should I convert X3 and X4 to factor before running rpart?

If they really are factors, yes.
If they are ordered factors, no.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loess with more than 4 predictors / offsets

2007-01-22 Thread Prof Brian Ripley
On Mon, 22 Jan 2007, Louisell, Paul wrote:

 Hello,

 Does anyone know of an R version of loess that allows more than 4
 predictors and/or allows the specification of offsets? For that matter,
 does anyone know of _any_ version of loess that does either of the
 things I mention?

Why would you want offsets in a regression?: just subtract them from the 
lhs.  (R's lm has gained offsets by analogy with glm, but the S original 
did not have them).  If you would be more comfortable working with them, 
it would be very easy to create a modified version that supports them.

Also, have you heard of the 'curse of dimensionality'?  Localization even 
to 4 dimensions is no longer really an appropriate term, and Euclidean 
distance will be the main determinant of 'local' and is quite arbitrary.


 Thanks,

 Paul Louisell
 650-833-6254
 [EMAIL PROTECTED]
 Research Associate (Statistician)
 Modeling  Data Analytics
 ARPC



   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] D'Agostino test

2007-01-22 Thread Ludwig Baringhaus
Am Montag, 22. Januar 2007 12:33 schrieb Matthieu Mourroux:
 Hello,

 I'd like to know if the D'Agostino test of normality is reliable,
The test is not consistent. The test statistic   
can be used for testing the hypothesis of uniformity.

See the paper  

Baringhaus, L.; Henze, N.
A test for uniformity with unknown limits based on D'Agostino's $D$.
Statist. Probab. Lett. 9 (1990), no. 4, 299--304.

for details.

L. Baringhaus
Leibniz Universitaet Hannover
Institut fuer Mathematische Stochastik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] error in arules package

2007-01-22 Thread Frédéric Chiroleu
Hi,

we noticed there was a error in the arules package.

After reading the source code, we saw that the Dice similarity index was 
miscalculated in dissimilarity function : an copy-paste from Jaccard 
Index was not corrected (2* a_b_c, ie 2*(a+b+c) in the code instead of 
2*a +b + c !!!).

After our mail to R-help (21/11/2006), we thought the authors could do 
something but I just try the function and the error is still there.

I hope the authors will read my mail.

Sincerely yours,

Fred.

-- 
Dr. Frédéric Chiroleu
Biométricien
UMR 53 PVBMT (Peuplements Végétaux et Bio-agresseurs en Milieu Tropical)
CIRAD-AMIS
Pôle de Protection des Plantes (3P)
Laboratoire d'Écologie Terrestre et de Lutte Intégrée (LETLI)
7, chemin de l'IRAT
Ligne Paradis
97410 Saint-Pierre
Île de la Réunion - France
Tél. : 02 62 49 92 30
Standard : 02 62 49 92 00
Fax  : 02 62 49 92 93
courriel : [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.