Re: [R] Matrix of dummy variables from a factor

2005-12-06 Thread John Fox
Dear Charles,

Try model.matrix(~f)[,-1].

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Charles H. Franklin
 Sent: Tuesday, December 06, 2005 3:03 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Matrix of dummy variables from a factor
 
 What is a simple way to convert a factor into a matrix of 
 dummy variables?
 
 fm-lm(y~f)
 
 where f is a factor takes care of this in the estimation. I'd 
 like to save the result of expanding f into a matrix for later use.
 
 Thanks.
 
 Charles
 
 -- 
 
 
  Charles H. Franklin
  Professor, Political Science
  University of Wisconsin, Madison
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
  608-263-2022 (voice)
  608-265-2663 (fax)
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] is it possible to use R to edit an EXCEL spreadsheet so I can create a searchable EXCEL database of R packages?

2005-12-05 Thread John Fox
Dear Bob,

In my copy of the R FAQ (from the Rgui for Windows), these is no space
between entries which, when copied to a text file, look like:

AMORE
A MORE flexible neural network package, providing the TAO robust neural
network algorithm.
AlgDesign
Algorithmic experimental designs. Calculates exact and approximate
theory experimental designs for D, A, and I criteria.

etc.

For this, the following works:

packages - readLines(c:/temp/Packages.txt)
packages - gsub(^\ *, , gsub(\ *$, , packages))  # get rid of
leading and trailing blanks
packages - matrix(packages, ncol=2, byrow=TRUE)
write.table(packages, c:/temp/Packages2.txt, sep=,, 
row.names=FALSE, col.names=FALSE)

Producing a comma-separated text file that can be imported into Excel, and
that looks like:

AMORE,A MORE flexible neural network package, providing the TAO robust
neural network algorithm.
AlgDesign,Algorithmic experimental designs. Calculates exact and
approximate theory experimental designs for D, A, and I criteria.

etc.

If you started with blank lines between packages, then (making sure that
there is a blank line after the last package in the input file) you could
adapt this as:

packages - readLines(c:/temp/Packages.txt)
packages - gsub(^\ *, , gsub(\ *$, , packages))
packages - matrix(packages, ncol=3, byrow=TRUE)
write.table(packages[,1:2], c:/temp/Packages2.txt, sep=,, 
row.names=FALSE, col.names=FALSE)

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Bob Green
 Sent: Monday, December 05, 2005 6:58 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] is it possible to use R to edit an EXCEL 
 spreadsheet so I can create a searchable EXCEL database of R packages?
 
 I have copied the R FAQ pertaining to available packages into 
 an EXCEL file. They take the following form -
 
 Design
 Regression modeling, testing, estimation, validation, 
 graphics, prediction, and typesetting by storing enhanced 
 model design attributes in the fit. 
 Design is a etc.
 
 Devore5
 Data sets and sample analyses from Probability and 
 Statistics for Engineering and the Sciences (5th ed) by Jay 
 L. Devore, 2000, Duxbury.
 
 
 
 Presently the above format can be represented as:
 
 package name
 package description
 space
 
 I want to change the format so that file name and description are in 
 adjacent columns and the space is removed
 
 package namepackage description
 package namepackage description
 
 
 Is this possible in R so I can create a searchable EXCEL 
 database of R 
 packages. I imagine this would be easily carried out using 
 PERL though was 
 interested to know R's capacity to work on textual data?
 
 
 Any assistance regarding this is appreciated,
 
 
 Bob Green


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] FW: Error in structural equation model - The model hasnegativedegrees of freedom

2005-12-04 Thread John Fox
Dear R-help list members,

I forgot to copy my reply to the r-help list. Here's most of it.

John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 

-Original Message-
From: John Fox [mailto:[EMAIL PROTECTED] 
Sent: Sunday, December 04, 2005 5:01 PM
To: 'Sunil W'
Subject: RE: [R] Error in structural equation model - The model
hasnegativedegrees of freedom

Dear Sunil,

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Sunil W
 Sent: Sunday, December 04, 2005 11:03 AM
 To: R-help@stat.math.ethz.ch
 Subject: [R] Error in structural equation model - The model 
 hasnegativedegrees of freedom
 
 Hi John
 Thanks a lot for the reply.
 
 Could you suggest how I can correct this problem? I tried using a 
 correlation matrix instead of raw moments, but still got the same 
 error. I also fixed parameters v1,v2,v3,a1 at 1; then it gave me the 
 error that the system is exatly singular.
 

The model is underidentified regardless of the input matrix. My point was
that if you're using raw moments (as opposed to covariances or correlations)
you should have a constant term in each equation.

 To answer the points that you raised:
 
 1. x1-x6 are not causes; they are just indicatiors. Does that change 
 my model?
 

Yes. The arrows should go the other way.

 2. Is the size of the dataset going to be a real limitation? 
 How can I take care of that?
 

. . .

 ---
 Dear Sunil,
 
 There are 7*8/2 = 28 raw moments among the 7 observed variables. Of 
 these,
 6*7/2 = 21 are used for the moments among the 6 fixed-exogenous 
 variables, leaving 28 - 21 = 7 df. You model has 11 free parameters. 
 So df for the model = 11 - 7 = -4.
 
 Some additional comments:
 
 If you're using raw moments, why isn't there a constant variable in 
 the model?
 
 Do you really intend x1 -- x6 to be causes, rather than indicators, of 
 m1 and m2?
 
 Why are there no normalizing constraints on the latent variables?
 
 Do you really want to fit a model like this to so small a data set?
 
 I hope this helps,
 John
 
 John Fox
 Department of Sociology
 McMaster University
 Hamilton, Ontario
 Canada L8S 4M4
 905-525-9140x23604
 http://socserv.mcmaster.ca/jfox
 
 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Sunil W
 Sent: Wednesday, November 30, 2005 10:08 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Error in structural equation model - The model has 
 negativedegrees of freedom
 
 Hi
 
 I am running a structural equation model with R using the
 sem command;
 am getting the following error:
 
 Error in sem.default : The model has negative degrees of
 freedom = -4
 
 
 My model is as follows:
 
 s_model = specify.model()
 x1-m1, b1,NA
 x2-m1, b2,NA
 x3-m2, b3,NA
 x4-m2, b4,NA
 x5-m2, b5,NA
 x6-m2, b6,NA
 m1-y, a1,NA
 m2-y, a2,NA
 m1-m1, v1,NA
 m2-m2, v2,NA
 y-y, v3,NA
 
 
 x1-x6 are observed independent variables, m1 and m2 are the latent 
 variables and y is the observed dependent variable. I use the 
 raw.moments command for calculating the covariance matrix,
 based on a
 data with 147 observations.
 
 The command that I use is as follows:
 
 s = sem(s_model,S=R,obs.variables=colnames(R),
 fixed.x=c('x1','x2','x3','x4','x5','x6'), raw=TRUE)
 
 
 I would appreciate any help on this; I am new to structural equation 
 models and realize that I may be making a silly error.
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Error in structural equation model - The model has negativedegrees of freedom

2005-12-01 Thread John Fox
Dear Sunil,

There are 7*8/2 = 28 raw moments among the 7 observed variables. Of these,
6*7/2 = 21 are used for the moments among the 6 fixed-exogenous variables,
leaving 28 - 21 = 7 df. You model has 11 free parameters. So df for the
model = 11 - 7 = -4.

Some additional comments:

If you're using raw moments, why isn't there a constant variable in the
model?

Do you really intend x1 -- x6 to be causes, rather than indicators, of m1
and m2?

Why are there no normalizing constraints on the latent variables?

Do you really want to fit a model like this to so small a data set?

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Sunil W
 Sent: Wednesday, November 30, 2005 10:08 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Error in structural equation model - The model 
 has negativedegrees of freedom
 
 Hi
 
 I am running a structural equation model with R using the sem 
 command; am getting the following error:
 
 Error in sem.default : The model has negative degrees of 
 freedom = -4
 
 
 My model is as follows:
 
 s_model = specify.model()
 x1-m1, b1,NA
 x2-m1, b2,NA
 x3-m2, b3,NA
 x4-m2, b4,NA
 x5-m2, b5,NA
 x6-m2, b6,NA
 m1-y, a1,NA
 m2-y, a2,NA
 m1-m1, v1,NA
 m2-m2, v2,NA
 y-y, v3,NA
 
 
 x1-x6 are observed independent variables, m1 and m2 are the 
 latent variables and y is the observed dependent variable. I 
 use the raw.moments command for calculating the covariance 
 matrix, based on a data with 147 observations.
 
 The command that I use is as follows:
 
 s = sem(s_model,S=R,obs.variables=colnames(R),
 fixed.x=c('x1','x2','x3','x4','x5','x6'), raw=TRUE)
 
 
 I would appreciate any help on this; I am new to structural 
 equation models 
 and realize that I may be making a silly error.
 
 Thanks
 Sunil
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Tiger Mac stalls running Rcmdr program

2005-11-25 Thread John Fox
Dear Maha,

I'm not a Mac user, so can't offer much help, since you appear to have
followed the Mac installation notes. (Rick: I'm copying this reply to you in
case you didn't see Maha's message on r-help.)

I'm writing primarily to make sure that you know that the Rcmdr doesn't
support either structural-equation modeling or meta-analysis.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Maha Golestaneh
 Sent: Friday, November 25, 2005 4:25 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Tiger Mac stalls running Rcmdr program
 
 I am a Macintosh computer (MAC OS X Version 10.4.3) user.  I 
 would like to run R for structural equation modeling and 
 meta-analysis but am having difficulty using the Rcmdr interface.
 
 According to the R commander installation notes for Tiger 
 Macs - I need to
 1) Install X11.app from Apple Install disks - which I have done
 2) Install R.app - which I have done
 3) Install binary package rgl from CRAN - which I have done
 4) Install binary Rcmdr from CRAN - which I have done
 
 I then need to start R and X11 and type library (Rcmdr) in 
 the R console and return... 
 
 My computer now is stalling (a colored CD spins) meaning 
 possibly it can9t run the Rcmdr (R commander) program.  
 Please help me install this successfully. Thank you.
 
 Maha
 
 Maha Golestaneh
 PhD Candidate, Graduate School of Business The Woolsack, 
 Pavilion 2, Room 2.20 University of Cape Town, Woolsack Drive 
 Rondebosch, Cape Town South Africa [EMAIL PROTECTED] 
 [EMAIL PROTECTED] H 011-27-21-685-4050 x222 M 011-27-72-713-0649
 
 
   [[alternative HTML version deleted]]
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] type III sums of squares in R

2005-11-24 Thread John Fox
Dear Stefanie and Mike,

To elaborate slightly on Mike's points, the Anova() function in car
calculates Type-III (and Type-II) tests differently from SAS. (The
difference originates in the fact that SAS uses a deficient-rank
parametrization of the model while R uses a full-rank parametrization; it
would be possible to mimic SAS's behaviour more closely, but I think that
there are problems with it.) As a consequence, you have to use a
contrast-generating function, such as contr.helmert or contr.sum (but not
contr.SAS), that provides contrasts that are orthogonal in the row-basis of
the model matrix. 

I should probably elaborate the warning about Type-III tests in the help
page for Anova(), but perhaps it would help to know that the issue is
discussed at greater length in the book with which the car package is
associated.

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Mike Sears
 Sent: Thursday, November 24, 2005 1:54 PM
 To: r-help@stat.math.ethz.ch
 Subject: Re: [R] type III sums of squares in R
 
 If you use the Anova function in the car package and specify 
 contr.sum or contr.SAS for the contrasts for your categorical 
 factors, you will get the same results as outputted by SAS. 
 I've tried this with a variety of data sets and it works.
 
 On Thu November 24 2005 08:27, Stefanie von Felten, IPWIfU wrote:
  Hi everyone,
 
  Can someone explain me how to calculate SAS type III sums 
 of squares 
  in R? Not that I would like to use them, I know they are 
 problematic. 
  I would like to know how to calculate them in order to demonstrate 
  that strange things happen when you use them (for a course for 
  example). I know you can use drop1(lm(), test=F) but for an 
  lm(y~A+B+A:B), type III SSQs are only calculated for the A:B 
  interaction. If I specify lm(y~A+B), it calculates type II 
 SSQ for the 
  main effects (as type III only differs from type II if interactions 
  are included). Thus, this approach doesn't help.
 
  Another approach is the Anova(, type=III) function within the 
  library(car). But somehow it produces strange results. 
 Somebody told 
  me I have to change the contrast settings using 
  options(contrasts=c(contr.helmert, contr.poly))  But I had the 
  impression that my results are still impossible.
  Are the calculations dependent from the version of R I use? I am 
  currently using R2.1.1
 
  The only thing that seems to work is a trick: Specify a separate 
  column AB that codes a new variable for the interaction of A:B. Now 
  you can fit A,B, and AB (as 3 main effects) in 3 different 
 sequential 
  models with each one of them in the end once. For the term 
 in the end 
  you then get type III SSQ which seem to be correct.
 
  Cheers
  Steffi
 
 --
 Michael W. Sears, Ph.D.
 Postdoctoral Fellow
 Department of Biology/MS 314
 University of Nevada, Reno
 Reno, NV 89557
 
 Assistant Professor
 Department of Zoology
 Southern Illinois University
 Carbondale, IL 62901
 
 phone: 775.784.8008
 cell:  775.232.3520
 web:http://www.unr.edu/homepage/msears
 http://www.science.siu.edu/zoology/sears
 
 Natural selection is a mechanism for generating
  an exceedingly high degree of improbability.
 -Sir Ronald Fisher (1890-1962)
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] residuals in logistic regression model

2005-11-24 Thread John Fox
Dear Urania,

The residuals method for glm objects can compute several kinds of residuals;
the default is deviance residuals. See ?residuals.glm for details and
references.

I hope this helps.
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Urania Sun
 Sent: Thursday, November 24, 2005 1:36 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] residuals in logistic regression model
 
 In the logistic regression model, there is no residual
 
 log (pi/(1-pi)) = beta_0 + beta_1*X_1 + .
 
 But glm model will return
 
 residuals
 
 What is that?
 
 How to understand this? Can we put some residual in the 
 logistic regression model by replacing pi with pi' (the estimated pi)?
 
  log (pi'/(1-pi')) = beta_0 + beta_1*X_1 + .+ ei
 
 Thanks!
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] residuals in logistic regression model

2005-11-24 Thread John Fox
Dear Urania,

 -Original Message-
 From: Urania Sun [mailto:[EMAIL PROTECTED] 
 Sent: Thursday, November 24, 2005 8:52 PM
 To: John Fox
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] residuals in logistic regression model
 
 Thanks, Professor. 
  
 But is it ok to write residuals in the right hand side of the 
 logistic regression formula? Some people said I cannot since 
 the generalized linear model is to use a function to link the 
 expectation to a linear model. So there should not be 
 residuals in the right hand side. 
  
 My question is that If residuals do exist (as in the glm 
 model output), why not put them in the formula (for example, 
 if I write the left-hand side as the estimated odds-ratio)?
 

There are several kinds of residuals for generalized linear models, as I
mentioned (see ?residuals.glm). The residuals in the glm output are deviance
residuals, which are the casewise (signed) components of the residual
deviance; differences between y and fitted-y are called response residuals
(and aren't generally as useful). The left-hand side of a logit model
transformed with the logit-link is the log-odds, not the odds or odds-ratio.
The form of the model to which the response residuals applies has the
proportion, not the logit, on the left-hand side.

These matters are discussed in the references given in ?residuals.glm, and
in many other places, such as Sec. 6.6 of my R and S-PLUS Companion to
Applied Regression.

 Many thanks!
  
 Happy Thanksgiving!

Unfortunately we celebrate Thanksgiving in Canada in October, probably
because the weather here in late November leaves little to be thankful for.

Regards,
 John

  
 On 11/24/05, John Fox [EMAIL PROTECTED] wrote: 
 
   Dear Urania,
   
   The residuals method for glm objects can compute 
 several kinds of residuals;
   the default is deviance residuals. See ?residuals.glm 
 for details and 
   references.
   
   I hope this helps.
   John
   
   
   John Fox
   Department of Sociology
   McMaster University
   Hamilton, Ontario
   Canada L8S 4M4
   905-525-9140x23604
   http://socserv.mcmaster.ca/jfox 
 http://socserv.mcmaster.ca/jfox 
   
   
-Original Message-
From: [EMAIL PROTECTED]
[mailto: [EMAIL PROTECTED] On Behalf 
 Of Urania Sun
Sent: Thursday, November 24, 2005 1:36 PM
To: r-help@stat.math.ethz.ch 
Subject: [R] residuals in logistic regression model
   
In the logistic regression model, there is no residual
   
log (pi/(1-pi)) = beta_0 + beta_1*X_1 + .
   
But glm model will return 
   
residuals
   
What is that?
   
How to understand this? Can we put some residual in the
logistic regression model by replacing pi with pi' 
 (the estimated pi)?
   
 log (pi'/(1-pi')) = beta_0 + beta_1*X_1 + .+ ei 
   
Thanks!
   
  [[alternative HTML version deleted]]
   
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html 
 http://www.R-project.org/posting-guide.html 
   
   
 
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] _simple_ introduction to bayesian methods?

2005-11-22 Thread John Fox
Dear Joshua,

You might take a look at Lancaster, An Introduction to Modern Bayesian
Econometrics.

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Joshua 
 N Pritikin
 Sent: Tuesday, November 22, 2005 12:13 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] _simple_ introduction to bayesian methods?
 
 I made it through the first chapter (Background) of 
 Bayesian Data Analysis by Gelman, et al.  Conceptually, the 
 Bayesian approach appeals to me and my curiosity is piqued.  
 However, the next chapter was much too terse. The math is 
 daunting.  Where can I find a gentle introduction?  Or which 
 math book(s) do I need to read first?
 
 On page 27, there is mention of introductory books including 
 Bayesian Methods by Jeff Gill (2002).  Just for fun, I took a 
 look at this book to see whether I could get through it.  
 Chapter 1 was inviting, but again, the math got too 
 sophisticated starting from chapter 2.
 
 --
 Make April 15 just another day, visit http://fairtax.org


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] graphing help

2005-11-22 Thread John Fox
Dear Anthony,

You can use xlim in plot() to reverse the x-axis. For example,

x - 1:10
y - x
plot(x, y, xlim=rev(range(x)))

(I'm pretty sure, by the way, that this question has been asked before.)

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Anthony Pitts
 Sent: Tuesday, November 22, 2005 3:04 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] graphing help
 
 Hello all,
 
 Does anyone know how to make R graph the x-axis in descending 
 order? I am all set in creating a vertical bar chart using 
 either the plot or the xychart (in lattice), but I need the 
 x-axis in descending order. I have tried resorting the data 
 but apparently R automatically graphs the x-axis in ascending order.
 I have multiplied the data used for the x-axis by -1 and this 
 works but now the tick-mark labels are negative. So does 
 anyone know how to 1) make R graph the x-axis in descending 
 order or 2) change the tick labels so I can get rid of the 
 negative signs?
 
 Thanks in advance for your help
 
 Tony


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Autoloading R Commander

2005-11-19 Thread John Fox
Dear Stephen,

I believe that this question has been asked before, though possibly
privately rather than on the r-help list. A solution (kindly provided, as I
recall, by Brian Ripley) is to put the following in an appropriate start-up
file. For example, if you *always* want to start the Rcmdr when R starts,
this could go in Rprofile.site in R's etc subdirectory. For more detail, see
?Startup, as others have suggested.

local({
  old - getOption(defaultPackages)
  options(defaultPackages = c(old, Rcmdr))
})

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Stephen P. Molnar, Ph.D.
 Sent: Saturday, November 19, 2005 10:35 AM
 To: R
 Subject: [R] Autoloading R Commander
 
 How do I go about autoloading R Commander when I start R?
 
 Thanks in advance.
 -- 
 Stephen P. Molnar, Ph.D.  
 Life is a fuzzy set
 Foundation for Chemistry  
 Stochastic and multivariant
 http://www.geocities.com/FoundationForChemistry


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Autoloading R Commander

2005-11-19 Thread John Fox
Dear Stephen,

As a brief addendum, this information (and other information) is in the
Rcmdr installation notes, at
http://socserv.socsci.mcmaster.ca/jfox/Misc/Rcmdr/installation-notes.html.

Sorry I forgot that when I posted my original answer.

John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of John Fox
 Sent: Saturday, November 19, 2005 11:27 AM
 To: [EMAIL PROTECTED]
 Cc: 'R'
 Subject: Re: [R] Autoloading R Commander
 
 Dear Stephen,
 
 I believe that this question has been asked before, though 
 possibly privately rather than on the r-help list. A solution 
 (kindly provided, as I recall, by Brian Ripley) is to put the 
 following in an appropriate start-up file. For example, if 
 you *always* want to start the Rcmdr when R starts, this 
 could go in Rprofile.site in R's etc subdirectory. For more 
 detail, see ?Startup, as others have suggested.
 
 local({
   old - getOption(defaultPackages)
   options(defaultPackages = c(old, Rcmdr))
 })
 
 I hope this helps,
  John
 
 
 John Fox
 Department of Sociology
 McMaster University
 Hamilton, Ontario
 Canada L8S 4M4
 905-525-9140x23604
 http://socserv.mcmaster.ca/jfox
  
 
  -Original Message-
  From: [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED] On Behalf Of Stephen P. 
  Molnar, Ph.D.
  Sent: Saturday, November 19, 2005 10:35 AM
  To: R
  Subject: [R] Autoloading R Commander
  
  How do I go about autoloading R Commander when I start R?
  
  Thanks in advance.
  -- 
  Stephen P. Molnar, Ph.D.
  Life is a fuzzy set
  Foundation for Chemistry
  Stochastic and multivariant
  http://www.geocities.com/FoundationForChemistry
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] loess: choose span to minimize AIC?

2005-11-17 Thread John Fox
Dear Mike,

You could try

bestLoess - function(model, criterion=c(aicc, aicc1, gcv),
spans=c(.05, .95)){
criterion - match.arg(criterion)
f - function(span) {
mod - update(model, span=span)
loess.aic(mod)[[criterion]]
}
result - optimize(f, spans)
list(span=result$minimum, criterion=result$objective)
} 

A little experimentation suggests that aicc1 doesn't seem to behave
reasonably.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Michael Friendly
 Sent: Thursday, November 17, 2005 9:58 AM
 To: R-help@stat.math.ethz.ch
 Subject: [R] loess: choose span to minimize AIC?
 
 Is there an R implementation of a scheme for automatic 
 smoothing parameter selection with loess, e.g., by minimizing 
 one of the AIC/GCV statistics discussed by Hurvich, Simonoff 
  Tsai (1998)?
 
 Below is a function that calculates the relevant values of AICC,
 AICC1 and GCV--- I think, because I to guess from the names 
 of the components returned in a loess object.
 
 I guess I could use optimize(), or do a simple line search on 
 span=, but I'm not sure how to use loess.aic to write a 
 function that would act as a wrapper for loess() and return 
 the mimimizing loess fit for a specified criterion.
 
 loess.aic - function (x) {
   # extract values from loess object
   if (!(inherits(x,loess))) stop(Error: argument must 
 be a loess object)
   span - x$pars$span
   n - x$n
   traceL - x$trace.hat
   sigma2 - sum( x$residuals^2 ) / (n-1)
   delta1 - x$one.delta
   delta2 - x$two.delta
   enp - x$enp
 
   aicc - log(sigma2) + 1 + 2* (2*(traceL+1)) / (n-traceL-2)
   aicc1- n*log(sigma2) + n* (
 (delta1/(delta2*(n+enp)))/(delta1^2/delta2)-2 )
   gcv  - n*sigma2 / (n-traceL)^2
   
   result - list(span=span, aicc=aicc, aicc1=aicc1, gcv=gcv)
   return(result)
 }
 
 
   cars.lo - loess(dist ~ speed, cars)
  
   (values - loess.aic(cars.lo))
 $span
 [1] 0.75
 
 $aicc
 [1] 6.93678
 
 $aicc1
 [1] 167.7267
 
 $gcv
 [1] 5.275487
 
  
 
 
 -- 
 Michael Friendly Email: friendly AT yorku DOT ca
 Professor, Psychology Dept.
 York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
 4700 Keele Streethttp://www.math.yorku.ca/SCS/friendly.html
 Toronto, ONT  M3J 1P3 CANADA
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Interpretation of output from glm

2005-11-10 Thread John Fox
Dear Pedro,

The basic point, which relates to the principle of marginality in
formulating linear models, applies whether the predictors are factors,
covariates, or both. I think that this is a common topic in books on linear
models; I certainly discuss it in my Applied Regression, Linear Models, and
Related Methods.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Pedro de Barros
 Sent: Wednesday, November 09, 2005 10:45 AM
 To: r-help@stat.math.ethz.ch
 Subject: Re: [R] Interpretation of output from glm
 Importance: High
 
 Dear John,
 
 Thanks for the quick reply. I did indeed have these ideas, 
 but somehow floating, and all I could find about this 
 mentioned categorical predictors. Can you suggest a good book 
 where I could try to learn more about this?
 
 Thanks again,
 
 Pedro
 At 01:49 09/11/2005, you wrote:
 Dear Pedro,
 
 
   -Original Message-
   From: [EMAIL PROTECTED] 
   [mailto:[EMAIL PROTECTED] On Behalf Of Pedro de 
   Barros
   Sent: Tuesday, November 08, 2005 9:47 AM
   To: r-help@stat.math.ethz.ch
   Subject: [R] Interpretation of output from glm
   Importance: High
  
   I am fitting a logistic model to binary data. The 
 response variable 
   is a factor (0 or 1) and all predictors are continuous variables. 
   The main predictor is LT (I expect a logistic relation between LT 
   and the probability of being
   mature) and the other are variables I expect to modify 
 this relation.
  
   I want to test if all predictors contribute significantly for the 
   fit or not I fit the full model, and get these results
  
 summary(HMMaturation.glmfit.Full)
  
   Call:
   glm(formula = Mature ~ LT + CondF + Biom + LT:CondF + LT:Biom,
family = binomial(link = logit), data = HMIndSamples)
  
   Deviance Residuals:
Min   1Q   Median   3Q  Max
   -3.0983  -0.7620   0.2540   0.7202   2.0292
  
   Coefficients:
  Estimate Std. Error z value Pr(|z|)
   (Intercept) -8.789e-01  3.694e-01  -2.379  0.01735 *
   LT   5.372e-02  1.798e-02   2.987  0.00281 **
   CondF   -6.763e-02  9.296e-03  -7.275 3.46e-13 ***
   Biom-1.375e-02  2.005e-03  -6.856 7.07e-12 ***
   LT:CondF 2.434e-03  3.813e-04   6.383 1.74e-10 ***
   LT:Biom  7.833e-04  9.614e-05   8.148 3.71e-16 ***
   ---
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  
   (Dispersion parameter for binomial family taken to be 1)
  
Null deviance: 10272.4  on 8224  degrees of freedom Residual 
   deviance:  7185.8  on 8219  degrees of freedom
   AIC: 7197.8
  
   Number of Fisher Scoring iterations: 8
  
   However, when I run anova on the fit, I get   
   anova(HMMaturation.glmfit.Full, test='Chisq') Analysis of 
 Deviance 
   Table
  
   Model: binomial, link: logit
  
   Response: Mature
  
   Terms added sequentially (first to last)
  
  
   Df Deviance Resid. Df Resid. Dev P(|Chi|)
   NULL822410272.4
   LT  1   2873.8  8223 7398.7   0.0
   CondF   1  0.1  8222 7398.5   0.7
   Biom1  0.2  8221 7398.3   0.7
   LT:CondF1142.1  8220 7256.3 9.413e-33
   LT:Biom 1 70.4  8219 7185.8 4.763e-17
   Warning message:
   fitted probabilities numerically 0 or 1 occurred in: 
 method(x = x[, 
   varseq = i, drop = FALSE], y = object$y, weights = 
   object$prior.weights,
  
  
   I am having a little difficulty interpreting these results.
   The result from the fit tells me that all predictors are 
   significant, while the anova indicates that besides LT (the main 
   variable), only the interaction of the other terms is 
 significant, 
   but the main effects are not.
   I believe that in the first output (on the glm object), the 
   significance of all terms is calculated considering each of them 
   alone in the model (i.e.
   removing all other terms), while the anova output is (as it says) 
   considering the sequential addition of the terms.
  
   So, there are 2 questions:
   a) Can I tell that the interactions are significant, but not the 
   main effects?
 
 In a model with this structure, the main effects represent slopes 
 over the origin (i.e., where the other variables in the 
 product terms 
 are 0), and aren't meaningfully interpreted as main effects. 
 (Is there 
 even any data near the origin?)
 
   b) Is it legitimate to consider a model where the 
 interactions are 
   considered, but not the main effects CondF and Biom?
 
 Generally, no: That is, such a model is interpretable, but it places 
 strange constraints on the regression surface -- that the CondF and 
 Biom slopes are 0 over the origin.
 
 None of this is specific to logistic

Re: [R] basic mac question

2005-11-09 Thread John Fox
Dear George,

I'm afraid that people won't understand the context of your question -- that
is, that you're using the Rcmdr GUI, but that the crux of your problem as do
with changing the names of variables in the data editor using R on a Mac. 

(As you know, I'm not a Mac user and unable to answer that question. I did
answer the first part of your question in our private email correspondence,
which is that the New Data Set window goes away only after the data editor
is closed.)

I hope that, with this clarification, a Mac R user will come to your
assistance.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of George 
 Skountrianos
 Sent: Wednesday, November 09, 2005 6:51 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] basic mac question
 
 ok i got r 2.2.0 to work on my mac just fine but im having 
 problems with when i try to create a new data set. when i go 
 to new data...new data set it opens up the window. so i name 
 my set but when i click ok the window does not go away. is 
 this normal. also the data editor will not allow me to name 
 my variables. for example i click on the create new variable 
 columb and row button and i get var1, var2  and then n/a n/a. 
 ok well im able to put numbers into the n/a slots but i 
 cannot change the var1, var2, ect names. any suggestions. ooo 
 i have tiger 10.4. thanks
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Need advice about models with ordinal input variables

2005-11-08 Thread John Fox
Dear Paul,

I'll try to answer these question briefly.

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Paul Johnson
 Sent: Tuesday, November 08, 2005 2:42 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Need advice about models with ordinal input variables
 
 Dear colleagues:
 
 I've been storing up this question for a long time and 
 apologize for the length and verbosity of it.  I am having 
 trouble in consulting with graduate students on their 
 research projects.  They are using surveys to investigate the 
 sources of voter behavior or attitudes.  They have predictors 
 that are factors, some ordered, but I am never confident in 
 telling them what they ought to do.  Usually, they come in 
 with a regression model fitted as though these variables are 
 numerical, and if one looks about in the social science 
 literature, one finds that many people have published doing the same.
 
 I want to ask your advice about some cases.
 
 1. An ordered factor that masquerades as a numerical 
 interval level score.
 
 In the research journals, these are the ones most often 
 treated as numerical variables in regressions. For example: 
 Thermometer scores 
 for Presidential candidates range from 0 to 100 in integer units.
 

In my opinion, such scales aren't even really measurments, but they have a
prima-facie reasonableness, and I wouldn't hesitate to use them in the
absence of something better.

 What's a better idea?  In an OLS model with just one input 
 variable, a plot will reveal if there is a significant 
 nonlinearity. One can recode the assigned values to 
 linearize the final model or take the given values and make a 
 nonlinear model.
 
 In the R package acepack I found avas, which works like a 
 rubber ruler and recodes variables in order to make 
 relationships as linear and homoskedastic as possible.  I've 
 never seen this used in the social science literature.  It 
 works like magic.  Take an ugly scatterplot and shazam, out 
 come transformed variables that have a beautiful plot.  But 
 what do you make of these things?  There is so much going on 
 in these transformations that interpretation of the results 
 is very difficult. 
 You can't say a one unit increase in x causes a b increase in y. 
 Furthermore, if the model is a survival model, a logistic 
 regression, or other non-OLS model, I don't see how the avas 
 approach will help.
 

As a general matter, the central issue seems to me the functional form of
the relationship between the variables, whether or not X really has a unit
of measurement. If you use a nonparametric fit (such as avas) or even if you
use a complex parametric fit, such as a regression (rather than smoothing)
spline, then why not think of the graph of the fit as the best description?

 I've tried fiddling about with smoothers, treating the input 
 scores as if they were numerical.  I got this idea from Prof. 
 Harrell's Regression Modeling Strategies.  In his Design 
 package for R, one can include a cubic spline for a variable 
 in a model by replacing x with rcs(x). Very convenient. If 
 the results say the relationship is mostly linear, then we 
 might as well treat the input variable as a numerical score 
 and save some degrees of freedom.
 
 But if the higher order terms are statistically significant, 
 it is difficult to know what to do. The best strategy I have 
 found so far is to calculate fitted values for particular 
 inputs and then try to tell a story about them.
 

This seems reasonable, though I'd usually prefer a graph to a table.

 
 2. Ordinal variables with less than 10 values.
 
 Consider variables like self-reported ideology, where 
 respondents are asked to place themselves on a 7 point scale 
 ranging from very conservative to very liberal.  Or Party 
 Identification on a 7 point scale, ranging (in the US) from 
 Strong Democrat to Strong Republican.
 
 It has been quite common to see these thrown into regression 
 models as if they were numerical.
 
 I've sometimes found it useful to run a regression treating 
 them as unordered factors, and then I attempt to glean a 
 pattern in the coefficients.  If the parameter estimates step 
 up by a fixed proportion, then one might think there's no 
 damage from treating them as numerical variables.
 

Since linear (and other similar) models don't make distributional
assumptions about the X's (other than independence from the errors), nothing
in principle changes.

 Yesterday, it occurred to us that there should be a 
 signifance test to determine if one looses predictive power 
 by replacing the factor-treatment of x with x itself.  Is 
 there a non-nested model test that is most appropriate?
 

Actually, the models are nested, so, e.g., for a linear model, you could do
an incremental F-test. That is, a linear relationship is a special case of
any relationship at all, which is what you get by treating X as a factor.

 3. Truly numericals variable that are 

Re: [R] Interpretation of output from glm

2005-11-08 Thread John Fox
Dear Pedro,


 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Pedro de Barros
 Sent: Tuesday, November 08, 2005 9:47 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Interpretation of output from glm
 Importance: High
 
 I am fitting a logistic model to binary data. The response 
 variable is a factor (0 or 1) and all predictors are 
 continuous variables. The main predictor is LT (I expect a 
 logistic relation between LT and the probability of being 
 mature) and the other are variables I expect to modify this relation.
 
 I want to test if all predictors contribute significantly for 
 the fit or not I fit the full model, and get these results
 
   summary(HMMaturation.glmfit.Full)
 
 Call:
 glm(formula = Mature ~ LT + CondF + Biom + LT:CondF + LT:Biom,
  family = binomial(link = logit), data = HMIndSamples)
 
 Deviance Residuals:
  Min   1Q   Median   3Q  Max
 -3.0983  -0.7620   0.2540   0.7202   2.0292
 
 Coefficients:
Estimate Std. Error z value Pr(|z|)
 (Intercept) -8.789e-01  3.694e-01  -2.379  0.01735 *
 LT   5.372e-02  1.798e-02   2.987  0.00281 **
 CondF   -6.763e-02  9.296e-03  -7.275 3.46e-13 ***
 Biom-1.375e-02  2.005e-03  -6.856 7.07e-12 ***
 LT:CondF 2.434e-03  3.813e-04   6.383 1.74e-10 ***
 LT:Biom  7.833e-04  9.614e-05   8.148 3.71e-16 ***
 ---
 Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
 
 (Dispersion parameter for binomial family taken to be 1)
 
  Null deviance: 10272.4  on 8224  degrees of freedom 
 Residual deviance:  7185.8  on 8219  degrees of freedom
 AIC: 7197.8
 
 Number of Fisher Scoring iterations: 8
 
 However, when I run anova on the fit, I get   
 anova(HMMaturation.glmfit.Full, test='Chisq') Analysis of 
 Deviance Table
 
 Model: binomial, link: logit
 
 Response: Mature
 
 Terms added sequentially (first to last)
 
 
 Df Deviance Resid. Df Resid. Dev P(|Chi|)
 NULL822410272.4
 LT  1   2873.8  8223 7398.7   0.0
 CondF   1  0.1  8222 7398.5   0.7
 Biom1  0.2  8221 7398.3   0.7
 LT:CondF1142.1  8220 7256.3 9.413e-33
 LT:Biom 1 70.4  8219 7185.8 4.763e-17
 Warning message:
 fitted probabilities numerically 0 or 1 occurred in: method(x 
 = x[, varseq = i, drop = FALSE], y = object$y, weights = 
 object$prior.weights,
 
 
 I am having a little difficulty interpreting these results.
 The result from the fit tells me that all predictors are 
 significant, while 
 the anova indicates that besides LT (the main variable), only the 
 interaction of the other terms is significant, but the main 
 effects are not.
 I believe that in the first output (on the glm object), the 
 significance of 
 all terms is calculated considering each of them alone in the 
 model (i.e. 
 removing all other terms), while the anova output is (as it says) 
 considering the sequential addition of the terms.
 
 So, there are 2 questions:
 a) Can I tell that the interactions are significant, but not 
 the main effects?

In a model with this structure, the main effects represent slopes over the
origin (i.e., where the other variables in the product terms are 0), and
aren't meaningfully interpreted as main effects. (Is there even any data
near the origin?)
 
 b) Is it legitimate to consider a model where the interactions are 
 considered, but not the main effects CondF and Biom?

Generally, no: That is, such a model is interpretable, but it places strange
constraints on the regression surface -- that the CondF and Biom slopes are
0 over the origin.

None of this is specific to logistic regression -- it applies generally to
generalized linear models, including linear models.

I hope this helps,
 John

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] A Quick and (Very) Dirty Intro to Stats in R

2005-11-08 Thread John Fox
Dear Jarrett,


 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Jarrett Byrnes
 Sent: Tuesday, November 08, 2005 6:41 PM
 To: R-help@stat.math.ethz.ch
 Subject: [R] A Quick and (Very) Dirty Intro to Stats in R
 

. . .

 Most importantly, there are still a few holes that need to be 
 filled - if they can be
 
 1) A SIMPLE explanation for how to do mixed models using lme. 
  I am quite unsatisfied with most of what I've seen on the 
 net, and if it even comes close to going over my head, it 
 really won't fly with most folk I know.  I've done the best I 
 can, but I know if falls short.

Possibly take a look at the appendix on mixed models to my R and S-PLUS
Companion to Applied Regression, available at
http://socserv.socsci.mcmaster.ca/jfox/Books/Companion/appendix-mixed-model
s.pdf. This was intended to be a simple explanation.

Regards,
 John

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] OLS variables

2005-11-07 Thread John Fox
Dear Brian,

I don't have a strong opinion, but R's interpretation seems more consistent
to me, and as Kjetil points out, one can use polym() to specify a
full-polynomial model. It occurs to me that ^ and ** could be differentiated
in model formulae to provide both.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] 
 Sent: Monday, November 07, 2005 4:05 AM
 To: Kjetil Brinchmann halvorsen
 Cc: John Fox; r-help@stat.math.ethz.ch
 Subject: Re: [R] OLS variables
 
 On Sun, 6 Nov 2005, Kjetil Brinchmann halvorsen wrote:
 
  John Fox wrote:
 
  I assume that you're using lm() to fit the model, and that 
 you don't 
  really want *all* of the interactions among 20 predictors: 
 You'd need 
  quite a lot of data to fit a model with 2^20 terms in it, 
 and might 
  have trouble interpreting the results.
 
  If you know which interactions you're looking for, then why not 
  specify them directly, as in lm(y ~  x1*x2 + x3*x4*x5 + 
 etc.)? On the 
  other hand, it you want to include all interactions, say, up to 
  three-way, and you've put the variables in a data frame, 
 then lm(y ~ .^3, data=DataFrame) will do it.
 
  This is nice with factors, but with continuous variables, 
 and need of 
  a response-surface type, of model, will not do. For instance, with 
  variables x, y, z in data frame dat
 lm( y ~ (x+z)^2, data=dat )
  gives a model mwith the terms x, z and x*z, not the square terms.
  There is a need for a semi-automatic way to get these, for 
 instance, 
  use poly() or polym() as in:
 
  lm(y ~ polym(x,z,degree=2), data=dat)
 
 This is an R-S difference (FAQ 3.3.2).  R's formula parser 
 always takes
 x^2 = x whereas the S one does so only for factors.  This 
 makes sense it you interpret `interaction' strictly as in 
 John's description - S chose to see an interaction of any two 
 continuous variables as multiplication (something which 
 puzzled me when I first encountered it, as it was not well 
 documented back in 1991).
 
 I have often wondered if this difference was thought to be an 
 improvement, or if it just a different implementation of the 
 Rogers-Wilkinson syntax.
 Should we consider changing it?
 
 -- 
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] OLS variables

2005-11-07 Thread John Fox
Dear Brian,

I like the idea of providing support for raw polynomials in poly() and
polym(), if only for pedagogical reasons.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] 
 Sent: Monday, November 07, 2005 11:14 AM
 To: John Fox
 Cc: r-help@stat.math.ethz.ch; 'Kjetil Brinchmann halvorsen'
 Subject: RE: [R] OLS variables
 
 On Mon, 7 Nov 2005, John Fox wrote:
 
  Dear Brian,
 
  I don't have a strong opinion, but R's interpretation seems more 
  consistent to me, and as Kjetil points out, one can use polym() to 
  specify a full-polynomial model. It occurs to me that ^ and 
 ** could 
  be differentiated in model formulae to provide both.
 
 However, poly[m] only provide orthogonal polynomials, and I 
 have from time to time considered extending them to provide 
 raw polynomials too.
 Is that a better-supported idea?
 
 
  Regards,
  John
 
  
  John Fox
  Department of Sociology
  McMaster University
  Hamilton, Ontario
  Canada L8S 4M4
  905-525-9140x23604
  http://socserv.mcmaster.ca/jfox
  
 
  -Original Message-
  From: Prof Brian Ripley [mailto:[EMAIL PROTECTED]
  Sent: Monday, November 07, 2005 4:05 AM
  To: Kjetil Brinchmann halvorsen
  Cc: John Fox; r-help@stat.math.ethz.ch
  Subject: Re: [R] OLS variables
 
  On Sun, 6 Nov 2005, Kjetil Brinchmann halvorsen wrote:
 
  John Fox wrote:
 
  I assume that you're using lm() to fit the model, and that
  you don't
  really want *all* of the interactions among 20 predictors:
  You'd need
  quite a lot of data to fit a model with 2^20 terms in it,
  and might
  have trouble interpreting the results.
 
  If you know which interactions you're looking for, then why not 
  specify them directly, as in lm(y ~  x1*x2 + x3*x4*x5 +
  etc.)? On the
  other hand, it you want to include all interactions, say, up to 
  three-way, and you've put the variables in a data frame,
  then lm(y ~ .^3, data=DataFrame) will do it.
 
  This is nice with factors, but with continuous variables,
  and need of
  a response-surface type, of model, will not do. For 
 instance, with 
  variables x, y, z in data frame dat
 lm( y ~ (x+z)^2, data=dat )
  gives a model mwith the terms x, z and x*z, not the square terms.
  There is a need for a semi-automatic way to get these, for
  instance,
  use poly() or polym() as in:
 
  lm(y ~ polym(x,z,degree=2), data=dat)
 
  This is an R-S difference (FAQ 3.3.2).  R's formula parser always 
  takes
  x^2 = x whereas the S one does so only for factors.  This 
 makes sense 
  it you interpret `interaction' strictly as in John's 
 description - S 
  chose to see an interaction of any two continuous variables as 
  multiplication (something which puzzled me when I first 
 encountered 
  it, as it was not well documented back in 1991).
 
  I have often wondered if this difference was thought to be an 
  improvement, or if it just a different implementation of the 
  Rogers-Wilkinson syntax.
  Should we consider changing it?
 
 -- 
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] OLS variables

2005-11-06 Thread John Fox
Dear Leaf,

I assume that you're using lm() to fit the model, and that you don't really
want *all* of the interactions among 20 predictors: You'd need quite a lot
of data to fit a model with 2^20 terms in it, and might have trouble
interpreting the results. 

If you know which interactions you're looking for, then why not specify them
directly, as in lm(y ~  x1*x2 + x3*x4*x5 + etc.)? On the other hand, it you
want to include all interactions, say, up to three-way, and you've put the
variables in a data frame, then lm(y ~ .^3, data=DataFrame) will do it.
There are many terms in this model, however, if not quite 2^20.

The introductory manual that comes with R has information on model formulas
in Section 11.

I hope this helps,
 John 


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Leaf Sun
 Sent: Sunday, November 06, 2005 3:11 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] OLS variables
 
 Dear all,
 
 Is there any simple way in R that can I put the all the 
 interactions of the variables in the OLS model?
 
 e.g.
 
 I have a bunch of variables, x1,x2, x20... I expect then 
 to have interaction (e.g. x1*x2, x3*x4*x5... ) with some 
 combinations(2 way or higher dimensions). 
 
 Is there any way that I can write the model simpler?
 
 Thanks!
 
 Leaf
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] cox models

2005-11-06 Thread John Fox
Dear Bertrand,

There are only 4 degrees of freedom for a 5-level factor. By default, R
creates contrasts for an unordered factor in a model formula using the
function contr.treatment(), which, also by default, treats the first level
of the factor (class, in your case) as the baseline or reference level. To
see the contrasts, enter the command contrasts(igr1$class). 

A bit more information on contrasts is in Section 11.1.1 of the introductory
manual that comes with R.

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of bertrand
 Sent: Sunday, November 06, 2005 4:21 AM
 To: [EMAIL PROTECTED]
 Cc: r-help@stat.math.ethz.ch
 Subject: [R] cox models
 
 Hello,
 i'm a french student of medical oncology and i'm working on 
 breast cancer. I have a variable with the histologic type of 
 tumor wich is between 1 and 5. I use as.factor function to 
 make some variable with level between 1 and 5. When i put it 
 in the cox model i have only the level between 2 and 5. The 
 level 1 doesn't appear. I think i have to change the number 
 of level but i don't really know. Please can you help me?
 
 
 Class
 Levels: 1 2 3 4 5
 coxph(formula = Surv(delai.etat, etat) ~ class, data = igr1)
 
 
   coef exp(coef) se(coef) z   p
 class2 -0.093 0.9110.245 -0.38 7.0e-01
 class3 -0.749 0.4730.286 -2.62 8.9e-03
 class4 -0.600 0.5490.343 -1.75 8.0e-02
 class5 -0.874 0.4170.197 -4.44 8.9e-06
 
 Likelihood ratio test=24.9  on 4 df, p=5.28e-05  n=740 (1 
 observations deleted due to missing)
 
 
 
 Thanks for your help
 
 
 Bertrand
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] (no subject)

2005-11-05 Thread John Fox
Dear Erez,

How about C - A + B ?

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Erez
 Sent: Tuesday, November 01, 2005 5:39 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] (no subject)
 
 Hi
 
 I need an advise if any one can help me.
 i have mass of data in 2 array A and B:
 A = 0 1 0 0 1 1 0 0
 B = 0 0 0 1 0 1 1 1
 and i have 3 rules to merge them into 3rd array C:
 if A[i] + B[i] == 0 then C[i]=0
 if A[i] + B[i] == 1 then C[i]=1
 if A[i] + B[i] == 2 then C[i]=2
 it looks easy but with the regular way (loop) with large data 
 it takes days (i test it).
 If any one can advise me what to do i'll be happy.
 
 Thanks
 Erez  
  
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] function effect and standard error

2005-11-01 Thread John Fox
Dear Emilie,

This is, I think, the effect() function in the effects() package.

By the way, the model that you've fit with glm() is just a linear
model, and could also have been fit with lm().

For a model with this simple structure, effect() computes the adjusted
means for the factor sp, holding other predictors to their average
values. These effects are just fitted values under the model, and the
standard errors reported are for the fitted values.

For details, see the paper at
http://www.jstatsoft.org/counter.php?id=75url=v08/i15/effect-displays-revised.pdf.

I hope this helps.

John

On Tue, 1 Nov 2005 12:55:17 -0500
 Emilie Berthiaume [EMAIL PROTECTED] wrote:
 Hi list!
 
 I did the following regression:
 reg1 -  glm(alti~sp + ovent + vivent + nuage, family=gaussian,
 data=meteo1)
 
 I was interested in knowing the effect of the species (sp) in reg1
 and so I used the function «effect»:
 
 effect.sp - effect (sp, reg1, se=TRUE)
 
 with this output:
 sp
   AK   BW   NH   OS   RT   SS 
 2.730101 2.885363 2.753774 2.750311 3.084606 2.834390 
 
 If I enter the following command:
 effect.sp$se
 I get this output:
 1253   3100488514   6100 
 0.07924610 0.06713200 0.11493178 0.13106639 0.05252749 0.04208334
 
 
 My question is:  Do the numbers on the second line of this output
 represent the standard error?  What do the numbers on the top line
 represent?
 
 Thank you,
 
 Emilie Berthiaume
 graduate student
 Biology Departement
 Université de Sherbrooke
 2500 boul. de l'Université
 Sherbrooke, Québec
 J1K 2R1 CANADA
 
 Tél: 1-819-821-8000 poste 2059
 Fax: 1-819-821-8049
 [EMAIL PROTECTED]
   [[alternative HTML version deleted]]
 


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] GLM

2005-10-31 Thread John Fox
Dear Jeff,

One way to do this is with by():

summaries - by(example, Individual, function(data){ 
mod - glm(Type~Dwater+Habitat, data=data, family=binomial)
list(AIC=AIC(mod), coef=coef(mod))
})

sapply(summaries, function(x) x$coef)
rowMeans(sapply(summaries, function(x) x$coef))
sapply(summaries, function(x) x$AIC)

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Jeff Row
 Sent: Monday, October 31, 2005 12:01 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] GLM
 
 Hello R-users,
 
 I am a relatively new user of R and I have a question 
 regarding glm. I want to run the function GLM on multiple 
 individuals separately and then get means for summary 
 statistics such as AIC and coef. Is there a way to do this in 
 R without separately going through each individual and 
 subsetting the dataset, getting the summary statistics and 
 averaging them? For example, in the dataset below I would 
 like to run the function:
 
 glm(Type~Dwater+Habitat, data=example, family=binomial())
 
 on each individual separately and then get the average for 
 the coef's Dwater and Habitat for the 3 individuals.
 
 example
Individual Type Dwater Habitat
 1   10 40   0
 2   11 80   1
 3   10 30   0
 4   11 90   0
 5   20 15   1
 6   21 95   1
 7   20 20   0
 8   21 75   1
 9   30 15   0
 10  31 60   1
 11  30 10   1
 12  31 95   0
 
 Thanks,
 
 Jeff.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Problem with llines in lattice

2005-10-30 Thread John Fox
Dear Deepayan,

The application in which I encountered the problem is much more complicated,
so I'm not sure whether using groups will work there, but I'll give it a
try. 

Thanks for this,
 John

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Deepayan Sarkar
 Sent: Saturday, October 29, 2005 6:01 PM
 To: John Fox
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] Problem with llines in lattice
 
 On 10/29/05, John Fox [EMAIL PROTECTED] wrote:
  Dear r-help list members,
 
  I'm experiencing problems getting type=b (or o or c) 
 to work in 
  llines(). Try, e.g., the following scaled-down example:
 
  x - factor(c(a, b, a, b))
  y - c(1,2,1,2)
  z - factor(c(A, A, B, B))
  symbols - 1:2
  lines - 1:2
  colors - 1:2
  zvals - levels(z)
  xyplot(y~x|z, panel = function(x, y, subscripts, ...) {
  for (i in 1:length(zvals)) {
  sub - z[subscripts] == zvals[i]
  llines(x[sub], y[sub], lwd = 2, type = b,
col = colors[i], pch = symbols[i], lty = lines[i])
  }
  })
 
  Only the lines (properly coloured with correct line types) 
 appear, but 
  no symbols are plotted.
 
 It's bug in lplot.xy. Fortunately, panel.xyplot will do the 
 same thing, so you can use it instead of llines.
 
 One comment: I'm not sure if this would work in your real 
 application, but for your toy example a more direct approach would be
 
 xyplot(y ~ x | z, type = 'b', groups = z,
col = colors, pch = symbols, lty = lines,
lwd = 2)
 
 -Deepayan
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Problem with llines in lattice

2005-10-29 Thread John Fox
Dear r-help list members,

I'm experiencing problems getting type=b (or o or c) to work in
llines(). Try, e.g., the following scaled-down example:

x - factor(c(a, b, a, b))
y - c(1,2,1,2)
z - factor(c(A, A, B, B))
symbols - 1:2
lines - 1:2
colors - 1:2
zvals - levels(z)
xyplot(y~x|z, panel = function(x, y, subscripts, ...) {
for (i in 1:length(zvals)) {
sub - z[subscripts] == zvals[i]
llines(x[sub], y[sub], lwd = 2, type = b, 
  col = colors[i], pch = symbols[i], lty = lines[i])
}
})

Only the lines (properly coloured with correct line types) appear, but no
symbols are plotted. On the other hand, if I call llines() with type=l and
type=p separately in the panel function, both lines and points appear:

xyplot(y~x|z, panel = function(x, y, subscripts, ...) {
for (i in 1:length(zvals)) {
sub - z[subscripts] == zvals[i]
llines(x[sub], y[sub], lwd = 2, type = p, 
  col = colors[i], pch = symbols[i], lty = lines[i])
llines(x[sub], y[sub], lwd = 2, type = l, 
  col = colors[i], pch = symbols[i], lty = lines[i])
}
})

I've looked at both the documentation and code for llines() and lplot.xy(),
which llines() calls, and don't see the source of the problem.

I'm using R 2.2.0 under Windows XP. I'm pretty sure that this used to work,
but I got the same problem in R 2.1.1 when I tried it there.

Any help would be appreciated.

Thanks,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] lm type of sums of squares

2005-10-28 Thread John Fox
Dear Jarrett,

anova() gives sequential sums of squares (as ?anova.lm says). See
Anova() in the car package for something similar to Type II and III
sums of squares.

I hope this helps,
 John

On Fri, 28 Oct 2005 10:05:39 -0700
 Jarrett Byrnes [EMAIL PROTECTED] wrote:
 I'm curious, I realize there are methods for Type II and III sums of 
 squares, and yet, when I've been constructing models with lm, I've 
 noticed that position of the term of the model has not mattered in 
 terms of its p-value.  Does lm use sequential Type I sums of squares,
 
 or something else?
 
 Thanks!
 
 -Jarrett
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Help: partial.cor significance test

2005-10-26 Thread John Fox
Dear Simon,

The population partial correlation rho[12|3...p] is 0 when the regression
coefficient beta[2] for x[2] from the regression of x[1] on x[2] ... X[p] is
0. Thus, the usual t-test for a regression coefficient also tests that the
partial correlation is 0.

Now, the sample partial correlation r[12|3...p] = t/sqrt(t^2 + dfe)) where
dfe = n - p is the degrees of freedom for error and t is the t-statistic for
testing that beta[2] is 0, and thus t = sqrt(dfe*r^2[12|3...p]/(1 -
r^2[12|3...p])), so it is easy to compute the (unsigned) t-statistics from
the partial correlations.

Why one would want to do this, however, is another matter. What would one do
with a matrix of 2-sided p-values?

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of sp219
 Sent: Wednesday, October 26, 2005 5:09 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Help: partial.cor significance test
 
 Hi,
 I have been using the partial.cor function in Rcmdr but I was 
 wondering if there is any easy way to get statistical 
 significance tests (two tailed) along with the partial 
 correlation coefficients?
 Simon Pickett
 
 Simon Pickett
 Centre for Ecology and Conservation Biology University of 
 Exeter in Cornwall Tremough Campus Penryn Cornwall TR10 9EZ UK
 Tel: 01326371852
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] making an inicator variable

2005-10-25 Thread John Fox
Dear Jen,

There are lots of different ways to do what you want -- you've already had
two suggestions -- but you might consider whether you really need to do it.
In particular, R will generate its own indicator variables (and other kinds
of contrasts) in linear and other statistical models (see Section 11 of the
Introduction to R manual that comes with R). It's hard to give really good
advice without knowing more about the context.

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Jennifer Dillon
 Sent: Tuesday, October 25, 2005 2:52 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] making an inicator variable
 
 Hello,
 
 I am almost a total novice, and I am sure there must be an easy (and
 basic) way to turn a variable of 1's and 2's into a variable 
 of zeros and ones.  This is in a data frame, but if I could 
 do it with vectors, that's all I need.
 
 Can someone tell me how?
 
 Thanks so much,
 
 Jen
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] String manipulation

2005-10-20 Thread John Fox
Dear Luis,

How about gsub([0-9], ,  x) ? This assumes that x contains the character
data and not a factor, as would usually be the case in a data frame. If the
variable is really a factor, then use as.character(x) in the call to gsub().

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Luis Ridao Cruz
 Sent: Thursday, October 20, 2005 8:24 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] String manipulation
 
 R-help,
 
 I have a data frame which contains a character string column 
 that is something like;
 
 II11
 II18
 II23
 III1
 III13
 III16
 III19
 III2
 III7
 IV10
 IV11
 IV12
 IX16
 IX4
 V12
 V18
 V2
 V20
 V23
 V4
 VII14
 VII18
 VII21
 VII26
 VII28
 VII33
 VII4
 VII48
 VII5
 
 
 
 
 I want to apply a function (e.g mean) by grouping according 
 to the roman part of the string, i.e,
 
 by I
 by V
 by VII
 ...
 ...
 and so on.
 
 I have looked at string manipulation functions (grep, 
 pmatch,,,) but I can't really get it the way I want.
 Can anyone help?
 
 Thanks in advance.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] spliting an integer

2005-10-20 Thread John Fox
Dear Sundar and Dimitri,

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Sundar 
 Dorai-Raj
 Sent: Thursday, October 20, 2005 3:50 PM
 To: Dimitri Szerman
 Cc: R-Help
 Subject: Re: [R] spliting an integer
 
 
 
 Dimitri Szerman wrote:
  Hi there,
  
 From the vector X of integers,
  
  X = c(11999, 122000, 81997)
  
  I would like to make these two vectors:
  
  Z= c(1999, 2000, 1997)
  Y =c(1 , 12 , 8)
  
  That is, each entry of vector Z receives the four last 
 digits of each entry of X, and Y receives the rest.
  
  Any suggestions?
  
  Thanks in advance,
  
  Dimitri
  [[alternative HTML version deleted]]
 
 Try:
 
 X - c(11999, 122000, 81997)
 Y - X %/% 1
 Z - X - Y * 1
 

Or even

 X %% 1
[1] 1999 2000 1997

Regards,
 John

 See ?Arithmetic for more details.
 
 HTH,
 
 --sundar
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] how to import such data to R?

2005-10-17 Thread John Fox
Dear ronggui,

I didn't find any attachments, but using the data lines in your
message, and assuming that . represents missing data, the following
appears to do what you want:

as.data.frame(scan(c:/temp/ronggui.txt, 
list(year=1, apps=1, top25=1, ver500=1, 
mth500=1, stufac=1, bowl=1, btitle=1, finfour=1, lapps=1,
d93=1, 
avg500=1, cfinfour=1, clapps=1, cstufac=1, cbowl=1, cavg500=1,
 
cbtitle=1, lapps_1=1, school=, ctop25=1, bball=1, cbball=1,),
na.strings=.))

See ?scan for details.

I hope this helps,
 John

On Sat, 15 Oct 2005 15:57:42 +0800
 ronggui [EMAIL PROTECTED] wrote:
 the data file has such structure:
 
  1992   6245 49  .  . 20
  1
 0  0   8.739536  0  .  .
  .
 .  .  .  .  .
alabama
 .  0  .
  1993   7677 58  .  . 15
  1
 0  0   8.945984  1  .  0
   .2064476
-5  0  .  0   8.739536
alabama
 9  0  0
  1992  13327 57 36 58 16
  0
 0  0   9.497547  0 47  .
  .
 .  .  .  0  .
arizona
 .  0  .
  1993  19860 57 36 58 16
  1
 1  0   9.896463  1 47  0
   .3989162
 0  1  0  1   9.497547
arizona
 0  1  1
  1992  10422 37 28 58 20
  0
 0  0   9.251675  0 43  .
  .
 .  .  . -1  .  arizona
 state
 .  0  .
 
 --snip-
 
 the data descriptions is:
 
 variable names:
 
 year  apps  top25 ver500mth500stufacbowl
  btitle   
 finfour   lapps d93   avg500cfinfour  clappscstufac
   cbowl
 cavg500   cbtitle   lapps_1   schoolctop25bball cbball

 
   Obs:   118
 
   1. year 1992 or 1993
   2. apps # applics for admission
   3. top25perc frosh class in 25th high sch
 percen
   4. ver500   perc frosh = 500 on verbal SAT
   5. mth500   perc frosh = 500 on math SAT
   6. stufac   student-faculty ratio
   7. bowl = 1 if bowl game in prev year
   8. btitle   = 1 if men's cnf chmps prev year
   9. finfour  = 1 if men's final 4 prev year
  10. lappslog(apps)
  11. d93  =1 if year = 1993
  12. avg500   (ver500+mth500)/2
  13. cfinfour change in finfour
  14. clapps   change in lapps
  15. cstufac  change in stufac
  16. cbowlchange in bowl
  17. cavg500  change in avg500
  18. cbtitle  change in btitle
  19. lapps_1  lapps lagged
  20. school   university name
  21. ctop25   change in top25
  22. bball=1 if btitle or finfour
  23. cbball   change in bball
 
 
 so the each four lines represent  one case,can some variables are
 numeric and some are character.
 I though the scan can read it in ,but it seems somewhat tricky as the
 mixed type of variables.any suggestions?
 
 the attachmen is the raw data and the description of the data.
 
 
 
 2005-10-15
 
 --
 Deparment of Sociology
 Fudan University
 
 My new mail addres is [EMAIL PROTECTED]
 Blog:http://sociology.yculblog.com


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] asking the user for data

2005-10-16 Thread John Fox
Dear Iain,

There's an ask() function in the sm package that does what you want, but
you'll have to compose your message properly: ask(paste(Please enter the z
value for, x))

Alternatively, eval(parse(text=readline(paste(Please enter the z value for
, x, : , sep= will do what you want, as would
as.numeric(readline(paste(Please enter the z value for , x, : ,
sep=))).

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Iain Gallagher
 Sent: Sunday, October 16, 2005 7:34 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] asking the user for data
 
 Hello everyone.
 
 How do I get R to ask users for data to be entered? 
 Specifically I want to ask for a z score to be entered (the 
 user would look this up in a
 table) and then use the entered data to compute a Dunn's 
 post-hoc test (post kruskal.test).
 
 I've tried the ask function but it's not recognised - maybe 
 I don't have to appropriate libary installed. A pointer to 
 the right one would be appreciated.
 
 e.g  z -ask(message=Please enter the z value for x)
 
 Any help would be gratefully received.
 
 Thanks
 
 Iain Gallagher
 Institiute for Infection and Immunology Research Edinburgh University
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] SEM with dichotomous indicators

2005-10-16 Thread John Fox
Dear Renata,

You can do this by combining the sem() and boot.sem() functions in the
sem package with hetcor() in the polycor package; boot.sem() computes
standard errors for parameter estimates by bootstrapping.

See ?boot.sem for an example of a confirmatory factor-analysis model
with ordinal indicators. See ?sem for many examples of
structural-equation models with latent variables. Just combine the two.
Note that a dichotomous indicator is in effect a two-category ordinal
variable and may be handled as such. For example, the polychoric
correlation between two dichtomous variables is just their tetrachoric
correlation.

I hope this helps,
 John 

On Mon, 10 Oct 2005 11:23:25 -0300
 Renata [EMAIL PROTECTED] wrote:
 Hello,
 
  
 
 I'd like to know if there is a way to fit a Structural equation model
 with
 dichotomous indicators (ex: problem with a phone solved/ or not)
 having
 effects on a ordinal variable.
 
 How I do that using R?
 
 Do you have an example with the code in R that you can send to me?
 
  
 
 Thanks a lot! 
 
 Renata Estrella
 
 UFRJ, Brasil, Rio de Janeiro
 
  
 
  
 
 Renata Leite Estrella
 
 Assistente de Pesquisa
 
 (21) 3978- Ramal  8841
 
 www.lipe.com.br
 
  
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] zip package

2005-10-14 Thread John Fox
Dear Sara,

It looks to me as if there are three problems here: (1) Is the zip file for
the package really at c:/ProgramFiles/R/rw2011/library/lpSolve_1.1.9? That
is, isn't there a space in Program Files? (2) You have to specify
repos=NULL to install from a local zip file, as ?install.packages tells you.
(3) You don't seem to be using the destdir argument correctly; you can omit
it.

Why not avoid all this and just use the R for Windows menus: Packages -
Install package(s) from local zip files?

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Sara Mouro
 Sent: Friday, October 14, 2005 4:45 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] zip package
 
   Dear all
 
   I can not understand how to install the package 
 lpsolve_1.1.9.zip
 
   I have read the FAQ and the help pages carefully, but 
 it still not clear for me.
 
   I have tried the following (and obtained the respective error
 messages):
 
   
 install.packages(c:/ProgramFiles/R/rw2011/library/lpSolve_1.
1.9,destdir=
 c:/ProgramFiles/R/rw2011/library/lpSolve)
   Mensagem de aviso:
   no package 'c:/Program
 Files/R/rw2011/library/lpSolve_1.1.9' at the repositories in:
 download.packages(pkgs, destdir = tmpd, available = available,  
 
   
 install.packages(lpSolve_1.1.9,destdir=c:/ProgramFiles/R/r
w2011/libr
 ary/
 lpSolve)
   Erro em download.packages(pkgs, destdir = tmpd, 
 available = available,  : 
   'destdir' is not a directory
 
   install.packages(lpSolve_1.1.9,destdir=NULL)
   Mensagem de aviso:
   no package 'lpSolve_1.1.9' at the 
 repositories in:
 download.packages(pkgs, destdir = tmpd, available = available,  
 
   Could you please tell me how to do that or when can I 
 find a simple example of that kind of installation?
 
   Thank you so much and sorry for such basic question.
 
   Sara Mouro
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] R/S-Plus equivalent to Genstat predict: predictions over averages of covariates

2005-10-06 Thread John Fox
Dear Peter,

See the effects package, described in
http://www.jstatsoft.org/counter.php?id=75url=v08/i15/effect-displays-revi
sed.pdf.

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Peter Dunn
 Sent: Wednesday, October 05, 2005 9:06 PM
 To: R-help mailing list
 Subject: [R] R/S-Plus equivalent to Genstat predict: 
 predictions over averages of covariates
 
 Hi all
 
 I'm doing some things with a colleague comparing different 
 sorts of models.  My colleague has fitted a number of glms in 
 Genstat (which I have never used), while the glm I have been 
 using is only available for R.
 
 He has a spreadsheet of fitted means from each of his models 
 obtained from using the Genstat predict function.  For 
 example, suppose we fit the model of the type
 glm.out - glm( y ~ factor(F1) + factor(F2) + X1 + poly(X2,2) +
poly(X3,2), family=...)
 
 Then he produces a table like this (made up, but similar):
 
 F1(level1)12.2
 F1(level2)14.2
 F1(level3)15.3
 F2(level1)10.3
 F2(level2)9.1
 X1=0  10.2
 X1=0.510.4
 X1=1  10.4
 X1=1.510.5
 X1=2  10.9
 X1=2.511.9
 X1=3  11.8
 X2=0  12.0
 X2=0.512.2
 X2=1  12.5
 X2=1.512.9
 X2=2  13.0
 X2=2.513.1
 X2=3  13.5
 
 Each of the numbers are a predicted mean.  So when X1=0, on 
 average we predict an outcome of 10.2.
 
 To obtain these figures in Genstat, he uses the Genstat predict
 function.  When I asked for an explanation of how it was done 
 (ie to make the predictions, what values of the other 
 covariates were used) I was told:
 
  So, for a one-dimensional table of fitted means for any factor (or 
  variate), all other variates are set to their average 
 values; and the 
  factor constants (including the first, at zero) are given a 
 weighted 
  average depending on their respective numbers of observations.
 
 So for quantitative variables (such as pH), one uses the mean 
 pH in the data set when making the predictions.  Reasonable anmd easy.
 
 But for categorical variables (like Month), he implies we use 
 a weighted average of the fitted coefficients for all the 
 months, depending on the proportion of times those factor 
 levels appear in the data.
 
 (I hope I explained that OK...)
 
 Is there an equivalent way in R or S-Plus of doing this?  I 
 have to do it for a number of sites and species, so an 
 automated way would be useful.  I have tried searching to no 
 avail (but may not be searching on the correct terms), and 
 tried hard-coding something myself as yet unsuccessfully:  
 The  poly  terms and the use of the weighted averaging over 
 the factor levels are proving a bit too much for my limited skills.
 
 Any assistance appreciated.  (Any clarification of what I 
 mean can be provided if I have not been clear.)
 
 Thanks, as always.
 
 P.
 
   version
   _
 platform i386-pc-linux-gnu
 arch i386
 os   linux-gnu
 system   i386, linux-gnu
 status
 major2
 minor1.0
 year 2005
 month04
 day  18
 language R
  
 
 
 
 --
 Dr Peter Dunn  |  Senior Lecturer in Statistics Faculty of 
 Sciences, University of Southern Queensland
Web:http://www.sci.usq.edu.au/staff/dunn
Email:  dunn at usq.edu.au
 CRICOS:  QLD 00244B |  NSW 02225M |  VIC 02387D |  WA 02521C
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] data.frame error using sem package

2005-10-06 Thread John Fox
Dear Suzanne,

Take a look at your model specification:

 p.ram
 [,1][,2][,3]
[1,] LM1 - LSMA   LM1 - LSMA   NA  
[2,] LSMA - RSMA  LSMA - RSMA  NA  
[3,] RSMA - RM1   RSMA - RM1   NA  
[4,] LSMA - LSMA LSMA - LSMA NA  
[5,] RSMA - RSMA RSMA - RSMA NA  
[6,] RM1 - RM1   RM1 - RM1   NA  
 

This matrix should have three columns, the first giving the path, the second
the name of the corresponding parameter, and the third the start value for
the parameter (or NA if you want sem() to compute a start value). You've
apparently left out the parameter names. Please see the sem examples for
details and the paper at
http://socserv.socsci.mcmaster.ca/jfox/sem-package.pdf.

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Suzanne Witt
 Sent: Thursday, October 06, 2005 2:55 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] data.frame error using sem package
 
 I am trying to use sem to measure effective connectivity 
 among four brain regions.  I have pasted the code that I am 
 trying to run since that seems easier than trying to come up 
 with another example.
 
 The input data is time series data taken from SPM; they are each
 1x121 columns of numbers.  I get the error either when I 
 source the whole code, or if I enter it line by line when I 
 go to get the summary.
 
 Thanks,
 
 Suzanne
 
 library(sem)
 
 # Load the region timecourses.
 
 lsma1 - read.table(/Users/witt/parkinsons/rmrkm010905/R_files/
 010905_lcomf_LSMA.dat)
 rsma1 - read.table(/Users/witt/parkinsons/rmrkm010905/R_files/
 010905_lcomf_RSMA.dat)
 lmc1 - read.table(/Users/witt/parkinsons/rmrkm010905/R_files/
 010905_lcomf_LM1.dat)
 rmc1 - read.table(/Users/witt/parkinsons/rmrkm010905/R_files/
 010905_lcomf_RM1.dat)
 
 # Combine all the timecourses from each session into a single 
 data frame and name the columns appropriately.
 
 lcomf - cbind(lsma1, rsma1, lmc1, rmc1)
 names(lcomf) - c(LSMA, RSMA, LM1, RM1)
 
 # Type this at the command line to see a summary of your data
 
 str(lcomf)
 
 # Set up the structural equation model.
 
 p.ram - matrix(c( 'LM1 - LSMA', 'LM1 - LSMA', NA,
  'LSMA - RSMA', 'LSMA - RSMA', NA,
  'RSMA - RM1', 'RSMA - RM1', NA,
  'LSMA - LSMA', 'LSMA - LSMA', NA,
  'RSMA - RSMA', 'RSMA - RSMA', NA,
  'RM1 - RM1', 'RM1 - RM1', NA),
  ncol = 3, byrow = TRUE)
 
 # Tell which variables are exogenous ('fixed').
 
 p.fixed - c('LM1')
 
 # Do the fitting for session 1.
 
 C - cor(lcomf)
 nt - dim(lcomf)[1]
 #attach(lcomf)
 lcomf.results - sem(p.ram, C, nt, obs.variables = 
 rownames(C), fixed.x = p.fixed)
 
 # Check out the results using the summary function
 
 summary(lcomf.results)
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Rcmdr and scatter3d

2005-10-05 Thread John Fox
Dear Ted,

I assumed that since Naiara was using scatter3d(), he wants a 3D dynamic
scatterplot. He could add points (actually, spheres) to the rgl graph
produced by scatter3d() -- the analog of plot() followed by points() for a
2D graph -- but doing so would be much more work than plotting by groups.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: ecatchpole [mailto:[EMAIL PROTECTED] 
 Sent: Tuesday, October 04, 2005 8:55 PM
 To: John Fox
 Cc: 'Naiara S. Pinto'; r-help@stat.math.ethz.ch
 Subject: Re: [R] Rcmdr and scatter3d
 
 Niara,
 
 Alternatively, instead of scatter3d, the analogy to hold on 
 in Matlab is to use plot() for the first set of data, then 
 points() for the remainder. See
 
 ?plot
 ?points
 
 Ted.
 
 On 05/10/05 11:18,  John Fox wrote,:
  Dear Naiara,
  
  Combine the data sets and differentiate among them with a 
 factor. Then 
  use the groups argument to scatter3d (see ?scatter3d). If 
 you're using 
  the R Commander to make the plot, the 3D scatterplot dialog 
 box as a 
  plot by groups button. You can also fit colour-coded 
 regression surfaces by group.
  
  I've appended a new version of the scatter3d function, not 
 yet in the 
  Rcmdr package, which will also plot data ellipsoids (for the whole 
  data set or by groups).
  
  I hope this helps,
   John
  
  --- snip --
 
  
  John Fox
  Department of Sociology
  McMaster University
  Hamilton, Ontario
  Canada L8S 4M4
  905-525-9140x23604
  http://socserv.mcmaster.ca/jfox
  
  
 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Naiara S. Pinto
 Sent: Tuesday, October 04, 2005 6:13 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Rcmdr and scatter3d
 
 Hi folks,
 
 I'd like to use scatter3d (which is in R commander) to plot 
 more than 
 one dataset in the same graph, each dataset with a different color. 
 The kind of stuff you would do with holdon
 in Matlab.
 
 I read a recent message that was posted to this list with a similar 
 problem, but I couldn't understand the reply. Could someone give me 
 one example? How do you plot subgroups using scatter3d?
 
 Thanks a lot!
 
 Naiara.
 
 
 
 Naiara S. Pinto
 Ecology, Evolution and Behavior
 1 University Station A6700
 Austin, TX, 78712
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
 
 
 --
 Dr E.A. Catchpole
 Visiting Fellow
 Univ of New South Wales at ADFA, Canberra, Australia and 
 University of Kent, Canterbury, England
 - www.ma.adfa.edu.au/~eac
 - fax: +61 2 6268 8786
 - ph:  +61 2 6268 8895

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] testing non-linear component in mgcv:gam

2005-10-05 Thread John Fox
Dear Denis,

Take a closer look at the anova table: The models provide identical fits to
the data. The differences in degrees of freedom and deviance between the two
models are essentially zero, 5.5554e-10 and 2.353e-11 respectively.

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Denis Chabot
 Sent: Wednesday, October 05, 2005 8:22 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] testing non-linear component in mgcv:gam
 
 Hi,
 
 I need further help with my GAMs. Most models I test are very 
 obviously non-linear. Yet, to be on the safe side, I report 
 the significance of the smooth (default output of mgcv's 
 summary.gam) and confirm it deviates significantly from linearity.
 
 I do the latter by fitting a second model where the same 
 predictor is entered without the s(), and then use anova.gam 
 to compare the two. I thought this was the equivalent of the 
 default output of anova.gam using package gam instead of mgcv.
 
 I wonder if this procedure is correct because one of my 
 models appears to be linear. In fact mgcv estimates df to be 
 exactly 1.0 so I could have stopped there. However I 
 inadvertently repeated the procedure outlined above. I would 
 have thought in this case the anova.gam comparing the smooth 
 and the linear fit would for sure have been not significant. 
 To my surprise, P was 6.18e-09!
 
 Am I doing something wrong when I attempt to confirm the non- 
 parametric part a smoother is significant? Here is my example 
 case where the relationship does appear to be linear:
 
 library(mgcv)
  This is mgcv 1.3-7
 Temp - c(-1.38, -1.12, -0.88, -0.62, -0.38, -0.12, 0.12, 
 0.38, 0.62, 0.88, 1.12,
 1.38, 1.62, 1.88, 2.12, 2.38, 2.62, 2.88, 3.12, 
 3.38, 3.62, 3.88,
 4.12, 4.38, 4.62, 4.88, 5.12, 5.38, 5.62, 5.88, 
 6.12, 6.38, 6.62, 6.88,
 7.12, 8.38, 13.62)
 N.sets - c(2, 6, 3, 9, 26, 15, 34, 21, 30, 18, 28, 27, 27, 
 29, 31, 22, 26, 24, 23,
  15, 25, 24, 27, 19, 26, 24, 22, 13, 10, 2, 5, 3, 
 1, 1, 1, 1, 1) wm.sed - c(0.0, 0.016129032, 
 0.0, 0.062046512, 0.396459596, 0.189082949,
  0.054757925, 0.142810440, 0.168005168, 
 0.180804428, 0.111439628, 0.128799505,
  0.193707937, 0.105921610, 0.103497845, 
 0.028591837, 0.217894389, 0.020535469,
  0.080389068, 0.105234450, 0.070213450, 
 0.050771363, 0.042074434, 0.102348837,
  0.049748344, 0.019100478, 0.005203125, 
 0.101711864, 0.0, 0.0,
  0.014808824, 0.0, 0.22200, 
 0.16700, 0.0, 0.0,
  0.0)
 
 sed.gam - gam(wm.sed~s(Temp),weight=N.sets)
 summary.gam(sed.gam)
  Family: gaussian
  Link function: identity
 
  Formula:
  wm.sed ~ s(Temp)
 
  Parametric coefficients:
  Estimate Std. Error t value Pr(|t|)
  (Intercept)  0.084030.01347   6.241 3.73e-07 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
 
  Approximate significance of smooth terms:
  edf Est.rank F  p-value
  s(Temp)   11 13.95 0.000666 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
 
  R-sq.(adj) =  0.554   Deviance explained = 28.5%
  GCV score = 0.09904   Scale est. = 0.093686  n = 37
 
 # testing non-linear contribution
 sed.lin - gam(wm.sed~Temp,weight=N.sets)
 summary.gam(sed.lin)
  Family: gaussian
  Link function: identity
 
  Formula:
  wm.sed ~ Temp
 
  Parametric coefficients:
   Estimate Std. Error t value Pr(|t|)
  (Intercept)  0.162879   0.019847   8.207 1.14e-09 ***
  Temp-0.023792   0.006369  -3.736 0.000666 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
 
 
  R-sq.(adj) =  0.554   Deviance explained = 28.5%
  GCV score = 0.09904   Scale est. = 0.093686  n = 37
 anova.gam(sed.lin, sed.gam, test=F)
  Analysis of Deviance Table
 
  Model 1: wm.sed ~ Temp
  Model 2: wm.sed ~ s(Temp)
 Resid. Df Resid. Dev Df  Deviance  F   Pr(F)
  1 3.5000e+01  3.279
  2 3.5000e+01  3.279 5.5554e-10 2.353e-11 0.4521 6.18e-09 ***
 
 
 Thanks in advance,
 
 
 Denis Chabot
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] transparent surface in rgl

2005-10-05 Thread John Fox
Dear Paul,

I don't have experience with rgl.postscript(), which is relatively new, but
find that the png graphs produced by rgl.snapshot() are of reasonably good
quality and preserve transparency. Perhaps the developers of the rgl package
can shed more light on the matter.

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Prof. 
 Paul R. Fisher
 Sent: Wednesday, October 05, 2005 8:32 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] transparent surface in rgl
 
 Hi all
 I am a complete newbie to this list (just subscribed) and a 
 newcomer to R (an S user from olden times). I have been using 
 scatter3d to create a 3d scatter plot with surface. The 
 graphic is created within the rgl package and I have used 
 rgl.postscript to export it so I can generate a publication 
 quality image. My problem is that the plotted surface is no 
 longer transparent in the postscript output ie. the 
 rgl.spheres that are behind the surface disappear in the 
 postscript image. Can't seem to find any info on this 
 anywhere. Am I doing something wrong? Is there an easy fix?
 
 Anyway, thanks.
 Hope I've not broken some netiquette rule sending this.
 
 Cheers,
 Paul Fisher.
 --
 Prof. Paul R. Fisher,
 Chair in Microbiology,
 La Trobe University,
 VIC 3086,
 AUSTRALIA.
 
 Tel. + 61 3 9479 2229
 Fax. + 61 3 9479 1222
 Email. [EMAIL PROTECTED]
 Web. http://www.latrobe.edu.au/mcbg/my.html
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] testing non-linear component in mgcv:gam

2005-10-05 Thread John Fox
Dear Denis,

The chi-square test is formed in analogy to what's done for a GLM: The
difference in residual deviance for the nested models is divided by the
estimated scale parameter -- i.e., the estimated error variance for a model
with normal errors. Otherwise, as you point out, the test would be dependent
upon the scale of the response.

John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Denis Chabot
 Sent: Wednesday, October 05, 2005 9:04 AM
 To: John Fox
 Cc: R list
 Subject: Re: [R] testing non-linear component in mgcv:gam
 
 Hi John,
 
 Le 05-10-05 à 09:45, John Fox a écrit :
 
  Dear Denis,
 
  Take a closer look at the anova table: The models provide identical 
  fits to the data. The differences in degrees of freedom and 
 deviance 
  between the two models are essentially zero, 5.5554e-10 and 
 2.353e-11 
  respectively.
 
  I hope this helps,
   John
 This is one of my difficulties. In some examples I found on 
 the web, the difference in deviance is compared directly 
 against the chi- squared distribution. But my y variable has 
 a very small range (between 0 and 0.5, most of the time) so 
 the difference in deviance is always very small and if I 
 compared it against the chi-squared distribution as I have 
 seen done in examples, the non-linear component would always 
 be not significant. Yet it is (with one exception), tested 
 with both mgcv:gam and gam:gam. I think the examples I have 
 read were wrong in this regard, the scale factor seen in 
 mgcv output seems to intervene. But exactly how is still 
 mysterious to me and I hesitate to judge the size of the 
 deviance difference myself.
 
 I agree it is near zero in my example. I guess I need to have 
 more experience with these models to better interpret the output...
 
 Denis
 
 
  -Original Message-
  From: [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED] On Behalf Of Denis Chabot
  Sent: Wednesday, October 05, 2005 8:22 AM
  To: r-help@stat.math.ethz.ch
  Subject: [R] testing non-linear component in mgcv:gam
 
  Hi,
 
  I need further help with my GAMs. Most models I test are very 
  obviously non-linear. Yet, to be on the safe side, I report the 
  significance of the smooth (default output of mgcv's
  summary.gam) and confirm it deviates significantly from linearity.
 
  I do the latter by fitting a second model where the same 
 predictor is 
  entered without the s(), and then use anova.gam to compare 
 the two. I 
  thought this was the equivalent of the default output of anova.gam 
  using package gam instead of mgcv.
 
  I wonder if this procedure is correct because one of my models 
  appears to be linear. In fact mgcv estimates df to be 
 exactly 1.0 so 
  I could have stopped there. However I inadvertently repeated the 
  procedure outlined above. I would have thought in this case the 
  anova.gam comparing the smooth and the linear fit would 
 for sure have 
  been not significant.
  To my surprise, P was 6.18e-09!
 
  Am I doing something wrong when I attempt to confirm the non- 
  parametric part a smoother is significant? Here is my example case 
  where the relationship does appear to be linear:
 
  library(mgcv)
 
  This is mgcv 1.3-7
 
  Temp - c(-1.38, -1.12, -0.88, -0.62, -0.38, -0.12, 0.12, 
 0.38, 0.62, 
  0.88, 1.12,
  1.38, 1.62, 1.88, 2.12, 2.38, 2.62, 2.88, 3.12, 3.38, 
  3.62, 3.88,
  4.12, 4.38, 4.62, 4.88, 5.12, 5.38, 5.62, 5.88, 6.12, 
  6.38, 6.62, 6.88,
  7.12, 8.38, 13.62)
  N.sets - c(2, 6, 3, 9, 26, 15, 34, 21, 30, 18, 28, 27, 
 27, 29, 31, 
  22, 26, 24, 23,
   15, 25, 24, 27, 19, 26, 24, 22, 13, 10, 2, 5, 
 3, 1, 1, 
  1, 1, 1) wm.sed - c(0.0, 0.016129032, 0.0, 
  0.062046512, 0.396459596, 0.189082949,
   0.054757925, 0.142810440, 0.168005168, 0.180804428, 
  0.111439628, 0.128799505,
   0.193707937, 0.105921610, 0.103497845, 0.028591837, 
  0.217894389, 0.020535469,
   0.080389068, 0.105234450, 0.070213450, 0.050771363, 
  0.042074434, 0.102348837,
   0.049748344, 0.019100478, 0.005203125, 0.101711864, 
  0.0, 0.0,
   0.014808824, 0.0, 0.22200, 0.16700, 
  0.0, 0.0,
   0.0)
 
  sed.gam - gam(wm.sed~s(Temp),weight=N.sets)
  summary.gam(sed.gam)
 
  Family: gaussian
  Link function: identity
 
  Formula:
  wm.sed ~ s(Temp)
 
  Parametric coefficients:
  Estimate Std. Error t value Pr(|t|)
  (Intercept)  0.084030.01347   6.241 3.73e-07 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
 
  Approximate significance of smooth terms:
  edf Est.rank F  p-value
  s(Temp)   11 13.95 0.000666

Re: [R] testing non-linear component in mgcv:gam

2005-10-05 Thread John Fox
Dear Denis,

You got me: I would have thought from ?summary.gam that this would be the
same as the adjusted R^2 for a linear model. Note, however, that the
percentage of deviance explained checks with the R^2 from the linear model,
as expected.

Maybe you should address this question to the package author.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: Denis Chabot [mailto:[EMAIL PROTECTED] 
 Sent: Wednesday, October 05, 2005 3:33 PM
 To: John Fox
 Cc: R list
 Subject: Re: [R] testing non-linear component in mgcv:gam
 
 Thank you everyone for your help, but my introduction to GAM 
 is turning my brain to mush. I thought the one part of the 
 output I understood the best was r-sq (adj), but now even 
 this is becoming foggy.
 
 In my original message I mentioned a gam fit that turns out 
 to be a linear fit. By curiosity I analysed it with a linear 
 predictor only with mgcv package, and then as a linear model. 
 The output was identical in both, but the r-sq (adj) was 0.55 
 in mgcv and 0.26 in lm. In lm I hope that my interpretation 
 that 26% of the variance in y is explained by the linear 
 relationship with x is valid. Then what does r2 mean in mgcv?
 
 Denis
   summary.gam(lin)
 
 Family: gaussian
 Link function: identity
 
 Formula:
 wm.sed ~ Temp
 
 Parametric coefficients:
   Estimate Std. Error t value Pr(|t|)
 (Intercept)  0.162879   0.019847   8.207 1.14e-09 ***
 Temp-0.023792   0.006369  -3.736 0.000666 ***
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 
 
 R-sq.(adj) =  0.554   Deviance explained = 28.5%
 GCV score = 0.09904   Scale est. = 0.093686  n = 37
 
 
   summary(sed.true.lin)
 
 Call:
 lm(formula = wm.sed ~ Temp, weights = N.sets)
 
 Residuals:
  Min  1Q  Median  3Q Max
 -0.6138 -0.1312 -0.0325  0.1089  1.1449
 
 Coefficients:
   Estimate Std. Error t value Pr(|t|)
 (Intercept)  0.162879   0.019847   8.207 1.14e-09 ***
 Temp-0.023792   0.006369  -3.736 0.000666 ***
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 
 Residual standard error: 0.3061 on 35 degrees of freedom
 Multiple R-Squared: 0.285,Adjusted R-squared: 0.2646
 F-statistic: 13.95 on 1 and 35 DF,  p-value: 0.000666
 
 
 Le 05-10-05 à 09:45, John Fox a écrit :
 
  Dear Denis,
 
  Take a closer look at the anova table: The models provide identical 
  fits to the data. The differences in degrees of freedom and 
 deviance 
  between the two models are essentially zero, 5.5554e-10 and 
 2.353e-11 
  respectively.
 
  I hope this helps,
   John
 
  
  John Fox
  Department of Sociology
  McMaster University
  Hamilton, Ontario
  Canada L8S 4M4
  905-525-9140x23604
  http://socserv.mcmaster.ca/jfox
  
 
 
  -Original Message-
  From: [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED] On Behalf Of Denis Chabot
  Sent: Wednesday, October 05, 2005 8:22 AM
  To: r-help@stat.math.ethz.ch
  Subject: [R] testing non-linear component in mgcv:gam
 
  Hi,
 
  I need further help with my GAMs. Most models I test are very 
  obviously non-linear. Yet, to be on the safe side, I report the 
  significance of the smooth (default output of mgcv's
  summary.gam) and confirm it deviates significantly from linearity.
 
  I do the latter by fitting a second model where the same 
 predictor is 
  entered without the s(), and then use anova.gam to compare 
 the two. I 
  thought this was the equivalent of the default output of anova.gam 
  using package gam instead of mgcv.
 
  I wonder if this procedure is correct because one of my models 
  appears to be linear. In fact mgcv estimates df to be 
 exactly 1.0 so 
  I could have stopped there. However I inadvertently repeated the 
  procedure outlined above. I would have thought in this case the 
  anova.gam comparing the smooth and the linear fit would 
 for sure have 
  been not significant.
  To my surprise, P was 6.18e-09!
 
  Am I doing something wrong when I attempt to confirm the non- 
  parametric part a smoother is significant? Here is my example case 
  where the relationship does appear to be linear:
 
  library(mgcv)
 
  This is mgcv 1.3-7
 
  Temp - c(-1.38, -1.12, -0.88, -0.62, -0.38, -0.12, 0.12, 
 0.38, 0.62, 
  0.88, 1.12,
  1.38, 1.62, 1.88, 2.12, 2.38, 2.62, 2.88, 3.12, 3.38, 
  3.62, 3.88,
  4.12, 4.38, 4.62, 4.88, 5.12, 5.38, 5.62, 5.88, 6.12, 
  6.38, 6.62, 6.88,
  7.12, 8.38, 13.62)
  N.sets - c(2, 6, 3, 9, 26, 15, 34, 21, 30, 18, 28, 27, 
 27, 29, 31, 
  22, 26, 24, 23,
   15, 25, 24, 27, 19, 26, 24, 22, 13, 10, 2, 5, 
 3, 1, 1, 
  1, 1, 1) wm.sed - c(0.0, 0.016129032, 0.0, 
  0.062046512, 0.396459596, 0.189082949

Re: [R] Rcmdr and scatter3d

2005-10-04 Thread John Fox
(predict(mod, newdata=dat),
grid.lines, grid.lines)
if (fill) rgl.surface(vals, vals, yhat,
color=surface.col[j], 
alpha=0.5, lit=FALSE)
if (grid) rgl.surface(vals, vals, yhat, color=if
(fill) grid.col 
else surface.col[j], alpha=0.5, lit=FALSE, 
front=lines, back=lines)
rgl.texts(0, predict(mod, 
newdata=data.frame(x=0, z=0, groups=group)), 0,
paste(group,  ), adj=1, color=surface.col[j])
if (residuals){
yy - y[select.obs]
xx - x[select.obs]
zz - z[select.obs]
fitted - fitted(mod)
rgl.lines(as.vector(rbind(xx,xx)),
as.vector(rbind(yy,fitted)), 
as.vector(rbind(zz,zz)),
col=surface.col[j])
}
}
}
}
}
}
if (revolutions  0) {
for (i in 1:revolutions){
for (angle in seq(1, 360, length=360/speed))
rgl.viewpoint(-angle, fov=fov)
}
}
if (model.summary) return(summaries) else return(invisible(NULL))
}



John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Naiara S. Pinto
 Sent: Tuesday, October 04, 2005 6:13 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Rcmdr and scatter3d
 
 Hi folks,
 
 I'd like to use scatter3d (which is in R commander) to plot 
 more than one dataset in the same graph, each dataset with a 
 different color. The kind of stuff you would do with holdon 
 in Matlab.
 
 I read a recent message that was posted to this list with a 
 similar problem, but I couldn't understand the reply. Could 
 someone give me one example? How do you plot subgroups using 
 scatter3d?
 
 Thanks a lot!
 
 Naiara.
 
 
 
 Naiara S. Pinto
 Ecology, Evolution and Behavior
 1 University Station A6700
 Austin, TX, 78712
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] plot question when type = b and pch is a vector

2005-10-02 Thread John Fox
Dear tokas,

How about:

x - seq(0.01, 10, length = 20)
xx - x[7]
x[7] - NA

plot(c(0, 10), c(-20, 20), type = n, xlab = x, 
ylab = expression(2 * alpha * log(x)))
for(i in 1:4){
lines(x, 2 * i * log(x), lty = 1)
text(xx,  2 * i * log(xx), i) 
}

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of toka tokas
 Sent: Sunday, October 02, 2005 5:37 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] plot question when type = b and pch is a vector
 
 Dear R users,
 
 I've been struggling some days with the following
 problem: I'm interesting in producing the following plot
 
 x - seq(0.01, 10, length = 20)
 
 plot(c(0, 10), c(-20, 20), type = n, xlab = x, 
 ylab = expression(2 * alpha * log(x)))
 
 pch. - rep(NA, length(x))
 for(i in 1:4){
 pch.[7] - as.character(i)
 lines(x, 2 * i * log(x), type = b, pch = pch., lty = 1) }
 
 where all the line segments are connected, except from the 
 7th one where I've put the value of alpha -- in other words 
 I'd like to produce a line plot where the label appears at 
 each line with some white space around it.
 
 thanks in advance,
 tokas
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] generalized linear model for multinomial data?

2005-10-02 Thread John Fox
Dear Hongyu,

See multinom() in the nnet package, associated with Venables and Ripley's
Modern Applied Statistics with S.

John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Hongyu Sun
 Sent: Sunday, October 02, 2005 3:07 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] generalized linear model for multinomial data?
 
 Dear All:
 
 Does R have the package as in SAS's generalized logits model 
 for nominal response data? I have searched but cannot find 
 the windows package.
 
 Many thanks,
 
 HS
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Are least-squares means useful or appropriate?

2005-09-24 Thread John Fox
Dear Peter, Doug, and Felipe,

My effects package (on CRAN, also see the article at
http://www.jstatsoft.org/counter.php?id=75url=v08/i15/effect-displays-revis
ed.pdf) will compute and graph adjusted effects of various kinds for linear
and generalized linear models -- generalizing so-called least-squares
means (or population marginal means or adjusted means).

A couple of comments: 

By default, the all.effects() function in the effects package computes
effects for high-order terms in the model, absorbing terms marginal to them.
You can ask the effect() function to compute an effect for a term that's
marginal to a higher-order term, and it will do so with a warning, but this
is rarely sensible.

Peter's mention of floating variances (or quasi-variances) in this context
is interesting, but what would most like to see, I think, are the
quasi-variances for the adjusted effects, that is for terms merged with
their lower-order relatives. These, for example, are unaffected by contrast
coding. How to define reasonable quasi-variances in this context has been
puzzling me for a while.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Peter Dalgaard
 Sent: Friday, September 23, 2005 10:23 AM
 To: Douglas Bates
 Cc: Felipe; R-help@stat.math.ethz.ch
 Subject: Re: [R] Are least-squares means useful or appropriate?
 
 Douglas Bates [EMAIL PROTECTED] writes:
 
  On 9/20/05, Felipe [EMAIL PROTECTED] wrote:
   -BEGIN PGP SIGNED MESSAGE-
   Hash: SHA1
  
   Hi.
   My question was just theoric. I was wondering if someone who were 
   using SAS and R could give me their opinion on the topic. I was 
   trying to use least-squares means for comparison in R, but then I 
   found some indications against them, and I wanted to know if they 
   had good basis (as I told earlier, they were not much detailed).
   Greetings.
  
   Felipe
  
  As Deepayan said in his reply, the concept of least squares 
 means is 
  associated with SAS and is not generally part of the theory 
 of linear 
  models in statistics.  My vague understanding of these (I 
 too am not a 
  SAS user) is that they are an attempt to estimate the 
 mean response 
  for a particular level of a factor in a model in which that 
 factor has 
  a non-ignorable interaction with another factor.  There is 
 no clearly 
  acceptable definition of such a thing.
 
 (PD goes and fetches the SAS manual)
 
 Well, yes. it'll do that too, although only if you ask for 
 the lsmeans of A when an interaction like A*B is present in 
 the model. This is related to the tests of main effects when 
 an interaction is present using type III sums of squares, 
 which has been beaten to death repeatedly on the list. In 
 both cases, there seems to be an implicit assumption that 
 categorical variables by nature comes from an underlying 
 fully balanced design.
 
 If the interaction is absent from the model, the lsmeans are 
 somewhat more sensible in that they at least reproduce the 
 parameter estimates as contrasts between different groups. 
 All continuous variables in the design will be set to their 
 mean, but values for categorical design variables are 
 weighted inversely as the number of groups. So if you're 
 doing an lsmeans of lung function by smoking adjusted for age 
 and sex you get estimates for the mean of a population of 
 which everyone has the same age and half are male and half 
 are female. This makes some sense, but if you do it for sex 
 adjusting for smoking and age, you are not only forcing the 
 sexes to smoke equally much, but actually adjusting to  
 smoking rates of 50%, which could be quite far from reality. 
 
 The whole operation really seems to revolve around 2 things: 
 
 (1) pairwise comparisons between factor levels. This can alternatively
 be done fairly easily using parameter estimates for the relevant
 variable and associated covariances. You don't really need all the
 mumbo-jumbo of adjusting to particular values of other variables.
 
 (2) plotting effects of a factor with error bars as if they were
 simple group means. This has some merit since the standard
 parametrizations are misleading at times (e.g. if you choose the
 group with the least data as the reference level, std. err. for
 the other groups will seem high). However, it seems to me that
 concepts like floating variances (see float() in the Epi package)
 are more to the point.
 
  R is an interactive language where it is a simple matter to fit a 
  series of models and base your analysis on a model that is 
  appropriate.  An approach of give me the answer to any possible 
  question about this model, whether or not it make sense is 
  unnecessary.
  
  In many ways statistical

Re: [R] text editor TinR?

2005-09-19 Thread John Fox
Dear Mihai,

It's Tinn-R (with two n's), at http://www.sciviews.org/Tinn-R/.

John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Mihai Nica
 Sent: Monday, September 19, 2005 4:07 PM
 To: r-help@stat.math.ethz.ch
 Subject: Re: [R] text editor TinR?
 
 Greetings,
 
 Please help me remember the name of the tiny text editor 
 that works with R TinR maybe? I cannot find it at all, 
 and cannot remember it, it is really frustrating...
 
 Thanks,
 
 Mihai Nica, ABD
 Jackson State University
 170 East Griffith Street G5
 Jackson, MS 39201
 601 914 0361

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Rcommander and simple chisquare

2005-09-15 Thread John Fox
Dear Christian,

From the Rcmdr menus, select Statistics - Summaries - Frequency
distributions, and check the Chisquare goodness of fit test box in the
resulting dialog. This will bring up a dialog box where you can enter
hypothesized probabilities from which expected frequencies will be
calculated.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Christian Jost
 Sent: Thursday, September 15, 2005 5:40 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Rcommander and simple chisquare
 
 In this years biostat teaching I will include Rcommander (it 
 indeed simplifies syntax problems that makes students 
 frequently miss the core statistical problems). But I could 
 not find how to make a simple chisquare comparison between 
 observed frequencies and expected frequencies (eg in genetics 
 where you expect phenotypic frequencies corresponding to 3:1 
 in standard dominant/recessif alleles). Any idea where this 
 feature might be hidden? Or could it be added to Rcommander?
 
 Thanks, Christian.
 
 ps: in case I am not making myself clear, can Rcommander be 
 made to perform
   chisq.test(c(61,39),p=c(0.75,0.25))
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Rcommander and simple chisquare

2005-09-15 Thread John Fox
Dear Christian,

From the Rcmdr menus, select Statistics - Summaries - Frequency
distributions, and check the Chisquare goodness of fit test box in the
resulting dialog. This will bring up a dialog box where you can enter
hypothesized probabilities from which expected frequencies will be
calculated.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Christian Jost
 Sent: Thursday, September 15, 2005 5:40 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Rcommander and simple chisquare
 
 In this years biostat teaching I will include Rcommander (it 
 indeed simplifies syntax problems that makes students 
 frequently miss the core statistical problems). But I could 
 not find how to make a simple chisquare comparison between 
 observed frequencies and expected frequencies (eg in genetics 
 where you expect phenotypic frequencies corresponding to 3:1 
 in standard dominant/recessif alleles). Any idea where this 
 feature might be hidden? Or could it be added to Rcommander?
 
 Thanks, Christian.
 
 ps: in case I am not making myself clear, can Rcommander be 
 made to perform
   chisq.test(c(61,39),p=c(0.75,0.25))
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Rcommander and simple chisquare

2005-09-15 Thread John Fox
Dear Christian,

The Rcmdr assumes that you have a data frame with the original data, in
which the variable in question is a factor. The frequency distribution is
constructed for the factor. Thus, in your example, you'd have 100
observations classified on a two-category factor. What you enter directly
are the hypothesized probabilities.

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: Christian Jost [mailto:[EMAIL PROTECTED] 
 Sent: Thursday, September 15, 2005 11:38 AM
 To: John Fox; 'Philippe Grosjean'
 Cc: r-help@stat.math.ethz.ch
 Subject: RE: [R] Rcommander and simple chisquare
 
 Dear John and Philippe,
 
 thanks for your replys, I finally  found this menu, but I am 
 somewhat at a loss how I should enter the observed 
 frequencies. To take my example below, If I enter a 
 one-column data.frame with the numbers 61 and 39, John's 
 indicated menu is not highlighted. If I add a second column 
 containing some factor, the menu is highlighted by I cannot 
 select the first column. However, if I edit the data and 
 declare the first column to be of type 'character' I can 
 select it in the menu dialog and declare the expected 
 frequencies, but the chisquare output doesn't make any sense. 
 For the moment I cannot make any sense of that :-( Any help 
 most appreciated, or a link to the tutorial/faq that explains 
 such kind of problems.
 
 Thanks, Christian.
 
 At 11:31 -0400 15/09/05, John Fox wrote:
 Dear Philippe,
 
 This does a chi-square test of independence in a contingency 
 table, not 
 a chi-square goodness-of-fit test (which is done in the Rcmdr via 
 Statistics
 - Summaries - Frequency distribution).
 
 Regards,
   John
 
 
 John Fox
 Department of Sociology
 McMaster University
 Hamilton, Ontario
 Canada L8S 4M4
 905-525-9140x23604
 http://socserv.mcmaster.ca/jfox
 
 
   -Original Message-
   From: [EMAIL PROTECTED]  
  [mailto:[EMAIL PROTECTED] On Behalf Of  Philippe 
  Grosjean
   Sent: Thursday, September 15, 2005 7:32 AM
   To: Christian Jost
   Cc: r-help@stat.math.ethz.ch
   Subject: Re: [R] Rcommander and simple chisquare
 
   Hello,
 
   Just look at Statistics - Contingency tables. There is 
 an  option 
  for making the chi square test there.
   Best,
 
   Philippe Grosjean,
 
   ..°}))
 ) ) ) ) )
( ( ( ( (Prof. Philippe Grosjean
..
 
   Christian Jost wrote:
In this years biostat teaching I will include Rcommander (it 
  indeed   simplifies syntax problems that makes students 
 frequently 
  miss the   core statistical problems). But I could not 
 find how to  
  make a simple   chisquare comparison between observed frequencies 
  and expected   frequencies (eg in genetics where you expect 
  phenotypic frequencies   corresponding to 3:1 in standard 
  dominant/recessif  alleles). Any idea   where this 
 feature might be 
  hidden? Or could it be added to   Rcommander?
   
Thanks, Christian.
   
ps: in case I am not making myself clear, can 
 Rcommander be made 
  to   perform chisq.test(c(61,39),p=c(0.75,0.25))
   
   
 __


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Collineariy Diagnostics

2005-09-13 Thread John Fox
Dear Sundar and Antoine,

In addition, the vif function in the car package will calculate generalized
variance inflation factors.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Sundar 
 Dorai-Raj
 Sent: Tuesday, September 13, 2005 5:16 AM
 To: Antoine de Bary
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] Collineariy Diagnostics
 
 
 
 Antoine de Bary wrote:
  Hi, and thanks for your help
  
  in order to do collinearity analysis I downloaded the 
 perturb package. 
  I run a lm (regression) and on that the ³calldiag² commad to get 
  condition numbers but i get the following message: the variable XY 
  with modus ³numeric² was not found (it does the same with all 
  predictors despite all variables are numeric and exists).
  
  Can anyone tell me how can I go arround this problem? Is 
 there another 
  way to have ³condition numbers²? What about VIF?
  
  Please return message to: [EMAIL PROTECTED]
 
 I cannot comment on the perturb package. However for 
 condition numbers see ?kappa.lm, and for variance inflation 
 factors see ?vif. The latter is in the Design package.
 
 set.seed(1)
 x1 - rnorm(100)
 x2 - x1 + 0.1 * rnorm(100)
 y  - rnorm(100)
 f  - lm(y ~ x1 + x2)
 
 vif(f)
 kappa(f)
 
 HTH,
 
 --sundar
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] is library loaded

2005-09-13 Thread John Fox
Dear Omar,

The following function tests whether a package is in the search path (with
the package name given in quotes):

packageLoaded - function(name) 0 != length(grep(paste(^package:, name,
$, sep=), search()))

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Omar Lakkis
 Sent: Tuesday, September 13, 2005 2:09 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] is library loaded
 
 Is there a way to test if a library has been loaded?
 
 is.loaded does not give me what I want, I am looking to say:
 
 if loaded(Rdbi) dbSendQuery(conn, q)
 if loaded(RODBC) sqlQuery(conn, q)
 
 I need this to support both unix and windows platforms as I 
 could not find a windows distribution for RdbiPgSQL. I am 
 using R 2.1.0 and postgresql. I will be connecting to the 
 database using Rdbi and RdbiPgSQL and have other developers 
 using windows connect with RODBC, unless someone can suggest 
 a better solution.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] is library loaded

2005-09-13 Thread John Fox
Dear Robert,

packageLoaded() may well be a bad name but loadedNamespaces() won't detect a
package without a namespace. It therefore seemed safe to me to check the
path, which would include both packages with and without namespaces. With
respect to loading and attaching, I thought that library() both loaded a
package (with or without a namespace) and attached it to the search path,
but I must admit that I'm easily confused about these distinctions. 

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: Robert Gentleman [mailto:[EMAIL PROTECTED] 
 Sent: Tuesday, September 13, 2005 3:53 PM
 To: John Fox
 Cc: [EMAIL PROTECTED]; r-help@stat.math.ethz.ch
 Subject: Re: [R] is library loaded
 
 Hi,
 
   Almost surely this is a bad name. With the advent of name 
 spaces it is important to distinguish between loading and 
 attaching. This function tests for attached packages. To test 
 for loaded packages we already have
   loadedNamespaces.
 
   Best wishes,
 Robert
 
 John Fox wrote:
  Dear Omar,
  
  The following function tests whether a package is in the 
 search path 
  (with the package name given in quotes):
  
  packageLoaded - function(name) 0 != length(grep(paste(^package:, 
  name, $, sep=), search()))
  
  I hope this helps,
   John
  
  
  John Fox
  Department of Sociology
  McMaster University
  Hamilton, Ontario
  Canada L8S 4M4
  905-525-9140x23604
  http://socserv.mcmaster.ca/jfox
  
  
  
 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Omar Lakkis
 Sent: Tuesday, September 13, 2005 2:09 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] is library loaded
 
 Is there a way to test if a library has been loaded?
 
 is.loaded does not give me what I want, I am looking to say:
 
 if loaded(Rdbi) dbSendQuery(conn, q)
 if loaded(RODBC) sqlQuery(conn, q)
 
 I need this to support both unix and windows platforms as I 
 could not 
 find a windows distribution for RdbiPgSQL. I am using R 2.1.0 and 
 postgresql. I will be connecting to the database using Rdbi and 
 RdbiPgSQL and have other developers using windows connect 
 with RODBC, 
 unless someone can suggest a better solution.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
  
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
  
 
 --
 Robert Gentleman, PhD
 Program in Computational Biology
 Division of Public Health Sciences
 Fred Hutchinson Cancer Research Center
 1100 Fairview Ave. N, M2-B876
 PO Box 19024
 Seattle, Washington 98109-1024
 206-667-7700
 [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] [R-pkgs] Internationalized version of the Rcmdr package

2005-09-09 Thread John Fox
Dear R-packages list members,

I've recently uploaded to CRAN version 1.1-1 of the Rcmdr (R Commander)
package. Based on the tcltk package, the Rcmdr provides a basic-statistics
graphical interface to R.

The general functionality of the R Commander has not changed much since
version 1.0-0. There are some small improvements to the package, but the
reason that I'm re-announcing it is that the latest version supports
translation into other languages, using the new internationalization and
localization features of R 2.1.x. 

A Catalan translation (courtesy of Manel Salamero), a French translation
(courtesy of Philippe Grosjean), a Japanese translations (courtesy of
Takaharu Araki), and a Slovenian translation (courtesy of Jaro Lajovic), are
provided with the Rcmdr package. I understand that Portuguese, Russian, and
Spanish translations are in the works, and these will be incorporated when
they are completed. 

If you're interested in translating the Rcmdr into another language, please
contact me.

As before, bug reports, suggestions, and other feedback are appreciated.

 John 


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Plotting an ellipse in 3D

2005-09-09 Thread John Fox
=data.frame(x=0,
z=0, groups=group)), 0,
paste(group,  ), adj=1, color=surface.col[j])
if (residuals){
yy - y[select.obs]
xx - x[select.obs]
zz - z[select.obs]
fitted - fitted(mod)
rgl.lines(as.vector(rbind(xx,xx)),
as.vector(rbind(yy,fitted)), as.vector(rbind(zz,zz)),
col=surface.col[j])
}
}
}
}
}
}
if (revolutions  0) {
for (i in 1:revolutions){
for (angle in seq(1, 360, length=360/speed))
rgl.viewpoint(-angle, fov=fov)
}
}
if (model.summary) return(summaries) else return(invisible(NULL))
}


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Duncan Murdoch
 Sent: Friday, September 09, 2005 8:03 AM
 To: Mike White
 Cc: R-help@stat.math.ethz.ch
 Subject: Re: [R] Plotting an ellipse in 3D
 
 Mike White wrote:
  I have been using the ellipse function from the car package and the 
  covariance matrix to draw an ellipse around a group of 
 points to show 
  the confidence limits.  However, the points are actually 
 represented 
  by 3 variables so rather than plot each pairwise combination of 
  variables in 2D I would like to plot the 'ellipse' in 3D using the 
  djmrgl package.  Can anyone offer advice on how I can plot 
 the surface 
  of  a 3D 'ellipse' using the covariance matrix to define 
 the shape, so 
  that the points inside can also be seen.
 
 You should use rgl, rather than djmrgl.  It now has most of 
 the same functions plus a lot more.
 
 Then you can plot the ellipse as a wireframe or transparent 
 object.  See the demo(regression) example for that kind of 
 drawing; demo(shapes3d) for ellipses.  (The demo names are 
 from memory, I don't have access to it right now.)
 
 Duncan Murdoch
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] tcltk, X11 protocol error: Bug?

2005-09-08 Thread John Fox
Dear Nicholas,

This problem has been reported before (enter X11 protocol error on the R
site search at http://finzi.psych.upenn.edu/search.html to see the previous
threads), but as far as I know, there's no definitive explanation or
solution. As well, things appear to work fine, despite the warnings. The way
I handle the problem in the Rcmdr package is simply to intercept the
warnings.

I hope this helps,
 John 


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Nicholas Lewin-Koh
 Sent: Monday, September 05, 2005 5:16 PM
 To: [EMAIL PROTECTED]
 Subject: [R] tcltk, X11 protocol error: Bug?
 
 Hi,
 I am having trouble debugging this one. The code is attached 
 below, but it seems to be a problem at the C-tk interface. If 
 I run this 1 time there are no problems if I run it more than 
 once I start to get warnings that increase in multiples of 11 
 everytime I run it. Here is a sample session
 
 
  source(clrramp2.r)
 Loading required package: tcltk
 Loading Tcl/Tk interface ... done
  clrRamp()
 
  tt-clrRamp()
  tt
 function (n)
 {
 x - ramp(seq(0, 1, length = n))
 rgb(x[, 1], x[, 2], x[, 3], max = 255) }
 environment: 0x8b8674c
  image(matrix(1:10),col=tt(10))
  tt-clrRamp()
 There were 22 warnings (use warnings() to see them)
  image(matrix(1:10),col=tt(10))
 There were 11 warnings (use warnings() to see them)
  warnings()
 Warning messages:
 1: X11 protocol error: BadWindow (invalid Window parameter)
 2: X11 protocol error: BadWindow (invalid Window parameter)
 3: X11 protocol error: BadWindow (invalid Window parameter)
 4: X11 protocol error: BadWindow (invalid Window parameter)
 5: X11 protocol error: BadWindow (invalid Window parameter)
 6: X11 protocol error: BadWindow (invalid Window parameter)
 7: X11 protocol error: BadWindow (invalid Window parameter)
 8: X11 protocol error: BadWindow (invalid Window parameter)
 9: X11 protocol error: BadWindow (invalid Window parameter)
 10: X11 protocol error: BadWindow (invalid Window parameter)
 11: X11 protocol error: BadWindow (invalid Window parameter)
 
 I am running R-2.1.1 on ubuntu linux 5.04, compiled from 
 source (not the deb package).
 My version of tcl/tk is 8.4. The code is below. If anyone 
 sees something I am doing foolish let me know, otherwise I 
 will file a bug report.
 
 Nicholas
 
 # File clrramp2.r ##
 
 require(tcltk)
 clrRamp - function(n.col, b.color=NULL,e.color=NULL){
 
   B.ChangeColor - function()
 {
  
   b.color - 
 tclvalue(tkcmd(tk_chooseColor,initialcolor=e.color,
  title=Choose a color))
   if (nchar(b.color)0){
 tkconfigure(canvas.b,bg=b.color)
 Rmp.Draw()
   }
 }
 
   E.ChangeColor - function()
 {
  
   e.color - 
 tclvalue(tkcmd(tk_chooseColor,initialcolor=e.color,
  title=Choose a color))
   ##cat(e.color)
   if (nchar(e.color)0){
 tkconfigure(canvas.e,bg=e.color)
 Rmp.Draw()
   }
 }
 
   Rmp.Draw -function(){
 
 cr-colorRampPalette(c(b.color,e.color),space=Lab,interpola
 te=spline)
 rmpcol - cr(n.col)
 #rmpcol-rgb( rmpcol[,1],rmpcol[,2],rmpcol[,3])
 inc - 300/n.col
 xl - 0
 for(i in 1:n.col){
   ##item - 
   
 tkitemconfigure(canvas.r,barlst[[i]],fill=rmpcol[i],outline=rmpcol[i])
   #xl - xl+inc
 }
   }
 
   save.ramp - function(){
 
 cr-colorRampPalette(c(b.color,e.color),space=Lab,interpola
 te=spline)
 tkdestroy(tt)
 ##invisible(cr)
   }
 
   tt - tktoplevel()
   tkwm.title(tt,Color Ramp Tool)
   frame - tkframe(tt)
   bframe - tkframe(frame,relief=groove,borderwidth=1)
 
   if(is.null(b.color)) b.color - blue
   if(is.null(e.color)) e.color - yellow
   if(missing(n.col)) n.col - 100
 
   canvas.b - tkcanvas(bframe,width=50,height=25,bg=b.color)
   canvas.e - tkcanvas(bframe,width=50,height=25,bg=e.color)
   canvas.r - tkcanvas(tt,width=300,height=50,bg=white)
   
   BColor.button - tkbutton(bframe,text=Begin
   Color,command=B.ChangeColor)
   ##tkgrid(canvas.b,BColor.button)
   EColor.button - tkbutton(bframe,text=End
   Color,command=E.ChangeColor)
   killbutton - tkbutton(bframe,text=Save,command=save.ramp)
   tkgrid(canvas.b,BColor.button,canvas.e,EColor.button)
   tkgrid(bframe)
   tkgrid(frame)
   tkgrid(canvas.r)
   tkgrid(killbutton)
 
   
 cr-colorRampPalette(c(b.color,e.color),space=Lab,interpolat
 e=spline)
   ##rmpcol - hex(mixcolor(alpha,bc,ec,where=LUV))
   rmpcol - cr(n.col)
   inc - 300/n.col
   xl - 0
   #barlst - vector(length=n.col,mode=list)
   barlst - tclArray()
   for(i in 1:n.col){
 item-tkcreate(canvas.r,rect,xl,0,xl+inc,50,
fill=rmpcol[i],outline=rmpcol[i])
 ##tkaddtag(canvas.r, point, withtag, item

Re: [R] Rcmdr extensions

2005-08-29 Thread John Fox
Dear Simone,

You'll find instructions for extending the R Commander in Section 4 of the
paper at http://socserv.socsci.mcmaster.ca/jfox/Papers/R-commander.pdf. In
most instances, you should cause commands to be executed and printed in the
Script window and output to appear in the Output window by using the
function doItAndPrint().

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Simone 
 Gabbriellini
 Sent: Monday, August 29, 2005 9:48 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Rcmdr extensions
 
 Dear List,
 
 I am trying to extend Rcmdr with some functions usefult to my 
 study... I have addedd succesfully a menu and some submenu to 
 the GUI, and I have placed a file .R in the /etc folder... I 
 am able to call functions on that file, but I cannot see the 
 results: how can I tell Rcmdr to write the output in the output box?
 
 Second question: how can I refer to the actually selected 
 Dataset in my .R functions' file?
 
 Thank you,
 Simone Gabbriellini
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Matrix oriented computing

2005-08-26 Thread John Fox
Dear Mark,

For that matter, the loop isn't a whole a slower (on my 3GHz Win XP system):

 x - c(0.005, 0.010, 0.025, 0.05, 0.1, 0.5, 0.9, 0.95, 0.975, 0.99, 0.995)
 df - 1:1000
 system.time(mat - sapply(x, qchisq, df), gcFirst = TRUE)
[1] 0.08 0.00 0.08   NA   NA
 
 mat - matrix(0, 1000, 11)
 system.time(for (i in 1:length(df)) mat[i,] - qchisq(x, df[i]))
[1] 0.09 0.00 0.10   NA   NA
 

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Marc 
 Schwartz (via MN)
 Sent: Friday, August 26, 2005 10:26 AM
 To: Peter Dalgaard
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] Matrix oriented computing
 
 On Fri, 2005-08-26 at 15:25 +0200, Peter Dalgaard wrote:
  Marc Schwartz [EMAIL PROTECTED] writes:
  
   x - c(0.005, 0.010, 0.025, 0.05, 0.1, 0.5, 0.9, 
  0.95, 0.975, 0.99, 0.995)
   
   df - c(1:100)
   
   mat - sapply(x, qchisq, df)
  
dim(mat)
   [1] 100  11

str(mat)
num [1:100, 1:11] 3.93e-05 1.00e-02 7.17e-02 2.07e-01 
 4.12e-01 ...
  
  outer() is perhaps a more natural first try... It does give the 
  transpose of the sapply approach, though.
  
  round(t(outer(x,df,qchisq)),2)
  
  should be close. You should likely add dimnames.
 
 
 
 What I find interesting, is that I would have intuitively expected
 outer() to be faster than sapply().  However:
 
 
   system.time(mat - sapply(x, qchisq, df), gcFirst = TRUE)
 [1] 0.01 0.00 0.01 0.00 0.00
  
   system.time(mat1 - round(t(outer(x, df, qchisq)), 2),
gcFirst = TRUE)
 [1] 0.01 0.00 0.01 0.00 0.00
  
 # No round() or t() to test for overhead
   system.time(mat2 - outer(x, df, qchisq), gcFirst = TRUE)
 [1] 0.01 0.00 0.02 0.00 0.00
 
 
 # Bear in mind the round() on mat1 above
  all.equal(mat, mat1)
 [1] Mean relative  difference: 4.905485e-05
 
  all.equal(mat, t(mat2))
 [1] TRUE
 
 
 Even when increasing the size of 'df' to 1:1000:
 
 
   system.time(mat - sapply(x, qchisq, df), gcFirst = TRUE)
 [1] 0.16 0.01 0.16 0.00 0.00
 
   system.time(mat1 - round(t(outer(x, df, qchisq)), 2), gcFirst =
 TRUE)
 [1] 0.16 0.00 0.18 0.00 0.00
  
   # No round() or t() to test for overhead
   system.time(mat2 - outer(x, df, qchisq), gcFirst = TRUE)
 [1] 0.16 0.01 0.17 0.00 0.00
 
 
 
 It also seems that, at least in this case, t() and round() do 
 not add much overhead.
 
 Best regards,
 
 Marc
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Manually Calculating Odds from POLR Model

2005-08-15 Thread John Fox
Dear Tate,

Your question pertains generally to the treatment of factors in model
formulas and is not particular to polr(). For a brief explanation, see
Section 11.1, Defining statistical models; formulae, and in particular
Section 11.1.1 on Contrasts in the manual An Introduction to R, which is
distributed with R. More detailed explanations are in texts such as Venables
and Ripley, Modern Applied Statistics With S, and my own, An R and S-PLUS
Companion to Applied Regression. 

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: Tate Avery [mailto:[EMAIL PROTECTED] 
 Sent: Monday, August 15, 2005 3:58 PM
 To: [EMAIL PROTECTED]
 Cc: r-help@stat.math.ethz.ch
 Subject: RE: [R] Manually Calculating Odds from POLR Model
 
 John,
 
 Thank you, the document was very helpful.  I can now 
 calculate the same values generated by predict() when I am 
 using purely numeric input data.
 
 Another small question arises when I look at the example 
 using 'housing' in the polr() documentation page:
 
 Running the example produces the following coefficients...
 
 Coefficients:
InflMedium  InflHigh TypeApartmentTypeAtrium   
 TypeTerrace  
 ContHigh
 0.5663924 1.2888218-0.5723552-0.3661912
 -1.0910195 
 0.3602834
 
 Now, if I am trying to perform a prediction and the value for 
 INFL comes in as 'Medium' what is done?  And, what is done for 'low'?
 
 That seems to be the last missing piece in my understanding 
 of how to convert the model values into predictions.
 
 Thank you,
 Tate
 
 From: John Fox [EMAIL PROTECTED]
 To: 'Tate Avery' [EMAIL PROTECTED]
 CC: r-help@stat.math.ethz.ch
 Subject: RE: [R] Manually Calculating Odds from POLR Model
 Date: Fri, 12 Aug 2005 19:22:23 -0400
 
 Dear Tate,
 
 If I understand correctly what you're asking, the formulas 
 are on p. 21 
 of the paper at 
 http://socserv.socsci.mcmaster.ca/jfox/Papers/logit-effect-d
isplays.pdf.
 But why do you want to do this when you can get the fitted 
 probabilities from predict()?
 
 I hope this helps.
   John
 
 
 John Fox
 Department of Sociology
 McMaster University
 Hamilton, Ontario
 Canada L8S 4M4
 905-525-9140x23604
 http://socserv.mcmaster.ca/jfox
 
 
   -Original Message-
   From: [EMAIL PROTECTED] 
   [mailto:[EMAIL PROTECTED] On Behalf Of Tate Avery
   Sent: Friday, August 12, 2005 2:50 PM
   To: r-help@stat.math.ethz.ch
   Subject: [R] Manually Calculating Odds from POLR Model
  
   Hello,
  
   I am using polr(...) to generate a model.  The summary shows the 
   coefficients and the intercepts.
  
   For example:
  
   coefficient for x1 = c1
   coefficient for x2 = c2
  
   intercept A|B = i1
   intercept B|C = i2
  
   I can then run predict(..., type=p) with the model and see the 
   odds for each factor.
  
   For example:
  
 ABC
   10.3 0.5  0.2
   20.4 0.1  0.5
  
   What I really want to be able to do is take the 2 
 coefficients, the 
   2 intercepts, the x1  x2 values and manually calculate the 
   probabilities generated by predict().
  
   I have been searching quite extensively for the underlying 
   calculations that transform the polr output and the input 
 variables 
   into the final output odds.  I have tried a number of 
 dead-end roads 
   so far.
  
   So, if anyone has any information on how to do this or 
 where I can 
   find out, I would be extremely grateful.
  
   Thank you for your time,
   Tate Avery
  
   __
   R-help@stat.math.ethz.ch mailing list 
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide!
   http://www.R-project.org/posting-guide.html
 
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] path analysis

2005-08-14 Thread John Fox
Dear Manuel,

Polychoric correlations imply only that the *latent* variables are
continuous -- the observed variables are ordered categories. Tetrachoric and
point-biserial correlations are special cases respectively of polychoric and
polyserial correlations. As long as you're willing to think of the
dichotomous variable as the dissection into two categories of a latent
continuous variable (and assuming multinormality of the latent variables),
you can use the approach that I suggested. This isn't logistic regression,
but it's similar to a probit model.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Manel Salamero
 Sent: Sunday, August 14, 2005 12:34 PM
 To: r-help@stat.math.ethz.ch
 Subject: Re: [R] path analysis
 
 This solves part of my problem with the independent ordinal 
 variables, but my dependent variable is truly categorial 
 (illness/no illness). Polychoric correlation implies that 
 data are continuous, which in not the case. Is possible to 
 implement logistic regression in the path model?
 
 Thanks,
 
 Manel Salamero
 
 -- Original Message --
 De: John Fox [EMAIL PROTECTED]
 Data:  Sat, 13 Aug 2005 19:35:24 -0400
 
 Dear Manel,
 
  -Original Message-
  From: [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED] On Behalf Of 
 SALAMERO BARO, 
  MANUEL
  Sent: Saturday, August 13, 2005 2:02 PM
  To: r-help@stat.math.ethz.ch
  Subject: [R] path analysis
  
  Someone knows if it is possible to perform a path analysis with sem 
  package (or any other) to explain a dependent
  *dichotomus* variable?
  
 
 Yes -- you can use the hetcor() function in the polycor 
 package to generate
 a correlation matrix and boot.sem() in the sem package to get standard
 errors or confidence intervals. Make sure that the 
 dichotomous variables are
 represented as factors. See ?boot.sem for an example.
 
 I hope this helps,
  John
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] path analysis

2005-08-14 Thread John Fox
Dear Manuel and list,

I see that I wrote point-biserial when I meant biserial.

Sorry,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of John Fox
 Sent: Sunday, August 14, 2005 1:34 PM
 To: [EMAIL PROTECTED]
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] path analysis
 
 Dear Manuel,
 
 Polychoric correlations imply only that the *latent* 
 variables are continuous -- the observed variables are 
 ordered categories. Tetrachoric and point-biserial 
 correlations are special cases respectively of polychoric and 
 polyserial correlations. As long as you're willing to think 
 of the dichotomous variable as the dissection into two 
 categories of a latent continuous variable (and assuming 
 multinormality of the latent variables), you can use the 
 approach that I suggested. This isn't logistic regression, 
 but it's similar to a probit model.
 
 Regards,
  John
 
 
 John Fox
 Department of Sociology
 McMaster University
 Hamilton, Ontario
 Canada L8S 4M4
 905-525-9140x23604
 http://socserv.mcmaster.ca/jfox
  
 
  -Original Message-
  From: [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED] On Behalf Of 
 Manel Salamero
  Sent: Sunday, August 14, 2005 12:34 PM
  To: r-help@stat.math.ethz.ch
  Subject: Re: [R] path analysis
  
  This solves part of my problem with the independent ordinal 
 variables, 
  but my dependent variable is truly categorial (illness/no illness). 
  Polychoric correlation implies that data are continuous, 
 which in not 
  the case. Is possible to implement logistic regression in the path 
  model?
  
  Thanks,
  
  Manel Salamero
  
  -- Original Message --
  De: John Fox [EMAIL PROTECTED]
  Data:  Sat, 13 Aug 2005 19:35:24 -0400
  
  Dear Manel,
  
   -Original Message-
   From: [EMAIL PROTECTED] 
   [mailto:[EMAIL PROTECTED] On Behalf Of
  SALAMERO BARO,
   MANUEL
   Sent: Saturday, August 13, 2005 2:02 PM
   To: r-help@stat.math.ethz.ch
   Subject: [R] path analysis
   
   Someone knows if it is possible to perform a path 
 analysis with sem 
   package (or any other) to explain a dependent
   *dichotomus* variable?
   
  
  Yes -- you can use the hetcor() function in the polycor package to 
  generate a correlation matrix and boot.sem() in the sem 
 package to get 
  standard errors or confidence intervals. Make sure that the 
  dichotomous variables are represented as factors. See 
 ?boot.sem for an 
  example.
  
  I hope this helps,
   John
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] path analysis

2005-08-13 Thread John Fox
Dear Manel,

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 SALAMERO BARO, MANUEL
 Sent: Saturday, August 13, 2005 2:02 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] path analysis
 
 Someone knows if it is possible to perform a path analysis 
 with sem package (or any other) to explain a dependent 
 *dichotomus* variable?
 

Yes -- you can use the hetcor() function in the polycor package to generate
a correlation matrix and boot.sem() in the sem package to get standard
errors or confidence intervals. Make sure that the dichotomous variables are
represented as factors. See ?boot.sem for an example.

I hope this helps,
 John

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Issues with tcltk for Tiger Mac OSX

2005-08-12 Thread John Fox
Dear Bernard,

The Rcmdr web site has some instructions prepared by Rob Goedman for Mac
users at
http://socserv.socsci.mcmaster.ca/jfox/Misc/Rcmdr/installation-notes.html
that might prove useful. (I'm not a Mac user myself so can't offer specific
advice.)

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Bernard Leong
 Sent: Friday, August 12, 2005 8:51 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Issues with tcltk for Tiger Mac OSX
 
 Dear R-helpers,
 
 I have installed the latest version of R 2.1.1 in the Tiger OSX.
 However, when I load up R and use the following command:
 
  library(tcltk)
 
 I encounter the following error:
 
 Loading Tcl/Tk interface ... Error in dyn.load(x, as.logical(local),
 as.logical(now)) :
   unable to load shared library
 '/Library/Frameworks/R.framework/Resources/library/tcltk/libs/
 tcltk.so':
   
 dlopen(/Library/Frameworks/R.framework/Resources/library/tcltk
 /libs/tcltk.so,
 10): corrupt binary, library ordinal too big
 Error: .onLoad failed in 'loadNamespace' for 'tcltk'
 Error: package/namespace load failed for 'tcltk'
 
 I have done a couple of remedies: 
 1) Tried installing tcl and tk 8.4 from source
 2) Tried to install the version of tcltk from the fink 
 sourceforge net.
 3) Tried to change the path
 
 and it does not work.
 
 Does anyone have a solution to this problem  as I need tcltk 
 to run limmaGUI?
 
 Best regards,
 Bernard
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Manually Calculating Odds from POLR Model

2005-08-12 Thread John Fox
Dear Tate,

If I understand correctly what you're asking, the formulas are on p. 21 of
the paper at
http://socserv.socsci.mcmaster.ca/jfox/Papers/logit-effect-displays.pdf.
But why do you want to do this when you can get the fitted probabilities
from predict()?

I hope this helps.
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Tate Avery
 Sent: Friday, August 12, 2005 2:50 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Manually Calculating Odds from POLR Model
 
 Hello,
 
 I am using polr(...) to generate a model.  The summary shows 
 the coefficients and the intercepts.
 
 For example:
 
 coefficient for x1 = c1
 coefficient for x2 = c2
 
 intercept A|B = i1
 intercept B|C = i2
 
 I can then run predict(..., type=p) with the model and see 
 the odds for each factor.
 
 For example:
 
   ABC
 10.3 0.5  0.2
 20.4 0.1  0.5
 
 What I really want to be able to do is take the 2 
 coefficients, the 2 intercepts, the x1  x2 values and 
 manually calculate the probabilities generated by predict().
 
 I have been searching quite extensively for the underlying 
 calculations that transform the polr output and the input 
 variables into the final output odds.  I have tried a number 
 of dead-end roads so far.
 
 So, if anyone has any information on how to do this or where 
 I can find out, I would be extremely grateful.
 
 Thank you for your time,
 Tate Avery
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] tcltk programming guide

2005-08-11 Thread John Fox
Dear Simone,

tcltk functions correspond closely to Tcl and Tk commands, so documentation
for the latter, available at http://wiki.tcl.tk/3109, is helpful. I also
found Welsch's Practical Programming in Tcl and Tk useful. I expect that
you've already seen Peter Dalgaard's two R News articles on the tcltk
package.

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Simone 
 Gabbriellini
 Sent: Thursday, August 11, 2005 9:43 AM
 To: John Zhang
 Cc: R-help@stat.math.ethz.ch
 Subject: Re: [R] tcltk programming guide
 
 thank you, I knew that link, but I need something more 
 document- oriented, more specific, i.e. if I want to know how 
 to use tkadd, where should I look? or tkinsert and so on...
 
 thanx,
 simone
 
 Il giorno 11/ago/05, alle ore 14:18, John Zhang ha scritto:
 
  Go to 
 http://bioinf.wehi.edu.au/~wettenhall/RTclTkExamples/. There are 
  good examples.
 
 
 
  X-Original-To: [EMAIL PROTECTED]
  Delivered-To: [EMAIL PROTECTED]
  X-Virus-Scanned: by amavisd-new at stat.math.ethz.ch
  Mime-Version: 1.0 (Apple Message framework v733)
  To: R-help@stat.math.ethz.ch
  From: Simone Gabbriellini [EMAIL PROTECTED]
  Date: Thu, 11 Aug 2005 14:04:58 +0200
  Subject: [R] tcltk programming guide
  X-BeenThere: r-help@stat.math.ethz.ch
  X-Mailman-Version: 2.1.6
  List-Id: Main R Mailing List: Primary help r- 
  help.stat.math.ethz.ch
  List-Unsubscribe: https://stat.ethz.ch/mailman/listinfo/r-help,
 
  mailto:[EMAIL PROTECTED]
 
  List-Archive: https://stat.ethz.ch/pipermail/r-help
  List-Post: mailto:r-help@stat.math.ethz.ch
  List-Help: mailto:[EMAIL PROTECTED]
  List-Subscribe: https://stat.ethz.ch/mailman/listinfo/r-help,
 
  mailto:[EMAIL PROTECTED]
 
  Content-Transfer-Encoding: 7bit
  X-Spam-Checker-Version: SpamAssassin 3.0.1 (2004-10-22) on
 
  pascal.dfci.harvard.edu
 
  X-Spam-Level:
  X-Spam-Status: No, score=-2.6 required=3.0 tests=BAYES_00 
  autolearn=ham
 
  version=3.0.1
 
 
  Dear List,
  I'm looking for some documentation about the R tcltk 
 package The one 
  I found in the help doesn't look exaustive, I need 
 information on the 
  use of the single tk widget, maybe with some examples
 
  thank you,
  simone gabbriellini
 
  __
  R-help@stat.math.ethz.ch mailing list 
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
 http://www.R-project.org/posting- 
  guide.html
 
 
  Jianhua Zhang
  Department of Medical Oncology
  Dana-Farber Cancer Institute
  44 Binney Street
  Boston, MA 02115-6084
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! http://www.R-project.org/posting- 
  guide.html
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] partial SS for anova

2005-07-31 Thread John Fox
Dear Natalia,

See the Anova() function in the car package. Also see the warning in ?Anova
about Type III sums of squares.

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 NATALIA F TCHETCHERINA
 Sent: Saturday, July 30, 2005 6:25 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] partial SS for anova
 
 Hello,
 I use  lme4 package.
 library(lme4)
 fit=lmer(y ~ time+dye+trt+trt:time + (1|rep), data=dataset, 
 na.action='na.omit')
 anova(fit)
 
 The anova gives sequential F-tests and sequential SS.
 
 My question is: how I can get partial F-tests and partial SS?
  For lm (not lmer)
 anova(lm(y~x+z))
 we can use
 anova(fit, ssType=3)
 but it is not work for lmer.
 
 Natalia.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Unexpected behavior in recode{car}

2005-07-28 Thread John Fox
Dear Tom and David,

The source of the problem isn't hard to see if you trace the execution of
recode() via debug(): The test for whether the result can be coerced to
numeric is faulty. I've fixed the bug and will upload a new version of car
to CRAN shortly. In the meantime, you can use 

ss - recode(nn, 2='Num2'; 4='Num4', as.factor=TRUE)

Thanks for bringing this bug to my attention.

John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Mulholland, Tom
 Sent: Thursday, July 28, 2005 2:01 AM
 To: D. Dailey; r-help@stat.math.ethz.ch
 Subject: Re: [R] Unexpected behavior in recode{car}
 
 require( car )
 set.seed(12345)
 nn - sample( c( 2, 4 ), size=50, replace=TRUE ) rr - 
 recode( nn, 2='TWO';4='FOUR' ) table( rr, exclude=NULL ) ss 
 - recode( nn, 2='Num2';4='Num4' )  # Doesn't work as 
 expected table( ss, exclude=NULL ) ss - recode( nn, 
 2='Num2';4='Num4', TRUE )  #?
 table( ss, exclude=NULL )
 tt - recode( nn, 2='TWO'; 4='Num4' )
 table( tt, exclude=NULL )
 uu - recode( nn, 2='Num2'; 4='FOUR' ) table( uu, exclude=NULL )
 
 I looked at the code and found it too difficult to 
 immediately decipher. So does making the result a factor 
 cause any real problems?
 
 I noticed that the same response happens with any letterset 
 followed by a number recode( nn, 2='Num2'; 4='abc5' )
 
 Tom
 
  -Original Message-
  From: [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED] Behalf Of D. Dailey
  Sent: Thursday, 28 July 2005 11:45 AM
  To: r-help@stat.math.ethz.ch
  Subject: [R] Unexpected behavior in recode{car}
  
  
  Thanks to the R creators for such a great statistical system. 
  Thanks to
  the R help list, I have (finally) gotten far enough in R to have a 
  question I hope to be worth posting.
  
  I'm using the recode function from John Fox's car package and have 
  encountered some unexpected behavior.
  
  Consider the following example:
  
  ## Begin cut-and-paste example
  require( car )
  set.seed(12345)
  nn - sample( c( 2, 4 ), size=50, replace=TRUE ) rr - recode( nn, 
  2='TWO';4='FOUR' ) table( rr, exclude=NULL ) ss - recode( nn, 
  2='Num2';4='Num4' )  # Doesn't work as expected table( ss, 
  exclude=NULL ) tt - recode( nn, 2='TWO'; 4='Num4' ) table( tt, 
  exclude=NULL ) uu - recode( nn, 2='Num2'; 4='FOUR' ) table( uu, 
  exclude=NULL ) ## End cut-and-paste example
  
  All but the recoding to ss work as expected: I get a 
 character vector 
  with 23 instances of either FOUR or Num4 and 27 
 instances of TWO
  or Num2.
  
  But for the ss line, wherein all the strings to be assigned 
 contain a 
  digit, the resulting vector contains all NAs. Using str(), 
 I note that 
  ss is a numeric vector.
  
  Is there a tidy way (using recode) to recode numeric values into 
  character strings, all of which contain a digit? I have a 
 workaround 
  for my current project, but it would be nice to be able to 
 use mixed 
  alphanumeric strings in this context.
  
  Thanks in advance for any insight you can give into this question.
  
  Using R 2.1.1 (downloaded binary) on Windows XP Pro, car version 
  1.0-17 (installed from CRAN via Windows GUI). Complete version 
  information
  below:
  
version
_
  platform i386-pc-mingw32
  arch i386
  os   mingw32
  system   i386, mingw32
  status
  major2
  minor1.1
  year 2005
  month06
  day  20
  language R
  
t(t( installed.packages()['car',] ))
[,1]
  Package  car
  LibPath  C:/Programs/R/rw2011/library
  Version  1.0-17
  Priority NA
  Bundle   NA
  Contains NA
  Depends  R (= 1.9.0)
  Suggests MASS, nnet, leaps
  Imports  NA
  Built2.1.0
  
  
  I subscribe to the help list in digest form, so would 
 appreciate being 
  copied directly in addition to seeing responses sent to the list.
  
  David Dailey
  Shoreline, Washington, USA
  [EMAIL PROTECTED]
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] logistic regression: categorical value, and multinomial

2005-07-27 Thread John Fox
Dear Ed,

See ?glm for fitting binomial logit models, and ?multinom (in the nnet
package) for multinomial logit models. Neither function will handle a
character (text) variable as the response, but you could easily convert
the variable to a factor.

John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Haibo Huang
 Sent: Wednesday, July 27, 2005 11:23 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] logistic regression: categorical value, and multinomial
 
 I have two questions:
 
 1. If I want to do a binomial logit, how to handle the 
 categorical response variable? Data for the response 
 variables are not numerical, but text.
 
 2. What if I want to do a multinomial logit, still with 
 categorical response variable? The variable has 5 
 non-numerical response levels, I have to do it with a 
 multinomial logit.
 
 Any input is highly appreciated! Thanks!
 
 Ed

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] setting elements to NA across an array

2005-07-27 Thread John Fox
Dear Dr Carbon,


 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Dr Carbon
 Sent: Wednesday, July 27, 2005 12:19 PM
 To: [EMAIL PROTECTED]
 Subject: [R] setting elements to NA across an array
 
 Please excuse what is obviously a trivial matter...
 
 I have a large 3-d array. I wish to set the third dimension 
 (z) to NA if there are any NA values in the first two 
 dimnesions (xy). So, given array foo:
 
   foo - array(data = NA, dim = c(5,5,3))
   foo[,,1] - matrix(rnorm(25), 5, 5)
   foo[,,2] - matrix(rnorm(25), 5, 5)
   foo[,,3] - matrix(rnorm(25), 5, 5)
 
 I'll set two elements to NA
 
   foo[2,2,1]- NA
   foo[3,5,2]- NA
   foo
 
 Now I want to set foo[2,2,] - NA and foo[3,5,] - NA. How 
 can I build a logical statement to do this?
 

That should work just as you've specified it: That is, elements in all
layers in the second row, second column and in the the third row, fifth
column of the array should be NA. Since you posed the question, I suppose
that I'm missing something.

I hope this helps,
 John

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] setting elements to NA across an array

2005-07-27 Thread John Fox
Dear Dr Carbon,

Actually, it appears that I'm the one who was being obtuse. To make sure
that I'm now interpreting what you want correctly, you'd like to set all
entries in a layer to NA if any entry in a layer is NA.  If that's correct,
then how about the following?

foo[array(apply(foo, c(1,2), function(x) any(is.na(x))), dim(foo))] - NA

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: Dr Carbon [mailto:[EMAIL PROTECTED] 
 Sent: Wednesday, July 27, 2005 12:45 PM
 To: John Fox
 Cc: [EMAIL PROTECTED]
 Subject: Re: [R] setting elements to NA across an array
 
 Sorry for being obtuse. 
 
 How can I build a logical statement that will set foo[2,2,] 
 - NA and foo[3,5,] - NA? Something like, if any row and 
 column are NA, then set NA to that row and column to NA for 
 the entire array
 
 
 
 On 7/27/05, John Fox [EMAIL PROTECTED] wrote:
  Dear Dr Carbon,
  
  
   -Original Message-
   From: [EMAIL PROTECTED] 
   [mailto:[EMAIL PROTECTED] On Behalf Of Dr Carbon
   Sent: Wednesday, July 27, 2005 12:19 PM
   To: [EMAIL PROTECTED]
   Subject: [R] setting elements to NA across an array
  
   Please excuse what is obviously a trivial matter...
  
   I have a large 3-d array. I wish to set the third dimension
   (z) to NA if there are any NA values in the first two dimnesions 
   (xy). So, given array foo:
  
 foo - array(data = NA, dim = c(5,5,3))
 foo[,,1] - matrix(rnorm(25), 5, 5)
 foo[,,2] - matrix(rnorm(25), 5, 5)
 foo[,,3] - matrix(rnorm(25), 5, 5)
  
   I'll set two elements to NA
  
 foo[2,2,1]- NA
 foo[3,5,2]- NA
 foo
  
   Now I want to set foo[2,2,] - NA and foo[3,5,] - NA. How can I 
   build a logical statement to do this?
  
  
  That should work just as you've specified it: That is, 
 elements in all 
  layers in the second row, second column and in the the third row, 
  fifth column of the array should be NA. Since you posed the 
 question, 
  I suppose that I'm missing something.
  
  I hope this helps,
   John
  
   __
   R-help@stat.math.ethz.ch mailing list 
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide!
   http://www.R-project.org/posting-guide.html
  
 

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] grep help needed

2005-07-26 Thread John Fox
Dear Denis,

I don't believe that anyone fielded your question -- my apologies if I
missed a response.

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Denis Chabot
 Sent: Monday, July 25, 2005 9:46 PM
 To: R list
 Subject: [R] grep help needed
 
 Hi,
 
 In another thread (PBSmapping and shapefiles) I asked for 
 an easy way to read shapefiles and transform them in data 
 that PBSmapping could use. One person is exploring some ways 
 of doing this, but it is possible I'll have to do this manually.
 
 With package maptools I am able to extract the information 
 I need from a shapefile but it is formatted like this:
 
 [[1]]
 [,1] [,2]
 [1,] -55.99805 51.68817
 [2,] -56.00222 51.68911
 [3,] -56.01694 51.68911
 [4,] -56.03781 51.68606
 [5,] -56.04639 51.68759
 [6,] -56.04637 51.69445
 [7,] -56.03777 51.70207
 [8,] -56.02301 51.70892
 [9,] -56.01317 51.71578
 [10,] -56.00330 51.73481
 [11,] -55.99805 51.73840
 attr(,pstart)
 attr(,pstart)$from
 [1] 1
 
 attr(,pstart)$to
 [1] 11
 
 attr(,nParts)
 [1] 1
 attr(,shpID)
 [1] NA
 
 [[2]]
[,1] [,2]
 [1,] -57.76294 50.88770
 [2,] -57.76292 50.88693
 [3,] -57.76033 50.88163
 [4,] -57.75668 50.88091
 [5,] -57.75551 50.88169
 [6,] -57.75562 50.88550
 [7,] -57.75932 50.88775
 [8,] -57.76294 50.88770
 attr(,pstart)
 attr(,pstart)$from
 [1] 1
 
 attr(,pstart)$to
 [1] 8
 
 attr(,nParts)
 [1] 1
 attr(,shpID)
 [1] NA
 
 I do not quite understand the structure of this data object 
 (list of lists I think)

Actually, it looks like a list of matrices, each with some attributes
(which, I gather, aren't of interest to you).

 but at this point I resorted to 
 printing it on the console and imported that text into Excel 
 for further cleaning, which is easy enough. I'd like to 
 complete the process within R to save time and to circumvent 
 Excel's limit of around 64000 lines. But I have a hard time 
 figuring out how to clean up this text in R.
 

If I understand correctly what you want, this seems a very awkward way to
proceed. Why not just extract the matrices from the list, stick on the
additional columns that you want, stick the matrices together, name the
columns, and then output the data to a file?

M1 - Data[[1]]  # assuming that the original list is named Data
M2 - Data[[2]]
M1 - cbind(1, 1:nrow(M1), M1)
M2 - cbind(2, 1:nrow(M2), M2)
M - rbind(M1, M2)
colnames(M) - c(PID, POS, X, Y)
write.table(M, Data.txt, row.names=FALSE, quote=FALSE)

It wouldn't be hard to generalize this to any number of matrices and to
automate the process.

I hope that this helps,
 John

 What I need to produce for PBSmapping is a file where each 
 block of coordinates shares one ID number, called PID, and a 
 variable POS indicates the position of each coordinate within 
 a shape. All other lines must disappear. So the above would become:
 
 PID POS X Y
 1 1 -55.99805 51.68817
 1 2 -56.00222 51.68911
 1 3 -56.01694 51.68911
 1 4 -56.03781 51.68606
 1 5 -56.04639 51.68759
 1 6 -56.04637 51.69445
 1 7 -56.03777 51.70207
 1 8 -56.02301 51.70892
 1 9 -56.01317 51.71578
 1 10 -56.00330 51.73481
 1 11 -55.99805 51.73840
 2 1 -57.76294 50.88770
 2 2 -57.76292 50.88693
 2 3 -57.76033 50.88163
 2 4 -57.75668 50.88091
 2 5 -57.75551 50.88169
 2 6 -57.75562 50.88550
 2 7 -57.75932 50.88775
 2 8 -57.76294 50.88770
 
 First I imported this text file into R:
 test - read.csv2(test file.txt,header=F, sep=;, colClasses =
 character)
 
 I used sep=; to insure there would be only one variable in 
 this file, as it contains no ;
 
 To remove lines that do not contain coordinates, I used the 
 fact that longitudes are expressed as negative numbers, so 
 with my very limited knowledge of grep searches, I thought of 
 this, which is probably not the best way to go:
 
 a - rep(-, length(test$V1))
 b - grep(a, test$V1)
 
 this gives me a warning (Warning message:
 the condition has length  1 and only the first element will be used
 in: if (is.na(pattern)) {
 but seems to do what I need anyway
 
 c - seq(1, length(test$V1))
 d - c %in% b
 
 e - test$V1[d]
 
 Partial victory, now I only have lines that look like [1,] 
 -57.76294 50.88770
 
 But I don't know how to go further: the number in square 
 brackets can be used for variable POS, after removing the 
 square brackets and the comma, but this requires a better 
 knowledge of grep than I have.  
 Furthermore, I don't know how to add a PID (polygon ID) 
 variable, i.e. all lines of a polygon must have the same ID, 
 as in the example above (i.e. each time POS == 1, a new 
 polygon starts and PID needs to be incremented by 1, and PID 
 is kept constant for lines where POS ! 1).
 
 Any help will be much appreciated.
 
 Sincerely,
 
 Denis Chabot

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] sciviews installation

2005-07-20 Thread John Fox
Dear Philippe,

The development version 1.1-0 of the Rcmdr package (i.e., the one with
localization facilities) isn't yet on CRAN. Georges doesn't say which
version of the Rcmdr package he's using, but I assume 1.0-2 from CRAN, which
I believe should work with SciViews.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Philippe Grosjean
 Sent: Wednesday, July 20, 2005 5:24 AM
 To: [EMAIL PROTECTED]
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] sciviews installation
 
 Hello,
 
 Several changes in the latest Rcmdr broken the SciViews 
 compatibility. I am working on it... but it takes longer than 
 expected. Sorry for the inconvenience.
 Best,
 
 Philippe
 
 ..°}))
   ) ) ) ) )
 ( ( ( ( (Prof. Philippe Grosjean
   ) ) ) ) )
 ( ( ( ( (Numerical Ecology of Aquatic Systems
   ) ) ) ) )   Mons-Hainaut University, Pentagone (3D08)
 ( ( ( ( (Academie Universitaire Wallonie-Bruxelles
   ) ) ) ) )   8, av du Champ de Mars, 7000 Mons, Belgium
 ( ( ( ( (
   ) ) ) ) )   phone: + 32.65.37.34.97, fax: + 32.65.37.30.54
 ( ( ( ( (email: [EMAIL PROTECTED]
   ) ) ) ) )
 ( ( ( ( (web:   http://www.umh.ac.be/~econum
   ) ) ) ) )  http://www.sciviews.org
 ( ( ( ( (
 ..
 
 ghjmora g mail wrote:
  Hello
  
  
  1. a few months ago, I had sciviews working fine with R 
 (rw2001) under 
  windows XP 2. now, upgrading to rw2011, the stuff seems fine (every 
  package installed),but I find a conflict when launching sciviews:
  - it runs, apparently
  - but when I try to work (import/export In: text for 
 instance), it 
  asks for Rcmdr (Would you like to install it now?)
  
  3. Rcmdr is already installed (with all dependencies) and 
 works well 
  when called directly in R gui 4. and it's impossible to make it 
  reconized or to install it under sciviews
  
  
  I have all the latest packages, and I am going to get mad.
  
  what do you suggest to solve my problem ?
  
  Thanks
  
  Georges Moracchini
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
  
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] 3D ellipsoid confidence region

2005-06-28 Thread John Fox
Dear Melanie,

I think that you're referring to the data.ellipse function in the car
package. What you're suggesting is a nice idea but I haven't done it. I
think that it might be easier to draw the ellipsoid using the rgl
package rather than scatterplot3d. (See, e.g., the scatter3d function
in the Rcmdr package.)

Regards,
 John

On Tue, 28 Jun 2005 12:35:35 -0700
 Melanie Edwards [EMAIL PROTECTED] wrote:
 I am curious if there is code developed to plot confidence regions in
 3D.
 The scatterplot3d function generates the plot I want, but would like
 an 3D
 equivalent to the data.ellipse function.
 
 Any help in this direction would be appreciated, be it theoretical,
 graphical, or otherwise.
 
 Melanie Edwards
 Senior Statistician
 
 Exponent
 15375 SE 30th PL, Suite 250
 Bellevue, WA  98007
 Tel:  (425) 519-8714
 Fax:  (425) 643-9827
 Cell:  (206) 852-5739
 www.exponent.com
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to save changed options in Rcmdr

2005-06-23 Thread John Fox
Dear Shige,

You can set appropriate options when R starts up. In R for Windows, the
easiest thing to do is probably to put an options() command in the Rprofile
file in R's etc directory. For more information on setting Rcmdr options,
see ?Commander (also accessible from the Rcmdr Help menu). For more
information on R initialization, see ?Startup.

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Shige Song
 Sent: Thursday, June 23, 2005 12:19 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] How to save changed options in Rcmdr
 
 Dear All,
 
 I want to change the default options of Rcmdr; it seemed to 
 work when I made changes and click the Exit and Restart R 
 Commander. However, next time I open Rcmdr, it automatically 
 restored to the default options. Is there a way to change 
 Rcmdr's options permanently? Thanks!
 
 Shige
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Another question about Rcmdr

2005-06-23 Thread John Fox
Dear Shige,

I'm afraid that my Linux system isn't up-to-date (and isn't a Suse system).
Since I'm about to leave town, I won't be able to check this out further
now. I expect that if the problem were general, I'd have heard of it before,
but I can't be sure that's the case. Perhaps someone else has used the Rcmdr
under R 2.1.1 on a Linux system and can report success or failure.

I'm sorry that I can't be of more help,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Shige Song
 Sent: Thursday, June 23, 2005 2:53 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Another question about Rcmdr
 
 Hi,
 
 I downloaded the new R 2.1.1 source, compiled and installed on my suse
 9.3 box. Then I installed the package Rcmdr. Surprisingly, I 
 cannot type anything onto the script window. I uninstalled R 
 and grabbed an rpm version from CRAN, installed it, and 
 installed Rcmdr, same problem!
 
 Has anyone else had this problem?
 
 Shige
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] test for equality of two data sets withmultidimensional variables

2005-06-21 Thread John Fox
Dear Petr and Shengzhe,

Also see ?manova.

John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Petr Pikal
 Sent: Tuesday, June 21, 2005 7:30 AM
 To: wu sz; r-help@stat.math.ethz.ch
 Subject: Re: [R] test for equality of two data sets 
 withmultidimensional variables
 
 Hi
 
 searching in CRAN homepage for hotelling gave me this response
 
 Thanks everyone for your help on this question. I solved the 
 problem by writing a procedure to calculate Hotelling's T^2 
 for a one-sample, multivariate t-test. Here's how it looks, 
 perhaps it will be useful to others. 
 
 
 data - cbind(rnorm(50, 0.1, .01), rnorm(50,.1,.01),
 rnorm(50,.1,.01))
 k - ncol(data)
 n - nrow(data)
 xbar - apply(data, 2, mean)
 mubar - rep(0,k) #hypothesized means are zero dbar - xbar - 
 mubar v - var(data)
 t2 - n*dbar%*%solve(v)%*%dbar
 F - (n-k)*t2/((n-1)*k)
 P - 1-pf(F,k,n-k) 
 
 
 A previous post by Peter B. Mandeville was very helpful, as 
 well as the Johnson/Wichern book on multivariate stats. 
 -S. Schultz 
 
 and this
 
 cran.r-project.org/doc/packages/agce.pdf - Podobné stránky 
 
 CRAN - Package SharedHT2
 SharedHT2: Shared Hotelling T2 test for small sample 
 microarray experiments ...
 Derives a Hotelling T2 statistic having an F-distribution 
 using an empirical ...
 
 Maybe this is what you want.
 
 HTH
 Petr
 
 
 
 
 
 On 21 Jun 2005 at 13:00, wu sz wrote:
 
  Hello there,
  
  I have two data sets with 14 variables each, and wish to do 
 the test 
  for equality of their covariance matrices and mean vectors. 
 Normally 
  these tests should be done by chi square test (box provided) and 
  Hotelling's T square test respectively. Which R functions could do 
  this kind of test? I just find some functions could do for one 
  dimension, but no for multidimension. Some one suggests 
 bartlett.test, 
  but it seems just works for one dimension. Do you know which ones 
  could do that, or I have to do R programming by myself?
  
  Thank you,
  Shengzhe
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide!
  http://www.R-project.org/posting-guide.html
 
 Petr Pikal
 [EMAIL PROTECTED]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Factanal loadings as large as 1.2 with promax -- how unusual?

2005-06-20 Thread John Fox
Dear Ben,

To get big factor loadings like this, I'd guess that you have large
communalities (not a bad thing, of course, and probably less rather than
more likely with crudely measured variables) and strong correlations between
factors. I suppose that if the latter get too large that might suggest that
you could get by with fewer factors.

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Ben Fairbank
 Sent: Monday, June 20, 2005 4:10 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Factanal loadings as large as 1.2 with promax -- 
 how unusual?
 
 I am performing a large (105 variable) factor analysis with 
 factanal, specifying promax rotation.  I kow that some 
 loadings over 1.0 are not unsual with that rotation, but I 
 have some as large as 1.2, which seems extreme.  I am 
 skirting the assumptions of the model by using responses on a 
 7-point rating scale as data; I may have to go back and 
 compute polychoric correlations instead of product moment, 
 but before doing that I would like to know if others have had 
 equally large factor loadings using data that are truly 
 interval level data on continuous scales.
  
 Thanks for suggestions or information,
  
 Ben Fairbank
  
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Problem with multinom ?

2005-06-11 Thread John Fox
Dear Marc,

I get the same results -- same coefficients, standard errors, and fitted
probabilities -- from multinom() and glm(). It's true that the deviances
differ, but they, I believe, are defined only up to an additive constant:

 dt
  output factor  n
1  m1.2 10
2  f1.2 12
3  m1.3 14
4  f1.3 14
5  m1.4 15
6  f1.4 12

 dt.m - multinom(output ~ factor, data=dt, weights=n)
# weights:  3 (2 variable)
initial  value 53.372333 
iter  10 value 53.115208
iter  10 value 53.115208
iter  10 value 53.115208
final  value 53.115208 
converged

 dt2
   m  f factor
1 10 121.2
2 14 141.3
3 15 121.4

 dt.b - glm(cbind(m,f) ~ factor, data=dt2, family=binomial)

 summary(dt.m)
Call:
multinom(formula = output ~ factor, data = dt, weights = n)

Coefficients:
   Values Std. Err.
(Intercept) -2.632443  3.771265
factor   2.034873  2.881479

Residual Deviance: 106.2304 
AIC: 110.2304 

Correlation of Coefficients:
[1] -0.9981598

 summary(dt.b)

Call:
glm(formula = cbind(m, f) ~ factor, family = binomial, data = dt2)

Deviance Residuals: 
   1 2 3  
 0.01932  -0.03411   0.01747  

Coefficients:
Estimate Std. Error z value Pr(|z|)
(Intercept)   -2.632  3.771  -0.6980.485
factor 2.035  2.881   0.7060.480

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 0.5031047  on 2  degrees of freedom
Residual deviance: 0.0018418  on 1  degrees of freedom
AIC: 15.115

Number of Fisher Scoring iterations: 2

 predict(dt.b, type=response)
1 2 3 
0.4524946 0.5032227 0.5538845 
 
 predict(dt.m, type=probs)
1 2 3 4 5 6 
0.4524948 0.4524948 0.5032229 0.5032229 0.5538846 0.5538846 

These fitted probabilities *are* correct: Since the members of each pair
(1,2), (3,4), and (5,6) have identical values of factor they are identical
fitted probabilities.

(Note, by the way, the large negative correlation between the coefficients,
produced by the configuration of factor values.)

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Marc Girondot
 Sent: Saturday, June 11, 2005 3:06 AM
 To: John Fox
 Cc: r-help@stat.math.ethz.ch
 Subject: [R] Problem with multinom ?
 
 Thanks for your response.
 OK, multinom() is a more logical in this context.
 
 But similar problem occurs:
 
 Let these data to be analyzed using classical glm with binomial error:
 
 m   f   factor   m theo  f theo 
 -Ln L model-Ln L full  interecept 
 f
 10  12  1.2  0.452494473  0.547505527 
 1.778835688  1.7786489632.632455675 
 -2.034882223
 14  14  1.3  0.503222759  0.496777241  1.901401922  1.900820284
 15  12  1.4  0.553884782  0.446115218  1.877062369  1.876909821
 
  Sum -Ln L
 5.557299979  5.556379068  Residual deviance
  Deviance 
 11.11459996  11.11275814   0.001841823
 
 If I try to use multinom() function to analyze these data, 
 the fitted parameters are correct but the residual deviance not.
 
   dt-read.table('/try.txt'. header=T)
   dt
output factor  n
 1  m1.2 10
 2  f1.2 12
 3  m1.3 14
 4  f1.3 14
 5  m1.4 15
 6  f1.4 12
   dt.plr - multinom(output ~ factor. data=dt. weights=n. maxit=1000)
 # weights:  3 (2 variable)
 initial  value 53.372333
 iter  10 value 53.115208
 iter  10 value 53.115208
 iter  10 value 53.115208
 final  value 53.115208
 converged
   dt.plr
 Call:
 multinom(formula = output ~ factor. data = dt. weights = n. 
 maxit = 1000)
 
 Coefficients:
 (Intercept)  factor
-2.6324432.034873
 
 Residual Deviance: 106.2304
 AIC: 110.2304
 
   dt.pr1-predict(dt.plr. . type=probs)
   dt.pr1
  1 2 3 4 5 6
 0.4524948 0.4524948 0.5032229 0.5032229 0.5538846 0.5538846
 
 Probability for 2. 4 and 6 are not correct and its explain 
 the non-correct residual deviance obtained in R.
 Probably the problem I have is due to an incorrect data 
 format... could someone help me... 
 Thanks
 
 (I know there is a simple way to analyze binomial data. but 
 in fine I want to use multinom() for 5 categories of outputs.
 
 
 Thanks a lot
 
 Marc
 -- 
 
 __
 Marc Girondot, Pr
 Laboratoire Ecologie, Systématique et Evolution Equipe de 
 Conservation des Populations et des Communautés CNRS, ENGREF 
 et Université Paris-Sud 11 , UMR 8079 Bâtiment 362
 91405 Orsay Cedex, France
 
 Tel:  33 1 (0)1.69.15.72.30   Fax: 33 1 (0)1 69 
 15 56 96   e-mail: [EMAIL PROTECTED]
 Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html

Re: [R] Problem with multinom ?

2005-06-11 Thread John Fox
Dear Marc,


 -Original Message-
 From: Marc Girondot [mailto:[EMAIL PROTECTED] 
 Sent: Saturday, June 11, 2005 2:16 PM
 To: John Fox
 Cc: [EMAIL PROTECTED]
 Subject: RE: [R] Problem with multinom ?
 
 Dear Marc,
 
 I get the same results -- same coefficients, standard errors, and 
 fitted probabilities -- from multinom() and glm(). It's true 
 that the 
 deviances differ, but they, I believe, are defined only up 
 to an additive constant:
 
   predict(dt.b, type=response)
  1 2 3
 0.4524946 0.5032227 0.5538845
 
   predict(dt.m, type=probs)
  1 2 3 4 5 6
 0.4524948 0.4524948 0.5032229 0.5032229 0.5538846 0.5538846
 
 These fitted probabilities *are* correct: Since the members of each 
 pair (1,2), (3,4), and (5,6) have identical values of factor 
 they are 
 identical fitted probabilities.
 
 I expected rather to obtain (note 1- before some terms):
  1 2 3 4 5 6
 0.4524948 1-0.4524948 0.5032229 1-0.5032229 0.5538846 1-0.5538846
 
 ...

But the fitted probabilities are for each observation in the data set; in
your data, these have identical values of factor for each pair and hence
identical fitted probabilities. When the response is dichotomous, you get
only one of the two fitted probabilities for each observation; for a
polytomous response, you get a matrix of fitted probabilities which sum to 1
row-wise.

Regards,
 John

 
 Marc
 -- 
 
 __
 Marc Girondot, Pr
 Laboratoire Ecologie, Systématique et Evolution Equipe de 
 Conservation des Populations et des Communautés CNRS, ENGREF 
 et Université Paris-Sud 11 , UMR 8079 Bâtiment 362
 91405 Orsay Cedex, France
 
 Tel:  33 1 (0)1.69.15.72.30   Fax: 33 1 (0)1 69 
 15 56 96   e-mail: [EMAIL PROTECTED]
 Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html
 Skype: girondot
 Fax in US: 1-425-732-6934

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] problem with polr ?

2005-06-10 Thread John Fox
Dear Marc,

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Marc Girondot
 Sent: Friday, June 10, 2005 3:44 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] problem with polr ?
 
 I want to fit a multinomial model with logit link.
 For example let this matrix to be analyzed:
 male   female   aborted   factor
 10 12 1 1.2
 14 14 4 1.3
 15 12 3 1.4
 
 (this is an example, not the true data which are far more complex...)
 
 I suppose the correct function to analyze these data is polr 
 from MASS library.
 

Actually, polr fits the proportional-odds logistic regression model (or a
similar ordered probit model), which requires an ordinal response. I don't
believe that it makes sense to think of a  f  m. For the multinomial logit
model, see the multinom function in the nnet package.

 The data have been entered in a text file like this:
 
 output  factor  n
 m   1.2 10
 f   1.2 12
 a   1.2 1
 m   1.3 14
 f   1.3 14
 a   1.3 4
 m   1.4 15
 f   1.4 12
 a   1.4 3
 
 However, after having performed the analysis, it appears this 
 is not correct as the 3 outputs per experiment are not linked...
 
 library(MASS)
 dt.plr - polr(output ~ factor, data=dt, weights=n)
   dt.pr1-predict(dt.plr, , type=probs)
   dt.pr1
 a f m
 1 0.09987167 0.4578184 0.4423099
 2 0.09987167 0.4578184 0.4423099
 3 0.09987167 0.4578184 0.4423099
 4 0.09437078 0.4477902 0.4578390
 5 0.09437078 0.4477902 0.4578390
 6 0.09437078 0.4477902 0.4578390
 7 0.08914287 0.4374067 0.4734505
 8 0.08914287 0.4374067 0.4734505
 9 0.08914287 0.4374067 0.4734505
 

These are the fitted probabilities of response in each of the three
categories. Note that each row sums to 1, as it should. Of course, the model
likely doesn't make sense.

 
 ---Another linked problem
 
 Also, I don't understand what the meaning of the residual 
 deviance that is displayed.
 Let modify the file so that the model can also be analyzed 
 using binomial model:
 
 output  factor  n
 m   1.2 10
 f   1.2 12
 a   1.2 0
 m   1.3 14
 f   1.3 14
 a   1.3 0
 m   1.4 15
 f   1.4 12
 a   1.4 0
 
 dt.plr
 Call:
 polr(formula = output ~ factor, data = dt, weights = n)
 
 Coefficients:
factor
 2.034848
 
 Intercepts:
 a|ff|m
 -16.306511   2.632410
 
 Residual Deviance: 106.2304
 AIC: 112.2304
 
 whereas the corresponding scaled deviance for the binomial 
 model (removing a column) is 1.842e-3...
 
 

I'm surprised that you were able to get estimates here at all, since there
are no observations at category a; nevetheless, polr is estimating two
thresholds, one between the never-observed a and f. I expect that this is a
numerical artifact. Note that if you simply remove the rows for a rather
than set the counts to 0, polr will complain that there are only two
categories.

I hope this helps,
 John

 -- 
 
 __
 Marc Girondot, Pr
 Laboratoire Ecologie, Systématique et Evolution
 Equipe de Conservation des Populations et des Communautés
 CNRS, ENGREF et Université Paris-Sud 11 , UMR 8079
 Bâtiment 362
 91405 Orsay Cedex, France
 
 Tel:  33 1 (0)1.69.15.72.30   Fax: 33 1 (0)1 69 
 15 56 96   e-mail: [EMAIL PROTECTED]
 Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html
 Skype: girondot
 Fax in US: 1-425-732-6934
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Forecasting with macroeconomic structural equations models?

2005-06-09 Thread John Fox
Dear Ronaldo,

As you discovered, there are no predict methods in the sem package, either
for tsls or sem objects. You might have a look at the systemfit package,
which does support predict().

More generally, you should be able to substitute out identities and base
predictions on the reduced form of the model. To do so, you'll need
forecasts for the exogenous variables in the system.

I hope that this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Ronaldo Carpio
 Sent: Wednesday, June 08, 2005 9:10 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Forecasting with macroeconomic structural 
 equations models?
 
 
 Hello,
 
 Is there a package or sample code that shows how to do ex 
 ante forecasts with a macroeconomic structural equations 
 model?  I looked at the sem package, which lets you 
 estimate e.g. Klein's model, but I'm not sure how to make 
 simulations using the full set of equations, including the identities.
 
 
 Thank you,
 
 Ronaldo Carpio
 [EMAIL PROTECTED]


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] chisq.test and anova problems

2005-06-06 Thread John Fox
Dear Richard,

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Richard Mendes
 Sent: Monday, June 06, 2005 8:44 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] chisq.test and anova problems
 
 we just started to use R and having some problems that no one 
 in our school could solve. I hope someone here can help me out.
 
 the first problem is with the chisquare test. we want to 
 exclude the missing values from the data. we used na.omit and 
 made two new variables.
 
 now we want to use the chi square method but get the error x 
 and y must have the same length.
 
 how do i use the chisquare method were i exclude the missing values ?
 and can i use this method if there is a difference in length.
 

Take a look at the help file for chisq.test (e.g., ?chisq.test): Missing
data are excluded when the contingency table for x and y is formed, so you
need do nothing special to get what you want -- just chisq.test(x, y).

 the second problem is with anova in the data set we are 
 working on we have to use this method on multiple variables 
 with a difference in length, can this be done. this is the 
 syntax we used and the error is stated behind.
 
 anova(lm(test~test1)) and the error states variable length differ.
 

I'm guessing that test and test1 are two samples on the response variable
that you want to compare, though that's not entirely clear from your
question. To do a one-way anova via lm(), you should have all of the
observations on the response in one variable and a factor giving group
membership of each observation; then anova(lm(response~factor)) will give
you the one-way ANOVA table. Also take a look at ?aov. Finally, if I've
correctly guessed what you're trying to do, then t.test(test, test1) is an
alternative.

More generally, have you looked at the introductory manual that comes with
R?

I hope this helps,
 John

 i think there has to be a way to use this method on 
 differences in variable lengths does anybode know how to do this
 
 thanks in advance for your help
 
 richard

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] R GUI for Linux?

2005-05-30 Thread John Fox
Dear Charles,


 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of White, 
 Charles E WRAIR-Wash DC
 Sent: Monday, May 30, 2005 10:52 AM
 To: r-help@stat.math.ethz.ch
 Cc: [EMAIL PROTECTED]
 Subject: [R] R GUI for Linux?
 

. . .

 
 Rcmdr: There are all sorts of things in FC3 that seem to be 
 tcl/tk related but Rcmdr doesn't seem to work with them. 
 Since some are part of the base FC3 installation, I'm 
 nervious about replacing them or installing competing 
 software. Potentially conflicting software in FC3 are listed below:
 
 tcl.i386 8.4.7-2  
   installed
 tclx.i3868.3.5-4  
   installed
 db4-tcl.i386 4.2.52-6   base
 postgresql-tcl.i386  7.4.8-1.FC3.1
   updates-released
 ruby-tcltk.i386  1.8.2-1.FC3.1
   updates-released
 tcl-devel.i386   8.4.7-2base
 tcl-html.i3868.4.7-2base
 tclx-devel.i386  8.3.5-4base
 tclx-doc.i3868.3.5-4base
 

. . .

I haven't tried the Rcmdr package with FC3, but I ran it under an older
version of Red Hat Linux some time ago, and the current version (Rcmdr
1.0-2) under Quantian.

Can you be more specific about the problems that you encountered? Are these
general to the tcltk package or specific to the Rcmdr?

I'm sorry that you're experiencing problems.

John

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Errors in Variables

2005-05-29 Thread John Fox
Dear Spencer,

The reason that I didn't respond to the original posting (I'm the author of
the sem package), that that without additional information (such as the
error variance of x), a model with error in both x and y will be
underidentified (unless there are multiple indicators of x, which didn't
seem to be the case here). I figured that what Jacob had in mind was
something like minimizing the least (orthogonal) distance of the points to
the regression line (implying by the way that x and y are on the same scale
or somehow standardized), which isn't doable with sem as far as I'm aware.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Spencer Graves
 Sent: Saturday, May 28, 2005 4:47 PM
 To: Eric-Olivier Le Bigot
 Cc: r-help@stat.math.ethz.ch; Jacob van Wyk
 Subject: Re: [R] Errors in Variables
 
 I'm sorry, I have not followed this thread, but I 
 wonder if you have considered library(sem), structural 
 equations modeling?  Errors in variables problems are the 
 canonical special case.
 
 Also, have you done a search of www.r-project.org 
 - search - R site search for terms like errors in 
 variables regression?  This just led me to ODRpack, which 
 is NOT a CRAN package but is apparently available after a 
 Google search.  If it were my problem, I'd first try to 
 figure out sem;  if that seemed too difficult, I might then 
 look at ODRpack.
 
 Also, have you read the posting guide! 
 http://www.R-project.org/posting-guide.html?  This suggests, 
 among other things, that you provide a toy example that a 
 potential respondant could easily copy from your email, test 
 a few modifications, and prase a reply in a minute or so.  
 This also helps clarify your question so any respondants are 
 more likely to suggest something that is actually useful to 
 you.  Moreover, many people have reported that they were able 
 to answer their own question in the course of preparing a 
 question for this list using the posting guide.
 
 hope this helps.  spencer graves
 
 Eric-Olivier Le Bigot wrote:
 
  I'm interested in this 2D line fitting too!  I've been looking, 
  without success, in the list of R packages.
  
  It might be possible to implement quite easily some of the 
 formalism 
  that you can find in Numerical Recipes (Fortran 77, 2nd ed.), 
  paragraph 15.3.  As a matter of fact, I did this in R but 
 only for a 
  model of the form y ~ x (with a given covariance matrix 
 between x and 
  y).  I can send you the R code (preliminary version: I 
 wrote it yesterday), if you want.
  
  Another interesting reference might be Am. J. Phys. 60, p. 
 66 (1992).  
  But, again, you would have to implement things by yourself.
  
  All the best,
  
  EOL
  
  -- 
  Dr. Eric-Olivier LE BIGOT (EOL)CNRS 
 Associate Researcher
  
 ~~~o~o
 o~o~~~
  Kastler Brossel Laboratory (LKB)   
 http://www.lkb.ens.fr
  Université P.  M. Curie and Ecole Normale Supérieure, Case 74
  4 place Jussieu  75252 Paris CEDEX 05   
   France
  
 ~~~o~o
 o~o~~~
  office  : 01 44 27 73 67 fax: 
 01 44 27 38 45
  ECR room: 01 44 27 47 12  x-ray room: 
 01 44 27 63 00
  home: 01 73 74 61 87  For int'l calls: 33 + number 
 without leading 0
  
  
  On Wed, 25 May 2005, Jacob van Wyk wrote:
  
  I hope somebody can help.
  A student of mine is doing a study on Measurement Error models 
  (errors-in-variables, total least squares, etc.). I have an old 
  reference to a multi archive  that contains
  leiv3: Programs for best line fitting with errors in both 
 coordinates.
  (The date is October 1989, by B.D. Ripley et al.) I have done a 
  search for something similar in R withour success. Has this been 
  implemented in a R-package, possibly under some sort of 
 assumptions 
  about variances. I would lke my student to apply some regression 
  techniques to data that fit this profile.
  Any help is much appreciated.
  (If I have not done my search more carefully - my 
 apologies.) Thanks 
  Jacob
 
 
  Jacob L van Wyk
  Department of Mathematics and Statistics University of 
 Johannesburg 
  APK P O Box 524 Auckland Park 2006 South Africa
  Tel: +27-11-489-3080
  Fax: +27-11-489-2832
 
  __
  R-help@stat.math.ethz.ch mailing list 
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html

RE: [R] text-function: adding text in an x,y-plot

2005-05-29 Thread John Fox
Dear Helmut,

The problem is that you're implicitly coercing the entire data frame lab to
character, producing

 as.character(lab)
[1] c(1, 3, 4, 5, 6, 7, 8, 9, 10, 2)

The numbers themselves come from the numeric coding of the factor V1 in this
data frame; as.numeric(lab$V1) shows you what's going on.

Instead you could specify labels=lab$V1 in the text() command. But why put
the row names in a data frame in the first place; why not simply use
labels=rownames(out.pca)?

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Helmut 
 Kudrnovsky
 Sent: Sunday, May 29, 2005 11:08 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] text-function: adding text in an x,y-plot
 
 Hello R-friends,
 
 i have a question to the text-function.
 
 a little test-dataset for better understanding:
 
 -the dataset was imported with read.table(,header=TRUE) 
 s1-s10 are the samplenames
 
  var1 var2 var3
 s1 112
 s2 231
 s3 223
 s4 543
 s5 423
 s6 632
 s7 854
 s8 721
 s9 932
 s10864
 
 -then i´ve performed a pincipal component analysis with prcomp
 
 -for displaying the result in a x,y-graph i transformed 
 out.pca$x into the dataframe points
   points - out.pca$x
   points
 PC1 PC2PC3
 s1  -4.7055777 -0.14781544 -0.3683602
 s2  -3.1854599  0.19661476  1.5212455
 s3  -3.2687980  0.78193513 -0.6352458
 s4   0.2278948  1.00061498  0.2164108
 s5  -1.4382847  0.02500633 -0.9114340
 s6   0.6216283 -0.68606440  0.2071083
 s7   3.4951878  1.17343675 -0.3266629
 s8   1.0153619 -2.37274378  0.1978058
 s9   3.3673983 -1.82145761 -0.2071739
 s10  3.8706492  1.85047328  0.3063065
 
 - with plot(points$PC1, points$PC2, type=n) i start a graph 
 without displaying anything at the x,y-coordinates
 
 - i want now to display the samplenames at the right 
 x,y-coordinate;  i generated a dataframe called lab with the 
 samplenames
 
   lab
  V1
 1   s1
 2   s2
 3   s3
 4   s4
 5   s5
 6   s6
 7   s7
 8   s8
 9   s9
 10 s10
 
 - i´ve studied the text-helppage and so i tried the following:
 
 text(pca$PC1, pca$PC2, labels=lab)
 
 - in the plot there is now c(1,3,4,5,6,7,8,9,10,2) displayed 
 instead of s1-s10
 
 is out there any idea what i'am doing wrong? is there maybe 
 another way of displaying the samplenames at the right x,y-coordinate?
 
 greetings from the sunny tirol with thanks in advance helli
 
 platform i386-pc-mingw32
 arch i386
 os   mingw32 - win xp
 system   i386, mingw32
 status
 major2
 minor1.0
 year 2005
 month04
 day  18
 language R
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Errors in Variables

2005-05-29 Thread John Fox
Dear Spencer,

 -Original Message-
 From: Spencer Graves [mailto:[EMAIL PROTECTED] 
 Sent: Sunday, May 29, 2005 4:13 PM
 To: John Fox
 Cc: r-help@stat.math.ethz.ch; 'Jacob van Wyk'; 'Eric-Olivier Le Bigot'
 Subject: Re: [R] Errors in Variables
 
 Hi, John:
 
 Thanks for the clarification.  I know that the 
 errors in X problem 
 requires additional information, most commonly one of the 
 variances or the correlation.  The question I saw (below) 
 indicated he had tried model of the form y ~ x (with a given 
 covariance matrix ...), which made me think of sem.
 
 If he wants the least (orthogonal) distance, could 
 he could get it indirectly from sem by calling sem 
 repeatedly giving, say, a variance for x, then averaging 
 the variances of x and y and trying that in sem?
 

I'm not sure how that would work, but seems similar to averaging the
regressions of y on x and x on y.

 Also, what do you know about ODRpack?  It looks 
 like that might solve the least (orthogonal) distance.
 

I'm not familiar with ODRpack, but it seems to me that one could fairly
simply minimize the sum of squared least distances using, e.g., optim.

Regards,
 John

 Thanks again for your note, John.
 Best Wishes,
 Spencer Graves
 
 John Fox wrote:
 
  Dear Spencer,
  
  The reason that I didn't respond to the original posting (I'm the 
  author of the sem package), that that without additional 
 information 
  (such as the error variance of x), a model with error in 
 both x and y 
  will be underidentified (unless there are multiple indicators of x, 
  which didn't seem to be the case here). I figured that what 
 Jacob had 
  in mind was something like minimizing the least 
 (orthogonal) distance 
  of the points to the regression line (implying by the way 
 that x and y 
  are on the same scale or somehow standardized), which isn't 
 doable with sem as far as I'm aware.
  
  Regards,
   John
  
  
  John Fox
  Department of Sociology
  McMaster University
  Hamilton, Ontario
  Canada L8S 4M4
  905-525-9140x23604
  http://socserv.mcmaster.ca/jfox
  
  
  
 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Spencer Graves
 Sent: Saturday, May 28, 2005 4:47 PM
 To: Eric-Olivier Le Bigot
 Cc: r-help@stat.math.ethz.ch; Jacob van Wyk
 Subject: Re: [R] Errors in Variables
 
   I'm sorry, I have not followed this thread, but I 
 wonder if you 
 have considered library(sem), structural equations modeling?  
 Errors in variables problems are the canonical special case.
 
   Also, have you done a search of www.r-project.org 
 - search - R site search for terms like errors in
 variables regression?  This just led me to ODRpack, 
 which is NOT a 
 CRAN package but is apparently available after a Google 
 search.  If it 
 were my problem, I'd first try to figure out sem;  if that seemed 
 too difficult, I might then look at ODRpack.
 
   Also, have you read the posting guide! 
 http://www.R-project.org/posting-guide.html?  This suggests, among 
 other things, that you provide a toy example that a potential 
 respondant could easily copy from your email, test a few 
 modifications, and prase a reply in a minute or so.
 This also helps clarify your question so any respondants are more 
 likely to suggest something that is actually useful to you. 
  Moreover, 
 many people have reported that they were able to answer their own 
 question in the course of preparing a question for this 
 list using the 
 posting guide.
 
   hope this helps.  spencer graves
 
 Eric-Olivier Le Bigot wrote:
 
 
 I'm interested in this 2D line fitting too!  I've been looking, 
 without success, in the list of R packages.
 
 It might be possible to implement quite easily some of the
 
 formalism
 
 that you can find in Numerical Recipes (Fortran 77, 2nd ed.), 
 paragraph 15.3.  As a matter of fact, I did this in R but
 
 only for a
 
 model of the form y ~ x (with a given covariance matrix
 
 between x and
 
 y).  I can send you the R code (preliminary version: I
 
 wrote it yesterday), if you want.
 
 Another interesting reference might be Am. J. Phys. 60, p. 
 
 66 (1992).  
 
 But, again, you would have to implement things by yourself.
 
 All the best,
 
 EOL
 
 -- 
 Dr. Eric-Olivier LE BIGOT (EOL)CNRS 
 
 Associate Researcher
 
 ~~~o~o
 o~o~~~
 
 Kastler Brossel Laboratory (LKB)   
 
 http://www.lkb.ens.fr
 
 Université P.  M. Curie and Ecole Normale Supérieure, Case 74
 4 place Jussieu  75252 Paris CEDEX 05   
 
   France
 
 ~~~o~o
 o~o~~~
 
 office  : 01 44 27 73 67 fax: 
 
 01 44 27 38 45
 
 ECR room: 01 44 27 47 12  x-ray room: 
 
 01 44 27 63 00
 
 home

RE: [R] precision problem

2005-05-25 Thread John Fox
Dear Omar,

Perhaps I'm missing something, but why not just subtract one matrix from the
other and test the difference in relation to the precision that you require
for the comparison? E.g., to test near equality, something like, abs(A - B)
 1e-13.

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Omar Lakkis
 Sent: Wednesday, May 25, 2005 9:09 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] precision problem
 
 I have prices that I am finding difficult to compare with ==, 
  and , due to precision. For example: the numbers should 
 match, with '==', but they differ in the magnitude of 1e-14 
 due to bunch of calculations that I run on them. Programming 
 with java, I am used to implementing a function that compares 
 the difference between the numbers to a pre determined 
 precision factor. This could be very slow when  I have two 
 matrices of numbers that I could otherwise compare with a 
 simple '==', ''  or '' in R.
 What is teh best solution for this problem?
 Can I control the precision of ==,  and  without having to 
 reimplement the operations in a slow way?
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] precision problem

2005-05-25 Thread John Fox
Dear Omar,

It wasn't clear to me from your original question that you wanted to test
that *all* the corresponding entries were equal, as opposed to each
individual entry.

In any event, I don't think that you'll find a similar function for testing
inequality, so you can do as you suggest, but of course without abs().

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Omar Lakkis
 Sent: Wednesday, May 25, 2005 10:08 AM
 To: r-help@stat.math.ethz.ch
 Subject: Re: [R] precision problem
 
 all.equal is helpful when I am comparing equality of two matrices.
 However, when I am comparing two individual number with  or 
  is my best bet doing if( abs(x - y)  tolerence) or is 
 there a function like all.equal that has the same default tolerence?
 
 On 5/25/05, Omar Lakkis [EMAIL PROTECTED] wrote:
  Thank you all.
  all.equal is very helpful since I am also interested in finding the 
  mismatched prices.
  
  On 5/25/05, John Fox [EMAIL PROTECTED] wrote:
   Dear Omar,
  
   Perhaps I'm missing something, but why not just subtract 
 one matrix 
   from the other and test the difference in relation to the 
 precision 
   that you require for the comparison? E.g., to test near equality, 
   something like, abs(A - B)  1e-13.
  
   I hope this helps,
John
  
   
   John Fox
   Department of Sociology
   McMaster University
   Hamilton, Ontario
   Canada L8S 4M4
   905-525-9140x23604
   http://socserv.mcmaster.ca/jfox
   
  
-Original Message-
From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED] On Behalf Of 
 Omar Lakkis
Sent: Wednesday, May 25, 2005 9:09 AM
To: r-help@stat.math.ethz.ch
Subject: [R] precision problem
   
I have prices that I am finding difficult to compare with ==,
 and , due to precision. For example: the numbers should
match, with '==', but they differ in the magnitude of 
 1e-14 due to 
bunch of calculations that I run on them. Programming 
 with java, I 
am used to implementing a function that compares the difference 
between the numbers to a pre determined precision factor. This 
could be very slow when  I have two matrices of numbers that I 
could otherwise compare with a simple '==', ''  or '' in R.
What is teh best solution for this problem?
Can I control the precision of ==,  and  without having to 
reimplement the operations in a slow way?
   
__
R-help@stat.math.ethz.ch mailing list 
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
  
  
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] colors and palettes and things...

2005-05-23 Thread John Fox
Dear Jeff,

Some time ago, Don McQueen posted to r-help a nice function for displaying
the named colours. You'll find it at
http://finzi.psych.upenn.edu/R/Rhelp02a/archive/31607.html. As well, if
you're using R for Windows, the various named colours are defined in the
file rgb.txt in R's etc directory. 

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Jeff D. Hamann
 Sent: Monday, May 23, 2005 10:35 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] colors and palettes and things... 
 
 After trying to find if there was a color picker in the FAQs 
 and the help, I thought I would send a post here. I was 
 overwhelmed with all the wonderful color choices R has 
 predefined (discovered after typing in
 colors()) but can't figure out what they all (by name) look 
 like. Is there a color picker or some other method to display 
 all those colors next to the name?
 
 I think I can put together palettes, but another question I 
 have then regards the building of palettes (a list of 
 variable length I can select or create myself other than the 
 ones defined by Palette) so I can pass these colors into 
 functions instead of having to predefine a bunch of colors 
 myself or use the predefined colors like terrain.colors(n)?
 
 Are there groups of colors in the colors() that I can group 
 together to make some nice palettes for drawing barplots, etc?
 
 Thanks,
 Jeff.
 
 
 --
 Jeff D. Hamann
 Forest Informatics, Inc.
 PO Box 1421
 Corvallis, Oregon 97339-1421
 phone 541-754-1428
 fax 541-752-0288
 [EMAIL PROTECTED]
 www.forestinformatics.com
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] installing R and Rcmdr onMacOS 10.3.9

2005-05-23 Thread John Fox
Dear Karla,

It's likely that you don't have Tcl/Tk installed on your Mac.

I know that it's possible to get the Rcmdr working on a Mac, but that doing
so requires some additional steps. I'm not a Mac user, so I'm afraid that I
can't be of much direct help. If someone who's using the Rcmdr on a Mac is
willing to prepare notes about how to get it installed and working, I'd
gladly post them on my web site.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Karla Van Meter
 Sent: Monday, May 23, 2005 11:43 AM
 To: R-help@stat.math.ethz.ch
 Subject: [R] installing R and Rcmdr onMacOS 10.3.9
 
 I have been trying to install the new, patched R 2.1.0a along 
 with the Rcmdr on my mac. I got the basic R package 
 installed, but in Project Manager, the tcltk package will not install.
 
   When I downloaded the Rcmdr binary package from the 
 Berkeley site, it did not show up on the package installer 
 list until a second attempt, but it still does not show up on 
 the package manager list.
 
 On the package manager list, the tcltk package is listed as 
 ''not installed'. When I click on it to install, it won't. I tried the
 library(tcltk) command, which failed.
 
 The following is the output from the last command:
 
   library(Rcmdr)
 Loading required package: tcltk
 Loading Tcl/Tk interface ... Error in dyn.load(x, as.logical(local),
 as.logical(now)) :
   unable to load shared library
 '/Library/Frameworks/R.framework/Resources/library/tcltk/libs/
 tcltk.so':
dlcompat: dyld: /Applications/R.app/Contents/MacOS/R can't open
 library: /usr/X11R6/lib/libX11.6.dylib  (No such file or 
 directory, errno = 2)
 Error: .onLoad failed in 'loadNamespace' for 'tcltk'
 Error: package 'tcltk' could not be loaded   library(tcltk) 
 Loading Tcl/Tk interface ... Error in dyn.load(x, as.logical(local),
 as.logical(now)) :
   unable to load shared library
 '/Library/Frameworks/R.framework/Resources/library/tcltk/libs/
 tcltk.so':
dlcompat: dyld: /Applications/R.app/Contents/MacOS/R can't open
 library: /usr/X11R6/lib/libX11.6.dylib  (No such file or 
 directory, errno = 2)
 Error: .onLoad failed in 'loadNamespace' for 'tcltk'
 Error: package/namespace load failed for 'tcltk'
  
 
 I am NOT a developer, but an application-oriented grad student.
 
 I'd appreciate any help that will make Rcmdr work.
 
 Thanks,
 
 Karla Van Meter
 tel: 707.765.0420
 net: [EMAIL PROTECTED]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] R annoyances

2005-05-19 Thread John Fox
Dear Uwe,

I've often wondered why T and F aren't reserved words in R as TRUE and FALSE
are. Perhaps there's some use of T and F as variables, but that seems
ill-advised.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Uwe Ligges
 Sent: Thursday, May 19, 2005 10:08 AM
 To: Chalasani, Prasad
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] R annoyances
 
 Chalasani, Prasad wrote:
 
  Thanks all for pointing out that I can use 
  mtx[,1,drop=F]
 
 
 Which, for example, won't work for
   F - 10.25
 
 --- drop=FALSE  !
^
 
 Uwe Ligges
 
 
 
  
  
  -Original Message-
  From: Uwe Ligges [mailto:[EMAIL PROTECTED]
  Sent: Thursday, May 19, 2005 10:49 AM
  To: Chalasani, Prasad
  Cc: r-help@stat.math.ethz.ch
  Subject: Re: [R] R annoyances
  
  
  Chalasani, Prasad wrote:
  
 Dear R Folks,
 I'm a big fan of R, but there are a couple of things that 
 repeatedly 
 annoy me, and I wondered if anyone has neat ways to deal with them.
 
 (a) When using apply row-wise to a matrix, it returns
 the results column-wise, and to preserve the original
 orientation, I've to do a transpose. E.g. I've to keep
 doing a transpose, which I consider to be quite annoying.
 
 transformed.mtx - t(apply( mtx, 1, exp))
  
  
  I'd rather type
  
 exp(mtx)
  
  
  
  
 (b) When extracting 2 or more columns of a matrix, 
 R returns the result as a matrix, BUT when extracting
 just one column, it returns a vector/array, rather than
 a matrix, so I've to keep doing as.matrix, which is annoying.
 
 sub.mtx - as.matrix(mtx[,1])
 
 Of course I could write a suitable function
 cols - function(mtx,range) as.matrix(mtx[, range])
 but then I lose the syntactic sugar of being able to say [,1].
  
  
  The docs suggest:
  
 mtx[ , 1, drop = FALSE]
  
  
  Uwe Ligges
  
  
  
 __
 R-help@stat.math.ethz.ch mailing list 
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] R annoyances

2005-05-19 Thread John Fox
Dear Jan,

Since you can use variables named c, q, or t in any event, I don't see why
the existence of functions with these names is much of an impediment.

The problem that I see with T and F is that allowing them to be redefined
sets a trap for people. If R wants to discourage use of T and F for TRUE and
FALSE, then why provide standard global variables by these names? On the
other hand, if providing T and F is considered desirable (e.g., for S-PLUS
compatibility), then why not make them reserved names?

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: Jan T. Kim [mailto:[EMAIL PROTECTED] 
 Sent: Thursday, May 19, 2005 12:22 PM
 To: John Fox
 Subject: Re: [R] R annoyances
 
 On Thu, May 19, 2005 at 11:55:22AM -0400, John Fox wrote:
  Dear Uwe,
  
  I've often wondered why T and F aren't reserved words in R 
 as TRUE and 
  FALSE are. Perhaps there's some use of T and F as 
 variables, but that 
  seems ill-advised.
 
 Personally, I'd rather argue the other way around: Reserved 
 words should be words that should be more unique and 
 expressive than just a single letter.
 
 In fact, I've found it slightly irritating at times that c, q 
 and t are functions in the base package, as I'm somewhat 
 prone to use all of these as local variable names...
 
 Best regards, Jan
 --
  +- Jan T. Kim 
 ---+
  |*NEW*email: [EMAIL PROTECTED]   
 |
  |*NEW*WWW:   http://www.cmp.uea.ac.uk/people/jtk 
 |
  *-=  hierarchical systems are for files, not for humans 
  =-*

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Difference between difference of means

2005-05-04 Thread John Fox
Dear Peter,

This looks like a two-way ANOVA, with product as one factor and condition as
the other. The test that you describe is for the interaction between product
and condition.

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Peter Zorg
 Sent: Wednesday, May 04, 2005 9:24 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Difference between difference of means
 
 Hi
 I would like to compare two differences of means.
 I have the means of the log of one variable for two 
 conditions for one product, and for another product.
 The differences look different between both products for the 
 two conditions, but a statistical test is required.
 Thank you for any hint
  
 Peter
 
 __
 
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Anova - adjusted or sequential sums of squares?

2005-04-20 Thread John Fox
Dear Mick,

The Anova() function in the car package will compute what are often called
type-II and -III sums of squares. 

Without wishing to reinvigorate the sums-of-squares controversy, I'll just
add that the various types of sums of squares correspond to different
hypotheses. The hypotheses tested by type-I sums of squares are rarely
sensible; that the results vary by the order of the factors is a symptom of
this, but sums of squares that are invariant with respect to ordering of the
factors don't necessarily correspond to sensible hypotheses. 

If you do decide to use type-III sums of squares, be careful to use a
contrast type (such as contr.sum) that produces an orthogonal row basis for
the terms in the model.

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 michael watson (IAH-C)
 Sent: Wednesday, April 20, 2005 9:38 AM
 To: [EMAIL PROTECTED]
 Cc: r-help@stat.math.ethz.ch
 Subject: RE: [R] Anova - adjusted or sequential sums of squares?
 
 I guess the real problem is this:
 
 As I have a different number of observations in each of the 
 groups, the results *change* depending on which order I 
 specify the factors in the model.  This unnerves me.  With a 
 completely balanced design, this doesn't happen - the results 
 are the same no matter which order I specify the factors.  
 
 It's this reason that I have been given for using the 
 so-called type III adjusted sums of squares...
 
 Mick
 
 -Original Message-
 From: Douglas Bates [mailto:[EMAIL PROTECTED]
 Sent: 20 April 2005 15:07
 To: michael watson (IAH-C)
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] Anova - adjusted or sequential sums of squares?
 
 
 michael watson (IAH-C) wrote:
  Hi
  
  I am performing an analysis of variance with two factors, each with 
  two levels.  I have differing numbers of observations in 
 each of the 
  four combinations, but all four combinations *are* present 
 (2 of the 
  factor combinations have 3 observations, 1 has 4 and 1 has 5)
  
  I have used both anova(aov(...)) and anova(lm(...)) in R 
 and it gave 
  the same result - as expected.  I then plugged this into minitab, 
  performed what minitab called a General Linear Model (I have to use 
  this in minitab as I have an unbalanced data set) and got a 
 different 
  result. After a little mining this is because minitab, by default, 
  uses the type III adjusted SS.  Sure enough, if I changed 
 minitab to 
  use the type I sequential SS, I get exactly the same 
 results as aov()
 and lm() in R.
  
  So which should I use?  Type I adjusted SS or Type III 
 sequential SS? 
  Minitab help tells me that I would usually want to use type III 
  adjusted SS, as  type I sequential sums of squares can differ when 
  your design is unbalanced - which mine is.  The R functions I am 
  using are clearly using the type I sequential SS.
 
 Install the fortunes package and try
   fortune(Venables)
 
 I'm really curious to know why the two types of sum of squares are
 called Type I and Type III! This is a very common misconception,
 particularly among SAS users who have been fed this nonsense 
 quite often
 for all their professional lives. Fortunately the reality is much
 simpler. There is, 
 by any
 sensible reckoning, only ONE type of sum of squares, and it always 
 represents
 an improvement sum of squares of the outer (or alternative) model over
 the inner (or null hypothesis) model. What the SAS highly dubious 
 classification of
 sums of squares does is to encourage users to concentrate on the null
 hypothesis model and to forget about the alternative. This is 
 always a 
 very bad
 idea and not surprisingly it can lead to nonsensical tests, as in the 
 test it
 provides for main effects even in the presence of interactions,
 something which beggars definition, let alone belief.
 -- Bill Venables
R-help (November 2000)
 
 In the words of the master, there is ... only one type of sum of 
 squares, which is the one that R reports.  The others are awkward 
 fictions created for times when one could only afford to fit 
 one or two 
 linear models per week and therefore wanted the output to 
 give results 
 for all possible tests one could conceive, even if the models being 
 tested didn't make sense.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Pearson corelation and p-value for matrix

2005-04-18 Thread John Fox
Dear Dren,

Since cor(), on which Andy's solution is based, can compute pairwise-present
correlations, you could adapt his function -- you'll have to adjust the df
for each pair. Alternatively, you could probably save some time (programming
time + computer time) by just using my solution:

 R - diag(100)
 R[upper.tri(R)] - R[lower.tri(R)] - .5
 library(mvtnorm)
 X - rmvnorm(6000, sigma=R)
 system.time(for (i in 1:50) cor.pvalues(X), gc=TRUE)
[1] 518.19   1.11 520.23 NA NA

I know that time is money, but nine minutes (on my machine) probably won't
bankrupt anyone.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Dren Scott
 Sent: Monday, April 18, 2005 12:41 PM
 To: 'R-Help'
 Subject: RE: [R] Pearson corelation and p-value for matrix
 
 Hi all,
  
 Thanks Andy, Mark and John for all the help. I really 
 appreciate it. I'm new to both R and statistics, so please 
 excuse any gaffes on my part. 
  
 Essentially what I'm trying to do, is to evaluate for each 
 row, how many other rows would have a p-value  0.05. So, 
 after I get my N x N p-value matrix, I'll just filter out 
 values that are  0.05.
  
 Each of my datasets (6000 rows x 100 columns) would consist 
 of some NA's. The iterative procedure (cor.pvalues) proposed 
 by John would yield the values, but it would take an 
 inordinately long time (I have 50 of these datasets to 
 process). The solution proposed by Andy, although fast, would 
 not be able to incorporate the NA's.
  
 Is there any workaround for the NA's? Or possibly do you 
 think I could try something else?
  
 Thanks very much. 
  
 Dren
 
 
 Liaw, Andy [EMAIL PROTECTED] wrote:
  From: John Fox
  
  Dear Andy,
  
  That's clearly much better -- and illustrates an effective strategy 
  for vectorizing (or matricizing) a computation. I think I'll add 
  this to my list of programming examples. It might be a little 
  dangerous to pass ...
  through to cor(), since someone could specify type=spearman, for 
  example.
 
 Ah, yes, the ... isn't likely to help here! Also, it will 
 only work correctly if there are no NA's, for example (or 
 else the degree of freedom would be wrong). 
 
 Best,
 Andy
 
  Thanks,
  John
  
  
  John Fox
  Department of Sociology
  McMaster University
  Hamilton, Ontario
  Canada L8S 4M4
  905-525-9140x23604
  http://socserv.mcmaster.ca/jfox
  
  
   -Original Message-
   From: Liaw, Andy [mailto:[EMAIL PROTECTED]
   Sent: Friday, April 15, 2005 9:51 PM
   To: 'John Fox'; [EMAIL PROTECTED]
   Cc: 'R-Help'; 'Dren Scott'
   Subject: RE: [R] Pearson corelation and p-value for matrix
   
   We can be a bit sneaky and `borrow' code from cor.test.default:
   
   cor.pval - function(x, alternative=two-sided, ...) { corMat - 
   cor(x, ...) n - nrow(x) df - n - 2 STATISTIC - 
 sqrt(df) * corMat 
   / sqrt(1 - corMat^2) p - pt(STATISTIC, df) p - if 
 (alternative == 
   less) { p } else if (alternative == greater) {
   1 - p
   } else 2 * pmin(p, 1 - p)
   p
   }
   
   The test:
   
system.time(c1 - cor.pvals(X), gcFirst=TRUE)
   [1] 13.19 0.01 13.58 NA NA
system.time(c2 - cor.pvalues(X), gcFirst=TRUE)
   [1] 6.22 0.00 6.42 NA NA
system.time(c3 - cor.pval(X), gcFirst=TRUE)
   [1] 0.07 0.00 0.07 NA NA
   
   Cheers,
   Andy
   
From: John Fox

Dear Mark,

I think that the reflex of trying to avoid loops in R is often 
mistaken, and so I decided to try to time the two
  approaches (on a
3GHz Windows XP system).

I discovered, first, that there is a bug in your 
 function -- you 
appear to have indexed rows instead of columns; fixing that:

cor.pvals - function(mat)
{
cols - expand.grid(1:ncol(mat), 1:ncol(mat)) 
 matrix(apply(cols, 
1,
function(x) cor.test(mat[, x[1]], mat[, x[2]])$p.value), ncol = 
ncol(mat)) }


My function is cor.pvalues and yours cor.pvals. This is
  for a data
matrix with 1000 observations on 100 variables:

 R - diag(100)
 R[upper.tri(R)] - R[lower.tri(R)] - .5
 library(mvtnorm)
 X - rmvnorm(1000, sigma=R)
 dim(X)
[1] 1000 100
 
 system.time(cor.pvalues(X))
[1] 5.53 0.00 5.53 NA NA
 
 system.time(cor.pvals(X))
[1] 12.66 0.00 12.66 NA NA
 

I frankly didn't expect the advantage of my approach to be
   this large,
but there it is.

Regards,
John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox


 -Original Message-
 From: Marc

[R] Discrepancy between gam from gam package and gam in S-PLUS

2005-04-18 Thread John Fox
Dear Trevor,

I've noticed a discrepancy in the degrees of freedom reported by gam() from
the gam package in R vs. gam() in S-PLUS. The nonparametric df differ by 1;
otherwise (except for things that depend upon the df), the output is the
same:

- snip  

*** From R (gam version 0.93):

 mod.gam - gam(prestige ~ lo(income, span=.6), data=Prestige)
 summary(mod.gam)

Call: gam(formula = prestige ~ lo(income, span = 0.6), data = Prestige)
Deviance Residuals:
Min  1Q  Median  3Q Max 
-17.163  -8.205  -1.998   8.070  32.326 

(Dispersion Parameter for gaussian family taken to be 122.5047)

Null Deviance: 29895.43 on 101 degrees of freedom
Residual Deviance: 12004.72 on 97.9939 degrees of freedom
AIC: 785.82 

Number of Local Scoring Iterations: 2 

DF for Terms and F-values for Nonparametric Effects

   Df Npar Df Npar F Pr(F)
(Intercept) 1 
lo(income, span = 0.6)  1   2 10.626 6.537e-05 ***
---
Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 


*** From S-PLUS 6.2.1 for Windows:

 library(car)
 mod.gam - gam(prestige ~ lo(income, span=.6), data=Prestige)
 summary(mod.gam)

Call: gam(formula = prestige ~ lo(income, span = 0.6), data = Prestige)
Deviance Residuals:
   Min1QMedian   3Q  Max 
 -17.16264 -8.205246 -1.997525 8.070375 32.32608

(Dispersion Parameter for Gaussian family taken to be 123.7677 )

Null Deviance: 29895.43 on 101 degrees of freedom

Residual Deviance: 12004.72 on 96.99392 degrees of freedom

Number of Local Scoring Iterations: 1 

DF for Terms and F-values for Nonparametric Effects

   Df Npar Df   Npar FPr(F) 
   (Intercept)  1  
lo(income, span = 0.6)  1   3 7.018995 0.0002517358
 

- snip  

I suppose that one of these must be in error.

Regards,
 John

John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Pearson corelation and p-value for matrix

2005-04-18 Thread John Fox
Dear Andy,

Very nice! (My point was that if this is a one-time thing, for Dren to
puzzle over it is probably more time-consuming than simply doing it
inefficiently.)

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: Liaw, Andy [mailto:[EMAIL PROTECTED] 
 Sent: Monday, April 18, 2005 2:40 PM
 To: 'John Fox'; 'Dren Scott'
 Cc: 'R-Help'
 Subject: RE: [R] Pearson corelation and p-value for matrix
 
 I believe this will do:
 
 cor.pval2 - function(x,  alternative=two-sided) {
 corMat - cor(x, use=if (any(is.na(x))) 
 pairwise.complete.obs else
 all)
 df - crossprod(!is.na(x)) - 2
 STATISTIC - sqrt(df) * corMat / sqrt(1 - corMat^2)
 p - pt(STATISTIC, df)
 p - if (alternative == less) {
 p
 } else if (alternative == greater) {
 1 - p
 } else 2 * pmin(p, 1 - p)
 p
 }
 
 Some test:
 
  set.seed(1)
  x - matrix(runif(2e3 * 1e2), 2e3)
  system.time(res1 - cor.pval(t(x)), gcFirst=TRUE)
 [1] 17.28  0.77 18.16NANA
  system.time(res2 - cor.pval2(t(x)), gcFirst=TRUE)
 [1] 19.51  1.05 20.70NANA
  max(abs(res1 - res2))
 [1] 0
  x[c(1, 3, 6), c(2, 4)] - NA
  x[30, 61] - NA
  system.time(res2 - cor.pval2(t(x)), gcFirst=TRUE)
 [1] 24.48  0.71 25.28NANA
 
 This is a bit slower because of the extra computation for 
 df.  One can try to save some computation by only computing 
 with the lower (or upper) triangular part.
 
 Cheers,
 Andy
 
  From: John Fox
  
  Dear Dren,
  
  Since cor(), on which Andy's solution is based, can compute 
  pairwise-present correlations, you could adapt his function 
 -- you'll 
  have to adjust the df for each pair. Alternatively, you 
 could probably 
  save some time (programming time + computer time) by just using my 
  solution:
  
   R - diag(100)
   R[upper.tri(R)] - R[lower.tri(R)] - .5
   library(mvtnorm)
   X - rmvnorm(6000, sigma=R)
   system.time(for (i in 1:50) cor.pvalues(X), gc=TRUE)
  [1] 518.19   1.11 520.23 NA NA
  
  I know that time is money, but nine minutes (on my machine) 
 probably 
  won't bankrupt anyone.
  
  Regards,
   John
  
  
  John Fox
  Department of Sociology
  McMaster University
  Hamilton, Ontario
  Canada L8S 4M4
  905-525-9140x23604
  http://socserv.mcmaster.ca/jfox
  
  
   -Original Message-
   From: [EMAIL PROTECTED] 
   [mailto:[EMAIL PROTECTED] On Behalf Of Dren Scott
   Sent: Monday, April 18, 2005 12:41 PM
   To: 'R-Help'
   Subject: RE: [R] Pearson corelation and p-value for matrix
   
   Hi all,

   Thanks Andy, Mark and John for all the help. I really 
 appreciate it. 
   I'm new to both R and statistics, so please excuse any 
 gaffes on my 
   part.

   Essentially what I'm trying to do, is to evaluate for 
 each row, how 
   many other rows would have a p-value  0.05. So, after I 
 get my N x 
   N p-value matrix, I'll just filter out values that are  0.05.

   Each of my datasets (6000 rows x 100 columns) would 
 consist of some 
   NA's. The iterative procedure (cor.pvalues) proposed by 
 John would 
   yield the values, but it would take an inordinately long time (I 
   have 50 of these datasets to process). The solution proposed by 
   Andy, although fast, would not be able to incorporate the NA's.

   Is there any workaround for the NA's? Or possibly do you think I 
   could try something else?

   Thanks very much. 

   Dren
   
   
   Liaw, Andy [EMAIL PROTECTED] wrote:
From: John Fox

Dear Andy,

That's clearly much better -- and illustrates an
  effective strategy
for vectorizing (or matricizing) a computation. I think
  I'll add
this to my list of programming examples. It might be a little 
dangerous to pass ...
through to cor(), since someone could specify
  type=spearman, for
example.
   
   Ah, yes, the ... isn't likely to help here! Also, it will only 
   work correctly if there are no NA's, for example (or else 
 the degree 
   of freedom would be wrong).
   
   Best,
   Andy
   
Thanks,
John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox


 -Original Message-
 From: Liaw, Andy [mailto:[EMAIL PROTECTED]
 Sent: Friday, April 15, 2005 9:51 PM
 To: 'John Fox'; [EMAIL PROTECTED]
 Cc: 'R-Help'; 'Dren Scott'
 Subject: RE: [R] Pearson corelation and p-value for matrix
 
 We can be a bit sneaky and `borrow' code from 
 cor.test.default:
 
 cor.pval - function(x, alternative=two-sided, ...) {
  corMat -
 cor(x, ...) n - nrow(x) df - n - 2 STATISTIC -
   sqrt(df

RE: [R] Pearson corelation and p-value for matrix

2005-04-16 Thread John Fox
Dear Andy,

That's clearly much better -- and illustrates an effective strategy for
vectorizing (or matricizing) a computation. I think I'll add this to my
list of programming examples. It might be a little dangerous to pass ...
through to cor(), since someone could specify type=spearman, for example.

Thanks,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: Liaw, Andy [mailto:[EMAIL PROTECTED] 
 Sent: Friday, April 15, 2005 9:51 PM
 To: 'John Fox'; [EMAIL PROTECTED]
 Cc: 'R-Help'; 'Dren Scott'
 Subject: RE: [R] Pearson corelation and p-value for matrix
 
 We can be a bit sneaky and `borrow' code from cor.test.default:
 
 cor.pval - function(x,  alternative=two-sided, ...) {
 corMat - cor(x, ...)
 n - nrow(x)
 df - n - 2
 STATISTIC - sqrt(df) * corMat / sqrt(1 - corMat^2)
 p - pt(STATISTIC, df)
 p - if (alternative == less) {
 p
 } else if (alternative == greater) {
 1 - p
 } else 2 * pmin(p, 1 - p)
 p
 }
 
 The test:
 
  system.time(c1 - cor.pvals(X), gcFirst=TRUE)
 [1] 13.19  0.01 13.58NANA
  system.time(c2 - cor.pvalues(X), gcFirst=TRUE)
 [1] 6.22 0.00 6.42   NA   NA
  system.time(c3 - cor.pval(X), gcFirst=TRUE)
 [1] 0.07 0.00 0.07   NA   NA
 
 Cheers,
 Andy
 
  From: John Fox
  
  Dear Mark,
  
  I think that the reflex of trying to avoid loops in R is often 
  mistaken, and so I decided to try to time the two approaches (on a 
  3GHz Windows XP system).
  
  I discovered, first, that there is a bug in your function -- you 
  appear to have indexed rows instead of columns; fixing that:
  
  cor.pvals - function(mat)
  {
cols - expand.grid(1:ncol(mat), 1:ncol(mat))
matrix(apply(cols, 1,
 function(x) cor.test(mat[, x[1]], mat[, 
  x[2]])$p.value),
   ncol = ncol(mat))
  }
  
  
  My function is cor.pvalues and yours cor.pvals. This is for a data 
  matrix with 1000 observations on 100 variables:
  
   R - diag(100)
   R[upper.tri(R)] - R[lower.tri(R)] - .5
   library(mvtnorm)
   X - rmvnorm(1000, sigma=R)
   dim(X)
  [1] 1000  100
   
   system.time(cor.pvalues(X))
  [1] 5.53 0.00 5.53   NA   NA
   
   system.time(cor.pvals(X))
  [1] 12.66  0.00 12.66NANA
   
  
  I frankly didn't expect the advantage of my approach to be 
 this large, 
  but there it is.
  
  Regards,
   John
  
  
  John Fox
  Department of Sociology
  McMaster University
  Hamilton, Ontario
  Canada L8S 4M4
  905-525-9140x23604
  http://socserv.mcmaster.ca/jfox
  
  
   -Original Message-
   From: Marc Schwartz [mailto:[EMAIL PROTECTED]
   Sent: Friday, April 15, 2005 7:08 PM
   To: John Fox
   Cc: 'Dren Scott'; R-Help
   Subject: RE: [R] Pearson corelation and p-value for matrix
   
   Here is what might be a slightly more efficient way to 
 get to John's
   question:
   
   cor.pvals - function(mat)
   {
 rows - expand.grid(1:nrow(mat), 1:nrow(mat))
 matrix(apply(rows, 1,
  function(x) cor.test(mat[x[1], ], mat[x[2], 
   ])$p.value),
ncol = nrow(mat))
   }
   
   HTH,
   
   Marc Schwartz
   
   On Fri, 2005-04-15 at 18:26 -0400, John Fox wrote:
Dear Dren,

How about the following?

 cor.pvalues - function(X){
nc - ncol(X)
res - matrix(0, nc, nc)
for (i in 2:nc){
for (j in 1:(i - 1)){
res[i, j] - res[j, i] - cor.test(X[,i],
  X[,j])$p.value
}
}
res
}

What one then does with all of those non-independent test
   is another
question, I guess.

I hope this helps,
 John
   
 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of
  Dren Scott
 Sent: Friday, April 15, 2005 4:33 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Pearson corelation and p-value for matrix
 
 Hi,
  
 I was trying to evaluate the pearson correlation and
  the p-values
 for an nxm matrix, where each row represents a vector. 
   One way to do
 it would be to iterate through each row, and find its
  correlation
 value( and the p-value) with respect to the other rows. 
  Is there
 some function by which I can use the matrix as input? 
   Ideally, the
 output would be an nxn matrix, containing the p-values
   between the
 respective vectors.
  
 I have tried cor.test for the iterations, but couldn't find a 
 function that would take the matrix as input.
  
 Thanks for the help.
  
 Dren
   
  
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting

RE: [R] Pearson corelation and p-value for matrix

2005-04-16 Thread John Fox
Dear Mark,

 -Original Message-
 From: Marc Schwartz [mailto:[EMAIL PROTECTED] 
 Sent: Friday, April 15, 2005 9:41 PM
 To: John Fox
 Cc: 'R-Help'; 'Dren Scott'
 Subject: RE: [R] Pearson corelation and p-value for matrix
 
 John,
 
 Interesting test. Thanks for pointing that out.
 
 You are right, there is a knee-jerk reaction to avoid loops, 
 especially nested loops.
 
 On the indexing of rows, I did that because Dren had 
 indicated in his initial post:
 
  I was trying to evaluate the pearson correlation and the p-values
   for an nxm matrix, where each row represents a vector.
   One way to do it would be to iterate through each row, and find its
   correlation value( and the p-value) with respect to the other rows.
 
 So I ran the correlations by row, rather than by column.
 

That's the second time yesterday that I responded to a posting without
reading it carefully enough -- a good lesson for me. I guess that Dren could
just apply my solution to the transpose of his matrix -- i.e.,
cor.pvalues(t(X)).

Sorry,
 John

 Thanks again. Good lesson.
 
 Marc
 
 On Fri, 2005-04-15 at 21:36 -0400, John Fox wrote:
  Dear Mark,
  
  I think that the reflex of trying to avoid loops in R is often 
  mistaken, and so I decided to try to time the two approaches (on a 
  3GHz Windows XP system).
  
  I discovered, first, that there is a bug in your function -- you 
  appear to have indexed rows instead of columns; fixing that:
  
  cor.pvals - function(mat)
  {
cols - expand.grid(1:ncol(mat), 1:ncol(mat))
matrix(apply(cols, 1,
 function(x) cor.test(mat[, x[1]], mat[, 
 x[2]])$p.value),
   ncol = ncol(mat))
  }
  
  
  My function is cor.pvalues and yours cor.pvals. This is for a data 
  matrix with 1000 observations on 100 variables:
  
   R - diag(100)
   R[upper.tri(R)] - R[lower.tri(R)] - .5
   library(mvtnorm)
   X - rmvnorm(1000, sigma=R)
   dim(X)
  [1] 1000  100
   
   system.time(cor.pvalues(X))
  [1] 5.53 0.00 5.53   NA   NA
   
   system.time(cor.pvals(X))
  [1] 12.66  0.00 12.66NANA
   
  
  I frankly didn't expect the advantage of my approach to be 
 this large, 
  but there it is.
  
  Regards,
   John
  
  
  John Fox
  Department of Sociology
  McMaster University
  Hamilton, Ontario
  Canada L8S 4M4
  905-525-9140x23604
  http://socserv.mcmaster.ca/jfox
  
  
   -Original Message-
   From: Marc Schwartz [mailto:[EMAIL PROTECTED]
   Sent: Friday, April 15, 2005 7:08 PM
   To: John Fox
   Cc: 'Dren Scott'; R-Help
   Subject: RE: [R] Pearson corelation and p-value for matrix
   
   Here is what might be a slightly more efficient way to 
 get to John's
   question:
   
   cor.pvals - function(mat)
   {
 rows - expand.grid(1:nrow(mat), 1:nrow(mat))
 matrix(apply(rows, 1,
  function(x) cor.test(mat[x[1], ], mat[x[2], 
   ])$p.value),
ncol = nrow(mat))
   }
   
   HTH,
   
   Marc Schwartz
   
   On Fri, 2005-04-15 at 18:26 -0400, John Fox wrote:
Dear Dren,

How about the following?

 cor.pvalues - function(X){
nc - ncol(X)
res - matrix(0, nc, nc)
for (i in 2:nc){
for (j in 1:(i - 1)){
res[i, j] - res[j, i] - cor.test(X[,i], 
 X[,j])$p.value
}
}
res
}

What one then does with all of those non-independent test
   is another
question, I guess.

I hope this helps,
 John
   
 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Dren 
 Scott
 Sent: Friday, April 15, 2005 4:33 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Pearson corelation and p-value for matrix
 
 Hi,
  
 I was trying to evaluate the pearson correlation and the 
 p-values for an nxm matrix, where each row represents 
 a vector.
   One way to do
 it would be to iterate through each row, and find its 
 correlation value( and the p-value) with respect to the other 
 rows. Is there some function by which I can use the 
 matrix as input?
   Ideally, the
 output would be an nxn matrix, containing the p-values
   between the
 respective vectors.
  
 I have tried cor.test for the iterations, but couldn't find a 
 function that would take the matrix as input.
  
 Thanks for the help.
  
 Dren
   
  
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


<    1   2   3   4   5   6   7   >