from:"John Fox"

Re: [R] sciviews installation

2005-07-20 Thread John Fox

Dear Philippe,

The development version 1.1-0 of the Rcmdr package (i.e., the one with
localization facilities) isn't yet on CRAN. Georges doesn't say which
version of the Rcmdr package he's using, but I assume 1.0-2 from CRAN, which
I believe should work with SciViews.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Philippe Grosjean
 Sent: Wednesday, July 20, 2005 5:24 AM
 To: [EMAIL PROTECTED]
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] sciviews installation
 
 Hello,
 
 Several changes in the latest Rcmdr broken the SciViews 
 compatibility. I am working on it... but it takes longer than 
 expected. Sorry for the inconvenience.
 Best,
 
 Philippe
 
 ..°}))
   ) ) ) ) )
 ( ( ( ( (Prof. Philippe Grosjean
   ) ) ) ) )
 ( ( ( ( (Numerical Ecology of Aquatic Systems
   ) ) ) ) )   Mons-Hainaut University, Pentagone (3D08)
 ( ( ( ( (Academie Universitaire Wallonie-Bruxelles
   ) ) ) ) )   8, av du Champ de Mars, 7000 Mons, Belgium
 ( ( ( ( (
   ) ) ) ) )   phone: + 32.65.37.34.97, fax: + 32.65.37.30.54
 ( ( ( ( (email: [EMAIL PROTECTED]
   ) ) ) ) )
 ( ( ( ( (web:   http://www.umh.ac.be/~econum
   ) ) ) ) )  http://www.sciviews.org
 ( ( ( ( (
 ..
 
 ghjmora g mail wrote:
  Hello
  
  
  1. a few months ago, I had sciviews working fine with R 
 (rw2001) under 
  windows XP 2. now, upgrading to rw2011, the stuff seems fine (every 
  package installed),but I find a conflict when launching sciviews:
  - it runs, apparently
  - but when I try to work (import/export In: text for 
 instance), it 
  asks for Rcmdr (Would you like to install it now?)
  
  3. Rcmdr is already installed (with all dependencies) and 
 works well 
  when called directly in R gui 4. and it's impossible to make it 
  reconized or to install it under sciviews
  
  
  I have all the latest packages, and I am going to get mad.
  
  what do you suggest to solve my problem ?
  
  Thanks
  
  Georges Moracchini
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
  
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] grep help needed

2005-07-26 Thread John Fox

Dear Denis,

I don't believe that anyone fielded your question -- my apologies if I
missed a response.

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Denis Chabot
 Sent: Monday, July 25, 2005 9:46 PM
 To: R list
 Subject: [R] grep help needed

 Hi,

 In another thread (PBSmapping and shapefiles) I asked for 
 an easy way to read shapefiles and transform them in data 
 that PBSmapping could use. One person is exploring some ways 
 of doing this, but it is possible I'll have to do this manually.

 With package maptools I am able to extract the information 
 I need from a shapefile but it is formatted like this:

 [[1]]
 [,1] [,2]
 [1,] -55.99805 51.68817
 [2,] -56.00222 51.68911
 [3,] -56.01694 51.68911
 [4,] -56.03781 51.68606
 [5,] -56.04639 51.68759
 [6,] -56.04637 51.69445
 [7,] -56.03777 51.70207
 [8,] -56.02301 51.70892
 [9,] -56.01317 51.71578
 [10,] -56.00330 51.73481
 [11,] -55.99805 51.73840
 attr(,pstart)
 attr(,pstart)$from
 [1] 1

 attr(,pstart)$to
 [1] 11

 attr(,nParts)
 [1] 1
 attr(,shpID)
 [1] NA

 [[2]]
[,1] [,2]
 [1,] -57.76294 50.88770
 [2,] -57.76292 50.88693
 [3,] -57.76033 50.88163
 [4,] -57.75668 50.88091
 [5,] -57.75551 50.88169
 [6,] -57.75562 50.88550
 [7,] -57.75932 50.88775
 [8,] -57.76294 50.88770
 attr(,pstart)
 attr(,pstart)$from
 [1] 1

 attr(,pstart)$to
 [1] 8

 attr(,nParts)
 [1] 1
 attr(,shpID)
 [1] NA

 I do not quite understand the structure of this data object 
 (list of lists I think)

Actually, it looks like a list of matrices, each with some attributes
(which, I gather, aren't of interest to you).

 but at this point I resorted to 
 printing it on the console and imported that text into Excel 
 for further cleaning, which is easy enough. I'd like to 
 complete the process within R to save time and to circumvent 
 Excel's limit of around 64000 lines. But I have a hard time 
 figuring out how to clean up this text in R.

If I understand correctly what you want, this seems a very awkward way to
proceed. Why not just extract the matrices from the list, stick on the
additional columns that you want, stick the matrices together, name the
columns, and then output the data to a file?

M1 - Data[[1]]  # assuming that the original list is named Data
M2 - Data[[2]]
M1 - cbind(1, 1:nrow(M1), M1)
M2 - cbind(2, 1:nrow(M2), M2)
M - rbind(M1, M2)
colnames(M) - c(PID, POS, X, Y)
write.table(M, Data.txt, row.names=FALSE, quote=FALSE)

It wouldn't be hard to generalize this to any number of matrices and to
automate the process.

I hope that this helps,
 John

 What I need to produce for PBSmapping is a file where each 
 block of coordinates shares one ID number, called PID, and a 
 variable POS indicates the position of each coordinate within 
 a shape. All other lines must disappear. So the above would become:

 PID POS X Y
 1 1 -55.99805 51.68817
 1 2 -56.00222 51.68911
 1 3 -56.01694 51.68911
 1 4 -56.03781 51.68606
 1 5 -56.04639 51.68759
 1 6 -56.04637 51.69445
 1 7 -56.03777 51.70207
 1 8 -56.02301 51.70892
 1 9 -56.01317 51.71578
 1 10 -56.00330 51.73481
 1 11 -55.99805 51.73840
 2 1 -57.76294 50.88770
 2 2 -57.76292 50.88693
 2 3 -57.76033 50.88163
 2 4 -57.75668 50.88091
 2 5 -57.75551 50.88169
 2 6 -57.75562 50.88550
 2 7 -57.75932 50.88775
 2 8 -57.76294 50.88770

 First I imported this text file into R:
 test - read.csv2(test file.txt,header=F, sep=;, colClasses =
 character)

 I used sep=; to insure there would be only one variable in 
 this file, as it contains no ;

 To remove lines that do not contain coordinates, I used the 
 fact that longitudes are expressed as negative numbers, so 
 with my very limited knowledge of grep searches, I thought of 
 this, which is probably not the best way to go:

 a - rep(-, length(test$V1))
 b - grep(a, test$V1)

 this gives me a warning (Warning message:
 the condition has length  1 and only the first element will be used
 in: if (is.na(pattern)) {
 but seems to do what I need anyway

 c - seq(1, length(test$V1))
 d - c %in% b

 e - test$V1[d]

 Partial victory, now I only have lines that look like [1,] 
 -57.76294 50.88770

 But I don't know how to go further: the number in square 
 brackets can be used for variable POS, after removing the 
 square brackets and the comma, but this requires a better 
 knowledge of grep than I have.  
 Furthermore, I don't know how to add a PID (polygon ID) 
 variable, i.e. all lines of a polygon must have the same ID, 
 as in the example above (i.e. each time POS == 1, a new 
 polygon starts and PID needs to be incremented by 1, and PID 
 is kept constant for lines where POS ! 1).

 Any help will be much appreciated.

 Sincerely,

 Denis Chabot

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] logistic regression: categorical value, and multinomial

2005-07-27 Thread John Fox

Dear Ed,

See ?glm for fitting binomial logit models, and ?multinom (in the nnet
package) for multinomial logit models. Neither function will handle a
character (text) variable as the response, but you could easily convert
the variable to a factor.

John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Haibo Huang
 Sent: Wednesday, July 27, 2005 11:23 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] logistic regression: categorical value, and multinomial
 
 I have two questions:
 
 1. If I want to do a binomial logit, how to handle the 
 categorical response variable? Data for the response 
 variables are not numerical, but text.
 
 2. What if I want to do a multinomial logit, still with 
 categorical response variable? The variable has 5 
 non-numerical response levels, I have to do it with a 
 multinomial logit.
 
 Any input is highly appreciated! Thanks!
 
 Ed

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] setting elements to NA across an array

2005-07-27 Thread John Fox

Dear Dr Carbon,

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Dr Carbon
 Sent: Wednesday, July 27, 2005 12:19 PM
 To: [EMAIL PROTECTED]
 Subject: [R] setting elements to NA across an array

 Please excuse what is obviously a trivial matter...

 I have a large 3-d array. I wish to set the third dimension 
 (z) to NA if there are any NA values in the first two 
 dimnesions (xy). So, given array foo:

   foo - array(data = NA, dim = c(5,5,3))
   foo[,,1] - matrix(rnorm(25), 5, 5)
   foo[,,2] - matrix(rnorm(25), 5, 5)
   foo[,,3] - matrix(rnorm(25), 5, 5)

 I'll set two elements to NA

   foo[2,2,1]- NA
   foo[3,5,2]- NA
   foo

 Now I want to set foo[2,2,] - NA and foo[3,5,] - NA. How 
 can I build a logical statement to do this?

That should work just as you've specified it: That is, elements in all
layers in the second row, second column and in the the third row, fifth
column of the array should be NA. Since you posed the question, I suppose
that I'm missing something.

I hope this helps,
 John

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] setting elements to NA across an array

2005-07-27 Thread John Fox

Dear Dr Carbon,

Actually, it appears that I'm the one who was being obtuse. To make sure
that I'm now interpreting what you want correctly, you'd like to set all
entries in a layer to NA if any entry in a layer is NA.  If that's correct,
then how about the following?

foo[array(apply(foo, c(1,2), function(x) any(is.na(x))), dim(foo))] - NA

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: Dr Carbon [mailto:[EMAIL PROTECTED] 
 Sent: Wednesday, July 27, 2005 12:45 PM
 To: John Fox
 Cc: [EMAIL PROTECTED]
 Subject: Re: [R] setting elements to NA across an array
 
 Sorry for being obtuse. 
 
 How can I build a logical statement that will set foo[2,2,] 
 - NA and foo[3,5,] - NA? Something like, if any row and 
 column are NA, then set NA to that row and column to NA for 
 the entire array
 
 
 
 On 7/27/05, John Fox [EMAIL PROTECTED] wrote:
  Dear Dr Carbon,
  
  
   -Original Message-
   From: [EMAIL PROTECTED] 
   [mailto:[EMAIL PROTECTED] On Behalf Of Dr Carbon
   Sent: Wednesday, July 27, 2005 12:19 PM
   To: [EMAIL PROTECTED]
   Subject: [R] setting elements to NA across an array
  
   Please excuse what is obviously a trivial matter...
  
   I have a large 3-d array. I wish to set the third dimension
   (z) to NA if there are any NA values in the first two dimnesions 
   (xy). So, given array foo:
  
 foo - array(data = NA, dim = c(5,5,3))
 foo[,,1] - matrix(rnorm(25), 5, 5)
 foo[,,2] - matrix(rnorm(25), 5, 5)
 foo[,,3] - matrix(rnorm(25), 5, 5)
  
   I'll set two elements to NA
  
 foo[2,2,1]- NA
 foo[3,5,2]- NA
 foo
  
   Now I want to set foo[2,2,] - NA and foo[3,5,] - NA. How can I 
   build a logical statement to do this?
  
  
  That should work just as you've specified it: That is, 
 elements in all 
  layers in the second row, second column and in the the third row, 
  fifth column of the array should be NA. Since you posed the 
 question, 
  I suppose that I'm missing something.
  
  I hope this helps,
   John
  
   __
   R-help@stat.math.ethz.ch mailing list 
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide!
   http://www.R-project.org/posting-guide.html
  
 

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Unexpected behavior in recode{car}

2005-07-28 Thread John Fox

Dear Tom and David,

The source of the problem isn't hard to see if you trace the execution of
recode() via debug(): The test for whether the result can be coerced to
numeric is faulty. I've fixed the bug and will upload a new version of car
to CRAN shortly. In the meantime, you can use 

ss - recode(nn, 2='Num2'; 4='Num4', as.factor=TRUE)

Thanks for bringing this bug to my attention.

John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Mulholland, Tom
 Sent: Thursday, July 28, 2005 2:01 AM
 To: D. Dailey; r-help@stat.math.ethz.ch
 Subject: Re: [R] Unexpected behavior in recode{car}
 
 require( car )
 set.seed(12345)
 nn - sample( c( 2, 4 ), size=50, replace=TRUE ) rr - 
 recode( nn, 2='TWO';4='FOUR' ) table( rr, exclude=NULL ) ss 
 - recode( nn, 2='Num2';4='Num4' )  # Doesn't work as 
 expected table( ss, exclude=NULL ) ss - recode( nn, 
 2='Num2';4='Num4', TRUE )  #?
 table( ss, exclude=NULL )
 tt - recode( nn, 2='TWO'; 4='Num4' )
 table( tt, exclude=NULL )
 uu - recode( nn, 2='Num2'; 4='FOUR' ) table( uu, exclude=NULL )
 
 I looked at the code and found it too difficult to 
 immediately decipher. So does making the result a factor 
 cause any real problems?
 
 I noticed that the same response happens with any letterset 
 followed by a number recode( nn, 2='Num2'; 4='abc5' )
 
 Tom
 
  -Original Message-
  From: [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED] Behalf Of D. Dailey
  Sent: Thursday, 28 July 2005 11:45 AM
  To: r-help@stat.math.ethz.ch
  Subject: [R] Unexpected behavior in recode{car}
  
  
  Thanks to the R creators for such a great statistical system. 
  Thanks to
  the R help list, I have (finally) gotten far enough in R to have a 
  question I hope to be worth posting.
  
  I'm using the recode function from John Fox's car package and have 
  encountered some unexpected behavior.
  
  Consider the following example:
  
  ## Begin cut-and-paste example
  require( car )
  set.seed(12345)
  nn - sample( c( 2, 4 ), size=50, replace=TRUE ) rr - recode( nn, 
  2='TWO';4='FOUR' ) table( rr, exclude=NULL ) ss - recode( nn, 
  2='Num2';4='Num4' )  # Doesn't work as expected table( ss, 
  exclude=NULL ) tt - recode( nn, 2='TWO'; 4='Num4' ) table( tt, 
  exclude=NULL ) uu - recode( nn, 2='Num2'; 4='FOUR' ) table( uu, 
  exclude=NULL ) ## End cut-and-paste example
  
  All but the recoding to ss work as expected: I get a 
 character vector 
  with 23 instances of either FOUR or Num4 and 27 
 instances of TWO
  or Num2.
  
  But for the ss line, wherein all the strings to be assigned 
 contain a 
  digit, the resulting vector contains all NAs. Using str(), 
 I note that 
  ss is a numeric vector.
  
  Is there a tidy way (using recode) to recode numeric values into 
  character strings, all of which contain a digit? I have a 
 workaround 
  for my current project, but it would be nice to be able to 
 use mixed 
  alphanumeric strings in this context.
  
  Thanks in advance for any insight you can give into this question.
  
  Using R 2.1.1 (downloaded binary) on Windows XP Pro, car version 
  1.0-17 (installed from CRAN via Windows GUI). Complete version 
  information
  below:
  
version
_
  platform i386-pc-mingw32
  arch i386
  os   mingw32
  system   i386, mingw32
  status
  major2
  minor1.1
  year 2005
  month06
  day  20
  language R
  
t(t( installed.packages()['car',] ))
[,1]
  Package  car
  LibPath  C:/Programs/R/rw2011/library
  Version  1.0-17
  Priority NA
  Bundle   NA
  Contains NA
  Depends  R (= 1.9.0)
  Suggests MASS, nnet, leaps
  Imports  NA
  Built2.1.0
  
  
  I subscribe to the help list in digest form, so would 
 appreciate being 
  copied directly in addition to seeing responses sent to the list.
  
  David Dailey
  Shoreline, Washington, USA
  [EMAIL PROTECTED]
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] partial SS for anova

2005-07-31 Thread John Fox

Dear Natalia,

See the Anova() function in the car package. Also see the warning in ?Anova
about Type III sums of squares.

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 NATALIA F TCHETCHERINA
 Sent: Saturday, July 30, 2005 6:25 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] partial SS for anova
 
 Hello,
 I use  lme4 package.
 library(lme4)
 fit=lmer(y ~ time+dye+trt+trt:time + (1|rep), data=dataset, 
 na.action='na.omit')
 anova(fit)
 
 The anova gives sequential F-tests and sequential SS.
 
 My question is: how I can get partial F-tests and partial SS?
  For lm (not lmer)
 anova(lm(y~x+z))
 we can use
 anova(fit, ssType=3)
 but it is not work for lmer.
 
 Natalia.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] tcltk programming guide

2005-08-11 Thread John Fox

Dear Simone,

tcltk functions correspond closely to Tcl and Tk commands, so documentation
for the latter, available at http://wiki.tcl.tk/3109, is helpful. I also
found Welsch's Practical Programming in Tcl and Tk useful. I expect that
you've already seen Peter Dalgaard's two R News articles on the tcltk
package.

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Simone 
 Gabbriellini
 Sent: Thursday, August 11, 2005 9:43 AM
 To: John Zhang
 Cc: R-help@stat.math.ethz.ch
 Subject: Re: [R] tcltk programming guide
 
 thank you, I knew that link, but I need something more 
 document- oriented, more specific, i.e. if I want to know how 
 to use tkadd, where should I look? or tkinsert and so on...
 
 thanx,
 simone
 
 Il giorno 11/ago/05, alle ore 14:18, John Zhang ha scritto:
 
  Go to 
 http://bioinf.wehi.edu.au/~wettenhall/RTclTkExamples/. There are 
  good examples.
 
 
 
  X-Original-To: [EMAIL PROTECTED]
  Delivered-To: [EMAIL PROTECTED]
  X-Virus-Scanned: by amavisd-new at stat.math.ethz.ch
  Mime-Version: 1.0 (Apple Message framework v733)
  To: R-help@stat.math.ethz.ch
  From: Simone Gabbriellini [EMAIL PROTECTED]
  Date: Thu, 11 Aug 2005 14:04:58 +0200
  Subject: [R] tcltk programming guide
  X-BeenThere: r-help@stat.math.ethz.ch
  X-Mailman-Version: 2.1.6
  List-Id: Main R Mailing List: Primary help r- 
  help.stat.math.ethz.ch
  List-Unsubscribe: https://stat.ethz.ch/mailman/listinfo/r-help,
 
  mailto:[EMAIL PROTECTED]
 
  List-Archive: https://stat.ethz.ch/pipermail/r-help
  List-Post: mailto:r-help@stat.math.ethz.ch
  List-Help: mailto:[EMAIL PROTECTED]
  List-Subscribe: https://stat.ethz.ch/mailman/listinfo/r-help,
 
  mailto:[EMAIL PROTECTED]
 
  Content-Transfer-Encoding: 7bit
  X-Spam-Checker-Version: SpamAssassin 3.0.1 (2004-10-22) on
 
  pascal.dfci.harvard.edu
 
  X-Spam-Level:
  X-Spam-Status: No, score=-2.6 required=3.0 tests=BAYES_00 
  autolearn=ham
 
  version=3.0.1
 
 
  Dear List,
  I'm looking for some documentation about the R tcltk 
 package The one 
  I found in the help doesn't look exaustive, I need 
 information on the 
  use of the single tk widget, maybe with some examples
 
  thank you,
  simone gabbriellini
 
  __
  R-help@stat.math.ethz.ch mailing list 
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
 http://www.R-project.org/posting- 
  guide.html
 
 
  Jianhua Zhang
  Department of Medical Oncology
  Dana-Farber Cancer Institute
  44 Binney Street
  Boston, MA 02115-6084
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! http://www.R-project.org/posting- 
  guide.html
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Issues with tcltk for Tiger Mac OSX

2005-08-12 Thread John Fox

Dear Bernard,

The Rcmdr web site has some instructions prepared by Rob Goedman for Mac
users at
http://socserv.socsci.mcmaster.ca/jfox/Misc/Rcmdr/installation-notes.html
that might prove useful. (I'm not a Mac user myself so can't offer specific
advice.)

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Bernard Leong
 Sent: Friday, August 12, 2005 8:51 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Issues with tcltk for Tiger Mac OSX
 
 Dear R-helpers,
 
 I have installed the latest version of R 2.1.1 in the Tiger OSX.
 However, when I load up R and use the following command:
 
  library(tcltk)
 
 I encounter the following error:
 
 Loading Tcl/Tk interface ... Error in dyn.load(x, as.logical(local),
 as.logical(now)) :
   unable to load shared library
 '/Library/Frameworks/R.framework/Resources/library/tcltk/libs/
 tcltk.so':
   
 dlopen(/Library/Frameworks/R.framework/Resources/library/tcltk
 /libs/tcltk.so,
 10): corrupt binary, library ordinal too big
 Error: .onLoad failed in 'loadNamespace' for 'tcltk'
 Error: package/namespace load failed for 'tcltk'
 
 I have done a couple of remedies: 
 1) Tried installing tcl and tk 8.4 from source
 2) Tried to install the version of tcltk from the fink 
 sourceforge net.
 3) Tried to change the path
 
 and it does not work.
 
 Does anyone have a solution to this problem  as I need tcltk 
 to run limmaGUI?
 
 Best regards,
 Bernard
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Manually Calculating Odds from POLR Model

2005-08-12 Thread John Fox

Dear Tate,

If I understand correctly what you're asking, the formulas are on p. 21 of
the paper at
http://socserv.socsci.mcmaster.ca/jfox/Papers/logit-effect-displays.pdf.
But why do you want to do this when you can get the fitted probabilities
from predict()?

I hope this helps.
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Tate Avery
 Sent: Friday, August 12, 2005 2:50 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Manually Calculating Odds from POLR Model
 
 Hello,
 
 I am using polr(...) to generate a model.  The summary shows 
 the coefficients and the intercepts.
 
 For example:
 
 coefficient for x1 = c1
 coefficient for x2 = c2
 
 intercept A|B = i1
 intercept B|C = i2
 
 I can then run predict(..., type=p) with the model and see 
 the odds for each factor.
 
 For example:
 
   ABC
 10.3 0.5  0.2
 20.4 0.1  0.5
 
 What I really want to be able to do is take the 2 
 coefficients, the 2 intercepts, the x1  x2 values and 
 manually calculate the probabilities generated by predict().
 
 I have been searching quite extensively for the underlying 
 calculations that transform the polr output and the input 
 variables into the final output odds.  I have tried a number 
 of dead-end roads so far.
 
 So, if anyone has any information on how to do this or where 
 I can find out, I would be extremely grateful.
 
 Thank you for your time,
 Tate Avery
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] path analysis

2005-08-13 Thread John Fox

Dear Manel,

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 SALAMERO BARO, MANUEL
 Sent: Saturday, August 13, 2005 2:02 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] path analysis

 Someone knows if it is possible to perform a path analysis 
 with sem package (or any other) to explain a dependent 
 *dichotomus* variable?

Yes -- you can use the hetcor() function in the polycor package to generate
a correlation matrix and boot.sem() in the sem package to get standard
errors or confidence intervals. Make sure that the dichotomous variables are
represented as factors. See ?boot.sem for an example.

I hope this helps,
 John

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] path analysis

2005-08-14 Thread John Fox

Dear Manuel,

Polychoric correlations imply only that the *latent* variables are
continuous -- the observed variables are ordered categories. Tetrachoric and
point-biserial correlations are special cases respectively of polychoric and
polyserial correlations. As long as you're willing to think of the
dichotomous variable as the dissection into two categories of a latent
continuous variable (and assuming multinormality of the latent variables),
you can use the approach that I suggested. This isn't logistic regression,
but it's similar to a probit model.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Manel Salamero
 Sent: Sunday, August 14, 2005 12:34 PM
 To: r-help@stat.math.ethz.ch
 Subject: Re: [R] path analysis
 
 This solves part of my problem with the independent ordinal 
 variables, but my dependent variable is truly categorial 
 (illness/no illness). Polychoric correlation implies that 
 data are continuous, which in not the case. Is possible to 
 implement logistic regression in the path model?
 
 Thanks,
 
 Manel Salamero
 
 -- Original Message --
 De: John Fox [EMAIL PROTECTED]
 Data:  Sat, 13 Aug 2005 19:35:24 -0400
 
 Dear Manel,
 
  -Original Message-
  From: [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED] On Behalf Of 
 SALAMERO BARO, 
  MANUEL
  Sent: Saturday, August 13, 2005 2:02 PM
  To: r-help@stat.math.ethz.ch
  Subject: [R] path analysis
  
  Someone knows if it is possible to perform a path analysis with sem 
  package (or any other) to explain a dependent
  *dichotomus* variable?
  
 
 Yes -- you can use the hetcor() function in the polycor 
 package to generate
 a correlation matrix and boot.sem() in the sem package to get standard
 errors or confidence intervals. Make sure that the 
 dichotomous variables are
 represented as factors. See ?boot.sem for an example.
 
 I hope this helps,
  John
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] path analysis

2005-08-14 Thread John Fox

Dear Manuel and list,

I see that I wrote point-biserial when I meant biserial.

Sorry,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of John Fox
 Sent: Sunday, August 14, 2005 1:34 PM
 To: [EMAIL PROTECTED]
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] path analysis
 
 Dear Manuel,
 
 Polychoric correlations imply only that the *latent* 
 variables are continuous -- the observed variables are 
 ordered categories. Tetrachoric and point-biserial 
 correlations are special cases respectively of polychoric and 
 polyserial correlations. As long as you're willing to think 
 of the dichotomous variable as the dissection into two 
 categories of a latent continuous variable (and assuming 
 multinormality of the latent variables), you can use the 
 approach that I suggested. This isn't logistic regression, 
 but it's similar to a probit model.
 
 Regards,
  John
 
 
 John Fox
 Department of Sociology
 McMaster University
 Hamilton, Ontario
 Canada L8S 4M4
 905-525-9140x23604
 http://socserv.mcmaster.ca/jfox
  
 
  -Original Message-
  From: [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED] On Behalf Of 
 Manel Salamero
  Sent: Sunday, August 14, 2005 12:34 PM
  To: r-help@stat.math.ethz.ch
  Subject: Re: [R] path analysis
  
  This solves part of my problem with the independent ordinal 
 variables, 
  but my dependent variable is truly categorial (illness/no illness). 
  Polychoric correlation implies that data are continuous, 
 which in not 
  the case. Is possible to implement logistic regression in the path 
  model?
  
  Thanks,
  
  Manel Salamero
  
  -- Original Message --
  De: John Fox [EMAIL PROTECTED]
  Data:  Sat, 13 Aug 2005 19:35:24 -0400
  
  Dear Manel,
  
   -Original Message-
   From: [EMAIL PROTECTED] 
   [mailto:[EMAIL PROTECTED] On Behalf Of
  SALAMERO BARO,
   MANUEL
   Sent: Saturday, August 13, 2005 2:02 PM
   To: r-help@stat.math.ethz.ch
   Subject: [R] path analysis
   
   Someone knows if it is possible to perform a path 
 analysis with sem 
   package (or any other) to explain a dependent
   *dichotomus* variable?
   
  
  Yes -- you can use the hetcor() function in the polycor package to 
  generate a correlation matrix and boot.sem() in the sem 
 package to get 
  standard errors or confidence intervals. Make sure that the 
  dichotomous variables are represented as factors. See 
 ?boot.sem for an 
  example.
  
  I hope this helps,
   John
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Manually Calculating Odds from POLR Model

2005-08-15 Thread John Fox

Dear Tate,

Your question pertains generally to the treatment of factors in model
formulas and is not particular to polr(). For a brief explanation, see
Section 11.1, Defining statistical models; formulae, and in particular
Section 11.1.1 on Contrasts in the manual An Introduction to R, which is
distributed with R. More detailed explanations are in texts such as Venables
and Ripley, Modern Applied Statistics With S, and my own, An R and S-PLUS
Companion to Applied Regression. 

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: Tate Avery [mailto:[EMAIL PROTECTED] 
 Sent: Monday, August 15, 2005 3:58 PM
 To: [EMAIL PROTECTED]
 Cc: r-help@stat.math.ethz.ch
 Subject: RE: [R] Manually Calculating Odds from POLR Model
 
 John,
 
 Thank you, the document was very helpful.  I can now 
 calculate the same values generated by predict() when I am 
 using purely numeric input data.
 
 Another small question arises when I look at the example 
 using 'housing' in the polr() documentation page:
 
 Running the example produces the following coefficients...
 
 Coefficients:
InflMedium  InflHigh TypeApartmentTypeAtrium   
 TypeTerrace  
 ContHigh
 0.5663924 1.2888218-0.5723552-0.3661912
 -1.0910195 
 0.3602834
 
 Now, if I am trying to perform a prediction and the value for 
 INFL comes in as 'Medium' what is done?  And, what is done for 'low'?
 
 That seems to be the last missing piece in my understanding 
 of how to convert the model values into predictions.
 
 Thank you,
 Tate
 
 From: John Fox [EMAIL PROTECTED]
 To: 'Tate Avery' [EMAIL PROTECTED]
 CC: r-help@stat.math.ethz.ch
 Subject: RE: [R] Manually Calculating Odds from POLR Model
 Date: Fri, 12 Aug 2005 19:22:23 -0400
 
 Dear Tate,
 
 If I understand correctly what you're asking, the formulas 
 are on p. 21 
 of the paper at 
 http://socserv.socsci.mcmaster.ca/jfox/Papers/logit-effect-d
isplays.pdf.
 But why do you want to do this when you can get the fitted 
 probabilities from predict()?
 
 I hope this helps.
   John
 
 
 John Fox
 Department of Sociology
 McMaster University
 Hamilton, Ontario
 Canada L8S 4M4
 905-525-9140x23604
 http://socserv.mcmaster.ca/jfox
 
 
   -Original Message-
   From: [EMAIL PROTECTED] 
   [mailto:[EMAIL PROTECTED] On Behalf Of Tate Avery
   Sent: Friday, August 12, 2005 2:50 PM
   To: r-help@stat.math.ethz.ch
   Subject: [R] Manually Calculating Odds from POLR Model
  
   Hello,
  
   I am using polr(...) to generate a model.  The summary shows the 
   coefficients and the intercepts.
  
   For example:
  
   coefficient for x1 = c1
   coefficient for x2 = c2
  
   intercept A|B = i1
   intercept B|C = i2
  
   I can then run predict(..., type=p) with the model and see the 
   odds for each factor.
  
   For example:
  
 ABC
   10.3 0.5  0.2
   20.4 0.1  0.5
  
   What I really want to be able to do is take the 2 
 coefficients, the 
   2 intercepts, the x1  x2 values and manually calculate the 
   probabilities generated by predict().
  
   I have been searching quite extensively for the underlying 
   calculations that transform the polr output and the input 
 variables 
   into the final output odds.  I have tried a number of 
 dead-end roads 
   so far.
  
   So, if anyone has any information on how to do this or 
 where I can 
   find out, I would be extremely grateful.
  
   Thank you for your time,
   Tate Avery
  
   __
   R-help@stat.math.ethz.ch mailing list 
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide!
   http://www.R-project.org/posting-guide.html
 
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Matrix oriented computing

2005-08-26 Thread John Fox

Dear Mark,

For that matter, the loop isn't a whole a slower (on my 3GHz Win XP system):

 x - c(0.005, 0.010, 0.025, 0.05, 0.1, 0.5, 0.9, 0.95, 0.975, 0.99, 0.995)
 df - 1:1000
 system.time(mat - sapply(x, qchisq, df), gcFirst = TRUE)
[1] 0.08 0.00 0.08   NA   NA
 
 mat - matrix(0, 1000, 11)
 system.time(for (i in 1:length(df)) mat[i,] - qchisq(x, df[i]))
[1] 0.09 0.00 0.10   NA   NA
 

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Marc 
 Schwartz (via MN)
 Sent: Friday, August 26, 2005 10:26 AM
 To: Peter Dalgaard
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] Matrix oriented computing
 
 On Fri, 2005-08-26 at 15:25 +0200, Peter Dalgaard wrote:
  Marc Schwartz [EMAIL PROTECTED] writes:
  
   x - c(0.005, 0.010, 0.025, 0.05, 0.1, 0.5, 0.9, 
  0.95, 0.975, 0.99, 0.995)
   
   df - c(1:100)
   
   mat - sapply(x, qchisq, df)
  
dim(mat)
   [1] 100  11

str(mat)
num [1:100, 1:11] 3.93e-05 1.00e-02 7.17e-02 2.07e-01 
 4.12e-01 ...
  
  outer() is perhaps a more natural first try... It does give the 
  transpose of the sapply approach, though.
  
  round(t(outer(x,df,qchisq)),2)
  
  should be close. You should likely add dimnames.
 
 
 
 What I find interesting, is that I would have intuitively expected
 outer() to be faster than sapply().  However:
 
 
   system.time(mat - sapply(x, qchisq, df), gcFirst = TRUE)
 [1] 0.01 0.00 0.01 0.00 0.00
  
   system.time(mat1 - round(t(outer(x, df, qchisq)), 2),
gcFirst = TRUE)
 [1] 0.01 0.00 0.01 0.00 0.00
  
 # No round() or t() to test for overhead
   system.time(mat2 - outer(x, df, qchisq), gcFirst = TRUE)
 [1] 0.01 0.00 0.02 0.00 0.00
 
 
 # Bear in mind the round() on mat1 above
  all.equal(mat, mat1)
 [1] Mean relative  difference: 4.905485e-05
 
  all.equal(mat, t(mat2))
 [1] TRUE
 
 
 Even when increasing the size of 'df' to 1:1000:
 
 
   system.time(mat - sapply(x, qchisq, df), gcFirst = TRUE)
 [1] 0.16 0.01 0.16 0.00 0.00
 
   system.time(mat1 - round(t(outer(x, df, qchisq)), 2), gcFirst =
 TRUE)
 [1] 0.16 0.00 0.18 0.00 0.00
  
   # No round() or t() to test for overhead
   system.time(mat2 - outer(x, df, qchisq), gcFirst = TRUE)
 [1] 0.16 0.01 0.17 0.00 0.00
 
 
 
 It also seems that, at least in this case, t() and round() do 
 not add much overhead.
 
 Best regards,
 
 Marc
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Rcmdr extensions

2005-08-29 Thread John Fox

Dear Simone,

You'll find instructions for extending the R Commander in Section 4 of the
paper at http://socserv.socsci.mcmaster.ca/jfox/Papers/R-commander.pdf. In
most instances, you should cause commands to be executed and printed in the
Script window and output to appear in the Output window by using the
function doItAndPrint().

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Simone 
 Gabbriellini
 Sent: Monday, August 29, 2005 9:48 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Rcmdr extensions
 
 Dear List,
 
 I am trying to extend Rcmdr with some functions usefult to my 
 study... I have addedd succesfully a menu and some submenu to 
 the GUI, and I have placed a file .R in the /etc folder... I 
 am able to call functions on that file, but I cannot see the 
 results: how can I tell Rcmdr to write the output in the output box?
 
 Second question: how can I refer to the actually selected 
 Dataset in my .R functions' file?
 
 Thank you,
 Simone Gabbriellini
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] tcltk, X11 protocol error: Bug?

2005-09-08 Thread John Fox

Dear Nicholas,

This problem has been reported before (enter X11 protocol error on the R
site search at http://finzi.psych.upenn.edu/search.html to see the previous
threads), but as far as I know, there's no definitive explanation or
solution. As well, things appear to work fine, despite the warnings. The way
I handle the problem in the Rcmdr package is simply to intercept the
warnings.

I hope this helps,
 John 


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Nicholas Lewin-Koh
 Sent: Monday, September 05, 2005 5:16 PM
 To: [EMAIL PROTECTED]
 Subject: [R] tcltk, X11 protocol error: Bug?
 
 Hi,
 I am having trouble debugging this one. The code is attached 
 below, but it seems to be a problem at the C-tk interface. If 
 I run this 1 time there are no problems if I run it more than 
 once I start to get warnings that increase in multiples of 11 
 everytime I run it. Here is a sample session
 
 
  source(clrramp2.r)
 Loading required package: tcltk
 Loading Tcl/Tk interface ... done
  clrRamp()
 
  tt-clrRamp()
  tt
 function (n)
 {
 x - ramp(seq(0, 1, length = n))
 rgb(x[, 1], x[, 2], x[, 3], max = 255) }
 environment: 0x8b8674c
  image(matrix(1:10),col=tt(10))
  tt-clrRamp()
 There were 22 warnings (use warnings() to see them)
  image(matrix(1:10),col=tt(10))
 There were 11 warnings (use warnings() to see them)
  warnings()
 Warning messages:
 1: X11 protocol error: BadWindow (invalid Window parameter)
 2: X11 protocol error: BadWindow (invalid Window parameter)
 3: X11 protocol error: BadWindow (invalid Window parameter)
 4: X11 protocol error: BadWindow (invalid Window parameter)
 5: X11 protocol error: BadWindow (invalid Window parameter)
 6: X11 protocol error: BadWindow (invalid Window parameter)
 7: X11 protocol error: BadWindow (invalid Window parameter)
 8: X11 protocol error: BadWindow (invalid Window parameter)
 9: X11 protocol error: BadWindow (invalid Window parameter)
 10: X11 protocol error: BadWindow (invalid Window parameter)
 11: X11 protocol error: BadWindow (invalid Window parameter)
 
 I am running R-2.1.1 on ubuntu linux 5.04, compiled from 
 source (not the deb package).
 My version of tcl/tk is 8.4. The code is below. If anyone 
 sees something I am doing foolish let me know, otherwise I 
 will file a bug report.
 
 Nicholas
 
 # File clrramp2.r ##
 
 require(tcltk)
 clrRamp - function(n.col, b.color=NULL,e.color=NULL){
 
   B.ChangeColor - function()
 {
  
   b.color - 
 tclvalue(tkcmd(tk_chooseColor,initialcolor=e.color,
  title=Choose a color))
   if (nchar(b.color)0){
 tkconfigure(canvas.b,bg=b.color)
 Rmp.Draw()
   }
 }
 
   E.ChangeColor - function()
 {
  
   e.color - 
 tclvalue(tkcmd(tk_chooseColor,initialcolor=e.color,
  title=Choose a color))
   ##cat(e.color)
   if (nchar(e.color)0){
 tkconfigure(canvas.e,bg=e.color)
 Rmp.Draw()
   }
 }
 
   Rmp.Draw -function(){
 
 cr-colorRampPalette(c(b.color,e.color),space=Lab,interpola
 te=spline)
 rmpcol - cr(n.col)
 #rmpcol-rgb( rmpcol[,1],rmpcol[,2],rmpcol[,3])
 inc - 300/n.col
 xl - 0
 for(i in 1:n.col){
   ##item - 
   
 tkitemconfigure(canvas.r,barlst[[i]],fill=rmpcol[i],outline=rmpcol[i])
   #xl - xl+inc
 }
   }
 
   save.ramp - function(){
 
 cr-colorRampPalette(c(b.color,e.color),space=Lab,interpola
 te=spline)
 tkdestroy(tt)
 ##invisible(cr)
   }
 
   tt - tktoplevel()
   tkwm.title(tt,Color Ramp Tool)
   frame - tkframe(tt)
   bframe - tkframe(frame,relief=groove,borderwidth=1)
 
   if(is.null(b.color)) b.color - blue
   if(is.null(e.color)) e.color - yellow
   if(missing(n.col)) n.col - 100
 
   canvas.b - tkcanvas(bframe,width=50,height=25,bg=b.color)
   canvas.e - tkcanvas(bframe,width=50,height=25,bg=e.color)
   canvas.r - tkcanvas(tt,width=300,height=50,bg=white)
   
   BColor.button - tkbutton(bframe,text=Begin
   Color,command=B.ChangeColor)
   ##tkgrid(canvas.b,BColor.button)
   EColor.button - tkbutton(bframe,text=End
   Color,command=E.ChangeColor)
   killbutton - tkbutton(bframe,text=Save,command=save.ramp)
   tkgrid(canvas.b,BColor.button,canvas.e,EColor.button)
   tkgrid(bframe)
   tkgrid(frame)
   tkgrid(canvas.r)
   tkgrid(killbutton)
 
   
 cr-colorRampPalette(c(b.color,e.color),space=Lab,interpolat
 e=spline)
   ##rmpcol - hex(mixcolor(alpha,bc,ec,where=LUV))
   rmpcol - cr(n.col)
   inc - 300/n.col
   xl - 0
   #barlst - vector(length=n.col,mode=list)
   barlst - tclArray()
   for(i in 1:n.col){
 item-tkcreate(canvas.r,rect,xl,0,xl+inc,50,
fill=rmpcol[i],outline=rmpcol[i])
 ##tkaddtag(canvas.r, point, withtag, item

[R] [R-pkgs] Internationalized version of the Rcmdr package

2005-09-09 Thread John Fox

Dear R-packages list members,

I've recently uploaded to CRAN version 1.1-1 of the Rcmdr (R Commander)
package. Based on the tcltk package, the Rcmdr provides a basic-statistics
graphical interface to R.

The general functionality of the R Commander has not changed much since
version 1.0-0. There are some small improvements to the package, but the
reason that I'm re-announcing it is that the latest version supports
translation into other languages, using the new internationalization and
localization features of R 2.1.x. 

A Catalan translation (courtesy of Manel Salamero), a French translation
(courtesy of Philippe Grosjean), a Japanese translations (courtesy of
Takaharu Araki), and a Slovenian translation (courtesy of Jaro Lajovic), are
provided with the Rcmdr package. I understand that Portuguese, Russian, and
Spanish translations are in the works, and these will be incorporated when
they are completed. 

If you're interested in translating the Rcmdr into another language, please
contact me.

As before, bug reports, suggestions, and other feedback are appreciated.

 John 


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Plotting an ellipse in 3D

2005-09-09 Thread John Fox

=data.frame(x=0,
z=0, groups=group)), 0,
paste(group,  ), adj=1, color=surface.col[j])
if (residuals){
yy - y[select.obs]
xx - x[select.obs]
zz - z[select.obs]
fitted - fitted(mod)
rgl.lines(as.vector(rbind(xx,xx)),
as.vector(rbind(yy,fitted)), as.vector(rbind(zz,zz)),
col=surface.col[j])
}
}
}
}
}
}
if (revolutions  0) {
for (i in 1:revolutions){
for (angle in seq(1, 360, length=360/speed))
rgl.viewpoint(-angle, fov=fov)
}
}
if (model.summary) return(summaries) else return(invisible(NULL))
}


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Duncan Murdoch
 Sent: Friday, September 09, 2005 8:03 AM
 To: Mike White
 Cc: R-help@stat.math.ethz.ch
 Subject: Re: [R] Plotting an ellipse in 3D
 
 Mike White wrote:
  I have been using the ellipse function from the car package and the 
  covariance matrix to draw an ellipse around a group of 
 points to show 
  the confidence limits.  However, the points are actually 
 represented 
  by 3 variables so rather than plot each pairwise combination of 
  variables in 2D I would like to plot the 'ellipse' in 3D using the 
  djmrgl package.  Can anyone offer advice on how I can plot 
 the surface 
  of  a 3D 'ellipse' using the covariance matrix to define 
 the shape, so 
  that the points inside can also be seen.
 
 You should use rgl, rather than djmrgl.  It now has most of 
 the same functions plus a lot more.
 
 Then you can plot the ellipse as a wireframe or transparent 
 object.  See the demo(regression) example for that kind of 
 drawing; demo(shapes3d) for ellipses.  (The demo names are 
 from memory, I don't have access to it right now.)
 
 Duncan Murdoch
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Collineariy Diagnostics

2005-09-13 Thread John Fox

Dear Sundar and Antoine,

In addition, the vif function in the car package will calculate generalized
variance inflation factors.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Sundar 
 Dorai-Raj
 Sent: Tuesday, September 13, 2005 5:16 AM
 To: Antoine de Bary
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] Collineariy Diagnostics
 
 
 
 Antoine de Bary wrote:
  Hi, and thanks for your help
  
  in order to do collinearity analysis I downloaded the 
 perturb package. 
  I run a lm (regression) and on that the ³calldiag² commad to get 
  condition numbers but i get the following message: the variable XY 
  with modus ³numeric² was not found (it does the same with all 
  predictors despite all variables are numeric and exists).
  
  Can anyone tell me how can I go arround this problem? Is 
 there another 
  way to have ³condition numbers²? What about VIF?
  
  Please return message to: [EMAIL PROTECTED]
 
 I cannot comment on the perturb package. However for 
 condition numbers see ?kappa.lm, and for variance inflation 
 factors see ?vif. The latter is in the Design package.
 
 set.seed(1)
 x1 - rnorm(100)
 x2 - x1 + 0.1 * rnorm(100)
 y  - rnorm(100)
 f  - lm(y ~ x1 + x2)
 
 vif(f)
 kappa(f)
 
 HTH,
 
 --sundar
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] is library loaded

2005-09-13 Thread John Fox

Dear Omar,

The following function tests whether a package is in the search path (with
the package name given in quotes):

packageLoaded - function(name) 0 != length(grep(paste(^package:, name,
$, sep=), search()))

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Omar Lakkis
 Sent: Tuesday, September 13, 2005 2:09 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] is library loaded
 
 Is there a way to test if a library has been loaded?
 
 is.loaded does not give me what I want, I am looking to say:
 
 if loaded(Rdbi) dbSendQuery(conn, q)
 if loaded(RODBC) sqlQuery(conn, q)
 
 I need this to support both unix and windows platforms as I 
 could not find a windows distribution for RdbiPgSQL. I am 
 using R 2.1.0 and postgresql. I will be connecting to the 
 database using Rdbi and RdbiPgSQL and have other developers 
 using windows connect with RODBC, unless someone can suggest 
 a better solution.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] is library loaded

2005-09-13 Thread John Fox

Dear Robert,

packageLoaded() may well be a bad name but loadedNamespaces() won't detect a
package without a namespace. It therefore seemed safe to me to check the
path, which would include both packages with and without namespaces. With
respect to loading and attaching, I thought that library() both loaded a
package (with or without a namespace) and attached it to the search path,
but I must admit that I'm easily confused about these distinctions. 

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: Robert Gentleman [mailto:[EMAIL PROTECTED] 
 Sent: Tuesday, September 13, 2005 3:53 PM
 To: John Fox
 Cc: [EMAIL PROTECTED]; r-help@stat.math.ethz.ch
 Subject: Re: [R] is library loaded
 
 Hi,
 
   Almost surely this is a bad name. With the advent of name 
 spaces it is important to distinguish between loading and 
 attaching. This function tests for attached packages. To test 
 for loaded packages we already have
   loadedNamespaces.
 
   Best wishes,
 Robert
 
 John Fox wrote:
  Dear Omar,
  
  The following function tests whether a package is in the 
 search path 
  (with the package name given in quotes):
  
  packageLoaded - function(name) 0 != length(grep(paste(^package:, 
  name, $, sep=), search()))
  
  I hope this helps,
   John
  
  
  John Fox
  Department of Sociology
  McMaster University
  Hamilton, Ontario
  Canada L8S 4M4
  905-525-9140x23604
  http://socserv.mcmaster.ca/jfox
  
  
  
 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Omar Lakkis
 Sent: Tuesday, September 13, 2005 2:09 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] is library loaded
 
 Is there a way to test if a library has been loaded?
 
 is.loaded does not give me what I want, I am looking to say:
 
 if loaded(Rdbi) dbSendQuery(conn, q)
 if loaded(RODBC) sqlQuery(conn, q)
 
 I need this to support both unix and windows platforms as I 
 could not 
 find a windows distribution for RdbiPgSQL. I am using R 2.1.0 and 
 postgresql. I will be connecting to the database using Rdbi and 
 RdbiPgSQL and have other developers using windows connect 
 with RODBC, 
 unless someone can suggest a better solution.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
  
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
  
 
 --
 Robert Gentleman, PhD
 Program in Computational Biology
 Division of Public Health Sciences
 Fred Hutchinson Cancer Research Center
 1100 Fairview Ave. N, M2-B876
 PO Box 19024
 Seattle, Washington 98109-1024
 206-667-7700
 [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Rcommander and simple chisquare

2005-09-15 Thread John Fox

Dear Christian,

From the Rcmdr menus, select Statistics - Summaries - Frequency
distributions, and check the Chisquare goodness of fit test box in the
resulting dialog. This will bring up a dialog box where you can enter
hypothesized probabilities from which expected frequencies will be
calculated.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Christian Jost
 Sent: Thursday, September 15, 2005 5:40 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Rcommander and simple chisquare
 
 In this years biostat teaching I will include Rcommander (it 
 indeed simplifies syntax problems that makes students 
 frequently miss the core statistical problems). But I could 
 not find how to make a simple chisquare comparison between 
 observed frequencies and expected frequencies (eg in genetics 
 where you expect phenotypic frequencies corresponding to 3:1 
 in standard dominant/recessif alleles). Any idea where this 
 feature might be hidden? Or could it be added to Rcommander?
 
 Thanks, Christian.
 
 ps: in case I am not making myself clear, can Rcommander be 
 made to perform
   chisq.test(c(61,39),p=c(0.75,0.25))
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Rcommander and simple chisquare

2005-09-15 Thread John Fox

Dear Christian,

From the Rcmdr menus, select Statistics - Summaries - Frequency
distributions, and check the Chisquare goodness of fit test box in the
resulting dialog. This will bring up a dialog box where you can enter
hypothesized probabilities from which expected frequencies will be
calculated.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Christian Jost
 Sent: Thursday, September 15, 2005 5:40 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Rcommander and simple chisquare
 
 In this years biostat teaching I will include Rcommander (it 
 indeed simplifies syntax problems that makes students 
 frequently miss the core statistical problems). But I could 
 not find how to make a simple chisquare comparison between 
 observed frequencies and expected frequencies (eg in genetics 
 where you expect phenotypic frequencies corresponding to 3:1 
 in standard dominant/recessif alleles). Any idea where this 
 feature might be hidden? Or could it be added to Rcommander?
 
 Thanks, Christian.
 
 ps: in case I am not making myself clear, can Rcommander be 
 made to perform
   chisq.test(c(61,39),p=c(0.75,0.25))
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Rcommander and simple chisquare

2005-09-15 Thread John Fox

Dear Christian,

The Rcmdr assumes that you have a data frame with the original data, in
which the variable in question is a factor. The frequency distribution is
constructed for the factor. Thus, in your example, you'd have 100
observations classified on a two-category factor. What you enter directly
are the hypothesized probabilities.

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: Christian Jost [mailto:[EMAIL PROTECTED] 
 Sent: Thursday, September 15, 2005 11:38 AM
 To: John Fox; 'Philippe Grosjean'
 Cc: r-help@stat.math.ethz.ch
 Subject: RE: [R] Rcommander and simple chisquare
 
 Dear John and Philippe,
 
 thanks for your replys, I finally  found this menu, but I am 
 somewhat at a loss how I should enter the observed 
 frequencies. To take my example below, If I enter a 
 one-column data.frame with the numbers 61 and 39, John's 
 indicated menu is not highlighted. If I add a second column 
 containing some factor, the menu is highlighted by I cannot 
 select the first column. However, if I edit the data and 
 declare the first column to be of type 'character' I can 
 select it in the menu dialog and declare the expected 
 frequencies, but the chisquare output doesn't make any sense. 
 For the moment I cannot make any sense of that :-( Any help 
 most appreciated, or a link to the tutorial/faq that explains 
 such kind of problems.
 
 Thanks, Christian.
 
 At 11:31 -0400 15/09/05, John Fox wrote:
 Dear Philippe,
 
 This does a chi-square test of independence in a contingency 
 table, not 
 a chi-square goodness-of-fit test (which is done in the Rcmdr via 
 Statistics
 - Summaries - Frequency distribution).
 
 Regards,
   John
 
 
 John Fox
 Department of Sociology
 McMaster University
 Hamilton, Ontario
 Canada L8S 4M4
 905-525-9140x23604
 http://socserv.mcmaster.ca/jfox
 
 
   -Original Message-
   From: [EMAIL PROTECTED]  
  [mailto:[EMAIL PROTECTED] On Behalf Of  Philippe 
  Grosjean
   Sent: Thursday, September 15, 2005 7:32 AM
   To: Christian Jost
   Cc: r-help@stat.math.ethz.ch
   Subject: Re: [R] Rcommander and simple chisquare
 
   Hello,
 
   Just look at Statistics - Contingency tables. There is 
 an  option 
  for making the chi square test there.
   Best,
 
   Philippe Grosjean,
 
   ..°}))
 ) ) ) ) )
( ( ( ( (Prof. Philippe Grosjean
..
 
   Christian Jost wrote:
In this years biostat teaching I will include Rcommander (it 
  indeed   simplifies syntax problems that makes students 
 frequently 
  miss the   core statistical problems). But I could not 
 find how to  
  make a simple   chisquare comparison between observed frequencies 
  and expected   frequencies (eg in genetics where you expect 
  phenotypic frequencies   corresponding to 3:1 in standard 
  dominant/recessif  alleles). Any idea   where this 
 feature might be 
  hidden? Or could it be added to   Rcommander?
   
Thanks, Christian.
   
ps: in case I am not making myself clear, can 
 Rcommander be made 
  to   perform chisq.test(c(61,39),p=c(0.75,0.25))
   
   
 __


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] text editor TinR?

2005-09-19 Thread John Fox

Dear Mihai,

It's Tinn-R (with two n's), at http://www.sciviews.org/Tinn-R/.

John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Mihai Nica
 Sent: Monday, September 19, 2005 4:07 PM
 To: r-help@stat.math.ethz.ch
 Subject: Re: [R] text editor TinR?
 
 Greetings,
 
 Please help me remember the name of the tiny text editor 
 that works with R TinR maybe? I cannot find it at all, 
 and cannot remember it, it is really frustrating...
 
 Thanks,
 
 Mihai Nica, ABD
 Jackson State University
 170 East Griffith Street G5
 Jackson, MS 39201
 601 914 0361

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Are least-squares means useful or appropriate?

2005-09-24 Thread John Fox

Dear Peter, Doug, and Felipe,

My effects package (on CRAN, also see the article at
http://www.jstatsoft.org/counter.php?id=75url=v08/i15/effect-displays-revis
ed.pdf) will compute and graph adjusted effects of various kinds for linear
and generalized linear models -- generalizing so-called least-squares
means (or population marginal means or adjusted means).

A couple of comments: 

By default, the all.effects() function in the effects package computes
effects for high-order terms in the model, absorbing terms marginal to them.
You can ask the effect() function to compute an effect for a term that's
marginal to a higher-order term, and it will do so with a warning, but this
is rarely sensible.

Peter's mention of floating variances (or quasi-variances) in this context
is interesting, but what would most like to see, I think, are the
quasi-variances for the adjusted effects, that is for terms merged with
their lower-order relatives. These, for example, are unaffected by contrast
coding. How to define reasonable quasi-variances in this context has been
puzzling me for a while.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Peter Dalgaard
 Sent: Friday, September 23, 2005 10:23 AM
 To: Douglas Bates
 Cc: Felipe; R-help@stat.math.ethz.ch
 Subject: Re: [R] Are least-squares means useful or appropriate?
 
 Douglas Bates [EMAIL PROTECTED] writes:
 
  On 9/20/05, Felipe [EMAIL PROTECTED] wrote:
   -BEGIN PGP SIGNED MESSAGE-
   Hash: SHA1
  
   Hi.
   My question was just theoric. I was wondering if someone who were 
   using SAS and R could give me their opinion on the topic. I was 
   trying to use least-squares means for comparison in R, but then I 
   found some indications against them, and I wanted to know if they 
   had good basis (as I told earlier, they were not much detailed).
   Greetings.
  
   Felipe
  
  As Deepayan said in his reply, the concept of least squares 
 means is 
  associated with SAS and is not generally part of the theory 
 of linear 
  models in statistics.  My vague understanding of these (I 
 too am not a 
  SAS user) is that they are an attempt to estimate the 
 mean response 
  for a particular level of a factor in a model in which that 
 factor has 
  a non-ignorable interaction with another factor.  There is 
 no clearly 
  acceptable definition of such a thing.
 
 (PD goes and fetches the SAS manual)
 
 Well, yes. it'll do that too, although only if you ask for 
 the lsmeans of A when an interaction like A*B is present in 
 the model. This is related to the tests of main effects when 
 an interaction is present using type III sums of squares, 
 which has been beaten to death repeatedly on the list. In 
 both cases, there seems to be an implicit assumption that 
 categorical variables by nature comes from an underlying 
 fully balanced design.
 
 If the interaction is absent from the model, the lsmeans are 
 somewhat more sensible in that they at least reproduce the 
 parameter estimates as contrasts between different groups. 
 All continuous variables in the design will be set to their 
 mean, but values for categorical design variables are 
 weighted inversely as the number of groups. So if you're 
 doing an lsmeans of lung function by smoking adjusted for age 
 and sex you get estimates for the mean of a population of 
 which everyone has the same age and half are male and half 
 are female. This makes some sense, but if you do it for sex 
 adjusting for smoking and age, you are not only forcing the 
 sexes to smoke equally much, but actually adjusting to  
 smoking rates of 50%, which could be quite far from reality. 
 
 The whole operation really seems to revolve around 2 things: 
 
 (1) pairwise comparisons between factor levels. This can alternatively
 be done fairly easily using parameter estimates for the relevant
 variable and associated covariances. You don't really need all the
 mumbo-jumbo of adjusting to particular values of other variables.
 
 (2) plotting effects of a factor with error bars as if they were
 simple group means. This has some merit since the standard
 parametrizations are misleading at times (e.g. if you choose the
 group with the least data as the reference level, std. err. for
 the other groups will seem high). However, it seems to me that
 concepts like floating variances (see float() in the Epi package)
 are more to the point.
 
  R is an interactive language where it is a simple matter to fit a 
  series of models and base your analysis on a model that is 
  appropriate.  An approach of give me the answer to any possible 
  question about this model, whether or not it make sense is 
  unnecessary.
  
  In many ways statistical

Re: [R] plot question when type = b and pch is a vector

2005-10-02 Thread John Fox

Dear tokas,

How about:

x - seq(0.01, 10, length = 20)
xx - x[7]
x[7] - NA

plot(c(0, 10), c(-20, 20), type = n, xlab = x, 
ylab = expression(2 * alpha * log(x)))
for(i in 1:4){
lines(x, 2 * i * log(x), lty = 1)
text(xx,  2 * i * log(xx), i) 
}

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of toka tokas
 Sent: Sunday, October 02, 2005 5:37 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] plot question when type = b and pch is a vector
 
 Dear R users,
 
 I've been struggling some days with the following
 problem: I'm interesting in producing the following plot
 
 x - seq(0.01, 10, length = 20)
 
 plot(c(0, 10), c(-20, 20), type = n, xlab = x, 
 ylab = expression(2 * alpha * log(x)))
 
 pch. - rep(NA, length(x))
 for(i in 1:4){
 pch.[7] - as.character(i)
 lines(x, 2 * i * log(x), type = b, pch = pch., lty = 1) }
 
 where all the line segments are connected, except from the 
 7th one where I've put the value of alpha -- in other words 
 I'd like to produce a line plot where the label appears at 
 each line with some white space around it.
 
 thanks in advance,
 tokas
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] generalized linear model for multinomial data?

2005-10-02 Thread John Fox

Dear Hongyu,

See multinom() in the nnet package, associated with Venables and Ripley's
Modern Applied Statistics with S.

John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Hongyu Sun
 Sent: Sunday, October 02, 2005 3:07 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] generalized linear model for multinomial data?
 
 Dear All:
 
 Does R have the package as in SAS's generalized logits model 
 for nominal response data? I have searched but cannot find 
 the windows package.
 
 Many thanks,
 
 HS
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Rcmdr and scatter3d

2005-10-04 Thread John Fox

(predict(mod, newdata=dat),
grid.lines, grid.lines)
if (fill) rgl.surface(vals, vals, yhat,
color=surface.col[j], 
alpha=0.5, lit=FALSE)
if (grid) rgl.surface(vals, vals, yhat, color=if
(fill) grid.col 
else surface.col[j], alpha=0.5, lit=FALSE, 
front=lines, back=lines)
rgl.texts(0, predict(mod, 
newdata=data.frame(x=0, z=0, groups=group)), 0,
paste(group,  ), adj=1, color=surface.col[j])
if (residuals){
yy - y[select.obs]
xx - x[select.obs]
zz - z[select.obs]
fitted - fitted(mod)
rgl.lines(as.vector(rbind(xx,xx)),
as.vector(rbind(yy,fitted)), 
as.vector(rbind(zz,zz)),
col=surface.col[j])
}
}
}
}
}
}
if (revolutions  0) {
for (i in 1:revolutions){
for (angle in seq(1, 360, length=360/speed))
rgl.viewpoint(-angle, fov=fov)
}
}
if (model.summary) return(summaries) else return(invisible(NULL))
}



John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Naiara S. Pinto
 Sent: Tuesday, October 04, 2005 6:13 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Rcmdr and scatter3d
 
 Hi folks,
 
 I'd like to use scatter3d (which is in R commander) to plot 
 more than one dataset in the same graph, each dataset with a 
 different color. The kind of stuff you would do with holdon 
 in Matlab.
 
 I read a recent message that was posted to this list with a 
 similar problem, but I couldn't understand the reply. Could 
 someone give me one example? How do you plot subgroups using 
 scatter3d?
 
 Thanks a lot!
 
 Naiara.
 
 
 
 Naiara S. Pinto
 Ecology, Evolution and Behavior
 1 University Station A6700
 Austin, TX, 78712
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Rcmdr and scatter3d

2005-10-05 Thread John Fox

Dear Ted,

I assumed that since Naiara was using scatter3d(), he wants a 3D dynamic
scatterplot. He could add points (actually, spheres) to the rgl graph
produced by scatter3d() -- the analog of plot() followed by points() for a
2D graph -- but doing so would be much more work than plotting by groups.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: ecatchpole [mailto:[EMAIL PROTECTED] 
 Sent: Tuesday, October 04, 2005 8:55 PM
 To: John Fox
 Cc: 'Naiara S. Pinto'; r-help@stat.math.ethz.ch
 Subject: Re: [R] Rcmdr and scatter3d
 
 Niara,
 
 Alternatively, instead of scatter3d, the analogy to hold on 
 in Matlab is to use plot() for the first set of data, then 
 points() for the remainder. See
 
 ?plot
 ?points
 
 Ted.
 
 On 05/10/05 11:18,  John Fox wrote,:
  Dear Naiara,
  
  Combine the data sets and differentiate among them with a 
 factor. Then 
  use the groups argument to scatter3d (see ?scatter3d). If 
 you're using 
  the R Commander to make the plot, the 3D scatterplot dialog 
 box as a 
  plot by groups button. You can also fit colour-coded 
 regression surfaces by group.
  
  I've appended a new version of the scatter3d function, not 
 yet in the 
  Rcmdr package, which will also plot data ellipsoids (for the whole 
  data set or by groups).
  
  I hope this helps,
   John
  
  --- snip --
 
  
  John Fox
  Department of Sociology
  McMaster University
  Hamilton, Ontario
  Canada L8S 4M4
  905-525-9140x23604
  http://socserv.mcmaster.ca/jfox
  
  
 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Naiara S. Pinto
 Sent: Tuesday, October 04, 2005 6:13 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Rcmdr and scatter3d
 
 Hi folks,
 
 I'd like to use scatter3d (which is in R commander) to plot 
 more than 
 one dataset in the same graph, each dataset with a different color. 
 The kind of stuff you would do with holdon
 in Matlab.
 
 I read a recent message that was posted to this list with a similar 
 problem, but I couldn't understand the reply. Could someone give me 
 one example? How do you plot subgroups using scatter3d?
 
 Thanks a lot!
 
 Naiara.
 
 
 
 Naiara S. Pinto
 Ecology, Evolution and Behavior
 1 University Station A6700
 Austin, TX, 78712
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
 
 
 --
 Dr E.A. Catchpole
 Visiting Fellow
 Univ of New South Wales at ADFA, Canberra, Australia and 
 University of Kent, Canterbury, England
 - www.ma.adfa.edu.au/~eac
 - fax: +61 2 6268 8786
 - ph:  +61 2 6268 8895

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] testing non-linear component in mgcv:gam

2005-10-05 Thread John Fox

Dear Denis,

Take a closer look at the anova table: The models provide identical fits to
the data. The differences in degrees of freedom and deviance between the two
models are essentially zero, 5.5554e-10 and 2.353e-11 respectively.

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Denis Chabot
 Sent: Wednesday, October 05, 2005 8:22 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] testing non-linear component in mgcv:gam
 
 Hi,
 
 I need further help with my GAMs. Most models I test are very 
 obviously non-linear. Yet, to be on the safe side, I report 
 the significance of the smooth (default output of mgcv's 
 summary.gam) and confirm it deviates significantly from linearity.
 
 I do the latter by fitting a second model where the same 
 predictor is entered without the s(), and then use anova.gam 
 to compare the two. I thought this was the equivalent of the 
 default output of anova.gam using package gam instead of mgcv.
 
 I wonder if this procedure is correct because one of my 
 models appears to be linear. In fact mgcv estimates df to be 
 exactly 1.0 so I could have stopped there. However I 
 inadvertently repeated the procedure outlined above. I would 
 have thought in this case the anova.gam comparing the smooth 
 and the linear fit would for sure have been not significant. 
 To my surprise, P was 6.18e-09!
 
 Am I doing something wrong when I attempt to confirm the non- 
 parametric part a smoother is significant? Here is my example 
 case where the relationship does appear to be linear:
 
 library(mgcv)
  This is mgcv 1.3-7
 Temp - c(-1.38, -1.12, -0.88, -0.62, -0.38, -0.12, 0.12, 
 0.38, 0.62, 0.88, 1.12,
 1.38, 1.62, 1.88, 2.12, 2.38, 2.62, 2.88, 3.12, 
 3.38, 3.62, 3.88,
 4.12, 4.38, 4.62, 4.88, 5.12, 5.38, 5.62, 5.88, 
 6.12, 6.38, 6.62, 6.88,
 7.12, 8.38, 13.62)
 N.sets - c(2, 6, 3, 9, 26, 15, 34, 21, 30, 18, 28, 27, 27, 
 29, 31, 22, 26, 24, 23,
  15, 25, 24, 27, 19, 26, 24, 22, 13, 10, 2, 5, 3, 
 1, 1, 1, 1, 1) wm.sed - c(0.0, 0.016129032, 
 0.0, 0.062046512, 0.396459596, 0.189082949,
  0.054757925, 0.142810440, 0.168005168, 
 0.180804428, 0.111439628, 0.128799505,
  0.193707937, 0.105921610, 0.103497845, 
 0.028591837, 0.217894389, 0.020535469,
  0.080389068, 0.105234450, 0.070213450, 
 0.050771363, 0.042074434, 0.102348837,
  0.049748344, 0.019100478, 0.005203125, 
 0.101711864, 0.0, 0.0,
  0.014808824, 0.0, 0.22200, 
 0.16700, 0.0, 0.0,
  0.0)
 
 sed.gam - gam(wm.sed~s(Temp),weight=N.sets)
 summary.gam(sed.gam)
  Family: gaussian
  Link function: identity
 
  Formula:
  wm.sed ~ s(Temp)
 
  Parametric coefficients:
  Estimate Std. Error t value Pr(|t|)
  (Intercept)  0.084030.01347   6.241 3.73e-07 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
 
  Approximate significance of smooth terms:
  edf Est.rank F  p-value
  s(Temp)   11 13.95 0.000666 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
 
  R-sq.(adj) =  0.554   Deviance explained = 28.5%
  GCV score = 0.09904   Scale est. = 0.093686  n = 37
 
 # testing non-linear contribution
 sed.lin - gam(wm.sed~Temp,weight=N.sets)
 summary.gam(sed.lin)
  Family: gaussian
  Link function: identity
 
  Formula:
  wm.sed ~ Temp
 
  Parametric coefficients:
   Estimate Std. Error t value Pr(|t|)
  (Intercept)  0.162879   0.019847   8.207 1.14e-09 ***
  Temp-0.023792   0.006369  -3.736 0.000666 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
 
 
  R-sq.(adj) =  0.554   Deviance explained = 28.5%
  GCV score = 0.09904   Scale est. = 0.093686  n = 37
 anova.gam(sed.lin, sed.gam, test=F)
  Analysis of Deviance Table
 
  Model 1: wm.sed ~ Temp
  Model 2: wm.sed ~ s(Temp)
 Resid. Df Resid. Dev Df  Deviance  F   Pr(F)
  1 3.5000e+01  3.279
  2 3.5000e+01  3.279 5.5554e-10 2.353e-11 0.4521 6.18e-09 ***
 
 
 Thanks in advance,
 
 
 Denis Chabot
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] transparent surface in rgl

2005-10-05 Thread John Fox

Dear Paul,

I don't have experience with rgl.postscript(), which is relatively new, but
find that the png graphs produced by rgl.snapshot() are of reasonably good
quality and preserve transparency. Perhaps the developers of the rgl package
can shed more light on the matter.

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Prof. 
 Paul R. Fisher
 Sent: Wednesday, October 05, 2005 8:32 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] transparent surface in rgl
 
 Hi all
 I am a complete newbie to this list (just subscribed) and a 
 newcomer to R (an S user from olden times). I have been using 
 scatter3d to create a 3d scatter plot with surface. The 
 graphic is created within the rgl package and I have used 
 rgl.postscript to export it so I can generate a publication 
 quality image. My problem is that the plotted surface is no 
 longer transparent in the postscript output ie. the 
 rgl.spheres that are behind the surface disappear in the 
 postscript image. Can't seem to find any info on this 
 anywhere. Am I doing something wrong? Is there an easy fix?
 
 Anyway, thanks.
 Hope I've not broken some netiquette rule sending this.
 
 Cheers,
 Paul Fisher.
 --
 Prof. Paul R. Fisher,
 Chair in Microbiology,
 La Trobe University,
 VIC 3086,
 AUSTRALIA.
 
 Tel. + 61 3 9479 2229
 Fax. + 61 3 9479 1222
 Email. [EMAIL PROTECTED]
 Web. http://www.latrobe.edu.au/mcbg/my.html
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] testing non-linear component in mgcv:gam

2005-10-05 Thread John Fox

Dear Denis,

The chi-square test is formed in analogy to what's done for a GLM: The
difference in residual deviance for the nested models is divided by the
estimated scale parameter -- i.e., the estimated error variance for a model
with normal errors. Otherwise, as you point out, the test would be dependent
upon the scale of the response.

John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Denis Chabot
 Sent: Wednesday, October 05, 2005 9:04 AM
 To: John Fox
 Cc: R list
 Subject: Re: [R] testing non-linear component in mgcv:gam
 
 Hi John,
 
 Le 05-10-05 à 09:45, John Fox a écrit :
 
  Dear Denis,
 
  Take a closer look at the anova table: The models provide identical 
  fits to the data. The differences in degrees of freedom and 
 deviance 
  between the two models are essentially zero, 5.5554e-10 and 
 2.353e-11 
  respectively.
 
  I hope this helps,
   John
 This is one of my difficulties. In some examples I found on 
 the web, the difference in deviance is compared directly 
 against the chi- squared distribution. But my y variable has 
 a very small range (between 0 and 0.5, most of the time) so 
 the difference in deviance is always very small and if I 
 compared it against the chi-squared distribution as I have 
 seen done in examples, the non-linear component would always 
 be not significant. Yet it is (with one exception), tested 
 with both mgcv:gam and gam:gam. I think the examples I have 
 read were wrong in this regard, the scale factor seen in 
 mgcv output seems to intervene. But exactly how is still 
 mysterious to me and I hesitate to judge the size of the 
 deviance difference myself.
 
 I agree it is near zero in my example. I guess I need to have 
 more experience with these models to better interpret the output...
 
 Denis
 
 
  -Original Message-
  From: [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED] On Behalf Of Denis Chabot
  Sent: Wednesday, October 05, 2005 8:22 AM
  To: r-help@stat.math.ethz.ch
  Subject: [R] testing non-linear component in mgcv:gam
 
  Hi,
 
  I need further help with my GAMs. Most models I test are very 
  obviously non-linear. Yet, to be on the safe side, I report the 
  significance of the smooth (default output of mgcv's
  summary.gam) and confirm it deviates significantly from linearity.
 
  I do the latter by fitting a second model where the same 
 predictor is 
  entered without the s(), and then use anova.gam to compare 
 the two. I 
  thought this was the equivalent of the default output of anova.gam 
  using package gam instead of mgcv.
 
  I wonder if this procedure is correct because one of my models 
  appears to be linear. In fact mgcv estimates df to be 
 exactly 1.0 so 
  I could have stopped there. However I inadvertently repeated the 
  procedure outlined above. I would have thought in this case the 
  anova.gam comparing the smooth and the linear fit would 
 for sure have 
  been not significant.
  To my surprise, P was 6.18e-09!
 
  Am I doing something wrong when I attempt to confirm the non- 
  parametric part a smoother is significant? Here is my example case 
  where the relationship does appear to be linear:
 
  library(mgcv)
 
  This is mgcv 1.3-7
 
  Temp - c(-1.38, -1.12, -0.88, -0.62, -0.38, -0.12, 0.12, 
 0.38, 0.62, 
  0.88, 1.12,
  1.38, 1.62, 1.88, 2.12, 2.38, 2.62, 2.88, 3.12, 3.38, 
  3.62, 3.88,
  4.12, 4.38, 4.62, 4.88, 5.12, 5.38, 5.62, 5.88, 6.12, 
  6.38, 6.62, 6.88,
  7.12, 8.38, 13.62)
  N.sets - c(2, 6, 3, 9, 26, 15, 34, 21, 30, 18, 28, 27, 
 27, 29, 31, 
  22, 26, 24, 23,
   15, 25, 24, 27, 19, 26, 24, 22, 13, 10, 2, 5, 
 3, 1, 1, 
  1, 1, 1) wm.sed - c(0.0, 0.016129032, 0.0, 
  0.062046512, 0.396459596, 0.189082949,
   0.054757925, 0.142810440, 0.168005168, 0.180804428, 
  0.111439628, 0.128799505,
   0.193707937, 0.105921610, 0.103497845, 0.028591837, 
  0.217894389, 0.020535469,
   0.080389068, 0.105234450, 0.070213450, 0.050771363, 
  0.042074434, 0.102348837,
   0.049748344, 0.019100478, 0.005203125, 0.101711864, 
  0.0, 0.0,
   0.014808824, 0.0, 0.22200, 0.16700, 
  0.0, 0.0,
   0.0)
 
  sed.gam - gam(wm.sed~s(Temp),weight=N.sets)
  summary.gam(sed.gam)
 
  Family: gaussian
  Link function: identity
 
  Formula:
  wm.sed ~ s(Temp)
 
  Parametric coefficients:
  Estimate Std. Error t value Pr(|t|)
  (Intercept)  0.084030.01347   6.241 3.73e-07 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
 
  Approximate significance of smooth terms:
  edf Est.rank F  p-value
  s(Temp)   11 13.95 0.000666

Re: [R] testing non-linear component in mgcv:gam

2005-10-05 Thread John Fox

Dear Denis,

You got me: I would have thought from ?summary.gam that this would be the
same as the adjusted R^2 for a linear model. Note, however, that the
percentage of deviance explained checks with the R^2 from the linear model,
as expected.

Maybe you should address this question to the package author.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: Denis Chabot [mailto:[EMAIL PROTECTED] 
 Sent: Wednesday, October 05, 2005 3:33 PM
 To: John Fox
 Cc: R list
 Subject: Re: [R] testing non-linear component in mgcv:gam
 
 Thank you everyone for your help, but my introduction to GAM 
 is turning my brain to mush. I thought the one part of the 
 output I understood the best was r-sq (adj), but now even 
 this is becoming foggy.
 
 In my original message I mentioned a gam fit that turns out 
 to be a linear fit. By curiosity I analysed it with a linear 
 predictor only with mgcv package, and then as a linear model. 
 The output was identical in both, but the r-sq (adj) was 0.55 
 in mgcv and 0.26 in lm. In lm I hope that my interpretation 
 that 26% of the variance in y is explained by the linear 
 relationship with x is valid. Then what does r2 mean in mgcv?
 
 Denis
   summary.gam(lin)
 
 Family: gaussian
 Link function: identity
 
 Formula:
 wm.sed ~ Temp
 
 Parametric coefficients:
   Estimate Std. Error t value Pr(|t|)
 (Intercept)  0.162879   0.019847   8.207 1.14e-09 ***
 Temp-0.023792   0.006369  -3.736 0.000666 ***
 ---
 Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1
 
 
 R-sq.(adj) =  0.554   Deviance explained = 28.5%
 GCV score = 0.09904   Scale est. = 0.093686  n = 37
 
 
   summary(sed.true.lin)
 
 Call:
 lm(formula = wm.sed ~ Temp, weights = N.sets)
 
 Residuals:
  Min  1Q  Median  3Q Max
 -0.6138 -0.1312 -0.0325  0.1089  1.1449
 
 Coefficients:
   Estimate Std. Error t value Pr(|t|)
 (Intercept)  0.162879   0.019847   8.207 1.14e-09 ***
 Temp-0.023792   0.006369  -3.736 0.000666 ***
 ---
 Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1
 
 Residual standard error: 0.3061 on 35 degrees of freedom
 Multiple R-Squared: 0.285,Adjusted R-squared: 0.2646
 F-statistic: 13.95 on 1 and 35 DF,  p-value: 0.000666
 
 
 Le 05-10-05 à 09:45, John Fox a écrit :
 
  Dear Denis,
 
  Take a closer look at the anova table: The models provide identical 
  fits to the data. The differences in degrees of freedom and 
 deviance 
  between the two models are essentially zero, 5.5554e-10 and 
 2.353e-11 
  respectively.
 
  I hope this helps,
   John
 
  
  John Fox
  Department of Sociology
  McMaster University
  Hamilton, Ontario
  Canada L8S 4M4
  905-525-9140x23604
  http://socserv.mcmaster.ca/jfox
  
 
 
  -Original Message-
  From: [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED] On Behalf Of Denis Chabot
  Sent: Wednesday, October 05, 2005 8:22 AM
  To: r-help@stat.math.ethz.ch
  Subject: [R] testing non-linear component in mgcv:gam
 
  Hi,
 
  I need further help with my GAMs. Most models I test are very 
  obviously non-linear. Yet, to be on the safe side, I report the 
  significance of the smooth (default output of mgcv's
  summary.gam) and confirm it deviates significantly from linearity.
 
  I do the latter by fitting a second model where the same 
 predictor is 
  entered without the s(), and then use anova.gam to compare 
 the two. I 
  thought this was the equivalent of the default output of anova.gam 
  using package gam instead of mgcv.
 
  I wonder if this procedure is correct because one of my models 
  appears to be linear. In fact mgcv estimates df to be 
 exactly 1.0 so 
  I could have stopped there. However I inadvertently repeated the 
  procedure outlined above. I would have thought in this case the 
  anova.gam comparing the smooth and the linear fit would 
 for sure have 
  been not significant.
  To my surprise, P was 6.18e-09!
 
  Am I doing something wrong when I attempt to confirm the non- 
  parametric part a smoother is significant? Here is my example case 
  where the relationship does appear to be linear:
 
  library(mgcv)
 
  This is mgcv 1.3-7
 
  Temp - c(-1.38, -1.12, -0.88, -0.62, -0.38, -0.12, 0.12, 
 0.38, 0.62, 
  0.88, 1.12,
  1.38, 1.62, 1.88, 2.12, 2.38, 2.62, 2.88, 3.12, 3.38, 
  3.62, 3.88,
  4.12, 4.38, 4.62, 4.88, 5.12, 5.38, 5.62, 5.88, 6.12, 
  6.38, 6.62, 6.88,
  7.12, 8.38, 13.62)
  N.sets - c(2, 6, 3, 9, 26, 15, 34, 21, 30, 18, 28, 27, 
 27, 29, 31, 
  22, 26, 24, 23,
   15, 25, 24, 27, 19, 26, 24, 22, 13, 10, 2, 5, 
 3, 1, 1, 
  1, 1, 1) wm.sed - c(0.0, 0.016129032, 0.0, 
  0.062046512, 0.396459596, 0.189082949

Re: [R] R/S-Plus equivalent to Genstat predict: predictions over averages of covariates

2005-10-06 Thread John Fox

Dear Peter,

See the effects package, described in
http://www.jstatsoft.org/counter.php?id=75url=v08/i15/effect-displays-revi
sed.pdf.

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Peter Dunn
 Sent: Wednesday, October 05, 2005 9:06 PM
 To: R-help mailing list
 Subject: [R] R/S-Plus equivalent to Genstat predict: 
 predictions over averages of covariates
 
 Hi all
 
 I'm doing some things with a colleague comparing different 
 sorts of models.  My colleague has fitted a number of glms in 
 Genstat (which I have never used), while the glm I have been 
 using is only available for R.
 
 He has a spreadsheet of fitted means from each of his models 
 obtained from using the Genstat predict function.  For 
 example, suppose we fit the model of the type
 glm.out - glm( y ~ factor(F1) + factor(F2) + X1 + poly(X2,2) +
poly(X3,2), family=...)
 
 Then he produces a table like this (made up, but similar):
 
 F1(level1)12.2
 F1(level2)14.2
 F1(level3)15.3
 F2(level1)10.3
 F2(level2)9.1
 X1=0  10.2
 X1=0.510.4
 X1=1  10.4
 X1=1.510.5
 X1=2  10.9
 X1=2.511.9
 X1=3  11.8
 X2=0  12.0
 X2=0.512.2
 X2=1  12.5
 X2=1.512.9
 X2=2  13.0
 X2=2.513.1
 X2=3  13.5
 
 Each of the numbers are a predicted mean.  So when X1=0, on 
 average we predict an outcome of 10.2.
 
 To obtain these figures in Genstat, he uses the Genstat predict
 function.  When I asked for an explanation of how it was done 
 (ie to make the predictions, what values of the other 
 covariates were used) I was told:
 
  So, for a one-dimensional table of fitted means for any factor (or 
  variate), all other variates are set to their average 
 values; and the 
  factor constants (including the first, at zero) are given a 
 weighted 
  average depending on their respective numbers of observations.
 
 So for quantitative variables (such as pH), one uses the mean 
 pH in the data set when making the predictions.  Reasonable anmd easy.
 
 But for categorical variables (like Month), he implies we use 
 a weighted average of the fitted coefficients for all the 
 months, depending on the proportion of times those factor 
 levels appear in the data.
 
 (I hope I explained that OK...)
 
 Is there an equivalent way in R or S-Plus of doing this?  I 
 have to do it for a number of sites and species, so an 
 automated way would be useful.  I have tried searching to no 
 avail (but may not be searching on the correct terms), and 
 tried hard-coding something myself as yet unsuccessfully:  
 The  poly  terms and the use of the weighted averaging over 
 the factor levels are proving a bit too much for my limited skills.
 
 Any assistance appreciated.  (Any clarification of what I 
 mean can be provided if I have not been clear.)
 
 Thanks, as always.
 
 P.
 
   version
   _
 platform i386-pc-linux-gnu
 arch i386
 os   linux-gnu
 system   i386, linux-gnu
 status
 major2
 minor1.0
 year 2005
 month04
 day  18
 language R
  
 
 
 
 --
 Dr Peter Dunn  |  Senior Lecturer in Statistics Faculty of 
 Sciences, University of Southern Queensland
Web:http://www.sci.usq.edu.au/staff/dunn
Email:  dunn at usq.edu.au
 CRICOS:  QLD 00244B |  NSW 02225M |  VIC 02387D |  WA 02521C
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] data.frame error using sem package

2005-10-06 Thread John Fox

Dear Suzanne,

Take a look at your model specification:

 p.ram
 [,1][,2][,3]
[1,] LM1 - LSMA   LM1 - LSMA   NA  
[2,] LSMA - RSMA  LSMA - RSMA  NA  
[3,] RSMA - RM1   RSMA - RM1   NA  
[4,] LSMA - LSMA LSMA - LSMA NA  
[5,] RSMA - RSMA RSMA - RSMA NA  
[6,] RM1 - RM1   RM1 - RM1   NA  
 

This matrix should have three columns, the first giving the path, the second
the name of the corresponding parameter, and the third the start value for
the parameter (or NA if you want sem() to compute a start value). You've
apparently left out the parameter names. Please see the sem examples for
details and the paper at
http://socserv.socsci.mcmaster.ca/jfox/sem-package.pdf.

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Suzanne Witt
 Sent: Thursday, October 06, 2005 2:55 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] data.frame error using sem package
 
 I am trying to use sem to measure effective connectivity 
 among four brain regions.  I have pasted the code that I am 
 trying to run since that seems easier than trying to come up 
 with another example.
 
 The input data is time series data taken from SPM; they are each
 1x121 columns of numbers.  I get the error either when I 
 source the whole code, or if I enter it line by line when I 
 go to get the summary.
 
 Thanks,
 
 Suzanne
 
 library(sem)
 
 # Load the region timecourses.
 
 lsma1 - read.table(/Users/witt/parkinsons/rmrkm010905/R_files/
 010905_lcomf_LSMA.dat)
 rsma1 - read.table(/Users/witt/parkinsons/rmrkm010905/R_files/
 010905_lcomf_RSMA.dat)
 lmc1 - read.table(/Users/witt/parkinsons/rmrkm010905/R_files/
 010905_lcomf_LM1.dat)
 rmc1 - read.table(/Users/witt/parkinsons/rmrkm010905/R_files/
 010905_lcomf_RM1.dat)
 
 # Combine all the timecourses from each session into a single 
 data frame and name the columns appropriately.
 
 lcomf - cbind(lsma1, rsma1, lmc1, rmc1)
 names(lcomf) - c(LSMA, RSMA, LM1, RM1)
 
 # Type this at the command line to see a summary of your data
 
 str(lcomf)
 
 # Set up the structural equation model.
 
 p.ram - matrix(c( 'LM1 - LSMA', 'LM1 - LSMA', NA,
  'LSMA - RSMA', 'LSMA - RSMA', NA,
  'RSMA - RM1', 'RSMA - RM1', NA,
  'LSMA - LSMA', 'LSMA - LSMA', NA,
  'RSMA - RSMA', 'RSMA - RSMA', NA,
  'RM1 - RM1', 'RM1 - RM1', NA),
  ncol = 3, byrow = TRUE)
 
 # Tell which variables are exogenous ('fixed').
 
 p.fixed - c('LM1')
 
 # Do the fitting for session 1.
 
 C - cor(lcomf)
 nt - dim(lcomf)[1]
 #attach(lcomf)
 lcomf.results - sem(p.ram, C, nt, obs.variables = 
 rownames(C), fixed.x = p.fixed)
 
 # Check out the results using the summary function
 
 summary(lcomf.results)
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] zip package

2005-10-14 Thread John Fox

Dear Sara,

It looks to me as if there are three problems here: (1) Is the zip file for
the package really at c:/ProgramFiles/R/rw2011/library/lpSolve_1.1.9? That
is, isn't there a space in Program Files? (2) You have to specify
repos=NULL to install from a local zip file, as ?install.packages tells you.
(3) You don't seem to be using the destdir argument correctly; you can omit
it.

Why not avoid all this and just use the R for Windows menus: Packages -
Install package(s) from local zip files?

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Sara Mouro
 Sent: Friday, October 14, 2005 4:45 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] zip package
 
   Dear all
 
   I can not understand how to install the package 
 lpsolve_1.1.9.zip
 
   I have read the FAQ and the help pages carefully, but 
 it still not clear for me.
 
   I have tried the following (and obtained the respective error
 messages):
 
   
 install.packages(c:/ProgramFiles/R/rw2011/library/lpSolve_1.
1.9,destdir=
 c:/ProgramFiles/R/rw2011/library/lpSolve)
   Mensagem de aviso:
   no package 'c:/Program
 Files/R/rw2011/library/lpSolve_1.1.9' at the repositories in:
 download.packages(pkgs, destdir = tmpd, available = available,  
 
   
 install.packages(lpSolve_1.1.9,destdir=c:/ProgramFiles/R/r
w2011/libr
 ary/
 lpSolve)
   Erro em download.packages(pkgs, destdir = tmpd, 
 available = available,  : 
   'destdir' is not a directory
 
   install.packages(lpSolve_1.1.9,destdir=NULL)
   Mensagem de aviso:
   no package 'lpSolve_1.1.9' at the 
 repositories in:
 download.packages(pkgs, destdir = tmpd, available = available,  
 
   Could you please tell me how to do that or when can I 
 find a simple example of that kind of installation?
 
   Thank you so much and sorry for such basic question.
 
   Sara Mouro
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Test for equality of coefficients in multivariate multipleregression

2006-07-19 Thread John Fox

Dear Berwin,

Simply stacking the problems and treating the resulting observations as
independent will give you the correct coefficients, but incorrect
coefficient variances and artificially zero covariances.

The approach that I suggested originally -- testing a linear hypothesis
using the coefficient estimates and covariances from the multivariate linear
model -- seems simple enough. For example, to test that all three
coefficients are the same across the two equations,

 b - as.vector(coef(x.mlm))
 
 V - vcov(x.mlm)
 
 L - c(1, 0, 0,-1, 0, 0,
0, 1, 0, 0,-1, 0,
0, 0, 1, 0, 0,-1)
 L - matrix(L, nrow=3, byrow=TRUE)
 
 t(L %*% b)  %*% (L %*% V %*% t(L)) %*% (L %*% b)

The test statistic is chi-square with 3 df under the null hypothesis. (Note:
not checked carefully.)

(BTW, it's a bit unclear to me how much of this exchange was on r-help, but
I'm copying to r-help since at least one of Ulrich's messages referring to
alternative approaches appeared there. I hope that's OK.)

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: Berwin A Turlach 
 [mailto:[EMAIL PROTECTED] On Behalf Of Berwin 
 A Turlach
 Sent: Tuesday, July 18, 2006 9:28 PM
 To: Andrew Robinson
 Cc: Ulrich Keller; John Fox
 Subject: Re: [R] Test for equality of coefficients in 
 multivariate multipleregression
 
 G'day all,
 
  AR == Andrew Robinson [EMAIL PROTECTED] writes:
 
 AR I suggest that you try to rewrite the model system into a
 AR single mixed-effects model, [...] Why a mixed-effect 
 model, wouldn't a fixed effect be o.k. too?
 
 Something like:
 
  DF-data.frame(x1=rep(c(0,1),each=50),x2=rep(c(0,1),50))
  tmp-rnorm(100)
  DF$y1-tmp+DF$x1*.5+DF$x2*.3+rnorm(100,0,.5)
  DF$y2-tmp+DF$x1*.5+DF$x2*.7+rnorm(100,0,.5)
  x.mlm-lm(cbind(y1,y2)~x1+x2,data=DF)
  coef(x.mlm)
  y1  y2
 (Intercept) -0.08885266 -0.05749196
 x1   0.33749086  0.60395258
 x2   0.72017894  1.11932077
 
 
  DF2 - with(DF, data.frame(y=c(y1,y2)))
  DF2$x11 - with(DF, c(x1, rep(0,100)))
  DF2$x21 - with(DF, c(x2, rep(0,100)))
  DF2$x12 - with(DF, c(rep(0,100), x1))
  DF2$x22 - with(DF, c(rep(0,100), x2))
  DF2$x1 - with(DF, c(x1, x1))
  DF2$wh - rep(c(0,1), each=100)
  fm1 - lm(y~wh + x11 + x21 + x12 + x22, DF2)
  fm1
 
 Call:
 lm(formula = y ~ wh + x11 + x21 + x12 + x22, data = DF2)
 
 Coefficients:
 (Intercept)   wh  x11  x21  
 x12  x22  
-0.08885  0.03136  0.33749  0.72018  
 0.60395  1.11932  
 
  fm2 - lm(y~wh + x1 + x21 + x22, DF2)
  anova(fm2,fm1)
 Analysis of Variance Table
 
 Model 1: y ~ wh + x1 + x21 + x22
 Model 2: y ~ wh + x11 + x21 + x12 + x22
   Res.Df RSS  Df Sum of Sq  F Pr(F)
 1195 246.919
 2194 246.031   1 0.888 0.6998 0.4039
 
 
 Cheers,
 
 Berwin
 
 == Full address 
 Berwin A Turlach  Tel.: +61 (8) 6488 3338 
 (secr)   
 School of Mathematics and Statistics+61 (8) 6488 3383 
 (self)  
 The University of Western Australia   FAX : +61 (8) 6488 1028
 35 Stirling Highway   
 Crawley WA 6009e-mail: [EMAIL PROTECTED]
 Australiahttp://www.maths.uwa.edu.au/~berwin


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] equality constraint

2006-07-22 Thread John Fox

Dear Ana Maria,

How about replacing the last of the four parameters with the negative of the
sum of the others?

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED]
 Sent: Friday, July 21, 2006 4:31 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] equality constraint
 
 I would like to optimize a (log-)likelihood function subject 
 to linear equality constraints in parameters. My function has 
 eight parameters. The first one, third and fourth are greater 
 than zero, the second one must be less than zero. The problem 
 is that the last four must sum zero. Function constrOptim 
 only fits inequality constraints.
 Is there an alternative to lead with this problem?
 
 Ana Maria
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] Version 0.9-4 of sem to CRAN

2006-07-25 Thread John Fox

Dear Kurt,

I've uploaded a new version (0.9-4) of the sem package to CRAN.

I hope that everything is well with you.

John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Syntax of Levene's test

2006-08-02 Thread John Fox

Dear Paul,

The argument y is the response variable and group is a factor defining
groups (as ?levene.test says). If you have more than one factor, then you
can use interaction() to create from them a factor with levels given by the
product set of the levels of the individual factors. Here's an example

 library(car)
 data(Moore)
 attach(Moore)
 levene.test(conformity, interaction(fcategory, partner.status))
Levene's Test for Homogeneity of Variance
  Df F value Pr(F)
group  5  1.4694 0.2219
  39   
 levels(interaction(fcategory, partner.status))
[1] high.high   low.highmedium.high high.lowlow.low
[6] medium.low 
 levels(fcategory)
[1] high   lowmedium
 levels(partner.status)
[1] high low 

I'll add a couple of examples to the help page.

I hope this helps,
 John 


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Paul Smith
 Sent: Wednesday, August 02, 2006 5:33 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Syntax of Levene's test
 
 Dear All
 
 I am trying to use Levene's test (of package car), but I do 
 not understand quite well how to use it. '?levene.test' does 
 not unfortunately provide any example. My data are in a data 
 frame and correspond to 4 factors plus response. Could 
 someone please give me an example about how to use the command
 
 levene.test(y, group)
 
 ?
 
 Thanks in advance,
 
 Paul
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Syntax of Levene's test

2006-08-02 Thread John Fox

Dear Paul,

Levene's test tests the null hypothesis that the variance are equal, so a
small p-value suggests that they are not. Looking at your output, it seems
odd that you have as many as 96 groups.

John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Paul Smith
 Sent: Wednesday, August 02, 2006 5:02 PM
 To: r-help@stat.math.ethz.ch
 Subject: Re: [R] Syntax of Levene's test
 
 On 8/2/06, John Fox [EMAIL PROTECTED] wrote:
  The argument y is the response variable and group is a 
 factor defining 
  groups (as ?levene.test says). If you have more than one 
 factor, then 
  you can use interaction() to create from them a factor with levels 
  given by the product set of the levels of the individual factors. 
  Here's an example
 
   library(car)
   data(Moore)
   attach(Moore)
   levene.test(conformity, interaction(fcategory, partner.status))
  Levene's Test for Homogeneity of Variance
Df F value Pr(F)
  group  5  1.4694 0.2219
39
   levels(interaction(fcategory, partner.status))
  [1] high.high   low.highmedium.high high.low
 low.low
  [6] medium.low
   levels(fcategory)
  [1] high   lowmedium
   levels(partner.status)
  [1] high low
 
  I'll add a couple of examples to the help page.
 
 Thanks, John. Now, I understand how to use levene.test. There 
 is only a question remaining: is the null hypothesis 
 corresponding to homogeneity of variances, i.e., should one 
 conclude that
 
 Levene's Test for Homogeneity of Variance
Df F valuePr(F)
 group  95  3.5919  2.2e-16 ***
   864
 
 tell us that the hypothesis that the variances are equal is 
 (highly) significant?
 
 Paul
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Doubt about Student t distribution simulation

2006-08-04 Thread John Fox

Dear Jose,

The problem is that you're using the population standard deviation (sigma)
rather than the sample SD of each sample [i.e., t[i]  = (mean(amo.i) - mu) /
(sd(amo.i) / sqrt(n)) ], so your values should be normally distributed, as
they appear to be.

A couple of smaller points: (1) Even after this correction, you're sampling
from a discrete population (albeit with replacement) and so the values won't
be exactly t-distributed. You could draw the samples directly from N(mu,
sigma) instead. (2) It would be preferable to make a quantile-comparison
plot against the t-distribution, since you'd get a better picture of what's
going on in the tails.

I hope this helps,
 John 


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Jose 
 Claudio Faria
 Sent: Friday, August 04, 2006 3:09 PM
 To: R-help@stat.math.ethz.ch
 Subject: [R] Doubt about Student t distribution simulation
 
 Dear R list,
 
 I would like to illustrate the origin of the Student t 
 distribution using R.
 
 So, if (sample.mean - pop.mean) / standard.error(sample.mean) 
 has t distribution with (sample.size - 1) degree free, what 
 is wrong with the simulation below? I think that the 
 theoretical curve should agree with the relative frequencies 
 of the t values calculated:
 
 #== begin options=
 # parameters
mu= 10
sigma =  5
 
 # size of sample
n = 3
 
 # repetitions
nsim = 1
 
 # histogram parameter
nchist = 150
 #== end options===
 
 t   = numeric()
 pop = rnorm(1, mean = mu, sd = sigma)
 
 for (i in 1:nsim) {
amo.i = sample(pop, n, replace = TRUE)
t[i]  = (mean(amo.i) - mu) / (sigma / sqrt(n)) }
 
 win.graph(w = 5, h = 7)
 split.screen(c(2,1))
 screen(1)
 hist(t,
   main = histogram,
   breaks   = nchist,
   col  = lightgray,
   xlab = '', ylab = Fi,
   font.lab = 2, font = 2)
 
 screen(2)
 hist(t,
   probability = T,
   main= 'f.d.p and histogram',
   breaks  = nchist,
   col = 'lightgray',
   xlab= 't', ylab = 'f(t)',
   font.lab= 2, font = 2)
 
 x = t
 curve(dt(x, df = n-1), add = T, col = red, lwd = 2)
 
 Many thanks for any help,
 ___
 Jose Claudio Faria
 Brasil/Bahia/Ilheus/UESC/DCET
 Estatística Experimental/Prof. Adjunto
 mails: [EMAIL PROTECTED]
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Doubt about Student t distribution simulation

2006-08-04 Thread John Fox

Dear Sundar,

Try qq.plot(t, dist=t, df=n-1) from the car package, which include a
95-percent point-wise confidence envelope that helps you judge how extreme
the outliers are relative to expectations.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: Sundar Dorai-Raj [mailto:[EMAIL PROTECTED] 
 Sent: Friday, August 04, 2006 3:27 PM
 To: John Fox
 Cc: [EMAIL PROTECTED]; R-help@stat.math.ethz.ch
 Subject: Re: [R] Doubt about Student t distribution simulation
 
 Hi, Jose/John,
 
 Here's an example to help Jose and highlights John's advice. 
 Also includes set.seed which should be included in all 
 simulations posted to R-help.
 
 set.seed(42)
 mu - 10
 sigma -  5
 n - 3
 nsim - 1
 m - matrix(rnorm(n * nsim, mu, sigma), nsim, n) t - 
 apply(m, 1, function(x) (mean(x) - mu)/(sd(x)/sqrt(n)))
 
 library(lattice)
 qqmath(t, distribution = function(x) qt(x, n - 1),
 panel = function(x, ...) {
   panel.qqmath(x, col = darkblue, ...)
   panel.qqmathline(x, col = darkred, ...)
 })
 
 
 With n = 3, expect a few outliers.
 
 --sundar
 
 
 John Fox wrote:
  Dear Jose,
  
  The problem is that you're using the population standard deviation 
  (sigma) rather than the sample SD of each sample [i.e., t[i]  = 
  (mean(amo.i) - mu) /
  (sd(amo.i) / sqrt(n)) ], so your values should be normally 
  distributed, as they appear to be.
  
  A couple of smaller points: (1) Even after this correction, you're 
  sampling from a discrete population (albeit with 
 replacement) and so 
  the values won't be exactly t-distributed. You could draw 
 the samples 
  directly from N(mu,
  sigma) instead. (2) It would be preferable to make a 
  quantile-comparison plot against the t-distribution, since 
 you'd get a 
  better picture of what's going on in the tails.
  
  I hope this helps,
   John
  
  
  John Fox
  Department of Sociology
  McMaster University
  Hamilton, Ontario
  Canada L8S 4M4
  905-525-9140x23604
  http://socserv.mcmaster.ca/jfox
  
  
  
 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Jose Claudio 
 Faria
 Sent: Friday, August 04, 2006 3:09 PM
 To: R-help@stat.math.ethz.ch
 Subject: [R] Doubt about Student t distribution simulation
 
 Dear R list,
 
 I would like to illustrate the origin of the Student t distribution 
 using R.
 
 So, if (sample.mean - pop.mean) / standard.error(sample.mean) has t 
 distribution with (sample.size - 1) degree free, what is wrong with 
 the simulation below? I think that the theoretical curve 
 should agree 
 with the relative frequencies of the t values calculated:
 
 #== begin options=
 # parameters
mu= 10
sigma =  5
 
 # size of sample
n = 3
 
 # repetitions
nsim = 1
 
 # histogram parameter
nchist = 150
 #== end options===
 
 t   = numeric()
 pop = rnorm(1, mean = mu, sd = sigma)
 
 for (i in 1:nsim) {
amo.i = sample(pop, n, replace = TRUE)
t[i]  = (mean(amo.i) - mu) / (sigma / sqrt(n)) }
 
 win.graph(w = 5, h = 7)
 split.screen(c(2,1))
 screen(1)
 hist(t,
   main = histogram,
   breaks   = nchist,
   col  = lightgray,
   xlab = '', ylab = Fi,
   font.lab = 2, font = 2)
 
 screen(2)
 hist(t,
   probability = T,
   main= 'f.d.p and histogram',
   breaks  = nchist,
   col = 'lightgray',
   xlab= 't', ylab = 'f(t)',
   font.lab= 2, font = 2)
 
 x = t
 curve(dt(x, df = n-1), add = T, col = red, lwd = 2)
 
 Many thanks for any help,
 ___
 Jose Claudio Faria
 Brasil/Bahia/Ilheus/UESC/DCET
 Estatística Experimental/Prof. Adjunto
 mails: [EMAIL PROTECTED]
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
  
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] polychoric correlation error

2006-08-04 Thread John Fox

Dear Janet,

Because you didn't set the value of the random-number generator seed, your
example isn't precisely reproducible, but the problem is apparent anyway:

 set.seed(12345)
 n-100
 test.x-rnorm(n, mean=0, sd=1)
 test.c-test.x + rnorm(n, mean=0, sd=.5) 
 thresh.x-c(-2.5, -1, -.5, .5, 1000) 
 thresh.c-c(-1, 1, 2, 3, 1000)
 
 discrete.x-discrete.c-vector(length=n)
 
 for (i in 1:n) {
+ discrete.x[i]-which.min(thresh.x  test.x[i] )
+ discrete.c[i]-which.min(thresh.c  test.c[i] ) }
 
 table(discrete.x, discrete.c)
  discrete.c
discrete.x  1  2  3  4  5
 2 12  1  0  0  0
 3  3 12  0  0  0
 4  2 19  2  0  0
 5  0 18 21  9  1
 
 cor(test.x, test.c)
[1] 0.9184189
 
 pc - polychor(discrete.x, discrete.c, std.err=T, ML=T)
Warning messages:
1: NaNs produced in: log(x) 
2: NaNs produced in: log(x) 
3: NaNs produced in: log(x) 
 pc

Polychoric Correlation, ML est. = 0.9077 (0.03314)
Test of bivariate normality: Chisquare = 3.103, df = 11, p = 0.9893

  Row Thresholds
  Threshold Std.Err.
1  -1.12200   0.1609
2  -0.56350   0.1309
3   0.03318   0.1235


  Column Thresholds
  Threshold Std.Err.
1   -0.9389   0.1489
20.4397   0.1292
31.2790   0.1707
42.3200   0.3715
 

The variables that you've created are indeed bivariate normal, but they are
highly correlated, and your choice of cut points makes it hard to estimate
the correlation from the contingency tables, apparently producing some
difficulty in the maximization of the likelihood. Nevertheless, the ML
estimates of the correlation and thresholds for the set of data above are
pretty good. (In your case, the optimization failed.)

BTW, a more straightforward way to create the categorical variables would be

discrete.x - cut(test.x, c(-Inf, -2.5, -1, -.5, .5, Inf))
discrete.c - cut(test.c, c(-Inf, -1, 1, 2, 3, Inf))

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Rosenbaum, Janet
 Sent: Friday, August 04, 2006 5:49 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] polychoric correlation error
 
 
 Dear all,
 
 I get a strange error when I find polychoric correlations 
 with the ML method, which I have been able to reproduce using 
 randomly-generated data.
 
 What is wrong?  
 I realize that the data that I generated randomly is a bit 
 strange, but it is the only way that I duplicate the error message.
 
 
  n-100
  test.x-rnorm(n, mean=0, sd=1)
  test.c-test.x + rnorm(n, mean=0, sd=.5) thresh.x-c(-2.5, -1, -.5, 
  .5, 1000) thresh.c-c(-1, 1, 2, 3, 1000)
  
  discrete.x-discrete.c-vector(length=n)
  
  for (i in 1:n) {
 + discrete.x[i]-which.min(thresh.x  test.x[i] )
 + discrete.c[i]-which.min(thresh.c  test.c[i] ) }
  pc-polychor(discrete.x, discrete.c, std.err=T, ML=T)
 Error in optim(c(optimise(f, interval = c(-1, 1))$minimum, 
 rc, cc), f,  : 
   non-finite finite-difference value [1]
 In addition: There were 50 or more warnings (use warnings() 
 to see the first 50)
  print(pc)
 Error in print(pc) : object pc not found
  warnings()
 Warning messages:
 1: NaNs produced in: log(x)
 2: NA/Inf replaced by maximum positive value
 3: NaNs produced in: log(x) 
 
 
 ---
 
 Thanks,
 
 Janet
 
 
 
 This email message is for the sole use of the intended\  ...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Specifying Path Model in SEM for CFA

2006-08-16 Thread John Fox

Dear Rick,

There are a couple of problems here:

(1) You've fixed the error variance parameters for each of the observed
variables to 1 rather than defining each as a free parameter to estimate.
For example, use 

X1 - X1, theta1, NA

Rather than 

X1 - X1, NA, 1

The general principle is that if you give a parameter a name, it's a free
parameter to be estimated; if you give the name as NA, then the parameter is
given a fixed value (here, 1). (There is some more information on this and
on error-variance parameters in ?sem.)

(2) I believe that the model you're trying to specify -- in which all
variables but X6 load on F1, and all variables but X1 load on F2 -- is
underidentified.

In addition, you've set the metric of the factors by fixing one loading to
0.20 and another to 0.25. That should work but strikes me as unusual, and
makes me wonder whether this was what you really intended. It would be more
common in a CFA to fix the variance of each factor to 1, and let the factor
loadings be free parameters. Then the factor covariance would be their
correlation. 

You should not have to specify start values for free parameters (such as
g11, g22, and g12 in your model), though it is not wrong to do so. I would
not, however, specify start values that imply a singular covariance matrix
among the factors, as you've done; I'm surprised that the program was able
to get by the start values to produce a solution.

BTW, the Thurstone example in ?sem is for a confirmatory factor analysis
(albeit a slightly more complicated one with a second-order factor). There's
also an example of a one-factor CFA in the paper at
http://socserv.socsci.mcmaster.ca/jfox/Misc/sem/SEM-paper.pdf, though this
is for ordinal observed variables.

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Rick Bilonick
 Sent: Tuesday, August 15, 2006 11:50 PM
 To: R Help
 Subject: [R] Specifying Path Model in SEM for CFA
 
 I'm using specify.model for the sem package. I can't figure 
 out how to represent the residual errors for the observed 
 variables for a CFA model. (Once I get this working I need to 
 add some further constraints.)
 
 Here is what I've tried:
 
 model.sa - specify.model()
   F1   - X1,l11, NA
   F1   - X2,l21, NA
   F1   - X3,l31, NA
   F1   - X4,l41, NA
   F1   - X5, NA, 0.20
   F2   - X1,l12, NA
   F2   - X2,l22, NA
   F2   - X3,l32, NA
   F2   - X4,l42, NA
   F2   - X6, NA, 0.25
   F1  - F2,g12, 1
   F1- F1,g11, 1
   F2- F2,g22, 1
   X1  - X1, NA, 1
   X2  - X2, NA, 1
   X3  - X3, NA, 1
   X4  - X4, NA, 1
   X5  - X5, NA, 1
   X6  - X6, NA, 1
 
 This at least converges:
 
  summary(fit.sem)
 
  Model Chisquare =  2147   Df =  10 Pr(Chisq) = 0
  Chisquare (null model) =  2934   Df =  15
  Goodness-of-fit index =  0.4822
  Adjusted goodness-of-fit index =  -0.087387
  RMSEA index =  0.66107   90 % CI: (NA, NA)
  Bentler-Bonnett NFI =  0.26823
  Tucker-Lewis NNFI =  -0.098156
  Bentler CFI =  0.26790
  BIC =  2085.1
 
  Normalized Residuals
Min. 1st Qu.  MedianMean 3rd Qu.Max.
  -5.990  -0.618   0.192   0.165   1.700   3.950
 
  Parameter Estimates
 Estimate  Std Error z value  Pr(|z|)
 l11 -0.245981 0.21863   -1.12510 0.26054748 X1 --- F1
 l21 -0.308249 0.22573   -1.36555 0.17207875 X2 --- F1
 l31  0.202590 0.079102.56118 0.01043175 X3 --- F1
 l41 -0.235156 0.21980   -1.06985 0.28468885 X4 --- F1
 l12  0.839985 0.219623.82476 0.00013090 X1 --- F2
 l22  0.828460 0.225483.67418 0.00023862 X2 --- F2
 l32  0.066722 0.083690.79725 0.42530606 X3 --- F2
 l42  0.832037 0.218403.80963 0.00013917 X4 --- F2
 g12  0.936719 0.643311.45609 0.14536647 F2 -- F1
 g11  2.567669 1.256082.04418 0.04093528 F1 -- F1
 g22  1.208497 0.550402.19567 0.02811527 F2 -- F2
 
  Iterations =  59
 
 And it produces the following path diagram:
 
  path.diagram(fit.sem)
 digraph fit.sem {
   rankdir=LR;
   size=8,8;
   node [fontname=Helvetica fontsize=14 shape=box];
   edge [fontname=Helvetica fontsize=10];
   center=1;
   F2 [shape=ellipse]
   F1 [shape=ellipse]
   F1 - X1 [label=l11];
   F1 - X2 [label=l21];
   F1 - X3 [label=l31];
   F1 - X4 [label=l41];
   F1 - X5 [label=];
   F2 - X1 [label=l12];
   F2 - X2 [label=l22];
   F2 - X3 [label=l32];
   F2 - X4 [label=l42];
   F2 - X6 [label=];
 }
 
 But I don't see the residual error terms that go into each of 
 the observed variables X1 - X6. I've tried:
 
 model.sa - specify.model()
   E1   - X1, e1,  1
   E2   - X2, e2,  1
   E3   - X3, e3,  1
   E4   - X4, e4,  1
   E5   - X5, e5,  1
   E6   - X6, e6,  1
   E1  - E1, s1, NA
   E2  - E2, s2, NA
   E3  - E3, s3, NA
   E4  - E4, s4, NA
   E5  - E5, s5, NA
   E6  - E6, s6, NA
   F1   - X1,l11, NA
   F1   - X2,l21, NA
   F1   - X3,l31, NA

Re: [R] Specifying Path Model in SEM for CFA

2006-08-16 Thread John Fox

Dear Rick,

It's unclear to me what you mean by constraining each column of the factor
matrix to sum to one. If you intend to constrain the loadings on each
factor to sum to one, sem() won't do that, since it supports only equality
constraints, not general linear constraints on parameters of the model, but
why such a constraint would be reasonable in the first place escapes me.
More common in confirmatory factor analysis would be to constrain more of
the loadings to zero. Of course, one would do this only if it made
substantive sense in the context of the research.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: Rick Bilonick [mailto:[EMAIL PROTECTED] 
 Sent: Wednesday, August 16, 2006 12:07 PM
 To: John Fox
 Cc: 'R Help'
 Subject: Re: [R] Specifying Path Model in SEM for CFA
 
 On Wed, 2006-08-16 at 08:47 -0400, John Fox wrote:
  Dear Rick,
  
  There are a couple of problems here:
  
  (1) You've fixed the error variance parameters for each of the 
  observed variables to 1 rather than defining each as a free 
 parameter to estimate.
  For example, use
  
  X1 - X1, theta1, NA
  
  Rather than
  
  X1 - X1, NA, 1
  
  The general principle is that if you give a parameter a 
 name, it's a 
  free parameter to be estimated; if you give the name as NA, 
 then the 
  parameter is given a fixed value (here, 1). (There is some more 
  information on this and on error-variance parameters in ?sem.)
  
  (2) I believe that the model you're trying to specify -- in 
 which all 
  variables but X6 load on F1, and all variables but X1 load 
 on F2 -- is 
  underidentified.
  
  In addition, you've set the metric of the factors by fixing one 
  loading to 0.20 and another to 0.25. That should work but 
 strikes me 
  as unusual, and makes me wonder whether this was what you really 
  intended. It would be more common in a CFA to fix the 
 variance of each 
  factor to 1, and let the factor loadings be free 
 parameters. Then the 
  factor covariance would be their correlation.
  
  You should not have to specify start values for free 
 parameters (such 
  as g11, g22, and g12 in your model), though it is not wrong 
 to do so. 
  I would not, however, specify start values that imply a singular 
  covariance matrix among the factors, as you've done; I'm surprised 
  that the program was able to get by the start values to 
 produce a solution.
  
  BTW, the Thurstone example in ?sem is for a confirmatory factor 
  analysis (albeit a slightly more complicated one with a 
 second-order 
  factor). There's also an example of a one-factor CFA in the 
 paper at 
  http://socserv.socsci.mcmaster.ca/jfox/Misc/sem/SEM-paper.pdf, 
  though this is for ordinal observed variables.
  
  I hope this helps,
   John
  
  
  John Fox
  Department of Sociology
  McMaster University
  Hamilton, Ontario
  Canada L8S 4M4
  905-525-9140x23604
  http://socserv.mcmaster.ca/jfox
  
 
 Thanks for the information. I think I understand how to 
 handle the residual variance after reading the sem help file 
 more carefully. Now I have to figure out how to constrain 
 each column of the factor matrix to sum to one. Maybe this 
 will fix the problem with being under-identified.
 
 Rick B.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] tkinser

2006-08-17 Thread John Fox

Dear Julie,

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of julie7.josse
 Sent: Thursday, August 17, 2006 4:01 AM
 To: r-help
 Subject: [R] tkinser
 Importance: High

 Dear list,

 I 'd like to know if it is possible to delete my text window 
 after running it??

 I have add a Menu to my text window and so i can for example 
 open a script; then i run it and i have my result on my R console.
 But i'd like that after running, the code disappears 
 automatically. i d'like something that clean my text window.

I think that what you mean is that you'd like to delete the text in the
window rather than the window itself. If the text widget is called txt, then
you can do 

tkdelete(txt, 0.0, end) 

 I have a second problem:

 My text window and all my widgets are not fixed:

 I use combo box, message box...and it always moves, it 
 appears on the right of my screen, on the bottom..or on the 
 bar tasks. I d'like my text window not move at all and after 
 if it's possible that my widgets appear on the same place on my sreen.

 When they appear on the bottom on my tasks bar, i have to 
 open it each time...

If the top-level Tk window is named top, then, after creating the window,
e.g., tkwm.geometry(tt, -100+100) will position it 100 pixels 100 pixels
from the right top corner of the display.

More generally, I found it useful to read Welch, Jones, and Hobbs, Practical
Programming in Tcl and Tk, to learn these kinds of things.

I hope this helps,
 John

 Thanks a lot.
 Julie.

 Cet iti, pensez aux cartes postales de laposte.net !

   [[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Specifying Path Model in SEM for CFA

2006-08-17 Thread John Fox

Dear Rick,

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Rick Bilonick
 Sent: Thursday, August 17, 2006 7:07 AM
 To: John Fox
 Cc: 'R Help'; 'Rick Bilonick'
 Subject: Re: [R] Specifying Path Model in SEM for CFA

. . .

 I'm trying to build a multivariate receptor model as 
 described by Christensen and Sain (Technometrics, vol 44 (4) 
 pp. 328-337). The model is

 x_t = Af_t + e_t

 where A is the matrix of nonnegative source compositions, x_t 
 are the observed pollutant concentrations at time t, and f_t 
 are the unobserved factors. The columns of A are supposed to 
 sum to no more than 100%. They say they are using a latent 
 variable model. If sem can't handle this, do you know of 
 another R package that could?

sem() handles only equality constraints among parameters, and this model
requires linear inequality constraints. 

I'm aware of SEM software that handles inequality constraints, but I'm not
aware of anything in R that will do it out of the box. One possibility is
to write out the likelihood (or fitting function) for your model and
perform a bounded optimization using optim(). It would probably be a fair
amount of work setting up the problem.

Finally, there are tricks that permit the imposition of general constraints
and inequality constraints using software, like sem(), that handles only
equality constraints. It's probably possible to do what you want using such
a trick, but it would be awkward. See the references given in Bollen,
Structural Equations with Latent Variables (Wiley, 1989), pp. 401-403.

I'm sorry that I can't be of more direct help.
 John

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Total (un)standardized effects in SEM?

2006-08-25 Thread John Fox

   0 0   0
 Alienation67 Alienation71
Anomia671.000 0.00
Powerless67 1.0265597 0.00
Anomia710.000 1.00
Powerless71 0.000 0.970892
Education   0.000 0.00
SEI 0.000 0.00
Alienation670.000 0.00
Alienation710.6173153 0.00

 (C - A[endog, exog, drop=FALSE]) # direct effects, exogenous -
endogenous
SES
Anomia67  0.000
Powerless67   0.000
Anomia71  0.000
Powerless71   0.000
Education 1.000
SEI   0.5163168
Alienation67 -0.5498096
Alienation71 -0.2115088

 I - diag(nrow(B))
 IBinv - solve(I - B)
 (Ty - IBinv - I)  # total effects, endogenous - endogenous
 Anomia67 Powerless67 Anomia71 Powerless71 Education SEI
Anomia670   00   0 0   0
Powerless67 0   00   0 0   0
Anomia710   00   0 0   0
Powerless71 0   00   0 0   0
Education   0   00   0 0   0
SEI 0   00   0 0   0
Alienation670   00   0 0   0
Alienation710   00   0 0   0
 Alienation67 Alienation71
Anomia671.000 0.00
Powerless67 1.0265597 0.00
Anomia710.6173153 1.00
Powerless71 0.5993465 0.970892
Education   0.000 0.00
SEI 0.000 0.00
Alienation670.000 0.00
Alienation710.6173153 0.00

 (Tx - IBinv %*% C) # total effects, exogenous - endogenous
SES
Anomia67 -0.5498096
Powerless67  -0.5644124
Anomia71 -0.5509147
Powerless71  -0.5348786
Education 1.000
SEI   0.5163168
Alienation67 -0.5498096
Alienation71 -0.5509147

 Ty - B  # indirect effects, endogenous - endogenous
 Anomia67 Powerless67 Anomia71 Powerless71 Education SEI
Anomia670   00   0 0   0
Powerless67 0   00   0 0   0
Anomia710   00   0 0   0
Powerless71 0   00   0 0   0
Education   0   00   0 0   0
SEI 0   00   0 0   0
Alienation670   00   0 0   0
Alienation710   00   0 0   0
 Alienation67 Alienation71
Anomia670.0000
Powerless67 0.0000
Anomia710.61731530
Powerless71 0.59934650
Education   0.0000
SEI 0.0000
Alienation670.0000
Alienation710.0000

 Tx - C # indirect effects, exogenous - endogenous
SES
Anomia67 -0.5498096
Powerless67  -0.5644124
Anomia71 -0.5509147
Powerless71  -0.5348786
Education 0.000
SEI   0.000
Alienation67  0.000
Alienation71 -0.3394059

These results agree with those in the LISREL manual (and for another example
there as well), but I haven't checked the method carefully.

It would, of course, be simple to encapsulate the steps above in a function,
but here's a caveat: The idea of indirect and total effects makes sense to
me for a recursive model, and for the exogenous variables in a nonrecursive
model, where they are the reduced-form coefficients (supposing, of course,
that the model makes sense in the first place, which is often problematic),
but not for the endogenous variables in a nonrecursive model. That is why I
haven't put such a function in the sem package; perhaps I should reconsider.

Having said that, I'm ashamed to add that I believe that I was the person
who suggested the definition of total and indirect effects currently used
for these models.

Finally, you can get standardized effects similarly by using standardized
structural coefficients. In the sem package, these are computed and printed
by standardized.eoefficients(). This function doesn't return the
standardized A matrix in a usable form, but could be made to do so.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] polychor error

2006-08-25 Thread John Fox

Dear Janet,

Performing a traceback after the error gives a hint:

 tmp.pcc-polychor(tmp.mat, ML=T, std.err=T)
 traceback()
8: stop(at least one element of , sQuote(lower),  is larger than , 
   sQuote(upper))
7: checkmvArgs(lower = lower, upper = upper, mean = mean, corr = corr, 
   sigma = sigma)
6: pmvnorm(lower = c(row.cuts[i], col.cuts[j]), upper = c(row.cuts[i + 
   1], col.cuts[j + 1]), corr = R)
5: binBvn(rho, row.cuts, col.cuts)
4: fn(par, ...)
3: function (par) 
   fn(par, ...)(c(0.422816748044139, -2.20343446037221, -2.2163627792244, 
   -1.79075055316993, -1.11077161663679, -0.323037731981826,
0.664036943094355, 
   -2.71305188847272, -1.58338678422633, -0.534011853182102,
0.65365619155084
   ))
2: optim(c(optimise(f, interval = c(-1, 1))$minimum, rc, cc), f, 
   control = control, hessian = std.err)
1: polychor(tmp.mat, ML = T, std.err = T)

The parameters are in the order correlation, row thresholds (of which there
are 6), column thresholds (of which there are 4), and at this point in the
maximization of the likelihood the first row threshold (-2.20343446037221)
is *above* the second threshold (-2.2163627792244), causing pmvnorm() to
complain. This can happen when the problem is ill-conditioned, since optim()
doesn't constrain the order of the thresholds.

So, why is the problem ill-conditioned? Here is your contingency table:

 tmp.mat
  3 4  5  6  7
1 0 2  0  0  0
2 0 0  1  1  0
3 0 0  5  1  1
4 0 5 10 12  2
5 0 5 27 29 11
6 1 3 20 57 31
7 0 1  9 34 32

There are only two observations in the first row, two in the second row, and
one in the first column.  You're expecting a lot out of ML to get estimates
of the first couple of thresholds for rows and the first for columns.

One approach would be to eliminate one or more sparse rows or columns; e.g.,

 polychor(tmp.mat[-1,], ML = TRUE, std.err = TRUE)

Polychoric Correlation, ML est. = 0.3932 (0.05719)
Test of bivariate normality: Chisquare = 14.21, df = 19, p = 0.7712

  Row Thresholds
  Threshold Std.Err.
1   -2.4410  0.24270
2   -1.8730  0.14280
3   -1.1440  0.09236
4   -0.3353  0.07408
50.6636  0.07857


  Column Thresholds
  Threshold Std.Err.
1   -2.6500  0.30910
2   -1.6420  0.12100
3   -0.5494  0.07672
40.6508  0.07833
 polychor(tmp.mat[,-1], ML = TRUE, std.err = TRUE)

Polychoric Correlation, ML est. = 0.4364 (0.05504)
Test of bivariate normality: Chisquare = 14.85, df = 17, p = 0.6062

  Row Thresholds
  Threshold Std.Err.
1   -2.4940  0.26020
2   -2.2080  0.19160
3   -1.7850  0.13400
4   -1.1090  0.09113
5   -0.3154  0.07371
60.6625  0.07821


  Column Thresholds
  Threshold Std.Err.
1   -1.6160  0.11970
2   -0.5341  0.07639
30.6507  0.07800
 

A more defensible alternative would be to collapse sparse rows or columns.

BTW, you *can* get estimated standard-errors from polychor() for your
original table for the two-step estimator:

 polychor(tmp.mat, ML = FALSE, std.err = TRUE)

Polychoric Correlation, 2-step est. = 0.4228 (0.05298)
Test of bivariate normality: Chisquare = 19.22, df = 23, p = 0.6883

That's because the two-step estimator estimates the thresholds from the
marginal distributions of the variables rather than from their joint
distribution.

So, is this a bug or a feature? I suppose that it's a bug to allow the
thresholds to get out of order, though to constrain the optimization to
prevent that from happening is probably not worth the effort and could cause
some strange results. On the other hand, the error tells you something about
the data, so maybe it's a feature.

I noticed that you posted another version of this question two days before
this one. I apologize for the slow response -- I wasn't able to read my
email for a few days and it's taken me most of today to catch up.

Regards,
 John

 original message ---

Janet Rosenbaum jrosenba at rand.org
Thu Aug 24 00:41:16 CEST 2006

Hi.

Does anyone know whether the following error is a result of a bug or a 
feature?

I can eliminate the error by making ML=F, but I would like to see the 
values of the cut-points and their variance.  Is there anything that I 
can do?


tmp.vec-c(0,  0,  0 , 0  ,0 , 1,  0,  2,  0 , 0,  5  ,5  ,3  ,1,  0 , 
1,  5, 10, 27, 20,  9,  0,  1,  1, 12, 29, 57, 34,  0,  0,  1,  2, 11, 
31, 32)
tmp.mat-matrix(tmp.vec, nrow=7)
rownames(tmp.mat)-1:7
colnames(tmp.mat)-3:7
tmp.pcc-polychor(tmp.mat, ML=T, std.err=T)
Error in checkmvArgs(lower = lower, upper = upper, mean = mean, corr = 
corr,  :
 at least one element of 'lower' is larger than 'upper'

Thanks,

Janet


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Type II and III sum of square in Anova (R, car package)

2006-08-27 Thread John Fox

Dear Amasco,

A complete explanation of the issues that you raise is awkward in an email,
so I'll address your questions briefly. Section 8.2 of my text, Applied
Regression Analysis, Linear Models, and Related Methods (Sage, 1997) has a
detailed discussion.

(1) In balanced designs, so-called Type I, II, and III sums of squares
are identical. If the STATA manual says that Type II tests are only
appropriate in balanced designs, then that doesn't make a whole lot of sense
(unless one believes that Type-II tests are nonsense, which is not the
case).

(2) One should concentrate not directly on different types of sums of
squares, but on the hypotheses to be tested. Sums of squares and F-tests
should follow from the hypotheses. Type-II and Type-III tests (if the latter
are properly formulated) test hypotheses that are reasonably construed as
tests of main effects and interactions in unbalanced designs. In unbalanced
designs, Type-I sums of squares usually test hypotheses of interest only by
accident. 

(3) Type-II sums of squares are constructed obeying the principle of
marginality, so the kinds of contrasts employed to represent factors are
irrelevant to the sums of squares produced. You get the same answer for any
full set of contrasts for each factor. In general, the hypotheses tested
assume that terms to which a particular term is marginal are zero. So, for
example, in a three-way ANOVA with factors A, B, and C, the Type-II test for
the AB interaction assumes that the ABC interaction is absent, and the test
for the A main effect assumes that the ABC, AB, and AC interaction are
absent (but not necessarily the BC interaction, since the A main effect is
not marginal to this term). A general justification is that we're usually
not interested, e.g., in a main effect that's marginal to a nonzero
interaction.

(4) Type-III tests do not assume that terms higher-order to the term in
question are zero. For example, in a two-way design with factors A and B,
the type-III test for the A main effect tests whether the population
marginal means at the levels of A (i.e., averaged across the levels of B)
are the same. One can test this hypothesis whether or not A and B interact,
since the marginal means can be formed whether or not the profiles of means
for A within levels of B are parallel. Whether the hypothesis is of interest
in the presence of interaction is another matter, however. To compute
Type-III tests using incremental F-tests, one needs contrasts that are
orthogonal in the row-basis of the model matrix. In R, this means, e.g.,
using contr.sum, contr.helmert, or contr.poly (all of which will give you
the same SS), but not contr.treatment. Failing to be careful here will
result in testing hypotheses that are not reasonably construed, e.g., as
hypotheses concerning main effects.

(5) The same considerations apply to linear models that include quantitative
predictors -- e.g., ANCOVA. Most software will not automatically produce
sensible Type-III tests, however.

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Amasco 
 Miralisus
 Sent: Saturday, August 26, 2006 5:07 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Type II and III sum of square in Anova (R, car package)
 
 Hello everybody,
 
 I have some questions on ANOVA in general and on ANOVA in R 
 particularly.
 I am not Statistician, therefore I would be very appreciated 
 if you answer it in a simple way.
 
 1. First of all, more general question. Standard anova() 
 function for lm() or aov() models in R implements Type I sum 
 of squares (sequential), which is not well suited for 
 unbalanced ANOVA. Therefore it is better to use
 Anova() function from car package, which was programmed by 
 John Fox to use Type II and Type III sum of squares. Did I 
 get the point?
 
 2. Now more specific question. Type II sum of squares is not 
 well suited for unbalanced ANOVA designs too (as stated in 
 STATISTICA help), therefore the general rule of thumb is to 
 use Anova() function using Type II SS only for balanced ANOVA 
 and Anova() function using Type III SS for unbalanced ANOVA? 
 Is this correct interpretation?
 
 3. I have found a post from John Fox in which he wrote that 
 Type III SS could be misleading in case someone use some 
 contrasts. What is this about?
 Could you please advice, when it is appropriate to use Type 
 II and when Type III SS? I do not use contrasts for 
 comparisons, just general ANOVA with subsequent Tukey 
 post-hoc comparisons.
 
 Thank you in advance,
 Amasco
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting

Re: [R] Total (un)standardized effects in SEM?

2006-08-27 Thread John Fox

Dear Rense,

I already wrote a function, appended below, to compute effects for SEMs, as
a method for the effects() generic. I've been thinking about adding this to
the sem package, possibly with asymptotic standard errors computed by the
delta method, as suggested by Michael Sobel. If I do so, I can also
incorporate standardized effects.

As I mentioned I have serious doubts about the utility of these ideas, and
worry about encouraging them, but I guess that I could say the same thing
for SEMs in general. My original purpose in writing the sem package was to
have something in R that I use in teaching.

Regards,
 John

-- snip ---

effects.sem - function(object, ...){
A - object$A
exog - apply(A, 1, function(x) all(x == 0))
endog - !exog
B - A[endog, endog, drop=FALSE]
C - A[endog, exog, drop=FALSE]
I - diag(nrow(B))
IBinv - solve(I - B)
TotEndog - IBinv - I
TotExog - IBinv %*% C
result - list(B=B, C=C, TotEndog=TotEndog, TotExog=TotExog)
class(result) - semEffect
result
}

print.semEffect - function(x, digits=5, ...){
B - x$B
C - x$C
cat(\nDirect Effects of Exogenous on Endogenous Variables:\n)
print(C, digits=digits)
cat(\n\nDirect Effects of Endogenous Variables on Each Other:\n)
print(B, digits=digits)
cat(\n\nIndirect Effects of Exogenous on Endogenous Variables:\n)
print(x$TotExog - C, digits=digits)
cat(\n\nIndirect Effects of Endogenous Variables on Each Other:\n)
print(x$TotEndog - B, digits=digits)
cat(\n\nTotal Effects of Exogenous on Endogenous Variables:\n)
print(x$TotExog, digits=digits)
cat(\n\nTotal Effects of Endogenous Variables on Each Other:\n)
print(x$TotEndog, digits=digits)
invisible(x)
}


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Rense 
 Nieuwenhuis
 Sent: Sunday, August 27, 2006 2:13 PM
 To: r-help@stat.math.ethz.ch
 Cc: John Fox
 Subject: Re: [R] Total (un)standardized effects in SEM?
 
 Dear John,
 
 thank you very much for your reply. The suggestions you make 
 for calculating the direct and indirect effects are exactly 
 what I was looking for. Although I'm very new to SEM and not 
 at all very experienced in R, I tried to put them together in 
 a function (called
 decomp) and expanded it to be able to calculate standardized 
 effects as well. For that, I made a few changes to your
 standardized.coefficients() and added the little function std.matrix
 () which converts the output of standardized.coefficients. 
 It's not a very coherent set of functions, but it works (for 
 now). I have performed a preliminary test using LISREL. The 
 results where identical.
 
 Again, I would like to thank you for the reply and the work 
 on the SEM package.
 
 With highest regards,
 
 Rense Nieuwenhuis
 
 
 (Note: The syntax below consists of three functions. The 
 first one decomposes SEM-objects. It needs the other two 
 functions to work. I'm sure this can be programmed a lot 
 cleaner, but for now, it serves my
 needs.)
 
 
 
 decomp - function(x, std=FALSE)
   {
   if(std==FALSE)
   {
   A - x$A
   # unstandardized structural coefficients
   }
   
   if(std==TRUE)
   {
   A - std.matrix(std.coef(x))
 # standardized structural coefficients
   }
   
   exog - apply(A, 1, function(x) all(x == 0))
   endog - !exog
 
   B - A[endog, endog, drop=FALSE]
 # direct effects, endogenous -  
 endogenous
   C - A[endog, exog, drop=FALSE] 
 # direct effects, exogenous -  
 endogenous
 
   I - diag(nrow(B))
   IBinv - solve(I - B)
   
   total.endo.endo - IBinv - I
 # total effects, endogenous -  
 endogenous
   total.exo.endo - IBinv %*% C   
 # total effects, exogenous -  
 endogenous
   ind.endo.endo - total.endo.endo - B# 
 indirect effects,  
 endogenous - endogenous
   ind.exo.endo - total.exo.endo - C  
 # indirect effects, exogenous - 
   endogenous
   
   temp - list(   total.endo.endo = total.endo.endo,
   total.exo.endo = total.exo.endo,
   ind.endo.endo = ind.endo.endo,
   ind.exo.endo = ind.exo.endo)
   
   return (temp)
   
   }
 
 
 
 std.matrix - function(x)
   {
   i - length(x$D)
   ii - length(x

Re: [R] Type II and III sum of square in Anova (R, car package)

2006-08-28 Thread John Fox

Dear Amasco,

Again, I'll answer briefly (since the written source that I previously
mentioned has an extensive discussion):

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Amasco 
 Miralisus
 Sent: Monday, August 28, 2006 2:21 PM
 To: r-help@stat.math.ethz.ch
 Cc: John Fox; Prof Brian Ripley; Mark Lyman
 Subject: Re: [R] Type II and III sum of square in Anova (R, 
 car package)
 
 Hello,
 
 First of all, I would like to thank everybody who answered my 
 question. Every post has added something to my knowledge of the topic.
 I now know why Type III SS are so questionable.
 
 As I understood form R FAQ, there is disagreement among 
 Statisticians which SS to use 
 (http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-does-the-out
 put-from-anova_0028_0029-depend-on-the-order-of-factors-in-the
 -model_003f).
 However, most commercial statistical packages use Type III as 
 the default (with orthogonal contrasts), just as STATISTICA, 
 from which I am currently trying to migrate to R. This was 
 probably was done for the convenience of end-users who are 
 not very experienced in theoretical statistics.
 

Note that the contrasts are only orthogonal in the row basis of the model
matrix, not, with unbalanced data, in the model matrix itself.

 I am aware that the same result could be produced using the standard
 anova() function with Type I sequential SS, supplemented by 
 drop1() function, but this approach will look quite 
 complicated for persons without any substantial background in 
 statistics, like no-math students. I would prefer easier way, 
 possibly more universal, though also probably more for 
 dummies :) If am not mistaken, car package by John Fox with 
 his nice Anova() function is the reasonable alternative for 
 any, who wish to simply perform quick statistical analysis, 
 without afraid to mess something with model fitting. Of 
 course orthogonal contrasts have to be specified (for example 
 contr.sum) in case of Type III SS.
 
 Therefore, I would like to reformulate my questions, to make 
 it easier for you to answer:
 
 1. The first question related to answer by Professor Brian 
 Ripley: Did I understood correctly from the advised paper 
 (Bill Venables'
 'exegeses' paper) that there is not much sense to test main 
 effects if the interaction is significant?
 

Many are of this opinion. I would put it a bit differently: Properly
formulated, tests of main effects in the presence of interactions make sense
(i.e., have a straightforward interpretation in terms of population marginal
means) but probably are not of interest.

 2. If I understood the post by John Fox correctly, I could safely use
 Anova(.,type=III) function from car for ANOVA analyses in 
 R, both for balanced and unbalanced designs? Of course 
 providing the model was fitted with orthogonal contrasts. 
 Something like below:
 mod - aov(response ~ factor1 * factor2, data=mydata,
contrasts=list(factor1=contr.sum, 
 factor2=contr.sum)) Anova(mod, type=III)
 

Yes (or you could reset the contrasts option), but why do you appear to
prefer the type-III tests to the type-II tests?

 It was also said in most of your posts that the decision of 
 which of Type of SS to use has to be done on the basis of the 
 hypothesis we want to test. Therefore, let's assume that I 
 would like to test the significance of both factors, and if 
 some of them significant, I plan to use post-hoc tests to 
 explore difference(s) between levels of this significant factor(s).
 

Your statement is too vague to imply what kind of tests you should use. I
think that people are almost always interested in main effects when
interactions to which they are marginal are negligible. In this situation,
both type-II and type-III tests are appropriate, and type-II tests
would usually be more powerful.

Regards,
John

 Thank you in advance, Amasco
 
 On 8/27/06, John Fox [EMAIL PROTECTED] wrote:
  Dear Amasco,
 
  A complete explanation of the issues that you raise is 
 awkward in an 
  email, so I'll address your questions briefly. Section 8.2 
 of my text, 
  Applied Regression Analysis, Linear Models, and Related 
 Methods (Sage, 
  1997) has a detailed discussion.
 
  (1) In balanced designs, so-called Type I, II, and 
 III sums of 
  squares are identical. If the STATA manual says that Type 
 II tests are 
  only appropriate in balanced designs, then that doesn't 
 make a whole 
  lot of sense (unless one believes that Type-II tests are nonsense, 
  which is not the case).
 
  (2) One should concentrate not directly on different 
 types of sums 
  of squares, but on the hypotheses to be tested. Sums of squares and 
  F-tests should follow from the hypotheses. Type-II and 
 Type-III tests 
  (if the latter are properly formulated) test hypotheses that are 
  reasonably construed as tests of main effects and interactions in 
  unbalanced designs. In unbalanced designs, Type-I sums of squares 
  usually test

[R] [R-pkgs] Version 1.2-0 of the Rcmdr package

2006-08-30 Thread John Fox

I've submitted a new, and substantially enhanced, version (1.2-0) of the
Rcmdr package to CRAN. Some highlights (from the CHANGES) file:

   o  Added ability to import from Excel, Access or dBase files (contributed
by Samir Messad, Renaud Lancelot and Matthieu Lesnoff).

   o  Added ability to read data from the clipboard (suggested by Graham
Smith).

   o  Added Data - Active data set - Stack variables in active data set
(suggested by Richard Heiberger).
   
   o  Added many probability distributions, courtesy of Miroslav Ristic and
G. Jay Kerns, Andy Chang, and Theophilius Boye.
   
   o  Added dialogs to sample from probability distributions; divided
distributions menus into Continuous and Discrete.
   
   o  Added totPercents function. Added total percentages and components of
chi-square to two-way tables (suggested by Richard Heiberger).
  
   o  Added Statistics - Summaries - Count missing observations.
   
   o  Added Statistics - Summaries - Correlation test, contributed by
Stefano Calza.
  
   o  The numerical summaries, table of statistics, frequency distributions,
numeric to factor, and recode dialogs will now process multiple variables in
parallel (suggested by Bo Sommerlund).
   
   o  Added function numSummaries for neater output from Statistics -
Summaries - Numerical summaries (motivated by comments from Bo
Sommerlund).
   
   o  XY conditioning plots dialog added, courtesy of Richard Heiberger.
  
   o  Enhancements to 3D scatterplots.

   o  Small interface improvements (e.g., pressing letter key in a list box
moves to next entry starting with that letter).

   o  New Rcmdr options (etc and etcMenus) determine where the Commander
will look for its configuration and menu files (motivated by a suggestion by
Richard Heiberger).
  
Rcmdr version 1.2-0 includes revised Catalan and Japanese translations
(courtesy of Manel Salamero and Takaharu Araki); as other revised
translations become available, I'll submit updated versions of the package
to CRAN. Version 1.2-1 will also have a revised introductory manual.

As usual, bug reports, comments, and suggestions are welcome.

John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox

___
R-packages mailing list
R-packages@stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question on Chi-square of null model in sem package

2006-09-04 Thread John Fox

Dear Wei-Wei,

As I explained to you in private email yesterday (perhaps you didn't receive
my reply?), the problem that you point out is due to a bug in the sem
function that I fixed some time ago and then inadvertently reintroduced.
Yesterday, I sent a corrected version of the sem package (0.9-5) to CRAN;
the source package is there now and I'm sure that the compiled Windows
package will appear in due course.

Thank you once more for bringing the problem to my attention.

John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Guo Wei-Wei
 Sent: Monday, September 04, 2006 12:34 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Question on Chi-square of null model in sem package
 
 Dear all,
 
 I met a problem while doing SEM by sem package. I got a 
 negative chi-square of null model. Because the theoretical 
 value of chi-square cannot be negative, I checked the source 
 code of sem.R in sem package and I found the Chi-square of 
 null model was computed by the following
 expression:
 
 result$chisqNull - (N - 1) * (sum(diag(S %*% diag(1/diag(S +
 log(prod(diag(S
 
 I think the reason for negative Chi-square is the too small value of
 prod(diag(S)) of my data. I'm working on a data.frame named 
 emc.data from a sample of a 16-item questioinnaire. The 
 variance of items are
 
  diag(cov(emc.data))
  EMC1  EMC2  EMC3  EMC4  EMC5  EMC6   
EMC7  EMC8
 0.364 0.2350041 0.2488009 0.2901653 0.3195399 0.3107343 
 0.3436622 0.2345912
  EMC9 EMC10 EMC11 EMC12 EMC13 EMC14   
   EMC15 EMC16
 0.2621680 0.3230400 0.4039245 0.3803105 0.2773370 0.4348342 
 0.2757216 0.3405252
 
 The fit indices of RMSEA and GFI are good, so I think the 
 problem might be solve by another way for computing the 
 Chi-square of null model. I'm not well trained in maths, so I 
 come for help. Any advise is appreciated.
 
 Best wishes,
 Wei-Wei
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problems with R and tckl/tk on Mac OS X

2006-09-29 Thread John Fox

Dear Alex,

It's possible, though not clear from the original posting, that Ingo is
trying to install the Rcmdr package. If that's the case, then he might find
the installation instructions for Mac users at
http://socserv.socsci.mcmaster.ca/jfox/Misc/Rcmdr/installation-notes.html
helpful.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Alex Brown
 Sent: Friday, September 29, 2006 8:48 AM
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] problems with R and tckl/tk on Mac OS X
 
 I assume you are running MacOS X 10.4?
 
 The Mac version of R does not require X11 or Tcl/TK since it 
 uses a separate GUI interface developed exclusively for the 
 Mac using native widgets and a quartz based graphing mechanism.
 
 Try downloading the following binary installer which includes the gui
 interface:
 
 http://cran.r-project.org/bin/macosx/R-2.3.1.dmg
 
 -Alex Brown
 
 On 29 Sep 2006, at 08:04, Ingo wrote:
 
 
  Dear R-help team,
  I am trying to run R on my Intel-based Mac.  I have installed R,
  X11 and Tcl/TK (I thought), but the GUI doesn't run.  My system 
  administrator doesn't support R and therefore, I'm a little 
 helpless.  
  What can I do?
 
  Regards,
 
  Ingo
 
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting- 
  guide.html and provide commented, minimal, self-contained, 
  reproducible code.
 
 
 
 On 29 Sep 2006, at 08:04, Ingo wrote:
 
  Dear R-help team,
  I am trying to run R on my Intel-based Mac.  I have installed R,
  X11 and Tcl/TK (I thought), but the GUI doesn't run.  My system 
  administrator doesn't support R and therefore, I'm a little 
 helpless.  
  What can I do?
 
  Regards,
 
  Ingo
 
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting- 
  guide.html and provide commented, minimal, self-contained, 
  reproducible code.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] scatter3d() model.summary coefficients?

2006-09-29 Thread John Fox

Dear Anja,

As you suggest, models in scatter3d() are fit via lm() and also mgcv().
scatter3d() rescales the three variables to fit in the unit cube;  I believe
that the new version of rgl makes the rescaling unnecessary, so eventually
I'll probably rework scatter3d() to avoid it. It would be better if
?scatter3d mentioned this; I've made that change in the development version
of the package.

BTW, a nice thing about R is that the source code is there, so you can look
to see what a function does.

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of angela baldo
 Sent: Friday, September 29, 2006 5:23 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] scatter3d() model.summary coefficients?
 
 Hello All,
 
 I am a R newbie and am probably misinterpreting something 
 really obvious...
 
 In the Rcmdr package there is a scatter3d() function that can 
 fit a curve and also provide coefficients for the model.  If 
 I'm understanding this right, I think it's calling the lower 
 level stats package function lm(), which is the part that 
 actually does the curve fitting.
 
 Anyway, what has me perplexed is that the model summary from 
 scatter3d() has different coefficients than the one generated 
 by lm().  However, the actual surface plotted by scatter3d() 
 looks like the function generated by lm().
 
 In the scatter3d() docs I didn't see anything about 
 transforming the coefficients or changing them somehow - 
 perhaps I have not been looking in the right place?
 
 I'm using a Linux box: 2.6.17-1.2187_FC5smp, R version 2.3.1, 
 Rcmdr version 1.2-0, in case that helps.
 
 Thanks very much for any enlightenment!
 
 anja
 
 
 Here's an example of the output on the same data by both 
 functions.  If anyone wants the dataset, let me know:
 
  scatter3d(samples$x1, samples$y, samples$x2, fit=linear,
 residuals=TRUE, bg=white, axis.scales=TRUE, grid=TRUE, 
 ellipsoid=FALSE, xlab=x1, ylab=y, zlab=x2, 
 model.summary=TRUE) $linear
 
 Call:
 lm(formula = y ~ x + z)
 
 Residuals:
   Min1QMedian3Q   Max 
 -0.096984 -0.022303  0.004758  0.029354  0.091188 
 
 Coefficients:
  Estimate Std. Error t value Pr(|t|)
 (Intercept)  0.708945   0.007005  101.20   2e-16 ***
 x0.278540   0.011262   24.73   2e-16 ***
 z   -0.688175   0.011605  -59.30   2e-16 ***
 ---
 Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
 
 Residual standard error: 0.03936 on 105 degrees of freedom
 Multiple R-Squared: 0.972,Adjusted R-squared: 0.9715 
 F-statistic:  1822 on 2 and 105 DF,  p-value:  2.2e-16 
 
  summary(lm(formula=samples$y~samples$x1+samples$x2))
 
 Call:
 lm(formula = samples$y ~ samples$x1 + samples$x2)
 
 Residuals:
 Min  1Q  Median  3Q Max 
 -7865.0 -1808.6   385.8  2380.5  7394.9 
 
 Coefficients:
  Estimate Std. Error t value Pr(|t|)
 (Intercept) 92204.502   1323.217   69.68   2e-16 ***
 samples$x1225.882  9.133   24.73   2e-16 ***
 samples$x2   -558.076  9.411  -59.30   2e-16 ***
 ---
 Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
 
 Residual standard error: 3192 on 105 degrees of freedom
 Multiple R-Squared: 0.972,Adjusted R-squared: 0.9715 
 F-statistic:  1822 on 2 and 105 DF,  p-value:  2.2e-16 
 
 --
 Angela M. Baldo
 Computational Biologist
 USDA, ARS
 Plant Genetic Resources Unit
  Grape Genetics Research Unit
 New York State Agricultural Experiment Station 630 W. North 
 Street Geneva, NY  14456-0462 USA
 
 voice 315 787-2413 or 607 254-9413
 fax   315 787-2339 or 607 254-9339
 
 [EMAIL PROTECTED]
 http://www.ars.usda.gov/NAA/Geneva
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] RODBC ERROR on Rcmdr install

2006-09-29 Thread John Fox

Dear Sarosh,

As I understand it, RODBC isn't useful on non-Windows systems, since the
necessary ODBC drivers aren't available. (Someone will correct me, I'm sure,
if I don't have that entirely straight.) The RODBC package is used in the
Rcmdr to read Excel and some other files under Windows; in the latest
version of the Rcmdr, you won't even see this menu item in non-Windows
systems.

You should, I believe, be able to install and use the Rcmdr package on your
system without RODBC.

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Sarosh Jamal
 Sent: Friday, September 29, 2006 11:53 AM
 To: R-help@stat.math.ethz.ch
 Subject: [R] RODBC ERROR on Rcmdr install
 
 
 I've been trying to install the RODBC dependency for Rcmdr on 
 Rv2.4 on a SunOS9 system. 
 
 It claims not to be able to create gcc output files 
 (executables) for the installation.
 
 This is puzzling since I've been able to install other 
 packages with the same PATH variables and all.
 
 Feedback would be much appreciated. Thank you!
 
 Sarosh Jamal
 Geo Computing  IT Specialist, Department of Geography 
 University of Toronto at Mississauga
 e: [EMAIL PROTECTED]
 p: (905) 569-4497
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] RODBC ERROR on Rcmdr install

2006-10-02 Thread John Fox

Dear Vittorio,

Thanks for the clarification. This raises two questions: (1) If a *nix
system has an ODBC driver, can one then read Excel, Access, and dBase data
sets via RODBC (which is what the Rcmdr menu item in question provides for)?
(2) If so, is there a way for me to detect whether unixODBC is present, or
would I have to rely, e.g., on a user-set option?

I'm moving this message to the r-devel list.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of vittorio
 Sent: Saturday, September 30, 2006 12:40 PM
 To: r-help@stat.math.ethz.ch
 Subject: Re: [R] RODBC ERROR on Rcmdr install
 
 Alle 22:55, venerdì 29 settembre 2006, John Fox ha scritto:
  As I understand it, RODBC isn't useful on non-Windows 
 systems, since 
  the necessary ODBC drivers aren't available. (Someone will 
 correct me, 
  I'm sure, if I don't have that entirely straight.) The 
 RODBC package 
  is used in the Rcmdr to read Excel and some other files 
 under Windows; 
  in the latest version of the Rcmdr, you won't even see this 
 menu item 
  in non-Windows systems.
 
 As a matter of fact RODBC can be profitably used under *nix 
 OS together with unixODBC to connect to many DBs.
 I've been using RODBC with unixODBC on linux, freebsd and win 
 xp  to connect smoothly to postgresql, mysql  and oracle 
 (somewhat tricky to me under *nix, you need ** to buy ** a 
 driver) and I know that connections are possible to many 
 other *nix DBs under *nix itself. 
 
 Ciao
 Vittorio
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Linear model with hidden variables

2006-10-04 Thread John Fox

Dear Richard,

This looks like a latent-variable structural-equation model of the kind that
one can fit with the sem package. If, however, both of the Y's affect all of
the Z's, then the model isn't identified. (You also don't say what you
assume about the errors from different equations -- are they correlated?)

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Richard A. O'Keefe
 Sent: Wednesday, October 04, 2006 12:37 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Linear model with hidden variables
 
 
 I have some data on a moving vehicle where, amongst other 
 things, it looks as though it would be informative to fit a 
 model with the following structure:
 
 Z = B.Y + errorz
 Y = C.X + errorz
 
 The X variables are observed predictor variables;
 6 of the variables look promising (on the basis of what they mean).
 
 The Z variables are observed response variables; there are 4 of them.
 
 There is a priori reason to believe, and scatterplots to 
 suggest, that the Z variables are really essentially 
 two-dimensional, so
 
 the Y variables are hidden intermediate variables.
 There should be 2 of them.  There are actually two physical 
 candidates for what they might be, but they happen not to 
 have been measured.
 
 There are 800 cases.  (More precisely, there are 14 periods, 
 each with about 800 samples, and I am interested in fitting a 
 separate model in each period.)
 
 A simple least squares fit for this model would minimise the 
 sum of error squares for the Zs.  Oh, I do mean there to be 
 constant terms.
 
 I suppose I could go back to first principles and work it all 
 out, but has anyone ever done something like this in R?  
 There seems to be every imaginable variation on lm and some 
 that I find unimaginable, so presumably a means to do this is 
 already around somewhere.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] glm and plot.effects

2006-10-06 Thread John Fox

Dear Michael,

Look more closely at the two plots: With rescale.axis=TRUE (the default),
the vertical axis is on the scale of the linear predictor (i.e., the logit
scale), but the tick marks are labelled in the scale of the response (i.e.,
the probability scale); this preserves the linear structure of the model.
With rescale.axis=FALSE, the vertical axis is on the scale of the response
(directly on the probability scale). 

If you want the vertical axis labelled on the scale of the linear predictor,
specify type=link (the default is response).

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Michael Kubovy
 Sent: Friday, October 06, 2006 7:19 AM
 To: R list
 Subject: [R] glm and plot.effects
 
 Dear R-helpers,
 
 I don't see a difference between the following two plots of 
 effect objects, which I understand should be different. What 
 am I missing?
 
 require(doBy)
 require(effects)
 data(budworm)
 m1 - glm(ndead/20 ~ sex + log(dose), data=budworm, weight=ntotal,
 family=binomial)
 m1.eff - all.effects(m1)
 plot(m1.eff, rescale.axis = FALSE, selection = 2, main = 
 'rescale = F') plot(m1.eff, rescale.axis = TRUE, selection = 
 2, main = 'rescale = T')
 
 *
 R version 2.4.0 (2006-10-03)
 powerpc-apple-darwin8.7.0
 
 locale:
 C
 
 attached base packages:
 [1] grid  tools splines   methods   utils
 [6] stats graphics  grDevices datasets  base
 
 other attached packages:
 effects   doBy  Hmisc  chron weaver  codetools
 1.0-9  0.83.1-12.3-81.0.00.0-3
  digest   proptest   survival   lme4 MatrixJGR
 0.2.20.1-1 2.29 0.9975-0 0.9975-1   1.4-11
  iplots JavaGD  rJava   MASSlattice
 1.0-40.3-5   0.4-10   7.2-29   0.14-9
 
 _
 Professor Michael Kubovy
 University of Virginia
 Department of Psychology
 USPS: P.O.Box 400400Charlottesville, VA 22904-4400
 Parcels:Room 102Gilmer Hall
  McCormick RoadCharlottesville, VA 22903
 Office:B011+1-434-982-4729
 Lab:B019+1-434-982-4751
 Fax:+1-434-982-4766
 WWW:http://www.people.virginia.edu/~mk9y/
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Differences in estimates calculated in R and SPSS

2006-10-14 Thread John Fox

Dear Katja,

If the fitted values are the same and the coefficients differ, then the
parametrization of the models differ. I suspect that SPSS is using
sum-to-zero contrasts (contr.sum in R, coded -1, 0, 1), while R uses
dummy-coded contrasts (contr.treatment, coded 0,1) by default. If this is
the case, then you should get the same coefficients in R by setting
options(contrasts=c(contr.sum, contr.poly)). There are other ways of
doing this as well. 

See ?options, ?contr.treatment, etc., and section 11.1.1 of the introductory
manual that comes with R (and perhaps read something more detailed about how
R handles factors in linear models).

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Katja 
 Walentowitz
 Sent: Saturday, October 14, 2006 11:08 AM
 To: Dirk Eddelbuettel
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] Differences in estimates calculated in R and SPSS
 
 I have compared the models, they are exactly the same. The 
 same for the data. The problem occurs most often when I use 
 the aov-function.
 When I calculate the fitted values, they are the same, but 
 the intercept is somewhat different. In R the intercept is 
 more or less the mean of the regressand values, as defined in 
 some statistical books - but still: it is not the exact mean. 
 In SPSS it is not.
 
 Thanks again for helping me.
 Katja
 
 On 10/14/06, Dirk Eddelbuettel [EMAIL PROTECTED] wrote:
 
 
  On 14 October 2006 at 16:59, Katja Walentowitz wrote:
  | I am a bit confused because I get different estimates for the 
  | regression coefficients from R and SPSS.
  | Does anyone have an explanation for this?  As I am not so 
  | experienced in data analysis, I don't know why this happens.
 
  Maybe you are not estimating the same model -- one may have an 
  intercept, and the other doesn't.  Or maybe something happens while 
  you read the data -- start by comparing summary statistics, 
  tabulations and plots of your data.
 
  But as you didn't tell what you estimated, and how, it is a 
 tad hard 
  for use to guess.  There is a Posting Guide recommending 
 how to phrase 
  questions and what to suplly ...
 
  Dirk
 
  --
  Hell, there are no rules here - we're trying to accomplish 
 something.
-- Thomas 
 A. Edison
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] regression analyses using a vector of means anda variance-covariance matrix

2006-10-14 Thread John Fox

Dear John,

I'm not aware of an existing function that will do what you want
conveniently, but it's not hard to write one:

reg - function(V, means, y, x, n){
# V: covariance matrix of variables
# means: vector of means
# y: index of response in V
# x: indices of regressors in V
SSP - (n - 1)*V + n*outer(means, means)
SSP - cbind(n*means, SSP)
SSP - rbind(c(n, n*means), SSP)
XtXinv - solve(SSP[c(1,x+1), c(1,x+1)])
b - as.vector(XtXinv %*% SSP[c(1,x+1), y+1])
R2 - as.vector(V[y,x] %*% solve(V[x,x]) %*% V[x,y])/V[y,y]
s2 - (1 - R2)*V[y,y]*(n - 1)/(n - length(x) - 1)
Vb - s2*XtXinv
list(coef=b, vcov=Vb, R2=R2, s2=s2, SSP=SSP)
}

For example,

 library(car)
 data - Duncan[,c(prestige, income, education)]
 reg(var(data), colMeans(data), 1, 2:3, nrow(data))
$coef
[1] -6.0646629  0.5987328  0.5458339

$vcov
  incomeeducation
  18.2494814 -0.15184501 -0.150706025
income-0.1518450  0.01432027 -0.008518551
education -0.1507060 -0.00851855  0.009653582

$R2
[1] 0.8281734

$s2
[1] 178.7309

$SSP
   prestige income education
45 2146   1884  2365
prestige  2146   146028 118229147936
income1884   118229 105148122197
education 2365   147936 122197163265

 summary(lm(prestige ~ income + education, data=Duncan)) # check

Call:
lm(formula = prestige ~ income + education, data = Duncan)

Residuals:
 Min   1Q   Median   3Q  Max 
-29.5380  -6.4174   0.6546   6.6051  34.6412 

Coefficients:
Estimate Std. Error t value Pr(|t|)
(Intercept) -6.064664.27194  -1.4200.163
income   0.598730.11967   5.003 1.05e-05 ***
education0.545830.09825   5.555 1.73e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Residual standard error: 13.37 on 42 degrees of freedom
Multiple R-Squared: 0.8282, Adjusted R-squared:  0.82 
F-statistic: 101.2 on 2 and 42 DF,  p-value:  2.2e-16

Some caveats: This function will work OK for well conditioned data, since
I'm inverting the matrix of sums of squares and products; one could take a
more sophisticated approach. As well, there's a bit of redundancy in the
computation of R2, but I didn't have time to figure out how to avoid it.
Finally, there's an obvious advantage to computing on the original data --
e.g., one can get residuals.

I hope this helps,
 John 


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of John Sorkin
 Sent: Saturday, October 14, 2006 3:27 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] regression analyses using a vector of means anda 
 variance-covariance matrix
 
 R 2.2.0
 windows XP
 
 How can I perform a regression analyses using a vector of 
 means, a variance-covariance matrix? I looked at the help 
 screen for lm and did not see any option for using the afore 
 mentioned structures as input to lm.
 Thanks,
 John
 
 John Sorkin M.D., Ph.D.
 Chief, Biostatistics and Informatics
 Baltimore VA Medical Center GRECC,
 University of Maryland School of Medicine Claude D. Pepper 
 OAIC, University of Maryland Clinical Nutrition Research 
 Unit, and Baltimore VA Center Stroke of Excellence
 
 University of Maryland School of Medicine Division of 
 Gerontology Baltimore VA Medical Center 10 North Greene 
 Street GRECC (BT/18/GR) Baltimore, MD 21201-1524
 
 (Phone) 410-605-7119
 (Fax) 410-605-7913 (Please call phone number above prior to 
 faxing) [EMAIL PROTECTED]
 
 Confidentiality Statement:
 This email message, including any attachments, is for the\...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

RE: [R] ANCOVA

2004-08-28 Thread John Fox

Dear Matt,

The sequential sums of squares produced by anova() test for g ignoring x
(and the interaction), x after g (and ignoring the interaction), and the x:g
interaction after g and x. The second and third test are generally sensible,
but the first doesn't adjust for x, which is probably not what you want in
general (although you've constructed x to be independent of g, which is
typically not the case). Note that you would not usually want to test for
equal intercepts in the model that doesn't constrain the slopes to be equal
(though you haven't done this): Among other things, 0 is outside the range
of x.

The Anova() function in the car package will produce so-called type-II
tests.

I hope that this helps.
 John

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Matt Oliver
 Sent: Friday, August 27, 2004 4:10 PM
 To: [EMAIL PROTECTED]
 Subject: [R] ANCOVA
 
 Dear R-help list,
 
 I am attempting to understand the proper formulation of 
 ANCOVA's in R. I would like to test both parallelism and 
 intercept equality for some data sets, so I have generated an 
 artificial data set to ease my understanding.
 
 This is what I have done
 
 #Limits of random error added to vectors min - -0.1 max - 0.1
 
 x - c(c(1:10), c(1:10))+runif(20, min, max)
 x1 - c(c(1:10), c(1:10))+runif(20, min, max) y - c(c(1:10), 
 c(10:1))+runif(20, min, max) z - c(c(1:10), 
 c(11:20))+runif(20, min, max) g - as.factor(c(rep(1, 10), 
 rep(2, 10)))
 
 #x and x1 have similar slopes and have the similar 
 intercepts, #x and y have different slopes and different 
 intercepts #x and z have similar slopes with different intercepts
 
 #These are my full effects models
 
 fitx1x - lm(x1~g+x+x:g)
 fityx - lm(y~g+x+x:g)
 fitzx - lm(z~g+x+x:g)
 
 
 anova(fitx1x)
 
 Analysis of Variance Table
 
 Response: x1
  Df   Sum Sq  Mean Sq F 
 value   Pr(F)
 g   1 0.002   0.002  0.3348 
 0.5709
 x   1 163.927  163.927  23456.8319 
 2e-16 ***
 g:x1  0.002  0.002   0.2671 
 0.6123
 Residuals 16  0.112  0.007
 ---
 Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
 
 These results confirm that x and x1 do not have significantly 
 different means with respect to g; There is a significant 
 linear relationship between x and x1 independent of g There 
 is no evidence that the slopes of x and x1 in the two g 
 groups is different
 
 anova(fityx)
 
 Analysis of Variance Table
 
 Response: y
  Df   Sum Sq  Mean Sq F 
 value  Pr(F)
 g   1   0.012  0.012  
   1.7344 
 0.2064
 x   1   0.003  0.003  
   0.4399 
 0.5166
 g:x 1  164.947   164.947  
 24274.4246 2e-16 ***
 Residuals   160.1090.007
 ---
 Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
 
 These results confirm that x and y do not have significantly 
 different means with respect to g; There is a not a 
 significant linear relationship between x and y independent 
 of g There is evidence that the slopes of x and y in the two 
 g groups is significantly different.
 
 
 anova(fitzx)
 
 Analysis of Variance Table
 
 Response: z
 DfSum Sq  Mean Sq  F 
 value  Pr(F)
 g  1  501.07501.07
 52709.9073  2e-16 ***
 x  1  165.39165.39
 17398.4057  2e-16 ***
 g:x10.02  0.02 1.7472 
 0.2048
 Residuals 160.15  0.01
 ---
 Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
 
 These results confirm that x and z  have significantly 
 different means with respect to g; There is a a significant 
 linear relationship between x and z independent of g There is 
 no evidence that the slopes of x and z in the two g groups is 
 significantly different.
 
 
 What I don't understand is how to formulate the model so that 
 I can tell if the intercepts between the g groups are different.
 Also, how would I formulate an ANCOVA if I am dealing with 
 Model II regressions?
 
 
 Any help would be greatly appreciated.
 
 Matt
 
 
 
 ==
 When you reach an equilibrium in biology, you're dead. - A. 
 Mandell == Matthew J. Oliver 
 Institute of Marine and Coastal Sciences
 71 Dudley Road, New Brunswick
 New Jersey, 08901
 http://marine.rutgers.edu/cool/
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do

Re: [R] Rcmdr X11 protocol error message

2004-08-30 Thread John Fox

Dear Peter,

On 30 Aug 2004 17:41:10 +0200
 Peter Dalgaard [EMAIL PROTECTED] wrote:
 John Fox [EMAIL PROTECTED] writes:
 
  Dear Peter and Michael,
  
  Peter: Thank you for fielding this question. Michael initially
 contacted me
  directly, and I suggested that he write to the list since I didn't
 know the
  possible source of the problem, had never seen it (either under
 Windows or
  Linux), and never had a report of it before.
  
  Michael: Does this problem occur only with certain Rcmdr dialogs or
  generally? 
  
  I'd also be curious to learn whether anyone else has experienced
 this
  problem, and whether anyone has run the Rcmdr package under
 Mandrake Linux
  without a problem.
 
 I can't seem to reproduce it on RH8 with R-1.9.x. I see a couple of
 other quirks though:
 
 If you bring up a dialog and click help, you get the help window
 OK,
 but it is not scrollable until you exit the dialog with OK or Cancel
 (by which time the point is moot either way, presumably).
 

My Linux system badly needs updating, so I can't check this directly
now. Does this problem occur with HTML help? As well, setting the Rcmdr
grab.focus option to FALSE might get things unstuck.

 The View data set feature gets stuck in a tight loop and requires
 a Ctl-C in the console to allow it to display the data and process
 any
 events.

Does this problem occur occur when  showData() (from the relimp
package) is called directly, or only when it's called from the Rcmdr?

I'm sorry to impose these questions on you -- in the longer term, I
have to update my Linux system so that I can again perform these tests
myself.

Thank you for your help,
 John
 
 -p
 
 -- 
O__   Peter Dalgaard Blegdamsvej 3  
   c/ /'_ --- Dept. of Biostatistics 2200 Cph. N   
  (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45)
 35327918
 ~~ - ([EMAIL PROTECTED]) FAX: (+45)
 35327907


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] initialization error with XEmacs init.el file

2004-08-30 Thread John Fox

Dear Melanie,

Are you using forward-slashes (/) or double-back-slashes (\\) to separate
directories, as I believe you should, and placing quotes around the path?

I hope that this helps,
 John 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Melanie A. Link-Perez
 Sent: Monday, August 30, 2004 5:08 PM
 To: [EMAIL PROTECTED]
 Subject: [R] initialization error with XEmacs init.el file
 
 I have successfully used XEmacs on my home PC for several months.  On
 8/25/04 I  downloaded R version 1.9.1 (rw1091) from one of 
 the R mirror sites, XEmacs installer (setup.exe) from 
 http://www.xemacs.org/Download/win32/, and Jim Fox's latest 
 configuration files (fox-ess-config.zip) from 
 http://socserv.socsci.mcmaster.ca/jfox/Books/Companion/ESS/ 
 to my Dell laptop, which is running MS Windows XP 
 Professional version 2002.  Upon launching XEmacs, I get the 
 following error:
 
   (1) (initialization/error) An error has occurred while loading
   c:\.xemacs\init.el:
 
   Cannot open load file: executable
 
 
 The init.el file is version 0.5.6.  I have triple-checked 
 path names and placement of files in directories, but have 
 not been able to track down this problem.  I also tried 
 replacing the init.el file with my older version
 (0.5.4) to no avail.  I have tried to get around this by 
 using Alt-x and shift-R to launch R, but the following 
 appears below the buffer:
 
   Searching for program: No such file or directory, C:Program
   FilesR^Mw1091^HinRterm
 
 
 Certainly, there is no Mw1091 or HinRterm and I don't 
 know where the program is getting that info (with the obvious 
 typos).  I am grateful for any assistance that is offered.
 
 Thanks,
 Melanie Link-Perez
 Miami University
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] Rcmdr X11 protocol error message

2004-08-31 Thread John Fox

Dear Michael,

A question: Do you observe these problems with tcltk in general or just with
the Rcmdr package? Is it possible for you to test a Tcl/Tk program outside
of R?

Regards,
 John

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Peter Dalgaard
 Sent: Tuesday, August 31, 2004 2:32 AM
 To: Michael Bibo
 Cc: [EMAIL PROTECTED]
 Subject: Re: [R] Rcmdr X11 protocol error message
 
 Michael Bibo [EMAIL PROTECTED] writes:
 
  What's happening?
  Both Java and R are reading events from the X11 event loop ``at 
  the same time''. As a result, R is reading events meant for 
 Java and 
  vice-versa and bad things happen!
  
  There are two solutions:
  
  * use the Java graphics device which uses the Java graphics 
  facilities to do the rendering;
  * don't run the GUI and the X11 device at the same time. 
  
  The second solution is clearly not ideal. What we can do is to 
  compile the
  X11 device code to be thread-safe. However, the same effect will be 
  seen if we load another X11 event handler (e.g. the Tcl/Tk library).
 
 I don't think that is the issue. The X11 device and Tk sit on 
 different connections to the X device, so they shouldn't be 
 seeing eachother's events (the event loops can starve 
 eachother though since they are multiplexed). I'd rather 
 suspect a race condition with the Tk code itself - sending 
 events to a window that is in the process of being destroyed 
 and things like that. Which Tk version are we talking about, 
 by the way?
 
 -- 
O__   Peter Dalgaard Blegdamsvej 3  
   c/ /'_ --- Dept. of Biostatistics 2200 Cph. N   
  (*) \(*) -- University of Copenhagen   Denmark  Ph: 
 (+45) 35327918
 ~~ - ([EMAIL PROTECTED]) FAX: 
 (+45) 35327907
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] Rcmdr X11 protocol error message

2004-09-01 Thread John Fox

Dear Peter and Michael,

I installed Quantian on  a spare machine that I have and observed the same
warning messages that Michael has been reporting. These (and the problem
with help files but not with viewing data sets that Peter reported) occurred
with version 0.9-11 of Rcmdr but not with an earlier version.

Since the code for the Rcmdr package was substantially reworked this summer,
that seems to me a good candidate for the source of these problems, though I
don't see why the changes should be problematic. I'm afraid that I'm
insufficiently familiar with the inner workings of X11 and Tcl/Tk to be much
help in figuring out what's wrong. Everything seems to work fine under
Windows, as far as I can tell.

It occurs to me that if the warning messages are benign, one approach would
be to suppress them. I already intercept warnings and present them in dialog
boxes; I could grep for X11 protocol error and simply ignore these. That
doesn't seem to me a good solution, however. It would be better to
understand what's happening.

I'm copying this message to Dirk since he's mentioned that he plans to put
the newer Rcmdr in Quantian. Dirk: Have you tested with Rcmdr 0.9-11?

Thank you.
 John



 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Peter Dalgaard
 Sent: Tuesday, August 31, 2004 2:37 PM
 To: Michael Bibo
 Cc: [EMAIL PROTECTED]
 Subject: Re: [R] Rcmdr X11 protocol error message
 
 Michael Bibo [EMAIL PROTECTED] writes:
 
  John Fox jfox at mcmaster.ca writes:
  
   
   Dear Michael,
   
   A question: Do you observe these problems with tcltk in 
 general or 
   just with the Rcmdr package? Is it possible for you to 
 test a Tcl/Tk 
   program outside of R?
   
  And Peter asked which version of Tcl/Tk.
  
  Apparently my system has version 8.4 installed, specifically:
  Tcl-8.4.5-3-mdk (tclsh8.4) and Tk-8.4.5-3-mdk (libtk8.4.so).  As I 
  understand it, these are installed from RPMs on the 
 installation DVD, 
  and I note that they are mandrake specific.
  
  John - I wasn't sure if I had any other Tcl/Tk applications 
 installed 
  (it's not always obvious when installing from RPM's).  I have 
  certainly not encountered these error messages with any other 
  application.  I quickly downloaded WISH Supernotepad 1.2.3.  This 
  application requires Tcl/Tk 8.4 or greater.  There are no graphics 
  window in this application, but plenty of dialogue boxes.  
 It gave no errors.  Is this an appropriate test?
  
  If this is a mandrake-configuration-specific problem, it may not be 
  worth spending too much time investigating, as R Commandr 
 still works.  
  I can always try re-installing Tcl/Tk from source when/if I 
 have time.
 
 I don't think we have evidence that it's the Tcl 
 installation, although it could be (Google suggests that 
 there have been problems with at least some versions, 
 although most references seem rather old). I can't seem to 
 reproduce the effect with SUSE's tk-8.4.6-28 either. If it is 
 a bug in Rcmdr, then we'd want to find it and you have the 
 only handle on it
 
 BTW, sometimes Tk errors allow you to see a trace of the execution.
 Would this happen to be one of those situations?
 
 -- 
O__   Peter Dalgaard Blegdamsvej 3  
   c/ /'_ --- Dept. of Biostatistics 2200 Cph. N   
  (*) \(*) -- University of Copenhagen   Denmark  Ph: 
 (+45) 35327918
 ~~ - ([EMAIL PROTECTED]) FAX: 
 (+45) 35327907
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] Rcmdr X11 protocol error message

2004-09-01 Thread John Fox

Dear Michael,

I can confirm Michael's observations: no problems until after a graphics
device window is opened. Then increasing numbers of warnings each time
commands are executed from within the Rcmdr. This happens whether the
commands are executed from dialog boxes or from the (upper) script window.
The warnings continue even if the graphics window is closed. Exiting from
the Commander and restarting it (within the name R session) stops the
warnings until another graphics window is opened.

John

 -Original Message-
 From: Michael Bibo [mailto:[EMAIL PROTECTED] 
 Sent: Wednesday, September 01, 2004 6:21 AM
 To: John Fox
 Cc: 'Peter Dalgaard'; [EMAIL PROTECTED]; 'Dirk Eddelbuettel'
 Subject: Re: [R] Rcmdr X11 protocol error message
 
 John Fox wrote:
 
 Dear Peter and Michael,
 
 I installed Quantian on  a spare machine that I have and 
 observed the 
 same warning messages that Michael has been reporting. These 
 (and the 
 problem with help files but not with viewing data sets that Peter 
 reported) occurred with version 0.9-11 of Rcmdr but not with 
 an earlier version.
 
 Since the code for the Rcmdr package was substantially reworked this 
 summer, that seems to me a good candidate for the source of these 
 problems, though I don't see why the changes should be 
 problematic. I'm 
 afraid that I'm insufficiently familiar with the inner 
 workings of X11 
 and Tcl/Tk to be much help in figuring out what's wrong. Everything 
 seems to work fine under Windows, as far as I can tell.
 
 It occurs to me that if the warning messages are benign, one 
 approach 
 would be to suppress them. I already intercept warnings and present 
 them in dialog boxes; I could grep for X11 protocol error 
 and simply 
 ignore these. That doesn't seem to me a good solution, however. It 
 would be better to understand what's happening.
 
 I'm copying this message to Dirk since he's mentioned that 
 he plans to 
 put the newer Rcmdr in Quantian. Dirk: Have you tested with 
 Rcmdr 0.9-11?
 
 Thank you.
  John
   
 
   
 
 John, Peter and Dirk,
 
 I'm glad I'm not the only one with a handle on it now.  I've 
 already exceeded my level of knowledge and skill in both 
 Linux and R.  But I'm happy to help track it down, if I can 
 be of assistance.  As I have said to John, the major value of 
 Rcmdr to me is to help 'sell' R to others in my organisation 
 currently using SPSS, and for this purpose, the Windows 
 version at work is working fine.  It is more of an annoyance 
 at home, as it doesn't seem to stop anything working.
 
 Peter asked:
 
 BTW, sometimes Tk errors allow you to see a trace of the execution.
 Would this happen to be one of those situations?
 
 The short answer is I don't know (see my first line :-)).  
 There is no button on the dialogue box saying 'more 
 details...' or anything that obvious.  Is there a log file 
 somewhere on the system or some other way to generate such a trace?
 
 I have been experimenting, and the following seem reliable 
 (at least on my system):
 
 Error messages don't seem to happen for the first graph 
 generated.  The second graph drawn also generates an error 
 dialogue box with the error message repeated about 11 times.  
 The third and subsequent times exactly the same graph is 
 generated leads to an error message repeated about 21 times.  
 (I don't know if the number of repetitions is meaningful).  
 Note that the graphs are generated and visible when the error 
 messages appear.
 
 Running any analysis that only writes output to the output 
 window (such as fitting a regression model) generates the 
 error messages before the output is written to the output 
 window.  The output appears when the error dialogue box is 
 OK'd.  I guess it's more likely to be responding to exiting 
 the dialogue box than writing the output.  But this only 
 happens when such error messages have been generated 
 previously in that session by creating a (second) graph.
 
 If the diagnostic panel plots for a regression model are 
 called, the error messages appear, but this time the plots 
 themselves are not visible in the graphics device (it is 
 blank).  When the error messages are OK'd, the plots appear 
 in the device.
 
 When exiting Rcmdr, then, and OK'ing the Exit? dialogue 
 box, the final error messages appear, but again only if they 
 have already been generated by a graph call.
 
 I don't know if this will be helpful, but I thought reliable 
 observations might give some clues.
 
 Regards,
 
 Michael
 
 [EMAIL PROTECTED]
 
 


__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] Rcmdr X11 protocol error message

2004-09-01 Thread John Fox

Dear Michael and Peter,

I've made the following changes to the current development version of the
Rcmdr package (on my web site, at
http://socserv.socsci.mcmaster.ca/jfox/Misc/Rcmdr/index.html, but not posted
to CRAN):

(1) The default for the Rcmdr grab.focus option is now set to FALSE for
non-Windows systems.

(2) There's a new report.X11.warnings option, which is set to FALSE by
default; this suppresses these warnings so that they don't appear in Rcmdr
warning dialogs. This just papers over the problem without really solving
it.

I'm not happy with these solutions, but perhaps they'll help until we
discover the source of the problems. (I tested on my Quantian system.)

Thanks for your help.
 John

 -Original Message-
 From: Michael Bibo [mailto:[EMAIL PROTECTED] 
 Sent: Wednesday, September 01, 2004 6:21 AM
 To: John Fox
 Cc: 'Peter Dalgaard'; [EMAIL PROTECTED]; 'Dirk Eddelbuettel'
 Subject: Re: [R] Rcmdr X11 protocol error message
 
 John Fox wrote:
 
 Dear Peter and Michael,
 
 I installed Quantian on  a spare machine that I have and 
 observed the 
 same warning messages that Michael has been reporting. These 
 (and the 
 problem with help files but not with viewing data sets that Peter 
 reported) occurred with version 0.9-11 of Rcmdr but not with 
 an earlier version.
 
 Since the code for the Rcmdr package was substantially reworked this 
 summer, that seems to me a good candidate for the source of these 
 problems, though I don't see why the changes should be 
 problematic. I'm 
 afraid that I'm insufficiently familiar with the inner 
 workings of X11 
 and Tcl/Tk to be much help in figuring out what's wrong. Everything 
 seems to work fine under Windows, as far as I can tell.
 
 It occurs to me that if the warning messages are benign, one 
 approach 
 would be to suppress them. I already intercept warnings and present 
 them in dialog boxes; I could grep for X11 protocol error 
 and simply 
 ignore these. That doesn't seem to me a good solution, however. It 
 would be better to understand what's happening.
 
 I'm copying this message to Dirk since he's mentioned that 
 he plans to 
 put the newer Rcmdr in Quantian. Dirk: Have you tested with 
 Rcmdr 0.9-11?
 
 Thank you.
  John
   
 
   
 
 John, Peter and Dirk,
 
 I'm glad I'm not the only one with a handle on it now.  I've 
 already exceeded my level of knowledge and skill in both 
 Linux and R.  But I'm happy to help track it down, if I can 
 be of assistance.  As I have said to John, the major value of 
 Rcmdr to me is to help 'sell' R to others in my organisation 
 currently using SPSS, and for this purpose, the Windows 
 version at work is working fine.  It is more of an annoyance 
 at home, as it doesn't seem to stop anything working.
 
 Peter asked:
 
 BTW, sometimes Tk errors allow you to see a trace of the execution.
 Would this happen to be one of those situations?
 
 The short answer is I don't know (see my first line :-)).  
 There is no button on the dialogue box saying 'more 
 details...' or anything that obvious.  Is there a log file 
 somewhere on the system or some other way to generate such a trace?
 
 I have been experimenting, and the following seem reliable 
 (at least on my system):
 
 Error messages don't seem to happen for the first graph 
 generated.  The second graph drawn also generates an error 
 dialogue box with the error message repeated about 11 times.  
 The third and subsequent times exactly the same graph is 
 generated leads to an error message repeated about 21 times.  
 (I don't know if the number of repetitions is meaningful).  
 Note that the graphs are generated and visible when the error 
 messages appear.
 
 Running any analysis that only writes output to the output 
 window (such as fitting a regression model) generates the 
 error messages before the output is written to the output 
 window.  The output appears when the error dialogue box is 
 OK'd.  I guess it's more likely to be responding to exiting 
 the dialogue box than writing the output.  But this only 
 happens when such error messages have been generated 
 previously in that session by creating a (second) graph.
 
 If the diagnostic panel plots for a regression model are 
 called, the error messages appear, but this time the plots 
 themselves are not visible in the graphics device (it is 
 blank).  When the error messages are OK'd, the plots appear 
 in the device.
 
 When exiting Rcmdr, then, and OK'ing the Exit? dialogue 
 box, the final error messages appear, but again only if they 
 have already been generated by a graph call.
 
 I don't know if this will be helpful, but I thought reliable 
 observations might give some clues.
 
 Regards,
 
 Michael
 
 [EMAIL PROTECTED]
 
 


__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] Rcmdr X11 protocol error message

2004-09-05 Thread John Fox

Dear Michael, 

 -Original Message-
 From: Michael Bibo [mailto:[EMAIL PROTECTED] 
 Sent: Saturday, September 04, 2004 9:59 AM
 To: John Fox
 Cc: 'Peter Dalgaard'; [EMAIL PROTECTED]; 'Dirk Eddelbuettel'
 Subject: Re: [R] Rcmdr X11 protocol error message

 . . .

 I have another reliable observation with the new version of 
 Rcmdr, which I offer only because it might provide a clue as 
 to what's going on:
 If you generate a scatterplot for two numerical variables 
 grouped by a factor, the plot is drawn, but the dialogue box 
 remains on screen, and is unresponsive.  When the graphics 
 device is closed, you get an error
 message: 'Error: No graphics device active'.
 If you then try to generate the same scatterplot ungrouped 
 (which previously worked fine) you receive a message; 
 'Warning: display list redraw incomplete'.  This only occurs 
 the first time you try to draw the ungrouped scatterplot.
 Hope it helps.

This works fine for me, under both Windows and Quantian. Is it possible that
you forgot to left-click to position the legend for the scatterplot by
groups (at is indicated at the bottom of the Groups dialog box)?

I hope that this helps,
 John

 Peter,
 Any hints as to how to generate that trace of the execution 
 you mentioned?

 Michael

 [EMAIL PROTECTED]

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] a little question about R

2004-09-07 Thread John Fox

You can use letters[1:4].
(B
(BI hope that this helps,
(B John 
(B
(B -Original Message-
(B From: [EMAIL PROTECTED] 
(B [mailto:[EMAIL PROTECTED] On Behalf Of ??
(B Sent: Tuesday, September 07, 2004 7:42 PM
(B To: [EMAIL PROTECTED]
(B Subject: [R] a little question about R
(B 
(B Hello,sir: Here's a little question about R which needs your 
(B help.Thanks in advance. 
(B If I wanna make a sequence just like a,b,c,d (In other 
(B words,a vector consists of 4 characters :a,b,c,d ).How can I 
(B do it in a shortcut manner? Yes,I can do it as following: 
(B c("a","b","c","d") and the result is:[1] "a" "b" "c" "d".
(B But I remember there's a shortcut manner to do the same 
(B thing,something like c("a":"d").But the system said:"Error in 
(B "a":"d" : NA/NaN argument" So I wonder the correct method to 
(B do it(I remember the expression is very similar with 
(B c("a":"d")). Thanks from the bottom of my heart. My best regards!
(B
(B
(B__
(B[EMAIL PROTECTED] mailing list
(Bhttps://stat.ethz.ch/mailman/listinfo/r-help
(BPLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] R conversion

2004-09-09 Thread John Fox

Dear Mark,

See ?factanal for a factor-analysis function, in the standard stats package
(though factanal does ML factor analysis and not principal axes). Note that
help.search(factor analysis) turns this up.

With respect to your last question, it's not possible to know the source of
the difference without knowing what the difference is. A guess is that
specifying proportion=0.9 in SAS causes the program to use communality
estimates rather than 1's on the diagonal of the correlation matrix.

I hope this helps,
 John

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Mark Strivens
 Sent: Thursday, September 09, 2004 7:26 PM
 To: [EMAIL PROTECTED]
 Subject: [R] R conversion
 
 I am a newcomer to R trying to convert a SAS program to R.
 Does anyone know if there is a functional equivalent of the 
 SAS 'Factor' procedure?
 
 For example in SAS:
 
 proc factor DATA=cor method=principal rotate=varimax 
 proportion=0.9 scree
 
 where 'cor' is a correlation matrix (as in the R 'cor' function)
 
 This should get you a list of eigen values as well as a 
 factor pattern matrix.
 
 Also why when I use the 'eigen' function in R does it seem to 
 give a subtly different answer to the eigen values generated 
 by the above program?
 
 Many thanks for any help
 
 Mark Strivens

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] Proposal for New R List: Criticism? Comments?

2004-09-09 Thread John Fox

Dear Bert,

I believe that you've identified an important issue -- and one that's
occasionally been discussed on this list previously -- but I'm not sure that
another email list is a good solution. Some method of indexing functions in
packages that would allow people to more easily locate them (e.g.,
author-supplied [i.e., not simply standard] keywords for each public object
in a package) seems to me a more promising approach.

Regards,
 John

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Berton Gunter
 Sent: Thursday, September 09, 2004 11:50 AM
 To: [EMAIL PROTECTED]
 Subject: [R] Proposal for New R List: Criticism? Comments?
 
 Folks:
  
 I would like to propose a new R list, tentatively labeled 
 r-contents. I wish to briefly explain the purpose and format 
 here and solicit public comments, pro or con, so feel free to 
 criticize or suggest a better name and other improvements or 
 alternatives.
  
 R presently consists of a suite of about a dozen core 
 recommended packages and several hundred contributed packages 
 comprising thousands -- perhaps tens of thousands -- of 
 functions. Hopefully, this will continue to grow rapidly. No 
 one can possibly keep track of all of this, and it is 
 therefore a daunting task for someone seeking specific 
 functionality to find it, especially when they are relatively 
 new to R. 
  
 Of course, R and CRAN (and Google and ...)  have various 
 search capabilities that help, but these are essentially 
 keyword-based and so require the searcher to guess search 
 terms that are at least reasonably close to function names 
 and keywords. A lot of the time this works, but it can be 
 tedious; some of the time one guesses wrong, and it doesn't work.
  
 S-Plus and much other software addresses this by providing a 
 semantically-based Contents Index (or something like it) in 
 their Help functionality. I find this quite useful, but 
 creating and maintaining such an index seems to me to be 
 extremely labor intensive, fraught with its own issues (what 
 heading should I look under?), and, I think, not a good fit 
 to the spirit and dynamics of R anyway.
  
 Not surprisingly, as a result, many of the questions 
 addressed to r-help are of the form: I want to do such and 
 such. How do I do it? While this certainly gives answers, I 
 think the breadth of r-help and its etiquette and posting 
 conventions result in an abruptness to many of our replies 
 (Read the posting guide! Read the Help files and do what 
 they say!) that discourages many users -- especially casual 
 ones -- from posting questions, and thus may thus discourage 
 use of R. Clearly, if true, this is not a good thing; on the 
 other hand, I think that given r-help's purpose and 
 practices, many of these abrupt replies may well be 
 appropriate (I'm a curmudgeon at heart!).
  
 Hence, there is a mismatch between user needs and r-help 
 services. To address this mismatch, I would like to propose a 
 new list, r-contents, to essentially serve the same purpose 
 as the S-Plus Contents index. Hence, it would serve as a 
 place for users to post queries ** only ** of the form: I 
 want to do such and such. How do I do it? and receive 
 answers that would all be **single phrases ** of the form 
 package suchandsuch or ?suchandsuchfunction. No further 
 explanations regarding usage would be provided, though users 
 would be free to follow up answers with private questions to 
 the responder, although there should be no expectation of any 
 response. Queries could be framed with as much or as little 
 supporting detail as desired, with the obvious consequence 
 that a more clearly framed question would be more likely to 
 get a (better) response. No other posting conventions (aside 
 from the usual ones regarding civility and adherence to
 topic) would be expected.
  
 My hope is that such a list would both reduce unnecessary 
 traffic on r-help and satisfy a genuine need in a less 
 threatening way. I can certainly see downsides (I often learn 
 a lot from How can I do this? queries), but I think, on 
 balance, this approach might be useful. So I would like to 
 subject the idea to public scrutiny and criticism, as well as 
 the opportunity for improvement from suggested modifications 
 or alternatives. If it's useful, this will be recognized; if 
 it's not and/or no one is interested, that, too, will be made 
 manifest. I would be especially grateful for the opinions of 
 casual users or newbies, either publicly or privately.
  
 Cheers,
  
 -- Bert Gunter
 Genentech Non-Clinical Statistics
 South San Francisco, CA


__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] Proposal for New R List: Criticism? Comments?

2004-09-10 Thread John Fox

Dear Brian et al.,

Jonathan's search site is excellent -- I use it frequently -- and for some
reason new users seem unaware of help.search(), which, despite the fact that
it searches only in installed packages, I also find very useful.

A couple of comments, however: First, if help pages from all packages were
available at a central location -- e.g., at CRAN -- help.search() could have
an option to search that location. Second, I still feel that it would be
useful to provide some other way of searching the space of all available
functions. One idea, which I mentioned in an earlier message on this thread,
would be a keyword system (again, different from the current set of standard
keywords). The keywords could be accessed by help.search() and also compiled
into an index.

Regards,
 John

 -Original Message-
 From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] 
 Sent: Friday, September 10, 2004 5:26 AM
 To: Jonathan Baron
 Cc: Adaikalavan Ramasamy; John Fox; R-help; 'Berton Gunter'
 Subject: Re: [R] Proposal for New R List: Criticism? Comments?
 
 On Fri, 10 Sep 2004, Jonathan Baron wrote:
 
  On 09/10/04 03:54, Adaikalavan Ramasamy wrote:
  There is another issue to be considered. Currently you 
 need to have 
  the relevant packages installed before help.search() bring 
 it up. My 
  work around this is to install all available packages just in case 
  the function I need is nestled in some non-standard 
 packages. I also 
  update them rather frequently.
  
  I do this too, at my search site (where 
 frequently=monthly) and you 
  can search functions only, and use Boolean search expressions and 
  phrases.
  
  But right now the entire set of packages takes about 885 
 meg (if I'm 
  reading du correctly), which is less than my very modest 
 collection of 
  digital photos, and a tiny fraction of a 3-year-old standard hard 
  disk.  In other words, it is no big deal to install all the 
 packages 
  if you have your own computer.
 
 I am seeing about 520Mb for all base + CRAN packages under 
 1.9.1, and it will be rather less under 2.0.0 as more parts 
 are stored compressed.
 BioC is a lot larger.
 
 It is however, a BIG deal to install *all* the packages and 
 am I currently 10 short since they depend on other software 
 that I do not have a licence for or will not compile (and 
 there are three others I cannot reinstall using current gcc). 
  On AMD64 and Solaris there are several others, and something 
 like 20 do not install on Windows.  (I could use 
 --install-fake as the CRAN checks do, but I have the almost 
 complete set installed to test R changes, not test packages.)
 
 So I do see some merit in having a full-text search for R 
 help available at some URL, as Jonathan has kindly provided.
 
 -- 
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595


__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] boxplot() from list

2004-09-12 Thread John Fox

Dear Laura,

You don't say what kind of objects are in the list, but suppose that they
are matrices; here's a scaled-down example using 3 list elements:

M1 - matrix(rnorm(30*4), 48, 4)
M2 - matrix(rnorm(30*4), 48, 4)
M3 - matrix(rnorm(30*4), 48, 4)
L - list(M1=M1, M2=M2, M3=M3)

par(mfrow=c(3, 4))
lapply(L, function(x) apply(x, 2, boxplot))

I hope that this helps,
 John 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Laura Quinn
 Sent: Sunday, September 12, 2004 10:10 AM
 To: [EMAIL PROTECTED]
 Subject: [R] boxplot() from list
 
 I have a list containing 48 objects (each with 30 rows and 4 
 columns, all numeric), and wish to produce 4 boxplot series 
 (with 48 plots in each) , one for each column of each object.
 
 Basically I want a boxplot from boxplot(mylist[[]][,i])
 
 for i in 1:4. It seems that I can create a boxplot of length 
 48 from the entire list, but I don't seem able to subscript 
 to return 4 boxplots from the list - I have also tried to 
 create 4 new lists (one for each column of each object) by 
 using variations on the following, but none seems to work:
 
 newlist-oldlist[,1]
 newlist-oldlist[[]][,1]
 newlist-oldlist[[]][,$colone]
 
 can anyone please offer some insight??
 
 Thanks in advance,
 
 Laura Quinn
 Institute of Atmospheric Science
 School of Earth and Environment
 University of Leeds
 Leeds
 LS2 9JT

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] mahalanobis distance

2004-09-12 Thread John Fox

Dear Murli,

Try ?mahalanobis, which, by the way, is turned up by
help.search(mahalanobis).

I hope this helps,
 John

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Murli Nair
 Sent: Sunday, September 12, 2004 3:17 PM
 To: [EMAIL PROTECTED]
 Subject: [R] mahalanobis distance
 
 Is there a function that calculate the mahalanobis distance in R .
 The dist function calculates euclidean', 'maximum', 
 'manhattan', 'canberra', 'binary' or 'minkowski'.
 Thanks ../Murli

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] replacing NA's with 0 in a dataframe for specified columns

2004-09-15 Thread John Fox

Dear David,

How about the following?

cols - c(1,3)
x[,cols][is.na(x[,cols])] - 0

I hope that this helps,
 John

On Wed, 15 Sep 2004 14:44:53 -0400
 David Kane [EMAIL PROTECTED] wrote:
 I know that there must be a cool way of doing this, but I can't think
 of it. Let's say I have an dataframe with NA's.
 
  x - data.frame(a = c(0,1,2,NA), b = c(0,NA,1,2), c = c(NA, 0, 1,
 2))
  x
a  b  c
 1  0  0 NA
 2  1 NA  0
 3  2  1  1
 4 NA  2  2
  
 
 I know it is easy to replace all the NA's with zeroes.
 
  x[is.na(x)] - 0
  x
   a b c
 1 0 0 0
 2 1 0 0
 3 2 1 1
 4 0 2 2
  
 
 But how do I do this for just columns a and c, leaving the NA in
 column b alone?
 
 Thanks,
 
 Dave Kane
 
  R.version
  _
 platform i686-pc-linux-gnu
 arch i686 
 os   linux-gnu
 system   i686, linux-gnu  
 status
 major1
 minor9.1  
 year 2004 
 month06   
 day  21   
 language R
 
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Splitting vector into individual elements

2004-09-15 Thread John Fox

Dear Paul,

How about do.call(rgb, as.list(offred.rgb)) ?

I hope that this helps,
 John

On Wed, 15 Sep 2004 15:20:24 -0500 (CDT)
 Paul Roebuck [EMAIL PROTECTED] wrote:
 Is there a means to split a vector into its individual
 elements without going the brute-force route for arguments
 to a predefined function call?
 
 offred.rgb - c(1, 0, 0) * 0.60;
 
 ## Brute force style
 offred.col - rgb(offred.rgb[1],
   offred.rgb[2],
   offred.rgb[3],
   names = offred)
 ## Desired style
 offred.col - rgb(silver.bullet(offred.rgb),
   names = offred)
 
 Neither of my attempts gets it right.
 
 silver.bullet.try1 - function(x) {
 expr - cat(x, sep = ,)
 return(parse(text = expr))
 }
 
 silver.bullet.try2 - function(x) {
 expr - expression(cat(x, sep = ,))
 return(eval(expr))
 }
 
 --
 SIGSIG -- signature too long (core dumped)
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] How do I insert a newline in my title in a plot?

2004-09-16 Thread John Fox

Dear Christos,

This works for me, and has many times in the past. Is it possible that you
used a forward-slash (/), rather than a back-slash (\)? Alternatively,
perhaps this has something to do with using Greek fonts.

I hope this helps,
 John

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Christos Rodopoulos
 Sent: Thursday, September 16, 2004 8:00 AM
 To: '[EMAIL PROTECTED]'
 Subject: FW: [R] How do I insert a newline in my title in a plot?
 
 
 yes I have tries, and nothing. It just shows the strings with 
 the slashn, just like i typed it.
 -Original Message-
 From: Rashid Nassar [mailto:[EMAIL PROTECTED]
 Sent: , 16  2004 15:44
 To: Christos Rodopoulos
 Subject: Re: [R] How do I insert a newline in my title in a plot?
 
 
 Have you not tried what you have already suggested:
 
title(this is a title\nIn 2 lines)
 ?
 
 
 On Thu, 16 Sep 2004, Christos Rodopoulos wrote:
 
   Hello, I want to help me with a simple I think question: How do I 
  insert
 a
  newline into the title of a plot I have made?
 
   Is it only done with hershey fonts? vfont = ??? and so on? 
 I do not 
  understand Hershey fonts, and I am afraid to use them, since I use 
  greek fonts (iso8859-7).
 
   I need propably something like: title(This is a title\nIn 
 2 lines);
 
   any help?
 
  __
  [EMAIL PROTECTED] mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html
 
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] Proposal for New R List: Criticism? Comments?

2004-09-17 Thread John Fox

Dear Martin,

Thanks for pointing this out -- I'm ashamed to say that I forgot about
\concept{} entries. As you say (aside from people stupidly forgetting that
they exist), the problem is to get people to use them. How about requiring
one or more concept entries for each help file?

Regards,
 John

 -Original Message-
 From: Martin Maechler [mailto:[EMAIL PROTECTED] 
 Sent: Friday, September 17, 2004 7:57 AM
 To: John Fox
 Cc: 'R-help'
 Subject: RE: [R] Proposal for New R List: Criticism? Comments?
 
 Hi John et al.
 
 I'm coming late to this thread (because of vacation),
 
  JohnF == John Fox [EMAIL PROTECTED]
  on Fri, 10 Sep 2004 10:56:51 -0400 writes:
 
 JohnF Dear Brian et al.,
 
 JohnF Jonathan's search site is excellent -- I use it
 JohnF frequently -- and for some reason new users seem
 JohnF unaware of help.search(), which, despite the fact
 JohnF that it searches only in installed packages, I also
 JohnF find very useful.
 
 yes and yes.
 
 JohnF A couple of comments, however: First, if help pages
 JohnF from all packages were available at a central
 JohnF location -- e.g., at CRAN -- help.search() could have
 JohnF an option to search that location. Second, I still
 JohnF feel that it would be useful to provide some other
 JohnF way of searching the space of all available
 JohnF functions. One idea, which I mentioned in an earlier
 JohnF message on this thread, would be a keyword system
 JohnF (again, different from the current set of standard
 JohnF keywords). 
 
 \concept{}  was introduced for this
 
 JohnF The keywords could be accessed by help.search() 
 
 and this happens (by default) for \concept{} entries
 
 JohnF and also compiled into an index.
 
 this doesn't happen yet.
 
 The ``real problem'' of course is that package authors need 
 to write all these \concept{} entries before such an index 
 can really become useful.
 
 Martin Maechler
 
  -Original Message-
  From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] 
  Sent: Friday, September 10, 2004 5:26 AM
  To: Jonathan Baron
  Cc: Adaikalavan Ramasamy; John Fox; R-help; 'Berton Gunter'
  Subject: Re: [R] Proposal for New R List: Criticism? Comments?
  
  On Fri, 10 Sep 2004, Jonathan Baron wrote:
  
   On 09/10/04 03:54, Adaikalavan Ramasamy wrote:
   There is another issue to be considered. Currently you 
  need to have 
   the relevant packages installed before help.search() bring 
  it up. My 
   work around this is to install all available 
 packages just in case 
   the function I need is nestled in some non-standard 
  packages. I also 
   update them rather frequently.
   
   I do this too, at my search site (where 
  frequently=monthly) and you 
   can search functions only, and use Boolean search 
 expressions and 
   phrases.
   
   But right now the entire set of packages takes about 885 
  meg (if I'm 
   reading du correctly), which is less than my very modest 
  collection of 
   digital photos, and a tiny fraction of a 3-year-old 
 standard hard 
   disk.  In other words, it is no big deal to install all the 
  packages 
   if you have your own computer.
  
  I am seeing about 520Mb for all base + CRAN packages under 
  1.9.1, and it will be rather less under 2.0.0 as more parts 
  are stored compressed.
  BioC is a lot larger.
  
  It is however, a BIG deal to install *all* the packages and 
  am I currently 10 short since they depend on other software 
  that I do not have a licence for or will not compile (and 
  there are three others I cannot reinstall using current gcc). 
  On AMD64 and Solaris there are several others, and something 
  like 20 do not install on Windows.  (I could use 
  --install-fake as the CRAN checks do, but I have the almost 
  complete set installed to test R changes, not test packages.)
  
  So I do see some merit in having a full-text search for R 
  help available at some URL, as Jonathan has kindly provided.
  
  -- 
  Brian D. Ripley,  [EMAIL PROTECTED]
  Professor of Applied Statistics,  
 http://www.stats.ox.ac.uk/~ripley/
  University of Oxford, Tel:  +44 1865 272861 (self)
  1 South Parks Road, +44 1865 272866 (PA)
  Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Issue with predict() for glm models

2004-09-22 Thread John Fox

  
 #
 #Degrees of Freedom: 39 Total (i.e. Null);  37 Residual
 #Null Deviance:  55.45 
 #Residual Deviance: 41.67AIC: 47.67 
 
 ###
 # Predicted Probabilities for Test Data # ###
 
 New.Data - data.frame(predictors.train1=predictors.test[,1],
 predictors.train2=predictors.test[,2])
 
 logreg.pred.prob.test -
 predict.glm(log.reg,New.Data,type=response)
 logreg.pred.prob.test
 
 #logreg.pred.prob.test
 # [1] 0.51106406 0.15597423 0.04948404 0.03863875 0.35587589
 0.71331091 #
 [7] 0.17320087 0.14176632 0.30966718 0.61878952 0.12525988 0.21271139
 #[13]
 0.70068113 0.18340723 0.10295501 0.44591568 0.72285161 0.31499339
 #[19]
 0.65789420 0.42750139 0.14435889 0.93008117 0.70798465 0.80109005
 #[25]
 0.89161472 0.47480625 0.56520952 0.63981834 0.57595189 0.60075882
 #[31]
 0.96493393 0.77015507 0.87643986 0.62973986 0.63043351 0.45398955
 #[37]
 0.80855782 0.90835588 0.54809117 0.11568637
 
 
 Of course, notice that the vector for the predicted probabilities has
 only
 40 elements, while the New.Data has 50 elements (since n.test has
 25 per
 group for 2 groups) and thus should have 50 predicted probabilities.
 As it
 turns out, the output is for the training data predictors and not for
 the
 New.Data as I would like it to be. I should also note that I have
 made
 sure that the names for the predictors in the New.Data are the same
 as the
 names for the predictors within the glm object (i.e., within
 log.reg) as
 this is what is done in the example for predict.glm() within the help
 files.
 
 
 Could some one help me understand either what I am doing incorrectly
 or what
 problems there might be within the predict() function? I should
 mention that
 I tried the same program using predict.glm() and obtained the same
 problematic results. 
 
 Thanks and take care, 
 
 Joe 
 
 
 Joe Rausch, M.A. 
 Psychology Liaison 
 Lab for Social Research 
 917 Flanner Hall 
 University of Notre Dame 
 Notre Dame, IN 46556
 (574) 631-3910
 
 If we knew what it was we were doing, it would not be called
 research,
 would it?
 - Albert Einstein
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] Issue with predict() for glm models

2004-09-23 Thread John Fox

Dear Uwe,

Unless I've somehow messed this up, as I mentioned yesterday, what you
suggest doesn't seem to work when the predictor is a matrix. Here's a
simplified example:

 X - matrix(rnorm(200), 100, 2)
 y - (X %*% c(1,2) + rnorm(100))  0
 dat - data.frame(y=y, X=X)
 mod - glm(y ~ X, family=binomial, data=dat)
 new - data.frame(X = matrix(rnorm(20),2))
 predict(mod, new)
   12345
6 
  1.81224443  -5.92955128   1.98718051 -10.05331521   2.6506
-2.50635812 
   789   10   11
12 
  5.63728698  -0.94845276  -3.61657377  -1.63141320   5.03417372
1.80400271 
  13   14   15   16   17
18 
  9.32876273  -5.32723406   5.29373023  -3.90822713 -10.95065186
4.90038016 

 . . .

   97   98   99  100 
 -6.92509812   0.59357486  -1.17205723   0.04209578 


Note that there are 100 rather than 10 predicted values.

But with individuals predictors (rather than a matrix),

 x1 - X[,1]
 x2 - X[,2]
 dat.2 - data.frame(y=y, x1=x1, x2=x2)
 mod.2 - glm(y ~ x1 + x2, family=binomial, data=dat.2)
 new.2 - data.frame(x1=rnorm(10), x2=rnorm(10))
 predict(mod.2, new.2)
 1  2  3  4  5  6  7

 6.5723823  0.6356392  4.0291018 -4.7914650  2.1435485 -3.1738096 -2.8261585

 8  9 10 
-1.5255329 -4.7087592  4.0619290 

works as expected (?).

Regards,
 John
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Uwe Ligges
 Sent: Thursday, September 23, 2004 1:33 AM
 To: [EMAIL PROTECTED]
 Cc: [EMAIL PROTECTED]
 Subject: Re: [R] Issue with predict() for glm models
 
 [EMAIL PROTECTED] wrote:
 
  Hello everyone,
  
  I am having a problem using the predict (or the 
 predict.glm) function in R.
  Basically, I run the glm model on a training data set and try to 
  obtain predictions for a set of new predictors from a 
 test data set 
  (i.e., not the predictors that were utilized to obtain the 
 glm parameter estimates).
  Unfortunately, every time that I attempt this, I obtain the 
  predictions for the predictors that were used to fit the 
 glm model. I 
  have looked at the R mailing list archives and don't believe I am 
  making the same mistakes that have been made in the past 
 and also have 
  tried to closely follow the predict.glm example in the help 
 file. Here is an example of what I am trying to do:
  
  
  set.seed(545345)
  
  
  # Necessary Variables #
  
  
  p - 2
  train.n - 20
  test.n - 25
  mean.vec.1 - c(1,1)
  mean.vec.2 - c(0,0)
  
  Sigma.1 - matrix(c(1,.5,.5,1),p,p)
  Sigma.2 - matrix(c(1,.5,.5,1),p,p)
  
  ###
  # Load MASS Library #
  ###
  
  library(MASS)
  
  ###
  # Data to Parameters for Logistic Regression Model # 
  ###
  
  train.data.1 - mvrnorm(train.n,mu=mean.vec.1,Sigma=Sigma.1)
  train.data.2 - mvrnorm(train.n,mu=mean.vec.2,Sigma=Sigma.2)
  train.class.var - as.factor(c(rep(1,train.n),rep(2,train.n)))
  predictors.train - rbind(train.data.1,train.data.2)
  
  ##
  # Test Data Where Predictions for Probabilities Using 
 Logistic Reg.  #
  # From Training Data are of Interest
   #
  ##
  
  test.data.1 - mvrnorm(test.n,mu=mean.vec.1,Sigma=Sigma.1)
  test.data.2 - mvrnorm(test.n,mu=mean.vec.2,Sigma=Sigma.2)
  predictors.test - rbind(test.data.1,test.data.2)
  
  ##
  # Run Logistic Regression on Training Data # 
  ##
  
  log.reg - glm(train.class.var~predictors.train,
  family=binomial(link=logit))
 
 Well, you haven't specified the data argument, but given 
 the two variables directly. Exactly those variables will be 
 used in the
 predict() step below! If you want the predict() step to work, 
 use something like:
 
train - data.frame(class = train.class.var,
predictors = predictors.train)
log.reg - glm(class ~ ., data = train,
   family=binomial(link=logit))
 
 
 
  log.reg
  
  # log.reg
  
  #Call:  glm(formula = train.class.var ~ predictors.train, family = 
  #binomial(link = logit)) #
  #Coefficients:
  #  (Intercept)  predictors.train1  predictors.train2  
  #   0.5105-0.2945-1.0811  
  #
  #Degrees of Freedom: 39 Total (i.e. Null);  37 Residual
  #Null Deviance:  55.45 
  #Residual Deviance: 41.67AIC: 47.67 
  
  ###
  # Predicted Probabilities for Test Data # 
 ###
  
  New.Data - data.frame(predictors.train1=predictors.test[,1],
  predictors.train2=predictors.test[,2])
  
  logreg.pred.prob.test -

RE: [R] Issue with predict() for glm models

2004-09-23 Thread John Fox

Dear Uwe, 

 -Original Message-
 From: Uwe Ligges [mailto:[EMAIL PROTECTED] 
 Sent: Thursday, September 23, 2004 8:06 AM
 To: John Fox
 Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
 Subject: Re: [R] Issue with predict() for glm models
 
 John Fox wrote:
 
  Dear Uwe,
  
  Unless I've somehow messed this up, as I mentioned 
 yesterday, what you 
  suggest doesn't seem to work when the predictor is a 
 matrix. Here's a 
  simplified example:
  
  
 X - matrix(rnorm(200), 100, 2)
 y - (X %*% c(1,2) + rnorm(100))  0
 dat - data.frame(y=y, X=X)
 mod - glm(y ~ X, family=binomial, data=dat) new - data.frame(X = 
 matrix(rnorm(20),2)) predict(mod, new)
 
 Dear John,
 
 the questioner had a 2 column matrix with 40 and one with 50 
 observations (not a 100 column matrix with 2 observation) and 
 for those matrices it works ...
 

Indeed, and in my example the matrix predictor X has 2 columns and 100 rows;
I did screw up the matrix for the new data to be used for predictions (in
the example I sent today but not yesterday), but even when this is done
right -- where the new data has 10 rows and 2 columns -- there are 100 (not
10) predicted values:

 X - matrix(rnorm(200), 100, 2)  # original predictor matrix with 100 rows
 y - (X %*% c(1,2) + rnorm(100))  0
 dat - data.frame(y=y, X=X)
 mod - glm(y ~ X, family=binomial, data=dat)
 new - data.frame(X = matrix(rnorm(20),10, 2)) # corrected -- note 10 rows
 predict(mod, new) # note 100 predicted values
   12345
6 
  5.75238091   0.31874587  -3.00515893  -3.77282121  -1.97511221
0.54712914 
   789   10   11
12 
  1.85091226   4.38465524  -0.41028694  -1.53942869   0.57613555
-1.82761518 

 . . .

  91   92   93   94   95
96 
  0.36210780   1.71358713  -9.63612775  -4.54257576  -5.29740468
2.64363405 
  97   98   99  100 
 -4.45478627  -2.44973209   2.51587537  -4.09584837 

Actually, I now see the source of the problem:

The data frames dat and new don't contain a matrix named X; rather the
matrix is split columnwise:

 names(dat)
[1] y   X.1 X.2
 names(new)
[1] X.1 X.2

Consequently, both glm and predict pick up the X in the global environment
(since there is none in dat or new), which accounts for why there are 100
predicted values.

Using list() rather than data.frame() produces the originally expected
behaviour:

 new - list(X = matrix(rnorm(20),10, 2))
 predict(mod, new)
 1  2  3  4  5  6  7

 5.9373064  0.3687360 -8.3793045  0.7645584 -2.6773842  2.4130547  0.7387318

 8  9 10 
-0.4347916  8.4678728 -0.8976054 

Regards,
 John

 Best,
 Uwe
 
 
 
 
 
 
 
 12345
  6 
1.81224443  -5.92955128   1.98718051 -10.05331521   2.6506
  -2.50635812 
 789   10   11
  12 
5.63728698  -0.94845276  -3.61657377  -1.63141320   5.03417372
  1.80400271 
13   14   15   16   17
  18 
9.32876273  -5.32723406   5.29373023  -3.90822713 -10.95065186
  4.90038016
  
   . . .
  
 97   98   99  100 
   -6.92509812   0.59357486  -1.17205723   0.04209578 
  
  
  Note that there are 100 rather than 10 predicted values.
  
  But with individuals predictors (rather than a matrix),
  
  
 x1 - X[,1]
 x2 - X[,2]
 dat.2 - data.frame(y=y, x1=x1, x2=x2)
 mod.2 - glm(y ~ x1 + x2, family=binomial, data=dat.2)
 new.2 - data.frame(x1=rnorm(10), x2=rnorm(10)) 
 predict(mod.2, new.2)
  
   1  2  3  4  5  
 6  7
  
   6.5723823  0.6356392  4.0291018 -4.7914650  2.1435485 -3.1738096 
  -2.8261585
  
   8  9 10 
  -1.5255329 -4.7087592  4.0619290
  
  works as expected (?).
  
  Regards,
   John
   
  
  
 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Uwe Ligges
 Sent: Thursday, September 23, 2004 1:33 AM
 To: [EMAIL PROTECTED]
 Cc: [EMAIL PROTECTED]
 Subject: Re: [R] Issue with predict() for glm models
 
 [EMAIL PROTECTED] wrote:
 
 
 Hello everyone,
 
 I am having a problem using the predict (or the
 
 predict.glm) function in R.
 
 Basically, I run the glm model on a training data set and try to 
 obtain predictions for a set of new predictors from a
 
 test data set
 
 (i.e., not the predictors that were utilized to obtain the
 
 glm parameter estimates).
 
 Unfortunately, every time that I attempt this, I obtain the 
 predictions for the predictors that were used to fit the
 
 glm model. I
 
 have looked at the R mailing list archives and don't believe I am 
 making the same mistakes that have been made in the past
 
 and also have
 
 tried to closely follow the predict.glm example in the help
 
 file. Here is an example

RE: [R] Issue with predict() for glm models

2004-09-23 Thread John Fox

Dear Uwe, 

 -Original Message-
 From: Uwe Ligges [mailto:[EMAIL PROTECTED] 
 Sent: Thursday, September 23, 2004 11:37 AM
 To: John Fox
 Cc: [EMAIL PROTECTED]
 Subject: Re: [R] Issue with predict() for glm models

. . .

 John,

 note that I used glm(y ~ .) (the dot!),
 because the names are automatically chosen to be X.1 and X.2, 
 hence you cannot use X in the formula in this case ...

 Best,
 Uwe

OK -- I see. I did notice that you used . in the formula but didn't make the
proper connection!

Thanks,
 John

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] RE: using tcltk in R under ESS/XEmacs on Windows

2004-09-24 Thread John Fox

Dear Andy,

Perhaps it's possible, but I've never been able to get tcltk to work
properly under ESS/Xemacs on Windows. 

Regards,
 John

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy
 Sent: Friday, September 24, 2004 3:03 PM
 To: [EMAIL PROTECTED]; ESS (Help list)
 Subject: using tcltk in R under ESS/XEmacs on Windows
 
 Sorry for the cross-post.  Not sure where the problem is...
 
 A while back I posted an R function to R-help:
 
 cd - function (dir = tclvalue(tkchooseDirectory()), saveOld = FALSE, 
 loadNew = TRUE) {
 stopifnot(require(tcltk))
 if (saveOld) 
 save.image(compress = TRUE)
 setwd(dir)
 rm(list = ls(all = TRUE, envir = .GlobalEnv), envir = .GlobalEnv)
 if (loadNew  file.exists(.RData)) {
 loaded - load(.RData, envir = .GlobalEnv)
 return(invisible(loaded))
 }
 
 where the default value for the `dir' argument is to run the 
 tcltk directory chooser and get the directory name chosen.  
 (Thanks to Prof. John Fox for the tcltk part!!)  While this 
 function works fine under Rgui on Windows, it doesn't work 
 when running R within ESS (5.2.3) and XEmacs (21.4.13).  The 
 directory chooser never shows up, and dir just gets the empty 
 string.  Does anyone have any idea what could be the problem? 
  I'd very much appreciate any pointers.
 
 Best,
 Andy
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/ess-help

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] RE: using tcltk in R under ESS/XEmacs on Windows

2004-09-24 Thread John Fox

Dear Andy,

I've taken the liberty of copying this reply to the r-help and ess-help
lists since my last reply was misleading.

I believe that the problem with tkchooseDirectory() (you can see it by just
calling that function directly) is that it brings up a native Windows
dialog, and it's that kind of tcltk command that I've had trouble with in
ESS/XEmacs under Windows.

Regards,
 John

 -Original Message-
 From: Liaw, Andy [mailto:[EMAIL PROTECTED] 
 Sent: Friday, September 24, 2004 3:23 PM
 To: 'John Fox'
 Subject: RE: using tcltk in R under ESS/XEmacs on Windows
 
 Hi John,
 
 demo(tkfaq) did work for me.  Maybe that's the exception...
 
 Thanks!
 Andy
 
  From: John Fox
  
  Dear Andy,
  
  Perhaps it's possible, but I've never been able to get 
 tcltk to work 
  properly under ESS/Xemacs on Windows.
  
  Regards,
   John
  
   -Original Message-
   From: [EMAIL PROTECTED]
   [mailto:[EMAIL PROTECTED] On Behalf Of 
 Liaw, Andy
   Sent: Friday, September 24, 2004 3:03 PM
   To: [EMAIL PROTECTED]; ESS (Help list)
   Subject: using tcltk in R under ESS/XEmacs on Windows
   
   Sorry for the cross-post.  Not sure where the problem is...
   
   A while back I posted an R function to R-help:
   
   cd - function (dir = tclvalue(tkchooseDirectory()),
  saveOld = FALSE,
   loadNew = TRUE) {
   stopifnot(require(tcltk))
   if (saveOld) 
   save.image(compress = TRUE)
   setwd(dir)
   rm(list = ls(all = TRUE, envir = .GlobalEnv), envir =
  .GlobalEnv)
   if (loadNew  file.exists(.RData)) {
   loaded - load(.RData, envir = .GlobalEnv)
   return(invisible(loaded))
   }
   
   where the default value for the `dir' argument is to run 
 the tcltk 
   directory chooser and get the directory name chosen.
   (Thanks to Prof. John Fox for the tcltk part!!)  While 
 this function 
   works fine under Rgui on Windows, it doesn't work when running R 
   within ESS (5.2.3) and XEmacs (21.4.13).  The directory chooser 
   never shows up, and dir just gets the empty string.  Does anyone 
   have any idea what could be the problem?
I'd very much appreciate any pointers.
   
   Best,
   Andy
   
   __
   [EMAIL PROTECTED] mailing list 
   https://stat.ethz.ch/mailman/listinfo/ess-help
  
  
  
 
 
 --
 
 Notice:  This e-mail message, together with any attachments, 
 contains information of Merck  Co., Inc. (One Merck Drive, 
 Whitehouse Station, New Jersey, USA 08889), and/or its 
 affiliates (which may be known outside the United States as 
 Merck Frosst, Merck Sharp  Dohme or MSD and in Japan, as 
 Banyu) that may be confidential, proprietary copyrighted 
 and/or legally privileged. It is intended solely for the use 
 of the individual or entity named on this message.  If you 
 are not the intended recipient, and have received this 
 message in error, please notify us immediately by reply 
 e-mail and then delete it from your system.
 --
 

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Diagnosing trouble with R-2.0, Fedora Core 2, and Rcmdf

2004-10-12 Thread John Fox

Dear Paul,

I'm currently out of town, limiting a bit my ability to answer these
questions definitively, but I'll give it a shot (hoping that my memory
isn't faulty):

On Mon, 11 Oct 2004 13:08:39 -0500
 Paul Johnson [EMAIL PROTECTED] wrote:
 Greetings, R-help!
 
 On 2 Fedora Core 2 Linux systems, i've completely erased the previous
 R and all packages and then installed R-2.0 and installed fresh
 packages.
 
 In using Rcmdr, I see some trouble and I wonder if other people see
 this and if it is due to the tcl/tk, or R, or Rcmdr.  (If readers
 have not yet tried Rcmdr, I recommend it not just because of the GUI
 it provides, but also because Prof. Fox has created several very
 handy commands, like scatter3d, which make it possible to use
 advanced packages like rgl. Rcmdr provides several very handy wrapper
 commands that just work and users don't have to fuss over so many
 details! And, if you are new at R, you can use pull down menus and
 then Rcmdr displays the commands that correspond with those menu
 options. Very educational!)
 
 In no particular order, here are the issues I see with R-2.0 and
 Rcmdr
 
 1. In the panel to select a dataset from a package (nenu: Data/Data
 in packages), I can type in the name of a dataset, but the GUI will
 not let me choose the name of a package from which to select a
 dataset. Nothing happens when I pick a package name.  (With previous
 R/Rcmdr, I did not have this trouble).
 

The problem is that the Rcmdr code doesn't quote the name of the
package. I believe that I've fixed this problem, but haven't posted the
fixed version either to my web site or to CRAN.

 2. When using Rcmdr, some of the output from commands shows in the
 Rcmdr bottom window, but some is sent to the xterm in which R was
 started. For example, with a scatter3d() command, if I add the option
 model.summary=T, the model summary shows in the xterm.  But some
 other models show results in the bottom pane of Rcmdr.  And, it seems
 to me, there are some commands in which I've noticed some of the
 output going to the Rcmdr bottom buffer and  some to the xterm.
 

As I recall, I've fixed this problem as well, but perhaps not in a
completely general way. Again, I haven't yet released the fixed
version.

 3. When I want to exit Rcmdr and choose the pull down exit from
 Commander and R, I get a Segmentation Fault that closes Rcmdr and R.
 

This I haven't seen before.

 4. 3d plots do not display on the screen until I click in the rgl
 window that displays the graph.  This might be a video card/AGP
 driver problem, I suppose.  These systems have the NVidia FX5200
 cards and the NVidia proprietary X driver.  If other users don't see
 the same on other kinds of systems, I guess that will tell me where
 the problem lies.
 

I believe that I've noticed this problem on a Quantian system.

 5. At random times, I get an error tcl/tk window popping up with a
 message about MASS and categorical variables.  It says
 

This reflects the first problem, and is produced by the way that the
Rcmdr initially suppresses and then reports warnings.

I'll check further when I return home.

Sorry for the problems.

John

 Error in data(package=parse(text=packageName));
 'MASS' must be character string or NULL
 
 It says that over and over, sometimes it has other package names, as
 in:
 
 Error in data(package=parse(text=packageName));
 'relimp' must be character string or NULL
 
 These seem to be harmless?
 -- 
 Paul E. Johnson   email: [EMAIL PROTECTED]
 Dept. of Political Sciencehttp://lark.cc.ku.edu/~pauljohn
 1541 Lilac Lane, Rm 504
 University of Kansas  Office: (785) 864-9086
 Lawrence, Kansas 66044-3177   FAX: (785) 864-5700
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] Diagnosing trouble with R-2.0, Fedora Core 2, and Rcmdf

2004-10-12 Thread John Fox

Dear Paul,

I've now had a chance to check into these problems further:

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Paul Johnson
 Sent: Monday, October 11, 2004 1:09 PM
 To: r help
 Subject: [R] Diagnosing trouble with R-2.0, Fedora Core 2, and Rcmdf

 Greetings, R-help!

 On 2 Fedora Core 2 Linux systems, i've completely erased the 
 previous R and all packages and then installed R-2.0 and 
 installed fresh packages.

 In using Rcmdr, I see some trouble and I wonder if other 
 people see this and if it is due to the tcl/tk, or R, or 
 Rcmdr.  (If readers have not yet tried Rcmdr, I recommend it 
 not just because of the GUI it provides, but also because 
 Prof. Fox has created several very handy commands, like 
 scatter3d, which make it possible to use advanced packages like rgl. 
 Rcmdr provides several very handy wrapper commands that just 
 work and users don't have to fuss over so many details! And, 
 if you are new at R, you can use pull down menus and then 
 Rcmdr displays the commands that correspond with those menu 
 options. Very educational!)

 In no particular order, here are the issues I see with R-2.0 and Rcmdr

 1. In the panel to select a dataset from a package (nenu: 
 Data/Data in packages), I can type in the name of a dataset, 
 but the GUI will not let me choose the name of a package from 
 which to select a dataset. Nothing happens when I pick a 
 package name.  (With previous R/Rcmdr, I did not have this trouble).

I had in fact fixed this in the development version of the package (Rcmdr
0.9-12). I've now posted that to my web site, at
http://socserv.socsci.mcmaster.ca/jfox/Misc/Rcmdr/index.html. I'd like to
do some more testing before sending this version to CRAN.

 2. When using Rcmdr, some of the output from commands shows 
 in the Rcmdr bottom window, but some is sent to the xterm in 
 which R was started. For example, with a scatter3d() command, 
 if I add the option model.summary=T, the model summary shows 
 in the xterm.  But some other models show results in the 
 bottom pane of Rcmdr.  And, it seems to me, there are some 
 commands in which I've noticed some of the output going to 
 the Rcmdr bottom buffer and  some to the xterm.

I did in fact try to deal with this problem. The solution that I employed
didn't work for scatter3d() with model.summary=TRUE, since the printed
summaries were produced as side effects by scatter3d(). That's a bad
practice, and I've fixed it; now scatter3d() returns a list of summaries,
and the Rcmdr can cope with that. (As you know, it's not my intention that
model.summary=TRUE be set from the Rcmdr GUI, but it's better to have it
work if you enter this from the script editor.)

You imply, however, that there are other cases of this problem. It would
help to know what they are -- assuming that they're not fixed in the
development version now on my web site.

 3. When I want to exit Rcmdr and choose the pull down exit 
 from Commander and R, I get a Segmentation Fault that closes 
 Rcmdr and R.

I haven't been able to duplicate this problem, to this point testing only
under Windows XP. Does this occur only when an rgl graphics device is open?
(The Rcmdr should close the rgl device before it exits.)

 4. 3d plots do not display on the screen until I click in the 
 rgl window that displays the graph.  This might be a video 
 card/AGP driver problem, I suppose.  These systems have the 
 NVidia FX5200 cards and the NVidia proprietary X driver.  If 
 other users don't see the same on other kinds of systems, I 
 guess that will tell me where the problem lies.

As I said, I believe that I've seen this problem before under Quantian. It
would help to know whether it's peculiar to the Rcmdr, scatter3d(), or rgl.

 5. At random times, I get an error tcl/tk window popping up 
 with a message about MASS and categorical variables.  It says

 Error in data(package=parse(text=packageName));
 'MASS' must be character string or NULL

 It says that over and over, sometimes it has other package 
 names, as in:

 Error in data(package=parse(text=packageName));
 'relimp' must be character string or NULL

 These seem to be harmless?

This should go away now that problem 1 is fixed.

Regards,
 John

 -- 
 Paul E. Johnson   email: [EMAIL PROTECTED]
 Dept. of Political Sciencehttp://lark.cc.ku.edu/~pauljohn
 1541 Lilac Lane, Rm 504
 University of Kansas  Office: (785) 864-9086
 Lawrence, Kansas 66044-3177   FAX: (785) 864-5700

 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] Seeking advice on introducing R

2004-10-13 Thread John Fox

Dear Arin,

I have a variety of materials at
http://socserv.socsci.mcmaster.ca/jfox/Courses/S-course/index.html that
may be helpful.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of data Analytics
 Sent: Wednesday, October 13, 2004 8:19 AM
 To: [EMAIL PROTECTED]
 Subject: [R] Seeking advice on introducing R
 
 Dear List:
 
 This is in search of advices/opinions for a lecture 
 presentation on R. I have this presentation scheduled on 
 Saturday (16-October-2004); the topic is introduction to R. 
 The audience is a group of researchers and scholars who have 
 either never used R/are beginning to use R, but otherwise 
 have used a   
 packagecalledlimdeb,andSPSS.ThereisalsoarequesttocoverRgnumeri
 c,andexcelbindingstoR.
 
 I have no knowledge of limdeb.
 
 Would greatly appreciate your advices, opinions, and look 
 forward to learn from your experiences in introducing R to 
 similar audiences. Also, pointers to web resources would be 
 greatly appreciated.
 
 TIA,
 Arin Basu
 Kolkata, India
 
  
 
   [[alternative HTML version deleted]]
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] Testing for normality of residuals in a regression model

2004-10-15 Thread John Fox

Dear Federico,

A problem with applying a standard test of normality to LS residuals is that
the residuals are correlated and heterskedastic even if the standard
assumptions of the model hold. In a large sample, this is unlikely to be
problematic (unless there's an unusual data configuration), but in a small
sample the effect could be nontrivial.

One approach is to use BLUS residuals, which transform the LS residuals to a
smaller set of uncorrelated, homoskedastic residuals (assuming the
correctness of the model). A search of R resources didn't turn up anything
for BLUS, but they shouldn't be hard to compute. This is a standard topic
covered in many econometrics texts.

You might consider the alternative of generating a bootstrapped confidence
envelope for the QQ plot; the qq.plot() function in the car package will do
this for a linear model.

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Federico Gherardini
 Sent: Friday, October 15, 2004 7:44 AM
 To: [EMAIL PROTECTED]
 Subject: [R] Testing for normality of residuals in a regression model
 
 Hi all,
 
 Is it possible to have a test value for assessing the 
 normality of residuals from a linear regression model, 
 instead of simply relying on qqplots?
 I've tried to use fitdistr to try and fit the residuals with 
 a normal distribution, but fitdsitr only returns the 
 parameters of the distribution and the standard errors, not 
 the p-value. Am I missing something?
 
 Cheers,
 
 Federico
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] one more Rcmdr problem

2004-10-15 Thread John Fox

Dear Adrian,

Since I moved the discussion of this problem to the r-devel list, I should
have reported back to r-help (1) that the problem was general to tlctk (not
just the Rcmdr package), and (2) that Duncan Murdoch made a change to the
patched Windows version of R 2.0.0 that made the problem go away.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Adrian Dusa
 Sent: Friday, October 15, 2004 11:44 AM
 To: 'Erich Neuwirth'; 'John Fox'; 'Duncan Murdoch'
 Cc: [EMAIL PROTECTED]
 Subject: RE: [R] one more Rcmdr problem
 
 Brilliant. With R 2.0.0 patched and Rcmdr 0.9-12 all problems 
 are gone.
 Thank you all, and a special thank you for the R Commander; 
 it saves a lot of effort for teaching purposes.
 Best regards,
 Adrian
 
 -Original Message-
 From: Erich Neuwirth [mailto:[EMAIL PROTECTED]
 Sent: 15 octombrie 2004 18:27
 To: Adrian Dusa
 Cc: [EMAIL PROTECTED]
 Subject: Re: [R] one more Rcmdr problem
 
 I did experience the same problem.
 After installing R 2.0.0 patched and downloading the source for
 Rcmdr_0.99-12 from John Fox's Web page
 http://socserv.socsci.mcmaster.ca/jfox/Misc/Rcmdr/
 and recompiling the package
 it now works.
 
 Ctrl-C, Ctrl-X, and Ctrl-V,  in the text boxes of Rcmdr now work.
 
 
 
 Adrian Dusa wrote:
  Hello,
  
  I'm using R 2.0.0 with the latest Rcmdr package installed 
 from CRAN, 
  on Windows XP Professional.
  
  When trying to copy some commands or results, either from 
 the upper or
 lower
  text window, this causes Rcmdr to crash:
  
  R for Windows GUI front-end has encountered a problem and 
 needs to close
  
  Did anyone have the same problem? I don't think it's my 
 system, as it 
  happened to reinstall my Windows just a few days ago, and the same 
  problem occurred in the former one.
  
  Regards,
  Adrian
  
  
  Adrian Dusa
  Romanian Social Data Archive
  1 Schitu Magureanu Bd.
  050025 Bucharest sector 5
  Tel. +40 21 3126618\
   +40 21 3153122/ int.101
  
  
  
 
 
 --
 Erich Neuwirth, Computer Supported Didactics Working Group 
 Visit our SunSITE at http://sunsite.univie.ac.at
 Phone: +43-1-4277-38624 Fax: +43-1-4277-9386
 
 
 
 -- 
 This message was scanned for spam and viruses by BitDefender
 For more information please visit http://linux.bitdefender.com/
 
 
 
 -- 
 This message was scanned for spam and viruses by BitDefender
 For more information please visit http://linux.bitdefender.com/
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] Testing for normality of residuals in a regression model

2004-10-15 Thread John Fox

Dear Kjetil,

I don't believe that these are BLUS residuals, but since the last n - r
effects are projections onto an orthogonal basis for the residual
subspace, they should do just fine (as long as the basis vectors have the
same length, which I think is the case, but perhaps someone can confirm).
The general idea is to transform the LS residuals into an uncorrelated,
equal-variance set.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: Kjetil Brinchmann Halvorsen [mailto:[EMAIL PROTECTED] 
 Sent: Friday, October 15, 2004 9:12 AM
 To: John Fox
 Cc: 'Federico Gherardini'; [EMAIL PROTECTED]
 Subject: Re: [R] Testing for normality of residuals in a 
 regression model
 
 John Fox wrote:
 
 Dear Federico,
 
 A problem with applying a standard test of normality to LS 
 residuals is 
 that the residuals are correlated and heterskedastic even if the 
 standard assumptions of the model hold. In a large sample, this is 
 unlikely to be problematic (unless there's an unusual data 
 configuration), but in a small sample the effect could be nontrivial.
 
 One approach is to use BLUS residuals, which transform the 
 LS residuals 
 to a smaller set of uncorrelated, homoskedastic residuals 
 (assuming the 
 correctness of the model).
 
 I'm not sure if this are BLUE residuals, but the following 
 function transform to a smaller set of independent, 
 homoscedastic residuals and the calls
 shapiro.test:
 I've proposed to make this a method for shapiro.test for lm 
 objects, but it is not accepted.
 
  shapiro.test.lm
 function (obj)
 {
 eff - effects(obj)
 rank - obj$rank
 df.r - obj$df.residual
 if (df.r  3)
 stop(To few degrees of freedom for residual for the test.)
 data.name - deparse(substitute(obj))
 x - eff[-(1:rank)]
 res - shapiro.test(x)
 res$data.name - data.name
 res$method - paste(res$method,  for residuals of linear model)
 res
 }
 
 Kjetil
 
 
  A search of R resources didn't turn up anything for BLUS, but they 
 shouldn't be hard to compute. This is a standard topic 
 covered in many 
 econometrics texts.
 
 You might consider the alternative of generating a bootstrapped 
 confidence envelope for the QQ plot; the qq.plot() function 
 in the car 
 package will do this for a linear model.
 
 I hope this helps,
  John
 
 
 John Fox
 Department of Sociology
 McMaster University
 Hamilton, Ontario
 Canada L8S 4M4
 905-525-9140x23604
 http://socserv.mcmaster.ca/jfox
 
 
   
 
 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Federico 
 Gherardini
 Sent: Friday, October 15, 2004 7:44 AM
 To: [EMAIL PROTECTED]
 Subject: [R] Testing for normality of residuals in a 
 regression model
 
 Hi all,
 
 Is it possible to have a test value for assessing the normality of 
 residuals from a linear regression model, instead of simply 
 relying on 
 qqplots?
 I've tried to use fitdistr to try and fit the residuals 
 with a normal 
 distribution, but fitdsitr only returns the parameters of the 
 distribution and the standard errors, not the p-value. Am I missing 
 something?
 
 Cheers,
 
 Federico
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 
 
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 
 
   
 
 
 
 -- 
 
 Kjetil Halvorsen.
 
 Peace is the most effective weapon of mass construction.
--  Mahdi Elmandjra
 
 


__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] Testing for normality of residuals in a regression model

2004-10-15 Thread John Fox

Dear Federico,

The problem is the same with GLS residuals -- even if the GLS transformation
produces homoskedastic errors, the residuals will be correlated and
heteroskedastic (with this problem tending to disappear in most instances as
n grows). The central point is that residuals don't behave quite the same as
errors.

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Federico Gherardini
 Sent: Friday, October 15, 2004 11:22 AM
 To: [EMAIL PROTECTED]
 Subject: Re: [R] Testing for normality of residuals in a 
 regression model
 
 Thank you very much for your suggestions! The residuals come 
 from a gls model, because I had to correct for 
 heteroscedasticity using a weighted regression... can I 
 simply apply one of these tests (like shapiro.test) to the 
 standardized residuals from my gls model?
 
 Cheers,
 Federico
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html


__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] Testing for normality of residuals in a regression model

2004-10-15 Thread John Fox

Dear Andy,

At the risk of muddying the waters (and certainly without wanting to
advocate the use of normality tests for residuals), I believe that your
point #4 is subject to misinterpretation: That is, while it is true that t-
and F-tests for regression coefficients in large sample retain their
validity well when the errors are non-normal, the efficiency of the LS
estimates can (depending upon the nature of the non-normality) be seriously
compromised, not only absolutely but in relation to alternatives (e.g.,
robust regression).

Regards,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy
 Sent: Friday, October 15, 2004 11:55 AM
 To: 'Federico Gherardini'; Berton Gunter
 Cc: R-help mailing list
 Subject: RE: [R] Testing for normality of residuals in a 
 regression model
 
 Let's see if I can get my stat 101 straight:
 
 We learned that linear regression has a set of assumptions:
 
 1. Linearity of the relationship between X and y.
 2. Independence of errors.
 3. Homoscedasticity (equal error variance).
 4. Normality of errors.
 
 Now, we should ask:  Why are they needed?  Can we get away 
 with less?  What if some of them are not met?
 
 It should be clear why we need #1.
 
 Without #2, I believe the least squares estimator is still 
 unbias, but the usual estimate of SEs for the coefficients 
 are wrong, so the t-tests are wrong.
 
 Without #3, the coefficients are, again, still unbiased, but 
 not as efficient as can be.  Interval estimates for the 
 prediction will surely be wrong.
 
 Without #4, well, it depends.  If the residual DF is 
 sufficiently large, the t-tests are still valid because of 
 CLT.  You do need normality if you have small residual DF.
 
 The problem with normality tests, I believe, is that they 
 usually have fairly low power at small sample sizes, so that 
 doesn't quite help.  There's no free lunch:  A normality test 
 with good power will usually have good power against a fairly 
 narrow class of alternatives, and almost no power against 
 others (directional test).  How do you decide what to use?
 
 Has anyone seen a data set where the normality test on the 
 residuals is crucial in coming up with appriate analysis?
 
 Cheers,
 Andy
 
  From: Federico Gherardini
  
  Berton Gunter wrote:
  
  Exactly! My point is that normality tests are useless for
  this purpose for
  reasons that are beyond what I can take up here. 
  
  Thanks for your suggestions, I undesrtand that! Could you possibly 
  give me some (not too complicated!) links so that I can investigate 
  this matter further?
  
  Cheers,
  
  Federico
  
  Hints: Balanced designs are
  robust to non-normality; independence (especially
  clustering of subjects
  due to systematic effects), not normality is usually the
  biggest real
  statistical problem; hypothesis tests will always reject
  when samples are
  large -- so what!; trust refers to prediction validity
  which has to do
  with study design and the validity/representativeness of
  the current data to
  future. 
  
  I know that all the stats 101 tests say to test for
  normality, but they're
  full of baloney!
  
  Of course, this is free advice -- so caveat emptor!
  
  Cheers,
  Bert
  

  
  
  __
  [EMAIL PROTECTED] mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
  
 
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] one more Rcmdr problem

2004-10-13 Thread John Fox

Dear Adrian,

I've just been able to verify this problem. It occurs for me when I try to
copy via Ctrl-C, but not when I use the Rcmdr Edit - Copy menu, nor when I
use the right-click context menu. As near as I can tell, the problem is
general to all Ctrl-key combinations. I don't have time at the moment to
investigate further, but I suspect a problem between tcltk and Rgui. I'm
copying this response to the r-devel list, since that might be a better
place to pursue the problem.

Sorry for the trouble.
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Adrian Dusa
 Sent: Wednesday, October 13, 2004 6:35 AM
 To: [EMAIL PROTECTED]
 Subject: [R] one more Rcmdr problem
 
 Hello,
 
 I'm using R 2.0.0 with the latest Rcmdr package installed 
 from CRAN, on Windows XP Professional.
 
 When trying to copy some commands or results, either from the 
 upper or lower text window, this causes Rcmdr to crash:
 
 R for Windows GUI front-end has encountered a problem and 
 needs to close
 
 Did anyone have the same problem? I don't think it's my 
 system, as it happened to reinstall my Windows just a few 
 days ago, and the same problem occurred in the former one.
 
 Regards,
 Adrian
 
 
 Adrian Dusa
 Romanian Social Data Archive
 1 Schitu Magureanu Bd.
 050025 Bucharest sector 5
 Tel. +40 21 3126618\
  +40 21 3153122/ int.101
 
 
 
 --
 This message was scanned for spam and viruses by BitDefender 
 For more information please visit http://linux.bitdefender.com/
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] multinom object :way of plotting??

2004-09-27 Thread John Fox

Dear Camille,

You might be interested in a paper (available at
http://socserv.socsci.mcmaster.ca/jfox/logit-effect-displays.pdf) that
Bob Andersen and I wrote on this topic. The paper deals with graphing
multinomial-logit and proportional-odds models.

Regards,
 John

On 27 Sep 2004 18:16:53 +0100
 Camille Szmaragd [EMAIL PROTECTED] wrote:
 Dear all,
 
 I'm fitting a multinom function to my dataset
 (multinom(outcome~age+K+D)) and I need to present my results on a
 poster. Does someone know a nice way of doing that? I think I saw in
 an archive that you cannot plot a multinom.object, is it true?
 
 Thank you by advance for your help,
 
 Cheers
 
 Camille
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] polr (MASS) and lrm (Design) differences in tests of statistical signifcance

2004-09-30 Thread John Fox

Dear Paul,

I tried polr() and lrm() on a different problem and (except for the
difference in signs for the cut-points/intercepts) got identical results for
both coefficients and standard errors. There might be something
ill-conditioned about your problem that produces the discrepancy -- I
noticed, for example, that some of the upper categories of the response are
very sparse. Perhaps the two functions use different forms of the
information matrix. I expect that someone else will be able to supply more
details.

I believe that the t-statistics in the polr() output are actually Wald
statistics.

I hope this helps,
 John



 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Paul Johnson
 Sent: Thursday, September 30, 2004 4:41 PM
 To: r help
 Subject: [R] polr (MASS) and lrm (Design) differences in 
 tests of statistical signifcance 
 
 Greetings:
 
 I'm running R-1.9.1 on Fedora Core 2 Linux.
 
 I tested a proportional odds logistic regression with MASS's 
 polr and Design's lrm.  Parameter estimates between the 2 are 
 consistent, but the standard errors are quite different, and 
 the conclusions from the t and Wald tests are dramatically 
 different. I cranked the abstol argument up quite a bit in 
 the polr method and it did not make the differences go away.
 
 So
 
 1. Can you help me see why the std. errors in the polr are so 
 much smaller, and
 
 2. Can I hear more opinions on the question of t vs. Wald in 
 making these signif tests. So far, I understand the t is 
 based on the asymptotic Normality of the estimate of b, and 
 for finite samples b/se is not exactly distributed as a t. 
 But I also had the impression that the Wald value was an 
 approximation as well.
 
   summary(polr(as.factor(RENUCYC) ~ DOCS + PCT65PLS*RANNEY2 
 + OLDCRASH 
 +  FISCAL2 + PCTMETRO + ADMLICEN, data=elaine1))
 
 Re-fitting to get Hessian
 
 Call:
 polr(formula = as.factor(RENUCYC) ~ DOCS + PCT65PLS * RANNEY2 +
  OLDCRASH + FISCAL2 + PCTMETRO + ADMLICEN, data = elaine1)
 
 Coefficients:
  Value  Std. Error   t value
 DOCS  0.004942217 0.002952001  1.674192
 PCT65PLS  0.454638558 0.113504288  4.005475
 RANNEY2   0.110473483 0.010829826 10.200855
 OLDCRASH  0.139808663 0.042245692  3.309418
 FISCAL2   0.025592117 0.011465812  2.232037
 PCTMETRO  0.018184093 0.007792680  2.333484
 ADMLICEN -0.028490387 0.011470999 -2.483688
 PCT65PLS:RANNEY2 -0.008559228 0.001456543 -5.876400
 
 Intercepts:
Value   Std. Error t value
 2|36.6177  0.301921.9216
 3|47.1524  0.277325.7938
 4|5   10.5856  0.214949.2691
 5|6   12.2132  0.185865.7424
 6|8   12.2704  0.185666.1063
 8|10  13.0345  0.218459.6707
 10|12 13.9801  0.351739.7519
 12|18 14.6806  0.558726.2782
 
 Residual Deviance: 587.0995
 AIC: 619.0995
 
 
   lrm(RENUCYC ~ DOCS + PCT65PLS*RANNEY2 + OLDCRASH +  
 FISCAL2 + PCTMETRO + ADMLICEN, data=elaine1)
 
 Logistic Regression Model
 
 lrm(formula = RENUCYC ~ DOCS + PCT65PLS * RANNEY2 + OLDCRASH +
  FISCAL2 + PCTMETRO + ADMLICEN, data = elaine1)
 
 
 Frequencies of Responses
2   3   4   5   6   8  10  12  18
   21  12 149  46   1  10   6   2   2
 
 Frequencies of Missing Values Due to Each Variable
   RENUCYC DOCS PCT65PLS  RANNEY2 OLDCRASH  FISCAL2 
 PCTMETRO ADMLICEN
 500605
 05
 
 Obs  Max Deriv Model L.R.   d.f.  P  C 
Dxy
 249  7e-05  56.58  8  0  0.733 
 0.465
   Gamma  Tau-a R2  Brier
0.47  0.278   0.22  0.073
 
 Coef   S.E. Wald Z P
 y=3-6.617857 6.716688 -0.99  0.3245
 y=4-7.152561 6.716571 -1.06  0.2869
 y=5   -10.585705 6.74 -1.57  0.1164
 y=6   -12.213340 6.755656 -1.81  0.0706
 y=8   -12.270506 6.755571 -1.82  0.0693
 y=10  -13.034584 6.756829 -1.93  0.0537
 y=12  -13.980235 6.767724 -2.07  0.0389
 y=18  -14.680760 6.786639 -2.16  0.0305
 DOCS 0.004942 0.002932  1.69  0.0918
 PCT65PLS 0.454653 0.552430  0.82  0.4105
 RANNEY2  0.110475 0.076438  1.45  0.1484
 OLDCRASH 0.139805 0.042104  3.32  0.0009
 FISCAL2  0.025592 0.011374  2.25  0.0245
 PCTMETRO 0.018184 0.007823  2.32  0.0201
 ADMLICEN-0.028490 0.011576 -2.46  0.0138
 PCT65PLS * RANNEY2  -0.008559 0.006417 -1.33  0.1822
 
  
 
 -- 
 Paul E. Johnson   email: [EMAIL PROTECTED]
 Dept. of Political Sciencehttp://lark.cc.ku.edu/~pauljohn
 1541 Lilac Lane, Rm 504
 University of Kansas  Office: (785) 864-9086
 Lawrence, Kansas 66044-3177   FAX: (785) 864-5700
 
 __
 [EMAIL PROTECTED] mailing list

1 2 3 4 5 6 7 >

1 - 100 of 645 matches

Mail list logo