date:20100214

Re: [R] histbackback

2010-02-14 Thread Iasonas Lamprianou

Dear all, 
does anyone know how I can run the histbackback function from R Commander? I am 
going to teach this in my class and I hesitate to ask my students to write code

Thanks
Jason




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem in performing goodness of fit test in R.

2010-02-14 Thread Bernardo Rangel Tura

On Sun, 2010-02-14 at 12:42 +0500, Faiz Rasool wrote:
 I am trying to perform goodness of fit test using R. I am using this website 
 http://wiener.math.csi.cuny.edu/Statistics/R/simpleR/stat013.html for help. 
 However, I am unable to carry out the test successfully. My code follows. It 
 is taken from the website just mentioned. 
 freq=c(22,21,22,27,22,36) # frequencies obtained after rolling the dice 150 
 times.
 prob=c(1,1,1,1,1,1)/6 # specify expected frequency for each category.
 chisq.test(freq,p=prob) # I do not know what this line means. I just followed 
 instructions on the website.
 The erorr I receive is erorr in chisq.test(freq,p=prob)/6 probabilities must 
 sum to 1 
 
 I am very new to R, so any help would be appreciated. 
 Faiz.

Faiz,


Well ... 
In my computer( Phenom X4 9650, runing Ubuntu 9.10 and R 2.10.1) the
script work


 freq=c(22,21,22,27,22,36) # frequencies obtained after rolling the
dice 150 times.
 prob=c(1,1,1,1,1,1)/6 # specify expected frequency for each category.
 chisq.test(freq,p=prob) # I do not know what this line means

Chi-squared test for given probabilities

data:  freq 
X-squared = 6.72, df = 5, p-value = 0.2423


About the third line You must read ?chisq.test for better know the
command, but you execute one chi-square test with uniform probability
distribution 



-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem in performing goodness of fit test in R.

2010-02-14 Thread Ted Harding

On 14-Feb-10 07:42:12, Faiz Rasool wrote:
 I am trying to perform goodness of fit test using R. I am using this
 website
 http://wiener.math.csi.cuny.edu/Statistics/R/simpleR/stat013.html
 for help. However, I am unable to carry out the test successfully. My
 code follows. It is taken from the website just mentioned. 
 
  freq=c(22,21,22,27,22,36) # frequencies obtained after
# rolling the dice 150 times.
  prob=c(1,1,1,1,1,1)/6 # specify expected frequency for each category.
  chisq.test(freq,p=prob) # I do not know what this line means.
  # I just followed instructions on the website.
 
 The erorr I receive is erorr in chisq.test(freq,p=prob)/6
 probabilities must sum to 1 
 
 I am very new to R, so any help would be appreciated. 
 Faiz.

I suspect that you must have made an error in entering the commands
into R. Prime suspect: You did not have 6 1's in p -- for example
you may have put

  prob=c(1,1,1,1,1)/6

(with only five). I copied your code (as trivially reformatted above)
straight into R using copypaste with the mouse, with results:

  freq=c(22,21,22,27,22,36) # frequencies obtained after
# rolling the dice 150 times.
  prob=c(1,1,1,1,1,1)/6 # specify expected frequency for each category.
  chisq.test(freq,p=prob) # I do not know what this line means.

  #Chi-squared test for given probabilities
  # data:  freq 
  # X-squared = 6.72, df = 5, p-value = 0.2423
  # I just followed instructions on the website.

So it worked as it should work. Therefore something went wrong
when you entered the code. Check your input!

Hoping this helps,
Ted.


E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk
Fax-to-email: +44 (0)870 094 0861
Date: 14-Feb-10   Time: 09:02:01
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem in performing goodness of fit test in R.

2010-02-14 Thread Joshua Wiley

Dear Faiz,

On Sat, Feb 13, 2010 at 11:42 PM, Faiz Rasool fai...@gmail.com wrote:

 The erorr I receive is erorr in chisq.test(freq,p=prob)/6 probabilities
 must sum to 1


This error message seems a little off.  It looks like the entire chi squared
test was divided by 6.  Notice that the /6 occurs *after* the closing
parenthesis for the test.

As others have noted, it is probably an input problem, but here is another
sample of code you could try.

## all of the data is entered directly into the function.  x is the data, p
are the probabilities.
## the biggest difference is rather than dividing by 6 to make the
probabilities sum to 1, the argument rescale.p=T will make sure that they
sum to 1

chisq.test(x=c(22,21,22,27,22,36), p=c(1,1,1,1,1,1), rescale.p=T)

Chi-squared test for given probabilities
data:  c(22, 21, 22, 27, 22, 36)
X-squared = 6.72, df = 5, p-value = 0.2423




 I am very new to R, so any help would be appreciated.
 Faiz.

Best of luck to you!  I have found the list to be very helpful and
informative.

Joshua



-- 
Joshua Wiley
Senior in Psychology
University of California, Riverside
http://www.joshuawiley.com/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem in performing goodness of fit test in R.

2010-02-14 Thread Ted Harding

Good that you got it to work somehow!

Faiz: can you report the result, exactly as returned by R, of

  sum(c(1,1,1,1,1,1)/6) - 1

??

On my Linux with R version 2.10.0 (2009-10-26) this gives 0.
(And, by the way, did you really mean to say you were using
R 2.1.01? Was this a typing error for R 2.10.1?)

Ted.

On 14-Feb-10 09:26:00, Faiz Rasool wrote:
 Hi  all,
 
 Following Denis's code, I am able to carry out goodness of fit test.
 
 However the responses I have received, indicate that the code I
 mentioned in 
 a prior email was not incorrect. inclusion of rep(1,6)/6 made a
 difference 
 for me. I am using windows xp and R 2.1.01. Does difference of
 operating 
 system makes a difference? As without rep(1,6)/6 R would not carry out 
 chisq.test on my computer. The code I got it to work follows.
 
 freq=c(22,21,22,27,22,36)#frequencies obtained after rolling the dice
 150 
 times.
 prob=rep(1,6)/6#change made after Dennis's suggestion.
 chisq.test(freq,p=prob)#no error message received now.
 
 Chi-squared test for given probabilities
 
 data:  freq
 X-squared = 6.72, df = 5, p-value = 0.2423
 
 thanks everyone,
 Faiz.
 
 - Original Message - 
 From: Ted Harding ted.hard...@manchester.ac.uk
 To: R-help@r-project.org
 Cc: Faiz Rasool fai...@gmail.com
 Sent: Sunday, February 14, 2010 2:02 PM
 Subject: RE: [R] Problem in performing goodness of fit test in R.
 
 
 On 14-Feb-10 07:42:12, Faiz Rasool wrote:
 I am trying to perform goodness of fit test using R. I am using this
 website
 http://wiener.math.csi.cuny.edu/Statistics/R/simpleR/stat013.html
 for help. However, I am unable to carry out the test successfully. My
 code follows. It is taken from the website just mentioned.

  freq=c(22,21,22,27,22,36) # frequencies obtained after
# rolling the dice 150 times.
  prob=c(1,1,1,1,1,1)/6 # specify expected frequency for each category.
  chisq.test(freq,p=prob) # I do not know what this line means.
  # I just followed instructions on the
  # website.

 The erorr I receive is erorr in chisq.test(freq,p=prob)/6
 probabilities must sum to 1

 I am very new to R, so any help would be appreciated.
 Faiz.

 I suspect that you must have made an error in entering the commands
 into R. Prime suspect: You did not have 6 1's in p -- for example
 you may have put

  prob=c(1,1,1,1,1)/6

 (with only five). I copied your code (as trivially reformatted above)
 straight into R using copypaste with the mouse, with results:

  freq=c(22,21,22,27,22,36) # frequencies obtained after
# rolling the dice 150 times.
  prob=c(1,1,1,1,1,1)/6 # specify expected frequency for each category.
  chisq.test(freq,p=prob) # I do not know what this line means.

  #Chi-squared test for given probabilities
  # data:  freq
  # X-squared = 6.72, df = 5, p-value = 0.2423
  # I just followed instructions on the
  # website.

 So it worked as it should work. Therefore something went wrong
 when you entered the code. Check your input!

 Hoping this helps,
 Ted.

 
 E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk
 Fax-to-email: +44 (0)870 094 0861
 Date: 14-Feb-10   Time: 09:02:01
 -- XFMail -- 
 


E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk
Fax-to-email: +44 (0)870 094 0861
Date: 14-Feb-10   Time: 10:53:34
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Labels on a pyramide

2010-02-14 Thread Jim Lemon


On 02/14/2010 01:27 AM, Orvalho Augusto wrote:

I am using pyramid.plot() from the plotrix package.
...
The problem is (1) I do not want plot agelabels on the center and (2)
I want plot different labels for each pair of the bars (one label for
masculine and the other feminine).

The data represent the 10 most frequent cancer in a group of individuals.



Bueno troglodita,
You have discovered a deficiency in pyramid.plot, and for that you get 
an answer to your question and brand new source code!


source(pyramid.plot.R)

after loading the plotrix package. Use it quickly, for it will be in the 
next version of plotrix and then everyone (todo el mundo, hombre!) will 
be using it.


Jim

# it must be wider than the default to accommodate the long labels
x11(width=10)
par(mar=pyramid.plot(dados$masfr,dados$femfr,
 laxlab=c(0,10,20,30,40),raxlab=c(0,10,20,30,40),
 main=Primeiras 10 cancros mais frequentes por sexo,
 top.labels=c(Masculino, Tipo de cancro, Feminino),
 labels=dados[,c(maslab,femlab)],
 xycol=rep(#ff,10),xxcol=rep(#ff88ff,10), gap=25))
pyramid.plot-function(xy,xx,labels=NA,top.labels=c(Male,Age,Female),
 main=,laxlab=NULL,raxlab=NULL,unit=%,xycol,xxcol,gap=1,
 labelcex=1,mark.cat=NA,add=FALSE) {

 if(any(c(xy,xx)0)) stop(Negative quantities not allowed)
 xydim-dim(xy)
 if(length(labels)==1) labels-1:xydim[1]
 ncats-ifelse(is.null(dim(labels)),length(labels),length(labels[,1]))
 if(is.null(xydim)) {
  if(length(xy) != ncats || length(xx) != ncats)
   stop(xy, xx and labels must all be the same length)
  halfwidth-ceiling(max(c(xy,xx)))+gap
 }
 else {
  if(length(xy[,1]) != ncats || length(xx[,1]) != ncats)
   stop(xy, xx and labels must all be the same length)
  halfwidth-ceiling(max(c(rowSums(xy),rowSums(xx+gap
 }
 oldmar-par(mar)
 if(!add) {
  par(mar=c(4,2,4,2))
  plot(0,xlim=c(-halfwidth,halfwidth),ylim=c(0,ncats+1),
  type=n,axes=FALSE,xlab=,ylab=,xaxs=i,yaxs=i,main=main)
  if(is.null(laxlab)) {
   laxlab-seq(halfwidth-gap,0,by=-1)
   axis(1,at=-halfwidth:-gap,labels=laxlab)
  }
  else axis(1,at=-(laxlab+gap),labels=laxlab)
  if(is.null(raxlab)) {
   raxlab-0:(halfwidth-gap)
   axis(1,at=gap:halfwidth,labels=raxlab)
  }
  else axis(1,at=raxlab+gap,labels=raxlab)
  axis(2,at=1:ncats,labels=rep(,ncats),pos=gap,tcl=-0.25)
  axis(4,at=1:ncats,labels=rep(,ncats),pos=-gap,tcl=-0.25)
  if(!is.na(mark.cat)) boxed.labels(0,mark.cat,labels[mark.cat])
  if(is.null(dim(labels))) text(0,1:ncats,labels,cex=labelcex)
  else {
   text(-gap*0.9,1:ncats,labels[,1],cex=labelcex,adj=0)
   text(gap*0.9,1:ncats,labels[,2],cex=labelcex,adj=1)
  }
  mtext(top.labels,3,0,at=c(-halfwidth/2,0,halfwidth/2),
   adj=0.5,cex=labelcex)
  mtext(c(unit,unit),1,2,at=c(-halfwidth/2,halfwidth/2))
 }
 if(is.null(xydim)) {
  if(missing(xycol)) xycol-rainbow(ncats)
  if(missing(xxcol)) xxcol-rainbow(ncats)
  rect(-(xy+gap),1:ncats-0.4,rep(-gap,ncats),1:ncats+0.4,col=xycol)
  rect(rep(gap,ncats),1:ncats-0.4,(xx+gap),1:ncats+0.4,col=xxcol)
 }
 else {
  if(missing(xycol)) xycol-rainbow(xydim[2])
  if(missing(xxcol)) xxcol-rainbow(xydim[2])
  xystart-xxstart-rep(gap,ncats)
  for(i in 1:xydim[2]) {
   xycolor-rep(xycol[i],ncats)
   xxcolor-rep(xxcol[i],ncats)
   rect(-(xy[,i]+xystart),1:ncats-0.4,-xystart,1:ncats+0.4,
col=xycolor)
   rect(xxstart,1:ncats-0.4,xx[,i]+xxstart,1:ncats+0.4,
col=xxcolor)
   xystart-xy[,i]+xystart
   xxstart-xx[,i]+xxstart
  }
 }
 return(oldmar)
}
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Labels on a pyramide

2010-02-14 Thread Orvalho Augusto

Jim thanks for your great! I will try to use your source code.

Caveman

Ps:
Reconheco a minha trogolodice e por isso pedi ajuda. Lamento perturbar
a todos por isso.


On Sun, Feb 14, 2010 at 1:05 PM, Jim Lemon j...@bitwrit.com.au wrote:
 On 02/14/2010 01:27 AM, Orvalho Augusto wrote:

 I am using pyramid.plot() from the plotrix package.
 ...
 The problem is (1) I do not want plot agelabels on the center and (2)
 I want plot different labels for each pair of the bars (one label for
 masculine and the other feminine).

 The data represent the 10 most frequent cancer in a group of individuals.


 Bueno troglodita,
 You have discovered a deficiency in pyramid.plot, and for that you get an
 answer to your question and brand new source code!

 source(pyramid.plot.R)

 after loading the plotrix package. Use it quickly, for it will be in the
 next version of plotrix and then everyone (todo el mundo, hombre!) will be
 using it.

 Jim

 # it must be wider than the default to accommodate the long labels
 x11(width=10)
 par(mar=pyramid.plot(dados$masfr,dados$femfr,
  laxlab=c(0,10,20,30,40),raxlab=c(0,10,20,30,40),
  main=Primeiras 10 cancros mais frequentes por sexo,
  top.labels=c(Masculino, Tipo de cancro, Feminino),
  labels=dados[,c(maslab,femlab)],
  xycol=rep(#ff,10),xxcol=rep(#ff88ff,10), gap=25))




-- 
OpenSource Software Consultant
CENFOSS (www.cenfoss.co.mz)
SP Tech (www.sptech.co.mz)
email: orvaq...@cenfoss.co.mz
cell: +258828810980

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] multiple-test correlation

2010-02-14 Thread Manuel Jesús López Rodríguez

Dear all,
I am trying to study the correlation between one independent variable (V1) 
and several others dependent among them (V2,V3,V4 and V5). For doing 
so, I would like to analyze my data by multiple-test (applying the Bonferroni´s 
correction or other similar), but I do not find the proper command in R. What I 
want to do is to calculate Kendall´s correlation between V1 and the others 
variables (i.e. V1 vs V2, V1 vs V3, etc.) and to correct the p values 
by Bonferroni or other. I have found outlier.test, but I do not know if this 
is what I need (also, I would prefer to use a less conservative method than 
Bonferroni´s, if possible).
Thank you very much in advance!

 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Labels on a pyramide

2010-02-14 Thread Orvalho Augusto

Pardon my ignorance here.

The Jim code works. But the problem the text comes over the plot.
Adjusting the laxlab and raxlab, to get more plot are, the text keeps
being cut.

And one more thing that might be added as a new functionality is an
option to show the value of the bar.

Thanks for your help

Caveman


On Sun, Feb 14, 2010 at 1:05 PM, Jim Lemon j...@bitwrit.com.au wrote:
 On 02/14/2010 01:27 AM, Orvalho Augusto wrote:

 I am using pyramid.plot() from the plotrix package.
 ...
 The problem is (1) I do not want plot agelabels on the center and (2)
 I want plot different labels for each pair of the bars (one label for
 masculine and the other feminine).

 The data represent the 10 most frequent cancer in a group of individuals.


 Bueno troglodita,
 You have discovered a deficiency in pyramid.plot, and for that you get an
 answer to your question and brand new source code!

 source(pyramid.plot.R)

 after loading the plotrix package. Use it quickly, for it will be in the
 next version of plotrix and then everyone (todo el mundo, hombre!) will be
 using it.

 Jim

 # it must be wider than the default to accommodate the long labels
 x11(width=10)
 par(mar=pyramid.plot(dados$masfr,dados$femfr,
  laxlab=c(0,10,20,30,40),raxlab=c(0,10,20,30,40),
  main=Primeiras 10 cancros mais frequentes por sexo,
  top.labels=c(Masculino, Tipo de cancro, Feminino),
  labels=dados[,c(maslab,femlab)],
  xycol=rep(#ff,10),xxcol=rep(#ff88ff,10), gap=25))




-- 
OpenSource Software Consultant
CENFOSS (www.cenfoss.co.mz)
SP Tech (www.sptech.co.mz)
email: orvaq...@cenfoss.co.mz
cell: +258828810980

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lm function in R

2010-02-14 Thread Douglas Bates

On Sat, Feb 13, 2010 at 7:09 PM, Daniel Malter dan...@umd.edu wrote:
 It seems to me that your question is more about the econometrics than about
 R. Any introductory econometric textbook or compendium on econometrics will
 cover this as it is a basic. See, for example, Greene 2006 or Wooldridge
 2002.

 Say X is your data matrix, that contains columns for each of the individual
 variables (x), columns with their interactions, and one column of 1s for the
 intercept. Let y be your dependent variable. Then, OLS estimates are
 computed by

 X'X inverse X'y

 Or in R

 solve(t(X)%*%X)%*%t(X)%*%y

But that is not the way that the coefficients are calculated in R.  It
is the formula that is given in text books but, like most text book
formulas, is not suitable for computation, especially with large data
sets and complex models.

There are many practical ways to calculate the least squares
coefficients in large data sets and they all involve decompositions.
For a Hadoop environment a scatter-gather approach would be to form
Cholesky decompositions of the cross-product of sections of rows of
the model matrix then combine the decompositions.

 Best,
 Daniel


 -
 cuncta stricte discussurus
 -
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
 Behalf Of Something Something
 Sent: Saturday, February 13, 2010 5:04 PM
 To: Bert Gunter
 Cc: r-help@r-project.org
 Subject: Re: [R] lm function in R

 I tried..

 mod = lm(Y ~ X1*X2*X3, na.action = na.exclude)
 formula(mod)

 This produced
 Y ~ X1 * X2 * X3


 When I typed just mod I got:

 Call:
 lm(formula = Y ~ X1 * X2 * X3, na.action = na.exclude)

 Coefficients:
 (Intercept)          X11          X21          X31      X11:X21      X11:X31
     X21:X31  X11:X21:X31
   177.9245       0.2005       2.4482       3.1216       0.8127     -26.6166
     -3.0398      29.6049


 I am trying to figure out how R computed all these coefficients.





 On Sat, Feb 13, 2010 at 1:30 PM, Bert Gunter gunter.ber...@gene.com wrote:

 ?formula


 Bert Gunter
 Genentech Nonclinical Statistics

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On
 Behalf Of Something Something
 Sent: Saturday, February 13, 2010 1:24 PM
 To: Daniel Nordlund
 Cc: r-help@r-project.org
 Subject: Re: [R] lm function in R

 Thanks Dan.  Yes that was very helpful.  I didn't see the change from '*'
 to
 '+'.

 Seems like when I put * it means - interaction  when I put + it's not an
 interaction.

 Is it correct to assume then that...

 When I put + R evaluates the following equation:
 Y-Hat = b0 + b1X1 + b2X2 + . . . bkXk + 7 7 7 + bkXk


 But when I put * R evaluates the following equation;
 Y-Hat = b0 + b1X1 + b2x2 + ... + bkXk + b12 X12+ b13 X13 +  + c

 Is this correct?  If it is then can someone point me to any sources that
 will explain how the coefficients (such as b0... bk, b12.. , b123..) are
 calculated.  I guess, one source is the R source code :) but is there any
 other documentation anywhere?

 Please let me know.  Thanks.



 On Fri, Feb 12, 2010 at 5:54 PM, Daniel Nordlund
 djnordl...@verizon.netwrote:

   -Original Message-
   From: r-help-boun...@r-project.org [mailto:
 r-help-boun...@r-project.org]
   On Behalf Of Something Something
   Sent: Friday, February 12, 2010 5:28 PM
   To: Phil Spector; r-help@r-project.org
   Subject: Re: [R] lm function in R
  
   Thanks for the replies everyone.  Greatly appreciate it.  Some
 progress,
   but
   now I am getting the following values when I don't use as.factor
  
   13.14167 25.11667 28.34167 49.14167 40.39167 66.86667
  
   Is that what you guys get?
 
 
  If you look at Phil's response below, no, that is not what he got.  The
  difference is that you are specifying an interaction, whereas Phil did
 not
  (because the equation you initially specified did not include an
  interaction.  Use Y ~ X1 + X2 instead of Y ~ X1*X2 for your formula.
 
  
  
   On Fri, Feb 12, 2010 at 5:00 PM, Phil Spector
   spec...@stat.berkeley.eduwrote:
  
By converting the two variables to factors, you are fitting
an entirely different model.  Leave out the as.factor stuff
and it will work exactly as you want it to.
   
 dat
   
 V1 V2 V3 V4
1 s1 14  4  1
2 s2 23  4  2
3 s3 30  7  2
4 s4 50  7  4
5 s5 39 10  3
6 s6 67 10  6
   
names(dat) = c('id','y','x1','x2')
z = lm(y~x1+x2,dat)
predict(z)
   
      1        2        3        4        5        6 15.16667
 24.7
27.7 46.7 40.16667 68.7
   
   
                                       - Phil Spector
                                        Statistical Computing
 Facility
                                        Department of Statistics
                                        UC Berkeley
                                        spec...@stat.berkeley.edu

[R] Help for programming a short code in R

2010-02-14 Thread Michael Haenlein

Dear all,

I'm looking for a person who could help me to program a short code in R. The
code involves Bayesian analysis so some familiarity with WinBUGS or another
package/ software dealing with Bayesian estimation would be helpful. 

I have an academic paper in which the code is described (Abe, M. (2009),
Counting your customers one by one: A hierarchical Bayes extension to the
Pareto/NBD model, Marketing science, Vol. 28 No. 3, pp. 541 - 53) as well
as one of the datasets mentioned in this manuscript to test the code. My
assumption is that the job does not take very long -- although I cannot give
a precise estimate of the number of hours required.

If anyone is interested, please let me know and I can send you an electronic
copy of the manuscript mentioned above.

Best,

Michael




Michael Haenlein
Professor of Marketing
ESCP Europe - The School of Management for Europe

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Newbie woes with *apply

2010-02-14 Thread Dimitri Shvorob


Bug fix:

first.day.of.quarter = function(date)
{
  t = first.day.of.month(date)
  l = month(date) %% 3
  if (l == 0) return(t)
  t = seq.Date(t, by = -1 month, length = l)
  return(t[length(t)])
}

But the *apply part still does not work.

-- 
View this message in context: 
http://n4.nabble.com/Newbie-woes-with-apply-tp1555149p1555167.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Newbie woes with *apply

2010-02-14 Thread Dimitri Shvorob


Dataframe cust has Date-type column open.date. I wish to set up another
column, with (first day of) the quarter of open.date.

To be comprehensive (of course, improvement suggestions are welcome),

month = function(date)
{
  return(as.numeric(format(date,%m)))
}

first.day.of.month = function(date)
{
  return(date + 1 - as.numeric(format(date,%d)))
}

first.day.of.quarter = function(date)
{
  t = seq.Date(first.day.of.month(date), by = -1 month, length =
month(date) %% 3)
  return(t[length(t)])
}

Now the main part,

 cust$open.quarter  = apply(cust$open.date, 1, FUN = first.day.of.quarter)
Error in apply(cust$open.date, 1, FUN = first.day.of.quarter) : 
  dim(X) must have a positive length

 cust$open.quarter  = tapply(cust$open.date, FUN = first.day.of.quarter)
Error in tapply(cust$open.date, FUN = first.day.of.quarter) : 
  element 1 is empty;
   the part of the args list of 'is.list' being evaluated was:
   (INDEX)

 cust$open.quarter  = lapply(cust$open.date, FUN = first.day.of.quarter)
Error in prettyNum(.Internal(format(x, trim, digits, nsmall, width, 3L,  : 
  invalid 'trim' argument

Can anyone suggest the right syntax?

Thank you.

-- 
View this message in context: 
http://n4.nabble.com/Newbie-woes-with-apply-tp1555149p1555149.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Multiple missing values

2010-02-14 Thread John

Does anyone know, or know documentation that describes, how to declare
multiple values in R as missing that does not involve coding them as NA? I
wish to be able to treate values as missing, while still retaining codes
that describe the reason for the value being missing.

Thanks

John MAcInnes


-- 
Professor John MacInnes
Sociology,
School of Social and Political Studies,
No 8 Buccleuch Place
University of Edinburgh
Edinburgh EH8 9LN
+44 (0)131 651 3867

Centre d'Estudis Demogràfics
Universitat Autònoma de Barcelona
Edifici E-2
08193 Bellaterra (Barcelona)
Spain
+34 93 581 3060
The University of Edinburgh is a charitable body, registered in Scotland,
with registration number SC005336.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Newbie woes with *apply

2010-02-14 Thread Gabor Grothendieck

Try

library(zoo)
x - as.Date(seq(1, 500, 50)) # test data
as.Date(as.yearmon(x))
as.Date(as.yearqtr(x))


On Sun, Feb 14, 2010 at 8:59 AM, Dimitri Shvorob
dimitri.shvo...@gmail.com wrote:

 Dataframe cust has Date-type column open.date. I wish to set up another
 column, with (first day of) the quarter of open.date.

 To be comprehensive (of course, improvement suggestions are welcome),

 month = function(date)
 {
  return(as.numeric(format(date,%m)))
 }

 first.day.of.month = function(date)
 {
  return(date + 1 - as.numeric(format(date,%d)))
 }

 first.day.of.quarter = function(date)
 {
  t = seq.Date(first.day.of.month(date), by = -1 month, length =
 month(date) %% 3)
  return(t[length(t)])
 }

 Now the main part,

 cust$open.quarter  = apply(cust$open.date, 1, FUN = first.day.of.quarter)
 Error in apply(cust$open.date, 1, FUN = first.day.of.quarter) :
  dim(X) must have a positive length

 cust$open.quarter  = tapply(cust$open.date, FUN = first.day.of.quarter)
 Error in tapply(cust$open.date, FUN = first.day.of.quarter) :
  element 1 is empty;
   the part of the args list of 'is.list' being evaluated was:
   (INDEX)

 cust$open.quarter  = lapply(cust$open.date, FUN = first.day.of.quarter)
 Error in prettyNum(.Internal(format(x, trim, digits, nsmall, width, 3L,  :
  invalid 'trim' argument

 Can anyone suggest the right syntax?

 Thank you.

 --
 View this message in context: 
 http://n4.nabble.com/Newbie-woes-with-apply-tp1555149p1555149.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Get rid of the first row of the LaTeX table generated by xtable

2010-02-14 Thread Shige Song

Dear All,

I am trying to generate a LaTeX table from a small data frame using
xtable. I have three variable and 10 records. However, the resulted
LaTeX table has four columns (instead of three), of which the first
column seems to be an automatically generated ID. Is there a way to
get rid of this column?

Thanks.

Shige

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SAS and RODBC

2010-02-14 Thread Uwe Ligges

On 14.02.2010 08:19, Daniel Nordlund wrote:

-Original Message-
From: Frank E Harrell Jr [mailto:f.harr...@vanderbilt.edu]
Sent: Saturday, February 13, 2010 5:49 AM
To: Daniel Nordlund
Cc: r-help@r-project.org
Subject: Re: [R] SAS and RODBC

Daniel Nordlund wrote:
. . .

This is just a quick follow-up to my previous post.  Based on Prof.

Ripley's response I went back and looked at the SAS log file and reread
the RODBC help pages.  The problem of writing a SAS dataset was solved by
setting colQuote=NULL in the call to the odbcConnect() function.

ch- odbcConnect('sasodbc', believeNRows=FALSE, colQuote=NULL)

I hope this will be useful to others who may have the SAS BASE product

and want to do graphics or statistical analyses with their SAS data, but
can't afford the high licensing fees for the SAS STAT and GRAPH modules.
Thanks to Prof. Ripley for the fine RODBC package.

Dan

Daniel Nordlund
Bothell, WA USA

Daniel since you have SAS BASE installed why not use sas.get in the
Hmisc package and also get access to metadata such as variable labels
that ODBC does not handle?  Besides providing better documentation, the
labels are very useful as axis labels in plotting, etc.

Frank

Frank,

I have used sas.get from Hmisc before, and I will continue to use it. I 
appreciate the work that you and your colleagues have done with Hmisc and the 
Design and rms packages.  However, the sas.get function still appears to be 
broken on Windows platforms (or Windows is broken :-).  I know how to fix the 
problem, but I am always looking for approaches where I don't have to fix 
things.   It may well be that the better documentation provided by sas.get will 
prove to out weigh the inconvenience of having to source an edited version of 
sas.get for my regular use.

Daniel,

if the function is broken and needs to be fixed, why don't you report 
your findings (both the error with reproducible code) as well as your 
bugfixes to the package maintainer?

Best wishes,
Uwe

As for moving data from R to SAS, I don't know of any methods other than the 
RODBC package with the SAS ODBC driver for writing SAS datasets.  Yes, I can 
write to csv or other file types that SAS can import, but if I can eliminate 
extra steps when going from R to SAS then that is a plus for me.

Thanks again for the great tools,

Dan

Daniel Nordlund
Bothell, WA USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Get rid of the first row of the LaTeX table generated by xtable

2010-02-14 Thread Liviu Andronic

On 2/14/10, Shige Song shiges...@gmail.com wrote:
  column seems to be an automatically generated ID. Is there a way to
  get rid of this column?

Perhaps
?print.xtable
include.rownames=F

Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Get rid of the first row of the LaTeX table generated by xtable

2010-02-14 Thread Uwe Ligges


See ?print.xtable and its argument include.rownames.

Uwe Ligges

On 14.02.2010 16:06, Shige Song wrote:

Dear All,

I am trying to generate a LaTeX table from a small data frame using
xtable. I have three variable and 10 records. However, the resulted
LaTeX table has four columns (instead of three), of which the first
column seems to be an automatically generated ID. Is there a way to
get rid of this column?

Thanks.

Shige

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multiple missing values

2010-02-14 Thread Gabor Grothendieck

NA, Inf, -Inf, NaN would give you 4 possibilities and is.finite would
check if its any of them:

 x - c(1, NA, 2, Inf, 3, -Inf, 4, NaN, 5)
 is.finite(x)
[1]  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE

You might need to map them all to NA before using it with various
functions depending on how the functions deal with these values.

Other possibilities are to have an attribute with a factor defining
the type of each NA.

x - c(1, NA, 2, NA, 3, NA)
attr(x, type.of.na) - factor(c(A, B, A))

and depending on how much work you are prepared to do you could define
a new R class that handles objects with such an attribute.

On Sun, Feb 14, 2010 at 9:33 AM, John john.macin...@ed.ac.uk wrote:
 Does anyone know, or know documentation that describes, how to declare
 multiple values in R as missing that does not involve coding them as NA? I
 wish to be able to treate values as missing, while still retaining codes
 that describe the reason for the value being missing.

 Thanks

 John MAcInnes


 --
 Professor John MacInnes
 Sociology,
 School of Social and Political Studies,
 No 8 Buccleuch Place
 University of Edinburgh
 Edinburgh EH8 9LN
 +44 (0)131 651 3867

 Centre d'Estudis Demogràfics
 Universitat Autònoma de Barcelona
 Edifici E-2
 08193 Bellaterra (Barcelona)
 Spain
 +34 93 581 3060
 The University of Edinburgh is a charitable body, registered in Scotland,
 with registration number SC005336.

        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Get rid of the first row of the LaTeX table generated by xtable

2010-02-14 Thread Shige Song

Dear Livlu and Uwe,

This is exactly what I need, thanks.

Shige

On Sun, Feb 14, 2010 at 10:11 AM, Uwe Ligges
lig...@statistik.tu-dortmund.de wrote:
 See ?print.xtable and its argument include.rownames.

 Uwe Ligges

 On 14.02.2010 16:06, Shige Song wrote:

 Dear All,

 I am trying to generate a LaTeX table from a small data frame using
 xtable. I have three variable and 10 records. However, the resulted
 LaTeX table has four columns (instead of three), of which the first
 column seems to be an automatically generated ID. Is there a way to
 get rid of this column?

 Thanks.

 Shige

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] lattice/ylim: how to fix ylim[1], but have ylim[2] dynamically calculated?

2010-02-14 Thread Johannes Graumann

Hello,

When drawing barcharts, I find it not helpful if ylim[1] != 0 - bars for a 
quantity of 0, that do not show a length of 0 are quite non-intuitive.

I have tried to study
 library(lattice)
 panel.barchart
but am unable to figure out where ylim is taken care of and how one might 
fix ylim[1] to 0 for barcharts ...

Can anyone point out how to tackle this?

Thanks, Joh

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multiple missing values

2010-02-14 Thread Patrick Burns


I can think of a few solutions, none perfect.

* You could have a master dataset that has the
missing value codes you want, and a dataset that
you use which is a copy of it with real NA's in it.

* You could add an attribute that gives the types
of missing values in the various positions.  The
downside is that attributes tend to disappear with
subsetting.

* If you only have two types, you might be able to
get away with using NaN as the second type of NA.

On 14/02/2010 14:33, John wrote:

Does anyone know, or know documentation that describes, how to declare
multiple values in R as missing that does not involve coding them as NA? I
wish to be able to treate values as missing, while still retaining codes
that describe the reason for the value being missing.

Thanks

John MAcInnes





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Patrick Burns
pbu...@pburns.seanet.com
http://www.burns-stat.com
(home of 'The R Inferno' and 'A Guide for the Unwilling S User')

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] evaluate variable within expression - how?

2010-02-14 Thread Mark Heckmann

#  I want to plot bold text. The text should depend on a variable  
containing a character string.

plot.new()
text(.5, .5, expression(bold(Some text)))

# now I would like to do the same replacing some text by a variable.

plot.new()
myText - some text
text(.5, .5, expression(bold(myText)))

# This obviouyls does not work.
# How can I combine substitute, parse, paste etc. do get what I want?
# And could someone add a brief explanation what happens, as I find  
parse, expression, deparse, substitute etc. not easy to understand.

Thanks,
Mark


Mark Heckmann
Dipl. Wirt.-Ing. cand. Psych.
Vorstraße 93 B01
28359 Bremen
Blog: www.markheckmann.de
R-Blog: http://ryouready.wordpress.com





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Shapiro-Wilk for levels of factor

2010-02-14 Thread Ravi Kulkarni

Hello,
  I have data for an ANOVA where the between-subjects factor has three
levels. How do I run a test of normality (using shapiro.test) on each
of the levels of the factor for the dependent variable separately
without creating extra datasets?

  Thanks,

Ravi

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Shapiro-Wilk for levels of factor

2010-02-14 Thread Achim Zeileis


On Sun, 14 Feb 2010, Ravi Kulkarni wrote:


Hello,
 I have data for an ANOVA where the between-subjects factor has three
levels. How do I run a test of normality (using shapiro.test) on each
of the levels of the factor for the dependent variable separately
without creating extra datasets?


You can use tapply(y, x, shapiro.test) which will then conduct as many 
Shapiro-Wilk tests as x has levels (without adjusting for multiple 
testing).


Another approach might be to look at shapiro.test(residualslm(y ~ x))) 
which tests the null hypothesis that the residuals in all groups come from 
the same normal distribution.


A worked example for the chickwts data is included below.
Z

## data
summary(chickwts)

## linear model and ANOVA
fm - lm(weight ~ feed, data = chickwts)
anova(fm)

## QQ plot for residuals + Shapiro-Wilk test
shapiro.test(residuals(fm))

## separate tests for all groups of observations
## (with some formatting)
do.call(rbind, with(chickwts, tapply(weight, feed,
  function(x) unlist(shapiro.test(x)[c(statistic, p.value)]



 Thanks,

   Ravi

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] evaluate variable within expression - how?

2010-02-14 Thread baptiste auguie

Hi,

Try with bquote,

plot.new()
myText - some text
text(.5, .5, bquote(bold(.(myText

basically, bquote( .(myText) ) performs the substitution before
applying bold() (see ?bquote).

HTH,

baptiste


On 14 February 2010 17:36, Mark Heckmann mark.heckm...@gmx.de wrote:
 #  I want to plot bold text. The text should depend on a variable
 containing a character string.

 plot.new()
 text(.5, .5, expression(bold(Some text)))

 # now I would like to do the same replacing some text by a variable.

 plot.new()
 myText - some text
 text(.5, .5, expression(bold(myText)))

 # This obviouyls does not work.
 # How can I combine substitute, parse, paste etc. do get what I want?
 # And could someone add a brief explanation what happens, as I find
 parse, expression, deparse, substitute etc. not easy to understand.

 Thanks,
 Mark

 –––
 Mark Heckmann
 Dipl. Wirt.-Ing. cand. Psych.
 Vorstraße 93 B01
 28359 Bremen
 Blog: www.markheckmann.de
 R-Blog: http://ryouready.wordpress.com





        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] evaluate variable within expression - how?

2010-02-14 Thread Peter Ehlers


Try

text(.5, .5, bquote(bold(.(myText

 -Peter Ehlers

Mark Heckmann wrote:
#  I want to plot bold text. The text should depend on a variable  
containing a character string.


plot.new()
text(.5, .5, expression(bold(Some text)))

# now I would like to do the same replacing some text by a variable.

plot.new()
myText - some text
text(.5, .5, expression(bold(myText)))

# This obviouyls does not work.
# How can I combine substitute, parse, paste etc. do get what I want?
# And could someone add a brief explanation what happens, as I find  
parse, expression, deparse, substitute etc. not easy to understand.


Thanks,
Mark

–––
Mark Heckmann
Dipl. Wirt.-Ing. cand. Psych.
Vorstraße 93 B01
28359 Bremen
Blog: www.markheckmann.de
R-Blog: http://ryouready.wordpress.com





[[alternative HTML version deleted]]





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Peter Ehlers
University of Calgary

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multiple missing values

2010-02-14 Thread Frank E Harrell Jr


Patrick Burns wrote:

I can think of a few solutions, none perfect.

* You could have a master dataset that has the
missing value codes you want, and a dataset that
you use which is a copy of it with real NA's in it.

* You could add an attribute that gives the types
of missing values in the various positions.  The
downside is that attributes tend to disappear with
subsetting.


The sas.get function in the Hmisc exemplifies that approach, and it has 
a subsetting method that preserves the special.miss attribute.


Frank



* If you only have two types, you might be able to
get away with using NaN as the second type of NA.

On 14/02/2010 14:33, John wrote:

Does anyone know, or know documentation that describes, how to declare
multiple values in R as missing that does not involve coding them as 
NA? I

wish to be able to treate values as missing, while still retaining codes
that describe the reason for the value being missing.

Thanks

John MAcInnes




--
Frank E Harrell Jr   Professor and ChairmanSchool of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lattice/ylim: how to fix ylim[1], but have ylim[2] dynamically calculated?

2010-02-14 Thread David Winsemius



On Feb 14, 2010, at 10:33 AM, Johannes Graumann wrote:


Hello,

When drawing barcharts, I find it not helpful if ylim[1] != 0 -  
bars for a

quantity of 0, that do not show a length of 0 are quite non-intuitive.

I have tried to study
 library(lattice)
 panel.barchart
but am unable to figure out where ylim is taken care of and how one  
might

fix ylim[1] to 0 for barcharts ...

Can anyone point out how to tackle this?


Looking at Sarkar's Lattice text in chapter 8 section 3 Limits and  
Aspect Ratio, it appears from subsection 1 that the prepanel function  
can used to supply values of xlim and ylim values. From subsection 2  
he clarifies that xlim and ylim can also be specified on a per panel  
basis (and here I am guessing that this would be within a scales  
argument) when relation=free. At the end of that section he offers  
two examples using ylim: the first is not plotted but the second uses  
the prepanel mechanism for Fig 8.1 and that is probably available on  
the Lattice website.


In the same subsection is offered an alternative to specifying an  
explicit scales$y$limits to be interpreted as ylim values.


My hope it that these ideas and references will be of some use in  
identifying productive places to look for further documentation.


--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] evaluate variable within expression - how?

2010-02-14 Thread Jorge Ivan Velez

Hi Mark,

Try also:

plot(1:10)
text(2, 5, Some text, font = 2)

HTH,
Jorge


On Sun, Feb 14, 2010 at 11:36 AM, Mark Heckmann  wrote:

 #  I want to plot bold text. The text should depend on a variable
 containing a character string.

 plot.new()
 text(.5, .5, expression(bold(Some text)))

 # now I would like to do the same replacing some text by a variable.

 plot.new()
 myText - some text
 text(.5, .5, expression(bold(myText)))

 # This obviouyls does not work.
 # How can I combine substitute, parse, paste etc. do get what I want?
 # And could someone add a brief explanation what happens, as I find
 parse, expression, deparse, substitute etc. not easy to understand.

 Thanks,
 Mark

 
 Mark Heckmann
 Dipl. Wirt.-Ing. cand. Psych.
 Vorstraße 93 B01
 28359 Bremen
 Blog: www.markheckmann.de
 R-Blog: http://ryouready.wordpress.com





[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lattice/ylim: how to fix ylim[1], but have ylim[2] dynamically calculated?

2010-02-14 Thread Deepayan Sarkar

On Sun, Feb 14, 2010 at 7:33 AM, Johannes Graumann
johannes_graum...@web.de wrote:
 Hello,

 When drawing barcharts, I find it not helpful if ylim[1] != 0 - bars for a
 quantity of 0, that do not show a length of 0 are quite non-intuitive.

 I have tried to study
         library(lattice)
         panel.barchart
 but am unable to figure out where ylim is taken care of and how one might
 fix ylim[1] to 0 for barcharts ...

 Can anyone point out how to tackle this?

Are you sure you are not looking for 'origin=0' (described in ?panel.barchart)?

-Deepayan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] mlogit function cut off formular

2010-02-14 Thread Han Qin


I'm trying to fit a multinominal logistic model using package mlogit. I have
15 independent variables. The code looks like this:
m-mlogit(score~0|f1+f2+f3+f4+f5+f6+f7+f8+f9+f10+f11+f12+f13+f14+f15, data,
reflevel=1)

And it gives the following error message:
Error in parse(text = x) : 
  unexpected ')' in score ~ 0 + alt:(f1 + f2 + f3 + f4 + f5 + f6 + f7 + f8
+ f9 + f10 + f11 + f12 + )

Obviously, the mlogit function somehow cut off my formula, probably because
it is too long. Is this considered a bug or I'm doing something wrond?
Thanks
-- 
View this message in context: 
http://n4.nabble.com/mlogit-function-cut-off-formular-tp1555298p1555298.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multiple missing values

2010-02-14 Thread Joe King

Gary King's Amelia package for R and a stand alone version does EM algorithm
multiple imputation.

Joe King
206-913-2912
j...@joepking.com
Never throughout history has a man who lived a life of ease left a name
worth remembering. --Theodore Roosevelt

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Frank E Harrell Jr
Sent: Sunday, February 14, 2010 9:39 AM
To: Patrick Burns
Cc: r-help@r-project.org; john.macin...@ed.ac.uk
Subject: Re: [R] Multiple missing values

Patrick Burns wrote:
 I can think of a few solutions, none perfect.
 
 * You could have a master dataset that has the
 missing value codes you want, and a dataset that
 you use which is a copy of it with real NA's in it.
 
 * You could add an attribute that gives the types
 of missing values in the various positions.  The
 downside is that attributes tend to disappear with
 subsetting.

The sas.get function in the Hmisc exemplifies that approach, and it has 
a subsetting method that preserves the special.miss attribute.

Frank

 
 * If you only have two types, you might be able to
 get away with using NaN as the second type of NA.
 
 On 14/02/2010 14:33, John wrote:
 Does anyone know, or know documentation that describes, how to declare
 multiple values in R as missing that does not involve coding them as 
 NA? I
 wish to be able to treate values as missing, while still retaining codes
 that describe the reason for the value being missing.

 Thanks

 John MAcInnes



-- 
Frank E Harrell Jr   Professor and ChairmanSchool of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Newbie woes with *apply

2010-02-14 Thread Dimitri Shvorob


Many thanks, but my focus is actually on *apply usage.
-- 
View this message in context: 
http://n4.nabble.com/Newbie-woes-with-apply-tp1555149p1555329.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Keeping the attributes of an observation

2010-02-14 Thread John Lipkins

Hey R list,

Can someone help me with following question?

I can create a listing of the unique observations in a dataframe by using:

df - df [order(df $start),]
df $first_ob[!duplicated(df $ID)] - 1
df $first_ob[duplicated(df $ID)] - 0

Now I want to hold specific attributes of the first observation for other
(the second or third ) observation of a specific element (named ID).

So for example if I have an element with ID X that is ok in the first
observation and not ok in the second I want to hold the ok in a
therefore specified column. An example of the output would be:

IDstartfirst obT1T2
X 1/1/20101okok
X2/2/20100not okok


Thanks in advance.
Kind regards,

John

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problems with boxplot in ggplot2:qplot

2010-02-14 Thread Dimitri Shvorob


Dataframe closed contains balances of closed accounts: each row has month of
closure (Date-type column month) and latest balance. I would like to plot
by-month distributions of balances. A qplot call below produces several
warnings and no output. 

Can anyone help?

Thank you.

PS. A really basic task, very similar to the examples on p. 71 of the
ggplot2 book, apart from a Date grouping column; I am quite surprised to
have problems with it. lattice package to the rescue?


 qplot(factor(month), balance, data = closed, geom = boxplot, xlim =
 range(closed$month))
There were 13 warnings (use warnings() to see them)

 warnings()
Warning messages:
1: Removed 1 rows containing missing values (stat_boxplot).
2: Removed 7 rows containing missing values (geom_point).
3: Removed 5 rows containing missing values (geom_point).
4: Removed 8 rows containing missing values (geom_point).
5: Removed 3 rows containing missing values (geom_point).
6: Removed 5 rows containing missing values (geom_point).
7: Removed 2 rows containing missing values (geom_point).
8: Removed 12 rows containing missing values (geom_point).
9: Removed 2 rows containing missing values (geom_point).
10: Removed 1 rows containing missing values (geom_point).
11: Removed 2 rows containing missing values (geom_point).
12: Removed 3 rows containing missing values (geom_point).
13: Removed 4 rows containing missing values (geom_point).

 p = qplot(factor(month), balance, data = closed, geom = boxplot, xlim =
 range(closed$month))
 plot(p)
Error in plot.window(...) : need finite 'xlim' values
-- 
View this message in context: 
http://n4.nabble.com/Problems-with-boxplot-in-ggplot2-qplot-tp1555338p1555338.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problems with boxplot in ggplot2:qplot

2010-02-14 Thread baptiste auguie

Hi,

it's hard to tell what's wrong without a reproducible example, but I
noted two things:

- AFAIK there is no plot method for ggplot2. You probably meant print(p) instead

- if you map x to factor(month), I think it will be incompatible with
your xlim values range(month).

HTH,

baptiste

On 14 February 2010 19:55, Dimitri Shvorob dimitri.shvo...@gmail.com wrote:

 Dataframe closed contains balances of closed accounts: each row has month of
 closure (Date-type column month) and latest balance. I would like to plot
 by-month distributions of balances. A qplot call below produces several
 warnings and no output.

 Can anyone help?

 Thank you.

 PS. A really basic task, very similar to the examples on p. 71 of the
 ggplot2 book, apart from a Date grouping column; I am quite surprised to
 have problems with it. lattice package to the rescue?


 qplot(factor(month), balance, data = closed, geom = boxplot, xlim =
 range(closed$month))
 There were 13 warnings (use warnings() to see them)

 warnings()
 Warning messages:
 1: Removed 1 rows containing missing values (stat_boxplot).
 2: Removed 7 rows containing missing values (geom_point).
 3: Removed 5 rows containing missing values (geom_point).
 4: Removed 8 rows containing missing values (geom_point).
 5: Removed 3 rows containing missing values (geom_point).
 6: Removed 5 rows containing missing values (geom_point).
 7: Removed 2 rows containing missing values (geom_point).
 8: Removed 12 rows containing missing values (geom_point).
 9: Removed 2 rows containing missing values (geom_point).
 10: Removed 1 rows containing missing values (geom_point).
 11: Removed 2 rows containing missing values (geom_point).
 12: Removed 3 rows containing missing values (geom_point).
 13: Removed 4 rows containing missing values (geom_point).

 p = qplot(factor(month), balance, data = closed, geom = boxplot, xlim =
 range(closed$month))
 plot(p)
 Error in plot.window(...) : need finite 'xlim' values
 --
 View this message in context: 
 http://n4.nabble.com/Problems-with-boxplot-in-ggplot2-qplot-tp1555338p1555338.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Newbie woes with *apply

2010-02-14 Thread Gabor Grothendieck

Your function is not receiving what you may think its receiving.  Try this:

 x - Sys.Date() + 1:3
 junk - lapply(x, print)
[1] 14655
[1] 14656
[1] 14657

So add this as the first statement in the body of each of your functions:
date - as.Date(date, 1970-01-01)

and then using x from above try these:
lapply(x, first.day.of.quarter)
xx - x; xx[] - sapply(x, first.day.of.quarter)






On Sun, Feb 14, 2010 at 8:59 AM, Dimitri Shvorob
dimitri.shvo...@gmail.com wrote:

 Dataframe cust has Date-type column open.date. I wish to set up another
 column, with (first day of) the quarter of open.date.

 To be comprehensive (of course, improvement suggestions are welcome),

 month = function(date)
 {
  return(as.numeric(format(date,%m)))
 }

 first.day.of.month = function(date)
 {
  return(date + 1 - as.numeric(format(date,%d)))
 }

 first.day.of.quarter = function(date)
 {
  t = seq.Date(first.day.of.month(date), by = -1 month, length =
 month(date) %% 3)
  return(t[length(t)])
 }

 Now the main part,

 cust$open.quarter  = apply(cust$open.date, 1, FUN = first.day.of.quarter)
 Error in apply(cust$open.date, 1, FUN = first.day.of.quarter) :
  dim(X) must have a positive length

 cust$open.quarter  = tapply(cust$open.date, FUN = first.day.of.quarter)
 Error in tapply(cust$open.date, FUN = first.day.of.quarter) :
  element 1 is empty;
   the part of the args list of 'is.list' being evaluated was:
   (INDEX)

 cust$open.quarter  = lapply(cust$open.date, FUN = first.day.of.quarter)
 Error in prettyNum(.Internal(format(x, trim, digits, nsmall, width, 3L,  :
  invalid 'trim' argument

 Can anyone suggest the right syntax?

 Thank you.

 --
 View this message in context: 
 http://n4.nabble.com/Newbie-woes-with-apply-tp1555149p1555149.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problems with boxplot in ggplot2:qplot

2010-02-14 Thread Dimitri Shvorob


... Unfortunately, a problem remains: I cannot label x ticks a la 'names.arg
=  '. 

month has values like '2009-01-01', '2009-02-01', etc., while I would prefer
'Jan', 'Feb'. Using

closed$month = format(closed$month, %b) 

disrupts the order of plot's panels, which now follows the alphabetic order
of month names.

-- 
View this message in context: 
http://n4.nabble.com/Problems-with-boxplot-in-ggplot2-qplot-tp1555338p1555358.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] unexpected results with higher-order functions and lapply

2010-02-14 Thread Jyotirmoy Bhattacharya

I want to use lapply and a function returning a function in order to build a
list of functions.

 genr1 - function(k) {function() {k}}
 l1 - lapply(1:2,genr1)
 l1[[1]]()
[1] 2

This was unexpected. I had expected the answer to be 1, since that is the
value k should be bound to when genr1 is applied to the first element of
1:2.

By itself genr1 seems to work fine.
 genr1(5)()
[1] 5

I defined a slightly different higher-order function:
 genr2 - function(k) {k;function() {k}}
 l2 - lapply(1:2,genr2)
 l2[[1]]()
[1] 1

This gives the answer I expected.

Now I am confused. The function returned by genr2 is exactly the same
function that was being returned by genr1. Why should evaluating k make a
difference?

I am using R 2.9.2 on Ubuntu Linux.

Jyotirmoy Bhattacharya

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problems with boxplot in ggplot2:qplot

2010-02-14 Thread Dimitri Shvorob


My bad: once I ran dev.off(), I did get a plot, albeit a blank one. Then I
removed xlim - which I put in after qplot's complain about xlim - and voila!

Thanks a lot.
-- 
View this message in context: 
http://n4.nabble.com/Problems-with-boxplot-in-ggplot2-qplot-tp1555338p1555352.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] unexpected results with higher-order functions and lapply

2010-02-14 Thread baptiste auguie

Hi,

I believe it's lazy evaluation. See ?force

HTH,

baptiste

On 14 February 2010 20:32, Jyotirmoy Bhattacharya
jyotir...@jyotirmoy.net wrote:
 I want to use lapply and a function returning a function in order to build a
 list of functions.

 genr1 - function(k) {function() {k}}
 l1 - lapply(1:2,genr1)
 l1[[1]]()
 [1] 2

 This was unexpected. I had expected the answer to be 1, since that is the
 value k should be bound to when genr1 is applied to the first element of
 1:2.

 By itself genr1 seems to work fine.
 genr1(5)()
 [1] 5

 I defined a slightly different higher-order function:
 genr2 - function(k) {k;function() {k}}
 l2 - lapply(1:2,genr2)
 l2[[1]]()
 [1] 1

 This gives the answer I expected.

 Now I am confused. The function returned by genr2 is exactly the same
 function that was being returned by genr1. Why should evaluating k make a
 difference?

 I am using R 2.9.2 on Ubuntu Linux.

 Jyotirmoy Bhattacharya

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problem with specifying variance-covariance matrix for random effects (nlme package)

2010-02-14 Thread Benjamin Cheah

Hi all,

I've been struggling with trying to specify a diagnoal matrix for linear mixed 
effects model. I think I've got nearly everything correct, except the following 
message appears:

In lme.formula(fixed = fwave ~ sex + sexXbulbar + visit + age +  :
  Fewer observations than random effects in all level 1 groups

Not sure if i've provided enough details, but I'm basically trying to perform a 
mixed effects model analysis, controlling for several covariates. Incorporating 
random effects for the intercept and slope.

What does the second line, 'Fewer observations than random effects in all level 
1 groups' mean?

Many thanks in advance,

Ben Cheah



  
__
[[elided Yahoo spam]]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lattice/ylim: how to fix ylim[1], but have ylim[2] dynamically calculated?

2010-02-14 Thread Johannes Graumann

David Winsemius wrote:

 
 On Feb 14, 2010, at 10:33 AM, Johannes Graumann wrote:
 
 Hello,

 When drawing barcharts, I find it not helpful if ylim[1] != 0 -
 bars for a
 quantity of 0, that do not show a length of 0 are quite non-intuitive.

 I have tried to study
  library(lattice)
  panel.barchart
 but am unable to figure out where ylim is taken care of and how one
 might
 fix ylim[1] to 0 for barcharts ...

 Can anyone point out how to tackle this?
 
 Looking at Sarkar's Lattice text in chapter 8 section 3 Limits and
 Aspect Ratio, it appears from subsection 1 that the prepanel function
 can used to supply values of xlim and ylim values. From subsection 2
 he clarifies that xlim and ylim can also be specified on a per panel
 basis (and here I am guessing that this would be within a scales
 argument) when relation=free. At the end of that section he offers
 two examples using ylim: the first is not plotted but the second uses
 the prepanel mechanism for Fig 8.1 and that is probably available on
 the Lattice website.
 
 In the same subsection is offered an alternative to specifying an
 explicit scales$y$limits to be interpreted as ylim values.
 
 My hope it that these ideas and references will be of some use in
 identifying productive places to look for further documentation.

I am quite certain that this is the most elaborately worded version of 
RTFM I have ever come across.

I shall go and do so.

Sincerely, Joh

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] sas.get and Windows: WAS RE: SAS and RODBC

2010-02-14 Thread Daniel Nordlund

 -Original Message-
 From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de]
 Sent: Sunday, February 14, 2010 7:09 AM
 To: Daniel Nordlund
 Cc: r-help@r-project.org
 Subject: Re: [R] SAS and RODBC

 On 14.02.2010 08:19, Daniel Nordlund wrote:
  -Original Message-
  From: Frank E Harrell Jr [mailto:f.harr...@vanderbilt.edu]
  Sent: Saturday, February 13, 2010 5:49 AM
  To: Daniel Nordlund
  Cc: r-help@r-project.org
  Subject: Re: [R] SAS and RODBC

snip

  Daniel since you have SAS BASE installed why not use sas.get in the
  Hmisc package and also get access to metadata such as variable labels
  that ODBC does not handle?  Besides providing better documentation, the
  labels are very useful as axis labels in plotting, etc.

  Frank

  Frank,

  I have used sas.get from Hmisc before, and I will continue to use it. I
 appreciate the work that you and your colleagues have done with Hmisc and
 the Design and rms packages.  However, the sas.get function still appears
 to be broken on Windows platforms (or Windows is broken :-).  I know how
 to fix the problem, but I am always looking for approaches where I don't
 have to fix things.   It may well be that the better documentation
 provided by sas.get will prove to out weigh the inconvenience of having to
 source an edited version of sas.get for my regular use.

 Daniel,

 if the function is broken and needs to be fixed, why don't you report
 your findings (both the error with reproducible code) as well as your
 bugfixes to the package maintainer?

 Best wishes,
 Uwe

snip

Uwe,

The solution has been posted on R-help more than once, but you are right, I 
should have specifically emailed the maintainer of the package.  So, I just 
did.  Thanks for the prompt.

Dan

Daniel Nordlund
Bothell, WA USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] xyplot, overlay two variables on one plot with group factors

2010-02-14 Thread Pat Schmitz

All

I want to overlay two variables on the same plot following their appropriate
grouping.  I have attempted to use subscripting in panel with panel.xyplot,
but I can't get the grouping to follow into the panel...here is an
example...


dat-data.frame(
y= log(1:10),
y2=10:19,
 x=1:10,
grp = as.factor(1)
)

dat2-data.frame(
 y= log(10:19),
y2= 20:29,
x=1:10,
 grp = as.factor(c(2))
)

dat-rbind(dat, dat2)

# Here I plot both response variables on y axis by the same x
xyplot(y2 ~ x, dat, groups=grp, type=l)
xyplot(y ~ x, dat, groups=grp, type=l)

# I want to overlay both in the same plot to compare
xyplot(y~ x, dat,
 groups = grp,
ylim = c(0,40),
panel=function(x,y,subscripts, groups,...){
 panel.xyplot(x, y, type=l)
panel.xyplot(dat$x[subscripts], dat$y2[subscripts],type=l)
 }
)


Thanks

-- 
Patrick Schmitz
Graduate Student
Plant Biology
1206 West Gregory Drive
RM 1500

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lattice/ylim: how to fix ylim[1], but have ylim[2] dynamically calculated?

2010-02-14 Thread Johannes Graumann

Deepayan Sarkar wrote:

 On Sun, Feb 14, 2010 at 7:33 AM, Johannes Graumann
 johannes_graum...@web.de wrote:
 Hello,

 When drawing barcharts, I find it not helpful if ylim[1] != 0 - bars
 for a quantity of 0, that do not show a length of 0 are quite
 non-intuitive.

 I have tried to study
  library(lattice)
  panel.barchart
 but am unable to figure out where ylim is taken care of and how one might
 fix ylim[1] to 0 for barcharts ...

 Can anyone point out how to tackle this?
 
 Are you sure you are not looking for 'origin=0' (described in
 ?panel.barchart)?

I sure am - thank you! Following the same path for bwplot I found the 
embarrassingly simple answer to my earlier question regarding varwidth in 
...

Sincerely, Joh

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] multiple-test correlation

2010-02-14 Thread Robert A LaBudde


At 07:35 AM 2/14/2010, Manuel Jesús López Rodríguez wrote:

Dear all,
I am trying to study the correlation between one 
independent variable (V1) and several others 
dependent among them (V2,V3,V4 and V5). 
For doing so, I would like to analyze my data by 
multiple-test (applying the Bonferroni´s 
correction or other similar), but I do not find 
the proper command in R. What I want to do is to 
calculate Kendall´s correlation between V1 and 
the others variables (i.e. V1 vs V2, V1 vs 
V3, etc.) and to correct the p values by 
Bonferroni or other. I have found 
outlier.test, but I do not know if this is 
what I need (also, I would prefer to use a less 
conservative method than Bonferroni´s, if possible).

Thank you very much in advance!


One approach might be to first test for any 
correlations via a likelihood ratio test:


Ho: P = I   (no correlations) or covariances are diagonal
T = -a ln V ~ chi-square [p(p-1)/2]

where

V = det(R)
a = N -1 - (2 p +5)/6   N = # data
p = # variables

Reject Ho if T  X^2 (alpha, p(p-1)/2)

Then do the pairwise tests without familywise 
error control. I.e., this is similar to doing the 
F test in ANOVA before doing LSD testing.



Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: r...@lcfltd.com
Least Cost Formulations, Ltd.URL: http://lcfltd.com/
824 Timberlake Drive Tel: 757-467-0954
Virginia Beach, VA 23464-3239Fax: 757-467-2947

Vere scire est per causas scire


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lattice/ylim: how to fix ylim[1], but have ylim[2] dynamically calculated?

2010-02-14 Thread David Winsemius



On Feb 14, 2010, at 3:40 PM, Johannes Graumann wrote:


David Winsemius wrote:



On Feb 14, 2010, at 10:33 AM, Johannes Graumann wrote:


Hello,

When drawing barcharts, I find it not helpful if ylim[1] != 0 -
bars for a
quantity of 0, that do not show a length of 0 are quite non- 
intuitive.


I have tried to study

library(lattice)
panel.barchart

but am unable to figure out where ylim is taken care of and how one
might
fix ylim[1] to 0 for barcharts ...

Can anyone point out how to tackle this?


Looking at Sarkar's Lattice text in chapter 8 section 3 Limits and
Aspect Ratio, it appears from subsection 1 that the prepanel  
function

can used to supply values of xlim and ylim values. From subsection 2
he clarifies that xlim and ylim can also be specified on a per panel
basis (and here I am guessing that this would be within a scales
argument) when relation=free. At the end of that section he offers
two examples using ylim: the first is not plotted but the second uses
the prepanel mechanism for Fig 8.1 and that is probably available on
the Lattice website.

In the same subsection is offered an alternative to specifying an
explicit scales$y$limits to be interpreted as ylim values.

My hope it that these ideas and references will be of some use in
identifying productive places to look for further documentation.


I am quite certain that this is the most elaborately worded version of
RTFM I have ever come across.


Well, not really, it was a version of ... here's my best untested wild  
guess from someone who persistently struggles with trying to get panel  
arguments right.



I shall go and do so.

Sincerely, Joh

--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] xyplot, overlay two variables on one plot with group factors

2010-02-14 Thread Pat Schmitz

Ok...I realized that if I don't wish to compare the groupings directly

i can plot the groups as separate panels, which is a much more simple
solution, but doesn't necessarily give the same effect.

xyplot(y+y2 ~ x | grp, dat)


On Sun, Feb 14, 2010 at 3:19 PM, Dennis Murphy djmu...@gmail.com wrote:

 Hi:

 Here's one approach: the idea is to essentially stack the responses in both
 data sets, create a factor that identifies the stacked variables, merges
 the data
 together and then combines 'variable' and 'grp' from the merged data frame
 into a single factor for use as the groups = element in xyplot. (Got all
 that?)

 library(lattice)
 library(reshape)
 # use the melt function of reshape to stack the responses and create an
 # identifier (variable) for them. The responses themselves are put into a
 # variable named 'value'. id.vars are kept separate, the others are stacked
 # into a variable identifier called 'variable' and the vector of values
 called
 # 'values'...
 Dat - melt(dat, id.vars = c('x', 'grp'))
 Dat2 - melt(dat2, id.vars = c('x', 'grp'))
 Dat3 - rbind(Dat, Dat2) # concatenate the two data frames
 Dat3$gv - with(Dat3, paste(variable, grp, sep = ))  # combine factors

 xyplot(value ~ x, data = Dat3, groups = gv, type = l)

 There are more elegant ways to create the grouping factor, which you may
 need if you intend to create a 'nice' legend, but at this level it works.

 HTH,
 Dennis

 On Sun, Feb 14, 2010 at 12:44 PM, Pat Schmitz pschm...@illinois.eduwrote:

 All

 I want to overlay two variables on the same plot following their
 appropriate
 grouping.  I have attempted to use subscripting in panel with
 panel.xyplot,
 but I can't get the grouping to follow into the panel...here is an
 example...


 dat-data.frame(
 y= log(1:10),
 y2=10:19,
  x=1:10,
 grp = as.factor(1)
 )

 dat2-data.frame(
  y= log(10:19),
 y2= 20:29,
 x=1:10,
  grp = as.factor(c(2))
 )

 dat-rbind(dat, dat2)

 # Here I plot both response variables on y axis by the same x
 xyplot(y2 ~ x, dat, groups=grp, type=l)
 xyplot(y ~ x, dat, groups=grp, type=l)

 # I want to overlay both in the same plot to compare
 xyplot(y~ x, dat,
  groups = grp,
 ylim = c(0,40),
 panel=function(x,y,subscripts, groups,...){
  panel.xyplot(x, y, type=l)
 panel.xyplot(dat$x[subscripts], dat$y2[subscripts],type=l)
  }
 )


 Thanks

 --
 Patrick Schmitz
 Graduate Student
 Plant Biology
 1206 West Gregory Drive
 RM 1500

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Patrick Schmitz
Graduate Student
Plant Biology
1206 West Gregory Drive
RM 1500

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] xyplot, overlay two variables on one plot with group factors

2010-02-14 Thread David Winsemius



On Feb 14, 2010, at 3:44 PM, Pat Schmitz wrote:


All

I want to overlay two variables on the same plot following their  
appropriate
grouping.  I have attempted to use subscripting in panel with  
panel.xyplot,

but I can't get the grouping to follow into the panel...here is an
example...


dat-data.frame(
y= log(1:10),
y2=10:19,
x=1:10,
grp = as.factor(1)
)

dat2-data.frame(
y= log(10:19),
y2= 20:29,
x=1:10,
grp = as.factor(c(2))
)

dat-rbind(dat, dat2)

# Here I plot both response variables on y axis by the same x
xyplot(y2 ~ x, dat, groups=grp, type=l)
xyplot(y ~ x, dat, groups=grp, type=l)

# I want to overlay both in the same plot to compare
xyplot(y~ x, dat,
groups = grp,
ylim = c(0,40),
panel=function(x,y,subscripts, groups,...){
panel.xyplot(x, y, type=l)
panel.xyplot(dat$x[subscripts], dat$y2[subscripts],type=l)
}
)



My reading of the xyplot help page made me think that subscripts was  
logical and not an indexing variable, although the opposite is  
suggested by panel.superpose's help page. (Just my continued confusion  
about how to use lattice functions properly.) It sounds as though you  
are trying to superimpose what would otherwise be separate panels. So  
I'm never quite sure that when I mess around with panel arguments that  
I'm doing it right,  but this produces what I think you are trying to  
do:


 xyplot(y~ x, dat,
 groups = grp,
 ylim = c(0,40),
 panel=function(x,y, groups,...){
  panel.xyplot(x, y, groups,  type=l, ...)
  panel.superpose(x, dat$y2, groups,  type=l, ...)
})

My hypothesis after experimenting with trying to leave out the , ...  
argument is that it is necessary to accept the subscripts argument,  
even though you don't actually need to do anything with it. I think  
that is implied by the fact that subscripts (with no default) appears  
before the ellipsis in the arguments list.




Thanks

--
Patrick Schmitz
Graduate Student
Plant Biology
1206 West Gregory Drive
RM 1500

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lattice/ylim: how to fix ylim[1], but have ylim[2] dynamically calculated?

2010-02-14 Thread Rolf Turner


On 15/02/2010, at 9:40 AM, Johannes Graumann wrote

SNIP

(In response to some advice from David Winsemius):

 I am quite certain that this is the most elaborately worded version of 
 RTFM I have ever come across.


I nominate this as a fortune.  (Despite Prof. Winsemius's later protestation
that his advice was *not* a version of RTFM.)

cheers,

Rolf Turner

##
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Dimensional reduction package

2010-02-14 Thread Michael Denslow

Hi Maura,

 Is there any R package which implements non-linear dimensionality reduction 
 (LLE, ISOMAP, GTM, and so on)  and/or intrinsic dimensionality estimation ?

I am not exactly sure what is meant by non-linear dimensionality
reduction but there is an isomap function in vegan. This is the isomap
of Tenenbaum et al. 2000 and De'ath 1999.

library(vegan)
?isomap

De'ath, G. (1999) Extended dissimilarity: a method of robust
estimation of ecological distances from high beta diversity data.
Plant Ecology 144, 191–199

Tenenbaum, J.B., de Silva, V.  Langford, J.C. (2000) A global network
framework for nonlinear dimensionality reduction. Science 290,
2319–2323.


Hope this helps,
Michael

 Thank you,
 Maura


 tutti i telefonini TIM!


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Michael Denslow

I.W. Carpenter Jr. Herbarium [BOON]
Department of Biology
Appalachian State University
Boone, North Carolina U.S.A.
-- AND --
Communications Manager
Southeast Regional Network of Expertise and Collections
sernec.org

36.214177, -81.681480 +/- 3103 meters

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plot different regression models on one graph

2010-02-14 Thread kMan

I would use all of the data. If you want to drop one, control for it in
the model  sacrifice a degree of freedom.

Why the call to poly() by the way?

KeithC.

-Original Message-
From: Peter Ehlers [mailto:ehl...@ucalgary.ca] 
Sent: Saturday, February 13, 2010 1:35 PM
To: David Winsemius
Cc: Rhonda Reidy; r-help@r-project.org
Subject: Re: [R] Plot different regression models on one graph

Rhonda:

As David points out, a cubic fit is rather speculative. I think that one
needs to distinguish two situations: 1) theoretical justification for a
cubic model is available, or 2) you're dredging the data for the best fit.
Your case is the second. That puts you in the realm of EDA (exploratory data
analysis). You're free to fit any model you wish, but you should assess the
value of the model. One convenient way is to use the influence.measures()
function, which will show that points 9 and 10 in your data have a strong
influence on your cubic fit (as, of course, your eyes would tell you). A
good thing to do at this point is to fit 3 more cubic models:
1) omitting point 9, 2) omitting point 10, 3) omitting both.

Try it and plot the resulting fits. You may be surprised.

Conclusion: Without more data, you can conclude nothing about a
non-straightline fit.

(And this hasn't even addressed the relative abundance of x=0 data.)

  -Peter Ehlers

David Winsemius wrote:
 
 On Feb 13, 2010, at 1:35 PM, Rhonda Reidy wrote:
 
 The following variables have the following significant relationships 
 (x is the explanatory variable): linear, cubic, exponential, logistic.
 The linear relationship plots without any trouble.

 Cubic is the 'best' model, but it is not plotting as a smooth curve 
 using the following code:

 cubic.lm- lm(y~poly(x,3))
 
 Try:
 
 lines(0:80, predict(cubic.lm, data.frame(x=0:80)),lwd=2)
 
 But I really must say the superiority of that relationship over a 
 linear one is far from convincing to my eyes. Seems to be caused by 
 one data point. I hope the quotes around best mean tha tyou have the
same qualms.
 
 
 lines(x,predict(cubic.lm),lwd=2)

 How do I plot the data and the estimated curves for all of these 
 regression models in the same plot?

 x - c(62.5,68.5,0,52,0,52,0,52,23.5,86,0,0,0,0,0,0,0,0,0,0)

 y -
 c(0.054,0.055,0.017,0.021,0.020,0.028,0.032,0.073,0.076,0.087,0.042,0
 .042,0.041,0.045,0.021,0.018,0.017,0.018,0.028,0.022)


 Thanks in advance.

 Rhonda Reidy


--
Peter Ehlers
University of Calgary

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plot different regression models on one graph

2010-02-14 Thread kMan

Dear Rhonda,

Consider curve().

KeithC.

-Original Message-
From: Rhonda Reidy [mailto:rre...@gmail.com] 
Sent: Saturday, February 13, 2010 11:36 AM
To: r-help@r-project.org
Subject: [R] Plot different regression models on one graph

The following variables have the following significant relationships (x is
the explanatory variable): linear, cubic, exponential, logistic. The linear
relationship plots without any trouble. 

Cubic is the 'best' model, but it is not plotting as a smooth curve using
the following code:

cubic.lm- lm(y~poly(x,3))
lines(x,predict(cubic.lm),lwd=2)

How do I plot the data and the estimated curves for all of these regression
models in the same plot? 

x - c(62.5,68.5,0,52,0,52,0,52,23.5,86,0,0,0,0,0,0,0,0,0,0)

y -
c(0.054,0.055,0.017,0.021,0.020,0.028,0.032,0.073,0.076,0.087,0.042,0.042,0.
041,0.045,0.021,0.018,0.017,0.018,0.028,0.022)

Thanks in advance.

Rhonda Reidy

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Adding a regression line to an xyplot

2010-02-14 Thread Andrew McFadden

Hi R users

 I am trying to add a regression line to a graph for c for factor 2
 only. Any suggestions?
 
library(lattice)

a=(1:10)
b=c(1:5,16:20)
c=as.factor(c(rep(1,5),rep(2,5)))

d=data.frame(a,b,c)

xyplot(a~b, pch=c(6,8),data = tavg, groups=d$c,
reg.line=lm, smooth=FALSE, type=c(p,g),xlab=a,ylab=b)

 I would appreciate your help.
 
 Kind regards
 
 andy
 
 Andrew McFadden MVS BVSc
 Incursion Investigator
 Investigation  Diagnostic Centres - Wallaceville Biosecurity New 
 Zealand Ministry of Agriculture and Forestry
 
 Phone 04 894 5600 Fax 04 894 4973 Mobile 029 894 5611 Postal address: 
 Investigation and Diagnostic Centre- Wallaceville Box 40742 Ward St 
 Upper Hutt
 


This email message and any attachment(s) is intended solely for the
addressee(s) named above. The information it contains is confidential
and may be legally privileged.  Unauthorised use of the message, or the
information it contains, may be unlawful. If you have received this
message by mistake please call the sender immediately on 64 4 8940100
or notify us by return email and erase the original message and
attachments. Thank you.

The Ministry of Agriculture and Forestry accepts no responsibility for
changes made to this email or to any attachments after transmission from
the office.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Estimated Standard Error for Theta in zeroinfl()

2010-02-14 Thread Lam, Tzeng Yih

Dear R Users,

When using zeroinfl() function to fit a Zero-Inflated Negative Binomial (ZINB) 
model to a dataset, the summary() gives an estimate of log(theta) and its 
standard error, z-value and Pr(|z|) for the count component. Additionally, it 
also provided an estimate of Theta, which I believe is the exp(estimate of 
log(theta)).

However, if I would like to have an standard error of Theta itself (not the 
SE.logtheta), how would I obtain or calculate that standard error?

Thank you very much for your time.

Best regards,
Tzeng Yih Lam

--
PhD Candidate
Department of Forest Engineering, Resources and Management
College of Forestry
Oregon State University
321 Richardson Hall
Corvallis OR 97330 USA
Phone: +1.541.713.7504
Fax: +1.541.713.7504
--
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding a regression line to an xyplot

2010-02-14 Thread David Winsemius



On Feb 14, 2010, at 6:33 PM, Andrew McFadden wrote:


Hi R users


I am trying to add a regression line to a graph for c for factor 2
only. Any suggestions?


library(lattice)

a=(1:10)
b=c(1:5,16:20)
c=as.factor(c(rep(1,5),rep(2,5)))

d=data.frame(a,b,c)

xyplot(a~b, pch=c(6,8),data = tavg, groups=d$c,


Seems likely that you want the dataframe defined above to be used as  
the argument to
'data='. Otherwise xyplot throws an error unless tavg is defined  
elsewhere.


If so, then try:

tavg=data.frame(a,b,c)

xyplot(a~b, pch=c(6,8),data = tavg, groups=c,
 panel=function(x,y, groups,...) {
   panel.xyplot(x,y,groups,type=c(p),...);
   panel.lmline(x[c==2],y[c==2],groups,...)},
  xlab=a,ylab=b)


reg.line=lm, smooth=FALSE, type=c(p,g),xlab=a,ylab=b)


I would appreciate your help.

Kind regards

andy

Andrew McFadden MVS BVSc
Incursion Investigator
Investigation  Diagnostic Centres - Wallaceville Biosecurity New
Zealand Ministry of Agriculture and Forestry

Phone 04 894 5600 Fax 04 894 4973 Mobile 029 894 5611 Postal address:
Investigation and Diagnostic Centre- Wallaceville Box 40742 Ward St
Upper Hutt




This email message and any attachment(s) is intended solely for the
addressee(s) named above. The information it contains is confidential
and may be legally privileged.  Unauthorised use of the message, or  
the

information it contains, may be unlawful. If you have received this
message by mistake please call the sender immediately on 64 4 8940100
or notify us by return email and erase the original message and
attachments. Thank you.

The Ministry of Agriculture and Forestry accepts no responsibility for
changes made to this email or to any attachments after transmission  
from

the office.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Estimated Standard Error for Theta in zeroinfl()

2010-02-14 Thread Rolf Turner


On 15/02/2010, at 12:49 PM, Lam, Tzeng Yih wrote:

 Dear R Users,
 
 When using zeroinfl() function to fit a Zero-Inflated Negative Binomial 
 (ZINB) model to a dataset, the summary() gives an estimate of log(theta) and 
 its standard error, z-value and Pr(|z|) for the count component. 
 Additionally, it also provided an estimate of Theta, which I believe is the 
 exp(estimate of log(theta)).
 
 However, if I would like to have an standard error of Theta itself (not the 
 SE.logtheta), how would I obtain or calculate that standard error?


It's tricky.  The exp and log functions aren't linear!!!
So nothing really works very well.

To start with, if you have an unbiased estimate of log(theta),
say lt.hat, then exp(lt.hat) is NOT an unbiased estimate of theta.

If you are willing to assume that lt.hat is normally distributed
then the expected value of exp(lt.hat) is theta*exp(sigma^2/2)
where sigma^2 is the variance of lt.hat.

The variance of exp(lt.hat) is 

(*) theta^2 * exp(sigma^2) * (exp(sigma^2) - 1).

You can ``plug in'' the SE of lt.hat for sigma into the foregoing and get
an ***``approximately''*** unbiased estimate of theta:

theta.hat = exp(lt.hat - SE^2/2)

and then an approximate estimate of the variance of this ``theta.hat''
(by plugging in theta.hat for theta and SE for sigma in (*)).  The results
won't be correct, but they'll probably be in the right ball park.  I think!

This is all posited on the distribution of the estimate of log(theta) being
normal (or ``Gaussian'').  Whether this is a justifiable assumption in your
setting is questionable.

Some simulation experiments might be illuminating.

cheers,

Rolf Turner

P. S.  The formulae I gave about are the result of quickly scribbled
calculations, and could be wrong.  They should be checked.

R. T.
##
Attention: 
This e-mail message is privileged and confidential. If you are not the 
intended recipient please delete the message and notify the sender. 
Any views or opinions presented are solely those of the author.

This e-mail has been scanned and cleared by MailMarshal 
www.marshalsoftware.com
##

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R project

2010-02-14 Thread navid safavi


Dear Sir/Madam

Could you pls help me at these subjects in R-project?
How could I construct a table for 5 or 10 records of data set?


How could I construct a bar chart for categorical variable with overlay?
How could I construct a histogram for a categorical variable?

pls inform me.
Thanks in advanced.
Navid
  
_
Hotmail: Powerful Free email with security by Microsoft.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] gnls not optimizing values

2010-02-14 Thread Brian Doctrow

Sometimes when I try to fit a model to data using the gnls function,  
it doesn't return optimized values of the model parameters.  Instead  
it just returns the exact same values I used as initial guesses, but  
with standard errors calculated.  Example is as follows:

two_site_mHb - gnls(y ~ (CS_AH2 + CS_AH1*10^(n1*(x-pKa1)) + CS_A_*10^ 
((n1+1)*x-n1*pKa1-pKa2)) / (1 + 10^(n1*(x-pKa1)) + 10^((n1+1)*x- 
n1*pKa1-pKa2)), start=list(CS_AH2=y[1], CS_A_=y[length(y)], CS_AH1=y 
[length(y)/2], pKa1=3.3, pKa2=5.9, n1=1.5))

and the results are:

Generalized nonlinear least squares fit
   Model: y ~ (CS_AH2 + CS_AH1 * 10^(n1 * (x - pKa1)) + CS_A_ * 10^ 
((n1 +  1) * x - n1 * pKa1 - pKa2))/(1 + 10^(n1 * (x - pKa1)) +  
10^((n1 +  1) * x - n1 * pKa1 - pKa2))
   Data: NULL
   AIC  BIC   logLik
-112.2501 -105.6390 63.12505

Coefficients:
Value Std.Errort-value  p-value
CS_AH2  8.390 0.0081958 1023.6969  0
CS_A_   8.629 0.0051456 1676.9804  0
CS_AH1  8.663 0.0085524 1012.9321  0
pKa13.300 0.0443557  74.3985   0
pKa25.900 0.4938263  11.9475   0
n1  1.500 0.22881796.5554  0

  Correlation:
CS_AH2  CS_A_  CS_AH1 pKa1  pKa2
CS_A_   -0.044
CS_AH1  -0.288  0.179
pKa10.436   0.083   0.413
pKa20.179   -0.504  -0.681  -0.295
n1  0.641   -0.095 -0.585   0.065   0.373

Standardized residuals:
Min Q1  Med Q3  
Max
-1.85468073 -0.10985684  0.05285202  0.45446735  1.81225021

Residual standard error: 0.01055068
Degrees of freedom: 19 total; 13 residual

Any thoughts as to why it's just spitting back the initial guesses,  
and how it might be avoided?  Thanks.

-Brian Doctrow
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to delete a parameter from list after running negative binomial error

2010-02-14 Thread Michelle Waterman

Hello everyone,

Sorry if my question is not clear, my first language is not English, but
Portuguese.

I am building a model for my data, using non-binomial error. I am having a
bit of a problem when updating the model to remove parameters that I no do
no autocorrelate with other variables (I have used a autocorrelation
function for this).

So my first model looks like this:

model1.glm.nb-glm.nb(ozone ~ ., data=climate.dat)
summary(model1.glm.nb)
model1AIC.glm.nb-stepAIC(model1.glm.nb)

When I run it, it doesn't give me any significance. So I run a second model,
with my autocorrelation table, and remove one more variable that does not
autocorrelate with other (anything below .07)

model2.glm.nb2 - update(model1.glm.nb, ~ . - rain, data=climate.dat)
summary(model2.glm.nb2)

Now, when I run a 3rd. model to remove rain, it shows after summary that
rain has been removed. Once I do a forth model rain is again in it. How can
I remove more than one variable in one go?

I was thinking doing something like this:

model2.glm.nb2 -update(model1.glm.nb, ~ . -rain, -temp, data=climate.dat)
summary(model2.glm.nb2)

Obviously using the - sign is not working and I am getting an error message.

Thank you for your help.

Michelle

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] error message in endseq

2010-02-14 Thread nomis24


Hi there,
I am new to R and feel so bloody stupid. Abut in spite of a search of
several hrs I could not find an answer to my problem.

I have imported SPSS data to R, and apart from some warnings regarding
duplicate labels, everything looks fine and names lists all my variables.

Then I try to run a seqdef command - and get the error message below:

 fam.seq- seqdef(family_seq, var= c (fseq2_15, fseq2_16,
 fseq2_17,fseq2_18, fseq2_19,fseq2_20,fseq2_21,
+ fseq2_22, fseq2_23,fseq2_24, fseq2_25,
fseq2_26,fseq2_27,fseq2_28, fseq2_29,fseq2_30, fseq2_31,
fseq2_32,fseq2_33,
+ fseq2_34, fseq2_35,fseq2_36, fseq2_37,
fseq2_38,fseq2_39,fseq2_40, fseq2_41,fseq2_42, fseq2_43,
fseq2_44,fseq2_45,
+ fseq2_46, fseq2_47,fseq2_48, fseq2_49, fseq2_50),
+ alphabet =c(1,2,3,4,5,6,7,8),labels=c(not left-never
married-no children,not left -never married - at least 1 child,not left
-has married-no children , not left - has married -  at least 1
child,left - never married- no children , left - never married - at
least 1 child,left - has married - no children , left - has married - at
least 1 child))
Error in subset.default(data, , var) : element 1 is empty;
   the part of the args list of 'is.logical' being evaluated was:
   (subset)


I understand that the problem seems to be with the list of variables I use
to define what my sequences are.
I tried changing lots of things but nothing helped.
Have I skipped an important step?

Many thanks for your help (I have to sleep now).

nomis

-- 
View this message in context: 
http://n4.nabble.com/error-message-in-endseq-tp1555607p1555607.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R project

2010-02-14 Thread Sarah Goslee

Those sound like basic questions. You should start by reading the Intro to R
that can be found here http://cran.r-project.org/doc/manuals/R-intro.html
or at www.r-project.org

For your specific questions, searching for histogram or bar chart at
http://www.rseek.org would be very helpful.

If you have problems beyond that, putting together a specific example
as requested by the posting guide will generally get you much better
advice than vague queries.

Sarah

On Sun, Feb 14, 2010 at 4:05 PM, navid safavi navid...@hotmail.com wrote:

 Dear Sir/Madam

 Could you pls help me at these subjects in R-project?
 How could I construct a table for 5 or 10 records of data set?


 How could I construct a bar chart for categorical variable with overlay?
 How could I construct a histogram for a categorical variable?

 pls inform me.
 Thanks in advanced.
 Navid

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [Requires Classification] Re: Adding a regression line to an xyplot

2010-02-14 Thread Andrew McFadden

Thanks, yes that does the trick, sorry inadvertently left the tavg in
my example. I haven't used the panel function before looks like it maybe
useful. Really appreciate your help.

regards 


Andrew McFadden MVS BVSc | Incursion Investigator (animals), 
Investigation and Diagnostic Centre | Biosecurity New Zealand
Ministry of Agriculture and Forestry | 66 Ward St,  Wallaceville | PO
Box 40 742 | Upper Hutt | New Zealand 
Telephone: 64-4-894 5611 | Facsimile: 64-4-894 4973| Mobile:
027-733-1791 | Web: www.maf.govt.nz


-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net] 
Sent: Monday, 15 February 2010 1:02 p.m.
To: Andrew McFadden
Cc: r-help@r-project.org
Subject: [Requires Classification] Re: [R] Adding a regression line to
an xyplot


On Feb 14, 2010, at 6:33 PM, Andrew McFadden wrote:

 Hi R users

 I am trying to add a regression line to a graph for c for factor 2 
 only. Any suggestions?

 library(lattice)

 a=(1:10)
 b=c(1:5,16:20)
 c=as.factor(c(rep(1,5),rep(2,5)))

 d=data.frame(a,b,c)

 xyplot(a~b, pch=c(6,8),data = tavg, groups=d$c,

Seems likely that you want the dataframe defined above to be used as the
argument to 'data='. Otherwise xyplot throws an error unless tavg is
defined elsewhere.

If so, then try:

tavg=data.frame(a,b,c)

xyplot(a~b, pch=c(6,8),data = tavg, groups=c,
  panel=function(x,y, groups,...) {
panel.xyplot(x,y,groups,type=c(p),...);
panel.lmline(x[c==2],y[c==2],groups,...)},
   xlab=a,ylab=b)

 reg.line=lm, smooth=FALSE, type=c(p,g),xlab=a,ylab=b)

 I would appreciate your help.

 Kind regards

 andy

 Andrew McFadden MVS BVSc
 Incursion Investigator
 Investigation  Diagnostic Centres - Wallaceville Biosecurity New 
 Zealand Ministry of Agriculture and Forestry

 Phone 04 894 5600 Fax 04 894 4973 Mobile 029 894 5611 Postal address:
 Investigation and Diagnostic Centre- Wallaceville Box 40742 Ward St 
 Upper Hutt


 ##
 ## This email message and any attachment(s) is intended solely for the
 addressee(s) named above. The information it contains is confidential 
 and may be legally privileged.  Unauthorised use of the message, or 
 the information it contains, may be unlawful. If you have received 
 this message by mistake please call the sender immediately on 64 4 
 8940100 or notify us by return email and erase the original message 
 and attachments. Thank you.

 The Ministry of Agriculture and Forestry accepts no responsibility for

 changes made to this email or to any attachments after transmission 
 from the office.
 ##
 ##

   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plot different regression models on one graph

2010-02-14 Thread Peter Ehlers


kMan wrote:

I would use all of the data. If you want to drop one, control for it in
the model  sacrifice a degree of freedom.


You like to live dangerously.

 -Peter Ehlers



Why the call to poly() by the way?

KeithC.

-Original Message-
From: Peter Ehlers [mailto:ehl...@ucalgary.ca] 
Sent: Saturday, February 13, 2010 1:35 PM

To: David Winsemius
Cc: Rhonda Reidy; r-help@r-project.org
Subject: Re: [R] Plot different regression models on one graph

Rhonda:

As David points out, a cubic fit is rather speculative. I think that one
needs to distinguish two situations: 1) theoretical justification for a
cubic model is available, or 2) you're dredging the data for the best fit.
Your case is the second. That puts you in the realm of EDA (exploratory data
analysis). You're free to fit any model you wish, but you should assess the
value of the model. One convenient way is to use the influence.measures()
function, which will show that points 9 and 10 in your data have a strong
influence on your cubic fit (as, of course, your eyes would tell you). A
good thing to do at this point is to fit 3 more cubic models:
1) omitting point 9, 2) omitting point 10, 3) omitting both.

Try it and plot the resulting fits. You may be surprised.

Conclusion: Without more data, you can conclude nothing about a
non-straightline fit.

(And this hasn't even addressed the relative abundance of x=0 data.)

  -Peter Ehlers

David Winsemius wrote:

On Feb 13, 2010, at 1:35 PM, Rhonda Reidy wrote:

The following variables have the following significant relationships 
(x is the explanatory variable): linear, cubic, exponential, logistic.

The linear relationship plots without any trouble.

Cubic is the 'best' model, but it is not plotting as a smooth curve 
using the following code:


cubic.lm- lm(y~poly(x,3))

Try:

lines(0:80, predict(cubic.lm, data.frame(x=0:80)),lwd=2)

But I really must say the superiority of that relationship over a 
linear one is far from convincing to my eyes. Seems to be caused by 
one data point. I hope the quotes around best mean tha tyou have the

same qualms.



lines(x,predict(cubic.lm),lwd=2)

How do I plot the data and the estimated curves for all of these 
regression models in the same plot?


x - c(62.5,68.5,0,52,0,52,0,52,23.5,86,0,0,0,0,0,0,0,0,0,0)

y -
c(0.054,0.055,0.017,0.021,0.020,0.028,0.032,0.073,0.076,0.087,0.042,0
.042,0.041,0.045,0.021,0.018,0.017,0.018,0.028,0.022)


Thanks in advance.

Rhonda Reidy



--
Peter Ehlers
University of Calgary






--
Peter Ehlers
University of Calgary

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plot different regression models on one graph

2010-02-14 Thread kMan

Peter wrote:
You like to live dangerously.

Clue me in, Professor.

Sincerely,
KeithC.

-Original Message-
From: Peter Ehlers [mailto:ehl...@ucalgary.ca] 
Sent: Sunday, February 14, 2010 6:49 PM
To: kMan
Cc: 'David Winsemius'; 'Rhonda Reidy'; r-help@r-project.org
Subject: Re: [R] Plot different regression models on one graph

kMan wrote:
 I would use all of the data. If you want to drop one, control for it in
 the model  sacrifice a degree of freedom.

You like to live dangerously.

  -Peter Ehlers

 
 Why the call to poly() by the way?
 
 KeithC.
 
 -Original Message-
 From: Peter Ehlers [mailto:ehl...@ucalgary.ca] 
 Sent: Saturday, February 13, 2010 1:35 PM
 To: David Winsemius
 Cc: Rhonda Reidy; r-help@r-project.org
 Subject: Re: [R] Plot different regression models on one graph
 
 Rhonda:
 
 As David points out, a cubic fit is rather speculative. I think that one
 needs to distinguish two situations: 1) theoretical justification for a
 cubic model is available, or 2) you're dredging the data for the best
fit.
 Your case is the second. That puts you in the realm of EDA (exploratory
data
 analysis). You're free to fit any model you wish, but you should assess
the
 value of the model. One convenient way is to use the influence.measures()
 function, which will show that points 9 and 10 in your data have a strong
 influence on your cubic fit (as, of course, your eyes would tell you). A
 good thing to do at this point is to fit 3 more cubic models:
 1) omitting point 9, 2) omitting point 10, 3) omitting both.
 
 Try it and plot the resulting fits. You may be surprised.
 
 Conclusion: Without more data, you can conclude nothing about a
 non-straightline fit.
 
 (And this hasn't even addressed the relative abundance of x=0 data.)
 
   -Peter Ehlers
 
 David Winsemius wrote:
 On Feb 13, 2010, at 1:35 PM, Rhonda Reidy wrote:

 The following variables have the following significant relationships 
 (x is the explanatory variable): linear, cubic, exponential, logistic.
 The linear relationship plots without any trouble.

 Cubic is the 'best' model, but it is not plotting as a smooth curve 
 using the following code:

 cubic.lm- lm(y~poly(x,3))
 Try:

 lines(0:80, predict(cubic.lm, data.frame(x=0:80)),lwd=2)

 But I really must say the superiority of that relationship over a 
 linear one is far from convincing to my eyes. Seems to be caused by 
 one data point. I hope the quotes around best mean tha tyou have the
 same qualms.

 lines(x,predict(cubic.lm),lwd=2)

 How do I plot the data and the estimated curves for all of these 
 regression models in the same plot?

 x - c(62.5,68.5,0,52,0,52,0,52,23.5,86,0,0,0,0,0,0,0,0,0,0)

 y -
 c(0.054,0.055,0.017,0.021,0.020,0.028,0.032,0.073,0.076,0.087,0.042,0
 .042,0.041,0.045,0.021,0.018,0.017,0.018,0.028,0.022)


 Thanks in advance.

 Rhonda Reidy

 
 --
 Peter Ehlers
 University of Calgary
 
 
 
 

-- 
Peter Ehlers
University of Calgary

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multiple missing values

2010-02-14 Thread Jim Lemon


John wrote:

...
Does anyone know, or know documentation that describes, how to declare
multiple values in R as missing that does not involve coding them as NA? I
wish to be able to treate values as missing, while still retaining codes
that describe the reason for the value being missing.


I would suggest leaving the missing values as is in your data file and 
recoding these to NA at the top of each analysis script you run. I find 
that the only place I usually make use of such information is in the 
initial descriptives, although you may want to selectively recode for 
different analyses.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R: Dimensional reduction package

2010-02-14 Thread mauede

Thank you.
Sorry. My question is rather ambiguous. Imeant to aske for Dimensionality 
Reduction methods and/or Intrinsic Dimensionality Estimation methods that can 
deal with non-linear data. That is, data that do not lie on a hyper-plane.
ISOMAP is one of such methods. 
LLE, AutoEncoder, GTM Kernel PCA are other methods that are used with 
non-linear data.

Maura

-Messaggio originale-
Da: Michael Denslow [mailto:michael.dens...@gmail.com]
Inviato: dom 14/02/2010 23.46
A: mau...@alice.it
Cc: r-h...@stat.math.ethz.ch
Oggetto: Re: [R] Dimensional reduction package
 
Hi Maura,

 Is there any R package which implements non-linear dimensionality reduction 
 (LLE, ISOMAP, GTM, and so on)  and/or intrinsic dimensionality estimation ?

I am not exactly sure what is meant by non-linear dimensionality
reduction but there is an isomap function in vegan. This is the isomap
of Tenenbaum et al. 2000 and De'ath 1999.

library(vegan)
?isomap

De'ath, G. (1999) Extended dissimilarity: a method of robust
estimation of ecological distances from high beta diversity data.
Plant Ecology 144, 191-199

Tenenbaum, J.B., de Silva, V.  Langford, J.C. (2000) A global network
framework for nonlinear dimensionality reduction. Science 290,
2319-2323.


Hope this helps,
Michael

 Thank you,
 Maura


 tutti i telefonini TIM!


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Michael Denslow

I.W. Carpenter Jr. Herbarium [BOON]
Department of Biology
Appalachian State University
Boone, North Carolina U.S.A.
-- AND --
Communications Manager
Southeast Regional Network of Expertise and Collections
sernec.org

36.214177, -81.681480 +/- 3103 meters




tutti i telefonini TIM!


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] optimization problem with linear constraints

2010-02-14 Thread Kathie


Dear R users,

I need some advises on how to use R to optimize this function with the
following constraints.

f(x1,x2,x3,y1,y2,y3,)  

= gamma(x1+x2-1)/{gamma(x1)*gamma(x2)} * y1^(x2-1) * y2^(x1-1)
+ gamma(x1+x3-1)/{gamma(x1)*gamma(x3)} * y1^(x3-1) * y3^(x1-1)
+ gamma(x2+x3-1)/{gamma(x2)*gamma(x3)} * y2^(x3-1) * y3^(x2-1)

s.t

0  x1
0  x2
0  x3
1  x1+x2
1  x1+x3
1  x2+x3
0  y1
0  y2
0  y3

I've tried constrOptim, but I got several errors; e.g.  value out of
range in 'gammafn', etc.

Is there any other built-in functions or something for these constraints??

Any suggestion will be greatly appreciated.

Regards,

Kathryn Lord 
-- 
View this message in context: 
http://n4.nabble.com/optimization-problem-with-linear-constraints-tp1555718p1555718.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Shapiro-Wilk for levels of factor

2010-02-14 Thread Ravi Kulkarni


Thanks! Exactly what I wanted.

  Ravi
-- 
View this message in context: 
http://n4.nabble.com/Shapiro-Wilk-for-levels-of-factor-tp1555254p1555720.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Difference in Levene's test between R and SPSS

2010-02-14 Thread Ravi Kulkarni


Hello,
  I notice that when I do Levene's test to test equality of variances across
levels of a factor, I get different answers in R and SPSS 16.
  e.g.: For the chickwts data, in R, levene.test(weight, feed) gives
F=0.7493, p=0.5896.
SPSS 16 gives F=0.987, p=0.432

  Why this difference? Which one should I believe? (I would like to believe
R :)

  Ravi
-- 
View this message in context: 
http://n4.nabble.com/Difference-in-Levene-s-test-between-R-and-SPSS-tp1555725p1555725.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Using text put into a dialog box

2010-02-14 Thread RagingJim


I have written in R this:

require(tcltk)
tt-tktoplevel()
Name - tclVar()
entry.Name -tkentry(tt,width=20,textvariable=Name)
tkgrid(tklabel(tt,text=Please enter site number.))
tkgrid(entry.Name)
OnOK - function()

{
NameVal - tclvalue(Name)
use.this=NameVal
tkdestroy(tt)
}

OK.but -tkbutton(tt,text=   OK   ,command=OnOK)
tkbind(entry.Name, Return,OnOK)
tkgrid(OK.but)
tkfocus(tt)

Fairly simple, yet I am trying to figure out how to save the text that the
user writes into the text box. This data will then be used for a query into
a SQL database. How do I go about using what is written?

Cheers
-- 
View this message in context: 
http://n4.nabble.com/Using-text-put-into-a-dialog-box-tp1555761p1555761.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Difference in Levene's test between R and SPSS

2010-02-14 Thread Peter Ehlers


Ravi Kulkarni wrote:

Hello,
  I notice that when I do Levene's test to test equality of variances across
levels of a factor, I get different answers in R and SPSS 16.
  e.g.: For the chickwts data, in R, levene.test(weight, feed) gives
F=0.7493, p=0.5896.
SPSS 16 gives F=0.987, p=0.432

  Why this difference? Which one should I believe? (I would like to believe
R :)


When in doubt, believe R.

Which levene.test() are you using? There are at least three
functions by that name (packages car, lawstat, s20x; you
should always state which package is being used).

All default to what I would consider to be the better test,
namely using the median as location measure. lawstat gives
you options: try it with location=mean.

Then switch to the Fligner-Killeen test (fligner.test() in
package 'stats'). The reference cited in ?fligner.test makes
for good reading.

 -Peter Ehlers



  Ravi


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plot different regression models on one graph

2010-02-14 Thread Peter Ehlers


kMan wrote:

Peter wrote:

You like to live dangerously.


Clue me in, Professor.

Sincerely,
KeithC.



Okay, Keith, here goes:

dat -
structure(list(x = c(62.5, 68.5, 0, 52, 0, 52, 0, 52, 23.5, 86,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0), y = c(0.054, 0.055, 0.017, 0.021,
0.02, 0.028, 0.032, 0.073, 0.076, 0.087, 0.042, 0.042, 0.041,
0.045, 0.021, 0.018, 0.017, 0.018, 0.028, 0.022)), .Names = c(x,
y), row.names = c(NA, -20L), class = data.frame)

fm1 - lm(y ~ poly(x,3), data = dat)
fm2 - lm(y ~ poly(x,3), data = dat, subset = -9)

xx - 0:86
yhat1 - predict(fm1, list(x = xx))
yhat2 - predict(fm2, list(x = xx))

plot(x,y)
lines(xx, yhat1, col=blue, lwd=2)
lines(xx, yhat2, col=red, lwd=2)


That's how much difference *one* point makes in a cubic fit!
I'm not much of a gambler, so I wouldn't base any important
decisions on the results of the fit.

Cheers,

 -Peter Ehlers


-Original Message-
From: Peter Ehlers [mailto:ehl...@ucalgary.ca] 
Sent: Sunday, February 14, 2010 6:49 PM

To: kMan
Cc: 'David Winsemius'; 'Rhonda Reidy'; r-help@r-project.org
Subject: Re: [R] Plot different regression models on one graph

kMan wrote:

I would use all of the data. If you want to drop one, control for it in
the model  sacrifice a degree of freedom.


You like to live dangerously.

  -Peter Ehlers


Why the call to poly() by the way?

KeithC.

-Original Message-
From: Peter Ehlers [mailto:ehl...@ucalgary.ca] 
Sent: Saturday, February 13, 2010 1:35 PM

To: David Winsemius
Cc: Rhonda Reidy; r-help@r-project.org
Subject: Re: [R] Plot different regression models on one graph

Rhonda:

As David points out, a cubic fit is rather speculative. I think that one
needs to distinguish two situations: 1) theoretical justification for a
cubic model is available, or 2) you're dredging the data for the best

fit.

Your case is the second. That puts you in the realm of EDA (exploratory

data

analysis). You're free to fit any model you wish, but you should assess

the

value of the model. One convenient way is to use the influence.measures()
function, which will show that points 9 and 10 in your data have a strong
influence on your cubic fit (as, of course, your eyes would tell you). A
good thing to do at this point is to fit 3 more cubic models:
1) omitting point 9, 2) omitting point 10, 3) omitting both.

Try it and plot the resulting fits. You may be surprised.

Conclusion: Without more data, you can conclude nothing about a
non-straightline fit.

(And this hasn't even addressed the relative abundance of x=0 data.)

  -Peter Ehlers

David Winsemius wrote:

On Feb 13, 2010, at 1:35 PM, Rhonda Reidy wrote:

The following variables have the following significant relationships 
(x is the explanatory variable): linear, cubic, exponential, logistic.

The linear relationship plots without any trouble.

Cubic is the 'best' model, but it is not plotting as a smooth curve 
using the following code:


cubic.lm- lm(y~poly(x,3))

Try:

lines(0:80, predict(cubic.lm, data.frame(x=0:80)),lwd=2)

But I really must say the superiority of that relationship over a 
linear one is far from convincing to my eyes. Seems to be caused by 
one data point. I hope the quotes around best mean tha tyou have the

same qualms.

lines(x,predict(cubic.lm),lwd=2)

How do I plot the data and the estimated curves for all of these 
regression models in the same plot?


x - c(62.5,68.5,0,52,0,52,0,52,23.5,86,0,0,0,0,0,0,0,0,0,0)

y -
c(0.054,0.055,0.017,0.021,0.020,0.028,0.032,0.073,0.076,0.087,0.042,0
.042,0.041,0.045,0.021,0.018,0.017,0.018,0.028,0.022)


Thanks in advance.

Rhonda Reidy


--
Peter Ehlers
University of Calgary





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] error message error

2010-02-14 Thread Roslina Zakaria

Hi r-users,
 
I hope somebody can help me to understand the error message.  Here is my code;

## Newton iteration
newton_gam - function(z)
{ n   - length(z)
  r   - runif(n)
  tol - 1E-6
  cdf - vector(length=n, mode=numeric)
  fprime - vector(length=n, mode=numeric)
  f   - vector(length=n, mode=numeric)
 
  for (i in 1:1000)
  { cdf  - integrate(fprime, lower = 0, upper = z)$value
    f    - cdf - r
    # Newton method
    z    - z - f/fprime
    
    if (any(f  tol)) break
   }
  cbind(z,cdf)
}
 
alp  - 2.0165
bt1  - 29.107 ; bt2 - 41.517
x1   - d1d4pos[,1];x1[1:10]
x2   - d1d4pos[,2];x2[1:10]
 
 x1   - d1d4pos[,1];x1[1:10]
 [1] 28.4 53.6  1.3 29.5 52.1 65.9 72.6 67.6 58.7 34.5
 x2   - d1d4pos[,2];x2[1:10]
 [1]  43.5  56.2   0.3  16.6  71.1  86.3 172.8 111.8  89.9  70.2
 
z - (x1/bt1)+(x2/bt2); z
 
newton_gam(min(z))
newton_gam(max(z))
 
Error in -`*tmp*` : invalid argument to unary operator
 newton_gam(min(z))
   z  cdf
[1,] Inf 1.141004e-05
 newton_gam(max(z))
Error in integrate(fprime, lower = 0, upper = z) : 
  non-finite function value
 
 z[100]
[1] 7.544834
 newton_gam(z[100])
Error in integrate(fprime, lower = 0, upper = z) : 
  non-finite function value
 
Thank you.




  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Separating columns, and sorting by rows

2010-02-14 Thread RagingJim


Dear anyone who knows more about R than me (so everyone). I have been bashing
my head on the keyboard all day trying to do something with my table.

I have some data, like so:
 -mm Rainfall(mm)
1   1977-0217.4
2   1977-0334.0
3   1977-0426.2
4   1977-0542.6
5   1977-0658.6
6   1977-0723.2
7   1977-0826.8
8   1977-0948.4
9   1977-1047.0
10  1977-1137.2
11  1977-1215.0
12  1978-01 2.6
13  1978-02 6.8
14  1978-03 9.0
15  1978-0446.6

The data continues on for x number of hundreds of data points. Simply put, I
need to make that data.frame into this data.frame/table/matrix/grid/... you
get the picture.

 Jan  Feb  Mar ... etc
Year   Rain Rain Rain
Year   Rain Rain Rain
Year   Rain Rain Rain
Year   etc...
Year
Year

How on earth do I do it? I have made little to no progress on it all day.
Also, just like all the other problems, the year and month will change every
time depending upon which csv file or sql query I load into the program. If
anyone has any pointers, that would be awesome.

Cheers lads.
-- 
View this message in context: 
http://n4.nabble.com/Separating-columns-and-sorting-by-rows-tp1555806p1555806.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Separating columns, and sorting by rows

2010-02-14 Thread RagingJim


the other alternative would be to edit my sql query so that that data is
brought in from the database and put in to the correct format initially.

sqlQuery(conn, select lsd,ttl_mo_prcp from mo_rains where stn_num=23090)

That is my very basic query. I have also been given this for use in orcale
(I believe):

my $query = select to_char(stn_num,'9') as stn,
to_char(lsd,'') as yr,
   
nvl(max(decode(to_char(lsd,'MON'),'JAN',ttl_mo_prcp)),-) jan,
   
nvl(max(decode(to_char(lsd,'MON'),'FEB',ttl_mo_prcp)),-) feb,
   
nvl(max(decode(to_char(lsd,'MON'),'MAR',ttl_mo_prcp)),-) mar,
   
nvl(max(decode(to_char(lsd,'MON'),'APR',ttl_mo_prcp)),-) apr,
   
nvl(max(decode(to_char(lsd,'MON'),'MAY',ttl_mo_prcp)),-) may,
   
nvl(max(decode(to_char(lsd,'MON'),'JUN',ttl_mo_prcp)),-) jun,
   
nvl(max(decode(to_char(lsd,'MON'),'JUL',ttl_mo_prcp)),-) jul,
   
nvl(max(decode(to_char(lsd,'MON'),'AUG',ttl_mo_prcp)),-) aug,
   
nvl(max(decode(to_char(lsd,'MON'),'SEP',ttl_mo_prcp)),-) sep,
   
nvl(max(decode(to_char(lsd,'MON'),'OCT',ttl_mo_prcp)),-) oct,
   
nvl(max(decode(to_char(lsd,'MON'),'NOV',ttl_mo_prcp)),-) nov,
   
nvl(max(decode(to_char(lsd,'MON'),'DEC',ttl_mo_prcp)),-) dec
 from mo_rains 
 where (stn_num in ($stns)) 
 group by stn_num, to_char(lsd,'')
 order by to_char(lsd,'') desc;;

But I think that sorts by station number, as it is designed for multiple
stations at a time, whereas mine is for one station only. Yet if I plug it
into R just to see what happens, I get a plethora of extraordinarily long
errors which I can post if needed.


-- 
View this message in context: 
http://n4.nabble.com/Separating-columns-and-sorting-by-rows-tp1555806p1555813.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] GAM for non-integer proportions

2010-02-14 Thread Julian Burgos

Dear list,

I´m using the mgcv package to model the proportion by weight of certain prey
on the stomach content of a predator.  This proportion is the ratio of two
weights (prey weight over stomach weight), and ranges between 0 and 1.  The
variance is low when proportion is close to 0 and 1, and higher at
intermediate values.

It seems that the best way to go is to model this using the quasi family
with a logit link and a mu(1-mu) variance.  Or I am missing something
obvious?  I will be thankful for any input.

All the best,

Julian

-- 
Julian Mariano Burgos
Hafrannsóknastofnunin/Marine Research Institute
Skúlagata 4, 121 Reykjavík, Iceland
Sími/Telephone : +354-5752037
Bréfsími/Telefax:  +354-5752001
Netfang/Email: jul...@hafro.is, jmbur...@uw.edu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Separating columns, and sorting by rows

2010-02-14 Thread milton ruser

Hi Raging Jim

may be this is a starting point.

myDF-read.table(stdin(),head=T,sep=,)
mm,Rainfall
1977-02,17.4
1977-03,34.0
1977-04,26.2
1977-05,42.6
1977-06,58.6
1977-07,23.2
1977-08,26.8
1977-09,48.4
1977-10,47.0
1977-11,37.2
1977-12,15.0
1978-01,2.6
1978-02,6.8
1978-03,9.0
1978-04,46.6

myDF$-substr(myDF$mm,1,4)
myDF$mm-substr(myDF$mm,6,7)
myDF-subset(myDF, select=c(,mm,Rainfall))
myDF.reshape-reshape(myDF,v.names=Rainfall,idvar=,
 timevar=mm,direction=wide)
myDF.reshape
best regards

milton

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Exponential Fitting to Credit default data - A theoretical query

2010-02-14 Thread Julia Cains

Dear R helpers
 
I am working on the credit risk default data and am referring to An 
introduction to Credit Risk Modelling by Christian Bluhm. The literature 
affirms that 'the default frequencies grow exponentially with decreasing credit 
worthiness'.  

I have a data of rating wise default frequencies. The idea is to fit 
exponentail of the type 
Y = a * exp( b*X ) where Y is (dependent variable) default frequency and X is 
rating class code. So e.g. suppose I have rating classes AAA, AA, A and so 
on coded as 1, 2, 3  etc. and suppose the observed default frequencies are  say 
0.00011, 0.0029, 0.0083 respectively, then my values of X and Y are as follows.
 
X Y
1   0.0001
2   0.0029 
3   0.0083
..
..
 
etc and to this data I am fitting Y = a * exp(b*X)
 
My quereis are 
(1) is there any R function which will find out the estimated values of a and b 
instead of taking logaritham and converting both sides into linear equation and 
solve for a and b;
 
(2) How do I find out out Statistically that this is the best fit? e.g. had it 
been linear regression, the R^2 gives me some idea about the fitted linear 
line. In otehr words, is there any way of comparing the observed values of Y 
against the estimated values of Y i.e. values obtained from the equation Y = a 
* exp(b*X). Will the Chi-Square will be a good choice? 
 
I was trying the t-test as given below and am not sure whether I am right to do 
so. 
For each rating class, I have the observed value of Y and estimates value. Say 
for AAA, the observed value is 0.0001 whereas the estimated value is say 
0.0017. I have tested the following hypothesis 
 
Ho : Y(AAA) = 0.0017
H1 : Y(AAA) not equal 0.0017
 
So this is a two tailed case and I have applied the 't' test as 
 
tcal (AAA) = [Y(obse) - Y(estim)]/(s / sqrt(n)).
 
Probelm is in some cases, I have observed significant difference ie. the 
t(calculated value) falling in Critical region. Thus, the need to find out 
whetehr my fit is correct or not.
 
Thanking you in advance
 
Regards
 
Julia Cains, Brisbane
 
 
**

Only a man of Worth sees Worth in other men

**


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Separating columns, and sorting by rows

2010-02-14 Thread David Winsemius



On Feb 15, 2010, at 1:22 AM, milton ruser wrote:


Hi Raging Jim

may be this is a starting point.

myDF-read.table(stdin(),head=T,sep=,)


Those mm entries will become factors, which can lead to  
confusion for newbies. Might be more straightforward to always use  
stringsAsFactors=FALSE in the read.table arguments.  I see that the  
yyymm column later gets thrown away so it may not matter here.



mm,Rainfall
1977-02,17.4
1977-03,34.0
1977-04,26.2
1977-05,42.6
1977-06,58.6
1977-07,23.2
1977-08,26.8
1977-09,48.4
1977-10,47.0
1977-11,37.2
1977-12,15.0
1978-01,2.6
1978-02,6.8
1978-03,9.0
1978-04,46.6



When I did a very similar maneuver, I added an extra NA entry at the  
beginning:


myDF - rbind(list(mm=1977-01, Rainfall=NA), myDF)

... so the columns would start with January. (The warning is harmless.)


myDF$-substr(myDF$mm,1,4)
myDF$mm-substr(myDF$mm,6,7)
myDF-subset(myDF, select=c(,mm,Rainfall))
myDF.reshape-reshape(myDF,v.names=Rainfall,idvar=,
timevar=mm,direction=wide)
myDF.reshape
best regards


When the time comes to rename those columns, knowing that there is a  
system constant called month.names may come in handy. Perhaps  
(untested):


names(myDF.reshape) - c(Year, month.names[1:12])



milton

--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plot different regression models on one graph

2010-02-14 Thread kMan

Dear Peter,

Ah, I see your point, Professor. The point at x=23.5 is carrying the model.
Allow me to clarify. I was making a similar point.

I was suggesting that the cube term could be left in the model, as you did,
but rather than dropping the data-point, another model is fit with an
additional parameter added to control for the suspected outlier -
essentially a point discontinuity.

Note that the call to poly() is not necessary for the graphical
representation, so I'll continue the example without it.
fm3-lm(y~x+I(x^2)+I(x^3), data=dat)
fm3.c-coef(fm3)

# drop data point alternative subseting using logical indexes.
index-dat$x!=23.5
x2-x[index]
x2-dat$x[index]
y2-dat$y[index]
fm4-lm(y2~x2+I(x2^2)+I(x2^3), data=dat)

# controlling for the potential outlier in the model, without throwing out
the data. 
ctrl-rep(0,length=dim(dat)[1])
ctrl[dat$x==23.5]-resid(fm3)[[dat$x==23.5]
fm5-lm(y~x+I(x^2)+I(x^3)+ctrl, data=dat)

# avoids the predict  lines calls, but requires knowledge of the model. 
curve(fm3.c[1]+fm3.c[2]*x+fm3.c[3]*x^2+fm3.c[4]*x^3, col=green, add=TRUE)
curve(fm4.c[1]+fm4.c[2]*x+fm4.c[3]*x^2+fm4.c[4]*x^3, col=green, add=TRUE)
# plot dropped outlier
curve(fm5.c[1]+fm5.c[2]*x+fm5.c[3]*x^2+fm5.c[4]*x^3, col=purple, add=TRUE)
# plot without ctrl variable

anova(fm5) 

Tells us how much difference one point made, sacrificing a df just to ensure
we're not throwing out information haphazardly. F1 or whatever cutoff you
want to use. If it turns out that the point is important, then at least its
existence  effect was documented. 

There is some irony, I suppose, that the graphical representation of
dropping the point vs. adding a control variable, is equivalent for this
example. I have not decided how I feel about that yet, but I do have a
splitting headache. 

Sincerely,
KeithC.

-Original Message-
From: Peter Ehlers [mailto:ehl...@ucalgary.ca] 
Sent: Sunday, February 14, 2010 10:04 PM
To: kMan
Cc: r-help@r-project.org
Subject: Re: [R] Plot different regression models on one graph

kMan wrote:
 Peter wrote:
 You like to live dangerously.
 
 Clue me in, Professor.
 
 Sincerely,
 KeithC.
 

Okay, Keith, here goes:

dat -
structure(list(x = c(62.5, 68.5, 0, 52, 0, 52, 0, 52, 23.5, 86, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0), y = c(0.054, 0.055, 0.017, 0.021, 0.02, 0.028, 0.032,
0.073, 0.076, 0.087, 0.042, 0.042, 0.041, 0.045, 0.021, 0.018, 0.017, 0.018,
0.028, 0.022)), .Names = c(x, y), row.names = c(NA, -20L), class =
data.frame)

fm1 - lm(y ~ poly(x,3), data = dat)
fm2 - lm(y ~ poly(x,3), data = dat, subset = -9)

xx - 0:86
yhat1 - predict(fm1, list(x = xx))
yhat2 - predict(fm2, list(x = xx))

plot(x,y)
lines(xx, yhat1, col=blue, lwd=2)
lines(xx, yhat2, col=red, lwd=2)


That's how much difference *one* point makes in a cubic fit!
I'm not much of a gambler, so I wouldn't base any important decisions on the
results of the fit.

Cheers,

  -Peter Ehlers

 -Original Message-
 From: Peter Ehlers [mailto:ehl...@ucalgary.ca]
 Sent: Sunday, February 14, 2010 6:49 PM
 To: kMan
 Cc: 'David Winsemius'; 'Rhonda Reidy'; r-help@r-project.org
 Subject: Re: [R] Plot different regression models on one graph
 
 kMan wrote:
 I would use all of the data. If you want to drop one, control for 
 it in the model  sacrifice a degree of freedom.
 
 You like to live dangerously.
 
   -Peter Ehlers
 
 Why the call to poly() by the way?

 KeithC.

 -Original Message-
 From: Peter Ehlers [mailto:ehl...@ucalgary.ca]
 Sent: Saturday, February 13, 2010 1:35 PM
 To: David Winsemius
 Cc: Rhonda Reidy; r-help@r-project.org
 Subject: Re: [R] Plot different regression models on one graph

 Rhonda:

 As David points out, a cubic fit is rather speculative. I think that 
 one needs to distinguish two situations: 1) theoretical justification 
 for a cubic model is available, or 2) you're dredging the data for the
best
 fit.
 Your case is the second. That puts you in the realm of EDA 
 (exploratory
 data
 analysis). You're free to fit any model you wish, but you should 
 assess
 the
 value of the model. One convenient way is to use the 
 influence.measures() function, which will show that points 9 and 10 
 in your data have a strong influence on your cubic fit (as, of 
 course, your eyes would tell you). A good thing to do at this point is to
fit 3 more cubic models:
 1) omitting point 9, 2) omitting point 10, 3) omitting both.

 Try it and plot the resulting fits. You may be surprised.

 Conclusion: Without more data, you can conclude nothing about a 
 non-straightline fit.

 (And this hasn't even addressed the relative abundance of x=0 data.)

   -Peter Ehlers

 David Winsemius wrote:
 On Feb 13, 2010, at 1:35 PM, Rhonda Reidy wrote:

 The following variables have the following significant 
 relationships (x is the explanatory variable): linear, cubic,
exponential, logistic.
 The linear relationship plots without any trouble.

 Cubic is the 'best' model, but it is not plotting as a smooth curve 
 using the following code:

Re: [R] lattice/ylim: how to fix ylim[1], but have ylim[2] dynamically calculated?

2010-02-14 Thread Johannes Graumann

Rolf Turner wrote:

 
 On 15/02/2010, at 9:40 AM, Johannes Graumann wrote
 
 SNIP
 
 (In response to some advice from David Winsemius):
 
 I am quite certain that this is the most elaborately worded version of
 RTFM I have ever come across.
 
 
 I nominate this as a fortune.  (Despite Prof. Winsemius's later
 protestation that his advice was *not* a version of RTFM.)

Uh, oh ... certified notoriety?

Joh

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lm function in R

2010-02-14 Thread kMan

Dear Something,

 

Ah, you need accessor functions. Set your lm() call to a variable name. If
my data frame or object is called dat, for example, I usually use dat.lm.

 

summary(dat.lm)

resid(dat.lm)

?summary.lm gives a host of information, including summary(dat.lm)$sigma and
$df for resolving SSE.

Alternatively, sum(c(resid(dat.lm)^2)).

 

See anova() as well.

 

Also, vif() [car] for the variance inflation factor (e.g. tolerance=1/vif)

 

Hope this helps.

 

Sincerely,

KeithC.

 

From: Something Something [mailto:mailinglist...@gmail.com] 
Sent: Sunday, February 14, 2010 11:49 PM
To: kMan
Subject: Re: [R] lm function in R

 

Thank you so much, kMan.  That makes sense.  Only one question, how can I
see the value of 'error'?  Here's what I see:

Call:

lm(formula = Y ~ X1 * X2 * X3, na.action = na.exclude)

 

Coefficients:

(Intercept)   X1   X2   X3X1:X2X1:X3
X2:X3 X1:X2:X3  

   177.9245   0.2005   2.4482   3.1216   0.8127 -26.6166
-3.0398  29.6049  


I don't see 'error' here.  Please explain.  Thanks again for your help.



 

On Sun, Feb 14, 2010 at 3:26 PM, kMan kchambe...@gmail.com wrote:

Dear Something,

Tricky to differentiate subscripts vs. different variables in plain text, so
I'll just use 2, X and Z.

Y~X+Z yields the model Y=b0 + b1*X + b2*Z + error
Y~X*Z yields the model Y=b0 + b1*X + b2*Z + b3*X*Z + error |b3 is the
interaction term.
Y~X+Z+I(X*Z) is the same thing but with the interaction term spelled out.
See ?I
XZint-X*Z
Y~X+Z+XZint same thing but avoids the call to I() in the model.

Note that:
Y~a*a or Y~a^2 does not evaluate an interaction, hence the use of I().

Sincerely,
KeithC.


-Original Message-
From: Something Something [mailto:mailinglist...@gmail.com]

Sent: Saturday, February 13, 2010 2:24 PM
To: Daniel Nordlund

Cc: r-help@r-project.org
Subject: Re: [R] lm function in R

Thanks Dan.  Yes that was very helpful.  I didn't see the change from '*' to
'+'.

Seems like when I put * it means - interaction  when I put + it's not an
interaction.

Is it correct to assume then that...

When I put + R evaluates the following equation:
Y-Hat = b0 + b1X1 + b2X2 + . . . bkXk + . . . + bkXk


But when I put * R evaluates the following equation; Y-Hat = b0 + b1X1 +
b2x2 + ... + bkXk + b12 X12+ b13 X13 +  + c

Is this correct?  If it is then can someone point me to any sources that
will explain how the coefficients (such as b0... bk, b12.. , b123..) are
calculated.  I guess, one source is the R source code :) but is there any
other documentation anywhere?

Please let me know.  Thanks.



On Fri, Feb 12, 2010 at 5:54 PM, Daniel Nordlund
djnordl...@verizon.netwrote:

  -Original Message-
  From: r-help-boun...@r-project.org
  [mailto:r-help-boun...@r-project.org]
  On Behalf Of Something Something
  Sent: Friday, February 12, 2010 5:28 PM
  To: Phil Spector; r-help@r-project.org
  Subject: Re: [R] lm function in R
 
  Thanks for the replies everyone.  Greatly appreciate it.  Some
  progress, but now I am getting the following values when I don't use
  as.factor
 
  13.14167 25.11667 28.34167 49.14167 40.39167 66.86667
 
  Is that what you guys get?


 If you look at Phil's response below, no, that is not what he got.
 The difference is that you are specifying an interaction, whereas Phil
 did not (because the equation you initially specified did not include
 an interaction.  Use Y ~ X1 + X2 instead of Y ~ X1*X2 for your formula.

 
 
  On Fri, Feb 12, 2010 at 5:00 PM, Phil Spector
  spec...@stat.berkeley.eduwrote:
 
   By converting the two variables to factors, you are fitting an
   entirely different model.  Leave out the as.factor stuff and it
   will work exactly as you want it to.
  
dat
  
V1 V2 V3 V4
   1 s1 14  4  1
   2 s2 23  4  2
   3 s3 30  7  2
   4 s4 50  7  4
   5 s5 39 10  3
   6 s6 67 10  6
  
   names(dat) = c('id','y','x1','x2') z = lm(y~x1+x2,dat)
   predict(z)
  
 123456 15.16667 24.7
   27.7 46.7 40.16667 68.7
  
  
  - Phil Spector
   Statistical Computing Facility
   Department of Statistics
   UC Berkeley
   spec...@stat.berkeley.edu
  
  
  
   On Fri, 12 Feb 2010, Something Something wrote:
  
Hello,
  
   I am trying to learn how to perform Multiple Regression Analysis in
R.
  I
   decided to take a simple example given in this PDF:
   http://www.utdallas.edu/~herve/abdi-prc-pretty.pdf
  
   I created a small CSV called, students.csv that contains the
   following
   data:
  
   s1 14 4 1
   s2 23 4 2
   s3 30 7 2
   s4 50 7 4
   s5 39 10 3
   s6 67 10 6
  
   Col headers:  Student id, Memory span(Y), age(X1), speech
   rate(X2)
  
   Now the expected results are:

Re: [R] Creating a new Access database with R

2010-02-14 Thread Ryusuke Kenji

I have just got an idea which is using RExcel and coding with Excel VBA may 
cope with it, its work!!!

Ryusuke

From: urishim...@optiver.com
To: ryusukeke...@hotmail.com
Date: Mon, 15 Feb 2010 08:31:49 +0100
Subject: RE: RE:Creating a new Access database with R

No idea, sorry. Haven$B!G(Bt checked it since. You might want to look
at simply creating a new file titled $B!F(Bjustanothername.mdb$B!G(B and 
see if it works
(probably not)

Sorry I can$B!G(Bt be more helpful, which is why I keep this off the
list$B!D(B

Uri

From: Ryusuke Kenji
[mailto:ryusukeke...@hotmail.com] 

Sent: Thursday 11 February 2010 16:51

To: Uri Shimron; r-help@r-project.org

Subject: RE:Creating a new Access database with R

I am facing the same problem as well, I would like
to code with following concept but wondering how to cope it.

if (*mdb file exist)

{ add new row/col }

else

   { add new *mdb file }

--

From: Uri Shimron UriShimron_at_optiver.com

Date: Thu 21 Sep
2006 - 13:27:35 GMT

First of all, since this is my first posting, I would like to thank anybody who
[[elided Hotmail spam]]

My question is: how do I create a new Access database with R? I need a channel
before I can do anything, but if the mdb-file doesn't exist, I can't connect to
it with odbcConnectAccess. 

I've looked at the RODBC.pdf on CRAN, searched the mailing-lists, and looked at
test.R file in the package. But probably I've overlooked something. 

It is of course possible to keep a clean new mdb-file somewhere and then copy
it to the required directory with: shell(copy EmptyDB.mdb
NewLocation.mdb) 

But that isn't very elegant... 

Thanks in advance, 

Uri Shimron 

USB$B%a%b%jBe$o$j$K$*;H$$$/$...@$5$$!#l5na$g;H$($k(B25GB$B!#(B 
SkyDrive$B$r:#$9$0BN83(B

***

This email and any files transmitted with it are confide...{{dropped:23}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] SLR 500 Random Samples

2010-02-14 Thread Schwonka


Consider the SLR y = 50 + 10x + e where e is NID(0.16). n = 20 pairs of
observations. Generate 500 samples of 20 obersvations drawing 1 observations
drawing one observation from each level of x = .5,1,1.5 ... 10 for each
sample

A) For each sample computer the least squares regression estimates of the
slop and intercept. Construct histograms of the sample values of B_0_hat and
B_1_hat.

B) For each sample compute an estimate of E(y|x=5) Construct a histogram
from these estimates.

C) For each sample compute a 95% CI on the slope. How manyt of these contain
the true value B_1 = 10

D) For each estimate of E(y|x=5) in part b compute the 95% CI. How many of
these intervals contain the trye value of E(y|x=5)=100?

I have 0 idea how to do these as R was not a prereq for my regression course
but is needed for these problems. As much help as you could offer would be
amazing.
-- 
View this message in context: 
http://n4.nabble.com/SLR-500-Random-Samples-tp1555769p1555769.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SLR 500 Random Samples

2010-02-14 Thread Dimitris Rizopoulos

it seems that this is a homework! Please read carefully the Posting 
Guide: http://www.r-project.org/posting-guide.html -- the R-help mailing 
list is not intended for Basic Statistics and classroom homework!



Best,
Dimitris


Schwonka wrote:

Consider the SLR y = 50 + 10x + e where e is NID(0.16). n = 20 pairs of
observations. Generate 500 samples of 20 obersvations drawing 1 observations
drawing one observation from each level of x = .5,1,1.5 ... 10 for each
sample

A) For each sample computer the least squares regression estimates of the
slop and intercept. Construct histograms of the sample values of B_0_hat and
B_1_hat.

B) For each sample compute an estimate of E(y|x=5) Construct a histogram
from these estimates.

C) For each sample compute a 95% CI on the slope. How manyt of these contain
the true value B_1 = 10

D) For each estimate of E(y|x=5) in part b compute the 95% CI. How many of
these intervals contain the trye value of E(y|x=5)=100?

I have 0 idea how to do these as R was not a prereq for my regression course
but is needed for these problems. As much help as you could offer would be
amazing.


--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

89 matches

Mail list logo