date:20080210

Re: [R] \ll and \gg in expression()

2008-02-10 Thread Peter Dalgaard

Prof Brian Ripley wrote:
 On Sat, 9 Feb 2008, Michael Kubovy wrote:

   
 On Feb 9, 2008, at 4:41 PM, Prof Brian Ripley wrote:

 
 On Sat, 9 Feb 2008, Michael Kubovy wrote:

   
 How do I enter 'much greater than' and 'much less than'  symbols in an
 expression?
 
 Those are not in the Adobe Symbol encoding used for plotmath.

 Since you have not told us your platform and locale as requested in the 
 posting guide
   
 R version 2.7.0 Under development (unstable) (2008-02-05 r44340)
 i386-apple-darwin8.10.1

 locale:
 C

 
 I don't know if the following is relevant to you.

 If you have a suitable Unicode font and the means to use it (which most 
 likely means a UTF-8 locale in R  2.7.0) they are the glyphs for \u226a 
 and \u226b (see 
 http://www.alanwood.net/unicode/mathematical_operators.html).  A quick 
 check suggests that not many fonts do.
   
 Thanks. The Mac character palette tells me that they correspond to Unicode 
 226A and B or UTF8 E2 89 AA and AB.

 My question now is, how do I tell expression() to use the glyphs for these?
 

 \u226a  and \u226b, as I said.   But not in a C locale.

   
Otherwise, a quick approximation could be expression(x~~1). (We 
don't do negative thin space, do we? That could make it look better.)

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] writing a function

2008-02-10 Thread Johannes Hüsing

mohamed nur anisah [EMAIL PROTECTED] [Fri, Feb 08, 2008 at 04:42:41PM CET]:
 Dear lists,

   I'm in my process of learning of writing a function. I tried to write a 
 simple functions of a matrix and a vector. Here are the codes:

   mm-function(m,n){  #matrix function
  w-matrix(nrow=m, ncol=n)
  for(i in 1:m){
   for(j in 1:n){
w[i,j]=i+j
   }
  }
 return(w[i,j])
 }


In addition to the other comments, allow me to remark that 
R provides a lot of convenience functions on vectors that
make explicit looping unnecessary. An error such as yours
wouldn't have occurred to a more experienced expRt because
indices wouldn't turn up in the code at all:

mm - function(m, n) {
a - array(nrow=m, ncol=n)
row(a)+col(a)
}

Greetings


Johannes

-- 
Johannes Hüsing   There is something fascinating about science. 
  One gets such wholesale returns of conjecture 
mailto:[EMAIL PROTECTED]  from such a trifling investment of fact.  
  
http://derwisch.wikidot.com (Mark Twain, Life on the Mississippi)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Testing for differecnes between groups, need help to find the right test in R.

2008-02-10 Thread Kes Knave

Dear all,

I have a data set with four different groups, for each group I have several
observation (number of observation in each group are unequal), and I want to
test if there are some differences in the values between the groups.

What will be the most proper way to test this in R?


Regards Kes

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Using 'sapply' and 'by' in one function

2008-02-10 Thread David Natalia

Greetings,

I'm having a problem with something that I think is very simple - I'd
like to be able to use the 'sapply' and 'by' functions in 1 function
to be able (for example) to get regression coefficients from multiple
models by a grouping variable.  I think that I'm missing something
that is probably obvious to experienced users.

Here's a simple (trivial) example of what I'd like to do:

new - 
data.frame(Outcome.1=rnorm(10),Outcome.2=rnorm(10),sex=rep(0:1,5),Pred=rnorm(10))
fxa - function(x,data)   { lm(x~Pred,data=data)$coef }
sapply(new[,1:2],fxa,new)  # this yields coefficients for the
predictor in separate models

fxb - function(x)   {lm(Outcome.1~Pred,da=x)$coef};
by(new,new$sex,fxb) #yields the coefficient for Outcome.1 for each sex

## I'd like to be able to combine 'sapply' and 'by' to be able to get
the regression coefficients for Outome.1 and Outcome.2 by each sex,
rather than running fxb a second time predicting 'Outcome.2' or by
subsetting the data - by sex - before I run the function, but the
following doesn't work -

by(new,new$sex,FUN=function(x)sapply(x[,1:2],fxa,new))
'Error in model.frame.default(formula = x ~ Pred, data = data,
drop.unused.levels = TRUE) :
  variable lengths differ (found for 'Pred')'

##I understand the error message - the length of 'Pred' is 10 while
the length of each sex group is 5, but I'm not sure how to correctly
write the 'by' function to use 'sapply' inside it.   Could someone
please point me in the right direction?  Thanks very much in advance

David S Freedman, CDC (Atlanta USA) [definitely not the well-know
statistician, David A Freedman, in Berkeley]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using 'sapply' and 'by' in one function

2008-02-10 Thread Gabor Grothendieck

By passing new to fxa via the second argument of fxa, new is not being
subsetted hence the error.  Try this:

by(new, new$sex, function(x) sapply(x[1:2], function(y) coef(lm(y ~ Pred, x)))

Actually, you can do the above without sapply as lm can take a matrix
for the dependent variable:

by(new, new$sex, function(x) coef(lm(as.matrix(x[1:2]) ~ Pred, x)))

On Feb 10, 2008 8:19 AM, David  Natalia [EMAIL PROTECTED] wrote:
 Greetings,

 I'm having a problem with something that I think is very simple - I'd
 like to be able to use the 'sapply' and 'by' functions in 1 function
 to be able (for example) to get regression coefficients from multiple
 models by a grouping variable.  I think that I'm missing something
 that is probably obvious to experienced users.

 Here's a simple (trivial) example of what I'd like to do:

 new - 
 data.frame(Outcome.1=rnorm(10),Outcome.2=rnorm(10),sex=rep(0:1,5),Pred=rnorm(10))
 fxa - function(x,data)   { lm(x~Pred,data=data)$coef }
 sapply(new[,1:2],fxa,new)  # this yields coefficients for the
 predictor in separate models

 fxb - function(x)   {lm(Outcome.1~Pred,da=x)$coef};
 by(new,new$sex,fxb) #yields the coefficient for Outcome.1 for each sex

 ## I'd like to be able to combine 'sapply' and 'by' to be able to get
 the regression coefficients for Outome.1 and Outcome.2 by each sex,
 rather than running fxb a second time predicting 'Outcome.2' or by
 subsetting the data - by sex - before I run the function, but the
 following doesn't work -

 by(new,new$sex,FUN=function(x)sapply(x[,1:2],fxa,new))
 'Error in model.frame.default(formula = x ~ Pred, data = data,
 drop.unused.levels = TRUE) :
  variable lengths differ (found for 'Pred')'

 ##I understand the error message - the length of 'Pred' is 10 while
 the length of each sex group is 5, but I'm not sure how to correctly
 write the 'by' function to use 'sapply' inside it.   Could someone
 please point me in the right direction?  Thanks very much in advance

 David S Freedman, CDC (Atlanta USA) [definitely not the well-know
 statistician, David A Freedman, in Berkeley]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using 'sapply' and 'by' in one function

2008-02-10 Thread Gabor Grothendieck

Actually thinking about this, not only do you not need sapply but you
don't even need by:

new2 - transform(new, sex = factor(sex))
coef(lm(as.matrix(new2[1:2]) ~ sex/Pred - 1, new2))


On Feb 10, 2008 8:43 AM, Gabor Grothendieck [EMAIL PROTECTED] wrote:
 By passing new to fxa via the second argument of fxa, new is not being
 subsetted hence the error.  Try this:

 by(new, new$sex, function(x) sapply(x[1:2], function(y) coef(lm(y ~ Pred, x)))

 Actually, you can do the above without sapply as lm can take a matrix
 for the dependent variable:

 by(new, new$sex, function(x) coef(lm(as.matrix(x[1:2]) ~ Pred, x)))


 On Feb 10, 2008 8:19 AM, David  Natalia [EMAIL PROTECTED] wrote:
  Greetings,
 
  I'm having a problem with something that I think is very simple - I'd
  like to be able to use the 'sapply' and 'by' functions in 1 function
  to be able (for example) to get regression coefficients from multiple
  models by a grouping variable.  I think that I'm missing something
  that is probably obvious to experienced users.
 
  Here's a simple (trivial) example of what I'd like to do:
 
  new - 
  data.frame(Outcome.1=rnorm(10),Outcome.2=rnorm(10),sex=rep(0:1,5),Pred=rnorm(10))
  fxa - function(x,data)   { lm(x~Pred,data=data)$coef }
  sapply(new[,1:2],fxa,new)  # this yields coefficients for the
  predictor in separate models
 
  fxb - function(x)   {lm(Outcome.1~Pred,da=x)$coef};
  by(new,new$sex,fxb) #yields the coefficient for Outcome.1 for each sex
 
  ## I'd like to be able to combine 'sapply' and 'by' to be able to get
  the regression coefficients for Outome.1 and Outcome.2 by each sex,
  rather than running fxb a second time predicting 'Outcome.2' or by
  subsetting the data - by sex - before I run the function, but the
  following doesn't work -
 
  by(new,new$sex,FUN=function(x)sapply(x[,1:2],fxa,new))
  'Error in model.frame.default(formula = x ~ Pred, data = data,
  drop.unused.levels = TRUE) :
   variable lengths differ (found for 'Pred')'
 
  ##I understand the error message - the length of 'Pred' is 10 while
  the length of each sex group is 5, but I'm not sure how to correctly
  write the 'by' function to use 'sapply' inside it.   Could someone
  please point me in the right direction?  Thanks very much in advance
 
  David S Freedman, CDC (Atlanta USA) [definitely not the well-know
  statistician, David A Freedman, in Berkeley]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Which package should I use if I estimate a recursive model?

2008-02-10 Thread John Fox

Dear Yongfu He,

If you mean a recursive structural-equation model, then if you're willing to
assume normally distributed errors, equation-by-equation OLS regression,
using lm(), will give you the full-information maximum-likelihood estimates
of the structural coefficients. You could also use the sem() function in the
sem package, but, aside from getting a test of over-identifying restrictions
(assuming that the model is overidentified), there's not much reason to do
so -- you'll get the same estimates.

I hope this helps,
 John


John Fox, Professor
Department of Sociology
McMaster University
Hamilton, Ontario, Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 project.org] On Behalf Of Yongfu He
 Sent: February-09-08 9:16 PM
 To: r-help@r-project.org
 Subject: [R] Which package should I use if I estimate a recursive
 model?
 
 
 Dear All:
 
 I want to estimate a simple recursive mode in R. Which package should I
 use? Thank you very much in advance.
 
 Yongfu He
 _
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R on Mac PRO does anyone have experience with R on such a platform ?

2008-02-10 Thread Rod

On Feb 10, 2008 2:29 AM, Maura E Monville [EMAIL PROTECTED] wrote:
 I saw there exists an R version for Mac/OS.
 I'd like to hear from someone who is running R on a Mac/OS before venturing
 on getting  the following  computer system.
 I am in the process of choosing a powerful laptop 17 MB PRO
 2.6GHZ(dual-core)  4GBRAM 

 Thank you so much,
 --
 Maura E.M

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


You can see the R MacOSX FAQ,
http://cran.es.r-project.org/bin/macosx/RMacOSX-FAQ.html.
Also you can post in the Mac R list (R-sig-mac)

Rod.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Do I need to use dropterm()??

2008-02-10 Thread Bernard Leemon

Hi Dani,
it would be better to start with a question you are trying to ask of your
data rather than trying to figure out what a particular function does.  with
your variables and model, even if the component terms were not significant,
they must in the model or the product of sunlight and aspect will NOT
represent the interaction.  also note that the tests of your components are
probably not what you think they are.  in general, tests of components of
interactions test the simple effect of that variable when the other variable
is 0.  hence, your 'significant' result for aspect pertains to when log
sunlight is 0, which probably isn't what you want to be testing.  what the
significant effect for sunlight means depends on how aspect was coded.  you
should check to see what code was used to know what zero means.

gary mcclelland
colorado

On Sun, Feb 10, 2008 at 6:40 AM, DaniWells [EMAIL PROTECTED]
wrote:


 Hello,

 I'm having some difficulty understanding the useage of the dropterm()
 function in the MASS library. What exactly does it do? I'm very new to R,
 so
 any pointers would be very helpful. I've read many definitions of what
 dropterm() does, but none seem to stick in my mind or click with me.

 I've coded everything fine for an interaction that runs as follows: two
 sets
 of data (one for North aspect, one for Southern Aspect) and have a
 logscale
 on the x axis, with survival on the y. After calculating my anova results
 i
 have all significant results (ie aspect = sig, logscale of sunlight = sig,
 and aspect:llight = sig).

 When i have all significant results in my ANOVA table, do i need
 dropterm(),
 or is that just to remove insignificant terms?

 Many thanks,

 Dani
 --
 View this message in context:
 http://www.nabble.com/Do-I-need-to-use-dropterm%28%29---tp15396151p15396151.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Error in optim while using fitdistr() function

2008-02-10 Thread Jason Q. McClintic

I get the digest, so I apologize if this is a little late.

For your situation (based on the description and what I think your code 
is doing, more on that below), it looks like you are modeling a Poisson 
flow where the number of hits per unit time is a random integer with 
some mean value.

If I understand your code correctly, you are trying to put your data 
into k bins of width f-(max(V1)-min(V1))/k. In that case I would think
something like this would work more efficiently:

m-min(V1);
k-floor(1 + log2(length(V1)));
f-(max(V1)-min(V1))/k;
binCount-NULL;
for(i in seq(length=k)){
  binIndex-which((m+(i-1)*fV1)(V1m+i*f));
  binCount[i]-sum(V2[binIndex]);
};

where i becomes the index of time intervals.

Hope it helps.

Sincerely,

Jason Q. McClintic

[EMAIL PROTECTED] wrote:
 Send R-help mailing list submissions to
   r-help@r-project.org
 
 To subscribe or unsubscribe via the World Wide Web, visit
   https://stat.ethz.ch/mailman/listinfo/r-help
 or, via email, send a message with subject or body 'help' to
   [EMAIL PROTECTED]
 
 You can reach the person managing the list at
   [EMAIL PROTECTED]
 
 When replying, please edit your Subject line so it is more specific
 than Re: Contents of R-help digest...


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using 'sapply' and 'by' in one function

2008-02-10 Thread hadley wickham

On Feb 10, 2008 8:25 AM, Gabor Grothendieck [EMAIL PROTECTED] wrote:
 Actually thinking about this, not only do you not need sapply but you
 don't even need by:

 new2 - transform(new, sex = factor(sex))
 coef(lm(as.matrix(new2[1:2]) ~ sex/Pred - 1, new2))

Although that's a very slightly different model, as it assumes that
both sexes have the same error variance.

Hadley


-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error while using fitdistr() function or goodfit() function

2008-02-10 Thread Aswad Gurjar

Hello,
Thanks that helped for poisson.
When I changed method to ML it worked for poisson but when I used that for
nbinomial I got errors.But why is this happening?
gf-goodfit(binCount,type= poisson)
summary(gf)

 Goodness-of-fit test for poisson distribution

 X^2 df P( X^2)
Likelihood Ratio 2730.24  30

gf-goodfit(binCount,type= nbinomial)
Warning messages:
1: NaNs produced in: dnbinom(x, size, prob, log)
2: NaNs produced in: dnbinom(x, size, prob, log)

 summary(gf)

 Goodness-of-fit test for nbinomial distribution

  X^2 df P( X^2)
Likelihood Ratio 64.53056  2 9.713306e-15

But how can I interpret above result?
When I was using goodfit using method MinChisq I was getting some P
value.More the P value among goodness of fit tests for different
distributions
(poisson,binomial,nbinomial) better the fit would be.Am I correct?If I am
wrong correct me.
But now with ML method how can I decide which distribution is best fit?
Thank You.

Aswad

On 2/10/08, Jason Q. McClintic [EMAIL PROTECTED] wrote:

 Try changing your method to ML and try again. I tried the run the
 first example from the documentation and it failed with the same error.
 Changing the estimation method to ML worked.

 @List: Can anyone else verify the error I got? I literally ran the
 following two lines interactively from the example for goodfit:

 dummy - rnbinom(200, size = 1.5, prob = 0.8)
 gf - goodfit(dummy, type = nbinomial, method = MinChisq)

 and got back

 Warning messages:
 1: In pnbinom(q, size, prob, lower.tail, log.p) : NaNs produced
 2: In pnbinom(q, size, prob, lower.tail, log.p) : NaNs produced

 Again, I hope this helps.

 Sincerely,

 Jason Q. McClintic

 Aswad Gurjar wrote:
  Hello,
 
  Thanks for help.But I am facing different problem.
 
  I have 421 readings of time and no of requests coming at perticular
 time.Basically I have data with interval of one minute and corresponding
 no of requests.It is discrete in nature.I am collecting data from 9AM to
 4PM.But some of readings are coming as 0.When I plotted histogram of data
 I could not get shape of any standard distribution.Now,my aim is to find
 distribution which is best fit to my data among standard ones.
 
   So there was huge data.That's why I tried to collect data into no of
 bins.That was working properly.Whatever code you have given is working
 properly too.But your code is more efficient.Now,problem comes at next
 stage.When I apply fitdistr() for continuous data or goodfit() for
 discrete data I get following error.I am not able to remove that
 error.Please help me if you can.
  Errors are as follows:
  library(vcd)
  gf-goodfit(binCount,type= nbinomial,method= MinChisq)
  Warning messages:
  1: NaNs produced in: pnbinom(q, size, prob, lower.tail, log.p)
  2: NaNs produced in: pnbinom(q, size, prob, lower.tail, log.p)
  3: NaNs produced in: pnbinom(q, size, prob, lower.tail, log.p)
  4: NaNs produced in: pnbinom(q, size, prob, lower.tail, log.p)
  5: NaNs produced in: pnbinom(q, size, prob, lower.tail, log.p)
  summary(gf)
 
   Goodness-of-fit test for nbinomial distribution
 
   X^2 dfP( X^2)
  Pearson 9.811273  2 0.007404729
  Warning message:
  Chi-squared approximation may be incorrect in: summary.goodfit(gf)
 
  for another distribution:
   gf-goodfit(binCount,type= poisson,method= MinChisq)
  Warning messages:
  1: NA/Inf replaced by maximum positive value in: optimize(chi2,
 range(count))
  2: NA/Inf replaced by maximum positive value in: optimize(chi2,
 range(count))
  3: NA/Inf replaced by maximum positive value in: optimize(chi2,
 range(count))
  4: NA/Inf replaced by maximum positive value in: optimize(chi2,
 range(count))
  5: NA/Inf replaced by maximum positive value in: optimize(chi2,
 range(count))
  6: NA/Inf replaced by maximum positive value in: optimize(chi2,
 range(count))
  7: NA/Inf replaced by maximum positive value in: optimize(chi2,
 range(count))
  8: NA/Inf replaced by maximum positive value in: optimize(chi2,
 range(count))
   Goodness-of-fit test for poisson distribution
 
X^2 df P( X^2)
  Pearson 1.660931e+115  30
  Warning message:
  Chi-squared approximation may be incorrect in: summary.goodfit(gf)
 
 
  Aswad
  On 2/10/08, Jason Q. McClintic  [EMAIL PROTECTED] wrote:
 
  I get the digest, so I apologize if this is a little late.
 
  For your situation (based on the description and what I think your code
  is doing, more on that below), it looks like you are modeling a Poisson
  flow where the number of hits per unit time is a random integer with
  some mean value.
 
  If I understand your code correctly, you are trying to put your data
  into k bins of width f-(max(V1)-min(V1))/k. In that case I would think
  something like this would work more efficiently:
 
  m-min(V1);
  k-floor(1 + log2(length(V1)));
  f-(max(V1)-min(V1))/k;
  binCount-NULL;
  for(i in seq(length=k)){

Re: [R] Using 'sapply' and 'by' in one function

2008-02-10 Thread hadley wickham

  Although that's a very slightly different model, as it assumes that
  both sexes have the same error variance.
 

 But the output are the coefficients and they are identical.

For the sake of an example I'm sure that David simply omitted the part
of his analysis where he looked at the standard errors as well ;)

Hadley

-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] building packages for Linux vs. Windows

2008-02-10 Thread Duncan Murdoch

On 10/02/2008 1:07 PM, Erin Hodgess wrote:
 Hi R People:
 
 I sure that this is a really easy question, but here goes:
 
 I'm trying to build a package that will run on both Linux and Windows.
 
 However, there are several commands in a section that will be
 different in Linux than they are in Windows.
 
 Would I be better off just to build two separate packages, please?
 If just one is needed, how could I determine which system is running
 in order to use the correct command, please?

You will find it much easier to build just one package.

You can use .Platform or (for more detail) Sys.info() to find out what 
kind of system you're running on.  Remember that R doesn't just run on 
Linux and Windows:  there's also MacOSX, and other Unix and Unix-like 
systems (Solaris, etc.).

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] building packages for Linux vs. Windows

2008-02-10 Thread Erin Hodgess

Hi R People:

I sure that this is a really easy question, but here goes:

I'm trying to build a package that will run on both Linux and Windows.

However, there are several commands in a section that will be
different in Linux than they are in Windows.

Would I be better off just to build two separate packages, please?
If just one is needed, how could I determine which system is running
in order to use the correct command, please?

Thanks in advance,
Erin


-- 
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: [EMAIL PROTECTED]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] prcomp vs. princomp vs fast.prcomp

2008-02-10 Thread Erin Hodgess

Hi R People:

When performing PCA, should I use prcomp, princomp or fast.prcomp, please?

thanks.
Erin


-- 
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: [EMAIL PROTECTED]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] building packages for Linux vs. Windows

2008-02-10 Thread John Sorkin

On my widows XP computer, W
From my windows XP system running R 2.6.1:
 version
   _   
platform   i386-pc-mingw32 
arch   i386
os mingw32 
system i386, mingw32   
status 
major  2   
minor  6.1 
year   2007
month  11  
day26  
svn rev43537   
language   R   
version.string R version 2.6.1 (2007-11-26)

John Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

 Ted Harding [EMAIL PROTECTED] 2/10/2008 1:39 PM 
On 10-Feb-08 18:07:56, Erin Hodgess wrote:
 Hi R People:
 
 I sure that this is a really easy question, but here goes:
 
 I'm trying to build a package that will run on both Linux and Windows.
 
 However, there are several commands in a section that will be
 different in Linux than they are in Windows.
 
 Would I be better off just to build two separate packages, please?
 If just one is needed, how could I determine which system is running
 in order to use the correct command, please?
 
 Thanks in advance,
 Erin

There is the version (a list) variable:

  version
# platform   i486-pc-linux-gnu
# arch   i486
# os linux-gnu
# system i486, linux-gnu
# status Patched
# major  2
# minor  4.0
# year   2006
# month  11
# day25
# svn rev39997
# language   R

from which you can extract the os component:

  version$os
# [1] linux-gnu

I don;t know what this says on a Windows system,
but it surely won't mention Linux!

So testing this wil enable you to set a flag, e.g.

Linux-ifelse(length(grep(linux,version$os))0, TRUE, FALSE)

if(Linux){window-function(...) X11(...)} else
 {window-function(...) windows(...)}

Hoping this helps,
Ted.


E-Mail: (Ted Harding) [EMAIL PROTECTED]
Fax-to-email: +44 (0)870 094 0861
Date: 10-Feb-08   Time: 18:39:29
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.

Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped:6}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] building packages for Linux vs. Windows

2008-02-10 Thread Ted Harding

On 10-Feb-08 18:07:56, Erin Hodgess wrote:
 Hi R People:
 
 I sure that this is a really easy question, but here goes:
 
 I'm trying to build a package that will run on both Linux and Windows.
 
 However, there are several commands in a section that will be
 different in Linux than they are in Windows.
 
 Would I be better off just to build two separate packages, please?
 If just one is needed, how could I determine which system is running
 in order to use the correct command, please?
 
 Thanks in advance,
 Erin

There is the version (a list) variable:

  version
# platform   i486-pc-linux-gnu
# arch   i486
# os linux-gnu
# system i486, linux-gnu
# status Patched
# major  2
# minor  4.0
# year   2006
# month  11
# day25
# svn rev39997
# language   R

from which you can extract the os component:

  version$os
# [1] linux-gnu

I don;t know what this says on a Windows system,
but it surely won't mention Linux!

So testing this wil enable you to set a flag, e.g.

Linux-ifelse(length(grep(linux,version$os))0, TRUE, FALSE)

if(Linux){window-function(...) X11(...)} else
 {window-function(...) windows(...)}

Hoping this helps,
Ted.


E-Mail: (Ted Harding) [EMAIL PROTECTED]
Fax-to-email: +44 (0)870 094 0861
Date: 10-Feb-08   Time: 18:39:29
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [R-sig-Geo] Comparing spatial point patterns - Syrjala test

2008-02-10 Thread jiho

Hi,

I went ahead and implemented something. However:
- I cannot garantie it gives correct results since, unfortunately, the  
data used in Syrjala 1996 is not published along with the paper. To  
avoid mistakes, I started by coding things in a fast and simple way  
and then tried to optimize the code. At least all versions given the  
same results.
- As expected, the test is still quite slow since it relies on  
permutations to compute the p.value. The successive optimizations  
allowed to go from 73 to 13 seconds on my machine, but 13 seconds is  
still a long time. Furthermore, I don't know how the different  
versions would scale according to the number of points (I only tested  
with one dataset). I'm not very good at thinking vector so if  
someone could look at this and further improve it, I would welcome  
patches. Maybe the only real solution would be to go the Fortran way  
and link some code to R, but I did not want to wander in such scary  
places ;)

The code and test data is here:

http://cbetm.univ-perp.fr/irisson/svn/distribution_data/tetiaroa/trunk/data/lib_spatial.R
Warning: it probably uses non canonical S syntax, sorry for those with  
sensitive eyes.

On 2008-February-10  , at 17:02 , Jan Theodore Galkowski wrote:
 I'm also interested here in comparing spatial point patterns.  So, if
 anyone finds any further R-based, or S-plus-based work on the  
 matter, or
 any more recent references, might you please include me in the
 distribution list?

 Thanks much!


Begin forwarded message:
 From: jiho [EMAIL PROTECTED]
 Subject: Comparing spatial point patterns - Syrjala test

 Dear Lists,

 At several stations distributed regularly in space[1], we sampled  
 repeatedly (4 times) the abundance of organisms and measured  
 environmental parameters. I now want to compare the spatial  
 distribution of various species (and test wether they differ or  
 not), or to compare the distribution of a particular organism with  
 the distribution of some environmental variable.
 Syrjala's test[2] seems to be appropriate for such comparisons. The  
 hamming distance is also used (but it is not associated with a  
 test). However, as far as I understand it, Syrjala's test only  
 compares the distribution gathered during one sampling event, while  
 I have four successive repeats and:
 - I am interested in comparing if, on average, the distributions are  
 the same
 - I would prefer to keep the information regarding the variability  
 of the abundances in time, rather than just comparing the means,  
 since the abundances are quite variable.

 Therefore I have two questions for all the knowledgeable R users on  
 these lists:
 - Is there a package in which Syrjala's test is implemented for R?
 - Is there another way (a better way) to test for such differences?

 Thank you very much in advance for your help.

 [1] http://jo.irisson.free.fr/work/research_tetiaroa.html
 [2] http://findarticles.com/p/articles/mi_m2120/is_n1_v77/ai_18066337/pg_7


JiHO
---
http://jo.irisson.free.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] grep etc.

2008-02-10 Thread Gabor Csardi

sub(-, --, v, fixed=TRUE)

See ?sub.

Gabor

On Sun, Feb 10, 2008 at 02:14:48PM -0500, Michael Kubovy wrote:
 Dear R-helpers,
 
 How do I transform
 v - c('insd-otsd', 'sppr-unsp')
 into
 c('insd--otsd', 'sppr--unsp')
 ?
 _
 Professor Michael Kubovy
 University of Virginia
 Department of Psychology
 USPS: P.O.Box 400400Charlottesville, VA 22904-4400
 Parcels:Room 102Gilmer Hall
  McCormick RoadCharlottesville, VA 22903
 Office:B011+1-434-982-4729
 Lab:B019+1-434-982-4751
 Fax:+1-434-982-4766
 WWW:http://www.people.virginia.edu/~mk9y/
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Csardi Gabor [EMAIL PROTECTED]UNIL DGM

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] grep etc.

2008-02-10 Thread Michael Kubovy

Dear R-helpers,

How do I transform
v - c('insd-otsd', 'sppr-unsp')
into
c('insd--otsd', 'sppr--unsp')
?
_
Professor Michael Kubovy
University of Virginia
Department of Psychology
USPS: P.O.Box 400400Charlottesville, VA 22904-4400
Parcels:Room 102Gilmer Hall
 McCormick RoadCharlottesville, VA 22903
Office:B011+1-434-982-4729
Lab:B019+1-434-982-4751
Fax:+1-434-982-4766
WWW:http://www.people.virginia.edu/~mk9y/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Vector Size

2008-02-10 Thread John Kane

You just have too large a vector for your memory.
There is not much you can do with an object of 500 MG.
 You have over 137 million combinations.

What are you trying to do with this vector?


--- Oscar A [EMAIL PROTECTED] wrote:

 
 Hello everybody!!
 I'm from Colombia (South America) and I'm new on R. 
 I've been trying to
 generate all of the possible combinations for a 6
 number combination with
 numbers that ranges from 1 to 53.
 
 I've used the following commands:
 
 datos-c(1:53)

M-matrix(data=(combn(datos,6,FUN=NULL,simplify=TRUE)),nrow=22957480,ncol=6,byrow=TRUE)
 
 Once the commands are executed, the program shows
 the following:
 
 Error: CANNOT ALLOCATE A VECTOR OF SIZE 525.5 Mb
 
 
 How can I fix this problem?
 -- 
 View this message in context:

http://www.nabble.com/Vector-Size-tp15366901p15366901.html
 Sent from the R help mailing list archive at
 Nabble.com.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [OT] good reference for mixed models and EM algorithm

2008-02-10 Thread Erin Hodgess

Dear R People:

Sorry for the off-topic.  Could someone recommend a good reference for
using the EM algorithm on mixed models, please?

I've been looking and there are so many of them.  Perhaps someone here
can narrow things down a bit.

Thanks in advance,
Sincerely,
Erin


-- 
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: [EMAIL PROTECTED]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] data frame question

2008-02-10 Thread joseph

Hello


I have 2 data frames df1 and df2. I would like to create a
new data frame new_df which will contain only the common rows based on the 
first 2
columns (chrN and start). The column score in the new data frame
should
be replaced with a column containing the average score (average_score) from df1
and df2. 




df1= data.frame(chrN= c(chr1, chr1, chr1, chr1, chr2,
chr2, chr2), 


start= c(23, 82, 95, 108, 95, 108, 121),


end= c(33, 92, 105, 118, 105, 118, 131),


score= c(3, 6, 2, 4, 9, 2, 7))


 


df2= data.frame(chrN= c(chr1, chr2, chr2, chr2 , chr2),


start= c(23, 50, 95, 20, 121),


end= c(33, 60, 105, 30, 131),


score= c(9, 3, 7, 7, 3))


 


new_df= data.frame(chrN= c(chr1, chr2, chr2),


start= c(23, 95, 121),


end= c(33, 105, 131),


average_score= c(6, 8, 5))


 
Thank you for your help
Joseph








  

Never miss a thing.  Make Yahoo your home page. 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [OT] good reference for mixed models and EM algorithm

2008-02-10 Thread Spencer Graves

Hi, Erin: 

  Have you looked at Pinheiro and Bates (2000) Mixed-Effects Models 
in S and S-Plus (Springer)? 

  As far as I know, Doug Bates has been the leading innovator in 
this area for the past 20 years.  Pinheiro was one of his graduate 
students.  The 'nlme' package was developed  by him or under his 
supervision, and 'lme4' is his current development platform.  The 
~R\library\scripts subdirectory contains ch01.R, ch02.R, etc. = 
script files to work the examples in the book (where ~R = your R 
installation directory).  There are other good books, but I recommend 
you start with Pinheiro and Bates. 

  Spencer Graves

Erin Hodgess wrote:
 Dear R People:

 Sorry for the off-topic.  Could someone recommend a good reference for
 using the EM algorithm on mixed models, please?

 I've been looking and there are so many of them.  Perhaps someone here
 can narrow things down a bit.

 Thanks in advance,
 Sincerely,
 Erin


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data frame question

2008-02-10 Thread Mark Wardle

On 10/02/2008, joseph [EMAIL PROTECTED] wrote:
 Hello
 I have 2 data frames df1 and df2. I would like to create a
 new data frame new_df which will contain only the common rows based on the 
 first 2
 columns (chrN and start). The column score in the new data frame
 should
 be replaced with a column containing the average score (average_score) from 
 df1
 and df2.


Try this:   (avoiding underscores)

new.df - merge(df1, df2, by=c('chrN','start'))
new.df$average.score - apply(df3[,c('score.x','score.y')], 1, mean, na.rm=T)

As always, interested to see whether it can be done in one line...

-- 
Dr. Mark Wardle
Specialist registrar, Neurology
Cardiff, UK

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data frame question

2008-02-10 Thread David Winsemius

joseph [EMAIL PROTECTED] wrote in
news:[EMAIL PROTECTED]: 

 I have 2 data frames df1 and df2. I would like to create a
 new data frame new_df which will contain only the common rows based
 on the first 2 columns (chrN and start). The column score in the new
 data frame should
 be replaced with a column containing the average score
 (average_score) from df1 and df2. 
 

 df1= data.frame(chrN= c(chr1, chr1, chr1, chr1, chr2,
 chr2, chr2), 
 start= c(23, 82, 95, 108, 95, 108, 121),
 end= c(33, 92, 105, 118, 105, 118, 131),
 score= c(3, 6, 2, 4, 9, 2, 7))
 
 df2= data.frame(chrN= c(chr1, chr2, chr2, chr2 , chr2),
 start= c(23, 50, 95, 20, 121),
 end= c(33, 60, 105, 30, 131),
 score= c(9, 3, 7, 7, 3))

Clunky to be sure, but this should worked for me:

df3 - merge(df1,df2,by=c(chrN,start)
#non-match variables get auto-relabeled

df3$avg.scr - with(df3, (score.x+score.y)/2) # or mean( )
df3 - df3[,c(chrN,start,avg.scr)]
#drops the variables not of interest

df3
  chrN start avg.scr
1 chr123   6
2 chr2   121   5
3 chr295   8

-- 
David Winsemius

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] prcomp vs. princomp vs fast.prcomp

2008-02-10 Thread Liviu Andronic

On 2/10/08, Erin Hodgess [EMAIL PROTECTED] wrote:
 When performing PCA, should I use prcomp, princomp or fast.prcomp, please?

You can take a look here [1] and here [2] for some short references.
From the first page: Principal Components Analysis (PCA) is available
in prcomp() (preferred) and princomp() in standard package stats.
There are also - at least - FactoMineR, psych and ade4 that provide
PCA funtions. I imagine that it would much depend on what you want to
do.

Liviu

[1] http://cran.miscellaneousmirror.org/src/contrib/Views/Environmetrics.html
[2] http://cran.r-project.org/src/contrib/Views/Psychometrics.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Applying lm to data with combn

2008-02-10 Thread Henrique Dallazuanna

I think that what you want do is stepwise, see step function

On 09/02/2008, AliR [EMAIL PROTECTED] wrote:

 Thank you, can you suggest wht is the shortest way to store the combination
 with min residual error term?



 AliR wrote:
 
   http://www.nabble.com/file/p15359204/test.data.csv
  http://www.nabble.com/file/p15359204/test.data.csv test.data.csv
 
  Hi,
 
  I have used apply to have certian combinations, but when I try to use
  these combinations I get the error
  [Error in eval(expr, envir, enclos) : object X.GDAXI not found]. being a
  novice I donot understand that after applying combination to the data I
  cant access it and use lm on these combinations. The data frame either is
  no longer a matrix, how can I access the data and make it work for lm!!
 
  Any help please!
 
 
 
 
 
 
  fruit  = read.csv(file=test.data.csv,head= TRUE, sep=,)# read it in
  matrix format
 
  #fruit =read.file(row.names=1)$data
 
  mD =head(fruit[, 1:5])# only first five used in combinations
  #X.SSMII = head(fruit[,  6])# Keep it for referebce
  nmax = NULL
  n =  ncol(mD)# dont take the last column for reference purpose
  if(is.null(nmax)) nmax = n
 
  mDD = apply(combn(5, 1),1, FUN= function(y) mD[, y])# to
 
 
 
  fg = lm( X.SSMII ~ X.GDAXI +  X.FTSE +  X.FCHI + X.IBEX, data = mDD )#
  regress on combos
 
  s = cbind(s, Residuals = residuals(fg))# take residuals
 
  print(mD)
 
 

 --
 View this message in context: 
 http://www.nabble.com/Applying-lm-to-data-with-combn-tp15359204p15391159.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] reshape

2008-02-10 Thread juli pausas

Dear colleagues,
I'd like to reshape a datafame in a long format to a wide format, but
I do not quite get what I want. Here is an example of the data I've
have (dat):

sp - c(a, a, a, a, b, b, b, c, d, d, d, d)
tr - c(A, B, B, C, A, B, C, A, A, B, C, C)
code - c(a1, a2, a2, a3, a3, a3, a4, a4, a4, a5,
a5, a6)
dat - data.frame(id=1:12, sp=sp, tr=tr, val=31:42, code=code)

and below is what I'd like to obtain. That is, I'd like the tr
variable in different columns (as a timevar) with their value (val).

sp  code  tr.A  tr.B  tr.C
aa1   31NANA
aa2   NA32NA
aa2   NA33NA**
aa3   NANA34
ba3   3536NA
ba4   NANA37
ca4   38NANA
da4   39NANA
da5   NA4041
da6   NANA42

Using reshape:

reshape(dat[,2:5], direction=wide, timevar=tr, idvar=c(code,sp ))

I'm getting very close. The only difference is in the 3rd row (**),
that is when sp and code are the same I only get one record. Is there
a way to get all records? Any idea?

Thank you very much for any help

Juli Pausas

-- 
http://www.ceam.es/pausas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] reshape

2008-02-10 Thread Henrique Dallazuanna

reshape(dat, direction=wide, timevar=tr, idvar=c(id, code,sp ))[,2:6]

But, I don't understand why you use reshape

On 10/02/2008, juli pausas [EMAIL PROTECTED] wrote:
 Dear colleagues,
 I'd like to reshape a datafame in a long format to a wide format, but
 I do not quite get what I want. Here is an example of the data I've
 have (dat):

 sp - c(a, a, a, a, b, b, b, c, d, d, d, d)
 tr - c(A, B, B, C, A, B, C, A, A, B, C, C)
 code - c(a1, a2, a2, a3, a3, a3, a4, a4, a4, a5,
 a5, a6)
 dat - data.frame(id=1:12, sp=sp, tr=tr, val=31:42, code=code)

 and below is what I'd like to obtain. That is, I'd like the tr
 variable in different columns (as a timevar) with their value (val).

 sp  code  tr.A  tr.B  tr.C
 aa1   31NANA
 aa2   NA32NA
 aa2   NA33NA**
 aa3   NANA34
 ba3   3536NA
 ba4   NANA37
 ca4   38NANA
 da4   39NANA
 da5   NA4041
 da6   NANA42

 Using reshape:

 reshape(dat[,2:5], direction=wide, timevar=tr, idvar=c(code,sp ))

 I'm getting very close. The only difference is in the 3rd row (**),
 that is when sp and code are the same I only get one record. Is there
 a way to get all records? Any idea?

 Thank you very much for any help

 Juli Pausas

 --
 http://www.ceam.es/pausas

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] reshape

2008-02-10 Thread Gabor Grothendieck

This isn't really well defined.   Suppose we have two rows that
both have a, a2 and a value for B.  Now suppose we have
another row with a,a2 but with a value for C.  Does the third row
go with the first one?  the second one?  a new row?  both the first
and the second?

Here is one possibility but without a good definition of the problem
we don't know whether its answering the problem that is intended.

In the code below we assume that all dat rows that
have the same sp value and the same code value are adjacent and
if a tr occurs among those dat rows that is equal to or less than the
prior row in factor level order then the new dat row must start a new
output row else not.   Thus within an sp/code group we assign each
row a 1 until we get a tr that is less than the prior row's tr and then
we start assigning 2 and so on.  This is the new column seq below.
We then use seq as part of our id.var in reshape.  For the particular
example in your post this does give the same answer.

f - function(x) cumsum(c(1, diff(x) = 0))
dat$seq - ave(as.numeric(dat$tr), dat$sp, dat$code, FUN = f)
reshape(dat[-1], direction=wide, timevar=tr,
idvar=c(code,sp,seq ))[-3]


On Feb 10, 2008 4:58 PM, juli pausas [EMAIL PROTECTED] wrote:
 Dear colleagues,
 I'd like to reshape a datafame in a long format to a wide format, but
 I do not quite get what I want. Here is an example of the data I've
 have (dat):

 sp - c(a, a, a, a, b, b, b, c, d, d, d, d)
 tr - c(A, B, B, C, A, B, C, A, A, B, C, C)
 code - c(a1, a2, a2, a3, a3, a3, a4, a4, a4, a5,
 a5, a6)
 dat - data.frame(id=1:12, sp=sp, tr=tr, val=31:42, code=code)

 and below is what I'd like to obtain. That is, I'd like the tr
 variable in different columns (as a timevar) with their value (val).

 sp  code  tr.A  tr.B  tr.C
 aa1   31NANA
 aa2   NA32NA
 aa2   NA33NA**
 aa3   NANA34
 ba3   3536NA
 ba4   NANA37
 ca4   38NANA
 da4   39NANA
 da5   NA4041
 da6   NANA42

 Using reshape:

 reshape(dat[,2:5], direction=wide, timevar=tr, idvar=c(code,sp ))

 I'm getting very close. The only difference is in the 3rd row (**),
 that is when sp and code are the same I only get one record. Is there
 a way to get all records? Any idea?

 Thank you very much for any help

 Juli Pausas

 --
 http://www.ceam.es/pausas

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Do I need to use dropterm()??

2008-02-10 Thread Bill.Venables

dropterm() is a tool for model building, not primarily for significance
testing.

As the name suggests, it tells you what the effect would be were you to
drop each *accessible* term in the model as it currently stands.  By
default it displays the effect on AIC of dropping each term, in turn,
from the model.  If you request them, though, it can also give you test
statistics and significance probabilities.

If there is an A:B interaction in the model, the main effects A or
B, if present, are not considered until a decision has been made on
including A:B.  The meaning of A:B in a model is not absolute: it is
conditional on which main effect terms you have there as well.  This is
one reason why the process is ordered in this way, but the main reason
is the so-called 'marginality' issue.   

If you do ask for test statistics and significance probabilities, you
get almost a SAS-style Type III anova table, with the important
restriction noted above: you will not get main effect terms shown along
with interactions.

If you want the full SAS, uh, version, there are at least two
possibilities. 1. Use SAS.  2. Use John Fox's Anova() function from the
'car' package, along with his excellent book, which should explain how
to avoid shooting yourself in the foot over this.

(This difference of opinion on what should sensibly be done, by the way,
predates R by a long shot.  My first exposure to it were with the very
acrimonious disputes between Nelder and Kempthorne in the mid 70's.  It
has remained a cross-Atlantic dispute pretty well ever since, with the
latest shot being the paper by Lee and Nelder in 2004.  Curiously, the
origin of the software can almost be determined by the view taken on
this issue, with Genstat going one way and SAS, SPSS, ... the other.
S-PLUS was a late comer...but I digress!)


Bill Venables.


Bill Venables
CSIRO Laboratories
PO Box 120, Cleveland, 4163
AUSTRALIA
Office Phone (email preferred): +61 7 3826 7251
Fax (if absolutely necessary):  +61 7 3826 7304
Mobile: +61 4 8819 4402
Home Phone: +61 7 3286 7700
mailto:[EMAIL PROTECTED]
http://www.cmis.csiro.au/bill.venables/ 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of DaniWells
Sent: Sunday, 10 February 2008 11:40 PM
To: r-help@r-project.org
Subject: [R] Do I need to use dropterm()??


Hello,

I'm having some difficulty understanding the useage of the dropterm()
function in the MASS library. What exactly does it do? I'm very new to
R, so
any pointers would be very helpful. I've read many definitions of what
dropterm() does, but none seem to stick in my mind or click with me.

I've coded everything fine for an interaction that runs as follows: two
sets
of data (one for North aspect, one for Southern Aspect) and have a
logscale
on the x axis, with survival on the y. After calculating my anova
results i
have all significant results (ie aspect = sig, logscale of sunlight =
sig,
and aspect:llight = sig).

When i have all significant results in my ANOVA table, do i need
dropterm(),
or is that just to remove insignificant terms?

Many thanks,

Dani
-- 
View this message in context:
http://www.nabble.com/Do-I-need-to-use-dropterm%28%29---tp15396151p15396
151.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] j and jcross queries

2008-02-10 Thread Robert Biddle

Hi:

I have a query related to the J and Jcross functions in the SpatStat 
package.

I use J to finding indications of clustering in my data, and Jcross
to look for dependence between point patterns.

I use the envelope function to do Monte Carlo tests to look for 
significance.

So far so good.

My question is how I can test to see if tests are significantly different.

For example, if find J of pattern X and J of pattern Y, how could I 
determine
the liklihood that those results come from different processes?

Similarly, if I find J of marks X and Y, and X and Z, how could I 
determine the liklihood
that Y and Z come from different processes?

I would appreciate advice.

Cheers

Robert Biddle

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Questions about histograms

2008-02-10 Thread Andre Nathan

Hello

I'm doing some experiments with the various histogram functions and I
have a two questions about the prob option and binning.

First, here's a simple plot of my data using the default hist()
function:

 hist(data[,1], prob = TRUE, xlim = c(0, 35))

  http://go.sneakymustard.com/tmp/hist.jpg

My first question is regarding the resulting plot from hist.scott() and
hist.FD(), from the MASS package. I'm setting prob to TRUE in these
functions, but as it can be seen in the images below, the value for the
first bar of the histogram is well above 1.0. Shouldn't the total area
be 1.0 in the case of prob = TRUE?

 hist.scott(data[,1], prob = TRUE, xlim=c(0, 35))

  http://go.sneakymustard.com/tmp/scott.jpg

 hist.FD(data[,1], prob = TRUE, xlim=c(0, 35))

  http://go.sneakymustard.com/tmp/FD.jpg

Is there anything I can do to fix these plots?

My second question is related to binning. Is there a function or package
that allows one to use logarithmic binning in R, that is, create bins
such that the length of a bin is a multiple of the length of the one
before it?

Pointers to the appropriate docs are welcome, I've been searching for
this and couldn't find any info.

Best regards,
Andre

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Questions about histograms

2008-02-10 Thread Duncan Murdoch

On 10/02/2008 8:14 PM, Andre Nathan wrote:
 Hello
 
 I'm doing some experiments with the various histogram functions and I
 have a two questions about the prob option and binning.
 
 First, here's a simple plot of my data using the default hist()
 function:
 
 hist(data[,1], prob = TRUE, xlim = c(0, 35))
 
   http://go.sneakymustard.com/tmp/hist.jpg
 
 My first question is regarding the resulting plot from hist.scott() and
 hist.FD(), from the MASS package. I'm setting prob to TRUE in these
 functions, but as it can be seen in the images below, the value for the
 first bar of the histogram is well above 1.0. Shouldn't the total area
 be 1.0 in the case of prob = TRUE?
 
 hist.scott(data[,1], prob = TRUE, xlim=c(0, 35))

It looks to me as though the area is one.  The first bar is about 3.6 
units high, and about 0.2 units wide: area is 0.72.  There are no gaps 
between bars in an R histogram, so the gaps you see in this jpg are bars 
with zero height.

Duncan Murdoch

 
   http://go.sneakymustard.com/tmp/scott.jpg
 
 hist.FD(data[,1], prob = TRUE, xlim=c(0, 35))
 
   http://go.sneakymustard.com/tmp/FD.jpg
 
 Is there anything I can do to fix these plots?
 
 My second question is related to binning. Is there a function or package
 that allows one to use logarithmic binning in R, that is, create bins
 such that the length of a bin is a multiple of the length of the one
 before it?
 
 Pointers to the appropriate docs are welcome, I've been searching for
 this and couldn't find any info.
 
 Best regards,
 Andre
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Questions about histograms

2008-02-10 Thread Bill.Venables

Andre,

Regarding your first question, it is by no means clear there is anything
to fix, in fact I'm sure there is nothing to fix.  The fact that the
height of any bar is greater than one is irrelevant - the width of the
bar is much less than one, as is the product of height by width.  Area
is height x width, not just height

Regarding the second question - logarithmic breaks.  I'm not aware of
anything currently available to do this, but the tools are there for you
to do it yourself.  The 'breaks' argument to hist allows you to specify
your breaks explicitly (among other things) so it's just a matter of
setting up the logarithmic (or, more precisely, 'geometric progression')
bins yourself and relaying them on to hist.

 


Bill Venables
CSIRO Laboratories
PO Box 120, Cleveland, 4163
AUSTRALIA
Office Phone (email preferred): +61 7 3826 7251
Fax (if absolutely necessary):  +61 7 3826 7304
Mobile: +61 4 8819 4402
Home Phone: +61 7 3286 7700
mailto:[EMAIL PROTECTED]
http://www.cmis.csiro.au/bill.venables/ 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Andre Nathan
Sent: Monday, 11 February 2008 11:14 AM
To: r-help@r-project.org
Subject: [R] Questions about histograms

Hello

I'm doing some experiments with the various histogram functions and I
have a two questions about the prob option and binning.

First, here's a simple plot of my data using the default hist()
function:

 hist(data[,1], prob = TRUE, xlim = c(0, 35))

  http://go.sneakymustard.com/tmp/hist.jpg

My first question is regarding the resulting plot from hist.scott() and
hist.FD(), from the MASS package. I'm setting prob to TRUE in these
functions, but as it can be seen in the images below, the value for the
first bar of the histogram is well above 1.0. Shouldn't the total area
be 1.0 in the case of prob = TRUE?

 hist.scott(data[,1], prob = TRUE, xlim=c(0, 35))

  http://go.sneakymustard.com/tmp/scott.jpg

 hist.FD(data[,1], prob = TRUE, xlim=c(0, 35))

  http://go.sneakymustard.com/tmp/FD.jpg

Is there anything I can do to fix these plots?

My second question is related to binning. Is there a function or package
that allows one to use logarithmic binning in R, that is, create bins
such that the length of a bin is a multiple of the length of the one
before it?

Pointers to the appropriate docs are welcome, I've been searching for
this and couldn't find any info.

Best regards,
Andre

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Using R in a university course: dealing with proposal comments

2008-02-10 Thread Arin Basu

Hi All,

I am scheduled to teach a graduate course on research methods in
health sciences at a university. While drafting the course proposal, I
decided to include a brief introduction to R, primarily with an
objective to enable the students to do data analysis using R. It is
expected that enrolled students of this course have all at least a
formal first level introduction to quantitative methods in health
sciences and following completion of the course, they are all expected
to either evaluate, interpret, or conduct primary research studies in
health. The course would be delivered over 5 months, and R was
proposed to be taught as several laboratory based hands-on sessions
along with required readings within the coursework.

The course proposal went to a few colleagues in the university for
review. I received review feedbacks from them; two of them commented
about inclusion of R in the proposal.

In quoting parts these mails, I have masked the names/identities of
the referees, and have included just part of the relevant text with
their comments. Here are the comments:

Comment 1:

In my quick glance, I did not see that statistics would be taught,
but I did see that R would be taught.  Of course, R is a statistics
programme. I worry that teaching R could overwhelm the class.  Or
teaching R would be worthless, because the students do not understand
statistics.  (Prof LR)

Comment 2:

Finally, on a minor point, why is R the statistical software being
used? SPSS is probably more widely available in the workplace –
certainly in areas of social policy etc.  (Prof NB)

I am interested to know if any of you have faced similar questions
from colleagues about inclusion of R in non-statistics based
university graduate courses. If you did and were required to address
these concerns, how you would respond?

TIA,
Arin Basu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using R in a university course: dealing with proposal comments

2008-02-10 Thread Bill.Venables

Comment 1 raises a real issue.  R is just a tool.  Too often people do
confuse the tool with the real skill that the people who use it should
have.  There are plenty of questions on R-help that demonstrate this
confusion.  It's well worth keeping in mind and acting upon if you can
see a problem emerging, but I would not take it quite at face value and
abandon R on those grounds.

Comment 2 is one of those comments that belongs to a very particular
period of time, one that passes as we look on.  It reminds me of the
time I tried to introduce some new software into my courses, (back in
the days when I was a teacher, long, long ago...).  The students took to
it like ducks to water, but my colleagues on the staff were very slow to
adapt, and some never did.  Also, R wins every time on price!


Bill Venables
CSIRO Laboratories
PO Box 120, Cleveland, 4163
AUSTRALIA
Office Phone (email preferred): +61 7 3826 7251
Fax (if absolutely necessary):  +61 7 3826 7304
Mobile: +61 4 8819 4402
Home Phone: +61 7 3286 7700
mailto:[EMAIL PROTECTED]
http://www.cmis.csiro.au/bill.venables/ 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Arin Basu
Sent: Monday, 11 February 2008 1:41 PM
To: r-help@r-project.org
Subject: [R] Using R in a university course: dealing with proposal
comments

Hi All,

I am scheduled to teach a graduate course on research methods in
health sciences at a university. While drafting the course proposal, I
decided to include a brief introduction to R, primarily with an
objective to enable the students to do data analysis using R. It is
expected that enrolled students of this course have all at least a
formal first level introduction to quantitative methods in health
sciences and following completion of the course, they are all expected
to either evaluate, interpret, or conduct primary research studies in
health. The course would be delivered over 5 months, and R was
proposed to be taught as several laboratory based hands-on sessions
along with required readings within the coursework.

The course proposal went to a few colleagues in the university for
review. I received review feedbacks from them; two of them commented
about inclusion of R in the proposal.

In quoting parts these mails, I have masked the names/identities of
the referees, and have included just part of the relevant text with
their comments. Here are the comments:

Comment 1:

In my quick glance, I did not see that statistics would be taught,
but I did see that R would be taught.  Of course, R is a statistics
programme. I worry that teaching R could overwhelm the class.  Or
teaching R would be worthless, because the students do not understand
statistics.  (Prof LR)

Comment 2:

Finally, on a minor point, why is R the statistical software being
used? SPSS is probably more widely available in the workplace -
certainly in areas of social policy etc.  (Prof NB)

I am interested to know if any of you have faced similar questions
from colleagues about inclusion of R in non-statistics based
university graduate courses. If you did and were required to address
these concerns, how you would respond?

TIA,
Arin Basu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using R in a university course: dealing with proposal comments

2008-02-10 Thread Liviu Andronic

Hello Arin,

If your future students do not know statistics, you might consider
buffering their introduction to R with the help of a GUI package, such
as Rcmdr (if functionality is missing, you could add it yourself via
the plugin infrastructure). Another way to help students would be to
direct them to easy to use and straight-forward resources, like this
[1], this [2] or this [3]. On the why not SPSS point, I would
imagine the answer is quality and price, and all the corollary
arguments (say, you can use it at home or during the weekend, etc).

No more than my two cents.
Liviu

[1] http://oit.utk.edu/scc/RforSASSPSSusers.pdf
[2] http://www.statmethods.net/index.html
[3] http://zoonek2.free.fr/UNIX/48_R/all.html

On 2/11/08, Arin Basu [EMAIL PROTECTED] wrote:
 Comment 1:

 In my quick glance, I did not see that statistics would be taught,
 but I did see that R would be taught.  Of course, R is a statistics
 programme. I worry that teaching R could overwhelm the class.  Or
 teaching R would be worthless, because the students do not understand
 statistics.  (Prof LR)

 Comment 2:

 Finally, on a minor point, why is R the statistical software being
 used? SPSS is probably more widely available in the workplace –
 certainly in areas of social policy etc.  (Prof NB)
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help with write.csv

2008-02-10 Thread Suhaila Zainudin

Dear all,

I am new to R. I am using the impute package with data contained in csv
file.
I have followed the example in the impute package as follows:

 mydata = read.csv(sample_impute.csv, header = TRUE)
 mydata.expr - mydata[-1,-(1:2)]
 mydata.imputed - impute.knn(as.matrix(mydata.expr))

The impute is succesful.

Then I try to write the imputation results (mydata.imputed) to a csv file
such as follows..

 write.csv(mydata.imputed, file = sample_imputed.csv)
Error in data.frame(data = c(-0.07, -1.22, -0.09, -0.6, 0.65, -0.36, 0.25,
:
  arguments imply differing number of rows: 18, 1, 0

I need help understanding the error message and overcoming the
write.csvproblem. TQVM!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using R in a university course: dealing with proposal comments

2008-02-10 Thread Spencer Graves

  R is just a tool, but so is English.  R is the platform of 
choice for an increasing portion of people involved in new statistical 
algorithm development.  R is not yet the de facto standard for nearly 
all serious research internationally, to the extent that English is.  
However, I believe that is only a matter of time. 

  There will always be a place for software with a nicer graphical 
user interface, etc., than R.  For an undergraduate course, it may be 
wise to stick with SPSS, SAS, Minitab, etc. 

  Are you teaching graduate students to solve yesterday's problems 
or tomorrow's? 

  Much of my work in 2007 was in Matlab, because I am working with 
colleagues who use only Matlab.  Matlab has better debugging tools.  
However, R now has well over 1,000 contributed packages, and r-help and 
r-sig-x provide better support and extensibility than you will likely 
get from commercial software.  Twice in the past year, an executive said 
I should get some Matlab toolbox.  In the first case, after thinking 
about it for a few days, I finally requested and received official 
permission from a Vice President.  From that point, it took roughly a 
week to get a quote from Mathsoft, then close to two weeks to get 
approval from our Chief Financial Officer, then a few more days to 
actually get the software.  With R, that month long process is reduced 
to seconds:  I download the package and try it.  This has allowed me to 
do things today that I only dreamed of doing a few years ago. 

  Moreover, R makes it much easier for me to learn new statistical 
techniques.  When I'm not sure I understand the math, I can trace 
through a worked example in R, and the uncertainties almost always 
disappear.  For that, 'debug(fun)' helps a lot.  If I want to try 
something different, I don't have to start from scratch to develop code 
to perform an existing analysis.  I now look for companion R code before 
I decide to buy a book or when I prioritize how much time I will spend 
with different books or articles:  If something has companion R code, I 
know I can learn much quicker how to use, modify and extend the 
statistical tools discussed. 

  Spencer Graves

[EMAIL PROTECTED] wrote:
 Comment 1 raises a real issue.  R is just a tool.  Too often people do
 confuse the tool with the real skill that the people who use it should
 have.  There are plenty of questions on R-help that demonstrate this
 confusion.  It's well worth keeping in mind and acting upon if you can
 see a problem emerging, but I would not take it quite at face value and
 abandon R on those grounds.

 Comment 2 is one of those comments that belongs to a very particular
 period of time, one that passes as we look on.  It reminds me of the
 time I tried to introduce some new software into my courses, (back in
 the days when I was a teacher, long, long ago...).  The students took to
 it like ducks to water, but my colleagues on the staff were very slow to
 adapt, and some never did.  Also, R wins every time on price!


 Bill Venables
 CSIRO Laboratories
 PO Box 120, Cleveland, 4163
 AUSTRALIA
 Office Phone (email preferred): +61 7 3826 7251
 Fax (if absolutely necessary):  +61 7 3826 7304
 Mobile: +61 4 8819 4402
 Home Phone: +61 7 3286 7700
 mailto:[EMAIL PROTECTED]
 http://www.cmis.csiro.au/bill.venables/ 

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 On Behalf Of Arin Basu
 Sent: Monday, 11 February 2008 1:41 PM
 To: r-help@r-project.org
 Subject: [R] Using R in a university course: dealing with proposal
 comments

 Hi All,

 I am scheduled to teach a graduate course on research methods in
 health sciences at a university. While drafting the course proposal, I
 decided to include a brief introduction to R, primarily with an
 objective to enable the students to do data analysis using R. It is
 expected that enrolled students of this course have all at least a
 formal first level introduction to quantitative methods in health
 sciences and following completion of the course, they are all expected
 to either evaluate, interpret, or conduct primary research studies in
 health. The course would be delivered over 5 months, and R was
 proposed to be taught as several laboratory based hands-on sessions
 along with required readings within the coursework.

 The course proposal went to a few colleagues in the university for
 review. I received review feedbacks from them; two of them commented
 about inclusion of R in the proposal.

 In quoting parts these mails, I have masked the names/identities of
 the referees, and have included just part of the relevant text with
 their comments. Here are the comments:

 Comment 1:

 In my quick glance, I did not see that statistics would be taught,
 but I did see that R would be taught.  Of course, R is a statistics
 programme. I worry that teaching R could overwhelm the class.  Or
 teaching R would be worthless, because the

[R] tree() producing NA's

2008-02-10 Thread Amnon Melzer

Hi

 

Hoping someone can help me (a newbie).

 

I am trying to construct a tree using tree() in package tree. One of the
fields is a factor field (owner), with many levels. In the resulting tree, I
see many NA's (see below), yet in the actual data there are none.

 

 rr200.tr - tree(backprof ~ ., rr200)

 rr200.tr

1) root 200 1826.00 -0.2332  

...

[snip]

...

5) owner: Cliveden Stud,NA,NA,NA,NA,NA,NA,NA,NA 10   14.25  1.5870 *

  3) owner: B E T Partnership,Flaming Sambuca
Syndicate,NA,NA,NA,NA,NA,NA,NA,NA 11  384.40 10.5900  

6) decodds  12 5   74.80  6.3000 *

7) decodds  12 6  140.80 14.1700 *

 

Can anyone tell me why this happens and what I can do about it?

 

Regards

 

Amnon

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

43 matches

Mail list logo