Re: [R] Finding data association in R

2009-05-19 Thread phen_ys



Johannes Hüsing wrote:
 
 
 Am 19.05.2009 um 05:39 schrieb phen_ys:
 

 surgery - data.frame(outcome = c(0, 0, 0, 0, 0, 0, 0, 0, 0,
 + 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0,
 + 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0), age = c(50, 50, 51,
 + 51, 53, 54, 54, 54, 55, 55, 56, 56, 56, 57, 57, 57, 57, 58,
 + 59, 60, 61, 61, 61, 62, 62, 62, 62, 63, 63, 63, 64, 64, 65,
 + 67, 67, 68, 68, 69, 70, 71))

 How to use R to find association of the death rate and age with the 
 above
 data?

 
 with(surgery, boxplot(age ~ outcome))
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

I try to fit this into the model use lm function, but it doesn't make much
sense. The question i'm trying to answer is whether death rate is associated
with age. E.g the death rate is higher when the age is older.

-- 
View this message in context: 
http://www.nabble.com/Finding-data-association-in-R-tp23609249p23610952.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finding data association in R

2009-05-19 Thread Bill.Venables
Your problem is statistical and has nothing particularly to do with R.

It looks like homework to me.  

You may care to look at it this way:

###

 fm - glm(outcome ~ age, binomial, surgery)
 summary(fm)

Call:
glm(formula = outcome ~ age, family = binomial, data = surgery)

Deviance Residuals: 
Min   1Q   Median   3Q  Max  
-1.6601  -0.8099  -0.5839   1.0491   1.7079  

Coefficients:
 Estimate Std. Error z value Pr(|z|)
(Intercept) -10.481744.30409  -2.435   0.0149
age   0.162950.07018   2.322   0.0202

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 51.796  on 39  degrees of freedom
Residual deviance: 45.301  on 38  degrees of freedom
AIC: 49.301

Number of Fisher Scoring iterations: 3

###
 
There are still a few questions left, of course.



Bill Venables
http://www.cmis.csiro.au/bill.venables/ 


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of phen_ys
Sent: Tuesday, 19 May 2009 5:17 PM
To: r-help@r-project.org
Subject: Re: [R] Finding data association in R




Johannes Hüsing wrote:
 
 
 Am 19.05.2009 um 05:39 schrieb phen_ys:
 

 surgery - data.frame(outcome = c(0, 0, 0, 0, 0, 0, 0, 0, 0,
 + 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0,
 + 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0), age = c(50, 50, 51,
 + 51, 53, 54, 54, 54, 55, 55, 56, 56, 56, 57, 57, 57, 57, 58,
 + 59, 60, 61, 61, 61, 62, 62, 62, 62, 63, 63, 63, 64, 64, 65,
 + 67, 67, 68, 68, 69, 70, 71))

 How to use R to find association of the death rate and age with the 
 above
 data?

 
 with(surgery, boxplot(age ~ outcome))
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

I try to fit this into the model use lm function, but it doesn't make much
sense. The question i'm trying to answer is whether death rate is associated
with age. E.g the death rate is higher when the age is older.

-- 
View this message in context: 
http://www.nabble.com/Finding-data-association-in-R-tp23609249p23610952.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Predicting complicated GAMMs on response scale

2009-05-19 Thread Gavin Simpson
On Mon, 2009-05-18 at 11:48 -0700, William Paterson wrote:
 Hi,
 
 I am using GAMMs to show a relationship of temperature differential over
 time with a model that looks like this:-
 
 gamm(Diff~s(DaysPT)+AirToC,method=REML) 
 
 where DaysPT is time in days since injury and Diff is repeat measures of
 temperature differentials with regards to injury sites compared to
 non-injured sites in individuals over the course of 0-24 days. I use the
 following code to plot this model on the response scale with 95% CIs which
 works fine:-
 
 g.m-gamm(Diff~s(DaysPT)+AirToC,method=REML)
 p.d-data.frame(DaysPT=seq(min(DaysPT),max(DaysPT)))
 p.d$AirToC-(6.7)
 b-predict.gam(g.m$gam,p.d,se=TRUE)
 range-c(min(b$fit-2*b$se.fit),max(b$fit+2*b$se.fit))
 plot(p.d$DaysPT,b$fit,ylim=c(-4,12),xlab=Days post-tagging,ylab=dTmax
 (ºC),type=l,lab=c(24,4,12),las=1,cex.lab=1.5, cex.axis=1,lwd=2)
 lines(p.d$DaysPT,b$fit+b$se.fit*1.96,lty=2,lwd=1.5)
 lines(p.d$DaysPT,b$fit-b$se.fit*1.96,lty=2,lwd=1.5)
 points(DaysPT,Diff)
 
 
 However, when I add a correlation structure and/or a variance structure so
 that the model may look like:- 
 
 
 gamm(Diff~s(DaysPT3)+AirToC,correlation=corCAR1(form=~DaysPT|
 Animal),weights=varPower(form=~DaysPT),method=REML)
 
 
 I get this message at the point of inputting the line
 b-predict.gam(g.m$gam,p.d,se=TRUE)

Note that p.d doesn't contain Animal. Not sure this is the problem, but
I would have thought you'd need to supply new values of Animal for the
data you wish to predict for in order to get the CAR(1) errors correct.
Is it possible that the model is finding another Animal variable in the
global environment?

I have predicted from several thousand GAMMs containing correlation
structures similar to the way you do above so this does work in general.
If the above change to p.d doesn't work, you'll probably need to speak
to Simon Wood to take this further.

Is mgcv up-to-date? I am using 1.5-5 that was released in the last week
or so.

For example, this dummy example runs without error for me and is similar
to your model

y1 - arima.sim(list(order = c(1,0,0), ar = 0.5), n = 200, sd = 1)
y2 - arima.sim(list(order = c(1,0,0), ar = 0.8), n = 200, sd = 3)
x1 - rnorm(200)
x2 - rnorm(200)
ind - rep(1:2, each = 200)
d - data.frame(Y = c(y1,y2), X = c(x1,x2), ind = ind, time = rep(1:200,
times = 2))
require(mgcv)
mod - gamm(Y ~ s(X), data = d, corr = corCAR1(form = ~ time | ind),
weights = varPower(form = ~ time))
p.d - data.frame(X = rep(seq(min(d$X), max(d$X), len = 20), 2),
  ind = rep(1:2, each = 20),
  time = rep(1:20, times = 2))
pred - predict(mod$gam, newdata = p.d, se = TRUE)

Does this work for you? If not, the above would be a reproducible
example (as asked for in the posting guide) and might help Simon track
down the problem if you are running an up-to-date mgcv.

HTH

G

 
 
 Error in model.frame(formula, rownames, variables, varnames, extras,
 extranames,  : 
 variable lengths differ (found for 'DaysPT')
 In addition: Warning messages:
 1: not all required variables have been supplied in  newdata!
  in: predict.gam(g.m$gam, p.d, se = TRUE) 
 2: 'newdata' had 25 rows but variable(s) found have 248 rows 
 
 
 Is it possible to predict a more complicated model like this on the response
 scale? How can I incorporate a correlation structure and variance structure
 in a dataframe when using the predict function for GAMMs?
 
 Any help would be greatly appreciated.
 
 William Paterson
 
 
 
 
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Overdispersion using repeated measures lmer

2009-05-19 Thread ONKELINX, Thierry
Dear Christine,

The poisson family does not allow for overdispersion (nor
underdispersion). Try using the quasipoisson family instead.

HTH,

Thierry

 




ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey

-Oorspronkelijk bericht-
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
Namens Christine Griffiths
Verzonden: maandag 18 mei 2009 13:26
Aan: r-help@r-project.org
Onderwerp: [R] Overdispersion using repeated measures lmer

Dear All

I am trying to do a repeated measures analysis using lmer and have a
number of issues. I have non-orthogonal, unbalanced data.  Count data
was obtained over 10 months for three treatments, which were arranged
into 6 blocks. 
Treatment is not nested in Block but crossed, as I originally designed
an orthogonal, balanced experiment but subsequently lost a treatment
from 2 blocks. My fixed effects are treatment and Month, and my random
effects are Block which was repeated sampled.  My model is:

Model-lmer(Count~Treatment*Month+(Month|Block),data=dataset,family=pois
son(link=sqrt))

Is this the only way in which I can specify my random effects? I.e. can
I specify them as: (1|Block)+(1|Month)?

When I run this model, I do not get any residuals in the error term or
estimated scale parameters and so do not know how to check if I have
overdispersion. Below is the output I obtained.

Generalized linear mixed model fit by the Laplace approximation
Formula: Count ~ Treatment * Month + (Month | Block)
   Data: dataset
   AIC   BIC logLik deviance
 310.9 338.5 -146.4292.9
Random effects:
 Groups NameVariance   Std.Dev. Corr
 Block  (Intercept) 0.06882396 0.262343
Month   0.00011693 0.010813 1.000
Number of obs: 160, groups: Block, 6

Fixed effects:
  Estimate Std. Error z value Pr(|z|)
(Intercept)   1.624030   0.175827   9.237   2e-16 ***
Treatment2.Radiata0.150957   0.207435   0.728 0.466777
Treatment3.Aldabra   -0.005458   0.207435  -0.026 0.979009
Month-0.079955   0.022903  -3.491 0.000481 ***
Treatment2.Radiata:Month  0.048868   0.033340   1.466 0.142717
Treatment3.Aldabra:Month  0.077697   0.033340   2.330 0.019781 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
(Intr) Trt2.R Trt3.A Month  T2.R:M Trtmnt2.Rdt -0.533
Trtmnt3.Ald -0.533  0.450
Month   -0.572  0.585  0.585
Trtmnt2.R:M  0.474 -0.882 -0.402 -0.661
Trtmnt3.A:M  0.474 -0.402 -0.882 -0.661  0.454


Any advice on how to account for overdispersion would be much
appreciated.

Many thanks in advance
Christine

--
Christine Griffiths
School of Biological Sciences
University of Bristol
Woodland Road
Bristol BS8 1UG
Tel: 0117 9287593
Fax 0117 925 7374
christine.griffi...@bristol.ac.uk
http://www.bio.bris.ac.uk/research/mammal/tortoises.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] As mat.or.vec

2009-05-19 Thread Barbara . Rogo
is there a command like mat.or.vec for an array that I have to create with a 
cicle for?
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Generic 'diff'

2009-05-19 Thread Wacek Kusnierczyk
Stavros Macrakis wrote:
 On Mon, May 18, 2009 at 6:00 PM, Gabor Grothendieck ggrothendi...@gmail.com
   
 wrote:
 

   
 I understood what you were asking but R is an oo language so
 that's the model to use to do this sort of thing.

 

 I am not talking about creating a new class with an analogue to the
 subtraction function.  I am talking about a function which applies another
 function to a sequence and its lagged version.

 Functional arguments are used all over the place in R's base package
 (Xapply, sweep, outer, by, not to mention Map,  Reduce, Filter, etc.) and
 they seem perfectly natural here.
   

perhaps 'diff' would not be the best name, something like 'lag' would be
better for the more generic function, but 'lag' is already taken.

i agree it would be reasonable to have diff (lag) to accept an extra
argument for the function to be applied.  the solution of wrapping the
vector into a new class to be diff'ed with a non-default diff does not
seem to make much sense, as (a) what you seem to want is to custom-diff
plain vectors, (b) to keep the diff family coherent, you'd need to
upgrade the other diffs to have the extra argument anyway.

as you say, it's trivial to implement an extended diff, say difff,
reusing code from diff:

difff = function(x, ...)
   UseMethod('difff')
difff.default = function(x, lag=1, differences=1, fun=`-`, ...) {
   ismat = is.matrix(x)
   xlen = if (ismat) dim(x)[1L] else length(x)
if (length(lag)  1L || length(differences)  1L || lag  1L ||
differences  1L)
   stop('lag' and 'differences' must be integers = 1)
if (lag * differences = xlen) return(x[0])
r = unclass(x)
i1 = -1L:-lag
if (ismat)
for (i in 1L:differences)
r = fun(r[i1, , drop = FALSE], r[-nrow(r):-(nrow(r) - lag +
1), , drop = FALSE])
else
for (i in 1L:differences) r = fun(r[i1],
r[-length(r):-(length(r) - lag + 1)])
class(r) = oldClass(x)
r }

now, this naive version seems to work close to what you'd like:

difff(1:4)
# 1 1 1

difff(1:4, fun=`+`)
# 3 5 7

it might be useful if the original diff were working this way.

vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fitting distribution

2009-05-19 Thread Stefan Grosse
On Tue, 19 May 2009 14:04:19 +1000 Kon Knafelman konk2...@hotmail.com
wrote:

KK i have the sample variances for 1000 samples, and i want to fit it
KK to a chi-squared distribution.

KK can someone please help me fit this to a chi-squared distribution
KK with n degrees of freedom. Thanks a lot 

Dear Kon, 

1. please only mail to r-h...@stat.math.ethz.ch OR to
r-help@r-project.org as we receive every mail of you twice if
you mail to both. 

2. please read the posting guide. A link is attached
every e-mail:

PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented,
minimal, self-contained, reproducible code.

especially the section do your homework. It means use search on
r-project.org and rseek.org and have a look at the documentation.

Don't expect others to do your homework. And show some effort!

Using search would have pointed you to a guide of Ricci on fitting
distributions for example:
http://cran.r-project.org/doc/contrib/Ricci-distributions-en.pdf

If you programmed something and it does not work as expected- THEN mail
to the list. 

hth
Stefan

PS You don't know Debbie Zhang by coincidence?...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Generic 'diff'

2009-05-19 Thread Wacek Kusnierczyk
Wacek Kusnierczyk wrote:
 Stavros Macrakis wrote:
   


[...]
 I am not talking about creating a new class with an analogue to the
 subtraction function.  I am talking about a function which applies another
 function to a sequence and its lagged version.

 Functional arguments are used all over the place in R's base package
 (Xapply, sweep, outer, by, not to mention Map,  Reduce, Filter, etc.) and
 they seem perfectly natural here.
   
 

   
[...]

 as you say, it's trivial to implement an extended diff, say difff,
 reusing code from diff:

 difff = function(x, ...)
UseMethod('difff')
 difff.default = function(x, lag=1, differences=1, fun=`-`, ...) {
ismat = is.matrix(x)
xlen = if (ismat) dim(x)[1L] else length(x)
 if (length(lag)  1L || length(differences)  1L || lag  1L ||
 differences  1L)
stop('lag' and 'differences' must be integers = 1)
   

btw., the error message here is confusing:

lag = 1:2
diff(1:10, lag=lag)
# Error in diff.default(1:10, lag = lag) :
#  'lag' and 'differences' must be integers = 1

is.integer(lag)
# TRUE
all(lag = 1)
# TRUE
  
what is meant is that lag and differences must be atomic 1-element
vectors of positive integers.  or rather integer-representing numerics:

lag = 1
diff(1:5, lag=1)
# fine
is.integer(lag)
# FALSE

(the usual confusion between 'integer' as the underlying representation
and 'integer' as the represented number.)

vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Overdispersion using repeated measures lmer

2009-05-19 Thread Christine Griffiths

Thanks. I did try using quasipoisson and a negative binomial error but am
unsure of the degree of overdispersion and whether it is simply due to
missing values. I am investigating to see if I can replace these missing
values so that I can have a balanced orthogonal design and use lme or aov
instead which is easier to interpret. Any ideas on whether it is feasible to
replace missing values for a small dataset with repeated measures? I have 6
blocks with 3 treatments sampled over 10 months. Two blocks are missing one
treatment, albeit a different one. Also any suggestions about how I would go
about this would be much appreciated. 

I am also unsure of whether my random effects (Month|Block) for repeated
measures with random slope and intercept is correct and whether (1|Month) +
(1|Block) represents repeated measures. Any confirmation would be great. 

Cheers
Christine 



Christine Griffiths-2 wrote:
 
 Dear All
 
 I am trying to do a repeated measures analysis using lmer and have a
 number 
 of issues. I have non-orthogonal, unbalanced data.  Count data was
 obtained 
 over 10 months for three treatments, which were arranged into 6 blocks. 
 Treatment is not nested in Block but crossed, as I originally designed an 
 orthogonal, balanced experiment but subsequently lost a treatment from 2 
 blocks. My fixed effects are treatment and Month, and my random effects
 are 
 Block which was repeated sampled.  My model is:
 
 Model-lmer(Count~Treatment*Month+(Month|Block),data=dataset,family=poisson(link=sqrt))
 
 Is this the only way in which I can specify my random effects? I.e. can I 
 specify them as: (1|Block)+(1|Month)?
 
 When I run this model, I do not get any residuals in the error term or 
 estimated scale parameters and so do not know how to check if I have 
 overdispersion. Below is the output I obtained.
 
 Generalized linear mixed model fit by the Laplace approximation
 Formula: Count ~ Treatment * Month + (Month | Block)
Data: dataset
AIC   BIC logLik deviance
  310.9 338.5 -146.4292.9
 Random effects:
  Groups NameVariance   Std.Dev. Corr
  Block  (Intercept) 0.06882396 0.262343
 Month   0.00011693 0.010813 1.000
 Number of obs: 160, groups: Block, 6
 
 Fixed effects:
   Estimate Std. Error z value Pr(|z|)
 (Intercept)   1.624030   0.175827   9.237   2e-16 ***
 Treatment2.Radiata0.150957   0.207435   0.728 0.466777
 Treatment3.Aldabra   -0.005458   0.207435  -0.026 0.979009
 Month-0.079955   0.022903  -3.491 0.000481 ***
 Treatment2.Radiata:Month  0.048868   0.033340   1.466 0.142717
 Treatment3.Aldabra:Month  0.077697   0.033340   2.330 0.019781 *
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 
 Correlation of Fixed Effects:
 (Intr) Trt2.R Trt3.A Month  T2.R:M
 Trtmnt2.Rdt -0.533
 Trtmnt3.Ald -0.533  0.450
 Month   -0.572  0.585  0.585
 Trtmnt2.R:M  0.474 -0.882 -0.402 -0.661
 Trtmnt3.A:M  0.474 -0.402 -0.882 -0.661  0.454
 
 
 Any advice on how to account for overdispersion would be much appreciated.
 
 Many thanks in advance
 Christine
 
 --
 Christine Griffiths
 School of Biological Sciences
 University of Bristol
 Woodland Road
 Bristol BS8 1UG
 Tel: 0117 9287593
 Fax 0117 925 7374
 christine.griffi...@bristol.ac.uk
 http://www.bio.bris.ac.uk/research/mammal/tortoises.html
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Overdispersion-using-repeated-measures-lmer-tp23595955p23612349.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R copula - empirical distributions

2009-05-19 Thread matejp

Dear list,

Has anyone used the 'copula' or 'fCopulae' package with empirical
distributions. I have two distributions (10.000 samples each) which I need
to combine using archimedean copulas (probably Clayton and/or Frank).

Is this possible? Is there an existing empirical distribution function
defined which I can use or do I need to define my own?

Thanks and best regards,

Matej
-- 
View this message in context: 
http://www.nabble.com/R-copula---empirical-distributions-tp23612397p23612397.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Generic 'diff'

2009-05-19 Thread Wacek Kusnierczyk
Wacek Kusnierczyk wrote:

 btw., the error message here is confusing:

 lag = 1:2
 diff(1:10, lag=lag)
 # Error in diff.default(1:10, lag = lag) :
 #  'lag' and 'differences' must be integers = 1

 is.integer(lag)
 # TRUE
 all(lag = 1)
 # TRUE
   
 what is meant is that lag and differences must be atomic 1-element
 vectors of positive integers.  or rather integer-representing numerics:

 lag = 1
 diff(1:5, lag=1)
 # fine
 is.integer(lag)
 # FALSE

   

... and even non-integer-representing non-integers are fine:

diff(1:5, lag=pi)
# 3 3


vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Wilcoxon nonparametric p-values

2009-05-19 Thread Peter Dalgaard

cvandy wrote:

When I use wilcox.test, I get vastly different p-values than the problems
from Statistics textbooks.
For example:
The following problem comes from Applied Statistics and Probability for
Engineers, 2nd Edition, by D. C. Montgomery.  Page736, problem 14.7.  The
problem is to compare the sample data with a population median of 8.5.  The
book answer is p = 0.25, wilcox.test answer is p = 0.573.
I've tried several other similar problems with similar results.  I've copied
the following directly from my workspace.


wilcox.exact (from exactRankTests) gives

 wilcox.exact(x - 8.5)

Exact Wilcoxon signed rank test

data:  x - 8.5
V = 80.5, p-value = 0.5748

so I'd suspect the textbook. One-sided p-value perhaps? or table 
limitation (as in p  .25). If you want to dig deeper, you'll probably 
have to check the computations implied by the text.



Thanks for any help,
CHV

x-c(8.32,8.05,
8.93,8.65,8.25,8.46,8.52,8.35,8.36,8.41,8.42,8.30,8.71,8.75,8.6,8.83,8.5,8.38,8.29,8.46)
wilcox.test(x,y=NULL,mu=8.5)

Wilcoxon signed rank test with continuity correction
 data:  x 
V = 80.5, p-value = 0.573
alternative hypothesis: true location is not equal to 8.5 
 
Warning messages:

1: In wilcox.test.default(x, y = NULL, mu = 8.5) :
  cannot compute exact p-value with ties
2: In wilcox.test.default(x, y = NULL, mu = 8.5) :
  cannot compute exact p-value with zeroes
  
Charles H Van deZande
 





 



--
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Wilcoxon nonparametric p-values

2009-05-19 Thread Keith Jewell
I just tried it in Minitab and got
--
Test of median = 8.500 versus median not = 8.500

N for   Wilcoxon Estimated
 N   Test  Statistic  P Median
C1  20 19   80.5  0.573  8.460
-
One tailed gave me closer to the textbook, but still not very close.
---
Test of median = 8.500 versus median  8.500

N for   Wilcoxon Estimated
 N   Test  Statistic  P Median
C1  20 19   80.5  0.287  8.460
---

I agree with Peter Dalgaard, the book has it wrong (for some value of 
wrong)

Regards

KJ

Peter Dalgaard p.dalga...@biostat.ku.dk wrote in message 
news:4a127d6d.1060...@biostat.ku.dk...
 cvandy wrote:
 When I use wilcox.test, I get vastly different p-values than the problems
 from Statistics textbooks.
 For example:
 The following problem comes from Applied Statistics and Probability for
 Engineers, 2nd Edition, by D. C. Montgomery.  Page736, problem 14.7. 
 The
 problem is to compare the sample data with a population median of 8.5. 
 The
 book answer is p = 0.25, wilcox.test answer is p = 0.573.
 I've tried several other similar problems with similar results.  I've 
 copied
 the following directly from my workspace.

 wilcox.exact (from exactRankTests) gives

  wilcox.exact(x - 8.5)

 Exact Wilcoxon signed rank test

 data:  x - 8.5
 V = 80.5, p-value = 0.5748

 so I'd suspect the textbook. One-sided p-value perhaps? or table 
 limitation (as in p  .25). If you want to dig deeper, you'll probably 
 have to check the computations implied by the text.

 Thanks for any help,
 CHV
 x-c(8.32,8.05,
 8.93,8.65,8.25,8.46,8.52,8.35,8.36,8.41,8.42,8.30,8.71,8.75,8.6,8.83,8.5,8.38,8.29,8.46)
 wilcox.test(x,y=NULL,mu=8.5)
 Wilcoxon signed rank test with continuity correction
  data:  x V = 80.5, p-value = 0.573
 alternative hypothesis: true location is not equal to 8.5 Warning 
 messages:
 1: In wilcox.test.default(x, y = NULL, mu = 8.5) :
   cannot compute exact p-value with ties
 2: In wilcox.test.default(x, y = NULL, mu = 8.5) :
   cannot compute exact p-value with zeroes
 ? ? Charles H Van deZande
  ?


 -- 
O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
 ~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Overdispersion using repeated measures lmer

2009-05-19 Thread ONKELINX, Thierry
Dear Christine,

(Month|Block) and (1|Block) + (1|Month) are completely different random 
effects. The first assumes that each Block exhibits a different linear trend 
along Month. The latter assumes that each block has a random effect, each month 
has a random effect and that the random effects of block and month are 
independent. So each month has a different effect, but within a given month 
that effect is the same on each block. It is up to you to see if that kind of 
assumption is valid in your design.

Missing values should not be a problem, as long as they are missing at random. 
I would not try to impute the missing values. How would you determine the 
imputed values? That requires a lot of assumptions and they could affect your 
model parameters.

HTH,

Thierry



ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and 
Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology 
and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than 
asking him to perform a post-mortem examination: he may be able to say what the 
experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure 
that a reasonable answer can be extracted from a given body of data.
~ John Tukey

-Oorspronkelijk bericht-
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens 
Christine Griffiths
Verzonden: dinsdag 19 mei 2009 11:01
Aan: r-help@r-project.org
Onderwerp: Re: [R] Overdispersion using repeated measures lmer


Thanks. I did try using quasipoisson and a negative binomial error but am 
unsure of the degree of overdispersion and whether it is simply due to missing 
values. I am investigating to see if I can replace these missing values so that 
I can have a balanced orthogonal design and use lme or aov instead which is 
easier to interpret. Any ideas on whether it is feasible to replace missing 
values for a small dataset with repeated measures? I have 6 blocks with 3 
treatments sampled over 10 months. Two blocks are missing one treatment, albeit 
a different one. Also any suggestions about how I would go about this would be 
much appreciated. 

I am also unsure of whether my random effects (Month|Block) for repeated 
measures with random slope and intercept is correct and whether (1|Month) +
(1|Block) represents repeated measures. Any confirmation would be great. 

Cheers
Christine 



Christine Griffiths-2 wrote:
 
 Dear All
 
 I am trying to do a repeated measures analysis using lmer and have a 
 number of issues. I have non-orthogonal, unbalanced data.  Count data 
 was obtained over 10 months for three treatments, which were arranged 
 into 6 blocks.
 Treatment is not nested in Block but crossed, as I originally designed 
 an orthogonal, balanced experiment but subsequently lost a treatment 
 from 2 blocks. My fixed effects are treatment and Month, and my random 
 effects are Block which was repeated sampled.  My model is:
 
 Model-lmer(Count~Treatment*Month+(Month|Block),data=dataset,family=po
 isson(link=sqrt))
 
 Is this the only way in which I can specify my random effects? I.e. 
 can I specify them as: (1|Block)+(1|Month)?
 
 When I run this model, I do not get any residuals in the error term or 
 estimated scale parameters and so do not know how to check if I have 
 overdispersion. Below is the output I obtained.
 
 Generalized linear mixed model fit by the Laplace approximation
 Formula: Count ~ Treatment * Month + (Month | Block)
Data: dataset
AIC   BIC logLik deviance
  310.9 338.5 -146.4292.9
 Random effects:
  Groups NameVariance   Std.Dev. Corr
  Block  (Intercept) 0.06882396 0.262343
 Month   0.00011693 0.010813 1.000
 Number of obs: 160, groups: Block, 6
 
 Fixed effects:
   Estimate Std. Error z value Pr(|z|)
 (Intercept)   1.624030   0.175827   9.237   2e-16 ***
 Treatment2.Radiata0.150957   0.207435   0.728 0.466777
 Treatment3.Aldabra   -0.005458   0.207435  -0.026 0.979009
 Month-0.079955   0.022903  -3.491 0.000481 ***
 Treatment2.Radiata:Month  0.048868   0.033340   1.466 0.142717
 Treatment3.Aldabra:Month  0.077697   0.033340   2.330 0.019781 *
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 
 Correlation of Fixed Effects:
 (Intr) Trt2.R Trt3.A Month  T2.R:M Trtmnt2.Rdt -0.533 
 Trtmnt3.Ald -0.533  0.450
 Month   -0.572  0.585  0.585
 Trtmnt2.R:M  0.474 -0.882 -0.402 -0.661 Trtmnt3.A:M  0.474 -0.402 
 -0.882 -0.661  0.454
 
 
 Any advice on how to account for overdispersion would be much appreciated.
 
 Many thanks in advance
 

[R] problem with installing a local zip file : GFCURE

2009-05-19 Thread marc bernard

Dear all,

 

I am trying to install a package called GFCURE from a local zip file. This 
package fits a cure survival model and  has been downloaded from: 

http://post.queensu.ca/~pengp/software.html  

 

The problem is that when I try to install this package from a local zip file 
using R,  I've got the following error message:

 

Error in gzfile(file, r) : cannot open the connection
In addition: Warning message:
In gzfile(file, r) : cannot open compressed file 'gfcureWinR/DESCRIPTION', 
probable reason 'No such file or directory'

 

First, I thought it was an internal problem.  I then asked  some of my 
colleagues to do the same thing and they  had the same error message.

 

I would be very grateful if you can help me on that matter.

 

All the best

 

Marc


 

 

_
[[elided Hotmail spam]]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Coord_equal in ggplot2

2009-05-19 Thread ONKELINX, Thierry
Dear all,

I'm plotting some points on a graph where both axes need to have the
same scale. See the example below. Coord_equal does that trick but in
this case it wastes a lot of space on the y-axis. Setting the limits of
the y-axis myself was no avail. 

Any suggestions to solve this problem?  

library(ggplot2)
ds - data.frame(x = runif(1000, min = 0, max = 30), y = runif(1000,
min = 14, max = 26))
ggplot(ds, aes(x = x, y = y)) + geom_point() + coord_equal()
ggplot(ds, aes(x = x, y = y)) + geom_point() + coord_equal() +
scale_x_continuous(limits = c(0, 30)) + scale_y_continuous(limits =
c(14, 26))

Regards,

Thierry



ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey


Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] remove empty objects from workspace

2009-05-19 Thread Katharina May
Hi,

how can I remove all empty objects (which are NA or have zero rows)
from my workspace?

Thanks,

 Katharina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Remove objects names like character String

2009-05-19 Thread Katharina May
Hi,

how can I use rm() on objects named like:
paste(site,i,_data,sep=) while looping
through i?
I tried rm(paste(site,i,_data,sep=)) but I get the error that
rm() must contain names or
text strings which is confusing me as I thought paste() would create
something like that...?

Thanks,


 Katharina



-- 
Time flies like an arrow, fruit flies like bananas.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Coord_equal in ggplot2

2009-05-19 Thread Mike Lawrence
If you use coord_equal on data where the range on the x-axis is larger
than the range on the y-axis, then of course you'll observe extra
space on the y-axis. What did you expect?

Also, this post may be better suited to the ggplot2 mailing list:
http://had.co.nz/ggplot2/

On Tue, May 19, 2009 at 7:17 AM, ONKELINX, Thierry
thierry.onkel...@inbo.be wrote:
 Dear all,

 I'm plotting some points on a graph where both axes need to have the
 same scale. See the example below. Coord_equal does that trick but in
 this case it wastes a lot of space on the y-axis. Setting the limits of
 the y-axis myself was no avail.

 Any suggestions to solve this problem?

 library(ggplot2)
 ds - data.frame(x = runif(1000, min = 0, max = 30), y = runif(1000,
 min = 14, max = 26))
 ggplot(ds, aes(x = x, y = y)) + geom_point() + coord_equal()
 ggplot(ds, aes(x = x, y = y)) + geom_point() + coord_equal() +
 scale_x_continuous(limits = c(0, 30)) + scale_y_continuous(limits =
 c(14, 26))

 Regards,

 Thierry

 
 
 ir. Thierry Onkelinx
 Instituut voor natuur- en bosonderzoek / Research Institute for Nature
 and Forest
 Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
 methodology and quality assurance
 Gaverstraat 4
 9500 Geraardsbergen
 Belgium
 tel. + 32 54/436 185
 thierry.onkel...@inbo.be
 www.inbo.be

 To call in the statistician after the experiment is done may be no more
 than asking him to perform a post-mortem examination: he may be able to
 say what the experiment died of.
 ~ Sir Ronald Aylmer Fisher

 The plural of anecdote is not data.
 ~ Roger Brinner

 The combination of some data and an aching desire for an answer does not
 ensure that a reasonable answer can be extracted from a given body of
 data.
 ~ John Tukey


 Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer
 en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd 
 is
 door een geldig ondertekend document. The views expressed in  this message
 and any annex are purely those of the writer and may not be regarded as 
 stating
 an official position of INBO, as long as the message is not confirmed by a 
 duly
 signed document.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University

Looking to arrange a meeting? Check my public calendar:
http://tr.im/mikes_public_calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] remove empty objects from workspace

2009-05-19 Thread Jim Lemon

Katharina May wrote:

Hi,

how can I remove all empty objects (which are NA or have zero rows)
from my workspace?

  

Hi Katharina,
To remove objects that are all NA:

for(object in objects()) if(all(is.na(get(object rm(list=object)

If by zero rows you mean objects that do not have a dimension:

for(object in objects()) if(is.null(dim(get(object rm(list=object)

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] remove empty objects from workspace

2009-05-19 Thread Katharina May
Thanks Jim, the removal of objects which are NA works perfectly!

For my second problem it didn't express myself correctly:
I  actually meant objects with rows (attributes?) but no data in it
but I solved this
adjusting your approach:

for(object in objects()) if(is.null(dim(get(object))[1]) ||
dim((get(object)))[1] == 0) rm(list=object)

Thanks a lot!


2009/5/19 Jim Lemon j...@bitwrit.com.au:
 Katharina May wrote:

 Hi,

 how can I remove all empty objects (which are NA or have zero rows)
 from my workspace?



 Hi Katharina,
 To remove objects that are all NA:

 for(object in objects()) if(all(is.na(get(object rm(list=object)

 If by zero rows you mean objects that do not have a dimension:

 for(object in objects()) if(is.null(dim(get(object rm(list=object)

 Jim





-- 
Time flies like an arrow, fruit flies like bananas.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] loglinear analysis

2009-05-19 Thread Zoltan Kmetty
Dear R Users,

A would like to fit a loglinear analysis to a three dimensional contingency
table. But I Don't want to run a full saturated modell. Is there any package
in R that could handle somekind of stepwise search to choose out the best
soultion? And how can I fit a non fully saturated modell, which only use the
important interactions?

Best Regards
Zoltan Kmetty

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] loglinear analysis

2009-05-19 Thread Steve_Friedman
Look to the glm function then pass the output to the step function





Steve Friedman Ph. D.
Spatial Statistical Analyst
Everglades and Dry Tortugas National Park
950 N Krome Ave (3rd Floor)
Homestead, Florida 33034

steve_fried...@nps.gov
Office (305) 224 - 4282
Fax (305) 224 - 4147


   
 Zoltan Kmetty 
 zkme...@gmail.co 
 m To 
 Sent by:  r-help@r-project.org
 r-help-boun...@r-  cc 
 project.org   
   Subject 
   [R] loglinear analysis  
 05/19/2009 02:12  
 PM ZE2
   
   
   
   




Dear R Users,

A would like to fit a loglinear analysis to a three dimensional contingency
table. But I Don't want to run a full saturated modell. Is there any
package
in R that could handle somekind of stepwise search to choose out the best
soultion? And how can I fit a non fully saturated modell, which only use
the
important interactions?

Best Regards
Zoltan Kmetty

 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with installing a local zip file : GFCURE

2009-05-19 Thread Prof Brian Ripley

On Tue, 19 May 2009, marc bernard wrote:



Dear all,

I am trying to install a package called GFCURE from a local zip 
file. This package fits a cure survival model and has been 
downloaded from:


http://post.queensu.ca/~pengp/software.html


However, it is not an R package.  Read the Readme.txt in the zip file 
for the instructions for use with R (under Windows).


The problem is that when I try to install this package from a local 
zip file using R, I've got the following error message:


Error in gzfile(file, r) : cannot open the connection
In addition: Warning message:
In gzfile(file, r) : cannot open compressed file 'gfcureWinR/DESCRIPTION', 
probable reason 'No such file or directory'

First, I thought it was an internal problem.  I then asked some of 
my colleagues to do the same thing and they had the same error 
message.


I would be very grateful if you can help me on that matter.


It would have been reasonable to ask the author (Cc:ed here) for help, 
since then he will become aware that potential users have been 
confused by his instructions.


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Generic 'diff'

2009-05-19 Thread Gabor Grothendieck
Note that this could be done like this for ordinary
vectors:

 x - seq(1:4)^2

 apply(embed(x, 2), 1, function(x, f) f(rev(x)), f = diff)
[1] 3 5 7
 apply(embed(x, 2), 1, function(x, f) f(rev(x)), f = sum)
[1]  5 13 25

or a method to rollapply in zoo could be added for ordinary vectors.
Here it is applied to zoo objects:

 library(zoo)
 rollapply(zoo(x), 2, diff)
1 2 3
3 5 7
 rollapply(zoo(x), 2, sum)
 1  2  3
 5 13 25


On Tue, May 19, 2009 at 4:23 AM, Wacek Kusnierczyk
waclaw.marcin.kusnierc...@idi.ntnu.no wrote:
 Stavros Macrakis wrote:
 On Mon, May 18, 2009 at 6:00 PM, Gabor Grothendieck ggrothendi...@gmail.com

 wrote:



 I understood what you were asking but R is an oo language so
 that's the model to use to do this sort of thing.



 I am not talking about creating a new class with an analogue to the
 subtraction function.  I am talking about a function which applies another
 function to a sequence and its lagged version.

 Functional arguments are used all over the place in R's base package
 (Xapply, sweep, outer, by, not to mention Map,  Reduce, Filter, etc.) and
 they seem perfectly natural here.


 perhaps 'diff' would not be the best name, something like 'lag' would be
 better for the more generic function, but 'lag' is already taken.

 i agree it would be reasonable to have diff (lag) to accept an extra
 argument for the function to be applied.  the solution of wrapping the
 vector into a new class to be diff'ed with a non-default diff does not
 seem to make much sense, as (a) what you seem to want is to custom-diff
 plain vectors, (b) to keep the diff family coherent, you'd need to
 upgrade the other diffs to have the extra argument anyway.

 as you say, it's trivial to implement an extended diff, say difff,
 reusing code from diff:

    difff = function(x, ...)
       UseMethod('difff')
    difff.default = function(x, lag=1, differences=1, fun=`-`, ...) {
       ismat = is.matrix(x)
       xlen = if (ismat) dim(x)[1L] else length(x)
    if (length(lag)  1L || length(differences)  1L || lag  1L ||
 differences  1L)
       stop('lag' and 'differences' must be integers = 1)
    if (lag * differences = xlen) return(x[0])
    r = unclass(x)
    i1 = -1L:-lag
    if (ismat)
        for (i in 1L:differences)
            r = fun(r[i1, , drop = FALSE], r[-nrow(r):-(nrow(r) - lag +
 1), , drop = FALSE])
    else
        for (i in 1L:differences) r = fun(r[i1],
 r[-length(r):-(length(r) - lag + 1)])
    class(r) = oldClass(x)
    r }

 now, this naive version seems to work close to what you'd like:

    difff(1:4)
    # 1 1 1

    difff(1:4, fun=`+`)
    # 3 5 7

 it might be useful if the original diff were working this way.

 vQ


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] remove empty objects from workspace

2009-05-19 Thread Henrique Dallazuanna
Try this also:

rm(list=names(which(unlist(eapply(globalenv(), function(a)all(is.na(a) ||
is.null(a)))


On Tue, May 19, 2009 at 9:07 AM, Katharina May may.kathar...@googlemail.com
 wrote:

 Thanks Jim, the removal of objects which are NA works perfectly!

 For my second problem it didn't express myself correctly:
 I  actually meant objects with rows (attributes?) but no data in it
 but I solved this
 adjusting your approach:

 for(object in objects()) if(is.null(dim(get(object))[1]) ||
 dim((get(object)))[1] == 0) rm(list=object)

 Thanks a lot!


 2009/5/19 Jim Lemon j...@bitwrit.com.au:
  Katharina May wrote:
 
  Hi,
 
  how can I remove all empty objects (which are NA or have zero rows)
  from my workspace?
 
 
 
  Hi Katharina,
  To remove objects that are all NA:
 
  for(object in objects()) if(all(is.na(get(object rm(list=object)
 
  If by zero rows you mean objects that do not have a dimension:
 
  for(object in objects()) if(is.null(dim(get(object rm(list=object)
 
  Jim
 
 



 --
 Time flies like an arrow, fruit flies like bananas.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with installing a local zip file : GFCURE

2009-05-19 Thread Gabor Grothendieck
On Tue, May 19, 2009 at 6:17 AM, marc bernard
marc_bern...@hotmail.co.uk wrote:

 Dear all,



 I am trying to install a package called GFCURE from a local zip file. This 
 package fits a cure survival model and  has been downloaded from:


You are assuming its in the form of an R *package* but its not.
Unzip it and read the readme.txt file.  If you still have problems
contact the author.

 http://post.queensu.ca/~pengp/software.html



 The problem is that when I try to install this package from a local zip file 
 using R,  I've got the following error message:



 Error in gzfile(file, r) : cannot open the connection
 In addition: Warning message:
 In gzfile(file, r) : cannot open compressed file 'gfcureWinR/DESCRIPTION', 
 probable reason 'No such file or directory'



 First, I thought it was an internal problem.  I then asked  some of my 
 colleagues to do the same thing and they  had the same error message.



 I would be very grateful if you can help me on that matter.



 All the best



 Marc






 _
 [[elided Hotmail spam]]

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] memory stack overflow

2009-05-19 Thread Nora Pérez








Dear colleagues,

I am trying a glm.nb for the distribution of a plant species with 93 
environmental variables. I execute the instruction and I get the following 
message: Error: C stack usage is too close to the limit.

How can I increase the memory of R?

Your sincerely,

Nora.


_


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] memory stack overflow

2009-05-19 Thread ivan valencia
Have you try principal component analysis to reduce space variables?

2009/5/19 Nora Pérez norichu...@hotmail.com









 Dear colleagues,

 I am trying a glm.nb for the distribution of a plant species with 93
 environmental variables. I execute the instruction and I get the following
 message: Error: C stack usage is too close to the limit.

 How can I increase the memory of R?

 Your sincerely,

 Nora.


 _


[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Luis Iván Ortiz Valencia
Estatístico Msc.
...
Curriculum Lattes

http://buscatextual.cnpq.br/buscatextual/visualizacv.jsp?id=K4778724J3
...
Aquarela Cusco Hostel

http://www.aquarelacuscohostel.com/
...

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Remove objects names like character String

2009-05-19 Thread Uwe Ligges



Katharina May wrote:

Hi,

how can I use rm() on objects named like:
paste(site,i,_data,sep=) while looping
through i?
I tried rm(paste(site,i,_data,sep=)) but I get the error that
rm() must contain names or
text strings which is confusing me as I thought paste() would create
something like that...?



Well, I would try to avoid the creation of so many objects, but once you 
have them you can do even without a loop:


e.g. for the first 5:

i - 1:5
do.call(rm, list(paste(site, i, _data, sep=)))

Uwe Ligges



Thanks,


 Katharina





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] stringsAsFactors param in expand.grid not working

2009-05-19 Thread Martin Maechler
 RT == Rolf Turner r.tur...@auckland.ac.nz
 on Tue, 19 May 2009 11:02:08 +1200 writes:

RT On 19/05/2009, at 10:20 AM, Steve Lianoglou wrote:

 Hi all,
 
 I've (tried) to look through the bug tracker, and gmane-search the  
 R list to
 see if this has been mentioned before, and it looks like it hasn't.

Yes, thank you.

That's a bug on which we  (R-core) currently work.
More about this, and notably about Rolf's (.)*^%)(#%$)
proposal on the   R-devel  list.

Martin Maechler


 According to the R 2.9.0 release notes[1], the expand.grid function  
 should now
 take a stringsAsFactor=LOGICAL argument which controls whether or  
 not the
 function coerces strings as factors. While the parameter is indeed  
 in the
 function, a quick examination of the function's source shows that  
 the value
 of this argument is never checked, and all strings are converted to  
 factors
 as a matter of course.
 
 The fix is pretty easy, and I believe only requires changing the  
 `if` check
 here:
 
 if (!is.factor(x)  is.character(x))
 x - factor(x, levels = unique(x))
 
 To:
 
 if (!is.factor(x)  is.character(x)  stringsAsFactors)
 x - factor(x, levels = unique(x))
 
 I can open a ticket regarding this issue and add this there if  
 necessary.
 
 Thanks,
 -steve
 
 [1] http://article.gmane.org/gmane.comp.lang.r.general/146891

RT While we're at it --- would it not make sense to have the  
RT stringsAsFactors
RT argument (once it's working) of expand.grid() default to options() 
RT $stringsAsFactors,
RT rather than to FALSE?

RT This would make no difference to me personally, since I set
RT options(stringsAsFactors=FALSE) in my .Rprofile.  But it might make some
RT people happier 

RT cheers,

RT Rolf Turner

RT ##
RT Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

RT __
RT R-help@r-project.org mailing list
RT https://stat.ethz.ch/mailman/listinfo/r-help
RT PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
RT and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with installing a local zip file : GFCURE

2009-05-19 Thread marc bernard

Dear gabor,

 

Many thanks for your answer. I indeed didn't check the read me text.

 

Bests

 


 
 From: ggrothendi...@gmail.com
 Date: Tue, 19 May 2009 08:44:13 -0400
 Subject: Re: [R] problem with installing a local zip file : GFCURE
 To: marc_bern...@hotmail.co.uk
 CC: r-help@r-project.org
 
 On Tue, May 19, 2009 at 6:17 AM, marc bernard
 marc_bern...@hotmail.co.uk wrote:
 
  Dear all,
 
 
 
  I am trying to install a package called GFCURE from a local zip file. 
  This package fits a cure survival model and  has been downloaded from:
 
 
 You are assuming its in the form of an R *package* but its not.
 Unzip it and read the readme.txt file. If you still have problems
 contact the author.
 
  http://post.queensu.ca/~pengp/software.html
 
 
 
  The problem is that when I try to install this package from a local zip 
  file using R,  I've got the following error message:
 
 
 
  Error in gzfile(file, r) : cannot open the connection
  In addition: Warning message:
  In gzfile(file, r) : cannot open compressed file 
  'gfcureWinR/DESCRIPTION', probable reason 'No such file or directory'
 
 
 
  First, I thought it was an internal problem.  I then asked  some of my 
  colleagues to do the same thing and they  had the same error message.
 
 
 
  I would be very grateful if you can help me on that matter.
 
 
 
  All the best
 
 
 
  Marc
 
 
 
 
 
 
  _
  [[elided Hotmail spam]]
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

_


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Coord_equal in ggplot2

2009-05-19 Thread Dieter Menne
ONKELINX, Thierry Thierry.ONKELINX at inbo.be writes:
 
 I'm plotting some points on a graph where both axes need to have the
 same scale. See the example below. Coord_equal does that trick but in
 this case it wastes a lot of space on the y-axis. Setting the limits of
 the y-axis myself was no avail. 
 
 Any suggestions to solve this problem?  
 
 library(ggplot2)
 ds - data.frame(x = runif(1000, min = 0, max = 30), y = runif(1000,
 min = 14, max = 26))
 ggplot(ds, aes(x = x, y = y)) + geom_point() + coord_equal()
 ggplot(ds, aes(x = x, y = y)) + geom_point() + coord_equal() +
 scale_x_continuous(limits = c(0, 30)) + scale_y_continuous(limits =
 c(14, 26))

I think you need to set ratio in addition to cut off the extra space.
(Not tried)

From Docs:

Equal scales. coord_equal ensures that the x and y axes have equal scales: i.e.
1 cm along the x axis represents the same range of data as 1 cm along the y
axis. By default it will assume that you want a one-to-one ratio, but you can
change this with the ratio parameter. The aspect ratio will also be set to
ensure that the mapping is maintained regardless of the shape of the output
device. See the documentation of coord_equal() for more details.

Dieter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Wilcoxon nonparametric p-values

2009-05-19 Thread Charles Van deZande
Thanks Peter,
There are 8 measurements less than 8.5, so calculating the probability
(binomial)  of 8, or fewer, happening by chance with n = 20 and p = 0.50
gives P = 0.25-- the book answer.  I've tried several problems in other
textbooks and in each case I get vastly different P-values than I get with
wilcox.test or wilcox.exact.
However, upon further testing, I've found good agreement when the calculated
P-values are small, but disagreement when P-values are large.  This might
mean a problem with wilcox.test and wilcox.exact when P-values are large or
I might be misinterpreting something.
CHV
   
Charles H Van deZande
---Original Message---
 
From: Peter Dalgaard
Date: 5/19/2009 5:35:07 AM
To: cvandy
Cc: r-help@r-project.org
Subject: Re: [R] Wilcoxon nonparametric p-values
 
cvandy wrote:
 When I use wilcox.test, I get vastly different p-values than the problems
 from Statistics textbooks.
 For example:
 The following problem comes from Applied Statistics and Probability for
 Engineers, 2nd Edition, by D. C. Montgomery.  Page736, problem 14.7.  The
 problem is to compare the sample data with a population median of 8.5. 
The
 book answer is p = 0.25, wilcox.test answer is p = 0.573.
 I've tried several other similar problems with similar results.  I've
copied
 the following directly from my workspace.
 
wilcox.exact (from exactRankTests) gives
 
   wilcox.exact(x - 8.5)
 
  Exact Wilcoxon signed rank test
 
data:  x - 8.5
V = 80.5, p-value = 0.5748
 
so I'd suspect the textbook. One-sided p-value perhaps? or table
limitation (as in p  .25). If you want to dig deeper, you'll probably
have to check the computations implied by the text.
 
 Thanks for any help,
 CHV
 x-c(8.32,8.05,
 8.93,8.65,8.25,8.46,8.52,8.35,8.36,8.41,8.42,8.30,8.71,8.75,8.6,8.83,8.5
8.38,8.29,8.46)
 wilcox.test(x,y=NULL,mu=8.5)
 Wilcoxon signed rank test with continuity correction
  data:  x
 V = 80.5, p-value = 0.573
 alternative hypothesis: true location is not equal to 8.5

 Warning messages:
 1: In wilcox.test.default(x, y = NULL, mu = 8.5) :
   cannot compute exact p-value with ties
 2: In wilcox.test.default(x, y = NULL, mu = 8.5) :
   cannot compute exact p-value with zeroes
  
 Charles H Van deZande





  
 
 
--
O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to calculate means of matrix elements

2009-05-19 Thread dxc13

Easy enough.  What if some of the matrix elements contained missing values? 
Then how could you still calculate the means?  Example code below:
mat1 - matrix(c(1,2,3,4,5,NA,7,8,9),3,3)
mat2 - matrix(c(NA,6,1,9,0,5,8,2,7),3,3)
mat3 - matrix(c(5,9,1,8,NA,3,7,2,4),3,3)


Gabor Grothendieck wrote:
 
 Try this:
 
 (mat1 + mat2 + mat3) / 3
 
 On Mon, May 18, 2009 at 8:40 PM, dxc13 dx...@health.state.ny.us wrote:

 useR's,
 I have several matrices of size 4x4 that I want to calculate means of
 their
 respective positions with.  For example, consider I have 3 matrices given
 by
 the code:
 mat1 - matrix(sample(1:20,16,replace=T),4,4)
 mat2 - matrix(sample(-5:15,16,replace=T),4,4)
 mat3 - matrix(sample(5:25,16,replace=T),4,4)

 The result I want is one matrix of size 4x4 in which position [1,1] is
 the
 mean of position [1,1] of the given three matrices.  The same goes for
 all
 other positions of the matrix.  If these three matrices are given in
 separate text files, how can I write code that will get this result I
 need?

 Thanks in advance,
 dxc13
 --
 View this message in context:
 http://www.nabble.com/how-to-calculate-means-of-matrix-elements-tp23607694p23607694.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/how-to-calculate-means-of-matrix-elements-tp23607694p23615755.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Wilcoxon nonparametric p-values

2009-05-19 Thread Peter Dalgaard

Charles Van deZande wrote:

Thanks Peter,
There are 8 measurements less than 8.5, so calculating the probability 
(binomial)  of 8, or fewer, happening by chance with n = 20 and p = 0.50 
gives P = 0.25-- the book answer.  I've tried several problems in other 
textbooks and in each case I get vastly different P-values than I get 
with wilcox.test or wilcox.exact.


Ah, but that is NOT a signed-rank test, just a sign test. (Using the 
former as a test of the median is BTW not really a good idea unless you 
assume symmetry of the distribution.)


It is also still a one-sided test, with two tails you get

 binom.test(8,20)

Exact binomial test

data:  8 and 20
number of successes = 8, number of trials = 20, p-value = 0.5034
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.1911901 0.6394574
sample estimates:
probability of success
   0.4

(and that is disregarding that one observation is exactly 8.5, so you 
should really look at 7 in 19 rather than 8 in 20.)



However, upon further testing, I've found good agreement when the 
calculated P-values are small, but disagreement when P-values are 
large.  This might mean a problem with wilcox.test and wilcox.exact when 
P-values are large or I might be misinterpreting something.


You need to read some more theory.

The extreme cases (all signs equal) are equally unlikely for the sign 
test and the signed-rank test.





CHV
  
Charles H Van deZande
/---Original Message---/
 
/*From:*/ Peter Dalgaard mailto:p.dalga...@biostat.ku.dk

/*Date:*/ 5/19/2009 5:35:07 AM
/*To:*/ cvandy mailto:cvand...@gmail.com
/*Cc:*/ r-help@r-project.org mailto:r-help@r-project.org
/*Subject:*/ Re: [R] Wilcoxon nonparametric p-values
 
cvandy wrote:

  When I use wilcox.test, I get vastly different p-values than the problems
  from Statistics textbooks.
  For example:
  The following problem comes from Applied Statistics and Probability for
  Engineers, 2nd Edition, by D. C. Montgomery.  Page736, problem 
14.7.  The
  problem is to compare the sample data with a population median of 
8.5.  The

  book answer is p = 0.25, wilcox.test answer is p = 0.573.
  I've tried several other similar problems with similar results.  I've 
copied

  the following directly from my workspace.
 
wilcox.exact (from exactRankTests) gives
 
   wilcox.exact(x - 8.5)
 
  Exact Wilcoxon signed rank test
 
data:  x - 8.5

V = 80.5, p-value = 0.5748
 
so I'd suspect the textbook. One-sided p-value perhaps? or table

limitation (as in p  .25). If you want to dig deeper, you'll probably
have to check the computations implied by the text.
 
  Thanks for any help,

  CHV
  x-c(8.32,8.05,
  
8.93,8.65,8.25,8.46,8.52,8.35,8.36,8.41,8.42,8.30,8.71,8.75,8.6,8.83,8.5,8.38,8.29,8.46)

  wilcox.test(x,y=NULL,mu=8.5)
  Wilcoxon signed rank test with continuity correction
   data:  x
  V = 80.5, p-value = 0.573
  alternative hypothesis: true location is not equal to 8.5
 
  Warning messages:
  1: In wilcox.test.default(x, y = NULL, mu = 8.5) :
cannot compute exact p-value with ties
  2: In wilcox.test.default(x, y = NULL, mu = 8.5) :
cannot compute exact p-value with zeroes
   
  Charles H Van deZande
 
 
 
 
 
   
 
 
--

O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk 
mailto:p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907






--
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Remove objects names like character String

2009-05-19 Thread Henrique Dallazuanna
Try this:

rm(list=ls(patt=site[0-9]$))

On Tue, May 19, 2009 at 7:47 AM, Katharina May may.kathar...@googlemail.com
 wrote:

 Hi,

 how can I use rm() on objects named like:
 paste(site,i,_data,sep=) while looping
 through i?
 I tried rm(paste(site,i,_data,sep=)) but I get the error that
 rm() must contain names or
 text strings which is confusing me as I thought paste() would create
 something like that...?

 Thanks,


 Katharina



 --
 Time flies like an arrow, fruit flies like bananas.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] memory stack overflow

2009-05-19 Thread Prof Brian Ripley
That is very unusual, but the C stack can be increased under Windows 
by recompiling R (see src/gnuwin32/front-ends/Makefile). On most other 
OSes it is much easier, just adjust the setting via ulimit or limit in 
your shell.


But I suspect the problem is that your model is too complex. 
Incidentally, is your R current?  Stack usage in expanding models was 
reduced recently.


On Tue, 19 May 2009, Nora Pérez wrote:


Dear colleagues,

I am trying a glm.nb for the distribution of a plant species with 93 
environmental variables. I execute the instruction and I get the 
following message: Error: C stack usage is too close to the limit.


Well, we don't know what you did (see the footer of this message) but 
if you mean 93 explanatory variables I hope you have tens of thousnads 
of cases and are just looking for a predictive model.



How can I increase the memory of R?

Your sincerely,

Nora.


_


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Remove objects names like character String

2009-05-19 Thread Katharina May
thanks to all your solutions, works out perfectly!


2009/5/19 Henrique Dallazuanna www...@gmail.com:
 Try this:

 rm(list=ls(patt=site[0-9]$))

 On Tue, May 19, 2009 at 7:47 AM, Katharina May
 may.kathar...@googlemail.com wrote:

 Hi,

 how can I use rm() on objects named like:
 paste(site,i,_data,sep=) while looping
 through i?
 I tried rm(paste(site,i,_data,sep=)) but I get the error that
 rm() must contain names or
 text strings which is confusing me as I thought paste() would create
 something like that...?

 Thanks,


         Katharina



 --
 Time flies like an arrow, fruit flies like bananas.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O




-- 
Time flies like an arrow, fruit flies like bananas.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Remove objects names like character String

2009-05-19 Thread Ronggui Huang
I don't get the error you mention:

 site1_data-1
 site2_data-2
 site3_data-3
 for (i in 1:3) paste(site,i,_data,sep=)


In my example, another way is: rm(list=paste(site,1:3,_data,sep=))

Or you can use rm(list=ls(pattern=you pattern)), in my example, it is:
rm(list=ls(pattern=site[1-3]_data))

Ronggui

2009/5/19 Katharina May may.kathar...@googlemail.com:
 Hi,

 how can I use rm() on objects named like:
 paste(site,i,_data,sep=) while looping
 through i?
 I tried rm(paste(site,i,_data,sep=)) but I get the error that
 rm() must contain names or
 text strings which is confusing me as I thought paste() would create
 something like that...?

 Thanks,


         Katharina



 --
 Time flies like an arrow, fruit flies like bananas.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
HUANG Ronggui, Wincent
PhD Candidate
Dept of Public and Social Administration
City University of Hong Kong
Home page: http://asrr.r-forge.r-project.org/rghuang.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] error glmpath()

2009-05-19 Thread David A.G

Hi R-users!

I am trying to learn how to use the glmpath package. I have a dataframe like 
this

 dim(data)
[1] 605 109

and selected the following

 response - data[,1]
 features-as.matrix(data[,3:109])
 mymodel - glmpath(features,response, family = binomial)
Error in if (lambda = min.lambda) { :
  missing value where TRUE/FALSE expected


Reading the glmpath pdf, I don't understand why I get this error since lambda 
and min.lambda seem to have default values.

Any suggestions will be very much welcomed

Dave

 sessionInfo()
R version 2.9.0 Under development (unstable) (2009-01-14 r47602)
x86_64-unknown-linux-gnu

locale:
LC_CTYPE=es_ES.UTF-8;LC_NUMERIC=C;LC_TIME=es_ES.UTF-8;LC_COLLATE=es_ES.UTF-8;LC_MONETARY=C;LC_MESSAGES=es_ES.U

TF-8;LC_PAPER=es_ES.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=es_ES.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] splines   stats graphics  grDevices utils datasets  methods
[8] base

other attached packages:
[1] foreign_0.8-34  glmpath_0.94survival_2.35-4


_
[[elided Hotmail spam]]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] nlrwr package. Error when fitting the optimal Box-Cox transformation with two variables

2009-05-19 Thread Ikerne del Valle



Dear all:

	I'm trying to fit the optimal Box-Cox 
transformation related to nls (see the code 
below) for the demand of money data in Green (3th 
Edition) but in the last step R gives the next 
error message.
	Error en 
`[.data.frame`(eval(object$data), , 
as.character(formula(object)[[2]])[2]) :

  undefined columns selected.
¿Any idea to solve the problem?
Thanks in advance,


library(nlrwr)
r-c(4.50,4.19,5.16,5.87,5.95,4.88,4.50,6.44,7.83,6.25,5.50,5.46,7.46,10.28,11.77,13.42,11.02,8.50,8.80,7.69)
M-c(480.00,524.30,566.30,589.50,628.20,712.80,805.20,861.00,908.40,1023.10,1163.60,1286.60,1388.90,1497.90,1631.40,1794.40,1954.90,2188.80,2371.70,2563.60)
Y-c(2208.30,2271.40,2365.60,2423.30,2416.20,2484.80,2608.50,2744.10,2729.30,2695.00,2826.70,2958.60,3115.20,3192.40,3187.10,3248.80,3166.00,3277.70,3492.00,3573.50)
money-data.frame(r,M,Y)
attach(money)
ols1-lm(log(M)~log(r)+log(Y))
output1-summary(ols1)
coef1-ols1$coefficients
a1-coef1[[1]]
b11-coef1[[2]]
b21-coef1[[3]]
money.m1-nls(log(M)~a+b*r^g+c*Y^g,data=money,start=list(a=a1,b=b11,g=1,c=b21))
summary(money.m1)
money.m2-boxcox(money.m1)



Prof. Ikerne del Valle Erkiaga
Department of Applied Economics V
Faculty of Economic and Business Sciences
University of the Basque Country
Avda. Lehendakari Agirre, Nº 83
48015 Bilbao (Bizkaia) Spain

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Spearman rho

2009-05-19 Thread mauede
I read that Spearman rho  can be used to detect the presence of trend in a time 
series.
However, I cannot figure out how to use such a test to thsi purpose. First of 
all which one 
of the available functions and how to pass my mono-channel time series which 
contains both 
positive and negative values.
I would love to see some examples.
Thank you very much.
Maura 


tutti i telefonini TIM!


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] File too big for filehash?

2009-05-19 Thread Rocko22

Dear R users,

I try to use a very large file (~3 Gib) with the filehash package. The
length of the dataset is around 4,000,000 obs. I get this message from R
while trying to load the dataset (named cc084.csv):

 dumpDF(read.csv(cc084.csv, header=T), dbName=db01)
Erreur : impossible d'allouer un vecteur de taille 15.6 Mo (French)
Error: impossible to allow a vector of size 15.6 Meg (my English
translation)

Is there anything I can do?
My R version is 2.8.1.

-
Rock Ouimet
Forest Soil Scientist
MRNF-QC
-- 
View this message in context: 
http://www.nabble.com/File-too-big-for-filehash--tp23618709p23618709.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using while statements to insert rows in a dataframe

2009-05-19 Thread Eric McKibben
Hi.
I am very new to R and have been diligently working my way through the manual 
and various tutorials.  I am now trying to work with some of my own data and 
have encountered a problem that I need to fix.  I have a dataframe with 8 
columns and approximately 850 rows.  I have provided an excerpt of the 
dataframe below.  Within column 6 (Question) the numbers 1:33 repeat down the 
entire column.  Occasionally, however, another value (-32767) appears.  I need 
to locate this value everytime it appears and in its place insert 33 rows that 
are numbered 1:33 in column Question.  Additionally, I need to maintain the 
integrity of the other columns so that the values at that location in each 
column are also repeated 33 times.  So, in the example below, I currently have 
68 rows of data, but I actually need 132 rows (two -32767 values need to be 
replaced).  Based on my reading I am guessing that I need to use a while loop, 
but I cannot seem to get it right.  Is this the appropriate function!
  or is there another more efficient method for achieving my goal.  Again, I am 
quite new to R.  Thanks for your help!

Year Month Day Time PartID Question Latency Response
2008 2 7 194556 6 1 265 -1
2008 2 7 194556 6 2 466 84
2008 2 7 194556 6 3 199 68
2008 2 7 194556 6 4 152 83
2008 2 7 194556 6 5 177 100
2008 2 7 194556 6 6 177 61
2008 2 7 194556 6 7 400 43
2008 2 7 194556 6 8 225 88
2008 2 7 194556 6 9 249 32
2008 2 7 194556 6 10 172 8
2008 2 7 194556 6 11 163 17
2008 2 7 194556 6 12 326 70
2008 2 7 194556 6 13 232 26
2008 2 7 194556 6 14 157 22
2008 2 7 194556 6 15 135 -1
2008 2 7 194556 6 16 133 2
2008 2 7 194556 6 17 222 2
2008 2 7 194556 6 18 357 4
2008 2 7 194556 6 19 131 -1
2008 2 7 194556 6 20 222 90
2008 2 7 194556 6 21 230 35
2008 2 7 194556 6 22 374 32
2008 2 7 194556 6 23 275 85
2008 2 7 194556 6 24 141 -1
2008 2 7 194556 6 25 264 19
2008 2 7 194556 6 26 380 17
2008 2 7 194556 6 27 240 21
2008 2 7 194556 6 28 127 -1
2008 2 7 194556 6 29 232 92
2008 2 7 194556 6 30 205 95
2008 2 7 194556 6 31 185 96
2008 2 7 194556 6 32 319 61
2008 2 7 194556 6 33 101 -1
2008 2 8 122203 6 -32767 0 NA
2008 2 7 194556 6 1 265 -1
2008 2 7 194556 6 2 466 84
2008 2 7 194556 6 3 199 68
2008 2 7 194556 6 4 152 83
2008 2 7 194556 6 5 177 100
2008 2 7 194556 6 6 177 61
2008 2 7 194556 6 7 400 43
2008 2 7 194556 6 8 225 88
2008 2 7 194556 6 9 249 32
2008 2 7 194556 6 10 172 8
2008 2 7 194556 6 11 163 17
2008 2 7 194556 6 12 326 70
2008 2 7 194556 6 13 232 26
2008 2 7 194556 6 14 157 22
2008 2 7 194556 6 15 135 -1
2008 2 7 194556 6 16 133 2
2008 2 7 194556 6 17 222 2
2008 2 7 194556 6 18 357 4
2008 2 7 194556 6 19 131 -1
2008 2 7 194556 6 20 222 90
2008 2 7 194556 6 21 230 35
2008 2 7 194556 6 22 374 32
2008 2 7 194556 6 23 275 85
2008 2 7 194556 6 24 141 -1
2008 2 7 194556 6 25 264 19
2008 2 7 194556 6 26 380 17
2008 2 7 194556 6 27 240 21
2008 2 7 194556 6 28 127 -1
2008 2 7 194556 6 29 232 92
2008 2 7 194556 6 30 205 95
2008 2 7 194556 6 31 185 96
2008 2 7 194556 6 32 319 61
2008 2 7 194556 6 33 101 -1
2008 2 8 143056 6 -32767 0 NA




Eric S McKibben
Industrial-Organizational Psychology Graduate Student
Clemson University
Clemson, SC
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using while statements to insert rows in a dataframe

2009-05-19 Thread Dieter Menne



Eric McKibben wrote:
 
 Within column 6 (Question) the numbers 1:33 repeat down the entire column. 
 Occasionally, however, another value (-32767) appears.  I need to locate
 this value everytime it appears and in its place insert 33 rows that are
 numbered 1:33 in column Question.  
 Additionally, I need to maintain the integrity of the other columns so
 that the values at that location in each column are also repeated 33
 times.  So, in the example below, I currently have 68 rows of data, but I
 actually need 132 rows (two -32767 values need to be replaced). 
 
 Year Month Day Time PartID Question Latency Response
 2008 2 7 194556 6 1 265 -1
 2008 2 7 194556 6 2 466 84
 2008 2 7 194556 6 3 199 68
 ..
 2008 2 8 122203 6 -32767 0 NA
 

It's always good to boil down you example to the minimal possible, your
example is too big.
To clarify you point: assuming there are only two questions:

You have:

Question Latency Response
1265 -1
2466 84
-32767 0   NA

You need? 

Question Latency Response
1265 -1
2466 84
1265 -1
2466 84






-- 
View this message in context: 
http://www.nabble.com/Using-while-statements-to-insert-rows-in-a-dataframe-tp23618849p23619171.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using while statements to insert rows in a dataframe

2009-05-19 Thread Luc Villandre

Eric McKibben wrote:

Hi.
I am very new to R and have been diligently working my way through the manual 
and various tutorials.  I am now trying to work with some of my own data and 
have encountered a problem that I need to fix.  I have a dataframe with 8 
columns and approximately 850 rows.  I have provided an excerpt of the 
dataframe below.  Within column 6 (Question) the numbers 1:33 repeat down the 
entire column.  Occasionally, however, another value (-32767) appears.  I need 
to locate this value everytime it appears and in its place insert 33 rows that 
are numbered 1:33 in column Question.  Additionally, I need to maintain the 
integrity of the other columns so that the values at that location in each 
column are also repeated 33 times.  So, in the example below, I currently have 
68 rows of data, but I actually need 132 rows (two -32767 values need to be 
replaced).  Based on my reading I am guessing that I need to use a while loop, 
but I cannot seem to get it right.  Is this the appropriate function!
 
  or is there another more efficient method for achieving my goal.  Again, I am quite new to R.  Thanks for your help!


Year Month Day Time PartID Question Latency Response
2008 2 7 194556 6 1 265 -1
2008 2 7 194556 6 2 466 84
2008 2 7 194556 6 3 199 68
2008 2 7 194556 6 4 152 83
2008 2 7 194556 6 5 177 100
2008 2 7 194556 6 6 177 61
2008 2 7 194556 6 7 400 43
2008 2 7 194556 6 8 225 88
2008 2 7 194556 6 9 249 32
2008 2 7 194556 6 10 172 8
2008 2 7 194556 6 11 163 17
2008 2 7 194556 6 12 326 70
2008 2 7 194556 6 13 232 26
2008 2 7 194556 6 14 157 22
2008 2 7 194556 6 15 135 -1
2008 2 7 194556 6 16 133 2
2008 2 7 194556 6 17 222 2
2008 2 7 194556 6 18 357 4
2008 2 7 194556 6 19 131 -1
2008 2 7 194556 6 20 222 90
2008 2 7 194556 6 21 230 35
2008 2 7 194556 6 22 374 32
2008 2 7 194556 6 23 275 85
2008 2 7 194556 6 24 141 -1
2008 2 7 194556 6 25 264 19
2008 2 7 194556 6 26 380 17
2008 2 7 194556 6 27 240 21
2008 2 7 194556 6 28 127 -1
2008 2 7 194556 6 29 232 92
2008 2 7 194556 6 30 205 95
2008 2 7 194556 6 31 185 96
2008 2 7 194556 6 32 319 61
2008 2 7 194556 6 33 101 -1
2008 2 8 122203 6 -32767 0 NA
2008 2 7 194556 6 1 265 -1
2008 2 7 194556 6 2 466 84
2008 2 7 194556 6 3 199 68
2008 2 7 194556 6 4 152 83
2008 2 7 194556 6 5 177 100
2008 2 7 194556 6 6 177 61
2008 2 7 194556 6 7 400 43
2008 2 7 194556 6 8 225 88
2008 2 7 194556 6 9 249 32
2008 2 7 194556 6 10 172 8
2008 2 7 194556 6 11 163 17
2008 2 7 194556 6 12 326 70
2008 2 7 194556 6 13 232 26
2008 2 7 194556 6 14 157 22
2008 2 7 194556 6 15 135 -1
2008 2 7 194556 6 16 133 2
2008 2 7 194556 6 17 222 2
2008 2 7 194556 6 18 357 4
2008 2 7 194556 6 19 131 -1
2008 2 7 194556 6 20 222 90
2008 2 7 194556 6 21 230 35
2008 2 7 194556 6 22 374 32
2008 2 7 194556 6 23 275 85
2008 2 7 194556 6 24 141 -1
2008 2 7 194556 6 25 264 19
2008 2 7 194556 6 26 380 17
2008 2 7 194556 6 27 240 21
2008 2 7 194556 6 28 127 -1
2008 2 7 194556 6 29 232 92
2008 2 7 194556 6 30 205 95
2008 2 7 194556 6 31 185 96
2008 2 7 194556 6 32 319 61
2008 2 7 194556 6 33 101 -1
2008 2 8 143056 6 -32767 0 NA




Eric S McKibben
Industrial-Organizational Psychology Graduate Student
Clemson University
Clemson, SC
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
  

Hi Eric,

Using a /while/ statement would probably work, but it would imply not 
making use of R's convenient indexing aspect. What I suggest is the 
following (my.data is the data.frame you provided) :



## To locate the rows ;

row.pos = which(my.data$Question==-32767) ;
repeat.index = rep(row.pos, 33) ;

## To output the result data.frame ;

index.vector = sort(c(seq_along(my.data$Question)[my.data$Question != 
-32767], repeat.index)) ;

final.result = my.data[index.vector,] ;

This should do the trick.

Cheers,
--
*Luc Villandré*
/Biostatistician
McGill University Health Center -
Montreal Children's Hospital Research Institute/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] exists function on list objects gives always a FALSE

2009-05-19 Thread Žroutík
Dear R-users,

in a minimal example exists() gives FALSE on an object which obviously does
exist. How can I check on that list object anyway else, please?

 SmoothData - list(exists=TRUE, span=0.001)
 SmoothData
$exists
[1] TRUE

$span
[1] 0.001

 exists(SmoothData)
TRUE

 exists(SmoothData$span)
FALSE

 exists(SmoothData[[2]])
FALSE

Thank you for any opinion regarding this topic.
Zroutik

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] create string of comma-separated content of vector

2009-05-19 Thread Katharina May
Hi,

how do I create a string of the comma-separated content of a vector?

I've got the vector i with several numeric values as content:
str(i)
num 99

and want to create a SQL statement to look like the following where
the part '(2, 4, 6, 7)' should be
the content of the vector i:
select * from  [biomass_data$] where site_no in (2, 4, 6, 7)

Here my approach (which doesn't work):
site_all_data =  sqlQuery(channel, select * from  [biomass_data$]
where site_no in (,paste(i,sep=,),) )


sorry for spaming so much today to the mailing list...

-Katharina

-- 
Time flies like an arrow, fruit flies like bananas.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] exists function on list objects gives always a FALSE

2009-05-19 Thread Linlin Yan
SmoothData$span is not an object which can be checked by exists(), but
part of an object which can be checked by is.null().

On Wed, May 20, 2009 at 12:07 AM, Žroutík zrou...@gmail.com wrote:
 Dear R-users,

 in a minimal example exists() gives FALSE on an object which obviously does
 exist. How can I check on that list object anyway else, please?

 SmoothData - list(exists=TRUE, span=0.001)
 SmoothData
 $exists
 [1] TRUE

 $span
 [1] 0.001

 exists(SmoothData)
 TRUE

 exists(SmoothData$span)
 FALSE

 exists(SmoothData[[2]])
 FALSE

 Thank you for any opinion regarding this topic.
 Zroutik

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] exists function on list objects gives always a FALSE

2009-05-19 Thread Duncan Murdoch

On 5/19/2009 12:07 PM, Žroutík wrote:

Dear R-users,

in a minimal example exists() gives FALSE on an object which obviously does
exist. How can I check on that list object anyway else, please?


SmoothData - list(exists=TRUE, span=0.001)
SmoothData

$exists
[1] TRUE

$span
[1] 0.001


exists(SmoothData)

TRUE


exists(SmoothData$span)

FALSE


exists(SmoothData[[2]])

FALSE

Thank you for any opinion regarding this topic.


There is no variable with name SmoothData$span, there is an element of 
SmoothData with name span.


To test for that, the safest test is probably

span %in% names(SmoothData)

but a common convention is to use

is.null(SmoothData$span)

because NULL elements are rare in lists.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] exists function on list objects gives always a FALSE

2009-05-19 Thread Romain Francois

Žroutík wrote:

Dear R-users,

in a minimal example exists() gives FALSE on an object which obviously does
exist. How can I check on that list object anyway else, please?

  

SmoothData - list(exists=TRUE, span=0.001)
SmoothData


$exists
[1] TRUE

$span
[1] 0.001

  

exists(SmoothData)


TRUE
  

exists(SmoothData$span)


FALSE
  

This checks for existance of an object called SmoothData$span, as in :

`SmoothData$span` - 1:10
exists(SmoothData$span)

You can do:

is.list( SmoothData )  !is.null(names(SmoothData))  span %in% 
names(SmoothData)


  

exists(SmoothData[[2]])


FALSE
  

Similarly:

`SmoothData[[2]]` - 1
exists(SmoothData[[2]])

You can do:

is.list( SmoothData )  length(SmoothData)  1


Thank you for any opinion regarding this topic.
Zroutik



--
Romain Francois
Independent R Consultant
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] exists function on list objects gives always a FALSE

2009-05-19 Thread Wacek Kusnierczyk
Žroutík wrote:

 SmoothData - list(exists=TRUE, span=0.001)
 SmoothData
 
 $exists
 [1] TRUE

 $span
 [1] 0.001

   
 exists(SmoothData)
 
 TRUE

   
 exists(SmoothData$span)
 
 FALSE

   

'SmoothData$span' = 'foo'
exists(SmoothData$span)
# TRUE

 exists(SmoothData[[2]])
 

'SmoothData[[2]]' = 'bar'
exists(SmoothData[[2]])
# TRUE


the problem in your case is that you have an object named 'SmoothData'
with a nested component named 'span', but you're testing for the
existence of an object named 'SmoothData$span'. 

as shown in a recent post, one attempt to do your task would be

exists('SmoothData')  'span' %in% names(SmoothData)
# TRUE

vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Coord_equal in ggplot2

2009-05-19 Thread ONKELINX, Thierry
Dear Dieter,

I tried that. But it rescales one of the axis. The resulting graph is
still square. But now 1 cm Y-axis equal 2.5 cm on the X-axis. This seems
not te be the documented behaviour. 

ggplot(ds, aes(x = x, y = y)) + geom_point() + coord_equal(ratio = 2/5) 

From sessionInfo()

R 2.9.0 on WinXP
ggplot2_0.8.3 reshape_0.8.3 plyr_0.1.8proto_0.3-8

Regards,

Thierry




ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey

-Oorspronkelijk bericht-
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
Namens Dieter Menne
Verzonden: dinsdag 19 mei 2009 16:15
Aan: r-h...@stat.math.ethz.ch
Onderwerp: Re: [R] Coord_equal in ggplot2

ONKELINX, Thierry Thierry.ONKELINX at inbo.be writes:
 
 I'm plotting some points on a graph where both axes need to have the 
 same scale. See the example below. Coord_equal does that trick but in 
 this case it wastes a lot of space on the y-axis. Setting the limits 
 of the y-axis myself was no avail.
 
 Any suggestions to solve this problem?  
 
 library(ggplot2)
 ds - data.frame(x = runif(1000, min = 0, max = 30), y = 
 runif(1000, min = 14, max = 26)) ggplot(ds, aes(x = x, y = y))

 + geom_point() + coord_equal() ggplot(ds, aes(x = x, y = y)) + 
 geom_point() + coord_equal() + scale_x_continuous(limits = c(0, 
 30)) + scale_y_continuous(limits = c(14, 26))

I think you need to set ratio in addition to cut off the extra space.
(Not tried)

From Docs:

Equal scales. coord_equal ensures that the x and y axes have equal
scales: i.e.
1 cm along the x axis represents the same range of data as 1 cm along
the y axis. By default it will assume that you want a one-to-one ratio,
but you can change this with the ratio parameter. The aspect ratio will
also be set to ensure that the mapping is maintained regardless of the
shape of the output device. See the documentation of coord_equal() for
more details.

Dieter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] exists function on list objects gives always a FALSE

2009-05-19 Thread Wacek Kusnierczyk
Linlin Yan wrote:
 SmoothData$span is not an object which can be checked by exists(), but
 part of an object which can be checked by is.null().

   

is.null is unhelpful here, in that lists can contain NULL as a named
element, and retrieving a non-existent element returns NULL:

foo = list(bar=NULL)
is.null(foo$bar)
# TRUE
is.null(foo$foo)
# TRUE

i must admit i find it surprising that ?'$' does not appropriately
explain what happens if a list is indexed with a name not included in
the list's names.  the closest is

 When extracting, a numerical, logical or character 'NA' index
 picks an unknown element and so returns 'NA' in the corresponding
 element of a logical, integer, numeric, complex or character
 result, and 'NULL' for a list. 

but it's valid for NAs in the index, and

  If no match is found then 'NULL' is returned. 

but it's in the section on environments.


vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Wilcoxon nonparametric p-values

2009-05-19 Thread Charles Van deZande
Thanks Peter,
You are correct!  After I sent the previous message, I realized that I was
comparing the sign test against the Wilcoxon test.  I would have replied
sooner, but I realized that while I was out walking my dogs.
CHV
   
Charles H Van deZande
---Original Message---
 
From: Peter Dalgaard
Date: 05/19/09 10:32:30
To: Charles Van deZande
Cc: r-help@r-project.org
Subject: Re: [R] Wilcoxon nonparametric p-values
 
Charles Van deZande wrote:
 Thanks Peter,
 There are 8 measurements less than 8.5, so calculating the probability
 (binomial)  of 8, or fewer, happening by chance with n = 20 and p = 0.50
 gives P = 0.25-- the book answer.  I've tried several problems in other
 textbooks and in each case I get vastly different P-values than I get
 with wilcox.test or wilcox.exact.
 
Ah, but that is NOT a signed-rank test, just a sign test. (Using the
former as a test of the median is BTW not really a good idea unless you
assume symmetry of the distribution.)
 
It is also still a one-sided test, with two tails you get
 
   binom.test(8,20)
 
  Exact binomial test
 
data:  8 and 20
number of successes = 8, number of trials = 20, p-value = 0.5034
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
  0.1911901 0.6394574
sample estimates:
probability of success
0.4
 
(and that is disregarding that one observation is exactly 8.5, so you
should really look at 7 in 19 rather than 8 in 20.)
 
 
 However, upon further testing, I've found good agreement when the
 calculated P-values are small, but disagreement when P-values are
 large.  This might mean a problem with wilcox.test and wilcox.exact when
 P-values are large or I might be misinterpreting something.
 
You need to read some more theory.
 
The extreme cases (all signs equal) are equally unlikely for the sign
test and the signed-rank test.
 
 
 
 CHV
   
 Charles H Van deZande
 /---Original Message---/

 /*From:*/ Peter Dalgaard mailto:p.dalga...@biostat.ku.dk
 /*Date:*/ 5/19/2009 5:35:07 AM
 /*To:*/ cvandy mailto:cvand...@gmail.com
 /*Cc:*/ r-help@r-project.org mailto:r-help@r-project.org
 /*Subject:*/ Re: [R] Wilcoxon nonparametric p-values

 cvandy wrote:
   When I use wilcox.test, I get vastly different p-values than the
problems
   from Statistics textbooks.
   For example:
   The following problem comes from Applied Statistics and Probability
for
   Engineers, 2nd Edition, by D. C. Montgomery.  Page736, problem
 14.7.  The
   problem is to compare the sample data with a population median of
 8.5.  The
   book answer is p = 0.25, wilcox.test answer is p = 0.573.
   I've tried several other similar problems with similar results.  I've
 copied
   the following directly from my workspace.

 wilcox.exact (from exactRankTests) gives

wilcox.exact(x - 8.5)

   Exact Wilcoxon signed rank test

 data:  x - 8.5
 V = 80.5, p-value = 0.5748

 so I'd suspect the textbook. One-sided p-value perhaps? or table
 limitation (as in p  .25). If you want to dig deeper, you'll probably
 have to check the computations implied by the text.

   Thanks for any help,
   CHV
   x-c(8.32,8.05,
  
 8.93,8.65,8.25,8.46,8.52,8.35,8.36,8.41,8.42,8.30,8.71,8.75,8.6,8.83,8.5,8
38,8.29,8.46)
   wilcox.test(x,y=NULL,mu=8.5)
   Wilcoxon signed rank test with continuity correction
data:  x
   V = 80.5, p-value = 0.573
   alternative hypothesis: true location is not equal to 8.5
  
   Warning messages:
   1: In wilcox.test.default(x, y = NULL, mu = 8.5) :
 cannot compute exact p-value with ties
   2: In wilcox.test.default(x, y = NULL, mu = 8.5) :
 cannot compute exact p-value with zeroes
    
   Charles H Van deZande
  
  
  
  
  



 --
 O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
   (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
 ~~ - (p.dalga...@biostat.ku.dk
 mailto:p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907


 
 
--
O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova(cph(..) output

2009-05-19 Thread pompon

Hi,

Thank you very much for the answer.

However, I have still some misunderstandings.
from the output, can we say that plant and leaf age are significant but not
their interaction?
And the last question I promise, what would you advise me to write in the
paper to explain the different method and ackonwledge for the df?

Thank you again,
julien.

 

Frank E Harrell Jr wrote:
 
 pompon wrote:
 Hello,
 
 I am a beginner in R and statistics, so my question may be trivial. Sorry
 in
 advance.
 I performed a Cox proportion hazard regression with 2 categorical
 variables
 with cph{design}. Then an anova on the results.
 the output is 
 
 anova(cph(surv(survival, censor) ~ plant + leaf.age + plant*leaf.age,
 Mpnymph)
 
 Wald Statistics  Response: Surv(survival,
 censored) 
 
  FactorChi-Square
 d.f. P 
  plant  (Factor+Higher Order Factors) 96.96 12   .0001
   All Interactions   10.58 
 6   0.1022
  leaf.age  (Factor+Higher Order Factors)  29.11  7   0.0001
   All Interactions 10.58 
 6   0.1022
  plant * leaf.age  (Factor+Higher Order Factors)  10.58  6   0.1022
  TOTAL   106.63 13   .0001
 
 What do All interaction stand for?
 The real df of for plant is 6 and 1 for leaf.age. Then, which chi square
 is
 one for my main factors anf their interaction.
 
 thank you,
 Julien.
 
 Julien,
 
 I know what you mean when you say 'real df' but that's not the whole 
 story as plant has 6 more df by interacting with a single df variable. 
 There is no such thing as 'the' main effect test for plant.  The 12 df 
 test is unique and tests whether plant is associated with Y for any 
 level of leaf.age.
 
 You can see exactly what is being tested by using various print options 
 for anova.Design, as described in the help file.  The dots option is 
 easy on the eyes.
 
 Frank
 -- 
 Frank E Harrell Jr   Professor and Chair   School of Medicine
   Department of Biostatistics   Vanderbilt University
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/anova%28cph%28..%29-output-tp23563818p23617483.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] text() to label points in ggplot

2009-05-19 Thread Sunil Suchindran
# Here are two options:

p - ggplot(mtcars, aes(wt, mpg)) + geom_point() + geom_text(aes(x = 5, y =
30, label = A Label))

#or

response - c(2,4)
xvar - c(1,2)
label - response;
myData - data.frame(response,xvar,label)
p - ggplot(myData, aes(y=response, x=xvar))
p + geom_bar(position=dodge, stat=identity) +
geom_text(aes(y=response+1,label=label))

On Thu, May 14, 2009 at 1:22 PM, stephenb sten...@go.com wrote:


 is there a way to label points in a graph using text(locator(1),text)
 after ggplot() or qplot() ?

  qplot(date, psavert, data = economics, geom = line,main=jhdjd)-p
  p+opts(text(locator(1),),new=T)

 does not work.
 --
 View this message in context:
 http://www.nabble.com/text%28%29-to-label-points-in-ggplot-tp23545135p23545135.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] create string of comma-separated content of vector

2009-05-19 Thread Mark Wardle
see ?paste

e.g.

x - seq(0,10,1)
paste(x, collapse=, )


2009/5/19 Katharina May may.kathar...@googlemail.com:
 Hi,

 how do I create a string of the comma-separated content of a vector?

 I've got the vector i with several numeric values as content:
str(i)
 num 99

 and want to create a SQL statement to look like the following where
 the part '(2, 4, 6, 7)' should be
 the content of the vector i:
 select * from  [biomass_data$] where site_no in (2, 4, 6, 7)

 Here my approach (which doesn't work):
 site_all_data =  sqlQuery(channel, select * from  [biomass_data$]
 where site_no in (,paste(i,sep=,),) )


 sorry for spaming so much today to the mailing list...

 -Katharina

 --
 Time flies like an arrow, fruit flies like bananas.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Dr. Mark Wardle
Specialist registrar, Neurology
Cardiff, UK

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova(cph(..) output

2009-05-19 Thread Frank E Harrell Jr

pompon wrote:

Hi,

Thank you very much for the answer.

However, I have still some misunderstandings.
from the output, can we say that plant and leaf age are significant but not
their interaction?
And the last question I promise, what would you advise me to write in the
paper to explain the different method and ackonwledge for the df?

Thank you again,
julien.


I would say there is moderate evidence for an interaction (P=0.10) and 
strong evidence for both a plant effect (at least at some level of leaf) 
and a leaf effect (at least at some level of plant).


Frank



 


Frank E Harrell Jr wrote:

pompon wrote:

Hello,

I am a beginner in R and statistics, so my question may be trivial. Sorry
in
advance.
I performed a Cox proportion hazard regression with 2 categorical
variables
with cph{design}. Then an anova on the results.
the output is 


anova(cph(surv(survival, censor) ~ plant + leaf.age + plant*leaf.age,
Mpnymph)

Wald Statistics  Response: Surv(survival,
censored) 


 FactorChi-Square
d.f. P 
 plant  (Factor+Higher Order Factors) 96.96 12   .0001
  All Interactions   10.58 
6   0.1022

 leaf.age  (Factor+Higher Order Factors)  29.11  7   0.0001
  All Interactions 10.58 
6   0.1022

 plant * leaf.age  (Factor+Higher Order Factors)  10.58  6   0.1022
 TOTAL   106.63 13   .0001

What do All interaction stand for?
The real df of for plant is 6 and 1 for leaf.age. Then, which chi square
is
one for my main factors anf their interaction.

thank you,
Julien.

Julien,

I know what you mean when you say 'real df' but that's not the whole 
story as plant has 6 more df by interacting with a single df variable. 
There is no such thing as 'the' main effect test for plant.  The 12 df 
test is unique and tests whether plant is associated with Y for any 
level of leaf.age.


You can see exactly what is being tested by using various print options 
for anova.Design, as described in the help file.  The dots option is 
easy on the eyes.


Frank
--
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.







--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] length(grep(...

2009-05-19 Thread Dennis Fisher

Colleagues

R2.8.1 in OSX

I often combine two commands as follows:
length(grep(TEXT, OBJECT))  0
to see if a particular snippet of text exists within an object.

Is there a single command that would accomplish this?

Dennis

Dennis Fisher MD
P  (The P Less Than Company)
Phone: 1-866-PLessThan (1-866-753-7784)
Fax: 1-415-564-2220
www.PLessThan.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] length(grep(...

2009-05-19 Thread Romain Francois

Hi,

R=2.9.0 ships grepl, which lets you do that:

any( grepl( TEXT, OBJECT) )

You can also:

install.packages( operators )
require( operators )
OBJECT %~+% TEXT

Romain

Dennis Fisher wrote:

Colleagues

R2.8.1 in OSX

I often combine two commands as follows:
length(grep(TEXT, OBJECT))  0
to see if a particular snippet of text exists within an object.

Is there a single command that would accomplish this?

Dennis

Dennis Fisher MD
P  (The P Less Than Company)
Phone: 1-866-PLessThan (1-866-753-7784)
Fax: 1-415-564-2220
www.PLessThan.com



--
Romain Francois
Independent R Consultant
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] S data sets in R?

2009-05-19 Thread Michael Hannon

Greetings.  I'm trying to learn to program in R.  (I'm definitely NOT new to
programming, just to R.)  A colleague suggested that I have a look at the
book:

An Introduction to S and S-Plus
by:
Phil Spector

I've glanced at the book, and it does indeed seem to be the kind of thing I
wanted, but in the Introduction to the book, the author says he'll be using
several example data sets throughout the book, including:

1. auto.stats

2. saving.x

3. rain.nyc1

4. state.x77

The author states:

These data sets should be available as part of the standard
S distribution, so you can simply refer to them as they are
used in the examples.

Of course I want to use R, not S.  I have every R-* package installed on my
Fedora linux system, but I can't find any of the data sets mentioned above.
(The command locate rain.nyc produces no output, for instance.)

It's entirely possible that these data sets are installed, but I just don't
know enough about R to determine that.

Hence, I need to help to find out if the data sets are installed, or if they CAN
be installed, etc.

If you can steer me in the right direction, please do so.

Thanks.

-- Mike

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] S data sets in R?

2009-05-19 Thread Chu, Roy
Maybe you should just bypass that book for one of these?

http://www.springer.com/series/6991

-Ro

On Tue, May 19, 2009 at 12:01 PM, Michael Hannon jm_han...@yahoo.com wrote:

 Greetings.  I'm trying to learn to program in R.  (I'm definitely NOT new to
 programming, just to R.)  A colleague suggested that I have a look at the
 book:

    An Introduction to S and S-Plus
 by:
    Phil Spector

 I've glanced at the book, and it does indeed seem to be the kind of thing I
 wanted, but in the Introduction to the book, the author says he'll be using
 several example data sets throughout the book, including:

    1. auto.stats

    2. saving.x

    3. rain.nyc1

    4. state.x77

 The author states:

    These data sets should be available as part of the standard
    S distribution, so you can simply refer to them as they are
    used in the examples.

 Of course I want to use R, not S.  I have every R-* package installed on my
 Fedora linux system, but I can't find any of the data sets mentioned above.
 (The command locate rain.nyc produces no output, for instance.)

 It's entirely possible that these data sets are installed, but I just don't
 know enough about R to determine that.

 Hence, I need to help to find out if the data sets are installed, or if they 
 CAN
 be installed, etc.

 If you can steer me in the right direction, please do so.

 Thanks.

 -- Mike

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] exists function on list objects gives always a FALSE

2009-05-19 Thread Stavros Macrakis
On Tue, May 19, 2009 at 12:07 PM, Žroutík zrou...@gmail.com wrote:

  SmoothData - list(exists=TRUE, span=0.001)
  exists(SmoothData$span)
 FALSE


As others have said, this just checks for the existence of a variable with
the (strange) name SmoothData$span.

In some sense, in R semantics, xxx$yyy *always* exists if xxx is a list (or
other recursive object):

  xxx - list()
  xxx$hello
 NULL

You might think that you can check names(xxx) to see if the slot has been
explicitly set, but it depends on *how* you have explicitly set the slot to
NULL:

xxx$hello - 3
xxx$hello - NULL
names(xxx)
   character(0)  # no names -- assigning to NULL kills slot
xxx - list(hello=NULL)
names(xxx)
   [1] hello# 1 name -- constructing with NULL-valued
slot

Welcome to R!

-s

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] exists function on list objects gives always a FALSE

2009-05-19 Thread Wacek Kusnierczyk
Stavros Macrakis wrote:

 You might think that you can check names(xxx) to see if the slot has been
 explicitly set, but it depends on *how* you have explicitly set the slot to
 NULL:

 xxx$hello - 3
 xxx$hello - NULL
 names(xxx)
character(0)  # no names -- assigning to NULL kills slot
   

kills indeed:

foo = list(bar=1)

with(foo, bar)
# 1

foo$bar = NULL
with(foo, bar)
# error: object 'bar' not found

 xxx - list(hello=NULL)
 names(xxx)
[1] hello# 1 name -- constructing with NULL-valued
 slot
   

but:

# cleanup -- don't do it in mission critical session
rm(list=ls())

foo
# error: object 'foo' not found

foo = NULL
foo
# NULL

that is, foo$bar = NULL kills bar within foo (even though NULL is a
valid component of lists), but foo = NULL does *not* kill foo.

 Welcome to R!
   

... and its zemanticks.

vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Replace / swap values of subset of a data.frame

2009-05-19 Thread tsunhin wong
Dear R users,

I have 1 data.frame of 1500x80 - data1. I found out that there are a
few cells of data that I have misplace, and I need to fix the ordering
of them.
In an attempt trying to swap column 22  23 of the Subject with
misplaced data, I did the following:
 data2 - data1
 subset(data1,(Subject==25  Session==1))[,22] - subset(data2,(Subject==25  
 Session==1))[,23]
 (error messages... Could not find function subset-)
 subset(data1,(Subject==25  Session==1))[,23] - subset(data2,(Subject==25  
 Session==1))[,22]
 (error messages... Could not find function subset-)

Please, please point me to some ways to achieve the swapping.
Thanks a lot!

Cheers,

 John

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help with error

2009-05-19 Thread deanj2k

while (theta1!=theta) {...}

gives the error message:
Error in while (theta1 != theta) { : 
  missing value where TRUE/FALSE needed

but when i extract theta1!=theta and paste it into the console it comes up
with the output TRUE which contradicts the error message- im not sure what I
am doing wrong
-- 
View this message in context: 
http://www.nabble.com/help-with-error-tp23623932p23623932.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] what is wrong with this code?

2009-05-19 Thread deanj2k

dlogl - -(n/theta)-sum((y/(theta)^2)*((1-exp(y/theta))/(1+exp(y/theta)))

d2logl - (n/theta^2) - sum((-2y/theta^3)*(1-exp(y/theta))/(1+exp(y/theta)))
- sum(((2*y/theta^4)*exp(y/theta))/((1+exp(y/theta))^2))

returns the error message:
Error: unexpected symbol in:
dlogl - -(n/theta)-sum((y/(theta)^2)*((1-exp(y/theta))/(1+exp(y/theta)))
d2logl

do you know what i have done wrong
-- 
View this message in context: 
http://www.nabble.com/what-is-wrong-with-this-code--tp23623227p23623227.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (no subject)

2009-05-19 Thread Douglas Bates
On Mon, May 18, 2009 at 9:22 AM, Thomas Lumley tlum...@u.washington.edu wrote:
 On Mon, 18 May 2009, Debbie Zhang wrote:

 Based on a set of binomial sample data, how would you utilize the nlm
 function in R to estimate the true proportion of the population?

 I can't see why anyone would want to use nlm() for this.  The sample
 proportion is the MLE, and binom.test() gives an exact confidence interval.

Homework exercise intended to teach the use of optimization when you
can separately work out what the answer should be?

And, as you probably know, the exact confidence interval from
binom.test is not as good as the approximate interval described by
Agresti and B.A. Coull in a 1998 American Statistician article.  (The
coverage of the exact interval is at least the nominal value but it
can be greater because the binomial is discrete.)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] what is wrong with this code?

2009-05-19 Thread Sundar Dorai-Raj
You're missing a ) off end of the first line. You should consider
using an editor (e.g. ESS/Emacs) that does parentheses matching. I
found this in less than 5 sec (less time than I'm taking to write you
a note) by cut and pasting in Emacs.

--sundar

On Tue, May 19, 2009 at 12:52 PM, deanj2k dl...@le.ac.uk wrote:

 dlogl - -(n/theta)-sum((y/(theta)^2)*((1-exp(y/theta))/(1+exp(y/theta)))

 d2logl - (n/theta^2) - sum((-2y/theta^3)*(1-exp(y/theta))/(1+exp(y/theta)))
 - sum(((2*y/theta^4)*exp(y/theta))/((1+exp(y/theta))^2))

 returns the error message:
 Error: unexpected symbol in:
 dlogl - -(n/theta)-sum((y/(theta)^2)*((1-exp(y/theta))/(1+exp(y/theta)))
 d2logl

 do you know what i have done wrong
 --
 View this message in context: 
 http://www.nabble.com/what-is-wrong-with-this-code--tp23623227p23623227.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] what is wrong with this code?

2009-05-19 Thread Ted Harding
On 19-May-09 19:52:20, deanj2k wrote:
 dlogl -
 -(n/theta)-sum((y/(theta)^2)*((1-exp(y/theta))/(1+exp(y/theta)))
 
 d2logl - (n/theta^2) -
 sum((-2y/theta^3)*(1-exp(y/theta))/(1+exp(y/theta)))
 - sum(((2*y/theta^4)*exp(y/theta))/((1+exp(y/theta))^2))
 
 returns the error message:
 Error: unexpected symbol in:
 dlogl -
 -(n/theta)-sum((y/(theta)^2)*((1-exp(y/theta))/(1+exp(y/theta)))
 d2logl
 
 do you know what i have done wrong

The error message strongly suggests that the line beginning d2logl -
is being seen as a continuation of the preceding line.

Counting parentheses, I find that you are 1 short of what is required
to complete the expression in the line beginning -(n/theta).

In that case, R will continue on to the next line seeking the completion,
and will encounter d2logl non-syntactically.

Ted.


E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk
Fax-to-email: +44 (0)870 094 0861
Date: 19-May-09   Time: 22:12:40
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] S data sets in R?

2009-05-19 Thread Douglas Bates
On Tue, May 19, 2009 at 2:01 PM, Michael Hannon jm_han...@yahoo.com wrote:

 Greetings.  I'm trying to learn to program in R.  (I'm definitely NOT new to
 programming, just to R.)  A colleague suggested that I have a look at the
 book:

    An Introduction to S and S-Plus
 by:
    Phil Spector

 I've glanced at the book, and it does indeed seem to be the kind of thing I
 wanted, but in the Introduction to the book, the author says he'll be using
 several example data sets throughout the book, including:

    1. auto.stats

    2. saving.x

    3. rain.nyc1

    4. state.x77

 The author states:

    These data sets should be available as part of the standard
    S distribution, so you can simply refer to them as they are
    used in the examples.

 Of course I want to use R, not S.  I have every R-* package installed on my
 Fedora linux system, but I can't find any of the data sets mentioned above.
 (The command locate rain.nyc produces no output, for instance.)

Not an unreasonable first guess but in R you need parentheses around
the arguments in function calls and you would need to quote the name
of the object.  Even when you do those things and guess at the
function name being find instead of locate you still won't get any
joy.

 find(rain.nyc)
character(0)

The state.x77 data set is part of the datasets package but the others
never seemed to make it from S to R.  If you want to find out what is
available you can try

ls.str(package:datasets)

and stare at the output for a while until it begins to make sense.  In
general, an experienced programmer can learn a lot about the structure
of an object in R by applying Martin Maechler's str function to it.
The ls.str function is the equivalent of asking for a listing of the
objects in a namespace and applying str to each of those names.

Two recent books that I would recommend for learning R are Robert
Gentleman's R Programming for Bioinformatics and John Chambers
Software for Data Analysis.  Robert (one of the two R's who
started the R Project) gives you a broad overview of tools available
and considerable detail on the important parts.  John, the designer
and implementor of the S language the preceded R, describes how to
think about the programming task in R.  Both are worth reading.

 It's entirely possible that these data sets are installed, but I just don't
 know enough about R to determine that.

 Hence, I need to help to find out if the data sets are installed, or if they 
 CAN
 be installed, etc.

 If you can steer me in the right direction, please do so.

 Thanks.

 -- Mike

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] binom package (was: no subject)

2009-05-19 Thread spencerg
 There are 17 different help pages in 5 different packages citing 
Agresti and Coull.  This is quickly displayed using the RSiteSearch 
package as follows: 



library(RSiteSearch)
HTML(RSiteSearch.function(Agresti and Coull))


 I have not checked all these 17, but they doubtless help explain 
Agresti and Coull's point that the term exact confidence interval is 
like a lot of terms in Marketing:  The substance falls far short of the 
hype for most purposes. 



 Hope this helps. 
 Spencer Graves



Douglas Bates wrote:

On Mon, May 18, 2009 at 9:22 AM, Thomas Lumley tlum...@u.washington.edu wrote:
  

On Mon, 18 May 2009, Debbie Zhang wrote:



  

Based on a set of binomial sample data, how would you utilize the nlm
function in R to estimate the true proportion of the population?
  


  

I can't see why anyone would want to use nlm() for this.  The sample
proportion is the MLE, and binom.test() gives an exact confidence interval.



Homework exercise intended to teach the use of optimization when you
can separately work out what the answer should be?

And, as you probably know, the exact confidence interval from
binom.test is not as good as the approximate interval described by
Agresti and B.A. Coull in a 1998 American Statistician article.  (The
coverage of the exact interval is at least the nominal value but it
can be greater because the binomial is discrete.)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] S data sets in R?

2009-05-19 Thread spencerg
 My favorite tool for finding things like this is 
RSiteSearch.function in the RSiteSearch package.  For the objects 
you mention, I get the following: 



library(RSiteSearch)
hits(a.s - RSiteSearch.function(auto.stats)) # 0
hits(sx - RSiteSearch.function(saving.x))  # 0
hits(rn - RSiteSearch.function(rain.nyc1)) # 0
hits(s77 - RSiteSearch.function(state.x77)) # 12
HTML(s77) # View the 12 and find states in the datasets package. 


hits(ps - RSiteSearch.function(Phil Spector))  # 0


 If you are still interested in that book, you might write to the 
author, suggesting he might get more readers by providing a package that 
includes those data sets.  If he were really interested in having more 
readers, he might also include script files providing R scripts for 
working all the examples in the book, as Doug Bates does in the nlme 
package, which can be found using system.file('scripts', 
package='nlme').  These provide R code to work essentially all the 
examples in Pinhiero and Bates (2000) Mixed-Effects Models in S and 
S-Plus (Springer).  For me, those files made reading that book much 
easier, more pleasant and memorable. 



 Hope this helps. 
 Spencer Graves



Douglas Bates wrote:

On Tue, May 19, 2009 at 2:01 PM, Michael Hannon jm_han...@yahoo.com wrote:
  

Greetings.  I'm trying to learn to program in R.  (I'm definitely NOT new to
programming, just to R.)  A colleague suggested that I have a look at the
book:

   An Introduction to S and S-Plus
by:
   Phil Spector

I've glanced at the book, and it does indeed seem to be the kind of thing I
wanted, but in the Introduction to the book, the author says he'll be using
several example data sets throughout the book, including:

   1. auto.stats

   2. saving.x

   3. rain.nyc1

   4. state.x77

The author states:

   These data sets should be available as part of the standard
   S distribution, so you can simply refer to them as they are
   used in the examples.

Of course I want to use R, not S.  I have every R-* package installed on my
Fedora linux system, but I can't find any of the data sets mentioned above.
(The command locate rain.nyc produces no output, for instance.)



Not an unreasonable first guess but in R you need parentheses around
the arguments in function calls and you would need to quote the name
of the object.  Even when you do those things and guess at the
function name being find instead of locate you still won't get any
joy.

  

find(rain.nyc)


character(0)

The state.x77 data set is part of the datasets package but the others
never seemed to make it from S to R.  If you want to find out what is
available you can try

ls.str(package:datasets)

and stare at the output for a while until it begins to make sense.  In
general, an experienced programmer can learn a lot about the structure
of an object in R by applying Martin Maechler's str function to it.
The ls.str function is the equivalent of asking for a listing of the
objects in a namespace and applying str to each of those names.

Two recent books that I would recommend for learning R are Robert
Gentleman's R Programming for Bioinformatics and John Chambers
Software for Data Analysis.  Robert (one of the two R's who
started the R Project) gives you a broad overview of tools available
and considerable detail on the important parts.  John, the designer
and implementor of the S language the preceded R, describes how to
think about the programming task in R.  Both are worth reading.

  

It's entirely possible that these data sets are installed, but I just don't
know enough about R to determine that.

Hence, I need to help to find out if the data sets are installed, or if they CAN
be installed, etc.

If you can steer me in the right direction, please do so.

Thanks.

-- Mike

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Getting lm() to work with a matrix

2009-05-19 Thread MikSmith

Hi

I'm fairly new to R and am trying to analyse some large spectral datasets
using stepwise regression (fairly standard in this area). I have a field
sampled dataset, of which a proportion has been held back for validation. I
gather than step() needs to be fed a regression model and lm() can produce a
multiple regression. I had thought something like:

spectra.lm - lm(response[,3]~spectra.spec[,2:20])

might work but lm() doesnt appear to like being fed a range of columns. I
suspect Ive missed something fairly fundamental here.

Any help much appreciated

best wishes

mike
-- 
View this message in context: 
http://www.nabble.com/Getting-lm%28%29-to-work-with-a-matrix-tp23625486p23625486.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] panel question (plm)

2009-05-19 Thread Stephen J. Barr
Hello,

I am working on a data set (already as a plm.data object) located
here: http://econsteve.com/arch/plmWithDensity.Robj

With the following R session:
 library(plm)
...
load(plmWithDensity.Robj)
model - plm(RATE ~ density08, data=plmWithDensity)
Error: subscript out of bounds

I am not understanding the subscript out of bounds error, as this is
a balanced panel and there are no holes in the data set. Any help
would be very much appreciated. The model I am trying to run is

 model2 - plm(RATE~ AGR.PCT+SVC.PCT+IND.PCT+density08, data=plmWithDensity)

This code runs fine, but I do not get any coefficients for density08

 summary(model2)
Oneway (individual) effect Within Model

Call:
plm(formula = RATE ~ AGR.PCT + SVC.PCT + IND.PCT + density08,
data = plmWithDensity)

Balanced Panel: n=89, T=26, N=2314

Residuals :
   Min. 1st Qu.  Median 3rd Qu.Max.
-1860.0  -475.011.3   526.0  1250.0

Coefficients :
 Estimate Std. Error  t-value  Pr(|t|)
AGR.PCT  34192.604281.07   7.9869 1.383e-15 ***
SVC.PCT   4024.83 457.17   8.8037  2.2e-16 ***
IND.PCT -16545.621541.32 -10.7347  2.2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Total Sum of Squares:98515
Residual Sum of Squares: 85206
F-statistic: 115.692 on 3 and  DF, p-value:  2.22e-16


What is going on? Any advice is appreciated.
Thanks,
-stephen
==
Stephen J. Barr
University of Washington
WEB: www.econsteve.com
==

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] panel question (plm)

2009-05-19 Thread Achim Zeileis

On Tue, 19 May 2009, Stephen J. Barr wrote:


Hello,



I am working on a data set (already as a plm.data object) located
here: http://econsteve.com/arch/plmWithDensity.Robj

With the following R session:
 library(plm)
...
load(plmWithDensity.Robj)
model - plm(RATE ~ density08, data=plmWithDensity)
Error: subscript out of bounds

I am not understanding the subscript out of bounds error, as this is


I agree that the error is not very meaningful but the problem is due to 
your data: density08 does not vary within your id variable (COURT), hence 
the default within model cannot be estimated. And it is also the reason 
why density08 gets no coefficient in a larger model.


Also note that your RATE variable is a factor...I'm pretty certain you 
want a numeric variable here!


Yves  Giovanni: What happens in the code is that the model.matrix() 
method silently omits the column from the regressor matrix. Hence, this 
goes unnoticed in the larger model and results in a regressor matrix 
without any columns in the case above. Thus, the subscript error.


hth,
Z

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] create string of comma-separated content of vector

2009-05-19 Thread Law, Jason
See ?toString

x - 0:10
toString(x)

See ?sQuote for cases where the vector is a character and needs to be quoted.

Jason Law
Statistician
City of Portland
Bureau of Environmental Services
Water Pollution Control Laboratory
6543 N Burlington Avenue
Portland, OR 97203-5452
jason@bes.ci.portland.or.us

 Hi,

 how do I create a string of the comma-separated content of a vector?

 I've got the vector i with several numeric values as content:
str(i)
 num 99

 and want to create a SQL statement to look like the following where
 the part '(2, 4, 6, 7)' should be
 the content of the vector i:
 select * from  [biomass_data$] where site_no in (2, 4, 6, 7)

 Here my approach (which doesn't work):
 site_all_data =  sqlQuery(channel, select * from  [biomass_data$]
 where site_no in (,paste(i,sep=,),) )


 sorry for spaming so much today to the mailing list...

 -Katharina

 --
 Time flies like an arrow, fruit flies like bananas.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] panel question (plm)

2009-05-19 Thread Stephen J. Barr
Ah, thank you for the help, and for the explanation of what is going
on. I suppose I will have to reload my data with plm.data set such
that RATE is not a factor. For my time index, will
2000,2000.25,2000.5, etc. work? Meaning 2000 quarter 1, 2000 quarter
2, etc? Or is there some special way that I need to format the time?

Thanks,
-stephen
==
Stephen J. Barr
University of Washington
WEB: www.econsteve.com
==



On Tue, May 19, 2009 at 4:39 PM, Achim Zeileis
achim.zeil...@wu-wien.ac.at wrote:
 On Tue, 19 May 2009, Stephen J. Barr wrote:

 Hello,

 I am working on a data set (already as a plm.data object) located
 here: http://econsteve.com/arch/plmWithDensity.Robj

 With the following R session:
  library(plm)
 ...
 load(plmWithDensity.Robj)
 model - plm(RATE ~ density08, data=plmWithDensity)
 Error: subscript out of bounds

 I am not understanding the subscript out of bounds error, as this is

 I agree that the error is not very meaningful but the problem is due to your
 data: density08 does not vary within your id variable (COURT), hence the
 default within model cannot be estimated. And it is also the reason why
 density08 gets no coefficient in a larger model.

 Also note that your RATE variable is a factor...I'm pretty certain you want
 a numeric variable here!

 Yves  Giovanni: What happens in the code is that the model.matrix() method
 silently omits the column from the regressor matrix. Hence, this goes
 unnoticed in the larger model and results in a regressor matrix without any
 columns in the case above. Thus, the subscript error.

 hth,
 Z



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Barchart in lattice - wrong order of groups, data labels on top of each other, and a legend question

2009-05-19 Thread Deepayan Sarkar
On Mon, May 18, 2009 at 11:47 AM, Dimitri Liakhovitski ld7...@gmail.com wrote:
 Hello!
 I have a question about my lattice barchart that I am trying to build
 in Section 3 below. I can't figure out a couple of things:
 1. When I look at the dataframe test that I am trying to plot, it
 looks right to me (the group Total is always the first out of 5).
 However, in the chart it is the last. Why?
 2. How can I make sure the value labels (on y) are not sitting on top
 of each other but on top of the respective bar?
 3. Is there any way to make the legend group items horizontally as
 opposed to now (vertically - taking up too much space)

For 1 and 3, use

 auto.key = list(points = FALSE,
 rectangles = TRUE,
 reverse.rows = TRUE,
 columns = 2,
 space = bottom)

From ?xyplot (under 'key'):

   'reverse.rows' logical, defaulting to 'FALSE'.  If
'TRUE', all components are reversed _after_ being
replicated (the details of which may depend on the
value of 'rep').  This is useful in certain
situations, e.g. with a grouped 'barchart' with
'stack = FALSE' with the categorical variable on
the vertical axis, where the bars in the plot will
usually be ordered from bottom to top, but the
corresponding legend will have the levels from top
to bottom (unless, of course, 'reverse.rows =
TRUE').  Note that in this case, unless all columns
have the same number or rows, they will no longer
be aligned.

   'columns' the number of columns column-blocks the key is
to be divided into, which are drawn side by side.


2 is hard with a simple custom panel function, because you need to
replicate some fairly involved calculations that are performed in
panel.barchart. Your best bet is to start with a copy of
panel.barchart, and then add calls to panel.text at suitable places.

-Deepayan


 Thanks a lot!
 Dimitri

 ### Section 1: generates my data set data - just run: #

 N-100
 myset1-c(1,2,3,4,5)
 probs1-c(.05,.10,.15,.40,.30)
 myset2-c(0,1)
 probs2-c(.65,.30)
 myset3-c(1,2,3,4,5,6,7)
 probs3-c(.02,.03,.10,.15,.20,.30,.20)

 group-unlist(lapply(1:4,function(x){
        out-rep(x,25)
        return(out)
 }))
 set.seed(1)
 a-sample(myset1, N, replace = TRUE,probs1)
 a[which(rbinom(100,2,.01)==1)]-NA
 set.seed(12)
 b-sample(myset1, N, replace = TRUE,probs1)
 b[which(rbinom(100,2,.01)==1)]-NA
 set.seed(123)
 c-sample(myset2, N, replace = TRUE,probs2)
 set.seed(1234)
 d-sample(myset2, N, replace = TRUE,probs2)
 set.seed(12345)
 e-sample(myset3, N, replace = TRUE,probs3)
 e[which(rbinom(100,2,.01)==1)]-NA
 set.seed(123456)
 f-sample(myset3, N, replace = TRUE,probs3)
 f[which(rbinom(100,2,.01)==1)]-NA
 data-data.frame(group,a=a,b=b,c=c,d=d,e=e,f=f)
 data[group]-lapply(data[group],function(x) {
        x[x %in% 1]-Group 1
        x[x %in% 2]-Group 2
        x[x %in% 3]-Group 3
        x[x %in% 4]-Group 4
        return(x)
 })
 data$group-as.factor(data$group)
 lapply(data,table,exclude=NULL)

 tables-lapply(data,function(x){
        out-table(x)
        out-prop.table(out)
        out-round(out,3)*100
        return(out)
 })
 str(tables[2])

 # Section 2: Generating a list of tables with percentages to be
 plotted in barcharts - just run: #

 listoftables-list()
 for(i in 1:(length(data)-1)) {
  listoftables[[i]]-data.frame()
 }
 for(i in 1:length(listoftables)) {
    total-table(data[[i+1]])
    groups-table(data[[1]],data[[i+1]])
    total.percents-as.data.frame(t(as.vector(round(total*100/sum(total),1
    groups.percents-as.data.frame(t(apply(groups,1,function(x){
      out-round(x*100/sum(x),1)
     return(out)
  })))
  names(total.percents)-names(groups.percents)
  final.table-rbind(total.percents,groups.percents)
  row.names(final.table)[1]-Total
  final.table-as.matrix(final.table)
  listoftables[[i]]-final.table
 }
 names(listoftables)-names(data)[2:(length(listoftables)+1)]


 ### Section 3 - building the graph for the very first table of the
 listoftables ###
 library(lattice)
 i-1
 test - data.frame(Group = rep(row.names(listoftables[[i]]),5), a =
 rep(1:5,each=5),Percentage = as.vector(listoftables[[i]]))
 par.settings=trellis.par.set(reference.line = list(col = gray, lty 
 =dotted))
 barchart(Percentage~a, test, groups = Group, horizontal = F,
 auto.key = list(points = FALSE, rectangles = TRUE, space =
 bottom),ylim = c(0,50),
    panel = function(y,x,...) {
    panel.grid(h = -1, v = -1)
    panel.barchart(x, y, ...)
    ltext(x, y, labels=round(y,0),cex=.7,col=black,font=2,pos=3)
 })


 --
 Dimitri Liakhovitski
 MarketTools, Inc.
 dimitri.liakhovit...@markettools.com

 __
 

Re: [R] panel question (plm)

2009-05-19 Thread Achim Zeileis

On Tue, 19 May 2009, Stephen J. Barr wrote:


Ah, thank you for the help, and for the explanation of what is going
on. I suppose I will have to reload my data with plm.data set such
that RATE is not a factor.


plmWithDensity$RATE - as.numeric(as.character(plmWithDensity$RATE))

should suffice.


For my time index, will
2000,2000.25,2000.5, etc. work? Meaning 2000 quarter 1, 2000 quarter
2, etc? Or is there some special way that I need to format the time?


That's ok. Internally, plm.data always stores it as a factor anyway.

Best,
Z


Thanks,
-stephen
==
Stephen J. Barr
University of Washington
WEB: www.econsteve.com
==



On Tue, May 19, 2009 at 4:39 PM, Achim Zeileis
achim.zeil...@wu-wien.ac.at wrote:

On Tue, 19 May 2009, Stephen J. Barr wrote:


Hello,



I am working on a data set (already as a plm.data object) located
here: http://econsteve.com/arch/plmWithDensity.Robj

With the following R session:

library(plm)

...

load(plmWithDensity.Robj)
model - plm(RATE ~ density08, data=plmWithDensity)

Error: subscript out of bounds

I am not understanding the subscript out of bounds error, as this is


I agree that the error is not very meaningful but the problem is due to your
data: density08 does not vary within your id variable (COURT), hence the
default within model cannot be estimated. And it is also the reason why
density08 gets no coefficient in a larger model.

Also note that your RATE variable is a factor...I'm pretty certain you want
a numeric variable here!

Yves  Giovanni: What happens in the code is that the model.matrix() method
silently omits the column from the regressor matrix. Hence, this goes
unnoticed in the larger model and results in a regressor matrix without any
columns in the case above. Thus, the subscript error.

hth,
Z







__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Getting lm() to work with a matrix

2009-05-19 Thread Gabor Grothendieck
Try this (note dot after ~):

lm(response[, 3] ~., as.data.frame(spectra.spec[, 2:20]))


On Tue, May 19, 2009 at 6:21 PM, MikSmith m...@hsm.org.uk wrote:

 Hi

 I'm fairly new to R and am trying to analyse some large spectral datasets
 using stepwise regression (fairly standard in this area). I have a field
 sampled dataset, of which a proportion has been held back for validation. I
 gather than step() needs to be fed a regression model and lm() can produce a
 multiple regression. I had thought something like:

 spectra.lm - lm(response[,3]~spectra.spec[,2:20])

 might work but lm() doesnt appear to like being fed a range of columns. I
 suspect Ive missed something fairly fundamental here.

 Any help much appreciated

 best wishes

 mike
 --
 View this message in context: 
 http://www.nabble.com/Getting-lm%28%29-to-work-with-a-matrix-tp23625486p23625486.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] New version of actuar

2009-05-19 Thread Vincent Goulet

Dear useRs,

A new version of actuar is available since last Friday. This is mainly  
a bugfix release. From the NEWS file:


Version 1.0-2
=

USER-VISIBLE CHANGES

o mfoo() and levfoo() now return Inf instead of NaN for infinite
moments. (Thanks to David Humke for the idea.)

BUG FIXES

o Non-ascii characters in one R source file prevented compilation of
the package in a C locale (at least on OS X).

o For probability laws that have a strictly positive mode or a mode
at zero depending on the value of one or more shape parameters,
dfoo(0, ...) did not handle correctly the case exactly at the
boundary condition.

actuar is a package offering additional actuarial science  
functionality to R, mostly in the fields of loss distributions, risk  
theory (including ruin theory), simulation of compound hierarchical  
models and credibility theory


See also: http://www.actuar-project.org.

--
Vincent Goulet, Associate Professor
École d'actuariat
Université Laval, Québec
vincent.gou...@act.ulaval.ca   http://vgoulet.act.ulaval.ca

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] panel question (plm)

2009-05-19 Thread Stephen J. Barr
Thank you for the advice. For the density08 variable, that is
population density in year 2008. I also have population densities for
year 2000, so I could put them both in, and interpolate between them
for the times that are covered by the panel (2000-2008), and then just
have a density column that will vary both over time and across
various courts. I would assume that this would fix the problem of
density not showing up in my coefficients list, although I think it is
more of an econometrics issue :)

Thanks again,
-stephen

On Tue, May 19, 2009 at 3:35 PM, Achim Zeileis
achim.zeil...@wu-wien.ac.at wrote:
 On Tue, 19 May 2009, Stephen J. Barr wrote:

 Ah, thank you for the help, and for the explanation of what is going
 on. I suppose I will have to reload my data with plm.data set such
 that RATE is not a factor.

 plmWithDensity$RATE - as.numeric(as.character(plmWithDensity$RATE))

 should suffice.

 For my time index, will
 2000,2000.25,2000.5, etc. work? Meaning 2000 quarter 1, 2000 quarter
 2, etc? Or is there some special way that I need to format the time?

 That's ok. Internally, plm.data always stores it as a factor anyway.

 Best,
 Z

 Thanks,
 -stephen
 ==
 Stephen J. Barr
 University of Washington
 WEB: www.econsteve.com
 ==



 On Tue, May 19, 2009 at 4:39 PM, Achim Zeileis
 achim.zeil...@wu-wien.ac.at wrote:

 On Tue, 19 May 2009, Stephen J. Barr wrote:

 Hello,

 I am working on a data set (already as a plm.data object) located
 here: http://econsteve.com/arch/plmWithDensity.Robj

 With the following R session:

 library(plm)

 ...

 load(plmWithDensity.Robj)
 model - plm(RATE ~ density08, data=plmWithDensity)

 Error: subscript out of bounds

 I am not understanding the subscript out of bounds error, as this is

 I agree that the error is not very meaningful but the problem is due to
 your
 data: density08 does not vary within your id variable (COURT), hence the
 default within model cannot be estimated. And it is also the reason why
 density08 gets no coefficient in a larger model.

 Also note that your RATE variable is a factor...I'm pretty certain you
 want
 a numeric variable here!

 Yves  Giovanni: What happens in the code is that the model.matrix()
 method
 silently omits the column from the regressor matrix. Hence, this goes
 unnoticed in the larger model and results in a regressor matrix without
 any
 columns in the case above. Thus, the subscript error.

 hth,
 Z






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to copy files from one direction to another?

2009-05-19 Thread XinMeng
There's 10 files in c:\\ 
I wanna copy 3 of them to d:\\ 

How to do it via R? 


Thanks!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to copy files from one direction to another?

2009-05-19 Thread jim holtman
?file.copy

On Tue, May 19, 2009 at 9:51 PM, XinMeng xm...@capitalbio.com wrote:

 There's 10 files in c:\\
 I wanna copy 3 of them to d:\\

 How to do it via R?


 Thanks!

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Replace / swap values of subset of a data.frame

2009-05-19 Thread jim holtman
Exactly what are you trying to do?  Are you trying to just change a subset
of the values?  'subset' does not have an 'assignment' operator.  Maybe you
want something like this (but it is not clear from your description.  Also
it is not clear if you have exactly the same set of matching values in the
two data frames for the subset conditions.  If you do, then this might work:

data1[(data1$Subject==25)  (data1$Session==1), 22] -
data2[(data2$Subject==25)(data2$Session==1), 23]

On Tue, May 19, 2009 at 3:50 PM, tsunhin wong thjw...@gmail.com wrote:

 Dear R users,

 I have 1 data.frame of 1500x80 - data1. I found out that there are a
 few cells of data that I have misplace, and I need to fix the ordering
 of them.
 In an attempt trying to swap column 22  23 of the Subject with
 misplaced data, I did the following:
  data2 - data1
  subset(data1,(Subject==25  Session==1))[,22] -
 subset(data2,(Subject==25  Session==1))[,23]
  (error messages... Could not find function subset-)
  subset(data1,(Subject==25  Session==1))[,23] -
 subset(data2,(Subject==25  Session==1))[,22]
  (error messages... Could not find function subset-)

 Please, please point me to some ways to achieve the swapping.
 Thanks a lot!

 Cheers,

 John

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Replace / swap values of subset of a data.frame

2009-05-19 Thread Gabor Grothendieck
If DF is your data frame:

DF2 - edit(DF)

and then make the changes manually in the spreadsheet that
pops up.

On Tue, May 19, 2009 at 3:50 PM, tsunhin wong thjw...@gmail.com wrote:
 Dear R users,

 I have 1 data.frame of 1500x80 - data1. I found out that there are a
 few cells of data that I have misplace, and I need to fix the ordering
 of them.
 In an attempt trying to swap column 22  23 of the Subject with
 misplaced data, I did the following:
 data2 - data1
 subset(data1,(Subject==25  Session==1))[,22] - subset(data2,(Subject==25  
 Session==1))[,23]
 (error messages... Could not find function subset-)
 subset(data1,(Subject==25  Session==1))[,23] - subset(data2,(Subject==25  
 Session==1))[,22]
 (error messages... Could not find function subset-)

 Please, please point me to some ways to achieve the swapping.
 Thanks a lot!

 Cheers,

     John

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Too large a data set to be handled by R?

2009-05-19 Thread tsunhin wong
Dear R users,

I have been using a dynamic data extraction from raw files strategy at
the moment, but it takes a long long time.
In order to save time, I am planning to generate a data set of size
1500 x 2 with each data point a 9-digit decimal number, in order
to save my time.
I know R is limited to 2^31-1 and that my data set is not going to
exceed this limit. But my laptop only has 2 Gb and is running 32-bit
Windows / XP or Vista.

I ran into R memory problem issue before. Please let me know your
opinion according to your experience.
Thanks a lot!

- John

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Extracting correlation in a nlme model

2009-05-19 Thread Kenneth Roy Cabrera Torres
Hi R users:

Is there a function to obtain the correlation within groups
from this very simple lme model?

 modeloMx1
Linear mixed-effects model fit by REML
  Data: barrag 
  Log-restricted-likelihood: -70.92739
  Fixed: fza_tension ~ 1 
(Intercept) 
   90.86667 

Random effects:
 Formula: ~1 | molde
(Intercept) Residual
StdDev:2.610052 2.412176

Number of Observations: 30
Number of Groups: 3 

I want to obtain \rho = \sigma_b^2 / (\sigma_b^2 + \sigma^2)

I know that I obtain \sigma_b^2 and \sigma^2 with

 VarCorr(modeloMx1)

molde = pdLogChol(1) 
Variance StdDev  
(Intercept) 6.812374 2.610052
Residual5.818593 2.412176

But, I want to know if I can obtain
\rho = 6.8123/(6.8123 + 5.8185) = 0.53934 straightforward.

Thank you for you help.

Kenneth

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] glht problem

2009-05-19 Thread Stats Wolf
I am struggling with a simple repeated-measure model:

fit-lme(trait~year * A, random = ~1|subj/year)

A being a factor with three levels. I got have the following results
for  anova(fit):

numDF denDF   F-value p-value
(Intercept) 1   126 2471.4720  .0001
year   2060   10.4126  .0001
A 2   126   23.0721  .0001
year:A40   1261.6499  0.0193

Now I try to use glht for A, but fail:

Linear Hypotheses:
  Estimate Std. Error z value p value
A2 - A1 == 0  0.25   1.10   0.227   0.972
A3 - A1 == 0 1.001.10   0.909   0.634
A3 - A2 == 0  0.75   1.10   0.682   0.774
(Adjusted p values reported -- single-step method)

Warning message:
In mcp2matrix(model, linfct = linfct) :
  covariate interactions found -- default contrast might be inappropriate


What can be going on with this?

many thanks in advance,
Wolf

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Package Inline under windows

2009-05-19 Thread _

Hi all,
I installed the package inline (windows-version) but can not compile any 
code, I alway get an error message
ERROR(s) during compilation : source code errors or compiler 
configuration errors!


Unfornutanely there is no description where the package finds a 
c-compiler nor where so set the configuration.


Using the linux version, everything works.

Thank's for help !

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.