Re: [R] drc results differ for different versions

2009-05-22 Thread Hans Vermeiren
Hello
Thanks a lot Marc, for the suggestion to explore the issue a bit more
systematically
So I did and the conclusion is that with the old drc 1.4-2, I get a
SE=0.003, with the new drc 1.5-2, I get a SE=0.4
irrespective of the R version or the version of the packages drc depends
on

I hope somebody can help me further, I still have the feeling that a SE
of 0.003 on an IC50 of 1.2, for a reasonable good fit is more realistic
than a SE of 0.4 on an IC50 of 1.2

Regards,
Hans

Here are the results of the following test script:
d-data.frame(dose=c(2.00e-05,4.00e-06,8.00e-07,1.60e-07,3.20e-08,6.40e-
09,1.28e-09,2.56e-10,5.10e-11,1.00e-11,2.00e-05,4.00e-06,8.00e-07,1.60e-
07,3.20e-08,6.40e-09,1.28e-09,2.56e-10,5.10e-11,1.00e-11),response=c(97.
202,81.670,47.292,16.924, 16.832,  6.832,  11.118,   1.319,   5.495,
-3.352, 102.464,  83.114,  50.631,  22.792,  18.348,  19.066,  27.794,
14.682,  11.992,  12.868))

m- drm(response ~ (log10(dose*1e6)), data = d, fct = l4(fixed =
c(NA,NA,NA,NA), names = c(hs, bottom, top, ec50)), logDose = 10,
control = drmc(useD = T))

summary(m)
RESULTS:
sessionInfo()
R version 2.7.0 (2008-04-22) 
i386-pc-mingw32 

locale:
LC_COLLATE=Dutch_Belgium.1252;LC_CTYPE=Dutch_Belgium.1252;LC_MONETARY=Du
tch_Belgium.1252;LC_NUMERIC=C;LC_TIME=Dutch_Belgium.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base


other attached packages:
[1] drc_1.5-2  plotrix_2.4-2  nlme_3.1-89MASS_7.2-41
lattice_0.17-6
[6] alr3_1.1.7

loaded via a namespace (and not attached):
[1] grid_2.7.0  tools_2.7.0

RESULT: ec50:(Intercept)=1.27447   SE=0.43541
CONCLUSION: R 2.7.0 with recent drc 1.5-2 (older dependencies) gives
SE=0.43541

===
R version 2.9.0 (2009-04-17) 
i386-pc-mingw32 

locale:
LC_COLLATE=Dutch_Belgium.1252;LC_CTYPE=Dutch_Belgium.1252;LC_MONETARY=Du
tch_Belgium.1252;LC_NUMERIC=C;LC_TIME=Dutch_Belgium.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base


other attached packages:
[1] drc_1.4-2   plotrix_2.5-5   nlme_3.1-90 MASS_7.2-46
[5] lattice_0.17-22 alr3_1.1.7 

loaded via a namespace (and not attached):
[1] grid_2.9.0  tools_2.9.0
RESULT: ec50:(Intercept)SE=1.2039e+00
CONCLUSION: R 2.9.0 with old drc 1.4-2 (newer dependencies) gives
SE=0.003


Hans,

You have three important factors changing here. The version of R, the  
version of drc and the versions of any relevant drc dependencies  
(alr3, lattice, magic, MASS, nlme, plotrix).

I would first try to install the newer version of drc on the older R  
system (all else staying the same) and see what you get. Don't run  
update.packages() here, lest you change other things. Just install the  
newer version of drc.

If you get the same results as the older version, then it might lead  
you to something in R or one of the package dependencies changing.

If you get a different result, then it would lead to something in drc  
changing.


You can also install the old version of drc on your more recent R  
system to see what you get, which might help to confirm behavior.

The old source version of drc would be available from:

   http://cran.r-project.org/src/contrib/Archive/drc/

I also found a Windows binary of the old package here:

   http://cran.r-project.org/bin/windows/contrib/2.6/drc_1.4-2.zip


I have also copied Christian Ritz, the drc package author/maintainer,  
so that he may be able to assist you further with the problem.

HTH,

Marc Schwartz

--
This e-mail and its attachment(s) (if any) may contain confidential and/or 
proprietary information and is intended for its addressee(s) only. Any 
unauthorized use of the information contained herein (including, but not 
limited to, alteration, reproduction, communication, distribution or any other 
form of dissemination) is strictly prohibited. If you are not the intended 
addressee, please notify the orginator promptly and delete this e-mail and its 
attachment(s) (if any) subsequently. 

Galapagos nor any of its affiliates shall be liable for direct, special, 
indirect or consequential damages arising from alteration of the contents of 
this message (by a third party) or as a result of a virus being passed on.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] arrangement of crowded labels

2009-05-22 Thread Jim Lemon

Thomas Zumbrunn wrote:

...
Thanks for your answers. This was almost what I was looking for, except that I 
would need something for a 2-dimensional context (my question was not specific 
enough).
  

Hi Thomas,
The thigmophobe.labels function just works out how to place each label 
away from the nearest point to the point that is being labeled. The 
pointLabel function in the maptools package and the spread.labs function 
in Teaching Demos use more sophisticated algorithms to do this, and may 
work in situations where thigmophobe.labels wouldn't separate all the 
labels.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Confirmatory factor analysis problems using sem package (works in Amos)

2009-05-22 Thread S. Messing

Hello all,

I'm trying to replicate a confirmatory factor analysis done in Amos.  The
idea is to compare a one-factor and a two-factor model.  I get the following
warning message when I run either model:

Could not compute QR decomposition of Hessian.
Optimization probably did not converge.

I have no idea what to do here.  I believe posters reported the same
problem.  It seems that the ability to invert the correlation matrix may
have something to do with this error, but solve(correl) yields a solution.

Here are my correlation matrix and model specifications:

--- R CODE BEGIN


correl - matrix(
c(1.000,-0.6657822,0.6702089,0.7997673,-0.7225454,0.8992372,

-0.8026491,-0.6715168,-0.5781565,-0.6657822,1.000,-0.5107568,

-0.5030886,0.6971188,-0.6306937,0.7200848,0.5121227,0.4496810,

0.6702089,-0.5107568,1.000,0.7276558,-0.3893792,0.6043672,

-0.7176532,-0.5247434,-0.4670362,0.7997673,-0.5030886,0.7276558,

1.000,-0.6251056,0.8164190,-0.6728515,-0.6371453,-0.5531964,

-0.7225454,0.6971188,-0.3893792,-0.6251056,1.000,-0.7760765,

0.6175124,0.5567924,0.4914176,0.8992372,-0.6306937,0.6043672,

0.8164190,-0.7760765,1.000,-0.7315507,-0.6625136,-0.5814590,

-0.8026491,0.7200848,-0.7176532,-0.6728515,0.6175124,-0.7315507,

1.000,0.5910688,0.5096898,-0.6715168,0.5121227,-0.5247434,

-0.6371453,0.5567924,-0.6625136,0.5910688,1.000,0.8106496,

-0.5781565,0.4496810,-0.4670362,-0.5531964,0.4914176,-0.5814590,
0.5096898,0.8106496,1.000), ,nrow=9,ncol=9)

rownames(correl) = c(pvote, jmposaff, jmnegaff,
boposaff,bonegaff,
obama.therm, mccain.therm, 
oddcand.D, evencand.D )

colnames(correl) = c(pvote, jmposaff, jmnegaff,
boposaff,bonegaff,
obama.therm, mccain.therm, 
oddcand.D, evencand.D )

#One Factor Model:
model.all - specify.model()
allmeasures - pvote,   b1, NA
allmeasures - obama.therm, b2, NA
allmeasures - mccain.therm,b3, NA
allmeasures - jmposaff,b4, NA
allmeasures - jmnegaff,b5, NA
allmeasures - boposaff,b6, NA
allmeasures - bonegaff,b7, NA
allmeasures - oddcand.D,   b8, NA
allmeasures - evencand.D,  b9, NA
allmeasures - allmeasures,NA,1
pvote - pvote,v1, NA
obama.therm - obama.therm,v2, NA
mccain.therm - mccain.therm,  v3, NA
jmposaff - jmposaff,  v4, NA
jmnegaff - jmnegaff,  v5, NA
boposaff - boposaff,  v6, NA
bonegaff - bonegaff,  v7, NA
oddcand.D - oddcand.D,v8, NA
evencand.D - evencand.D,  v9, NA


sem.all - sem(model.all, correl, 1100)

summary(sem.all)

#Two Factor Model:
model.vi - specify.model()
verbal - pvote,b1, NA
verbal - obama.therm,  b2, NA
verbal - mccain.therm, b3, NA
verbal - jmposaff, b4, NA
verbal - jmnegaff, b5, NA
verbal - boposaff, b6, NA
verbal - bonegaff, b7, NA
imp - oddcand.D,   b8, NA
imp - evencand.D,  b9, NA
imp - imp,NA, 1
verbal - verbal,  NA, 1
pvote - pvote,v1, NA
obama.therm - obama.therm,v2, NA
mccain.therm - mccain.therm,  v3, NA
jmposaff - jmposaff,  v4, NA
jmnegaff - jmnegaff,  v5, NA
boposaff - boposaff,  v6, NA
bonegaff - bonegaff,  v7, NA
oddcand.D - oddcand.D,v8, NA
evencand.D - evencand.D,  v9, NA
imp - verbal, civ, NA

sem.vi - sem(model.vi, correl, 1100)
summary(sem.vi)

--- R CODE END


Thanks very much.

-Solomon
-- 
View this message in context: 
http://www.nabble.com/Confirmatory-factor-analysis-problems-using-sem-package-%28works-in-Amos%29-tp23664618p23664618.html
Sent 

[R] Goodness of fit in quantile regression

2009-05-22 Thread laura m.

Dear R users,


I've used the function qr.fit.sfn to estimate a quantile regression on a
panel data set. Now I would like to compute an statistic to measure the
goodness of fit of this model.

Does someone know how could I do that? I could compute a pseudo R2 but in
order to do that I would need the value of the objetive function at the
optimum and I don't see how to get this from the function qr.fit.sfn.

If someone has any good idea about how to solve this problem I would be most
grateful!

Best

Laura
-- 
View this message in context: 
http://www.nabble.com/Goodness-of-fit-in-quantile-regression-tp23666962p23666962.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] drc results differ for different versions

2009-05-22 Thread Christian Ritz
Hi Hans,

I hope I can resolve your problems below (Marc, thank you very much for cc'ing 
me on your
initial response!).

Have a look at the following R lines:


## Fitting the model using drm() (from the latest version)
m1- drm(response ~ dose, data = d, fct = LL.4())
summary(m1)
plot(m1)

## Checking the fit by using nls()
## (we have very good guesses for the parameter estimates)
m2 - nls(response ~ c + (d - c)/(1 + (dose/e)^b), data=d, start=list(b=-0.95, 
c=10,
d=106, e=1.2745e-06))
summary(m2)


The standard errors agree quite well. The minor discrepancies between to two 
fits are
attributable to different numerical approximations of the variance-covariance 
matrix being
used in drm() and nls().

So I would use the latest version of 'drc', especially for datasets with really 
small
doses. One recent change to drm() was to incorporate several layers of scaling 
prior to
estimation (as well as subsequent back scaling after estimation):

1) scaling of parameters with the same scale as the x axis
2) scaling of parameters with the same scale as the y axis
3) scaling of parameters in optim()


The effect of scaling is to temporarily convert the dataset (and the model) 
to scales
that are more convenient for the estimation procedure. Any feedback on this 
would be much
appreciated.

Therefore it should also not be necessary to manually do any scaling prior to 
using drm()
(like what you did). Compare, for instance, your specification of drm() to mine 
above.

Is this explanation useful?!

Christian

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] likelihood

2009-05-22 Thread Mark Bilton

I have a problem related to measuring likelihood between
-an observed presence absence dataset (containing 0 or 1)
-a predicted simulation matrix of the same dimensions (containing values from 0 
to 1)

This must be a common problem but I am struggling to find the answer in the 
literature.

Within the simulation model I have a parameter 'm' which I can alter to find 
the best fit (maximum likelihood). 

Currently I just use a 'sum of squares of difference' to measure likelihood.
ie likelihood = sum (obs-pred)^2
This is then very easy to find (using numerical optimisation techniques) the 
value of 'm' which gives my maximum likelihood (least sum of squares)

However I do not think my likelihood function is the correct one to be using 
for this purpose.
Firstly, if sum of squares is the correct method, maybe I should be taking the 
square root of the likelihood (makes no difference) and possibly the 'mean' 
values of the datasets may need to be included in my calculatons.

However, sum of squares suggests my data are normally distributed (which it is 
clearly not)
Obs (boolean O or 1)
Pred (beta O to 1)
Difference (beta -1 to 1)
My guess is that I should be using a beta (or uniform) defined likelihood 
measure.
Or maybe just a simple transformation.

Any help greatly appreciated

Mark


_


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Behavior of seq with vector from

2009-05-22 Thread baptiste auguie

Hi,

Perhaps you can try this,



seq.weave - function(froms, by, length,  ... ){
c(
matrix(c(sapply(froms, seq, by=by, length = length/2,  ...)),
nrow=length(froms), byrow=T)
)
}

seq.weave(c(2, 3), by=3, length=8)
seq.weave(c(2, 3, 4), by=2, length=8)



HTH,

baptiste

On 21 May 2009, at 21:08, Rowe, Brian Lee Yung (Portfolio Analytics)  
wrote:



Hello,

I want to use seq with multiple from values and am getting unexpected
(to me) behavior. I'm wondering if this behavior is intentional or  
not.



seq(2, by=3, length.out=4)

[1]  2  5  8 11


seq(3, by=3, length.out=4)

[1]  3  6  9 12

Now if I want the combined sequence, I thought I could pass in c(2,3),
and I get:

seq(c(2,3), by=3, length.out=8)

[1]  2  6  8 12 14 18 20 24

However, the result is not what I expected (i.e. what I wanted):
[1]  2  3  5  6  8  9 11 12

It seems that this is a consequence of vector recycling during the
summation in seq.default:
 if (missing(to)) from + (0L:(length.out - 1L)) * by

To get the value I want, I am using the following code:

sort(as.vector(apply(array(c(2,3)), 1, seq, by=3,length.out=4)))

[1]  2  3  5  6  8  9 11 12

So two questions:
1. Is seq designed/intended to be used with a vector from argument,  
and

is this the desired behavior?
2. If so, is there a cleaner way of implementing what I want?

Thanks,
Brian




--
This message w/attachments (message) may be privileged...{{dropped:26}}


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] System crash when using surv=T in cph

2009-05-22 Thread Uwe Ligges



maxb wrote:

Can someone help me. I am very new to R. I am fitting a Cox model using Frank
Harrell's cph as I want to produce a Nomogram. This is what I have done:

Srv- Surv(time,cens)
f.cox- cph(Srv~ v1+v2+v3+v4, x=T, y=T, surv=T)


This is not reproducible for us. Where are the data?
What is cph? I presume the one from package Design?


As soon as I press enter, Windows XP crashes. If I remove surv=T, then it
works. I have R version 2.9.0.


Have you updates all your packages? Which versions of survival and 
Design are you using?

Is it really Windows that crashes, or just R?


See
http://cran.r-project.org/web/checks/check_results_Design.html
to learn that even the newest version of Design produces WARNINGs.
If you can send reproducible code to let R crash, please send it to the 
package maintainer of the involved packages.


Best,
Uwe Ligges




Is there a way of displaying the parameter estimates (ie beta coefficients)
and HR when I type 
anova(f.cox) as this only displays the Chi squared and p-values.

Any help or advise drawing a Nomogram will be appreciated.

Thanks in advance

Max


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Paste Strings as logical for functions?

2009-05-22 Thread Uwe Ligges



tsunhin wong wrote:

Dear R Users,

I have some dynamic selection rules that I want to pass around for my functions:


rules - paste(g$TrialList==1  g$Session==2)


I guess you do not want to paste() at all:

rules - g$TrialList==1  g$Session==2

Uwe Ligges



myfunction - function(rules) {
  index - which(rules)
  anotherFunction(index)
}


However, I can't find a way to pass around these selection rules
easily (for subset, for which, etc)
Please let me know if you have some idea.

Thank you very much!

- John

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] step by step debugger in R?

2009-05-22 Thread Uwe Ligges



Michael wrote:

Could anybody point me to the latest status of the most user-friendly
debugger in R?



I always have been happy with ?debug, ?recover and friends cited there. 
Although you also find additional software like the debug package ..


Best,
Uwe Ligges



How I wish I don't have to stare at my (long) code for ages and stuck...

Thanks!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] drc results differ for different versions

2009-05-22 Thread Hans Vermeiren
Yes, thanks that's very useful
Apart from checking the fit with nls() as you suggested, I've also used Prism, 
which gave the following results

Equation 1  
Best-fit values 
 BOTTOM 10.96
 TOP106.4
 LOGEC50-5.897
 HILLSLOPE  0.9501
 EC50   1.2670e-006
Std. Error  
 BOTTOM 2.196
 TOP9.337
 LOGEC500.1439
 HILLSLOPE  0.2270
95% Confidence Intervals
 BOTTOM 6.301 to 15.61
 TOP86.62 to 126.2
 LOGEC50-6.202 to -5.592
 HILLSLOPE  0.4689 to 1.431
 EC50   6.2750e-007 to 2.5560e-006
Goodness of Fit 
 Degrees of Freedom 16
 R² 0.9622
 Absolute Sum of Squares787.5
 Sy.x   7.015
Data
 Number of X values 20
 Number of Y replicates 1
 Total number of values 20
 Number of missing values   0

In other words: also in line with the drc 1.6-3 and nls() results
As for the scaling: yes this is useful because I can't predict whether 
concentrations are in molar, micromolar,..., right now I indeed scaled 
dose-values manually, it's better/ more robust when the drm-function takes 
care of that
I suppose this also means I don't have to do the log transformation anymore?
Thanks (both of you) for your swift feedback

Hans

-Original Message-
From: Christian Ritz [mailto:r...@life.ku.dk] 
Sent: vrijdag 22 mei 2009 11:30
To: Hans Vermeiren
Cc: r-help@r-project.org; marc_schwa...@me.com
Subject: Re: [R] drc results differ for different versions

Hi Hans,

I hope I can resolve your problems below (Marc, thank you very much for cc'ing 
me on your
initial response!).

Have a look at the following R lines:


## Fitting the model using drm() (from the latest version)
m1- drm(response ~ dose, data = d, fct = LL.4())
summary(m1)
plot(m1)

## Checking the fit by using nls()
## (we have very good guesses for the parameter estimates)
m2 - nls(response ~ c + (d - c)/(1 + (dose/e)^b), data=d, start=list(b=-0.95, 
c=10,
d=106, e=1.2745e-06))
summary(m2)


The standard errors agree quite well. The minor discrepancies between to two 
fits are
attributable to different numerical approximations of the variance-covariance 
matrix being
used in drm() and nls().

So I would use the latest version of 'drc', especially for datasets with really 
small
doses. One recent change to drm() was to incorporate several layers of scaling 
prior to
estimation (as well as subsequent back scaling after estimation):

1) scaling of parameters with the same scale as the x axis
2) scaling of parameters with the same scale as the y axis
3) scaling of parameters in optim()


The effect of scaling is to temporarily convert the dataset (and the model) 
to scales
that are more convenient for the estimation procedure. Any feedback on this 
would be much
appreciated.

Therefore it should also not be necessary to manually do any scaling prior to 
using drm()
(like what you did). Compare, for instance, your specification of drm() to mine 
above.

Is this explanation useful?!

Christian

--
This e-mail and its attachment(s) (if any) may contain confidential and/or 
proprietary information and is intended for its addressee(s) only. Any 
unauthorized use of the information contained herein (including, but not 
limited to, alteration, reproduction, communication, distribution or any other 
form of dissemination) is strictly prohibited. If you are not the intended 
addressee, please notify the orginator promptly and delete this e-mail and its 
attachment(s) (if any) subsequently. 

Galapagos nor any of its affiliates shall be liable for direct, special, 
indirect or consequential damages arising from alteration of the contents of 
this message (by a third party) or as a result of a virus being passed on.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Behavior of seq with vector from

2009-05-22 Thread Peter Dalgaard
Rowe, Brian Lee Yung (Portfolio Analytics) wrote:

 To get the value I want, I am using the following code:
 sort(as.vector(apply(array(c(2,3)), 1, seq, by=3,length.out=4)))
 [1]  2  3  5  6  8  9 11 12
 
 So two questions:
 1. Is seq designed/intended to be used with a vector from argument, and
 is this the desired behavior?
 2. If so, is there a cleaner way of implementing what I want?

1. Hmm, not really. NA.

2. I'd view it as an outer sum, stringed out to a single vector, hence:

 c(outer(c(2,3), seq(0,,3,4), +))
[1]  2  3  5  6  8  9 11 12


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] drc results differ for different versions

2009-05-22 Thread Christian Ritz
Yes, you're right: taking logarithms is no longer needed!

Christian

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] anova leads to an error

2009-05-22 Thread Skotara

Dear R-list,

the following code had been running well over the last months:

  exam - matrix(rnorm(100,0,1), 10, 10)
  gg - factor(c(rep(A, 5), rep(B, 5)))
  mlmfit - lm(exam ~ 1); mlmfitG - lm(exam ~ gg)
  result - anova(mlmfitG, mlmfit, X=~0, M=~1)

Until, all of a sudden the following error occured:

Fehler in apply(abs(sapply(deltassd, function(X) diag((T %*% X %*% 
t(T),  :

 dim(X) must have a positive length

I have not kept track of the changes in my R-version, so it might have 
to do with that.

Now it is: R version 2.9.0 (2009-04-17).

Does anybody know more about this error? I would help me a lot!

Thank you very much!

Nils

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Changing a legend title to bold font

2009-05-22 Thread Steve Murray

Dear all,

I have seen a response from Duncan Murdoch which comes close to solving this 
one, but I can't quite seem to tailor it to fit my needs!

I am trying to make just the title in my legend as bold font, with the legend 
'items' in normal typeface.

I've tried setting par(font=2) external to the legend command, but this makes 
everything bold. I've also tried (in the legend code), font.main=2, but I 
didn't have any luck with this.

Any suggestions would be most welcome,

Steve

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Cannot Install Cairo Library

2009-05-22 Thread Lorenzo Isella

Dear All,
I am running Debian testing on my box and I have R 2.9.0 installed from 
the standard repositories.

I downloaded the package source from
http://cran.r-project.org/web/packages/Cairo/index.html
but when I try to install it on my system, this is what I get

$ sudo R CMD INSTALL Cairo_1.4-4.tar.gz
* Installing to library ‘/usr/local/lib/R/site-library’
* Installing *source* package ‘Cairo’ ...
checking for gcc... gcc -std=gnu99
checking for C compiler default output file name... a.out
checking whether the C compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables...
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc -std=gnu99 accepts -g... yes
checking for gcc -std=gnu99 option to accept ISO C89... none needed
checking how to run the C preprocessor... gcc -std=gnu99 -E
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for ANSI C header files... yes
checking for sys/wait.h that is POSIX.1 compatible... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking for string.h... (cached) yes
checking sys/time.h usability... yes
checking sys/time.h presence... yes
checking for sys/time.h... yes
checking for unistd.h... (cached) yes
checking for an ANSI C-conforming const... yes
checking for pkg-config... /usr/bin/pkg-config
checking whether pkg-config knows about cairo... yes
checking for configurable backends... cairo cairo-ft cairo-pdf cairo-png 
cairo-ps cairo-xlib cairo-xlib-xrender
configure: CAIRO_CFLAGS=-D_REENTRANT -I/usr/include/cairo 
-I/usr/include/pixman-1 -I/usr/include/freetype2 -I/usr/include/directfb 
-I/usr/include/libpng12

checking if R was compiled with the RConn patch... no
checking cairo.h usability... yes
checking cairo.h presence... yes
checking for cairo.h... yes
checking for PNG support in Cairo... yes
checking for ATS font support in Cairo... no
configure: CAIRO_LIBS=-lfreetype -lfontconfig -lpng12 -lz -lXrender 
-lcairo -lX11

checking for library containing deflate... none required
checking whether Cairo programs can be compiled... yes
checking whether cairo_image_surface_get_format is declared... no
checking for FreeType support in cairo... yes
checking whether FreeType needs additional flags... no
checking wheter libjpeg works... yes
checking wheter libtiff works... no
configure: creating ./config.status
config.status: creating src/Makevars
config.status: creating src/cconfig.h
** libs
gcc -std=gnu99 -I/usr/share/R/include -D_REENTRANT -I/usr/include/cairo 
-I/usr/include/pixman-1 -I/usr/include/freetype2 -I/usr/include/directfb 
-I/usr/include/libpng12 -I. -Iinclude -g -O2 -fpic -g -O2 -c cairobem.c 
-o cairobem.o
gcc -std=gnu99 -I/usr/share/R/include -D_REENTRANT -I/usr/include/cairo 
-I/usr/include/pixman-1 -I/usr/include/freetype2 -I/usr/include/directfb 
-I/usr/include/libpng12 -I. -Iinclude -g -O2 -fpic -g -O2 -c cairogd.c 
-o cairogd.o
gcc -std=gnu99 -I/usr/share/R/include -D_REENTRANT -I/usr/include/cairo 
-I/usr/include/pixman-1 -I/usr/include/freetype2 -I/usr/include/directfb 
-I/usr/include/libpng12 -I. -Iinclude -g -O2 -fpic -g -O2 -c cairotalk.c 
-o cairotalk.o
gcc -std=gnu99 -I/usr/share/R/include -D_REENTRANT -I/usr/include/cairo 
-I/usr/include/pixman-1 -I/usr/include/freetype2 -I/usr/include/directfb 
-I/usr/include/libpng12 -I. -Iinclude -g -O2 -fpic -g -O2 -c 
img-backend.c -o img-backend.o
gcc -std=gnu99 -I/usr/share/R/include -D_REENTRANT -I/usr/include/cairo 
-I/usr/include/pixman-1 -I/usr/include/freetype2 -I/usr/include/directfb 
-I/usr/include/libpng12 -I. -Iinclude -g -O2 -fpic -g -O2 -c img-jpeg.c 
-o img-jpeg.o
gcc -std=gnu99 -I/usr/share/R/include -D_REENTRANT -I/usr/include/cairo 
-I/usr/include/pixman-1 -I/usr/include/freetype2 -I/usr/include/directfb 
-I/usr/include/libpng12 -I. -Iinclude -g -O2 -fpic -g -O2 -c img-tiff.c 
-o img-tiff.o
gcc -std=gnu99 -I/usr/share/R/include -D_REENTRANT -I/usr/include/cairo 
-I/usr/include/pixman-1 -I/usr/include/freetype2 -I/usr/include/directfb 
-I/usr/include/libpng12 -I. -Iinclude -g -O2 -fpic -g -O2 -c 
pdf-backend.c -o pdf-backend.o
gcc -std=gnu99 -I/usr/share/R/include -D_REENTRANT -I/usr/include/cairo 
-I/usr/include/pixman-1 -I/usr/include/freetype2 -I/usr/include/directfb 
-I/usr/include/libpng12 -I. -Iinclude -g -O2 -fpic -g -O2 -c 
ps-backend.c -o ps-backend.o
gcc -std=gnu99 -I/usr/share/R/include -D_REENTRANT -I/usr/include/cairo 
-I/usr/include/pixman-1 -I/usr/include/freetype2 -I/usr/include/directfb 
-I/usr/include/libpng12 -I. -Iinclude -g -O2 -fpic -g -O2 -c 
svg-backend.c -o svg-backend.o
gcc -std=gnu99 -I/usr/share/R/include -D_REENTRANT -I/usr/include/cairo 
-I/usr/include/pixman-1 

Re: [R] Cannot Install Cairo Library

2009-05-22 Thread Dirk Eddelbuettel

On 22 May 2009 at 13:42, Lorenzo Isella wrote:
| Dear All,
| I am running Debian testing on my box and I have R 2.9.0 installed from 
| the standard repositories.
| I downloaded the package source from
| http://cran.r-project.org/web/packages/Cairo/index.html
| but when I try to install it on my system, this is what I get
| 
| $ sudo R CMD INSTALL Cairo_1.4-4.tar.gz
| * Installing to library ‘/usr/local/lib/R/site-library’
| * Installing *source* package ‘Cairo’ ...
| checking for gcc... gcc -std=gnu99
[...]
| gcc -std=gnu99 -I/usr/share/R/include -D_REENTRANT -I/usr/include/cairo 
| -I/usr/include/pixman-1 -I/usr/include/freetype2 -I/usr/include/directfb 
| -I/usr/include/libpng12 -I. -Iinclude -g -O2 -fpic -g -O2 -c 
| xlib-backend.c -o xlib-backend.o
| xlib-backend.c:34:74: error: X11/Intrinsic.h: No such file or directory
^

e...@ron:~$ dpkg -S `locate Intrinsic.h`
libxt-dev: /usr/include/X11/Intrinsic.h
e...@ron:~$

| xlib-backend.c: In function ‘Rcairo_init_xlib’:
| xlib-backend.c:158: warning: implicit declaration of function 
| ‘XrmUniqueQuark’
| make: *** [xlib-backend.o] Error 1
| ERROR: compilation failed for package ‘Cairo’
| * Removing ‘/usr/local/lib/R/site-library/Cairo’
| 
| Is there a header file missing? Or is there anything wrong with my system?
| Many thanks

Please run 'sudo apt-get install libxt-dev' and try again.

Dirk

-- 
Three out of two people have difficulties with fractions.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cannot Install Cairo Library

2009-05-22 Thread Lorenzo Isella
Dirk Eddelbuettel wrote:
 On 22 May 2009 at 13:42, Lorenzo Isella wrote:
 | Dear All,
 | I am running Debian testing on my box and I have R 2.9.0 installed from 
 | the standard repositories.
 | I downloaded the package source from
 | http://cran.r-project.org/web/packages/Cairo/index.html
 | but when I try to install it on my system, this is what I get
 | 
 | $ sudo R CMD INSTALL Cairo_1.4-4.tar.gz
 | * Installing to library ‘/usr/local/lib/R/site-library’
 | * Installing *source* package ‘Cairo’ ...
 | checking for gcc... gcc -std=gnu99
 [...]
 | gcc -std=gnu99 -I/usr/share/R/include -D_REENTRANT -I/usr/include/cairo 
 | -I/usr/include/pixman-1 -I/usr/include/freetype2 -I/usr/include/directfb 
 | -I/usr/include/libpng12 -I. -Iinclude -g -O2 -fpic -g -O2 -c 
 | xlib-backend.c -o xlib-backend.o
 | xlib-backend.c:34:74: error: X11/Intrinsic.h: No such file or directory
 ^

 e...@ron:~$ dpkg -S `locate Intrinsic.h`
 libxt-dev: /usr/include/X11/Intrinsic.h
 e...@ron:~$

 | xlib-backend.c: In function ‘Rcairo_init_xlib’:
 | xlib-backend.c:158: warning: implicit declaration of function 
 | ‘XrmUniqueQuark’
 | make: *** [xlib-backend.o] Error 1
 | ERROR: compilation failed for package ‘Cairo’
 | * Removing ‘/usr/local/lib/R/site-library/Cairo’
 | 
 | Is there a header file missing? Or is there anything wrong with my system?
 | Many thanks

 Please run 'sudo apt-get install libxt-dev' and try again.

 Dirk

   
Perfect! Now Cairo installs nicely.
Many thanks

Lorenzo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Reading iputs from adjacency matrix

2009-05-22 Thread pankaj borah
Dear all ,

I am just wondering if there is a way to read inputs from adjacency matrix . I 
am using igraph module. I want to directly load the input as adjacency instead 
of reading as edgelist or pajek format. Can anypne help ?

Thanks ,

Pankaj Barah

Mathematical modelling  Computational Biology Group

Centre for Cellular  Molecular Biology

Uppal Road, Hyderabad 500 007, AP, India





  Explore and discover exciting holidays and getaways with Yahoo! India 
Travel http://in.travel.yahoo.com/
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Need a faster function to replace missing data

2009-05-22 Thread Dieter Menneq
Tim Clark mudiver1200 at yahoo.com writes:

 
 I need some help in coming up with a function that will take two data sets,
determine if a value is missing in
 one, find a value in the second that was taken at about the same time, and
substitute the second value in for
 where the first should have been.  

This is the type of job I would do with a database, not R (alone). The
main advantage is that you have to do the cleanup job only once and can
retrieve the data in a rather well-documented way later (it's possible
with R, I know).

 Put the 5 minutes data into one table. I would two new columns giving the 
delta to the next value for easier linear interpolation, but that's 
secondary. Make sure to index the table.

 Put the 1 seconds data into another table, adding values rounded to
5 seconds, and giving these an index.

From R/ODBC or with RSQLite, make a Join of all values in Table 1 
that do have NULL values in the coordinates. If you do not want to 
do a linear interpolation, you could even do this within the database
and SQL alone. 

 Compute the linear interpolation, and write the data back into
the database. If you want to be careful, you might mark the interpolated
values in a separate field as computed

When at a later time new data come in, you can run the procedure again
without penalty.

Dieter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Step by step: Making an R package with Fortran 95

2009-05-22 Thread sjnovick

To all.  I need your help.  The big question is:  How do I make an R library
with Fortran 95 files?  You may assume that I am a pretty decent programmer
in both R and Fortran.  I lay out a scenario and need your help!

I know how to make an ordinary R package and have dabbled with R + Fortran
95 *.dll linking.  I do not have a great handle on the whole Makevars file
and whatever else I might need to get things working to build a new R
package.  By the way, I'm using R 2.8.1 and Windows XP (and sometimes
Linux).


Let's take this simple situation.  Suppose I'm using Windows XP and place
the Folder for the package Testme in C:\Program
Files\R\R-2.8.1\src\library\Testme

Files and folders:

  DESCRIPTION file -- I understand this file
  NAMESPACE file:  This one has the contents
   useDynLib(Testme)
   exportPattern(^[^\\.])

  man directory:  I put my *.Rd files here.  I understand this part.
  R directory:   I put my *.R files here.  One of these files calls upon a
Fortran 95 subroutine with
 .Fortran(MySubroutine, var1=as.double(x),
var2=as.double(y), etc.)
 I understand this part.

  src directory:  I put my *.f95 files here.  Also, I put a Makevars file
here.

  Suppose that I have two *.f95 files, named  CalculatorModule.f95 and 
ANiceSubroutine.f95.
  
  CalculatorModule.f95 contains a Fortran 95 module.
  ANiceSubroutine.f95 uses the functions in
Calculator.Module.f95 with the USE command.  To be consistent with my R
file (above), ANiceSubroutine.f95 contains the subroutine: MySubroutine.

Finally:  I issue the command from a DOS Shell in the directory C:\Program
Files\R\R_2.8.1\src\gnuwin32

make pkg-Testme

This results in errors.



Here are the problems:
  1.  The order of compilation must be f95 -c CalculatorModule.f95
ANiceSubroutine.f95.  Unfortunately, R compiles them alphabetically.  So, I
was given errors making the *.o files.
   
   I overcame this problem by renaming the files  
a01.CalculatorModule.f95 and a02.ANiceSubroutine.f95.  I would rather not
have to name the files this way if I don't have to!  When I did this, R
compiled the Fortran files in the correct order, but I still have errors.

   2.  Here was the error output.  Help?

C:\Program Files\R\R-2.8.1\src\gnuwin32make pkg-Testme

-- Making package Testme 
  adding build stamp to DESCRIPTION
  installing NAMESPACE file and metadata
  making DLL ...
g95  -O3 -c a01.CalculatorMod.f95 -o a01.CalculatorMod.o
g95  -O3 -c a02.ANiceSubroutine.f95 -o a02.ANiceSubroutine.o
windres --preprocessor=gcc -E -xc -DRC_INVOKED -I
C:/PROGRA~1/R/R-28~1.1/inclu
de  -i Testme_res.rc -o Testme_res.o
g95 -shared -s  -o Testme.dll Testme.def a01.CalculatorMod.o
a02.ANiceSubroutine.o
Testme_res.o  -LC:/PROGRA~1/R/R-28~1.1/bin-lR
ld: Testme.def:6: syntax error
ld:Testme.def: file format not recognized; treating as linker script
ld:Testme.def:2: syntax error
make[3]: *** [Testme.dll] Error 1
make[2]: *** [srcDynlib] Error 2
make[1]: *** [all] Error 2
make: *** [pkg-Testme] Error 2



Thanks for helping!!! :)
-- 
View this message in context: 
http://www.nabble.com/Step-by-step%3A-Making-an-R-package-with-Fortran-95-tp23669355p23669355.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to compute the score-statistics from a logistic model using lrm {Design}

2009-05-22 Thread Christian Ametz

Greetings,

is there are way to simply compute the score-statistics for a logistic 
model generated with lrm?


For example, I want to compare the wald-statistics for a given model 
against the score-statistics in order to find the relevant predictors.


 attach(iris)
 model = lrm (species~.,iris)
 anova (model)
   Wald Statistics  Response: Species

Factor   Chi-Square d.f. P
Sepal.Length 1.06   10.3032

Sepal.Width  2.22   10.1358
Petal.Length 3.96   10.0465
Petal.Width  3.52   10.0605
TOTAL5.56   40.2346


So what I now basically want is an output of the score-statistics to 
verify if it leads to the same predictors than the wald-statistics 
(anova) and therefore estimate if the score-statistics would produce a 
better model.



I hope I stated everything clearly.
Thanks for your help!

Christian

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confirmatory factor analysis problems using sem package (works in Amos)

2009-05-22 Thread John Fox
Dear Solomon,

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On
 Behalf Of S. Messing
 Sent: May-22-09 1:27 AM
 To: r-help@r-project.org
 Subject: [R] Confirmatory factor analysis problems using sem package
(works
 in Amos)
 
 
 Hello all,
 
 I'm trying to replicate a confirmatory factor analysis done in Amos. 

Occasionally in an ill-conditioned problem, one program will produce a
solution and another won't. As a general matter, I'd expect Amos to be more
robust than sem() since Amos is written specifically for SEMs, while sem()
uses nlm(), a general-purpose optimizer.

 The
 idea is to compare a one-factor and a two-factor model.  I get the
following
 warning message when I run either model:
 
 Could not compute QR decomposition of Hessian.
 Optimization probably did not converge.
 
 I have no idea what to do here.  

A general strategy is to set debug=TRUE in the call to sem() and see what
happens in the optimization. Then there are several things that you can do
to try to get the optimization to converge; see ?sem. In this case, however,
I wasn't able to get a solution.

The one-factor model is equivalent to a one-factor exploratory FA, which can
be fit by ML using factanal():

 factanal(factors=1, covmat=correl, n.obs=1100)

Call:
factanal(factors = 1, covmat = correl, n.obs = 1100)

Uniquenesses:
   pvote jmposaff jmnegaff boposaff bonegaff
obama.therm mccain.thermoddcand.D   evencand.D 
   0.1000.4960.4970.2770.397
0.1290.3120.4660.585 

Loadings:
 Factor1
pvote-0.949 
jmposaff  0.710 
jmnegaff -0.709 
boposaff -0.850 
bonegaff  0.777 
obama.therm  -0.934 
mccain.therm  0.830 
oddcand.D 0.731 
evencand.D0.645 

   Factor1
SS loadings  5.744
Proportion Var   0.638

Test of the hypothesis that 1 factor is sufficient.
The chi square statistic is 1710.03 on 27 degrees of freedom.
The p-value is 0

As you can see, the one-factor model fits the data very poorly (as does a
two-factor EFA); I suspect, but am not sure, that this is the source of the
problem in sem(). I couldn't get a solution from sem() even when I used the
factanal() solution as start values.


 I believe posters reported the same
 problem.  

In almost all cases, the models haven't been properly specified, which is
not the case here. Here, the model just doesn't fit the data.

 It seems that the ability to invert the correlation matrix may
 have something to do with this error, but solve(correl) yields a solution.

No, the input correlation matrix is positive-definite. sem() would have
complained if it were not:

 eigen(correl, only.values=TRUE)
$values
[1] 6.12561630 0.82418329 0.71616585 0.51263750 0.24467315 0.18248909
0.17024374
[8] 0.13905585 0.08493524


I'll keep your problem as a test case to see whether I can produce a
solution, possibly using a different optimizer -- as I mentioned, sem() uses
nlm().

Regards,
 John


 
 Here are my correlation matrix and model specifications:
 
 --- R CODE BEGIN
 
 
 library(sem)
 correl - matrix(
 c(1.000,-0.6657822,0.6702089,0.7997673,-0.7225454,0.8992372,
   -0.8026491,-0.6715168,-0.5781565,-
 0.6657822,1.000,-0.5107568,
   -0.5030886,0.6971188,-
 0.6306937,0.7200848,0.5121227,0.4496810,
   0.6702089,-0.5107568,1.000,0.7276558,-
 0.3893792,0.6043672,
   -0.7176532,-0.5247434,-0.4670362,0.7997673,-
 0.5030886,0.7276558,
   1.000,-0.6251056,0.8164190,-0.6728515,-
 0.6371453,-0.5531964,
   -0.7225454,0.6971188,-0.3893792,-
 0.6251056,1.000,-0.7760765,
   0.6175124,0.5567924,0.4914176,0.8992372,-
 0.6306937,0.6043672,
   0.8164190,-0.7760765,1.000,-0.7315507,-
 0.6625136,-0.5814590,
   -0.8026491,0.7200848,-0.7176532,-
 0.6728515,0.6175124,-0.7315507,

1.000,0.5910688,0.5096898,-0.6715168,0.5121227,-
 0.5247434,
   -0.6371453,0.5567924,-
 0.6625136,0.5910688,1.000,0.8106496,
   -0.5781565,0.4496810,-0.4670362,-
 0.5531964,0.4914176,-0.5814590,
   0.5096898,0.8106496,1.000),
,nrow=9,ncol=9)
 
 rownames(correl) = c(pvote, jmposaff, jmnegaff,
   boposaff,bonegaff,
   obama.therm, mccain.therm,
   oddcand.D, evencand.D )
 
 colnames(correl) = c(pvote, jmposaff, jmnegaff,
   boposaff,bonegaff,
   obama.therm, mccain.therm,
   oddcand.D, evencand.D )
 
 #One Factor Model:
 model.all - specify.model()
 allmeasures - pvote, b1, NA
 allmeasures - obama.therm,   

Re: [R] survfit, summary, and survmean (was Changelog for survival package)

2009-05-22 Thread Terry Therneau

 Further I appreciate your new function survmean(). At the moment it 
 seems to be intended as internal, and not documented in the help. 

The computations done by print.survfit are now a part of the results returned 
by 
summary.survfit.  See  'table' in the output list of ?summary.survfit.  Both 
call an internal survmean() function to ensure that any future updates stay in 
synchrony.

This was a perennial (and justified) complaint with print.survfit.  Per the 
standard print(x) always returns x, so there was no way to get the results of 
the print as an S object.

  Terry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] System crash when using surv=T in cph

2009-05-22 Thread Frank E Harrell Jr

maxb wrote:

Can someone help me. I am very new to R. I am fitting a Cox model using Frank
Harrell's cph as I want to produce a Nomogram. This is what I have done:

Srv- Surv(time,cens)
f.cox- cph(Srv~ v1+v2+v3+v4, x=T, y=T, surv=T)

As soon as I press enter, Windows XP crashes. If I remove surv=T, then it
works. I have R version 2.9.0.

Is there a way of displaying the parameter estimates (ie beta coefficients)
and HR when I type 
anova(f.cox) as this only displays the Chi squared and p-values.

Any help or advise drawing a Nomogram will be appreciated.

Thanks in advance

Max  


Until Design is updated (which will be very soon)
source('http://biostat.mc.vanderbilt.edu/tmp/cphnew.s') after 
library(Design).


anova is not supposed to display parameter estimates.  For one thing, 
there is often more than one parameter associated with a term in the 
model.  Use summary(fit) to get hazard ratios for sensible covariate ranges.


Frank
--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Class for time of day?

2009-05-22 Thread Stavros Macrakis
On Thu, May 21, 2009 at 8:28 PM, Gabor Grothendieck ggrothendi...@gmail.com
 wrote:

 It uses hours/minutes/seconds for values  1 day and uses days and
 fractions
 of a day otherwise.


Yes, my examples were documenting this idiosyncracy.


 For values and operations that it has not considered it falls back to
 the internal representation.


Yes, my examples were documenting this bad behavior.


 Most of your examples start to make sense once you realize this.


Of course I realize this.  That was the point of my examples.  I
understand perfectly well what is *causing* the bad behavior.  That doesn't
make it less bad.

What is the point of a class system if functions ignore the class and
perform meaningless calculations on the internal representation?

-s

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] postscript problems (landscape orientation)

2009-05-22 Thread Marc Schwartz

On May 21, 2009, at 6:31 PM, Ted Harding wrote:


On 21-May-09 23:02:28, David Scott wrote:

Well most people deal with that problem by not using Acrobat to
read .pdf files. On linux you can use evince or xpdf. On windows
just use gsview32. Those readers don't lock the .pdf.

I am with Peter and generally go straight to pdf these days. The only
reason for going through postscript is if you want to use psfrag.

David Scott


Going off at a tangent to the original query, I would say that
this is too limited a view! For one thing, PostScript (in its
Encasulated manifestation) is readily imported and re-scalable
by software which does not import PFF. Also, PS is an editable
plain-text file (even though there may be chunks in hexadecimal
for some graphics -- but it's still ASCII). One thing which I
quite often do is edit the %%BoundingBox:  line to improve the
framing of the graphic -- and viewing it in ghostscript with
watch mode on as one edits, one can easily adjust things to
a satisfactory visual effect.

If you know what you are doing, you can if you wish move things
around, or add or delete things (especially bits of text) by
using any plain-text editor on the PostScript file.

Finally (though this may be a symptom of serious masochsim on
my part), if I download a PDF in which the author has plotted
the data, after I print to file in PostScript from Acrobat
Reader I can usually obtain a very close approximation to the
original data values by extracting the PS coordinates of the
plotted points (and axis tick-marks) from the PostScript file.

The only reason for going through postscript is if you want
to use psfrag -- or psnup and/or psbook or ...




PSTricks (http://www.tug.org/PSTricks/), which I use for creating flow  
chart types of figures, such as subject disposition charts in clinical  
trialsin Sweave, I can then fill in text labels in the various  
boxes using \Sexpr with counts, etc.


Examples of use here:

  http://www.tug.org/PSTricks/main.cgi?file=examples

On OSX I find that OSX' Preview works quite well in place of Adobe  
Reader, save for certain animations in PDF files. Also, for those  
still reading this thread and are on OSX, if you are not aware, there  
are additional plugins for QuickLook for EPS files and other such  
things:


  http://www.quicklookplugins.com/

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Step by step: Making an R package with Fortran 95

2009-05-22 Thread Duncan Murdoch

On 5/22/2009 8:18 AM, sjnovick wrote:

To all.  I need your help.  The big question is:  How do I make an R library
with Fortran 95 files?  You may assume that I am a pretty decent programmer
in both R and Fortran.  I lay out a scenario and need your help!

I know how to make an ordinary R package and have dabbled with R + Fortran
95 *.dll linking.  I do not have a great handle on the whole Makevars file
and whatever else I might need to get things working to build a new R
package.  By the way, I'm using R 2.8.1 and Windows XP (and sometimes
Linux).


Let's take this simple situation.  Suppose I'm using Windows XP and place
the Folder for the package Testme in C:\Program
Files\R\R-2.8.1\src\library\Testme


I'd put your own files in their own directory, rather than in the R 
source tree.  But this shouldn't have caused your error...



Files and folders:

  DESCRIPTION file -- I understand this file
  NAMESPACE file:  This one has the contents
   useDynLib(Testme)
   exportPattern(^[^\\.])

  man directory:  I put my *.Rd files here.  I understand this part.
  R directory:   I put my *.R files here.  One of these files calls upon a
Fortran 95 subroutine with
 .Fortran(MySubroutine, var1=as.double(x),
var2=as.double(y), etc.)
 I understand this part.

  src directory:  I put my *.f95 files here.  Also, I put a Makevars file
here.


What is in Makevars?  You don't need this unless you're doing something 
special.  In your case, you might be, see below.




  Suppose that I have two *.f95 files, named  CalculatorModule.f95 and 
ANiceSubroutine.f95.
  
  CalculatorModule.f95 contains a Fortran 95 module.

  ANiceSubroutine.f95 uses the functions in
Calculator.Module.f95 with the USE command.  To be consistent with my R
file (above), ANiceSubroutine.f95 contains the subroutine: MySubroutine.

Finally:  I issue the command from a DOS Shell in the directory C:\Program
Files\R\R_2.8.1\src\gnuwin32

make pkg-Testme

This results in errors.



Here are the problems:
  1.  The order of compilation must be f95 -c CalculatorModule.f95
ANiceSubroutine.f95.  Unfortunately, R compiles them alphabetically.  So, I
was given errors making the *.o files.


You could put the dependency

ANiceSubroutine.o:  ANiceSubroutine.f95 CalculatorModule.o

into your Makevars file, but see the instructions in Writing R 
Extensions if you add targets there.


   
   I overcame this problem by renaming the files  
a01.CalculatorModule.f95 and a02.ANiceSubroutine.f95.  I would rather not

have to name the files this way if I don't have to!  When I did this, R
compiled the Fortran files in the correct order, but I still have errors.

   2.  Here was the error output.  Help?

C:\Program Files\R\R-2.8.1\src\gnuwin32make pkg-Testme

-- Making package Testme 
  adding build stamp to DESCRIPTION
  installing NAMESPACE file and metadata
  making DLL ...
g95  -O3 -c a01.CalculatorMod.f95 -o a01.CalculatorMod.o
g95  -O3 -c a02.ANiceSubroutine.f95 -o a02.ANiceSubroutine.o
windres --preprocessor=gcc -E -xc -DRC_INVOKED -I
C:/PROGRA~1/R/R-28~1.1/inclu
de  -i Testme_res.rc -o Testme_res.o
g95 -shared -s  -o Testme.dll Testme.def a01.CalculatorMod.o
a02.ANiceSubroutine.o
Testme_res.o  -LC:/PROGRA~1/R/R-28~1.1/bin-lR
ld: Testme.def:6: syntax error
ld:Testme.def: file format not recognized; treating as linker script
ld:Testme.def:2: syntax error
make[3]: *** [Testme.dll] Error 1
make[2]: *** [srcDynlib] Error 2
make[1]: *** [all] Error 2
make: *** [pkg-Testme] Error 2


Did you create the Testme.def file? What is in it?

Duncan Murdoch





Thanks for helping!!! :)


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Query regarding na.omit function

2009-05-22 Thread Moumita Das
Hi friends,
I have a query regarding na.omit function.Please ,someone help me.

I have a function
xyz_function-function(arguments)
{
   some code
   return(list(matrix=dataset))
}

xyz_function_returnvalue-xyz_function(passed argumentss)

*Case-I*
xyz_function_returnvalue_deletingNArows-na.omit((xyz_function_returnvalue))
*Case-II*
 
xyz_function_returnvalue_dataframe_deletingNArows-na.omit(as.data.frame((pair_raw_data$matrix)))

Both case I and II don't work.My dataset has rows with NA values,that's for
sure.



Whereas this simple code,works fine.I mean from the data frame the rows
containing NA values could be easily deleted.

DF - data.frame(x = c(1, 2, 3), y = c(0, 10, NA))
DF-na.omit(DF)
print(DF)

-- 
Thanks
Moumita

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Forcing a variableinto a model using stepAIC

2009-05-22 Thread Laura Bonnett
Dear All,

I am attempting to use forward and/or backward selection to determine
the best model for the variables I have.  Unfortunately, because I am
dealing with patients and every patient is receiving treatment I need
to force the variable for treatment into the model.  Is there a way to
do this using R?  (Additionally, the model is stratified by
randomisation period).  I know that SAS can be used to do this but my
SAS coding is poor and consequently it would be easier for me to use
R, especially given the fractional polynomial transformations!

Currently the model is as follows (without treatment).

coxfita=coxph(Surv(rem.Remtime,rem.Rcens)~sind(nearma)+fsh(nearma)+fdr(nearma)+th1(nearma)+th2(nearma)+fp(cage)+fp(fint)+fp(tsb)+strata(rpa),data=nearma)


Thank you for your help,

Laura

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query regarding na.omit function

2009-05-22 Thread jim holtman
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

We have no idea of the structure of 'dataset' that is being returned by your
function.  This example works fine for dataframes:

 x - list(matrix=data.frame(x = c(1, 2, 3), y = c(0, 10, NA)))
 x
$matrix
  x  y
1 1  0
2 2 10
3 3 NA
 na.omit(x$matrix)
  x  y
1 1  0
2 2 10
 na.omit(as.data.frame(x$matrix))
  x  y
1 1  0
2 2 10

You might also look at 'complete.cases', but a real example would be helpful
in providing information.

On Fri, May 22, 2009 at 9:41 AM, Moumita Das
das.moumita.onl...@gmail.comwrote:

 Hi friends,
 I have a query regarding na.omit function.Please ,someone help me.

 I have a function
 xyz_function-function(arguments)
 {
   some code
   return(list(matrix=dataset))
 }

 xyz_function_returnvalue-xyz_function(passed argumentss)

 *Case-I*

 xyz_function_returnvalue_deletingNArows-na.omit((xyz_function_returnvalue))
 *Case-II*

  
 xyz_function_returnvalue_dataframe_deletingNArows-na.omit(as.data.frame((pair_raw_data$matrix)))

 Both case I and II don't work.My dataset has rows with NA values,that's for
 sure.



 Whereas this simple code,works fine.I mean from the data frame the rows
 containing NA values could be easily deleted.

 DF - data.frame(x = c(1, 2, 3), y = c(0, 10, NA))
 DF-na.omit(DF)
 print(DF)

 --
 Thanks
 Moumita

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Class for time of day?

2009-05-22 Thread Gabor Grothendieck
You could create a subclass of times with its own print and format
methods that printed and formatted hours/minutes/seconds even if
greater than one day if that is the main item you need.

Regarding division you could contribute that to the chron package.
I've contributed a few missing items and they were incorporated.

Giving an error when it does not understand something would be
dangerous as it could break much existing code so that would
probably not be possible at this stage.
The idea of defaulting to internal representations is based on
the idea that you get many features for free since the way the
internal representations work gives the right answer in many
cases.   Its best to stick with the implicit philosophy that
underlies a package.  If you want a different philosophy then
its really tantamount to creating a new package.  I don't
think that one is right and the other wrong but simply
represent different viewpoints.

On Fri, May 22, 2009 at 9:33 AM, Stavros Macrakis macra...@alum.mit.edu wrote:
 On Thu, May 21, 2009 at 8:28 PM, Gabor Grothendieck
 ggrothendi...@gmail.com wrote:

 It uses hours/minutes/seconds for values  1 day and uses days and
 fractions
 of a day otherwise.

 Yes, my examples were documenting this idiosyncracy.


 For values and operations that it has not considered it falls back to
 the internal representation.

 Yes, my examples were documenting this bad behavior.


 Most of your examples start to make sense once you realize this.

 Of course I realize this.  That was the point of my examples.  I
 understand perfectly well what is *causing* the bad behavior.  That doesn't
 make it less bad.

 What is the point of a class system if functions ignore the class and
 perform meaningless calculations on the internal representation?

     -s



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Forcing a variableinto a model using stepAIC

2009-05-22 Thread Chuck Cleland
On 5/22/2009 9:58 AM, Laura Bonnett wrote:
 Dear All,
 
 I am attempting to use forward and/or backward selection to determine
 the best model for the variables I have.  Unfortunately, because I am
 dealing with patients and every patient is receiving treatment I need
 to force the variable for treatment into the model.  Is there a way to
 do this using R?  (Additionally, the model is stratified by
 randomisation period).  I know that SAS can be used to do this but my
 SAS coding is poor and consequently it would be easier for me to use
 R, especially given the fractional polynomial transformations!
 
 Currently the model is as follows (without treatment).
 
 coxfita=coxph(Surv(rem.Remtime,rem.Rcens)~sind(nearma)+fsh(nearma)+fdr(nearma)+th1(nearma)+th2(nearma)+fp(cage)+fp(fint)+fp(tsb)+strata(rpa),data=nearma)
 
 
 Thank you for your help,
 
 Laura

  See the scope argument to stepAIC in the MASS package.  You can
specify a formula in the 'lower' component of scope which includes the
treatment variable.  That will force the treatment variable to remain in
every model examined in the stepwise search.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc. (www.ndri.org)
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query regarding na.omit function

2009-05-22 Thread Moumita Das
Hi Jim,

xyz_function-function(arguments)

 {
some code


pair_raw_data-changeMissingValuestoNA(pair_raw_data$matrix,pair_raw_data$dim,missing_values)



return(list(matrix=pair_raw_data))
 }

Type of my dataset was a list.I tried checking that ,changing my dataset
to some other types.

Thanks
Moumita

On Fri, May 22, 2009 at 7:11 PM, Moumita Das
das.moumita.onl...@gmail.comwrote:


 Hi friends,
 I have a query regarding na.omit function.Please ,someone help me.

 I have a function
 xyz_function-function(arguments)
 {
some code
return(list(matrix=dataset))
 }

 xyz_function_returnvalue-xyz_function(passed argumentss)

 *Case-I*

 xyz_function_returnvalue_deletingNArows-na.omit((xyz_function_returnvalue))
 *Case-II*

  
 xyz_function_returnvalue_dataframe_deletingNArows-na.omit(as.data.frame((pair_raw_data$matrix)))

 Both case I and II don't work.My dataset has rows with NA values,that's for
 sure.



 Whereas this simple code,works fine.I mean from the data frame the rows
 containing NA values could be easily deleted.

 DF - data.frame(x = c(1, 2, 3), y = c(0, 10, NA))
 DF-na.omit(DF)
 print(DF)

 --
 Thanks
 Moumita




-- 
Thanks
Moumita

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] EM algorithm mixture of multivariate gaussian

2009-05-22 Thread daniele riggi
Hi, i would to know, if someone have ever write the code to estimate the
parameter (mixing proportion, mean, a var/cov matrix) of a mixture of two
multivariate normal distribution. I wrote it and it works (it could find
mean and mixing proportion, if I fix the var/cov matrix), while if I fix
anything, it doesn't work. My suspect is that when the algorithm iterates
the var/cov matrix, something wrong happens, but i don't know how solve this
problem.
I enclose my script, if someone would give it a look, I would appreciate so
much.
thank you
daniele

#
#Start Script
#

library(mvtnorm)
libray(scatterplot3d)
library(MASS)
n=100

m1=c(5,-5)
m2=c(-3,3)
s1=matrix(c(2,1,1,3), 2,2)
s2=matrix(c(4,1,1,6), 2,2)
alpha=0.3

c1 - mvrnorm(round(n*alpha),m1,s1)
c2 - mvrnorm(round(n*(1-alpha)),m2,s2)
allval - rbind(c1,c2)
x - allval[sample(n,n),]

mixm-
function(x,m1,m2,s1,s2, alpha)

{
f1-dmvnorm(x, m1, s1, log=FALSE)
 f2-dmvnorm(x, m2, s2, log=FALSE)
 f=alpha*f1+(1-alpha)*f2
f
}



plot(x)

den-mixm(x,m1,m2,s1,s2,alpha)
scatterplot3d(x[,1],x[,2],den, highlight.3d=TRUE, col.axis=blue,
  col.grid=lightblue,angle=120, pch=20)


em2mn- function(y)
{
n-length(y[,1])
p-matrix(0,n,1)
f1-matrix(0,n,1)
f2-matrix(0,n,1)
tau-matrix(0,n,2)
eps-0.0001
mu01-c(0,0)
mu02-c(0,0)
sd01-matrix(0,2,2)
sd02-matrix(0,2,2)
cov1-matrix(0,2,2)
cov2-matrix(0,2,2)
 # 1 inizializzare i valori
alpha0= runif(1,0,1)
for (j in 1:2) {
mu01[j] - runif(1,min=quantile(y[,j], probs =0.25),
max=quantile(y[,j], probs =0.75))
mu02[j] - runif(1,min=quantile(y[,j], probs =0.25),
max=quantile(y[,j], probs =0.75))
}
 sd01- var(y[1:round(n/2),])
sd02- var(y[round(n/2):n,])
#sd01-s1
#sd02-s2
#prima iterazione

iter-0
 f1-dmvnorm(y, mu01, sd01, log=FALSE)
 f2-dmvnorm(y, mu02, sd02, log=FALSE)
 p=alpha0*f1+(1-alpha0)*f2


scatterplot3d(y[,1], y[,2], p , highlight.3d=TRUE, col.axis=blue,
  col.grid=lightblue,main=val iniz mistura normali multivariata,
angle=120, pch=20)

#verosimiglianza iniziale

l0-sum(log(p))
l1-l0
alpha-alpha0
mu1-mu01
mu2-mu02
sd1-sd01
sd2-sd02

for (iter in 1:itermax)
{

#passo E
 for (i in 1:n) {
tau[i,1]-(alpha*f1[i])/p[i]
tau[i,2]-((1-alpha)*f2[i])/p[i]
}


#passo M
 alpha= mean(tau[,1])
 mu1=colSums(tau[,1]*y)/sum(tau[,1])
mu2=colSums(tau[,2]*y)/sum(tau[,2])
 ycen1-(y-mu1)
ycen2-(y-mu2)

cov1-matrix(0,2,2)
cov2-matrix(0,2,2)

for (i in 1:n){
cov1-cov1+ (tau[i,1]*(ycen1[i,])%*%t(ycen1[i,]))
cov2-cov2+ (tau[i,2]*(ycen2[i,])%*%t(ycen2[i,]))
}
# w1-sqrt(tau[,1])
# w2-sqrt(tau[,2])
# ywei1-w1*ycen1
# ywei2-w2*ycen2
 sd1-cov1/sum(tau[,1])
sd2-cov2/sum(tau[,2])
f1-dmvnorm(y,mu1,sd1,log=FALSE)
f2-dmvnorm(y,mu2,sd2,log=FALSE)
p-alpha*f1+(1-alpha)*f2

L-sum(log(p))
 cat(iter,L,\t,alpha,\t,mu1,\t,mu2,\n)

if (abs(L1-L)eps) break
L1-L
}

denfin-mixm(x,mu1,mu2,sd1,sd2,alpha)
scatterplot3d(x[,1],x[,2],denfin, highlight.3d=TRUE, col.axis=blue,
  col.grid=lightblue,main=densità valori finali,angle=120, pch=20)

return(list(alpha0=alpha0,alpha=alpha,mu1=mu1,mu2=mu2,sd1=sd1,sd2=sd2))

}

em2mn(x)

##
end script
##



-- 
Dr. Daniele Riggi, PhD student
University of Milano-Bicocca
Department of Statistics
Building U7, Via Bicocca degli Arcimboldi, 8
20126 Milano, Italy
cell. +39 328 3380690
mailto: daniele.ri...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] bug in rpart?

2009-05-22 Thread Yuanyuan
Greetings,

I checked the Indian diabetes data again and get one tree for the data with
reordered columns and another tree for the original data. I compared these
two trees, the split points for these two trees are exactly the same but the
fitted classes are not the same for some cases. And the misclassification
errors are different too. I know how CART deal with ties --- even we are
using the same data, the subjects to the left and right would not be the
same if we just rearrange the order of covariates.

But the problem is, the fitted trees are exactly the same on the split
points. Shouldn't we get the same fitted values if the decisions are the
same at each step? Why the same structured trees have different observations
on the nodes?

The source code for running the diabetes data example and the output of
trees are attached. Your professional opinion is very much appreciated.

library(mlbench)
data(PimaIndiansDiabetes2)
mydata-PimaIndiansDiabetes2
library(rpart)
fit2-rpart(diabetes~., data=mydata,method=class)
plot(fit2,uniform=T,main=CART for original data)
text(fit2,use.n=T,cex=0.6)
printcp(fit2)
table(predict(fit2,type=class),mydata$diabetes)
## misclassifcation table: rows are fitted class
  neg pos
  neg 437  68
  pos  63 200


pmydata-data.frame(mydata[,c(1,6,3,4,5,2,7,8,9)])
fit3-rpart(diabetes~., data=pmydata,method=class)
plot(fit3,uniform=T,main=CART after exchaging mass  glucose)
text(fit3,use.n=T,cex=0.6)
printcp(fit3)
table(predict(fit3,type=class),pmydata$diabetes)
##after exchage the order of BODY mass and PLASMA glucose
  neg pos
  neg 436  64
  pos  64 204


Best,

-- 
--
Yuanyuan Huang
Email: sunnyua...@gmail.com


ReorderedTree.pdf
Description: Adobe PDF document


OriginalTree.pdf
Description: Adobe PDF document
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Returning only a file path on Windows

2009-05-22 Thread amvds
I am choosing a file like this:

#Bring up file selection box
fn-file.choose()
fp-file.path(fn,fsep='\\')

Unfortunately, the file path contains the short file name and extension as
well. I had hoped to get only the path so I could make my own long
filenames (for output graphs) by concatenation with this file path.

Of course I can split the string and assemble the components from the
returned list:

fp-strsplit(fp,'\\',fixed=TRUE)


But there must be a better way?

Thanks in advance,
Alex van der Spek

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Behavior of seq with vector from

2009-05-22 Thread Rowe, Brian Lee Yung (Portfolio Analytics)
So if I want to concatenate the output of multiple seq calls, there's no clear 
way to to do this?

For background, I have a number of data.frames with the same structure in a 
list. I want to 'collapse' the list into a single data.frame but only keeping 
certain columns from each underlying data.frame. In my example below, I want to 
keep columns 2,3 in each underlying data.frame.

I'm using do.call('cbind', my.list) and then using the statement below to 
extract only the columns I need (other details omitted for brevity). If there's 
a built-in or pre-built function to do this, I'm all eyes.


Brian

PS if this is unclear, flame away, and I'll post some code

-Original Message-
From: Peter Dalgaard [mailto:p.dalga...@biostat.ku.dk] 
Sent: Friday, May 22, 2009 6:20 AM
To: Rowe, Brian Lee Yung (Portfolio Analytics)
Cc: r-help@r-project.org
Subject: Re: [R] Behavior of seq with vector from


Rowe, Brian Lee Yung (Portfolio Analytics) wrote:

 To get the value I want, I am using the following code:
 sort(as.vector(apply(array(c(2,3)), 1, seq, by=3,length.out=4)))
 [1]  2  3  5  6  8  9 11 12
 
 So two questions:
 1. Is seq designed/intended to be used with a vector from argument, and
 is this the desired behavior?
 2. If so, is there a cleaner way of implementing what I want?

1. Hmm, not really. NA.

2. I'd view it as an outer sum, stringed out to a single vector, hence:

 c(outer(c(2,3), seq(0,,3,4), +))
[1]  2  3  5  6  8  9 11 12


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907


--
This message w/attachments (message) may be privileged, confidential or 
proprietary, and if you are not an intended recipient, please notify the 
sender, do not use or share it and delete it. Unless specifically indicated, 
this message is not an offer to sell or a solicitation of any investment 
products or other financial product or service, an official confirmation of any 
transaction, or an official statement of Merrill Lynch. Subject to applicable 
law, Merrill Lynch may monitor, review and retain e-communications (EC) 
traveling through its networks/systems. The laws of the country of each 
sender/recipient may impact the handling of EC, and EC may be archived, 
supervised and produced in countries other than the country in which you are 
located. This message cannot be guaranteed to be secure or error-free. 
References to Merrill Lynch are references to any company in the Merrill 
Lynch  Co., Inc. group of companies, which are wholly-owned by Bank of America 
Corporation. Securities and Insurance Products: * Are Not FDIC Insured * Are 
Not Bank Guaranteed * May Lose Value * Are Not a Bank Deposit * Are Not a 
Condition to Any Banking Service or Activity * Are Not Insured by Any Federal 
Government Agency. Attachments that are part of this E-communication may have 
additional important disclosures and disclaimers, which you should read. This 
message is subject to terms available at the following link: 
http://www.ml.com/e-communications_terms/. By messaging with Merrill Lynch you 
consent to the foregoing.
--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] fitting Autoregressive Conditional Duration and Cox Proportional Hazard model in R

2009-05-22 Thread Michael
Hi all,

Could anybody point me to some existing code in R for fitting
Autoregressive Conditional Duration and Cox proportional hazard model
and with model selection and model specification tests?

Thank you!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] step by step debugger in R?

2009-05-22 Thread Michael
Really I think if there is a Visual Studio strength debugger, our
collective time spent in developing R code will be greatly reduced.

2009/5/22 Uwe Ligges lig...@statistik.tu-dortmund.de:


 Michael wrote:

 Could anybody point me to the latest status of the most user-friendly
 debugger in R?


 I always have been happy with ?debug, ?recover and friends cited there.
 Although you also find additional software like the debug package ..

 Best,
 Uwe Ligges


 How I wish I don't have to stare at my (long) code for ages and stuck...

 Thanks!

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] EM algorithm mixture of multivariate

2009-05-22 Thread daniele riggi
Hi, i would to know, if someone have ever write the code to estimate the
parameter (mixing proportion, mean, a var/cov matrix) of a mixture of two
multivariate normal distribution. I wrote it and it works (it could find
mean and mixing proportion, if I fix the var/cov matrix), while if I fix
anything, it doesn't work. My suspect is that when the algorithm iterates
the var/cov matrix, something wrong happens, but i don't know how solve this
problem.
I enclose my script, if someone would give it a look, I would appreciate so
much.
thank you
daniele

#
#Start Script
#

library(mvtnorm)
libray(scatterplot3d)
library(MASS)
n=100

m1=c(5,-5)
m2=c(-3,3)
s1=matrix(c(2,1,1,3), 2,2)
s2=matrix(c(4,1,1,6), 2,2)
alpha=0.3

c1 - mvrnorm(round(n*alpha),m1,s1)
c2 - mvrnorm(round(n*(1-alpha)),m2,s2)
allval - rbind(c1,c2)
x - allval[sample(n,n),]

mixm-
function(x,m1,m2,s1,s2, alpha)

{
f1-dmvnorm(x, m1, s1, log=FALSE)
 f2-dmvnorm(x, m2, s2, log=FALSE)
 f=alpha*f1+(1-alpha)*f2
f
}



plot(x)

den-mixm(x,m1,m2,s1,s2,alpha)
scatterplot3d(x[,1],x[,2],den, highlight.3d=TRUE, col.axis=blue,
  col.grid=lightblue,angle=120, pch=20)


em2mn- function(y)
{
n-length(y[,1])
p-matrix(0,n,1)
f1-matrix(0,n,1)
f2-matrix(0,n,1)
tau-matrix(0,n,2)
eps-0.0001
mu01-c(0,0)
mu02-c(0,0)
sd01-matrix(0,2,2)
sd02-matrix(0,2,2)
cov1-matrix(0,2,2)
cov2-matrix(0,2,2)
 # 1 inizializzare i valori
alpha0= runif(1,0,1)
for (j in 1:2) {
mu01[j] - runif(1,min=quantile(y[,j], probs =0.25),
max=quantile(y[,j], probs =0.75))
mu02[j] - runif(1,min=quantile(y[,j], probs =0.25),
max=quantile(y[,j], probs =0.75))
}
 sd01- var(y[1:round(n/2),])
sd02- var(y[round(n/2):n,])
#sd01-s1
#sd02-s2
#prima iterazione

iter-0
 f1-dmvnorm(y, mu01, sd01, log=FALSE)
 f2-dmvnorm(y, mu02, sd02, log=FALSE)
 p=alpha0*f1+(1-alpha0)*f2


scatterplot3d(y[,1], y[,2], p , highlight.3d=TRUE, col.axis=blue,
  col.grid=lightblue,main=val iniz mistura normali multivariata,
angle=120, pch=20)

#verosimiglianza iniziale

l0-sum(log(p))
l1-l0
alpha-alpha0
mu1-mu01
mu2-mu02
sd1-sd01
sd2-sd02

for (iter in 1:itermax)
{

#passo E
 for (i in 1:n) {
tau[i,1]-(alpha*f1[i])/p[i]
tau[i,2]-((1-alpha)*f2[i])/p[i]
}


#passo M
 alpha= mean(tau[,1])
 mu1=colSums(tau[,1]*y)/sum(tau[,1])
mu2=colSums(tau[,2]*y)/sum(tau[,2])
 ycen1-(y-mu1)
ycen2-(y-mu2)

cov1-matrix(0,2,2)
cov2-matrix(0,2,2)

for  (i in 1:n){
cov1-cov1+ (tau[i,1]*(ycen1[i,])%*%t(ycen1[i,]))
cov2-cov2+ (tau[i,2]*(ycen2[i,])%*%t(ycen2[i,]))
}
# w1-sqrt(tau[,1])
# w2-sqrt(tau[,2])
# ywei1-w1*ycen1
# ywei2-w2*ycen2
 sd1-cov1/sum(tau[,1])
sd2-cov2/sum(tau[,2])
 f1-dmvnorm(y,mu1,sd1,log=FALSE)
f2-dmvnorm(y,mu2,sd2,log=FALSE)
p-alpha*f1+(1-alpha)*f2

L-sum(log(p))
 cat(iter,L,\t,alpha,\t,mu1,\t,mu2,\n)

if (abs(L1-L)eps) break
L1-L
}

denfin-mixm(x,mu1,mu2,sd1,sd2,alpha)
scatterplot3d(x[,1],x[,2],denfin, highlight.3d=TRUE, col.axis=blue,
  col.grid=lightblue,main=densità valori finali,angle=120, pch=20)

return(list(alpha0=alpha0,alpha=alpha,mu1=mu1,mu2=mu2,sd1=sd1,sd2=sd2))

}

em2mn(x)

##
end script
##

-- 
Dr. Daniele Riggi, PhD student
University of Milano-Bicocca
Department of Statistics
Building U7, Via Bicocca degli Arcimboldi, 8
20126 Milano, Italy
cell. +39 328 3380690
mailto: daniele.ri...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] axis(): disable prevention of overlapping tick labels for pgfSweave()

2009-05-22 Thread Gerhard Schön
 
Content-Type: text/plain; charset=ISO-8859-15; format=flowed
Content-Transfer-Encoding: 8bit

Dear R Users,

I'm using pgfSweave() form R-Forge.
How can I disable prevention of overlapping tick labels in axis()?
If I use axis() with Sweave(Test1, driver = pgfSweaveDriver), R 
treats the LaTeX-formatting as long strings, therefore tries not to draw 
overlapping.

Gerhard


\documentclass{article}
\usepackage{pgf}

\begin{document}

Test, echo = FALSE, fig = TRUE, pgf = TRUE=
plot(1:15, axes = FALSE, xlab = )
axis(1, at = 1:2, tick = FALSE, line = 0, labels = 1:2)
axis(1, at = 1:7, tick = FALSE, line = 1, labels = 1:7)
axis(1, at = 1:2, tick = FALSE, line = 2, labels = paste({\\textit{, 
1:2, }}, sep = ))
axis(1, at = 1:7, tick = FALSE, line = 3, labels = paste({\\textit{, 
1:7, }}, sep = ))
@

\end{document}


-- 
Gerhard Sch?n
Department of Medical Biometry and Epidemiology
University Medical Center Hamburg-Eppendorf
Martinistr. 52, building W34
Phone: +49(40) 7410 57458
Fax:   +49(40) 7410 57790
Web: www.uke.de/imbe

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Class for time of day?

2009-05-22 Thread Stavros Macrakis
On Fri, May 22, 2009 at 10:03 AM, Gabor Grothendieck 
ggrothendi...@gmail.com wrote:

 Regarding division you could contribute that to the chron package.
 I've contributed a few missing items and they were incorporated.


Good to know.  Maybe I'll do that


 Giving an error when it does not understand something would be
 dangerous as it could break much existing code so that would
 probably not be possible at this stage.


But would it break any existing *correct* code?  I find it hard to imagine
any cases where adding 1 hour of difftime to times(12:00:00) should return
1.5 days rather than 13:00:00.


 The idea of defaulting to internal representations is based on
 the idea that you get many features for free since the way the
 internal representations work gives the right answer in many
 cases.


I must admit I am rather shocked by this approach.  Getting something for
free is a bad bargain if what you get is nonsense.


 Its best to stick with the implicit philosophy that
 underlies a package.  If you want a different philosophy then
 its really tantamount to creating a new package.  I don't
 think that one is right and the other wrong but simply
 represent different viewpoints.


So you would defend the viewpoint that 1 hour is the same thing as 1 day?

 -s

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova leads to an error

2009-05-22 Thread Peter Dalgaard
Skotara wrote:
 Dear R-list,
 
 the following code had been running well over the last months:
 
   exam - matrix(rnorm(100,0,1), 10, 10)
   gg - factor(c(rep(A, 5), rep(B, 5)))
   mlmfit - lm(exam ~ 1); mlmfitG - lm(exam ~ gg)
   result - anova(mlmfitG, mlmfit, X=~0, M=~1)
 
 Until, all of a sudden the following error occured:
 
 Fehler in apply(abs(sapply(deltassd, function(X) diag((T %*% X %*%
 t(T),  :
  dim(X) must have a positive length
 
 I have not kept track of the changes in my R-version, so it might have
 to do with that.
 Now it is: R version 2.9.0 (2009-04-17).
 
 Does anybody know more about this error? I would help me a lot!


Hmm, It didn't work in 2.8.1 either.

Anyways, the direct cause of the problem is that the construction

sapply(deltassd, function(X) diag((T %*% X %*% t(T

will return a vector, not matrix, if T is 1 x p. This happens inside

apply(abs(sapply(deltassd, function(X) diag((T %*% X %*% t(T), 1, max)

and obviously, apply() is confused. We need an as.matrix(), which we do
have at other points in the function.

For a quick workaround, use test=Spherical. It is really a univariate
problem so all the tests are equivalent (in fact,
anova(lm(rowMeans(exam) ~ gg)) also works).

 
 Thank you very much!
 
 Nils
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Returning only a file path on Windows

2009-05-22 Thread Duncan Murdoch

On 5/22/2009 10:45 AM, am...@xs4all.nl wrote:

I am choosing a file like this:

#Bring up file selection box
fn-file.choose()
fp-file.path(fn,fsep='\\')


file.path() constructs a path from component parts, it doesn't extract 
the path from a filename.


You want dirname(fn).  You may also want to look at normalizePath, to 
convert short names to long ones, or shortPathName, for the reverse.


Duncan Murdoch



Unfortunately, the file path contains the short file name and extension as
well. I had hoped to get only the path so I could make my own long
filenames (for output graphs) by concatenation with this file path.

Of course I can split the string and assemble the components from the
returned list:

fp-strsplit(fp,'\\',fixed=TRUE)


But there must be a better way?

Thanks in advance,
Alex van der Spek

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Behavior of seq with vector from

2009-05-22 Thread Gabor Grothendieck
Try it this way:

# test list of data frames
L - list(anscombe[1:4], anscombe[5:8], anscombe[1:4], anscombe[5:8])

# get columns 2 and 3 from each component; cbind those together
do.call(cbind, lapply(L, [, 2:3))


On Fri, May 22, 2009 at 11:01 AM, Rowe, Brian Lee Yung (Portfolio
Analytics) b_r...@ml.com wrote:
 So if I want to concatenate the output of multiple seq calls, there's no 
 clear way to to do this?

 For background, I have a number of data.frames with the same structure in a 
 list. I want to 'collapse' the list into a single data.frame but only keeping 
 certain columns from each underlying data.frame. In my example below, I want 
 to keep columns 2,3 in each underlying data.frame.

 I'm using do.call('cbind', my.list) and then using the statement below to 
 extract only the columns I need (other details omitted for brevity). If 
 there's a built-in or pre-built function to do this, I'm all eyes.


 Brian

 PS if this is unclear, flame away, and I'll post some code

 -Original Message-
 From: Peter Dalgaard [mailto:p.dalga...@biostat.ku.dk]
 Sent: Friday, May 22, 2009 6:20 AM
 To: Rowe, Brian Lee Yung (Portfolio Analytics)
 Cc: r-help@r-project.org
 Subject: Re: [R] Behavior of seq with vector from


 Rowe, Brian Lee Yung (Portfolio Analytics) wrote:
 
 To get the value I want, I am using the following code:
 sort(as.vector(apply(array(c(2,3)), 1, seq, by=3,length.out=4)))
 [1]  2  3  5  6  8  9 11 12

 So two questions:
 1. Is seq designed/intended to be used with a vector from argument, and
 is this the desired behavior?
 2. If so, is there a cleaner way of implementing what I want?

 1. Hmm, not really. NA.

 2. I'd view it as an outer sum, stringed out to a single vector, hence:

 c(outer(c(2,3), seq(0,,3,4), +))
 [1]  2  3  5  6  8  9 11 12


 --
   O__   Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
 ~~ - (p.dalga...@biostat.ku.dk)              FAX: (+45) 35327907


 --
 This message w/attachments (message) may be privileged, confidential or 
 proprietary, and if you are not an intended recipient, please notify the 
 sender, do not use or share it and delete it. Unless specifically indicated, 
 this message is not an offer to sell or a solicitation of any investment 
 products or other financial product or service, an official confirmation of 
 any transaction, or an official statement of Merrill Lynch. Subject to 
 applicable law, Merrill Lynch may monitor, review and retain e-communications 
 (EC) traveling through its networks/systems. The laws of the country of each 
 sender/recipient may impact the handling of EC, and EC may be archived, 
 supervised and produced in countries other than the country in which you are 
 located. This message cannot be guaranteed to be secure or error-free. 
 References to Merrill Lynch are references to any company in the Merrill 
 Lynch  Co., Inc. group of companies, which are wholly-owned by Bank of 
 America Corporation. Securities and Insurance Products: * Are Not FDIC 
 Insured * Are Not Bank Guaranteed * May Lose Value * Are Not a Bank Deposit * 
 Are Not a Condition to Any Banking Service or Activity * Are Not Insured by 
 Any Federal Government Agency. Attachments that are part of this 
 E-communication may have additional important disclosures and disclaimers, 
 which you should read. This message is subject to terms available at the 
 following link: http://www.ml.com/e-communications_terms/. By messaging with 
 Merrill Lynch you consent to the foregoing.
 --

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Returning only a file path on Windows

2009-05-22 Thread Uwe Ligges

Are you looking for choose.dir() ?
Or basename() and dirname()?

Uwe Ligges

am...@xs4all.nl wrote:

I am choosing a file like this:

#Bring up file selection box
fn-file.choose()
fp-file.path(fn,fsep='\\')

Unfortunately, the file path contains the short file name and extension as
well. I had hoped to get only the path so I could make my own long
filenames (for output graphs) by concatenation with this file path.

Of course I can split the string and assemble the components from the
returned list:

fp-strsplit(fp,'\\',fixed=TRUE)


But there must be a better way?

Thanks in advance,
Alex van der Spek

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] step by step debugger in R?

2009-05-22 Thread Duncan Murdoch

On 5/22/2009 10:59 AM, Michael wrote:

Really I think if there is a Visual Studio strength debugger, our
collective time spent in developing R code will be greatly reduced.


If someone who knows how to write a debugger plugin for Eclipse wants to 
help, we could have that fairly easily.  All the infrastructure is 
there; it's the UI part that's missing.


Duncan Murdoch



2009/5/22 Uwe Ligges lig...@statistik.tu-dortmund.de:



Michael wrote:


Could anybody point me to the latest status of the most user-friendly
debugger in R?



I always have been happy with ?debug, ?recover and friends cited there.
Although you also find additional software like the debug package ..

Best,
Uwe Ligges



How I wish I don't have to stare at my (long) code for ages and stuck...

Thanks!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] survfit, summary, and survmean (was Changelog for survival package)

2009-05-22 Thread Heinz Tuechler

Dear Terry,

sorry that I did not see this change, and thank you for it. It is very useful.

Heinz

At 15:28 22.05.2009, Terry Therneau wrote:


 Further I appreciate your new function survmean(). At the moment it
 seems to be intended as internal, and not documented in the help.

The computations done by print.survfit are now a part of the results 
returned by

summary.survfit.  See  'table' in the output list of ?summary.survfit.  Both
call an internal survmean() function to ensure that any future 
updates stay in

synchrony.

This was a perennial (and justified) complaint with print.survfit.  Per the
standard print(x) always returns x, so there was no way to get the results of
the print as an S object.

  Terry



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Behavior of seq with vector from

2009-05-22 Thread Bert Gunter
##Is this what you mean ??

do.call(rbind,lapply(yourlist,[,,seq(from=1,by=3,length=4)))

## note the ,, that omits the row argument in the call to [

-- Bert Gunter
Genentech Nonclinical Biostatistics

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Rowe, Brian Lee Yung (Portfolio Analytics)
Sent: Friday, May 22, 2009 8:02 AM
To: Peter Dalgaard
Cc: r-help@r-project.org
Subject: Re: [R] Behavior of seq with vector from

So if I want to concatenate the output of multiple seq calls, there's no
clear way to to do this?

For background, I have a number of data.frames with the same structure in a
list. I want to 'collapse' the list into a single data.frame but only
keeping certain columns from each underlying data.frame. In my example
below, I want to keep columns 2,3 in each underlying data.frame.

I'm using do.call('cbind', my.list) and then using the statement below to
extract only the columns I need (other details omitted for brevity). If
there's a built-in or pre-built function to do this, I'm all eyes.


Brian

PS if this is unclear, flame away, and I'll post some code

-Original Message-
From: Peter Dalgaard [mailto:p.dalga...@biostat.ku.dk] 
Sent: Friday, May 22, 2009 6:20 AM
To: Rowe, Brian Lee Yung (Portfolio Analytics)
Cc: r-help@r-project.org
Subject: Re: [R] Behavior of seq with vector from


Rowe, Brian Lee Yung (Portfolio Analytics) wrote:

 To get the value I want, I am using the following code:
 sort(as.vector(apply(array(c(2,3)), 1, seq, by=3,length.out=4)))
 [1]  2  3  5  6  8  9 11 12
 
 So two questions:
 1. Is seq designed/intended to be used with a vector from argument, and
 is this the desired behavior?
 2. If so, is there a cleaner way of implementing what I want?

1. Hmm, not really. NA.

2. I'd view it as an outer sum, stringed out to a single vector, hence:

 c(outer(c(2,3), seq(0,,3,4), +))
[1]  2  3  5  6  8  9 11 12


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907


--
This message w/attachments (message) may be privileged, confidential or
proprietary, and if you are not an intended recipient, please notify the
sender, do not use or share it and delete it. Unless specifically indicated,
this message is not an offer to sell or a solicitation of any investment
products or other financial product or service, an official confirmation of
any transaction, or an official statement of Merrill Lynch. Subject to
applicable law, Merrill Lynch may monitor, review and retain
e-communications (EC) traveling through its networks/systems. The laws of
the country of each sender/recipient may impact the handling of EC, and EC
may be archived, supervised and produced in countries other than the country
in which you are located. This message cannot be guaranteed to be secure or
error-free. References to Merrill Lynch are references to any company in
the Merrill Lynch  Co., Inc. group of companies, which are wholly-owned by
Bank of America Corporation. Securities and Insurance Products: * Are Not
FDIC Insured * Are Not Bank Guaranteed * May Lose Value * Are Not a Bank
Deposit * Are Not a Condition to Any Banking Service or Activity * Are Not
Insured by Any Federal Government Agency. Attachments that are part of this
E-communication may have additional important disclosures and disclaimers,
which you should read. This message is subject to terms available at the
following link: http://www.ml.com/e-communications_terms/. By messaging with
Merrill Lynch you consent to the foregoing.
--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Behavior of seq with vector from

2009-05-22 Thread jim holtman
Yet another way of doing it:

 x - c(2,3)
 x + rep(seq(0, by=3, length=4), each=length(x))
[1]  2  3  5  6  8  9 11 12


On Fri, May 22, 2009 at 11:01 AM, Rowe, Brian Lee Yung (Portfolio Analytics)
b_r...@ml.com wrote:

 So if I want to concatenate the output of multiple seq calls, there's no
 clear way to to do this?

 For background, I have a number of data.frames with the same structure in a
 list. I want to 'collapse' the list into a single data.frame but only
 keeping certain columns from each underlying data.frame. In my example
 below, I want to keep columns 2,3 in each underlying data.frame.

 I'm using do.call('cbind', my.list) and then using the statement below to
 extract only the columns I need (other details omitted for brevity). If
 there's a built-in or pre-built function to do this, I'm all eyes.


 Brian

 PS if this is unclear, flame away, and I'll post some code

 -Original Message-
 From: Peter Dalgaard [mailto:p.dalga...@biostat.ku.dk]
 Sent: Friday, May 22, 2009 6:20 AM
 To: Rowe, Brian Lee Yung (Portfolio Analytics)
 Cc: r-help@r-project.org
 Subject: Re: [R] Behavior of seq with vector from


 Rowe, Brian Lee Yung (Portfolio Analytics) wrote:
 
  To get the value I want, I am using the following code:
  sort(as.vector(apply(array(c(2,3)), 1, seq, by=3,length.out=4)))
  [1]  2  3  5  6  8  9 11 12
 
  So two questions:
  1. Is seq designed/intended to be used with a vector from argument, and
  is this the desired behavior?
  2. If so, is there a cleaner way of implementing what I want?

 1. Hmm, not really. NA.

 2. I'd view it as an outer sum, stringed out to a single vector, hence:

  c(outer(c(2,3), seq(0,,3,4), +))
 [1]  2  3  5  6  8  9 11 12


 --
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
 ~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907


 --
 This message w/attachments (message) may be privileged, confidential or
 proprietary, and if you are not an intended recipient, please notify the
 sender, do not use or share it and delete it. Unless specifically indicated,
 this message is not an offer to sell or a solicitation of any investment
 products or other financial product or service, an official confirmation of
 any transaction, or an official statement of Merrill Lynch. Subject to
 applicable law, Merrill Lynch may monitor, review and retain
 e-communications (EC) traveling through its networks/systems. The laws of
 the country of each sender/recipient may impact the handling of EC, and EC
 may be archived, supervised and produced in countries other than the country
 in which you are located. This message cannot be guaranteed to be secure or
 error-free. References to Merrill Lynch are references to any company in
 the Merrill Lynch  Co., Inc. group of companies, which are wholly-owned by
 Bank of America Corporation. Securities and Insurance Products: * Are Not
 FDIC Insured * Are Not Bank Guaranteed * May Lose Value * Are Not a Bank
 Deposit * Are Not a Condition to Any Banking Service or Activity * Are Not
 Insured by Any Federal Government Agency. Attachments that are part of this
 E-communication may have additional important disclosures and disclaimers,
 which you should read. This message is subject to terms available at the
 following link: http://www.ml.com/e-communications_terms/. By messaging
 with Merrill Lynch you consent to the foregoing.
 --

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] levelplot in combination with xyplot

2009-05-22 Thread Eelke Folmer
With levelplot I would like to combine fitted values and raw values in 
one plot. The surface should be based on fitted values and on top of 
those I would like distinguisable points based on the raw data with a 
colorscheme the same as the surface.


# raw data
x1 = rep(1:10, times = 10)
x2 = rep(1:10, each = 10)
y  = 2 + (x1+rnorm(100, mean=0, sd=4)) + (x2+rnorm(100, mean=0, sd=2))
levelplot(y~x1+x2)
# predicted data
fit = lm(y ~ x1 + x2)
x1.mesh = do.breaks(range(x1), 50)
x2.mesh = do.breaks(range(x2), 50)
grid= expand.grid(x1 = x1.mesh, x2 = x2.mesh)
grid[[fit]] = predict(fit, newdata=grid)
levelplot(fit ~ x1 + x2, data=grid)

So, instead of plotting both levelplots (like here) I want the first on 
top of the second. I tried a bunch of things (including panel.xyplot 
within panel) but am now doubting whether it is possible?

Thanks in advance,
Eelke

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Behavior of seq with vector from

2009-05-22 Thread Rowe, Brian Lee Yung (Portfolio Analytics)
Brilliant! Thanks, Brian

-Original Message-
From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] 
Sent: Friday, May 22, 2009 11:20 AM
To: Rowe, Brian Lee Yung (Portfolio Analytics)
Cc: Peter Dalgaard; r-help@r-project.org
Subject: Re: [R] Behavior of seq with vector from


Try it this way:

# test list of data frames
L - list(anscombe[1:4], anscombe[5:8], anscombe[1:4], anscombe[5:8])

# get columns 2 and 3 from each component; cbind those together
do.call(cbind, lapply(L, [, 2:3))


On Fri, May 22, 2009 at 11:01 AM, Rowe, Brian Lee Yung (Portfolio
Analytics) b_r...@ml.com wrote:
 So if I want to concatenate the output of multiple seq calls, there's no 
 clear way to to do this?

 For background, I have a number of data.frames with the same structure in a 
 list. I want to 'collapse' the list into a single data.frame but only keeping 
 certain columns from each underlying data.frame. In my example below, I want 
 to keep columns 2,3 in each underlying data.frame.

 I'm using do.call('cbind', my.list) and then using the statement below to 
 extract only the columns I need (other details omitted for brevity). If 
 there's a built-in or pre-built function to do this, I'm all eyes.


 Brian

 PS if this is unclear, flame away, and I'll post some code

 -Original Message-
 From: Peter Dalgaard [mailto:p.dalga...@biostat.ku.dk]
 Sent: Friday, May 22, 2009 6:20 AM
 To: Rowe, Brian Lee Yung (Portfolio Analytics)
 Cc: r-help@r-project.org
 Subject: Re: [R] Behavior of seq with vector from


 Rowe, Brian Lee Yung (Portfolio Analytics) wrote:
 
 To get the value I want, I am using the following code:
 sort(as.vector(apply(array(c(2,3)), 1, seq, by=3,length.out=4)))
 [1]  2  3  5  6  8  9 11 12

 So two questions:
 1. Is seq designed/intended to be used with a vector from argument, and
 is this the desired behavior?
 2. If so, is there a cleaner way of implementing what I want?

 1. Hmm, not really. NA.

 2. I'd view it as an outer sum, stringed out to a single vector, hence:

 c(outer(c(2,3), seq(0,,3,4), +))
 [1]  2  3  5  6  8  9 11 12


 --
   O__   Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
 ~~ - (p.dalga...@biostat.ku.dk)              FAX: (+45) 35327907


 --
 This message w/attachments (message) may be privileged, confidential or 
 proprietary, and if you are not an intended recipient, please notify the 
 sender, do not use or share it and delete it. Unless specifically indicated, 
 this message is not an offer to sell or a solicitation of any investment 
 products or other financial product or service, an official confirmation of 
 any transaction, or an official statement of Merrill Lynch. Subject to 
 applicable law, Merrill Lynch may monitor, review and retain e-communications 
 (EC) traveling through its networks/systems. The laws of the country of each 
 sender/recipient may impact the handling of EC, and EC may be archived, 
 supervised and produced in countries other than the country in which you are 
 located. This message cannot be guaranteed to be secure or error-free. 
 References to Merrill Lynch are references to any company in the Merrill 
 Lynch  Co., Inc. group of companies, which are wholly-owned by Bank of 
 America Corporation. Securities and Insurance Products: * Are Not FDIC 
 Insured * Are Not Bank Guaranteed * May Lose Value * Are Not a Bank Deposit * 
 Are Not a Condition to Any Banking Service or Activity * Are Not Insured by 
 Any Federal Government Agency. Attachments that are part of this 
 E-communication may have additional important disclosures and disclaimers, 
 which you should read. This message is subject to terms available at the 
 following link: http://www.ml.com/e-communications_terms/. By messaging with 
 Merrill Lynch you consent to the foregoing.
 --

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Axis Limits in Scatterplot3d

2009-05-22 Thread Uwe Ligges



Alan wrote:

Hi,

How do you obtain the limits of the plotting region in a scatterplot3d
plot?  `par('usr')' does not seem to give sensible values, and that
vector only has 4 elements (not the expected 6).



Well, I never designed anything to do that, but it is possible with the 
following function:



gets3dusr - function(s3dobject){
es3d - environment(s3d$points3d)
g3d - function(name) get(name, envir=es3d)
c(c(g3d(x.min), g3d(x.max)) * g3d(x.scal),
  c(0, g3d(y.max)) * g3d(y.scal) + g3d(y.add),
  c(g3d(z.min), g3d(z.max)) * g3d(z.scal))
}


#Example:

s3d - scatterplot3d(rnorm(5), rnorm(5), rnorm(5))
gets3dusr(s3d)


Uwe Ligges






Alan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lattice strip argument check

2009-05-22 Thread Sebastien Bihorel

Dear R-Users,

Is there a way to check within the following dummy function if the strip 
argument is different from the default set in the function declaration? 
FYI, this argument should be used downstream in a xyplot call of my 
actual function.


dummyfunction - function(..., strip = function(...) 
strip.default(...,strip.names=c(TRUE,TRUE))) {}


Thanks for your help

--
*Sebastien Bihorel, PharmD, PhD*
PKPD Scientist
Cognigen Corp
Email: sebastien.biho...@cognigencorp.com 
mailto:sebastien.biho...@cognigencorp.com

Phone: (716) 633-3463 ext. 323

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Class for time of day?

2009-05-22 Thread Gabor Grothendieck
On Fri, May 22, 2009 at 11:01 AM, Stavros Macrakis
macra...@alum.mit.edu wrote:
 On Fri, May 22, 2009 at 10:03 AM, Gabor Grothendieck
 ggrothendi...@gmail.com wrote:

 Regarding division you could contribute that to the chron package.
 I've contributed a few missing items and they were incorporated.

 Good to know.  Maybe I'll do that


 Giving an error when it does not understand something would be
 dangerous as it could break much existing code so that would
 probably not be possible at this stage.

 But would it break any existing *correct* code?  I find it hard to imagine
 any cases where adding 1 hour of difftime to times(12:00:00) should return
 1.5 days rather than 13:00:00.


 The idea of defaulting to internal representations is based on
 the idea that you get many features for free since the way the
 internal representations work gives the right answer in many
 cases.

 I must admit I am rather shocked by this approach.  Getting something for
 free is a bad bargain if what you get is nonsense.


 Its best to stick with the implicit philosophy that
 underlies a package.  If you want a different philosophy then
 its really tantamount to creating a new package.  I don't
 think that one is right and the other wrong but simply
 represent different viewpoints.

 So you would defend the viewpoint that 1 hour is the same thing as 1 day?

The way this might appear in code is if someone wanted to calculate the
number of one hour intervals in 18 hours.  One could write:

t18 - times(18:00:00)
t1 - times(1:00:00)
as.numeric(t18) / as.numeric(t1)

but since we all know that it uses internal representations unless it
indicates otherwise a typical code snippet might shorten it to:

as.numeric(t18 / t1)

and all such code would break if one were to cause that to generate an error.
(I think it would be ok if it generated a numeric automatically and that was
the enhancement I had suggested to you.)

You can try to argue that the user should not code that way but a huge
amount of chron code exists by now (note that chron may pre-date R).
If it were a new package one could consider raising new errors but at this
point in time it would be unwise.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] good numerical optimization to use in R?

2009-05-22 Thread spencerg
 Have you tried the maxLik package?  If that is not adequate, you 
can check CRAN Task View: Optimization and Mathematical Programming 
(http://cran.fhcrc.org/web/views/Optimization.html).  Or try the new 
RSiteSearch.function in the RSiteSearch package.  If none of these 
adequate, please provide another post with more detail about what you've 
tried and why it does not seem adequate, preferably providing provide 
commented, minimal, self-contained, reproducible code, as suggeste in 
the posting guide http://www.R-project.org/posting-guide.html;. 



 Hope this helps. 
 Spencer


Michael wrote:

Hi all,

Could anybody point me to a good/robust numerical optimization program
to use in R?

I am doing some MLE fitting. Thanks!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error in FUN with tapply and by

2009-05-22 Thread Thomas Levine
A subset of my raw data looks like this:

--
Grip  Technique   Baseline.integrated Task
Stroke..direction.Engag   Disen
PenDG   PenUG   PenDS
PenUS   Duration
-
Tripod,Barrel,Integrated,7,S70,230,510,270,510,781,1011,1011
Tripod,Barrel,Integrated,7,S71,na,na,na,na,na,na,na
Round,NonPrefHand,Baseline,0,S00,na,na,110,250,380,520,520
Round,NonPrefHand,Baseline,0,S01,na,na,220,360,460,620,620
--


I computed some values (times) from the raw data


---
t_p1=PenDG
t_c1=PenUG-PenDG
t_p2=PenDS-PenUG
t_c2=PenUS-PenDS
---


And I put those times in a data frame called times. For each of these
times, I want to subtract the average for Baseline trials from the average
for Integrated trials within the Grip and Technique factors. Call
these differences the true cost of mode selection.


 truecost -
function(time){as.numeric(tapply(time,Baseline.integrated,mean,na.rm=T)[2]-tapply(time,Baseline.integrated,mean,na.rm=T)[1])}

To help explain what the truecost function does:
 tapply(t_p1,Baseline.integrated,mean,na.rm=T)
  Baseline Integrated
  212.8000   252.8402
 truecost(t_p1)
[1] 40.04021


Then I try to create a table of average truecost as a function of levels of
a factor. I think this is the same error with tapply and by.


 tapply(t_p1,list(Grip,Technique),truecost,na.rm=T)
Error in FUN(X[[1L]], ...) : unused argument(s) (na.rm = TRUE)
 by(times,list(Grip,Technique),truecost,na.rm=T)
Error in FUN(data[x, , drop = FALSE], ...) :
  unused argument(s) (na.rm = TRUE)


Any ideas?


Thomas Levine!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] sciplot question

2009-05-22 Thread Jarle Bjørgeengen

Hi,

I would like to have lineplot.CI and barplot.CI to actually plot  
confidence intervals , instead of standard error.


I understand I have to use the ci.fun option, but I'm not quite sure  
how.


Like this :

  qt(0.975,df=n-1)*s/sqrt(n)

but how can I apply it to visualize the length of the student's T  
confidence intervals rather than the stdandard error of the plotted  
means ?


--
~~
Best regards   .~.
Jarle Bjørgeengen  /V\
Mob: +47 9155 7978// \\
http://www.uio.no/sok?person=jb  /(   )\
while(){if(s/^(.*\?)$/42 !/){print $1 $_}}^`~'^
~~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] regrouping factor levels

2009-05-22 Thread ravi

Hi all,
I had some trouble in regrouping factor levels for a variable. After some 
experiments, I have figured out how I can recode to modify the factor levels. I 
would now like some help to understand why some methods work and others don't.

Here's my code :
rm(list=ls())
###some trials in recoding factor levels
char-letters[1:10]
fac-factor(char)
levels(fac)
print(fac)
##first method of recoding factors
fac1-fac
levels(fac1)[c(a,b,c)]-A
levels(fac1)[c(d,e,f)]-B
levels(fac1)[c(g,h,i,j)]-C
levels(fac1)
print(fac1)
##second method
fac2-fac  
levels(fac2)[c(1,2,3)]-A
levels(fac2)[c(2,3,4)]-B # not c(4,5,6)
levels(fac2)[c(3,4,5,6)]-C # not c(7,8,9,10)
levels(fac2)
print(fac2)
#third method
fac3-fac
levels(fac3)-list(A=c(a,b,c),B=c(d,e,f),C=c(g,h,i,j))
levels(fac3)
print(fac3)

I first tried method 1 and had no luck with it at all. The levels A, B, and C 
just got added to the existing levels without affecting the fac variable.
After some time, I was able to figure out how I should use method 2.
After reading the help documentation, I arrived at method 3.

I would appreciate help in understanding why the first method does not work. In 
my application, I had long factor names and Tinn-R just would not 
accept statements running to several lines. Partial substitution was desirable 
then. Having spent a considerable amount of time on this, I would like to 
understand the underlying problem with method 1 as it is. The deeper 
understanding could be useful for me later. 
Thanking You,
Ravi 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Goodness of fit in quantile regression

2009-05-22 Thread Brian S Cade
Laura:  Part of the issue may depend on what you mean by goodness-of-ft. 
If you are looking for some global measure like a pseudo R or AIC to 
select among models, you ought to be able to make those calculations off 
the objective function that was minimized as you recognized.  If 
qr.fit.sfn() does not return the objective function like rq.fit.br(), the 
simplex routine, you still ought to be able to do the calculations by 
performing the asymmetric weighting of the residuals from the model (see 
Roger Koenker's 2005 book).  Now, if by goodness-of-fit you mean how the 
model fits in local regions of the predictor space, then you might want to 
check out Stef van Buuren's work on worm plots to diagnose fit in quantile 
regression.  Don't remember where he published this but his email is 
stef.vanbuu...@tno.nl

Brian

Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  brian_c...@usgs.gov
tel:  970 226-9326



From:
laura m. mayorala...@gmail.com
To:
r-help@r-project.org
Date:
05/22/2009 03:29 AM
Subject:
[R]  Goodness of fit in quantile regression
Sent by:
r-help-boun...@r-project.org




Dear R users,


I've used the function qr.fit.sfn to estimate a quantile regression on a
panel data set. Now I would like to compute an statistic to measure the
goodness of fit of this model.

Does someone know how could I do that? I could compute a pseudo R2 but in
order to do that I would need the value of the objetive function at the
optimum and I don't see how to get this from the function qr.fit.sfn.

If someone has any good idea about how to solve this problem I would be 
most
grateful!

Best

Laura
-- 
View this message in context: 
http://www.nabble.com/Goodness-of-fit-in-quantile-regression-tp23666962p23666962.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error in FUN with tapply and by

2009-05-22 Thread jim holtman
Error message is self-explanatory: there is an unused parameter
'na.rm=TRUE'.  You are calling your function 'truecost' which only has a
single parameter 'time' and you are attempting to pass in 'na.rm=TRUE' which
it will not accept.  You don't need it.

On Fri, May 22, 2009 at 12:36 PM, Thomas Levine thomas.lev...@gmail.comwrote:

 A subset of my raw data looks like this:

 --
 Grip  Technique   Baseline.integrated Task
 Stroke..direction.Engag   Disen
 PenDG   PenUG   PenDS
 PenUS   Duration
 -
 Tripod,Barrel,Integrated,7,S70,230,510,270,510,781,1011,1011
 Tripod,Barrel,Integrated,7,S71,na,na,na,na,na,na,na
 Round,NonPrefHand,Baseline,0,S00,na,na,110,250,380,520,520
 Round,NonPrefHand,Baseline,0,S01,na,na,220,360,460,620,620
 --


 I computed some values (times) from the raw data


 ---
 t_p1=PenDG
 t_c1=PenUG-PenDG
 t_p2=PenDS-PenUG
 t_c2=PenUS-PenDS
 ---


 And I put those times in a data frame called times. For each of these
 times, I want to subtract the average for Baseline trials from the
 average
 for Integrated trials within the Grip and Technique factors. Call
 these differences the true cost of mode selection.


  truecost -

 function(time){as.numeric(tapply(time,Baseline.integrated,mean,na.rm=T)[2]-tapply(time,Baseline.integrated,mean,na.rm=T)[1])}

 To help explain what the truecost function does:
  tapply(t_p1,Baseline.integrated,mean,na.rm=T)
  Baseline Integrated
  212.8000   252.8402
  truecost(t_p1)
 [1] 40.04021


 Then I try to create a table of average truecost as a function of levels of
 a factor. I think this is the same error with tapply and by.


  tapply(t_p1,list(Grip,Technique),truecost,na.rm=T)
 Error in FUN(X[[1L]], ...) : unused argument(s) (na.rm = TRUE)
  by(times,list(Grip,Technique),truecost,na.rm=T)
 Error in FUN(data[x, , drop = FALSE], ...) :
  unused argument(s) (na.rm = TRUE)


 Any ideas?


 Thomas Levine!

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Need a faster function to replace missing data

2009-05-22 Thread William Dunlap
Here are 2 functions, which.just.above and which.just.below,
which may help you.  They will tell which element in a reference
dataset is the first just above (or just below) each element
in the main dataset (x).  They return NA if there is no reference
element above (or below) an element of x.  The strict argument
lets you say if the inequalities are strict or if equality is
acceptable.
They are vectorized so are pretty quick.

E.g.,
which.just.below(c(14,14.5), 11:15, strict=TRUE)
   [1] 3 4
which.just.above(c(14,14.5), 11:15, strict=FALSE)
   [1] 4 5
They should work with any class of data that order() and sort()
work on.  In particular, POSIXct times work.  The attached file
has a demonstration function called 'test' with some examples.

In your case the 'reference' data would be the times at which your
backup measurements were taken and the 'x' data would be the
times of the pings.  You can look at the elements of 'reference' just
before and just after each ping (or just the pings that are missing
locations) and decide how to combine the data from the bracketing
reference elements to inpute a location for the ping.

Here are the functions, in case the attachment doesn't make it
through.  I'm sure some mailer will throw in some newlines so
it will be corrupted.

which.just.above -
function(x, reference, strict = T)
{
# output[k] will be index of smallest value in reference vector
# larger than x[k].  If strict=F, replace 'larger than' by
# 'larger than or equal to'.
# We should allow NA's in x (but we don't). NA's in reference
# should not be allowed.
if(any(is.na(x)) || any(is.na(reference))) stop(NA's in input)
if(strict)
i - c(rep(T, length(reference)), rep(F,
length(x)))[order(
c(reference, x))]
else i - c(rep(F, length(x)), rep(T,
length(reference)))[order(c(
x, reference))]
i - cumsum(i)[!i] + 1.
i[i  length(reference)] - NA
# i is length of x and has values in range 1:length(reference)
or NA
# following needed if reference is not sorted
i - order(reference)[i]
# following needed if x is not sorted
i[order(order(x))]
}

which.just.below -
function(x, reference, strict = T)
{
# output[k] will be index of largest value in reference vector
# less than x[k].  If strict=F, replace 'less than' by
# 'less than or equal to'.  Neither x nor reference need be
# sorted, although they should not have NA's (in theory, NA's
# in x are ok, but not in reference).
if(any(is.na(x)) || any(is.na(reference))) stop(NA's in input)
if(!strict)
i - c(rep(T, length(reference)), rep(F,
length(x)))[order(
c(reference, x))]
else i - c(rep(F, length(x)), rep(T,
length(reference)))[order(c(
x, reference))]
i - cumsum(i)[!i]
i[i = 0] - NA
# i is length of x and has values in range 1:length(reference)
or NA
# following needed if reference is not sorted
i - order(reference)[i]
# following needed if x is not sorted
i[order(order(x))]
}

Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com  

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Tim Clark
 Sent: Thursday, May 21, 2009 9:45 PM
 To: r-help@r-project.org
 Subject: [R] Need a faster function to replace missing data
 
 
 Dear List,
 
 I need some help in coming up with a function that will take 
 two data sets, determine if a value is missing in one, find a 
 value in the second that was taken at about the same time, 
 and substitute the second value in for where the first should 
 have been.  My problem is from a fish tracking study.  We put 
 acoustic tags in fish and track them for several days.  
 Location data is supposed to be automatically recorded every 
 time we detect a ping from the fish.  Unfortunately the GPS 
 had some problems and sometimes the fishes depth was recorded 
 but not its location.  I fortunately had a back-up GPS that 
 was taking location data every five minutes.  I would like to 
 merge the two files, replacing the missing value in the vscan 
 (automatic) file with the location from the garmin file.  
 Since we were getting vscan records every 1-2 seconds and 
 garmin records every 5 minutes, I need to find the right 
 place in the vscan file to place the garmin record - i.e. the
  closest in time, but not greater than 5 minutes.  I have 
 written a function that does this. However, it works with my 
 test data but locks up my computer with my real data.  I have 
 several million vscan records and several thousand garmin 
 records.  Is there a better way to do this?
 
 
 My function and test data:
 
 myvscan-data.frame(c(1,NA,1.5),times(c(12:00:00,12:14:00,
 12:20:00)))
 

Re: [R] Need a faster function to replace missing data

2009-05-22 Thread jim holtman
I think this does what you want.  It uses 'findInterval' to determine where
a possible match is:


myvscan-data.frame(c(1,NA,1.5),as.POSIXct(c(12:00:00,12:14:00,12:20:00),
format=%H:%M:%S))
 # convert to numeric
 names(myvscan)-c(Latitude,DateTime)
 myvscan$tn - as.numeric(myvscan$DateTime)  # numeric for findInterval

mygarmin-data.frame(c(20,30,40),as.POSIXct(c(12:00:00,12:10:00,12:15:00),
format=%H:%M:%S))
 names(mygarmin)-c(Latitude,DateTime)
 mygarmin$tn - as.numeric(mygarmin$DateTime)

 # use 'findInterval'
 na.indx - which(is.na(myvscan$Latitude))  # find NAs
 # replace with garmin latitude
 myvscan$Latitude[na.indx] -
mygarmin$Latitude[findInterval(myvscan$tn[na.indx], mygarmin$tn)]


 myvscan
  LatitudeDateTime tn
1  1.0 2009-05-22 12:00:00 1243008000
2 30.0 2009-05-22 12:14:00 1243008840
3  1.5 2009-05-22 12:20:00 1243009200



On Fri, May 22, 2009 at 12:45 AM, Tim Clark mudiver1...@yahoo.com wrote:


 Dear List,

 I need some help in coming up with a function that will take two data sets,
 determine if a value is missing in one, find a value in the second that was
 taken at about the same time, and substitute the second value in for where
 the first should have been.  My problem is from a fish tracking study.  We
 put acoustic tags in fish and track them for several days.  Location data is
 supposed to be automatically recorded every time we detect a ping from the
 fish.  Unfortunately the GPS had some problems and sometimes the fishes
 depth was recorded but not its location.  I fortunately had a back-up GPS
 that was taking location data every five minutes.  I would like to merge the
 two files, replacing the missing value in the vscan (automatic) file with
 the location from the garmin file.  Since we were getting vscan records
 every 1-2 seconds and garmin records every 5 minutes, I need to find the
 right place in the vscan file to place the garmin record - i.e. the
  closest in time, but not greater than 5 minutes.  I have written a
 function that does this. However, it works with my test data but locks up my
 computer with my real data.  I have several million vscan records and
 several thousand garmin records.  Is there a better way to do this?


 My function and test data:

 myvscan-data.frame(c(1,NA,1.5),times(c(12:00:00,12:14:00,12:20:00)))
 names(myvscan)-c(Latitude,DateTime)
 mygarmin-data.frame(c(20,30,40),times((12:00:00,12:10:00,12:15:00)))
 names(mygarmin)-c(Latitude,DateTime)

 minute.diff-1/24/12   #Time diff is in days, so this is 5 minutes
 for (k in 1:nrow(myvscan))
 {
 if (is.na(myvscan$Latitude[k]))
 {
 if ((min(abs(mygarmin$DateTime-myvscan$DateTime[k])))  minute.diff )
 {
 index.min.date-which.min(abs(mygarmin$DateTime-myvscan$DateTime[k]))
 myvscan$Latitude[k]-mygarmin$Latitude[index.min.date]
 }}}

 I appreciate your help and advice.

 Aloha,

 Tim




 Tim Clark
 Department of Zoology
 University of Hawaii

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] step by step debugger in R?

2009-05-22 Thread Romain Francois

Duncan Murdoch wrote:

On 5/22/2009 10:59 AM, Michael wrote:

Really I think if there is a Visual Studio strength debugger, our
collective time spent in developing R code will be greatly reduced.


If someone who knows how to write a debugger plugin for Eclipse wants 
to help, we could have that fairly easily.  All the infrastructure is 
there; it's the UI part that's missing.


Duncan Murdoch
[I've copied Mark Bravington and Robert Gentleman to the list as they 
are likely to have views here, and I am not sure they monitor R-help]


Hello,

Making a front-end to debugging was one of the proposed google summer of 
code for this year [1], it was not retained eventually, but I am still 
motivated.


Pretty much all infrastructure is there, and some work has been done 
__very recently__ in R's debugging internals (ability to step up). As I 
see it, the ability to call some sort of hook each time the debugger 
waits for input would make it much easier for someone to write 
front-ends. A recent post of mine (patch included) [2] on R-devel 
suggested a custom prompt for browser which would do the trick, but I 
now think that a hook would be more appropriate. Without something 
similar to that, there is no way that I know of for making a front-end, 
unless maybe if you embed R ... (please let me know how I am wrong)


There is also the debug package [3,4] which does __not__ work with R 
internals but rather works with instrumenting tricks at the R level. 
debug provides a tcl/tk front-end. It is my understanding that it does 
not work using R internals (do_browser, ...) because it was not possible 
at the time, and I believe this is still not possible today, but I might 
be wrong. I'd prefer to be wrong actually.


Romain

[1] : http://www.r-project.org/soc09/ideas.html#p5
[2] : https://stat.ethz.ch/pipermail/r-devel/2009-April/053128.html
[3] : http://cran.r-project.org/web/packages/debug/index.html
[4] : http://cran.r-project.org/doc/Rnews/Rnews_2003-3.pdf

--
Romain Francois
Independent R Consultant
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bug in rpart?

2009-05-22 Thread Uwe Ligges



Yuanyuan wrote:

Greetings,

I checked the Indian diabetes data again and get one tree for the data with
reordered columns and another tree for the original data. I compared these
two trees, the split points for these two trees are exactly the same but the
fitted classes are not the same for some cases. And the misclassification
errors are different too. I know how CART deal with ties --- even we are
using the same data, the subjects to the left and right would not be the
same if we just rearrange the order of covariates.

But the problem is, the fitted trees are exactly the same on the split
points. Shouldn't we get the same fitted values if the decisions are the
same at each step? Why the same structured trees have different observations
on the nodes?



Because they may use different surrogate variables. Note that your data 
contain missing values that are handled by surrogates.


Best,
Uwe Ligges






The source code for running the diabetes data example and the output of
trees are attached. Your professional opinion is very much appreciated.

library(mlbench)
data(PimaIndiansDiabetes2)
mydata-PimaIndiansDiabetes2
library(rpart)
fit2-rpart(diabetes~., data=mydata,method=class)
plot(fit2,uniform=T,main=CART for original data)
text(fit2,use.n=T,cex=0.6)
printcp(fit2)
table(predict(fit2,type=class),mydata$diabetes)
## misclassifcation table: rows are fitted class
  neg pos
  neg 437  68
  pos  63 200


pmydata-data.frame(mydata[,c(1,6,3,4,5,2,7,8,9)])
fit3-rpart(diabetes~., data=pmydata,method=class)
plot(fit3,uniform=T,main=CART after exchaging mass  glucose)
text(fit3,use.n=T,cex=0.6)
printcp(fit3)
table(predict(fit3,type=class),pmydata$diabetes)
##after exchage the order of BODY mass and PLASMA glucose
  neg pos
  neg 436  64
  pos  64 204


Best,





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Class for time of day?

2009-05-22 Thread Stavros Macrakis
On Fri, May 22, 2009 at 12:28 PM, Gabor Grothendieck 
ggrothendi...@gmail.com wrote:

...The way this might appear in code is if someone wanted to calculate the
 number of one hour intervals in 18 hours.  One could write:

 t18 - times(18:00:00)
 t1 - times(1:00:00)
 as.numeric(t18) / as.numeric(t1)

 but since we all know that it uses internal representations unless it
 indicates otherwise


Um, yes, I suppose that was the attitude in the 60's and 70's, but I think
we have moved on from there.  cf.
http://en.wikipedia.org/wiki/Data_abstraction


 a typical code snippet might shorten it to:

 as.numeric(t18 / t1)

 and all such code would break if one were to cause that to generate an
 error.


(18/24 day)/(1/24 day) is the perfectly meaningful dimensionless number 18,
so this code should not break with a correct implementation of '/'.  (cf.
http://en.wikipedia.org/wiki/Dimensional_analysis).  Alas, chron gives the
nonsense result of 18 days.

-s

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How can I estimate a Box-Cox function with R?

2009-05-22 Thread Ikerne del Valle



	Thanks Gregory. I see that that with 
boxcox.lm() the optimal lambda is obtained and 
plotted against log-likelihood.


library(MASS)
boxcox(Volume ~ log(Height) + log(Girth), data = trees,
   lambda = seq(-0.25, 0.25, length = 10))

But has how can I see the fit of the same linear 
model with the optimal BoxCox transformation, the 
standard error for lambda etc.?


Best. Ikerne.



Have you looked at the boxcox function in the 
MASS package?  That may do what you want.


--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111



 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Ikerne del Valle
 Sent: Thursday, May 21, 2009 4:29 AM
 To: fernando.tus...@ehu.es

  Cc: r-help@r-project.org

 Subject: [R] How can I estimate a Box-Cox function with R?


Dear Fernando and all:

Thanks for your help. Now works. This is
 a training example to learn how to estimate a
 Box-Cox (right and/or left side transformations)
 with R (as LIMDEP does) in order to compare these
 estimations with the ones derived by applying
 NLS, ones the dependent variable has been divided
 by its geometric mean (see below) as suggested by
 (Zarembka (1974) and Spitzer (1984). However the
 example of the demand of money seems not to work.
 Any idea to face the error messages or how to
 estimate a Box-Cox function with R?

Best regards,
Ikerne

 library(nlrwr)
 r-
 c(4.50,4.19,5.16,5.87,5.95,4.88,4.50,6.44,7.83,6.25,5.50,5.46,7.46,10.2
 8,11.77,13.42,11.02,8.50,8.80,7.69)
 Lr-log(r)
 M-
 c(480.00,524.30,566.30,589.50,628.20,712.80,805.20,861.00,908.40,1023.1
 0,1163.60,1286.60,1388.90,1497.90,1631.40,1794.40,1954.90,2188.80,2371.
 70,2563.60)
 LM-log(M)
 Y-
 c(2208.30,2271.40,2365.60,2423.30,2416.20,2484.80,2608.50,2744.10,2729.
 30,2695.00,2826.70,2958.60,3115.20,3192.40,3187.10,3248.80,3166.00,3277
 .70,3492.00,3573.50)
 LY-log(Y)
 gmM-exp((1/20)*sum(LM))
 GM-M/gmM
 Gr-r/gmM
 GY-Y/gmM
 money-data.frame(r,M,Y,Lr,LM,LY,GM,Gr,GY)
 attach(money)
 ols1-lm(GM~r+Y)
 output1-summary(ols1)
 coef1-ols1$coefficients
 a1-coef1[[1]]
 b11-coef1[[2]]
 b21-coef1[[3]]
 ols2-lm(GM~Gr+GY)
 output2-summary(ols2)
 coef2-ols2$coefficients
 a2-coef2[[1]]
 b12-coef2[[2]]
 b22-coef2[[3]]
 money.m1-
 nls(GM~a+b*r^g+c*Y^g,data=money,start=list(a=a1,b=b11,g=1,c=b21))
 money.m2-
 nls(GM~a+b*Gr^g+c*GY^g,data=money,start=list(a=a2,b=b12,g=1,c=b22))


Ikerne del Valle Erkiaga
Department of Applied Economics V
Faculty of Economic and Business Sciences
University of the Basque Country
Avda. Lehendakari Agirre, Nº 83
48015 Bilbao (Bizkaia) Spain

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Class for time of day?

2009-05-22 Thread Gabor Grothendieck
On Fri, May 22, 2009 at 1:55 PM, Stavros Macrakis macra...@alum.mit.edu wrote:
 On Fri, May 22, 2009 at 12:28 PM, Gabor Grothendieck
 ggrothendi...@gmail.com wrote:

 ...The way this might appear in code is if someone wanted to calculate the
 number of one hour intervals in 18 hours.  One could write:

 t18 - times(18:00:00)
 t1 - times(1:00:00)
 as.numeric(t18) / as.numeric(t1)

 but since we all know that it uses internal representations unless it
 indicates otherwise

 Um, yes, I suppose that was the attitude in the 60's and 70's, but I think
 we have moved on from there.  cf.
 http://en.wikipedia.org/wiki/Data_abstraction


 a typical code snippet might shorten it to:

 as.numeric(t18 / t1)

 and all such code would break if one were to cause that to generate an
 error.

 (18/24 day)/(1/24 day) is the perfectly meaningful dimensionless number 18,
 so this code should not break with a correct implementation of '/'.  (cf.
 http://en.wikipedia.org/wiki/Dimensional_analysis).  Alas, chron gives the
 nonsense result of 18 days.

Your point was that otherwise undefined operations should produce an
error and I was illustrating why that could not be introduced at this stage.
I had already suggested that you implement division if you found it important
and that was not the source of any disagreement.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Naming a random effect in lmer

2009-05-22 Thread spencerg
 The first exaample on the lmer help page uses a formula 
Reaction ~ Days + (Days|Subject).  Here, Subject is the name of a 
column in the data.frame sleepstudy, with levels 308, 309, ... . 



 Does this answer your question?  If no, please provide commented, 
minimal, self-contained, reproducible code, as requested in the posting 
guide http://www.R-project.org/posting-guide.html;.  Your example is 
not self-contained, reproducible. 



 Hope this helps. 
 Spencer



Leigh Ann Starcevich wrote:

Dear guRus:

I am using lmer for a mixed model that includes a random intercept for a 
set of effects that have the same distribution, Normal(0, sig2b).  This set 
of effects is of variable size, so I am using an as.formula statement to 
create the formula for lmer.  For example, if the set of random effects has 
dimension 8, then the lmer call is:


Zs- paste(Z,1:mb,sep=)
Trendformula -as.formula(paste(LogY ~ WYear + (1+WYear|Site) +  (1|, 
paste(paste(Zs,collapse=+), 

fit2.4a-lmer(Trendformula, data = testsamp)

which, for mb=8, expands to:

fit1-lmer(LogY ~ WYear + (1 | Site) + (1 | Year) + (1 | Z1+ Z2 + Z3 + Z4 + 
Z5 + Z6 + Z7 + Z8), data = testsamp)



I have no problems with this.  However, if the set of random effects has a 
dimension of 30, then the lmer call is:


fit2-lmer(LogY ~ WYear + (1 | Site) + (1 | Year) + (1 | Z1+Z2 + Z3 + Z4 + 
Z5 + Z6 + Z7 + Z8 + Z9 + Z10 + Z11 + Z12 + Z13 + Z14 + Z15 +  Z16 + Z17 + 
Z18 + Z19 + Z20 + Z21 + Z22 + Z23 + Z24 + Z25 + Z26 + Z27 + Z28 + Z29+ 
Z30), data = testsamp)


In this case, I get an error because the name Z1+Z2 + Z3 + Z4 + Z5 + Z6 + 
Z7 + Z8 + Z9 + Z10 + Z11 + Z12 + Z13 + Z14 + Z15 +  Z16 + Z17 + Z18 + Z19 + 
Z20 + Z21 + Z22 + Z23 + Z24 + Z25 + Z26 + Z27 + Z28 + Z29+ Z30 is too long 
to print in the output.   Is there any way to name the random effect in 
lmer so that the shorter (and more descriptive) name may be used and the 
error avoided?   Or is there a way to combine these into a single variable 
prior to the lmer function call?  In SAS, I am able to parameterize these 
as a Toeplitz structure with bandwidth 1.


Thanks for any help.

Leigh Ann Starcevich
Doctoral student
Oregon State University
Corvallis, Oregon
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Object not found

2009-05-22 Thread Gaertner, Stefanie
Hello,

I run into a problem:

 ftable(table(Fire, Standard, StoAll), col.vars=c(Fire,Standard))
Error in table(Fire, Standard, StoAll) : object 'Fire' not found

I do not understand that because when I read the table everything seems
correct.


Stocking_all-read.table(P://Benchmark//analysis//r//stocking_10//stock
ing_all.csv,header=TRUE,sep=,)
 names(Stocking_all)
 [1] ID_basic_tallesttree FID_plot Fire

 [4] Time_FireStandard Fire_Standard

 [7] ESR  Fire_Stand   StoARS2000

[10] StoWAS2008   StoAll 

And I also get a summary for the variables.  I have no clue and would
very much appreciate your help.
Stefanie



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Object not found

2009-05-22 Thread jim holtman
You read them into a dataframe so you need to do

ftable(table(Stocking_all$Fire, Stocking_all$Standard, StoAll),
col.vars=c(Fire,Standard))


On Fri, May 22, 2009 at 2:33 PM, Gaertner, Stefanie 
stefanie.gaert...@ales.ualberta.ca wrote:

 Hello,

 I run into a problem:

  ftable(table(Fire, Standard, StoAll), col.vars=c(Fire,Standard))
 Error in table(Fire, Standard, StoAll) : object 'Fire' not found

 I do not understand that because when I read the table everything seems
 correct.

 
 Stocking_all-read.table(P://Benchmark//analysis//r//stocking_10//stock
 ing_all.csv,header=TRUE,sep=,)
  names(Stocking_all)
  [1] ID_basic_tallesttree FID_plot Fire

  [4] Time_FireStandard Fire_Standard

  [7] ESR  Fire_Stand   StoARS2000

 [10] StoWAS2008   StoAll

 And I also get a summary for the variables.  I have no clue and would
 very much appreciate your help.
 Stefanie



[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error in FUN with tapply and by

2009-05-22 Thread Thomas Levine
 str(time)
function (x, ...)
 str(t_p1)
 num [1:576] 190 180 190 200 210 200 220 190 230 230 ...
 str(Baseline.integrated)
 Factor w/ 2 levels Baseline,Integrated: 1 1 1 1 1 1 1 1 1 1 ...
 str(Technique)
 Factor w/ 2 levels Barrel,NonPrefHand: 1 1 1 1 1 1 1 1 1 1 ...
 str(Grip)
 Factor w/ 2 levels Round,Tripod: 1 1 1 1 1 1 1 1 1 1 ...


On Fri, May 22, 2009 at 2:46 PM, jim holtman jholt...@gmail.com wrote:

 You need to supply str for the original arguments; the error message had a
 different set of parameters.


 On Fri, May 22, 2009 at 2:36 PM, Thomas Levine thomas.lev...@gmail.comwrote:

 That produces the following error

  tapply(t_p1,list(Grip,Technique),truecost)
 Error in tapply(time, Baseline.integrated, mean, na.rm = T) :
   arguments must have same length




 On Fri, May 22, 2009 at 1:06 PM, jim holtman jholt...@gmail.com wrote:

 Error message is self-explanatory: there is an unused parameter
 'na.rm=TRUE'.  You are calling your function 'truecost' which only has a
 single parameter 'time' and you are attempting to pass in 'na.rm=TRUE' which
 it will not accept.  You don't need it.

   On Fri, May 22, 2009 at 12:36 PM, Thomas Levine 
 thomas.lev...@gmail.com wrote:

  A subset of my raw data looks like this:

 --
 Grip  Technique   Baseline.integrated Task
 Stroke..direction.Engag   Disen
 PenDG   PenUG   PenDS
 PenUS   Duration
 -
 Tripod,Barrel,Integrated,7,S70,230,510,270,510,781,1011,1011

 Tripod,Barrel,Integrated,7,S71,na,na,na,na,na,na,na
 Round,NonPrefHand,Baseline,0,S00,na,na,110,250,380,520,520
 Round,NonPrefHand,Baseline,0,S01,na,na,220,360,460,620,620
 --


 I computed some values (times) from the raw data


 ---
 t_p1=PenDG
 t_c1=PenUG-PenDG
 t_p2=PenDS-PenUG
 t_c2=PenUS-PenDS
 ---


 And I put those times in a data frame called times. For each of these
 times, I want to subtract the average for Baseline trials from the
 average
 for Integrated trials within the Grip and Technique factors. Call
 these differences the true cost of mode selection.


  truecost -

 function(time){as.numeric(tapply(time,Baseline.integrated,mean,na.rm=T)[2]-tapply(time,Baseline.integrated,mean,na.rm=T)[1])}

 To help explain what the truecost function does:
  tapply(t_p1,Baseline.integrated,mean,na.rm=T)
  Baseline Integrated
  212.8000   252.8402
  truecost(t_p1)
 [1] 40.04021


 Then I try to create a table of average truecost as a function of levels
 of
 a factor. I think this is the same error with tapply and by.


  tapply(t_p1,list(Grip,Technique),truecost,na.rm=T)
 Error in FUN(X[[1L]], ...) : unused argument(s) (na.rm = TRUE)
  by(times,list(Grip,Technique),truecost,na.rm=T)
 Error in FUN(data[x, , drop = FALSE], ...) :
  unused argument(s) (na.rm = TRUE)


 Any ideas?


 Thomas Levine!

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390

 What is the problem that you are trying to solve?





 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390

 What is the problem that you are trying to solve?


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Barchart in lattice - wrong order of groups, data labels on top of each other, and a legend question

2009-05-22 Thread Dimitri Liakhovitski
Thank you very much, Gabor!

On Thu, May 21, 2009 at 9:37 PM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:
 If you are willing to go down one level and work at the grid level
 then you can do it without modifying the panel function.

 Below gg.ls$name lists the grid object names.  Within that
 list look at the ones that have rect in their name.  Among
 those are 5 in success (the 2nd through 6th rect objects).
 Similarly look for a text object in that vicinity. That would
 be the the third object of those with text in their name.  We
 want to reset the positioning of the text in the text object
 using the info from the rect objects that form the bars. Thus:

 # first run your code as posted, then run this
 library(grid)
 gg - grid.grab()
 gg.ls - grid.ls(gg)
 # based on gg.ls$name we make the observations above

 rect.names - grep(rect, gg.ls$name, value = TRUE)[2:6]
 text.names - grep(text, gg.ls$name, value = TRUE)[3]

 rect.x - c(sapply(rect.names, function(x) grid.get(x)$x))
 rect.height - c(sapply(rect.names, function(x) grid.get(x)$height))
 text.name - grep(text, gg.ls$name, value = TRUE)[3]

 grid.edit(text.name,
        x = unit(rect.x, native),
        y = unit(rect.height+1, native))

 On Thu, May 21, 2009 at 3:58 PM, Dimitri Liakhovitski ld7...@gmail.com 
 wrote:
 Deepayan, thank you very much for your response.
 I have a general question. And please remember - I am really just a
 beginner in R.
 Is it truly the case that in order to build quite a basic bar chart
 with value labels attached to it I have to be a true R graphics guru -
 because the only way to do achieve what I am trying to achive is to
 modify the underlying R function (panel.barchart)?
 Really?

 Dimitri


 On Tue, May 19, 2009 at 7:53 PM, Deepayan Sarkar
 deepayan.sar...@gmail.com wrote:
 On Mon, May 18, 2009 at 11:47 AM, Dimitri Liakhovitski ld7...@gmail.com 
 wrote:
 Hello!
 I have a question about my lattice barchart that I am trying to build
 in Section 3 below. I can't figure out a couple of things:
 1. When I look at the dataframe test that I am trying to plot, it
 looks right to me (the group Total is always the first out of 5).
 However, in the chart it is the last. Why?
 2. How can I make sure the value labels (on y) are not sitting on top
 of each other but on top of the respective bar?
 3. Is there any way to make the legend group items horizontally as
 opposed to now (vertically - taking up too much space)

 For 1 and 3, use

         auto.key = list(points = FALSE,
                         rectangles = TRUE,
                         reverse.rows = TRUE,
                         columns = 2,
                         space = bottom)

 From ?xyplot (under 'key'):

               'reverse.rows' logical, defaulting to 'FALSE'.  If
                    'TRUE', all components are reversed _after_ being
                    replicated (the details of which may depend on the
                    value of 'rep').  This is useful in certain
                    situations, e.g. with a grouped 'barchart' with
                    'stack = FALSE' with the categorical variable on
                    the vertical axis, where the bars in the plot will
                    usually be ordered from bottom to top, but the
                    corresponding legend will have the levels from top
                    to bottom (unless, of course, 'reverse.rows =
                    TRUE').  Note that in this case, unless all columns
                    have the same number or rows, they will no longer
                    be aligned.

               'columns' the number of columns column-blocks the key is
                    to be divided into, which are drawn side by side.


 2 is hard with a simple custom panel function, because you need to
 replicate some fairly involved calculations that are performed in
 panel.barchart. Your best bet is to start with a copy of
 panel.barchart, and then add calls to panel.text at suitable places.

 -Deepayan


 Thanks a lot!
 Dimitri

 ### Section 1: generates my data set data - just run: #

 N-100
 myset1-c(1,2,3,4,5)
 probs1-c(.05,.10,.15,.40,.30)
 myset2-c(0,1)
 probs2-c(.65,.30)
 myset3-c(1,2,3,4,5,6,7)
 probs3-c(.02,.03,.10,.15,.20,.30,.20)

 group-unlist(lapply(1:4,function(x){
        out-rep(x,25)
        return(out)
 }))
 set.seed(1)
 a-sample(myset1, N, replace = TRUE,probs1)
 a[which(rbinom(100,2,.01)==1)]-NA
 set.seed(12)
 b-sample(myset1, N, replace = TRUE,probs1)
 b[which(rbinom(100,2,.01)==1)]-NA
 set.seed(123)
 c-sample(myset2, N, replace = TRUE,probs2)
 set.seed(1234)
 d-sample(myset2, N, replace = TRUE,probs2)
 set.seed(12345)
 e-sample(myset3, N, replace = TRUE,probs3)
 e[which(rbinom(100,2,.01)==1)]-NA
 set.seed(123456)
 f-sample(myset3, N, replace = TRUE,probs3)
 f[which(rbinom(100,2,.01)==1)]-NA
 data-data.frame(group,a=a,b=b,c=c,d=d,e=e,f=f)
 data[group]-lapply(data[group],function(x) {
        x[x %in% 1]-Group 1
        x[x %in% 2]-Group 2
        x[x %in% 3]-Group 3
        

Re: [R] step by step debugger in R?

2009-05-22 Thread Robert Gentleman
Hi,

Romain Francois wrote:
 Duncan Murdoch wrote:
 On 5/22/2009 10:59 AM, Michael wrote:
 Really I think if there is a Visual Studio strength debugger, our
 collective time spent in developing R code will be greatly reduced.

 If someone who knows how to write a debugger plugin for Eclipse wants
 to help, we could have that fairly easily.  All the infrastructure is
 there; it's the UI part that's missing.

 Duncan Murdoch
 [I've copied Mark Bravington and Robert Gentleman to the list as they
 are likely to have views here, and I am not sure they monitor R-help]
 
 Hello,
 
 Making a front-end to debugging was one of the proposed google summer of
 code for this year [1], it was not retained eventually, but I am still
 motivated.
 
 Pretty much all infrastructure is there, and some work has been done
 __very recently__ in R's debugging internals (ability to step up). As I
 see it, the ability to call some sort of hook each time the debugger
 waits for input would make it much easier for someone to write

 I have still not come to an understanding of what this is supposed to do? When
you have the browser prompt you can call any function or code you want to. There
is no need for something special to allow you to do that.

 front-ends. A recent post of mine (patch included) [2] on R-devel
 suggested a custom prompt for browser which would do the trick, but I
 now think that a hook would be more appropriate. Without something
 similar to that, there is no way that I know of for making a front-end,
 unless maybe if you embed R ... (please let me know how I am wrong)

 I think you are wrong. I can't see why it is needed. The external debugger has
lots of options for handling debugging. It can rewrite code (see examples in
trace for how John Chambers has done this to support tracing at a location),
which is AFAIK a pretty standard approach to writing debuggers. It can figure
out where the break point is (made a bit easier by allowing it to put in pieces
of text in the call to browser).  These are things the internal debugger can't 
do.

 
 There is also the debug package [3,4] which does __not__ work with R
 internals but rather works with instrumenting tricks at the R level.
 debug provides a tcl/tk front-end. It is my understanding that it does
 not work using R internals (do_browser, ...) because it was not possible
 at the time, and I believe this is still not possible today, but I might
 be wrong. I'd prefer to be wrong actually.

  I don't understand this statement. It has always been possible to work with
the internal version - but one can also take the approach of rewriting code.
There are some difficulties supporting all the operations that one would like by
rewriting code and I think a combination of external controls and the internal
debugger will get most of the functionality that anyone wants.

  There are somethings that are hard and once I have a more complete list I will
be adding this to the appropriate manual. I will also be documenting the changes
that I have been making, but that project is in flux and won't be done until the
end of August, so people who want to look at it are welcome (it is in R-devel),
but it is in development and could change pretty much without notice.
  Romain noted that we now support stepping out from one place to another
function.  We also have a debugonce flag that lets you get close to step in, but
step in is very hard in R.

  I am mostly interested in writing tools in R that can be used by anyone that
wants to write an external debugger and am not that interested in any particular
external debugger. I would be happy to listen to feature requests or issues that
arise - but the discussion should probably be on R-devel mailing list.

  best wishes
Robert


 
 Romain
 
 [1] : http://www.r-project.org/soc09/ideas.html#p5
 [2] : https://stat.ethz.ch/pipermail/r-devel/2009-April/053128.html
 [3] : http://cran.r-project.org/web/packages/debug/index.html
 [4] : http://cran.r-project.org/doc/Rnews/Rnews_2003-3.pdf
 

-- 
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
PO Box 19024
Seattle, Washington 98109-1024
206-667-7700
rgent...@fhcrc.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] step by step debugger in R?

2009-05-22 Thread John Lindsey
As a newbie I'm trying to figure out how much more than RKWard does is 
wanted. The code turns colors as syntax is checked and errors are noted. 
It seems like a reasonable IDE. Maybe someone is looking for the same in 
windows?


John F Lindsey
803-790-5006 Home , 803-790-5008 Cell
919-439-9088 Home Office
3850 Northshore Rd., Columbia,SC 29206-3350
john.f.lind...@gmail.com , john-lind...@sc.rr.com 
http://www.linkedin.com/in/johnlindseysc





Robert Gentleman wrote:

Hi,

Romain Francois wrote:
  

Duncan Murdoch wrote:


On 5/22/2009 10:59 AM, Michael wrote:
  

Really I think if there is a Visual Studio strength debugger, our
collective time spent in developing R code will be greatly reduced.


If someone who knows how to write a debugger plugin for Eclipse wants
to help, we could have that fairly easily.  All the infrastructure is
there; it's the UI part that's missing.

Duncan Murdoch
  

[I've copied Mark Bravington and Robert Gentleman to the list as they
are likely to have views here, and I am not sure they monitor R-help]

Hello,

Making a front-end to debugging was one of the proposed google summer of
code for this year [1], it was not retained eventually, but I am still
motivated.

Pretty much all infrastructure is there, and some work has been done
__very recently__ in R's debugging internals (ability to step up). As I
see it, the ability to call some sort of hook each time the debugger
waits for input would make it much easier for someone to write



 I have still not come to an understanding of what this is supposed to do? When
you have the browser prompt you can call any function or code you want to. There
is no need for something special to allow you to do that.

  

front-ends. A recent post of mine (patch included) [2] on R-devel
suggested a custom prompt for browser which would do the trick, but I
now think that a hook would be more appropriate. Without something
similar to that, there is no way that I know of for making a front-end,
unless maybe if you embed R ... (please let me know how I am wrong)



 I think you are wrong. I can't see why it is needed. The external debugger has
lots of options for handling debugging. It can rewrite code (see examples in
trace for how John Chambers has done this to support tracing at a location),
which is AFAIK a pretty standard approach to writing debuggers. It can figure
out where the break point is (made a bit easier by allowing it to put in pieces
of text in the call to browser).  These are things the internal debugger can't 
do.

  

There is also the debug package [3,4] which does __not__ work with R
internals but rather works with instrumenting tricks at the R level.
debug provides a tcl/tk front-end. It is my understanding that it does
not work using R internals (do_browser, ...) because it was not possible
at the time, and I believe this is still not possible today, but I might
be wrong. I'd prefer to be wrong actually.



  I don't understand this statement. It has always been possible to work with
the internal version - but one can also take the approach of rewriting code.
There are some difficulties supporting all the operations that one would like by
rewriting code and I think a combination of external controls and the internal
debugger will get most of the functionality that anyone wants.

  There are somethings that are hard and once I have a more complete list I will
be adding this to the appropriate manual. I will also be documenting the changes
that I have been making, but that project is in flux and won't be done until the
end of August, so people who want to look at it are welcome (it is in R-devel),
but it is in development and could change pretty much without notice.
  Romain noted that we now support stepping out from one place to another
function.  We also have a debugonce flag that lets you get close to step in, but
step in is very hard in R.

  I am mostly interested in writing tools in R that can be used by anyone that
wants to write an external debugger and am not that interested in any particular
external debugger. I would be happy to listen to feature requests or issues that
arise - but the discussion should probably be on R-devel mailing list.

  best wishes
Robert


  

Romain

[1] : http://www.r-project.org/soc09/ideas.html#p5
[2] : https://stat.ethz.ch/pipermail/r-devel/2009-April/053128.html
[3] : http://cran.r-project.org/web/packages/debug/index.html
[4] : http://cran.r-project.org/doc/Rnews/Rnews_2003-3.pdf




  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] vcd package --- change layout of plot

2009-05-22 Thread Achim Zeileis

On Thu, 21 May 2009, Prew, Paul wrote:


Hello,

I'm trying to use the vcd package to analyze survey data.  Expert judges
ranked possible features for product packaging.  Seven features were
listed, and 19 judges split between 2 cities ranked them.

The following code (1) works, but the side-by-side plots for Cities PX,
SF are shrunk too much.  Stacking PX on top of SF would make for a
better plot.  (I could switch the order of Feature and Rank dimensions,
and go with the default side-by-side, but would prefer not to).


Several comments:

  - Adding keep_aspect_ratio = TRUE does probably change what you call
shrunk too much. By default is aspect ratio is kept fixed for
2-way displays.

  - I would switch the splitting order of Feature/Rank because you
probably want the distribution of ranks for a given feature and
not the other way round. In that case, you might find etting
split_vertical = TRUE aesthetically more pleasing. But it's probably
a matter of taste.

  - Setting gp = shading_max is not really the best choice: If you would
want a conditional independence test based on the maximum of residuals
you should use panel = cotab_coindep. See the JCGS paper in the
references and the accompanying vignette(residual-shadings, package
= vcd) for more details. But note that the double maximum test
is not the most powerful test against ordinal alternatives (as one
might expect due to the ordered Rank variable).

hth,
Z


(1)
cotabplot(~ Rank + Feature| Cities, data = Pack.dat, gp = shading_max,
rot_labels = c(90, 0, 0, 0),just_labels = c(left, left,
left, right),set_varnames = c(Feature = ))

Reading the vcd help, I got lost trying to understand the
panel-generating parameters I should use.  My best guess was below (2),
but gave an error message.  Clearly, I don't know what the paneling is
asking for.  This is where I would like some advice, if anyone is
familiar with vcd.

(2)
  Tried to change the layout of trellis plot from horizontal to
vertical
Pack.mos-mosaic(~Feature + Rank, data = Pack.tab, gp = shading_max,
rot_labels = c(0, 0, 0, 0),just_labels = c(left, left, left,
right),set_varnames = c(Feature = ))
## attempt to create an object for panel argument in cotabplot function

pushViewport(viewport(layout = grid.layout(ncol = 1)))
pushViewport(viewport(layout.pos.row = 1))
## tell vcd to change the default layout, and what to put in the top
plot

cotabplot(~ Feature + Rank | Cities, data = Pack.dat, panel = Pack.mos,
Pack.dat[[PX]], gp = shading_max, rot_labels = c(0, 0, 0, 0))
## create mosaic plot that's conditional on Cities;  first plot Cities =
PX
## panel argument is an attempt to modify an example in the vcd help
file

popViewport()
## create the graphic

Error:  Cannot pop the top-level viewport (grid and graphics output
mixed?)

# no point in gong on to code the plot for layout.pos.row = 2


str(Pack.tab)

Error in `[.structable`(x, i, args[[2]]) : subscript out of bounds

class(Pack.tab)

[1] structable ftable

dim(Pack.tab)

[1] 7 2 7

 Cities PX SF
Rank Feature
1Flexible 2  0
Integrate.Probes 1  2
Large/heavy  1  0
Lockout  0  1
Recyclable   3  5
Rigid0  0
Small/light  2  1
2Flexible 1  6
Integrate.Probes 2  0
Large/heavy  1  1
Lockout  1  0
Recyclable   2  0
Rigid1  0
Small/light  2  2
3Flexible 1  1
Integrate.Probes 3  0
Large/heavy  1  1
Lockout  2  1
Recyclable   1  3
Rigid0  0
Small/light  0  3
4Flexible 3  0
Integrate.Probes 0  2
Large/heavy  0  0
Lockout  2  2
Recyclable   0  1
Rigid1  2
Small/light  3  2
5Flexible 1  1
Integrate.Probes 1  1
Large/heavy  0  3
Lockout  0  2
Recyclable   2  0
Rigid3  1
Small/light  2  1
6Flexible 0  1
Integrate.Probes 1  3
Large/heavy  3  2
Lockout  3  1
Recyclable   0  0
Rigid2  2
Small/light  0  0
7Flexible 1  0
Integrate.Probes 1  1
Large/heavy  3  2
Lockout  1  2
Recyclable   1  0
Rigid2  4
Small/light  0  0




sessionInfo()

R version 2.9.0 RC (2009-04-10 r48318)
i386-pc-mingw32
locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United

Re: [R] Naming a random effect in lmer

2009-05-22 Thread Leigh Ann Starcevich
Here is a test data set and code.  I am including the data set after the 
code and discussion to make reading easier.  Apologies for the size of the 
data set, but my problem occurs when there are a lot of Z 
variables.   Thanks for your time.

# Enter data below

# Sample code
library(lme4)
mb- length(unique(testsamp$WYear))

# Create the formula for the set of identically distributed random effects
Zs- paste(Z,2:(mb-1)),sep=)
Trendformula -as.formula(paste(LogY ~ WYear + (1+WYear|Site) +  (1|, 
randommodel=paste(paste(Zs,collapse=+), 

fittest-lmer(Trendformula, data = testsamp)
summary(fittest)

# Here I get an error because the name of the random effect is too long to 
print
# in the random effects output (I think).
# The error message is:  Error in names(bars) - unlist(lapply(bars, 
function(x)
# deparse(x[[3]]))) : 'names' attribute [3] must be the same length as the 
vector [2]

# However, when fewer Z variables are used in the random portion of the model,
# there is no error.
# Using only Z2 + … + Z9 for the random intercept

Zs2- paste(Z,2:9,sep=)
Trendformula2 -as.formula(paste(LogY ~ WYear + (1+WYear|Site) +  (1|, 
randommodel=paste(paste(Zs2,collapse=+), 
fittest2-lmer(Trendformula2, data = testsamp)
summary(fittest2)


# Is there a way to either name the set of iid random effects something 
else or
# to define a random variable that could be used in the model to create a
# random intercept?

# I have had some success in lme, but it would be helpful for my simulation 
if I
# could conduct this analysis with lmer.  My model in lme is not correctly
# estimating one of the variance components (random Site intercept).
# I am using:

detach(package:lme4)
library(nlme)
random.model.lme 
-as.formula(paste(~-1+,paste(paste(Z,2:(mb-1),sep=),collapse=+)))

n-dim(testsamp)[1]
testsampgroup - rep(1,n)
testsamp.lme -cbind(testsamp,testsampgroup)
testgroupSamp- groupedData(LogY ~ WYearCen|testsampgroup, inner= ~Site,
data= data.frame(testsamp.lme))

fittest3-lme(LogY ~ WYearCen, random= 
pdBlocked(list(pdIdent(~-1+WYearCen:as.factor(Site)), 
pdIdent(~-1+as.factor(Site)), pdIdent(random.model.lme))),data= testgroupSamp)
summary(fittest3)
VarCorr(fittest3)


# Data

testsamp -
structure(list(Year = c(2008, 2008, 2008, 2008, 2008, 2008, 2008,
2009, 2009, 2009, 2009, 2009, 2009, 2009, 2010, 2010, 2010, 2010,
2010, 2010, 2010, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2012,
2012, 2012, 2012, 2012, 2012, 2012, 2013, 2013, 2013, 2013, 2013,
2013, 2013, 2014, 2014, 2014, 2014, 2014, 2014, 2014, 2015, 2015,
2015, 2015, 2015, 2015, 2015, 2016, 2016, 2016, 2016, 2016, 2016,
2016, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2018, 2018, 2018,
2018, 2018, 2018, 2018, 2019, 2019, 2019, 2019, 2019, 2019, 2019,
2020, 2020, 2020, 2020, 2020, 2020, 2020, 2021, 2021, 2021, 2021,
2021, 2021, 2021, 2022, 2022, 2022, 2022, 2022, 2022, 2022),
 WYear = c(0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 2, 2,
 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4,
 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7,
 7, 7, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 10, 10, 10,
 10, 10, 10, 10, 11, 11, 11, 11, 11, 11, 11, 12, 12, 12, 12,
 12, 12, 12, 13, 13, 13, 13, 13, 13, 13, 14, 14, 14, 14, 14,
 14, 14), Site = c(4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40,
 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67,
 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75,
 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94,
 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4,
 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18,
 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26,
 40, 67, 75, 94), LogY = c(0.848648866298552, 0.809143925760456,
 0.734173952725014, 1.46749967704437, 0.716106254860468, 
0.843871512951468,
 1.09120092433378, 0.800893809796851, 0.996674977596997, 
0.917613604481207,
 1.71928772722884, 0.797853604855215, 0.922298691760041, 
0.964654422529188,
 0.782903180421921, 1.13457969553106, 1.21628917384868, 2.03084776647495,
 0.872954085910578, 1.02192794559856, 0.746307251774509, 
0.439812203188778,
 1.11164109549224, 1.18414357836729, 2.00157711459358, 0.66577753155877,
 0.856374433428581, 0.343862060001402, 0.278505653147057,
 1.20632152691478, 1.32289150679746, 2.19814430598707, 0.538363941164496,
 0.820038163290321, 0.070054765524828, 0.0479738684639024,
 1.29137364087568, 1.52436357249377, 2.32150525025777, 0.595507040392793,
 0.851417610550757, -0.115908144193410, 0.0118306018140099,
 1.39448009350962, 1.71677754106603, 2.59146662837284, 0.595750060671620,
 0.855387479311679, -0.430729785591898, 0.0178423104900579,
 1.60964246000316, 1.99184029256509, 2.86865842252168, 0.69512483409,
 0.96175860396451, -0.600991172113926, -0.174420349224615,
 1.73794158380868, 2.06718359946362, 3.04502112974038, 0.730730638403177,
 0.961110819398807, -0.856693722990918, 

[R] stripchart and ordering of plot

2009-05-22 Thread Andrew Yee
Take for example the following stripchart that's created:

b - 1:5
a - 11:15
e - 21:25
f - -11:-15

foo - rbind(b,a,e,f)

stripchart(foo ~ rownames(foo))

In this case, I would like the bottom part of the plot to be the f vector,
followed by the e vector, etc.

However, R reorders the matrix and plots according to the alphabetical order
of the rownames of foo.  Is there a way to plot the matrix in the current
order of the rownames?

Thanks,
Andrew

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] vcd package --- change layout of plot

2009-05-22 Thread Prew, Paul
Dear Achim,  thank you very much for the suggestions, they work well.  Agreed 
that an ordinal logistic regression seems a more powerful choice of analysis 
method, and I'm using that also.

Thanks for providing the vcd package, it's proving quite helpful.
Regards, Paul

Paul Prew  |  Statistician
651-795-5942   |   fax 651-204-7504 
Ecolab Research Center  | Mail Stop ESC-F4412-A 
655 Lone Oak Drive  |  Eagan, MN 55121-1560 


-Original Message-
From: Achim Zeileis [mailto:achim.zeil...@wu-wien.ac.at] 
Sent: Friday, May 22, 2009 3:06 PM
To: Prew, Paul
Cc: r-help@r-project.org
Subject: Re: [R] vcd package --- change layout of plot

On Thu, 21 May 2009, Prew, Paul wrote:

 Hello,

 I'm trying to use the vcd package to analyze survey data.  Expert judges
 ranked possible features for product packaging.  Seven features were
 listed, and 19 judges split between 2 cities ranked them.

 The following code (1) works, but the side-by-side plots for Cities PX,
 SF are shrunk too much.  Stacking PX on top of SF would make for a
 better plot.  (I could switch the order of Feature and Rank dimensions,
 and go with the default side-by-side, but would prefer not to).

Several comments:

   - Adding keep_aspect_ratio = TRUE does probably change what you call
 shrunk too much. By default is aspect ratio is kept fixed for
 2-way displays.

   - I would switch the splitting order of Feature/Rank because you
 probably want the distribution of ranks for a given feature and
 not the other way round. In that case, you might find etting
 split_vertical = TRUE aesthetically more pleasing. But it's probably
 a matter of taste.

   - Setting gp = shading_max is not really the best choice: If you would
 want a conditional independence test based on the maximum of residuals
 you should use panel = cotab_coindep. See the JCGS paper in the
 references and the accompanying vignette(residual-shadings, package
 = vcd) for more details. But note that the double maximum test
 is not the most powerful test against ordinal alternatives (as one
 might expect due to the ordered Rank variable).

hth,
Z

 (1)
 cotabplot(~ Rank + Feature| Cities, data = Pack.dat, gp = shading_max,
 rot_labels = c(90, 0, 0, 0),just_labels = c(left, left,
 left, right),set_varnames = c(Feature = ))

 Reading the vcd help, I got lost trying to understand the
 panel-generating parameters I should use.  My best guess was below (2),
 but gave an error message.  Clearly, I don't know what the paneling is
 asking for.  This is where I would like some advice, if anyone is
 familiar with vcd.

 (2)
   Tried to change the layout of trellis plot from horizontal to
 vertical
 Pack.mos-mosaic(~Feature + Rank, data = Pack.tab, gp = shading_max,
 rot_labels = c(0, 0, 0, 0),just_labels = c(left, left, left,
 right),set_varnames = c(Feature = ))
 ## attempt to create an object for panel argument in cotabplot function

 pushViewport(viewport(layout = grid.layout(ncol = 1)))
 pushViewport(viewport(layout.pos.row = 1))
 ## tell vcd to change the default layout, and what to put in the top
 plot

 cotabplot(~ Feature + Rank | Cities, data = Pack.dat, panel = Pack.mos,
 Pack.dat[[PX]], gp = shading_max, rot_labels = c(0, 0, 0, 0))
 ## create mosaic plot that's conditional on Cities;  first plot Cities =
 PX
 ## panel argument is an attempt to modify an example in the vcd help
 file

 popViewport()
 ## create the graphic

 Error:  Cannot pop the top-level viewport (grid and graphics output
 mixed?)

 # no point in gong on to code the plot for layout.pos.row = 2

 str(Pack.tab)
 Error in `[.structable`(x, i, args[[2]]) : subscript out of bounds
 class(Pack.tab)
 [1] structable ftable
 dim(Pack.tab)
 [1] 7 2 7

  Cities PX SF
 Rank Feature
 1Flexible 2  0
 Integrate.Probes 1  2
 Large/heavy  1  0
 Lockout  0  1
 Recyclable   3  5
 Rigid0  0
 Small/light  2  1
 2Flexible 1  6
 Integrate.Probes 2  0
 Large/heavy  1  1
 Lockout  1  0
 Recyclable   2  0
 Rigid1  0
 Small/light  2  2
 3Flexible 1  1
 Integrate.Probes 3  0
 Large/heavy  1  1
 Lockout  2  1
 Recyclable   1  3
 Rigid0  0
 Small/light  0  3
 4Flexible 3  0
 Integrate.Probes 0  2
 Large/heavy  0  0
 Lockout  2  2
 Recyclable   0  1
 Rigid1  2
 Small/light  3  2
 5Flexible 1  1
 Integrate.Probes 1  1
 Large/heavy  0  3
 Lockout  0  2
 Recyclable   2  0
 Rigid3  1

[R] Goodness of fit for MLE?

2009-05-22 Thread Michael
Hi all,

How do I evaluate how good is my MLE fit? Moreover, suppose I am
having 30 data points, and 50 points, how do I compare which one gives
better goodness-of-fit?

What are the common test procedures to assess the goodness of fit for MLE?

Thanks!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Naming a random effect in lmer

2009-05-22 Thread spencerg
 I miscommunicated: In every application I've seen with large numbers 
of parameters to estimate, most of those parameters are specific 
instances of different levels of a random effect. For example, a 
colleague recently did a fixed effects analysis of a longitudinal 
abundance survey of a large number of species of wildlife. This 
colleague claimed that treating species as a random effect was 
inappropriate, because the species were not selected by use of random 
numbers. With only two or three species, if the scientists were only 
concerned with those two or three species, this may be reasonable.



However, most scientists -- and people who study their work -- will want 
to generalize beyond only the species in the study. Even if scientists 
were only interested in two or three species for which they had data, 
Stein's phenomenon suggests that even a rabid Bayesophobe would likely 
get lower mean square error at the expense of some bias from pretending 
the species were randomly selected from some larger population 
(http://en.wikipedia.org/wiki/James-Stein_estimator).



If I were asked to referee a paper or serve on a thesis committee for a 
study estimating 30 random effects, I would not be happy unless there 
were a convincing discussion of why it was appropriate to estimate all 
those random effects separately; right now, I can not imagine an 
application where that would be appropriate.



If you and your advisor still feel that what you are doing makes sense, 
I suggest you first get the source code via svn checkout 
svn://svn.r-forge.r-project.org/svnroot/lme4 (or by downloading 
lme4_0.999375-30.tar.gz from 
http://cran.fhcrc.org/web/packages/lme4/index.html;), then walk through 
the code line by line using the debug function (or browser or the 
debug package). From this, you will likely see either (a) how you can 
do what you want differently to achieve the same result or (b) how to 
modify the code so it does what you want.



Hope this helps.
Spencer


Leigh Ann Starcevich wrote:
Here is a test data set and code. I am including the data set after 
the code and discussion to make reading easier. Apologies for the size 
of the data set, but my problem occurs when there are a lot of Z 
variables. Thanks for your time.


# Enter data below

# Sample code
library(lme4)
mb- length(unique(testsamp$WYear))

# Create the formula for the set of identically distributed random 
effects

Zs- paste(Z,2:(mb-1)),sep=)
Trendformula -as.formula(paste(LogY ~ WYear + (1+WYear|Site) + (1|, 
randommodel=paste(paste(Zs,collapse=+), 


fittest-lmer(Trendformula, data = testsamp)
summary(fittest)

# Here I get an error because the name of the random effect is too 
long to print

# in the random effects output (I think).
# The error message is: Error in names(bars) - unlist(lapply(bars, 
function(x)
# deparse(x[[3]]))) : 'names' attribute [3] must be the same length as 
the vector [2]


# However, when fewer Z variables are used in the random portion of 
the model,

# there is no error.
# Using only Z2 + … + Z9 for the random intercept

Zs2- paste(Z,2:9,sep=)
Trendformula2 -as.formula(paste(LogY ~ WYear + (1+WYear|Site) + 
(1|, randommodel=paste(paste(Zs2,collapse=+), 

fittest2-lmer(Trendformula2, data = testsamp)
summary(fittest2)


# Is there a way to either name the set of iid random effects 
something else or

# to define a random variable that could be used in the model to create a
# random intercept?

# I have had some success in lme, but it would be helpful for my 
simulation if I

# could conduct this analysis with lmer. My model in lme is not correctly
# estimating one of the variance components (random Site intercept).
# I am using:

detach(package:lme4)
library(nlme)
random.model.lme 
-as.formula(paste(~-1+,paste(paste(Z,2:(mb-1),sep=),collapse=+))) 



n-dim(testsamp)[1]
testsampgroup - rep(1,n)
testsamp.lme -cbind(testsamp,testsampgroup)
testgroupSamp- groupedData(LogY ~ WYearCen|testsampgroup, inner= ~Site,
data= data.frame(testsamp.lme))

fittest3-lme(LogY ~ WYearCen, random= 
pdBlocked(list(pdIdent(~-1+WYearCen:as.factor(Site)), 
pdIdent(~-1+as.factor(Site)), pdIdent(random.model.lme))),data= 
testgroupSamp)

summary(fittest3)
VarCorr(fittest3)


# Data

testsamp -
structure(list(Year = c(2008, 2008, 2008, 2008, 2008, 2008, 2008,
2009, 2009, 2009, 2009, 2009, 2009, 2009, 2010, 2010, 2010, 2010,
2010, 2010, 2010, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2012,
2012, 2012, 2012, 2012, 2012, 2012, 2013, 2013, 2013, 2013, 2013,
2013, 2013, 2014, 2014, 2014, 2014, 2014, 2014, 2014, 2015, 2015,
2015, 2015, 2015, 2015, 2015, 2016, 2016, 2016, 2016, 2016, 2016,
2016, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2018, 2018, 2018,
2018, 2018, 2018, 2018, 2019, 2019, 2019, 2019, 2019, 2019, 2019,
2020, 2020, 2020, 2020, 2020, 2020, 2020, 2021, 2021, 2021, 2021,
2021, 2021, 2021, 2022, 2022, 2022, 2022, 2022, 2022, 2022),
WYear = c(0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 2, 2,
2, 2, 2, 2, 2, 3, 3, 3, 3, 

[R] mle() question

2009-05-22 Thread Stephen Collins
Is there a way to code the mle() function in library stats4 such that it 
switches optimizing methods midstream (i.e. BFGS to Newton and back to 
BFGS, etc.)?

Thanks,
 
Stephen Collins, MPP | Analyst
Health  Benefits | Aon Consulting

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Naming a random effect in lmer

2009-05-22 Thread William Dunlap
 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of spencerg
 Sent: Friday, May 22, 2009 3:01 PM
 To: Leigh Ann Starcevich
 Cc: r-help@r-project.org
 Subject: Re: [R] Naming a random effect in lmer
 
 [ ... elided statistical advice ... ]
 If you and your advisor still feel that what you are doing 
 makes sense, 
 I suggest you first get the source code via svn checkout 
 svn://svn.r-forge.r-project.org/svnroot/lme4 (or by downloading 
 lme4_0.999375-30.tar.gz from 
 http://cran.fhcrc.org/web/packages/lme4/index.html;), then 
 walk through 
 the code line by line using the debug function (or browser or the 
 debug package). From this, you will likely see either (a) 
 how you can 
 do what you want differently to achieve the same result or (b) how to 
 modify the code so it does what you want.

The coding error is right in the error message:
  Error in names(bars) - unlist(lapply(bars, function(x)
deparse(x[[3]])))
and I suspect that traceback() would tell you that came from a call
to lmerFactorList.

That code implicitly assumes that deparse() will produce a scalar
character
vector, but it doesn't if the input expression is complicated enough.
Changing the
deparse(x[[3]])
to
deparse(x[[3]])[1]
or
paste(collapse= , deparse(x[[3]])[1])
would fix it.  The first truncates the name and the second my make a
very
long name.

There is at least one other use of that idiom in the lme4 code and your
dataset and analysis may require that all of them be fixed.
 
 
 
 Hope this helps.
 Spencer
 
 
 Leigh Ann Starcevich wrote:
  Here is a test data set and code. I am including the data set after 
  the code and discussion to make reading easier. Apologies 
 for the size 
  of the data set, but my problem occurs when there are a lot of Z 
  variables. Thanks for your time.
 
  # Enter data below
 
  # Sample code
  library(lme4)
  mb- length(unique(testsamp$WYear))
 
  # Create the formula for the set of identically distributed random 
  effects
  Zs- paste(Z,2:(mb-1)),sep=)
  Trendformula -as.formula(paste(LogY ~ WYear + 
 (1+WYear|Site) + (1|, 
  randommodel=paste(paste(Zs,collapse=+), 
 
  fittest-lmer(Trendformula, data = testsamp)
  summary(fittest)
 
  # Here I get an error because the name of the random effect is too 
  long to print
  # in the random effects output (I think).
  # The error message is: Error in names(bars) - unlist(lapply(bars, 
  function(x)
  # deparse(x[[3]]))) : 'names' attribute [3] must be the 
 same length as 
  the vector [2]
 
  # However, when fewer Z variables are used in the random portion of 
  the model,
  # there is no error.
  # Using only Z2 + ... + Z9 for the random intercept
 
  Zs2- paste(Z,2:9,sep=)
  Trendformula2 -as.formula(paste(LogY ~ WYear + (1+WYear|Site) + 
  (1|, randommodel=paste(paste(Zs2,collapse=+), 
  fittest2-lmer(Trendformula2, data = testsamp)
  summary(fittest2)
 
 
  # Is there a way to either name the set of iid random effects 
  something else or
  # to define a random variable that could be used in the 
 model to create a
  # random intercept?
 
  # I have had some success in lme, but it would be helpful for my 
  simulation if I
  # could conduct this analysis with lmer. My model in lme is 
 not correctly
  # estimating one of the variance components (random Site intercept).
  # I am using:
 
  detach(package:lme4)
  library(nlme)
  random.model.lme 
  
 -as.formula(paste(~-1+,paste(paste(Z,2:(mb-1),sep=),col
 lapse=+))) 
 
 
  n-dim(testsamp)[1]
  testsampgroup - rep(1,n)
  testsamp.lme -cbind(testsamp,testsampgroup)
  testgroupSamp- groupedData(LogY ~ WYearCen|testsampgroup, 
 inner= ~Site,
  data= data.frame(testsamp.lme))
 
  fittest3-lme(LogY ~ WYearCen, random= 
  pdBlocked(list(pdIdent(~-1+WYearCen:as.factor(Site)), 
  pdIdent(~-1+as.factor(Site)), pdIdent(random.model.lme))),data= 
  testgroupSamp)
  summary(fittest3)
  VarCorr(fittest3)
 
 
  # Data
 
  testsamp -
  structure(list(Year = c(2008, 2008, 2008, 2008, 2008, 2008, 2008,
  2009, 2009, 2009, 2009, 2009, 2009, 2009, 2010, 2010, 2010, 2010,
  2010, 2010, 2010, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2012,
  2012, 2012, 2012, 2012, 2012, 2012, 2013, 2013, 2013, 2013, 2013,
  2013, 2013, 2014, 2014, 2014, 2014, 2014, 2014, 2014, 2015, 2015,
  2015, 2015, 2015, 2015, 2015, 2016, 2016, 2016, 2016, 2016, 2016,
  2016, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2018, 2018, 2018,
  2018, 2018, 2018, 2018, 2019, 2019, 2019, 2019, 2019, 2019, 2019,
  2020, 2020, 2020, 2020, 2020, 2020, 2020, 2021, 2021, 2021, 2021,
  2021, 2021, 2021, 2022, 2022, 2022, 2022, 2022, 2022, 2022),
  WYear = c(0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 2, 2,
  2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4,
  5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7,
  7, 7, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 10, 10, 10,
  10, 10, 10, 10, 11, 11, 11, 11, 11, 11, 11, 12, 12, 12, 12,
  12, 12, 12, 13, 13, 13, 13, 13, 13, 

[R] how to insert NULLs in lists?

2009-05-22 Thread Kynn Jones
I'm an experienced programmer, but learning R is making me lose the little
hair I have left...

 list(NULL)
[[1]]
NULL

 length(list(NULL))
[1] 1
 x - list()
 x[[1]] - NULL
 x
list()
 length(x)
[1] 0


From the above experiment, it is clear that, although one can create a
one-element list consisting of a NULL element, one can't get the same result
by assigning NULL to the first element of an empty list.  And it gets worse:

 x - list(1, 2, 3)
 length(x)
[1] 3
 x[[2]] - NULL
 length(x)
[1] 2



I just could NOT believe my eyes!  Am I going crazy???

What I'm trying to do is so simple and straightforward: I want to be able to
append NULL to a list, and, after the appending, have the last element of
the list be NULL.  Is that so unreasonable?  How can it be done?

TIA!

Kynn

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to insert NULLs in lists?

2009-05-22 Thread Bert Gunter

Continuing your example below,

x[4] - list(NULL)

I won't try to defend the semantics,which have been complained about before.
However, note that with lists, x[i] is the list which consists of one
member, the ith component of x, which is not the same as x[[i]], the ith
component, itself; so the assignment above sets the list that contains the
4th component of x to the list that contains NULL, doing what you desire.
 For similar reasons, x - c(x,list(NULL)) would also work.

I believe all of this is documented in the man page for [ ; but I grant
that it takes some close reading and experimentation to get it.

-- Bert Gunter
Genentech Nonclinical Statistics

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Kynn Jones
Sent: Friday, May 22, 2009 3:52 PM
To: r-help@r-project.org
Subject: [R] how to insert NULLs in lists?

I'm an experienced programmer, but learning R is making me lose the little
hair I have left...

 list(NULL)
[[1]]
NULL

 length(list(NULL))
[1] 1
 x - list()
 x[[1]] - NULL
 x
list()
 length(x)
[1] 0


From the above experiment, it is clear that, although one can create a
one-element list consisting of a NULL element, one can't get the same result
by assigning NULL to the first element of an empty list.  And it gets worse:

 x - list(1, 2, 3)
 length(x)
[1] 3
 x[[2]] - NULL
 length(x)
[1] 2



I just could NOT believe my eyes!  Am I going crazy???

What I'm trying to do is so simple and straightforward: I want to be able to
append NULL to a list, and, after the appending, have the last element of
the list be NULL.  Is that so unreasonable?  How can it be done?

TIA!

Kynn

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to insert NULLs in lists?

2009-05-22 Thread markleeds

   Hi Kynn: this oddity  is discussed in Patrick Burn's document called The R
   Inferno. I don't recall the fix so I'm not sure if below is the same as
   what his book says to do but it seems to do what you want.
   x - list()
   x[[1]] - 2
   x
   length(x)
   print(str(x))
   x[2] - list(NULL)
   x
   length(x)
   print(str(x))
   Still,  I would look at that document rather than just using the above
   because I'm not an expeRt
   so there might be a better way ?
   Also, if you cant find Pat's website, let me know and  I'll find it. I'm
   pretty sure that,  if you
   google  Patrick Burns, his site should be up at the top and then his
   R-Inferno is easy to
   find from there. It's quite a useful document so I highly recommend it.

   On May 22, 2009, Kynn Jones kyn...@gmail.com wrote:

 I'm an experienced programmer, but learning R is making me lose the little
 hair I have left...
  list(NULL)
 [[1]]
 NULL
  length(list(NULL))
 [1] 1
  x - list()
  x[[1]] - NULL
  x
 list()
  length(x)
 [1] 0
 From the above experiment, it is clear that, although one can create a
 one-element list consisting of a NULL element, one can't get the same
 result
 by assigning NULL to the first element of an empty list. And it gets
 worse:
  x - list(1, 2, 3)
  length(x)
 [1] 3
  x[[2]] - NULL
  length(x)
 [1] 2
 I just could NOT believe my eyes! Am I going crazy???
 What I'm trying to do is so simple and straightforward: I want to be able
 to
 append NULL to a list, and, after the appending, have the last element of
 the list be NULL. Is that so unreasonable? How can it be done?
 TIA!
 Kynn
 [[alternative HTML version deleted]]
 __
 [1]r-h...@r-project.org mailing list
 [2]https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 [3]http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

References

   1. mailto:R-help@r-project.org
   2. https://stat.ethz.ch/mailman/listinfo/r-help
   3. http://www.R-project.org/posting-guide.html
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Need a faster function to replace missing data

2009-05-22 Thread Tim Clark

Jim,

Thanks!  I like the way you use indexing instead of the loops.  However, the 
find.Interval function does not give the right result.  I have been playing 
with it and it seems to give the closest number that is less than the one of 
interest.  In this case, the correct replacement should have been 40, not 30, 
since 12:15 from mygarmin is closer to 12:14 in myvscan than 12:10.  Is there a 
way to get the function to find the closest in value instead of the next 
smaller value?  I was trying to use which.min to get the closet date but can't 
seem to get it to work right either.

Aloha,

Tim


Tim Clark
Department of Zoology 
University of Hawaii


--- On Fri, 5/22/09, jim holtman jholt...@gmail.com wrote:

 From: jim holtman jholt...@gmail.com
 Subject: Re: [R] Need a faster function to replace missing data
 To: Tim Clark mudiver1...@yahoo.com
 Cc: r-help@r-project.org
 Date: Friday, May 22, 2009, 7:24 AM
 I think this does what you
 want.  It uses 'findInterval' to determine where a
 possible match is:
  
 
 myvscan-data.frame(c(1,NA,1.5),as.POSIXct(c(12:00:00,12:14:00,12:20:00),
 format=%H:%M:%S))
  # convert to numeric
 
 names(myvscan)-c(Latitude,DateTime)
 
  myvscan$tn - as.numeric(myvscan$DateTime)  #
 numeric for findInterval
 
 mygarmin-data.frame(c(20,30,40),as.POSIXct(c(12:00:00,12:10:00,12:15:00),
 format=%H:%M:%S))
 
 
 names(mygarmin)-c(Latitude,DateTime)
  mygarmin$tn - as.numeric(mygarmin$DateTime)
  
  # use 'findInterval'
  na.indx - which(is.na(myvscan$Latitude))  # find
 NAs
 
  # replace with garmin latitude
  myvscan$Latitude[na.indx] -
 mygarmin$Latitude[findInterval(myvscan$tn[na.indx],
 mygarmin$tn)]
  
  
  myvscan
   Latitude    DateTime
 tn
 
 1  1.0 2009-05-22 12:00:00 1243008000
 2 30.0 2009-05-22 12:14:00 1243008840
 3  1.5 2009-05-22 12:20:00 1243009200
  
 
 
 
 On Fri, May 22, 2009 at 12:45 AM,
 Tim Clark mudiver1...@yahoo.com
 wrote:
 
 
 Dear List,
 
 I need some help in coming up with a function that will
 take two data sets, determine if a value is missing in one,
 find a value in the second that was taken at about the same
 time, and substitute the second value in for where the first
 should have been.  My problem is from a fish tracking
 study.  We put acoustic tags in fish and track them for
 several days.  Location data is supposed to be
 automatically recorded every time we detect a
 ping from the fish.  Unfortunately the GPS had
 some problems and sometimes the fishes depth was recorded
 but not its location.  I fortunately had a back-up GPS that
 was taking location data every five minutes.  I would like
 to merge the two files, replacing the missing value in the
 vscan (automatic) file with the location from the garmin
 file.  Since we were getting vscan records every 1-2
 seconds and garmin records every 5 minutes, I need to find
 the right place in the vscan file to place the garmin record
 - i.e. the
 
  closest in time, but not greater than 5 minutes.  I have
 written a function that does this. However, it works with my
 test data but locks up my computer with my real data.  I
 have several million vscan records and several thousand
 garmin records.  Is there a better way to do this?
 
 
 
 My function and test data:
 
 myvscan-data.frame(c(1,NA,1.5),times(c(12:00:00,12:14:00,12:20:00)))
 names(myvscan)-c(Latitude,DateTime)
 
 mygarmin-data.frame(c(20,30,40),times((12:00:00,12:10:00,12:15:00)))
 names(mygarmin)-c(Latitude,DateTime)
 
 minute.diff-1/24/12   #Time diff is in days, so this
 is 5 minutes
 
 for (k in 1:nrow(myvscan))
 {
 if (is.na(myvscan$Latitude[k]))
 {
 if ((min(abs(mygarmin$DateTime-myvscan$DateTime[k]))) 
 minute.diff )
 {
 index.min.date-which.min(abs(mygarmin$DateTime-myvscan$DateTime[k]))
 
 myvscan$Latitude[k]-mygarmin$Latitude[index.min.date]
 }}}
 
 I appreciate your help and advice.
 
 Aloha,
 
 Tim
 
 
 
 
 Tim Clark
 Department of Zoology
 University of Hawaii
 
 __
 
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 
 and provide commented, minimal, self-contained,
 reproducible code.
 
 
 
 
 -- 
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390
 
 What is the problem that you are trying to solve?
 
 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] rank reduction method in time series analysis?

2009-05-22 Thread Michael
Hi all,

Suppose I have 100 simultaneous time series, what's the best
statistical procedure to figure out a transformation of the data, and
see if we could squeeze most of the information into a few transformed
time series?

Thanks!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Need a faster function to replace missing data

2009-05-22 Thread jim holtman
Here is a modification that should now find the closest:


myvscan-data.frame(c(1,NA,1.5),as.POSIXct(c(12:00:00,12:14:00,12:20:00),
+ format=%H:%M:%S))
 # convert to numeric

 names(myvscan)-c(Latitude,DateTime)

 myvscan$tn - as.numeric(myvscan$DateTime)  # numeric for findInterval


mygarmin-data.frame(c(20,30,40),as.POSIXct(c(12:00:00,12:10:00,12:15:00),
+ format=%H:%M:%S))


 names(mygarmin)-c(Latitude,DateTime)
 mygarmin$tn - as.numeric(mygarmin$DateTime)

 # use 'findInterval'
 na.indx - which(is.na(myvscan$Latitude))  # find NAs

 # create matrix of values to test the range
 indices - findInterval(myvscan$tn[na.indx],mygarmin$tn)
 x - cbind(indices,
+abs(myvscan$tn[na.indx] - mygarmin$tn[indices]), # lower
+abs(myvscan$tn[na.indx] - mygarmin$tn[indices + 1]))  #higher
 # now determine which index is closer
 closest - x[,1] + (x[,2]  x[,3])  # determine the proper index
 # replace with garmin latitude
 myvscan$Latitude[na.indx] - mygarmin$Latitude[closest]



 myvscan
  LatitudeDateTime tn
1  1.0 2009-05-23 12:00:00 124308
2 40.0 2009-05-23 12:14:00 1243080840
3  1.5 2009-05-23 12:20:00 1243081200



On Fri, May 22, 2009 at 7:39 PM, Tim Clark mudiver1...@yahoo.com wrote:


 Jim,

 Thanks!  I like the way you use indexing instead of the loops.  However,
 the find.Interval function does not give the right result.  I have been
 playing with it and it seems to give the closest number that is less than
 the one of interest.  In this case, the correct replacement should have been
 40, not 30, since 12:15 from mygarmin is closer to 12:14 in myvscan than
 12:10.  Is there a way to get the function to find the closest in value
 instead of the next smaller value?  I was trying to use which.min to get the
 closet date but can't seem to get it to work right either.

 Aloha,

 Tim


 Tim Clark
 Department of Zoology
 University of Hawaii


 --- On Fri, 5/22/09, jim holtman jholt...@gmail.com wrote:

  From: jim holtman jholt...@gmail.com
  Subject: Re: [R] Need a faster function to replace missing data
  To: Tim Clark mudiver1...@yahoo.com
  Cc: r-help@r-project.org
  Date: Friday, May 22, 2009, 7:24 AM
   I think this does what you
  want.  It uses 'findInterval' to determine where a
  possible match is:
 
  
 
 myvscan-data.frame(c(1,NA,1.5),as.POSIXct(c(12:00:00,12:14:00,12:20:00),
  format=%H:%M:%S))
   # convert to numeric
  
  names(myvscan)-c(Latitude,DateTime)
 
   myvscan$tn - as.numeric(myvscan$DateTime)  #
  numeric for findInterval
  
 
 mygarmin-data.frame(c(20,30,40),as.POSIXct(c(12:00:00,12:10:00,12:15:00),
  format=%H:%M:%S))
 
  
  names(mygarmin)-c(Latitude,DateTime)
   mygarmin$tn - as.numeric(mygarmin$DateTime)
  
   # use 'findInterval'
   na.indx - which(is.na(myvscan$Latitude))  # find
  NAs
 
   # replace with garmin latitude
   myvscan$Latitude[na.indx] -
  mygarmin$Latitude[findInterval(myvscan$tn[na.indx],
  mygarmin$tn)]
  
  
   myvscan
LatitudeDateTime
  tn
 
  1  1.0 2009-05-22 12:00:00 1243008000
  2 30.0 2009-05-22 12:14:00 1243008840
  3  1.5 2009-05-22 12:20:00 1243009200
  
 
 
 
  On Fri, May 22, 2009 at 12:45 AM,
  Tim Clark mudiver1...@yahoo.com
  wrote:
 
 
  Dear List,
 
  I need some help in coming up with a function that will
  take two data sets, determine if a value is missing in one,
  find a value in the second that was taken at about the same
  time, and substitute the second value in for where the first
  should have been.  My problem is from a fish tracking
  study.  We put acoustic tags in fish and track them for
  several days.  Location data is supposed to be
  automatically recorded every time we detect a
  ping from the fish.  Unfortunately the GPS had
  some problems and sometimes the fishes depth was recorded
  but not its location.  I fortunately had a back-up GPS that
  was taking location data every five minutes.  I would like
  to merge the two files, replacing the missing value in the
  vscan (automatic) file with the location from the garmin
  file.  Since we were getting vscan records every 1-2
  seconds and garmin records every 5 minutes, I need to find
  the right place in the vscan file to place the garmin record
  - i.e. the
 
   closest in time, but not greater than 5 minutes.  I have
  written a function that does this. However, it works with my
  test data but locks up my computer with my real data.  I
  have several million vscan records and several thousand
  garmin records.  Is there a better way to do this?
 
 
 
  My function and test data:
 
 
 myvscan-data.frame(c(1,NA,1.5),times(c(12:00:00,12:14:00,12:20:00)))
  names(myvscan)-c(Latitude,DateTime)
 
 
 mygarmin-data.frame(c(20,30,40),times((12:00:00,12:10:00,12:15:00)))
  names(mygarmin)-c(Latitude,DateTime)
 
  minute.diff-1/24/12   #Time diff is in days, so this
  is 5 minutes
 
  for (k in 1:nrow(myvscan))
  {
  if (is.na(myvscan$Latitude[k]))
  {
  if 

Re: [R] stripchart and ordering of plot

2009-05-22 Thread Andrew Yee
Thanks, that works great.
Andrew

On Fri, May 22, 2009 at 5:42 PM, William Dunlap wdun...@tibco.com wrote:

  -Original Message-
  From: r-help-boun...@r-project.org
  [mailto:r-help-boun...@r-project.org] On Behalf Of Andrew Yee
  Sent: Friday, May 22, 2009 2:16 PM
  To: r-h...@stat.math.ethz.ch
  Subject: [R] stripchart and ordering of plot
 
  Take for example the following stripchart that's created:
 
  b - 1:5
  a - 11:15
  e - 21:25
  f - -11:-15
 
  foo - rbind(b,a,e,f)
 
  stripchart(foo ~ rownames(foo))
 
  In this case, I would like the bottom part of the plot to be
  the f vector,
  followed by the e vector, etc.
 
  However, R reorders the matrix and plots according to the
  alphabetical order
  of the rownames of foo.  Is there a way to plot the matrix in
  the current
  order of the rownames?

 If you supply a factor it will use the order of the factor
 levels.  If you supply a character vector, it converts it
 into a factor and the converter puts the levels in alphabetical
 order by default.  Hence one solution is to change that
 character vector into  a factor with the desired order
 of levels:
   stripchart(foo ~ factor(rownames(foo), levels=c(f,e,a,b))

 Bill Dunlap
 TIBCO Software Inc - Spotfire Division
 wdunlap tibco.com

 
  Thanks,
  Andrew
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.