Re: [R] Interpolating / smoothing missing time series data

2005-09-08 Thread Thomas Petzoldt
Francisco J. Zagmutt wrote:
 I don't have much experience in the subject but it seems that library(akima) 
 should be useful for your problem. Try library(help=akima) to see a list 
 of the functions available in the library.
 
 I hope this helps
 
 Francisco

Yes, function aspline() of package akima is well suited for such things: 
no wiggles like in spline() and less variance reducing than approx(). 
But in any case: excessive interpolation will definitely lead to biased 
results, in particular artificial autocorrelations.

If ever possible, David should look for methods, capable of dealing with 
missing data directly.

Thomas P.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Creating very small plots (2.5 cm wide) in Sweave

2005-09-08 Thread Andrew Robinson
Dear Francisco,

thanks for your solution.  It turns out that it's best for me to use

\setkeys{Gin}{width=0.15\textwidth}

directly before I call the plot - that seems to work just fine.

Andrwe

On Thu, Sep 08, 2005 at 05:44:59AM +, Francisco J. Zagmutt wrote:
 Others may propose more elegant solutions but, in windows one quick an 
 dirty option would be to change the argument 'pin' and 'fin' within par to 
 get an image of exactly 1 inch (2.54 cm) i.e.
 
 y - c(40, 46, 39, 44, 23, 36, 70, 39, 30, 73, 53, 74)
 x - c(6, 4, 3, 6, 1, 5, 6, 2, 1, 8, 4, 6)
 par(pin=c(1,1), fin=c(1,1))
 plot(x, y, xlab=, ylab=)
 abline(h=mean(y), col=red)
 
 #Save the plot in bmp format
 savePlot(myplot, bmp)
 
 and then manually crop the picture using your favorite picture package or 
 even within a word processor.
 
 I hope this helps
 
 Francisco
 
 
 From: Andrew Robinson [EMAIL PROTECTED]
 To: R-Help Discussion r-help@stat.math.ethz.ch
 Subject: [R] Creating very small plots (2.5 cm wide) in Sweave
 Date: Thu, 8 Sep 2005 13:40:17 +1000
 
 Hi everyone,
 
 I was wondering if anyone has any code they could share for creating
 thumbnail plots in Sweave.  For example, I'd like a plot like the
 following:
 
 y - c(40, 46, 39, 44, 23, 36, 70, 39, 30, 73, 53, 74)
 x - c(6, 4, 3, 6, 1, 5, 6, 2, 1, 8, 4, 6)
 opar - par(mar=c(3,3,0,0))
 plot(x, y, xlab=, ylab=)
 abline(h=mean(y), col=red)
 par(opar)
 
 to come out about 2.5 cm wide.
 
 Thanks for any assistance,
 
 Andrew
 --
 Andrew Robinson
 Senior Lecturer in Statistics   Tel: +61-3-8344-9763
 Department of Mathematics and StatisticsFax: +61-3-8344-4599
 University of Melbourne, VIC 3010 Australia
 Email: [EMAIL PROTECTED]Website: 
 http://www.ms.unimelb.edu.au
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 

-- 
Andrew Robinson
Senior Lecturer in Statistics   Tel: +61-3-8344-9763
Department of Mathematics and StatisticsFax: +61-3-8344-4599
University of Melbourne, VIC 3010 Australia
Email: [EMAIL PROTECTED]Website: http://www.ms.unimelb.edu.au

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Text Size in Legend

2005-09-08 Thread Chris Diehl
Hello,

I need to reduce the size of the text in a legend since the legend is 
overlapping with the curves in my plot. I've not been able to identify any 
way to achieve this in the documentation. Anyone have any suggestions on how 
to scale down the text or the overall legend?

Thanks in advance for your help!

Chris Diehl

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Text Size in Legend

2005-09-08 Thread Spyridoula Tsonaka
Hi Chris,

To change the scale of the whole legend you can use the argument 'cex' in 
legend.

I hope this helps!

Regards,
Roula

=
Spyridoula Tsonaka
Doctoral Student
Biostatistical Centre
Catholic University of Leuven
Kapucijnenvoer 35
B-3000 Leuven
Belgium
Tel: +32/16/336899
Fax: +32/16/337015


- Original Message - 
From: Chris Diehl [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Sent: Wednesday, September 07, 2005 10:31 PM
Subject: [R] Text Size in Legend


 Hello,

 I need to reduce the size of the text in a legend since the legend is
 overlapping with the curves in my plot. I've not been able to identify any
 way to achieve this in the documentation. Anyone have any suggestions on 
 how
 to scale down the text or the overall legend?

 Thanks in advance for your help!

 Chris Diehl

 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Using R map data to determine associated state for a coordinate?

2005-09-08 Thread Werner Wernersen
Hi!

I have no idea if this is maybe an easy task utilizing
R since I read there is 
geographical map data in some package:

I have a huge number of geographical points with their
coordinates in Germany. 
Now I want to determine for each point in which
Bundesland = state it is located.

Can anybody tell me if this is doable relatively easy
in R and if so give me 
some hints or links how to do it?

Thanks a million,
   Werner

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Too long to display problem

2005-09-08 Thread Wuming Gong
Dear list, 

I used read.xls in gdata package to read a worksheet in which certain
field contains very long character strings (nucleotides sequence,
nchar  10,000). Then, the values in these fields are automatically
converted to TOO LONG TO DISPLAY. How can I get those original
characters instead of TOO LONG TO DISPLAY?

Thanks,

Wuming

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Survival model with cross-classified shared frailties

2005-09-08 Thread Shige Song
Dear All,

The coxph function in the survival package allows multiple frailty 
terms. In all the examples I saw, however, the frailty terms are nested. 
What will happen if I have non-nested (that is, cross-classified) frailties 
in the model? Will the model still work? Do I need to take special cares 
when specifying these models? Thanks!

Shige

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] variables from command line

2005-09-08 Thread Martin Maechler
 Omar == Omar Lakkis [EMAIL PROTECTED]
 on Wed, 7 Sep 2005 10:47:43 -0400 writes:

Omar How can I pass parameters to an R script from the
Omar command line. And how can I read them from within the
Omar script?

Omar This is how I want to invoke the script: R CMD BATCH
Omar r.in r.out input values

Omar The script with read in the input values, process them
Omar and spit the output to r.out.

I think commandArgs()should solve this.

Regard,
Martin Maechler

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] FW: Re: Doubt about nested aov output

2005-09-08 Thread Ken Knoblauch
Your response nicely clarifies a question that I've had for a long time,
but which I've dealt
with by giving each subject a unique label.  Unless I'm missing something,
both techniques should
work as the toy example below gives exactly the same output in all 3 cases
below (forgetting
about the convergence problem).  Would there be a reason to prefer
labeling the levels
one way or another or is it just a matter of convenience?

library(lmer)
y - rnorm(15)
cond - gl(3, 5, 15)
obs - gl(15, 1)
subj - gl(5, 1, 15)
dd - data.frame(y = y, cond = cond, obs = obs, subj = subj)

l1 - lmer(y~cond + (1|cond:obs), dd)
l2 - lmer(y~cond + (1|cond:subj), dd)
l3 - lmer(y~cond + (1|obs), dd)

Douglas Bates a écrit:

The difference between models like
  lmer(Glycogen~Treatment+(1|Rat)+(1|Rat:Liver))
and
  lmer(Glycogen~Treatment+(1|Treatment:Rat)+(1|Treatment:Rat:Liver))

is more about the meaning of the levels of Rat than about the
meaning of Treatment.  As I understood it there are three different
rats labelled 1.  There is a rat 1 on treatment 1 and a rat 1 on
treatment 2 and a rat 1 on treatment 3.  Thus the levels of Rat do not
designate the experimental unit, it is the levels of Treatment:Rat
that do this.

-- 
Ken Knoblauch
Inserm U371
Cerveau et Vision
Dept. of Cognitive Neuroscience
18 avenue du Doyen Lépine
69500 Bron
France
tel: +33 (0)4 72 91 34 77
fax: +33 (0)4 72 91 34 61
portable: +33 (0)6 84 10 64 10
http://www.lyon.inserm.fr/371/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Too long to display problem

2005-09-08 Thread Wuming Gong
Dear list, 

Please ignore this thread - the TOO LONG TO DISPLAY is brought by
another tool when parsing data sets. Sorry for this ...

Wuming

On 9/8/05, Wuming Gong [EMAIL PROTECTED] wrote:
 Dear list,
 
 I used read.xls in gdata package to read a worksheet in which certain
 field contains very long character strings (nucleotides sequence,
 nchar  10,000). Then, the values in these fields are automatically
 converted to TOO LONG TO DISPLAY. How can I get those original
 characters instead of TOO LONG TO DISPLAY?
 
 Thanks,
 
 Wuming


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Using R map data to determine associated state for a coordinate?

2005-09-08 Thread Thomas Petzoldt
Werner Wernersen wrote:
 Hi!
 
 I have no idea if this is maybe an easy task utilizing
 R since I read there is 
 geographical map data in some package:
 
 I have a huge number of geographical points with their
 coordinates in Germany. 
 Now I want to determine for each point in which
 Bundesland = state it is located.
 
 Can anybody tell me if this is doable relatively easy
 in R and if so give me 
 some hints or links how to do it?
 
 Thanks a million,
Werner

Hello Werner,

two building blocks, but don't know if the precision meets your needs.

1. Do you have a good map of Germany *with* Federal States?

* If YES and if it's free:
== I would be interested! Please post it's source.

* If NO:
== Downloadable map data are available on:
 http://www.vdstech.com/map_data.htm

2. The following approach reads and converts a shapefile with functions 
from maptools and then follows the example of inside.owin() from the 
spatstat package.

Hope that helps

Thomas Petzoldt


##
library(maptools)
library(spatstat)

ger - read.shape(germany.shp)
plot(ger)

pger - Map2poly(ger)
sx- pger[[13]]
lines(sx, type=l, col=red) # Saxony ;-)

## Create an owin (observation window) object
# direction of coordinates must be reversed, in some cases
# if error message: remove rev()'s
saxony - owin(poly=list(x=rev(sx[,1]), y=rev(sx[,2])))

# random points in rectangle
x - runif(1000, min= 6, max=15)
y - runif(1000, min=46, max=56)

ok - inside.owin(x, y, saxony)

points(x[ok], y[ok])
points(x[!ok], y[!ok], pch=.)
##

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Interpolating / smoothing missing time series data

2005-09-08 Thread Sean Davis
On 9/7/05 10:19 PM, Gabor Grothendieck [EMAIL PROTECTED] wrote:

 On 9/7/05, David James [EMAIL PROTECTED] wrote:
 The purpose of this email is to ask for pre-built procedures or
 techniques for smoothing and interpolating missing time series data.
 
 I've made some headway on my problem in my spare time.  I started
 with an irregular time series with lots of missing data.  It even had
 duplicated data.  Thanks to zoo, I've cleaned that up -- now I have a
 regular time series with lots of NA's.
 
 I want to use a regression model (i.e. ARIMA) to ill in the gaps.  I
 am certainly open to other suggestions, especially if they are easy
 to implement.
 
 My specific questions:
 1.  Presumably, once I get ARIMA working, I still have the problem of
 predicting the past missing values -- I've only seen examples of
 predicting into the future.
 2.  When predicting the past (backcasting), I also want to take
 reasonable steps to make the data look smooth.
 
 I guess I'm looking for a really good example in a textbook or white
 paper (or just an R guru with some experience in this area) that can
 offer some guidance.
 
 Venables and Ripley was a great start (Modern Applied Statistics with
 S).  I really had hoped that the Seasonal ARIMA Models section on
 page 405 would help.  It was helpful, but only to a point.  I have a
 hunch (based on me crashing arima numerous times -- maybe I'm just
 new to this and doing things that are unreasonable?) that using
 hourly data just does not mesh well with the seasonal arima code?
 
 Not sure if this answers your question but if you are looking for something
 simple then na.approx in the zoo package will linearly interpolate for you.
 
 z - zoo(c(1,2,NA,4,5))
 na.approx(z)
 1 2 3 4 5 
 1 2 3 4 5

Alternatively, if you are looking for more smoothing, you could look at
using a moving average or median applied at points of interest with an
appropriate window size--see wapply in the gplots package (gregmisc
bundle).  There are a number of other functions that can accomplish the same
task.  A search for moving window or moving average in the archives may
produce some other ideas.

Sean

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Using R map data to determine associated state for a coordinate?

2005-09-08 Thread Roger Bivand
On Thu, 8 Sep 2005, Thomas Petzoldt wrote:

 Werner Wernersen wrote:
  Hi!
  
  I have no idea if this is maybe an easy task utilizing
  R since I read there is 
  geographical map data in some package:
  
  I have a huge number of geographical points with their
  coordinates in Germany. 
  Now I want to determine for each point in which
  Bundesland = state it is located.
  
  Can anybody tell me if this is doable relatively easy
  in R and if so give me 
  some hints or links how to do it?
  
  Thanks a million,
 Werner
 
 Hello Werner,
 
 two building blocks, but don't know if the precision meets your needs.
 
 1. Do you have a good map of Germany *with* Federal States?
 
 * If YES and if it's free:
 == I would be interested! Please post it's source.
 
 * If NO:
 == Downloadable map data are available on:
  http://www.vdstech.com/map_data.htm
 
 2. The following approach reads and converts a shapefile with functions 
 from maptools and then follows the example of inside.owin() from the 
 spatstat package.

The example code will work very well, but since yesterday, when we
released a new version of maptools depending on the sp package, it can
look like:

 library(maptools)
Loading required package: foreign
Loading required package: sp
 nc - readShapePoly(system.file(shapes/sids.shp, package=maptools)[1])
 plot(nc, lwd=2, border=grey)
 bbox(nc)
 min   max
r1 -84.32385 -75.45698
r2  33.88199  36.58965
 x - runif(1000, min=-84.32385, max=-75.45698)
 y - runif(1000, min=33.88199, max=36.58965)
 xypts - SpatialPoints(cbind(x, y))
 plot(xypts, add=TRUE, pch=19, cex=0.2)
 where_am_i - overlay(xypts, nc)
 plot(xypts[is.na(where_am_i),], add=TRUE, pch=19, cex=0.2, col=grey80)  
 summary(where_am_i)
   Min. 1st Qu.  MedianMean 3rd Qu.Max.NA's 
   1.00   29.00   51.00   53.11   77.00  100.00  450.00 

and in this case the points would be read into the SpatialPoints object 
directly. Should overlay have trouble with the huge number of points, 
you could take them in smaller batches, storing the intermediate results. 
As Thomas said, you need the boundaries of the Bundesland first, and the 
accuracy of your results will depend on the degree of detail of the 
boundary polygons.

Roger Bivand

 
 Hope that helps
 
 Thomas Petzoldt
 
 
 ##
 library(maptools)
 library(spatstat)
 
 ger - read.shape(germany.shp)
 plot(ger)
 
 pger - Map2poly(ger)
 sx- pger[[13]]
 lines(sx, type=l, col=red) # Saxony ;-)
 
 ## Create an owin (observation window) object
 # direction of coordinates must be reversed, in some cases
 # if error message: remove rev()'s
 saxony - owin(poly=list(x=rev(sx[,1]), y=rev(sx[,2])))
 
 # random points in rectangle
 x - runif(1000, min= 6, max=15)
 y - runif(1000, min=46, max=56)
 
 ok - inside.owin(x, y, saxony)
 
 points(x[ok], y[ok])
 points(x[!ok], y[!ok], pch=.)
 ##
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Prediction with multiple zeros in the dependent variable

2005-09-08 Thread Ted Harding
On 08-Sep-05 John Sorkin wrote:
 I have a batch of data in each line of data contains three values,
 calcium score, age, and sex. I would like to predict calcium scores
 as a function of age and sex, i.e. calcium=f(age,sex). Unfortunately
 the calcium scorers have a very ugly distribution. There are
 multiple zeros, and multiple values between 300 and 600. There are
 no values between zero and 300. Needless to say, the calcium scores
 are not normally distributed, however, the values between 300 and 600
 have a distribution that is log normal. As you might imagine, the
 residuals from the regression are not normally distributed and thus
 violates the basic assumption of regression analyses. Does anyone
 have a suggestion for a method (or a transformation) that will allow
 me predict calcium from age and sex without violating the assumptions
 of the model?
 Thanks,
 John

From your description (but only from your description) one might be
tempted to suggest (borrowing a term from Joe Shafer) a semi-continuous
model. This means that each observation either takes a discrete value,
or takes a value with a continuous distribution. In your case this
might be

Score = 0 with probability p which is a function of Age and Sex
Score = X with probability (1-p) where X has a log-normal distribution.

Whether using such a model, for data arising in the context you refer
to, is reasonable  depends on whether Calcium Score = 0 is a reasonable
description of a biological state of things. Even if not a reasonable
biological state, it may be a reasonable description of the outcome
of a measurement process (e.g. too small to measure), in which case
there may be a consequential issue -- what is the likely distribution
of calcium values which give rise to Score = 0? (Though your data may
be uninformative about this). However, if your aim is simply predicting
calcium scores, then this may be irrelevant.

With such a model, you should be able to make progress by using
a log-linear model for the probability p (which may be adequately
addressed by simply using a logistic regression for the event
Score = 0 or equivalently score != 0, though you may need to
be careful about how you represent Age as a covariate; Sex, being
binary, should not present problems). This then allowes you to predict
the probability of zero score, and the complementary probability
of non-zero score.

Then you can consider the problem of estimating the relationship
between Score and (Age, Sex) conditional on Score != 0.

This, in turn, is no more (and no less!) complicated than estimating
the continuous distribution of non-zero scores from the subset of
the data which carries such scores.

If the distribution of non-zero scores were (as you suggest) a simple
log-normal distribution, then a regression of log(Score) on Age and
Sex might do well.

However, from your description, it may not be a simple log-normal.

The absence of scores between 0 and 300, and the containment of
score values betweem 300 and 600, suggests a 3-parameter log-normal
in which, as well as the mean and SD for the normal distribution of
log(X) there is also a lower limit S0, so that it is

  log(S - S0)

which has the N(mean,SD^2) distribution. The distribution might be
more complicated than this.

So, in summary, provided a semi-continuous model is acceptable,
you can proceed by estimating its two aspects separately: The
discrete part by a logistic (or other suitable binary) regression,
using 'glm' in R; the continuous part by a suitable regression
(using e.g. 'lm' in R) perhaps after suitable transformation
(though this may need care). In each case, it is only the relevant
part of the data (the proportions with Score = 0 and Score != 0
on the one hand, the values of Score where Score != 0 on the other
hand, in each case using the corresponding (Age, Sex) as covariates)
which will be needed.

Once you have these estimated models, they can be used straightforwardly
for prediction: Given Age and Sex, the Score will be zero with
estimated probability p(Age,Sex) or, with probability (1 - p(Age,Sex)),
will have a distribution implied by your regression.

So the structure of the predicted values will be the same as the
structure of the observed values. All very straightforward, provided
this is a reasonable way to go.

However, there is a complication in that the above might well not
be a reasonable model (as hinted at above). As an example, consider
the following (purely hypothetical assumptions).

1. The true distribution of Calcium Score is (say) simple log-normal
   such that log(Score) is normal with mean linearly dependent on Age
   and Sex, in all subjects.

2. In attempting to measure true Score (i.e. in obtaining observed
   Calcium Score data), there is a probability that Score = 0
   will be obtained, and this probability depends on the true Score
   (e.g. the smaller the true Score, the higher the probability of
   obtaining Score = 0).

The resulting non-zero score data will then 

[R] Time series ARIMAX and multivariate models

2005-09-08 Thread nhy303
Dear List,

The purpose of this e-mail is to ask about R time series procedures - as a
biologist with only basic time series knowledge and about a year's
experience in R.

I have been using ARIMAX models with seasonal components on seasonal data.
 However I am now moving on to annual data (with only 34 time points) and
understand that ARIMA is not suitable for these shorter time periods -
does R have other, more robust, methods?

I have tried looking through the R help pages  documentation for packages
but am unsure what model type is suitable.

Secondly, I wish to start building multivariate time series models in R to
look at how fish condition (for several sizes of fish) is affected by
environmental factors and numbers of prey.  It would be great if someone
could suggest what R packages/documentation would be useful to research?

Thankyou,

Lillian.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] tcltk, X11 protocol error: Bug?

2005-09-08 Thread John Fox
Dear Nicholas,

This problem has been reported before (enter X11 protocol error on the R
site search at http://finzi.psych.upenn.edu/search.html to see the previous
threads), but as far as I know, there's no definitive explanation or
solution. As well, things appear to work fine, despite the warnings. The way
I handle the problem in the Rcmdr package is simply to intercept the
warnings.

I hope this helps,
 John 


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Nicholas Lewin-Koh
 Sent: Monday, September 05, 2005 5:16 PM
 To: [EMAIL PROTECTED]
 Subject: [R] tcltk, X11 protocol error: Bug?
 
 Hi,
 I am having trouble debugging this one. The code is attached 
 below, but it seems to be a problem at the C-tk interface. If 
 I run this 1 time there are no problems if I run it more than 
 once I start to get warnings that increase in multiples of 11 
 everytime I run it. Here is a sample session
 
 
  source(clrramp2.r)
 Loading required package: tcltk
 Loading Tcl/Tk interface ... done
  clrRamp()
 
  tt-clrRamp()
  tt
 function (n)
 {
 x - ramp(seq(0, 1, length = n))
 rgb(x[, 1], x[, 2], x[, 3], max = 255) }
 environment: 0x8b8674c
  image(matrix(1:10),col=tt(10))
  tt-clrRamp()
 There were 22 warnings (use warnings() to see them)
  image(matrix(1:10),col=tt(10))
 There were 11 warnings (use warnings() to see them)
  warnings()
 Warning messages:
 1: X11 protocol error: BadWindow (invalid Window parameter)
 2: X11 protocol error: BadWindow (invalid Window parameter)
 3: X11 protocol error: BadWindow (invalid Window parameter)
 4: X11 protocol error: BadWindow (invalid Window parameter)
 5: X11 protocol error: BadWindow (invalid Window parameter)
 6: X11 protocol error: BadWindow (invalid Window parameter)
 7: X11 protocol error: BadWindow (invalid Window parameter)
 8: X11 protocol error: BadWindow (invalid Window parameter)
 9: X11 protocol error: BadWindow (invalid Window parameter)
 10: X11 protocol error: BadWindow (invalid Window parameter)
 11: X11 protocol error: BadWindow (invalid Window parameter)
 
 I am running R-2.1.1 on ubuntu linux 5.04, compiled from 
 source (not the deb package).
 My version of tcl/tk is 8.4. The code is below. If anyone 
 sees something I am doing foolish let me know, otherwise I 
 will file a bug report.
 
 Nicholas
 
 # File clrramp2.r ##
 
 require(tcltk)
 clrRamp - function(n.col, b.color=NULL,e.color=NULL){
 
   B.ChangeColor - function()
 {
  
   b.color - 
 tclvalue(tkcmd(tk_chooseColor,initialcolor=e.color,
  title=Choose a color))
   if (nchar(b.color)0){
 tkconfigure(canvas.b,bg=b.color)
 Rmp.Draw()
   }
 }
 
   E.ChangeColor - function()
 {
  
   e.color - 
 tclvalue(tkcmd(tk_chooseColor,initialcolor=e.color,
  title=Choose a color))
   ##cat(e.color)
   if (nchar(e.color)0){
 tkconfigure(canvas.e,bg=e.color)
 Rmp.Draw()
   }
 }
 
   Rmp.Draw -function(){
 
 cr-colorRampPalette(c(b.color,e.color),space=Lab,interpola
 te=spline)
 rmpcol - cr(n.col)
 #rmpcol-rgb( rmpcol[,1],rmpcol[,2],rmpcol[,3])
 inc - 300/n.col
 xl - 0
 for(i in 1:n.col){
   ##item - 
   
 tkitemconfigure(canvas.r,barlst[[i]],fill=rmpcol[i],outline=rmpcol[i])
   #xl - xl+inc
 }
   }
 
   save.ramp - function(){
 
 cr-colorRampPalette(c(b.color,e.color),space=Lab,interpola
 te=spline)
 tkdestroy(tt)
 ##invisible(cr)
   }
 
   tt - tktoplevel()
   tkwm.title(tt,Color Ramp Tool)
   frame - tkframe(tt)
   bframe - tkframe(frame,relief=groove,borderwidth=1)
 
   if(is.null(b.color)) b.color - blue
   if(is.null(e.color)) e.color - yellow
   if(missing(n.col)) n.col - 100
 
   canvas.b - tkcanvas(bframe,width=50,height=25,bg=b.color)
   canvas.e - tkcanvas(bframe,width=50,height=25,bg=e.color)
   canvas.r - tkcanvas(tt,width=300,height=50,bg=white)
   
   BColor.button - tkbutton(bframe,text=Begin
   Color,command=B.ChangeColor)
   ##tkgrid(canvas.b,BColor.button)
   EColor.button - tkbutton(bframe,text=End
   Color,command=E.ChangeColor)
   killbutton - tkbutton(bframe,text=Save,command=save.ramp)
   tkgrid(canvas.b,BColor.button,canvas.e,EColor.button)
   tkgrid(bframe)
   tkgrid(frame)
   tkgrid(canvas.r)
   tkgrid(killbutton)
 
   
 cr-colorRampPalette(c(b.color,e.color),space=Lab,interpolat
 e=spline)
   ##rmpcol - hex(mixcolor(alpha,bc,ec,where=LUV))
   rmpcol - cr(n.col)
   inc - 300/n.col
   xl - 0
   #barlst - vector(length=n.col,mode=list)
   barlst - tclArray()
   for(i in 1:n.col){
 item-tkcreate(canvas.r,rect,xl,0,xl+inc,50,
fill=rmpcol[i],outline=rmpcol[i])
 ##tkaddtag(canvas.r, point, withtag, item)

[R] Converting a matrix to a dataframe: how to prevent conversion to factor

2005-09-08 Thread Dennis Fisher
Colleages

I am running R 2.1.0 on a Mac (same problem occurs in Linux).  In  
some situations, I have mixed text/numeric data that is stored as  
characters in a matrix.  If I convert this matrix to a dataframe, the  
numeric data becomes factors, not what I intend.

 TEXT- paste(Text, 1:4, sep=)
 NUMBERS- 10 + 4:1
 MATRIX- cbind(TEXT, NUMBERS)
 FRAME- as.data.frame(MATRIX)

  str(FRAME)
`data.frame':4 obs. of  2 variables:
$ TEXT   : Factor w/ 4 levels Text1,Text2,..: 1 2 3 4
$ NUMBERS: Factor w/ 4 levels 11,12,13,..: 4 3 2 1

One work-around is to write the matrix (or the dataframe) to a file,  
then read the file back using the as.is argument.
 write.table(MATRIX, JUNK, row.names=F)
 NEWFRAME- read.table(JUNK, as.is=T, header=T)

  str(NEWFRAME)
`data.frame':4 obs. of  2 variables:
$ TEXT   : chr  Text1 Text2 Text3 Text4
$ NUMBERS: int  14 13 12 11

This restores the NUMBERS to their intended mode (integers, not  
factors).  The text column is also not read as a factor (not a  
problem for me).

It appears that the function AsIs [I(x)] would enable me to  
accomplish this without the write/read steps.  However, it is not  
obvious to me how to implement I(x).  Can anyone advise?

Thanks in advance.

Dennis Fisher

Dennis Fisher MD
P  (The P Less Than Company)
Phone: 1-866-PLessThan (1-866-753-7784)
Fax: 1-415-564-2220
www.PLessThan.com


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Prediction with multiple zeros in the dependent variable

2005-09-08 Thread Frank E Harrell Jr
John Sorkin wrote:
 I have a batch of data in each line of data contains three values,
 calcium score, age, and sex. I would like to predict calcium scores as a
 function of age and sex, i.e. calcium=f(age,sex). Unfortunately the
 calcium scorers have a very ugly distribution. There are multiple
 zeros, and multiple values between 300 and 600. There are no values
 between zero and 300. Needless to say, the calcium scores are not
 normally distributed, however, the values between 300 and 600 have a
 distribution that is log normal. As you might imagine, the residuals
 from the regression are not normally distributed and thus violates the
 basic assumption of regression analyses. Does anyone have a suggestion
 for a method (or a transformation) that will allow me predict calcium
 from age and sex without violating the assumptions of the model?
 Thanks,
 John
  
 John Sorkin M.D., Ph.D.
 Chief, Biostatistics and Informatics
 Baltimore VA Medical Center GRECC and
 University of Maryland School of Medicine Claude Pepper OAIC

John - first I would try a proportional odds model, with zero as its own 
category then treating all other values as continuous or collapsing them 
into 20-tiles.  If the PO assumption happens to hold (look at partial 
residual plots) you have a simple solution.

Frank

-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Time Series Analysis: book?

2005-09-08 Thread jfontain
There has been a few questions on the subject lately.
Is there any book on the subject, if possible with a computer processing flavor,
that you would highly recommend?


Many thanks in advance,


--
Jean-Luc

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Converting a matrix to a dataframe: how to prevent conversion to factor

2005-09-08 Thread Peter Dalgaard
Dennis Fisher [EMAIL PROTECTED] writes:

 Colleages
 
 I am running R 2.1.0 on a Mac (same problem occurs in Linux).  In  
 some situations, I have mixed text/numeric data that is stored as  
 characters in a matrix.  If I convert this matrix to a dataframe, the  
 numeric data becomes factors, not what I intend.
 
  TEXT- paste(Text, 1:4, sep=)
  NUMBERS- 10 + 4:1
  MATRIX- cbind(TEXT, NUMBERS)
  FRAME- as.data.frame(MATRIX)
 
   str(FRAME)
 `data.frame':4 obs. of  2 variables:
 $ TEXT   : Factor w/ 4 levels Text1,Text2,..: 1 2 3 4
 $ NUMBERS: Factor w/ 4 levels 11,12,13,..: 4 3 2 1
 
 One work-around is to write the matrix (or the dataframe) to a file,  
 then read the file back using the as.is argument.
  write.table(MATRIX, JUNK, row.names=F)
  NEWFRAME- read.table(JUNK, as.is=T, header=T)
 
   str(NEWFRAME)
 `data.frame':4 obs. of  2 variables:
 $ TEXT   : chr  Text1 Text2 Text3 Text4
 $ NUMBERS: int  14 13 12 11
 
 This restores the NUMBERS to their intended mode (integers, not  
 factors).  The text column is also not read as a factor (not a  
 problem for me).
 
 It appears that the function AsIs [I(x)] would enable me to  
 accomplish this without the write/read steps.  However, it is not  
 obvious to me how to implement I(x).  Can anyone advise?

I don't think that is going to help 

There are really several issues here: Your numeric column was
converted to character by the cbind, using as.data.frame(I(MATRIX))
will not split it into individual columns, and things like
apply(MATRIX,2,f) may do the right thing to begin with, but then
there's coercion due to an implicit cbind at the end.

It's a bit awkward, but this may do it:

 FRAME - as.data.frame(lapply(split(MATRIX,col(MATRIX)),type.convert))
 names(FRAME) - colnames(MATRIX)
 str(FRAME)
`data.frame':   4 obs. of  2 variables:
 $ TEXT   : Factor w/ 4 levels Text1,Text2,..: 1 2 3 4
 $ NUMBERS: int  14 13 12 11

whereas this isn't right:

 str(apply(MATRIX,2,type.convert))
 int [1:4, 1:2] 1 2 3 4 14 13 12 11
 - attr(*, dimnames)=List of 2
  ..$ : NULL
  ..$ : chr [1:2] TEXT NUMBERS



-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] FW: Re: Doubt about nested aov output

2005-09-08 Thread Douglas Bates
On 9/8/05, Ken Knoblauch [EMAIL PROTECTED] wrote:
 Your response nicely clarifies a question that I've had for a long time,
 but which I've dealt
 with by giving each subject a unique label.  Unless I'm missing something,
 both techniques should
 work as the toy example below gives exactly the same output in all 3 cases
 below (forgetting
 about the convergence problem).  Would there be a reason to prefer
 labeling the levels
 one way or another or is it just a matter of convenience?
 
 library(lmer)
 y - rnorm(15)
 cond - gl(3, 5, 15)
 obs - gl(15, 1)
 subj - gl(5, 1, 15)
 dd - data.frame(y = y, cond = cond, obs = obs, subj = subj)
 
 l1 - lmer(y~cond + (1|cond:obs), dd)
 l2 - lmer(y~cond + (1|cond:subj), dd)
 l3 - lmer(y~cond + (1|obs), dd)

I prefer to have a grouping factor constructed with unique levels for
each distinct unit.  The only reason I mention constructions like
Treatment:Rat in the original part of this thread is that data are
often provided in that form.

Reusing subject labels within another group is awkward and can be
error prone.  One of the data sets I examine in the MlmSoftRev
vignette of the mlmRev package is called Exam and has student
identifiers within schools.  The student identifiers are not unique
but the school:student combination should be.  It isn't.  These data
have been analyzed in scores of books and articles and apparently none
of the other authors bothered to check this.  There are some
interesting ramifications such as some of the schools that are
classified as mixed-sex schools are likely single-sex schools because
the only student of one of the sexes in that school is apparently
mislabelled.

BTW, in your example you have only one observation per level of 'obs'
so you can't use obs as a grouping factor as this variance component
would be completely confounded with the per-observation noise.

 
 Douglas Bates a écrit:
 
 The difference between models like
   lmer(Glycogen~Treatment+(1|Rat)+(1|Rat:Liver))
 and
   lmer(Glycogen~Treatment+(1|Treatment:Rat)+(1|Treatment:Rat:Liver))
 
 is more about the meaning of the levels of Rat than about the
 meaning of Treatment.  As I understood it there are three different
 rats labelled 1.  There is a rat 1 on treatment 1 and a rat 1 on
 treatment 2 and a rat 1 on treatment 3.  Thus the levels of Rat do not
 designate the experimental unit, it is the levels of Treatment:Rat
 that do this.
 
 --
 Ken Knoblauch
 Inserm U371
 Cerveau et Vision
 Dept. of Cognitive Neuroscience
 18 avenue du Doyen Lépine
 69500 Bron
 France
 tel: +33 (0)4 72 91 34 77
 fax: +33 (0)4 72 91 34 61
 portable: +33 (0)6 84 10 64 10
 http://www.lyon.inserm.fr/371/
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] FW: Re: Doubt about nested aov output

2005-09-08 Thread Ken Knoblauch
Thank you for your response.  The single response/observer most probably
explains
the complaints that lmer was giving for my example.  Maybe this small
modification
provides a better example and corrects a more serious error in my previous
post:

library(lme4)
y-rnorm(30)
cond - rep(gl(3,5,15), 2)
obs-rep(gl(15,1), 2)
subj-rep(gl(5,1,15), 2)
dd-data.frame(y=y,cond=cond,obs=obs,subj=subj)

l1 - lmer(y~cond + (1|cond:obs), data=dd)
l2 - lmer(y~cond + (1|cond:subj), data=dd)
l3 - lmer(y~cond + (1|obs), dd)

Understanding the notation is often about 99% of the job, and it is
very helpful to see multiple ways to accomplish the same thing.

 Douglas Bates a écrit:
 I prefer to have a grouping factor constructed with unique levels for
 each distinct unit.  The only reason I mention constructions like
 Treatment:Rat in the original part of this thread is that data are
 often provided in that form.

 Reusing subject labels within another group is awkward and can be
 error prone.  One of the data sets I examine in the MlmSoftRev
 vignette of the mlmRev package is called Exam and has student
 identifiers within schools.  The student identifiers are not unique
 but the school:student combination should be.  It isn't.  These data
 have been analyzed in scores of books and articles and apparently none
 of the other authors bothered to check this.  There are some
 interesting ramifications such as some of the schools that are
 classified as mixed-sex schools are likely single-sex schools because
 the only student of one of the sexes in that school is apparently
 mislabelled.

 BTW, in your example you have only one observation per level of 'obs'
 so you can't use obs as a grouping factor as this variance component
 would be completely confounded with the per-observation noise.


 Douglas Bates a écrit:

 The difference between models like
   lmer(Glycogen~Treatment+(1|Rat)+(1|Rat:Liver))
 and
   lmer(Glycogen~Treatment+(1|Treatment:Rat)+(1|Treatment:Rat:Liver))

 is more about the meaning of the levels of Rat than about the
 meaning of Treatment.  As I understood it there are three different
 rats labelled 1.  There is a rat 1 on treatment 1 and a rat 1 on
 treatment 2 and a rat 1 on treatment 3.  Thus the levels of Rat do not
 designate the experimental unit, it is the levels of Treatment:Rat
 that do this.

 --
 Ken Knoblauch
 Inserm U371
 Cerveau et Vision
 Dept. of Cognitive Neuroscience
 18 avenue du Doyen Lépine
 69500 Bron
 France
 tel: +33 (0)4 72 91 34 77
 fax: +33 (0)4 72 91 34 61
 portable: +33 (0)6 84 10 64 10
 http://www.lyon.inserm.fr/371/





-- 
Ken Knoblauch
Inserm U371
Cerveau et Vision
Dept. of Cognitive Neuroscience
18 avenue du Doyen Lépine
69500 Bron
France
tel: +33 (0)4 72 91 34 77
fax: +33 (0)4 72 91 34 61
portable: +33 (0)6 84 10 64 10
http://www.lyon.inserm.fr/371/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Prediction with multiple zeros in the dependent variable

2005-09-08 Thread Thomas Lumley
On Thu, 8 Sep 2005, John Sorkin wrote:
 I have a batch of data in each line of data contains three values,
 calcium score, age, and sex. I would like to predict calcium scores as a
 function of age and sex, i.e. calcium=f(age,sex). Unfortunately the
 calcium scorers have a very ugly distribution. There are multiple
 zeros, and multiple values between 300 and 600. There are no values
 between zero and 300. Needless to say, the calcium scores are not
 normally distributed, however, the values between 300 and 600 have a
 distribution that is log normal.

[Coronary artery calcium by EBCT, I presume]

Our approach to modelling calcium scores is to do it in two parts.  First 
fit something like a logistic regression model where the outcome is zero 
vs non-zero calcium.  Then, for the non-zero use something like a linear 
regression model for log calcium.

You could presumably use such a model for prediction or imputation too, 
and you can work out means, medians etc from the two models.

One particular reason for using this two-part model is that we find 
different predictors of zero/non-zero and of amount. This makes biological 
sense -- a factor that makes arterial plaques calcify might well have no 
impact until you have arterial plaques.

Or you could use smooth quantile regression in the rq package.

-thomas

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Survival model with cross-classified shared frailties

2005-09-08 Thread Thomas Lumley
On Thu, 8 Sep 2005, Shige Song wrote:

 Dear All,

 The coxph function in the survival package allows multiple frailty
 terms.

Um, no, it doesn't.

 In all the examples I saw, however, the frailty terms are nested.
 What will happen if I have non-nested (that is, cross-classified) frailties
 in the model?

This wouldn't work even if it did allow multiple frailty terms.


You may want the coxme() function in the kinship package.


-thomas

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] array indices in synced vectors

2005-09-08 Thread Erich Neuwirth
Let us start with the following definitions

xxx-rep(c(1,2),times=5)
yyy-rep(c(1,2),each=5)
a-c(11,12)
b-matrix(1:4,2,2)

a[xxx] produces
[1] 11 12 11 12 11 12 11 12 11 12

b[xxx,yyy] produces
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]111113333 3
 [2,]222224444 4
 [3,]111113333 3
 [4,]222224444 4
 [5,]111113333 3
 [6,]222224444 4
 [7,]111113333 3
 [8,]222224444 4
 [9,]111113333 3
[10,]222224444 4

so it does an implicit outer for the indices in xxx and yyy.

sapply(1:length(xxx),function(x)b[xxx[x],yyy[x]])
does what I need and produces
 [1] 1 2 1 2 1 4 3 4 3 4

Is there a function taking xxx,yyy, and b as arguments
producing the same result?

Essentially, I am asking for a version of lapply and/or sapply
which works with functions of more than one argument and takes the
iteration arguments as vectors or lists of equal length.




-- 
Erich Neuwirth, Didactic Center for Computer Science
University of Vienna
Visit our SunSITE at http://sunsite.univie.ac.at
Phone: +43-1-4277-39902 Fax: +43-1-4277-9399

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] array indices in synced vectors

2005-09-08 Thread Thomas Lumley
On Thu, 8 Sep 2005, Erich Neuwirth wrote:

 sapply(1:length(xxx),function(x)b[xxx[x],yyy[x]])
 does what I need and produces
 [1] 1 2 1 2 1 4 3 4 3 4

 Is there a function taking xxx,yyy, and b as arguments
 producing the same result?

b[cbind(xxx,yyy)]

 Essentially, I am asking for a version of lapply and/or sapply
 which works with functions of more than one argument and takes the
 iteration arguments as vectors or lists of equal length.

More generally there is mapply(), but the matrix subscript solution is 
better in this example
 mapply(function(i,j) b[i,j], xxx,yyy)
  [1] 1 2 1 2 1 4 3 4 3 4

-thomas

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Predicting responses using ace

2005-09-08 Thread Luis Pineda
I'm trying to run the print method, but according to the documentation it 
needs as a parameter an object created by |summary.areg.boot| . The thing is 
that |summary.areg.boot| gives me the following error: Error in bootj[, 1] 
: incorrect number of dimensions, when I do the simple call --- summary(
ace.r)

I started the debug browser to see what was going on inside and I noticed 
that 'bootj' is a numeric class variable with the same number of elements as 
the 'evaluation' parameter for |areg.boot|. What I found is that it has only 
one dimension and summary is asking for bootj[, 1], which is an error.

Is that the intended behavior and I'm doing something wrong elsewhere, or 
should I try to adjust it by myself (to boot[1] for example)? In case the 
answer is the latter I would apretiate some insight about how to do it, 
'cause I don't know how to edit the file.

Thanks for your help,
Luis Pineda

On 9/7/05, Frank E Harrell Jr [EMAIL PROTECTED] wrote:
 
 Luis Pineda wrote:
 
  2.) I'm evaluating the model's goodness of fit using the Fraction of
  Variance Unexplained, which I'm calculating as:
 
  rsa = za - zs
  FVUa = sum(rsa*rsa)/(1*var(zs)) #1 is the size of the test set
 
 That is not corrected for overfitting. You need to use the print method
 for the areg.boot object and note the Bootstrap validated R2
 
 --
 Frank E Harrell Jr Professor and Chair School of Medicine
 Department of Biostatistics Vanderbilt University


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Predicting responses using ace

2005-09-08 Thread Luis Pineda
I gave a quick read to the documentation again and noticed I misinterpreted 
it. It was print.summary.areg.boot the method I was referring to (although 
the summary error should still work). Sorry for the inconvenience

Anyway, I used the print method on my |areg.boot| object and I got this:
--
Apparent R2 on transformed Y scale: 0.798
Bootstrap validated R2 : 0.681
...
Residuals on transformed scale:
Min 1Q Median 3Q Max
-1.071312e+00 -2.876245e-01 -3.010081e-02 2.123566e-01 1.867036e+00
Mean S.D.
1.290634e-17 4.462159e-01
--
I suppose thats the R^2 evaluated using the training set, but how do I 
evaluate the performance of the model on a uncontaminated test set?


On 9/8/05, Luis Pineda [EMAIL PROTECTED] wrote:
 
 I'm trying to run the print method, but according to the documentation it 
 needs as a parameter an object created by |summary.areg.boot| . 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Prediction with multiple zeros in the dependent variable

2005-09-08 Thread Berton Gunter
John:

1. As George Box long ago emphasized and proved, normality is **NOT** that
important in regression, certainly not for estimation and not even for
inference in balanced designs. Independence of the observations is far more
important. 

2. That said, it sounds like what you have here is a mixture of some sort.
Before running off to do fancy modeling, I would work very hard to look for
some kind of lurking variable or experimental aberration -- what was going
on in the experiment or study that might have caused all the zeros? Was
there an instrument problem? -- a bad reagent? -- improper handling of the
samples? It might very well be that you need to throw away part of the data
because it's useless, rather than artificially attempt to model it.

3. And having said that, if a comprehensive model IS called for, one rather
cynical approach to take is just to add a grouping variable as a covariate
that has a value of 1 for all data in the zero group and 2 for all the
nonzero data. Your model is f(age,sex) = 0 for all data in group 1 and your
linear or nonlinear regression for group 2. Of course, this merely cloaks
the cynicism in respectable dress. It's hard for me to believe that it was
Mother Nature and not some kind of experimental problem that you see. 

A slightly less cynical approach might be to use some sort of changepoint
model (in both age and sex) of the form f(age, sex) = g(age,sex) for age=k1
and sex =k2 and h(age,sex) otherwise. Well, perhaps **not** less cynical --
the response data are so widely separated that you'll just be using a bunch
of extra (nonlinear, incidentally) parameters to essentially reproduce the
use of a covariate.

So I guess the point is that unless you already have a previously developed
nonlinear model that could explain the behavior you see (perhaps based on
some kind of mechanistic reasoning) it's not a good idea to try to develop
an artificial empirical model that comprehends all the data. The fact is (a
horrible phrase) that no modeling at all is needed for the most important
message the data have to convey: rather, focus on the cause of the message
instead of statistical artifice. Once you have determined that, you may be
able to do something sensible. Clear thinking trumps muddy modeling every
time.

(Hopefully, this is sufficiently inflammatory that others will vigorously
and wisely dispute me).

Cheers,

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
The business of the statistician is to catalyze the scientific learning
process.  - George E. P. Box
 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of John Sorkin
 Sent: Wednesday, September 07, 2005 9:06 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Prediction with multiple zeros in the dependent variable
 
 I have a batch of data in each line of data contains three values,
 calcium score, age, and sex. I would like to predict calcium 
 scores as a
 function of age and sex, i.e. calcium=f(age,sex). Unfortunately the
 calcium scorers have a very ugly distribution. There are multiple
 zeros, and multiple values between 300 and 600. There are no values
 between zero and 300. Needless to say, the calcium scores are not
 normally distributed, however, the values between 300 and 600 have a
 distribution that is log normal. As you might imagine, the residuals
 from the regression are not normally distributed and thus violates the
 basic assumption of regression analyses. Does anyone have a suggestion
 for a method (or a transformation) that will allow me predict calcium
 from age and sex without violating the assumptions of the model?
 Thanks,
 John
  
 John Sorkin M.D., Ph.D.
 Chief, Biostatistics and Informatics
 Baltimore VA Medical Center GRECC and
 University of Maryland School of Medicine Claude Pepper OAIC
  
 University of Maryland School of Medicine
 Division of Gerontology
 Baltimore VA Medical Center
 10 North Greene Street
 GRECC (BT/18/GR)
 Baltimore, MD 21201-1524
  
 410-605-7119 
 -- NOTE NEW EMAIL ADDRESS:
 [EMAIL PROTECTED]
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] ROracle install problem

2005-09-08 Thread Ariel Chernomoretz
Hi, 

I am trying to install the ROracle package in a Linux-64 machine.
I downloaded from Oracle's site their Instant Client bundle but it seems that
ROracle needs some stuff not included in that kit in order to compile (in 
particuar, the 'proc' executable).

I did not find any other linux client suite in Oracle's site, (our db runs on 
a Solaris server, so I can not use the included binaries).

Does anybody know how to solve this? Is there any workaround? 
 
Thanks,

Ariel./


-- 
Ariel Chernomoretz, Ph.D.
Centre de recherche du CHUL
2705 Blv Laurier, bloc T-367
Sainte-Foy, Qc
G1V 4G2
(418)-525- ext 46339

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] package installation error (LF versus CR)

2005-09-08 Thread sonia
Hello, 
I have the following problem in installing a package (in windows xp) 

rcmd install -c  dlm

[ ..stuff deleted ]

  ... DLL made
  installing DLL
  installing R files
  installing inst files
  installing data files
  installing man source files
  installing indices
Errore in load(zfile, envir = envir) : l'input Þ stato danneggiato, LF 
sostituiti da CR
Esecuzione interrotta
make[2]: *** [indices] Error 1
make[1]: *** [all] Error 2
make: *** [pkg-dlm] Error 2
*** Installation of dlm failed ***

Does anybody please have suggestions? 
If relevant, the source package was downloaded from a unix server using cvs+ssh

Thanks a lot, 
best, 
Sonia 

-- 
Sonia Petrone
Istituto di Metodi Quantitativi
Università Bocconi
Viale Isonzo 25
20135 Milano, Italia.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] R API call from delphi

2005-09-08 Thread Laurent TESSIER
Hello,
Has anyone tried to call R API from Delphi under windows ? How was it done if 
it was ? Or has anyone any idea about how it could be done ?
Thanks for your answers
Laurent Tessier
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Time Series Analysis: book?

2005-09-08 Thread Wensui Liu
TS is a huge topic. The book recomended by statisitcian might be different 
from the one recommended by econometrician. Finance guy might recommend 
another. 

Could you please be more specific?

On 9/8/05, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 
 There has been a few questions on the subject lately.
 Is there any book on the subject, if possible with a computer processing 
 flavor,
 that you would highly recommend?
 
 
 Many thanks in advance,
 
 
 --
 Jean-Luc
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 



-- 
WenSui Liu
(http://statcompute.blogspot.com)
Senior Decision Support Analyst
Cincinnati Children Hospital Medical Center

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] data manipulation

2005-09-08 Thread Marc Bernard
Dear All,
 
I would be grateful if you can help me. My problem is the following:
I have a data set like:
 
ID  time  X1  X2
11  x111  x211
12  x112  x212
21  x121  x221
22  x122  x222
23  x123  x223
 
where X1 and X2 are 2 covariates and time is the time of observation and ID 
indicates the cluster.
 
I want to merge the above data by creating a new variable  X and type as 
follows:
 
ID   timeXtype
1 1  x111 X1
1 2  x112 X1
1 1  x211 X2
1 2  x212 X2
2 1  x121 X1
2 2  x122 X1
2 3  x123 X1
2 1  x221 X2
2 2  x222 X2
2 3  x223 X2

 
Where type is a factor variable indicating if the observation is related to 
X1 or X2...
 
Many thanks in advance,
 
Bernard


-


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] package installation error (LF versus CR)

2005-09-08 Thread Uwe Ligges
sonia wrote:

 Hello, 
 I have the following problem in installing a package (in windows xp) 
 
 
rcmd install -c  dlm
 
 
 [ ..stuff deleted ]
 
   ... DLL made
   installing DLL
   installing R files
   installing inst files
   installing data files
   installing man source files
   installing indices
 Errore in load(zfile, envir = envir) : l'input Þ stato danneggiato, LF 
 sostituiti da CR
 Esecuzione interrotta
 make[2]: *** [indices] Error 1
 make[1]: *** [all] Error 2
 make: *** [pkg-dlm] Error 2
 *** Installation of dlm failed ***
 
 Does anybody please have suggestions? 
 If relevant, the source package was downloaded from a unix server using 
 cvs+ssh

In principle, it should not matter if it works on the unix machine.
Anyway, can you try to *build* the package on that unix machine and 
install from the tar.gz file on Windows. Maybe some line endings got 
mixed up by cvs...

For further report, please set
LANGUAGE=en
before a sample-run you want to include in a question to R-help, because 
not everybody understands italian.

Uwe Ligges


 Thanks a lot, 
 best, 
 Sonia 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Interpolating / smoothing missing time series data

2005-09-08 Thread Spencer Graves
(see inline)

Sean Davis wrote:

 On 9/7/05 10:19 PM, Gabor Grothendieck [EMAIL PROTECTED] wrote:
 
 
On 9/7/05, David James [EMAIL PROTECTED] wrote:

The purpose of this email is to ask for pre-built procedures or
techniques for smoothing and interpolating missing time series data.

I've made some headway on my problem in my spare time.  I started
with an irregular time series with lots of missing data.  It even had
duplicated data.  Thanks to zoo, I've cleaned that up -- now I have a
regular time series with lots of NA's.

I want to use a regression model (i.e. ARIMA) to ill in the gaps.  I
am certainly open to other suggestions, especially if they are easy
to implement.

My specific questions:
1.  Presumably, once I get ARIMA working, I still have the problem of
predicting the past missing values -- I've only seen examples of
predicting into the future.
2.  When predicting the past (backcasting), I also want to take
reasonable steps to make the data look smooth.

I guess I'm looking for a really good example in a textbook or white
paper (or just an R guru with some experience in this area) that can
offer some guidance.

Venables and Ripley was a great start (Modern Applied Statistics with
S).  I really had hoped that the Seasonal ARIMA Models section on
page 405 would help.  It was helpful, but only to a point.  I have a
hunch (based on me crashing arima numerous times -- maybe I'm just
new to this and doing things that are unreasonable?) that using
hourly data just does not mesh well with the seasonal arima code?


  Have you looked at Durbin, J. and Koopman, S. J. (2001) _Time Series 
Analysis by State Space Methods._  Oxford University Press, cited with 
?arima?  They explain that Kalman filtering is predicting the future, 
while Kalman smoothing is using all the data to fill the gaps, which 
seems to match your question.  I was able to reproduce Figure 2.1 in 
that book but got bogged down with Figure 2.2 before I dropped the 
project.  I can send you the script file I developed when working on 
that if it would help you.

  I'm still interested in learning how to reproduce in R all the 
examples in that book, and I'd happily receive suggestions from others 
on how to do that.

  spencer graves

Not sure if this answers your question but if you are looking for something
simple then na.approx in the zoo package will linearly interpolate for you.


z - zoo(c(1,2,NA,4,5))
na.approx(z)

1 2 3 4 5 
1 2 3 4 5
 
 
 Alternatively, if you are looking for more smoothing, you could look at
 using a moving average or median applied at points of interest with an
 appropriate window size--see wapply in the gplots package (gregmisc
 bundle).  There are a number of other functions that can accomplish the same
 task.  A search for moving window or moving average in the archives may
 produce some other ideas.
 
 Sean
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

-- 
Spencer Graves, PhD
Senior Development Engineer
PDF Solutions, Inc.
333 West San Carlos Street Suite 700
San Jose, CA 95110, USA

[EMAIL PROTECTED]
www.pdf.com http://www.pdf.com
Tel:  408-938-4420
Fax: 408-280-7915

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Tip: I() can designate constants in a regression

2005-09-08 Thread David James
Just thought I would share a tip I learned:
The function I() is useful for specifying constants to formulas and  
regressions.

It will prevent nls (for example) from trying to treat the variable  
inside I() as something it needs to estimate.  An example is below.

-David

P.S.  This may be obvious to some, but it is not made clear to be by  
the documentation or common books that I reviewed.  These books, of  
course, do tend to mention others aspects of I(), which seems to be a  
very diverse function.  For example:
* ISwR by Dalgaard (p. 160, 177)
* MASwS by Venables and Ripley (p.18)

However, the books I looked at do not mention the specific tip here:  
Wrapping I() around a variable will make it a constant from the  
perspective of a regression.

A humble suggestion to the many authors of the many great R and S  
books out there: I would find it helpful if more R books had the word  
constants in the index.  Perhaps there could be a brief section  
that explained how to create constants in a regression.  These sorts  
of problems, I would guess, occur more commonly with nls models than  
lm models.

- - - - - - - -

Here is the example that motivated my tip:

 weather.df : a data frame, where each row is one hour
 weather.df$temp : the temperature
 weather.df$annual : time offset, adjusted so that its period is one  
 year
 weather.df$daily : time offset, adjusted so that its period is one day

 # I want a1,a2 to be constants from the point of view of nls
 a1 - 66
 a2 - -18
 nls.example  - nls( temp ~ I(a1) + I(a2)*sin( ts.annual ) + a3*sin 
 ( ts.daily ), data=weather.df, start=c(a3=1) )
 # leaving out the I() will cause nls to estimate values for a1 and a2



[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] data manipulation

2005-09-08 Thread Martin Lam
Hi,

This may not be the best solution, but at least it's
easy to see what i'm doing, assume that your data set
is called data:

# remove the 4th column
data1 = data[,-4]

# remove the 3rd column
data2 = data[,-3]

# use cbind to add an extra column with only X1 
#elements
data1 = cbind(data1, array(X1, nrow(data1), 1)

# use cbind to add an extra column with only X2
#elements
data2 = cbind(data2, array(X2, nrow(data2), 1)

# use rbind to add them together as rows
data3 = rbind(data1, data2)

# rename the names of the columns
colnames(data3) - c(ID, time, X, type)

# show output
data3

The only thing I couldn't figure out is how to sort
the data set per row, perhaps someone else could help
us out on this?

Martin

--- Marc Bernard [EMAIL PROTECTED] wrote:

 Dear All,
  
 I would be grateful if you can help me. My problem
 is the following:
 I have a data set like:
  
 ID  time  X1  X2
 11  x111  x211
 12  x112  x212
 21  x121  x221
 22  x122  x222
 23  x123  x223
  
 where X1 and X2 are 2 covariates and time is the
 time of observation and ID indicates the cluster.
  
 I want to merge the above data by creating a new
 variable  X and type as follows:
  
 ID   timeXtype
 1 1  x111 X1
 1 2  x112 X1
 1 1  x211 X2
 1 2  x212 X2
 2 1  x121 X1
 2 2  x122 X1
 2 3  x123 X1
 2 1  x221 X2
 2 2  x222 X2
 2 3  x223 X2
 
  
 Where type is a factor variable indicating if the
 observation is related to X1 or X2...
  
 Many thanks in advance,
  
 Bernard
 
   
 -
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html
 





__
Click here to donate to the Hurricane Katrina relief effort.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] data manipulation

2005-09-08 Thread Sebastian Luque
Marc Bernard [EMAIL PROTECTED] wrote:
 Dear All,

 I would be grateful if you can help me. My problem is the following:
 I have a data set like:

 ID  time  X1  X2
 11  x111  x211
 12  x112  x212
 21  x121  x221
 22  x122  x222
 23  x123  x223

 where X1 and X2 are 2 covariates and time is the time of observation and ID
   indicates the cluster.

 I want to merge the above data by creating a new variable X and type as
   follows:

 ID   timeXtype
 1 1  x111 X1
 1 2  x112 X1
 1 1  x211 X2
 1 2  x212 X2
 2 1  x121 X1
 2 2  x122 X1
 2 3  x123 X1
 2 1  x221 X2
 2 2  x222 X2
 2 3  x223 X2


 Where type is a factor variable indicating if the observation is related to
   X1 or X2...


Say your original data is in dataframe df, then this might do what you
want:

R newdf - rbind(df[, 1:3], df[, c(1, 2, 4)])
R names(newdf)[3] - X
R newdf$type - substr(c(df[[3]], df[[4]]), 1, 2)

Cheers,

-- 
Sebastian P. Luque

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Tip: I() can designate constants in a regression

2005-09-08 Thread Peter Dalgaard
David James [EMAIL PROTECTED] writes:

 Just thought I would share a tip I learned:
 The function I() is useful for specifying constants to formulas and  
 regressions.
 
 It will prevent nls (for example) from trying to treat the variable  
 inside I() as something it needs to estimate.  An example is below.
 
 -David
 
 P.S.  This may be obvious to some, but it is not made clear to be by  
 the documentation or common books that I reviewed.  These books, of  
 course, do tend to mention others aspects of I(), which seems to be a  
 very diverse function.  For example:
 * ISwR by Dalgaard (p. 160, 177)
 * MASwS by Venables and Ripley (p.18)
 
 However, the books I looked at do not mention the specific tip here:  
 Wrapping I() around a variable will make it a constant from the  
 perspective of a regression.
 
 A humble suggestion to the many authors of the many great R and S  
 books out there: I would find it helpful if more R books had the word  
 constants in the index.  Perhaps there could be a brief section  
 that explained how to create constants in a regression.  These sorts  
 of problems, I would guess, occur more commonly with nls models than  
 lm models.

First check whether your claim is actually correct:

  x = 1:10
  y = x  # perfect fit
  yeps = y + rnorm(length(y), sd = 0.01) # added noise
  nls(yeps ~ a + b*x, start = list(a = 0.12345, b = 0.54321),
+   trace = TRUE)
74.2686 :  0.12345 0.54321
0.0006529895 :  -0.002666984  1.000334031
Nonlinear regression model
  model:  yeps ~ a + b * x
   data:  parent.frame()
   ab
-0.002666984  1.000334031
 residual sum-of-squares:  0.0006529895
 a - 0
  nls(yeps ~ a + b*x, start = list(b = 0.54321),trace=TRUE)
80.31713 :  0.54321
0.0006682311 :  0.53
Nonlinear regression model
  model:  yeps ~ a + b * x
   data:  parent.frame()
   b
0.53
 residual sum-of-squares:  0.0006682311

I.e., turning a into a constant works quite happily without the I().


 Here is the example that motivated my tip:
 
  weather.df : a data frame, where each row is one hour
  weather.df$temp : the temperature
  weather.df$annual : time offset, adjusted so that its period is one  
  year
  weather.df$daily : time offset, adjusted so that its period is one day
 
  # I want a1,a2 to be constants from the point of view of nls
  a1 - 66
  a2 - -18
  nls.example  - nls( temp ~ I(a1) + I(a2)*sin( ts.annual ) + a3*sin 
  ( ts.daily ), data=weather.df, start=c(a3=1) )
  # leaving out the I() will cause nls to estimate values for a1 and a2

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Multinomial Logit and p-values

2005-09-08 Thread Sangick Jeon

Hi,

I am trying to obtain p-values for coefficient estimates in a multinomial
logit model.  Although I am able to test for significance using other
methods (e.g., Wald statistics), I can't seem to get R to give me simple
p-values. I am sure there is a very simple solution to this, but the R
archives seem to have nothing on this issue. I would appreciate any help. 
Thanks in advance!

Best,
Sangick Jeon

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] data manipulation

2005-09-08 Thread Thomas Lumley

This is what reshape() does.

-thomas

On Thu, 8 Sep 2005, Marc Bernard wrote:

 Dear All,

 I would be grateful if you can help me. My problem is the following:
 I have a data set like:

 ID  time  X1  X2
 11  x111  x211
 12  x112  x212
 21  x121  x221
 22  x122  x222
 23  x123  x223

 where X1 and X2 are 2 covariates and time is the time of observation and ID 
 indicates the cluster.

 I want to merge the above data by creating a new variable  X and type as 
 follows:

 ID   timeXtype
 1 1  x111 X1
 1 2  x112 X1
 1 1  x211 X2
 1 2  x212 X2
 2 1  x121 X1
 2 2  x122 X1
 2 3  x123 X1
 2 1  x221 X2
 2 2  x222 X2
 2 3  x223 X2


 Where type is a factor variable indicating if the observation is related to 
 X1 or X2...

 Many thanks in advance,

 Bernard


 -


   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] execute R expression from command line

2005-09-08 Thread Omar Lakkis
Can I execute an R expression from the command line without having it
in an infile, something like perl's -e flag. So it would look like:

R {Rexpression;}  outfile

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] data manipulation

2005-09-08 Thread Jim Porzak
Also see Hadley Wickham's reshape package for more bells  whistles.
-- 
HTH!
Jim Porzak
Loyalty Matrix Inc.



On 9/8/05, Thomas Lumley [EMAIL PROTECTED] wrote:
 
 This is what reshape() does.
 
 -thomas
 
 On Thu, 8 Sep 2005, Marc Bernard wrote:
 
  Dear All,
 
  I would be grateful if you can help me. My problem is the following:
  I have a data set like:
 
  ID  time  X1  X2
  11  x111  x211
  12  x112  x212
  21  x121  x221
  22  x122  x222
  23  x123  x223
 
  where X1 and X2 are 2 covariates and time is the time of observation and 
  ID indicates the cluster.
 
  I want to merge the above data by creating a new variable  X and type 
  as follows:
 
  ID   timeXtype
  1 1  x111 X1
  1 2  x112 X1
  1 1  x211 X2
  1 2  x212 X2
  2 1  x121 X1
  2 2  x122 X1
  2 3  x123 X1
  2 1  x221 X2
  2 2  x222 X2
  2 3  x223 X2
 
 
  Where type is a factor variable indicating if the observation is related 
  to X1 or X2...
 
  Many thanks in advance,
 
  Bernard
 
 
  -
 
 
[[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
 
 
 Thomas Lumley   Assoc. Professor, Biostatistics
 [EMAIL PROTECTED]University of Washington, Seattle
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Re-evaluating the tree in the random forest

2005-09-08 Thread Martin Lam
Dear mailinglist members,

I was wondering if there was a way to re-evaluate the
instances of a tree (in the forest) again after I have
manually changed a splitpoint (or split variable) of a
decision node. Here's an illustration:

library(randomForest)

forest.rf - randomForest(formula = Species ~ ., data
= iris, do.trace = TRUE, ntree = 3, mtry = 2,
norm.votes = FALSE)

# I am going to change the splitpoint of the root node
of the first tree to 1
forest.rf$forest$xbestsplit[1,]
forest.rf$forest$xbestsplit[1,1] - 1
forest.rf$forest$xbestsplit[1,]

Because I've changed the splitpoint, some instances in
the leafs are not supposed where they should be. Is
there a way to reappoint them to the correct leaf?


I was also wondering how I should interpret the output
of do.trace:

ntree  OOB  1  2  3
1:   3.70%  0.00%  6.25%  5.88%
2:   3.49%  0.00%  3.85%  7.14%
3:   3.57%  0.00%  5.56%  5.26%

What's OOB and what does the percentages mean?

Thanks in advance,

Martin




__
Click here to donate to the Hurricane Katrina relief effort.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] R API call from delphi

2005-09-08 Thread Duncan Temple Lang

On approach is to create a native/foreign interface to R by
linking R as a library (libR.a and R.dll) file and
calling the C routines in the library to
i) initialize the R interpreter
   ii) call an R function

We have done this with many languages and the procedure is well
understood at this point, but requires some C-level programming
and converting between the standard data types of both systems.

Another approach is to use R via DCOM.  There are two different
approaches to this. One has the R interpreter as the DCOM
server and the client (the Delphi application here) would send
R commands to that server. The other approach has regular
DCOM servers that are implemented via R functions.  (The
ability to send R commands is a simple case of this.)  It is up
to you to define the servers via a few extra lines of R code.

I would suggest you pursue the DCOM route unless you are
keen to do the necessary work to embed the R  library
in Delphi and deal with some technical details about
calling conventions of C routines.

  D

Laurent TESSIER wrote:
 Hello,
 Has anyone tried to call R API from Delphi under windows ? How was it done if 
 it was ? Or has anyone any idea about how it could be done ?
 Thanks for your answers
 Laurent Tessier
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Predicting responses using ace

2005-09-08 Thread Frank E Harrell Jr
Luis Pineda wrote:
 I gave a quick read to the documentation again and noticed I misinterpreted 
 it. It was print.summary.areg.boot the method I was referring to (although 
 the summary error should still work). Sorry for the inconvenience
 
 Anyway, I used the print method on my |areg.boot| object and I got this:
 --
 Apparent R2 on transformed Y scale: 0.798
 Bootstrap validated R2 : 0.681
 ...
 Residuals on transformed scale:
 Min 1Q Median 3Q Max
 -1.071312e+00 -2.876245e-01 -3.010081e-02 2.123566e-01 1.867036e+00
 Mean S.D.
 1.290634e-17 4.462159e-01
 --
 I suppose thats the R^2 evaluated using the training set, but how do I 
 evaluate the performance of the model on a uncontaminated test set?

Please read my last note.  Bootstrap validated R2 is corrected for 
overfitting and is an estimate of the likely future R2 on a totally 
independent dataset.  The bootstrap is more efficient than data 
splitting for this purpose.

Frank

 
 
 On 9/8/05, Luis Pineda [EMAIL PROTECTED] wrote:
 
I'm trying to run the print method, but according to the documentation it 
needs as a parameter an object created by |summary.areg.boot| . 

 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 


-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] data manipulation

2005-09-08 Thread Jean Eid
I am sure all this work but If you want exaclty the output to be the way
you mentioned do this

temp-read.table(yourfile, as.is=T, header=T)
temp1-temp[, 1:3]
temp2-temp[, c(1,2,4)]
colnames(temp1)[3]-X
colnames(temp2)[3]-X
temp3-merge(temp1, temp2, all=T)
temp3$type-toupper(substr(temp3$X, 1,2))


after which you can generate factors and such..
note the as.is=T in read.table keeps the variables X1, X2, as characters.
This is done for substr...


P.S. I am sure you can use reshape instead of the second to the fifth
commands above

?reshape

Jean

On Thu, 8 Sep 2005, Sebastian Luque wrote:

 Marc Bernard [EMAIL PROTECTED] wrote:
  Dear All,

  I would be grateful if you can help me. My problem is the following:
  I have a data set like:

  ID  time  X1  X2
  11  x111  x211
  12  x112  x212
  21  x121  x221
  22  x122  x222
  23  x123  x223

  where X1 and X2 are 2 covariates and time is the time of observation and 
  ID
  indicates the cluster.

  I want to merge the above data by creating a new variable X and type as
  follows:

  ID   timeXtype
  1 1  x111 X1
  1 2  x112 X1
  1 1  x211 X2
  1 2  x212 X2
  2 1  x121 X1
  2 2  x122 X1
  2 3  x123 X1
  2 1  x221 X2
  2 2  x222 X2
  2 3  x223 X2


  Where type is a factor variable indicating if the observation is related 
  to
  X1 or X2...


 Say your original data is in dataframe df, then this might do what you
 want:

 R newdf - rbind(df[, 1:3], df[, c(1, 2, 4)])
 R names(newdf)[3] - X
 R newdf$type - substr(c(df[[3]], df[[4]]), 1, 2)

 Cheers,

 --
 Sebastian P. Luque

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] writing data to sheet in excel workbook

2005-09-08 Thread adalbert duerrer
Hi,

I believe to remember there is a package that lets you
write data from R to different sheets in a Excel
workbook. I've been looking around on CRAN but could
not find what I am looking for.

Any help would be greatly appreciated.
Cheers,
Adi

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Leading in line-wrapped Lattice value and panel labels

2005-09-08 Thread Deepayan Sarkar
On 9/7/05, Paul Murrell [EMAIL PROTECTED] wrote:
 Hi
 
 
 Deepayan Sarkar wrote:
   On 9/7/05, Tim Churches [EMAIL PROTECTED] wrote:
  
   Version 2.1.1 Platforms: all
  
   What is the trellis parameter (or is there a trellis parameter) to
   set the leading (the gap between lines) when long axis values
   labels or panel header labels wrap over more than one line? By
   default, there is a huge gap between lines, and much looking and
   experimentation has not revealed to me a suitable parameter to
   adjust this.
  
  
  
   There is none. Whatever grid.text does happens.
 
 
 grid does have a lineheight graphical parameter.  For example,
 
 library(grid)
 grid.text(line one\nlinetwo,
x=rep(1:3/4, each=3),
y=rep(1:3/4, 3),
gp=gpar(lineheight=1:9/2))
 
 Could you add this in relevant places in trellis.par Deepayan?

I will (don't know how soon). The description in ?gpar is not very
informative though:

   lineheight  The height of a line as a multiple of the size of text

(or maybe it's a standard term in typography that I'm not familiar with).

Deepayan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] execute R expression from command line

2005-09-08 Thread Seth Falcon
On  8 Sep 2005, [EMAIL PROTECTED] wrote:

 Can I execute an R expression from the command line without having
 it in an infile, something like perl's -e flag. So it would look
 like:

 R {Rexpression;}  outfile

With a bash-like shell, you can do:

echo library(foo); somefunc(5) | R --slave 

HTH,

+ seth

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Survival model with cross-classified shared frailties

2005-09-08 Thread Shige Song
Hi Thomas,

Thanks for the reply, coxme() seems to be the one I need. 

Best,
Shige


On 9/8/05, Thomas Lumley [EMAIL PROTECTED] wrote:
 
 On Thu, 8 Sep 2005, Shige Song wrote:
 
  Dear All,
 
  The coxph function in the survival package allows multiple frailty
  terms.
 
 Um, no, it doesn't.
 
  In all the examples I saw, however, the frailty terms are nested.
  What will happen if I have non-nested (that is, cross-classified) 
 frailties
  in the model?
 
 This wouldn't work even if it did allow multiple frailty terms.
 
 
 You may want the coxme() function in the kinship package.
 
 
 -thomas


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Setting width in batch mode

2005-09-08 Thread Jonathan Dushoff
As instructed, I have spent a long time searching the web for an answer
to this question.

I am trying to use Sweave to produce lecture slides, and have the
problem that I can't control the formatting of my R source.  Setting
options(width), as recommended in this forum, works fine on the R
_output_, but seems to have unpredictable effects on the echoing of the
source code.

If I try setting options(width) directly in R, I note that it has _no_
effect on echoed source code, whereas Sweave does sometimes break source
code, but not predictably, and not to the same width as output code.

I would be happy with any method of manually or automatically
controlling the line width of Sweave source, using R, Sweave or LaTeX
options.  Making the font smaller does not count, though; I want to
break the lines.

Any help is appreciated.

An example of Sweave input and output is appended.

The last break is right, while the others are too late.

Jonathan Dushoff

--
bug.rnw

=
options(width=55)
data(state)
data.frame(area=mean(state.area),   pop=mean(state.pop),   hop=mean(state.area))
c(medianarea=median(state.area),   medianpop=median(state.pop))
c(medianarea=median(median(state.area)),   medianpop=median(state.pop))
@

--
bug.tex

\begin{Schunk}
\begin{Sinput}
 options(width = 55)
 data(state)
 data.frame(area = mean(state.area), pop = mean(state.pop), 
+ hop = mean(state.area))
\end{Sinput}
\begin{Soutput}
   area pop  hop
1 72367.98 4246420 72367.98

\end{Soutput}
\begin{Sinput}
 c(medianarea = median(state.area), medianpop = median(state.pop))
\end{Sinput}
\begin{Soutput}
medianarea  medianpop
  562222838500

\end{Soutput}
\begin{Sinput}
 c(medianarea = median(median(state.area)), 
+ medianpop = median(state.pop))
\end{Sinput}
\begin{Soutput}
medianarea  medianpop
  562222838500

\end{Soutput}
\end{Schunk}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] writing data to sheet in excel workbook

2005-09-08 Thread Gabor Grothendieck
On 9/8/05, adalbert duerrer [EMAIL PROTECTED] wrote:
 Hi,
 
 I believe to remember there is a package that lets you
 write data from R to different sheets in a Excel
 workbook. I've been looking around on CRAN but could
 not find what I am looking for.
 

See

http://finzi.psych.upenn.edu/R/Rhelp02a/archive/58249.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] R API call from delphi

2005-09-08 Thread Francisco J. Zagmutt
Follow this thread 
http://finzi.psych.upenn.edu/R/Rhelp02a/archive/50598.html

Cheers

Francisco

From: Laurent TESSIER [EMAIL PROTECTED]
Reply-To: [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Subject: [R] R API call from delphi
Date: Thu,  8 Sep 2005 17:47:49 +0200 (CEST)

Hello,
Has anyone tried to call R API from Delphi under windows ? How was it done 
if it was ? Or has anyone any idea about how it could be done ?
Thanks for your answers
Laurent Tessier
   [[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] can't successfully use installed evir package

2005-09-08 Thread Gallagher Dionysus Polyn
I'm next at installing packages. I seem to have successfully installed evir, 
but I can't use it. I'm wondering if I need to specify the installation to 
match my working directory, or something else.

thx,

G

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] clustering: Multivariate t mixtures

2005-09-08 Thread Nicholas Lewin-Koh
Hi,
Before I write code to do it does anyone know of code for fitting
mixtures of multivariate-t distributions.
I can't use McLachan's EMMIX code because the license is For non
commercial use only. 
I checked, mclust and flexmix but both only do Gaussian. 

Thanks
Nicholas

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] generating a vector from clusters of logicals

2005-09-08 Thread Troels Ring
dear friends,
I have a vector of clusters of TRUE and FALSE like 
c(TRUE,TRUE,TRUE...,FALSE,FALSE,FALSE,TRUE,TRUE...) and want to make 
that into a vector
of c(1,1,1,1...2,2,2,2,.3,3,3,3) increasing the number assigned to each 
cluster as they change.
How would I do that ?

Best wishes

Troels Ring, Aalborg, Denmark

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] clustering: Multivariate t mixtures

2005-09-08 Thread Achim Zeileis
On Thu, 08 Sep 2005 15:38:55 -0500 Nicholas Lewin-Koh wrote:

 Hi,
 Before I write code to do it does anyone know of code for fitting
 mixtures of multivariate-t distributions.
 I can't use McLachan's EMMIX code because the license is For non
 commercial use only. 
 I checked, mclust and flexmix but both only do Gaussian. 

The Gaussian case is available in a pre-packaged function FLXmclust(),
but the flexmix framework is not limited to that case. There is a paper
which appeared in the Journal of Statistical Software
(http://www.jstatsoft.org/) that explains how to write new M-steps for
flexmix. It is also contained in the package as
  vignette(flexmix-intro)

Best,
Z

 Thanks
 Nicholas
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] generating a vector from clusters of logicals

2005-09-08 Thread Achim Zeileis
On Thu, 08 Sep 2005 23:03:03 +0200 Troels Ring wrote:

 dear friends,
 I have a vector of clusters of TRUE and FALSE like 
 c(TRUE,TRUE,TRUE...,FALSE,FALSE,FALSE,TRUE,TRUE...) and want to
 make that into a vector
 of c(1,1,1,1...2,2,2,2,.3,3,3,3) increasing the number assigned to
 each cluster as they change.
 How would I do that ?

Does this what you want:

R set.seed(123)
R x - sample(c(TRUE, FALSE), 10, replace = TRUE)
R x
 [1]  TRUE FALSE  TRUE FALSE FALSE  TRUE FALSE FALSE FALSE  TRUE
R c(1, cumsum(abs(diff(x))) + 1)
 [1] 1 2 3 4 4 5 6 6 6 7

?
Z

 Best wishes
 
 Troels Ring, Aalborg, Denmark
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] clustering: Multivariate t mixtures

2005-09-08 Thread Nicholas Lewin-Koh
Hi,
Actually that was my plan was to implement a new flexmix class.
Thanks for the pointer to the jss paper, that will be helpful.

Nicholas 
On Thu, 8 Sep 2005 23:07:13 +0200, Achim Zeileis
[EMAIL PROTECTED] said:
 On Thu, 08 Sep 2005 15:38:55 -0500 Nicholas Lewin-Koh wrote:
 
  Hi,
  Before I write code to do it does anyone know of code for fitting
  mixtures of multivariate-t distributions.
  I can't use McLachan's EMMIX code because the license is For non
  commercial use only. 
  I checked, mclust and flexmix but both only do Gaussian. 
 
 The Gaussian case is available in a pre-packaged function FLXmclust(),
 but the flexmix framework is not limited to that case. There is a paper
 which appeared in the Journal of Statistical Software
 (http://www.jstatsoft.org/) that explains how to write new M-steps for
 flexmix. It is also contained in the package as
   vignette(flexmix-intro)
 
 Best,
 Z
 
  Thanks
  Nicholas
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide!
  http://www.R-project.org/posting-guide.html
 

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Time Series Analysis: book?

2005-09-08 Thread Jean-Luc Fontaine
Wensui Liu wrote:

 TS is a huge topic. The book recomended by statisitcian might be
 different from the one recommended by econometrician. Finance guy
 might recommend another. Could you please be more specific?


My software (http://moodss.sourceforge.net) collects, archives in a
SQL database and displays data from monitored devices, mostly
computers, databases and network equipment.
My idea is to use the stored data to perform predictions for capacity
planning purposes. For example, based on the trafic on a network
line for the last 12 months, what is the expected evolution in the
next 3 months.
So the data is more of the engineering type, I guess. But since the
software is modular, somebody could also use it to monitor the stock
market.
Actually, anything can be monitored so the data could come from
any source although practically mostly from computing related devices
and activities.

So I would like a book covering at least those subjects if possible.

Thanks very much for your help.

-- 
Jean-Luc Fontaine  http://jfontain.free.fr/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] generating a vector from clusters of logicals

2005-09-08 Thread Ray Brownrigg
 From: Troels Ring [EMAIL PROTECTED]
 
 I have a vector of clusters of TRUE and FALSE like 
 c(TRUE,TRUE,TRUE...,FALSE,FALSE,FALSE,TRUE,TRUE...) and want to make 
 that into a vector
 of c(1,1,1,1...2,2,2,2,.3,3,3,3) increasing the number assigned to each 
 cluster as they change.
 How would I do that ?
 
How about:
 TF - c(TRUE,TRUE,TRUE,FALSE,FALSE,FALSE,TRUE,TRUE,FALSE)
 rep(1:length(rlel - rle(TF)$lengths), rlel)
[1] 1 1 1 2 2 2 3 3 4

HTH,
Ray Brownrigg

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] generating a vector from clusters of logicals

2005-09-08 Thread Greg Snow
Try:

x - c(TRUE,TRUE,TRUE,FALSE,FALSE,FALSE,TRUE,TRUE)
tmp - rle(x)
tmp$values - seq(along=tmp$lengths)
new.x - inverse.rle(tmp)
new.x
 


Greg Snow, Ph.D.
Statistical Data Center, LDS Hospital
Intermountain Health Care
[EMAIL PROTECTED]
(801) 408-8111

 Troels Ring [EMAIL PROTECTED] 09/08/05 03:03PM 
dear friends,
I have a vector of clusters of TRUE and FALSE like 
c(TRUE,TRUE,TRUE...,FALSE,FALSE,FALSE,TRUE,TRUE...) and want to
make 
that into a vector
of c(1,1,1,1...2,2,2,2,.3,3,3,3) increasing the number assigned to
each 
cluster as they change.
How would I do that ?

Best wishes

Troels Ring, Aalborg, Denmark

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Data Expo 2006 (off-topic)

2005-09-08 Thread Paul Murrell
Hi

This is to let R folks know about the Data Expo that is being run by the 
ASA Sections on Statistical Graphics, Statistical Computing,
and Statistics and the Environment for JSM 2006.

This competition provides a data set of geographic and
atmospheric data from NASA and entrants are asked to provide
a graphical summary of the important features of the data set.
The emphasis is on graphical display, but the data set has
time series, spatial, and multivariate features that allow the
focus to be directed in a number of different ways.

Entries will be presented in a poster session at JSM 2006 and
the best entries will receive cash prizes totalling $1700
plus NASA merchandise.  Student and group entries are encouraged.

It would be good to see some R-based entries!

For more information, please see the Data Expo web site
http://www.amstat-online.org/sections/graphics/dataexpo/2006.php

Paul Murrell
(on behalf of the Data Expo organising team)
-- 
Dr Paul Murrell
Department of Statistics
The University of Auckland
Private Bag 92019
Auckland
New Zealand
64 9 3737599 x85392
[EMAIL PROTECTED]
http://www.stat.auckland.ac.nz/~paul/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] ROracle install problem

2005-09-08 Thread dhinds
Ariel Chernomoretz [EMAIL PROTECTED] wrote:
 Hi, 

 I am trying to install the ROracle package in a Linux-64 machine.
 I downloaded from Oracle's site their Instant Client bundle but it seems that
 ROracle needs some stuff not included in that kit in order to compile (in 
 particuar, the 'proc' executable).

You can't use the Instant Client.  You need to get the full client CD
for your platform.

 I did not find any other linux client suite in Oracle's site

It is there.  Go to:

  http://www.oracle.com/technology/software/index.html

and click on Oracle Database 10g (or 9i, or whatever), then on your
platform, and look for the Client CD.

-- Dave

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Re: General Matrix Inner Product?

2005-09-08 Thread Khayat, Roger


Roger E. Khayat, Professor
Department of Mechanical and Materials Engineering
The University of Western Ontario
London, Ontario, Canada N6A 5B9

Email: [EMAIL PROTECTED]
Tel: (519) 661-2111 Ext 88253
Fax: (519) 661-3020

http://www.engga.uwo.ca/people/rkhayat/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Coarsening Factors

2005-09-08 Thread Murray Jorgensen
It is not uncommon to want to coarsen a factor by grouping levels 
together. I have found one way to do this in R:

  sites
  [1] F A A D A A B F C F A D E E D C F A E D F C E D E F F D B C
Levels: A B C D E F
  regions - list(I = c(A,B,C), II = D, III = c(E,F))
  library(Epi)
  region - Relevel(sites,regions)
  region
  [1] III I   I   II  I   I   I   III I   III I   II  III III II  I 
III I   III
[20] II  III I   III II  III III III II  I   I
Levels: I II III

However this seems like using a sledgehammer to crack a nut. Can someone 
suggest a simpler way to do this task?

Murray Jorgensen
-- 
Dr Murray Jorgensen  http://www.stats.waikato.ac.nz/Staff/maj.html
Department of Statistics, University of Waikato, Hamilton, New Zealand
Email: [EMAIL PROTECTED]Fax 7 838 4155
Phone  +64 7 838 4773 wk Home +64 7 825 0441   Mobile 021 1395 862

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Coarsening Factors

2005-09-08 Thread Peter Dalgaard
Murray Jorgensen [EMAIL PROTECTED] writes:

 It is not uncommon to want to coarsen a factor by grouping levels 
 together. I have found one way to do this in R:
 
   sites
   [1] F A A D A A B F C F A D E E D C F A E D F C E D E F F D B C
 Levels: A B C D E F
   regions - list(I = c(A,B,C), II = D, III = c(E,F))
   library(Epi)
   region - Relevel(sites,regions)
   region
   [1] III I   I   II  I   I   I   III I   III I   II  III III II  I 
 III I   III
 [20] II  III I   III II  III III III II  I   I
 Levels: I II III
 
 However this seems like using a sledgehammer to crack a nut. Can someone 
 suggest a simpler way to do this task?

Yes,

 regions - list(I = c(A,B,C), II = D, III = c(E,F))
 levels(sites) - regions
 sites
 [1] III I   I   II  I   I   I   III I   III I   II  III III II  I III I   III 
[20] II  III I   III II  III III III II  I   I
Levels: I II III


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] change in read.spss, package foreing?

2005-09-08 Thread Heinz Tuechler
Dear All,

it seems to me that the function read.spss of package foreign changed its
behaviour regarding factors. I noted that in version 0.8-8 variables with
value labels in SPSS were transformed in factors with the labels in
alphabetic order.
In version 0.8-10 they seem to be ordered preserving the order
corresponding to their numerical codes in SPSS.
However I could not find a description of this supposed change. Since the
different behaviour seems to depend on the installed version of the
foreign-package I don't know how to give a reproducible example.
It also affects spss.get of the Hmisc-package, which is not surprising.
I prefer the new behaviour and would like to know, if it will persist in
future versions.

Comments?

Heinz Tüchler

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Time Series Analysis: book?

2005-09-08 Thread Spencer Graves
  1.  Have you read the appropriate chapter in Venables and Ripley 
(2002) Modern Applied Statists with S (Springer)?  If no, I suggest you 
start there.

  2.  Have you worked through the vignettes associated with the zoo 
package?  If no, you might find that quite useful.  [Are you aware that 
edit(vignette(...)) will provide a script file with the R code discussed 
in the vignette, which can be viewed in Adobe Acrobat while you are 
working throught the examples line by line, modifying them, etc.?  I've 
found this to be very useful.  If you use XEmacs, edit(vignette(...)) 
may not work.  Instead, try Stangle(vignette(...)$file).  This will copy 
the R code to a file in the working directory, which you can then open.]

  3.  Have you considered Durbin, J. and Koopman, S. J. (2001) _Time 
Series Analysis by State Space Methods._  Oxford University Press?  If 
no, you might want to spend some time with that.

   I'm still looking for the right kind of introduction and overview to 
what is available in R for time series analysis, especially with a 
Bayesian approach to Kalman filtering and smoothing.  Unfortunately, I 
have yet to find the key I feel I need to get started, though I found 
the vignettes with zoo to be quite helpful.

  spencer graves

Jean-Luc Fontaine wrote:

 Wensui Liu wrote:
 
 
TS is a huge topic. The book recomended by statisitcian might be
different from the one recommended by econometrician. Finance guy
might recommend another. Could you please be more specific?
 
 
 
 My software (http://moodss.sourceforge.net) collects, archives in a
 SQL database and displays data from monitored devices, mostly
 computers, databases and network equipment.
 My idea is to use the stored data to perform predictions for capacity
 planning purposes. For example, based on the trafic on a network
 line for the last 12 months, what is the expected evolution in the
 next 3 months.
 So the data is more of the engineering type, I guess. But since the
 software is modular, somebody could also use it to monitor the stock
 market.
 Actually, anything can be monitored so the data could come from
 any source although practically mostly from computing related devices
 and activities.
 
 So I would like a book covering at least those subjects if possible.
 
 Thanks very much for your help.
 

-- 
Spencer Graves, PhD
Senior Development Engineer
PDF Solutions, Inc.
333 West San Carlos Street Suite 700
San Jose, CA 95110, USA

[EMAIL PROTECTED]
www.pdf.com http://www.pdf.com
Tel:  408-938-4420
Fax: 408-280-7915

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] SPSS Dataset

2005-09-08 Thread ERICK YEGON
How would one read SPSS data sets directly into R

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Time Series Analysis: book?

2005-09-08 Thread jfontain
Quoting Spencer Graves [EMAIL PROTECTED]:

 1.  Have you read the appropriate chapter in Venables and Ripley
 (2002) Modern Applied Statists with S (Springer)?  If no, I suggest you
 start there.

 2.  Have you worked through the vignettes associated with the zoo
 package?  If no, you might find that quite useful.  [Are you aware that
 edit(vignette(...)) will provide a script file with the R code discussed
 in the vignette, which can be viewed in Adobe Acrobat while you are
 working throught the examples line by line, modifying them, etc.?  I've
 found this to be very useful.  If you use XEmacs, edit(vignette(...))
 may not work.  Instead, try Stangle(vignette(...)$file).  This will copy
 the R code to a file in the working directory, which you can then open.]

 3.  Have you considered Durbin, J. and Koopman, S. J. (2001) _Time
 Series Analysis by State Space Methods._  Oxford University Press?  If
 no, you might want to spend some time with that.

  I'm still looking for the right kind of introduction and overview to
 what is available in R for time series analysis, especially with a
 Bayesian approach to Kalman filtering and smoothing.  Unfortunately, I
 have yet to find the key I feel I need to get started, though I found
 the vignettes with zoo to be quite helpful.

Thank you very much Spencer and all who responded.

I think I have enough to get started with all this valuable information.


--
Jean-Luc

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] strata in crr (cmprsk library)

2005-09-08 Thread Williams Scott
Hi all, I am aware that crr lacks the friendly command structure of
functions such as cph. All is clear to me about including covariates
until I want to include a stratification term in the competing risk
framework (no nice strat command). 

I am still a bit of a novice in R - I am looking for an example to help
me with this, but can't seem to find one. Any advice appreciated (no
matter how simple).

Thanks

Scott Williams MD
Peter MacCallum Cancer Centre
Melbourne, Australia

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Debugging R/Fortran in Windows

2005-09-08 Thread James Wettenhall
Hi,

I'm trying to debug an R interface to a Fortran subroutine from Windows. 
(Yes, I know I should try Unix/Linux as well, but a quick attempt
suggested that the (MinGW g77) Fortran compiler I have installed on my
Windows laptop works better on this Fortran code.)

I'm trying to follow the instructions in the Writing R Extensions Manual:

Start R under the debugger after setting a breakpoint for WinMain.
  gdb .../bin/Rgui.exe
  (gdb) break WinMain
  (gdb) run

But when I run gdb on Rgui.exe, I get the message:
no debugging symbols found
and then when I try break WinMain, I get:
No symbol table is loaded.  use the 'file' command.

I'm using R 2.1.1 on Windows 2000 and gdb 5.2.1 from MSys's MinGW.

I'm calling a Fortran function (several times) from R.  And I seem to have
the basic two-way data communication working - I appear to have
succesfully passed all required data types (integer, real, double
precision) to and from Fortran with sensible results both from within R
and from using WRITE(FILENUM,*) from within Fortran.  But unfortunately
there is still evidence of memory leakage.

Any suggestions would be greatly appreciated.

Regards,
James

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] SPSS Dataset

2005-09-08 Thread TEMPL Matthias
RSiteSearch(read spss data)
--
library(foreign)
?read.spss

Best,
Matthias

 -Ursprüngliche Nachricht-
 Von: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] Im Auftrag von ERICK YEGON
 Gesendet: Freitag, 09. September 2005 07:02
 An: R-help@stat.math.ethz.ch
 Betreff: [R] SPSS Dataset
 
 
 How would one read SPSS data sets directly into R
 
 __
 R-help@stat.math.ethz.ch mailing list 
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read 
 the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html