from:"Michael Dewey"

Re: [R] memory.size help

2007-09-01 Thread Michael Dewey

At 13:27 31/08/2007, dxc13 wrote:

I keep getting the 'memory.size' error message when I run a program I have
been writing.  It always it cannot allocate a vector of a certain size.  I
believe the error comes in the code fragement below where I have multiple
arrays that could be taking up space.  Does anyone know a good way around
this?

It is a bit hard without knowing the dimensions of the arrays but see below.

w1 - outer(xk$xk1, data[,x1], function(y,z) abs(z-y))
w2 - outer(xk$xk2, data[,x2], function(y,z) abs(z-y))
w1[w1  d1] - NA
w2[w2  d2] - NA
i1 - ifelse(!is.na(w1),yvals[col(w1)],NA)
i2 - ifelse(!is.na(w2),yvals[col(w2)],NA)

If I read this correctly after this point you no longer need w1 and 
w2 so what happens if you remove them?

zk - numeric(nrow(xk))  #DEFININING AN EMPTY VECTOR TO HOLD ZK VALUES
for(x in 1:nrow(xk)) {
 k - intersect(i1[x,], i2[x,])
 zk[x] - mean(unlist(k), na.rm = TRUE)
}
xk$zk - zk
data - na.omit(xk)
--
View this message in context: 
http://www.nabble.com/memory.size-help-tf4359846.html#a12425401
Sent from the R help mailing list archive at Nabble.com.

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Exact Confidence Intervals for the Ration of Two Binomial

2007-08-24 Thread Michael Dewey

At 18:28 23/08/2007, [EMAIL PROTECTED] wrote:
Hello everyone,

I will like to know if there is an R package to compute exact confidence
intervals for the ratio of two binomial proportions.

If you go to
https://www.stat.math.ethz.ch/pipermail/r-help//2006-November/thread.html
and search that part of the archive for relative risk you will find 
a number of suggestions. Unfortunately the responses are not all 
threaded so you need to search the whole thing.

Tony.
 [[alternative HTML version deleted]]

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to fit an linear model withou intercept

2007-08-23 Thread Michael Dewey

At 10:56 23/08/2007, Michal Kneifl wrote:
Please could anyone help me?
How can I fit a linear model where an intercept has no sense?

Well the line has to have an intercept somewhere I suppose.
If you use the site search facility and look for known intercept 
you will get some clues.

Thanks in advance..

Michael



Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] GLMM: MEEM error due to dichotomous variables

2007-08-11 Thread Michael Dewey

At 14:31 07/08/2007, Elva Robinson wrote:
I am trying to run a GLMM on some binomial data. My fixed factors 
include 2 dichotomous variables, day, and distance. When I run the model:

modelA-glmmPQL(Leaving~Trial*Day*Dist,random=~1|Indiv,family=binomial)

I get the error:

iteration 1
Error in MEEM(object, conLin, control$niterEM) :
Singularity in backsolve at level 0, block 1

 From looking at previous help topics,( 
 http://tolstoy.newcastle.edu.au/R/help/02a/4473.html)
I gather this is because of the dichotomous predictor variables - 
what approach should I take to avoid this problem?

I seem to remember a similar error message (although possibly not 
from glmmPQL). Does every combination of Trial * Day * Dist occur in 
your dataset?

You would find it easier to read your code if you used your space 
bar. Computer storage is cheap.


Thanks, Elva.

_
Got a favourite clothes shop, bar or restaurant? Share your local knowledge



Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] proportional odds model in R

2007-08-02 Thread Michael Dewey

At 08:51 02/08/2007, Ramon Martínez Coscollà wrote:
Hi all!!

There is no need to post twice, nor to also post on allstat.

Pages 204-205 of MASS for which this software is 
a support tool provides ample information on how to compare models.

I am using a proportinal odds model to study some ordered categorical
data. I am trying to predict one ordered categorical variable taking
into account only another categorical variable.

I am using polr from the R MASS library. It seems to work ok, but I'm
still getting familiar and I don't know how to assess goodness of fit.
I have this output, when using response ~ independent variable:

Residual Deviance: 327.0956
AIC: 333.0956
  polr.out$df.residual
[1] 278
  polr.out$edf
[1] 3

When taking out every variable... (i.e., making 
formula: response ~ 1), I have:

Residual Deviance: 368.2387
AIC: 372.2387

How can I test if the model fits well? How can I check that the
independent variable effectively explains the model? Is there any
test?

Moreover, sendig summary(polr.out) I get this error:


Error in optim(start, fmin, gmin, method = BFGS, hessian = Hess, ...) :
initial value in 'vmmin' is not finite

Something to do with the optimitation procedure... but, how can I fix
it? Any help would be greatly appreciated.

Thanks.

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] generating symmetric matrices

2007-07-31 Thread Michael Dewey

At 16:29 30/07/2007, Gregory Gentlemen wrote:


Douglas Bates [EMAIL PROTECTED] wrote: On 7/27/07, Gregory 
Gentlemen  wrote:
  Greetings,

  I have a seemingly simple task which I have not been able to 
 solve today. I want to construct a symmetric matrix of arbtriray 
 size w/o using loops. The following I thought would do it:

  p - 6
  Rmat - diag(p)
  dat.cor - rnorm(p*(p-1)/2)
  Rmat[outer(1:p, 1:p, )] - Rmat[outer(1:p, 1:p, )] - dat.cor

  However, the problem is that the matrix is filled by column and 
 so the resulting matrix is not symmetric.

Could you provide more detail on the properties of the symmetric
matrices that you would like to generate?  It seems that you are
trying to generate correlation matrices.  Is that the case?  Do you
wish the matrices to be a random sample from a specific distribution.
If so, what distribution?

Yes, my goal is to generate correlation matrices whose entries have 
been sampled independently from a normal with a specified mean and variance.

Would it sufficient to use one of the results of
RSiteSearch(random multivariate normal, restrict = functions)
or have I completely misunderstood what you want? (I appreciate this 
is not exactly what you say you want.)

Thanks for the help.

Greg


-

 [[alternative HTML version deleted]]

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SLLOOOWWW function ...

2007-07-18 Thread Michael Dewey

At 12:32 17/07/2007, Johannes Graumann wrote:
Does anybody have any insight into how to make this faster?

I am not an expert on R programming by any means but I notice you are 
growing your new data frame row by row. I believe it is normally 
recommended to allocate enough space to start with.

I suspect, that the rounding going on may be an issue, as is the stepping
through data frame rows using integers ...

If you have the patience to teach a noob, he will highly appreciate it ;0)

Joh

digit - 4
for (minute in seq(from=25,to=lrange[2])){
   # Extract all data associtaed with the current time (minute)
   frame - subset(mylist,mylist[[Time]] == minute)
   # Sort by Intensity
   frame - frame[order(frame[[Intensity]],decreasing = TRUE),]
   # Establish output frame using the most intense candidate
   newframe - frame[1,]
   # Establish overlap-checking vector using the most intense candidate
   lowppm - round(newframe[1,][[Mass]]-newframe[1,
[[Mass]]/1E6*ppmrange,digits=digit)
   highppm - round(newframe[1,][[Mass]]+newframe[1,
[[Mass]]/1E6*ppmrange,digits=digit)
   presence - seq(from=lowppm,to=highppm,by=10^(-digit))
   # Walk through the entire original frame and check whether peaks are
overlap-free ... do so until max of 2000 entries
   for (int in seq(from=2,to=nrow(frame))) {
 if(nrow(newframe)  2000) {
   lowppm - round(frame[int,][[Mass]]-frame[int,
[[Mass]]/1E6*ppmrange,digits=digit)
   highppm - round(frame[int,][[Mass]]+frame[int,
[[Mass]]/1E6*ppmrange,digits=digit)
   windowrange - seq(from=lowppm,to=highppm,by=10^(-digit))
   if (sum(round(windowrange,digits=digit) %in%
round(presence,digits=digit))  1) {
 newframe - rbind(newframe,frame[int,])
 presence - c(presence,windowrange)
   }
 } else {
   break()
 }
   }

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Partial Proportional Odds model using vglm

2007-07-17 Thread Michael Dewey

At 13:01 16/07/2007, Rizwan Younis wrote:
Hi:
I am trying to fit a PPO model using vglm from VGAM, and get an error while
executing the code.

You seem to keep posting the same problem.
Since the only person who can tell you what is happening inside VGAM 
is probably the maintainer a more efficient strategy would be to 
email him as the instructions ask you to do.

However if that fails then try simplifying your problem to see if the 
error goes away
1) try merging ages 1 and 2, and ages 4 and 5
2) try merging column 3 2 with 2 1 or 4 3


Here is the data, code, and error:

Data  = rc13, first row is the column names. a = age, and 1,2,3, 4 and 5 are
condition grades.

   a  1 2 3  4 5
   1  1 0 0  0 0
   2 84 2 7 10 2
   3 16 0 6  6 2
   4 13 0 3  4 0
   5  0 0 0  1 0

Library(VGAM)

rc13-read.table(icg_rcPivot_group2.txt,header=F)
names(rc13)-c(a,1,2,3,4,5)

ppo-vglm(cbind(rc13[,2],rc13[,3],rc13[,4],rc13[,5],rc13[,6])~a,family =
cumulative(link = logit, parallel = F , reverse = F),na.action=na.pass,
data=rc13)
summary(ppo)

I get the following error:

Error in [-(`*tmp*`, , index, value = c(1.13512932539841,
0.533057528200189,  :
 number of items to replace is not a multiple of replacement length
In addition: Warning messages:
1: NaNs produced in: log(x)
2: fitted values close to 0 or 1 in: tfun(mu = mu, y = y, w = w, res =
FALSE, eta = eta, extra)
3: 19 elements replaced by 1.819e-12 in: checkwz(wz, M = M, trace = trace,
wzeps = control$wzepsilon)

I will appreciate any help to fix this problem.
Thanks

Reez You
Grad Student
University of Waterloo
Waterloo, ON Canada

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Difference in linear regression results for Stata and R

2007-07-13 Thread Michael Dewey

At 17:17 12/07/2007, kdestler wrote:

Hi
I recently imported data from r into Stata.  I then ran the linear
regression model I've been working on, only to discover that the results are
somewhat (though not dramatically different).  the standard errors vary more
between the two programs than do the coefficients themselves.  Any
suggestions on what I've done that causes this mismatch?

You really need to find a small example which exhibits the problem 
and then post that.


Thanks,
Kate
--
View this message in context: 
http://www.nabble.com/Difference-in-linear-regression-results-for-Stata-and-R-tf4069072.html#a11563283
Sent from the R help mailing list archive at Nabble.com.

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Meta-Analysis of proportions

2007-06-28 Thread Michael Dewey

At 09:58 28/06/2007, Chung-hong Chan wrote:
OpenBUGS should be something related to Bayesian statistics.

You may refer to Chapter 12 of Handbook
http://cran.r-project.org/doc/vignettes/HSAUR/Ch_meta_analysis.pdf
It talks about meta-regression.



On 6/28/07, Monica Malta [EMAIL PROTECTED] wrote:
Dear colleagues,

I'm conducting a meta-analysis of studies evaluating adherence of 
HIV-positive drug users into AIDS treatment, therefore I'm looking 
for some advice and syntax suggestion for running the 
meta-regression using proportions, not the usual OR/RR frequently 
used on RCT studies.

Monica, you have a number of options.
1 - weight each study equally
2 - weight each individual equally
3 - use the usual inverse variance procedure, possibly transforming 
the proportions first
4 - something else I have not though of

You could do 3 using rmeta which is available from CRAN. Programming 
1 or 2 is straightforward.
Of course, you do need to decide which corresponds to your scientific question.


Have already searched already several handbooks, R-manuals, mailing 
lists, professors, but... not clue at all...

Does anyone have already tried this? A colleague of mine recently 
published a similar study on JAMA, but he used OpenBUGS - a 
software I'm not familiar with...

If there is any tip/suggestion for a possible syntax, could someone 
send me? I need to finish this paper before my PhD qualify, but I'm 
completely stuck...

So, any tip will be more than welcome...I will really appreciate it!!!

Thanks in advance and congrats on the amazing mailing-list.



Bests from Rio de Janeiro, Brazil.

Monica





Monica Malta
Researcher
Oswaldo Cruz Foundation - FIOCRUZ
Social Science Department - DCS/ENSP
Rua Leopoldo Bulhoes, 1480 - room 905
Manguinhos
Rio de Janeiro - RJ 21041-210
Brazil
phone +55.21.2598-2715
fax +55.21.2598-2779
 [[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
The scientists of today think deeply instead of clearly. One must be
sane to think clearly, but one can think deeply and be quite insane.
Nikola Tesla
http://www.macgrass.com



Michael Dewey
[EMAIL PROTECTED]
http://www.aghmed.fsnet.co.uk/home.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] merging dataframes with diffent rownumbers

2007-06-18 Thread Michael Dewey

At 09:09 18/06/2007, Thomas Hoffmann wrote:
Dear R-Helpers,

I have following problem:

I do have two data frames dat1 and dat2 with a commen column BNUM 
(long integer). dat1 has a larger number of BNUM than dat2 and 
different rows of dat2 have equal BNUM. The numbers of rows in dat1 
and dat2 is not equal.  I applied the  tapply-function to dat2 with 
BNUM as index. I would like to add the columns from dat1 to the results of

b.sum - tapply(dat2, BNUM, sum).

However the BNUM of b.sum are only a subset of the dat1.

Does anybody knows a elegant way to solve the problem?

If I understand you correctly
?merge
should help you here

Thanks in advance

Thomas H.



Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Responding to a posting in the digest

2007-06-14 Thread Michael Dewey

At 08:26 14/06/2007, Moshe Olshansky wrote:
Is there a convenient way to respond to a particular
posting which is a part of the digest?
I mean something that will automatically quote the
original message, subject, etc.

Yes, if you use appropriate mailing software. I use Eudora, receive 
the emails in MIME format and the responses do get properly threaded 
as far as I can see from the mailing list archives.

Thank you!

Moshe Olshansky
[EMAIL PROTECTED]

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] export to a dat file that SAS can read

2007-06-13 Thread Michael Dewey

At 09:05 13/06/2007, Rina Miehs wrote:
Hello

i have a data frame in R that some SAS users need to use in their
programs, and they want it in a dat file, is that possible?
and which functions to use for that?

Does
library(foreign)
?write.foreign
get you any further forward?


my data frame is like this:

  out13[1:100,]
  faridniveau1  niveau3
p1p3   antal1
210007995  0.0184891394  4.211306e-10 5.106471e-02 2.594580e-02
  3
910076495  0.0140812953  3.858757e-10 1.065804e-01 3.743271e-02
  3
10   10081892  0.0241760590  7.429612e-10 1.628295e-02 3.021538e-04
  6
13   10101395  0.0319517576  3.257375e-10 2.365204e-03 6.633232e-02
19
16   10104692  0.0114040787  3.661169e-10 1.566721e-01 4.550663e-02
  4
17   10113592  0.0167586526  4.229634e-10 6.922003e-02 2.543987e-02
  2
18   10113697  0.0259205504  2.888646e-10 1.096366e-02 9.118995e-02
  6
22   10121697 -0.0135341273 -5.507914e-10 1.157417e-01 5.501947e-03
16
28   10146495  0.0093514076  3.493487e-10 2.041883e-01 5.340801e-02
  4
29   10150497  0.0091611504  3.455925e-10 2.089893e-01 5.531904e-02
  4
36   10171895  0.0089116669  2.956742e-10 2.153844e-01 8.614259e-02
  4
42   10198295  0.0078515166  3.147140e-10 2.437943e-01 7.314111e-02
  5


Thanks

Rina




 [[alternative HTML version deleted]]

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A function for raising a matrix to a power?

2007-05-07 Thread Michael Dewey

At 15:48 06/05/2007, Ron E. VanNimwegen wrote:
  Hi,

Is there a function for raising a matrix to a power? For example if 
you like to compute A%*%A%*%A, is there an abbreviation similar to A3?

Atte Tenkanen

I may be revealing my ignorance here, but is MatrixExp in the msm 
package (available from CRAN) not relevant here?

[snip other suggestion]



Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] digest readability/ thunderbird email client/ R-help

2007-04-29 Thread Michael Dewey

At 21:00 28/04/2007, Martin Maechler wrote:
Hi Sam,

I posted a request for information on how various mailers handled 
this so I could summarise the results and put it on a helpful 
webpage. Nobody was willing to sing the praises of his/her favourite 
mailer so the only one I can suggest is the one I use myself, Eudora, 
although that is now no longer under development.

My offer is still open if anyone has anything to contribute.


  SamMc == Sam McClatchie [EMAIL PROTECTED]
  on Fri, 27 Apr 2007 11:21:56 -0700 writes:

 SamMc System:Linux kernel 2.6.15 Ubuntu dapper
 ...


 SamMc Has anyone figured out how to make the R-help digest
 SamMc more easily readable in the Thunderbird mail client?
 SamMc I think the digest used to be more readable than is
 SamMc currently the case with all of the messages as
 SamMc attachments.

 SamMc I know the work around is to get individual messages
 SamMc rather than the digest, use another mail client, or
 SamMc just look at the archives on the web...

 {and there are at least two alternatives to the standard
  archives, notably Gmane.
  BTW: Just found an URL that should list all R lists carried
   by them :  http://dir.gmane.org/search.php?match=.r.


It had been pointed out more than once that Thunderbird
unfortunately is not adhering to the standard(s) of such
digests-- too bad it still is not.

One alternative is to get the digest as plain digest
((which BTW another standard-complying digest format, that
  e.g. emacs Rmail or VM can easily deal with))
which will most probably just appear as one long e-mail in
Thunderbird, with table of contents of all the subject lines,
but nothing clickable

Regards,
Martin

Martin Maechler, ETH Zurich

Michael Dewey
[EMAIL PROTECTED]
http://www.aghmed.fsnet.co.uk/home.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] graphs superimposed on pictures?

2007-04-12 Thread Michael Dewey

At 14:39 11/04/2007, Robert Biddle wrote:
Hi:

I am doing some work that involves plotting points of interest
superimposed on photographs and maps. I can produce the plots fine 
in R, but so far
I have had to do the superimposition externally, which makes it 
tedious to do exploratory work.
I have looked to see if there is some capability to put a background 
picture on a plot window,
but I have not found anything.
Advice, anyone?

Although my situation was not exactly the same as yours you may find

http://finzi.psych.upenn.edu/R/Rhelp02a/archive/42884.html

a help

Cheers
Robert Biddle



Michael Dewey
http://www.aghmed.fsnet.co.uk

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] summary polr

2007-02-20 Thread Michael Dewey

At 15:21 19/02/2007, Paolo Accadia wrote:
Hi all,

I have a problem to estimate Std. Error and 
t-value by âpolrâ in library Mass.
They result from the summary of a polr object.

I can obtain them working in the R environment with the following statements:

  temp - polr(formula = formula1,  data = data1)
  coeff - summary(temp),

but when the above statements are enclosed in a 
function, summary reports the following error:

Error in eval(expr, envir, enclos) : object dat not found

Someone knows how I can solve the problem?

By giving us a short reproducible example?

Specifically we do not know:
1 - what formula1 is
2 - what the structure of data1 is
3 - what the enclosing function looks like
4 - what dat is


Thanks for any help.
Paolo

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] summary polr

2007-02-20 Thread Michael Dewey

At 14:41 20/02/2007, you wrote:
Please do not just reply to me,
1 - I might not know
2 - it breaks the threading

Hi

here there is an example extracted from polr help in MASS:

The function could be:

temp - function(form, dat) {
 house.plr - polr(formula = 
 form, weights = Freq, data = dat)
 coeff - summary(house.plr)
return(coeff)}

Why do you try to redefine the coeff extractor function?
Try calling the results of summary summ or some 
other obvious name and I think you will find the problem goes away.

See also
  library(fortunes)
  fortune(dog)

Firstly, don't call your matrix 'matrix'. Would you call your dog 'dog'?
Anyway, it might clash with the function 'matrix'.
-- Barry Rowlingson
   R-help (October 2004)



the function can be called by:

   temp(Sat ~ Infl + Type + Cont, housing)

where all data is available from MASS, as it is 
an example in R Help on 'polr'.

Results are:

Re-fitting to get Hessian

Error in eval(expr, envir, enclos) : object dat not found

Paolo Accadia


  Michael Dewey [EMAIL PROTECTED] 20/02/07 1:43 PM 
At 15:21 19/02/2007, Paolo Accadia wrote:
 Hi all,
 
 I have a problem to estimate Std. Error and
 t-value by Ã¢polrÃ¢  in library Mass.
s.
 They result from the summary of a polr object.
 
 I can obtain them working in the R environment 
 with the following statements:
 
   temp - polr(formula = formula1,  data = data1)
   coeff - summary(temp),
 
 but when the above statements are enclosed in a
 function, summary reports the following error:
 
 Error in eval(expr, envir, enclos) : object dat not found
 
 Someone knows how I can solve the problem?

By giving us a short reproducible example?

Specifically we do not know:
1 - what formula1 is
2 - what the structure of data1 is
3 - what the enclosing function looks like
4 - what dat is


 Thanks for any help.
 Paolo

Michael Dewey
http://www.aghmed.fsnet.co.uk

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] enhanced question / standardized coefficients

2007-02-09 Thread Michael Dewey

At 15:51 07/02/2007, Brian S Cade wrote:
There was a nice paper in The American Statistician by Johan Bring (1994.
How to standardize regression coefficients.  The American Statistician
48(3):209-213) pointing out that comparing ratios of t-test statistic
values (for null hypothesis that parameter = 0) is equivalent to comparing
ratios of standardized coefficients where standardization is based on the
partial (conditional) standard deviations of the parameter estimates.  And
this is equivalent to thinking about the incremental improvement in
R-squared that is obtained by including a variable in the regression model
after all others are already in the model.   It would seem possible to
extend this idea to categorical factor variables with more than 2 levels
(1 indicator variable), given the relation between an F and t-test
statistic.

You may also be interested in
http://www.tfh-berlin.de/~groemp/rpack.html
As well as her package relaimpo this also links to her article in JSS 
which I found thought provoking.


Any way something to think about, though there are no doubt still
limitations in trying to equate effects of variables measured on disparate
scales.

Brian

Brian S. Cade

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  [EMAIL PROTECTED]
tel:  970 226-9326



John Fox [EMAIL PROTECTED]
Sent by: [EMAIL PROTECTED]
02/07/2007 07:49 AM

To
'Simon P. Kempf' [EMAIL PROTECTED]
cc
r-help@stat.math.ethz.ch
Subject
Re: [R] enhanced question / standardized coefficients






Dear Simon,

In my opinion, standardized coefficients only offer the illusion of
comparison for quantitative explanatory variables, since there's no deep
reason that the standard deviation of one variable has the same meaning as
the standard deviation of another. Indeed, if the variables are in the
same
units of measurement in the first place, permitting direct comparison of
unstandardized coefficients, then separate standardization of each X is
like
using a rubber ruler.

That said, as you point out, it makes no sense to standardize the dummy
regressors for a factor, so you can just standardize the quantitative
variables (Y and X's) in the regression equation.

I hope that this helps,
  John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox


  -Original Message-
  From: [EMAIL PROTECTED]
  [mailto:[EMAIL PROTECTED] On Behalf Of Simon P. Kempf
  Sent: Wednesday, February 07, 2007 9:27 AM
  To: r-help@stat.math.ethz.ch
  Subject: [R] enhanced question / standardized coefficients
 
  Hello,
 
 
 
  I would like to repost the question of Joerg:
 
 
 
  Hello everybody,
 
  a question that connect to the question of Frederik Karlsons
  about 'how to stand. betas'
  With the stand. betas i can compare the influence of the
  different explaning variables. What do i with the betas of
  factors? I can't use the solution of JohnFox, because there
  is no sd of an factor. How can i compare the influence of the
  factor with the influence of the numeric variables?
 
  I got the same problem. In my regression equation there are
  several categorical variables and  I would like to compute
  the standard coefficients. How can I do this?
 
 
 
  Simon
 
 
 
 
 
 
 
 
 
 
 [[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


 [[alternative HTML version deleted]]

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to find series of small numbers in a big vector?

2007-01-30 Thread Michael Dewey

At 01:49 30/01/2007, Ed Holdgate wrote:

Hello:

I have a vector with 120,000 reals
between 0.0 and 0.

They are not sorted but the vector index is the
time-order of my measurements, and therefore
cannot be lost.

How do I use R to find the starting and ending
index of ANY and ALL the series or sequences
in that vector where ever there are 5 or more
members in a row between 0.021 and 0.029 ?

You could look at rle which codes into runs


For example:

search_range - c (0.021, 0.029) # inclusive searching
search_length - 5   # find ALL series of 5 members within search_range
my_data - c(0.900, 0.900, 0.900, 0.900, 0.900,
  0.900, 0.900, 0.900, 0.900, 0.900,
  0.900, 0.028, 0.024, 0.027, 0.023,
  0.022, 0.900, 0.900, 0.900, 0.900,
  0.900, 0.900, 0.024, 0.029, 0.023,
  0.025, 0.026, 0.900, 0.900, 0.900,
  0.900, 0.900, 0.900, 0.900, 0.900,
  0.900, 0.900, 0.900, 0.900, 0.022,
  0.023, 0.025, 0.333, 0.027, 0.028,
  0.900, 0.900, 0.900, 0.900, 0.900)

I seek the R program to report:
start_index of 12 and an end_index of 16
-- and also --
start_index of 23 and an end_index of 27
because that is were there happens to be
search_length numbers within my search_range.

It should _not_ report the series at start_index 40
because that 0.333 in there violates the search_range.

I could brute-force hard-code an R program, but
perhaps an expert can give me a tip for an
easy, elegant existing function or a tactic
to approach?

Execution speed or algorithm performance is not,
for me in this case, important.  Rather, I
seek an easy R solution to find the time windows
(starting  ending indicies) where 5 or more
small numbers in my search_range were measured
all in a row.

Advice welcome and many thanks in advance.

Ed Holdgate

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] replicating the odds ratio from a published study

2007-01-29 Thread Michael Dewey

At 21:13 28/01/2007, Bob Green wrote:
Michael,

Thanks. Yes, clearly the volume number for the Schanda paper I cited is wrong.

Where things are a bit perplexing, is that I used the same method as 
Peter suggested on two papers by Eronen (referenced below). I can 
reproduce in R a similar odds ratio to the first published paper e.g 
OR = 9.7 (CI= 7.4-12.6) whereas I obtained quite different results 
from the second published paper (Eronen 2) of OR =  10.0 (8.1-12.5). 
One reason why I wanted to work out the calculations was so I could 
analyse data from studies using the same method, for confirmation.

Now the additional issue, is that Woodward, who is also the author 
of an epidemiological text, says in a review that Eronen used 
wrong  formula in a 1995 paper and indicates that this comment 
applies also to later studies - he stated the they use methods 
designed for use with binomial data when they really have Poisson 
data. Consequently, they quote odds ratios when they really have 
relative rates and their confidence intervals are 
inaccurate.  Eronen1 cites the formula that was used for OR. 
Schanda sets out his table for odds ratio the same as Eronen1

There do seem to be difficulties in what they are doing as they have 
not observed all the non-homicides, they estimate how many they are 
and then estimate the number of people with a given diagnosis using 
prevalence estimates from another study. I think you are moving 
towards writing an article criticising the statistical methods used 
in this whole field which I think is going beyond the resources of R-help.


For the present purpose, my primary question is: as you have now 
seen the Schanda paper, would you consider Schanda calculated odds 
or relative risk?

Also, when I tried the formula suggested by Peter (below) I obtained 
an error - do you know what M might be or the source of the error?

exp(log(41*2936210/920/20068)+qnorm(c(.025,.975))*sqrt(sum(1/M)))
Error in sum(1/M) : object M not found


  eronen1 -  as.table(matrix(c(58,852,13600-58,1947000-13600-852), 
 ncol = 2 , dimnames = list(group=c(scz, nonscz), who= 
 c(sample, population
  fisher.test(eronen1)


p-value  2.2e-16
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
   7.309717 12.690087
sample estimates:
odds ratio
   9.713458

  eronen2 
 -  as.table(matrix(c(86,1302,13530-86,1933000-13530-1302), ncol = 
 2 , dimnames = list(group=c(scz, nonscz), who= c(sample, 
 population
  fisher.test(eronen2)

p-value  2.2e-16
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
   7.481272 11.734136
sample estimates:
odds ratio
9.42561

References

Eronen, M. et al. (1996 - 1) Mental disorders and homicidal behavior 
in Finland. Archives of General Psychiatry, 53, 497-501

Eronen, M et al (1996 - 2). Schizophrenia  homicidal 
behavior.  Schizophrenia Bulletin, 22, 83-89

Woodward, Mental disorder  homicide. Epidemiologia E Psichiatria 
Sociale, 9, 171-189

Any comments are welcomed,

Bob

At 01:57 PM 28/01/2007 +, Michael Dewey wrote:
At 22:01 26/01/2007, Peter Dalgaard wrote:
Bob Green wrote:
Peetr  Michael,

I now see my description may have confused the issue.  I do want 
to compare odds ratios across studies - in the sense that I want 
to create a table with the respective odds ratio for each study. 
I do not need to statistically test two sets of odds ratios.

What I want to do is ensure the method I use to compute an odds 
ratio is accurate and intended to check my method against published sources.

The paper I selected by Schanda et al (2004). Homicide and major 
mental disorders. Acta Psychiatr Scand, 11:98-107 reports a total 
sample of 1087. Odds ratios are reported separately for men and 
women. There were 961 men all of whom were convicted of homicide. 
Of these 961 men, 41 were diagnosed with schizophrenia. The 
unadjusted odds ratio is for this  group of 41 is cited as 
6.52   (4.70-9.00).  They also report the general population aged 
over 15 with schizophrenia =20,109 and the total population =2,957,239.

Looking at the paper (which is in volume 110 by the way) suggests 
that Peter's reading of the situation is correct and that is what 
the authors have done.

Any further clarification is much appreciated,
A fisher.test on the following matrix seems about right:
  matrix(c(41,920,20109-41,2957239-20109-920),2)

 [,1][,2]
[1,]   41   20068
[2,]  920 2936210

  fisher.test(matrix(c(41,920,20109-41,2957239-20109-920),2))

Fisher's Exact Test for Count Data

data:  matrix(c(41, 920, 20109 - 41, 2957239 - 20109 - 920), 2)
p-value  2.2e-16
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
4.645663 8.918425
sample estimates:
odds ratio
  6.520379

The c.i. is not precisely the same as your source. This could be 
down to a different approximation (R's is based on the noncentral 
hypergeometric distribution), but the classical

Re: [R] replicating the odds ratio from a published study

2007-01-28 Thread Michael Dewey

At 22:01 26/01/2007, Peter Dalgaard wrote:
Bob Green wrote:
Peetr  Michael,

I now see my description may have confused the issue.  I do want to 
compare odds ratios across studies - in the sense that I want to 
create a table with the respective odds ratio for each study. I do 
not need to statistically test two sets of odds ratios.

What I want to do is ensure the method I use to compute an odds 
ratio is accurate and intended to check my method against published sources.

The paper I selected by Schanda et al (2004). Homicide and major 
mental disorders. Acta Psychiatr Scand, 11:98-107 reports a total 
sample of 1087. Odds ratios are reported separately for men and 
women. There were 961 men all of whom were convicted of homicide. 
Of these 961 men, 41 were diagnosed with schizophrenia. The 
unadjusted odds ratio is for this  group of 41 is cited as 
6.52   (4.70-9.00).  They also report the general population aged 
over 15 with schizophrenia =20,109 and the total population =2,957,239.

Looking at the paper (which is in volume 110 by the way) suggests 
that Peter's reading of the situation is correct and that is what the 
authors have done.

Any further clarification is much appreciated,


A fisher.test on the following matrix seems about right:
  matrix(c(41,920,20109-41,2957239-20109-920),2)

 [,1][,2]
[1,]   41   20068
[2,]  920 2936210

  fisher.test(matrix(c(41,920,20109-41,2957239-20109-920),2))

Fisher's Exact Test for Count Data

data:  matrix(c(41, 920, 20109 - 41, 2957239 - 20109 - 920), 2)
p-value  2.2e-16
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
4.645663 8.918425
sample estimates:
odds ratio
  6.520379

The c.i. is not precisely the same as your source. This could be 
down to a different approximation (R's is based on the noncentral 
hypergeometric distribution), but the classical asymptotic formula gives

  exp(log(41*2936210/920/20068)+qnorm(c(.025,.975))*sqrt(sum(1/M)))
[1] 4.767384 8.918216

which is closer, but still a bit narrower.


Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] replicating the odds ratio from a published study

2007-01-26 Thread Michael Dewey

At 09:04 26/01/2007, Bob Green wrote:
I wanted to compare odds ratio across studies and tried to replicate 
the results from a study but have not been able to determine how to 
do this in R.

The study reported a sample of 961 men, of whom 41 had a disorder. 
The reported raw odds ratio was 6.52 (4.70-9.00)

For an odds ratio you require two odds from which you form the odds ratio.
You only have one odds.
Do you have another one lying around somewhere?


I did a search of the archives and came across script that looks 
like it should perform this task, however the results I obtained did not match

   prop.test(41,961)

 1-sample proportions test with continuity correction

data:  41 out of 961, null probability 0.5
X-squared = 802.1686, df = 1, p-value  2.2e-16
alternative hypothesis: true p is not equal to 0.5
95 percent confidence interval:
  0.03115856 0.05795744
sample estimates:
  p
0.04266389

 
  ci.p - prop.test(920, 961)$conf
  ci.odds - ci.p/(1-ci.p)
  ci.odds
[1] 16.25404 31.09390
attr(,conf.level)
[1] 0.95


Any advice regarding the script I require is appreciated,

regards

bob Green



Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] selecting rows for inclusion in lm

2007-01-18 Thread Michael Dewey

At 08:19 18/01/2007, David Barron wrote:
Why not use the subset option?  Something like:

lm(diff ~ Age + Race, data=data, subset=data$Meno==PRE)

should do the trick, and be much easier to read!

And indeed the advice in
  library(fortunes)
  fortune(dog)

Firstly, don't call your matrix 'matrix'. Would you call your dog 'dog'?
Anyway, it might clash with the function 'matrix'.
-- Barry Rowlingson
   R-help (October 2004)

 
also helps to make life clearer I find


On 18/01/07, John Sorkin [EMAIL PROTECTED] wrote:
I am having trouble selecting rows of a dataframe that will be included
in a regression. I am trying to select those rows for which the variable
Meno equals PRE. I have used the code below:

difffitPre-lm(data[,diff]~data[,Age]+data[,Race],data=data[data[,Meno]==PRE,])
summary(difffitPre)

The output from the summary indicates that more than 76 rows are
included in the regression:

Residual standard error: 2.828 on 76 degrees of freedom

where in fact only 22 rows should be included as can be seen from the
following:

print(data[length(data[,Meno]==PRE,Meno]))
[1] 22

I would appreciate any help in modifying the data= parameter of the lm
so that I include only those subjects for which Meno=PRE.

R 2.3.1
Windows XP

Thanks,
John

John Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
Baltimore VA Medical Center GRECC,
University of Maryland School of Medicine Claude D. Pepper OAIC,
University of Maryland Clinical Nutrition Research Unit, and
Baltimore VA Center Stroke of Excellence

University of Maryland School of Medicine
Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524

(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
[EMAIL PROTECTED]

Confidentiality Statement:
This email message, including any attachments, is for the so...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
=
David Barron
Said Business School
University of Oxford
Park End Street
Oxford OX1 1HP



Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to format R code in LaTex documents

2007-01-16 Thread Michael Dewey

At 14:12 15/01/2007, Frank E Harrell Jr wrote:
Benjamin Dickgiesser wrote:
Hi,
I am planning on putting some R script in an appendix of a LaTex
document. Can anyone recommend me a way of how to format it? Is there
a way to keep all line breaks without having to insert \\ in every
single line?
Thank you!
Benjamin

Here's one way and I would appreciate anyone's 
improvements.  I've also included solutions from 
two others.  Please let me know what you decide to use.  -Frank

Some interesting ideas below so just to add that 
I find I often have to use the linewidth 
parameter to avoid lines wrapping. This is 
valuable if you are also going to include bits of 
R output as well (you can control the width of 
your own code fragments of course).


\usepackage{listings,relsize}
\lstloadlanguages{R}
\lstset{language=R,basicstyle=\smaller[2],commentstyle=\rmfamily\smaller,
  showstringspaces=false,%
  xleftmargin=4ex,literate={-}{{$\leftarrow$}}1 {~}{{$\sim$}}1}
\lstset{escapeinside={(*}{*)}}   % for (*\ref{ }*) inside lstlistings (S code)

. . .
\begin{lstlisting}
. . . S code . . .
\end{lstlisting}

The following code was provided by Vincent Goulet:


listings is a great package to highlight R keywords and comments and --- that
was my main use of the package --- index those keywords. I found that I had
to slightly redefine the list of keywords included in listings. I still did
not take the time to submit a patch to the author, though...

In any case, here's what I use, if it can be of any help to anyone:

\lstloadlanguages{R}
\lstdefinelanguage{Renhanced}[]{R}{%
   morekeywords={acf,ar,arima,arima.sim,colMeans,colSums,is.na,is.null,%
 mapply,ms,na.rm,nlmin,replicate,row.names,rowMeans,rowSums,seasonal,%
 sys.time,system.time,ts.plot,which.max,which.min},
   deletekeywords={c},
   alsoletter={.\%},%
   alsoother={:_\$}}
\lstset{language=Renhanced,extendedchars=true,
   basicstyle=\small\ttfamily,
   commentstyle=\textsl,
   keywordstyle=\mdseries,
   showstringspaces=false,
   index=[1][keywords],
   indexstyle=\indexfonction}

with

   [EMAIL PROTECTED]

-- Vincent Goulet, Associate Professor École 
d'actuariat Université Laval, Québec 
[EMAIL PROTECTED] http://vgoulet.act.ulaval.ca

Anupam Tyagi provided the following:

\documentclass{report}
\usepackage{listings}
\begin{document}

Somethings .

\lstset{% general command to set parameter(s)
basicstyle=\small, % print whole in small
stringstyle=\ttfamily, % typewriter type for strings
numbers=left, % numbers on the left
numberstyle=\tiny, % Tiny numbers
stepnumber=2, % number every second line of code
numbersep=5pt, % 5pt seperation between numbering and code listing
language=R }

\lstinputlisting{text1.R}

\end{document}



Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] proc GLM with R

2006-12-19 Thread Michael Dewey

At 05:15 18/12/2006, Cressoni, Massimo \(NIH/NHLBI\) [F] wrote:
I want to migrate from SAS to R.
I used proc mixed to do comparison between multiple groups and to perform
multiple comparison between groups since, as far as I know, proc 
mixed does not make assumptions about the data and so
it is better than a simple anova (data must only be normal).
Es. how can I translate a code like this (two way anova with a factor of
repetition) :


proc mixed;
class kind  PEEP codice;
model PaO2_FiO2 = kind PEEP kind*PEEP;
repeated /type = un sub=codice;
lsmeans kind*PEEP /adjust=bon;
run;

codice is a unique identifier of patient
kind is a variable which subdivided the patient (i.e. red or brown hairs)
PEEP is positive end expiratory pressure. These are the steps of a clinical
trial. Patient did the trial at PEEP = 5 and PEEP = 10

You could investigate either nlme or lme4
The best documentation for nlme (which should be included in your system) is
@BOOK{pinheiro00,
   author = {Pinheiro, J C and Bates, D M},
   year = 2000,
   title = {Mixed-effects models in {S} and {S-PLUS}},
   publisher = {Springer-Verlag},
   address = {New York},
   keywords = {glm; mixed models}
}
lme4 is a more recent development by Bates which as yet has slightly 
fewer helper functions and no book.

Since you are assuming normal error you can use nlme. I am afraid I 
do not read SAS so I think it would be wrong of me to try to 
translate your example (traddutore, traditore and all that)


Thank you

Massimo Cressoni

run;

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ifelse misusage become more and more frequent...

2006-12-13 Thread Michael Dewey

At 10:42 13/12/2006, Martin Maechler wrote:
  [EMAIL PROTECTED] == [EMAIL PROTECTED] fr [EMAIL PROTECTED]
  on Tue, 12 Dec 2006 22:24:33 +0100 writes:

 [EMAIL PROTECTED] ...ifelse, a function of three **vector**
 [EMAIL PROTECTED] arguments  Yes !!  I misunderstood the
 [EMAIL PROTECTED] functioning of ifelse.

Seems to happen more an more often.
When I teach R programming I nowadays usually emphasize that people
should often *NOT* use ifelse().
In other words, I think ifelse() is much over-used in situations
where something else would be both clearer and more efficient.

Is there a document / book around which lures people into
misusing ifelse() so frequently?

Perhaps it is because here two concepts are involved which may be 
less familiar to people: program control and vectorisation. I wonder 
whether the manual pages for ifelse and if need to do more than just 
See also the other one. I notice that the page for if has a very 
helpful health warning about putting braces round everything. This is 
unusual in the manual pages which usually describe what is rather 
than tell you explicitly what to do to achieve salvation. Perhaps I 
could suggest that the if page tells you what error message you get 
when you wanted ifelse and that the ifelse page has a health warning 
along the lines of 'ifelse is not if and they are often confused with 
difficult to understand results'.

[snip]


Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multiple Imputation / Non Parametric Models / Combining Results

2006-12-11 Thread Michael Dewey

At 08:12 08/12/2006, Simon P. Kempf wrote:
Dear R-Users,



The following question is more of general nature than a merely technical
one.  Nevertheless I hope someone get me some answers.



I am in no sense an expert in this area but since 
it seems that noone else has answered so far; I 
wonder whether the mitools package from CRAN helps?


I have been using the mice package to perform the multiple imputations. So
far, everything works fine with the standard regressions analysis.



However, I am wondering, if it is theoretically correct to perform
nonparametric models (GAM, spline smoothing etc.) with multiple imputed
datasets. If yes, how can I combine the results in order to show the
uncertainty?



In the research field of real estate economics, the problem of missing data
is often ignored respectively unmentioned. However, GAM, spline smoothing
etc. become increasingly popular. In my research, I would like to use
multiple imputed datasets and GAM, but I am unsure how present single
results.



Again I want to apologize that this is a rather theoretical statistical
question than a technical question on R.



Thanks in advance for any hints and advices.



Simon











Simon P. Kempf

Dipl.-Kfm. MScRE Immobilienökonom (ebs)

Wissenschaftlicher Assistent



Büro:

IREBS Immobilienakademie

c/o ebs Immobilienakademie GmbH

Berliner Str. 26a

13507 Berlin



Privat:

Dunckerstraße 60

10439 Berlin



Mobil: 0176 7002 6687

Email:  mailto:[EMAIL PROTECTED] [EMAIL PROTECTED]




 [[alternative HTML version deleted]]

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] beginning my R-learning

2006-12-07 Thread Michael Dewey

At 22:16 04/12/2006, Michael McCulloch wrote:
Hello,
I'm just beginning to learn R. What books/online learning modules 
with datasets would you suggest?
Thank you!

If
a) you already know some statistics
b) you want to use R as a tool in your applied statistical work
then
@BOOK{venables02,
   author = {Venables, W N and Ripley, B D},
   year = 2002,
   title = {Modern applied statistics with {S}},
   publisher = {Springer-Verlag},
   address = {New York},
   keywords = {statistics, general, software}
}
is worth considering. It is quite terse though (it is one of the few 
books I have which I wish were longer) and not so suitable if you are 
also learning statistics at the same time.



Best wishes,
Michael




Michael McCulloch
Pine Street Clinic
Pine Street Foundation
124 Pine Street, San Anselmo, CA 94960-2674
tel 415.407.1357
fax 415.485.1065
email:  [EMAIL PROTECTED]
web:www.pinest.org
 www.pinestreetfoundation.org
 www.medepi.net/meta



Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] prop.trend.test issue

2006-12-03 Thread Michael Dewey

At 05:55 03/12/2006, Ethan Johnsons wrote:
I have the clinical study data.

Year 0   Year 3
Retinol (nmol/L)N   Mean +-sd   Mean +-sd
Vitamin A group 73  1.89+-0.36  2.06+-0.53
Trace group57  1.83+-0.31 1.78+-0.30

where N is the number of male for the clinical study.

I want to test if the mean serum retinol has increased over 3 years
among subjects in the vitamin A group.

If you want to test means why did you think a test for proportions 
was a good idea?


  1.89+0.36
[1] 2.25
1.89-0.36
[1] 1.53
2.06+0.53
[1] 2.59
2.06-0.53
[1] 1.53


prop.trend.test(c(2.25, 1.53),c( 2.59, 1.53))

Chi-squared Test for Trend in Proportions

data:  c(2.25, 1.53) out of c(2.59, 1.53) ,
using scores: 1 2
X-squared = 0.2189, df = 1, p-value = 0.6399

The issue I am seeing that N of Vitamin A group = 73 seems not reflected.
This leads me to think that I can't implement the test based on the
data just presented.
Nor a two-tailed test is possible.

2-sample test for equality of proportions with continuity correction

data:  c(2.25, 1.53) out of c(2.59, 1.53)
X-squared = 0, df = 1, p-value = 1
alternative hypothesis: two.sided
95 percent confidence interval:
-0.6738203  0.4112720
sample estimates:
   prop 1prop 2
0.8687259 1.000

Warning message:
Chi-squared approximation may be incorrect in: prop.test(c(2.25,
1.53), c(2.59, 1.53))

Any ideas, please?

thx

ej



Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] prop.trend.test issue

2006-12-03 Thread Michael Dewey

At 13:46 03/12/2006, Ethan Johnsons wrote:
I don't find any other test avail for this?
Am I missing something?

I do not want to seem unhelpful but the only response that springs to 
mind is a knowledge of statistics.

I hope people's lives are not at stake with the results of your analysis

thx

ej

On 12/3/06, Michael Dewey [EMAIL PROTECTED] wrote:
At 05:55 03/12/2006, Ethan Johnsons wrote:
 I have the clinical study data.
 
 Year 0   Year 3
 Retinol (nmol/L)N   Mean +-sd   Mean +-sd
 Vitamin A group 73  1.89+-0.36  2.06+-0.53
 Trace group57  1.83+-0.31 1.78+-0.30
 
 where N is the number of male for the clinical study.
 
 I want to test if the mean serum retinol has increased over 3 years
 among subjects in the vitamin A group.

If you want to test means why did you think a test for proportions
was a good idea?


   1.89+0.36
 [1] 2.25
 1.89-0.36
 [1] 1.53
 2.06+0.53
 [1] 2.59
 2.06-0.53
 [1] 1.53
 
 
 prop.trend.test(c(2.25, 1.53),c( 2.59, 1.53))
 
 Chi-squared Test for Trend in Proportions
 
 data:  c(2.25, 1.53) out of c(2.59, 1.53) ,
 using scores: 1 2
 X-squared = 0.2189, df = 1, p-value = 0.6399
 
 The issue I am seeing that N of Vitamin A group = 73 seems not reflected.
 This leads me to think that I can't implement the test based on the
 data just presented.
 Nor a two-tailed test is possible.
 
 2-sample test for equality of proportions with continuity 
 correction
 
 data:  c(2.25, 1.53) out of c(2.59, 1.53)
 X-squared = 0, df = 1, p-value = 1
 alternative hypothesis: two.sided
 95 percent confidence interval:
 -0.6738203  0.4112720
 sample estimates:
prop 1prop 2
 0.8687259 1.000
 
 Warning message:
 Chi-squared approximation may be incorrect in: prop.test(c(2.25,
 1.53), c(2.59, 1.53))
 
 Any ideas, please?
 
 thx
 
 ej
 
 

Michael Dewey
http://www.aghmed.fsnet.co.uk




--
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.5.431 / Virus Database: 268.15.3/562 - Release Date: 
01/12/2006 13:12

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Is there a better way for inputing data manually?

2006-12-02 Thread Michael Dewey

At 04:40 02/12/2006, Duncan Murdoch wrote:
On 12/1/2006 11:00 PM, Wei-Wei Guo wrote:
Dear All,
I have worked with R for some time. It's a great tool for data 
analysis. But it's too hard to inputing raw data manually with R (I 
don't mean importing data. R is good at importing data). Maybe it's 
not a focused topic in this list, but I don't know other place 
where I can ask the question. How do you do when inputing data from 
a paper material, such as questionnaire, or what do you use ?

I would not use R for this.  Depending on how many questionnaires I 
had, from small number to large, I would use:

1.  A text file.
2.  A spreadsheet, like Excel, or the OpenOffice one, or the R data editor.
3.  A custom program written specifically to handle the particular 
questionnaire.

You can do 1 and 2 in R, but you can't do them as well as programs 
dedicated to those tasks, and you can't do 3 at all well.  It 
depends a lot on the specific conventions of the platform you're 
working on.  R is aimed at writing cross-platform programs, and 
isn't particularly good at writing GUI programs, which is what you 
want here.  I would personally use Delphi for this, but there are 
lots of alternatives.

I usually use Epidata, available from http://www.epidata.dk/, which 
is free and enables you to write your own questionnaire file and then 
input your data. It also enables you to implement checks during data 
entry which in my view is an important feature. It enables export in 
a number of formats but its native output format, the .rec file, can 
be read into R using the foreign package.

Duncan Murdoch



Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] random effect question and glm

2006-11-27 Thread Michael Dewey

At 06:28 26/11/2006, you wrote:
Below is the output for p5.random.p,p5.random.p1 and m0
My question is
in p5.random.p, variance for P is 5e-10.
But in p5.random.p1,variance for P is 0.039293.
Why they are so different?

Please do as the posting guide asks and reply to the list, not just 
the individual.
a - I might not know the answer
b - the discussion might help others

You give very brief details of the underlying problem so it is hard 
to know what information will help you most.

Remember, if a computer estimates a non-negative quantity as very 
small perhaps it is really zero.

I think you might benefit from reading Pinheiro and Bates, again see 
the posting list.

Is that wrong to write Y~P+(1|P) if I consider P as random effect?

I suppose terminology differs but I would have said the model with 
Y~1+(1|P) was a random intercept model

Also in p5.random.p and m0, it seems that there are little 
difference in estimate for P. Why?

thanks,

Aimin Yan

  p5.random.p - 
 lmer(Y~P+(1|P),data=p5,family=binomial,control=list(usePQL=FALSE,msV=1))
  summary(p5.random.p)
Generalized linear mixed model fit using Laplace
Formula: Y ~ P + (1 | P)
Data: p5
  Family: binomial(logit link)
   AIC  BIC logLik deviance
  1423 1452 -705.4 1411
Random effects:
  Groups NameVariance Std.Dev.
  P  (Intercept) 5e-102.2361e-05
number of obs: 1030, groups: P, 5

Estimated scale (compare to  1 )  0.938

Fixed effects:
  Estimate Std. Error z value Pr(|z|)
(Intercept)  0.153404   0.160599  0.9552   0.3395
P8ABP   -0.422636   0.202181 -2.0904   0.0366 *
P8adh0.009634   0.194826  0.0495   0.9606
P9pap0.108536   0.218875  0.4959   0.6200
P9RNT0.376122   0.271957  1.3830   0.1667
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
   (Intr) P8ABP  P8adh  P9pap
P8ABP -0.794
P8adh -0.824  0.655
P9pap -0.734  0.583  0.605
P9RNT -0.591  0.469  0.487  0.433
  p5.random.p1 - 
 lmer(Y~1+(1|P),data=p5,family=binomial,control=list(usePQL=FALSE,msV=1))
  summary(p5.random.p1)
Generalized linear mixed model fit using Laplace
Formula: Y ~ 1 + (1 | P)
Data: p5
  Family: binomial(logit link)
   AIC  BIC logLik deviance
  1425 1435 -710.6 1421
Random effects:
  Groups NameVariance Std.Dev.
  P  (Intercept) 0.039293 0.19823
number of obs: 1030, groups: P, 5

Estimated scale (compare to  1 )  0.9984311

Fixed effects:
 Estimate Std. Error z value Pr(|z|)
(Intercept)   0.1362 0.1109   1.2280.219

  m0-glm(Y~P,data=p5,family=binomial(logit))
  summary(m0)

Call:
glm(formula = Y ~ P, family = binomial(logit), data = p5)

Deviance Residuals:
 Min   1Q   Median   3Q  Max
-1.4086  -1.2476   0.9626   1.1088   1.2933

Coefficients:
  Estimate Std. Error z value Pr(|z|)
(Intercept)  0.154151   0.160604   0.960   0.3371
P8ABP   -0.422415   0.202180  -2.089   0.0367 *
P8adh0.009355   0.194831   0.048   0.9617
P9pap0.108214   0.218881   0.494   0.6210
P9RNT0.374693   0.271945   1.378   0.1683
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

 Null deviance: 1425.5  on 1029  degrees of freedom
Residual deviance: 1410.8  on 1025  degrees of freedom
AIC: 1420.8

Number of Fisher Scoring iterations: 4


At 06:13 AM 11/24/2006, you wrote:
At 21:52 23/11/2006, Aimin Yan wrote:
consider p as random effect with 5 levels, what is difference between these
two models?

   p5.random.p - lmer(Y
~p+(1|p),data=p5,family=binomial,control=list(usePQL=FALSE,msV=1))
   p5.random.p1 - lmer(Y
~1+(1|p),data=p5,family=binomial,control=list(usePQL=FALSE,msV=1))

Well, try them and see. Then if you cannot understand the output tell us
a) what you found
b) how it differed from what you expected

in addtion, I try these two models, it seems they are same.
what is the difference between these two model. Is this a cell means model?
m00 - glm(sc ~aa-1,data = p5)
m000 - glm(sc ~1+aa-1,data = p5)

See above


thanks,

Aimin Yan

Michael Dewey
http://www.aghmed.fsnet.co.uk





--
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.5.431 / Virus Database: 268.14.16/551 - Release Date: 
25/11/2006 10:55

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] GLM and LM singularities

2006-11-27 Thread Michael Dewey

At 23:12 26/11/2006, Tim Sippel wrote:
Hi-

I'm wrestling with some of my data apparently not being called into a GLM or
an LM.  I'm looking at factors affecting fish annual catch rates (ie. CPUE)
over 30 years.  Two of the factors I'm using are sea surface temperature and
sea surface temperature anomaly.  A small sample of my data is below:



I could not make much sense of the dataset and your output is 
incomplete (what happened to Year?) but usually the message you 
received means that one of your predictors is a linear combination of 
some of the others.



CPUE

Year

Vessel_ID

Base_Port

Boat_Lgth

Planing

SST

Anomaly


0.127

1976

2

BOI

40

Planing

19.4

-0.308


0.059

1976

30

BOI

40

Displacement

19.4

-0.308


0.03

1977

46

BOI

45

Displacement

18.5

-1.008


0.169231

1977

2

BOI

40

Planing

18.5

-1.008


0.044118

1977

19

BOI

34

Planing

18.5

-1.008


0.114286

1977

29

WHANGAROA

42

Displacement

18.5

-1.008



Have defined Year, Vessel_ID, Base_Port, Boat_Lgth, Planing as factors, and
CPUE, SST, Anomaly are continuous variables.



The formula I'm calling is: glm(formula = log(CPUE) ~ Year + Vessel_ID +
SST, data = marlin).  A summary of the glm includes the following output:



Coefficients: (1 not defined because of singularities)

 Estimate Std. Error t value Pr(|t|)

Vessel_ID80 -0.540930.23380  -2.314 0.021373 *

Vessel_ID83 -0.364990.20844  -1.751 0.080977 .

Vessel_ID84 -0.233860.19098  -1.225 0.221718

SST   NA NA  NA   NA



Can someone explain the output Coefficients: (1 not defined because of
singularities)? What does this mean?  I'm guessing that something to do
with my SST data is causing a singularity but I don't know where to go
from there?  Have inspected various R discussions on this, but I haven't
found the information I need.



Many thanks,



Tim Sippel (MSc)

School of Biological Sciences

Auckland University

Private Bag 92019

Auckland 1020

New Zealand

+64-9-373-7599 ext. 84589 (work)

+64-9-373-7668 (Fax)

+64-21-593-001 (mobile)




 [[alternative HTML version deleted]]

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] show unique combination of two factor

2006-11-24 Thread Michael Dewey

At 22:02 23/11/2006, Aimin Yan wrote:
p factor have 5 levels
aa factor have 19 levels.
totally it should have 95 combinations.
but I just find there are 92 combinations.
Does anyone know how to code to find what combinations are missed?

Does
which(table(p,aa) == 0, arr.ind=TRUE)
do what you wanted.



Thanks,

Aimin



Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] random effect question and glm

2006-11-24 Thread Michael Dewey

At 21:52 23/11/2006, Aimin Yan wrote:
consider p as random effect with 5 levels, what is difference between these
two models?

   p5.random.p - lmer(Y
~p+(1|p),data=p5,family=binomial,control=list(usePQL=FALSE,msV=1))
   p5.random.p1 - lmer(Y
~1+(1|p),data=p5,family=binomial,control=list(usePQL=FALSE,msV=1))

Well, try them and see. Then if you cannot understand the output tell us
a) what you found
b) how it differed from what you expected

in addtion, I try these two models, it seems they are same.
what is the difference between these two model. Is this a cell means model?
m00 - glm(sc ~aa-1,data = p5)
m000 - glm(sc ~1+aa-1,data = p5)

See above


thanks,

Aimin Yan



Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Summary, was Re: Confidence interval for relative risk

2006-11-22 Thread Michael Dewey

At 14:43 10/11/2006, Michael Dewey wrote:

After considerable help from list members and some digging of my own
I have prepared a summary of the findings which I have posted (see
link below). Broadly there were four suggestions
1 - Wald-type intervals,
2 - transforming the odds ratio confidence interval,
3 - method based on score test,
4 - method based on likelihood.

Method 3 was communicated to me off-list
  ===
I haven't followed all of the thread either but has someone already 
pointed out to you confidence intervals that use the score method? 
For example Agresti (Categorical Data Analysis 2nd edition, p. 77-78) 
note that 'although computationally more complex, these methods often 
perform better'. However, they also note that 'currently they are not 
available in standard software'.

But then again, R is not standard software: the code (riskscoreci) 
can be found from here: 
http://www.stat.ufl.edu/~aa/cda/R/two_sample/R2/index.html

best regards,
Jukka Jokinen


and so I reproduce it here. Almost a candidate for the fortunes package there.
The other three can be found from the archive
under the same subject although not all in the same thread.

Methods 3 and 4 seem to have more going for them as far
as I can judge.

Thanks to David Duffy, Spencer Graves,
Jukka Jokinen, Terry Therneau, and Wolfgan Viechtbauer.
The views and calculations presented here and in the summary
are my own and
any errors are my responsibility not theirs.

The summary document is available from here

http://www.zen103156.zen.co.uk/rr.pdf


Original post follows.


The concrete problem is that I am refereeing
a paper where a confidence interval is
presented for the risk ratio and I do not find
it credible. I show below my attempts to
do this in R. The example is slightly changed
from the authors'.

I can obtain a confidence interval for
the odds ratio from fisher.test of
course

=== fisher.test example ===

  outcome - matrix(c(500, 0, 500, 8), ncol = 2, byrow = TRUE)
  fisher.test(outcome)

 Fisher's Exact Test for Count Data

data:  outcome
p-value = 0.00761
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
  1.694792  Inf
sample estimates:
odds ratio
Inf

=== end example ===

but in epidemiology authors often
prefer to present risk ratios.

Using the facility on CRAN to search
the site I find packages epitools and Epi
which both offer confidence intervals
for the risk ratio

=== Epi example ===

  library(Epi)
  twoby2(outcome[c(2,1),c(2,1)])
2 by 2 table analysis:
--
Outcome   : Col 1
Comparing : Row 1 vs. Row 2

   Col 1 Col 2P(Col 1) 95% conf. interval
Row 1 8   500  0.01570.0079   0.0312
Row 2 0   500  0.0.  NaN

95% conf. interval
  Relative Risk:Inf   NaN  Inf
  Sample Odds Ratio:Inf   NaN  Inf
Conditional MLE Odds Ratio:Inf1.6948  Inf
 Probability difference: 0.01570.0027   0.0337

  Exact P-value: 0.0076
 Asymptotic P-value: NaN
--

=== end example ===

So Epi gives me a lower limit of NaN but the same confidence
interval and p-value as fisher.test

=== epitools example ===

  library(epitools)
  riskratio(outcome)
$data
   Outcome
Predictor  Disease1 Disease2 Total
   Exposed1  5000   500
   Exposed2  5008   508
   Total10008  1008

$measure
   risk ratio with 95% C.I.
Predictor  estimate lower upper
   Exposed11NANA
   Exposed2  Inf   NaN   Inf

$p.value
   two-sided
Predictor  midp.exact fisher.exact  chi.square
   Exposed1 NA   NA  NA
   Exposed2 0.00404821  0.007610478 0.004843385

$correction
[1] FALSE

attr(,method)
[1] Unconditional MLE  normal approximation (Wald) CI
Warning message:
Chi-squared approximation may be incorrect in: chisq.test(xx, correct =
correction)

=== end example ===

And epitools also gives a lower limit
of NaN.

=== end all examples ===

I would prefer not to have to tell the authors of the
paper I am refereeing that
I think they are wrong unless I can help them with what they
should have done.

Is there another package I should have tried?

Is there some other way of doing this?

Am I doing something fundamentally wrong-headed?




Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Newbie problem ... Forest plot

2006-11-22 Thread Michael Dewey

At 15:16 22/11/2006, you wrote:
When I use studlab parameter I don`t have (don't know how to put) 
names inside graph, X and Y axis, and I have Black/White forest 
plot, and I need colored.

1 - please cc to r-help so that others can learn from your experience
2 - when you use the studlab parameter what happens, and how does 
that differ from what you wanted to happen?



Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Newbie problem ... Forest plot

2006-11-17 Thread Michael Dewey

At 15:59 16/11/2006, Peter Bolton wrote:
Hello!
I have some data stored into 2 separate csv file. 1 file (called 
A.csv) (12 results named Group1, Group2, Group3, etc...) odds 
ratios, 2 file (called B.csv) 12 corresponded errors.
How to import that data into R and make forest plot like I saw 
inside help file Rmeta and meta with included different font colors 
and names trough X and Y axis.

You need to show us what you tried, and why it did not do what you 
want, for us to be able to point you in the right direction.
I think you also need to get a good book on meta-analysis if you 
think that meta.MH from rmeta is the right tool for your datset

@BOOK{sutton00,
   author = {Sutton, A J and Abrams, K R and Jones, D R and Sheldon, T A and
Song, F},
   year = 2000,
   title = {Methods for meta-analysis in medical research},
   publisher = {Wiley},
   address = {Chichester},
   keywords = {meta-analysis, general}
}
may help you.

I know for meta libb
...
out - metagen(name1,name2)
plot(out,xlab=abcd)


But I need for my data to look like this? (copy from help file rmeta)

library(rmeta)
op - par(lend=square, no.readonly=TRUE)
data(catheter)
a - meta.MH(n.trt, n.ctrl, col.trt, col.ctrl, data=catheter,
  names=Name, subset=c(13,6,5,3,7,12,4,11,1,8,10,2))
# angry fruit salad
metaplot(a$logOR, a$selogOR, nn=a$selogOR^-2, a$names,
  summn=a$logMH, sumse=a$selogMH, sumnn=a$selogMH^-2,
  logeffect=TRUE, colors=meta.colors(box=magenta,
  lines=blue, zero=red, summary=orange,
  text=forestgreen))
par(op) # reset parameters

Of course if someone have better idea to import data - not trough 
csv file no problem any format is OK - data frame or something else 
and to make forest plot ... no problem ... I need them on the forest plot.
Greetings...
Peter Boolton
MCA adcc..

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Confidence interval for relative risk

2006-11-12 Thread Michael Dewey

At 15:54 10/11/2006, Viechtbauer Wolfgang (STAT) wrote:

Thanks for the suggestion Wolfgang, but whatever the original authors 
did that is not it.

Hello,

A common way to calculate the CI for a relative risk is the 
following. Given a 2x2 table of the form:

a b
c d

the log of the risk ratio is given by:

lrr = log[ a/(a+b) / ( c/(c+d) ) ]

which is asymptotically normal with variance:

vlrr = 1/a - 1/(a+b) + 1/c - 1/(c+d).

So an approximate 95% CI for the risk ratio is given by:

exp[ lrr - 1.96*sqrt(vlrr) ], exp[ lrr + 1.96*sqrt(vlrr) ].

A common convention is to add 1/2 to each cell when there are zeros.

So, for the table:

Col 1 Col 2
Row 1 8   500
Row 2 0   500

lrr = log[ 8.5/509 / ( 0.5/501 ) ] = 2.817
vllr = 1/8.5 - 1/509 + 1/0.5 - 1/501 = 2.1137

exp[ 2.817-1.96*sqrt(2.1137) ] = .97
exp[ 2.817+1.96*sqrt(2.1137) ] = 289.04

Maybe that is what the authors did.

Best,

--
Wolfgang Viechtbauer
  Department of Methodology and Statistics
  University of Maastricht, The Netherlands
  http://www.wvbauer.com/


  -Original Message-
  From: [EMAIL PROTECTED] [mailto:r-help-
  [EMAIL PROTECTED] On Behalf Of Michael Dewey
  Sent: Friday, November 10, 2006 15:43
  To: r-help@stat.math.ethz.ch
  Subject: [R] Confidence interval for relative risk
 
  The concrete problem is that I am refereeing
  a paper where a confidence interval is
  presented for the risk ratio and I do not find
  it credible. I show below my attempts to
  do this in R. The example is slightly changed
  from the authors'.
 
  I can obtain a confidence interval for
  the odds ratio from fisher.test of
  course
 
  === fisher.test example ===
 
outcome - matrix(c(500, 0, 500, 8), ncol = 2, byrow = TRUE)
fisher.test(outcome)
 
   Fisher's Exact Test for Count Data
 
  data:  outcome
  p-value = 0.00761
  alternative hypothesis: true odds ratio is not equal to 1
  95 percent confidence interval:
1.694792  Inf
  sample estimates:
  odds ratio
  Inf
 
  === end example ===
 
  but in epidemiology authors often
  prefer to present risk ratios.
 
  Using the facility on CRAN to search
  the site I find packages epitools and Epi
  which both offer confidence intervals
  for the risk ratio
 
  === Epi example ===
 
library(Epi)
twoby2(outcome[c(2,1),c(2,1)])
  2 by 2 table analysis:
  --
  Outcome   : Col 1
  Comparing : Row 1 vs. Row 2
 
 Col 1 Col 2P(Col 1) 95% conf. interval
  Row 1 8   500  0.01570.0079   0.0312
  Row 2 0   500  0.0.  NaN
 
  95% conf. interval
Relative Risk:Inf   NaN  Inf
Sample Odds Ratio:Inf   NaN  Inf
  Conditional MLE Odds Ratio:Inf1.6948  Inf
   Probability difference: 0.01570.0027   0.0337
 
Exact P-value: 0.0076
   Asymptotic P-value: NaN
  --
 
  === end example ===
 
  So Epi gives me a lower limit of NaN but the same confidence
  interval and p-value as fisher.test
 
  === epitools example ===
 
library(epitools)
riskratio(outcome)
  $data
 Outcome
  Predictor  Disease1 Disease2 Total
 Exposed1  5000   500
 Exposed2  5008   508
 Total10008  1008
 
  $measure
 risk ratio with 95% C.I.
  Predictor  estimate lower upper
 Exposed11NANA
 Exposed2  Inf   NaN   Inf
 
  $p.value
 two-sided
  Predictor  midp.exact fisher.exact  chi.square
 Exposed1 NA   NA  NA
 Exposed2 0.00404821  0.007610478 0.004843385
 
  $correction
  [1] FALSE
 
  attr(,method)
  [1] Unconditional MLE  normal approximation (Wald) CI
  Warning message:
  Chi-squared approximation may be incorrect in: chisq.test(xx, correct =
  correction)
 
  === end example ===
 
  And epitools also gives a lower limit
  of NaN.
 
  === end all examples ===
 
  I would prefer not to have to tell the authors of the
  paper I am refereeing that
  I think they are wrong unless I can help them with what they
  should have done.
 
  Is there another package I should have tried?
 
  Is there some other way of doing this?
 
  Am I doing something fundamentally wrong-headed?
 
 
 
  Michael Dewey
  http://www.aghmed.fsnet.co.uk

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Confidence interval for relative risk

2006-11-12 Thread Michael Dewey

At 12:35 12/11/2006, Peter Dalgaard wrote:
Michael Dewey [EMAIL PROTECTED] writes:

  At 15:54 10/11/2006, Viechtbauer Wolfgang (STAT) wrote:
 
  Thanks for the suggestion Wolfgang, but whatever the original authors
  did that is not it.

Did you ever say what result they got?

 -p


No, because I did not want to use the original numbers in the 
request. So as the snippet below indicates I changed the numbers. If 
I apply Wolfgang's suggestion (which I had already thought of but 
discarded) I get about 13 for the real example where the authors quote about 5.

My question still remains though as to how I can get a confidence 
interval for the risk ratio without adding a constant to each cell.

it credible. I show below my attempts to
do this in R. The example is slightly changed
from the authors'.

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Confidence interval for relative risk

2006-11-10 Thread Michael Dewey

The concrete problem is that I am refereeing
a paper where a confidence interval is
presented for the risk ratio and I do not find
it credible. I show below my attempts to
do this in R. The example is slightly changed
from the authors'.

I can obtain a confidence interval for
the odds ratio from fisher.test of
course

=== fisher.test example ===

  outcome - matrix(c(500, 0, 500, 8), ncol = 2, byrow = TRUE)
  fisher.test(outcome)

 Fisher's Exact Test for Count Data

data:  outcome
p-value = 0.00761
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
  1.694792  Inf
sample estimates:
odds ratio
Inf

=== end example ===

but in epidemiology authors often
prefer to present risk ratios.

Using the facility on CRAN to search
the site I find packages epitools and Epi
which both offer confidence intervals
for the risk ratio

=== Epi example ===

  library(Epi)
  twoby2(outcome[c(2,1),c(2,1)])
2 by 2 table analysis:
--
Outcome   : Col 1
Comparing : Row 1 vs. Row 2

   Col 1 Col 2P(Col 1) 95% conf. interval
Row 1 8   500  0.01570.0079   0.0312
Row 2 0   500  0.0.  NaN

95% conf. interval
  Relative Risk:Inf   NaN  Inf
  Sample Odds Ratio:Inf   NaN  Inf
Conditional MLE Odds Ratio:Inf1.6948  Inf
 Probability difference: 0.01570.0027   0.0337

  Exact P-value: 0.0076
 Asymptotic P-value: NaN
--

=== end example ===

So Epi gives me a lower limit of NaN but the same confidence
interval and p-value as fisher.test

=== epitools example ===

  library(epitools)
  riskratio(outcome)
$data
   Outcome
Predictor  Disease1 Disease2 Total
   Exposed1  5000   500
   Exposed2  5008   508
   Total10008  1008

$measure
   risk ratio with 95% C.I.
Predictor  estimate lower upper
   Exposed11NANA
   Exposed2  Inf   NaN   Inf

$p.value
   two-sided
Predictor  midp.exact fisher.exact  chi.square
   Exposed1 NA   NA  NA
   Exposed2 0.00404821  0.007610478 0.004843385

$correction
[1] FALSE

attr(,method)
[1] Unconditional MLE  normal approximation (Wald) CI
Warning message:
Chi-squared approximation may be incorrect in: chisq.test(xx, correct =
correction)

=== end example ===

And epitools also gives a lower limit
of NaN.

=== end all examples ===

I would prefer not to have to tell the authors of the
paper I am refereeing that
I think they are wrong unless I can help them with what they
should have done.

Is there another package I should have tried?

Is there some other way of doing this?

Am I doing something fundamentally wrong-headed?



Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] struggling to plot subgroups

2006-11-05 Thread Michael Dewey

At 04:06 05/11/2006, Sumitrajit Dhar wrote:
Hi Folks,

I have data that looks like this:

freqgender  xBar
1000m   2.32
1000f   3.22
2000m   4.32
2000f   4.53
3000m   3.21
3000f   3.44
4000m   4.11
4000f   3.99

I want to plot two lines (with symbols) for the two groups m and
f. I have tried the following:

plot(xBar[gender==m]~freq[gender==f]) followed by

So you want to plot the value of xBar for a man against the 
corresponding freq for a woman? But how are we to tell which man maps 
to which woman?


lines(xBar[gender==m]~freq[gender==f])

with different symbols and lines colors (which I know how to do).

However, these initial plots are not giving me the right output. The
initial plot command generates the following error.
Error in plot.window(xlim, ylim, log, asp, ...) :
 need finite 'xlim' values
In addition: Warning messages:
1: no non-missing arguments to min; returning Inf
2: no non-missing arguments to max; returning -Inf
3: no non-missing arguments to min; returning Inf
4: no non-missing arguments to max; returning -Inf

A second issue, I would also like to offset the second set of points
along the x-axis so that the error bars will be visible.

Any help is appreciated.

Regards,
Sumit



Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] latent class model

2006-10-27 Thread Michael Dewey

At 19:59 26/10/2006, karen xie wrote:
Dear List,

I try to implement the latent class model with the unknown number of
classes. I wonder whether someone can provide me some sample codes.

RSiteSearch(latent class)

brings up 160 documents some of which may be relevant

Thank you for your help.

Karen

 [[alternative HTML version deleted]]

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error when naming rows of dataset

2006-10-25 Thread Michael Dewey

At 17:30 24/10/2006, yongchuan wrote:
I get the following error when I try reading in a table.
How are 1.1, 1.2, 1.3 duplicate row names? Thx.

R gives you brief details of where it was when it fell over.
Have you checked in latestWithNumber.txt' to see whether R is right?


  table - read.table('latestWithNumber.txt', header=T)
Error in row.names-.data.frame(`*tmp*`, value = c(1.1, 1.2, 1.3,  :
 duplicate 'row.names' are not allowed

Yongchuan

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error when naming rows of dataset

2006-10-25 Thread Michael Dewey

At 12:44 25/10/2006, yongchuan wrote:
I'm pretty new with R, so the only error message I see is
the below that I pasted. I'm attaching the first few rows
of the file for reference.

You seem to have 2 rows called 2.1

The layout looks screwy when I
attach it here but 'Start' to 'closingCoupon' is the first
row in the .txt file. Thx!

 Start   StopPrepayDate  modBalance  closingCoupon
1.1 6   7   0   811.27698.35
1.2 7   8   0   811.27698.35
1.3 8   9   1   811.27698.35
2.1 4   5   0   2226.0825   8.7
2.2 5   6   0   2226.0825   8.7
2.3 6   7   0   2226.0825   8.7
2.4 7   8   0   2226.0825   8.7
2.5 8   9   0   2226.0825   8.7
2.6 9   10  0   2226.0825   8.7
2.7 10  11  0   2226.0825   8.7
2.8 11  12  0   2226.0825   8.7
2.9 12  13  0   2226.0825   8.7
2.1 13  14  0   2226.0825   8.7

 
  From: Michael Dewey [EMAIL PROTECTED]
  Date: Wed 25/10/2006 6:38 PM SGT
  To: yongchuan [EMAIL PROTECTED], r-help@stat.math.ethz.ch
  Subject: Re: [R] Error when naming rows of dataset
 
  At 17:30 24/10/2006, yongchuan wrote:
  I get the following error when I try reading in a table.
  How are 1.1, 1.2, 1.3 duplicate row names? Thx.
 
  R gives you brief details of where it was when it fell over.
  Have you checked in latestWithNumber.txt' to see whether R is right?
 
 
table - read.table('latestWithNumber.txt', header=T)
  Error in row.names-.data.frame(`*tmp*`, value = c(1.1, 
 1.2, 1.3,  :
   duplicate 'row.names' are not allowed
  
  Yongchuan
 
  Michael Dewey
  http://www.aghmed.fsnet.co.uk
 
 




--
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.1.408 / Virus Database: 268.13.11/496 - Release Date: 24/10/2006

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] glm cannot find valid starting values

2006-10-16 Thread Michael Dewey

At 16:12 15/10/2006, Ronaldo ReisJunior wrote:
Em Sábado 14 Outubro 2006 11:15, Gabor Grothendieck escreveu:
  Try using OLS starting values:
 
  glm(Y~X,family=gaussian(link=log), start = coef(lm(Y~X)))
 

Ok, using a starting value the model work.

But, sometimes ago it work without any starting values. Why now I need a
starting values?

Since this involves a prediction about how a 
computer will work in the future I would have 
thought the quote from Samuel Goldwyn in your .sig was helpful.

More seriously I think you need to give more 
detail about when it worked for those more expert 
than me to tell you what happened.


Thanks
Ronaldo
--
Nunca faça previsões, especialmente sobre o futuro.
 -- Samuel Goldwyn
--
  Prof. Ronaldo Reis Júnior
|  .''`. UNIMONTES/Depto. Biologia Geral/Lab. Ecologia Evolutiva
| : :'  : Campus Universitário Prof. Darcy Ribeiro, Vila Mauricéia
| `. `'` CP: 126, CEP: 39401-089, Montes Claros - MG - Brasil
|   `- Fone: (38) 3229-8190 | [EMAIL PROTECTED]
| ICQ#: 5692561 | LinuxUser#: 205366

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] glm cannot find valid starting values

2006-10-14 Thread Michael Dewey

At 15:31 13/10/2006, Ronaldo ReisJunior wrote:
Hi,

I have some similar problems. Some times ago this problem dont there existed.

Look this simple example:

  Y - c(0,0,0,1,4,8,16)
  X - c(1,2,3,4,5,6,7)

  m - glm(Y~X,family=gaussian(link=log))
Error in eval(expr, envir, enclos) : cannot find valid starting 
values: please
specify some

Have you tried specifying some starting values as it asks?
Try start=c(-3,1) for instance



Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] error from pmvnorm

2006-10-01 Thread Michael Dewey

At 20:11 30/09/2006, [EMAIL PROTECTED] wrote:
Hi all,

Can anyone tell me what the following error message means?
 Error in mvt(lower = lower, upper = upper, df = 0, corr = corr, 
delta = mean,
:  NA/NaN/Inf in foreign function call (arg 6)

It was generated when I used the 'pmvnorm' function in the 'mvtnorm' package.

You are supposed I think to send questions about contributed packages 
to the maintainer, but before you do, or repost to this list, it 
might be a good idea to collect more information.
What was the call of pmvnorm?
What version of mvtnorm do you have?
What version of R?


Thanks a lot.

Yonghai Li

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Variable im Data Frame Namen

2006-09-21 Thread Michael Dewey

At 12:03 20/09/2006, Thorsten Muehge wrote:

Hello R Experts,
how can I incorporate a variable in a data frame definition?

Thorsten, you have already had a reply about this, but I wonder 
whether you _really_ want to do this. You do not say what the 
scientific question is you are trying to answer, but I find that when 
I am tempted to do this it is almost always better to make a list (in 
your case of data frames) and then access the elements of the list

Example:
week - 28
test(week) - data.frame(a,b,s,c);
test28

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Consulta sobre bases en R

2006-09-01 Thread Michael Dewey

At 17:04 31/08/2006, Marcela Corrales wrote:
Estimados,

Quisiera saber cuanto es el tamaño máximo de datos contenidos en mi base de
datos, que el software R puede llegar a procesar y soporta.

Marcela, you will get more help if you
(a) post in English
(b) give us more information about your problem and your system.

At the moment the answer to your question is 'How long is a piece of string?'

Try using the site search facility for large databases


De antemano agradezco y envió un cordial saludo.

De nada



Marcela Corrales

CPData Optimum

And please do not send HTML.


 [[alternative HTML version deleted]]

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to reply to a thread if receiving R-help mails in digest form

2006-08-14 Thread Michael Dewey

At 14:21 13/08/2006, Dirk Enzmann wrote:
I receive R-help messages in digest form that makes it difficult to 
answer a post. I noticed that my answer is not added to the thread 
(instead, a new thread is started) although I use the same subject 
line (starting with Re: ) as the original post. Is there a 
solution (I prefer the digest to separate messages for several 
reasons and don't want to change my email reader)?

Dirk, I asked some while ago for people to tell me how they read 
digests in their news reader so I could put together a list of 
helpful hints. Unfortunately I got nowhere. In the email client I use 
(Eudora) there is an option to receive digests as attachments (as Ted 
Harding has also outlined in another email. It is rather hidden so I 
suspect if you can do it in Thunderbird it will be similarly hidden. 
Try looking for such an option and let me know if you find it.


The way I answer post up to now is:
1) I press the reply button of my email program (Mozilla / 
Thunderbird, Windows)
2) I delete all contents of the digest except for the post 
(including name and mail address of the posting person) I want to 
answer so that the original question will be included (cited) in my answer.
3) I add the email address to the individual sender to cc: to the 
automatically generated address of the R-help list.
4) I replace the automatically generated subject line (for example 
Re: R-help Digest, Vol 42, Issue 13 by Re:  followed by a copy 
of the original subject line of the post.
5) I write my answer and send the mail to the mailing list.

It's not that this is tedious - the problem is that the thread is 
broken. Is there a better way even if I want to keep receiving 
messages in digest form? The posting guide is silent about this.

Dirk

*
Dr. Dirk Enzmann
Institute of Criminal Sciences
Dept. of Criminology
Edmund-Siemers-Allee 1
D-20146 Hamburg
Germany

phone: +49-(0)40-42838.7498 (office)
+49-(0)40-42838.4591 (Billon)
fax:   +49-(0)40-42838.2344
email: [EMAIL PROTECTED]
www: 
http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Enzmann.html



Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Vector Join

2006-08-14 Thread Michael Dewey

At 13:23 13/08/2006, Michael Zatorsky wrote:
Hi,

I'm working on producing a simple cumulative frequency
distribution.

Thanks to the help of the good people on this list I
now have four vectors that I'd like to join/relate
into a table. e.g.


v1 - myHistogram$breaks # classes
v2 - myHistogram$counts # freqs
v3 - cumsum(v2) # cumulative freq
v4 - ((v3 / length(myData)) * 100)  # cumulative %

data.frame(v1 = myHistogram$breaks, v2 = myHistogram$counts, and so on ...)



What is the recommend approach to turning these into a
single table with four columns?  ie effectively doing
a relational join on row id?

The goal is to ultimately have the data with one row
per class in a format I can write out to a text file
as:

   v1v2v3v4
   v1v2v3v4
   etc...

Any advice will be appreciated.

Regards
Michael.







Coming soon: Celebrity Survivor - 11 celebrities, 25 days, unlimited drama

Michael Dewey
[EMAIL PROTECTED]
http://www.aghmed.fsnet.co.uk/home.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Great R documentation

2006-07-31 Thread Michael Dewey

At 10:09 31/07/2006, hadley wickham wrote:
Dear all,

I'm trying to improve the documentation I provide my R packages, and
to that end I'd like to find out what you think is great R
documentation.  I'm particularly interested in function documentation,

Hadley, I do not think any bit of function documentation, if you mean 
the manual page type documents, is ever that enlightening unless you 
already know what the function does. For me the useful documents are 
the more extended narrative documents.

What I find helpful is:
start with what is the aim of this document
tell me what I am assumed to know first (and ideally tell me where to 
look if I do not)
start with an example using the defaults
tell me what the output means in terms of the scientific problem
tell me what action I might take when it gives me a warning (if that 
is predictable)
tell me about other packages with the same aim and why I should use this one
tell me where to go for more information

but great vignettes, websites or book are also of interest.

What is your favourite bit of R documentation, and why?

Thanks,

Hadley



Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Joining variables

2006-05-25 Thread Michael Dewey

At 21:01 24/05/2006, [EMAIL PROTECTED] wrote:
Hello,

Although I cannot speak for the original individual posting this 
question but I would like to see an example, if possible, of how to 
construct a factor of several factors.

?interaction (assuming I understand you correctly)


I think this would be more efficient than paste if such a 
procedure were possible and, of course, would also answer the 
postee's original question.

Many thanks,

Mark LoPresti
Thomson Financial/Research Group

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of Berton Gunter
Sent: Wednesday, May 24, 2006 2:54 PM
To: 'Guenther, Cameron'; r-help@stat.math.ethz.ch
Subject: Re: [R] Joining variables


What does combination of both mean exactly. I can think of two
interpretations that have two different answers. If you give a small example
(as the posting guide suggests) it would certainly help me provide an
answer.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA





  -Original Message-
  From: [EMAIL PROTECTED]
  [mailto:[EMAIL PROTECTED] On Behalf Of
  Guenther, Cameron
  Sent: Wednesday, May 24, 2006 11:46 AM
  To: r-help@stat.math.ethz.ch
  Subject: [R] Joining variables
 
  Hello,
 
  If I have two variables that are factors or characters and I want to
  create a new variable that is the combination of both what
  function can
  I use to accomplish this?
 
  Ex.
 
  Var1  Var2
  SA100055113   19851113
 
  And I want
 
  NewVar
  SA10005511319851113
 
  Thanks in advance.
 
  Cameron Guenther, Ph.D.
  Associate Research Scientist
  FWC/FWRI, Marine Fisheries Research
  100 8th Avenue S.E.
  St. Petersburg, FL 33701
  (727)896-8626 Ext. 4305
  [EMAIL PROTECTED]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide!
  http://www.R-project.org/posting-guide.html
 

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Michael Dewey
[EMAIL PROTECTED]
http://www.aghmed.fsnet.co.uk/home.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] logistic regression model with non-integer weights

2006-04-12 Thread Michael Dewey

At 17:12 09/04/06, RamÃ³n Casero CaÃ±as wrote:

I have not seen a reply to this so far apologies if I missed something.


When fitting a logistic regression model using weights I get the
following warning

  data.model.w - glm(ABN ~ TR, family=binomial(logit), weights=WEIGHT)
Warning message:
non-integer #successes in a binomial glm! in: eval(expr, envir, enclos)

Details follow

***

I have a binary dependent variable of abnormality

ABN = T, F, T, T, F, F, F...

and a continous predictor

TR = 1.962752 1.871123 1.893543 1.685001 2.121500, ...



As the number of abnormal cases (ABN==T) is only 14%, and there is large
overlapping between abnormal and normal cases, the logistic regression
found by glm is always much closer to the normal cases than for the
abnormal cases. In particular, the probability of abnormal is at most 0.4.

Coefficients:
 Estimate Std. Error z value Pr(|z|)
(Intercept)   0.7607 0.7196   1.057   0.2905
TR2  -1.4853 0.4328  -3.432   0.0006 ***
---

I would like to compensate for the fact that the a priori probability of
abnormal cases is so low. I have created a weight vector

I am not sure what the problem you really want to solve is but it seems that
a) abnormality is rare
b) the logistic regression predicts it to be rare.
If you want a prediction system why not try different cut-offs (other than 
0.5 on the probability scale) and perhaps plot sensitivity and specificity 
to help to choose a cut-off?

  WEIGHT - ABN
  WEIGHT[ ABN == TRUE ] -  1 / na / 2
  WEIGHT[ ABN == FALSE ] -  1 / nn / 2

so that all weights add up to 1, where ``na'' is the number of abnormal
cases, and ``nn'' is the number of normal cases. That is, normal cases
have less weight in the model fitting because there are so many.

But then I get the warning message at the beginning of this email, and I
suspect that I'm doing something wrong. Must weights be integers, or at
least greater than one?

Regards,

--
RamÃ³n Casero CaÃ±as

http://www.robots.ox.ac.uk/~rcasero/wiki
http://www.robots.ox.ac.uk/~rcasero/blog

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Wikis etc.

2006-01-10 Thread Michael Dewey

At 16:06 09/01/06, Gabor Grothendieck wrote:

[snip various earlier posts]


In addition to books, the various manuals, contributed documents and
mailing list archives, all of which one should review,
the key thing to do if you want to really learn R is to read source code
and lots of it.  I think there is no other way.  Furthermore, the fact that
you can do this is really a key advantage of open source.

But that is the solution to a different problem.
Reading the source for merge tells you how R merges two dataframes, the 
beginning user wants to know how to link together the information s/he has 
in two files but does not know what the name of the relevant command is, or 
indeed whether it is even possible.

To give you some idea of how ignorant some of us are it was only quite 
recently that I realised (despite several years reading free documentation, 
books on R and R-help) that if I type cor at the prompt what I see looks 
like source code but is not _the_ source code.


Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Wikis etc.

2006-01-09 Thread Michael Dewey

At 20:12 08/01/06, Jack Tanner wrote:
Philippe's idea to start a wiki that grows out of the content on 
http://zoonek2.free.fr/UNIX/48_R/all.html is really great. Here's why.

My hypothesis is that the basic reason that people ask questions on R-help 
rather than first looking elsewhere is that looking elsewhere doesn't get 
them the info they need.

People think in terms of the tasks they have to do. The documentation for 
R, which can be very good, is organized in terms of the structure of R, 
its functions. This mismatch -- people think of tasks, the documentation 
thinks in functions -- causes people to turn to the mailing list.

Further to that I feel that (perhaps because they do not like to blow their 
own trumpet too much) the authors of books on R do not stress how much most 
questioners could gain by buying and reading at least one of the many books 
on R. When I started I found the free documents useful but I made most 
progress when I bought MASS. I do realise that liking books is a bit last 
millennium.




Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] How to create a correlation table for categorical data???

2006-01-05 Thread Michael Dewey

At 19:07 04/01/06, fabio crepaldi wrote:
Hi,
   I need to create the correlation table of a set of categorical data 
 (sex, marital_status, car_type, etc.) for a given population.
   Basically what I'm looking for is a function like cor( ) working on 
 factors (and, if possible, considering NAs as a level).

If they are ordered have you considered downloading the polycor package?


Michael Dewey
[EMAIL PROTECTED]
http://www.aghmed.fsnet.co.uk/home.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] How do I get sub to insert a single backslash?

2006-01-05 Thread Michael Dewey

Something about the way R processes backslashes is defeating me.
Perhaps this is because I have only just started using R for text processing.

I would like to change occurrences of the ampersand  into ampersand 
preceded by a backslash.

  temp - R  D
  sub(, \, temp)
[1] R  D
  sub(, \\, temp)
[1] R  D
  sub(, \\\, temp)
[1] R  D
  sub(, , temp)
[1] R \\ D
 

So I can get zero, or two backslashes, but not one. I am sure this is 
really simple but I did not find the answer by doing, for example ?regexp 
or ?Quotes


Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] from Colombia - help

2005-12-22 Thread Michael Dewey

At 13:50 20/12/05, andres felipe wrote:
   Hi, my name is Andres Felipe Barrientos, I'm a student of Statistic and 
 don't speak in english. En mi trabajo de grado necesito implementar la 
 funcion smooth.spline y necesito saber con que tipo de spline trabaja 
 (b-splines o naturales).

Since I have not yet seen a reply here goes.
It is very honest of you to tell us it is your 'trabajo de grado' but 
should you not ask your tutor for help?
Have you tried ?smooth.spline which tells you what the function does 
(admittedly fairly tersely)?

   Ademas me gustaria saber cual es la base que se usa para encontrar 
 estos splines, por ejemplo, cosenos, senos, polinomios entre otros 
 Otra pregunta que tengo consiste en saber cual es la relacion que 
 sostiene la funcion smooth.spline y ns Agradeciendo la atencion 
 prestada y esperando una respuesta desde la universidad del valle, quien 
 le escribe. Andres Felipe



-
  1GB gratis, Antivirus y Antispam
  Correo Yahoo!, el mejor correo web del mundo
  Abrí tu cuenta aquí
 [[alternative HTML version deleted]]

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Comments please, how to set your mailer to read R-help in digest format

2005-12-12 Thread Michael Dewey

There are occasional comments on this list about how difficult it is to 
read the digest format. Since it took me a few false starts to configure my 
mailer to do it I have (with the encouragement of the list owner) put 
together a few hints. All I need now is for people who use other mailers to 
fill in the details in the same way as I have done for Eudora.

The draft is available at
http://www.aghmed.fsnet.co.uk/r/digest.htm

Any other comments welcome of course.


Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Digest reading is tedious

2005-09-02 Thread Michael Dewey

At 11:43 01/09/05, Martin Maechler wrote:

[snip section about Trevor Hastie's experience]


What are other readers' experiences with mailman mailing lists
in digest mode -- using MIME type delivery?

I use Eudora 6.2.1.2 (which is not the very latest version) running under 
Windows 98 or Windows XP and the digests are fine once you have found the 
option 'Receive MIME digest as mailbox attachment' and turned it on. I can 
then see the whole set of messages, sort them by subject and reply to them 
individually without having to change the subject of the message.

Hope that helps.

Regards,
Martin

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] (sans objet)

2005-06-27 Thread Michael Dewey

At 11:12 26/06/05, pir2.jv wrote:
Ma première question: puis-je écrire en français, mon anglais est
pire que pire. J'essaie en français; sinon, j'essaierai une traduction!

Je sduis débutant. Ma question est donc simple, j'imagine.

I think ?merge is your friend here.

Je travaille sur des tableaux de statistiques. Je prends un exemple:
J'ai un frame que j'appelle eurostat qui me donne des
statistiques commerciales:

eurostat: PaysProduitTonnage
 CI80110123
...


J'ai un autre frame , cp qui m'indique
ProduitNom
801Coconut

et un autre qui m'indique que CI est Côte d'Ivoire
Je veux créer une nouvelle table, par exemple Importation qui me
donne les données suivantes:

ImportationPays  Produit Tonnage
 Côte d'IvoireCoconut10123

... et je n'y arrive pas. Ce doit pourtant être basique.

Merci pour votre aide.

Jacques Vernin
PiR2
10, Boulevard de Brazza
13008 Marseille
0491 734 736
[EMAIL PROTECTED]



Michael Dewey
[EMAIL PROTECTED]
http://www.aghmed.fsnet.co.uk/home.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Problem trying to use boot and lme together

2005-06-22 Thread Michael Dewey

At 23:09 21/06/05, Prof Brian Ripley wrote:
On Tue, 21 Jun 2005, Douglas Bates wrote:

On 6/21/05, Søren Højsgaard [EMAIL PROTECTED] wrote:

Thanks everyone for your help, more comments at the foot

The problem with simulate.lme is that it only returns logL for a given 
model fitted to a simulated data set  - not the simulated data set 
itself (which one might have expected a function with that name to 
do...). It would be nice with that functionality...
Søren

You could add it.  You just need to create a matrix that will be large
enough to hold all the simulated data sets and fill a column (or row
if you prefer but column is probably better because of the way that
matrices are stored) during each iteration of the simulation and
remember to include that matrix in the returned object.

Note: you don't need to store it: you can do the analysis at that point 
and return the statistics you want, rather than just logL.

I did say `see simulate.lme', not `use simulate.lme'.  I know nlme is no 
longer being developed, but if it were I would be suggesting/contributing 
a modification that allowed the user to specify an `extraction' function 
from the fit -- quite a few pieces of bootstrap code work that way.

--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

This has been most informative for me. Given the rather stern comments in 
Pinheiro and Bates that most things do not have the reference distribution 
you might naively think I now realise that simulate.lme is more important 
than its rather cursory treatment in the book. As suggested I have looked 
at the code but although I can see broadly what each section does I lack 
the skill to modify it myself. I will have to wait for someone more gifted.

If there is to be a successor edition to Pinheiro and Bates perhaps I could 
suggest that this topic merits a bit more discussion?


Michael Dewey
[EMAIL PROTECTED]
http://www.aghmed.fsnet.co.uk/home.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Problem trying to use boot and lme together

2005-06-21 Thread Michael Dewey

The outcome variable in my dataset is cost. I prefer not to transform it as 
readers will really be interested in pounds, not log(pounds) or 
sqrt(pounds). I have fitted my model using lme and now wish to use boot on 
it. I have therefore plagiarised the example in the article in RNews 2/3 
December 2002

after loading the dataset I source this file


require(boot)
require(nlme)
model.0 - lme(tot ~ time + timepost + pct + totpat
+ (time + timepost) : single + single
+ (time + timepost) : train + train
+ time:gpattend + timepost:gpattend + gpattend,
data = common,
random = ~time + timepost | gp
)
ints.9 - intervals(model.0, which=fixed)$fixed[9,]
#
common$fit - fitted(model.0)
common$res - resid(model.0, type = response)
cats.fit - function(data) {
mod - lme(tot ~ time + timepost + pct + totpat
   + (time + timepost) : single + single
   + (time + timepost) : train + train
   + time:gpattend + timepost:gpattend + gpattend,
data = data,
random = ~ time + timepost | gp)
summ - summary(mod)
c(fixef(summ), diag(summ$varFix))
}
model.fun - function(d, i) {
d$tot - d$fit+d$res[i]
cats.fit(d)
}
tot.boot - boot(common, model.fun, R=999)

So I fit the model and then generate fitted values and residuals which I 
use within the model.fun function to generate the bootstrap resample.

If I do this the plot looks fine as does the jack.after.boot plot but the 
confidence intervals are about 10% of the width of the ones from the lme 
output. I wondered whether I was using the wrong fitted and residuals so I 
added level = 0 to the calls of fitted and resid respectively (level = 1 is 
the default) but this seems to lead to resamples to which lme cannot fit 
the model.

Can anyone spot what I am doing wrong?

I would appreciate a cc'ed response as my ISP has taken to bouncing the 
digest posts from R-help with probability approximately 0.3.

FWIW I am using 2.1.0 under XP (SP2) with the versions of boot and nlme 
which shipped with the binary.


Michael Dewey
[EMAIL PROTECTED]
http://www.aghmed.fsnet.co.uk/home.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] How I calculate nCr with R ? (Como calculo nCr con R? )

2005-03-19 Thread Michael Dewey

At 10:37 19/03/05, Mario Morales wrote:
En español  (In Spanish)
Necesito calcular la en numeros de combinaciones de n cosas
tomando k al tiempo.
In English we usually read this as N choose r and with that clue you might 
go ?choose

Incidentally your English is fine although I see the logic in giving us both.
Como hago eso en R ???
Yo escribí mi propia función pero pienso que  de esa forma no es
fácil para mis estudiantes .
He estado buscando en la ayuda y no he encontrado información
sobre una función que calcule eso directamente. Podrían ayudarme
In English (en Inglés )
I need calculate the combination of n things taking k at time.
How do I do that in R ?
I wrote my own function but in this form isn't easy for my
students.
Can you help me ?

Michael Dewey
[EMAIL PROTECTED]
http://www.aghmed.fsnet.co.uk/home.html
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Tetrachoric and polychoric ceofficients (for sem) - any tips?

2004-11-28 Thread Michael Dewey

About two years ago there was a thread about this which suggested that at 
that time nobody had these coefficients ready to go.
(a) has anyone in the meanwhile programmed them?
(b) I think I can see how to do the tetrachoric one with mvtnorm on similar 
lines to an example on the help page so will try that if nobody else 
already has
(c) looking at the polychoric one makes me realise yet again that I wish I 
knew more mathematics. I would also find it difficult to test it as I do 
not have any worked examples. Does anyone have any tips about how to 
program it and how to test the resultant code?
(d) I appreciate this last item is not strictly an R question, but my 
intention is to use these as input into the sem package for structural 
equation models. If anyone thinks that is misguided I would be intersted to 
here.

Michael Dewey
[EMAIL PROTECTED]
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Summary, was [R] Plotting panels at arbitrary places on a map, rather than on a lattice

2004-10-06 Thread Michael Dewey

At 17:58 01/10/04, Michael Dewey wrote:
I think it is easiest to describe
what I want in terms of the concrete
problem I have.
I have data from a number of countries
in each of which a sample of people was
interviewed. In presenting the results
in a forthcoming collaborative publication
much emphasis will be placed on the
multi-centre nature of the study. Although
I suspect colleagues may do this with
shaded maps I would prefer to avoid
them as (a) they present one fact per
country per map (b) they are unfair to
the Netherlands and other high density
countries.
What I would like to do is to make
the background represent Europe (ideally
with a map but that is a frill) then
place simple scattergrams (or radar plots)
on it located roughly where the country
is. Another way of describing it might
be to say that I want something like
the panels produced by lattice but at
arbitrary coordinates rather than on
a rectangular grid. I suspect I have
to do this from scratch and I would
welcome hints.
Am I right that there is no off the
shelf way to do this?
Is grid the way to go? Looking at the
article in Rnews 2(2) and a brief scan
of the documentation suggests so.
If grid is the way to go then bearing
in mind I have never used grid before
(a) any hints about the overall
possible solution structure
would be welcome (b) is this realistic to
do within a week or shall I revert to
lattice and lose the geography?
Is there a simple way to draw a map
in the background? It needs to cover
as far as Sweden, Spain and Greece.
It can be crude,
as long as Italy looks roughly like
a boot that is fine. I am an epidemiologist
not a geographer.
I received some very helpful hints and was able to get a satisfactory solution.
Roger Bivand  pointed me in the right way with map. After loading maps
and mapproj I go
map(world, region = c(France, Belgium, Greece, Spain, Italy,
   Switzerland, Sweden, Germany, Netherlands, Austria,
   Denmark, Sicily, Sardinia),
   xlim = c(-10, 30), ylim = c(30, 60),
   projection = albers, parameters = c(30,60), col = lightgreen)
then bearing in mind that my data is in a file called merg
which contains the coordinates and the data to plot
merg$x - mapproject(merg$long,merg$lat)$x
merg$y - mapproject(merg$long,merg$lat)$y
gets the coordinates in the new system.
Deepayan Sarkar reminded me about stars which I should
have remembered myself from reading MASS.
I now go
stars(2*merg[,ord+4], scale = FALSE, len = 0.07,
   locations = cbind(merg$x,merg$y), labels = NULL,
   cex = 0.3,
   key.loc = c(mapproject(-10,60)$x,mapproject(-10,60)$y),
   add = TRUE
)
and get a plot which does what I wanted.
Roger had pointed out that gridbase was probably the way to put 
scatterplots on the map
but I decided after looking at the stars that scatterplots would end up too 
small to use
so I stayed with lattice for them.
(When I said scattergrams I meant what nearly everyone else calls 
scatterplots.)

Thanks to both of them, and to the people who made maps and stars so easy
when you know what to look for.

Michael Dewey
[EMAIL PROTECTED]
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Plotting panels at arbitrary places on a map, rather than on a lattice

2004-10-01 Thread Michael Dewey

I think it is easiest to describe
what I want in terms of the concrete
problem I have.
I have data from a number of countries
in each of which a sample of people was
interviewed. In presenting the results
in a forthcoming collaborative publication
much emphasis will be placed on the
multi-centre nature of the study. Although
I suspect colleagues may do this with
shaded maps I would prefer to avoid
them as (a) they present one fact per
country per map (b) they are unfair to
the Netherlands and other high density
countries.
What I would like to do is to make
the background represent Europe (ideally
with a map but that is a frill) then
place simple scattergrams (or radar plots)
on it located roughly where the country
is. Another way of describing it might
be to say that I want something like
the panels produced by lattice but at
arbitrary coordinates rather than on
a rectangular grid. I suspect I have
to do this from scratch and I would
welcome hints.
Am I right that there is no off the
shelf way to do this?
Is grid the way to go? Looking at the
article in Rnews 2(2) and a brief scan
of the documentation suggests so.
If grid is the way to go then bearing
in mind I have never used grid before
(a) any hints about the overall
possible solution structure
would be welcome (b) is this realistic to
do within a week or shall I revert to
lattice and lose the geography?
Is there a simple way to draw a map
in the background? It needs to cover
as far as Sweden, Spain and Greece.
It can be crude,
as long as Italy looks roughly like
a boot that is fine. I am an epidemiologist
not a geographer.


Michael Dewey
[EMAIL PROTECTED]
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] rbind with missing columns

2004-05-02 Thread Michael Dewey

At 11:03 01/05/04, you wrote:
Message: 3
Date: Fri, 30 Apr 2004 13:47:35 +0200
From: [EMAIL PROTECTED]
Subject: [R] rbind with missing columns
To: [EMAIL PROTECTED]
Message-ID:
[EMAIL PROTECTED]
Content-Type: text/plain;   charset=us-ascii
Hello,
I have several data sets, which I want to combine column-by-column using
rbind (the data sets have ~100 columns). The problem is, that in some
data sets some of the columns are missing.
Simply using rbind gives me:
Error in match.names(clabs, names(xi)) : names don't match previous
names
Is there a simple way to make R do the rbind despite the missing columns
and to fill out the missing data with NA's? (I could program this
somehow, but probably there is one very simple solution available)
To make it clear here a simplified example. unknown.command is what I
am looking for.
A - data.frame(a = c(1,2,3), b = c(4,5,6))
B - data.frame(a = c(1,2,3))
unknown.command(A, B) - should give
A B
1 4
2 5
3 6
4 NA
5 NA
6 NA
Does
A$id - 1:nrow(A)
B$id - 1:nrow(B) + nrow(A)
AandB - merge(A, B, all = TRUE)
 A
  a b id
1 1 4  1
2 2 5  2
3 3 6  3
 B
  a id
1 1  4
2 2  5
3 3  6
 AandB
  a id  b
1 1  1 4
2 1  4 NA
3 2  2 5
4 2  5 NA
5 3  3 6
6 3  6 NA

Give you what you wanted? You can always strip out the id column if you 
wish (after sorting on it if you would like the original order back.

Thank you for your help
Klaus
Michael Dewey
[EMAIL PROTECTED]
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Re: How transform excell file to text file in R

2004-03-30 Thread Michael Dewey

At 12:00 30/03/04 +0200, you wrote:
From: [EMAIL PROTECTED]
Precedence: list
MIME-Version: 1.0
Cc:
To: [EMAIL PROTECTED]
Date: Mon, 29 Mar 2004 17:22:14 -0300 (ART)
Message-ID: [EMAIL PROTECTED]
Content-Type: text/plain;charset=iso-8859-1
Subject: [R] how transform excell file to text file in R
Message: 56
Hello, it want to know if there is a library that transform a file in
excell (*. xls) to text file(*.txt, *,prn, *.csv).  Thanks
library(RODBC) will read your Excel file into R. You do not need a copy of 
Excel to do this.

Michael Dewey
[EMAIL PROTECTED]
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Lattice, skip= and layout= problem, plotting object from nlme output

2004-03-22 Thread Michael Dewey

I generate a groupedData object

library(nlme)
obj - groupedData(mg10 ~ time | gp, data = common, outer = ~pct)
gp has 101 levels, and pct has 3. There are 38, 25, 38 gps in each of the 
levels of pct respectively.

I fit my model

fit.rtg - lme(mg10 ~ time * group,
   data = obj,
   random = ~time * group | gp)
Now I try to plot the results. I would like to print 40 panels on each page 
and have a new level of pct start a new page. The effect of using outer in 
the call of groupedData is that the values of gp are presented by pct.

plot.rtg - plot(augPred(fit.rtg),
   skip = c(rep(FALSE, 38), TRUE, TRUE,
rep(FALSE, 25), rep(TRUE, 15),
rep(FALSE, 38), TRUE, TRUE),
   layout = c(5, 8, 3),
   strip = FALSE
)
What I get is 38 panels on page 1 with 2 blank panels top right. I had 
hoped to get 25 on the next page with 15 blanks but I get 38 again with 2 
blanks.

If I alter layout to (say) layout = c(12, 10) I get blanks as expected on 
the single page produced so I surmise that skip is forcing the same format 
on each page. There is an example in \cite{pinheiro00} which suggests that 
what I wanted can be done (p113,  p445-447) so I suspect I am doing 
something stupid here.

I am using R 1.8.1 with the versions of lattice and nlme that came with it. 
I am using Windows 98SE if that is relevant.

@BOOK{pinheiro00,
  author = {Pinheiro, J C and Bates, D M},
  year = 2000,
  title = {Mixed-effects models in {S} and {S-PLUS}},
  publisher = {Springer-Verlag}
}
Michael Dewey
[EMAIL PROTECTED]
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Lattice, skip= and layout= problem, plotting object from nlme output

2004-03-22 Thread Michael Dewey

At 13:28 22/03/04 -0600, you wrote:

[snip my original problem]


You are seeing documented lattice behavior, but looks like that's
inconsistent with what happens in S-PLUS. lattice replicates the skip
vector to be as long as the number of panels per page and uses that for
every page, while S-PLUS replicates it to be as long as the total
number of panels spanning all pages.
I can easily fix this for 1.9.0, but this may break old (but probably
rare) lattice code. For example,
layout = c(2,2,2), skip = c(T,F,F)

will now expand to

page 1: T, F, F, T
page 2: F, F, T, F
as opposed to the current behavior

page 1: T, F, F, T
page 2: T, F, F, T
Any objections to that ?
From my point of view the current behaviour is not very useful, and in 
your example either behaviour seems a bit strange so I would not object to 
it changing.

Thanks for the quick response, and thanks for making lattice available. I 
find myself making so many more graphical displays because they are (a) 
easy (b) aesthetically pleasing (c) revealing about my dataset.

Deepayan
Michael Dewey
[EMAIL PROTECTED]
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Specifying suitable PC to run R

2003-10-09 Thread Michael Dewey

If I am buying a PC where the most compute intensive task will be running R 
and I do not have unlimited resources what trade-offs should I make?
Specifically should I go for
1 - more memory, or
2 - faster processor, or
3 - something else?
If it makes a difference I shall be running Windows on it and I am thinking 
about getting a portable which I understand makes upgrading more difficult.

Extra background: the tasks I notice going slowly at the moment are fitting 
models with lme which have complex random effects and bootstrapping. By the 
standards of r-help posters I have small datasets (few thousand cases, few 
hundred variables). In order to facilitate working with colleagues I need 
to stick with windows even if linux would be more efficient

Michael Dewey
[EMAIL PROTECTED]
http://www.aghmed.fsnet.co.uk/home.html
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

75 matches

Mail list logo