Re: [R] How I can read the binary file with different type?

2008-08-22 Thread Prof Brian Ripley

On Fri, 22 Aug 2008, Bingxiang Miao wrote:


Hi all,

I  have a binary file  which have 8*100 bytes. The structure of the file is
as follows: every eigth bytes represent 2 data:the first four bytes is the
little-endian for integer, the next four bytes is the little-endian for
floating32. The structure of the following bytes is as the same as the first
eight bytes'.

As the function readBin only read the binary file with one structure like
this: readBin(my data file,what=integer,size=4,n=200) or readBin(my
data file,what=numeric,size=4,n=200).But only one type of the data can be
read properly.


That's plain wrong, and the examples on the help page disprove it. Please 
do read the help before posting. The concept is to use multiple readBin 
calls from an open connection (as in the examples on the help page).



Can anyone suggest me how should I read this file?


We DO suggest you 'should' read the posting guide and the help pages.



Thanks in advance.

Miao

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help Regarding 'integrate'

2008-08-22 Thread Prof Brian Ripley
As my post showed, it is a scaling issue.  The function has so small a 
peak that it is effectively 0 -- when scaled sensibly integrate then works 
out of the box.


On Thu, 21 Aug 2008, Thomas Lumley wrote:


On Thu, 21 Aug 2008, Moshe Olshansky wrote:

The phenomenon is most likely caused by numerical errors. I do not know how 
'integrate' works but numerical integration over a very long interval does 
not look a good idea to me.


Numerical integration over a long (or infinite) interval is fine. The problem 
here is that integrate appears not to find the peak of the integrand.  The 
integrand is very flat near zero, which makes things harder.


It seems to work well to split the integral somewhere not too far to the left 
of its peak. Splits at 20, 50,100,or 200 give good agreement.



integrate(f,0,100)

0.0001600355 with absolute error  6.5e-10

integrate(f,100,Inf)

5.355248e-06 with absolute error  3.4e-06


 integrate(f,0,200)

0.0001653797 with absolute error  3.1e-05

integrate(f,200,Inf)

3.141137e-13 with absolute error  1.2e-13


integrate(f,0,50)

2.702318e-05 with absolute error  1.3e-16

integrate(f,50, Inf)

0.0001383181 with absolute error  8e-05

Transforming the integral is only going to be helpful if you have a good idea 
of what it looks like and what the integrate() function is expecting. An 
automated transformation to a bounded interval won't help since that is the 
first thing the underlying Fortran code does.


-thomas


I would do the following:

f1-function(x){
  return(dchisq(x,9,77)*((13.5/x)^5)*exp(-13.5/x))
}

f2-function(y){
  return(dchisq(1/y,9,77)*((13.5*y)^5)*exp(-13.5*y)/y^2)
}
# substitute y = 1/x

I1 - integrate(f1,0,1,abs.tol=abs.tol=1.0e-20*.Machine$double.eps^0.25)

I2 - integrate(f2,0,1,abs.tol=abs.tol=1.0e-20*.Machine$double.eps^0.25)

--- On Thu, 21/8/08, A [EMAIL PROTECTED] wrote:


From: A [EMAIL PROTECTED]
Subject: [R] Help Regarding 'integrate'
To: r-help@r-project.org
Received: Thursday, 21 August, 2008, 4:44 PM
I have an R function defined as:

f-function(x){
  return(dchisq(x,9,77)*((13.5/x)^5)*exp(-13.5/x))
}

Numerically integrating this function, I observed a couple
of things:

A) Integrating the function f over the entire positive real
line gives an
error:

integrate(f,0,Inf)

Error in integrate(f, 0, Inf) : the integral is probably
divergent

B)  Increasing the interval of integration actually
decreases the value of
the integral:

integrate(f,0,10^5)

9.813968e-06 with absolute error  1.9e-05

integrate(f,0,10^6)

4.62233e-319 with absolute error  4.6e-319


Since the function f is uniformly positive, B) can not
occur. Also, the
theory tells me that the infinite integral actually exists
and is finite, so
A) can not occur. That means there are certain problems
with the usage of
function 'integrate' which I do not understand. The
help document tells me
that 'integrate' uses quadrature approximation to
evaluate integrals
numerically. Since I do not come from the numerical methods
community, I
would not know the pros and cons of various methods of
quadrature
approximation. One naive way that I thought of evaluating
the above integral
was by first locating the maximum of the function (may be
by using
'optimize' in R) and then applying the
Simpson's rule to an interval around
the maximum. However, I am sure that the people behind the
R project and
other users have much better ideas, and I am sure the
'correct' method has
already been implemented in R. Therefore, I would
appreciate if someone can
help me find it.


Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained,
reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting 

Re: [R] R memory limitations under Windows vs UNIX

2008-08-22 Thread Prof Brian Ripley

On Fri, 22 Aug 2008, Fiona Giannini wrote:



Hi all,

Our section at work is looking at buying a high-powered computer (32 Gb 
RAM, 64-bit) to be able run models on large datasets and have more 
processing power for the software we commonly use. We mostly use R to 
fit GLMs and GAMs. Our department uses Windows as the standard OS (we 
are upgrading to Vista in the next few months) and so most people would 
be in favour of having the same OS on this computer. I have concerns 
with Windows as I'm not convinced it will make full use of the increased 
RAM.


I have done some searching on the web and as far as I can see, the most


Well, that *is* in the rw-FAQ -- so do we take it that as an R for Windows 
users you have yet to read it?



amount of RAM under Windows I can expect to make use of is 4 Gb using
Vista 64-bit whereas UNIX will make use of closer to the full amount of 
RAM.


Is this correct? Is there a way to make the windows version of R use 
more of the available RAM? Or is going for a UNIX OS the better 
solution?


See the rw-FAQ for info on 64-bit builds of R under Windows.


Any advice would be very much appreciated!

Fiona
_
[[elided Hotmail spam]]

au%2Fchannel%2Findex%2Easpx%3Ftrackingid%3D1046247_t=773166080_r=WL_TAGLINE_m=EXT
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Predictive Analytics event Sept 24-25, Chicago

2008-08-22 Thread Elise Johnson

Hi, I wanted to make sure you were all aware of these upcoming events. There
is a seminar in Predictive Analytics on Sept 24-25 in Chicago, and two
following in D.C. (Oct.) and San Francisco (Nov.).  This is intensive
training for managers, marketers, and IT people who need to make sense of
customer data to predict buying behavior, profit, etc.  Past attendees have
given rave reviews. 

You can find more info at
http://www.predictionimpact.com/predictive-analytics-training.html, e-mail
[EMAIL PROTECTED], or call (415) 683-1146.

thanks --Elise Johnson, Prediction Impact

-- 
View this message in context: 
http://www.nabble.com/Predictive-Analytics-event-Sept-24-25%2C-Chicago-tp19101661p19101661.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] subset grouped data with quantile and NA's

2008-08-22 Thread David Carslaw

I can't quite seem to solve a problem subsetting a data frame.  Here's a
reproducible example. 

Given a data frame:

dat - data.frame(fac = rep(c(a, b), each = 100),
  value = c(rnorm(130), rep(NA, 70)),
  other = rnorm(200))

What I want is a new data frame (with the same columns as dat) excluding the
top 5% of value separately by a and b. For example, this produces the
results I'm after in an array:

sub - tapply(dat$value, dat$fac, function(x) x[x  quantile(x, probs =
0.95, na.rm = TRUE)]) 

My difficulty is putting them into a data frame along with the other columns
fac and other. Note that quantile will return different length vectors
due to different numbers of NAs for a and b.

There's something I'm just not seeing - can you help?

Many thanks.

David Carslaw

-
Institute for Transport Studies
University of Leeds
-- 
View this message in context: 
http://www.nabble.com/subset-grouped-data-with-quantile-and-NA%27s-tp19102795p19102795.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Parabolic cylinder function

2008-08-22 Thread Zornitsa Luleva
Dear all,

I need your advice since I am looking for an implementation of the parabolic
cylinder function in R. I found implemantations of the hypergemetric
functions (the Whittaker and the confluent hypogeometric functions) in the
package fAsianOptions but the parabolic cylinder function was unfortunately
not there. Do you know of such implementation?

Thank you very much for your advice.

Cheers,
Zoe

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] GAM-binomial logit link

2008-08-22 Thread Fabrizio Cipollini
 Dear all,

 I'm using a binomial distribution with a logit link function to fit a GAM
 model. I have 2 questions about it.
 First i am not sure if i've chosen the most adequate distribution. I don't
 have presence/absence data (0/1) but I do have a rate which values vary
 between 0 and 1. This means the response variable is continuous even if
 within a limited interval. Should i use binomial?

I guess safer to use the option
family = quasibinomial
since, with a continuous [0,1]-response, the empirical (conditional)
variance of y can significantly differ from the corresponding theoretical
binomial variance.

You can find larger references in
Papke - Wooldridge (1996), 'Journal of Applied Econometrics' (vol. 11, p.
619-632).

 Secondly, in the numerical output i get negative values of UBRE score. I
 would like to know if one should consider the lowest absolute value or the
 lowest real value to select the best model.

Hmmm... On the basis of the UBRE formula within gam{mgcv}, UBRE scores
should be nonnegative. Please inspect the values of the single elements
inside the formula for discovering possible problems.


 Thank you in advance for your help.
 Marina

Fabrizio Cipollini

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Censored Poisson Data

2008-08-22 Thread Justine Rochon
Dear list,

I am wondering whether R allows to perform a Poisson regression on 
counting data which include censored observations, that is, observations of the 
form 
2 events or more or less than 2 events or even 1 or 2 events.

Can anyone give me a hint whether this is already possible and how to do it?

Data of the form (obstime is the observation time, offset in Poisson 
regression):

obs   obstime   cmin   cmax
1  1 2 Inf
2  2 0 1
3  1 1 2

and so on.

Best wishes,

Justine

--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple look up

2008-08-22 Thread bartjoosen

Please take a look at the manuals before posting (DO read the posting guide!)

Bart
PS: yourdataframe[yourdataframe$TAZ==103,2]





PDXRugger wrote:
 
 So i am hoping this solution is simple, which i beleive it is.  I would
 like to look up a value in one column of a data set and display the
 corresponding value in the 2nd column.  For example
 TAZVACANT ACRES
 100   45
 101   62
 102   23
 103   64
 104   101
 105   280
 
 So if i want to find TAZ 103 it would give me 64 as the corresponding
 value.  Hope you guys can help
 
 Cheers
 
 

-- 
View this message in context: 
http://www.nabble.com/Simple-look-up-tp19098777p19103028.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.csv : double quoted numbers

2008-08-22 Thread Aval Sarri
On Thu, Aug 21, 2008 at 12:54 AM, Gabor Grothendieck
[EMAIL PROTECTED] wrote:

 The problem is the commas, not the quotes:

First thanks to everyone for taking out time to help me.

As Mr. Gabor rightly corrected commas was my problem and not the
double quotes (in numerical values).  This solves my problem. But I
have couple more questions.

1) Is it recommended to set default locale to C when working with R?

I am asking this since, the file that I am reading is xls (but
containing text only). The file command on Linux terminal returns
ASCII English text, with very long lines, with CRLF line
terminators.

gsub fails on this file due to a particular value  C￴te d'Ivoire
which is present in the file. This is when my locale is set to
en_US.utf8 but when I set locale to C it (gsub) works. So is it okay
to set default locale to C?

2) Other problem is escaping double quote in pipe and sed. I have
tried almost all possible combination that came to my limited mind.

 a = scan(pipe(sed -e s/\//g WEOall.xls), sep=\t)
 a = scan(pipe('sed -e s/\//g WEOall.xls'),  sep=\t)
 a = scan(pipe(sed -r s/\\//g WEOall.xls), sep=\t)
 a = scan(pipe(sed -e s///g WEOall.xls), sep=\t)
 a = scan(pipe('sed -e s///g WEOall.xls'), sep=\t)
 a = scan(pipe(paste('sed -e s/\//g WEOall.xls')),sep=\t)
 a = scan(pipe(paste('sed -e s///g WEOall.xls')),sep=\t)

For all of the above I am getting sh: Syntax error: Unterminated quoted string

 sessionInfo()
R version 2.6.2 (2008-02-08)
i486-pc-linux-gnu
locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

Thanks and Regards
Aval S.







 Lines.raw - '1,001.231,008,000.456'
 Lines - readLines(textConnection(Lines.raw))
 Lines - gsub(,, , Lines)
 read.table(textConnection(Lines))
   V1  V2
 1 1001.23 1008000







 On Wed, Aug 20, 2008 at 2:27 PM, Aval Sarri [EMAIL PROTECTED] wrote:
 Hello;

 I am new user of R; so pardon me.

 I am reading a .txt file that has around 50+ numeric columns with '\t'
 as separator. I am using read.csv function along with colClasses but
 that fails to recognize double quoted numeric values. (My numeric
 values are something like 1,001.23; 1,008,000.456.)   Basically
 read.csv fails with  - scan() expected 'a real', got '1,044.059'.

 What I have tried and problems with them:


 1) I tried  scan and pipe but getting following error message; that is
 how do I replace all double quotes with nothing. I tired enclosing sed
 command in single quotes but that does not help.
 (Though the sed command works from shell)

 scan(pipe(sed -e s/\//g DataAll.txt), sep=\t)
 sh: Syntax error: Unterminated quoted string

 2) On mailing list on solution I found was setAs() described here
 http://www.nabble.com/Re%3A--R--read.table()-and-scientific-notation-p6734890.html

 3) Other than using as.is=TRUE and then doing as.numeric for numeric
 columns what is the solution?  But then how do I efficiently convert
 50+ columns to numeric using regular expression? That is all my
 numeric columns name starts with 'X' character, so how do I use sapply
 and/or regular expression to convert all columns starting with X to
 numeric? What is the alternate method to do so?

 Basically 2 and 3 works but which one is efficient and correct way to do 
 this.

 (Also what is most efficient way to apply field level validation and
 conversion while reading a file? Does one has to read the file and
 only after that validation and conversion can happen?)

 Thanks for taking out time to read through the mail.

 Thanks and Regards
 -Aval

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] skipping values when reading binary files

2008-08-22 Thread Uwe Ligges



Pieter Beck wrote:

Hi,

I would like to skip a preset number of values when reading in a binary file
(readBin). Is there any way to do this?



See ?seek

Uwe Ligges



kind regards,

and thanks in advance for the help.

 


Pieter Beck

 



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] barplot with anchored bars

2008-08-22 Thread Uwe Ligges



Paulo Barata wrote:


Dear R list members,

How to produce barplots anchored to the x-axis (not floating
above the x-axis) with a box around?

With both following codes, the lower horizontal line of the
box is below the y = 0 line:

# first code
x - c(1,2,3,4)
barplot(x,yaxs='i')
box()

# second code
x - c(1,2,3,4)
op - par(yaxs='i')
barplot(x)
box()
par(op)


You have to set the ylim in that case.
If you read the barplot.default code, you will find that rAdj is some 
adjustment value that can only be omitted by specifying ylim.


Uwe Ligges





The parameter yaxs='i' does not seem to work with barplot.

I use R version 2.7.1, running on Windows XP.

Thank you very much.

Paulo Barata


Paulo Barata
Fundacao Oswaldo Cruz - Oswaldo Cruz Foundation
Rua Leopoldo Bulhoes 1480 - 8A
21041-210  Rio de Janeiro - RJ
Brazil

E-mail: [EMAIL PROTECTED]
Alternative e-mail: [EMAIL PROTECTED]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.csv : double quoted numbers

2008-08-22 Thread Philipp Pagel

 gsub fails on this file due to a particular value  C￴te d'Ivoire
 which is present in the file. This is when my locale is set to
 en_US.utf8 but when I set locale to C it (gsub) works.

Have a look at ?file, especially the section on Encoding. 

cu
Philipp

-- 
Dr. Philipp Pagel
Lehrstuhl für Genomorientierte Bioinformatik
Technische Universität München
Wissenschaftszentrum Weihenstephan
85350 Freising, Germany
http://mips.gsf.de/staff/pagel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] chisq

2008-08-22 Thread Marco Chiapello
Hi,
the chisq value for .05 level of signif. and 3 DF is 7.81 (one-sided).
Is there a way to calculated it in R (giving level of signif. and DF)?

Marco

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help on competing risk package cmprsk with time dependent covariate

2008-08-22 Thread Philippe Guardiola

Dear R users,


I d like to assess the effect of treatment covariate on a disease relapse 
risk with the package cmprsk. 
However, the effect of this covariate on survival is time-dependent
(assessed with cox.zph): no significant effect during the first year of 
follow-up,
then after 1 year a favorable effect is observed on survival (step
function might be the correct way to say that ?). 
For overall survival analysis
I have used a time dependent Cox model which has confirmed this positive effect 
after
1 year.
Now I m moving to disease relapse incidence and a similar time dependency seems 
to be present. 

what I d like to have is that: for
patients without treatment the code for treatment covariate is
always 0, and for patients who received treatment covariate I d like
to have it = 0 during time interval 0 to 1 year, and equal to 1 after 1
year. Correct me if I m wrong in trying to do so.


First, I have run the following script (R2.7.1 under XPpro) according to 
previous advices:

library(cmprsk)
attach(LAMrelapse)
fit1- crr(rel.t, rel.s, treatment, treatment, function(uft)
cbind(ifelse(uft=1,1,0),ifelse(uft1,1,0)), failcode=1,
cencode=0, na.action=na.omit, gtol-06, maxiter)
fit1

where:
rel.t = time to event (in years)
rel.s = status , =1 if disease relapse, =2 if death from non disease
related cause (toxicity of previous chemotherapy), =0 if alive 
not in relapse
treatment =
binary covariate (value: 0 or 1) representing the treatment to test
(different from chemotherapy above, with no known toxicity)
I have not yet added other covariates in the model.


this script gave me the following result:
 fit1 - crr(relcmp.t, relcmp.s, treatment, treatment, function(uft) 
 cbind(ifelse(uft = 1, 1, 0), ifelse(uft  1, 1, 0)), failcode = 1, cencode = 
 0, 
na.action = na.omit, gtol = 1e-006, maxiter = 10)
 fit1
convergence:  TRUE 
coefficients:
[1] -0.6808  0.7508
standard errors:
[1] 0.2881 0.3644
two-sided p-values:
[1] 0.018 0.039

...That I dont understand at all since it looks like if treatment
covariate had also a significant effect of the first period of time !? 
This is absolutely not the case. 
So I m surely wrong with a part of this script... cov2 and tf are
pretty obscure for me in the help file of the package. I would really
appreciate advices regarding these 2 terms. 

I was thinking that I might changed : cbind(ifelse(uft = 1, 1, 0), ifelse(uft 
 1, 1, 0)   into:cbind(ifelse(uft = 1, 0, 1), 
ifelse(uft  1, 1, 0)

But since I only have one covariate (treatment) to test, shouldnt I only write 
the following:
fit1- crr(rel.t,
rel.s, treatment, treatment, function(uft) ifelse(uft=1,0,1)), failcode=1,
cencode=0, na.action=na.omit, gtol-06, maxiter)

which gives me :
 fit1
convergence:  TRUE 
coefficients:
[1]  0.06995 -0.75080
standard errors:
[1] 0.2236 0.3644
two-sided p-values:
[1] 0.750 0.039

which, if I understand things
correctly (I m not sure at all !) confirms that before 1 year, the effect of 
treatment covariate is not
significant, but is significant after 1 year of follow up. But there I m again 
not sure of the result I obtain...

any help would be greatly appreciated with cov2 and tf

thanks for  if you have some time for this,


Philippe Guardiola


  
_ 

o.fr
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] GAM-binomial logit link

2008-08-22 Thread cipollin

 Dear all,

 I'm using a binomial distribution with a logit link function to fit a GAM
 model. I have 2 questions about it.
 First i am not sure if i've chosen the most adequate distribution. I don't
 have presence/absence data (0/1) but I do have a rate which values vary
 between 0 and 1. This means the response variable is continuous even if
 within a limited interval. Should i use binomial?

I guess safer to use the option
family = quasibinomial
since, with a continuous [0,1]-response, the empirical (conditional)
variance of y can significantly differ from the corresponding theoretical
binomial variance.

You can find references in
Papke - Wooldridge (1996), 'Journal of Applied Econometrics' (vol. 11, p.
619-632).

 Secondly, in the numerical output i get negative values of UBRE score. I
 would like to know if one should consider the lowest absolute value or the
 lowest real value to select the best model.

Hmmm... On the basis of the UBRE formula within gam{mgcv}, UBRE scores
should be nonnegative. Please inspect the values of the single elements
inside the formula to discover possible problems.


 Thank you in advance for your help.
 Marina

Fabrizio Cipollini

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Retrieving sample points

2008-08-22 Thread Yemi Oyeyemi
Hi everyone. My problem is about multivariate analysis. Given a multivariate 
data X, of dimension (n by p), n is the number of observations while p is the 
number of variables. I wanted a sample of size p+1 that will have its product 
of the eigen values of the covariance matrix to be the minimum among all the 
possible samples (n combination (p+1)). I have written a syntax with worked 
well, but the problem is that it does not work when the dimension of X is large 
(when n is greater than 25). I think the problem is memory size, is there 
anyway I can maximize the memory.
Here is the syntax that I used.
 
 bestunits - function(x){
+ n - nrow(x)
+ p - ncol(x)
+ h - p+1
+ nchoosek - function(n,h){
+ factorial(n)/(factorial(h+1)*factorial(n-h-1))}
+ vec11 - matrix(nrow=nchoosek(n,h), ncol=p)
+ vec12 - matrix(nrow=nchoosek(n,h), ncol=h)
+ for (i in 1:nchoosek(n,h)){
+ flag - sample(1:n, h, replace=FALSE)
+ z - x[flag,]
+ y - eigen(var(z), only.values=TRUE)$values
+ vec11[i,] - y
+ vec12[i,] - flag
+ }
+ product - apply(vec11,1,prod)
+ best - which.min(product)
+ sample - vec12[best,]
+ bestsample - x[sample,]
+ return(bestsample)
+ }
Thanks
Oyeyemi, Gafar Matanmi
Department of Statistics
University of Ilorin
Ilorin, Nigeria


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] psychometric functions

2008-08-22 Thread Mario Maiworm
Paul,
thanks a lot for your advice! I got that kuss et al. paper and started to
read, this is more than I had expected! Very cool paper, and an algorithm
implemented in R that seems to be even superior to psignifit in matlab. So
thanks again!
mario


__

Mario Maiworm
Biological Psychology and Neuropsychology
University of Hamburg
Von-Melle-Park 11
D-20146 Hamburg

Phone: +49 40 42838 8265
Fax: +49 40 42838 6591

http://bpn.uni-hamburg.de/Maiworm_e.html
http://cinacs.org
__

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 project.org] On Behalf Of Paul Artes
 Sent: Thursday, August 21, 2008 5:37 PM
 To: r-help@r-project.org
 Subject: Re: [R] psychometric functions
 
 
 There is a nice paper by Yssaad-Fesselier and Knoblauch on Modelling
 Psychometric Functions in R.
 http://hal.archives-ouvertes.fr/docs/00/13/17/99/PDF/B125.pdf
 
 You might also be interested in this:
 http://www.journalofvision.org/5/5/8/article.aspx
 which comes from the same group as the psignifit toolbox for matlab
 (methinks), but is a step ahead.
 
 Best wishes
 
 Paul
 --
 View this message in context: http://www.nabble.com/psychometric-
 functions-tp19086590p19091317.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] swap

2008-08-22 Thread amor Gandhi
Hello everybody,
 
I wonder if there is any swap function in R that does the following:
x - seq(1,10,12)
x1 - swap(x)
x1 
1 8 3 4 5 6 7 2 10
Thank you very much!
 
Amor

__


 Schutz gegen Massenmails. 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Very confused with class

2008-08-22 Thread Dan Davison

Hi Rich,


Richard M. Heiberger wrote:
 
 Dan,
 
 The real problem is the use of csv files.  csv files don't handle missing
 values
 (#VALUE is most likely from Excel), dates, or other complications very
 well.
 
 Read your Excel file directly into
 R with one of the packages designed specifically for that purpose.  I
 recommend
 RExcel (Windows only) which allows complete two-way communication between
 R
 and Excel.
 Missing values and dates are handled correctly.
 You can download the RExcelInstaller package from CRAN.
 

I'm sure RExcel is an excellent technology. However, it is an unnecessarily
complex technology in this instance. What I was trying to do was help the
original poster read in tabular data stored in a standard text format, which
is a fundamental skill for any R programmer. In general, I would encourage
people (beginners especially) to avoid the use of hi-tech solutions, when
simple text-based solutions suffice. But when people do need to have more
sophisticated integration of R and e.g. Excel, it's nice that the tools
exist.

Dan


Richard M. Heiberger wrote:
 
 
 Rich
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Very-confused-with-class-tp19090246p19104343.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help on competing risk package cmprsk with time dependent covariate

2008-08-22 Thread Arthur Allignol

Hello,

Something i don't understand
in your question.
Is treatment a time-dependent covariate?
That is, do patients receive the treatment
at the beginning of the study or later?

cmprsk cannot handle time-dependent covariates.

But if treatment is a baseline covariate,
but has a time-varying effect (i.e. does the subdistribution hazard 
ratio varies with time?), your solution

to assess that is weird, because you will transform
your baseline covariate into a time-dependent one,
thus considering all the patients to receive no treatment
the first year. For sure, the treatment wont have any
effect for the first year.
To assess a time-varying effect on competing risks,
i would either follow the cmprsk documentation, including
an interaction with functions of time, or use the comp.risk
function in the timereg package, which fits more flexible
models for the cumulative incidence functions.

Best regards,
Arthur Allignol


Philippe Guardiola wrote:

Dear R users,


I d like to assess the effect of treatment covariate on a disease relapse risk with the package cmprsk. 
However, the effect of this covariate on survival is time-dependent

(assessed with cox.zph): no significant effect during the first year of 
follow-up,
then after 1 year a favorable effect is observed on survival (step
function might be the correct way to say that ?). 
For overall survival analysis

I have used a time dependent Cox model which has confirmed this positive effect 
after
1 year.
Now I m moving to disease relapse incidence and a similar time dependency seems to be present. 


what I d like to have is that: for
patients without treatment the code for treatment covariate is
always 0, and for patients who received treatment covariate I d like
to have it = 0 during time interval 0 to 1 year, and equal to 1 after 1
year. Correct me if I m wrong in trying to do so.


First, I have run the following script (R2.7.1 under XPpro) according to 
previous advices:

library(cmprsk)
attach(LAMrelapse)
fit1- crr(rel.t, rel.s, treatment, treatment, function(uft)
cbind(ifelse(uft=1,1,0),ifelse(uft1,1,0)), failcode=1,
cencode=0, na.action=na.omit, gtol-06, maxiter)
fit1

where:
rel.t = time to event (in years)
rel.s = status , =1 if disease relapse, =2 if death from non disease
related cause (toxicity of previous chemotherapy), =0 if alive 
not in relapse
treatment =
binary covariate (value: 0 or 1) representing the treatment to test
(different from chemotherapy above, with no known toxicity)
I have not yet added other covariates in the model.


this script gave me the following result:
fit1 - crr(relcmp.t, relcmp.s, treatment, treatment, function(uft) cbind(ifelse(uft = 1, 1, 0), ifelse(uft  1, 1, 0)), failcode = 1, cencode = 0, 

na.action = na.omit, gtol = 1e-006, maxiter = 10)

fit1
convergence:  TRUE 
coefficients:

[1] -0.6808  0.7508
standard errors:
[1] 0.2881 0.3644
two-sided p-values:
[1] 0.018 0.039

...That I dont understand at all since it looks like if treatment
covariate had also a significant effect of the first period of time !? 
This is absolutely not the case. 
So I m surely wrong with a part of this script... cov2 and tf are

pretty obscure for me in the help file of the package. I would really
appreciate advices regarding these 2 terms. 


I was thinking that I might changed : cbind(ifelse(uft = 1, 1, 0), ifelse(uft  1, 
1, 0)   into:cbind(ifelse(uft = 1, 0, 1), ifelse(uft  1, 1, 
0)

But since I only have one covariate (treatment) to test, shouldnt I only write 
the following:
fit1- crr(rel.t,
rel.s, treatment, treatment, function(uft) ifelse(uft=1,0,1)), failcode=1,
cencode=0, na.action=na.omit, gtol-06, maxiter)

which gives me :

fit1
convergence:  TRUE 
coefficients:

[1]  0.06995 -0.75080
standard errors:
[1] 0.2236 0.3644
two-sided p-values:
[1] 0.750 0.039

which, if I understand things
correctly (I m not sure at all !) confirms that before 1 year, the effect of 
treatment covariate is not
significant, but is significant after 1 year of follow up. But there I m again 
not sure of the result I obtain...

any help would be greatly appreciated with cov2 and tf

thanks for  if you have some time for this,


Philippe Guardiola


  _ 


o.fr
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] swap

2008-08-22 Thread Richard . Cotton
 I wonder if there is any swap function in R that does the following:
 x - seq(1,10,12)

This just makes x equal to 1.  I'm guessing you meant something like
x - 1:10

 x1 - swap(x)
 x1 
 1 8 3 4 5 6 7 2 10

It's not terribly clear what you want swap to do.  You can reorder the 
elements of x using sample, e.g.
sample(x)
[1]  6 10  8  4  2  5  1  9  3  7

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Coordinate systems for geostatistics in R

2008-08-22 Thread imicola

Hi,

I read somewhere that when carrying out geostatistical analysis in R you
should not use latitude and longitude...can anyone expand on this a little
for me, and what would be the best coordinate system to use?

I have my data in a geographic coordinate system, WGS84, decimal
degreesis this the wrong format for such analyses?

I have also converted my data in the UTM projection and so have it in
metres(ranging from 480,000 to 550,000 E and 170,000 to 230,000 N).  

If I was to use the UTM coordinates, should I be using the actual
coordinates in metres, or should I convert this into an arbitrary coordinate
system (i.e. from 0 - 1) somehow?  

I have noticed that running an analysis on the data gives different results
depending on which type of system you use, so I want to make sure I have the
correct system.  I should also probably note that I am a geostatistical
novice!

Thanks,
-- 
View this message in context: 
http://www.nabble.com/Coordinate-systems-for-geostatistics-in-R-tp19104598p19104598.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Large data sets with R (binding to hadoop available?)

2008-08-22 Thread Rory.WINSTON
Hi

Apart from database interfaces such as sqldf which Gabor has mentioned, there 
are also packages specifically for handling large data: see the ff package, 
for instance.

I am currently playing with parallelizing R computations via Hadoop. I haven't 
looked at PIG yet though.

Rory


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Roland Rau
Sent: 21 August 2008 20:04
To: Avram Aelony
Cc: r-help@r-project.org
Subject: Re: [R] Large data sets with R (binding to hadoop available?)

Hi

Avram Aelony wrote:

 Dear R community,

 I find R fantastic and use R whenever I can for my data analytic needs.
 Certain data sets, however, are so large that other tools seem to be
 needed to pre-process data such that it can be brought into R for
 further analysis.

 Questions I have for the many expert contributors on this list are:

 1. How do others handle situations of large data sets (gigabytes,
 terabytes) for analysis in R ?

I usually try to store the data in an SQLite database and interface via 
functions from the packages RSQLite (and DBI).

No idea about Question No. 2, though.

Hope this helps,
Roland


P.S. When I am sure that I only need a certain subset of large data sets, I 
still prefer to do some pre-processing in awk (gawk).
2.P.S. The size of my data sets are in the gigabyte range (not terabyte range). 
This might be important if your data sets are *really large* and you want to 
use sqlite: http://www.sqlite.org/whentouse.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

***
The Royal Bank of Scotland plc. Registered in Scotland No 90312. Registered 
Office: 36 St Andrew Square, Edinburgh EH2 2YB. 
Authorised and regulated by the Financial Services Authority 

This e-mail message is confidential and for use by the=2...{{dropped:22}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] call perl

2008-08-22 Thread Rory.WINSTON
Maybe

?system

Might be what you need here?

-Rory


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jay an
Sent: 22 August 2008 03:08
To: r-help@r-project.org
Subject: [R] call perl

Hi,

It may be the old question.
can anyone tell me how to call perl in R?
thanks

Y.





[[alternative HTML version deleted]]


***
The Royal Bank of Scotland plc. Registered in Scotland No 90312. Registered 
Office: 36 St Andrew Square, Edinburgh EH2 2YB. 
Authorised and regulated by the Financial Services Authority 

This e-mail message is confidential and for use by the=2...{{dropped:22}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help needed for HWE.exact in library genetics

2008-08-22 Thread Neil Shephard

You could follow the advice given when loading the library (see the code you
posted for details) and use the enhanced genetics packages to cross-validate
the results (ideally you should get the same answer).

The results not weird though.

Your working with SNPs and having a homozygote with a frequency of zero
isn't that unlikely, it depends on the minor allele frequency (MAF).  In
this instance its the controls you appear to be concerned about, and for
this locus the MAF of the minor allele is 0.0145, so in a sample of 173
individuals you would expect to see 0.0363 individuals with the recessive
genotype (basic Mendelian genetics of 173 * 0.0145^2).  Thus at best you
might expect to see one person in a sample of that size, but its not
surprising that you've not seen any.

You may also find the following references of interest...

Guangyong Zou, Allan Donner
The Merits of Testing Hardy-Weinberg Equilibrium in the Analysis of
Unmatched Case-Control Data: A Cautionary Note
Annals of Human Genetics
Volume 70, Issue 6, Pages 923-933

(basically advocates the use of a robust test for association such as the
trend test over the two-stage design of testing for HWeqm and then testing
for association).

There is follow up discussion of the article in 

Yik Y. Teo, Andrew E. Fry, Taane G. Clark, E. S. Tai, Mark Seielstad
On the Usage of HWE for Identifying Genotyping Errors 
Annals of Human Genetics
Volume 71 Issue 5 , Pages 701 - 703

...and a rejoinder from Zou and Donner...

Guangyong Zou, Allan Donner
The Reply of Zou  Donner to Teo's Commentary on their Manuscript (p
704-704)
Annals of Human Genetics
Volume 71 Issue 5 , Pages 701 - 703

Neil

Sun, Ying [BSD] - HGD wrote:
 
 
 library( genetics )
 NOTE: THIS PACKAGE IS NOW OBSOLETE.
   The R-Genetics project has developed an set of enhanced genetics
   packages to replace 'genetics'. Please visit the project homepage
   at http://rgenetics.org for informtion.
 

-- 
View this message in context: 
http://www.nabble.com/help-needed-for-HWE.exact-in-library-%22genetics%22-tp19100974p19105112.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Bias reduction method

2008-08-22 Thread Julio Rojas
Dear all, I'm doing a simulation study for an MM1 queue and I would like to 
know if there are bias correction methods in R for time series (estimations of 
warm-up points or something else).

Hope you can help me.



  

Yahoo! MTV Blog  Rock gt;¡Cuéntanos tu historia, inspira una canción y 
gánate un viaje a los Premios MTV! Participa aquí http://mtvla.yahoo.com/
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] barplot with anchored bars

2008-08-22 Thread Jim Lemon
On Thu, 2008-08-21 at 15:47 -0300, Paulo Barata wrote:
 Dear R list members,
 
 How to produce barplots anchored to the x-axis (not floating
 above the x-axis) with a box around?
 
 With both following codes, the lower horizontal line of the
 box is below the y = 0 line:
 
 # first code
 x - c(1,2,3,4)
 barplot(x,yaxs='i')
 box()
 
 # second code
 x - c(1,2,3,4)
 op - par(yaxs='i')
 barplot(x)
 box()
 par(op)
 
 The parameter yaxs='i' does not seem to work with barplot.
 
 I use R version 2.7.1, running on Windows XP.
 
Hi Paolo,
Have a look at the barp function in the plotrix package.

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] simple generation of artificial data with defined features

2008-08-22 Thread drflxms
Dear R-colleagues,

I am quite a newbie to R fighting my stupidity to solve a probably quite
simple problem of generating artificial data with defined features.

I am conducting a study of inter-observer-agreement in
child-bronchoscopy. One of the most important measures is Kappa
according to Fleiss, which is very comfortable available in R through
the irr-package.
Unfortunately medical doctors like me don't really understand much of
statistics. Therefore I'd like to give the reader an easy understandable
example of Fleiss-Kappa in the Methods part. To achieve this, I obtained
a table with the results of the German election from 2005:

partynumber of votespercent

SPD1619466534,2
CDU1313674027,8
CSU34943097,4
Gruene38383268,1
FDP46481449,8
PDS41181948,7

I want to show the agreement of voters measured by Fleiss-Kappa. To
calculate this with the kappam.fleiss-function of irr, I need a
data.frame like this:

(id of 1st voter) (id of 2nd voter)

party spd cdu

Of course I don't plan to calculate this with the million of cases
mentioned in the table above (I am working on a small laptop). A
division by 1000 would be more than perfect for this example. The exact
format of the table is generally not so important, as I could reshape
nearly every format with the help of the reshape-package.

Unfortunately I could not figure out how to create such a
fictive/artificial dataset as described above. Any data.frame would be
nice, that keeps at least the percentage. String-IDs of parties could be
substituted by numbers of course (would be even better for function
kappam.fleiss in irr!).

I would appreciate any kind of help very much indeed.
Greetings from Munich,

Felix Mueller-Sarnowski

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] random effects in lmer

2008-08-22 Thread Line Johansen

Dear all
I am running several generalized mixed model using lmer. The models 
typical look like this: 
model2xx-lmer(numbers~Treatment+b+(1|Genotype)+(1|Field)+(1|Genotype:Treatment), 
family=quasipoisson)
All factors are categorical.  


And the output looks like this:
Generalized linear mixed model fit using Laplace
formula:numbers~Treatment+Dead+(1|Genotype)+(1|Field)+(1|Genotype:Treatment), 
family=quasipoisson)

Family: quasipoisson(log link)

 AIC  BIC logLik deviance

1033 1051 -509.4 1019

Random effects:
Groups   NameVariance Std.Dev.
Genotype:Treatment (Intercept) 0.40555  0.63683
Genotype(Intercept) 1.16642  1.08001
Field(Intercept) 1.23738  1.11238
Residual 9.47740  3.07854
number of obs: 100, groups: Genotype1:Treatment1, 14; Genotype1, 5; 
Felt1, 4


Fixed effects:

Estimate Std. Error t value
(Intercept)  3.200610.80010   4.000
Treatment a -0.054820.42619  -0.129
Treatment c  0.083160.46395   0.179
Dead No  0.276040.14873   1.856

Correlation of Fixed Effects:
(Intr) Tretmnt1 Trtmnt1c
Treatment a -0.247 
Treatment c -0.239  0.450  
Dead No -0.111 -0.1320.003  



I need some help to evaluate the importance of the random factors. The 
random factor of interest is Genotype. I have tried to delete random 
factors from the model and comparing the model with the original model 
by log likelihood-ratio statistics. Is this an appropriate method for 
testing the random factors in lmer? Is it possible to evaluate how much 
of the total variation the random factor Genotype explains?
 
I have am a new user in lmer and my questions is probably very naive. 
But I appreciate any help.
Thanks for the help.  


Line



--

Line Johansen   

Department of Biology   
Norwegian University of Natural Science and Technology

Høgskoleringen 15
No-7491 Trondheim
Phone: 73 55 12 94

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subset grouped data with quantile and NA's

2008-08-22 Thread jim holtman
This will also remove the NAs from the output;  you will have to
change it to also keep the NAs.  Wasn't sure what you wanted to do
with them.

dat - data.frame(fac = rep(c(a, b), each = 100),
 value = c(rnorm(130), rep(NA, 70)),
 other = rnorm(200))
# split the data
x.s - split(dat, dat$fac, drop=TRUE)
# process the quantiles
x.l - lapply(x.s, function(.fac){
# remove NAs from the output -- need to change if you want to keep NAs
.fac[(!is.na(.fac$value))  (.fac$value = quantile(.fac$value,
prob=0.95, na.rm=TRUE)),]
})
# put back into a dataframe
dat.new - do.call(rbind, x.l)


On Fri, Aug 22, 2008 at 3:35 AM, David Carslaw
[EMAIL PROTECTED] wrote:

 I can't quite seem to solve a problem subsetting a data frame.  Here's a
 reproducible example.

 Given a data frame:

 dat - data.frame(fac = rep(c(a, b), each = 100),
  value = c(rnorm(130), rep(NA, 70)),
  other = rnorm(200))

 What I want is a new data frame (with the same columns as dat) excluding the
 top 5% of value separately by a and b. For example, this produces the
 results I'm after in an array:

 sub - tapply(dat$value, dat$fac, function(x) x[x  quantile(x, probs =
 0.95, na.rm = TRUE)])

 My difficulty is putting them into a data frame along with the other columns
 fac and other. Note that quantile will return different length vectors
 due to different numbers of NAs for a and b.

 There's something I'm just not seeing - can you help?

 Many thanks.

 David Carslaw

 -
 Institute for Transport Studies
 University of Leeds
 --
 View this message in context: 
 http://www.nabble.com/subset-grouped-data-with-quantile-and-NA%27s-tp19102795p19102795.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combining multiple datasets

2008-08-22 Thread jim holtman
Here is the way that I usually do it using lapply:

myData - lapply(list.files(pattern=.RWL$), function(.file)
read.rel(.file, header=TRUE))
# combine into a single dataframe
myDF - do.call(rbind, myData)



On Thu, Aug 21, 2008 at 10:42 PM, Alison Macalady [EMAIL PROTECTED] wrote:
 Hi,
 I've tried to figure this out using Intro to R and help(), to no avail - I
 am new at this.

 I'm trying to write a script that will read multiple files from a directory
 and then merge them into a single new data frame.
 The original data are in a tree-ring specific format, and so I've first used
 a function (read.rwl) from the dplR package to read each file, translate
 each into a more intuitive time series format, and then put each new data
 frame into a single object, demo.Rwl:

demo.Rwl - list.files(pattern=.RWL$); for(i in demo.Rwl) { x -
 read.rwl(i, header=TRUE); assign(print(i, quote=FALSE), x)}

 This part seems to work.  Now, I want to be able to put all of the
 individual data frames contained in demo.Rwl into a single data frame,
 merging by rows (rows are calendar years in the time series). I think I know
 how to do this by typing each data set name into a merge:

merge(x,y..., by.x=0, all=TRUE )

 However, I'd like to be able to script something that will save me the time
 of typing in all of the individual data set names, as I will be repeating
 this operation many times with many different numbers of original input
 files. Is there a way to code a merge such that every individual data frame
 contained in demo.Rwl (a different number every time) gets combined into a
 single data set?

 Thanks for the help!

 Ali

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] nesting environments

2008-08-22 Thread Antje

Hello,

I got the hint to use nesting environments (for a GUI problem), but I'm not 
sure whether I understand what he means (and I did not really succeed with 
reading documentation - probably I looked for the wrong keywords?).

That's what he wrote:

 Another good way to control the scoping is nesting environments (e.g. 
through closures). That is, your object does not need to be global (in the 
workspace) but it can be in a parent environment. In fact, global variables 
won't even work when you put your code into a namespaced pacakge, because the 
namespace will be locked.


Can anybody explain it (probably with a simple example)
That would be great!

Thanks,
Antje

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] swap

2008-08-22 Thread amor Gandhi
Hello Richie,
I would like to do three (or k) swap steps in each step just 2 ID recursive 
swaping
x - 1:10
swap - function(x){
  a - sample(x,2)
  x[x==a[1]] - swap[2]
  x[x==a[2]] - swap[1]
  return(x)
  }
  swap(swap(swap(x))) - mix
  
Is this possible?
Thanks you in advance!
Amor

__


 Schutz gegen Massenmails. 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Re : Help on competing risk package cmprsk with time dependent covariate

2008-08-22 Thread phguardiol

 Hello again,

I m trying to use timereg package as suggested (R2.7.1 on XP Pro). 
here is my script based on the example from timereg for a fine  gray model in 
which
relt = time to event, rels = status 0/1/2 2=competing, 1=event of interest, 
0=censored
random = covariate I want to test
 
library(timereg)
rel-read.csv(relapse2.csv, header = TRUE, sep = ,, quote=\, dec=., 
fill = TRUE, comment.char=)
names(rel)

 names(rel)
[1] upn    rels   relt   random

times-rel$relt[rel$rels==1]

fg1-comp.risk(Surv(relt,rels0)~const(random),rel,rel$rels,times[-1],causeS=1,resample.iid=1,model=prop)
summary(fg)

fg2-comp.risk(Surv(relt,rels0)~random,rel,rel$rels,times[-1],causeS=1,resample.iid=1,model=prop)
summary(fg)

Question1: can you confirm that fg1 evaluates random with the hypothesis 
that it has a constant hazard over time and that it is assumed as being time 
dependent in fg2 ?

Question 2: both summaries give me the following that I dont understand at all, 
is there a mistake in my script ?

Competing risks Model 

Test for nonparametric terms 

Test for non-significant effects 
            sup| hat B(t)/SD(t) | p-value H_0: B(t)=0
(Intercept)                     
0                   0
random                      C2   
0                   0
Test for time invariant effects 
            supremum test p-value H_0: B(t)=b t
(Intercept)             0                     0
random                  
0                     0

            int (b(t)-g(t,gam))^2dt p-value H_0:constant effect 
(Intercept)                       
0                            0
random                            
0                            0

  Call: 
comp.risk(Surv(relt, rels  0) ~ random, rel, rel$rels, times[-1],     
causeS = 1, resample.iid = 1, model = prop)


any help is very welcome
regards


 


Philippe G

 


 

-E-mail d'origine-
De : Arthur Allignol [EMAIL PROTECTED]
A : Philippe Guardiola [EMAIL PROTECTED]
Cc : R-help@r-project.org; phgua
[EMAIL PROTECTED]
Envoyé le : Vendredi, 22 Août 2008 11:53
Sujet : Re: [R] Help on competing risk package cmprsk with time dependent 
covariate









Hello, 
 

Something i don't understand 

in your question. 

Is treatment a time-dependent covariate? 

That is, do patients receive the treatment 

at the beginning of the study or later? 
 

cmprsk cannot handle time-dependent covariates. 
 

But if treatment is a baseline covariate, 

but has a time-varying effect (i.e. does the subdistribution hazard 
ratio varies with time?), your solution 

to assess that is weird, because you will transform 

your baseline covariate into a time-dependent one, 

thus considering all the patients to receive no treatment 

the first year. For sure, the treatment wont have any 

effect for the first year. 

To assess a time-varying effect on competing risks, 

i would either follow the cmprsk documentation, including 

an interaction with functions of time, or use the comp.risk 

function in the timereg package, which fits more flexible 

models for the cumulative incidence functions. 
 

Best regards, 

Arthur Allignol 
 


Philippe Guardiola wrote: 

 Dear R users, 

 
 
 I d like to assess the effect of treatment covariate on a disease relapse 
 risk with the package cmprsk. 
 However, the effect of this covariate on survival is time-dependent 

 (assessed with cox.zph): no 
significant effect during the first year of follow-up, 

 then after 1 year a favorable effect is observed on survival (step 

 function might be the correct way to say that ?). 
 For overall survival analysis 

 I have used a time dependent Cox model which has confirmed this positive 
 effect after 

 1 year. 

 Now I m moving to disease relapse incidence and a similar time dependency 
 seems to be present. 
 
 what I d like to have is that: for 

 patients without treatment the code for treatment covariate is 

 always 0, and for patients who received treatment covariate I d like 

 to have it = 0 during time interval 0 to 1 year, and equal to 1 after 1 

 year. Correct me if I m wrong in trying to do so. 

 
 
 First, I have run the following script (R2.7.1 under XPpro) according to 
 previous advices: 

 
 library(cmprsk) 

 attach(LAMrelapse) 

 fit1- crr(rel.t, rel.s, treatment, treatment, function(uft) 

 cbind(ifelse(uft=1,1,0),ifelse(uft1,1,0)), failcode=1, 

 cencode=0, na.action=na.omit, gtol-06, maxiter) 

 fit1 

 
 where: 

 rel.t = time to event (in years) 

 rel.s = status , =1 if disease relapse, =2 if death from non disease 

 related cause (toxicity of previous chemotherapy), =0 if alive  

 not in relapse 

 treatment = 

 binary covariate (value: 0 or 1) 

Re: [R] swap

2008-08-22 Thread jim holtman
Is this what you want:

 swap - function(z){
+ a - sample(length(z), 2)
+ z[a] - z[rev(a)]
+ z
+ }
 swap(1:10)
 [1]  2  1  3  4  5  6  7  8  9 10

 swap(1:10)
 [1]  5  2  3  4  1  6  7  8  9 10
 swap(1:10)
 [1]  8  2  3  4  5  6  7  1  9 10
 swap(1:10)
 [1]  2  1  3  4  5  6  7  8  9 10
 swap(1:10)
 [1]  4  2  3  1  5  6  7  8  9 10
 swap(swap(swap(1:10)))
 [1]  8 10  3  5  4  6  7  1  9  2
 swap(swap(swap(1:10)))
 [1]  8  2  7  4  5  6  3 10  9  1



On Fri, Aug 22, 2008 at 8:57 AM, amor Gandhi [EMAIL PROTECTED] wrote:
 Hello Richie,
 I would like to do three (or k) swap steps in each step just 2 ID recursive 
 swaping
 x - 1:10
 swap - function(x){
   a - sample(x,2)
   x[x==a[1]] - swap[2]
   x[x==a[2]] - swap[1]
   return(x)
   }
   swap(swap(swap(x))) - mix

 Is this possible?
 Thanks you in advance!
 Amor

 __


  Schutz gegen Massenmails.

[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nesting environments

2008-08-22 Thread Gabor Grothendieck
Look at the example section of ?new.env

Also the proto package on CRAN defines proto objects (which are just
environments with
slightly different semantics) and in the Applications section of the home page
has links to examples of using proto objects to creates GUIs with gWidgets:
http://r-proto.googlecode.com

On Fri, Aug 22, 2008 at 8:54 AM, Antje [EMAIL PROTECTED] wrote:
 Hello,

 I got the hint to use nesting environments (for a GUI problem), but I'm not
 sure whether I understand what he means (and I did not really succeed with
 reading documentation - probably I looked for the wrong keywords?).
 That's what he wrote:

 Another good way to control the scoping is nesting environments (e.g.
 through closures). That is, your object does not need to be global (in the
 workspace) but it can be in a parent environment. In fact, global variables
 won't even work when you put your code into a namespaced pacakge, because
 the namespace will be locked.

 Can anybody explain it (probably with a simple example)
 That would be great!

 Thanks,
 Antje

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nesting environments

2008-08-22 Thread Antje

Thanks a lot! I'll give it a closer look :-)

Ciao,
Antje



Gabor Grothendieck schrieb:

Look at the example section of ?new.env

Also the proto package on CRAN defines proto objects (which are just
environments with
slightly different semantics) and in the Applications section of the home page
has links to examples of using proto objects to creates GUIs with gWidgets:
http://r-proto.googlecode.com

On Fri, Aug 22, 2008 at 8:54 AM, Antje [EMAIL PROTECTED] wrote:

Hello,

I got the hint to use nesting environments (for a GUI problem), but I'm not
sure whether I understand what he means (and I did not really succeed with
reading documentation - probably I looked for the wrong keywords?).
That's what he wrote:


Another good way to control the scoping is nesting environments (e.g.
through closures). That is, your object does not need to be global (in the
workspace) but it can be in a parent environment. In fact, global variables
won't even work when you put your code into a namespaced pacakge, because
the namespace will be locked.

Can anybody explain it (probably with a simple example)
That would be great!

Thanks,
Antje

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Coordinate systems for geostatistics in R

2008-08-22 Thread Paul Hiemstra

imicola schreef:

Hi,

I read somewhere that when carrying out geostatistical analysis in R you
should not use latitude and longitude...can anyone expand on this a little
for me, and what would be the best coordinate system to use?

I have my data in a geographic coordinate system, WGS84, decimal
degreesis this the wrong format for such analyses?

I have also converted my data in the UTM projection and so have it in
metres(ranging from 480,000 to 550,000 E and 170,000 to 230,000 N).  


If I was to use the UTM coordinates, should I be using the actual
coordinates in metres, or should I convert this into an arbitrary coordinate
system (i.e. from 0 - 1) somehow?  


I have noticed that running an analysis on the data gives different results
depending on which type of system you use, so I want to make sure I have the
correct system.  I should also probably note that I am a geostatistical
novice!

Thanks,
  

Hi,

I use the gstat package for geostatistics. For doing the analysis I 
don't think it is necessary to convert to UTM. But maybe just do it to 
be on the safe side. If you use the spatial objects provided by the 
sp-package (http://cran.r-project.org/web/packages/sp/vignettes/sp.pdf) 
you transform your data to other projections using the spTransform package.


Questions regarding geostatistics and spatial data will result in more 
answers on the r-sig-geo list.


cheers,
Paul

--
Drs. Paul Hiemstra
Department of Physical Geography
Faculty of Geosciences
University of Utrecht
Heidelberglaan 2
P.O. Box 80.115
3508 TC Utrecht
Phone: +31302535773
Fax:+31302531145
http://intamap.geo.uu.nl/~paul

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] swap

2008-08-22 Thread Richard . Cotton
 Hello Richie,
 I would like to do three (or k) swap steps in each step just 2 ID 
 recursive swaping
 x - 1:10
 swap - function(x){
   a - sample(x,2)
   x[x==a[1]] - swap[2]
   x[x==a[2]] - swap[1]
   return(x)
   }
   swap(swap(swap(x))) - mix

I tried my best with a response before, but if you want a sensible answer 
you are going to have to try harder explaining what you really want.

What do you mean by 'swap step'?

If you want to swap the position of two elements in a vector (as I suspect 
you might) then which positions do you want swapping?  Do you specify them 
yourself (as inputs to the swap function perhaps), or should they be 
randomly generated?

If you provide more context (the rest of your code, what you are trying to 
achieve etc.) the help will be better.

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to read malformed csv files with read.table?

2008-08-22 Thread Martin Ballaschk

Hi,

how do I read files that have two header fields less than they have  
columns? The easiest solution would be to insert one or two additional  
header fields, but I have a lot of files and that would be quite a lot  
of awful work.


Any ideas on how to solve that problem?

###
R stuff:

 read.table(myfile.CSV, sep = \t, header = T)
  Error in read.table(myfile.CSV, sep = \t,  :
  more columns than column names

 count.fields(myfile.CSV, sep = \t)
   [1] 10 12 12 12 12 12 12 12 12 12 12 [...]

###
ugly sample (Exported by SDL DataTable component):

	time/ms	C550.KMS	Cyt_b559.KMS	Cyt_b563.KMS	Cyt_f_.KMS	P515FR.KMS	 
Scatt.KMS	Zea2.KMS	PC	P700
0	Point1	-599.500	   0.000	   0.000	   0.000	
0.000	   0.000	   0.000	   0.000	   0.000	   0.000
0	Point2	-598.000	  -0.012	  -0.013	   0.040	
0.013	   0.027	   0.010	   0.022	   0.000	   0.000
0	Point3	-596.500	  -0.015	  -0.015	   0.044	
0.020	   0.025	   0.010	   0.033	   0.000	   0.000

[...]


Cheers,
Martin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Re : Help on competing risk package cmprsk with time dependent covariate

2008-08-22 Thread Philippe Guardiola
Hello again,

I m trying to use timereg package as you suggested (R2.7.1 on XP Pro). 
here is my script based on the example from timereg for a fine  gray model in 
which
relt = time to event, rels = status 0/1/2 2=competing, 1=event of interest, 
0=censored
random = covariate I want to test
 
library(timereg)
rel-read.csv(relapse2.csv, header = TRUE, sep = ,, quote=\, dec=., 
fill = TRUE, comment.char=)
names(rel)

 names(rel)
[1] upnrels   relt   random

times-rel$relt[rel$rels==1]

fg1-comp.risk(Surv(relt,rels0)~const(random),rel,rel$rels,times[-1],causeS=1,resample.iid=1,model=prop)
summary(fg)

fg2-comp.risk(Surv(relt,rels0)~random,rel,rel$rels,times[-1],causeS=1,resample.iid=1,model=prop)
summary(fg)

Question1: can you confirm that fg1 evaluates random with the
hypothesis that it has a constant hazard over time and that it is
assumed as being time dependent in fg2 ?

Question 2: both summaries give me the following that I dont understand at all, 
is there a mistake in my script ?

Competing risks Model 

Test for nonparametric terms 

Test for non-significant effects 
sup| hat B(t)/SD(t) | p-value H_0: B(t)=0
(Intercept) 0   0
random  0   0
Test for time invariant effects 
supremum test p-value H_0: B(t)=b t
(Intercept) 0 0
random  0 0

int (b(t)-g(t,gam))^2dt p-value H_0:constant effect 
(Intercept)   00
random00

  Call: 
comp.risk(Surv(relt, rels  0) ~ random, rel, rel$rels, times[-1], causeS = 
1, resample.iid = 1, model = prop)


any help is very welcome
regards


Philippe G


- Message d'origine 
De : Arthur Allignol [EMAIL PROTECTED]
À : Philippe Guardiola [EMAIL PROTECTED]
Cc : R-help@r-project.org; [EMAIL PROTECTED]
Envoyé le : Vendredi, 22 Août 2008, 11h53mn 42s
Objet : Re: [R] Help on competing risk package cmprsk with time dependent 
covariate

Hello,

Something i don't understand
in your question.
Is treatment a time-dependent covariate?
That is, do patients receive the treatment
at the beginning of the study or later?

cmprsk cannot handle time-dependent covariates.

But if treatment is a baseline covariate,
but has a time-varying effect (i.e. does the subdistribution hazard 
ratio varies with time?), your solution
to assess that is weird, because you will transform
your baseline covariate into a time-dependent one,
thus considering all the patients to receive no treatment
the first year. For sure, the treatment wont have any
effect for the first year.
To assess a time-varying effect on competing risks,
i would either follow the cmprsk documentation, including
an interaction with functions of time, or use the comp.risk
function in the timereg package, which fits more flexible
models for the cumulative incidence functions.

Best regards,
Arthur Allignol


Philippe Guardiola wrote:
 Dear R users,
 
 
 I d like to assess the effect of treatment covariate on a disease relapse 
 risk with the package cmprsk. 
 However, the effect of this covariate on survival is time-dependent
 (assessed with cox.zph): no significant effect during the first year of 
 follow-up,
 then after 1 year a favorable effect is observed on survival (step
 function might be the correct way to say that ?). 
 For overall survival analysis
 I have used a time dependent Cox model which has confirmed this positive 
 effect after
 1 year.
 Now I m moving to disease relapse incidence and a similar time dependency 
 seems to be present. 
 
 what I d like to have is that: for
 patients without treatment the code for treatment covariate is
 always 0, and for patients who received treatment covariate I d like
 to have it = 0 during time interval 0 to 1 year, and equal to 1 after 1
 year. Correct me if I m wrong in trying to do so.
 
 
 First, I have run the following script (R2.7.1 under XPpro) according to 
 previous advices:
 
 library(cmprsk)
 attach(LAMrelapse)
 fit1- crr(rel.t, rel.s, treatment, treatment, function(uft)
 cbind(ifelse(uft=1,1,0),ifelse(uft1,1,0)), failcode=1,
 cencode=0, na.action=na.omit, gtol-06, maxiter)
 fit1
 
 where:
 rel.t = time to event (in years)
 rel.s = status , =1 if disease relapse, =2 if death from non disease
 related cause (toxicity of previous chemotherapy), =0 if alive 
 not in relapse
 treatment =
 binary covariate (value: 0 or 1) representing the treatment to test
 (different from chemotherapy above, with no known toxicity)
 I have not yet added other covariates in the model.
 
 
 this script gave me the following result:
 fit1 - crr(relcmp.t, relcmp.s, treatment, treatment, function(uft) 
 cbind(ifelse(uft = 1, 1, 0), ifelse(uft  1, 1, 0)), failcode = 1, cencode 
 = 0, 
 na.action = na.omit, gtol = 1e-006, maxiter = 10)
 fit1
 convergence:  TRUE 
 coefficients:
 [1] -0.6808  0.7508
 

[R] draw a function over a histogram

2008-08-22 Thread Daniela Garavaglia
Sorry, I have some troubles with the graph device.

How can I draw a function over a histogram?

Thank's so much.

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] mapping a vector into predefined bins

2008-08-22 Thread Dr. G. Lawyer

   Hi,
   I have a mapping M consisting of a vector of bin centers (say 0.01, 0.02,
   0.03 ..., 1) with associated values (say red, orange, yellow, ...),
   stored as a 2-column matrix with column 1 the bin center and column 2 the
   value.
   I also have a vector D of data (say 0.9056, 0.4118,  0.2162,  ...).
   I want to map each datum in D to the value associated with its bin in M (say
   0.0111- red, 0.0198 - orange, ...) .
   Does R have a built-in function for this?
   Thanks for tips/suggestions.

   +glenn
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] filtering out data

2008-08-22 Thread Kate Rohrbaugh
Greetings,
 
Apologies for such a simple question, but I have been trying to figure this out 
for a few days now (I'm quite new to R), have read manuals, list posts, etc., 
and not finding precisely what I need.  I am accustomed to working in Stata, 
but I am currently working through the multilevel package right now.  In Stata, 
I would include the statement if model1 == 1 at the end of my regression 
statement, but I can't seem to find the correct syntax for doing the same thing 
in R.
 
I have successfully converted Stata datasets, merged, aggregated, run lm, etc., 
but this simple task is evading me entirely.  
 
Here's what I have tried (and various iterations):
 
if(merged.dataset$model1 == 1) model1.coeff - lm(dv ~ iv1 + iv2 + iv3, data = 
merged.dataset)
I get this:
Warning message:
In if (merged.dataset$model1 == 1) model1.coeff - lm(dv ~ iv +  :
  the condition has length  1 and only the first element will be used
What it seems to do is just run lm on the first opportunities of model1==1, but 
then doesn't try it again.  I tried sorting so that the model1==1 appear first, 
but that doesn't seem to be a great solution either.
 
I'll just go back into Stata and create separate new datasets if I have to, but 
there HAS to be a more elegant way.
 
Thank you for ANY feedback!
 
Kate
 
 
Kate Rohrbaugh
Independent Project Analysis, Inc.
44426 Atwater Drive
Ashburn, VA  20147
 
office - 703.726.5465
fax - 703.729.8301
email - [EMAIL PROTECTED] 
website - www.ipaglobal.com ( http://www.ipaglobal.com/ )

HTMLHEAD
META http-equiv=Content-Type content=text/html; charset=iso-8859-15
META content=MSHTML 6.00.5730.11 name=GENERATOR/HEAD
BODY style=MARGIN: 4px 4px 1px; FONT: 12pt Comic Sans MS; COLOR: #00
DIVnbsp;/DIV
DIVnbsp;/DIV
DIV
HR
/DIV
DIVFONT face=Arial size=1This email message and any attached files are 
confidential and are intended solely for the use of the addressee(s) named 
above. This communication may contain material protected by legal privileges. 
If you are not the intended recipient or person responsible for delivering this 
confidential communication to the intended recipient, you have received this 
communication in error; any review, use, dissemination, forwarding, printing, 
copying or other distribution of this email message and any attached files is 
strictly prohibited. Independent Project Analysis Inc. reserves the right to 
monitor any communication that is created, received, or sent on its network. If 
you have received this confidential communication in error, please notify the 
sender immediately by reply email message and permanently delete the original 
message. Thank you for your cooperation./FONT/DIV/BODY/HTML




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] filtering out data

2008-08-22 Thread Kate Rohrbaugh
Greetings,
 
Apologies for such a simple question, but I have been trying to figure this out 
for a few days now (I'm quite new to R), have read manuals, list posts, etc., 
and not finding precisely what I need.  I am accustomed to working in Stata, 
but I am currently working through the multilevel package right now.  In Stata, 
I would include the statement if model1 == 1 at the end of my regression 
statement, but I can't seem to find the correct syntax for doing the same thing 
in R.
 
I have successfully converted Stata datasets, merged, aggregated, run lm, etc., 
but this simple task is evading me entirely.  
 
Here's what I have tried (and various iterations):
 
if(merged.dataset$model1 == 1) model1.coeff - lm(dv ~ iv1 + iv2 + iv3, data = 
merged.dataset)
I get this:
Warning message:
In if (merged.dataset$model1 == 1) model1.coeff - lm(dv ~ iv +  :
  the condition has length  1 and only the first element will be used
What it seems to do is just run lm on the first opportunities of model1==1, but 
then doesn't try it again.  I tried sorting so that the model1==1 appear first, 
but that doesn't seem to be a great solution either.
 
I'll just go back into Stata and create separate new datasets if I have to, but 
there HAS to be a more elegant way.
 
Thank you for ANY feedback!
 
Kate

Kate Rohrbaugh
Independent Project Analysis, Inc.
44426 Atwater Drive
Ashburn, VA  20147
 
office - 703.726.5465
fax - 703.729.8301
email - [EMAIL PROTECTED]
website - www.ipaglobal.com


HTMLHEAD
META http-equiv=Content-Type content=text/html; charset=iso-8859-15
META content=MSHTML 6.00.5730.11 name=GENERATOR/HEAD
BODY style=MARGIN: 4px 4px 1px; FONT: 12pt Comic Sans MS; COLOR: #00
DIVnbsp;/DIV
DIVnbsp;/DIV
DIV
HR
/DIV
DIVFONT face=Arial size=1This email message and any attached files are 
confidential and are intended solely for the use of the addressee(s) named 
above. This communication may contain material protected by legal privileges. 
If you are not the intended recipient or person responsible for delivering this 
confidential communication to the intended recipient, you have received this 
communication in error; any review, use, dissemination, forwarding, printing, 
copying or other distribution of this email message and any attached files is 
strictly prohibited. Independent Project Analysis Inc. reserves the right to 
monitor any communication that is created, received, or sent on its network. If 
you have received this confidential communication in error, please notify the 
sender immediately by reply email message and permanently delete the original 
message. Thank you for your cooperation./FONT/DIV/BODY/HTML

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] draw a function over a histogram

2008-08-22 Thread Rubén Roa-Ureta

Daniela Garavaglia wrote:

Sorry, I have some troubles with the graph device.

How can I draw a function over a histogram?

Thank's so much.
  

Daniela,
What function?
Here is one example using density() and using dnorm()
x - rnorm(1000,2,2)
hist(x,prob=TRUE)
lines(density(x,na.rm=TRUE),col=red)
library(MASS)
y - fitdistr(x,normal)
curve(dnorm(x,mean=y$estimate[1],sd=y$estimate[2]),add=TRUE)

HTH
R.


[[alternative HTML version deleted]] --- What about this??




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to read malformed csv files with read.table?

2008-08-22 Thread jim holtman
Try this.  It will read the file and see if there is a difference and
add in the extra headers:

x -time/ms C550.KMSCyt_b559.KMSCyt_b563.KMS
Cyt_f_.KMS  P515FR.KMS  Scatt.KMS   Zea2.KMSPC
 P700
0   Point1  -599.500   0.000   0.000
0.000   0.000  0.000   0.000   0.000
0.000   0.000
0   Point2  -598.000  -0.012  -0.013
0.040   0.013  0.027   0.010   0.022
0.000   0.000
0   Point3  -596.500  -0.015  -0.015
0.044   0.020  0.025   0.010   0.033
0.000   0.000
# find out how many dummy headers you have to add
x.c - count.fields(textConnection(x))
x.diff - x.c[2] - x.c[1]  # assume first line is short
x.connection - textConnection(x)  # setup connection
if (x.diff  0){
# read first line
x.first - readLines(x.connection, n=1)
# add dummy headers
x.first - paste(x.first, paste(LETTERS[1:x.diff], collapse= ))
pushBack(x.first, x.connection)   # push back the line so it is
ready for read.table
}

input - read.table(x.connection, header=TRUE)
closeAllConnections()



On Fri, Aug 22, 2008 at 10:19 AM, Martin Ballaschk
[EMAIL PROTECTED] wrote:
 Hi,

 how do I read files that have two header fields less than they have columns?
 The easiest solution would be to insert one or two additional header fields,
 but I have a lot of files and that would be quite a lot of awful work.

 Any ideas on how to solve that problem?

 ###
 R stuff:

 read.table(myfile.CSV, sep = \t, header = T)
  Error in read.table(myfile.CSV, sep = \t,  :
  more columns than column names

 count.fields(myfile.CSV, sep = \t)
   [1] 10 12 12 12 12 12 12 12 12 12 12 [...]

 ###
 ugly sample (Exported by SDL DataTable component):

time/ms C550.KMSCyt_b559.KMSCyt_b563.KMSCyt_f_.KMS
P515FR.KMS  Scatt.KMS   Zea2.KMSPC  P700
 0   Point1  -599.500   0.000   0.000   0.000
   0.000  0.000   0.000   0.000
 0.000   0.000
 0   Point2  -598.000  -0.012  -0.013   0.040
   0.013  0.027   0.010   0.022
 0.000   0.000
 0   Point3  -596.500  -0.015  -0.015   0.044
   0.020  0.025   0.010   0.033
 0.000   0.000
 [...]


 Cheers,
 Martin

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to read malformed csv files with read.table?

2008-08-22 Thread Daniel Folkinshteyn

on 08/22/2008 10:19 AM Martin Ballaschk said the following:
how do I read files that have two header fields less than they have 
columns? The easiest solution would be to insert one or two additional 
header fields, but I have a lot of files and that would be quite a lot 
of awful work.


Any ideas on how to solve that problem?



you could use read.table with header = F, that way it will read the 
table without worrying about column names (they will end up in the first 
row of the data).


Then, you can just delete the first row, or assign it to names(), or 
whatever.


if all the columns in all your files have the same names, you can read 
them all with header=F and col.names=vectorofcolumnnames, and then 
delete first row (which will contain the incomplete col names from the 
file).


hope this helps :)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] filtering out data

2008-08-22 Thread jim holtman
Exactly what do you want this statement to do:

if(merged.dataset$model1 == 1) model1.coeff - lm(dv ~ iv1 + iv2 +
iv3, data = merged.dataset)

The result in the if is a vector of logical values (assuming that
merged.dataset is longer than 1); that is the source of the error
message.  Do you want the regression to be done on every row?  So what
is the problem you are trying to solve?

On Fri, Aug 22, 2008 at 10:47 AM, Kate Rohrbaugh
[EMAIL PROTECTED] wrote:
 Greetings,

 Apologies for such a simple question, but I have been trying to figure this 
 out for a few days now (I'm quite new to R), have read manuals, list posts, 
 etc., and not finding precisely what I need.  I am accustomed to working in 
 Stata, but I am currently working through the multilevel package right now.  
 In Stata, I would include the statement if model1 == 1 at the end of my 
 regression statement, but I can't seem to find the correct syntax for doing 
 the same thing in R.

 I have successfully converted Stata datasets, merged, aggregated, run lm, 
 etc., but this simple task is evading me entirely.

 Here's what I have tried (and various iterations):

 if(merged.dataset$model1 == 1) model1.coeff - lm(dv ~ iv1 + iv2 + iv3, data 
 = merged.dataset)
 I get this:
 Warning message:
 In if (merged.dataset$model1 == 1) model1.coeff - lm(dv ~ iv +  :
  the condition has length  1 and only the first element will be used
 What it seems to do is just run lm on the first opportunities of model1==1, 
 but then doesn't try it again.  I tried sorting so that the model1==1 appear 
 first, but that doesn't seem to be a great solution either.

 I'll just go back into Stata and create separate new datasets if I have to, 
 but there HAS to be a more elegant way.

 Thank you for ANY feedback!

 Kate

 Kate Rohrbaugh
 Independent Project Analysis, Inc.
 44426 Atwater Drive
 Ashburn, VA  20147

 office - 703.726.5465
 fax - 703.729.8301
 email - [EMAIL PROTECTED]
 website - www.ipaglobal.com


 HTMLHEAD
 META http-equiv=Content-Type content=text/html; charset=iso-8859-15
 META content=MSHTML 6.00.5730.11 name=GENERATOR/HEAD
 BODY style=MARGIN: 4px 4px 1px; FONT: 12pt Comic Sans MS; COLOR: #00
 DIVnbsp;/DIV
 DIVnbsp;/DIV
 DIV
 HR
 /DIV
 DIVFONT face=Arial size=1This email message and any attached files are 
 confidential and are intended solely for the use of the addressee(s) named 
 above. This communication may contain material protected by legal privileges. 
 If you are not the intended recipient or person responsible for delivering 
 this confidential communication to the intended recipient, you have received 
 this communication in error; any review, use, dissemination, forwarding, 
 printing, copying or other distribution of this email message and any 
 attached files is strictly prohibited. Independent Project Analysis Inc. 
 reserves the right to monitor any communication that is created, received, or 
 sent on its network. If you have received this confidential communication in 
 error, please notify the sender immediately by reply email message and 
 permanently delete the original message. Thank you for your 
 cooperation./FONT/DIV/BODY/HTML

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] testing for differences in sums - how to?

2008-08-22 Thread James Howey
If I want to test for a salesman effect on average sale, I model it as Sale ~ 
Salesman. But what if I want to test for a salesman effect on total sales? 
How do I model that?

Given a significant salesman effect, how do I follow up with the equivalent of 
Tukey's HSD to determine who the (indistinguishly) top salesmen are?

If I add another factor, say region to the design, how do I then model the 
result?

Thanks,

jkh

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to read malformed csv files with read.table?

2008-08-22 Thread Prof Brian Ripley

On Fri, 22 Aug 2008, Daniel Folkinshteyn wrote:


on 08/22/2008 10:19 AM Martin Ballaschk said the following:
how do I read files that have two header fields less than they have 
columns? The easiest solution would be to insert one or two additional 
header fields, but I have a lot of files and that would be quite a lot of 
awful work.


Any ideas on how to solve that problem?



you could use read.table with header = F, that way it will read the table 
without worrying about column names (they will end up in the first row of the 
data).


Or, better, use header=FALSE, skip=1 and the col.names arg of 
read.table().




Then, you can just delete the first row, or assign it to names(), or 
whatever.


if all the columns in all your files have the same names, you can read them 
all with header=F and col.names=vectorofcolumnnames, and then delete first 
row (which will contain the incomplete col names from the file).


The trouble with that approach is that your will get all factor columns.


hope this helps :)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Test of Homogeneity of Variances

2008-08-22 Thread Daren Tan

I am testing the homogeneity of variances via bartlett.test and fligner.test. 
Using the following example, how should I interpret the p-value in order to 
accept or reject the null hypothesis ?

set.seed(5)
x - rnorm(20)
bartlett.test(x, rep(1:5, each=4))


Bartlett test of homogeneity of variances


data:  x and rep(1:5, each = 4)
Bartlett's K-squared = 1.7709, df = 4, p-value = 0.7778

fligner.test(x, rep(1:5, each=4))

 Fligner-Killeen test of homogeneity of variances


data:  x and rep(1:5, each = 4)
Fligner-Killeen:med chi-squared = 1.0819, df = 4, p-value = 0.8971

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] HELP: how to add weight to a [x,y] coordinate

2008-08-22 Thread Greg Snow
There are several options.

You can use the sunflowerplot function.

You can plot solid points with a small alpha value (not all graphics devices 
support this), so that just a few points show up pale, but a lot show up as 
more opaque.

You can use the symbols function (or bubbleplot, or my.symbols, or ..., from 
other packages).

You can use the hexbin package and function (bioconductor) (I think this is my 
prefered method).

Hope this helps,

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111



 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Jan Akko Eleveld
 Sent: Thursday, August 21, 2008 3:11 PM
 To: r-help@r-project.org
 Subject: [R] HELP: how to add weight to a [x,y] coordinate


 Anyone who can help me with the following question?


 How can I add weight to [x,y] coordinates on a graph/scatterplot?


 Background:
 Monte Carlo simulation generated 730,000 [x,y] coordinates
 with a weight attached (from 0-0.5).
 Both x and y are rounded and fit on a raster with x-axis
 0-170 months (smalles unit = 1 month) and y-axis 0-6
 (smallest unit=0.1).
 I would like every [x,y] to add its weight, so that after all
 730,000 [x,y] are plotted, a maximum likelyhood becomes
 apparent through the size of the point that are made up by
 accumulated weights.

 Thank you in advance for any thoughts!

 Best, Akko Eleveld



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test of Homogeneity of Variances

2008-08-22 Thread Richardson, Patrick
What are your hypotheses?  Once you state what they are, interpretation should 
be straightforward.



-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Daren Tan
Sent: Friday, August 22, 2008 11:18 AM
To: [EMAIL PROTECTED]
Subject: [R] Test of Homogeneity of Variances


I am testing the homogeneity of variances via bartlett.test and fligner.test. 
Using the following example, how should I interpret the p-value in order to 
accept or reject the null hypothesis ?

set.seed(5)
x - rnorm(20)
bartlett.test(x, rep(1:5, each=4))


Bartlett test of homogeneity of variances


data:  x and rep(1:5, each = 4)
Bartlett's K-squared = 1.7709, df = 4, p-value = 0.7778

fligner.test(x, rep(1:5, each=4))

 Fligner-Killeen test of homogeneity of variances


data:  x and rep(1:5, each = 4)
Fligner-Killeen:med chi-squared = 1.0819, df = 4, p-value = 0.8971

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
This email message, including any attachments, is for th...{{dropped:6}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lme questions re: repeated measures covariance structure

2008-08-22 Thread Mark Na
Hello,

We are attempting to use nlme to fit a linear mixed model to explain bird
abundance as a function of habitat:


lme(abundance~habitat-1,data=data,method=ML,random=~1|sampleunit)

The data consist of repeated counts of birds in sample units across multiple
years, and we have two questions:

1) Is it necessary (and, if so, how) to specify the repeated measure
(years)? As written, the above code does not.

2) How can we specify a Toeplitz heterogeneous covariance structure for this
model? We have searched the help file for lme, and the R-help archives, but
cannot find any pertinent information. If that's not possible, can we adapt
an existing covariance structure, and if so how?

Thanks, Mark

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Null and Alternate hypothesis for Significance test

2008-08-22 Thread Greg Snow
The problem with trying to prove that 2 distributions are equal is that there 
is exactly one way in which they can be equal and an infinite number of ways 
that they can be different (and its rather a large infinity).  The traditional 
use of equality as the Null hypothesis works, because we can assume that it is 
true and then compute the probability of seeing data as or more extreem than 
that observed.  If we just want to assume that the 2 distributions are 
different and compute the probability of the similarity between the datasets, 
then we can assume differences that are small enough, but still non-zero, to 
easily result in the similarity of the data seen.  It is not hard to come up 
with 2 distributions that are clearly different, but which would yield 
identical datasets.  Given this, any test to show exact equality would need to 
have a p-value identically set at 1 (data observed is completely plausible with 
null hypothesis).  Or we could get a significantly higher powered !
 test of the correct size by generating a p-value from a uniform(0,1) 
distribution.  Neither option has much scientific merit.

Stepping away from proving equality, the more common approach is to prove 
equivalence, where the alternative hypothesis is that the 2 distributions are 
close enough even if not equal.  Determining close enough is subjective and 
dependent on the scientific question.  Close enough is often determined by 
visual inspection of graphs rather than a hypothesis test.  If you insist on a 
hypothesis test, then you need to determine in advance what is meant by close 
enough, both in deciding the distance measure and how big that distance has to 
be before they are no longer equivalent.

You asked about the KS distance measure, that is one option for choosing a 
distance, there are others, which works best depends on you and the scientific 
question.  Take for example 2 distributions F and G.  F is uniform(0,1) meaning 
the the density function is 1 between 0 and 1 and 0 elsewhere, the distribution 
of G is equal to 1 between 0 and 0.99 and also equal to 1 between 99.99 and 
100, zero elsewhere.  Are these 2 functions equivalent?  The 2 functions have 
99% overlap, the KS distance is small (0.01 if I remember correctly), but the 
means and variances are quite different.  When generating random values from 
the 2 distributions we will see very similar numbers with the exception that G 
will generate outliers near 100 1% of the time.  Some people would consider 
these 2 distributions equivalent (they are pretty much the same if we discard 
the outliers), while others would consider the potential outliers (very 
extreeme) to make them non-equivalent.  You need to decide th!
 at based on the science from which your data comes.

Hope this helps,

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111



 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Nitin Agrawal
 Sent: Thursday, August 21, 2008 2:59 PM
 To: r-help@r-project.org
 Subject: [R] Null and Alternate hypothesis for Significance test

 Hi,
 I had a question about specifying the Null hypothesis in a
 significance test.
 Advance apologies if this has already been asked previously
 or is a naive question.

 I have two samples A and B, and I want to test whether A and
 B come from the same distribution. The default Null
 hypothesis would be H0: A=B But since I am trying to prove
 that A and B indeed come from the same distribution, I think
 this is not the right choice for the null hypothesis (it
 should be one that is set up to be rejected)

 How do I specify a null hypothesis H0: A not equal to B for
 say a KS test.
 An example to do this in R would be greatly appreciated.

 On a related note: what is a good way to measure the
 difference between observed and expected PDFs? Is the D
 statistic of the KS test a good choice?

 Thanks!
 Nitin

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] swap

2008-08-22 Thread Patrick Burns

I'm guessing that the following (untested)
does what is wanted:

function(x) {
   pos - sample(length(x), 2, replace=FALSE)
   x[pos] - x[ pos[2:1] ]
   x
}


Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and A Guide for the Unwilling S User)

[EMAIL PROTECTED] wrote:

Hello Richie,
I would like to do three (or k) swap steps in each step just 2 ID 
recursive swaping

x - 1:10
swap - function(x){
  a - sample(x,2)
  x[x==a[1]] - swap[2]
  x[x==a[2]] - swap[1]
  return(x)
  }
  swap(swap(swap(x))) - mix



I tried my best with a response before, but if you want a sensible answer 
you are going to have to try harder explaining what you really want.


What do you mean by 'swap step'?

If you want to swap the position of two elements in a vector (as I suspect 
you might) then which positions do you want swapping?  Do you specify them 
yourself (as inputs to the swap function perhaps), or should they be 
randomly generated?


If you provide more context (the rest of your code, what you are trying to 
achieve etc.) the help will be better.


Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lme questions re: repeated measures covariance structure

2008-08-22 Thread Christoph Scherber

Dear Mark,

I would include the repeated measure as the smallest stratum in the random 
effects specification:

random=~1|sampleunit/year

Setting up user-defined variance structures should be possible using for 
example:

weights=varPower(form=~habitat)

or also try out the available corStruct() classes (found in Pinheiro and Bates 
2000)

HTH
Christoph




Mark Na schrieb:

Hello,

We are attempting to use nlme to fit a linear mixed model to explain bird
abundance as a function of habitat:


lme(abundance~habitat-1,data=data,method=ML,random=~1|sampleunit)

The data consist of repeated counts of birds in sample units across multiple
years, and we have two questions:

1) Is it necessary (and, if so, how) to specify the repeated measure
(years)? As written, the above code does not.

2) How can we specify a Toeplitz heterogeneous covariance structure for this
model? We have searched the help file for lme, and the R-help archives, but
cannot find any pertinent information. If that's not possible, can we adapt
an existing covariance structure, and if so how?

Thanks, Mark

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

.



--
Dr. rer.nat. Christoph Scherber
University of Goettingen
DNPW, Agroecology
Waldweg 26
D-37073 Goettingen
Germany

phone +49 (0)551 39 8807
fax   +49 (0)551 39 8806

Homepage http://www.gwdg.de/~cscherb1

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mapping a vector into predefined bins

2008-08-22 Thread Wolfgang Huber

Have a look at

  ? cut



Best wishes
 Wolfgang

--
Wolfgang Huber  EBI/EMBL  Cambridge UK  http://www.ebi.ac.uk/huber



22/08/2008 14:14 Dr. G. Lawyer scripsit
Hi,
I have a mapping M consisting of a vector of bin centers (say 0.01, 0.02,
0.03 ..., 1) with associated values (say red, orange, yellow, ...),
stored as a 2-column matrix with column 1 the bin center and column 2 the
value.
I also have a vector D of data (say 0.9056, 0.4118,  0.2162,  ...).
I want to map each datum in D to the value associated with its bin in M 
 (say
0.0111- red, 0.0198 - orange, ...) .
Does R have a built-in function for this?
Thanks for tips/suggestions.
 
+glenn

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] filtering out data

2008-08-22 Thread Daniel Folkinshteyn
If I understand correctly the result you are trying to achieve, I think 
what you may be looking for is this:


model1.coeff - lm(dv ~ iv1 + iv2 + iv3, data = 
merged.dataset[merged.dataset$model1 == 1, ])


This will give you a regression using only the data for which model1 == 1.

on 08/22/2008 11:07 AM jim holtman said the following:

Exactly what do you want this statement to do:

if(merged.dataset$model1 == 1) model1.coeff - lm(dv ~ iv1 + iv2 +
iv3, data = merged.dataset)

The result in the if is a vector of logical values (assuming that
merged.dataset is longer than 1); that is the source of the error
message.  Do you want the regression to be done on every row?  So what
is the problem you are trying to solve?

On Fri, Aug 22, 2008 at 10:47 AM, Kate Rohrbaugh
[EMAIL PROTECTED] wrote:

Greetings,

Apologies for such a simple question, but I have been trying to figure this out for a few 
days now (I'm quite new to R), have read manuals, list posts, etc., and not finding 
precisely what I need.  I am accustomed to working in Stata, but I am currently working 
through the multilevel package right now.  In Stata, I would include the statement 
if model1 == 1 at the end of my regression statement, but I can't seem to 
find the correct syntax for doing the same thing in R.

I have successfully converted Stata datasets, merged, aggregated, run lm, etc., 
but this simple task is evading me entirely.

Here's what I have tried (and various iterations):

if(merged.dataset$model1 == 1) model1.coeff - lm(dv ~ iv1 + iv2 + iv3, data = 
merged.dataset)
I get this:
Warning message:
In if (merged.dataset$model1 == 1) model1.coeff - lm(dv ~ iv +  :
 the condition has length  1 and only the first element will be used
What it seems to do is just run lm on the first opportunities of model1==1, but 
then doesn't try it again.  I tried sorting so that the model1==1 appear first, 
but that doesn't seem to be a great solution either.

I'll just go back into Stata and create separate new datasets if I have to, but 
there HAS to be a more elegant way.

Thank you for ANY feedback!

Kate

Kate Rohrbaugh
Independent Project Analysis, Inc.
44426 Atwater Drive
Ashburn, VA  20147

office - 703.726.5465
fax - 703.729.8301
email - [EMAIL PROTECTED]
website - www.ipaglobal.com


HTMLHEAD
META http-equiv=Content-Type content=text/html; charset=iso-8859-15
META content=MSHTML 6.00.5730.11 name=GENERATOR/HEAD
BODY style=MARGIN: 4px 4px 1px; FONT: 12pt Comic Sans MS; COLOR: #00
DIVnbsp;/DIV
DIVnbsp;/DIV
DIV
HR
/DIV
DIVFONT face=Arial size=1This email message and any attached files are confidential and are 
intended solely for the use of the addressee(s) named above. This communication may contain material protected by 
legal privileges. If you are not the intended recipient or person responsible for delivering this confidential 
communication to the intended recipient, you have received this communication in error; any review, use, 
dissemination, forwarding, printing, copying or other distribution of this email message and any attached files is 
strictly prohibited. Independent Project Analysis Inc. reserves the right to monitor any communication that is 
created, received, or sent on its network. If you have received this confidential communication in error, please 
notify the sender immediately by reply email message and permanently delete the original message. Thank you for 
your cooperation./FONT/DIV/BODY/HTML

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.







__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test of Homogeneity of Variances

2008-08-22 Thread Daren Tan

I am testing whether the sample variances are equal. When p-value  0.05 
(alpha), should accept null hypothesis (sample variances are equal) or reject 
it ?


The two new examples with each having same sample variances also puzzle me. Why 
are the p-values different ?

bartlett.test(rep(rnorm(5),times=4), rep(1:5, each=4))


Bartlett test of homogeneity of variances


data:  rep(rnorm(5), times = 4) and rep(1:5, each = 4)
Bartlett's K-squared = 0.8681, df = 4, p-value = 0.929


bartlett.test(rep(rnorm(5),times=4), rep(1:5, each=4))


Bartlett test of homogeneity of variances


data:  rep(rnorm(5), times = 4) and rep(1:5, each = 4)
Bartlett's K-squared = 3.5599, df = 4, p-value = 0.4688


 From: [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
 Date: Fri, 22 Aug 2008 11:25:36 -0400
 Subject: RE: [R] Test of Homogeneity of Variances

 What are your hypotheses? Once you state what they are, interpretation should 
 be straightforward.



 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Daren Tan
 Sent: Friday, August 22, 2008 11:18 AM
 To: [EMAIL PROTECTED]
 Subject: [R] Test of Homogeneity of Variances


 I am testing the homogeneity of variances via bartlett.test and fligner.test. 
 Using the following example, how should I interpret the p-value in order to 
 accept or reject the null hypothesis ?

 set.seed(5)
 x - rnorm(20)
 bartlett.test(x, rep(1:5, each=4))


 Bartlett test of homogeneity of variances


 data: x and rep(1:5, each = 4)
 Bartlett's K-squared = 1.7709, df = 4, p-value = 0.7778

 fligner.test(x, rep(1:5, each=4))

 Fligner-Killeen test of homogeneity of variances


 data: x and rep(1:5, each = 4)
 Fligner-Killeen:med chi-squared = 1.0819, df = 4, p-value = 0.8971

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 This email message, including any attachments, is for ...{{dropped:6}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] filtering out data

2008-08-22 Thread Kate Rohrbaugh
Thank you, thank you everyone!
 
This last worked the best for what I need to do.  The other options were 
definitely helpful in that they help me understand how R is structured 
(something I haven't been able to get my head around yet) and gave me good 
ideas about how to manipulate data, however they generated regressions of data 
I don't care about (i.e., when model1==0).   
 
Regards -- Kate
 
Kate Rohrbaugh
Independent Project Analysis, Inc.
44426 Atwater Drive
Ashburn, VA  20147
 
office - 703.726.5465
fax - 703.729.8301
email - [EMAIL PROTECTED] 
website - www.ipaglobal.com ( http://www.ipaglobal.com/ )


 Daniel Folkinshteyn [EMAIL PROTECTED] 8/22/2008 12:08 PM 
If I understand correctly the result you are trying to achieve, I think 
what you may be looking for is this:

model1.coeff - lm(dv ~ iv1 + iv2 + iv3, data = 
merged.dataset[merged.dataset$model1 == 1, ])

This will give you a regression using only the data for which model1 == 1.

on 08/22/2008 11:07 AM jim holtman said the following:
 Exactly what do you want this statement to do:
 
 if(merged.dataset$model1 == 1) model1.coeff - lm(dv ~ iv1 + iv2 +
 iv3, data = merged.dataset)
 
 The result in the if is a vector of logical values (assuming that
 merged.dataset is longer than 1); that is the source of the error
 message.  Do you want the regression to be done on every row?  So what
 is the problem you are trying to solve?
 
 On Fri, Aug 22, 2008 at 10:47 AM, Kate Rohrbaugh
 [EMAIL PROTECTED] wrote:
 Greetings,

 Apologies for such a simple question, but I have been trying to figure this 
 out for a few days now (I'm quite new to R), have read manuals, list posts, 
 etc., and not finding precisely what I need.  I am accustomed to working in 
 Stata, but I am currently working through the multilevel package right now.  
 In Stata, I would include the statement if model1 == 1 at the end of my 
 regression statement, but I can't seem to find the correct syntax for doing 
 the same thing in R.

 I have successfully converted Stata datasets, merged, aggregated, run lm, 
 etc., but this simple task is evading me entirely.

 Here's what I have tried (and various iterations):

 if(merged.dataset$model1 == 1) model1.coeff - lm(dv ~ iv1 + iv2 + iv3, data 
 = merged.dataset)
 I get this:
 Warning message:
 In if (merged.dataset$model1 == 1) model1.coeff - lm(dv ~ iv +  :
  the condition has length  1 and only the first element will be used
 What it seems to do is just run lm on the first opportunities of model1==1, 
 but then doesn't try it again.  I tried sorting so that the model1==1 appear 
 first, but that doesn't seem to be a great solution either.

 I'll just go back into Stata and create separate new datasets if I have to, 
 but there HAS to be a more elegant way.

 Thank you for ANY feedback!

 Kate

 Kate Rohrbaugh
 Independent Project Analysis, Inc.
 44426 Atwater Drive
 Ashburn, VA  20147

 office - 703.726.5465
 fax - 703.729.8301
 email - [EMAIL PROTECTED] 
 website - www.ipaglobal.com 


 HTMLHEAD
 META http-equiv=Content-Type content=text/html; charset=iso-8859-15
 META content=MSHTML 6.00.5730.11 name=GENERATOR/HEAD
 BODY style=MARGIN: 4px 4px 1px; FONT: 12pt Comic Sans MS; COLOR: #00
 DIVnbsp;/DIV
 DIVnbsp;/DIV
 DIV
 HR
 /DIV
 DIVFONT face=Arial size=1This email message and any attached files are 
 confidential and are intended solely for the use of the addressee(s) named 
 above. This communication may contain material protected by legal 
 privileges. If you are not the intended recipient or person responsible for 
 delivering this confidential communication to the intended recipient, you 
 have received this communication in error; any review, use, dissemination, 
 forwarding, printing, copying or other distribution of this email message 
 and any attached files is strictly prohibited. Independent Project Analysis 
 Inc. reserves the right to monitor any communication that is created, 
 received, or sent on its network. If you have received this confidential 
 communication in error, please notify the sender immediately by reply email 
 message and permanently delete the original message. Thank you for your 
 cooperation./FONT/DIV/BODY/HTML

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help 
 PLEASE do read the posting guide http://www.R ( http://www.r/ 
 )-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
 
 

HTMLHEAD
META http-equiv=Content-Type content=text/html; charset=iso-8859-15
META content=MSHTML 6.00.5730.11 name=GENERATOR/HEAD
BODY style=MARGIN: 4px 4px 1px; FONT: 12pt Comic Sans MS; COLOR: #00
DIVnbsp;/DIV
DIVnbsp;/DIV
DIV
HR
/DIV
DIVFONT face=Arial size=1This email message and any attached files are 
confidential and are intended solely for the use of the addressee(s) named 
above. This communication may contain material protected by legal privileges. 
If you are not the intended 

[R] WinBUGS with R

2008-08-22 Thread artimon

Dear Users,
I am new to both of things, so do not blame me too much...

I am busy with semiparametric regression and use WinBUGS to sample
posteriors.
The code to call Winbugs is as follows: 

data- list(y,X,n,m) #My variables
inits.beta  - rep(0,K)
inits.beta0 - 0
inits   - 
function(){list(beta=inits.beta,beta0=inits.beta0,taueps=1.0E-3)}
parameters  - list(sigma,beta,beta0,y.star)
fitm- bugs(data,inits,parameters,model.file=model.bug,
n.chains=3, n.iter=n.iter, n.burnin=n.burnin, n.thin = 
n.thin,
debug=FALSE,DIC=FALSE,digit=5,codaPkg=FALSE,
bugs.directory=C:/Program Files/WinBUGS14/)

but I always get the following error:
Error in FUN(X[[1L]], ...) : 
  .C(..): 'type' must be real for this format


I tried the web, but failed. Could anyone give me a clue?
Best!

-- 
View this message in context: 
http://www.nabble.com/WinBUGS-with-R-tp19108557p19108557.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combining multiple datasets

2008-08-22 Thread bogdan romocea

Here's a function that does what you asked, you may need to adjust the column
names of your data frames before using it. If all your data frames are
similar (same number of rows, same years) then try do.call('cbind',
yourList).


#This function takes a list of data frames and merges them into one data
frame.
#   The data frames are assumed to have one common column name (the one 
used 
#   for joins) and customized name(s) for the remaining column(s).
#LDF = list of data frames to be merged

rmerge - function(LDF, verbose=FALSE)
{
DF - LDF[[1]]
cat(paste(Started with, nrow(DF), rows from, names(LDF)[1]), \n)
for (i in 2:length(LDF)) {
DF - merge(DF, LDF[[i]], all=TRUE)  #outer join
if (verbose) cat(paste(Adding , nrow(LDF[[i]]),  rows from ,
names(LDF)[i], 
 (, nrow(DF),  rows so far)..., sep=), \n)
}
rownames(DF) - NULL
cat(paste(Got, nrow(DF), rows after recursive merging), \n)
DF
}




Alison Macalady wrote:
 
 Hi,
 I've tried to figure this out using Intro to R and help(), to no avail 
 - I am new at this.
 
 I'm trying to write a script that will read multiple files from a 
 directory and then merge them into a single new data frame.
 The original data are in a tree-ring specific format, and so I've first 
 used a function (read.rwl) from the dplR package to read each file, 
 translate each into a more intuitive time series format, and then put 
 each new data frame into a single object, demo.Rwl:
 
  demo.Rwl - list.files(pattern=.RWL$); for(i in demo.Rwl) { x - 
 read.rwl(i, header=TRUE); assign(print(i, quote=FALSE), x)}
 
 This part seems to work.  Now, I want to be able to put all of the 
 individual data frames contained in demo.Rwl into a single data frame, 
 merging by rows (rows are calendar years in the time series). I think I 
 know how to do this by typing each data set name into a merge:
 
  merge(x,y..., by.x=0, all=TRUE )
 
 However, I'd like to be able to script something that will save me the 
 time of typing in all of the individual data set names, as I will be 
 repeating this operation many times with many different numbers of 
 original input files. Is there a way to code a merge such that every 
 individual data frame contained in demo.Rwl (a different number every 
 time) gets combined into a single data set?
 
 Thanks for the help!
 
 Ali
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Combining-multiple-datasets-tp19100369p19108774.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Quantile regression with complex survey data

2008-08-22 Thread Brian S Cade
Just as a follow-up on T. Lumley's post, 2 citations that may be useful in 
reference to application of quantile regression with survey samples are:

Bassett and Saleh.  1994.  L_1 estimation of the median of a survey 
population.  Nonparametric Statistics 3: 277-283.  (L_1 estimation of 
median is 0.50 quantile regression).

Bassett et al.  2002.  Quantile models and estimators for data analysis. 
Metrika 55: 17-26.  (describes weighted QR for survey of school 
characteristics and student achievement scores).

Brian

Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  [EMAIL PROTECTED]
tel:  970 226-9326



Stas Kolenikov [EMAIL PROTECTED] 
Sent by: [EMAIL PROTECTED]
08/20/2008 01:14 PM

To
Cheng, Yiling (CDC/CCHP/NCCDPHP) [EMAIL PROTECTED]
cc
r-help@r-project.org
Subject
Re: [R] Quantile regression with complex survey data






On Wed, Aug 20, 2008 at 8:12 AM, Cheng, Yiling (CDC/CCHP/NCCDPHP)
[EMAIL PROTECTED] wrote:
 I am working on the NHANES survey data, and want to apply quantile
 regression on these complex survey data. Does anyone know how to do
 this?

There are no references in technical literature (thinking, Annals,
JASA, JRSS B, Survey Methodology). Absolutely none. Zero. You might be
able to apply the procedure mechanically and then adjust the standard
errors, but God only knows what the population equivalent is of
whatever that model estimates. If there is a population analogue at
all.

In general, a quantile regression is a heavily model based concept:
for each value of the explanatory variables, there is a well defined
distribution of the response, and quantile regression puts additional
structure on it -- linearity of quantiles wrt to some explanatory
variables. That does not mesh well with the design paradigm according
to which the survey estimation is usually conducted. With the latter,
the finite population and characteristics of every unit are assumed
fixed, and randomness comes only from the sampling procedure. Within
that paradigm, you can define the marginal distribution of the
response (or any other) variable, but the conditional distributions
may simply be unavailable because there are no units in the population
satisfying the conditions.

-- 
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: I use this email account for mailing lists only.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] grep, gsub and metacharacters

2008-08-22 Thread Judith Flores
Hello,

   I have an expression that I want to substitute for something else.

myvector-c(ajkhfkiuwe,Variable(kg/min))

if I use the following code to extract variable(kg/min) and substitute it for 
va

gsub(myvector[grep(var, myvector, ignore=T)], va, myvector)

 grep identifies the element of the vector that matches my query, but it won't 
substitute the value. I have been reading the help pages for regular 
expression, but still can't figure out the syntax to read parenthesis and 
forward slashes, which are considered metacharacters.


As usual, thank you for your help,

Judith

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Survey Design / Rake questions

2008-08-22 Thread Farley, Robert
I *think* I'm making progress, but I'm still failing at the same step.  My rake 
call fails with:
Error in postStratify.survey.design(design, strata[[i]], 
population.margins[[i]],  : 
  Stratifying variables don't match

To my naïve eyes, it seems that my factors are in the wrong order.  If so, 
how do I assert an ordering in my survey dataframe, or copy an image from 
the survey dataframe to my marginals dataframes?  I'd prefer to pull the 
original marginals dataframe(s) from the survey dataframe so that I can 
automate that in production.

If that's not my problem, where might I look for enlightenment?  Neither ?why 
nor ?whatamimissing return citations.  :-)


**
 How I fail Now **
SurveyData - read.spss(C:/Data/R/orange_delivery.sav, use.value.labels=TRUE, 
max.value.labels=Inf, to.data.frame=TRUE)
 #===
 temp - sub(' +$', '', SurveyData$direction_) 
 SurveyData$direction_ - temp
 #===
 SurveyData$NumStn=abs(as.numeric(SurveyData$lineon)-as.numeric(SurveyData$lineoff))
 mean(SurveyData$NumStn)
[1] 6.785276
 ### Kludge
 SurveyData$NumStn - pmax(1,SurveyData$NumStn)
 mean(SurveyData$NumStn)
[1] 6.789877
 SurveyData$NumStn - as.factor(SurveyData$NumStn)
 ###
 EBSurvey - subset(SurveyData, direction_ == EASTBOUND )
 XTTable - xtabs(~direction_ , EBSurvey)
 XTTable
direction_
EASTBOUND 
  345 
 WBSurvey - subset(SurveyData, direction_ == WESTBOUND )
 XTTable - xtabs(~direction_ , WBSurvey)
 XTTable
direction_
WESTBOUND 
  307 
 #
 EBDesign - svydesign(id=~sampn, weights=~expwgt, data=EBSurvey)
 #   svytable(~lineon+lineoff, EBDesign)
 StnName - c( Warner Center, De Soto, Pierce College, Tampa, 
 Reseda, Balboa, Woodley, Sepulveda, Van Nuys, Woodman, Valley 
 College, Laurel Canyon, North Hollywood)
 EBOnNewTots - c(1000,   600, 1200, 500, 
 1000,  500,   200, 250,   1000,   300,  
 100,  123.65,0 )
 StnTraveld  - c(as.character(1:12))
 EBNumStn- c(673.65, 800, 1000, 1000,  800,  700,  600, 500, 400, 
 200,  50, 50 )
 ByEBOn  - data.frame(StnName,   Freq=EBOnNewTots)
 ByEBNum - data.frame(StnTraveld, Freq=EBNumStn)
 RakedEBSurvey - rake(EBDesign, list(~lineon, ~NumStn), list(ByEBOn, ByEBNum) 
 )
Error in postStratify.survey.design(design, strata[[i]], 
population.margins[[i]],  : 
  Stratifying variables don't match
 
 str(EBSurvey$lineon)
 Factor w/ 13 levels Warner Center,..: 3 1 1 1 2 13 1 5 1 5 ...
 EBSurvey$lineon[1:5]
[1] Pierce College Warner Center  Warner Center  Warner Center  De Soto   
Levels: Warner Center De Soto Pierce College Tampa Reseda Balboa Woodley 
Sepulveda Van Nuys Woodman Valley College Laurel Canyon North Hollywood
 str(ByEBOn$StnName)
 Factor w/ 13 levels Balboa,De Soto,..: 11 2 5 8 6 1 12 7 10 13 ...
 ByEBOn$StnName[1:5]
[1] Warner Center  De SotoPierce College Tampa  Reseda
Levels: Balboa De Soto Laurel Canyon North Hollywood Pierce College Reseda 
Sepulveda Tampa Valley College Van Nuys Warner Center Woodley Woodman
 
 str(EBSurvey$NumStn)
 Factor w/ 12 levels 1,2,3,4,..: 10 12 4 12 8 1 8 8 12 4 ...
 EBSurvey$NumStn[1:5]
[1] 10 12 4  12 8 
Levels: 1 2 3 4 5 6 7 8 9 10 11 12
 str(ByEBNum$StnTraveld)
 Factor w/ 12 levels 1,10,11,..: 1 5 6 7 8 9 10 11 12 2 ...
 ByEBNum$StnTraveld[1:5]
[1] 1 2 3 4 5
Levels: 1 10 11 12 2 3 4 5 6 7 8 9


**
**


Robert Farley
Metro
www.Metro.net 


-Original Message-
From: Thomas Lumley [mailto:[EMAIL PROTECTED] 
Sent: Thursday, August 21, 2008 13:55
To: Farley, Robert
Cc: r-help@r-project.org
Subject: Re: [R] Survey Design / Rake questions

On Tue, 19 Aug 2008, Farley, Robert wrote:

 While I'm trying to catch up on the statistical basis of my task, could
 someone point me to how I should fix my R error?

The variables in the formula in rake() need to be the raw variables in the 
design object, not summary tables.

   -thomas


 Thanks


 
 
 library(survey)
 SurveyData - read.spss(C:/Data/R/orange_delivery.sav,
 use.value.labels=TRUE, max.value.labels=Inf, to.data.frame=TRUE)

 #===
 
 temp - sub(' +$', '', SurveyData$direction_)
 SurveyData$direction_ - temp

 #===
 

 SurveyData$NumStn=abs(as.numeric(SurveyData$lineon)-as.numeric(SurveyDat
 a$lineoff))
 EBSurvey - subset(SurveyData, direction_ == EASTBOUND )
 XTTable - xtabs(~direction_ , EBSurvey)
 

Re: [R] WinBUGS with R

2008-08-22 Thread Uwe Ligges



artimon wrote:

Dear Users,
I am new to both of things, so do not blame me too much...

I am busy with semiparametric regression and use WinBUGS to sample
posteriors.
The code to call Winbugs is as follows: 


data- list(y,X,n,m) #My variables
inits.beta  - rep(0,K)
inits.beta0 - 0
inits   - 
function(){list(beta=inits.beta,beta0=inits.beta0,taueps=1.0E-3)}
parameters  - list(sigma,beta,beta0,y.star)
fitm- bugs(data,inits,parameters,model.file=model.bug,
n.chains=3, n.iter=n.iter, n.burnin=n.burnin, n.thin = 
n.thin,
debug=FALSE,DIC=FALSE,digit=5,codaPkg=FALSE,
bugs.directory=C:/Program Files/WinBUGS14/)



but I always get the following error:
Error in FUN(X[[1L]], ...) : 
  .C(..): 'type' must be real for this format



I tried the web, but failed. Could anyone give me a clue?
Best!




- Your example is not reproducible for us (without the data).
- Is WinBUGS runnign correctly (see its output with debug=TRUE)?
- What does traceback() tell us?

Which versions of R, WinBUGS and R2WinBUGS are in use?

Uwe Ligges

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to read malformed csv files with read.table?

2008-08-22 Thread Martin Ballaschk

Hi folks,

thank you for your friendly and immediate help!

Am 22.08.2008 um 17:14 schrieb Prof Brian Ripley:
Or, better, use header=FALSE, skip=1 and the col.names arg of  
read.table().


My solution is reading the files without the headers (skip = 1) and  
seperately reading the headers with scan (scan(myfile.CSV, what =  
character, sep = \t, nlines = 1). After throwing out the first two  
columns it should be possible to assign the scanned colnames to the  
data.frame colnames.


Cheers
Martin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grep, gsub and metacharacters

2008-08-22 Thread Greg Snow
Try this:

 myvector-c(ajkhfkiuwe,Variable(kg/min))
 gsub( \\(kg/min\\), va, myvector )

Does that come close to what you want?  If not, maybe an example of what you 
want the result to look like,

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111



 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Judith Flores
 Sent: Friday, August 22, 2008 10:57 AM
 To: RHelp
 Subject: [R] grep, gsub and metacharacters

 Hello,

I have an expression that I want to substitute for something else.

 myvector-c(ajkhfkiuwe,Variable(kg/min))

 if I use the following code to extract variable(kg/min) and
 substitute it for va

 gsub(myvector[grep(var, myvector, ignore=T)], va, myvector)

  grep identifies the element of the vector that matches my
 query, but it won't substitute the value. I have been reading
 the help pages for regular expression, but still can't figure
 out the syntax to read parenthesis and forward slashes, which
 are considered metacharacters.


 As usual, thank you for your help,

 Judith

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to read malformed csv files with read.table?

2008-08-22 Thread Prof Brian Ripley

On Fri, 22 Aug 2008, Martin Ballaschk wrote:


Hi folks,

thank you for your friendly and immediate help!

Am 22.08.2008 um 17:14 schrieb Prof Brian Ripley:

Or, better, use header=FALSE, skip=1 and the col.names arg of read.table().


My solution is reading the files without the headers (skip = 1) and 
seperately reading the headers with scan (scan(myfile.CSV, what = 
character, sep = \t, nlines = 1). After throwing out the first two 
columns it should be possible to assign the scanned colnames to the 
data.frame colnames.


Yes, but if you read the header first you can set the col.names via the 
arg to read.table().


--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] simple generation of artificial data with defined features

2008-08-22 Thread Christos Hatzis
On the general question on how to create a dataset that matches the
frequencies in a table, function as.data.frame can be useful.  It takes as
argument an object of a class 'table' and returns a data frame of
frequencies.

Consider for example table 6.1 of Fleiss et al (3rd Ed):

 birth.weight - c(10,15,40,135)
 attr(birth.weight, class) - table
 attr(birth.weight, dim) - c(2,2)
 attr(birth.weight, dimnames) - list(c(A, Ab), c(B, Bb))
 birth.weight
 B  Bb
A   10  40
Ab  15 135
 summary(birth.weight)
Number of cases in table: 200 
Number of factors: 2 
Test for independence of all factors:
Chisq = 3.429, df = 1, p-value = 0.06408
 
 bw.dt - as.data.frame(birth.weight)

Observations (rows) in this table can then be replicated according to their
corresponding frequencies to yield the expanded dataset that conforms with
the original table. 

 bw.dt.exp - bw.dt[rep(1:nrow(bw.dt), bw.dt$Freq), -ncol(bw.dt)]
 dim(bw.dt.exp)
[1] 200   2
 table(bw.dt.exp)
Var2
Var1   B  Bb
  A   10  40
  Ab  15 135 

The above approach is not restricted to 2x2 tables, and should be
straightforward generate datasets that conform to arbitrary nxm frequency
tables.

-Christos Hatzis


 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Greg Snow
 Sent: Friday, August 22, 2008 12:41 PM
 To: drflxms; r-help@r-project.org
 Subject: Re: [R] simple generation of artificial data with 
 defined features
 
 I don't think that the election data is the right data to 
 demonstrate Kappa, you need subjects that are classified by 2 
 or more different raters/methods.  The election data could be 
 considered classifying the voters into which party they voted 
 for, but you only have 1 rater.  Maybe if you had some survey 
 data that showed which party each voter voted for in 2 or 
 more elections, then that may be a good example dataset.  
 Otherwise you may want to stick with the sample datasets.
 
 There are other packages that compute Kappa values as well (I 
 don't know if others calculate this particular version), but 
 some of those take the summary data as input rather than the 
 raw data, which may be easier if you just have the summary tables.
 
 
 --
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 [EMAIL PROTECTED]
 (801) 408-8111
 
 
 
  -Original Message-
  From: [EMAIL PROTECTED]
  [mailto:[EMAIL PROTECTED] On Behalf Of drflxms
  Sent: Friday, August 22, 2008 6:12 AM
  To: r-help@r-project.org
  Subject: [R] simple generation of artificial data with defined 
  features
 
  Dear R-colleagues,
 
  I am quite a newbie to R fighting my stupidity to solve a probably 
  quite simple problem of generating artificial data with defined 
  features.
 
  I am conducting a study of inter-observer-agreement in 
  child-bronchoscopy. One of the most important measures is Kappa 
  according to Fleiss, which is very comfortable available in 
 R through 
  the irr-package.
  Unfortunately medical doctors like me don't really 
 understand much of 
  statistics. Therefore I'd like to give the reader an easy 
  understandable example of Fleiss-Kappa in the Methods part. 
 To achieve 
  this, I obtained a table with the results of the German 
 election from 
  2005:
 
  partynumber of votespercent
 
  SPD1619466534,2
  CDU1313674027,8
  CSU34943097,4
  Gruene38383268,1
  FDP46481449,8
  PDS41181948,7
 
  I want to show the agreement of voters measured by Fleiss-Kappa. To 
  calculate this with the kappam.fleiss-function of irr, I need a 
  data.frame like this:
 
  (id of 1st voter) (id of 2nd voter)
 
  party spd cdu
 
  Of course I don't plan to calculate this with the million of cases 
  mentioned in the table above (I am working on a small laptop). A 
  division by 1000 would be more than perfect for this example. The 
  exact format of the table is generally not so important, as I could 
  reshape nearly every format with the help of the reshape-package.
 
  Unfortunately I could not figure out how to create such a 
  fictive/artificial dataset as described above. Any 
 data.frame would be 
  nice, that keeps at least the percentage. String-IDs of 
 parties could 
  be substituted by numbers of course (would be even better 
 for function 
  kappam.fleiss in irr!).
 
  I would appreciate any kind of help very much indeed.
  Greetings from Munich,
 
  Felix Mueller-Sarnowski
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 __
 R-help@r-project.org mailing list
 

Re: [R] How to read malformed csv files with read.table?

2008-08-22 Thread Martin Ballaschk


Hi Brian,


Am 22.08.2008 um 19:22 schrieb Prof Brian Ripley:

On Fri, 22 Aug 2008, Martin Ballaschk wrote:
My solution is reading the files without the headers (skip = 1) and  
seperately reading the headers with scan (scan(myfile.CSV, what =  
character, sep = \t, nlines = 1). After throwing out the first  
two columns it should be possible to assign the scanned colnames to  
the data.frame colnames.


Yes, but if you read the header first you can set the col.names via  
the arg to read.table().


Thanks! I plan to do it like that (actually it will be stuffed into a  
loop to read a bunch of files), seems to work:


 headernames - scan(test.CSV, what = character, sep = \t,  
nlines = 1, skip = 4)
 my.table - read.table(test.CSV, header=F, skip = 5, col.names =  
c(crap.1, crap.2, headernames))


 head(my.table)
  crap.1 crap.2 time.ms C550.KMS Cyt_b559.KMS [...] etc.
1  0 Point1  -599.50.0000.000
2  0 Point2  -598.00.019   -0.014
3  0 Point3  -596.50.025   -0.023
4  0 Point4  -595.00.034   -0.029
5  0 Point5  -593.50.049   -0.033
6  0 Point6  -592.00.068   -0.033

Cheers
Martin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test of Homogeneity of Variances

2008-08-22 Thread Mark Difford



Have you read the documentation to either of the functions you are using?

?bartlett.test

Performs Bartlett's test of the null that the variances in each of the
groups (samples) are the same.

This explicitly tells you what is being tested, i.e. the null tested is that
var1 = var2.

?rnorm

Generates random deviates, so the answer [i.e. p-value] will (almost
certainly) differ each time. Only by setting a seed for the random number
generator to use will rnorm() generate the same number-set/distribution and
therefore the same p-value.

?set.seed

##
set.seed(7)
bartlett.test(rep(rnorm(5),times=4), rep(1:5, each=4))
set.seed(7)
bartlett.test(rep(rnorm(5),times=4), rep(1:5, each=4))
set(23)
bartlett.test(rep(rnorm(5),times=4), rep(1:5, each=4))
bartlett.test(rep(rnorm(5),times=4), rep(1:5, each=4))

HTH, Mark.


Daren Tan wrote:
 
 
 I am testing whether the sample variances are equal. When p-value  0.05
 (alpha), should accept null hypothesis (sample variances are equal) or
 reject it ?
 
 
 The two new examples with each having same sample variances also puzzle
 me. Why are the p-values different ?
 
 bartlett.test(rep(rnorm(5),times=4), rep(1:5, each=4))
 
 
 Bartlett test of homogeneity of variances
 
 
 data:  rep(rnorm(5), times = 4) and rep(1:5, each = 4)
 Bartlett's K-squared = 0.8681, df = 4, p-value = 0.929
 
 
 bartlett.test(rep(rnorm(5),times=4), rep(1:5, each=4))
 
 
 Bartlett test of homogeneity of variances
 
 
 data:  rep(rnorm(5), times = 4) and rep(1:5, each = 4)
 Bartlett's K-squared = 3.5599, df = 4, p-value = 0.4688
 
 
 From: [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
 Date: Fri, 22 Aug 2008 11:25:36 -0400
 Subject: RE: [R] Test of Homogeneity of Variances

 What are your hypotheses? Once you state what they are, interpretation
 should be straightforward.



 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 On Behalf Of Daren Tan
 Sent: Friday, August 22, 2008 11:18 AM
 To: [EMAIL PROTECTED]
 Subject: [R] Test of Homogeneity of Variances


 I am testing the homogeneity of variances via bartlett.test and
 fligner.test. Using the following example, how should I interpret the
 p-value in order to accept or reject the null hypothesis ?

 set.seed(5)
 x - rnorm(20)
 bartlett.test(x, rep(1:5, each=4))


 Bartlett test of homogeneity of variances


 data: x and rep(1:5, each = 4)
 Bartlett's K-squared = 1.7709, df = 4, p-value = 0.7778

 fligner.test(x, rep(1:5, each=4))

 Fligner-Killeen test of homogeneity of variances


 data: x and rep(1:5, each = 4)
 Fligner-Killeen:med chi-squared = 1.0819, df = 4, p-value = 0.8971

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 This email message, including any attachments, is for ...{{dropped:6}}
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Test-of-Homogeneity-of-Variances-tp19109383p19112613.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Test of Homogeneity of Variances

2008-08-22 Thread Liaw, Andy
You don't need to test that the _sample_ variances are different:  They
already are.  Statistical tests of hypotheses are not about sample
statistics, but distributional charateristics.

It seems to me that reinforcement of some basic stat concept may do you
quite a bit of good.  If you don't have the basics, you're just going to
get more and more puzzled every step of the way.  Just a frank
suggestion.

Best,
Andy 

From: Daren Tan
 
 I am testing whether the sample variances are equal. When 
 p-value  0.05 (alpha), should accept null hypothesis (sample 
 variances are equal) or reject it ?
 
 
 The two new examples with each having same sample variances 
 also puzzle me. Why are the p-values different ?
 
 bartlett.test(rep(rnorm(5),times=4), rep(1:5, each=4))
 
 
 Bartlett test of homogeneity of variances
 
 
 data:  rep(rnorm(5), times = 4) and rep(1:5, each = 4)
 Bartlett's K-squared = 0.8681, df = 4, p-value = 0.929
 
 
 bartlett.test(rep(rnorm(5),times=4), rep(1:5, each=4))
 
 
 Bartlett test of homogeneity of variances
 
 
 data:  rep(rnorm(5), times = 4) and rep(1:5, each = 4)
 Bartlett's K-squared = 3.5599, df = 4, p-value = 0.4688
 
 
  From: [EMAIL PROTECTED]
  To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
  Date: Fri, 22 Aug 2008 11:25:36 -0400
  Subject: RE: [R] Test of Homogeneity of Variances
 
  What are your hypotheses? Once you state what they are, 
 interpretation should be straightforward.
 
 
 
  -Original Message-
  From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Daren Tan
  Sent: Friday, August 22, 2008 11:18 AM
  To: [EMAIL PROTECTED]
  Subject: [R] Test of Homogeneity of Variances
 
 
  I am testing the homogeneity of variances via bartlett.test 
 and fligner.test. Using the following example, how should I 
 interpret the p-value in order to accept or reject the null 
 hypothesis ?
 
  set.seed(5)
  x - rnorm(20)
  bartlett.test(x, rep(1:5, each=4))
 
 
  Bartlett test of homogeneity of variances
 
 
  data: x and rep(1:5, each = 4)
  Bartlett's K-squared = 1.7709, df = 4, p-value = 0.7778
 
  fligner.test(x, rep(1:5, each=4))
 
  Fligner-Killeen test of homogeneity of variances
 
 
  data: x and rep(1:5, each = 4)
  Fligner-Killeen:med chi-squared = 1.0819, df = 4, p-value = 0.8971
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
  This email message, including any attachments, is for 
 ...{{dropped:6}}
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
Notice:  This e-mail message, together with any attachme...{{dropped:12}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] inner margins for lattice

2008-08-22 Thread Todd Hatfield
I would like to control the inner margins of a lattice graph.  The graph is
a single superposed panel, but there is too much white space around the data
for my liking.  I've played around with some of the layout options, but so
far have found only how to control outer margins.

R version 2.7.1 (2008-06-23)
MS Vista OS

Thanks for any help.

Todd Hatfield

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R and GNUPLOT

2008-08-22 Thread Fristachi, Anthony
Hello,

 I am trying to send commands to GNUPLOT to load a .plt file that was generated 
with a C++ software module called with R, and to replot.

 I am able to open the program with shell.exec (below)  but I am at a loss at 
what do next.  Any suggestions??

shell.exec(C:\\ gnuplot \\ bin \\ wgnuplot.exe)

Thanks

---
Tony Fristachi
Battelle, Statistics  Information Analysis
505 King Avenue, Rm 11-7-056
Columbus, OH 43201
Email: [EMAIL PROTECTED]mailto:[EMAIL PROTECTED]
Phone: 614/424.4910   Fax: 614/458.4910
Cell: 513/375.3263
Web: 
http://www.battelle.org/solutions/default.aspx?Nav_Area=TechNav_SectionID=3

We are what we repeatedly do. Excellence then, is not an act, but a habit.
Aristotle


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and GNUPLOT

2008-08-22 Thread Greg Snow
There are some interface commands in the TeachingDemos package (see ?gp.open) 
for communicating between R and Gnuplot.  These commands are pretty basic, but 
looking at the code may give you some ideas of what to do next in your project 
(or maybe what not to do).

Hope this helps,

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111



 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Fristachi, Anthony
 Sent: Friday, August 22, 2008 12:43 PM
 To: r-help@R-project.org
 Subject: [R] R and GNUPLOT

 Hello,

  I am trying to send commands to GNUPLOT to load a .plt file
 that was generated with a C++ software module called with R,
 and to replot.

  I am able to open the program with shell.exec (below)  but I
 am at a loss at what do next.  Any suggestions??

 shell.exec(C:\\ gnuplot \\ bin \\ wgnuplot.exe)

 Thanks

 ---
 Tony Fristachi
 Battelle, Statistics  Information Analysis
 505 King Avenue, Rm 11-7-056
 Columbus, OH 43201
 Email: [EMAIL PROTECTED]mailto:[EMAIL PROTECTED]
 Phone: 614/424.4910   Fax: 614/458.4910
 Cell: 513/375.3263
 Web:
 http://www.battelle.org/solutions/default.aspx?Nav_Area=TechN
 av_SectionID=3

 We are what we repeatedly do. Excellence then, is not an
 act, but a habit.
 Aristotle


 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sending ... to a C external

2008-08-22 Thread John Tillinghast
I'm trying to figure this out with Writing R Extensions but there's not a
lot of detail on this issue.
I want to write a (very simple really) C external that will be able to take
... as an argument.
(It's for optimizing a function that may have several parameters besides the
ones being optimized.)

I got the showArgs code (from R-exts) to compile and install, but it only
gives me error messages when I run it. I think I'm supposed to pass it
different arguments from what I'm doing, but I have no idea which ones.

What exactly are CAR, CDR, and CADR anyway? Why did the R development team
choose this very un-C-like set of commands? They are not explained much in
R-exts.

Thanks in advance for any info.

John

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lme questions re: repeated measures covariance structure

2008-08-22 Thread Mark Na
Great, thanks for your reply!

I've just tracked down a copy of Pinheiro  Bates (2000) so I'll look at
that, too.

Thanks again, Mark


On Fri, Aug 22, 2008 at 9:56 AM, Christoph Scherber 
[EMAIL PROTECTED] wrote:

 Dear Mark,

 I would include the repeated measure as the smallest stratum in the random
 effects specification:

 random=~1|sampleunit/year

 Setting up user-defined variance structures should be possible using for
 example:

 weights=varPower(form=~habitat)

 or also try out the available corStruct() classes (found in Pinheiro and
 Bates 2000)

 HTH
 Christoph




 Mark Na schrieb:

 Hello,

 We are attempting to use nlme to fit a linear mixed model to explain bird
 abundance as a function of habitat:


 lme(abundance~habitat-1,data=data,method=ML,random=~1|sampleunit)

 The data consist of repeated counts of birds in sample units across
 multiple
 years, and we have two questions:

 1) Is it necessary (and, if so, how) to specify the repeated measure
 (years)? As written, the above code does not.

 2) How can we specify a Toeplitz heterogeneous covariance structure for
 this
 model? We have searched the help file for lme, and the R-help archives,
 but
 cannot find any pertinent information. If that's not possible, can we
 adapt
 an existing covariance structure, and if so how?

 Thanks, Mark

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 .


 --
 Dr. rer.nat. Christoph Scherber
 University of Goettingen
 DNPW, Agroecology
 Waldweg 26
 D-37073 Goettingen
 Germany

 phone +49 (0)551 39 8807
 fax   +49 (0)551 39 8806

 Homepage http://www.gwdg.de/~cscherb1 http://www.gwdg.de/%7Ecscherb1


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] sd(NA)

2008-08-22 Thread Roy Mendelssohn

Hi All:

This was discussed in the R-developers list   (see the thread  
starting at http://tolstoy.newcastle.edu.au/R/e3/devel/ 
07/12/0560.html).  It has to do with the behavior of sd() when the  
entire vector is NA.  The behavior has changed between 2.6 and 2.7.1  
as follows:


Run in Version 2.7.1
 tt-rep(NA, 10)
 mean(tt, na.rm=T)
[1] NaN
sd(tt, na.rm=T)
Error in var(x, na.rm=na.rm) : no complete element pairs

Run in Version 2.6.1
 tt-rep(NA, 10)
 mean(tt, na.rm=T)
[1] NaN
sd(tt, na.rm=T)
Na

If I understand the discussion, 2.7.1 fails because 'rm=T' removes  
the NA's so that it is now empty. What I have been unable to find in  
any of the mail lists is how you are suppose to program this  in  
2.7.1. when you are looping through a large number of  
variables  (we have  time series on a grid and are calculating the  
mean and sd of the time series at each grid point).


Any suggestions much appreciated.

-Roy M.

**
The contents of this message do not reflect any position of the U.S.  
Government or NOAA.

**
Roy Mendelssohn
Supervisory Operations Research Analyst
NOAA/NMFS
Environmental Research Division
Southwest Fisheries Science Center
1352 Lighthouse Avenue
Pacific Grove, CA 93950-2097

e-mail: [EMAIL PROTECTED] (Note new e-mail address)
voice: (831)-648-9029
fax: (831)-648-8440
www: http://www.pfeg.noaa.gov/

Old age and treachery will overcome youth and skill.
From those who have been given much, much will be expected

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sd(NA)

2008-08-22 Thread hadley wickham
Here's one approach:

try_default - function(expr, default = NA, quiet = FALSE) {
  result - default
  if (quiet) {
tryCatch(result - expr, error = function(e) {})
  } else {
try(result - expr)
  }
  result
}

failwith - function(default = NULL, f, quiet = FALSE) {
  function(...) try_default(f(...), default, quiet = quiet)
}


sd2 - failwith(NA, sd)
sd2(NA, na.rm=T)

sd3 - failwith(NA, sd, quiet = T)
sd3(NA, na.rm=T)

Hadley

-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Writing Rcmdr Plugins

2008-08-22 Thread Richard M. Heiberger
Bernhard and Irina, 

While tailoring the Rcmdr menu directly does work, I strongly do not
recommend it.  It will lead to mass confusion, most likely for the author,
several
months from now when it is installed on a different machine or when R or
Rcmdr is updated.

It is much cleaner to have a separate package.  It is also much easier to
write and maintain
a separate package than it is to edit the original Rcmdr-menus.txt.  While
the startup cost to
the author of writing an RcmdrPlugin.xxx is about the same as the cost for
modifying Rcmdr
itself, the maintenance is fantastically simpler for a Plugin package.

It is clear to the user of a Plugin package that your changes are not the
original.
When you modify the original Rcmdr, users will not be told explicitly that
they are using
a modified version.  That will lead to confusion on this list.

I distribute a Plugin (RcmdrPlugin.HH) on CRAN.  I originally distributed it
as a modification of
the Rcmdr.  John Fox invented plugins partially as a result of my
experience.  It is much
better for both of us have a clear separation of his responsibilities as
designer of the
Rcmdr, and mine as designer of several specific functions.

Rich

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
Behalf Of Pfaff, Bernhard Dr.
Sent: Thursday, August 21, 2008 06:01
To: G. Jay Kerns; Irina Ursachi
Cc: r-help@r-project.org
Subject: Re: [R] Writing Rcmdr Plugins

Dear Irina,

though you asked explicitly for writing a RCommander-plugin package; I
just wanted to add that the former approach of tailor-making menues in
the Commander still works. That is, just copy your R file with the
tcl/tk functions into the /etc directory of the RCommander and include
your menu structure into the file Rcmdr-menus.txt

Best,
Bernhard

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] History pruning

2008-08-22 Thread Richard M. Heiberger
We have different starting points.  Please be sure that your modularity
allows a cleaned region as well as a history log to be the input to your
next
step.  The history log is incomplete; lines sent to the *R* buffer by C-c
C-n are
explicitly excluded from history.  Lines picked up from a saved transcript
file aren't in history.  Will your program handle this correctly:

aa - bb +
cc

?  It is valid code.  Suppressing the line with cc will make it not valid
code.


Rich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Building colorspace on RHEL5

2008-08-22 Thread hadley wickham
Dear all,

I'm having problems installing the colorspace package on Red Hat
Enterprise Linux 5:

* Installing *source* package 'colorspace' ...
** libs
gcc -std=gnu99 -I/usr/lib/R/include -I/usr/lib/R/include
-I/usr/local/include-fpic  -O2 -g -pipe -Wall
-Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector
--param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic
-fasynchronous-unwind-tables -c colorspace.c -o colorspace.o
In file included from /usr/include/features.h:352,
 from /usr/include/stdlib.h:25,
 from /usr/lib/R/include/R.h:28,
 from colorspace.c:1:
/usr/include/gnu/stubs.h:7:27: error: gnu/stubs-32.h: No such file or directory
colorspace.c: In function 'CheckHex':
colorspace.c:411: warning: implicit declaration of function 'isxdigit'
colorspace.c: In function 'RGB_to_RColor':
colorspace.c:1024: warning: unused variable 'Zn'
colorspace.c:1024: warning: unused variable 'Yn'
colorspace.c:1024: warning: unused variable 'Xn'
colorspace.c: In function 'hex_to_RGB':
colorspace.c:1115: warning: passing argument 1 of 'decodeHexStr'
discards qualifiers from pointer target type
colorspace.c: In function 'polarLAB_to_LAB':
colorspace.c:218: warning: control reaches end of non-void function
colorspace.c: In function 'XYZ_to_LUV':
colorspace.c:314: warning: control reaches end of non-void function
make: *** [colorspace.o] Error 1

I'm running 2.6.1 because lab computers can only have official rpms
and uname -a gives:

Linux rl102-07.lab.rice.edu 2.6.18-92.el5 #1 SMP Tue Apr 29 13:16:15
EDT 2008 x86_64 x86_64 x86_64 GNU/Linux

Any ideas as to why the compile is failing?

Thanks,

Hadley

-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Building colorspace on RHEL5

2008-08-22 Thread Marc Schwartz
Hi Hadley,

on 08/22/2008 03:52 PM hadley wickham wrote:
 Dear all,
 
 I'm having problems installing the colorspace package on Red Hat
 Enterprise Linux 5:
 
 * Installing *source* package 'colorspace' ...
 ** libs
 gcc -std=gnu99 -I/usr/lib/R/include -I/usr/lib/R/include
 -I/usr/local/include-fpic  -O2 -g -pipe -Wall
 -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector
 --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic
 -fasynchronous-unwind-tables -c colorspace.c -o colorspace.o
 In file included from /usr/include/features.h:352,
  from /usr/include/stdlib.h:25,
  from /usr/lib/R/include/R.h:28,
  from colorspace.c:1:
 /usr/include/gnu/stubs.h:7:27: error: gnu/stubs-32.h: No such file or 
 directory

It looks like the above (gnu/stubs-32.h) is the key missing file with
the required headers preventing compilation.

 colorspace.c: In function 'CheckHex':
 colorspace.c:411: warning: implicit declaration of function 'isxdigit'
 colorspace.c: In function 'RGB_to_RColor':
 colorspace.c:1024: warning: unused variable 'Zn'
 colorspace.c:1024: warning: unused variable 'Yn'
 colorspace.c:1024: warning: unused variable 'Xn'
 colorspace.c: In function 'hex_to_RGB':
 colorspace.c:1115: warning: passing argument 1 of 'decodeHexStr'
 discards qualifiers from pointer target type
 colorspace.c: In function 'polarLAB_to_LAB':
 colorspace.c:218: warning: control reaches end of non-void function
 colorspace.c: In function 'XYZ_to_LUV':
 colorspace.c:314: warning: control reaches end of non-void function
 make: *** [colorspace.o] Error 1
 
 I'm running 2.6.1 because lab computers can only have official rpms
 and uname -a gives:
 
 Linux rl102-07.lab.rice.edu 2.6.18-92.el5 #1 SMP Tue Apr 29 13:16:15
 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux
 
 Any ideas as to why the compile is failing?

At least on F9:

$ locate stubs-32.h
/usr/include/gnu/stubs-32.h

$ rpm -q --whatprovides /usr/include/gnu/stubs-32.h
glibc-devel-2.8-8.i386


So:

  # yum install glibc-devel

should take care of it, presuming that you are not missing anything else...

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Newbie programming help

2008-08-22 Thread Ranney, Steven
All - 

Not sure if this is a real programming question, but here goes:

I have data that looks like

LakeLength  Weight
1   158 45
1   179 70
1   200 125
1   202 150
1   206 145
1   209 165
1   210 140
1   215 175
1   216 152
1   220 150
1   221 165
...

where lake goes from 1 - 84 and the number of rows for each lake is variable 
(but  ~20).  
I'm trying to do two things: 1) build a simple linear model of the form 

{lm(log10(Weight)~log10(Length)}

for every separate lake in the data set; 2) I'd like to save the intercepts and 
slopes 
from each of these linear regressions into a seperate data frame.  Any ideas?  
I think it would 
probably require some kind of 'for' statement, but I'm just not that smart.

Thanks for your help, 

SR  

Steven H. Ranney
Graduate Research Assistant (Ph.D)
USGS Montana Cooperative Fishery Research Unit
Montana State University
PO Box 173460
Bozeman, MT 59717-3460

phone: (406) 994-6643
fax:   (406) 994-7479


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help needed for HWE.exact in library genetics

2008-08-22 Thread David Duffy

Sun, Ying asked:


I have a genotype data for both case and controls and would
like to calculate the HW p-value.  However, since the
number of one genotype is 0, I got wierd result.  Would
someone help me to figure it out?  Or confirm it's right?
Thanks a lot.

...

HWE.exact(g1)
Exact Test for Hardy-Weinberg Equilibrium
data:  g1
N11 = 71, N12 = 9, N22 = 0, N1 = 151, N2 = 9,
p-value = 1


Yes, that is correct.  Double check it by calculating the
goodness-of-fit chi-square for the same hypothesis:

p - 151/160; q - 1-p; 80*c(p^2,2*p*q,q^2)
[1] 71.253125  8.493750  0.253125
...

David Duffy.

--
| David Duffy (MBBS PhD) ,-_|\
| email: [EMAIL PROTECTED]  ph: INT+61+7+3362-0217 fax: -0101  / *
| Epidemiology Unit, Queensland Institute of Medical Research   \_,-._/
| 300 Herston Rd, Brisbane, Queensland 4029, Australia  GPG 4D0B994A v

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Newbie programming help

2008-08-22 Thread Chuck Cleland
On 8/22/2008 5:34 PM, Ranney, Steven wrote:
 All - 
 
 Not sure if this is a real programming question, but here goes:
 
 I have data that looks like
 
 Lake  Length  Weight
 1 158 45
 1 179 70
 1 200 125
 1 202 150
 1 206 145
 1 209 165
 1 210 140
 1 215 175
 1 216 152
 1 220 150
 1 221 165
 ...
 
 where lake goes from 1 - 84 and the number of rows for each lake is variable 
 (but  ~20).  
 I'm trying to do two things: 1) build a simple linear model of the form 
 
 {lm(log10(Weight)~log10(Length)}
 
 for every separate lake in the data set; 2) I'd like to save the intercepts 
 and slopes 
 from each of these linear regressions into a seperate data frame.  Any ideas? 
  I think it would 
 probably require some kind of 'for' statement, but I'm just not that smart.

  Assuming the data are in a data frame called mydf:

library(nlme)

fm1 - lmList(log10(Weight)~log10(Length) | Lake, mydf)

coef(fm1)

?lmList

or

t(sapply(split(mydf, mydf$Lake),
function(x){coef(lm(log10(Weight)~log10(Length), data=x))}))

 Thanks for your help, 
 
 SR  
 
 Steven H. Ranney
 Graduate Research Assistant (Ph.D)
 USGS Montana Cooperative Fishery Research Unit
 Montana State University
 PO Box 173460
 Bozeman, MT 59717-3460
 
 phone: (406) 994-6643
 fax:   (406) 994-7479
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


-- 
Chuck Cleland, Ph.D.
NDRI, Inc. (www.ndri.org)
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using interactive plots to get information about data points

2008-08-22 Thread jcarmichael

I have been experimenting with interactive packages such iplots and playwith. 
Consider the following sample dataset:

A  B  C  D
1  5  5  9
3  2  8  4
1  7  3  0
7  2  2  6

Let's say I make a plot of variable A.  I would like to be able to click on
a data point (e.g. 3) and have a pop-up window tell me the corresponding
value for variable D (e.g. 4).  I am also trying to produce multiple small
plots.  For example, four side-by-side boxplots for each of the four
variables A, B, C, D.

Any help would be greatly appreciated!  Thanks!
-- 
View this message in context: 
http://www.nabble.com/Using-interactive-plots-to-get-information-about-data-points-tp19116114p19116114.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to handle ~ character after csv importation

2008-08-22 Thread Sébastien

Dear R users,

I have to import some csv files in which column headers contain the 
character ~. Following the following import call, the character seems 
to be replaced by dots in the column names of my data frame. Plus, I 
cannot query names(mydata) to find the column index which header should 
contain ~ or .


 mydata - data.frame(read.table(mycsv.csv, sep = ,, header = 
TRUE, na.strings = ., as.is = TRUE))

 mydata
 P1 P2 P3 P1.P2 P1.P3 P2.P3 A B C
1  1  2  3 4 5 6 7 8 9
 names(mydata)
[1] P1P2P3P1.P2 P1.P3 P2.P3 A B C   
 grep(~,names(mydata))

integer(0)
 grep(.,names(mydata))
[1] 1 2 3 4 5 6 7 8 9

How can I check for this character ? And by the way, can anyone explain 
to me the result of the last grep call?


Thank you in advance.

Sebastien

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] dimension values of an array

2008-08-22 Thread Jan Akko Eleveld

Question: 
 
I created an array with 164 columns and 70 rows. 
 
I would like every column to have unit value 1 
and the rows 0.1, 0.2, 0.3 up to 7. How can I define that?
 
Thanks for your help!
 
Good weekend!
 
Akko
_


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] nlme gls front end crash

2008-08-22 Thread Daniel Malter
Hi, when I try to run GLS, R crashes. I have seen on the mailing list that
somebody has encountered the problem before (
http://tolstoy.newcastle.edu.au/R/help/05/01/10915.html ), but there seemed
to be no solution / the thread appears incomplete. The program runs gls1
(without correlation structure; code below) and crashes when I run gls2
(with correlation structure; code further below). Windows then shows a
standard crash window (the one where Windows asks to send an error report to
Microsoft) saying something like: R GUI has encountered a problem and has
to close. We are sorry for the inconvenience...  

The crash seems to depend on the specific correlation structure. It does not
crash with all correlation structures. My data has about 2000 data rows. Is
this a memory problem?

Thanks for any advice.
Daniel


The R code I am running is the following

newdata.new=groupedData(cash.end~rs|subj,data=data.new)
cs1 - corARMA(value=c(0.9),p=1,form=~rs|subj)
cs1 - Initialize(cs1, data = newdata.new)
corMatrix(cs1)
 
gls1=gls(log(cash.end+1)~factor(rs)+nogs+nols+ret+factor(poc)+posh+posl+negh
+negl,data=newdata.new,na.action=na.omit)

 
gls2=gls(log(cash.end+1)~factor(rs)+nogs+nols+ret+factor(poc)+posh+posl+negh
+negl,data=newdata.new,na.action=na.omit,corr=cs1)

And the session info is: 

R version 2.6.2 (2008-02-08) 
i386-pc-mingw32 

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

other attached packages:
[1] nlme_3.1-87

loaded via a namespace (and not attached):
[1] grid_2.6.2 lattice_0.17-4


-
cuncta stricte discussurus

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Visualization of two-mode matrix/ networks/ graphs

2008-08-22 Thread Brendan Casey

Jakob,

Could you please tell me how you got your 2-mode data INTO R/statnet/sna,
etc... ?

I can do many things with this data in ucinet, and even export it to pajek,
but I cannot get it to import even using the read.paj routine.  I have about
620 actors involved in 230 events.  When I convert it to an actor only
network, I also cannot get rid of the arrows!  I am trying to import from a
Pajek .net file

Success so far with drawing the two modes differently in UCInet, but I want
to do more statnet work on these networks, and also use the event timestamps
to show dynamic features of the network down the line.  Anyone?

-Brendan Casey



Jakob Mumm wrote:
 
 Hey all,
 I'm looking for a plotting method for two-mode networks. Having a  n x m
 matrix where n is the first set of entities/ actors and m representing the
 second set, I want to colour both sets of actors differently. I tried
 plot.network from the network-package, but I did not succeed. Even the
 proposed solution from Gabor Grothendiek
 (http://finzi.psych.upenn.edu/R/Rhelp02a/archive/87003.html ) seems not
 well
 suited for the problem. Further, I'm wondering how to interpret
 vertex.col :
 color for vertices; may be given as a vector or a vertex attribute name,
 if
 vertices are to be of different colors. . For the latter solution via
 attribute name the documentation gives the following example
 vertex.col=2+(network.vertex.names(nflo)==Medici which is not a
 problem,
 but how to use the former solution via vector? 
  The command gplot out of the sna-package should have incorporated a way
 of
 considering two-mode matrix via gmode==twomode (see help for gplot), but
 then one has to work with graphs and will loose (so far I have caught it)
 information which is stored in the class of network. 
 For example, take the following adjacency matrix, where 1-6 (rows) is the
 first set of entities and 7-9 (cols) the second:
   7 8 9
 1 0 0 0
 2 1 0 0
 3 1 0 0
 4 0 0 0
 5 0 0 0
 6 0 0 0
 Thanks in advance,
 Jakob 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Visualization-of-two-mode-matrix--networks--graphs-tp12959659p19114539.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sending ... to a C external

2008-08-22 Thread Emmanuel Charpentier
Le vendredi 22 août 2008 à 15:16 -0400, John Tillinghast a écrit :
 I'm trying to figure this out with Writing R Extensions but there's not a
 lot of detail on this issue.
 I want to write a (very simple really) C external that will be able to take
 ... as an argument.
 (It's for optimizing a function that may have several parameters besides the
 ones being optimized.)

!!! That's a hard one. I have never undertaken this kind of job, but I expect 
that your ... argument, if you can reach it from C (which I don't know) will 
be bound to a Lisp-like structure, notoriously hard to decode in C. Basically,
you'll have to create very low level code (an duplicate a good chunk of the R
parser-interpreter...). 

I'd rather treat the ... argument in a wrapper that could call the relevant 
C function with all arguments interpreted and bound... This wrapper would
probably be an order of magnitude slower than C code, but two orders of 
magnitude
easier to write (and maintain !). Since ... argument parsing would be done 
*once*
before the grunt work is accomplished by C code, the slowdown would (probably)
be negligible...

 I got the showArgs code (from R-exts) to compile and install, but it only
 gives me error messages when I run it. I think I'm supposed to pass it
 different arguments from what I'm doing, but I have no idea which ones.
 
 What exactly are CAR, CDR, and CADR anyway? Why did the R development team
 choose this very un-C-like set of commands? 

'Cause they are fundamental to the Lisp-like language that is S/R. Read on...

 They are not explained much in
 R-exts.

At the risk of incurring the R-help deities wrath :

In the beginning, John Mc Carthy created Lisp 1.5, who begat MACLISP who begat 
..., who begat Scheme ... . Nonwhistanding a long inheritance story, full of 
sound,
fury, holy wars and bastardry (sometimes evoking European royalty lines...), 
all 
Lisp-like language bear one mark (dominant allele) of their common ancestor, 
which
is the list structure. This structure is implemented as a pair of pointers, the
first one  pointing to the first element of the list, the second pointing to ...
the list of the rest of the members (or nothing, aka NIL).

In McCarthy's implementation, which was machine code on an IBM 704 (we were in 
1957, 
mind you...) such  word was spanned among different parts of a CPU word, known 
in the
technical documentation of this dinosaur as address register and decrement 
register
respectively. Thus the mnemonic names of Content of the Address Register and 
Content
of Decrement Register (aka CAR and CDR) for these two pointers, names 
which were
used for the (lisp) functions allowing to access them.

Those names have stuck mainly for historical (sentimental ? hysterical ? ) 
reasons.
Even when reasonable synonyms were introduced (e. g. first and rest in 
Common Lisp),
all the old hands (and young dummies who wanted to emulate them) kept car and 
cdr close
to their hearts.

So, fifty one years after McCarthy stroke of genius, this piece of CS snobbery 
is still
with us (and probably will 'til 64-bit Linux time counters roll over...).

Despite its C-like syntax, its huge collection of array and array-like 
structures,
its loops, S and R are fundamentally Lisp-like languages. Most notably, the 
representation
of executable code is accessible by the code itself : one can create a function 
which
computes another function. Lisp was the first language explicitly created for 
this
purpose, and  it is no happenstance that it (or one of his almost uncountable 
dialects)
that many (most ?) of such language use Lisp fundamentals. Many of R/S common 
way of
doing things have a Lisp smell : for example, it is no chance that 
{s|t|l}apply(),
outer() and suchlike are *way* more efficient than for(), and they are strongly
reminescent of Lisp's mapcar...

BTW, this ability to compute the language is probably the most fundamental 
point
distinguishing S/R from all the rest of statistical packages (SPSS, Stata, 
SAS and
the rest of the crowd...

Now I have to say that I'm not old enough to have first-hand knowledge of this 
history.
I was born, but not weaned, when McCarthy unleashed Lisp on an unsuspecting 
world ...
I learned that ca. 1978-9, while discovering VLisp. 

While I can't really help you (I still think that processing ... at C level 
is either
hubris of the worst sort, pure folly or a desperate case), I hope to have 
entertained you.

Sincerely,

Emmanuel Charpentier

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Deleting NA in list()

2008-08-22 Thread Dong-hyun Oh

Dear useRs,

I would like to know the way of deleting NA in list().

Following is a example.

lt - list(a = 1:3, b = NA, c = letters[1:3], d = NA)

for(i in length(lt):1) {
if(is.na(lt[[i]])) lt[[i]] - NULL
}

How to simplify for() loop by using (l)apply family?

Thank you in advance.




=
Dong-hyun Oh
Center of Excellence for Science and Innovation Studies
Royal Institute or Technology, Sweden
e-mail: [EMAIL PROTECTED]
cel: +46 73 563 45 22

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R

2008-08-22 Thread leidy patricia garzon rodriguez
Hi my name is leidy. I am a student of sistem ingenier, i do not speak
english.

Now I am working in a proyect using R but with a interface of java, I have a
problem because a need  acces to p



 s-box.cox.powers(resp,2)

 s

Box-Cox Transformation to Normality



 Est.Power Std.Err. Wald(Power=0) Wald(Power=1)

   -0.0419   0.0274-1.528  -38.0037



L.R. test, power = 0:  2.3469   df = 1   p = 0.1255

L.R. test, power = 1:  1732.1631   df = 1   p = 0



This is the code then I need get the p value in each test when the power is
0 and 1
hi my name is Leidy, I am a student of engineering systems, actualmete I'm
working on a project using academic R but I have had some drawbacks.
I ask in advance excuse because I do not speak English
my problem is as follows, I need to get values p testing with power and
power = 0 = 1
ie in this case I need to get the 0.1255 and 0
but not with the instruction that I get.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problem with rbind

2008-08-22 Thread kayj


I am trying to use rbind to have the two data on the top of each other but I
am getting an extra X on the column header and the rows are numberd , How to
get rid of this problem? I appreciate your help



x1- read.table(file=data1.txt, header=T, sep=\t)
 x2-read.table(file=data2.txt, header=T, sep=\t)
 y-rbind(x1,x2)
 y
X0   X0.0.05  X0.05.0.1  X0.1.0.2  X0.2.0.3  X0.3.0.4  X0.4.0.5
1  0.1971 0.108 0.1 0.1 0.156 0.38 0.22
2  0.2024 0.106 0.0 0.1 0.15 0.1408173 0.55


-- 
View this message in context: 
http://www.nabble.com/problem-with-rbind-tp19116234p19116234.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple look up

2008-08-22 Thread PDXRugger

WTF mate. you think im about wasting time.  I been seeking a solution for
this for two weeks.  excuse my lack of programming savvy, but this is why i
posted.

bartjoosen wrote:
 
 Please take a look at the manuals before posting (DO read the posting
 guide!)
 
 Bart
 PS: yourdataframe[yourdataframe$TAZ==103,2]
 
 
 
 
 
 PDXRugger wrote:
 
 So i am hoping this solution is simple, which i beleive it is.  I would
 like to look up a value in one column of a data set and display the
 corresponding value in the 2nd column.  For example
 TAZVACANT ACRES
 100   45
 101   62
 102   23
 103   64
 104   101
 105   280
 
 So if i want to find TAZ 103 it would give me 64 as the corresponding
 value.  Hope you guys can help
 
 Cheers
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Simple-look-up-tp19098777p19113270.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >